CS 297 PROPOSAL
A SCALABLE SEARCH ENGINE AGGREGATOR
Pooja B. Mishra (pooja192009@gmail.com)
Advisor :
Dr. Chris Pollett
Description :
Yioop is a PHP search engine developed by Dr. Pollett. Certain search engine tasks such as updating news feeds, rss feeds, sending out notifications are done periodically in bulk. Currently, these tasks are not done periodically in Yioop.
I will work on building aggregator to periodically update group notifications, news feeds. This aggregator will be distributed over all machines in the Yioop instance. My tasks will include working on different features of Yioop either to modify them or to enhance them by first doing some experimentation with its current load capacity and architecture.
Deliverables:
As part of CS 297, I will produce the below deliverables.
1. Get familiar with the current feature of news updater and experiment with the current news feed features in the search engine. This will include :
Download search engine and install on the local machine.
Do a test crawl and get the news updater working by using git clone.
2. Currently news feed feature is running on only one of the six dedicated server machines. This news feed feature needs to be improved. As part of this deliverable, I will
Experiment with the capacity of news feed to see how many news feed it updates in an hour
Try to test the news feature with different numbers of search sources until it fails. Apache bench will be used to fire the queries and see how it performs over different number of records.
3. Currently, the mpg4 format does not work in Mozilla firefox browser.
To overcome this, I will add feature like ffmpg recoding to the news updater.
4. CS 297 report approximately 10 pages containing the above mentioned deliverable results and also any other useful findings. This will take couple of weeks.
The technologies that we will be working on will mainly include PHP , JavaScript , MYSQL, DB2 and PostgreSQL .
Timetable:
The timeline for above mentioned deliverables is given below:
Due date |
Deliverable |
Tuesday, 9 September 2014 | Read and learn about Yioop search engine news updater |
Tuesday, 16 September 2014 | Deliverable #1 : Install and do test crawl on news updater |
Tuesday, 23 September 2014 | Learn about the news feed features of search engine and how to distribute an application over distributed environment |
Tuesday, 30 September 2014 | Deliverable #2: Experiment with news feed feature and its capacity. |
Tuesday, 14 October 2014 | Learn about ffmpg recoding feature and read about it |
Tuesday, 21 October 2014 | Deliverable #3 : Code for incorporating ffmpg recording feature |
Tuesday, 25 November 2014 | Deliverable #4 : CS 297 Final Report |
List of references:
http://linuxers.org/tutorial/ffmpeg-tutorial-beginners
http://www.cs.usask.ca/faculty/eager/loadsharing.pdf
http://www.computer.org/portal/web/ds/home
http://scalingsystems.com/2011/09/07/reading-list-for-distributed-systems/
|