Chris Pollett > Students > Pooja

    Print View

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Deliverable #1]

    [Deliverable #1: Experimentation Results]

    [Deliverable #2]

    [Deliverable #2: Code Patch]

    [Deliverable #3]

    [Deliverable #3 : FFMPEG Presentation]

    [CS297 Report]

    [CS 298 Proposal]

    [CS298 Report]

    [CS298 Presentation]

                          

























CS298 Proposal

Scalable Search Engine Aggregator

Pooja Mishra (pooja192009@gmail.com)

Advisor: Dr. Chris Pollett

Committee Members: Dr. Sami Khuri, Dr. Robert Chun.

Abstract:

Yioop is a PHP search engine developed by Dr. Pollett.The Yioop search engine is designed to allow users to produce indexes of a web-site or a collection of web-sites. The number of pages a Yioop index can handle range from small site to those containing tens or hundreds of millions of pages. Just like any other search engines, Yioop also consists of media updaters like news updater and features like uploading video sources.It also supports many common features of a search portal such as user discussion group, blogs, wikis, and a news aggregator. Certain search engine tasks such as updating news feeds, rss feeds, sending out notifications are done periodically in bulk. All these different functions are part of the media updater any search engine usually does.Yioop has a news updater process that can be used to re-index RSS and Atom feeds on an hourly basis. This more timely information can then be incorporated into Yioop search results.The list of video and news sites can be configured through the GUI. Yioop has a news_updater process which can be used to automatically update news feeds hourly.

CS297 Results

  • Studied the news updater feature of Yioop in detail
  • Did manual testing with the capacity of the no of news sources single machine news updater can handle
  • Coded the distributed version of news updater(so that it can now run on multiple machines)
  • Learnt and understood the FFMPEG tool
  • Discussed and proposed the architecture for implementing the video uploader as part of media updater feature of Yioop

Proposed Schedule

Week 1: Jan.27-Feb.02Prepare a CS298 Proposal and upload it
Week 2:Feb.03-09Experiment with different User Interface ideas for news updater
Week 3,4:Feb.03-16Deliverable#1:Implement the User Interface for news updater and optimize it
Week 5:Feb.17-23Discuss the already proposed Video uploader architecture
Week 6,7:Feb.24-Mar.09Deliverable#2:Code for the Video uploader
Week 8:Mar.10-16Learn and read about the group feed features and walk through code for sending notification emails
Week 9:Mar.17-23Deliverable#3: Code for implementing aggregation of group feed features raise code review request for the same
Week 10:Mar.24-30Discuss the automated test framework for news updater testing
Week 11:Mar.31-Apr.06Build the framework and get it reviewed
Week 12:Apr.07-13Understand the archival process for the news articles
Week 13:Apr.14-20Implement the automated archival process and raise code review request for the same
Week 14:Apr.21-27Create a first draft of CS298 report
Week 15:Apr.28-May.04Create a final CS298 Report and submit to Advisor and committee members.
Week 16:May.04-12Defense

Key Deliverables:

  • Software
    • User Interface for news updater
    • Video uploader implementation
    • Code for mail distribution
  • Report
    • CS298 Report
    • Project code and test result documentation

Innovations and Challenges

  • Experimenting with different User Interface samples and check if any framework could be used
  • Optimizing the code for efficiency and keeping track of requesting machine in distributed environment

References:

[2011] PHP for the Web. Ullman, Larry. 2011.

The Algorithm Design Manual, 2nd Edition. Steven Skiena.