Chris Pollett >
Students > [Bio] [CS298 Spring 2012 Progress Report-PDF] |
|
Week 1-2: 08/29/2011 - 09/10/2011 | Discuss the project in detail with the advisor |
Week 3: 09/11/2011 - 09/18/2011 | Download and study Yioop code and perform experiments with it |
Week 4: 09/19/2011 - 09/26/2011 | Deliverable 1 due: Study how URL shortening service links are handled by Yioop and associate the short links correctly with the original page. |
Week 5: 09/27/2011 - 10/3/2011 | Download and study Sphinx code |
Week 6: 10/4/2011 - 10/11/2011 | Deliverable 2 due: Perform experiments with Sphinx to index a newly created database |
Week 7-8: 10/12/2011 - 10/25/2011 | Explore various page ranking algorithms |
Week 9: 10/26/2011 - 10/1/2011 | Deliverable 3 due: Study various page ranking algorithms that can be used with URL shorteners. |
Week 10-11: 10/2/2011 - 11/16/2011 | Study how to crawl password restricted sites |
Week 12: 11/17/2011 - 11/24/2011 | Deliverable 4 due: Sample scripts for Yioop to crawl password restricted sites |
Week 13-14: 11/25/2010 - 12/8/2010 | Work on CS297 Report |
Week 15: 12/9/2010 - 12/14/2010 | Deliverable 5 due: CS297 Report |
Deliverables:
The full project will be done when CS298 is completed. The following will be done by the end of CS297:
1. Study how URL shortening service links are handled by Yioop and associate the short links correctly with the original page.
2. Perform experiments with Sphinx to index a newly created database
3. Study various page ranking algorithms that can be used with URL shorteners.
4. Modify Yioop to crawl password restricted sites.
5. CS297 Report
References:
[1] Accessing the Deep Web. Bin He, Mitesh Patel, Zhen Zhang, Kevin Chen-Chuan Chang. ACM. 2007.
[2] http://en.wikipedia.org/wiki/Invisible_Web#Accessing, August 22, 2011.
[3] http://www.fridaytrafficreport.com/exploring-the-deep-web/, August 21, 2011.
[4] http://en.wikipedia.org/wiki/Sphinx_(search_engine), August 28, 2011.