CS297 Proposal
Processing Posting Lists Using OpenCL
Radha Kotipalli (sowmyu94@yahoo.com)
Advisor :
Dr. Chris Pollett
Description :
Search engine uses an inverted index to look up for a term in all the documents containing that term. This list of documents is called a posting list. Posting lists are typically stored in a compressed binary format. My project will be to create a GPU based , parallel reader of posting lists for Yioop.
Yioop is an open source search engine designed and developed in PHP by Dr. Chris Pollett. We will be using OpenCL to code for the GPU. We will then do experiments comparing our GPU posting handler with the existing mechanism in Yioop.
Schedule:
Week 1:
(Jan 27 - Feb 2) | Setting the schedule for future meetings and working on proposal |
Week 2:
(Feb 3 - Feb 9) | Discussing about the project in detail with the professor and working on the proposal and deliverables |
Week 3,4:
(Feb 10 - Feb 23) | Reading about Index compression |
Week 5:
(Feb 24 - Mar 2) | Deliverable #1: Finish reading about Index compression |
Week 6:
(Mar 3 - Mar 9) | Working on writing a C program to create a posting for the given array of a word and it's locations in the document |
Week 7:
(Mar 10 - Mar 16) | Deliverable #2: Write a C-program that can take an array of a word and it's locations and converting them to a posting |
Week 8,9:
(Mar 17 - Mar 30) | Learn and understand about OpenCL |
Week 10:
(Mar 31 - Apr 6) | Deliverable #3: Write a program in OpenCL |
Week 11,12:
(Apr 7 - Apr 20) | Learn and understand about PHP extensions |
Week 13:
(Apr 21 - Apr 27) | Deliverable #4: Write a simple PHP extensions for the above OpenCL program |
Week 14:
(Apr 28 - May 4) | Working on final report |
Week 15:
(May 5 - May 11) | Deliverable #5: Write CS297 report |
Deliverables:
The full project will be done when CS298 is completed. The following will
be done by the end of CS297:
1. Finish reading about Index compression.
2. Write a C-program that can take array of docid's and locations and convert them to a posting.
3. Write a program in OpenCL.
4. Write a simple PHP extensions for the above OpenCL program.
5. Write CS297 Report
References:
1. Stefan, B., Clarke , C., Cormack, G. (2010). Information retrieval - Implementing and Evaluating Search Engines . Cambridge, Massachusetts: MIT Press.
2. Benedict , G., Howes, L., Kaeli, D., Mistry, P., Schaa, D. (2011). Heterogeneous Computing with OpenCL. Morgan Kaufmann.
3. Golemon, Sara. Extending and Embedding PHP. Indianapolis, Ind.: Sams, 2006. Print.
|