Chris Pollett > Students > Radha

    Print View

    [Bio]

    [Blog]

    [CS297 Proposal]

    [Deliverable1_indexCompression]

    [Deliverable2]

    [Deliverable3]

    [Deliverable4]

    [CS297_report]

    [CS298 Proposal]

    [CS298_Progress_Report]

    [CS298_presentation]

    [CS298_Report]

    [CS298_Project_SourceCode-ZIP]

                          

























CS298 Proposal

Processing Posting Lists Using OpenCL

Radha Kotipalli (sowmyu94@yahoo.com)

Advisor: Dr. Chris Pollett

Committee Members: Dr . Sami Khuri, Dr. Thomas Austin

Abstract:

The objective of this project is to create a GPU - based, parallel reader of posting lists for Yioop. Yioop is an open source search engine designed and developed in PHP by Dr. Chris Pollett. We will be using OpenCL to code for the GPU. We will then do experiments comparing performance of our GPU posting list handler with the existing mechanism in Yioop. Search engines use an inverted index to look up for a term in all the documents containing that term. This list of documents is called a posting list. Posting lists are typically stored in a compressed binary format. OpenCL (Open Computing Language) provides a platform for parallel and efficient programming. GPUs have hundreds or thousands of processing units rather than one to eight for a typical CPU. OpenCL provides an effective way to program GPUs and to send and receive data from the CPU in a system and a GPU. Many posting list operations can be done in parallel. So implementing posting lists algorithm using OpenCL will likely improve the performance of search engine.

CS297 Results

  • Understood different algorithms for index compressions and wrote a C-Program to implement Huffman encoder/decoder.
  • Wrote a C-Program to encode/decode posting lists using Simple-9 compression algorithm.
  • Installed the necessary C++ bindings for Microsoft Visual studio and wrote a OpenCL program to compute the squares of each element of an array.
  • Installed PHP extensions for windows and wrote a simple PHP extensions for the hello world program.

Proposed Schedule

Week 1: (Aug 25 - Aug 30)CS 298 Proposal
Week 2: (Aug 31 - Sep 6)Get the latest version of Yioop and understand the existing code
Week 3: (Sep 7 - Sep 13)Deliverable 1:C-program to encode/decode posting lists using Modified-9 compression algorithm (existing Yioop's algorithm)
Week 4: (Sep 14 - Sep 20)Finish deliverable 1
Week 5: (Sep 21 - Sep 27)Deliverable 2: Write an OpenCL program to above C-program
Week 6: (Sep 28 - Oct 4)Continue working on deliverable 2
Week 7: (Oct 5 - Oct 11)Finish deliverable 2
Week 8: (Oct 12 - Oct 18)Deliverable 3: Write PHP Extensions to the above OpenCL program
Week 9,10: (Oct 19 - Nov 1)Continue working on deliverable 3
Week 11: (Nov 2 - Nov 8)Finish deliverable 3
Week 12,13: (Nov 9 - Nov 22)Deliverable 4: Conduct performance comparison tests with the GPU posting handler and the exisiting Yioop system
Week 14: (Nov 23 - Nov 29)Start working on CS298 report
Week 15: (Nov 30 - Dec 6)Finish CS298 Report and submit to Advisor and committee members
Week 16: (Dec 7 - Dec 13)Defense

Key Deliverables:

  • Software
    • Write a C-program to encode/decode posting lists using Modified-9 compression algorithm (existing Yioop's algorithm)
    • Convert above C-program into an OpenCL program
    • Write PHP Extensions to the above OpenCL program
    • Conduct performance comparison tests with the GPU posting handler and the exisiting Yioop system
  • Report
    • CS298 Report
    • Project Code and Test results Documentation

Innovations and Challenges

  • Working with novel index compression schemes
  • Writing PHP extensions to the OpenCL
  • Conducting performance testing to compare relative performance improvements may need dedicated hardware

References:

1. Stefan, B., Clarke , C., Cormack, G. (2010). Information retrieval - Implementing and Evaluating Search Engines . Cambridge, Massachusetts: MIT Press.

2. Benedict , G., Howes, L., Kaeli, D., Mistry, P., Schaa, D. (2011). Heterogeneous Computing with OpenCL. Morgan Kaufmann.

3. Golemon, Sara. Extending and Embedding PHP. Indianapolis, Ind.: Sams, 2006. Print.