CS298 Proposal

Question Answering System

Niravkumar Patel (niravkumar.patel1989@gmail.com)

Advisor: Dr. Chris Pollett

Description:

Yioop is an open source search engine developed and managed by Dr. Christopher Pollett. Currently, when a query is searched, it suggests relevant documents based on the query. A summarizer is a process which extracts creates a short summary from a potentially long text document. The Yioop crawler when processing pages runs a summarizer and then only index the contents of the summary it produces. There are times when a user queries or tries to search for specific information. So the information in this summary part can be used to answer those queries. However, the summary itself might not have sentences arranged as question answer pairs.

I will work on adding a new module called Question - Answering System which will extract the information stored in the summary and modify Yioop so that the summary data can be used to answer questions.By implementing various functionalities of natural language processing, Question-Answering system should be efficient enough to identify user's query and answer those queries from the available data.

Schedule:

Week 1,2,3: 19 AUG 2015 - 10 SEP 2015Deliverable 1: Generate a parse tree from a statement annotated by the part of speech tagger.
Week 4,5,6: 11 SEP 2015 - 19 OCT 2015Deliverable 2: Extract triplet from the parse tree generated.
Week 7,8,9: 20 OCT 2015 - 10 NOV 2015Deliverable 3: Develop user query parser
Week 10,11,12: 11 NOV 2015 - 30 NOV 2015Deliverable 4: Efficient storage of extracted triplets
Week 13: 01 DEC 2015 - 15 DEC 2015Deliverable 5: CS298 Report

Deliverables:

The project would be considered as done when CS 298 is completed. The following will be completed by the end of CS 297:

1. Generate a parse tree from a statement annotated by the part of speech tagger.

2. Extract triplet from the parse tree generated.

3. Develop user query parser.

4. Efficient storage of extracted triplets

5. CS 298 Report

References:

Yioop Documentation: Yioop Documentation

[2015] Information Extraction From Text Information Extraction From Text by Steven Bird, Ewan Klein, and Edward Loper 2015

[2007] Triplet Extraction From Sentences by Delia Rusu*, Lorand Dali*, Blaž Fortuna°, Marko Grobelnik°, Dunja Mladenić°. 2007.

[2003] Integrating Web-based and Corpus-based Techniques for Question Answering by Boris Katz, Jimmy Lin, Daniel Loreto, Wesley Hildebrandt, Matthew Bilotti, Sue Felshin, Aaron Fernandes, Gregory Marton, Federico Mora. TREC 2003.

[2001] Gathering Knowledge for a Question Answering System from Heterogeneous Information Sources by Boris Katz and Jimmy Lin and Sue Felshin. ACL Workshop 2001.