CS297 Proposal
Translating Natural Languages to SPARQL Query Language for RDF based Question Answering System
Shreya Satish Bhajikhaye (shreyasatish.bhajikhaye@sjsu.edu)
Advisor: Dr. Chris Pollett
Description:
RDF is a data framework used for managing linked resources and knowledge graphs. Sparql is a powerful language developed to query the continuously growing RDf data but using it requires an understanding of its syntax and semantics. This project aims to overcome this limitation by helping an average human user search for information using natural languages. The system will use word embeddings to map the sentence structures in the question to the terms in the underlying database. It will then build a query in sparqle according to the user's requirements.
Schedule:
Week 1:
Sep 25 - Aug 31 | Discuss topics and draft proposal |
Week 2:
Sep 01 - Sep 07 | Finalize schedule and read about RDF and Sparql query language |
Week 3:
Sep 08 - Sep 14 | Work on deliverable 1 - Read relevant topics in [1] |
Week 4:
Sep 15 - Sep 21 | Deliverable 1 due - SPARQL queries |
Week 5:
Sep 22 - Sep 28 | Read [2] to understand concepts in Question Answering System. Select three types to to read in depth. |
Week 6:
Sep 29 - Oct 05 | Work on deliverable 2. |
Week 7:
Oct 06 - Oct 12 | Deliverable 2 due |
Week 8:
Oct 13 - Oct 19 | Read papers [3] and [4] |
Week 9:
Oct 20 - Oct 26 | Implement a simple neural network |
Week 10:
Oct 27 - Nov 02 | Work on deliverable 3 |
Week 11:
Nov 03 - Nov 09 | Work on deliverable 3 |
Week 12:
Nov 10 - Nov 16 | Deiverable 3 due. |
Week 13:
Nov 17 - Nov 23 | Work on Deliverable 4 |
Week 14:
Nov 24 - Nov 30 | Deliverable 4 due. Start work on report |
Week 15:
Dec 01 - Dec 07 | First draft of report due. Revise draft of report |
Week 16:
Dec 08 | Deliverable 5 due - CS297 report |
Deliverables:
The full project will be done when CS298 is completed. The following will
be done by the end of CS297:
1. Implement five complex queries to Wikidata using SPARQL.
2. Read about three different Question Answering systems and implement some non-trivial aspects of one.
3. Build a neural network that maps sentence terms to subject, object and predicate.
4. Write a program that finds the entities in a sentence and locates the terms in the Wikidata items.
5. CS 297 report due.
References:
[1] DuCharme, B.(2013). Learning Sparql: Querying and Updating for SPARQL 1.1. O'Reilly
[2] Dimitrakis, E., Sgontzos, K. & Tzitzikas, Y. A survey on question answering systems over linked data and documents., J Intell Ins Syst 55, p233-259. (2020).
[3] Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., & Mladeni, D. (2007). Triplet Extraction from Sentences.
[4] Liu, Y., Zhang, T., Liang, Z., Ji, H., & McGuinness, D. (2018). Seq2RDF: An End-to-end Application for Deriving Triples from Natural Language Text. ArXiv, abs/1807.01763.
|