Chris Pollett >
Students > [Bio] [Blog] |
CS297 ProposalTranslating Natural Languages to SPARQL Query Language for RDF based Question Answering SystemShreya Satish Bhajikhaye (shreyasatish.bhajikhaye@sjsu.edu) Advisor: Dr. Chris Pollett Description: RDF is a data framework used for managing linked resources and knowledge graphs. Sparql is a powerful language developed to query the continuously growing RDf data but using it requires an understanding of its syntax and semantics. This project aims to overcome this limitation by helping an average human user search for information using natural languages. The system will use word embeddings to map the sentence structures in the question to the terms in the underlying database. It will then build a query in sparqle according to the user's requirements. Schedule:
Deliverables: The full project will be done when CS298 is completed. The following will be done by the end of CS297: 1. Implement five complex queries to Wikidata using SPARQL. 2. Read about three different Question Answering systems and implement some non-trivial aspects of one. 3. Build a neural network that maps sentence terms to subject, object and predicate. 4. Write a program that finds the entities in a sentence and locates the terms in the Wikidata items. 5. CS 297 report due. References: [1] DuCharme, B.(2013). Learning Sparql: Querying and Updating for SPARQL 1.1. O'Reilly [2] Dimitrakis, E., Sgontzos, K. & Tzitzikas, Y. A survey on question answering systems over linked data and documents., J Intell Ins Syst 55, p233-259. (2020). [3] Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., & Mladeni, D. (2007). Triplet Extraction from Sentences. [4] Liu, Y., Zhang, T., Liang, Z., Ji, H., & McGuinness, D. (2018). Seq2RDF: An End-to-end Application for Deriving Triples from Natural Language Text. ArXiv, abs/1807.01763. |