Chris Pollett >
Students > [Bio] [Blog] [Code] |
CS297 ProposalHigh performance document store implementation in RustIshaan Aggarwal (ishaan.aggarwal@sjsu.edu) Advisor: Dr. Chris Pollett Description: The aim of this project is to implement a high performance document store which is robust and memory efficient as well. This will be achieved while migrating the older PHP based implementation of data storage for the Yioop! open source search engine to RUST based implementation. The reason for choosing Rust is that it allows more efficient memory management, easier maintenance, robustness and faster performance. Schedule:
Deliverables: The full project will be done when CS298 is completed. The following will be done by the end of CS297: 1. A single node server that can receive requests for a document by key and return the corresponding document. 2 Implement linear hashing using rust. This will be leveraged in Deliverable 4. 3. Read and write documents from/to warc files. 4. Implement the key-value store using consistent hashing. 5. TBD - Migrate some portion of the PHP code to RUST. References: [1] [2012] Corbett, J., Dean, J., Epstein, M. et al. Spanner: Google's globally distributed database. In Proceedings of OSDI'12: Tenth Symposium on Operating System Design and Implementation, Hollywood, CA, October 2012. [2] [2019] Khan, S., Liu, X., Ali, S. A., and Alam, M. (2019). Storage solutions for big data systems: A qualitative study and comparison. arXiv preprint arXiv:1904.11498. [3] [2020] Okazaki, S. (2020). An experimental study of memory management in Rust programming for big data processing (Doctoral dissertation, Boston University). [4] [2021] Rust programming best practices with examples: https://github.com/mre/idiomatic-rust [5] [2018] Gjengset, J., Schwarzkopf, M., Behrens, J., Araujo, L. T., Ek, M., Kohler, E., ... and Morris, R. (2018). Noria: dynamic, partially-stateful data-flow for high-performance web applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18) (pp. 213-231). |