Chris Pollett >
Students > [Bio] [Blog] |
CS297 ProposalEvaluating the performance of NoSQL and Time Series databases using TSBSAarsh Patel (aarsh.patel@sjsu.edu) Advisor: Dr. Chris Pollett Description: Time series are measurements or events that are tracked, monitored, down-sampled, and aggregated over time. There are many time series databases developed with a focus on storing such time series data. But, many traditional NoSQL databases like MongoDB and Cassandra can also be used for storing time series data.A recent open source time series data benchmarking suite has been developed(and still many features and updates are added periodically) called Time Series Benchmarking Suite (TSBS) that supports many time series and NoSQL databases. The goal of the project is to evaluate the performance of 4 databases (3 Time Series and MongoDB) against various queries. Metrics like data storage footprint, and read and write performance of databases will be the base of the research question as to how traditional NoSQL databases perform against time series databases when it comes to storing time series data. Schedule:
Deliverables: The full project will be done when CS298 is completed. The following will be done by the end of CS297: 1. A research study on time series data, workload and their advantages. 2. Finalizing the benchmarking suite and the databases. 3. Research on the databases, studying the benchmark code and Go programs. 4. Implementation of the benchmark locally with a small dataset with decided databases and queries. 5. CS 297 report. References: [1] S. N. Z. Naqvi, S. Yfantidou, and E. Zimanyi, “Time series databases and influxdb.” [Online]. Available: https://jira.lsstcorp.org/secure/attachment/37574/influxdb_2017.pdf. [Accessed: 10-Mar-2023]. [2] “DB-Engines ranking,” DB-Engines. [Online]. Available: https://db-engines.com/en/ranking/time+series+dbms/all. [Accessed: 10-Mar-2023]. [3] J. Han, H. E, G. Le, and J. Du, “Survey on NoSQL database,” 2011 6th International Conference on Pervasive Computing and Applications, pp. 363–366, 2011. [4] V. Abramova and J. Bernardino, “NoSQL databases: MongoDB vs Cassandra ,” Proceedings of the International C* Conference on Computer Science and Software Engineering, pp. 14–22, 2013. [5] D. Paul“Time Series Database (TSDB) guide: Influxdb,” InfluxData, 09-Feb-2023. [Online]. Available: https://www.influxdata.com/time-series-database/. [Accessed: 10-Mar-2023].
[6] “Timescale docs,” TimescaleDB - Timeseries database for PostgreSQL. [Online]. Available: https://docs.timescale.com/. [Accessed: 10-Mar-2023]. |