Chris Pollett >
Students > [Bio] [Blog] |
CS297 ProposalRobust Cache System for YioopRushikesh Padia (padiarushi3012@gmail.com) Advisor: Dr. Chris Pollett Description: Yioop is an open search engine that allows user to create indexes on websites. It has distributed crawlers to crawl open internet and index web pages. It also allows users to add their own websites. Currently, Yioop uses a simple query caching mechanism based on expiration time to invalidate the stale cache. This mechanism can be substituted with other state-of-the-art approaches to improve speed and precision. The goal of the project is to implement a cache management system for improving search results of the Yioop search engine. The cache system will be responsible for the efficient storage and retrieval of search query results.
Schedule:
Deliverables: The full project will be done when CS298 is completed. The following will be done by the end of CS297: 1. Understand Yioop and add new media job to run queries to populate caches 2. Implement MLDC algorithm [3] 3. Implement STDC algorithm [4] 4. Implment SSDC algorithm [5] 5. CS 297 Report. References: [1] "B. Cambazoglu and R. Baeza-Yates, "Scalability Challenges in Web Search Engines," in Synthesis Lectures on Information Concepts, Retrieval, and Services, vol. 7, 2011, pp. 27-50. doi: 10.1007/978-3-642-20946-8_2." [2] R. Ozcan, I. S. Altingovde, and A. Ulusoy, "Cost-Aware Strategies for Query Result Caching in Web Search Engines," ACM Trans. Web, vol. 5, no. 2, May 2011, doi: 10.1145/1961659.1961663. [3] T. Kucukyilmaz, B. B. Cambazoglu, C. Aykanat, and R. Baeza-Yates, "A machine learning approach for result caching in web search engines," Information Processing & Management, vol. 53, no. 4, pp. 834-850, 2017, doi: https://doi.org/10.1016/j.ipm.2017.02.006. [4] I. Mele, N. Tonellotto, O. Frieder, and R. Perego, "Topical result caching in web search engines," Information Processing & Management, vol. 57, no. 3, p. 102193, 2020, doi: https://doi.org/10.1016/j.ipm.2019.102193. [5] T. Kucukyilmaz, "Exploiting temporal changes in query submission behavior for improving the search engine result cache performance," Information Processing & Management, vol. 58, no. 3, p. 102533, 2021, doi: https://doi.org/10.1016/j.ipm.2021.102533. [6] H. Ma, O. Tao, C. Zhao, P. Li, and L. Wang, "Impact of replacement policies on static-dynamic query results cache in web search engines," in 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), 2017, pp. 137-139. doi: 10.1109/ISI.2017.8004890. [7] R. Solar, V. Gil-Costa, and M. Marin, "Evaluation of Static/Dynamic Cache for Similarity Search Engines," in SOFSEM 2016: Theory and Practice of Computer Science, 2016, pp. 615-627. [8] R. Blanco, E. Bortnikov, F. Junqueira, R. Lempel, L. Telloli, and H. Zaragoza, "Caching Search Engine Results over Incremental Indices," in Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 82-89. doi: 10.1145/1835449.1835466. [9] T. Trinh, D. Wu, and J. Z. Huang, "C3C: A New Static Content-Based Three-Level Web Cache," IEEE Access, vol. 7, pp. 11796-11808, 2019, doi: 10.1109/ACCESS.2019.2892761. |