Chris Pollett >
Students > [Bio] [Blog] |
CS298 ProposalRobust Cache System for YioopRushikesh Padia (rushikeshlalit.padia@sjsu.edu) Advisor: Dr. Chris Pollett Committee Members: Dr. Ben Reed, Batul Merchant Abstract:According to recent studies, the average internet user expects search results in 0.5 to 2 seconds. Also, Google has established guidelines that the response time for the search query should be under 200 milliseconds. Achieving such ambitious goals requires an efficient caching mechanism. In web search engines, caching is implemented at multiple layers to improve performance and response time. Caching query results has the highest impact on response time as it does not require the processing of queries. This also improves the performance of web servers by reducing the load. Yioop is one such open-source web search engine that implements result caching. The current implementation utilizes a single dynamic cache based on Marker’s algorithm. Having only a single dynamic cache could capture the short-term trend in queries, but it fails to capture the long-term trend and always popular queries. To capture such trends Static-Dynamic caches are commonly used. In this project, we are implementing a Static-Topic-Dynamic cache in Yioop which involves adding topic-based layer over the SD cache. In STD cache, a fraction of cache is allocated to this layer where cache entries are separated into different topics, such as weather and education. The search results based on a specific topic are stored in the cache section for that topic. This will allow Yioop to use cache space effectively by adapting to the temporal locality of different topics. CS297 Results
Proposed Schedule
Key Deliverables:
Innovations and Challenges
References:[1] H. Ma, O. Tao, C. Zhao, P. Li, and L. Wang, "Impact of replacement policies on static-dynamic query results cache in web search engines," in 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), 2017, pp. 137-139. doi: 10.1109/ISI.2017.8004890. [2] R. Solar, V. Gil-Costa, and M. Marin, "Evaluation of Static/Dynamic Cache for Similarity Search Engines," in SOFSEM 2016: Theory and Practice of Computer Science, 2016, pp. 615-627. [3] R. Blanco, E. Bortnikov, F. Junqueira, R. Lempel, L. Telloli, and H. Zaragoza, "Caching Search Engine Results over Incremental Indices," in Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 82-89. doi: 10.1145/1835449.1835466. [4] T. Trinh, D. Wu, and J. Z. Huang, "C3C: A New Static Content-Based Three-Level Web Cache," IEEE Access, vol. 7, pp. 11796-11808, 2019, doi: 10.1109/ACCESS.2019.2892761. [5] T. Kucukyilmaz, B. B. Cambazoglu, C. Aykanat, and R. Baeza-Yates, "A machine learning approach for result caching in web search engines," Information Processing & Management, vol. 53, no. 4, pp. 834-850, 2017, doi: https://doi.org/10.1016/j.ipm.2017.02.006. [6] I. Mele, N. Tonellotto, O. Frieder, and R. Perego, "Topical result caching in web search engines", Information Processing & Management, vol. 57, no. 3, p. 102193, 2020, doi: https://doi.org/10.1016/j.ipm.2019.102193. [7] M. Catena and N. Tonellotto, "Energy-Efficient Query Processing in Web Search Engines," IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 7, pp. 1412-1425, 2017, doi: 10.1109/TKDE.2017.2681279. [8] B. Cambazoglu and R. Baeza-Yates, "Scalability Challenges in Web Search Engines" in Synthesis Lectures on Information Concepts, Retrieval, and Services, vol. 7, 2011, pp. 27-50. doi: 10.1007/978-3-642-20946-8_2." [9] R. Ozcan, I. S. Altingovde, and A. Ulusoy, "Cost-Aware Strategies for Query Result Caching in Web Search Engines," ACM Trans. Web, vol. 5, no. 2, May 2011, doi: 10.1145/1961659.1961663. |