Project Blog


CS 298


Week 16 - December 6th 2022

Discuss about the out of memory issue. Use LRU cache to store frequently used term embeddings and avoid creating vectors for all the terms at once. Use pack-unpack function to convert embedding vectors to binary strings for saving memory

Things to do this week

  • Implement LRU cache to store frequent term embeddings
  • Use pack-unpack functions for embedding vectors
  • Prepare defense slides

Week 15 - November 29th 2022

Revised the recommendation job patch and many user experiment patch to address the changes done in description update media job. Discussed on how to handle meta details in descriptions. If the number of terms in a line is less than length of context window then consider those terms as meta words

Things to do this week

  • Revise the patch to handle meta words as discussed above
  • Complete the draft for report
  • Meet committee members and confirm defense date

Week 14 - November 22nd 2022

Used the data from ITEM_IMPRESSION_SUMMARY table as it is persisted for almost a year. Demonstrated the code changes for threads and groups recommendation. Finalized to recommend threads for the groups a user already is a member and use same table for storing term and item embeddings differentiated using ITEM_TYPE column

Things to do this week

  • Fix the issue of group recommendations not calculated
  • Start drafting the report

Week 12, 13 - November 15th 2022

Demonstrated the first working version of resource recommendation. Finalized to get ride of existing recommendation code for threads and groups, use same mechanism as used for resources

Things to do this week

  • Update code for threads and groups recommendation
  • Use stops words from Tokenizer to consider the locale
  • Figure out how to handle data removal from ITEM_IMPRESSION table

Week 11 - November 1st 2022

Demonstrated how descriptions for the resources are fetched using recommendation file

Things to do this week

  • Complete the code for calculating terms and item embedding for resources description

Week 10 - October 25th 2022

Discussed about maintaining text file similar to needs_description.txt to retrieve the descriptions for resources

Things to do this week

  • Implement Hash2Vec on resources to find term embeddings and item embeddings
  • Start the work on deliverable 4

Week 9 - October 18th 2022

Demonstrated the completed updated ManyUserExperiment and the results of completed DescriptionUpdate media job on the test resources created by many user experiment. Discussed about using binary return type for md5 function while calculating resource id

Things to do this week

  • Update md5 function return type while calculating resource id
  • Start the work on deliverable 4

Week 8 - October 11th 2022

Demonstrated the updated search source form fields and use of those in DescriptionUpdate media job. Discussed on how to capture the wiki resource view in the ITEMS_IMPRESSION table and also update the ManyUserExperiment in order to create test data

Things to do this week

  • Finalize the remaining implementation of DescriptionUpdate media job
  • Start on deliverable 3 goals

Week 7 - October 4th 2022

Demonstrated the DescriptionUpdate media job in test mode that can only fetch the query page html and logs it in the terminal. Discussed about need to update the search source form fields in order to handle the custom HTML tags and attributes

Things to do this week

  • Update the search source add and edit forms
  • Complete the DescriptionUpdate media job

Week 6 - September 27th 2022

Demonstrated the resource processing mechanism as discussed last week. Created initial DescriptionUpdate media job that only runs in test mode and only fetches the thumb folder paths in global needs_description.txt file

Things to do this week

  • Complete the DescriptionUpdate Job to fully run in test mode

Week 5 - September 20th 2022

Discussed about the way of marking a resource needs description by keeping a check when the user reads the wiki page and that page has Update Resource Description marked as true. Maintain needs_description.txt file at each thumb folders that tracks the resources needing description update and copy over the path of that file to global needs_description.txt file at resources folder

Things to do this week

  • Change the resource tracking mechanism as discussed above
  • Start basic implementation of DescriptionUpdate media job

Week 4 - September 13th 2022

Demonstrated the first iteration of UI for adding and editing Description source. Discussed about adding a field to take test values in the edit source form and also changing the form fields label to more suitable names. Moreover discussed about adding checkbox field in edit wiki pages form to allow user to either include / exclude the resources of that page to be processed for updating descriptions.

Things to do this week

  • Update the names of form fields
  • Add test values input field
  • Add checkbox field in the edit wiki page form
  • Create a text file to store details about resources that needs description update

Week 3 - September 6th 2022

Discussed about Anirudh's Hash2Vec approach and came to conclusion of improving the approach to use combination of TF-IDF scores and Hash2Vec embedded vectors for entire item rather than using vectors for individual terms to find similarities between two items using cosine between the item vectors. Discussed high level functionality of the media job for scrapping relevant details of media items using Search Sources and Web Scrappers in Yioop.

Things to do this week

  • Read about Media Job in Yioop Documentation
  • Explore Search Source and Web Scrappers code base
  • Design the UI for Media Items Search Source

Week 2 - August 30th 2022

Created new password for access to edit the project site. Discussed about retiring the old recommendation mechanism and only use Hash2Vec approach and discussed about the changes to be done for hash2Vec patch. Also learnt about creating a group for storing media items in it and about Search Source mechanism in Yioop to download the media items from given source and update the destination wiki page.

Things to do this week

  • Update hash2vec patch as per the comments in MantisBT and retire the old recommendation mechanism
  • Get second committee member and get add code for CS 298 from department as soon as possible

Week 1- August 23rd 2022

Organizational Meeting of Fall 2022 semester. Discussed about the project and finalized the major outcomes of CS 298

Things to do this week

  • Draft proposal for CS 298
  • Find two committee members for CS 298
  • Complete CS 298 Request Form and Petition for Advancement to Candidacy

CS 297


Week 15 - May 10th 2022

Discussed the changes done in Anirudh's patch and figured to use min heap data structure for improving performance while finding similar words. Also discussed way to handle credits redeem when only default credit redeem script is available.

Things to do this week

  • Update changes to Anirudh's patch as discussed
  • Complete the report

Week 14 - May 3rd 2022

Discussed about the structure and content for report. Issues with apply credits redeem patch due to PublicHelpPages.php file, so create new patch to resolve it and also remove commits which contained test Stripe keys.

Things to do this week

  • Update credits redeem patch
  • Complete refactoring of Anirudh's patch
  • Start drafting report

Week 13 - April 26th 2022

Demo of completed Selenium testing project with the shell script and mochawesome html reports and modified Manage Credits UI as discussed last time. Decided to redirect users to Stripe site for verification part and also discussed about the changes to be done in Anirudh's code.

Things to do this week

  • Update latest Selenium code on Deliverable 2 page
  • Use Stripe Connect for verification flow
  • Create Issue for credits redeem on Issue Tracker
  • Start refactoring Anirudh's code

Week 12 - April 19th 2022

Demo of working credit conversion feature. Discussed few points on collecting the identity document and showing relevant warnings to users if they do not provide it. Discussed about writing a shell script for Selenium testing project to download yioop locally before running test cases.

Things to do this week

  • Write shell script for Selenium project
  • Redesign credit redeem UI to use the tab menu
  • Allow users to update the debit cards details when they redeem the credits
  • Create help menu for Redeem credits which states that Yioop does not handle or store the information

Week 11 - April 12th 2022

Got Payment processing script to refer the charging logic. Discussed 2 ways of allowing users to setup Stripe account for redeeming credit - 1) Redirect them to Stripe 2) Perform everything within Yioop

Things to do this week

  • Implement second way for redeeming the credits
  • Start reading reference 4 - Neural Collaborative Filtering

Week 10 - April 5th 2022

Demo of completed testing project. Discussed about creating a cron job to run the test cases periodically and output the results at some specific place. Summarized the reading on reference 3 and noted point of introducing tags to the threads and groups to use them for recommendation.

Things to do this week

  • Start learning about Stripe
  • Familiarize with the existing purchase credit code in Yioop

Week 9 - March 29th 2022

Spring break


Week 8 - March 22nd 2022

Implemented the signin UI test case for web version. Discussed the possibility of removing the PhantomJS code completely from Yioop. Try to add Yioop as dependency to the testing project.

Things to do this week

  • Implement more test cases for Groups and Discussions
  • Continue reading reference paper 3 - Collaborative Filtering Recommender Systems

Week 7 - March 15th 2022

Localized the strings for emoji and submitted new patch. Discussed the possibilities of using Selenium and come up with seperate project for testing Yioop UI. Summarized the readings on the reference article 2 - Item based collaborative filtering.

Things to do this week

  • Implement basic setup for testing project
  • Start reading reference paper 3 - Collaborative Filtering Recommender Systems

Week 6 - March 8th 2022

Implemented the emoji shortcuts feature, fixed the message row length to prevent it from exceeding, created issue on Issue tracker and submitted the patch. Started exploring about Selenium.

Things to do this week

  • Explore the existing UI testing code
  • Try to fix the code so that it atleast execute the test cases
  • Find alternative for PhantomJS as it is not maintained
  • Start reading next reference article on Item based Collaborative Filtering

Feedback to add the localized strings for Emoji description and shortcuts on the issue tracker.


Week 5 - March 1st 2022

Demo of completed Deliverable 1 - Emoji Picker. Discussed the feature of typing corresponding shortcuts for emoji and on submit convert it to emoji using regex. Reviewed the equations of model included in the ppt. Created Issue tracking account and promoted to developer role.

Things to do this week

  • Work on shortcuts for emoji
  • Fix the messages length exceeding the container and messing the UI
  • Start learning about selenium
  • Create issue for including the patch
  • Create Deliverable 1 page and upload the patch file along with the ppt

Week 4 - February 22nd 2022

Discussed about deliverable 1 - Emoji Picker, reviewed the starter code for it and discussed the enhancements like show description on hover, make it adaptable for mobile devices and create separate helper class for rendering the emoji picker. Also looked about issue tracking to include the patch. In the end went through a ppt which summarises reference 1: chapter 4-How Netflix Recommends Movie of Networked Life.

Things to do this week

  • Complete the Emoji Picker Deliverable
  • Include the equations used for the prediction models in the ppt
  • Setup account for Issue Tracking and including patch request

Week 3 - February 15th 2022

Discussed about the proposal, revised the description and deliverables to make them more concrete. Discussed about two things to be done for CS 298 - (1) collaborative filtering on the search results to show results that are more likely to be clicked by a user and (2) include the patch for Anirudh's hash2vec approach and extend it for wiki pages and wiki resources. Also discussed about the things to be done for next week.

Things to do this week

  • Meet at noon, not at 2:00 pm
  • Create separate element for emoji which draws emojis using unicode code points
  • Script part to make emojis clickable and logic to send their html to input text when clicked

Week 2 - February 8th 2022

Discussed about the change in the topic. Discussed and listed down few enhancements on Yioop and finalized the deliverables for CS 297 and a brief discussion on the final project for CS 298.

Things to do this week

  • Add Bio
  • Update the Project Blog
  • Draft electronic version of the proposal
  • Download the Yioop project via git clone
  • Explore the codebase of Yioop
  • Locate the code to incorporate Emojis in Chat System

Week 1 - February 1st 2022

It was an organizational meeting on zoom, met the other students taking their masters projects under the Professor. Decided the time of meetings for rest of the semester and discussed about the CS 297 proposal template.

Things to do this week

  • Draft the proposal