CS298 Proposal
Keyword Search in Social Networks
Vijeth Patil (vijeth.patil@gmail.com)
Advisor: Dr. Chris Pollett
Committee Members: Mark Stamp, Soon Tee Teoh
Abstract:
The idea of this project is to access data (feeds, posts) from
social networks and create a search engine over the collected data. The user of the system provides his credentials for the given social networks and then my search project will access the posts of the user and his friends in these networks. On a social network, a user has access to the documents authored by himself and to the documents authored by his friends. Social networking sites like Facebook allows you to search for a user but not the posts, while other sites like Twitter allow you to search based on keywords but not based on people you are following. My search provides the ability to search by keywords from the posts of a user's friend network. Social search provides richer sources for relevance information than traditional web pages. Some examples of this are ratings, relevant links, comments and interest in a topic. My project will use these additional signals to the enhance the search ranking function in ways that traditional search engines and social networks are not able to do. As an example of this, if the user searches for a product and sees that 20 of his friends also like the same product, they will immediately feel more comfortable and will be more likely to use that product. The same process occurs on Twitter. When a person sees his or her trusted followers already engaging with a business, it acts as an unspoken endorsement. Another example is that when a user searches for a keyword the user gets web resource bookmarks posted by their friends from social bookmarking service sites like del.icio.us which are more relevant as compared to the search results returned by del.icio.us which don't give preference to your friends's bookmarks.
CS297 Results
- Created Twitter application to access user's data from Twitter.
- Developed a Facebook application to access user's data on facebook.
- Implemented an ArchiveBundleIterator to crawl and index a user's bookmarks from Del.icio.us into Yioop!
- CS 297 report.
Proposed Schedule
Week 1:
Jan26_Feb 1 | Getting CS298 Proposal approved and upload. |
Week 2:
Feb 2_Feb 7 | Read the API's for Facebook and Twitter |
Week 3-4:
Feb 08_Feb 21 | Develop the program to extract data from Facebook |
Week 5:
Feb 22_Feb 28 | Develop the program to extract data from Twitter |
Week 6:
Feb 29_March 06 | Read the Yioop documentation to Index the Data |
Week 7:
March 07_March 13 | Implement the Iterator for Indexing the Facebook and Twitter data into Yioop Phase 1 |
Week 8:
March 14_March 20 | Implement the Iterator for Indexing the Facebook and Twitter data into Yioop Phase 2 |
Week 9:
March 21_March 27 | Read the API's from Del.icio.us to extract the user's bookmark data |
Week 10:
March 28_April 03 | Develop the extension for Yioop to index the Del.icio.us bookmarks into Yioop |
Week 11:
April 04_April 10 | Tweak the Yioop to include the results from Facebook, Twitter and Del.icio.us in its web search results |
Week 12:
April 11_April 17 | Test all the features of the extension and document them |
Week 13-14:
April 18_ May 01 | Work on CS298 Report |
Week 15:
May 02_May 9 | Submit Report to advisor and committee |
Week 16:
May 10_May 17 | Defense |
Key Deliverables:
- Software
- Deliverable 1: Program to extract the posts, comments and tweets from Facebook and Twitter
- Deliverable 2: Develop a Yioop extension to index the data from Facebook and Twitter
- Deliverable 3: Extending Yioop to also extract and index data from Del.icio.us
- Deliverable 4: Tweaking the ranking algorithm of Yioop to show up the results from this data ranked using a function based on the relevancy, recency and source of the post
- Report
- CS298 Project Report
- Project code Documentation.
Innovations and Challenges
- Structure of data shared over Facebook, Twitter and Del.icio.us is different, structuring these into one common indexable format is challenging.
- Accessing user's posts, likes, comments and tweets from Facebook and Twitter is challenging. Indexing and ranking this social data effectively on Yioop is challenging.
References:
[2010] Search in Social Networks with Access Control. Truls A. Björklund, Michaela Götz, Johannes Gehrke. ACM. 2010.
|