Chris Pollett > Students > Umaranikar
[Bio] [Blog] [Google as an ad space seller - PDF ] [Credibility of rating on Amazon - PDF] [Study of Google's advertising services] [configuring Ad Server- Yioop patch] [Designing of online auction system] [Prototype - Find relevant ads for the search query] [Online Auction System in Yioop] [How System Calculates Min Bid?] [How System Updates Bid Amount?] [Display Relevant Advertisements] |
Prototype - Find relevant ads for the search queryThe aim of this deliverable is to design and develop a model which would retrieve relevant ads from advertisement database based on user's query. To implement the system, we first created bag of words model based on documents available for indexing. The bag of words model used to retrieve documents relevant to the user's query using cosine similarity ranking algorithm. Cosine similarity is a measure of similarity between two vectors of an inner product space that measures the cosine of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle. It is thus a judgement of orientation and not magnitude: two vectors with the same orientation have a Cosine similarity of 1, two vectors at 90° have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. Cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1]. Implementation overview:Bag of words data format:< term : number of docs containing term Input: Intermediate output: Output: Important source files: Example:Let us say user entered search query "cloud". We get following set of documents which are relevant to entered query using cosine similarity. A set of documents is build using retrieved documents. Each of these documents is formed with entered keywords and immediate three words associated with it on either sides. In above case, I parsed through retrieved documents to locate cloud and formed documents containing <3 words> cloud <3 words> as shown in figure. Then, calculate cosine similarity of each of these documents with ads in the inventory by forming vectors. Cosine similarity ranking algorithm will give the most relevant ads for given search query. Implementation:Source Code - ZIPReferences |