Chris Pollett > Students >
Ravi

    ( Print View )

    [Bio]

    [Project Blog]

    [CS297 Proposal]

    [Deliverable1]

    [Deliverable2]

    [Deliverable3]

    [CS297 Report - PDF]

    [CS298 Proposal]

    [CS298 Progress Report Spring 2011- PDF]

    [CS298 Project Report - PDF]

                          

























Deliverable 2

Goal

The goal of this deliverable was to simulate the OPIC algorithm using PHP. The basic understanding of OPIC algorithm was gained by reading the reference paper on OPIC. The OPIC algorithm works on the following principle: Initially, some “cash” is distributed to each page and each page when it is crawled distributes its current cash equally to all pages it points to. Cash can be defined as the numerical value allotted to each page. The static nodes of the matrix represent the web pages. This is recorded in the history of the page. The importance of a page is then obtained from the “credit history” of the page.
The idea is that the flow of cash through a page is proportional to its importance. At each step, an estimate of any page k’s importance is (H[k] + C[k]) / (G + 1), where H[k] represents history, C[k] represents cash and G is the total cash accumulated. The algorithm is executed over a decided number of iterations until the acceptable rate of convergence is achieved.

OPIC Simulator

As a part of the deliverable, a simulator was developed in PHP to simulate the working of the Online Page Importance Calculation Algorithm (OPIC). This tool uses an n n static nodes (n can be configured in the configuration file) to depict the working of a static network. A connection of nodes is activated or deactivated by checking or unchecking the radio button on the adjacency matrix.

In the snapshot of the tool below the links between pages have been set for a 5 5 matrix

opic snapshot 1

After setting the links the user can iterate through the algorithm by clicking the next button. The table contains information about the pages, their links, cash, history and importance. The Page with red background is the page which is currently being visited by the algorithm. Snapshot below depicts one of the iterations.

opic snapshot 2

Source Code

[opic]