Chris Pollett > Students > Shawn

    Print View


    [297 Proposal]

    [Del 1: Reading Slides (pdf)]

    [Del 2 & 3: Code (zip)]

    [297 Report (pdf)]

    [299 Proposal]

    [299 Blog]

    [Del 1: Detailed Plan (pdf)]

    [Del 2: Admin Interface]

    [299 Report (pdf)]



My name is Shawn Tice, and I'm an M.S. student in the Computer Science department at SJSU. This semester (Fall 2012) I'm working with Dr. Pollett on introducing web page classification into the Yioop! crawl process. The classifier will be supervised, and will be trained on a small corpus supplied by the user, then extended by an interactive bootstrapping process. See the proposal for details.

In Spring 2012 I worked with Dr. Pollett on improving Yioop's archive crawl process by standardizing the archive iterator interface and making it possible to distribute archive crawling over fetchers on several machines. See the proposal for details. You can also access the project blog and final report via the links to the left.

You can contact me through my university email address, which is just my first name, a period, then my last name at