Chris Pollett > Students > Shenoy

    Print View

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Deliverable 1]

    [Deliverable 2]

    [Deliverable 3]

    [Deliverable 4]

    [CS 297 Report]

    [CS 298 Proposal]

























Implement a Hindi Part of Speech Tagger

Description: The aim of the deliverable was to implement a hindi part of speech tagger. Given a Hindi sentence the PoS tagger should be able to tag each word in the sentence with the most likely part of speech.

The main challenge while implementing the tagger was getting a hindi lexicon. After finding the lexicon, I had to work on cleaning the lexicon file so that it was in a format similar to the English lexicon. After getting the lexicon in place, I implemented a rule based part of speech tagger.

The patch with the Hindi Part of speech tagger Hindi Part of Speech Tagger