Chris Pollett > Students >
Forrest

    ( Print View)

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Deliverable1]

    [Deliverable2]

    [Deliverable3]

    [Deliverable4]

    [CS297 Report-PDF]

    [CS 298 Proposal]

    [CS298 Report-PDF]

    [CS298 slides-PDF]

Implemented a Segmenter for Yioop to segment Chinese sentence into words

Description

This is the project related to Chinese NLP. Since I am not good at Machine learning and cannot implement the conditional random Field by myself. I find an alternative way to do it by combining a statistic way and a dictionary search way.

The result seems good. Academia Sinica(AS) dataset has an accuracy of 96.5% and Peking University(PKU) dataset has an accuracy of 89.8%.

Results:

Chinese Sentence Segmentation