Chris Pollett > Students >
Forrest

    ( Print View)

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Deliverable1]

    [Deliverable2]

    [Deliverable3]

    [Deliverable4]

    [CS297 Report-PDF]

    [CS 298 Proposal]

    [CS298 Report-PDF]

    [CS298 slides-PDF]

Project Blog: Chinese Language Processing


14.Dec.10th

Done

report draft 2

To do

final report; debug reverse maximun matching


13.Dec.3rd

Done

report draft 1

To do

report draft 2, segmentation patch 4


12.Nov.26th

Done

Partial QA implementation

Todo

report draft 1


11.Nov.19th

Done:

Second part of question answering system

TO DO:

QA partial implementation


10.Nov.12th

Done:

Starting Question and Answering Part

To Do

Find algorithm for extracting the answers from candidate pages.


09.Nov.5th

Have done:

Postagging

TO DO:

Question and anserwing

Deliverable 3


08.Oct.29nd

Was Done:

word segmentation finalize

pos tagging start

To do:

Pos tagging program first draft version.


07.Oct.22nd

Finalize Chinese Segmentation.

Revise some structure of the code.

TODO

POS slides


06.Oct.15th

continue on chinese segmentation


05.Oct.1st

Done:

Explaining CRF;

Explaining CWS;

To Do:

Trying to implement CRF in two weeks;


04.Sept.24th

Done this week:

Deliverable 1

To Do:

CRF and Chinese segmentation


03.Sept.17th

Things talked today

Some bugs metioned:

Searching Chinese words in English Setting results in Chinese Page with English Lang: lang=zh

All Chinese Pages are unsafe: safe:all

In windows, Chinese text are saved(?) into unicode (not sure if this is a bug but can figure out later)

TODO:

Use Google Trends to calculate score for words


02.Sept.10th

Stuff talked about today:

Chinese suggestion word list and the format

How to generated the list using 4 gram / 5 gram

Stuff Todo:

Fix the current Chinese tokenizer bug in yioop

Generate the 4 gram / 5 gram file


01.Sept.3rd

Fixed some proposal issues.

Talked about the project/subject I need to work on.

Talked about the localization in Yioop.