Chris Pollett > Students >
Forrest

    ( Print View)

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Deliverable1]

    [Deliverable2]

    [Deliverable3]

    [Deliverable4]

    [CS297 Report-PDF]

    [CS 298 Proposal]

    [CS298 Report-PDF]

    [CS298 slides-PDF]

Generate a Chinese words frequency file for Yioop to suggest Chinese words based on input

Description

The goal of the term project is to add some features to Yioop. So, I decided to get familiar with Yioop first by doing some enhacement to it. For this deliverable, I was given a list of Chinese words and my task is to find the frequency of those words and use the yioop tool to add the words.

Results:

1. The frequency of the words is retrieved from Google trends using pytrends. The frequency are calculated by the average of the weekly hotness: if the words are hot for the whole year, it is a frequently seached word; if it is hot in a short time, then it is not a frequently seached word. I rank the words based on this rule.

Chinese Suggest word

2. Fixed 2 minor bugs. First one is the Chinese stopwordsRemover was not working as expected. Second one is that a single Chinese character input in the yioop search bar is not correctly encoded for searching

3. Added a sorting statement so the suggestion words are ordered by frequency instead of lexicon order

suggest word lexicographic order