Chris Pollett > Students >
Youn

    ( Print View )

    [Bio]

    [Project Blog]

    [CS297 Proposal]

    [Deliverable 1]

    [Deliverable 2]

    [Deliverable 3]

    [Deliverable 4]

    [CS297 Report - PDF]

    [CS298 Proposal]

    [CS298 Report - PDF]

    [CS298 Presentation - PDF]

                          

























CS297 Proposal

Text Summarization

Youn, Kim (youn.kim@students.sjsu.edu)

Advisor: Dr. Chris Pollett

Description:

With the flourish of internet, one can easily get overwhelmed by the flood of text information. No one wants to waste time on reading whole long texts on a web page.
By summarizing a web page, one can quickly read an accurate representation of the content of a web page by providing a non redundant extract from the original and save time by knowing in brief what is worth reading and what is not worth reading.
Thus, an efficient text summarizer is needed. All the major search engines(Google, Yahoo, etc) make use of automated text summarization and present condensed descriptions of the search results.
My project also deals with text summarization. For CS 297, we will make sample programs that uses the Lanczos algorithm for performing SVD(Singular Value Decomposition), which allows us to find decompositions of very large sparse matrices. The idea is to get familiar with the Lanczos algorithm.
For CS 298/299, we will make such a text summarizer in pure SQL(Structured Query Language).

Schedule:

Week 1: Jan.25-Jan.29Write 297 proposal
Week 2: Feb.1-Feb.5Read about eigenvalue and eigenvector
Week 3: Feb.8-Feb.12Deliverable 1 due
Week 4: Feb.15-Feb.19Read about Singular Vector Decomposition
Week 5: Feb.22-Feb.26Read about Lanczos algorithm
Week 6: Feb.29-Mar.5Work on deliverable 2
Week 7: Mar.8-Mar.12Work on deliverable 2
Week 8: Mar.15-Mar.19Deliverable 2 due
Week 9: Mar.22-Mar.26Read about Text Summarization
Week 10: Mar.29-Apr.2Work on Delierable 3
Week 11: Apr.5-Apr.9Deliverable 3 due
Week 12: Apr.12-Apr.16Read about SQL
Week 13: Apr.19-Apr.23Work on delierable 4
Week 14: Apr.26-May.2Deliverable 4 due
Week 15: May.10-May.14Write report
Week 16: May.17-May.21Deliverable 5 due

Deliverables:

The full project will be done when CS298 is completed. The following will be done by the end of CS297:

1. Sample program that computes SVD using brute force

2. Sample program that generates a tridiagonal matrix using Lanczos algorithm

3. Sample program that computes SVD using Lanczos algorithm

4. Sample program that extracts summary from original contents using Lanczos algorithm

5. CS 297 Report

References:

Wiki, online at http://en.wikipedia.org/wiki/Lanczos_algorithm

[2005] A Note on Singular Value Decomposition. Sang Min Oh. College of Computing, Georgia Institute of Technology. 2005.

[2002] A Singularly Valuable Decomposition: The SVD of a Matrix. The American University. 2002.

[2004] Clustered SVD strategies in latent semantic indexing. Laboratory for High Performance Scientific Computing and Computer Simulation, Department of Computer Science, University of Kentucky. Jing Gao, Jun Zhang.