Chris Pollett > Students >
Bui

    ( Print View)

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [Dynamic Hashing Schemes - PDF]

    [WARC Files - PDF]

    [Deliverable 1]

    [Deliverable 2]

    [Deliverable 3]

    [Deliverable 4]

    [CS 297 Report - PDF]

    [CS 298 Proposal]

    [WARC-KIT Code]

    [CS 298 Report - PDF]

    [CS 298 Presentation - PDF]

Project Blog


Defense - December 10th 2021

Defend CS 298 project.

Final things to do:

  • Add slides and final report to web page
  • Graduate :D

Week 16 - December 7th 2021

Discuss defense slides.

Things to do for this week:

  • Add suggested topics to defense slides
  • Practice presentation for defense

Week 15 - November 30th 2021

Discuss completed report draft.

Things to do for this week:

  • Send report draft out to committee members for feedback
  • Upload report draft to Turnitin and Schedule defense
  • Work on defense slides

Week 14 - November 23rd 2021

Discuss current report draft.

Things to do for this week:

  • Revise draft with added suggestions to introduction and experiments section

Week 13 - November 16th 2021

Discuss current report progress

Things to do for this week:

  • Finish writing final report draft

Week 12 - November 9th 2021

Picked WARC-KIT as the name for the final product. Also, discussed report deadlines.

Things to do for this week:

  • Continue writing final report
  • Continue bug squashing and creation of more examples to demonstrate

Week 11 - November 2nd 2021

Discuss completed GraphQL Server for querying

Things to do for this week:

  • Make 4 -5 different hastables/queries to show off
  • Begin Final Report
  • Think of better name for completed application that has catchy acronym

Week 10 - October 26th 2021

Discuss combined code

Things to do for this week:

  • Complete GraphQL Server for queries

Week 9 - October 19th 2021

Discussed work done so far in combining code and outline of last few weeks

Things to do for this week:

  • Continue work on combining code

Week 8 - October 12th 2021

Discuss completed WarcPacker and discuss work with Driver and GraphQL

Things to do for this week:

  • Continue work on Driver and GraphQL
  • Start combining LinearHashTable, WarcPacker, and Driver.
  • Outline last few weeks of work

Week 7 - October 5th 2021

Discuss work done so far on WarcPacker and the issue with creating dynamic routes with GraphQL.

Things to do for this week:

  • Continue work on WarcPacker
  • Find a solution for the GraphQL limitation.

Week 6 - September 28th 2021

Discussed finished PackedTableTools, and work done so far with the driver format. Also, discussed making another tool WarcPacker to transform data filtered from WARCParser to a PackedTableTool format to insert into the LinearHashTable.

Things to do for this week:

  • Continue work on GraphQL/Driver implementation.
  • Begin work on a WarcPacker? (naming is hard).

Week 5 - September 21st 2021

Discussed current work done so far on indexing with blob columns and PackedTableTools. Also, discussed a driver format using http/express servers and GraphQL

Things to do for this week:

  • Finish Linear Hash Table, PackedTableTools, and blob integration.
  • Begin work on a driver in the discussed format.

Week 4 - September 14th 2021

Discussed current implemented PackedTableTools and issues with pack and unpacking of floating point numbers in Javscript

Things to do for this week:

  • Integrate Packed table tools with the Linear Hash Table and include indexes/blob columns
  • Fix packing of floating point numbers if possible

Week 3 - September 7th 2021

Discuss work done so far with PackedTableTools and improved LinearHashTable

Things to do for this week:

  • Continue PackeTableTools Node.js implementation
  • Research into ODBC and/or memcached drivers

Week 2 - August 31st 2021

Discuss Project outline and ways to make data inserted into Linear Hash Table more compressed and SQL like with Yioop's PackedTableTools.php

Things to do for this week:

  • Improve performance of Linear Hash Table with Node.js stream optimizations
  • Begin work on a Node.js PackedTableTools implementation for inserting records

Week 1 - August 24th 2021

Kickoff meeting and CS298 Proposal discussion

Things to do for this week:

  • Finish CS298 Proposal
  • Turn in CS298 Request Form + Proposal to CS office

Begin CS298


Week 14 - May 11th 2021

Discuss CS 297 Report and plans for next semester

Things to do for this week:

  • Revise CS 297 Report and add to sidebar

Week 13 - May 4th 2021

Discuss CS 297 Report, Del 4, and CS 298 proposal and deadlines for next semester

Things to do for this week:

  • Finish up Del 4
  • Write and finish 297 Report

Week 12 - April 27th 2021

Small demonstration of in memory implementation of Del 4 Consistent Hashing and discussed final CS297 report outline

Things to do for this week:

  • Continue work on Del 4 Consistent Hashing with physical implementation
  • Work on CS 297 Report paper outline/draft

Week 11 - April 20th 2021

Presented slides on paper [4] and discussed initial consistent hashing implementation

Things to do for this week:

  • Continue work on Consistent Hashing implementation

Week 10 - April 13th 2021

Presented completed Del 3 WARC Parser and discussed Del 4 Consistent hashing

Things to do for this week:

  • Make quick tweak to Del 3 to accept arguments straight from the cmd line
  • Start on Del 4 by making multiple accessible express servers.
  • Read and create slides on consistent hashing paper [4]

Week 9 - April 6th 2021

Showed complex arguments implementation for WARC Parser and discussed anime recommendations.

Things to do for this week:

  • Continue work on CDX file index creator of WARC files for Del 3

Week 8 - March 23rd 2021

Presented slides on paper [3] and presented the Node.js cli program that extracts WARC records filtered either directly or through a CDX index with specified arguments.

Things to do for this week:

  • Continue further work on Del 3 by implementing more complex arguments like multiple argument parameters, ranged arguments, and reading from multiple WARC files
  • Add CDX file indexing of WARC files functionality to program

Week 7 - March 16th 2021

Discussed work done so far done on Del 3 WARC parser/writer such as basic parsing and filtering.

Things to do for this week:

  • Continue work on Del 3 by adding some sort of query language or format for filtering extracted WARC records.
  • Read and create slides for paper [3] on Web Archive profiling

Week 6 - March 9th 2021

Paper [2] presentation on WARC file migration and discussed Del 2 linear hash table implementations

Things to do for this week:

  • Start on Del 3 a WARC read/writer

Week 5 - March 2nd 2021

Discussed CDXJ files and more about linear hash tables

Things to do for this week:

  • Continue work on persistent linear hash table
  • Read and create slides on WARC file migration paper [2]

Week 4 - February 23rd 2021

Presented fixed slides on WARC and CDX files and the Dynamic Hashing paper. Also discussed making Linear Hash table implementation Del 2

Things to do for this week:

  • Continue work on KV store by implementing a Linear Hashing Scheme for Del 2
  • Create summary of CDXJ file structure and generate a CDXJ example file

Week 3 - February 16th 2021

Discussed work done so far on Del 1, and slides about WARC files and Incorrect CDX files

Things to do for this week:

  • Continue work on DEL 1 by replacing the simple JSON store with a linear hashing implementation
  • Lookup the correct CDX file format and fix slides from previous week
  • Read paper [1] about dynamic hashing and create slides that summarize the paper

Week 2 - February 9th 2021

Discussed the CS 297 document, setup account in the system where this blog resides, and discussed Del 1 details.

Things to do for this week:

  • Work on Del 1 which is implementing a simple document key store with simple timing tests
  • Lookup WARC and CDX file formats and create slides for them

Week 1 - February 2nd 2021

First meeting. Finalized the project topic and discussed the CS 297 Proposal.

Things to do for this week:

  • Work on CS 297 Proposal document

Begin CS297