CS298 Proposal

Schemes to make Aries and XML work in harmony

Thien An Nguyen (thien_an9@yahoo.com)

Advisor: Dr. Chris Pollett

Committee Members: Your_Committee.

Dr. Melody Moh, Dept. of Computer Science, SJSU (moh@cs.sjsu.edu)

Dr. Tsau Young Lin, Dept. of Computer Science, SJSU (tylin@cs.sjsu.edu)

Abstract:

One of the most important issues for a database product is its availability for transactions. When disaster happens, the Recovery Component of a database product relies on its log records to recover data. During the time of recovery, the database will not be available. Therefore, it takes more time if there are more log records to process. As eXtensible Markup Language (XML) is gaining in popularity as a way of sharing information over the Internet, the need for a database that stores XML document natively is increasing. Relational and hierarchal databases converted to store XML, often use a separate layer that is built on top of their existing storage structure. For native XML databases, most research papers study the area of storage management, specifically logical and physical storage but rely on existing relational database recovery algorithms, for example, Algorithms for Recovery and Isolation Exploiting Semantics (ARIES). Therefore, it is natural to try to come up with a faster and more fine grained recovery approach which is tailored to XML data. In CS298, we will implement a modified version of ARIES based on NATIX (Kan03) - a native XML Data Base Management System (DBMS) - to work with a native XML databases. NATIX uses an improved way of writing log records to enhance the performance of the Recovery Manager. We will use the toy XML database that we built in CS297 and work on the Recovery Manager. Our goal for this project is to build a recovery management system for this database so that experiments with native XML databases and recovery can be carried out.

CS297 Results

  • Defined our own Data Manipulation Language (DML).
  • Built a parser.
  • Built the overall structure of our database system.
  • Provided basic operations such as create, insert, update, delete, store, and print the XML tree.

Proposed Schedule

Week 1: Aug 25thFirst pass of Log Manager.
Week 2: Sep 1stSecond pass of Log Manager according to Dr. Pollett's revisions.
Week 3 & 4: Sep 8th - Sep 15thFirst and second pass of Archive/checkpoint.
Week 5: Sep 22ndTesting and debugging.
Week 6 & 7: Sep 29th - Oct 6thFirst and second pass of Recovery Manager, and rollback operation.
Week 8 & 9: Oct 13th - 20thFirst and second pass of Restart Recovery.
Week 10 & 11: Oct 27th - Nov 20thTesting and debugging.
Week 12 & 13: Nov 10 - Nov 23rdWriting Final Report.
Week 14: Nov 24thPreparation for Oral Presentation.
Week 15 & 16: Dec 1st - Dec 13thFinal Oral Presentation.

Key Deliverables:

  • Software
    • Toy native XML Database whose query and DML engine supports the basic operations: create database, insert, update, delete, select, commit, rollback, and print data from the database. Commands can be read from a file and a transaction number can be associated with a command.
    • We will implement a Log Manager. The Log manager will be able to record a composite log record for multiple updates by the same transactions on the same record. This is called subsidiary logging. When Rollback is specified, all of updates.
    • A checkpoint command can be specified in test script. When shut down the dababase system, during starts up, updated data for committed transactions that haven't flush to stable storage yet will be recovered.
  • Report
    • Description of our native XML Database and CS298 Report

Innovations and Challenges

  • XML recovery is a new and hot topic in databases.
  • We are implementing several hard subsystems of a DBMS.
  • We can quickly simulate various real-life situations by creating appropriate script files for transaction schedules and compare various tweaks to the basic algorithm.

References:

1. [Kan03] Core Technologies for Native XML Database Management System. Carl-Christian Kanne.2003.

2. [W3C] Namespaces in XML, World Wide Web Consortium 14-January-1999.

3. [Mohan90] ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, P. Schwarz. ACM Transactions on Database Systems, 1990.

4. [Mohan99] Repeating History Beyond ARIES. C. Mohan. Proc. 25th International Conference on Very Large Data Bases, Edinburgh, 1999.

5. [Mohan93] ARIES/LHS: A Concurrency Control and Recovery Method Using Write-Ahead Logging for Linear Hashing with Separators. C. Mohan. Proceedings of the 9th IEEE Interational Conference on Data Engineering, 1993.

6. [Lahiri01] Fast-Start: Quick Fault Recovery in Oracle. T. Lahiri, A. Ganesh, R. Weiss, A. Joshi. ACM SIGMOD, 2001.