More Database Services

CS185c

Chris Pollett

Feb. 18, 2010

(These notes are based on slides which are copyright IBM)

Outline

More Data Manager
Buffer Manager
z/OS services DB2 makes use of

Introduction

Last day, we discussed the RDS in detail and how it processes SQL statements.
We then began talking about the Data Manager and its major responsibilities.
Today, we will continue talking about the DM in detail and then talk about the Buffer Manager in detail.

Data Manager and Data Manipulation

image showing how a select-rt maps to pages

The data manipulation services of the DM allow other DB2 components to load, retrieve, insert, change, or delete record instances that are stored in DB2 databases.
These services perform any locking needed to concurrent access to the data and to ensure that only committed modifications can be accessed by another transaction.
These services also perform the logging to ensure that modifications to data objects can be backed out or redone in the event of a system failure or a media recovery operation.
Data manipulation services also enforce referential constraints (for example, foreign key constraints).
The Data Manager uses the buffer manager component to read pages from DASD and write pages to DASD.

Data Manipulation without Using Index

A picture of the DB2 components of DM and BM involved in a Relation scan

Runtime requests DM for a record.
DM requests BM for a page.
DM evaluates stage 1 predicate.
DM copies column data from buffer pool to thread storage for qualified row.

Data Manipulation Using Index Access

A picture of the DB2 components of DM and BM involved when an Index is involved

The runtime requests the DM for a row.
The DM passes the index search ranges to the IM
The IM request an index page from BM.
The IM returns RID to DM.
The DM requests the page of this RID (row ID) from BM.
The DM applies stage 1 predicates.
The DM copies column data from buffer pool to thread private buffer.

Index Manager

The IM is responsible to handle the folllowing operations on indexes:
- Load
- Retrieve
- Insert
- Change
- Delete
The IM uses the Buffer Manager to get and write index pages

Index Management Scenario

A picture showing how index pages can live in the buffer pool

The interface between the IM and the BM is very similar to the interface between the DM and the BM.
The showing of the entire index tree in the buffer pool is only for the purpose of illustration. Only the index pages which contain qualified rows are loaded into the buffer pool.

Relation Scan versus Index Scan

A Relation Scan scans all the pages of a table, it involves a lot of I/Os usually, and therefore can be slow.
An Index Scan only needs to scan or do a search in an index to file an RID. That RID is then used to retrieve the page containing a given row.
An index row is typically much smaller than a table row, so many more of them can fit in page.
Indexes also typically use a B-tree structure which is a balanced search tree optimized to have a branching factor proportional to the number of index rows that can fit in a page.
Because of this, only logarithmic base this branching factor many index pages need to be searched to find the RID.

Lock Management

The DM uses the IRLM to lock resources
Locking is used to preserve data consistency and integrity in a conccurrent setting
It does this by prohibiting concurrent updates to the same resource
There are two kinds of locks: explicit and implicit.
Explicit locking services allow the DB2 users to lock specific table spaces, tables, or partitions.
The DM implicitly, acting on behalf of DBMS, locks DBDs, table spaces, partitions, pages or rows during the resource allocation process.

Log Management

The Log Manager records changes that are done on the database.
These are stored in log files.
The two main kinds of log records (there are others we will talk about later) are: undo log records and redo log records.
An "undo" log entry records a data value prior to a change.
A "redo" log entry records a data value after a change.
During a recovery restart, the undo log entries are used to "back-out" uncommitted changes that went to DASD, redo log entries are used to re-apply lost committed changes.

Temporary File Management

There are several different kinds of temporary files in DB2:
- Work files -- Used to store result of sort. These are created and deleted by RDS
- Created global temporary table (CGTT) -- these are used by applications to share temporary results among threads. These are created by DDL and have definitions stored in the catalog
- Declared global temporary table (DGTT) -- these are also used by applications to share temporary results among threads and are created using DDL statements. They aren't stored in the catalog though. They are dropped when the thread ends
DM data manipulation services are used to insert into and retrieve from these temporary files.

Recovery Management

The Recovery Manager applies undo/redo logs to restore DB2 to a consistent state.
Undoing of uncommitted changes can occur during: a DB2 restart, an ABORT, or a ROLLBACK.
Redoing lost committed changes can occur during a DB2 restart

DM for INSERT Statement with Cluster Index

A picture with what DM does for a clustered index insert

The Runtime invokes the DM to process an INSERT RT
The DM invokes the IM to find a candidate page to insert into
The DM writes column values to the candidate page
All indexes are updated

DM for INSERT Statement without Cluster Index

A picture with what DM does for a non - clustered index insert

The Runtime invokes the DM to process an INSERT RT
The DM invokes the BM to find a page at the end of the table
The DM writes column values to the candidate page
Then all indexes are updated

DM for UPDATE and DELETE Statements

A picture with what DM does for update and delete

A SELECT RT is first processed to find the the DM row to be updated or deleted
Then the SELECT RT specifies whether access is via index
After a row is qualified, the Runtime invokes the DM to process the positioned update or delete

Buffer Manager

Recall the Buffer Manager controls the movement of pages between their page sets and virtual storage buffers
DB2 data is stored in linearly addressable DASD data set collections called page sets. Within a page set, the data is stored in fixed-sized units called pages.
DB2 functions, for examples, Data Manager or Utilities, that process these data do not directly access the page sets, but instead operate on virtual storage copies of the pages.
The Buffer Manager (BM) is the component which controls the movement of these pages between their page sets and virtual storage buffers.

Major Functions of Buffer Manager

Buffer management
- Services DM and IM get page requests
- Deferred write pages to DASD
- Manages buffer pools usage
- Page latch for page integrity
Buffer management
- Open and close
- Preformat disk

Buffer Pools Management Strategy

Here are the major operations that the BM needs to perform and the algorithm or data structure used to do it:

Search -- uses a hashing table
Replace -- uses one of Least Recently Used (LRU), Most Recently Used (MRU), First-In-First-Out (FIFO) algorithms. The choice from amongst these can be specified in the DB2 ZPARM
Write -- Deferred write

Buffer Pools Management Queues

The hash table the BM uses to look up pages makes use of a linked-list structure to handle page control information blocks which have the same hash value.
A queue of page control information blocks is also maintained for each buffer pool to keep track of the least recently used page. A slightly different queue structure called the SLRU queue is used to handle page control information blocks when sequential pre-fetches are done.
One last queue, the deferred write queue, is used to handle information about when a page is updated. If a page is written to, its page control information block is marked dirty and inserted into a deferred write queue under the corresponding data set anchor. When the queue threshold is reached for a data set, the dirty pages in the queue are written out to the disk. When the total number of queue elements for all data sets reach another preset threshold, all dirty pages are written out to their corresponding data sets.

Prefetch

A prefetch reads a set of pages into the buffer pool with only one asynchronous I/O
There are three types of prefetches:
- A Sequential Prefetch reads a sequential set of pages. The maximum number of pages for sequential pre-fetch is determined by the size of the buffer pool used. It might be used during a table scan
- A List Prefetch reads a set of data pages determined by a list of RIDs taken from the index. IT might be used for certain kinds of joins.
- A Dynamic pre-fetch is a kind of prefetch done by DB2 on the fly based on observations of past page accesses

Controlling the Number of Prefetch Pages in the Buffer Queue

If prefetching is done too aggressively the buffer pool can fill with pages from prefetches.
Many of these pages might not be needed.
To avoid this problem there is a parameter VPSEQT, the Sequential steal threshold, which can be set.
This effects the allowed length of sequential LRU, and so can be used to limit the total number of prefetch pages.

Automatic Page Recovery

Another problem that the BM might sometimes need to handle is if a page is temporarily marked broken during a page update.
The DM makes a broken page mark in the page header at the start of buffer page update and remove the mark when the update is completed.
If the thread "abends" (abnormal end of task -- IBM/360 lingo) before the buffer page update is completed, then the page stays broken.
On the next reference of this buffer page, BM has the mechanism to automatically refreshes the page from DASD and applies logs to back out the uncommitted changes.

Data Set Management

The BM manages data set open and close activity>
The BM preformats DASD space
- DASD space needs to be preformatted before a row or key can be inserted
- First preformat is done at
  CREATE TABLESPACE or CREATE INDEX
- Subsequent preformats are triggered by inserts
- Each preformat formats 2 cylinders except for very small tables/indexes

New Topic -- Z/OS services that DB2 make use of

DB2 makes use of several different z/OS services...
These include:
- The Physical Storage Manager
- The Lock Manager
- The Security Server
Let's spend a moment to look at each of these...

The Physical Storage Manager

We have said already that we have two different notions of storage: logical and physical.
The logical notions of storage are high level abstractions like databases, table spaces, and tables.
The physical notions of storage corresponds to kinds of storage groups and how the data is actually written to DASD.
Physical storage also come in different type from the fast and expensive RAM to DASD to finally tape which is slowest but perhaps cheapest.
The flow of data between these different kinds of storage needs to be managed.
You could imagine an end-user trying to manage this process but that would be cumbersome.
DB2 typically manages logical notions of storage, but it can also be set up to manage data motion between different types of physical storage.
The perhaps best way to manage physical storage is to have DB2 use the z/OS system managed storage (SMS).
It is also the required way to manage physical storage for table spaces and index spaces larger than 4 GB.
The main z/OS subsystem associated with SMS is the Data Facility Storage Management Subsystem (DFSMS).

Making use of SMS

In order to tell DB2 to use SMS you use VOLUMES(*) when you specify the storage group.
In the DB2 setting, SMS will take care of the following things
- Security, placement, migration, backup, recall, recovery, and deletion
- It also ensures availability of current data

More Database Services

Outline

Introduction

Data Manager and Data Manipulation

More on Data Manipulation

Data Manipulation without Using Index

Data Manipulation Using Index Access

Index Manager

Index Management Scenario

Relation Scan versus Index Scan

Lock Management

Log Management

Temporary File Management

Recovery Management

DM for INSERT Statement with Cluster Index

DM for INSERT Statement without Cluster Index

DM for UPDATE and DELETE Statements

Buffer Manager

Major Functions of Buffer Manager

Buffer Pools Management Strategy

Buffer Pools Management Queues

Prefetch

Controlling the Number of Prefetch Pages in the Buffer Queue

Automatic Page Recovery

Data Set Management

New Topic -- Z/OS services that DB2 make use of

The Physical Storage Manager

Making use of SMS