Outline
- Persistent Data structures in DB2 up to tables
- Quiz
- Storage Groups, Indexes, the Catalog, the Directory
Introduction
- Last week, we introduced DB2 for z/OS.
- We talked about its history, some of its features, and the environment in which it runs.
- We then began to talk about data structures that DB2 uses and dove right in and began to talk about
types of table spaces.
- We went into a little bit of detail on how segmented table spaces work.
- Today, we are going to back up talk a little about what data structures DB2 has before resuming again our discussion of table spaces.
Bird's eye View of DB2 Data Structures
The following picture illustrates the principle data sturctures of DB2:
Persistent Structures - Databases
- A database is a logical collection of table spaces and index spaces.
- For DB2, the default database is DSNDB04 -- data source name data base 4
- There are two kinds of databases in DB2:
- User Databases -- there can be up to 65271 of these. Each can contain multiple tablespaces, indexes, etc.
- System Database -- each DB2 subsystem has only two system databases: the Catalog (DSNDB06) and the Directory DSNDB01
Persistent Structures - Table Spaces
- A DB2 table space is a logical container which consists of a set of pages.
- A table space is always associated with a storage group. That's how data in a table space is stored in DASD (Direct Access Storage Device such as Disk and in contrast to other kinds of storage like tape).
- A table space can have one or more tables in it.
- As we said last day, a table space can consist of a number of VSAM linear data sets (LDS). A VSAM LDS can span one or more volumes. Table spaces are divided into equal-sized units, called pages, which are written to or read from a disk in one operation.
- We listed five different kinds (segmented, partitioned, etc.) of table spaces last day.
Persistent Structures - Tables
- Tables are logical structures maintained by DB2.
- Tables are made up of columns and rows.
- The rows of a relational table have no fixed order.
- The order of the columns, however, is always the order in which you specified them when you defined the table.
- We call the data stored in a given column of a given row a value.
- Sometimes I will use the word field to mean column.
- Each column in table has a type (a fixed set such as DATE, CHAR, etc). Any value that appears in a given column of any row must be of the type of that column.
- A row is a sequence of values such that the `n`th value is a value of the `n`th column of the table
- Every table must have one or more columns, but the number of rows can be zero.
Quiz
Which of the following statements is true?
- DB2 and IMS cannot interoperate with each other.
- The z/OS platform is 32 bit.
- Each segment in a segmented table space has the same number of pages.
Types of Tables
DB2 supports five main types of tables:
- Base table -- A table created with the SQL statement CREATE TABLE and used to hold persistent user data.
- Temporary table -- can be created by either the SQL statement CREATE GLOBAL TEMPORARY TABLE (a created temporary table) or DECLARE GLOBAL TEMPORARY TABLE (a declared temporary table) and used to hold data temporarily, such as the intermediate results of SQL transactions
- Materialized query table -- A table defined with the SQL CREATE TABLE statement, that contains materialized data that is derived from one or more source tables.
- Auxiliary table -- A special table that holds Large OBject data (LOB )and XML data.
- Clone table -- A table that is structurally identical to a base table. The clone table is created in a different instance of the same table space as the base table, is structurally identical to the base table in every way, and has the same indexes, before triggers, and LOB objects or XML.
Persistent Structures - Page
- Data in table spaces is stored and allocated in pages which as we mentioned before can come in 4, 8, 16, or 32KB sizes.
- The size of the data page is determined by the buffer pool in which you define the table space.
- For example, a table space that is defined in a 4KB buffer pool has 4KB page sizes, and one that is defined in an 8KB buffer pool has 8KB page sizes.
Persistent Structures - Storage Group
- A storage group is a set of disks volumes used to hold the data sets
- DB2 SQL commands can be used to manipulate storage groups:
CREATE STOGROUP SG01 VOLUMES (vol1, vol2, vol3) VCAT DSNCAT;
ALTER STOGROUP SG01 REMOVE VOLUMES (vol3);
- All volumes of a given storage group must have the same device type. However, parts of a single database can be stored in different storage groups.
- There are two ways to manage the user data sets: One is user-managed, another is DB2-managed.
- Since one of the reasons for using DB2 is performance and reliability, you might manage your own data sets to use different IBM tools for guaranteeing these properties.
More on Storage
There are also a couple of options for how storage for table spaces is managed with regards to DB2. You can have:
- DB2 managed table spaces
- Commonly used for user data table spaces.
- DB2 allocates the table spaces on the volumes contained in the storage group.
- Requires user to make sure you don't run out of space.
- System managed table space
- DFHSMS (Storage Management System) is an extension to z/OS that manages disk,
tape, and optical storage.
- It handles allocation and expansion requests so you don't have to worry about them.
Ways you can manipulate a storage group
- You can allocate storage for table spaces and indexes
- You can define the necessary VSAM data sets
- You can extend and delete VSAM data sets
- You can alter VSAM data sets
Persistent Storage Structures - Indexes
- An index is an ordered set of (key, row pointer) pairs where the row pointers point to rows of a table.
- Typically, such pairs are much smaller than a table row, and so DB2 can use indexes to improve performance.
- Furthermore, sometimes specialized data structures such as B+-trees can be used so that given a key one can find the (key, row pointer) pair; and hence, the table row, quickly.
- Since table rows are often not ordered by a given column's values, if you want to search for a row of a given value, and you don't have an index you often have to search each row in turn. This is called a table scan.
Types of Indexes
DB2 supports several types of indexes which we will talk more about as the semester progresses.
- Unique index
- Clustering index
- Padded index
- Not padded index
- Index on expression
- XML index
- Compressed index
- Data-partitioned secondary index
- Non-partitioned secondary index
System Database - Catalog
- So far we have been talking about persistent storage structures
- We also have been describing mainly user databases.
- We are now going to consider the DB2 Catalog which is a persistent system database
- We are also going to consider a slightly faster system database known as the directory
- We will then consider how these interact with in-memory storage structures.
System Database - More Catalog
- The Catalog is one place where DB2 stores information about what databases and tables are available in DB2. i.e.,
meta data.
- When an object inside a database is created, DB2 assigns it an identifier that is stored in the DB2 catalog.
- These OBject IDentifiers are known as OBID.
- A table space or an index space is assigned a PSID, or Page Set ID, because these objects are associated with physical data sets.
- A database is assigned a DataBase Identifier, or DBID.
- DB2 Catalog contains many system tables, for example SYSDATABASE, SYSTABLESPACE, SYSTABLES, SYSCOLUMNS. The catalog tables can be queried by DB2 users.
Catalog Structure
Major table spaces of the Catalog are :
- SYSDBASE -- contains object information
- SYSPLAN -- contains information about plans, DBRMs, statements in the plan, authorization, plan dependencies
- SYSCOPY -- contains recovery information
- SYSTATS -- contains statistics information
- SYSSTR -- contains string conversion information
- SYSDBAUTH -- contains database information and authorization information
- SYSUSER -- contains user's system privileges (sysadmin,sysoper,sysaduit)
System Database - Directory
- The Directory is a second dictionary-like structure in addition to the DB2 Catalog to store meta information for DB2.
- Its for DB2's internal use only, and are not accessible by SQL.
- It is designed to be faster for DB2 to get information out of than the Catalog, so it uses a binary format.
Directory Structure
Major table spaces of the Directory are :
- SCT02 Skeleton cursor (SKCT) -- Contains the internal form of SQL statements that are contained in an application. When you bind a plan, DB2 creates a skeleton cursor table in SCT02.
- SPT01 Skeleton package (SKPT) -- Similar to SCT02, except that the skeleton package table is created when you bind a package.
- SYSLGRNX Log range -- Tracks the opening and closing of table spaces, indexes, or partitions.
- SYSUTILX System utilities -- Contains a row for every utility job that is running.
- DBD01 Database descriptor (DBD) -- Contains internal information, called database descriptors (DBDs), about the databases that exist within the DB2 subsystem.