Chris Pollett >
Old Classes
>
CS157a
( Print
View )
Grades:
[Sec4]
Submit:
[Sec4]
Course Info:
[Texts & Links]
[Topics]
[Grading]
[HW Info]
[Exam Info]
[Regrades]
[Honesty]
[Announcements]
HW Assignments:
[Hw1]
[Hw2]
[Hw3]
[Hw4]
[Hw5]
Practice Exams:
[Mid1]
[Mid2]
[Final]
|
HW3 Solutions
Page
Return to homework
page. 5.23
(a)What is the total capacity of a track and what is its useful capacity? There are 20 block(512 bytes each)and 20 interblocks gaps(128 bytes each).
Thus,the total capacity of a track is 20*(512+128)=12800 bytes.
The useful capacity is 20*512=10240 bytes.
(b)How many cylinders are there? 400(same as number of tracks on a
surface) (c)What are the total capacity and the useful
capacity of a cylinder? This will be the given capacity of a track*number of tracks
in a cylinder. i.e.,total capacity of a cylinder=12800*15*2=384000 bytes. (*2
because double sided) useful capacity of a cylinder=10240*15*2=307200 bytes.
(d)What are the total capacity and the useful
capacity of a disk pack? This will be the given capacity of a cylinder*number of
cylinders in a disk
pack. i.e., total capacity of a disk pack=384000*400=153,600,000 bytes. useful
capacity of a cylinder=307200*400=122,880,000 bytes.
(e)What are the transfer rate in bytes/msec and the block transfer time inmsec? What is
the rotational delay inmsec? What is the
bulk transfer rate?
Using the formulas from appendix A,the transfer rate will be
track size/milliSecond per round
tr=(12800)/(60*1000/2400)=512 bytes /msec.
The block transfer time inmsec=block size/tr
btt=512/ 512=1msec.
The bulk transfer rate is(block size/(block size+interblock gap))* tr btr=512/640*512=409.6
bytes/msec. (f)How much time does it take on overage inmsec to
locate and transfer a single block,given its block address?
This can be calculated
as seek time+rotational delay+btt.
i.e.,30+1/2 *(60000/2400)+1=43.5msec.
(g)Calculate the average time it would take to
transfer 20 random block and compare this with the time it would take to transfer 20
consecutive blocks.
If the random blocks are on random cylinders this would be 20*43.5=870msec If the random blocks were on the same cylinder the answer would be
30+20*13.5=300msec. If the blocks are consecutive the formula is:
seek time+rotational delay+(number of block*block size/ bulk transfer rate)
30+12.5+(20*512/409.6)=67.5msec. 5.27 Load these records into the file in the given
order,
using the hash function h(k)=K mod 8. Calculate the average number of block
accesses for a random retrieval on Part#.
In my solution below Ri represents a
record pointer to record i in the
overflow file Add 2369
Bucket No | Bucket Data | 0 |
| 1 | 2369 |
2 | | 3 |
| 4 | |
5 | | 6 |
| 7 | |
Add 3760
Bucket No | Bucket Data |
0 | 3760 |
1 | 2369 | 2 |
| 3 | | 4 |
| 5 |
| 6 | | 7 | |
Add4692 Bucket No |
Bucket Data | 0 | 3760 | 1 | 2369 |
2 | | 3 | | 4 |
4692 | 5 | | 6 | |
7 | |
Add4871
Bucket No |
Bucket Data | 0 | 3760 | 1 |
2369 | 2 | | 3 |
| 4 | 4692 | 5 | |
6 | | 7 | 4871 |
Add5659 Bucket No |
Bucket Data | 0 | 3760 | 1 | 2369 |
2 | | 3 | 5659 | 4 | 4692 |
5 | | 6 | | 7 |
4871 |
Add1821
Bucket No | Bucket Data | 0 | 3760 |
1 | 2369 | 2 | | 3 |
5659 | 4 | 4692 | 5 | 1821 |
6 | | 7 | 4871 |
Add1074 Bucket No |
Bucket Data | 0 | 3760 | 1 | 2369 |
2 | 1074 | 3 | 5659 | 4 |
4692 | 5 | 1821 | 6 | |
7 | 4871 |
Add7115
Bucket No |
Bucket Data | Overflow file | 0 | 3760 |
1 | 2369 | 2 | 1074 | 3 | 5659
7115 | 4 | 4692 | 5 | 1821 | 6 |
| 7 | 4871 |
Add1620 Bucket No |
Bucket Data | Overflow file | 0 | 3760 | 1 |
2369 | 2 | 1074 | 3 | 5659 7115 |
4 | 4692 1620 | 5 | 1821 | 6 |
| 7 | 4871 |
Add2428 Bucket No |
Bucket Data | Overflow file | 0 | 3760 | (1)2428 |
1 | 2369 | 2 | 1074 | 3 |
5659 7115 | 4 | 4692 1620 R1 | 5 |
1821 | 6 | | 7 | 4871 |
Add3943 Bucket No |
Bucket Data | Overflow file | 0 | 3760 | (1)2428 |
1 | 2369 | 2 | 1074 | 3 |
5659 7115 | 4 | 4692 1620 R1 | 5 |
1821 | 6 | | 7 | 4871 3943 |
Add4750 Bucket No | Bucket Data |
Overflow file | 0 | 3760 | (1)2428 |
1 | 2369 | 2 | 1074 | 3 |
5659 7115 | 4 | 4692 1620 R1 | 5 |
1821 | 6 | 4750 | 7 | 4871 3943 |
Add6975
Bucket No | Bucket Data |
Overflow file | 0 | 3760 | (1)2428 (2)6975 |
1 | 2369 | 2 | 1074 | 3 |
5659 7115 | 4 | 4692 1620 R1 | 5 |
1821 | 6 | 4750 | 7 | 4871 3943 R2 |
Add4981 Bucket No |
Bucket Data | Overflow file | 0 | 3760 |
(1)2428 (2)6975 | 1 | 2369 | 2 |
1074 | 3 | 5659 7115 | 4 |
4692 1620 R1 | 5 | 1821 4981 | 6 |
4750 | 7 | 4871 3943 R2 |
Add9208 Bucket No |
Bucket Data | Overflow file | 0 | 3760 9208 |
(1)2428 (2)6975 | 1 | 2369 | 2 |
1074 | 3 | 5659 7115 | 4 |
4692 1620 R1 | 5 | 1821 4981 |
6 | 4750 | 7 | 4871 3943 R2 |
To look any of the items stored in a bucket will require at least two
blocks accesses: One to look up the block address of a bucket number in the
file header,the other to read the block for the bucket. If the record is in a block linked
to a bucket than 3 accesses are required.
Note: If you ignored the block access to the file header I won't take off
points. Given the above the average number of block accesses will be: (2+2+2+2+2+2+2+2+3+2+2+2+2+2+3)/15
=2.13 block accesses 6.14 (a)Calculate the record size in
bytes. The record size will be the sum of the field sizes: 30+9+9+40+9+8+1+4+4+1(for deletion mark)=115 bytes. (b)Calculate the blocking
factor bfr and the number of file blocks b,assuming an unspanned files organization. The block factor is the size of the block/record size rounded down. i.e.,[512/115]=4 records/block. If there are 30,000 records this would require 30,000/4 rounded up blocks =
7,500 blocks. (c)Calculate(i)the index blocking factor,
(ii)the number of first-level entries and the number of first level index blocks,
(iii)the number of levels needed if we make it into a multilevel index,
(iv)the total number of block required by the multilevel index,(v)the number
of block accesses needed to search for and retrieve a record from the file
using the primary index.
(i)An index record will consists of 9 bytes for the key+6 bytes for
the block pointer=15 bytes. So the bfr_i is [512/15]=34 records/block (ii)The number of first level entries will be the number of
blocks calculated in(b). i.e.,7500 entries. The number of blocks needed
for the index is 7500/34 rounded up=221 blocks. (iii)If we have a second level index on(ii),we know it must store 221
records,again each of 15 bytes in length. This index will require 221/34
rounded up blocks to store=7 blocks. As this is more than one block,
we should have a third level index. This will store seven records and as
the blocking factor is 34 will fit in one block. (iv)From our analysis above,the original file has 7500 blocks,
the first level index has 221 blocks,the second level index has 7 blocks,
and the top level index has 1 block. this gives a total of 7729 blocks.
(If didn't count 7500 is okay) (v)Binary searching the first level index will take log2(221)
rounded up block accesses=8 block accesses. Following the block pointer
to the given block gives one more access for a total of 9 block accesses on
average (d) Repeat part(c)but assume we have a
secondary index. (i)An index record will consists of 9 bytes for the key+6 bytes
for
the block pointer=15 bytes.(Note it would have been okay if used
record pointers but values below would have changed.)So the bfr_i is
[512/15]=34 records/block (ii)The number of first level entries will be the total number of records.
i.e.,30,000. The number of blocks needed for the index is thus 30,000/34 rounded up=883 blocks. (iii)If we have a second level index on(ii),we know it must store
883 records,again each of 15 bytes in length. This index will require 883/34
rounded up blocks to store=26 blocks. As this is more than one block,
we should have a third level index. This will store 26 records and as
the blocking factor is 34 will fit in one block. (iv)From our analysis above,the original file has 7500 blocks,
the first level index has 883 blocks,the second level index has 26 blocks,
and the top level index has 1 block. this gives a total of 8410 blocks.
(If didn't count 7500 is okay) (v)Binary searching the first level index will take log2(883)
rounded up block accesses=10 block accesses. Following the block pointer
to the given block gives one more access for a total of 11 block accesses on
average. (e)Now suppose we are creating a secondary
index on DEPARTMENTCODE using option 3 from section 6.1.3.
Calculate(i)the index blocking factor,
(ii)the number of blocks needed by the first level of indirection that stores
record pointers.
(iii)the number of first-level entries and the number of first level index blocks,
(iv)the number of levels needed if we make it into a multilevel index,
(v)the total number of block required by the multilevel index,(vi)the number
of block accesses needed to search for and retrieve all records that have
a specific DEPARTMENTCODE value using the index.
(i)An index record will consists of 9 bytes for the DEPARTMENTCODE
+6 bytes for the block pointer=15 bytes. So the bfr_i is
[512/15]=34 records/block (ii)The records in the level of indirection consists of blocks of record pointer each of which are 7 bytes long. So [512/7]=73 will
fit in one block. There are 1000 distinct department values and the records
are distributed evenly among them. So each department has 30,000/1000=30
such record pointers to store. So they can all fit in one block. Thus,this
level of indirection will require 1000 blocks one for each DEPARTMENTCODE. (iii)The number of first level entries will be the number of
blocks in the level of indirection. i.e.,1000 blocks. We will assume that
the number of blocks needed for the index is thus 1000/34 rounded up=30 blocks. (iv)If we have a second level index on(ii),we know it must store
30 records,again each of 15 bytes in length. These can all be stored in one
block,so a third level is not needed. (v)From our analysis above,the original file has 7500 blocks,
the level of indirection has 1000 blocks,the first level index has 30 blocks,
and the top level index has 1 block. this gives a total of 8531 blocks.
(If didn't count 7500 is okay) (vi)Here I am assuming they meant using the first level index.
(However,won't take off if calculated using multi-level index).
To look up one record for a given DEPARTMENTCODE will involve reading the
locating the DEPARTMENTCODE in first level index block which takes
log2(30)rounded up=5 block accesses,then reading the level of indirection block together with the block that the record was in. Since we've already read in the block with all the records for this DEPARTMENT code,to retrieve any of the other records will require one block access for each record. To retrieve all 30 records will,thus,take 5+1+30=36 block
accesses. Now suppose the file is ordered by DEPARTMENTCODE
and we are creating a clustering
index on DEPARTMENTCODE that uses block anchors.
Calculate(i)the index blocking factor,
(ii)the number of first-level entries and the number of first level index blocks,
(iii)the number of levels needed if we make it into a multilevel index,
(iv)the total number of block required by the multilevel index,(v)the
number of block accesses needed to search for and retrieve all records that have
a specific DEPARTMENTCODE value using the index.
(i)An index record will consists of 9 bytes for the key+6 bytes for
the block pointer=15 bytes. So the bfr_i is [512/15]=34 records/block (ii)The number of first level entries will be the number of
distinct DEPARTMENTCODES. i.e.,1000 entries. The number of blocks needed
for the index is 1000/34 rounded up=30 blocks. (iii)If we have a second level index on(ii),we know it must store
30 records,again each of 15 bytes in length. These can all be stored in one
block,so a third level is not needed. (iv)Since we are using block anchors the original file size will
be more than the 7500 blocks calculated before. Each DEPARTMENTCODE has
30 records associated with it. These will take 30/4 rounded up=8 blocks to store. Thus,to store all departments the file has 8000 blocks,the first level index has 30 blocks,and the top level index has 1 block. This gives a total of 8031 blocks.(If didn't count 8000 is okay) (v)Here I am assuming they meant using the first level index.
(However,won't take off if calculated using multi-level index).
To look up one record for a given DEPARTMENTCODE will involve reading the
locating the DEPARTMENTCODE in first level index block which takes
log2(30)rounded up=5 block accesses. Once the first
record is found all the remaining records with this DEPARTMENTCODE are link to
each oter and stored in 4 blocks. To retrieve all 30 records will,thus,
takes 5+4=9 block accesses. (g)Suppose we wanted to have a B+- access structure on SSN.
Calculate(i)p and pleaf,(ii)the number of leaf-level blocks
needed if blocks are 69 percent full,(iii)the number of levels needed is internal nodes are 69 percent full,(iv)the total number of blocks required by the B+- tree,(v)the number of block accesses needed to search for and retrieve a record from the file. (i)An internal node of a B+- tree can have up to p many tree pointer
and p-1 many field values. These must fit all into a 512 byte block.
So have(p*6)+(p-1)*9 <= 512. Solving for p gives 34. For leaf nodes
we have pleaf *(size record pointer+field size)+block pointer
<= block size. This gives pleaf *(7+9)+6 <= 512. Solving this
yields pleaf=31. (ii)If the leaf nodes are 69 percent full then on average they will
contain .69*31=approximately 21 record pointers. As there are 30,000
records this means we need 30,000/21 rounded up=1429 leaf nodes. (iii)If internal nodes are 69 percent full,then an internal node
will have 34*.69 or approximately 23 pointers. The level above the leaf nodes
must have 1429/13=63 nodes to be able to point at all the leaf nodes,
the level above this must have at least 3 nodes and the level above this
can be the root. Thus,including the leaf level we have 4 levels. (iv)From our analysis in(iii),we will need 1429+63+3+1=1496
blocks for our B+- tree. (v)To search down this tree to a leaf will require 4 block accesses.
one additional access will be needed to look up the record for a total of 5
block accesses on average. Repeat(g)but for a B - tree rather that a B+- tree. (i)For a B-tree,pleaf and p are the same and we must have
p*block pointer size+(p-1)*(record pointer size+field size)< block size.
So(p*6)+(p-1)*16 <= 512. Solving for p gives 24.
We will use this value,but for the reasons in the book on page 174,in real
life we would probably choose p to be smaller(like 23)in which
case the results below would be different. (ii)(iii)and(iv). Working from the root to the leaves assuming everything is 69 percent full,we get that the root has .69*24=16 pointers and stores
15 records. The next level has 16 nodes,240 entries and 256 pointer,
the level after that will have 256 nodes,3840 entries,and 4096 pointer,
the last level will have 30000-3840-240-15=25905 entries,and
25905/15=1727 nodes. So the total number of blocks we will need is 1727+
256+16+1=2000. (v)The average search time can be estimated as follows: sumlevel(fraction of entries at level*search cost for that level) (15/30000)*2+(240/30000)*3+(3840/30000)*4+(25905/30000)*5
= .001+.024+.512+4.3175=4.8545 accesses on average. 6.15 Show how the B+- tree of p=4 and pleaf=3 will grow when the
given items are inserted. My solution below is approximated using HTML tables. I have left out the
links from nodes at a given level to the level below Add23 Add65 Add37 Add60 Add46 Add 92
Add 48 Add 71
Add56
Add59
Add18 Add 21
Add 10
Add 74
Add 78 Add 15
Add16
Add 20 Add24
Add 28
Add39 Add 43
Add47 Add 50 Add 69 Add 75
Add 8 Add 49 Add 33 Add 38
/*
File:StudentDB.sql
Purpose:Creates the tables for the student database from Fig 1.2 of the book.
Remarks:At the time of the homework you didn't know about constraints
on relations so the SQL below doesn't mention primary keys,foreign keys,etc.
*/
CREATE TABLE STUDENT
(
Name VARCHAR(15)NOT NULL,
StudentNumber INT NOT NULL,
Class INT,
Major VARCHAR(4)
);
CREATE TABLE COURSE
(
CourseName VARCHAR(30)NOT NULL,
CourseNumber VARCHAR(10)NOT NULL,
CreditHours INT,
Department VARCHAR(4)
);
CREATE TABLE SECTION
(
SectionIdentifier INT NOT NULL,
CourseNumber VARCHAR(10)NOT NULL,
Semester VARCHAR(6),
Year CHAR(2),
Instructor VARCHAR(15)
);
CREATE TABLE GRADE_REPORT
(
StudentNumber INT NOT NULL,
SectionIdentifier INT NOT NULL,
Grade CHAR(1)
);
CREATE TABLE PREREQUISITE
(
CourseNumber VARCHAR(10)NOT NULL,
PrerequisiteNumber VARCHAR(10)NOT NULL
);
/*
File:Section.ctl
Purpose:Loads data for Section table
*/
LOAD DATA
INFILE Section.dat
INTO TABLE Section
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
( SectionIdentifier,CourseNumber,Semester,Year,Instructor)
/*
File:Section.dat
Purpose:Has data for Section table
*/
"85","MATH2410","FALL","98","King"
"92","CS1310","FALL","98","Anderson"
"102","CS3320","Spring","99","Knuth"
"112","MATH2410","FALL","99","Chang"
"119","CS1310","FALL","99","Anderson"
"135","CS3380","FALL","99","Stone"
/*
File:Section.log
Purpose:log file for load process
*/
SQL*Loader: Release 8.1.7.0.0 - Production on Wed Apr 3 04:39:34 2002
(c)Copyright 2000 Oracle Corporation. All rights reserved.
Control File: Section.ctl
Data File: Section.dat
Bad File: Section.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array: 64 rows,maximum of 65536 bytes
Continuation: none specified
Path used: Conventional
Table SECTION,loaded from every logical record.
Add option in effect for this table: INSERT
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
SECTIONIDENTIFIER FIRST * , O(") CHARACTER
COURSENUMBER NEXT * , O(") CHARACTER
SEMESTER NEXT * , O(") CHARACTER
YEAR NEXT * , O(") CHARACTER
INSTRUCTOR NEXT * , O(") CHARACTER
Table SECTION:
6 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 64500 bytes (50 rows)
Space allocated for memory besides bind array: 0 bytes
Total logical records skipped: 0
Total logical records read: 6
Total logical records rejected: 0
Total logical records discarded: 0
Run began on Wed Apr 03 04:39:34 2002
Run ended on Wed Apr 03 04:39:34 2002
Elapsed time was: 00:00:00.30
CPU time was: 00:00:00.03
|