Chris Pollett > Old Classes >
CS157a

( Print View )

Grades: [Sec4]

Submit: [Sec4]

Course Info:
  [Texts & Links]
  [Topics]
  [Grading]
  [HW Info]
  [Exam Info]
  [Regrades]
  [Honesty]
  [Announcements]

HW Assignments:
  [Hw1]  [Hw2]  [Hw3]
  [Hw4]  [Hw5]

Practice Exams:
  [Mid1]  [Mid2]  [Final]

                            












HW3 Solutions Page

Return to homework page.
5.23
(a)What is the total capacity of a track and what is its useful capacity?
There are 20 block(512 bytes each)and 20 interblocks gaps(128 bytes each). Thus,the total capacity of a track is 20*(512+128)=12800 bytes. The useful capacity is 20*512=10240 bytes.
(b)How many cylinders are there?
400(same as number of tracks on a surface)
(c)What are the total capacity and the useful capacity of a cylinder?
This will be the given capacity of a track*number of tracks in a cylinder.
i.e.,total capacity of a cylinder=12800*15*2=384000 bytes.
(*2 because double sided)
useful capacity of a cylinder=10240*15*2=307200 bytes.
(d)What are the total capacity and the useful capacity of a disk pack?
This will be the given capacity of a cylinder*number of cylinders in a disk pack.
i.e., total capacity of a disk pack=384000*400=153,600,000 bytes.
useful capacity of a cylinder=307200*400=122,880,000 bytes.
(e)What are the transfer rate in bytes/msec and the block transfer time inmsec? What is the rotational delay inmsec? What is the bulk transfer rate?
Using the formulas from appendix A,the transfer rate will be track size/milliSecond per round
tr=(12800)/(60*1000/2400)=512 bytes /msec.
The block transfer time inmsec=block size/tr
btt=512/ 512=1msec.
The bulk transfer rate is(block size/(block size+interblock gap))* tr
btr=512/640*512=409.6 bytes/msec.
(f)How much time does it take on overage inmsec to locate and transfer a single block,given its block address?
This can be calculated as seek time+rotational delay+btt.
i.e.,30+1/2 *(60000/2400)+1=43.5msec.
(g)Calculate the average time it would take to transfer 20 random block and compare this with the time it would take to transfer 20 consecutive blocks.
If the random blocks are on random cylinders this would be
20*43.5=870msec
If the random blocks were on the same cylinder the answer would be
30+20*13.5=300msec.
If the blocks are consecutive the formula is: seek time+rotational delay+(number of block*block size/ bulk transfer rate)
30+12.5+(20*512/409.6)=67.5msec.
5.27 Load these records into the file in the given order, using the hash function h(k)=K mod 8. Calculate the average number of block accesses for a random retrieval on Part#.
In my solution below Ri represents a record pointer to record i in the overflow file
Add 2369

Bucket NoBucket Data
0  
12369
2 
3  
4 
5 
6  
7 

Add 3760

Bucket NoBucket Data
03760
12369
2 
3 
4  
5  
6 
7 

Add4692

Bucket No Bucket Data
03760
12369
2 
3 
4 4692
5 
6 
7 

Add4871

Bucket No Bucket Data
03760
1 2369
2 
3  
44692
5 
6 
74871

Add5659

Bucket No Bucket Data
03760
12369
2 
35659
44692
5 
6 
7 4871

Add1821

Bucket NoBucket Data
03760
12369
2 
3 5659
44692
51821
6 
74871

Add1074

Bucket No Bucket Data
03760
12369
21074
35659
4 4692
51821
6 
74871

Add7115

Bucket No Bucket DataOverflow file
03760
12369
21074
35659
7115
44692
51821
6  
74871

Add1620

Bucket No Bucket DataOverflow file
03760
1 2369
21074
35659
7115
44692
1620
51821
6  
74871

Add2428

Bucket No Bucket DataOverflow file
03760(1)2428
12369
21074
3 5659
7115
44692
1620 R1
5 1821
6 
74871

Add3943

Bucket No Bucket DataOverflow file
03760(1)2428
12369
21074
3 5659
7115
44692
1620 R1
5 1821
6 
74871
3943

Add4750

Bucket NoBucket Data Overflow file
03760(1)2428
12369
21074
3 5659
7115
44692
1620 R1
5 1821
64750
74871
3943

Add6975

Bucket NoBucket Data Overflow file
03760(1)2428
(2)6975
12369
21074
3 5659
7115
44692
1620 R1
5 1821
64750
74871
3943 R2

Add4981

Bucket No Bucket DataOverflow file
03760 (1)2428
(2)6975
12369
2 1074
35659
7115
4 4692
1620 R1
51821
4981
6 4750
74871
3943 R2

Add9208

Bucket No Bucket DataOverflow file
03760
9208
(1)2428
(2)6975
12369
2 1074
35659
7115
4 4692
1620 R1
51821
4981
64750
74871
3943 R2

To look any of the items stored in a bucket will require at least two blocks accesses: One to look up the block address of a bucket number in the file header,the other to read the block for the bucket.
If the record is in a block linked to a bucket than 3 accesses are required. Note: If you ignored the block access to the file header I won't take off points.
Given the above the average number of block accesses will be:
(2+2+2+2+2+2+2+2+3+2+2+2+2+2+3)/15
=2.13 block accesses
6.14
(a)Calculate the record size in bytes.
The record size will be the sum of the field sizes:
30+9+9+40+9+8+1+4+4+1(for deletion mark)=115 bytes.
(b)Calculate the blocking factor bfr and the number of file blocks b,assuming an unspanned files organization.
The block factor is the size of the block/record size rounded down.
i.e.,[512/115]=4 records/block.
If there are 30,000 records this would require 30,000/4 rounded up blocks = 7,500 blocks.
(c)Calculate(i)the index blocking factor, (ii)the number of first-level entries and the number of first level index blocks, (iii)the number of levels needed if we make it into a multilevel index, (iv)the total number of block required by the multilevel index,(v)the number of block accesses needed to search for and retrieve a record from the file using the primary index.
(i)An index record will consists of 9 bytes for the key+6 bytes for the block pointer=15 bytes. So the bfr_i is [512/15]=34 records/block
(ii)The number of first level entries will be the number of blocks calculated in(b). i.e.,7500 entries. The number of blocks needed for the index is 7500/34 rounded up=221 blocks.
(iii)If we have a second level index on(ii),we know it must store 221 records,again each of 15 bytes in length. This index will require 221/34 rounded up blocks to store=7 blocks. As this is more than one block, we should have a third level index. This will store seven records and as the blocking factor is 34 will fit in one block.
(iv)From our analysis above,the original file has 7500 blocks, the first level index has 221 blocks,the second level index has 7 blocks, and the top level index has 1 block. this gives a total of 7729 blocks. (If didn't count 7500 is okay)
(v)Binary searching the first level index will take log2(221) rounded up block accesses=8 block accesses. Following the block pointer to the given block gives one more access for a total of 9 block accesses on average
(d) Repeat part(c)but assume we have a secondary index.
(i)An index record will consists of 9 bytes for the key+6 bytes for the block pointer=15 bytes.(Note it would have been okay if used record pointers but values below would have changed.)So the bfr_i is [512/15]=34 records/block
(ii)The number of first level entries will be the total number of records. i.e.,30,000. The number of blocks needed for the index is thus 30,000/34 rounded up=883 blocks.
(iii)If we have a second level index on(ii),we know it must store 883 records,again each of 15 bytes in length. This index will require 883/34 rounded up blocks to store=26 blocks. As this is more than one block, we should have a third level index. This will store 26 records and as the blocking factor is 34 will fit in one block.
(iv)From our analysis above,the original file has 7500 blocks, the first level index has 883 blocks,the second level index has 26 blocks, and the top level index has 1 block. this gives a total of 8410 blocks. (If didn't count 7500 is okay)
(v)Binary searching the first level index will take log2(883) rounded up block accesses=10 block accesses. Following the block pointer to the given block gives one more access for a total of 11 block accesses on average.
(e)Now suppose we are creating a secondary index on DEPARTMENTCODE using option 3 from section 6.1.3. Calculate(i)the index blocking factor, (ii)the number of blocks needed by the first level of indirection that stores record pointers. (iii)the number of first-level entries and the number of first level index blocks, (iv)the number of levels needed if we make it into a multilevel index, (v)the total number of block required by the multilevel index,(vi)the number of block accesses needed to search for and retrieve all records that have a specific DEPARTMENTCODE value using the index.
(i)An index record will consists of 9 bytes for the DEPARTMENTCODE +6 bytes for the block pointer=15 bytes. So the bfr_i is [512/15]=34 records/block
(ii)The records in the level of indirection consists of blocks of record pointer each of which are 7 bytes long. So [512/7]=73 will fit in one block. There are 1000 distinct department values and the records are distributed evenly among them. So each department has 30,000/1000=30 such record pointers to store. So they can all fit in one block. Thus,this level of indirection will require 1000 blocks one for each DEPARTMENTCODE.
(iii)The number of first level entries will be the number of blocks in the level of indirection. i.e.,1000 blocks. We will assume that the number of blocks needed for the index is thus 1000/34 rounded up=30 blocks.
(iv)If we have a second level index on(ii),we know it must store 30 records,again each of 15 bytes in length. These can all be stored in one block,so a third level is not needed.
(v)From our analysis above,the original file has 7500 blocks, the level of indirection has 1000 blocks,the first level index has 30 blocks, and the top level index has 1 block. this gives a total of 8531 blocks. (If didn't count 7500 is okay)
(vi)Here I am assuming they meant using the first level index. (However,won't take off if calculated using multi-level index). To look up one record for a given DEPARTMENTCODE will involve reading the locating the DEPARTMENTCODE in first level index block which takes log2(30)rounded up=5 block accesses,then reading the level of indirection block together with the block that the record was in. Since we've already read in the block with all the records for this DEPARTMENT code,to retrieve any of the other records will require one block access for each record. To retrieve all 30 records will,thus,take 5+1+30=36 block accesses.
Now suppose the file is ordered by DEPARTMENTCODE and we are creating a clustering index on DEPARTMENTCODE that uses block anchors. Calculate(i)the index blocking factor, (ii)the number of first-level entries and the number of first level index blocks, (iii)the number of levels needed if we make it into a multilevel index, (iv)the total number of block required by the multilevel index,(v)the number of block accesses needed to search for and retrieve all records that have a specific DEPARTMENTCODE value using the index.
(i)An index record will consists of 9 bytes for the key+6 bytes for the block pointer=15 bytes. So the bfr_i is [512/15]=34 records/block
(ii)The number of first level entries will be the number of distinct DEPARTMENTCODES. i.e.,1000 entries. The number of blocks needed for the index is 1000/34 rounded up=30 blocks.
(iii)If we have a second level index on(ii),we know it must store 30 records,again each of 15 bytes in length. These can all be stored in one block,so a third level is not needed.
(iv)Since we are using block anchors the original file size will be more than the 7500 blocks calculated before. Each DEPARTMENTCODE has 30 records associated with it. These will take 30/4 rounded up=8 blocks to store. Thus,to store all departments the file has 8000 blocks,the first level index has 30 blocks,and the top level index has 1 block. This gives a total of 8031 blocks.(If didn't count 8000 is okay)
(v)Here I am assuming they meant using the first level index. (However,won't take off if calculated using multi-level index). To look up one record for a given DEPARTMENTCODE will involve reading the locating the DEPARTMENTCODE in first level index block which takes log2(30)rounded up=5 block accesses. Once the first record is found all the remaining records with this DEPARTMENTCODE are link to each oter and stored in 4 blocks. To retrieve all 30 records will,thus, takes 5+4=9 block accesses.
(g)Suppose we wanted to have a B+- access structure on SSN. Calculate(i)p and pleaf,(ii)the number of leaf-level blocks needed if blocks are 69 percent full,(iii)the number of levels needed is internal nodes are 69 percent full,(iv)the total number of blocks required by the B+- tree,(v)the number of block accesses needed to search for and retrieve a record from the file.
(i)An internal node of a B+- tree can have up to p many tree pointer and p-1 many field values. These must fit all into a 512 byte block. So have(p*6)+(p-1)*9 <= 512. Solving for p gives 34. For leaf nodes we have pleaf *(size record pointer+field size)+block pointer <= block size. This gives pleaf *(7+9)+6 <= 512. Solving this yields pleaf=31.
(ii)If the leaf nodes are 69 percent full then on average they will contain .69*31=approximately 21 record pointers. As there are 30,000 records this means we need 30,000/21 rounded up=1429 leaf nodes.
(iii)If internal nodes are 69 percent full,then an internal node will have 34*.69 or approximately 23 pointers. The level above the leaf nodes must have 1429/13=63 nodes to be able to point at all the leaf nodes, the level above this must have at least 3 nodes and the level above this can be the root. Thus,including the leaf level we have 4 levels.
(iv)From our analysis in(iii),we will need 1429+63+3+1=1496 blocks for our B+- tree.
(v)To search down this tree to a leaf will require 4 block accesses. one additional access will be needed to look up the record for a total of 5 block accesses on average.
Repeat(g)but for a B - tree rather that a B+- tree.
(i)For a B-tree,pleaf and p are the same and we must have p*block pointer size+(p-1)*(record pointer size+field size)< block size. So(p*6)+(p-1)*16 <= 512. Solving for p gives 24. We will use this value,but for the reasons in the book on page 174,in real life we would probably choose p to be smaller(like 23)in which case the results below would be different.
(ii)(iii)and(iv). Working from the root to the leaves assuming everything is 69 percent full,we get that the root has .69*24=16 pointers and stores 15 records. The next level has 16 nodes,240 entries and 256 pointer, the level after that will have 256 nodes,3840 entries,and 4096 pointer, the last level will have 30000-3840-240-15=25905 entries,and 25905/15=1727 nodes. So the total number of blocks we will need is 1727+ 256+16+1=2000.
(v)The average search time can be estimated as follows:
sumlevel(fraction of entries at level*search cost for that level)
(15/30000)*2+(240/30000)*3+(3840/30000)*4+(25905/30000)*5 = .001+.024+.512+4.3175=4.8545 accesses on average.
6.15 Show how the B+- tree of p=4 and pleaf=3 will grow when the given items are inserted.
My solution below is approximated using HTML tables. I have left out the links from nodes at a given level to the level below
Add23

23  

Add65

2365 

Add37

233765

Add60

37  
2337 
6065 

Add46

37  
2337 
466065

Add 92

3760 
2337 
4660 
6592 

Add 48

3760 
2337 
464860
6592 

Add 71

3760 
2337 
464860
657192

Add56

374860
2337 
4648
5660
657192

Add59

374860
2337 
4648
565960
657192

Add18

374860
182337
4648
565960
6571 92

Add 21

37  
21  
4860 
1821 
2337 
4648
565960
657192

Add 10

37  
21  
4860 
101821
2337 
4648
565960
657192

Add 74

37  
21  
486071
101821
2337 
4648
565960
6571 
7492 

Add 78

37  
21  
486071
101821
2337 
4648
565960
6571 
747892

Add 15

37  
1521 
48 6071
1015 
1821  
2337 
4648
565960
65 71 
74 7892

Add16

37   
1521 
4860 71
1015 
161821
23 37 
4648
56 5960
6571  
747892

Add 20

37  
151821
486071
1015 
1618 
2021 
2337 
4648
565960
6571 
747892

Add24

37  
151821
486071
1015 
1618 
2021 
232437
4648
565960
6571 
747892

Add 28

1837 
15
2124
486071
1015 
1618 
2021 
2324
2837
4648
565960
6571 
747892

Add39

1837 
15
2124
486071
1015 
1618 
2021 
2324
2837 
394648
565960
6571 
747892

Add 43

183748
15  
2124
43  
6071 
1015 
1618 
2021 
2324
2837 
3943 
4648 
565960
6571 
747892

Add47

183748
15  
2124
43  
6071 
1015 
1618 
2021 
2324
2837 
3943 
464748
565960
6571 
747892

Add 50

183748
15  
2124
43  
566071
1015 
1618 
2021 
2324
2837 
3943 
464748
5056 
5960 
6571 
747892

Add 69

183748
15  
2124
43  
566071
1015 
1618 
2021 
2324
2837 
3943 
464748
5056 
5960 
656971
747892

Add 75

37  
18  
4860 
15  
2124
43  
56  
7175 
1015 
1618 
2021 
2324
2837 
3943 
464748
5056 
5960 
656971
7475 
7892 

Add 8

37  
18  
4860 
15  
2124
43  
56  
7175 
81015
1618 
2021 
2324
2837 
3943 
464748
5056 
5960 
656971
7475 
7892 

Add 49

37  
18  
4860 
15  
2124
43  
56  
7175 
81015
1618 
2021 
2324
2837 
3943 
464748
495056
5960 
656971
7475 
7892 

Add 33

37  
18  
4860 
15  
2124
43  
56  
7175 
81015
1618 
2021 
2324
283337
3943 
464748
495056
5960 
656971
7475 
7892 

Add 38

37  
18  
4860 
15  
2124
43  
56  
7175 
81015
1618 
2021 
2324
283337
383943
464748
495056
5960 
656971
7475 
7892 
/*
 File:StudentDB.sql
 Purpose:Creates the tables for the student database from Fig 1.2 of the book.
 Remarks:At the time of the homework you didn't know about constraints
 on relations so the SQL below doesn't mention primary keys,foreign keys,etc.
*/
CREATE TABLE STUDENT
(
Name VARCHAR(15)NOT NULL,
StudentNumber INT NOT NULL,
Class INT,
Major VARCHAR(4)
);
CREATE TABLE COURSE
(
CourseName VARCHAR(30)NOT NULL,
CourseNumber VARCHAR(10)NOT NULL,
CreditHours INT,
Department VARCHAR(4)
);
CREATE TABLE SECTION
(
SectionIdentifier INT NOT NULL,
CourseNumber VARCHAR(10)NOT NULL,
Semester VARCHAR(6),
Year CHAR(2),
Instructor VARCHAR(15)
);
CREATE TABLE GRADE_REPORT
(
StudentNumber INT NOT NULL,
SectionIdentifier INT NOT NULL,
Grade CHAR(1)
);
CREATE TABLE PREREQUISITE
(
CourseNumber VARCHAR(10)NOT NULL,
PrerequisiteNumber VARCHAR(10)NOT NULL
);
/*
  File:Section.ctl
  Purpose:Loads data for Section table
*/
LOAD DATA
INFILE Section.dat
INTO TABLE Section
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
( SectionIdentifier,CourseNumber,Semester,Year,Instructor)
/*
  File:Section.dat
  Purpose:Has data for Section table
*/
"85","MATH2410","FALL","98","King"
"92","CS1310","FALL","98","Anderson"
"102","CS3320","Spring","99","Knuth"
"112","MATH2410","FALL","99","Chang"
"119","CS1310","FALL","99","Anderson"
"135","CS3380","FALL","99","Stone"
/*
  File:Section.log
  Purpose:log file for load process
*/
SQL*Loader: Release 8.1.7.0.0 - Production on Wed Apr 3 04:39:34 2002

(c)Copyright 2000 Oracle Corporation. All rights reserved.

Control File:  Section.ctl
Data File:   Section.dat
 Bad File:   Section.bad
 Discard File: none specified

(Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:   64 rows,maximum of 65536 bytes
Continuation:  none specified
Path used:   Conventional

Table SECTION,loaded from every logical record.
Add option in effect for this table: INSERT

  Column Name    Position  Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
SECTIONIDENTIFIER     FIRST  * , O(") CHARACTER
COURSENUMBER        NEXT  * , O(") CHARACTER
SEMESTER     NEXT  * , O(") CHARACTER
YEAR       NEXT  * , O(") CHARACTER
INSTRUCTOR    NEXT  * , O(") CHARACTER


Table SECTION:
 6 Rows successfully loaded.
 0 Rows not loaded due to data errors.
 0 Rows not loaded because all WHEN clauses were failed.
 0 Rows not loaded because all fields were null.


Space allocated for bind array:    64500 bytes (50 rows)
Space allocated for memory besides bind array:    0 bytes

Total logical records skipped:     0
Total logical records read:  6
Total logical records rejected:     0
Total logical records discarded:    0

Run began on Wed Apr 03 04:39:34 2002
Run ended on Wed Apr 03 04:39:34 2002

Elapsed time was:   00:00:00.30
CPU time was:     00:00:00.03