Binary Data

Information is measured in bits and bytes

1 bit = 0 or 1
1 byte = 8 bits
1 kilobyte = 2^10 bytes (appx. 1000 bytes)
1 megabyte = 2^20 bytes (appx. 1 million bytes)
1 gigabyte = 2^30 bytes (appx. 1 billion bytes)
1 terabyte = 2^40 bytes (appx. 1 trillion bytes)
1 petrabyte = 2^50 bytes
1 exabyte = 2^60 bytes
1 zettabyte = 2^70 bytes
1 yottabyte = 2^80 bytes

How much information is in ...

Information object

How many bytes

A binary decision

1 bit

A single text character

1 byte

A typical text word

10 bytes

A typewritten page

2 kilobyte s ( KB s)

A low-resolution photograph

100 kilobytes

A short novel

1 megabyte ( MB )

The contents of a 3.5 inch floppy disk

1.44 megabytes

A high-resolution photograph

2 megabytes

The complete works of Shakespeare

5 megabytes

A minute of high-fidelity sound

10 megabytes

One meter (or close to a yard) of shelved books

100 megabytes

The contents of a CD-ROM

500 megabytes

A pickup truck filled with books

1 gigabyte GB )

The contents of a DVD

17 gigabyte s

A collection of the works of Beethoven

20 gigabytes

A library floor of academic journals

100 gigabytes

50,000 trees made into paper and printed

1 terabyte ( TB )

An academic research library

2 terabytes

The print collections of the U.S. Library of Congress

10 terabytes

The National Climactic Data Center database

400 terabytes

Three years' of EOS data (2001)

1 petabyte ( PB )

All U.S. academic research libraries

2 petabytes

All hard disk capacity developed in 1995

20 petabytes

All printed material in the world

200 petabytes

Total volume of information generated in 1999

2 exabyte s ( EB s)

All words ever spoken by human beings

5 exabytes

Nibbles as unsigned integers

A length n binary word is any sequence of n bits.

Fact: There are 2^n length n binary words. So for example, there are 2^8 = 256 bytes. (Proof?)

A nibble = 4 bits. So there are 2^4 = 16 nibbles.

An unsigned integer is a sequence of digits (0 – 9). We interpret the digits appearing in an integer as factors of powers of 10. For example:

234 = 2 * 10^2 + 3 * 10^1 + 4 * 10^0

Note that the power of 10 corresponds to the position of the digit in the sequence.

We can do the same for binary words. We can interpret the bits as factors of powers of 2. For example:

1011 = 1 * 2^3 + 0 * 2^2 + 1 * 2^1 + 1 * 2^0 = 8 + 2 + 1 = 11

What unsigned integer is represented by the byte: 00010111? The nibble 1111?

Nibbles as signed integers

There are several ways to interpret nibbles as unsigned integers. In all representations we want to interpret the leftmost bit as the sign of the number:

0 = +, 1 = -

It is also desirable to have +0 = -0 = 0000.

The standard representation is called the Twos Complement System. It interprets nibbles with leftmost bit 0 as sums of powers of two in the way we saw before:

0000 = 0
0001 = 1
0010 = 2
0011 = 3
0100 = 4
0101 = 5
0110 = 6
0111 = 7

For negative numbers the idea is this:

n + (-n) = 0

For example:

5 + (-5) = 0

Notice that in the unsigned interpretation 15 + 1 = 1111 + 1 = 0 = 16. In other words, we can treat 16 as another 0. So now are equation becomes:

n + (-n) = 16

or:

-n = 16 – n

So for example:

-5 = 16 – 5 = 11 = 1011

Using this scheme we have:

1111 = 15 = 16 – 1 = 0 – 1 = -1
1110 = 14 = 16 – 2 = 0 – 2 = -2
1101 = 13 = 16 – 3 = 0 – 3 = -3
1100 = 12 = 16 – 4 = 0 – 4 = -4
1011 = 11 = 16 – 5 = 0 – 5 = -5
1010 = 10 = 16 – 6 = 0 – 6 = -6
1001 = 9 = 16 – 7 = 0 – 7 = -7
1000 = 8 = 16 – 8 = 0 – 8 = -8

One problem with the Twos complement system is that we have a representation for -8, but not for 8!

The two's complement system gets its name because there is a shortcut for figuring out the negative of a number: take the ones complement and add 1. The ones complement of a binary number, ~n, is the number that results from inverting each bit. So:

-n = ~n + 1

For example:

- (-3) = -(1101) = ~(1101) + 1 = 0010 + 1 = 0011 = 3