Arrays
An array is a named group of several consecutive variables of the same type. Each of the component variables is accessed from the array name and an index number.
Assume SIZE is an integer constant:
const int SIZE = 3;
The following declaration creates three arrays, each consisting of three variables of type double:
double vec1[SIZE], vec2[SIZE], vec3[SIZE];
We can initialize these arrays using a for loop. Be careful, array indices start at 0, so the maximum valid index is SIZE – 1 = 2:
for(int i = 0; i < SIZE; i++)
{
vec1[i] = 2 * i + 1;
vec2[i] = i/2;
vec3[i] = 1;
}
Here is a picture of what's going on in the computer's memory:

We can add the entries in vec1 to the corresponding entries in vec2 and store them in vec3 using a for loop:
for(int i = 0; i < SIZE; i++)
vec3[i] = vec1[i] + vec2[i];
Another for loop prints vec3:
for(int i = 0; i < SIZE; i++)
cout << vec3[i] << ' ';
Producing the output:
1 3.5 6
We can introduce Vector as an alternative name for length three arrays of doubles by using a typedef declaration, but the syntax is a little tricky:
typedef double VECTOR[SIZE];
Now we can simply declare vec1, vec2, and vec3 as VECTORs:
VECTOR vec1, vec2, vec3;
We can also initialize arrays in their declarations:
double vec4[3] = {100, 200, 300};
Of course we can also define arrays of arrays. This might be a good way to represent matrices:
double mat1[SIZE][SIZE], mat2[SIZE][SIZE];
We need a nested for loop to initialize these:
for(int i = 0; i < SIZE; i++)
for(int j = 0; j < SIZE; j++)
{
mat1[i][j] = 3 * i + j + 1;
mat2[i][j] = 2 * mat1[i][j];
}
Each row of one of our matrices is an ordinary array:
for(int i = 0; i < SIZE; i++)
vec3[i] = mat1[1][i] * mat2[i][1];
Problem
Write a program that prompts the user for the rows of a 3 X 3 matrix, then returns the determinant. (Check your math books for the definition of determinant.)
Pointers
Each variable has a unique address in the computer's memory. We can discover the address of a variable x using the address of operator: &x. For example, on my computer executing the statements:
int nums[3] = {100, 200, 300};
for(int i = 0; i < 3; i++)
cout << "&nums[" << i << "] = " << &nums[i] << '\n';
produces the output:
&nums[0] = 0012FF6C
&nums[1] = 0012FF70
&nums[2] = 0012FF74
Addresses, also called pointers, are usually given in hexadecimal notation. Notice that difference between two consecutive pointers is 4. That's because on my computer:
sizeof(int) = 4 (bytes)
If we change nums to an array of short ints, then the output produced is:
&nums[0] = 0012FF70
&nums[1] = 0012FF72
&nums[2] = 0012FF74
Differences between consecutive pointers is now 2 because on my computer:
sizeof(short) = 2 (bytes)
Although pointers look like integers, they are not. For example, it's illegal to store an pointer in an integer variable:
int x = 42;
int y = &x; // error
Instead, pointers must be stored in pointer variables. The type of all pointers to ints is called int*:
int x = 42;
int* y = &x; // ok
The syntax of a pointer declaration is a little tricky. Officially, a declaration has four parts:
SPECIFIER BASE_TYPE DECLARATOR INITIALIZER;
The specifier (e.g. const) and the initializer (e.g. = { 100, 200, 300 }) are optional. The base type is any valid C++ type name: int, float, vector<string>, etc. The declarator is a name and, optionally, some declarator operators. For example, recall the declaration of the nums array:
int nums[3] = {100, 200, 300};
This declaration has no specifier. The base type is int, the initializer is = {100, 200, 300} and the declarator consists of the name, nums, and the declarator operator [3]. The syntax is complicated because it suggests nums is an integer variable, but that's not the case.
As it turns out, even though * appeared next to the base type in:
int* y = &x;
It is also a declarator operator. C++ simply ignores the spaces. We could have written:
int *y = &x;
This can cause problems. For example, if we attempt to declare two pointers, we can get in trouble:
int* y = &nums[1], z = &nums[2]; // error: z is an int!
The correct syntax is:
int *y = &nums[1], *z = &nums[2]; // ok
Here's a picture representing the definitions we have at this point:

There's more confusion to come. What can we do with a pointer? Two things. We can dereference a pointer. This means we can discover the value stored inside the variable it points to. What's confusing is the syntax. We use the same operator used to declare the pointer type:
cout << *y; // prints 200
cout << *z; // prints 300
Be careful. dereferencing a null pointer is the most common cause if programs crashing:
int *p;
cout << *p; // error, p doesn't point to anything
The value of a null pointer is 0, so a safe way to dereference is:
if (p) cout << *p; // ok
The other thing we can do is pointer arithmetic. For example:
cout << y << '\n'; // prints 0012FF70
y = y + 1;
cout << y << '\n'; // prints 0012FF74
cout << *y << '\n'; // prints 200
Notice, incrementing a pointer by 1 really increments it by sizeof(int).
Array Pointer Duality
There is a close relationship between pointers and arrays: arrays are constant pointers. C++ translates the expression:
nums[i];
into:
*(nums + i);
Here's another way to traverse an array. Note the similarity with iterators:
for(int* p = nums; p != nums + 3; p++)
*p = *p + 1;
Problem
Assume
sizeof(int) = sizeof(int*) = sizeof(int**)
Assume the following declarations are made:
int nums[3] = {100, 200, 300}, *p = &nums[1], **q = &p;
Assume:
p = 5000 and q = 6000
Compute the values produced by the following expressions:
*p =
*p + 1 =
*(p + 1) =
**q =
*q =
&nums[0] =
p[1] =
Constant Pointers
Mixing the const specifier with pointer declarations creates four combinations. Assume the following declarations are made:
int x = 42;
int *a = &x;
const int *b = &x;
int const *c = &x;
const int const *d = &x;
Here is what you can and can't do with these pointers:
*a = 0; // ok
a++; // ok
*b = 0; // error, b points to a constant
b++; // ok
*c = 0; // ok
c++; // error, c is a constant
*d = 0; // error, d points to a constant
d++; // error, d is a constant
Heap Variables
The computer's memory is divided into four segments:

Local variables and parameters are allocated in the stack. Global variables are allocated in the static segment. Binary instructions are allocated in the code segment. The computer manages allocation and deallocation of variables in these segments, however, the programmer can allocate and deallocate variables in the heap segment while the program is running.
The new() operator allocates variables in the heap. It returns a pointer to the allocated variable:
int *p = new int;
double *q = new double;
The delete() operator deallocates heap variables so they can be reallocated later:
delete p;
delete q;
All of this would be pretty uninteresting if it weren't also possible to create array variables in the heap:
int* p = new int[50]; // p points to a heap array of length 50
for(int i = 0; i < 50; i++)
p[i] = i;
Use the delete[] operator to delete an array:
delete[] p;
A heap array is also called a dynamic array because, unlike ordinary arrays, they can be expanded. For example, to add one more element to p we can write:
int* temp = new int[100];
for(int i = 0; i < 50; i++)
temp[i] = p[i];
delete[] p;
p = temp;
p[50] = 50;
Vectors
Arrays have two problems. First, C++ does not automatically validate indices. Instead, the program just crashes, or worse, memory is corrupted:
double vec[3] = {100, 200, 300};
vec[4] = 0; // oops!
Second, arrays can't expand to accommodate more elements. A length three array is always a length three array.
To solve these problems the standard C++ library includes a vector class. Actually, vector is not a class, it's a template for generating classes. For example, here is how we would declare different types of vectors:
vector<double> vec1, vec2, vec3;
vector<vector<double> > mat1, mat2;
vector<char> msg1, msg2, msg3, msg4;
(Note the space after the second angle bracket in the middle declaration. This is necessary to prevent confusion with the right shift operator.)
We can add as many elements as we want to vec1:
vec1.push_back(7);
vec1.push_back(42);
vec1.push_back(19);
vec1.push_back(100);
vec1.push_back(-13);
We can access the members of vec1 using subscripts:
int n = vec1.size();
for(int i = 0; i < n; i++)
vec1[i] = vec1[i] * 2;
Unfortunately, the subscript operator, [], doesn't validate index bounds, we need the at() function encapsulated by vec1 for that:
try
{
cout << vec1.at(500); // oops!
}
catch(out_of_range e)
{
cerr << e.what() << '\n';
}
To make vectors available to our programs, we must include the vector header file:
#include <vector>
using namespace std;
Iterators
Officially, vectors, in fact all of the container templates, use iterators as abstract pointers. We aren't allowed to know what an iterator is, but here's how one is declared:
vector<double>::iterator p;
All vectors encapsulate iterators that point to the first element and just beyond the last element:
vec1.begin(); // points to beginning of vec1
vec1.end(); // points to just beyond the end of vec1
An iterator can be incremented or decremented using the ++ and -- operators. The element an iterator p points to can be accessed by dereferencing the iterator using the unary * operator. Putting all this together, here's another way to traverse vec1:
for(p = vec1.begin(); p != vec1.end(); p++)
cout << *p << ' ';
Characters
Here are some example declarations of character variables initialized by character literals. Notice that literal characters are bracketed by single quotes and that char is the name of the type of all characters:
char
zero = '0', quote = '"', a = 'a', period = '.', space = ' ';Internally, a character is represented by an integer. For example:
cout << "the code for " << period << " = " << int(period) << '\n';
prints 46 on my computer.
An appropriate integer can be converted into the character it represents. For example, the statement:
cout << char(98) << 'n';
prints A on my computer.
Control characters represent special keys such as the return key, etc. They have special literal representations that use the backslash character:
char newline = '\n',
tab = '\t',
backslash = '\\',
bell = '\a',
ret = '\r',
backspace = '\b';
The standard C library contains functions for classifying and manipulating characters. They are declared in <cctype>:
int isalpha(int);
int isupper(int);
int islower(int);
int isdigit(int);
int isspace(int);
int iscntrl(int);
int ispunct(int);
int isalnum(int);
int toupper(int);
int tolower(int);
Here's a little program for determining character codes. Unfortunately, we'll have to make the user type <Ctrl> + c to break out of the loop:
#include <iostream>
using namespace std;
int main()
{
while(true)
{
cout << "Enter a character (<Ctrl>c to quit): ";
char response = cin.get();
cout << "int(" << response << ") = ";
cout << int(response) << '\n';
cin.sync(); // flush '\n'
}
return 0;
}
Here's a sample output produced on my computer, which uses ASCII character codes:
Enter a character (<Ctrl>c to quit): 6
int(6) = 54
Enter a character (<Ctrl>c to quit): 7
int(7) = 55
Enter a character (<Ctrl>c to quit): a
int(a) = 97
Enter a character (<Ctrl>c to quit): A
int(A) = 65
Enter a character (<Ctrl>c to quit): b
int(b) = 98
Enter a character (<Ctrl>c to quit): B
int(B) = 66
Enter a character (<Ctrl>c to quit):
int( ) = 32
Enter a character (<Ctrl>c to quit): ?
int(?) = 63
Enter a character (<Ctrl>c to quit):
C Strings
A C string variable is simply an array of characters. To allow C strings of varying sizes, these arrays are often allocated in the heap:
typedef char String80[80];
typedef char* String;
A C string literal is a sequence of characters bracketed by double quotes. Backslash is the escape character:
String path = "D:\\aaa\\nnn\n"; // = D:\aaa\nnn
String80 prompt = "Type \"q\" to quit\n"; // = Type "q" to quit
C strings are always terminated by the NUL character, where NUL = char(0). The standard C library provides functions for manipulating C strings (see below).
Programmers should use C++ strings (see below) instead of C strings, because C strings don't check for out-of-range indices, and they don't allocate and deallocate memory for themselves. Instead, these jobs are left to the programmer.
There are several places where C strings are still used in C++. One example are command line arguments. The DOS console always passes the entire command line, including the command name, to main() as an array of C strings (i.e. an array of arrays of chars). The console also passes the length of this array to main().
test.cpp (A simple expression evaluator)
#include <string> // string functions declared here (not used)
#include <cmath> // pow declared here
#include <cstdlib> // atof declared here
#include <iostream>
using namespace std;
int main(int argc, char* argv[])
{
if (argc != 4)
{
cerr << "usage: " << argv[0] << " NUMBER OPERATOR NUMBER\n";
exit(1);
}
double num1 = atof(argv[1]);
double num2 = atof(argv[3]);
double num = 0; // result stored here
char op = argv[2][0];
switch (op)
{
case '+': num = num1 + num2; break;
case '*': num = num1 * num2; break;
case '-': num = num1 - num2; break;
case '/': num = num1 / num2; break;
case 'e': num = pow(num1, num2); break; // DOS won't allow ^
default:
cerr << "unrecognized operator: " << op << '\n';
return 1;
} // switch
cout << "result = " << num << '\n';
return 0;
}
Program Output
Notice that although the executable is renamed, the new name is used in the error message:
D:>rename string3.exe eval.exe
D:>eval 22 + 15
result = 37
D:>eval 3.2 * 7
result = 22.4
D:>eval 15 / 3
result = 5
D:>eval 15 - 5
result = 10
D:>eval 2 e 5
result = 32
D:>eval 23
usage: eval NUMBER OPERATOR NUMBER
D:>eval 6 # 2
unrecognized operator: #
D:>
C++ Strings
The standard C++ library provides a string class declared in <string>. A C++ string is an object that encapsulates many useful functions. The program below demonstrates some of these functions.
test.cpp
The boldface text demonstrates the most important string operations:
#include <cstdio>
#include <iostream>
#include <string>
using namespace std;
int main()
{
// converting C strings to C++ strings
string s1("California"), s2, s3;
s2 = "Nevada";
// converting C++ strings to C strings
printf("%s\n\n", s1.c_str());
cout << "s3 empty? = " << s3.empty() << "\n\n";
// assignment copies strings
s3 = s2;
cout << "s2 = " << s2 << '\n';
cout << "s3 = " << s3 << '\n';
s3[0] = 'n';
cout << "s2 = " << s2 << '\n';
cout << "s3 = " << s3 << "\n\n";
// strings can grow
s3 = "Nevada is in the USA";
cout << s3 << "\n\n";
// 3 ways to traverse a string
for(int i = 0; i < s1.size(); i++)
cout << s1[i] << ' ';
cout << '\n';
// at() does index bounds checking
for(i = 0; i < s2.length(); i++)
cout << s2.at(i) << ' ';
cout << '\n';
string::iterator p;
for(p = s1.begin(); p != s1.end(); p++)
cout << *p << ' ';
cout << "\n\n";
// error handling
try
{
cout << s1.at(500); // index too big!
}
catch(out_of_range e)
{
cerr << e.what() << "\n\n";
}
// comparing strings
cout << s1 << " == " << s2 << " = " << (s1 == s2) << '\n';
cout << s1 << " != " << s2 << " = " << (s1 != s2) << '\n';
cout << s1 << " <= " << s2 << " = " << (s1 <= s2) << "\n\n";
// concatonation
cout << s1 + ' ' + "is next to" + ' ' + s2 << "\n\n";
// finding substrings
int pos = s1.find("for");
s3 = s1.substr(pos, 3); // 3 chars beyond pos
cout << s3 << '\n';
s3 = s1.substr(pos, string::npos); // tail of s1
cout << s3 << "\n\n";
// replacing substrings
s3 = s1;
s3.replace(pos, 3, "XXXXX");
cout << s3 << "\n\n";
// stats
cout << "s1's max size = " << s1.max_size() << '\n';
cout << "s1's capacity = " << s1.capacity() << '\n';
cout << "s1's size = " << s1.size() << "\n\n";
// erasing substrings
s1.erase(pos, 3);
cout << s1 << '\n';
cout << "s1's max size = " << s1.max_size() << '\n';
cout << "s1's capacity = " << s1.capacity() << '\n';
cout << "s1's size = " << s1.size() << "\n\n";
// string I/O
cout << "Enter 3 strings seperated by white space: ";
cin >> s1 >> s2 >> s3;
cout << "\nYou entered: " << s1 << ' ' << s2 << ' ' << s3 << '\n';
return 0;
}
Program Output
California
s3 empty? = 1
s2 = Nevada
s3 = Nevada
s2 = Nevada
s3 = nevada
Nevada is in the USA
C a l i f o r n i a
N e v a d a
C a l i f o r n i a
invalid string position
California == Nevada = 0
California != Nevada = 1
California <= Nevada = 1
California is next to Nevada
for
fornia
CaliXXXXXnia
s1's max size = 4294967293
s1's capacity = 31
s1's size = 10
Calinia
s1's max size = 4294967293
s1's capacity = 31
s1's size = 7
Enter 3 strings seperated by white space: shoe crab rotator
You entered: shoe crab rotator
File Streams
Input File Streams
An input file stream (ifstream) is a special kind of input stream (istream). This means all operations we can perform on input streams can also be performed on input file streams. The main trick is knowing that when an input file stream fails for any reason, the stream itself turns into 0.
We can construct an input file stream from a file name:
ifstream ifs("data.txt");
If the file isn't found or if its permission level doesn't allow reading, then ifs will enter the fail state and it is set equal to 0:
if (!ifs)
{
cerr << "Can't open file\n";
return 1;
}
We can use the zero test again to determine when the end of the file is reached. This is more reliable than the eof() function, because that returns false even when only white space characters remain in the file:
while(ifs >> val) { ... }
For example, the following program computes the average of a file of numbers:
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main()
{
double num, total = 0;
int count = 0;
string source;
cout << "Enter file name: ";
cin >> source;
ifstream ifs(source.c_str()); // won't work with C++ strings!
if (!ifs)
{
cerr << "can't open file";
return 1;
}
while (ifs >> num)
{
total += num;
count++;
}
cout << "Average = " << total/count << '\n';
return 0;
}
Of course if a non number is encountered in the file, even so much as a punctuation mark, the loop will exit and return the average of the numbers seen.
Output File Streams
An output file stream (ofstream) is a special kind of output stream (ostream). This means any operations we can perform on an output stream can also be performed on an output file stream.
Everyone's favorite file function changes lower case letters to upper case letters. I'm using the toupper() function defined in the standard C library, so I need to include <cctype>:
int main()
{
char c;
string source, dest;
cout << "Enter input file name: ";
cin >> source;
ifstream ifs(source.c_str());
if (!ifs)
{
cerr << "can't open file";
return 1;
}
cout << "Enter output file name: ";
cin >> dest;
ofstream ofs(dest.c_str());
if (!ofs)
{
cerr << "can't open file";
return 1;
}
while (ifs.get(c))
ofs.put(toupper(c));
return 0;
}
Counting Words
Counting words and lines in a file is harder than you'd think. I increment my word counter each time I encounter a non letter following a letter. If the last character in the file was a letter, I increment my word counter again. I increment my line counter each time I encounter a newline character. If the last character in the file was not a newline, I increment my line count once more. Can you think of other improvements?
int main()
{
char c;
int charCount = 0, wordCount = 0, lineCount = 0;
bool wasLetter = false;
string source;
cout << "Enter file name: ";
cin >> source;
ifstream ifs(source.c_str());
if (!ifs)
{
cerr << "can't open file";
return 1;
}
if (ifs)
{
while (ifs.get(c))
{
charCount++;
if (isalpha(c))
wasLetter = true;
else
{
if (c == '\n') lineCount++;
if (wasLetter) wordCount++;
wasLetter = false;
}
}
if (wasLetter) wordCount++;
if (c != '\n') lineCount++;
cout << "char count = " << charCount << '\n';
cout << "word count = " << wordCount << '\n';
cout << "line count = " << lineCount << '\n';
}
else
{
cerr << "can't open file\n";
return 1;
}
return 0;
}
Grep
Grep is a famous UNIX utility that searches a given file line-by-line for a given pattern. It prints lines, with their line numbers, that contain the pattern. The ultimate pattern matching algorithm is the subject of an upper division algorithms course. I simply use the find() function encapsulated by string objects. Note that command line arguments are used to get the pattern and file name. This function also uses the standard library's getline() function to fetch an entire line and put it into a string variable:
int main(int argc, char** argv)
{
if (argc != 3)
{
cerr << "Usage: " << argv[0] << " PATTERN FILE\n";
return 1;
}
int lineNum = 0;
string line;
string pattern = argv[1];
string file = argv[2];
ifstream ifs(file.c_str());
if (ifs)
while (getline(ifs, line))
{
lineNum++;
if (string::npos != line.find(pattern))
cout << lineNum << ":\t" << line << '\n';
}
else
{
cerr <<"can't open file\n";
return 1;
}
return 0;
}
Notice that all of these functions depend on the fact that get(), getline(), and >> return the stream as a value after each extraction.
Problem
Assume we are interested in describing the distribution of a set of test scores: s1, ..., sn. The center of the distribution is given by the mean:
mean = (s1 + ... + sn)/n
The spread of the scores is given by the standard deviation:
sd = sqrt(variance);
The variance is the average variation:
variance = (v1 + ... + vn)/n
The variation of the ith score is the square of its difference from the mean:
vi = (si – mean)2
Write a program that prompts the user for a file of test scores, then computes and prints the mean, minimum score, maximum score, and standard variation.