Separate Compilation
Keeping a large program (say more than 250 lines) in a single source file has several disadvantages:
1. A minor change requires recompilation of the entire file.
2. Reusing part of the program, a class for example, in another program requires a risky copy and paste operation. The class declaration and all member function implementations must be located, copied (don't press the cut button!), and pasted into another file.
3. Several programmers can't work on the program simultaneously.
For these reasons most programs are divided into several source files, each containing logically related declarations. Each source file is compiled separately, producing a file of machine language instructions called an object file. Object files have a .obj extension.
It's the job of a program called a linker to link a project's object files together into a single executable file with a .exe extension. The linker is responsible for associating all references to a name in one object file to the definition of the name, which might be in another object file. (This process is called resolution.) Sometimes the definition of a name isn't in any of the project's object files. In this case the linker searches the standard C++ library (libcp.lib), the C library (libc.lib), and any special libraries specified by the programmer for the name's definition. (Only the definitions needed by the program are extracted from a library and linked into the program.) If the name's definition still can't be found, the VC++ linker generates a pair of ugly error messages:
Linking...
main.obj : error LNK2001: unresolved external symbol "int __cdecl sine(int)" (?sine@@YAHH@Z)
Debug/Lab5b.exe : fatal error LNK1120: 1 unresolved externals
Error executing link.exe.
Lab5b.exe - 2 error(s), 0 warning(s)
A source file can belong to many programs simultaneously, and there is no requirement that a source file must be contained in a particular folder (directory). This makes reusing and sharing easy.
Header Files
Multiple source file programs sound great, but there is a problem. While the compiler is willing to accept that a name can be used in a source file without being defined there (after all, its the linker's job to find the definition), the compiler must at least know the type of the name. But the compiler only compiles one file at a time. It knows nothing of the program's other source files; it can't be expected to go searching for a name's type the same way the linker searches for a name's definition. Regrettably, it is the programmer's duty to put type declarations of all names used but not defined in a source file at the top of the file.
#include
This job would be worse than it already is if it wasn't for the #include directive. A preprocessor directive is a command beginning with a # symbol that can appear in a source or header file. When a file is compiled, it is first submitted to the C/C++ preprocessor. The preprocessor executes all directives.
Preprocessor directives usually alter the source file in some way. For example, the #include directive takes a file name for its argument. When executed by the preprocessor, it replaces itself with the entire contents of the file. (Be careful, if the same directive is in the file, then an infinite loop can occur.)
How do we use #include to solve our problem? If a programmer creates a source file called utils.cpp containing definitions that may be useful in other source files, the programmer also creates a header file containing the corresponding type declarations called utils.h. A second source file, say main.cpp, that uses the definitions contained in utils.cpp only needs to include the directive:
#include "utils.h"
Normally, utils.h would also include type declarations needed by utils.cpp, so utils.cpp also includes utils.h.
Example
Assume utils.cpp contains a constant and two functions. These functions use names such as cin, cout, <<, and string, which are defined in the standard library:
utils.cpp
#include <string>
#include <iostream>
using namespace std;
#include "utils.h"
/************************************************/
// constants
// Fahrenheit to centigrade
const double CONVERSION_FACTOR = 1.8;
/************************************************/
double getNumber(const string& prompt)
/* PURPOSE: prompt user for a number
RECEIVES: the prompt
RETURNS: the number entered
REMARKS:: passing prompt by value causes
a bogus return value in release
build
*/
{
double response;
cout << prompt + " -> ";
cin >> response;
cout << "You entered " << response << '\n';
return response;
}
/************************************************/
bool getResponse(const string& question)
/* PURPOSE: ask user yes/no question
RECEIVES: A yes/no question
RETURNS: response
REMARKS: converts yes/no response into true/false
*/
{
string response;
bool yes, no, invalid;
do
{
cout << question + " (y/n) -> ";
cin >> response;
yes = response == "y" || response == "Y";
no = response == "n" || response == "N";
invalid = !(yes || no);
if (invalid)
cout <<
"Type \"y\" for \"yes,\" \"n\" for \"no\".\n";
}
while (invalid);
return yes;
}
utils.h
The corresponding header file contains includes, prototypes of the two functions, an extern declaration of the constant, and an inline function:
#ifndef UTILS_H
#define UTILS_H
/************************************************/
// constants
// Fahrenheit to centigrade
extern const double CONVERSION_FACTOR;
/************************************************/
double getNumber(const string& prompt);
/* PURPOSE: prompt user for a number
RECEIVES: the prompt
RETURNS: the number entered
REMARKS: passing prompt by value causes
a bogus return value in release
build
*/
/************************************************/
bool getResponse(const string& question);
/* PURPOSE: ask user yes/no question
RECEIVES: a yes/no question
RETURNS: response
REMARKS: converts yes/no response into true/false
*/
/************************************************/
inline double CentigradeToFahrenheit(double ctemp)
/* PURPOSE: converts centigrade to Fahrenheit
RECEIVES: ctemp = centigrade temperature
RETURNS: Fahrenheit temperature
*/
{
return CONVERSION_FACTOR * ctemp + 32;
}
#endif
main.cpp
To make the compiler happy, main.cpp needs to include utils.h:
#include "utils.h"
int main()
{
bool more = true;
double ftemp, ctemp;
cout << "conversion factor = ";
cout << CONVERSION_FACTOR << '\n';
while (more)
{
ctemp = getNumber("Enter a centigrade temperature");
ftemp = CentigradeToFahrenheit(ctemp);
cout << "Degrees Fahrenheit = " << ftemp << '\n';
more = getResponse("Another?");
}
return 0;
}
Of course all three files must be added to the project before the linker will be happy.
Standard Library Headers
Utils.cpp uses names and operators defined in the standard C++ library (cin, cout, <<, >>, string), hence includes the corresponding header files:
#include <string>
#include <iostream>
For convenience, declarations of the standard library names are organized into 51 header files according to functionality (see chapter 16.1.2 in Stroustrup's, The C++ Programming Language, ed. 3).
<...> vs. "..."
The names used in utils.cpp are declared in iostream.h and string.h (Stroustrup's book can help you figure out which header files to include), but which directory are these header files located in? Fortunately, programmers don't need to know. Bracketing standard header files with angle braces < ... > instead of quotes " ... " instructs the preprocessor to look in the "usual directory" for these files.
<iostream> vs. <iostream.h>
Because standard header files are distinguished by their brackets, it isn't necessary to include the .h extension. Most preprocessors treat #include <iostream> the same as #include <iostream.h>. But VC++ has two incompatible stream libraries, the old one and the new one. The old one uses nonstandard I/O features provided by Microsoft before their counterparts were officially included in the ANSI standard library. The .h extension is used by VC++ as a signal that the programmer wants the old stream library rather than the new. Therefore, to make programs platform independent, programmers should refrain from using the .h extension when including a standard header. (See the VC++ Info Viewer topic Port to the Standard C++ Library for more details.)
Note: Spaces within < ... > are significant. For example, "#include < iostream >" will fail.
Namespaces
The modularity principle says a program should be decomposed into a collection of loosely coupled, tightly cohesive modules. A file containing logically related definitions can be regarded as a module. (Functions, classes, and objects are examples of mini modules.)
The abstraction principle says the interface of a module should be independent or separate from its implementation. We can identify header files with module interfaces and source files with the corresponding implementations.
C++ provides another decomposition mechanism: namespaces. A namespace is a collection of logically related names. By declaring its names to belong to a particular namespace:
// module1.h
namespace Module1
{
void service1();
void service2();
void service3();
...
}
// module2.h
namespace Module2
{
void service1();
void service2();
void service3();
...
}
a module can control which names are visible to clients, and can protect itself from name conflicts with other modules.
There are three ways for a client module to use a name in a namespace. Either the name must be qualified:
// client.cpp
#include "module1.h"
#include module2.h"
Module1::service2();
Module2::service3();
or the name can be selected from the namespace using "using," then used without qualification:
// client.cpp
#include "module1.h"
#include module2.h"
using Module1::serivce2;
using Module2::service3;
service2();
Module2::service2();
service3();
or the entire namespace can be dumped into the client's namespace:
// client.cpp
#include "module1.h"
#include module2.h"
using namespace Module1;
service2();
service3();
Module3::service3();
The entire standard library exists in a namespace called std, therefore standard library names should be qualified by clients:
std::cout << "Hello. world!\n";
This gets a little tedious, so lazy programmers (like Stroustrup) often dump the entire std namespace into the global namespace:
#include <string>
#include <iostream>
using namespace std;
Note: <iostream.h> already contains "using namespace std;" but <iostream> doesn't.
The One-Definition Rule (ODR)
What may a header file contain? Why not include utils.cpp in main.cpp and save the linker a lot of work?
Definitions vs. Declarations
A definition binds a name to a value. Here are some examples of C++ definitions:
int num = 42;
double square(double x) { return x * x; }
const double pi = 3.1416;
struct Date { int d, m, y; };
template <class T> T id(T x) { return x; }
typedef stack<Date> DateStack;
These definitions create the following bindings: num to a variable containing 42, square to a function, pi to 3.1416, Date to a type, id to a template, and DateStack to a type.
A declaration binds a name to a type. All definitions are declarations because they implicitly bind the name to the type of the associated value. For example, num is bound to int, square is bound to double square(double), pi is bound to double, Date is bound to struct, id is bound to template <class T> T id(T), and DateStack is bound to stack.
Some declarations bind names to types but not to values. Let's call these pure type declarations:
DECLARATIONS = DEFINITIONS + PURE TYPE DECLARATIONS
Here are some examples of pure type declarations:
double square(double x);
extern int num;
struct Date;
Definitions Must Be Unique
The One-Definition Rule (ODR) states that a definition may only occur once in a C++ program, but a pure type declaration can occur multiple times as long as the occurrences don't contradict each other. For example, having declared:
double square(double);
in a program, I may not also declare in the same program:
Date square(double);
(But I may declare double square(Date); Why?)
Contents of a Header File
Because a header file might be included in several source files (utils.h was included in utils.cpp and main.cpp), placing the definition of num or square in a header file might result in a compiler or linker error. (Why?) By contrast, header files are a good place for pure type declarations.
ODR Exceptions
ODR allows multiple definitions of types (enums, structs, unions, classes, and typedefs), inline functions, and templates on three conditions:
1. They appear in different source files.
2. They are token-for-token identical.
3. The meanings of the tokens remain the same.
Here are three ways of violating these conditions. See if you can spot them:
Example 1:
// file1.cpp
struct Date { int d, m, y; };
struct Date { int d, m, y; };
Example 2:
// file1.cpp
struct Date { int d, m, y; };
// file2.cpp
struct Date { int d, m, year; };
Example 3:
// file1.cpp
typedef int INT;
struct Date { INT d, m, y; };
// file2.cpp
typedef char INT;
struct Date { INT d, m, y; };
More Header Files Contents
We have already seen that preprocessor directives, namespaces declarations, and pure type declarations are placed in header files. Thanks to the ODR exceptions, type, template, and inline function definitions may also be placed in header files.
Type definitions are placed in header files because a type is frequently needed in several source files. Inline function must go in header files because they are needed by the compiler to expand calls to the inline functions where they occur in source files. Template definitions must also go in header files because they are needed by the compiler to create instances of the template where they occur in source files. (Inline functions and templates are advanced C++ features you will learn about later.)
#ifndef/endif
There is a problem with the exceptions to the ODR. If a header file such as utils.h contains a struct definition:
// utils.h
struct Date { int d, m, y; };
// etc.
and if a second header file, tools.h, includes utils.h:
// tools.h
#include "utils.h"
// etc.
then any source file that unwittingly includes utils.h and tools.h:
// main.cpp
#include "utils.h" // needed?
#include "tools.h"
// etc.
violates the first condition of the exceptions to the ODR, because the definition of Date now occurs twice within main.cpp.
To avoid this problem programmers surround the contents of their header files with #ifndef/#endif directives:
// utils.h
#ifndef UTILS_H
#define UTILS_H
struct Date { int d, m, y; };
// etc.
#endif
If the symbol UTILS_H has already been defined by the preprocessor, then everything is ignored until the matching #endif directive. Otherwise, the symbol UTILS_H is defined using the #define directive, and the declarations inside are resolved normally.
See chapters 4.9 and 9 of Stroustrup's The C++ Programming Language, ed. 3 for more details about header files.
Problems
Problem 1
In the last chapter we defined several functions that probably can be reused in other applications:
inline double square(double x) { ... }
inline double cube(double x) { ... }
const double pi = acos(-1);
const double e = exp(1);
template <typename Data>
void swap(Data& x, Data& y) { ... }
#define DEBUG_MODE true
void error(const string& gripe) { ... }
bool getResponse(const string& question) { ... }
template <typename Data>
void getData(Data& var, const string& prompt) { ... }
template <typename Value>
string toString(const Value& val) { ... }
template <typename Value>
void fromString(Value& val, const string& str) { ... }
istream& getLine(istream& in, string& str) { ... }
void controlLoop(const string& prompt = "command") { ... }
Create files called utils.h and utils.cpp containing these declarations. You may add declarations of other reusable functions to these files, but be discriminating. You may add a function, then later discover a better function, but now you must leave the old version in because old programs depend on it.
Problem 2
Reorganize the Blackjack program into separate files.
Problem 3
Reorganize the calculator program into separate files.