Strings

Characters

Here are some example declarations of character variables initialized by character literals. Notice that literal characters are bracketed by single quotes and that char is the name of the type of all characters:

char zero = '0', quote = '"', a = 'a', period = '.', space = ' ';

Internally, a character is represented by an integer. For example:

cout << "the code for " << period << " = " << int(period) << '\n';

prints 46 on my computer.

An appropriate integer can be converted into the character it represents. For example, the statement:

cout << char(98) << '\n';

prints A on my computer.

Control characters represent special keys such as the return key, etc. They have special literal representations that use the backslash character:

char newline = '\n',
      tab = '\t',
      backslash = '\\',
      bell = '\a',
      ret = '\r',
      backspace = '\b';

The standard C library contains functions for classifying and manipulating characters. They are declared in <cctype>:

int isalpha(int);
int isupper(int);
int islower(int);
int isdigit(int);
int isspace(int);
int iscntrl(int);
int ispunct(int);
int isalnum(int);

int toupper(int);
int tolower(int);

Here's a little program for determining character codes. Unfortunately, we'll have to make the user type <Ctrl> + c to break out of the loop:

#include <iostream>
using namespace std;

int main()
{
   while(true)
   {
      cout << "Enter a character (<Ctrl>c to quit): ";
      char response = cin.get();
      cout << "int(" << response << ") = ";
      cout << int(response) << '\n';     
      cin.sync(); // flush '\n'
   }

   return 0;
}

Here's a sample output produced on my computer, which uses ASCII character codes:

Enter a character (<Ctrl>c to quit): 6
int(6) = 54
Enter a character (<Ctrl>c to quit): 7
int(7) = 55
Enter a character (<Ctrl>c to quit): a
int(a) = 97
Enter a character (<Ctrl>c to quit): A
int(A) = 65
Enter a character (<Ctrl>c to quit): b
int(b) = 98
Enter a character (<Ctrl>c to quit): B
int(B) = 66
Enter a character (<Ctrl>c to quit):
int( ) = 32
Enter a character (<Ctrl>c to quit): ?
int(?) = 63
Enter a character (<Ctrl>c to quit):

C Strings

A C string variable is simply an array of characters. To allow C strings of varying sizes, these arrays are often allocated in the heap:

typedef char String80[80];
typedef char* String;

A C string literal is a sequence of characters bracketed by double quotes. Backslash is the escape character:

String path = "D:\\aaa\\nnn\n"; // = D:\aaa\nnn
String80 prompt = "Type \"q\" to quit\n"; // = Type "q" to quit

C strings are always terminated by the NUL character, where NUL = char(0). The standard C library provides functions for manipulating C strings (see below).

Programmers should use C++ strings (see below) instead of C strings, because C strings don't check for out-of-range indices, and they don't allocate and deallocate memory for themselves. Instead, these jobs are left to the programmer.

There are several places where C strings are still used in C++. One example are command line arguments. The DOS console always passes the entire command line, including the command name, to main() as an array of C strings (i.e. an array of arrays of chars). The console also passes the length of this array to main().

test.cpp (A simple expression evaluator)

#include <string>    // string functions declared here (not used)
#include <cmath> // pow declared here
#include <cstdlib>   // atof declared here
#include <iostream>
using namespace std;

 

int main(int argc, char* argv[])
{
   if (argc != 4)
   {
      cerr << "usage: " << argv[0] << " NUMBER OPERATOR NUMBER\n";
      return 1;
   }

 

   double num1 = atof(argv[1]);
   double num2 = atof(argv[3]);
   double num = 0; // result stored here
   char op = argv[2][0];

 

   switch (op)
   {
      case '+': num = num1 + num2; break;
      case '*': num = num1 * num2; break;
      case '-': num = num1 - num2; break;
      case '/': num = num1 / num2; break;
      case 'e': num = pow(num1, num2); break; // DOS won't allow ^
      default:
         cerr << "unrecognized operator: " << op << '\n';
         return 1;
   } // switch

 

   cout << "result = " << num << '\n';
   return 0;
}

Program Output

Notice that although the executable is renamed, the new name is used in the error message:

D:>rename string3.exe eval.exe

D:>eval 22 + 15
result = 37
D:>eval 3.2 * 7
result = 22.4
D:>eval 15 / 3
result = 5
D:>eval 15 - 5
result = 10
D:>eval 2 e 5
result = 32
D:>eval 23
usage: eval NUMBER OPERATOR NUMBER

D:>eval 6 # 2
unrecognized operator: #

D:>

C++ Strings

The standard C++ library provides a string class declared in <string>. A C++ string is an object that encapsulates many useful functions. The program below demonstrates some of these functions.

test.cpp

The boldface text demonstrates the most important string operations:

#include <cstdio>
#include <iostream>
#include <string>
using namespace std
;

 

int main()
{
   // converting C strings to C++ strings
   string s1("California"), s2, s3;
   s2 = "Nevada";

 

   // converting C++ strings to C strings
   printf("%s\n\n", s1.c_str());

 

   cout << "s3 empty? = " << s3.empty() << "\n\n";

 

   // assignment copies strings
   s3 = s2;
   cout << "s2 = " << s2 << '\n';
   cout << "s3 = " << s3 << '\n';
   s3[0] = 'n';
   cout << "s2 = " << s2 << '\n';
   cout << "s3 = " << s3 << "\n\n";

 

   // strings can grow
   s3 = "Nevada is in the USA";
   cout << s3 << "\n\n";

 

   // 3 ways to traverse a string

   for(int i = 0; i < s1.size(); i++)
      cout << s1[i] << ' ';

 

   cout << '\n';
  
   // at() does index bounds checking
   for(i = 0; i < s2.length(); i++)
      cout << s2.at(i) << ' ';

 

   cout << '\n';

 

   string::iterator p;
   for(p = s1.begin(); p != s1.end(); p++)
      cout << *p << ' ';

 

   cout << "\n\n";

 

   // error handling

  
try
   {
      cout << s1.at(500); // index too big!
   }
   catch(out_of_range e)
   {
      cerr << e.what() << "\n\n";
   }

 

   // comparing strings

 

   cout << s1 << " == " << s2 << " = " << (s1 == s2) << '\n';
   cout << s1 << " != " << s2 << " = " << (s1 != s2) << '\n';
   cout << s1 << " <= " << s2 << " = " << (s1 <= s2) << "\n\n";

 

   // concatonation
   cout << s1 + ' ' + "is next to" + ' ' + s2 << "\n\n";

 

   // finding substrings
   int pos = s1.find("for");
   s3 = s1.substr(pos, 3); // 3 chars beyond pos
   cout << s3 << '\n';
   s3 = s1.substr(pos, string::npos); // tail of s1
   cout << s3 << "\n\n";

 

   // replacing substrings
   s3 = s1;
   s3.replace(pos, 3, "XXXXX");
   cout << s3 << "\n\n";

 

   // stats
  
cout << "s1's max size = " << s1.max_size() << '\n';
   cout << "s1's capacity = " << s1.capacity() << '\n';
   cout << "s1's size = " << s1.size() << "\n\n";

 

   // erasing substrings
   s1.erase(pos, 3);
   cout << s1 << '\n';
   cout << "s1's max size = " << s1.max_size() << '\n';
   cout << "s1's capacity = " << s1.capacity() << '\n';
   cout << "s1's size = " << s1.size() << "\n\n";

 

   // string I/O
   cout << "Enter 3 strings seperated by white space: ";
   cin >> s1 >> s2 >> s3;
   cout << "\nYou entered: " << s1 << ' ' << s2 << ' ' << s3 << '\n';

 

   return 0;
}

Program Output

California

 

s3 empty? = 1

 

s2 = Nevada
s3 = Nevada
s2 = Nevada
s3 = nevada

 

Nevada is in the USA

 

C a l i f o r n i a
N e v a d a
C a l i f o r n i a

 

invalid string position

 

California == Nevada = 0
California != Nevada = 1
California <= Nevada = 1

 

California is next to Nevada

 

for
fornia

 

CaliXXXXXnia

 

s1's max size = 4294967293
s1's capacity = 31
s1's size = 10

 

Calinia
s1's max size = 4294967293
s1's capacity = 31
s1's size = 7

 

Enter 3 strings seperated by white space: shoe crab rotator

 

You entered: shoe crab rotator