Unix Lab



The grep utility


Searching files for text

Suppose you know that a file you are looking for has a specific text string within it but you don't remember the name of the file. Suppose in a Java program consisting of many files that you would like to find the source code for a specific method. The grep utility can help you in both of these instances.


Simple uses of grep

Copy the file /handouts/cs46blab/graph.usa into your working directory. This file contains a list of pairs of city names together with the distance in miles between them. Examine the first few lines with the head command.

In its simplest form, the grep command will look like:

grep string file(s)

where string is a string of characters you are looking for in the one or more files you have named. For example, let's look for the string Denver in the file graph.usa:

grep Denver graph.usa

Copy all the files with a .java extension from /handouts/cs46blab into your working directory. If you look at the Patterns.java file (using more, for example) you will see that the update method is called in the program. Suppose we want to find out where the update method is defined in these files (in this case you could easily determine that but pretend you had many files and you didn't know). Type:

grep update *

What is the role of the '*' character? If you are not sure, check out the use of metacharacters. Which files contain mention of update? How can you tell?

Knowing the file may be useful, but it would be more useful to know the line number within the file. Type:

grep -n update *

Compare the former output to this output and determine the line numbers of the files where update is mentioned. In which file and on what line is the start of the definition of the update method?

Use the man command to determine the purpose of the -i option for grep. What about the -c option?


Using regular expressions

Sometimes we need to be more specific about the string match that we are looking for.

As an example, consider the graph.usa file again. Suppose we want to find all lines containing the string Denver but only if it appears at the beginning of the line. We can use regular expressions for this. If you are unfamiliar with regular expressions please review that module.

To look for occurrences of Denver at the beginning of a line, type:

grep '^Denver' graph.usa

The use of the single quote character surrounding the regular expression is to "protect" it from the shell so that the shell doesn't try to interpret the metacharacter '^' before grep sees it. Compare this output to the output from the previous grep command earlier.

The '^' character at the beginning of the regular expression forces the match to be anchored at the beginning of the line. If the expression doesn't find a match at the beginning of the line, it's not a match.

There is also the use of the '$' character. If that appears at the end of a regular expression it forces a match to be anchored at the end of the line. If the expression doesn't find a match at the end of the line, it's not a match.

To look for all lines in which the distance between cities is less than 100 miles type:

grep ' [1-9][0-9]$' graph.usa

Do you understand why this works?

How would you find out the line numbers for these lines?

Suppose we want to find all lines in graph.usa that mention a city whose name ends in the characters City? In particular, we will accept any alphabetic character followed by City. What is the grep command to do this?

What if we want to find names of cities that end in an 's' character? (Hint: look for an s character followed by a blank.).


You can find a more comprehensive guide to regular expressions using:
man -s5 regex

(scroll down until you find the material). This will give you more flexibility in how to specify strings for which you want to search.

Finally, you can use the more powerful regular expressions with a more powerful version of grep called egrep. If you find that grep doesn't understand your regular expression, try egrep.


Click on to go back to the main directory.

Click on to take the quiz for this module.

These pages were developed by John Avila SJSU CS Dept.