Grep Command Tutorial For Unix

Tutorials

 

Extended Regular Expressions (egrep or grep -E)

grep -E and egrep are the same exact command. The commands search files for patterns that have been interpreted as extended regular expressions. An extended regular expression goes be- yond just using the previously mentioned options; it uses ad- ditional metacharacters to create more complex and powerful search strings. As far as command-line options, grep -E and grep take the same ones—the only differences are in how they process the search pattern:

?

? in an expression carries the meaning of optional. Any character preceding the question mark may or may not appear in the target string. For example, say you are look- ing for the word “behavior”, which can also be written as “behaviour”. Instead of using the or (|) option, you can use the command:

egrep 'behaviou?r' filename

 

As a result, the search is successful for both “behavior” and “behaviour” because it will treat the presence or ab- sence of the letter “u” the same way.

+

The plus sign will look at the previous character and allow an unlimited amount of repetitions when it looks for matching strings. For instance, the following command would match both “pattern1” and “pattern11111”, but would not match “pattern”:

egrep 'pattern1+' filename
{n,m}

The braces are used to determine how many times a pat- tern needs to be repeated before a match occurs. For in- stance, instead of searching for “patternnnn”, you could enter the following command:

egrep 'pattern{4}' filename

 

This will match any string that contains “patternnnn” without going through the trouble of typing out repeated strings. In order to match at least four repetitions, you would use the following command:

egrep 'pattern{4,}' filename

 

On the other hand, look at the following example:

egrep 'pattern{,4}' filename

 

Despite the fact that it would fit in with the conventions already used, this is not valid. The command just shown would result in no matches because the ability to have “no more than X” matches is not available.

To match between four and six repetitions, use the following:

egrep 'pattern{4,6}' filename

Used in a regular expression, this character signifies “or.” As a result, pipe (|) allows you to combine several patterns into one expression. For example, suppose you need to find either of two names in file. You could issue the fol- lowing command:

egrep 'name1|name2' filename

 

It would match on lines containing either “name1” or “name2”.

( )

Parentheses can be used to “group” particular strings of text for the purposes of backreferences, alternation, or simply readability. Additionally, the use of parentheses can help resolve any ambiguity in precisely what the user wants the search pattern to do. Patterns placed inside pa- rentheses are often called subpatterns.

Also parentheses put limits on pipe (|). This allows the user to more tightly define which strings are part of or in scope of the “or” operation. For instance, to search for lines that contain either “pattern” or “pattarn”, you would use the following command:

egrep 'patt(a|e)rn' filename

 

Without the parentheses, the search pattern would be patta|ern, which would match if the string “patta” or “ern” is found, a very different outcome than the inten- tion.

 

In basic regular expressions, the backslash (\) negates the metacharacter’s behavior and forces the search to match the character in a literal sense. The same happens in egrep, but there is an exception. The metacharacter { is not supported by the traditional egrep. Although some versions interpret \{ lit- erally, it should be avoided in egrep patterns. Instead, [{] should be used to match the character without invoking the special meaning.

It is not precisely true that basic grep does not have these metacharacters as well. It does, but they cannot be used directly. Each of the special metacharacters in extended regular expressions needs to be prefaced by an escape to draw out its special meaning. Note that this is the reverse of normal escap- ing behavior, which usually strips special meaning.

 

Table  illustrates how to use the extended regular expressions metacharacters with basic grep.

Table  Basic versus extended regular expressions comparison

 

Basic regular expressions Extended regular expressions

‘\(red\)’     ‘(red)’

‘a\{1,3\}’     ‘a{1,3}’

‘behaviou\?r’     ‘behaviou?r’

‘pattern\+’     ‘pattern+’

 

From Table 3, you get the idea why people would prefer to just use extended grep when they want to use extended regular ex- pressions. Convenience aside, it is also easy to forget to place a necessary escape in basic regular expressions, which would cause the pattern to silently not return any matches. An ideal regular expression should be clear and use as few characters as possible.

 

 

Fixed Strings (fgrep or grep -F)

 

In the following section, we discuss grep -F, or fgrep. fgrep is known as fixed string or fast grep. It is known as “fast grep” because of the great performance it has compared to grep and egrep. It accomplishes this by dropping regular expressions al- together and looking for a defined string pattern. It is useful for searching for specific static content in a precise manner, similar to the way Google operates.

 

The command to evoke fgrep is:

fgrep string_pattern filename

By design, fgrep was intended to operate fast and free of inten- sive functions; as a result, it can take a more limited set of command-line options. The most common ones are:

-b
fgrep -b string_pattern filename

Shows the block number where the string_pattern was found. Because entire lines are printed by default, the byte number displayed is the byte offset of the start of the line.

-c
fgrep -c string_pattern filename

 

This counts the number of lines that contain one or more instances of the string_pattern.

-e, -string
fgrep -e string_pattern filename

 

Used for the search of more than one pattern or when the string_pattern begins with hyphen. Though you can use a newline character to specify more than one string, in- stead you could use multiple -e options, which is useful in scripting:

fgrep -e string_pattern1
-e string_pattern2 filename
-f file
fgrep -f newfile string_pattern filename

 

Outputs the results of the search into a new file instead of printing directly to the terminal. This is unlike the behav- ior of the -f option in grep; there it specifies a search pat- tern input file.

-h
fgrep -h string_pattern filename

When the search is done in more than one file, using -h stops fgrep from displaying filenames before the matched output.

-i
fgrep -i string_pattern filename

 

The -i option tells fgrep to ignore capitalization contained in the string_pattern when matching the pattern.

-l
fgrep -l string_pattern filename

Displays the files containing the string_pattern but not the matching lines themselves.

-n
fgrep -n string_pattern filename

 

Prints out the line number before the line that matches the given string_pattern.

-v
fgrep -v string_pattern filename

 

Matches any lines that do not contain the given string_pattern.

-x
fgrep -x string_pattern filename

Prints out the lines that match the string_pattern in their entirety. This is the default behavior of fgrep, so usually it does not need to be specified.

 

 

Perl-Style Regular Expressions (grep -P)

 

In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.