# 4.1. Regular expressions

## 4.1.1. What are regular expressions?

A *regular expression* is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions by using various operators to combine smaller expressions.

The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.

## 4.1.2. Regular expression metacharacters

A regular expression may be followed by one of several repetition operators (metacharacters):

**Table 4-1. Regular expression operators**

Operator | Effect |
---|

. | Matches any single character. |

? | The preceding item is optional and will be matched, at most, once. |

* | The preceding item will be matched zero or more times. |

+ | The preceding item will be matched one or more times. |

{N} | The preceding item is matched exactly N times. |

{N,} | The preceding item is matched N or more times. |

{N,M} | The preceding item is matched at least N times, but not more than M times. |

- | represents the range if it's not first or last in a list or the ending point of a range in a list. |

^ | Matches the empty string at the beginning of a line; also represents the characters not in the range of a list. |

$ | Matches the empty string at the end of a line. |

\b | Matches the empty string at the edge of a word. |

\B | Matches the empty string provided it's not at the edge of a word. |

\< | Match the empty string at the beginning of word. |

\> | Match the empty string at the end of word. |

Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.

Two regular expressions may be joined by the infix operator "|"; the resulting regular expression matches any string matching either subexpression.

Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.

## 4.1.3. Basic versus extended regular expressions

In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "\(", and "\)".

Check in your system documentation whether commands using regular expressions support extended expressions.