12 Reasons to
Learn and Use
Regular Expressions

One of the most powerful tools available to programmers are Regular Expressions. Here are 12 reasons why:

  1. Regular Expressions are integrated into many programmer tools and programming languages. Here is a partial list:

    • The vi editor which comes standard with the Unix/Linux operating system.

    • Any decent programmer's editor.

    • The grep command found standard on many operating systems including Unix/Linux.

    • The Perl programming language.

    • The PHP programming language.

  2. Regular Expressions are a language within a language.

    That's right. Regular expressions are a language study all to themselves. In fact, there's even a book written about this language, Jeffrey's Friedl's, Mastering Regular Expressions, from O'Reilly Press.

  3. Each time you use Regular Expressions, you are invoking a powerful search engine.

    No, not a search engine on the World Wide Web. This search engine is one that searches through text looking for specific patterns.

    Wall Street financial analysts are some of the folks that benefit from the existance of Regular Expressions. Their offices use Regular Expressions to scan text for financial data.

    At the office, they have news feeds to supply them with financial information. Their programmers have done something quite useful with these news feeds: They have written programs that scan the incoming news releases for hard financial and statistical data. They grab this data -- which consists of information like quarterly revenues and quarterly earnings -- and format it and put it into a database.

    Without a powerful tool like Regular Expressions, these analysts would have to hire someone to extract this information by eye and hand. Regular Expressions makes their life so much easier.

  4. If your search string is very simple, Regular Expression syntax is very simple.

    Here's some examples of that simplicity:

    • Find all lines that contain the word Apple:

      Apple

    • Find all lines that start with the word Apple:

      ^Apple

    • Find all lines that end with the word Apple:

      Apple$

  5. You can match upper or lower case:

    Find all lines contain either Apple or apple:

    [Aa]pple

  6. You can match either one string or another:

    • Find all lines containing either Apples or Oranges:

      Apples|Oranges

  7. With Regular Expressions, you can quantify how many times a character repeats:

    • Match the letter Z when it appears in the line exactly 3 times in a row:

      Z{3}

    • Match the letter Z when it appears at least 3 times in a row but possibly 4 or more times in a row:

      Z{3,}

    • Match the letter Z when it appears 3 times in a row or 6 times in a row or anything in between.

      Z{3,6}

    • Match Z when it appears 1 or more times in a row:

      Z+

      (Z+ and Z{1,} mean exactly the same thing!)

    • Match Z when it appears 0 or more times in a row:

      Z*

      (Z* and Z{0,} mean exactly the same thing!)

  8. With Regular Expressions, you can match whole classes of characters. Here are some examples that match a class of characters for 1 character position:

    • Match all uppercase letters for 1 character position:

      [A-Z]

    • Match all lowercase letters for 1 character position:

      [a-z]

    • Match all numbers for 1 character position:

      [0-9]

    • Designate a match of some uppercase letters, some lowercase letters, and some numbers for 1 character position:

      [ABCxyz789]

    • Designate a match of some letters and all numbers for 1 character position:

      [ABCxyz0-9]

  9. Regular Expressions allow you to match any character by using a period:

    • Match any character:

      . (1 period)

    • Match any 2 characters:

      .. (2 periods)

    • Match any 3 characters:

      ... (3 periods)

    • Match any 3 characters at the beginning of a line:

      ^...

    • Match any 3 characters at the end of a line:

      ...$

  10. Character classes and quantifiers can be combined. Here are some examples:

    • Match the letter a, b, or c.

      [abc]

    • Match the letter a, b, or c appearing side-by-side.

      [abc]{2}

      The above Regular Expression matches the following strings:

      aa bb cc ac ba bc ca cb

  11. Regular Expressions can be combined by placing these expressions side-by-side:

    • Match any character at the beginning of a line followed by a capital Z:

      ^.Z (caret, period, Z)

    • Match any line that consists of nothing but three capital Z letters:

      ^ZZZ$ (caret, Z, Z, Z, dollar sign)

  12. Regular Expressions can be used to match just about anything!

    That's right. Regular Expressions are a powerful tool that can be used to scan just about any file for data with an identifiable pattern.

    This is especially true of programming languages such as Perl where the implementation of Regular Expressions has been extended to extraordinary levels of high capability.

©Edward Abbott, 2004. All rights reserved. Revised May 5, 2004.

Questions or comments? Email me at ed@WebSiteRepairGuy.com.