Regex: character classes bracket the possible

One of the ways in which regular expressions (regex) are more powerful than simple pattern matching filters is that the regex syntax offers a wide set of metacharacters that can be used to identify complex patterns.

For instance, regex uses a set of square brackets, [], to hold a character class, or a range of possible characters that could fit within a single space.

In other words, using a character class, you can match an expression that could have one of a number characters in a given space.

For instance, the regex h[eu]llo World, would match either Hello World or Hullo World.

Character classes have a range of metacharacters to help advanced searching.

Within a character class, the - character represents a range of characters: <H[1-6]> would match <H1> through <H6>.

Ranges within character classes also work for letters, though they are case sensitive: [a-zA-Z] would work for all letters.

Character classes can consist of a combination of ranges and literal characters: [a-z7!].

Note, however, that each instance of a character class is a set of possible values for a single space: [acquainted] will match every word with the letters, a,c,q,u,a,i,n,t, e or d, not the word acquainted itself.

You can also find phrases that do not have a particular phrase, through the ^ within a character class: [^c] matches any word that does not contain the letter c. s[^k] will highlight any instances where an “s” is not followed by a “k,” and ignore those where it is (such as “sky”).

The dot, “.” is a place holder. It represents any character. For instance, if you are looking for a word with an unknown second character (“h7llo” or “hxllo,”) you could use h[.]llo which would match any occurrence of the pattern “h?llo”

Keep in mind that, within regular expressions, regex metacharacters such as “^” and “-” have different meanings when they are placed inside characters classes than when they are outside them.

Material taken from the book:




all mistakes are my own however…–Joab Jackson





Tags:

Comments are closed.