Intro to Regular Expressions: Anchors

Welcome back to our Series "Intro to Regular Expressions". If you haven't done so, you may want to read the previous article in this series:

Anchors

A Regular Expression pattern (or a Regex, as we will use) contains characters which represent, either directly (a literal) or indirectly (a special character) the text you are seeking. So, a Regex foo will find the text “foo” and a regex \d will find a digit. These have been covered in the first and second blog post.

There are also characters in a Regex which match a location in the searched text. That location, or anchor, could be before the searched text, it could be within the searched text, or it could be after the searched text.

The Magna Carta begins like this.

John, by the grace of God King of England, Lord of Ireland, Duke of Normandy and Aquitaine and Count of Anjou, to his archbishops, bishops, abbots, earls, barons, justices, foresters, sheriffs, stewards, servants and to all his officials and loyal subjects, greeting.

For whatever reason, you wish to create a Regex that finds words beginning with the letter “J” or “j”, followed by any vowel. (Most Regex machines offer an option of case sensitivity. We will choose the option that makes the regex search case-insensitive.)

Your regex could be this: j[aeiou] The construct [aeiou] is called a character class and matches either an “a”, an “e”, an “i” etc etc. This regex will match the “Jo” of John, the “jo” of Anjou and the “je” of subjects. But what, for whatever reason, you only wish to limit this match to the beginning of the text? That is where an anchor comes in.

The character ^ when used in a regex, like this: ^j[aeiou] will anchor the match to the beginning of the text. (Be aware that using the ^ in a character class, like this, [^aeiou] has a different meaning; we will explain this in a separate blog post later.)

The regex ^j[aeiou] thus matches only the “Jo” of John.

There are many such anchors. To name a few common ones.

  • $ anchors the match to the end of the text.[^1]
  • \b matches only at a word non-word boundary.
  • \z matches only at the end of the text.

If you wish to experiment with Regular Expressions, so that you might use them for your own applications, or simply wish to learn more about this incredibly useful tool, consider purchasing our macOS app ReX-T. We provide a library with all the anchors and examples of their use, as well as other Regex constructs. You can even create and save your own useful Regular Expressions for use at another time.

Head over to the App Store and get ReX-T. The introductory price is valid only for a limited time.

Come back soon to read our next installment of this series. We'll be talking about what Assertions are and what you can use them for. Or subscribe to our newsletter to stay up to date with our articles and other stuff.

[^1]: Depending on your search options, the $ anchor may anchor to the end of the whole text or a single line.

Comments

No comments yet.

You can submit a comment or question related to this blog post. Your post may be subject to moderation and therefor not appear immediately. Please be patient.