Regular Expressions - Grouping and string methods

mconner89

Mike Conner

Posted on August 31, 2020

Regular Expressions - Grouping and string methods

In my last post, I talked about how to construct and use regular expressions. In this post, I'm going to go a little more in-depth and hopefully demonstrate how powerful regular expressions can be!

Grouping

The ability to use special operators is one of the reasons regular expression are so powerful. Combine that with the fact that regular expressions allow you to group several characters together and use operators on those entire groups and you have the ability to formulate much more specific search patterns than could be achieved with strings alone. Surrounding characters by parentheses, then following those parentheses with an operator applies that operator to the entire group. For example:

const waluigi = /wa+(ha+)+/;

waluigi.test('waha');  // returns true
waluigi.test('waaaahaaaaha');  // returns true
waluigi.test('waahahaahahaa');  // returns true

In the above regexp, We have several "+" operators, as well as a parentheses group. Notice that we have surrounded "ha+" in parentheses, then followed those parentheses by a "+" operator. This means that the string of "ha" can occur any number of times, with any number of "a"s trailing the "h". We can also combine parenthesis with the pipe operator "|", which functions similarly to the Javascript "or" operator. This operator denotes that the choice on either side of the operator will produce a match:

const animals = /(cat|dog|turtle)s/;
animals.test('I like cats');  // returns true
animals.test('I like dogs');  // returns true
animals.test('I like turtles');  // returns true
animals.test('I like squids');  // returns false

Note that the pipe operator will also work outside of parentheses.

Boundaries

The "^" symbol and the "$" symbol specifically refer to the start and the end of a string, respectively:

const carat = /^\d/;

carat.test('5 time 5 is 25');  // returns true
carat.test('Five time five is 25');  // returns false

const dollar = /\d$/;

dollar.test('five times five is 25')  // returns true
dollar.test('five times five is twenty-five')  // returns false

const caratDollar = /^\d.+\d$/;
caratDollar.test('5 times 5 is 25')  // returns true
caratDollar.test('5 times 5 is twenty-five')  // returns false
caratDollar.test('Five times 5 is 25')  // returns false
caratDollar.test('Five times 5 is twenty-five')  // returns false

In the above example, we see that we start our carat regexp with "^", followed by "\d". This means that the first character of our string must be a number. Similarly, in the dollar regexp, we use the "$" symbol to denote that our string must end with a number. We combine the two in caratDollar, with ".+" in the middle to test that our string starts and ends with any number, and can have anything else (except a newline character!) in between the two. We can use the "\b" marker similarly. It denotes a word boundary, meaning that the spot where "\b" is found can be the start or end of the string, or any non-alphanumeric character.

const spaceFirst = /\bcat/;
spaceFirst.test('I like cats');  // returns true
spaceFirst.test('.cats');  // returns true
spaceFirst.test('9cats');  // returns false
spaceFirst.test('concat');  // returns false


const spaceLast = /cat\b/;
spaceLast.test('I like cats');  // returns false
spaceLast.test('I have a cat');  // returns true
spaceLast.test('I have a cat.');  // returns true
spaceLast.test('concatenate');  // returns false

const allAlone = /\bcat\b/;
allAlone.test('I like cats');  // returns false
allAlone.test('I have a cat');  // returns true
allAlone.test('My cat is friendly');  // returns true
allAlone.test('I have a cat.');  // returns true
allAlone.test('concatenate');  // returns false

With string methods

Finally, regular expressions can be used with several string methods to return more than just true or false. First, lets talk about search. While you can't use a regexp with the .indexOf method, you can use it with .search. This will return the first index of a match, or a -1 if no match was found, just like .indexOf. For example:

const searchExp = /chicken/;
const searchString= `Don't count your chickens before they hatch`;

searchString.search(searchExp);  // returns 17

However, unlike .indexOf, there is no way to start from a specific index. Next, you have .match, which actually requires a regular expression. .match will return an array of all the matches in a string (or just one if the 'g' flag isn't used). This is useful when you remember that regular expressions can be more specific than strings. Lets see an example:

const matchExp = /\d+/g;
const matchString = 'I had a 10, 9, 4, 2, and ace.'
matchString.match(matchExp);  // returns ["10", "9", "4", "2"]

And finally, we have .replace! We can use it identically to the way it would be used with a string, just with a regexp (and all of the operators available to it), but there are some other interesting use cases for regular expressions and .replace. For one, we can use the g flag to indicate that we wish to replace EVERY occurance of the match in the string (There is a replaceAll that does the same thing, but at the time of this writing, it wasn't fully functional in all browsers, and .replace with regular expressions is).

const replaceAllExp = /(cat|dog|fish)/g;
const replaceAllString = 'cat dog fish'
replaceAllString.replace(replaceAllExp, 'turkey');  // returns 'turkey turkey turkey'

We can also refer to group matches in the replacement string. This is much easier to think about after you see it happen:

const namesExp = /(\w+), (\w+)/g
const names = 'Potter, Harry, Weasley, Ronald, Granger, Hermione';
names.replace(namesExp, "$2 $1");  // returns "Harry Potter, Ronald Weasley, Hermione Granger"

In our regular expression, we have two groups, denoted by the parentheses. In our call to the .replace method, notice that our string has $2 and $1 in it. This format refers to the groups of regular expressions. We are essentially placing anything found by the second expression in front of anything found by the first expression, with a space in between the two. A final use of regular expression with .replace is using it with a function to perform some action on your string. This is also possible when .replace is used with two strings, but again, a regular expression allows us to be more specific with our search pattern:

const funcExp = /\b(jfk|fdr)\b/g
const presidents = "I prefer jfk to fdr";
presidents.replace(funcExp, str => str.toUpperCase());  // returns "I prefer JFK to FDR"

Conclusion

When working with strings, regular expressions are a powerful tool to keep in mind. They can be used to beef up our string methods, or allow us to perform actions that would normally take multiple lines of code with a single expression.

💖 💪 🙅 🚩
mconner89
Mike Conner

Posted on August 31, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related