Joao Marques
Posted on October 28, 2024
Why do I need Regex?
Regular expressions are patterns that help us search for specific sequences in a text. In Java, they are used with classes in the java.util.regex
package.
With regex, we can find patterns, replace text, and validate inputs without adding too much code.
Basic Syntax
Let’s go over some common regex symbols and what they do:
Literal Characters: The simplest regex is just plain text.
hello
matches any occurrence ofhello
in a string.Wildcards:
.
: Matches any single character (h.llo
matcheshello
,hallo
,hxllo
).Character Sets:
[abc]
: Matches any character within the brackets (h[aeiou]llo
matcheshello
,hallo
).
[a-z]
: Matches any lowercase letter froma
toz
.Quantifiers:
*
: Matches zero or more occurrences of the letter behind it(go*gle
matchesgoogle
,ggle
,goooooooogle
).
+
: Matches one or more occurrences (go+gle
matchesgoogle
,gooogle
but notggle
).
?
: Matches zero or one occurrence of the letter behind it(colo?ur
matches bothcolur
andcolour
).Anchors:
^
: Indicates the start of a line (^hello
matches any line that begins withhello
).
$
: Indicates the end of a line (world$
matches any line that ends withworld
).Groups:
(abc)
: Groups multiple characters as a single unit ((ha)+
matchesha
,haha
,hahaha
).Escape Characters:
Some characters (like.
or*
) have special meanings, so prefix them with a backslash\
to use them literally. For instance,\
. will match a literal dot.
Short example:
Pattern: Compiles the regular expression and matches it in a text.
Matcher: Applies the pattern to a specific text and helps find matches.
Here’s a quick example of how these classes work together:
import java.util.regex.*;
import java.util.regex.*;
public class RegexBasicsDemo {
public static void main(String[] args) {
String text = "hxllo hallo hbllllllo hello";
Pattern pattern = Pattern.compile("h.llo");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println("Wildcard match found: " + matcher.group());
}
}
}
What will be printed:
-
Wildcard match found: hxllo
-
Wildcard match found: hallo
-
Wildcard match found: hello
import java.util.regex.*;
public class RegexReplaceExample {
public static void main(String[] args) {
String text = "hello hzllo hallo hillo";
Pattern pattern = Pattern.compile("h[aeiou]llo");
Matcher matcher = pattern.matcher(text);
String result = matcher.replaceAll("hi");
System.out.println("Original text: " + text);
System.out.println("Text after replacement: " + result);
}
}
What will be printed:
- Original text:
hello hzllo hallo hillo
- Text after replacement:
hi hzllo hi hi
Useful Java Regex Methods
-
matches()
: Checks if the whole text matches the regex pattern. -
find()
: Searches for occurrences of the pattern in the text (returns true if, and only if, a subsequence of the input sequence matches this matcher's pattern) -
group()
: Returns the matched text after calling find(). -
replaceAll()
: Replaces matches in the text with a replacement string
My opinion about regex
As a Java developer, I’ve come to really appreciate regex for how powerful it can be with text processing. It’s amazing to see how one well-crafted line of regex can handle tasks that might otherwise need an entire block of code. For straightforward matching, regex feels perfect: it’s concise, efficient, and ideal for things like validating formats or extracting patterns.
But I know not everyone feels the same way. Regex can be far from intuitive, and when patterns start getting complex, readability suffers. It’s easy to create patterns that work like magic, yet are nearly impossible for anyone else (or even yourself, later on, after you came back from a nice vacation) to understand at a glance. Complex patterns can quickly become "write-only" code.
In these situations, I’ve found it better to break validation down into smaller, simpler steps. This keeps things clearer and makes it easier for others to follow the logic. While regex is such a valuable tool in Java, I think it’s best used with a bit of restraint, especially in team environments. After all, writing maintainable code means thinking of the next person who’ll read it.
Posted on October 28, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.