30 Days of Python 👨‍💻 - Day 19 - Regular Expressions

arindamdawn

Arindam Dawn

Posted on July 9, 2020

30 Days of Python 👨‍💻 - Day 19 - Regular Expressions

Regular Expressions (Regex/RegExp) is a powerful programming concept and is universal across all programming languages. but is often found to be confusing and hard to interpret mainly by beginners. Regular Expressions are a sequence of character patterns used for efficiently searching searching strings. They offer a wide array of use cases when dealing with texts such as searching, validation or replacing texts.

Today I explored how to use regex in Python.

A regular expression (shortened as regex or regexp also referred to as rational expression) is a sequence of characters that define a search pattern. Usually such patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. (Wikipedia).

From my prior experience in writing JavaScript programs, I am already familiar with regular expressions. Also, there are tons of amazing resources available out there on the web about regular expressions. My intention today was to check out the syntax and method of using them in Python as knowing how to use regular expressions in Python will come in very handy when building projects in the upcoming days. So I compiled together some great resources related to regex in this post along with some practical coding exercises which I can use as a reference in future. It might help any enthusiast as well. There is no need to memorize each and every regex rule as they can always be Googled based on the requirement and most common regex patterns are readily available so we most of the time don't need to create complex regex patterns ourselves.

However, having knowledge of how to read regex patterns is a great skill to have and it helps in understanding what a pattern is doing basically.

Here are some cool regex resources specific to Python

To practice and test regular expressions Regex101 is a great learning playground. It also helps in generating the equivalent Python regexp pattern as well

Regex methods in Python

To use regular expressions in Python, a built-in module re needs to be imported. This module comes with several methods for using regex.

Function Description
re.search Check if given pattern is present anywhere in input string
Output is a re.Match object, usable in conditional expressions
r-strings preferred to define RE
Use byte pattern for byte input
Python also maintains a small cache of recent RE
re.fullmatch ensures pattern matches the entire input string
re.compile Compile a pattern for reuse, outputs re.Pattern object
re.sub search and replace
re.sub(r'pat', f, s) function f with re.Match object as argument
re.escape automatically escape all metacharacters
re.split split a string based on RE
text matched by the groups will be part of the output
portion matched by pattern outside group won’t be in output
re.findall returns all the matches as a list
if 1 capture group is used, only its matches are returned
1+, each element will be tuple of capture groups
portion matched by pattern outside group won’t be in output
re.finditer iterator with re.Match object for each match
re.subn gives tuple of modified string and number of substitutions

Code Exercises

Let's try out building some code to test out various practical use cases of regex in building Python applications.

  • Password Validator
# Prompts user to enter a password and validates it
# Criteria:
# Min 8 characters
# Only alphabets, numbers and @$!%*?& allowed
# should have atleast 1 uppercase character
# should have atleat 1 lowercase character
# should have atleast 1 special character
# should have atleast 1 number
import re

def password_checker():
    password = input('Please enter a password')
    password_pattern = re.compile(
        r"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$"
    )
    result = re.fullmatch(password_pattern, password)
    if result:
        print('Valid password')
    else:
        print('Invalid password')

password_checker()
Enter fullscreen mode Exit fullscreen mode

Note: The above code can be made more interactive using a switch statement to check for conditions separately and displaying individual errors if any condition fails. If the above regex looks confusing, try copying it to regex101. It will breakdown the regex into chunks with explanations.

I prefer using the compile method to store the regex pattern as a reference which can be used later on. It returns a regex object.

The r before the regex string tells the Python interpreter that it is a raw string. With raw strings, there is no need for escaping characters.

  • Extract numbers from a string
# Program to extract numbers from a string

import re

string = 'Python was introduced in 1992. This is year 2020.'
pattern = '\d+'

result = re.findall(pattern, string)
print(result) # ['1992', '2020']
Enter fullscreen mode Exit fullscreen mode

These are some basic examples of how regex can be used in Python.

Here are some good articles for more in-depth information on regex in Python

That's all for today. Tomorrow I will be digging into testing techniques in Python. I am pretty excited about that.

Have a great one!

💖 💪 🙅 🚩
arindamdawn
Arindam Dawn

Posted on July 9, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related