The ‘Walrus’ Operator

The headline feature of Python’s version 3.8 release was the addition of the ‘assignment expression operator’ (:=) – colloquially known as the ‘Walrus’ operator.

According to the Python docs:

"[The Walrus operator] assigns values to variables as part of a larger expression"

So… what does that mean? Basically it allows you to assign a variable and check it with a conditional at the same time, potentially cutting down a couple lines of code into one.

When I first saw the Walrus operator I thought, that’s kinda cool. Not sure I will use it that often though.

But recently I have found myself using it more and more as an elegant (in my opinion) solution for regular expression matching.

Walrus Operator Use Case: Regular Expression Matching

Let’s start with an example of regular expression matching without using the Walrus Operator. We will use a regular expression to extract the date (YYYYMMDD) and city name from an example file name.

If you are new to regular expressions, check out this website that walks through the syntax for regular expressions: https://regexone.com

Setup

# Example: extract date and city from filename

import re

FILENAME = "20220614_london.csv"

# regular expression pattern to capture the date and city
regex_pat = "(\d*)_(.*).csv"

Version 1: Regex only

# extract date and city from filename
matches = re.match(regex_pat, FILENAME)

# show extracted data
matches.groups()
('20220614','london')

This works fine but I would argue there are a couple issues.

Firstly, a problem arises if the supplied string (file name) does not match the expected pattern. In this scenario, the re.match function will return None because no matches would be found. When we then call matches.groups() to display the extracted information, we will get an AttributeError:

AttributeError: 'NoneType' object has no attribute 'groups'

The way to protect against this error is to add a conditional check on the matches variable to check it has a value, before doing anything else with it:

Version 2: Regex with conditional

# improved example
matches = re.match(regex_pat, FILENAME)

# check matches is not None before calling matches.groups()
if matches:
    print(matches.groups())

Further reading

I used this approach in another post about extracting information from Google Cloud Storage URIs. Link to article here

However, this leads to the second issue (although admittedly much less of a problem). We have had to add an extra line of code for the conditional. This seems unnecessarily verbose.

This is where the Walrus operator comes in handy. We can extract the information using the regular expression and check it is not None in a single line of code.

Final version: Walrus Operator

# assignment and conditional in one go using := syntax
if matches := re.match(regex_pat, FILENAME):
    print(matches.groups())

Conclusion

You can use the Walrus Operator to write more concise code. It is particularly handy for regular expression matching, or any time where you assign a variable using a function which could return None.

Side Note ⭐️

While researching this article, I learned that you can actually explicitly name the capture groups in regular expressions using the ?P<...> syntax. We can then access their values by name rather than by index value.

Using the example above:

import re

regex_pat = "(?P<date>\d*)_(?P<city>.*).csv"

if matches := re.match(regex_pat, FILENAME):
    print(matches.group('date'))
    print(matches.group('city'))

Accessing the values by name rather than index value makes your regular expressions code even more readable , particularly if you have a long regex pattern with multiple capturing groups.

Happy coding!

Further Reading