Writing More Idiomatic and Pythonic Code

martinheinz

Martin Heinz

Posted on September 1, 2020

Writing More Idiomatic and Pythonic Code

There are lots of ways one can implement same feature, algorithm or function. Some of them straightforward, clear - better, some of them confusing, inefficient - worse. The Python community often uses terms like Pythonic or idiomatic when describing code that follows certain (natural, proper) style and conventions. That's the kind of good, clear code we all try to write everyday and in this article we will go over a few tips, conventions and idioms that will help you write a bit more idiomatic and Pythonic code.

Identity and Equality Comparisons

Not just in Python, but really in any programming language, you can fall into the trap of mixing up identity and value equality. In Python you have choice of using either is or == for comparisons, where is checks identity and == checks value.

Considering that most of the time we only care about value, not identity, we would usually choose ==. There are cases however, where you should always use is operator instead. One of those is comparison with all of Pythons singletons - None, True or False.

Using is None, is True or is False isn't just about convention or improved readability though. It also improves performance, especially if you would use x is None instead of x == None inside loop. Why is that? - you might ask. Well, it's because is operator cannot be overloaded like == (which really is just a.__eq__(b)), so Python can skip lookup of these dunder methods that are needed to evaluate comparison using ==.

So, bottom line here is, that you should try to use is when possible, as it is more readable, faster and idiomatic. But to find out whether you can actually use it, you should ask yourself whether you care about value or identity of variables being compared.

Context Managers Instead of try/finally

In other languages it's common practice to use try/finally to manage resources and to make sure you dispose of opened files or acquired locks if exception occurs. You could use try/finally in Python too, but we can do better using with statement:

# Bad
try:
    page = urlopen(url)
    ...
finally:
    page.close()

# Good
from contextlib import closing

with closing(urlopen(url)) as page:
    ...

Code above shows usage of so-called context protocol which consists of 2 methods - __enter__ and __exit__ which are called when entering and exiting body of with block, respectively. You probably already know about with statement and it's usage, but you might not know about contextlib used above. It's a module that provides tools for turning functions into context managers. As for the closing function above, it just forces call to .close() method of the object, in this case page.

Usage of context protocol isn't limited to management of resources, though. It can also be used for suppressing exceptions (with suppress(...)) or redirecting output (with redirect_stdout(...)):

# Bad
import os
try:
    os.remove(path)
except FileNotFoundError:
    pass

# Good
from contextlib import suppress

with suppress(FileNotFoundError) :
    os.remove(path)

Checking If Parameter Was Provided

From time to time you might need to define function that takes optional arguments. This can be done in Python very easily and surely know how:

def myfunc(x, y=10):  # `y` is optional
    ...

myfunc(5)      # x = 5, y = 10
myfunc(5, 25)  # x = 5, y = 25

Most of the time, we use optional arguments to allow the user of the our function to omit the obvious default argument or rarely used option. In some cases though, we might want to change behaviour of our function based not just on the value of optional argument but also based on whether the argument was provided or not. One reasonable solution for this case could be to use None as default (when it's None do X, when it's not do Y). But what if None is acceptable value? You could choose another throwaway value, but there is nice idiomatic solution to this:

_no_value = object()

def myfunc(x, y=_no_value):
    if y is _no_value:
        print("Optional parameter wasn't supplied...")

We can solve this problem by creating constant - for example - called _no_value which we set as a default value for the optional argument. By doing this we avoid any possibly acceptable values, because we're actually not checking value at all - we are checking identity. In other words we are checking whether the y argument refers to same exact object as the one assigned to _no_value.

Multiple Assignment

One of the nice features of Python, that most programming languages lack is multiple assignment. In its simplest form it looks like this:

a = b = "something"

This is nice as it shortens and simplifies code, but I personally rarely get a chance to use it. Much more practical version of this can be used when unpacking iterables into multiple variables:

some_list = ["value1", "value2"]
first, second = some_list

This is definitely preferable option over assigning values to each variable using indices, as it creates less visual noise, is more concise and also less error prone.

Variable Unpacking

Building on previous example and going little further - we can also use star expression to unpack elements of iterable of arbitrary length:

first, *middle, last = [1, 2, 3, 4, 5]
# first = 1, middle = [2, 3, 4], last = 5

first, second, *rest =  [1, 2, 3, 4, 5]
# first = 1, middle = 2, rest = [3, 4, 5]

name, address, *_, email = ["John", "Some Street", "Credit Card Number", "Phone Number", "john@gmail.com"]
# name = "John", address = "Some Street", email = "john@gmail.com"

header_row, *table_rows = open("filename").read().split("\n")
# header_row -> first line
# table_rows -> list of remaining lines

Quite often, values in iterables will have some pattern or known component, which can be easily extracted using unpacking. This is always better solution than explicitly using indices into iterable, as that creates unreadable code with lots of unnamed and unknown variables.

There's one thing to be aware of when using star expression, though. Unpacking with star expression always creates list even if the variable receives zero values from unpacking, which can be nice considering that you won't need to do any extra type checking, but can be also a bit surprising to receive [] instead of None.

If we wanted to stretch the limits of this feature, then we could even unpack multiple levels of iterable into other iterables:

geographies = {
    "EMEA": ("France", "Germany", "UK", "Sweden"),
    "LA": ("Brazil", "Argentina", "Chile", "Cuba"),

}

((geo1, (first1, *rest1)),
 (geo2, (first2, *rest2))) = geographies.items()

print(f"In {geo1} there is {first1} and {len(rest1)} more countries.")
# In EMEA there is France and 3 more countries.

I don't necessarily recommend doing this, as this will not produce very readable or nice code, but it's good to know limits of a tool we use even if we're not going to use this particular option very often or at all.

Swapping Values

In other languages you would need extra variable and 3 lines of code to swap 2 variables. In Python however, there is a better way similar to previously shown multiple assignment:

# Bad
temp = a
a = b
b = temp

# Good
a, b = b, a

This is super simple and super useful and it's one of those features which reminds you how great Python is. Apart from swapping variables this also applies to mutable iterables (e.g. lists) and their indices, which can be commonly seen in sorting:

a[i-1], a[i] = a[i], a[i-1]

This all might seem like some Python magic, but in reality Python is just clever enough to know when to create temporary variables, what to put into them, where to assign their values and when to throw them away.

Processing Lists in Parallel

Oftentimes when working with - for example - databases or CSV tables, you will find yourself with multiple lists of related data. It might be a few columns from database table, a few related datasets, etc. Regardless of what the data really is, you will probably want to work with it and process it in parallel. The simplest way to do that in Python is to use zip:

countries = [...]
population = [...]
for country, pop in zip(countries, population):
    print(f"{country} has population of {pop}.")

zip function takes variable number of lists and produces lazy generator that yields tuples containing elements from each of the supplied lists. This is great for processing data and it's also very efficient because - as I mentioned - the generator is lazy, so it won't load whole lists into memory, only the current tuple of elements.

When using this function you might come to realize that it's not so great when working with lists with different lengths, as it's going yield values only until the shortest of the lists is exhausted which might not always be desirable. In case you'd rather consume values until the longest of the lists is exhausted, you can instead use itertools.zip_longest, which will fill missing values with None or fillvalue provided as argument.

Avoid map, filter and reduce

Python has many functional programming concepts and functions like lambda expressions, list comprehensions, functools module, etc. There are however, a few that are frowned upon by many people. These are map, reduce and filter. What is bad about these functions though? Well, there are multiple reasons, but the one I have to agree with is that it's usually cleaner and clearer to write list comprehension instead map or filter and in case of reduce the code becomes hard to read when used with non-trivial function argument. Another good reason to dislike these functions is that ideally there should be only one right way to do things, so why use map, filter, reduce or even lambda when we have list comprehensions?

It's understandable if you disagree with me, but before writing some angry comment, you might want to read short write-up by Guido va Rossum, which might change your mind.

Bottom line - use above functions sparingly and ideally just replace them with list comprehensions wherever possible.

"The only purpose of 'reduce' is to write really obfuscated code that shows how cool you are. I'm just not that cool." — Guido van Rossum

Batteries Included

Python has for very long time maintained the philosophy of "batteries included", meaning that you will find lots of useful tools, modules and functions in the standard library, that you wouldn't expect to be there. You should always check whether the problem you are trying to solve or function you are trying to implement isn't somewhere in the standard library and if you can't find it, chances are you aren't looking hard enough.

There are many examples of these "batteries" all over standard library, first module that comes to mind my is itertools which provides iterator building blocks. Another great one is functools with collection of higher order functions and I also have to mention collections module with very useful datatypes like Counter, deque or namedtuple just to name a few.

So, next time you need some fairly common functionality in your program, don't reinvent a wheel, go see Python library docs, grab what's already there and save yourself some time.

The "Bunch" Idiom

When you define a Python class you will most likely declare couple of attributes in its __init__ method. You might declare just one or two attributes, but you can also end up with something like this:

class Person:
    def __init__(self, first_name, last_name, age, height, weight, gender, address, ssn):
        self.first_name = first_name
        self.last_name = last_name
        self.age = age
        self.height = height
        self.weight = weight
        self.gender = gender
        self.address = address
        self.ssn = ssn

With just a few attributes in class it's kind of okay to write them out and it won't clutter your code that much, but if there were 10 or so attributes - like in the code above - would you be still okay with writing them all out? Well, I wouldn't. So, to avoid it you can use the so-called "bunch" idiom:

class Person:
    def __init__(self, **kwargs):
        self.__dict__.update(**kwargs)

class Person:
    def __init__(self, **kwargs):
        vars(self).update(**kwargs)  # Alternatively use `vars()`

The snippet above demonstrates usage of self.__dict__ which is a dictionary which stores all the attributes of class (unless __slots__ is declared). Here we pass any keyword arguments of the constructor to the update function which generates all the attributes. It's also possible to use vars(self) which looks little nicer in my opinion.

You might consider this a dirty hack, but I think it's okay to use it, especially if you have class (data structure) for storing bunch of attributes without any real functionality. Also Raymond Hettinger says it's okay to update the instance dictionary directly, so there's that.

Conclusion

I definitely recommend using all of the above idioms and tips in your Python code and I believe these will make your code more Pythonic and idiomatic. There’s however no single answer to "What is Pythonic?" or "What is idiomatic?", and what works for me, might not work for you. So, use idioms to make your code more readable, concise and effective and not just because it's idiomatic. In the same way, use language specific features of Python to improve your code, not just to make it more Pythonic.

💖 💪 🙅 🚩
martinheinz
Martin Heinz

Posted on September 1, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related