Writing More Idiomatic and Pythonic Code
Martin Heinz
Posted on September 1, 2020
There are lots of ways one can implement same feature, algorithm or function. Some of them straightforward, clear - better, some of them confusing, inefficient - worse. The Python community often uses terms like Pythonic or idiomatic when describing code that follows certain (natural, proper) style and conventions. That's the kind of good, clear code we all try to write everyday and in this article we will go over a few tips, conventions and idioms that will help you write a bit more idiomatic and Pythonic code.
Identity and Equality Comparisons
Not just in Python, but really in any programming language, you can fall into the trap of mixing up identity and value equality. In Python you have choice of using either is
or ==
for comparisons, where is
checks identity and ==
checks value.
Considering that most of the time we only care about value, not identity, we would usually choose ==
. There are cases however, where you should always use is
operator instead. One of those is comparison with all of Pythons singletons - None
, True
or False
.
Using is None
, is True
or is False
isn't just about convention or improved readability though. It also improves performance, especially if you would use x is None
instead of x == None
inside loop. Why is that? - you might ask. Well, it's because is
operator cannot be overloaded like ==
(which really is just a.__eq__(b)
), so Python can skip lookup of these dunder methods that are needed to evaluate comparison using ==
.
So, bottom line here is, that you should try to use is
when possible, as it is more readable, faster and idiomatic. But to find out whether you can actually use it, you should ask yourself whether you care about value or identity of variables being compared.
Context Managers Instead of try/finally
In other languages it's common practice to use try/finally
to manage resources and to make sure you dispose of opened files or acquired locks if exception occurs. You could use try/finally
in Python too, but we can do better using with
statement:
# Bad
try:
page = urlopen(url)
...
finally:
page.close()
# Good
from contextlib import closing
with closing(urlopen(url)) as page:
...
Code above shows usage of so-called context protocol which consists of 2 methods - __enter__
and __exit__
which are called when entering and exiting body of with
block, respectively. You probably already know about with
statement and it's usage, but you might not know about contextlib
used above. It's a module that provides tools for turning functions into context managers. As for the closing
function above, it just forces call to .close()
method of the object, in this case page
.
Usage of context protocol isn't limited to management of resources, though. It can also be used for suppressing exceptions (with suppress(...)
) or redirecting output (with redirect_stdout(...)
):
# Bad
import os
try:
os.remove(path)
except FileNotFoundError:
pass
# Good
from contextlib import suppress
with suppress(FileNotFoundError) :
os.remove(path)
Checking If Parameter Was Provided
From time to time you might need to define function that takes optional arguments. This can be done in Python very easily and surely know how:
def myfunc(x, y=10): # `y` is optional
...
myfunc(5) # x = 5, y = 10
myfunc(5, 25) # x = 5, y = 25
Most of the time, we use optional arguments to allow the user of the our function to omit the obvious default argument or rarely used option. In some cases though, we might want to change behaviour of our function based not just on the value of optional argument but also based on whether the argument was provided or not. One reasonable solution for this case could be to use None
as default (when it's None
do X, when it's not do Y). But what if None
is acceptable value? You could choose another throwaway value, but there is nice idiomatic solution to this:
_no_value = object()
def myfunc(x, y=_no_value):
if y is _no_value:
print("Optional parameter wasn't supplied...")
We can solve this problem by creating constant - for example - called _no_value
which we set as a default value for the optional argument. By doing this we avoid any possibly acceptable values, because we're actually not checking value at all - we are checking identity. In other words we are checking whether the y
argument refers to same exact object as the one assigned to _no_value
.
Multiple Assignment
One of the nice features of Python, that most programming languages lack is multiple assignment. In its simplest form it looks like this:
a = b = "something"
This is nice as it shortens and simplifies code, but I personally rarely get a chance to use it. Much more practical version of this can be used when unpacking iterables into multiple variables:
some_list = ["value1", "value2"]
first, second = some_list
This is definitely preferable option over assigning values to each variable using indices, as it creates less visual noise, is more concise and also less error prone.
Variable Unpacking
Building on previous example and going little further - we can also use star expression to unpack elements of iterable of arbitrary length:
first, *middle, last = [1, 2, 3, 4, 5]
# first = 1, middle = [2, 3, 4], last = 5
first, second, *rest = [1, 2, 3, 4, 5]
# first = 1, middle = 2, rest = [3, 4, 5]
name, address, *_, email = ["John", "Some Street", "Credit Card Number", "Phone Number", "john@gmail.com"]
# name = "John", address = "Some Street", email = "john@gmail.com"
header_row, *table_rows = open("filename").read().split("\n")
# header_row -> first line
# table_rows -> list of remaining lines
Quite often, values in iterables will have some pattern or known component, which can be easily extracted using unpacking. This is always better solution than explicitly using indices into iterable, as that creates unreadable code with lots of unnamed and unknown variables.
There's one thing to be aware of when using star expression, though. Unpacking with star expression always creates list even if the variable receives zero values from unpacking, which can be nice considering that you won't need to do any extra type checking, but can be also a bit surprising to receive []
instead of None
.
If we wanted to stretch the limits of this feature, then we could even unpack multiple levels of iterable into other iterables:
geographies = {
"EMEA": ("France", "Germany", "UK", "Sweden"),
"LA": ("Brazil", "Argentina", "Chile", "Cuba"),
}
((geo1, (first1, *rest1)),
(geo2, (first2, *rest2))) = geographies.items()
print(f"In {geo1} there is {first1} and {len(rest1)} more countries.")
# In EMEA there is France and 3 more countries.
I don't necessarily recommend doing this, as this will not produce very readable or nice code, but it's good to know limits of a tool we use even if we're not going to use this particular option very often or at all.
Swapping Values
In other languages you would need extra variable and 3 lines of code to swap 2 variables. In Python however, there is a better way similar to previously shown multiple assignment:
# Bad
temp = a
a = b
b = temp
# Good
a, b = b, a
This is super simple and super useful and it's one of those features which reminds you how great Python is. Apart from swapping variables this also applies to mutable iterables (e.g. lists) and their indices, which can be commonly seen in sorting:
a[i-1], a[i] = a[i], a[i-1]
This all might seem like some Python magic, but in reality Python is just clever enough to know when to create temporary variables, what to put into them, where to assign their values and when to throw them away.
Processing Lists in Parallel
Oftentimes when working with - for example - databases or CSV tables, you will find yourself with multiple lists of related data. It might be a few columns from database table, a few related datasets, etc. Regardless of what the data really is, you will probably want to work with it and process it in parallel. The simplest way to do that in Python is to use zip
:
countries = [...]
population = [...]
for country, pop in zip(countries, population):
print(f"{country} has population of {pop}.")
zip
function takes variable number of lists and produces lazy generator that yields tuples containing elements from each of the supplied lists. This is great for processing data and it's also very efficient because - as I mentioned - the generator is lazy, so it won't load whole lists into memory, only the current tuple of elements.
When using this function you might come to realize that it's not so great when working with lists with different lengths, as it's going yield values only until the shortest of the lists is exhausted which might not always be desirable. In case you'd rather consume values until the longest of the lists is exhausted, you can instead use itertools.zip_longest
, which will fill missing values with None
or fillvalue
provided as argument.
Avoid map
, filter
and reduce
Python has many functional programming concepts and functions like lambda
expressions, list comprehensions, functools
module, etc. There are however, a few that are frowned upon by many people. These are map
, reduce
and filter
. What is bad about these functions though? Well, there are multiple reasons, but the one I have to agree with is that it's usually cleaner and clearer to write list comprehension instead map
or filter
and in case of reduce
the code becomes hard to read when used with non-trivial function argument. Another good reason to dislike these functions is that ideally there should be only one right way to do things, so why use map
, filter
, reduce
or even lambda
when we have list comprehensions?
It's understandable if you disagree with me, but before writing some angry comment, you might want to read short write-up by Guido va Rossum, which might change your mind.
Bottom line - use above functions sparingly and ideally just replace them with list comprehensions wherever possible.
"The only purpose of 'reduce' is to write really obfuscated code that shows how cool you are. I'm just not that cool." — Guido van Rossum
Batteries Included
Python has for very long time maintained the philosophy of "batteries included", meaning that you will find lots of useful tools, modules and functions in the standard library, that you wouldn't expect to be there. You should always check whether the problem you are trying to solve or function you are trying to implement isn't somewhere in the standard library and if you can't find it, chances are you aren't looking hard enough.
There are many examples of these "batteries" all over standard library, first module that comes to mind my is itertools
which provides iterator building blocks. Another great one is functools
with collection of higher order functions and I also have to mention collections
module with very useful datatypes like Counter
, deque
or namedtuple
just to name a few.
So, next time you need some fairly common functionality in your program, don't reinvent a wheel, go see Python library docs, grab what's already there and save yourself some time.
The "Bunch" Idiom
When you define a Python class
you will most likely declare couple of attributes in its __init__
method. You might declare just one or two attributes, but you can also end up with something like this:
class Person:
def __init__(self, first_name, last_name, age, height, weight, gender, address, ssn):
self.first_name = first_name
self.last_name = last_name
self.age = age
self.height = height
self.weight = weight
self.gender = gender
self.address = address
self.ssn = ssn
With just a few attributes in class
it's kind of okay to write them out and it won't clutter your code that much, but if there were 10 or so attributes - like in the code above - would you be still okay with writing them all out? Well, I wouldn't. So, to avoid it you can use the so-called "bunch" idiom:
class Person:
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
class Person:
def __init__(self, **kwargs):
vars(self).update(**kwargs) # Alternatively use `vars()`
The snippet above demonstrates usage of self.__dict__
which is a dictionary which stores all the attributes of class
(unless __slots__
is declared). Here we pass any keyword arguments of the constructor to the update
function which generates all the attributes. It's also possible to use vars(self)
which looks little nicer in my opinion.
You might consider this a dirty hack, but I think it's okay to use it, especially if you have class
(data structure) for storing bunch of attributes without any real functionality. Also Raymond Hettinger says it's okay to update the instance dictionary directly, so there's that.
Conclusion
I definitely recommend using all of the above idioms and tips in your Python code and I believe these will make your code more Pythonic and idiomatic. There’s however no single answer to "What is Pythonic?" or "What is idiomatic?", and what works for me, might not work for you. So, use idioms to make your code more readable, concise and effective and not just because it's idiomatic. In the same way, use language specific features of Python to improve your code, not just to make it more Pythonic.
Posted on September 1, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.