A Test Driven Approach to Python Packaging
Derek D.
Posted on June 15, 2020
Recently I reached a breaking point with importing local packages in python. I consistently found myself dealing with ImportError
or ModuleNotFoundError
's when starting on a new project. A lot of it is my fault for never sitting down and properly understanding Pythons importing mechanism until now, but a little of it has to do with a plethora of answers, and advice online that tell you how to fix a problem, but not why the fix works. I hope to bridge that gap today.
There are a few key questions I want to answer about this topic that I think will clear up most of the confusion.
- Is an
__init__.py
file needed in every folder? - Is a package you import the same thing as the package you install from PyPI?
- Does Pytest and Python's builtin unittest module import packages differently?
The simple answers to these questions are
- Yes if using Python 3.2 or earlier, and No if using Python 3.3 or newer.
- No, they are different even though they represent the same code.
- Yes
Historical Perspective
Although Python 2 reached end of life this year, it's influence on Python devs has been profound. Even when I started writing Python several years ago every project I worked with only supported Python 2, and in Python 2 an __init__.py
file was required in every directory, otherwise it wasn't importable. You can test this for yourself. Create a directory strucutre as shown below, open up a Python REPL and try to import greeter.lang
. It doesn't work until you add an __init__.py
to the lang folder.
experiment/
|_greeter
| |_ __init__.py
| |_lang
| | |_en.py
| | |_en.py
| |
| |_greeting.py
|
|_tests
| |_test_greeting.py
This behaviour was present even in the early days of Python 3. PEP-420 finally removed that requirement in Python 3.3, but the trend of adding __init__.py
files has been continued by long time Python developers because they never knew to change their habits and the habit has been passed down to newer devs either from these older devs or from historical answers on Q&A sites like Stack Overflow.
If you don't believe me that __init__.py
files are no longer needed in every directory you can read it for yourself in the Python 3 docs here or in this screenshot of the paragraph that confirms it.
I have some examples of common scenarios you'll find yourself in that I want to share with you, but before that, I figure we should define what a package actually is.
Paraphrasing from the setuptools documentation (setuptools is the Python lib for building and distributing PyPI packages),
"A package in the context of PyPI is a distribution of bundled > software. A package in the context of Python is a container of > modules".
Simply put a module is just a Python source code file.
Additionally there are 2 types of python packages now, "Regular Packages" which adhere to the old Python 3.2 and earlier requirements of a package (i.e. __init__.py
in every directory) and "Namespace Packages" that follow the requirements from PEP-420, which bring a whole new meaning to being 420 friendly.
With that in mind let's get down to the examples. Since most of my confusion came while I was following Test Driven Development all of these examples use pytest
and Python's builtin unittest
module to run a test module that imports some other python module/package of varying complexity.
Example 1
The simplest case is where the test is in the same module as the code it is testing. The directory structure would look like this.
example-1/
|_ greeter.py
Both pytest greeter.py
and python -m unittest greeter.py
pass. Since there is nothing to import this should be expected. If you want to see the source code for this example it's available on GitHub.
Example 2
A slightly more complex, but more common case is when there is a test module separate from the module being testing. The directory structure would look like this.
example-2/
|_greeter.py
|_test_greeter.py
In this case pytest test_greeter.py
and python -m unittest test_greeter.py
once again pass even without an __init__.py
file. The source code for this example is also available on GitHub.
More complex projects need better organization than a single source code module and a single test module. This is where python packages come into play. Examples 3, 4 and 5 all deal with the various ways a package can be structured (i.e. nesting the test module with the source code package or keeping it separate from the source package).
Example 3
This example covers the case where the test module is nested within the source code package. The directory structure looks like this.
example-3/
|_translator/
| |_greeter.py
| |_tests/
| | |_test_greeter.py
This time only python -m unittest translator/tests/test_greeter.py
passes the test. pytest translator/tests/
fails with the error ModuleNotFoundError: No module named 'translator'
. That's because pytest only looks for PyPI packages to import. You can get pytest to pass by adding an __init__.py
file under the translator
directory, adding a setup.py
file under the example-3 directory and running pip install .
. You can see the source code here and the PR that fixes it here.
Example 4
This is the same as example 3 except the tests package is outside the source code package.
example-4/
|_ translator/
| |_ greeter.py
|
|_ tests
| |_ test_greeter.py
Once again python -m unittest tests/test_greeter.py
passes but pytest tests/
fails with a ModuleNotFoundError
. The fix is exactly the same as example 3. Add an __init__.py
file under the translator
directory and a setup.py
file under the example-4
directory then run pip install .
. The source code can be seen here and the PR to fix it can be seen here.
Example 5
Example 5 is a little more complex because there is a subpackage under the translator
package. The directory stucture will look like that.
example-5/
|_translator/
| |_ greeter.py
| |_ lang/
| | |_ en.py
| | |_ es.py
|
|_tests
| |_test_greeter.py
As you've probably already guessed python -m unittest tests/test_greeter.py
passes and pytest tests/
failes with ModuleNotFoundError
. If you apply the same fix from examples 3 and 4, pytest still fails, but this time with the error ModuleNotFoundError: No module named 'lang'
. This error is caused by the setup.py
file i've been using. In setup.py
I use the function find_packages()
from setuptools which traverses the directory structure and tries to find packages, but it's only looking for "Regular Packages". That's right the ones that require an __init__.py
file to be present. So to fix this example an __init__.py
needs to be added under the lang
directory as well. Unless you used the -e
flag in yout pip install command you'll need to re-run pip install .
again since the __init__.py
has to exist before the PyPI package was installed.
Conclusion
I hope that can save you hours of debugging import issues so you can focus on the more enjoyable parts of coding. Here are some best practices I've taken away from this journey.
-
__init__.py
files should only be used when needed by setuptools or when a package needs some intial setup on import. - Tests should remain outside the source code package (Can help with things like Docker which only wants Prod code)
-
pip install -e .
will pick up new subpackages that have__init__.py
w/o needing to rerunpip install .
A good next step would be to look into importlib which exposes
the implementation of the import statement.
Thanks for reading. If you enjoy this content check out my other articles on DEV.to, or tune into the Namespace Podcast which I co-host with another TDD, Python junkie like myself.
Posted on June 15, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.