Python tips and trick, you haven't already seen
Martin Heinz
Posted on August 6, 2019
Note: This was originally posted at martinheinz.dev
There are plenty of articles written about lots of cool features in Python such as variable unpacking, partial functions, enumerating iterables, but there is much more to talk about when it comes to Python, so here I will try to show some of the features I know and use, that I haven't yet seen mentioned elsewhere. So here we go.
Sanitizing String Input
Problem of sanitizing user input applies to almost every program you might write. Often it's enough to convert characters to lower or upper-case, sometimes you can use Regex to do the work, but for complex cases, there might be a better way:
user_input = "This\nstring has\tsome whitespaces...\r\n"
character_map = {
ord('\n') : ' ',
ord('\t') : ' ',
ord('\r') : None
}
user_input.translate(character_map) # This string has some whitespaces...
In this example you can see that whitespace characters "\n" and "\t" have been replaced by single space and "\r" has been removed completely. This is a simple example, but we could take it much further and generate big remapping tables using unicodedata
package and its combining()
function to generate and map which we could use to remove all accents from string.
Taking Slice of an Iterator
If you try to take slice of an Iterator, you will get a TypeError
, stating that generator object is not subscriptable, but there is a easy solution to that:
import itertools
s = itertools.islice(range(50), 10, 20) # <itertools.islice object at 0x7f70fab88138>
for val in s:
...
Using itertools.islice
we can create a islice
object which is an iterator that produces desired items. It's important to note though, that this consumes all generator items up until the start of slice and also all the items in our islice
object.
Skipping Begining of Iterable
Sometimes you have to work with files which you know that start with variable number of unwanted lines such as comments. itertools
again provides easy solution to that:
string_from_file = """
// Author: ...
// License: ...
//
// Date: ...
Actual content...
"""
import itertools
for line in itertools.dropwhile(lambda line: line.startswith("//"), string_from_file.split("\n")):
print(line)
This code snippet produces only lines after initial comment section. This approach can be useful in case we only want to discard items (lines in this instance) at the beginning of the iterable and don't know how many of them there are.
Functions with only Keyword Arguments (kwargs)
It can be helpful to create function that only takes keyword arguments to provide (force) more clarity when using such function:
def test(*, a, b):
pass
test("value for a", "value for b") # TypeError: test() takes 0 positional arguments...
test(a="value", b="value 2") # Works...
As you can see this can be easily solved by placing single *
argument before keyword arguments. There can obviously be positional arguments if we place them before the *
argument.
Creating Object That Supports with
Statements
We all know how to, for example open file or maybe acquire locks using with
statement, but can we actually implement our own? Yes, we can implement context-manager protocol using __enter__
and __exit__
methods:
class Connection:
def __init__(self):
...
def __enter__(self):
# Initialize connection...
def __exit__(self, type, value, traceback):
# Close connection...
with Connection() as c:
# __enter__() executes
...
# conn.__exit__() executes
This is the most common way to implement context management in Python, but there is easier way to do it:
from contextlib import contextmanager
@contextmanager
def tag(name):
print(f"<{name}>")
yield
print(f"</{name}>")
with tag("h1"):
print("This is Title.")
The snippet above implements the content management protocol using contextmanager
manager decorator. The first part of the tag
function (before yield
) is executed when entering the with
block, then the block is executed and finally rest of the tag
function is executed.
Saving Memory with __slots__
If you ever wrote a program that was creating really big number of instances of some class, you might have noticed that your program suddenly needed a lot of memory. That is because Python uses dictionaries to represent attributes of instances of classes, which makes it fast but not very memory efficient, which is usually not a problem. However, if it becomes a problem for your program you might try using __slots__
:
class Person:
__slots__ = ["first_name", "last_name", "phone"]
def __init__(self, first_name, last_name, phone):
self.first_name = first_name
self.last_name = last_name
self.phone = phone
What happens here is that when we define __slots__
attribute, Python uses small fixed-size array for the attributes instead of dictionary, which greatly reduces memory needed for each instance. There are also some downsides to using __slots__
- we can't declare any new attributes and we are restricted to using ones on __slots__
. Also classes with __slots__
can't use multiple inheritance.
Limiting CPU and Memory Usage
If instead of optimizing your program memory or CPU usage, you want to just straight up limit it to some hard number, then Python has a library for that too:
import signal
import resource
import os
# To Limit CPU time
def time_exceeded(signo, frame):
print("CPU exceeded...")
raise SystemExit(1)
def set_max_runtime(seconds):
# Install the signal handler and set a resource limit
soft, hard = resource.getrlimit(resource.RLIMIT_CPU)
resource.setrlimit(resource.RLIMIT_CPU, (seconds, hard))
signal.signal(signal.SIGXCPU, time_exceeded)
# To limit memory usage
def set_max_memory(size):
soft, hard = resource.getrlimit(resource.RLIMIT_AS)
resource.setrlimit(resource.RLIMIT_AS, (size, hard))
Here we can see both options to set maximum CPU runtime as well as maximum memory used limit. For CPU limit we first get soft and hard limit for that specific resource (RLIMIT_CPU
) and then set it using number of seconds specified by argument and previously retrieved hard limit. Finally, we register signal that causes system exit if CPU time is exceeded. As for the memory, we again retrieve soft and hard limit and set it using setrlimit
with size argument and retrieved hard limit.
Controlling What Can Be Imported and What Not
Some languages have very obvious mechanism for exporting members (variables, methods, interfaces) such as Golang, where only members starting with upper-case letter are exported. In Python on the other hand, everything is exported, unless we use __all__
:
def foo():
pass
def bar():
pass
__all__ = ["bar"]
Using code snippet above, we can limit what can be imported when using from some_module import *
. For this specific example, wildcard import with only import bar
. Also, we can leave __all__
empty and nothing will be exported causing AttributeError
when importing from this module using wildcard import.
Comparison Operators the Easy Way
It can be pretty annoying to implement all the comparison operators for one class, considering there are quite a few of them - __lt__ , __le__ , __gt__ ,
or __ge__
. But what if there was an easier way to do it? functools.total_ordering
to the rescue:
from functools import total_ordering
@total_ordering
class Number:
def __init__(self, value):
self.value = value
def __lt__(self, other):
return self.value < other.value
def __eq__(self, other):
return self.value == other.value
print(Number(20) > Number(3))
print(Number(1) < Number(5))
print(Number(15) >= Number(15))
print(Number(10) <= Number(2))
How does this actually work? total_ordering
decorator is used to simplify the process of implementing ordering of instances for our class. It's only needed to define __lt__
and __eq__
, which is the minimum needed for mapping of remaining operations and that's the job of decorator - it fills the gaps for us.
Conclusion
Not all these features are essential and useful in day-to-day Python programming, but some of them might come in handy from time to time and they also might simplify task that would be otherwise quite lengthy and annoying to implement. Also I want to note that all those features are part of Python standard library, while some of them seem to me like pretty non-standard things to have in standard library, so whenever you decide to implement something in Python first go looking for it in standard library and if you can't find it, then you are probably not looking hard enough (if it's really not there, then it's surely in some third party library). 🙂
Posted on August 6, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.