Would it be silly to write my own defaultdict?
RobinFiveWords
Posted on September 23, 2024
TL;DR — Yes? Python's defaultdict
is exactly the right tool for the job I call upon it to do.
Python
defaultdict
is part of Python's standard library. Python is already a high-level language, so it seems arbitrary to say I'll allow myself to use some but not all parts of it as given to me.
I do, however, like when I know everything that something does, and I know it doesn't do anything else. If I'm not mistaken, defaultdict
is 300 lines of C code (starting at line 2175 here), and I can't say I understand everything that it does. When I use defaultdict
, almost always I am looping through data, aggregating values, and looking to avoid the KeyError
the first time a key appears in the data (or having to check whether the key exists each time).
I'm sure millions of people have done this for homework or fun before, but I didn't search for their solutions; I searched for how to make an object "subscriptable", how to add a key to a dict without the subscript syntax (which I'm guessing would infinitely recurse), and how to call the parent class's __init__
method (it had been a while). The following seems to work fine:
class DefaultDict(dict):
def __init__(self, value_type):
self.value_type = value_type
super().__init__()
def __getitem__(self, key):
if key not in self:
self.update({key: self.value_type()})
return self.get(key)
I probably will never use this, because defaultdict
is already the right tool for the job. It throws an exception immediately when the default factory is not callable or None. Someone has already thought through what type the result should have when taking the union of a dict
and a defaultdict
. It has a nice repr
.
Where I find writing my own tool to be most useful is if an existing tool is complex in a way that makes it unclear whether it is the right tool to use. Or if an existing tool solves a bigger problem than I am trying to solve.
Java
I must not have needed a "default HashMap" before, because I just now discovered computeIfAbsent
. While I was thinking it through, though, this one-off code seemed straightforward enough:
ArrayList<Integer> list = d.get(key)
if (list == null) {
list = new ArrayList<Integer>();
d.put(key, list);
}
The more concise way:
d.computeIfAbsent(key, (x -> new ArrayList<Integer>())).add(value)
I think I'm looking for a way to extend HashMap
so that I can write
d.myGet(key).add(value)
and it will run the previous code. This seems somewhat straightforward...
class DefaultMap extends HashMap<Integer, ArrayList<Integer>> {
ArrayList<Integer> myGet(Integer key) {
return this.computeIfAbsent(key, (z -> new ArrayList<Integer>()));
}
}
...except I would need to create a separate class for each set of key-value types. There might be a way to do this with generics, but because (if I understand correctly) classes and functions aren't first-class objects in Java, I'm not sure how that would be done. Here's where I really need to search for an answer.
And of course the top Google result is a link I've visited before, because of course this isn't the first time I've wondered about it. The .class
property effectively allows the class of the default value, such as ArrayList
, to be passed as a parameter to the constructor. I'm going to have to let this simmer.
Posted on September 23, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.