Transposing options/results with iterators

elshize

Michał Siedlaczek

Posted on January 5, 2021

Transposing options/results with iterators

In this article, I would like to share some of my thoughts about a problem I encountered, my process of solving it, and possibly start a discussion about the validity of the proposed approach. I certainly would like to hear any comments and opinions about both the problem and the decisions I made in my solution.

The topic of this article is transforming, or "transposing", an optional collection or an iterator into an iterator of options. You might be familiar with Option::transpose and Result::transpose methods, which transform Option<Result<T, E>> into Result<Option<T>, E> and vice versa. My goal is quite similar, but instead, I would like to transform Option<I: IntoIterator> into impl Iterator<Option<I::Item>> and Result<I: IntoIterator, E> into impl Iterator<Result<I:Item, E>>.

First, I will discuss the motivation behind it and how such operation could be defined. Then, I will walk you through the implementation of this functionality, which uses a trait to extend Option and Result with an additional method. So hopefully, even if the motivation fails to convince you, there is still some value in following the implementation details.

Keywords: rust algorithm iterator

Source Code

I shared the code from this article in a small crate available on Github. I also uploaded the reference documentation if you would like to take a closer look.

Motivation

First, let me share with you what motivated me to look into this problem in the first place. Suppose you have a type that stores two values, one of which is potentially expensive to compute. The latter doesn't need to be provided when creating an object, and if it is missing, it either gets computed or is given a default value.

struct Entry {
    id: String,
    value: u64,
}

impl Entry {
    pub fn new(id: String, value: Option<u64>) -> Self {
        Self {
            id,
            value: value.unwrap_or_else(Self::default_value),
        }
    }
    fn default_value() -> u64 {
        todo!("Compute or provide default")
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's assume that the IDs are read from a file, line by line. The values are computed by some other process and also stored in a file. Our main program takes the first file and optionally the one with values. The details about this particular setup are not important, the point is that we either have all the values or none. We now want to load that data to Vec<Entry>. Here is an example of how it could be done (note that I will panic instead of handling errors for simplicity, but the same can be implemented with proper error handling):

fn load_entries<R: BufRead>(ids: R, values: Option<R>) -> Vec<Entry> {
    let mut values = values.map(|r| {
        r.lines().map(|line| {
            line.expect("cannot read line")
                .parse::<u64>()
                .expect("cannot parse as u64")
        })
    });
    ids.lines()
        .map(|line| {
            let id = line.expect("cannot read line");
            Entry::new(id, values.as_mut().and_then(|values| values.next()))
        })
        .collect()
}
Enter fullscreen mode Exit fullscreen mode

This certainly works, but it is a bit verbose and I must say I dislike the mutable iterator in this context. I wish I could just zip the two together. And this is precisely what I set out to do below.

Iterator Adapter

Our iterator adapter can be easily implemented using std::iter::from_fn as we see below:

fn transpose_into_iter<I: IntoIterator>(iter: Option<I>)
    -> impl Iterator<Item = Option<I::Item>>
{
    let mut inner = iter.map(|i| i.into_iter());
    std::iter::from_fn(move || {
        inner.as_mut().map_or(Some(None), |iter| iter.next().map(Some))
    })
}
Enter fullscreen mode Exit fullscreen mode

That was pretty easy. We simply encapsulate what we did before in the map function. Let's quickly go through the entire function. The argument is of type Option<I> where I implements IntoIterator. This essentially means that we can call into_iter() on any value of type I to get an iterator over its elements. Notice that IntoIterator is trivially implemented for any I: Iterator, in which case into_iter() simply returns self. Many other types implement IntoIterator, such as collections, e.g., Vec and HashSet, slices, and even Option and Result.

IntoIterator has two associated types: Item and IntoIter. The latter is the type resulting iterator, and the former is the type of the elements of that iterator. We can refer to these types with a double colon: I::Item. We exploit this in defining the return type of the function: impl Iterator<Item = Option<I::Item>>. It means that this function returns a type that implements the Iterator trait, and its elements have type Option<I::Item>.In other words, the elements of the resulting iterator will be the elements of the input iterator (if such exists) wrapped in Option.

In the body of the function, we first map the optional value from the type of Option<I: IntoIterator> to Option<I: Iterator>. Then, we return a special from_fn iterator adapter, which simply calls the passed closure each time the iterator's next() method is called. It will always return Some(None) (thus, the actual element returned is None) if inner is None. Notice how this will result in an infinite iterator, always returning None. If inner is Some(_), we will return the next element wrapped in Some until the underlying iterator runs out of elements. A minor change to the code would make it produce None after all existing elements. Notice how the latter would be more general, as we can call take_while(Option::is_some) to reduce it to a finite sequence. However, I decided to stick with the first approach. Let me know if you have an opinion in favor of either of them.

Either way, we can now use this adapter in our example:

fn load_entries_with_transpose<R: BufRead>(ids: R, values: Option<R>)
    -> Vec<Entry>
{
    let values = values.map(|r| {
        r.lines().map(|line| {
            line.expect("cannot read line")
                .parse::<u64>()
                .expect("cannot parse as u64")
        })
    });
    ids.lines()
        .zip(transpose_into_iter(values))
        .map(|(line, value)| Entry::new(line.expect("cannot read line"), value))
        .collect()
}
Enter fullscreen mode Exit fullscreen mode

Nice, no more mutable state. We could leave it as is and call it a day, but notice that we will have to have another function for Result because we can't simply overload it for another type. In Rust, function overload is achieved through traits. However, impl Trait return value is not allowed in trait functions, so we will have to come up with a concrete (associated) type.

Implementing Trait

As mentioned before, we have to return a concrete type in a trait.

pub trait IterTranspose {
    fn transpose_into_iter(self) -> T;
}
Enter fullscreen mode Exit fullscreen mode

Now, what should T be? It probably depends what type implements the trait. This is a perfect use for an associated type:

pub trait IterTranspose {
    type Iter: Iterator;
    fn transpose_into_iter(self) -> Self::Iter;
}
Enter fullscreen mode Exit fullscreen mode

Now the return value has a concrete type, but because it is an associated type, it depends on the implementation. What would be Iter for the example above? This is a bit tricky, because we can't express the concrete type of a closure. We could set it to std::iter::FromFn<Box<FnMut() -> Option<T>>> but then we would additionally have to define T and, more importantly, it would introduce dynamic dispatch. Instead, we can create a simple struct that implements Iterator directly, which is as simple as our previous implementation:

pub struct OptionTransposedIter<I> {
    inner: Option<I>,
}

impl<I: Iterator> Iterator for OptionTransposedIter<I> {
    type Item = Option<I::Item>;

    fn next(&mut self) -> Option<Self::Item> {
        self.inner
            .as_mut()
            .map_or(Some(None), |iter| iter.next().map(Some))
    }
}
Enter fullscreen mode Exit fullscreen mode

Now we are ready to implement IterTranspose:

impl<I: IntoIterator> IterTranspose for Option<I> {
    type Iter = OptionTransposedIter<I::IntoIter>;

    fn transpose_into_iter(self) -> Self::Iter {
        OptionTransposedIter {
            inner: self.map(I::into_iter),
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

And finally, this is what our example looks like using the new interface:

fn load_entries_with_transpose<R: BufRead>(ids: R, values: Option<R>)
    -> Vec<Entry>
{
    use IterTranspose;
    let values = values.map(|r| {
        r.lines().map(|line| {
            line.expect("cannot read line")
                .parse::<u64>()
                .expect("cannot parse as u64")
        })
    });
    ids.lines()
        .zip(values.transpose_into_iter())
        .map(|(line, value)| Entry::new(line.expect("cannot read line"), value))
        .collect()
}
Enter fullscreen mode Exit fullscreen mode

Result

The implementation for Result<T, E> is very similar, and I won't repeat it here, but if you are interested in my implementation, visit the Github repository. One important difference is that the Err variant holds a value of type E and needs to produce it repeatedly. This means that it must implement E: Clone. Besides that, the implementation for Result is analogous to the one above.

Summary

We discussed how to transform an Option<I: IntoIter> (or Result<I: IntoIter, E>) to an impl Iterator<Item = I::Item>, similar to how Option::transpose transforms Option<Result<T, E>> into Result<Option<T>, E>. Such operation can be useful, e.g., when we want to zip a collection with another collection that is optional. We first saw how to solve this problem using std::iter::from_fn, and then how to overload our adapter function for multiple types (Option and Result) using a custom trait with an associated type.

Although the problem we solved is rather niche, I found the path to solution interesting. It touches several important parts of Rust, such as iterator, traits, and generic programming. It also addressed a problem I have come across more than once now, and it was fun to approach it as a generic solution and write it as a crate.

If you have any questions or suggestions, don't hesitate to reach out on Twitter or Mastodon.

💖 💪 🙅 🚩
elshize
Michał Siedlaczek

Posted on January 5, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related