Avoid hidden traps in switching between languages: a tale of queries and sequences

giodiblasi

Giovanni Di Blasi

Posted on November 5, 2019

Avoid hidden traps in switching between languages: a tale of queries and sequences

It will have probably happened to you to switch between two or more languages or frameworks in a short time.
In these situations, it's important to keep in mind the differences between the languages and the tools that they offer. I'm not speaking about the syntax, but it's important to know the implementation details.

When I work to my side-project I often switch between C# and Javascript. With this post, I would like to share with you some reasoning about how to query sequences in these two languages.

Context

Let's define a task to solve so that we have a practical scenario to talk about.
Imagine to have a list of users (what original example!).
Each user object is first serialized and then encoded in base64.


const encodedUsers = [
        "eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==",   //{ "name":"Peter", "age":20 }
        "eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==",   //{ "name":"Simon", "age":31 }
        "eyAibmFtZSI6IkVyaWMiLCAiYWdlIjo0MCB9"        //{ "name":"Eric", "age":40 }
];

So what we want is to find the first user over 30 years old.
Spoiler: the user is Simon 馃槃

Let's implement it

It's easy to imagine that we will need some methods as bricks to solve our task:

  • decode: Given a base64 input it returns the decoded version.
  • parse: Given a string that represent a user it returns a User object.
  • isOlderThan: Given an User object it returns true if the user age is greater than a given age.

Define them in C# and in Javascript:

  • C#
private String Decode(string encodedUser)
{
    Console.WriteLine("decoding..." + encodedUser);
    return Encoding.UTF8.GetString(Convert.FromBase64String(encodedUser));
}

private User Parse(string serializedUser)
{
    var user =  JsonConvert.DeserializeObject<User>(serializedUser);
    Console.WriteLine("parsed..." + user.name);
    return user;
}

private bool IsOlderThan(int age, User user)
{
    Console.WriteLine("check age of: " + user.name);
    return user.age > age;
}
  • Javascript

const decode = encodedUser => {
    console.log("DECONDING: ",encodedUser);
    return Buffer.from(encodedUser,'base64').toString('utf8');
}

const parse = serializedUser => {
    const user = JSON.parse(serializedUser) 
    console.log("PARSED: ", user.name);
    return user;
}

const isOlderThan = (age, user)=>{
    console.log("CHECK AGE OF: ", user.name);
    return user.age > age;
}

Now that we have defined our methods, we need to transform our base64 strings in users so that we can test their ages.
For the C# implementation we could use LINQ to query our sequence and in Javascript we could use the map() method to apply our methods and first() to find our user:

  • C# (LINQ)
var userName =  encodedUsers
        .Select(Decode)
        .Select(Parse)
        .First(user => IsOlderThan(30, user))
        .name;

 Console.WriteLine("The first user over 30 years old is " + userName);
}
  • Javascript

var userName = encodedUsers
    .map(decode)
    .map(parse)
    .find(user=>isOlderThan(30, user))
    .name;

console.log("The first user over 30 years old is ", userName)

This two implementations look very similar, both allow us to apply our methods to the users sequence in a very expressive way.
How you can imagine they give the same result: the first user over 30 years old is Simon 馃帀

Deep in dive

We had put some logs on our parse and decode methods, and we are damn curious to take a look at the output, right? 馃
Ok, let's go!

  • C# (LINQ)
DECODING: eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==
PARSED: Peter
CHECK AGE OF: Peter

DECODING: eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==
PARSED: Simon
CHECK AGE OF: Simon

The first user over 30 years old is Simon
  • Javascript

DECONDING:  eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==
DECONDING:  eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==
DECONDING:  eyAibmFtZSI6IkVyaWMiLCAiYWdlIjo0MCB9

PARSED:  Peter
PARSED:  Simon
PARSED:  Eric

CHECK AGE OF:  Peter
CHECK AGE OF:  Simon

The first user over 30 years old is Simon

What??? The outputs are very different 馃

Although both C# and Javascript provide tools that give us a very similar way to query a sequence, they are implemented with a totally different approach.

The first difference we can observe is the order of execution of our queries: in our C# implementation decode() is executed on the first item of the sequence than parse() is applied on the resulting value, the same happens for the second value and so on.

Instead In the javascript implementation decode() is executed for each item in the sequence, then parse() is applied to each item of the resulting sequence.

But the biggest difference is that in the C# implementation the parse() and decode() methods are never called for the user "Eric" 馃く

In fact, reading from Select documentation:

This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated.

So when we invoke Select passing our parse() and decode() methods we are collecting the queries without execute them!
This explains why they are never executed on the last item of our sequence:
First() methods stop to iterate the sequence to the second user (Simon), so on the third item any query will be executed.

From the Javascript map() documentation page instead we discover that this method uses the immediate execution approach:

map calls a provided callback function once for each element in an array, in order, and constructs a new array from the results.

This means that for every map() execution the query is immediately executed and a new sequence is created.
This is why the queries are executed on every item in our sequence.

If on the one hand the immediate execution could bring to execute avoidable queries, on the other the deferred execution could execute the queries on the same items more than once.
Take a look at this example:

  • C# (LINQ) - Deferred Execution
var users = encodedUsers
            .Select(Decode)
            .Select(Parse);

Console.WriteLine("The first user over 30 years old is " + users.First(user=>IsOlderThan(30, user).name);
//....
Console.WriteLine("The first user over 35 years old is " + users.First(user=>IsOlderThan(35, user)).name);

}
  • Output

DECODING: eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==
PARSED: Peter

DECODING: eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==
PARSED: Simon

The first user over 30 years old is Simon

DECODING: eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==
PARSED: Peter

DECODING: eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==
PARSED: Simon

DECODING: eyAibmFtZSI6IkVyaWMiLCAiYWdlIjo0MCB9
PARSED: Eric

The first user over 35 years old is Eric

We notice that decode() and parse() are execute twice on Peter and Simon. This happens because the deferred execution implies that the users variable contains the "Promise" to execute our queries over items, so the queries are executed each time we call first() method.
Implementing the same example in Javascript we notice that users variable contains the execution results of the two map methods, this is the output:

  • Javascript - Immediate Execution
const users = encodedUsers
    .map(decode)
    .map(parse);


console.log("The first user older than 30 is ", users.find(user=>isOlderThan(30, user)).name);
//...
console.log("The first user older than 33 is ", users.find(user=>isOlderThan(33, user)).name);

}
  • Output

DECONDING:  eyAibmFtZSI6IlBldGVyIiwgImFnZSI6MjAgfQ==
DECONDING:  eyAibmFtZSI6IlNpbW9uIiwgImFnZSI6MzEgfQ==
DECONDING:  eyAibmFtZSI6IkVyaWMiLCAiYWdlIjo0MCB9

PARSED:  Peter
PARSED:  Simon
PARSED:  Eric

CHECK AGE OF:  Peter
CHECK AGE OF:  Simon

The first user older than 30 is  Simon

CHECK AGE OF:  Peter
CHECK AGE OF:  Simon
CHECK AGE OF:  Eric

The first user older than 33 is  Eric

Look under the hood

I don't mean one implementation is better than the other, but that these are simply two different approaches to solve the same problem.
To not being aware of these differences could make hard for us writing good code: we could introduce unexpected behaviours like the one shown in this example.
So I think that when we approach a new language or framework we should take some time to look what's under the hood.

馃挅 馃挭 馃檯 馃毄
giodiblasi
Giovanni Di Blasi

Posted on November 5, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related