Matthijs
Posted on March 24, 2024
So I was trying to come up with a way to speed up the initial start time of Pipeliner... and I did it! But it is one of the most nasty ass pieces of code I've written in a long time... and here is why :)
Background
But first a little bit of background. The project is intended to simplify things (like all code we write I guess :)) and as per my previous post I like to KISS. So instead of having to deal with dependencies I wanted Pipeliner to auto include the relevant files.
The initial setup was a bit of a blunt instrument, but it worked. It would simply include all .sh files using the source
command and all functions were available. Naturally the code base grew and this method would also take longer and longer. To the point where it took 1 whole damn second to start a script! ARE YOU KIDDING ME?! 1 WHOLE SECOND? UNFUCKINGACCEPTABLE!!!
So I had to come up with something new... which I did... and it's nasty... very nasty!
The "code"
#!/bin/bash
Files_Pre_Import_Classes() {
files=$(Files_Get_Class_Files)
for file in $files; do
functions=$(grep "^ *[A-Za-z0-9_]\+ *( *) *{*" $file | sed -E "s/ *\( *\) *\{.*//")
for function in $functions; do
if type -t $function > /dev/null 2>&1; then #function/command already defined
continue
fi
eval "$function() { source $file; $function \"\$@\"; }"
done
done
}
Line 1: #!/bin/bash
Well it doesn't get any nastier than this now does it? Bash (or at least sh) is one of the oldest, if not the oldest, scripting languages around. If there were ever a red flag to stop doing what you're doing then this is absolutely it. This shit was conceived in the late 70s... hardly anyone reading this was bourne back then!
Although interestingly shell scripting is still around and is probably growing in popularity because of all the DevOps/Platform/provisioning shit we're doing. So how bad could it be... insert dumpster fire meme here
Line 6: grep... AND sed
I love regex... if there was ever one brilliant way to simplify and at the same time ridiculously overcomplicate your code: REGEX FTFW! It is so elegant yet so terrifying... that I try to infect as many people as possible with them. Honestly they're soo much fun!
But here's the most troublesome part about them... the dialects. Most typically use PCRE, but not grep... or sed. Well at least not by default and even if you could convince them... don't expect it to work on MacOS and Ubuntu. So yeah lets stick with it being all over the place.
Line 7: for function in
Yeah lets loop through a list of functions... if you see this kinda code, run the hell away! There are very few usecases to deal with classes, functions and/or variables in a dynamic way. And you know what's coming next...
Line 12: evil
Often misspelled as eval
, the evil
command is the Devil... on a bad day... when just kicked in the nuts. Any time you think you need evil
, throw yourself off the highest building in the vicinity... it'll likely end better than continuing whatever ridiculous thing you were doing.
Why is evil
such a bad idea you ask? It allows you to do things the language isn't designed to do... so don't fucking do it! Honestly don't use it! I can't repeat it enough, you shouldn't need it... And if you do, you're probably trying to do something you shouldn't be doing it the first place.
The big problem with it is that it has an enormous potential for unintended side effects... and nasty ass security holes! When you use evil
you want to do something dynamic, so you'll have at least 1 variable in there... you must ensure the values are "sane", i.e. do NOT trust user input.
Imagine my regex's above are a bit brittle (which is quite likely since grep+sed suck balls), what if I defined a function like:
NastyAssFunction() { { echo } ; sudo rm -Rf / ;
The only thing that prevents this from wiping your harddrive is the .*
at the end of the sed statement. And yes, this is an invitation to come up with a way to break this code because I'm not 100% convinced the code is solid.
So what does it do?
Honestly this is the best part :) For each function definition it finds (in any .sh file) it dynamically predefines a function that:
- imports the relevant .sh file
- which in turn will overwrite all predefined functions with their real definition
- call the same function again with the same arguments so it executes using the real function definition
- any subsequent calls to the function will use the real definition going forward
There is so much potential for failure, it's insane. For starters a function calling itself is likely to cause havoc just for the sake of it being recursive. And yes the first version did in fact royally fuck things up and ended up in an endless loop :) TBH I'm actually surprised it even works, depending on how the language implements imports & recursion it's likely that something like node would just keep calling the predefined one... assuming it's even possible.
Conclusion
In order to make things simple we often need to resort to complicated mechanisms. Although this particular implementation can potentially be made less terrible, the problem it's trying to solve is a difficult one. And you could argue that a dev should just handle it, but why? Why should we be bothered with mundane repetitive tasks like dependency management? We can automate this shit... or at least we should be able to...
The big question is: should we be sticking nasty ass code in our code base? I can guarantee that someone (likely me) is gonna run into issues with this. So is it worth the hassle or should we KISS? Or why doesn't whatever language simply handle this for us? If I use some namespace can't it just figure out where it comes from and auto include the relevant dependencies at compile time?
And for the record: I'm still considering whether to leave the code in or not...
Posted on March 24, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.