The Tragedy of the Jigsaw Bot
Adam Rozanski
Posted on August 14, 2019
Our story begins on the 25th of march. I was working on a ticket that had been in front of me for a week or two regarding our build server (Team City) not sending any emails to folks. The build server software’s official support team was taking quite a while to get back to me, and I had tried everything I could think of. Not wanting to leave this untouched for a while, I decided to come up with a plan.
“What if instead of an email, we get the notification of a failed build sent via slack directly to the user in a private chat? Most of my teammates don’t like email anyway, and this would put me a step or two further on my goals for integrating the entire CI/CD pipeline with the chat client.”
Over the course of the week in my free time between major tasks, I experimented with a few plugins for the build server. Nothing seemed to work correctly and the server logs didn’t overtly indicate that I had a problem or that it was attempting to use the shiny new tool that I had configured. However, with some help from support I was able to learn how to enable the logging correctly. Apparently, I was missing several scopes for the bot to talk to me in my slackbot channel.
I had been writing PowerShellon the side to send myself simple messages so that I could test the permissions of the bot. As it turns out, there’s a lot of documentation for Slackbots, but I had not seen a unified guide to setting them up with all the permission scopes you’d want in one place in the manner I was seeking, namely something that explained them in a method that made it so I could understand. I knew that you could send bot chat to a channel, but private users was another thing, and that didn’t have (to me) a straightforward logic to doing so.
Typically, a public/private channel only requires an incoming-webhook for a scope, and an incoming webhook URL in order to post to a channel. However, a person’s private slackbot channel is unique to them. The scopes I ended up granting the bot are: incoming-webhook, channels:write, chat:write:bot, groups:write, bot, users:read:users:read:email.
**$stuff** = “_I have the shiniest meat bicycle_!”
**Invoke-RestMethod** -Method POST -Uri “https://slack.com/api/chat.postMessage?token=xoxb-SUPER\_LONG\_TOKEN&channel=SlackUserID&text= **$stuff**" -ContentType “application/json”
“I have the shiniest meat bicycle!” — _was the very first thing I saw in my slackbot channel from my own bot (test messages in my opinion should be fun, so I use Borderlands 2 quotes). Soon after, I was sending URLS and semi-complex text structures in urlencoding, followed by JSON objects. _Success!
However this was only the beginning. It would be a major task, and a huge spoiler of fun to ask my pals in development to go through a semi complicated procedure to get the slack token I needed and then to record it somewhere in a kind of database or file. The next logical step then was to tell my script my email address, have it figure out the rest from there, and then send a message that way. The good news for me is that Team City has a built-in variable for that, and I had been asking everyone who registered on the server to use their work email so they could actually get emails when builds failed.
**$who** = “%teamcity.build.triggeredBy.username%”
**$targetUser** = **Invoke-RestMethod** -Method POST -Uri “https://slack.com/api/users.lookupByEmail?email= **$who@emaildomain.com** &token=xoxb-super\_long\_token" -ContentType ‘application/x-www-form-urlencoded’ | **ConvertTo-Json**
( **$targetUser** -split “`r`n”) | **ForEach-Object** {
**$userID** = **Select-string** -InputObject $\_ -Pattern ‘“id”: ‘
if( **$userID** ){
**Write-Output** “FOUND”
**$cleanID** = **$userID** -Replace ‘“id”:’, “” -replace ‘“‘, “” -replace ‘,’, “”
**$cleanID** = **$cleanID**.trim()
}
}
Above I’m doing some cleanup. The response from the API is a JSON object, and in order to lookup my string ID that comes from being in Slack I need to parse it. Once I grab a line that has “id”: “MY_ID”, I just remove the extra stuff I don’t need and that leaves me with just my target ID as a string.
Within an hour I had it set up so that if i gave it my email, it would look up my userID in Slack, clean up the token it got out of the JSON that it returned, and then send it back to slack with the api call to send the private user a message. Easy, right?
It’s now Friday and I have about half an afternoon left in my work week when the realization hits me —
- April 1st is a Monday this year. I could prank the dev team by challenging them to a game of Russian Roulette at random on each build triggered. If I took the current time, took the last digit of the minute off and used that in a switch, I’d generate a sensibly fair roulette. An example of this is: Imagine it’s 1:25pm, we grab the last digit in the time, 5, and generate a result based on that. Using the logic I present below, a build triggering at 1:25 would gain the user nothing, but a build triggered at 1:29 would earn them 200 points, and a build triggered at 1:23 would cost them 100 points.
**$a** = Get-Date
**$array** = [int[]](($a.Minute -split ‘’) -ne ‘’)
**$yei** = 0
switch( **$array** [-1]){
0 { **$yei** =0}
1 { **$yei** =-25}
2 { **$yei** =50}
3 { **$yei** =-100}
4 { **$yei** =100}
5 { **$yei** =0}
6 { **$yei** =25}
7 { **$yei** =-50}
8 { **$yei** =0}
9 { **$yei** =200}
default { **$yei** =”ID:10-T Error”}
}
- It would be an ideal way to test out the build notifier as a concept and turn it into a real thing if it works out. I already had it well figured out how to look up the user in the system and verify their authority to run specific builds/execute scripts against certain servers from a previous project.
- All the pieces to make this work are here, it just needs someone to assemble the jigsaw puzzle. If I doubled-down , I could give it a Jigsaw flavor and feel but not overtly pressure them into playing my game if they are not interested or do not have the time. I get to have my love of pranks and to be useful at the same time.
By COB that day I had Jigsaw bot tested out and working. The whole process works like this:
- A user triggers a build, and at random they’re challenged to a game.
- I look up the user on the whitelist and blacklist. If the user is not whitelisted or blacklisted when they’re chosen for a challenge, then they are issued a challenge by the Team City build server directly in Slack. They’d be instructed to (if they wanted) play the game by using the SAW build at the bottom of the page, in a section reserved for Utilities.
- Team City Updates the whitelist by allowing that user to play.
- Team City sends me a message, so I have a record that a game is coming and should be prepared to send them points.
- The user plays the game. Based on another time-based roulette like the one shown above, they’re given or penalized an minuscule, arbitrary amount of points. Any user caught trying to play the game who wasn’t invited via whitelist would be blacklisted and mocked in Slack in front of everyone on their team. Once the user completes the game they’re also blacklisted from playing again in order to keep it simple and prevent everyone from winning too much. If the RNG engine was not being generous with the developers and there were not a lot of players by midweek, my plan was to clear the blacklist and let them try again, for the fun of it.
Nothing could go wrong with this plan, right?
Riiiiiiiiiight…
Come Monday, hell had pretty much broken loose in my slackbot channel. It turns out that unless the user clicked the button [RUN] or […] next to it to initiate a build, it would set the default service account name as the %teamcity.build.triggeredby.username% system variable value when it went to do a lookup. Slack would then try to lookup this invalid user and then send nothing to me. I was 100% in the dark on what was going on. It wasn’t until Tuesday of that week before I had an actual user get labeled as the trigger-er of builds. By then I already had about 100 IMs in chat saying that %teamcity.build.triggeredby.username% has been challenged to a game.Very helpful. Veeeeeery helpful.
Remember the whitelist/blacklist function that I mentioned earlier? Its only purpose was to be a build artifact that was a composite of all the other users who had been challenged and add their names to it as the week progressed. As a user gets challenged, they’re put on the whitelist, never to be bothered again, until they play the game. The whitelist function wasn’t working for anyone, so Matt came to me by midweek saying that he was getting challenged to a game every 5 minutes while running a series of database changes on multiple databases. Although he wasn’t upset (he was actually mildly amused), he did ask if I could turn it off while I looked for a way to make Jigsawbot not play favorites with him.
Oh, and did I mention that almost everyone found this irritating and deemed it to be some social engineering scam? Pretty much everyone I spoke to either believed that it was a social engineering scam, or just me goofing around, and no one reported it to me or my team either way, save for the 4 users who actively contacted me because they were just amused enough to ask. Out of that subset of developers, half of them were even remotely interested in playing along and did so (to which I rewarded as per rules of the game). At least our IT department’s anti-phishing training was not lost on my developer friends.
Everything was going so smoothly.
So I worked on it, and after some hair-pulling moments realized where my errors were, and got it to leave poor Matt alone.
Pseudo-code for what I was doing:
If (we should challenge the user) {
challenge the user;
if (not on whitelist and not on blacklist) {whitelist them;}
}
What I instead should have been doing:
if (we should challenge the user){
if (not on whitelist and not on blacklist){
challenge the user;
whitelist them;
}
}
This never came up in my tests because I’d play the game directly after being challenged. I didn’t have a test case scenario that would adequately cover the instance of “I need to run 5 different builds at once that would trigger this before playing” and the gap in my code’s behavior wasn’t readily apparent until it was out for the “world” to see.
With the correct logic in place, I was able to fix what my problems were, and Matt finally got some peace and quiet. I, however, wasn’t satisfied. If I wanted to make this into a usable bot script for build errors, I needed to understand how the build software gets the VCS changes from Git to learn who the user was.
With my week of sad and failed pranks nearly over, I began efforts to grab the username from Git so I could extrapolate it into my script and the circle would be complete. With a bit of git commands piped to a file, and a sprinkle of regex later, I had cooked up something that would reliably get emails for usernames, or for users who wanted their first and last name to be signed on the commit, and can accurately report it back to the script so I can do all the fancy things that I have described above without being concerned for getting a bad username passed in because the system’s default is not valid for the entire script’s usage.
The icing of this cake is that I got the plugin working while I was doing this(you know, the one that started this whole damn journey). Ironically, this script isn’t usable for me in it’s current form without me completely redoing all of our builds and sending them back to the dark ages.
However, should I ever need to migrate to another build software where there are less plugins and the UI is less friendly and more command line driven, I have a script that will send private messages. Here it is:
Cover image by Jose Francisco Morales on Unsplash
Posted on August 14, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.