The Discord Bot That Nearly Killed Me (Created in C++)

timbeaudet

šŸŽļøTim BeaudetšŸš€

Posted on November 18, 2019

The Discord Bot That Nearly Killed Me (Created in  C++)

Project Overview

If you are considering writing a bot yourself, or enjoy reading the adventures of other developers, then this article will serve as a list of pitfalls and how to avoid them, or the quest I took on this adventure. This is a long one as I faced many challenges, so go grab that coffee or popcorn and learn something along the way.

This adventure was taken in C++. While other languages and frameworks would have made the project significantly easier, C++ was chosen for code reuse from a twitch bot I use for my programming livestream, it is my comfort language and I want to use websockets in other C++ projects. The last reason being the primary driving force.

The objective of the project was to create a Discord bot to run on my in-house server box 24/7. The server box runs Linux and at the start of the project I was running Ubuntu on my development machine for an experiment of gamedev on Linux. The project started with a search for websocket frameworks;

I did not want to use boost. While I like the concept I have nightmares about it from other projects. In fact I very much dislike dependency management in C++ but still sometimes easier than writing a custom implementation. So I dug into seasocks, got it built, linking and started using it before realizingā€¦ seasocks is specifically a websocket server and does not have client capabilities. Wasted effort.

The First Challenge, Quitting and a Minor Victory

The first 6 hours were spent jumping between various frameworks and writing a custom implementation. Then I quit the project. Yes, I quit the project after 6 hours deeming it not worth the hassle and effort. A couple day break and the project was reawakened with the intent to write the implementation myself, and power on through. I get knocked down, but I get back up again.

Secure web-sockets were required to connect to Discord although not required for my future websocket needs. The next several hours of the project were spent dealing with this. First attempting to use OpenSSL, and what a fiasco that was. With much help from twitch chat, I jumped over to LibreSSL and in a few more hours I finally had a connection with Discord. The first baby step. The connection died after a few seconds (no heartbeats sent), but successful upgrade from https:// to wss://.

I already have TCP and UDP socket implementations for my game development projects. I desired using WebSockets through those socket implementations for future needs. The challenge was getting LibreSSL to use those sockets, though this one was rather easy. Swapping tls_connect_socket() with tls_connect_cbs() allowing callbacks to be used for reading and writing to a custom socket implementation. After a bit of cleanup from all the prototyping I was now feeling pretty solid after two victories.

Implementing WebSocket Protocol

This started out way worse than expected, rfc6455 was a bit scary at first. As stated the https to wss had been implemented in some form to get TLS working with LibreSSL. But to send or receive data over a websocket, special frames are used which contains a header that can be somewhere between 2 and 14 bytes depending on the payload length and masking. This was a perfect place to use a C++ bitfield. Optionally manual bitwise operations could meet the needs, but bitfields are cleaner.

struct FrameHeader
{
    std::uint8_t mOpCode : 4;
    std::uint8_t mReserved : 3;
    std::uint8_t mFinished : 1;
    std::uint8_t mLength : 7;
    std::uint8_t mMask : 1;
};
Enter fullscreen mode Exit fullscreen mode

I admit the image in rfc6455#page-28 describing the frame did throw me for a loop and my first instinct flipped the most and least significant bits of each byte. This was pretty easy to discover and fix. Each frame will have at least two bytes, and after reading those bytes in they could be casted into the header to extract information. The actual payload length can be stored within those 7 bits, an extra 2 bytes or 8 bytes that follow the first two when length (in FrameHeader) is 126 or 127. Note to pay attention to the endianness of those larger byte sizes when sending across the network which expects big-endian to go across the wire, most significant bytes first.

An additional 4 bytes are added to describe the mask when the mask bit is on. Apparently this is prevent packets getting cached by looking similar to http. Though this made little sense to me I pressed on with help from viewers and playing with the XOR operator to mask the payload. A websocket client is expected to set the mask bit to on and xor each byte of the payload with those of the mask. The mask is randomly selected for each frame of data. Note, this doesnā€™t add security; as anyone can see the mask and unmask the data, it just makes two frames that would otherwise be identical, become different.

At this point data was received from Discord in the WebSocket frames and able to be parsed through json. Upon connection Discord sends a HELLO message through the gateway which tells the bot how often the HEARTBEAT message should be sent to keep the connection alive. Part of the heartbeat was to contain the last given sequence number from Discord, fairly straight forward, and at this point the connection could live on - but do nothing otherwise.

Becoming a Detective

After receiving a HELLO the bot is expected to send an IDENTIFY message which contains the bots token and returns a READY message on success. At that point the bot is connectedā€¦ Or it should have been. After sending the IDENTIFY message my bot would immediately disconnect once a HEARTBEAT message was sent, or never received the READY message before disconnect by timeout if HEARTBEAT was never sent.

There were many hours of digging into this issue. A debug tool for logging hexdumps was added to my debug framework as well. Not entirely sure how I lived this long without that tool, but it will definitely save me in the future. With the hexdumps I was able to start comparing what was getting sent to expectations. I also wrote a small ā€˜testā€™ of sorts that created a websocket frame and parsed it to ensure everything worked.

By far the hardest case to solve was that of the disappearing bug. At first it started with a few random failures and ā€œthat was weirdā€, but no obvious suspect found. The investigation continued and multiple suspects questioned. Undefined behavior was discovered when the bot was ran several times, without recompiling, with different results for a simple test frame with the string ā€œINDIEā€being unmasked correctly as ā€œINDIEā€ or as ā€œINDGDā€. In digging deeper into the handling of the payload a rookie mistake was discovered; referencing of data within a std::vector, while calling push_back(). Solutions were simple, either reserve the size required or not hold the reference.

When everything was looking good with the test code, Discord still failed to respond to IDENTIFY with a READY. A significant investigation revealed the IDENTIFY message was larger than 125 bytes while heartbeats and tests were smaller. This lead to the discovery that endianness was ignored. I take the simplest approach first and in previous experiences endianness is often mentioned but in practice always seemed to work without messing around. Not this time.

The solution was quite easy. Just flip the order the bytes are sent or received to the way they are stored in memory. So if the uint16 length was 0x1234, and stored in memory (low-endian) as 0x34, 0x12 then sending over the wire they need to be sent (big-endian) 0x12, 0x34. The same process applies for the uint64, there are simply more bytes to swap around. Finally the READY message arrived.

Sending a Chat Message

After receiving the READY message other messages came in as well, MessageCreate being the interesting thing to dig into. It was very easy to parse the json object to receive the contents of the message and add a very simple if (message == ā€œ!timeā€) { Respond(ā€œtime isā€¦ā€); } well, that was where simplicity ended. Implementing Respond() took a lot of digging into Discord documentation before figuring out that, apparently, sending a message requires the http api and cannot be done through the websocket connection.

I donā€™t know and cannot speculate why responding to a message is done through an entire different connection when a perfectly good connection already exists, but, I am sure there are reasons. Sending an http post was not too hard since I already had a wrapper around libcurl to do just this, with the exception that my wrapper didnā€™t send data, only post parameters, headers and url. It was quick work to find and implement a way to post data with CURLOPT_POSTFIELDS. However, I am cursed.

With postfields you need to give a pointer to data and the size of the data in another option. Unless using CURLOPT_COPYPOSTFIELDS the data needs to be managed on your end. This was all effortless. But it did not work. Discord sent back {"message": "Cannot send an empty message","code": 50006}. Debug output from curl showed the entire contents of my post data was sent successfully, so how was the message empty? I checked the json, and everything about 400 times. Even used curl through command-line with --libcurl file.c switch to compare the generated code with mine.

After a lot of attempts I removed the null-terminator that I naively copied into the data given to postfields. This was the problem. As a game-developer I am not as versed in the internet or http protocol exactly, but sending a null-terminator byte is evidently extremely bad and Discord throws the contents away. I guess this is in defense of a Null Byte Poisoning attack where the server will sanitize content to the null-byte but potententially process unsanitized content after.

Finally I smashed through the last wall victorious. The bot responded to the !time command with my local time. A significant amount of code cleanup and refactoring occurred so the project could be maintained into the future and more commands added. There are many plans to enhance my discord server and live-streaming overlay.

Wrap-up

The takeaway is that programming is often about persistence. Digging through concrete walls with a plastic spoon. I nearly quit this project at the start, but instead I got back up and powered through. By jumping over, crushing through and going around multiple walls, I managed to get my discord bot working. It makes the project much more rewarding. For more of my projects checkout my development stream on twitch.tv/timbeaudet.

šŸ’– šŸ’Ŗ šŸ™… šŸš©
timbeaudet
šŸŽļøTim BeaudetšŸš€

Posted on November 18, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related