Gealber Morales
Posted on May 14, 2021
MQTT over TCP
Introduction
In this article I will be trying to understand how is the flow in a single MQTT connection. For those who don't know what is MQTT here is their main page, where you can find reliable information about this protocol, along with several use cases. I personally recommend you to check this protocol.
Prerequisites
- Docker
- emqx/emqx docker image
- Python
- paho-mqtt python library
- Wireshark
Setup
Having Docker and Python 3 installed, run the following command:
- Downloading emqx/emqx Docker image
docker pull emqx/emqx
- Running this image as a container
docker run -d --name emqx -p 18083:18083 -p 1883:1883 emqx/emqx:latest
- Making sure container is up without problem
docker ps -a
Check the status of the container, it should be Up.
- Install paho-mqtt python library
pip install paho-mqtt
- Run Wireshark
sudo wireshark
After all these we should be good to go.
Preparing python scripts
The python scripts are the same that you can find in MQTT in Python(Paho), with a modification. We don't want to send that amount of message, instead I will send only one message. All the analysis will be with the publisher script, but you can perform the same task with the subscriber script.
# pub.py modified from original
import random
import time
from paho.mqtt import client as mqtt_client
# Configuration
broker = '0.0.0.0'
port = 1883
topic = "/python/mqtt"
client_id = f"python-mqtt-{random.randint(0, 1000)}"
def connect_mqtt():
"""
connect_mqtt: handle the connection.
"""
def on_connect(client, userdata, flags, rc):
if rc == 0:
print("Connected to MQTT Broker!")
else:
print("failed to connect returned code %d\n", rc)
client = mqtt_client.Client(client_id)
client.on_connect = on_connect
client.connect(broker, port)
return client
def publish(client):
"""
publish: publish data to the broker.
"""
msg_count = 0
time.sleep(1)
msg = f"messages: {msg_count}"
result = client.publish(topic, msg)
# result: [0, 1]
status = result[0]
if status == 0:
print(f"Send `{msg}` to topic: `{topic}`")
else:
print(f"Failed to send message to topic `{topic}`")
msg_count += 1
def run():
client = connect_mqtt()
client.loop_start()
publish(client)
if __name__ == '__main__':
run()
General overview of this setup
If you are new to MQTT all this setup it may be pretty strange for you. I'll do my best to explain to you why all this is needed, a pretty sketch with excalidraw will help you to understand.
According to this sketch, both the publisher and the subscriber are connected to emqx/emqx. In this case I'll be looking only one part of this communication, the interaction between the publisher and the broker.
Looking inside the packets
It's time to see the packets on this communication, with the idea that at least get an idea of how this works. If you followed all the steps above, you should have emqx running and waiting for connection in 1883 port. Now make sure also to select the appropriate network interface with Wireshark, as I explained on the tutorial related to TCP. The filter in this case will be tcp.port == 1883 | mqtt
. With all these, it is time to run the pub.py
script, let's see what happens.
python pub.py
Output:
Connected to MQTT Broker!
Send `messages: 0` to topic: `/python/mqtt`
If you read the output, it seems that the publisher just connected and send the message to the topic /python/mqtt
. Mmm... 🤔 but that's not all, there's a lot of back and forward between the publisher and emqx, let's explore the packets that we captured with Wireshark.
After running the script you should be seeing something like this.
As I commented to you, is not just sent.
TCP handshake
The first three packets that you can see on Wireshark are related to the TCP handshake between the publisher and emqx.
CONNECT command
After the handshake is established the publisher send to emqx a CONNECT COMMAND, this is something new. What is this CONNECT command? This CONNECT command is part of MQTT protocol, let's see inside the packet
Mmm...🤔 is not so easy to take an idea just with this picture, right? As a personal recommendation, you should take this picture and the MQTT specification, in case you are too lazy to read part of a specification I understand, so stay with me and question everything I say, that will help you to think more critical about this subject. Any specification of a protocol, is a document very dense and extremely technical, so don't worry if you don't understand everything. In our simple case you should be looking at section 2 MQTT Control Packet format and also skimming section 1 Introduction.
According to the specification an MQTT Control Packet is structured in the following way:
- Fixed Header, present in all MQTT Control Packets
- Variable Header, present in some MQTT Control Packets
- Payload, present in some MQTT Control Packets
Now back to the picture. Could we identify some of these parts? At least I will make my best with the Fixed Header. Let's see, according to the photo, which represent the traffic intercepted by Wireshark, the Fixed Header would be composed by:
- Header Flags, which specify the Message Type, in our case is a Connect command.
- Connect Flags, which contains flags specific to our Message Type. One really important here, is the QoS, which stands for Quality of Service. QoS can take three values: 0, 1 or 2. The section 4.3 Payload, present in some MQTT Control Packets is dedicated to it, check it out to get an idea of its importance.
There's also more information that Wireshark show you, like Protocol name, Version, Msg length, and Client ID. The first two are used to identify the request with the protocol, is like saying "hey I'm speaking MQTT v3.1.1", in this way the other part can return an appropriate response. While the Client ID, like the name suggests, is a way to identify the other part connected.
EMQX TCP ACK
After the CONNECT command, emqx sends an ACK packet over TCP with the common data related to this packet type.
EMQX CONNECT ACK command
Now, using the MQTT protocol, is sent to the publisher the CONNECT ACK command. This is the equivalent of saying: "Ok, I acknowledge your attempt of connection", all in the sense of MQTT protocol, because as we saw before they are already connected by TCP. They are just agreeing in "talking MQTT" for now on, here's a screenshot of how Wireshark sees this case.
Note the Return Code, as I mention before it tell us "Connection accepted", thanks to Wireshark we can see this in human-readable format.
Publisher send ACK by TCP and emqx response with an ACK by TCP
TCP chatting between both points before sending the actual message.
Publisher send Publish message command
Now actually sending the data from the publisher to the emqx
Here's where all the important information goes, take a look at the picture. We got Msg Length, Topic Length, Topic and the actual Message. It is in this last packet that the publisher really send all the data, he specified the Topic and the message, and many other things too. In this way a subscriber that is connected to emqx, and is "subscribed" to this specific topic, could read the message.
Conclusion
We saw how chatty are protocols, specially this one. In this case we analyze MQTT over TCP, but there's another case that deserve our attention, and is the case of MQTT over Websocket. We didn't covered the part of the subscriber here, but you should try it yourself.
That's all 👋
Bibliography
Posted on May 14, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.