Build own Kubernetes - NodePort Service networking
Jonatan Ezron
Posted on October 21, 2022
In the previous articles, we established a connection between the pods across nodes using virtual network devices and some iptables rules.
In this article, we are going to make a NodePort service iptables rules to make a connection from outside of the cluster. For this article, I had the kubernetes-services-and-iptables blog helpful for the iptables rules in some services.
This article continues from the last article network architecture:
First, to make our rules organized and well structured we are going to make custom chains and add our rules there, we start by deleting all the rules from the last article and creating a new chain called KUBE-SERVICES
and adding them to PREROUTING
and OUTPUT
chains so that every packet going through this chains will be affected by our rules:
iptables -t nat -N KUBE-SERVICES
iptables -t nat -A PREROUTING -j KUBE-SERVICES
iptables -t nat -A OUTPUT -j KUBE-SERVICES
Now we create a custom chain for our ClusterIP service called KUBE-SVC-1
, and assign the desired destination IP (virtual cluster IP: 172.17.10.10
) and port rule into the KUBE-SERVICE
:
iptables -t nat -N KUBE-SVC-1
iptables -t nat -A KUBE-SERVICES -d 172.17.10.10/16 -p tcp -m tcp --dport 3001 -j KUBE-SVC-1
Now for each request to the ClusterIP service we want to load balance every request to the existing pod as before, so we add the DNAT
statistics rules as before to the KUBE-SVC-1
:
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 4 --packet 0 -p tcp -m tcp -j DNAT --to-destination 10.0.1.2:8080
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 3 --packet 0 -p tcp -m tcp -j DNAT --to-destination 10.0.1.3:8080
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 2 --packet 0 -p tcp -m tcp -j DNAT --to-destination 10.0.2.2:8080
iptables -t nat -A KUBE-SVC-1 -p tcp -m tcp -j DNAT --to-destination 10.0.2.3:8080
If you list all the iptables rules the output should be as followed:
root@91cda5c7dade:/agent# iptables --list -n -t nat --line-number
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
1 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0
Chain INPUT (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
1 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
Chain KUBE-MARK-MASQ (0 references)
num target prot opt source destination
1 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-NODEPORTS (1 references)
num target prot opt source destination
1 KUBE-SVC-1 tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:30001
Chain KUBE-SERVICES (2 references)
num target prot opt source destination
1 KUBE-SVC-1 tcp -- 0.0.0.0/0 172.17.0.0/16 tcp dpt:3001
2 KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0
Chain KUBE-SVC-1 (2 references)
num target prot opt source destination
1 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 4 tcp to:10.0.1.2:8080
2 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 3 tcp to:10.0.1.3:8080
3 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 2 tcp to:10.0.2.2:8080
4 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp to:10.0.2.3:8080
If you make a request to http://172.17.10.10:3001
from inside the node or pods it will work as before.
Now for the NodePort it should be fairly easy, a NodePort service opens selected ports on all nodes between a port range 30000-32767 so that a client can request a node with this port to access a service.
Basically, we need to forward incoming requests from a specific port to our ClusterIP load balancer service that already exists.
Let's say we pick a NodePort port 3001, create a NodePort chain, and add them to our KUBE-SERVICES chain:
iptables -t nat -N KUBE-NODEPORTS
iptables -t nat -A KUBE-NODEPORTS -p tcp -m tcp --dport 30001 -j KUBE-SVC-1
iptables -t nat -A KUBE-SERVICES -j KUBE-NODEPORTS
If we make a request from our environment outside of the node to NodeIP:3001
:
❯ curl http://172.17.0.3:30001
HOSTNAME:91cda5c7dade IP:10.0.1.2
❯ curl http://172.17.0.3:30001
HOSTNAME:91cda5c7dade IP:10.0.1.3
❯ curl http://172.17.0.3:30001
^C
The third request hangs and not returning a response, this is because the last 2 requests are for the second node, when the request responds to the process outside of the cluster (our host) it expects a different source IP, so we need to MASQUERADE
every outgoing request, Kubernetes doing that by marking every service request with MARK
module and then MASQUERADE the marked packets.
Let's start by adding a KUBE-MARK-MASQ
chain, add a mark rule in it, and add it as the first rule for the KUBE-SERVICES
with our ClusterIP rules, KUBE-NODEPORTS
without node port rules:
iptables -t nat -N KUBE-MARK-MASQ
iptables -t nat -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
iptables -t nat -I KUBE-SERVICES 1 ! -s 10.0.0.0/16 -d 172.17.10.10/16 -p tcp -m tcp --dport 3001 -j KUBE-MARK-MASQ
iptables -t nat -I KUBE-NODEPORTS 1 -p tcp -m tcp --dport 30001 -j KUBE-MARK-MASQ
Now we also want to mark each request forwarded to our pods, so we create a chain for each pod forward called KUBE-SVC-1-X
, when X is just a serial number, mark the packet and forward to the relevant pod in each chain:
# pod1
iptables -t nat -N KUBE-SVC-1-1
iptables -t nat -A KUBE-SVC-1-1 -s 10.0.1.2/16 -j KUBE-MARK-MASQ
iptables -t nat -A KUBE-SVC-1-1 -p tcp -m tcp -j DNAT --to-destination 10.0.1.2:8080
# pod2
iptables -t nat -N KUBE-SVC-1-2
iptables -t nat -A KUBE-SVC-1-2 -s 10.0.1.3/16 -j KUBE-MARK-MASQ
iptables -t nat -A KUBE-SVC-1-2 -p tcp -m tcp -j DNAT --to-destination 10.0.1.3:8080
# pod3
iptables -t nat -N KUBE-SVC-1-3
iptables -t nat -A KUBE-SVC-1-3 -s 10.0.2.2/16 -j KUBE-MARK-MASQ
iptables -t nat -A KUBE-SVC-1-3 -p tcp -m tcp -j DNAT --to-destination 10.0.2.2:8080
# pod4
iptables -t nat -N KUBE-SVC-1-4
iptables -t nat -A KUBE-SVC-1-4 -s 10.0.2.3/16 -j KUBE-MARK-MASQ
iptables -t nat -A KUBE-SVC-1-4 -p tcp -m tcp -j DNAT --to-destination 10.0.2.3:8080
And add these chains to the KUBE-SVC-1
chain instead of the rules we wrote (delete the KUBE-SVC-1
rules before this command):
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 4 --packet 0 -j KUBE-SVC-1-1
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 3 --packet 0 -j KUBE-SVC-1-2
iptables -t nat -A KUBE-SVC-1 -m statistic --mode nth --every 2 --packet 0 -j KUBE-SVC-1-3
iptables -t nat -A KUBE-SVC-1 -j KUBE-SVC-1-4
Now we add a MASQUERADE rule for the POSTROUTING chain:
iptables -t nat -A POSTROUTING -m mark --mark 0x4000/0x4000 -j MASQUERADE
Don't forget to add all the iptables rules we added to the second node as well!
We just need to change the last NodePort rule in KUBE-SERVICES
to match every local package:
# delete the old one
iptables -t nat -D KUBE-SERVICES 3
iptables -t nat -A KUBE-SERVICES -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
All of your iptables rules in the nat
table should be as followed:
root@91cda5c7dade:/agent# iptables --list -n -t nat --line-number
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
1 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0
Chain INPUT (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
1 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
1 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 mark match 0x4000/0x4000
Chain KUBE-MARK-MASQ (6 references)
num target prot opt source destination
1 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000
Chain KUBE-NODEPORTS (1 references)
num target prot opt source destination
1 KUBE-MARK-MASQ tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:30001
2 KUBE-SVC-1 tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:30001
Chain KUBE-SERVICES (2 references)
num target prot opt source destination
1 KUBE-MARK-MASQ tcp -- !10.0.0.0/16 172.17.0.0/16 tcp dpt:3001
2 KUBE-SVC-1 tcp -- 0.0.0.0/0 172.17.0.0/16 tcp dpt:3001
3 KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain KUBE-SVC-1 (2 references)
num target prot opt source destination
1 KUBE-SVC-1-1 all -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 4
2 KUBE-SVC-1-2 all -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 3
3 KUBE-SVC-1-3 all -- 0.0.0.0/0 0.0.0.0/0 statistic mode nth every 2
4 KUBE-SVC-1-4 all -- 0.0.0.0/0 0.0.0.0/0
Chain KUBE-SVC-1-1 (1 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- 10.0.0.0/16 0.0.0.0/0
2 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp to:10.0.1.2:8080
Chain KUBE-SVC-1-2 (1 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- 10.0.0.0/16 0.0.0.0/0
2 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp to:10.0.1.3:8080
Chain KUBE-SVC-1-3 (1 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- 10.0.0.0/16 0.0.0.0/0
2 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp to:10.0.2.2:8080
Chain KUBE-SVC-1-4 (1 references)
num target prot opt source destination
1 KUBE-MARK-MASQ all -- 10.0.0.0/16 0.0.0.0/0
2 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp to:10.0.2.3:8080
And now if we make the request from our host:
❯ curl http://172.17.0.3:30001
HOSTNAME:91cda5c7dade IP:10.0.1.2
❯ curl http://172.17.0.3:30001
HOSTNAME:91cda5c7dade IP:10.0.1.3
❯ curl http://172.17.0.3:30001
HOSTNAME:f76b32aa3747 IP:10.0.2.2
❯ curl http://172.17.0.3:30001
HOSTNAME:f76b32aa3747 IP:10.0.2.3
Everything is working!
All of the work we have done in the last 2 posts need to happen automatically on each pod creation and deletion, we will automate this process in the next article in code!
Posted on October 21, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.