Kubernetes Services; Expose your app to the Internet
Olufisayo Bamidele
Posted on November 23, 2023
In the last articles in this series, we talked about the main pillars of Kubernetes: Cluster, control pane, nodes, workloads, pods, containers, and services. We dockerized a simple nodejs echo server and ran it on our machine inside a single-node Kubernetes cluster. In this article, we run that same workload in a Kubernetes cluster in the Cloud, but before that, let's talk about a topic I graced on quickly in the last article: Services.
Kubernetes Services.
While writing the last articles, I realized that the word "service" is overloaded in tech; It could mean a web backend application, a daemon running in an OS, a cloud offering, or even a module in a codebase. A "Service" is also a thing in Kubernetes that differs from all the meanings above. For clarity, we would be very specific when discussing "Services" in Kubernetes. We would call them k8s-Services
What is a K8S-Service?
A k8s-service is an abstraction that controls communication with a target group of Pods within a cluster and the exposure of those pods outside a Cluster. To be more specific, a k8s-service abstracts the networking of related pods.
Why Do We Need K8s-Services
In the last part of this series, we mentioned that a Pod gets assigned a unique IP address that we can use to communicate with the Pod inside the Cluster. Why can't we talk to pods directly using their IP addresses? Yes, we can, but we shouldn't. Pods, by nature, are short-lived. They come in and out of existence depending on many factors. Factors like
The Pod's health: If it maxes out its allocated memory and CPU, it will stop receiving traffic. Once Kubernetes detects this, it kills that Pod and replaces it with a new one.
Initial Configuration Of The Deployment That Started The Pod: Pods are usually started as deployment members. A deployment is a Kubernetes configuration that describes the desired state of a pod, including when we would like to have a replica. In a deployment, it is possible to specify the number of pods you want when a specific event occurs. For example, one can add a configuration like this: "If the total % of CPU usage goes up to 60%, create a new replica of this Pod to handle extra requests". In the configuration described above, replicas of a pod are expected to come in and out of existence depending on how intensively each Pod is being used.
Each new Pod replica gets a unique IP address, so using Pod's IP addresses is unreliable as a new replica receives a new IP Address. See the image below.
K8s-services solve the problem described above. To ensure that we can communicate with a group of Pods without manually keeping track of the Pods' IP addresses, we must create a k8s-service that sits between a request and the destination Pods. Like Pods, k8s-services get assigned unique IP addresses on creation, but unlike Pods, k8s-services are not ephemera because nothing capable of crashing is running inside them. They are there until deleted. They are just a data structure translated to a network routing configuration on the operating system.
With the setup in the image above, clients of Pod replicas do not need to remember the IP address of each Pod. That responsibility is given to k8s-service. Clients only need to remember the k8s-service's IP address.
Geek Bit ℹ️: The image above shows that k8s-services keep track of destination pod IP addresses in a table. While most of the diagram oversimplifies a Kubernetes cluster, this part is literal. A component of Kubernetes called Kubeproxy is responsible for translating the specification of your k8s-service to a network configuration. The configuration is usually implemented on your OS as a NAT iptable or as ipvs. Most cloud services providers like AWS and Google run kubeproxy in NAT IPTable mode. If you're running Kubernetes on Linux, you can view the translation of your k8s-service configuration as NAT table by running
sudo iptables -t nat -L KUBE-SERVICES -n | column -t
. As a Kubernetes user. Of course, you are usually not concerned about this implementation detail unless you're an administrator.
Types Of k8s-Service
ClusterIP K8S-Service
Whenever you create a k8s-service without specifying the type, the ClusterIP k8s-service is the type that Kubernetes would create. When you create a ClusterIP k8s-service, Kubernetes assigns a private static IP address to the k8s-service, which routes the request to matching pods. ClusterIPs only allow communication within the Cluster; Machines outside the Cluster cannot communicate with Pods through ClusterIP k8s service by default unless you allow this through something called an ingress.
We will cover ingresses in another part of this series. See the illustration below.
Defining a ClusterIP in Kubernetes
This example edits the k8s-service example in the last article.
file_name: node-echo-service.yaml
apiVersion: v1
kind: Service
metadata:
name: node-echo-service
spec:
selector:
app: node-echo # the label of the group of pods that this service maps to
ports:
- port: 80
targetPort: 5001
selector: app: node-echo
instructs Kubernetes to put this ClusterIP k8s-service in front of any pod with the label app=node-echo
. port:80
is the port the service binds to. targetPort:50001
is the port that our container is listening on; that is where the k8s-service will forward traffic to
To create the service, run kubectl apply -f node-echo-service. yaml
. If the configuration does not contain a syntax error, you should get an output that says service/node-echo-service created
on your terminal.
To confirm the creation of our service, type
kubectl get services
You should see the following output.
Geek Bit ℹ️: Pay attention attention, especially to the external IP column. Notice that the value is none. Also, notice that the ClusterIP column is a class A private IP address. Networking 101: Private IPs are only used in LAN. The Kubernetes Cluster's network is the LAN in this scenario. This ClusterIP k8s-service cannot receive internet traffic by default. The other two kinds of k8s services are built on top of ClusterIPs.
NodePort K8S-Service
NodePort k8s-Services builds on top of ClusterIP. In addition to getting a static private IP address, the NodePort k8s-service receives traffic from outside the Kubernetes cluster by opening up a Port in every Node. The traffic from these open ports can hit any node, and the service can forward requests to the available matching Pods. When configuring a NodePort k8s-service, we must provide three ports.
targetPort: The destination port of the matching pods. Usually, the port that your container is running on
port: The port that the k8s service binds to
nodePort: The port on each Node that accepts public traffic
See the the illustration below.
💡 I like to describe NodePort K8s-Service as a k8s-service whose public IP address is the address of every Node in the Kubernetes Cluster.
Although NodePort k8s-services allow clients from outside the Cluster to communicate with our Pods, they are not production-ready for the following reasons:
- NodePort k8s-services only allow traffic from ports 30000 to 32767. Those are non-standard ports in a production environment. Browsers and HTTP clients look at port 80 by default and port 443 for HTTPS. Any other port would require users to be specific. Imagine having to remember the port of every website that you visit
- NodePort k8s-service receives internet traffic through all the Nodes available in the Cluster. This is problematic in production because clients must keep track of all those IP addresses. At the very least, you need a static, permanent, public IP address associated with the K8S-service for your workload to be production-ready. You can achieve this through the creation of an Ingress(don't think about this for now) or using the next k8s-service type called LoadBalancer
Defining a NodePort K8s-service
To define a NodePort k8s-service, we need to add two new properties to the configuration in the ClusterIP section.
- Under the spec property, add the property "type" whose value is NodePort, i.e.,
type: NodePort
- Under the port object, add the property "nodePort", whose value is any port you choose, i.e.,
nodePort: 30000
file_name: node-echo-service.yaml
apiVersion: v1
kind: Service
metadata:
name: node-echo-service
spec:
type: NodePort # telling k8s that we are talking about NodePort
selector:
app: node-echo # the label of the group of pods that this service maps to
ports:
- port: 80
targetPort: 5001
nodePort: 30000 # The port port that receives traffic from the internet
Note: If you skip the
nodePort
property, Kubernetes will automatically choose your value.
Submit the new configuration to Kubernetes by running kubectl apply-f node-echo-service.yaml
. If your configuration contains no syntax error, you should get an output that says service/node-echo-service configured.
To see the result, run kubectl get services node-echo-service -o wide
. Your result should look similar to the screenshot below.
Pay attention to the type and the port column. The type column now says "NodePort," the ports column maps the k8s-service's port 80 to port 30000 on our machine.
We can now communicate with our workload by running curl -d "amazing" 127.0.0.1:30000
GeekBit ℹ️: NodePort is not useless in production; it's just not unsuitable for most web applications. Assuming I run a compute-intensive workload(say, image processing) in Kubernetes, I have dedicated an entire Cluster to this workload. I want to balance incoming tasks across Nodes so that every Node in the cluster always has the same number of tasks running inside them. I'd go for a NodePort k8s-service and set the
externalTrafficPolicy
toLocal
, ensuring that traffic to a Node only fulfills a request inside that Node. Finally, I'd put a network load balancer in front of the k8s-service. Of course, don't worry about it if you don't understand everything. Keep following this series, and it'll eventually make sense.
LoadBalancer K8s-service
With the LoadBalancer K8s-Service type, Kubernetes assigns it a static public IP address. This is what we want in production for web servers 🤗. The IP address is then announced across the underlying network infrastructure.
See the illustration below.
⚠️ Note that Kubernetes doesn't come by default with a network Loadbalancer, so one would usually have to install a plugin such as
metallb
for load balancing but you don't need to worry about this since your cloud provider would have made this available to your Cluster by default
There are two other types of Kubernetes services which I am intentionally skipping in this part. As we dive deeper into Kubernetes networking in the future, I will talk about these in more detail.
Defining a LoadBalancer K8s-service
To define a k8s-service of type LoadBalancer, take the yaml config file from the NodePort section and
- Change the
type
from NodePort toLoadBalancer,
i.e.type: LoadBalancer
- remove the
nodePort
property. The resulting yaml should look like so
file_name: node-echo-service.yaml
apiVersion: v1
kind: Service
metadata:
name: node-echo-service
spec:
type: LoadBalancer
selector:
app: node-echo # the label of the group of pods that this service maps to
ports:
- port: 80
targetPort: 5001
Apply the configuration by running kubectl apply -f node-echo-service.yaml
. You should get the following output; service/node-echo-service configured
Running kubectl get services node-echo-service -o wide
, you should get an output similar to the screenshot below.
Observe the external ip column. Now it says localhost
; This is because I don't have a load balancer installed in the Cluster, but you would get a public IP address in the Cloud.
If we run curl -d "load balancers are amazing" localhost
without specifying any port, we should get those exact words echoed back to us.
Deploying our Kubernetes Workload to the Cloud
From the first part of this series to this particular article, we have learned the very basics things we need to nail down to deploy stuff workloads to a Kubernetes Cluster in the Cloud. Now it's time to do the do. Let's take our workload to the Cloud.
I chose Google Cloud for this demo because I've had more experience with Kubernetes on GCP.
Step 1: Set up the projects we want to deploy
We will be using the project we used in the previous part.
Clone the repository and duplicate the folder "k8s-node-echo
Rename the duplicate folder with a name of your choice. I'm calling mine "k8s-node-echo-with-loadbalancer".
cd
into thek8s-node-echo-with-loadbalancer
Step 2: Build The Project As a Docker Image
Create a Docker Hub account - Docker Hub, like GitHub for Docker Images. This is where we would push our docker image. Note: DockerHub is not the only place we can push our images, just as GitHub is not the only place to push our code. Docker hub is just one of the popular destinations for your open-source docker images. Take note of your username while signing up. It would be useful later on
Log in to your docker hub account on your docker desktop. Click on the Login icon on the docker hub UI, as shown in the screenshot below.
Back in your terminal, in our project folder, run the following command
build -f Dockerfile . -t <yourdockerhubusername>/node-echo:v1 -t <yourdockerhubusername>/node-echo:latest
For example, for me, that would be
docker build -f Dockerfile . -t ngfizzy/node-echo:v1 -t ngfizzy/node-echo:latest
🚨 If you're working on an M1 and above Macbook, remember to add the
--platform linux/amd
flag, i.edocker build --platform linux/amd -f Dockerfile. -t ngfizzy/node-echo:v1 -t
. This is because arm architecture(which m1 chips are based on) is not the default chip most cloud service providers use.
The -t flag specifies the name of your image. The image name contains three parts.
ngfizzy: your docker hub username
node-echo: our application name
v1: the version of our application
The second -t option only aliases v1 as the latest version.
Running docker images | grep "REPOSITORY\|ngfizzy"
should show you more information about the image you just built, like the screenshot below
Step 4: Push the image to Docker Hub
Run
docker push ngfizzy/node-echo:v1
If everything works out, your output should look similar to mine in the screenshot below.
Run the same command for the ngfizzy/node-echo:latest
. If you visit your docker hub account, you should be able to see those images there now. Here's mine
Step 5: Update your node-echo-deployment.yaml
file1.
- Clear the excessive comments I used for explaining the file in the last part of this article
- Update the
image
propertyngfizzy/node-echo:latest
- Change the
imagePullPolicy
property's value toAlways
The resulting configuration should look like this.
apiVersion: apps/v1
kind: Deployment
metadata:
name: node-echo-deployment
spec:
replicas: 1
selector:
matchLabels:
app: node-echo
template:
metadata:
labels:
app: node-echo
spec:
containers:
- name: node-echo
image: ngfizzy/node-echo:latest
imagePullPolicy: Always
resources:
limits:
cpu: 1
memory: 256Mi
ports:
- name: node-echo-port
containerPort: 5001
livenessProbe:
httpGet:
path: /
port: node-echo-port
readinessProbe:
httpGet:
path: /
port: node-echo-port
startupProbe: # configuration for endpoints
httpGet:
path: /
port: node-echo-port
Only lines 16 - 18 changed in the configuration above.
Step 6: Update the node-echo-service.yaml file
Replace the content of node-echo-service.yaml with the LoadBalancer
configuration in the load balancer section of this article. Here's the configuration to save you from scrolling
apiVersion: v1
kind: Service
metadata:
name: node-echo-service
spec:
type: LoadBalancer
selector:
app: node-echo # the label of the group of pods that this service maps to
ports:
- port: 80
targetPort: 5001
Step 7: Create a GKE Cluster On GCP
- Create a GCP account if you don't already have one
- Create a GCP Project if you don't have one previously
- On the home page, click on the
Create GKE Cluster
button as shown in the image below
⚠️ If you have previously enabled if you have not previously enabled Cloud Compute and GKE API, you'd be prompted to do so by following the prompts. When you're done, return to the home page and click the
Create GKE Cluster
button again.
You'd be presented with the following page settings page after clicking.
For demo purposes, we would accept all the default settings and click the submit button at the bottom of the screen.
That should redirect you to this, as seen in the screenshot below.
It takes a couple of minutes for the Cluster to be created.
Step 8: Install GCloud CLI if you've not done that already
Follow the instructions here https://cloud.google.com/sdk/docs/install
Step 9: Log in to Google Cloud on your CLI and your gcloud project
gcloud auth login
gcloud config set project <your-project-id>
Step 9: Connect to your GKE Cluster on your local machine
On the GKE Cluster page, click on the connect button. Follow the numbers in the screenshot below for Navigation.
Click on the pop-up box after clicking connect, then click the copy icon to copy the connection command to your clipboard. Go back to your CLI and paste the command. You should get the following output.
Fetching cluster endpoint and auth data.
kubeconfig entry generated for <your cluster name>.
Confirm that you are now connected by running kubectl config get-contexts
, you should see at least one entry in the table. The name of one of them should start with gke_*
Deploy To GKE Cluster
Now that we are connected to the GKE cluster
- Apply our deployment by running
kubectl apply -f node-echo-deployment.yaml
. You might get a warning saying,Warning: autopilot-default-resources-mutator:Autopilot updated Deployment...
Don't worry about this - Apply your k8s-service config by running
kubectl apply -f node-echo-service.yaml
- Confirm the deployment by running
kubectl get all
. You should see an output similar to the screenshot below.
Test your deployment
- Run
kubectl get services node-echo-service
- Copy the IP address under the
External IP
column and send a post request to it like the one below ```bash
curl -d "hello world" 34.123.423.124
The server would echo "Hello world" back to you
## Summary
In this article, we took a more detailed look at Kubernetes services; we then used our knowledge to deploy a simple server to the Internet. We are just scratching the surface of Kubernetes. In this series, I aim to gradually reveal containers, and Kubernetes features until we can paint a complete picture of how everything works from end to end.
In the next article, we will take a full circle look deeper at containers, and explore how they do what they do.
Posted on November 23, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.