Networking Fundamentals for Software Developers | Part 1: OSI and TCP/IP models
Bogdan MilojeviÄ
Posted on July 24, 2023
đ Welcome, welcome, welcome
Hi there again! As you already know (you probably donât since you have better things to do), I want to become a better developer and I decided to learn more about networking. So naturally, I spent past few weeks wrapping my head around layers, protocols, reverse proxies and other buzzwords in networking domain.
Iâll share fundamental concepts I've learned along the way in this article series. Networking is huge topic so I wonât get into too much details for various reasons, one of them being that unfortunately, I donât know everything about everything (yet đ ) but Iâll share a list of useful resources for further learning.
Part one focuses on a high level overview of TCP/IP and ISO OSI models with an concrete example to guide us and light the dark path ahead.
đ¨âđ¨Â Iâm particularly proud of all illustrations, which I designed myself.
đ Bit of History: Protocol Wars
Once upon a time, few lads had a great idea. They wanted to connect two computers and share data between them. This revolutionary concept led to the birth of Arpanet. It didnât take long for other companies and organisations to jump aboard and build their own networks.
These networks utilized proprietary technology and architecture, with their own set of protocols in place. What happened when two computers from different networks wanted to communicate? Radio silence. They couldnât understand each other. To overcome this issue, the need for common rules and standards in the form of a networking model became evident.
After numerous meetings and intense discussions known as the 'Protocol wars', two models emerged: the one we mostly use, TCP/IP, and the OSI model. Both models are simply sets of standards which engineers adhere to when developing computer networks.
Eventually, TCP/IP protocol stack won since it proved to be more practical. Leading vendors as well as ARPA switched to TCP/IP marking the end of decades long dialogue. Although TCP/IP became the dominant protocol stack, the OSI model retained its usefulness as a reference.
If you have taken any networking classes, you may have noticed that educators primarily use the OSI model when explaining computer network concepts. This is not a big issue since layer 1 - 4 are exactly the same in both models. When network engineers refer to Application layer, theyâll always call it âLayer 7â regardless of the protocol stack.
The bottom line is that we need standardized way for establishing communication between two systems. Imagine you had to develop multiple versions of your app, one thatâll work over wi-fi, one for fibre-optic cables and every other medium there is. Fortunately, we donât have to do this since a long time ago some smart people agreed upon standardized, globally used networking model that we take for granted.
One, Two, Three, Four⌠Seven?
Both TCP/IP and OSI models organize functionalities into separate layers. Each layer has a set of protocols that operate on it. Dividing the functionalities into layers provides easier development, maintenance, and interoperability of network protocols.
For example, layer 3 data unit (packet) can be transmitted across various layer 1 mediums such as radio waves, copper or fibre-optic cables. The underlying technology doesnât matter as long as we follow common standards.
Letâs finally take a look at the TCP/IP and OSI models:
OSI models consists of seven layers and TCP/IP model has four layers (five-layer model is also commonly used like in the picture above). As already mentioned, OSI is a conceptual model, something to refer to. In practice, presentation and session layer functionalities are also handled within application layer.
When considering the flow of data within the protocol stack, each layer from 1 to 4 is associated with a specific name and has its own Protocol Data Unit (PDU) defined:
- L1 - Bits
- L2 - Frame
- L3 - Packet
- L4 - Segment
- L5, L6, L7 - Data
đJourney Through The Layers
Lets take a journey through protocol stack with a concrete example. In this example weâll send simple GET request with JavaScript fetch API and explore what happens at each layer:
fetch('http://api.apis.guru/v2/providers.json')
.then(response => response.json())
.then(json => console.log(json))
We are taking top to bottom approach meaning L7 will be the first one covered.
Once we send a GET request via HTTP, additional HTTP headers such as IP address of destination server, type of HTTP method, request URL etc. are filled in. HTTP protocol defines a number of headers. You can see a full list of standardized HTTP headers here. Itâs possible to define custom headers if your application needs them.
For simplicity sake, we are going to skip layers 5 and 6, like a true network engineers.
Layer 4 - Full Message Delivery
As data passes through Layer 4, it is encapsulated within a TCP segment, which includes the source and destination ports as well as sequence number. Knowing the destination and source ports is crucial as it ensures that the data is directed to the correct process on the both systems. Additionally, the TCP segment contains other headers, as illustrated.
Layer 4 is responsible for end-to-end message transmission. It's got two major protocols under its belt, TCP and UDP â the dynamic duo of data delivery!
TCP is a stateful protocol designed to ensure error-free and complete data transmission, it will never let you down. On the other hand, UDP doesnât bother with data integrity and transmits data without verifying if it reaches the destination, a bit like sending a message in the bottle, hoping it ends up at the final destination.
When using TCP, a three-way handshake is performed before sending actual application data in order to establish a reliable connection with the server. Itâs like asking the server âHey man, I want to send you some data, are you able to communicate with me?â
Layer 3 - All About IP
Way down we go. All the way down to the network layer, home of the famous IP address. When sending a message to your buddy, you need to know his number, right? Itâs pretty much the same concept with networking.
Our TCP segment gets wrapped once more, in a IP packet this time, including source IP address, destination IP address and a bunch of other headers.
Every device with internet access has its own IP address. Go to the whatismyip.com and find out your IP address and then share it with us in the comments! Just kidding, donât share it with anyone, not a good idea.
IP addresses are built for routing, i.e. finding a path for data to travel the vast network, across thousands of routers, switches, servers and other beautiful stuff.
Your public IP address is the address of your router, i.e. your network address, assigned to you by your ISP (Internet Service Provider). It's the address that is visible to the external world, and it is used to identify your network on the global internet.
Every device connected to your router has itsâ private address which is accessible within your local network. Network address translation (NAT) is the process of mapping multiple private IPs to single public IP address. NAT is a crucial mechanism for conserving public IP addresses, as there is a limited number of IPv4 addresses available.
Layer 2 - Home of The Switch
Data link layer will take an IP packet and wrap it in a frame with source and destination MAC addresses. MAC address is also known as a burn-in address. Every device connected to a network has a unique MAC address which is assigned by manufacturer.
While IP addresses are used for end-to-end transmission, MAC addresses are used for hop-to-hop transmission. If you are sending a message to a computer outside of your network (or a subnet), all packets will be sent to your router, also known as gateway.
While router connects multiple networks, switch connects multiple hosts in a local network. Traditionally, switches used to exclusively be layer 2 devices. Nowadays there are multi-layer switches that can operate on layer 3 as well.
ARP is a layer 2 protocol used to map IP addresses to MAC addresses. Here is an awesome video that further explains MAC addresses and data link layer.
Layer 1 - Iâm All About That Bits
Layer one defines physical connection between two devices and handles sending digital data through physical mediums such as coper wires, radio waves and optical cables. Layer one deals with raw bits and handles their transmission.
In other words, the main job of layer one is to take digital bits and transmit them using electrical signals, electromagnetic waves or light. This process is known as modulation.
Now back to our example. Finally, your network card will take data link frames and convert them into appropriate physical signals depending on the medium and transmit them over to your router. Your router then forwards the packets to your ISP.
Your ISP router determines the next hop router based on the destination IP and its routing table. Of course, our computer isnât directly connected to the backend server. There is a whole jungle of switches, routers, firewalls, proxies etc. in between so this process is repeated until packet reaches its destination.
This resource is great for learning more about routing.
đą Connection Is An Illusion
I briefly mentioned a concept of TCP connection, but the thing is, it doesnât exist (in a sense). When I think about a connection, I imagine two sides tightly coupled to one another. In networking of course, thatâs not the case, there are many nodes between the two sides, without a direct connection between.
Connection is just an abstraction which is only known to end parties, nothing in between is aware of it. At layer 3 each packet that is a part of this âconnectionâ can take a different route from side A to B, additionally, packets will arrive out of order, some will be dropped and re-sent, depending on many different variables (internet is packet-switched network after all).
TCP takes control of this and it makes sure data arrives reliably and in the correct order. Software that is using TCP sockets can take this for granted, send and receive data smoothly without worrying about the underlying process.
So a socket is an operating-system object created on behalf of a process to represent a network connection (client/server) or listening port (server) and referred to by that process in system calls used for communicating over that connection.
â Donât confuse TCP socket with WebSockets layer 7 protocol
đ§Â Grain of Salt
In the real word, things often arenât cut and dry. There are protocols that may span across multiple layers or stand between some layers, without clear separation. Networking models are our best effort to make sense of everything that happens but thatâs not possible all of the time, so take everything with a grain of salt.
To add to that, different networks may use different protocols, equipment and tech but the important thing is that everything is compatible and we can build software regardless of the underlying infrastructure. As a software developer, you really want to focus on layer 7 and layer 4 since thatâs what youâll most often interact with.
So thank you for reading this far, now you have a high-level overview of networking models and fundamental concepts. But we are far from done. Stay tuned.
Further learning:
Posted on July 24, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
July 24, 2023