kazauwa

Igor Perepilitsyn

Posted on September 8, 2022

Intro to HTTP

Hypertext Transfer Protocol is a set of rules for sending data across the internet. It was invented to standardize communication, e.g., how browsers should request data from servers and how servers should respond. HTTP is very widespread, so as a web developer you will interact with the protocol a lot.

The basic request

It’s a text protocol, so a human can easily read it. Here is a simplified version of a message that a browser generates to ask for the main page of dev.to:

GET / HTTP/1.1
Host: dev.to
User-Agent: Chrome/102.26.590.728
Accept-Language: en-US
Enter fullscreen mode Exit fullscreen mode

The first line, called the request line, contains the request. We tell the server that we want to read (GET) a resource at the root page / (as in dev.to/). The line ends with version (1.1) of the HTTP protocol.

Headers

The second and the following lines contain additional information called headers. They are technical bits of data needed for communication to work out correctly. A header consists of a name and a value separated by a colon. One line holds precisely one header, and there is no limit to their number, apart from the max request size a server can handle.

Headers are optional, except for Host, which specifies the domain of the requested resource. The last two headers in the example above specify our browser and preferred language.

The complete list of all available headers is long, and you don’t really need to know all of them. Among other reasons, a browser will handle them for you in most cases.

Response

And here’s how a simplified HTTP response may look:

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 31890
Content-Encoding: gzip
<response body>
Enter fullscreen mode Exit fullscreen mode

The first line consists of an HTTP version and a status code, which briefly describes the request result. Same as in requests, the following lines contain headers. For instance, the server tells our browser that the response has an HTML page (Content-Type: text/html).

An HTTP response can have a response body: a file (a bunch of bytes), an HTML page, or it can be empty. That’s what we see when our browser renders this information on our screen.

That’s it for the basic layout of an HTTP message.

Cookies

Cookies — are small pieces of information that persist across requests. They are used to “remember” a user on a website. For example, your login information is saved in cookies. That’s why you don’t authenticate each time you visit some website. That is also the reason you see targeted ads. Cookies are stored in your browser and sent to the server with every request.

In terms of HTTP, cookies are headers. They are initially sent with the response:

Set-Cookie: <cookie-name>=<cookie-value>
Enter fullscreen mode Exit fullscreen mode

If browser has saved cookies, it will pass them in subsequent requests as headers:

Cookie: name=value
Enter fullscreen mode Exit fullscreen mode

Methods

Methods or HTTP verbs describe which action we want to perform on a resource. An action is executed on a server-side; hence methods are only used in requests. There are 9 methods as defined in the standard. Some of them may seem identical, but there are 3 properties to help us determine the difference between them:

  • Safety — action only reads the resource and does not change it.
  • Idempotence — doing the same action repeatedly will produce the same result.
  • Cacheability — the response can be saved somewhere and retrieved for later use.

Now, let’s have a look at each method. I’ll leave out CONNECT and TRACE because they are pretty rare.

GET

Safety Idempotence Cacheability
+ + +

GET is the most frequently used method on the internet. It is used to read a resource. When navigating a website in a browser, we make a GET request.

GET / HTTP/1.1
Host: httpbin.org
Enter fullscreen mode Exit fullscreen mode
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 9593
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>httpbin.org</title>
...
Enter fullscreen mode Exit fullscreen mode

POST

Safety Idempotence Cacheability
- - sometimes

POST method sends data to the server. That data is placed in the request body, which works the same way as in an HTTP response. POST method alters resource state, often creating a new instance of something. It is the second most frequently used type of request on the internet. If you submit a form on a website, your browser will send a POST request.

POST /post HTTP/1.1
Host: httpbin.org
Content-Type: multipart/form-data; boundary=--------------------------878806948835616453637143
Content-Length: 285

----------------------------878806948835616453637143
Content-Disposition: form-data; name="login"
kazauwa
----------------------------878806948835616453637143
Content-Disposition: form-data; name="password"
supersecret
----------------------------878806948835616453637143–
Enter fullscreen mode Exit fullscreen mode

PUT

Safety Idempotence Cacheability
- + -

PUT is like POST, but idempotent. That is, subsequent PUT requests will produce the same result. POST, in contrast, can lead to unexpected effects. Imagine submitting a registration form multiple times and creating multiple identical users.

PATCH

Safety Idempotence Cacheability
- - -

Also known as partial PUT. Unlike POST, it cannot be used in forms. PATCH is meant to send only partial changes to a resource (say, we only want to change a username), but in reality, not many servers implement it.

DELETE

Safety Idempotence Cacheability
- + -

DELETE is used to delete a resource. It is allowed to send data in the request body but is not required. As with PUT, it should produce the same result for subsequent requests. After all, you can’t delete something twice.

HEAD

Safety Idempotence Cacheability
+ + +

HEAD works almost like GET, except that the response should not have a body. If it is present, it must be ignored. HEAD is used to fetch headers that can be used to plan the subsequent request. For instance, if we want to download a file, it may be a good idea to check its size (Content-Length header) and see if we can save it.

OPTIONS

Safety Idempotence Cacheability
+ + -

OPTIONS asks a server which methods can be performed on a resource. They can be found inside Allow header in the response. You probably won’t use it directly, but it is used by a browser sometimes. Not many servers implement this method.

OPTIONS / HTTP/1.1
Host: httpbin.org
Enter fullscreen mode Exit fullscreen mode
HTTP/1.1 200 OK
Allow: OPTIONS, HEAD, GET
Enter fullscreen mode Exit fullscreen mode

Status codes

Every HTTP response has a status code, a 3-digit number that indicates whether it was successful or not. They are divided into 5 groups:

  • 1хх — informational
  • 2хх — successful
  • 3хх — redirection
  • 4хх — client errors
  • 5хх — server errors

Let’s take a look at some of the common codes. The complete list is available here.

1хх

Informational codes notify a client of an intermediate status of their request. You probably won’t ever see them.

2xx

Status codes from this group indicate that the request was accepted and successfully processed by a server:

  • 200 OK — success. Most of the time, you will see this status code.
  • 201 Created — request resulted in creating a new resource. It is usually sent in response to POST and PUT requests.
  • 202 Accepted — request is accepted but not yet processed. For example, the operation takes a long time to finish and is handled by some other application that runs once a day.
  • 204 No Content — the body is empty, but headers may be useful. You may see it in response to a HEAD request. It can also be used with idempotent and unsafe requests to indicate that the state wasn’t altered.

3xx

These codes indicate that a client needs to take action to finish the request.

  • 301 Moved Permanently — The URL of a resource was changed. The new one is in the response.
  • 302 Found — similar to the previous, except the change is temporary. For example, search engines won’t change the old URL to a new one in the results.
  • 304 Not Modified — resource contents are cached and haven’t changed since the last request. It is safe to stop the current request there and use the cached data.

4xx

Status codes from this group indicate that there was a mistake in the request:

  • 400 Bad Request — the request is malformed, and the server cannot process it. There may be many reasons for that, but the error is often somewhere in the request body.
  • 401 Unauthorized — authentication is required to proceed
  • 403 Forbidden — user is authenticated but unauthorized to proceed to the resource.
  • 404 Not Found — the famous “page not found” code. Indicates that resource doesn’t exist. Some websites may return 404 instead of 403 to hide the existence of some pages from unauthorized users.
  • 405 Method Not Allowed — resource does not support the request method. Remember the OPTIONS method? You’ll see this status code if the requested method is not among those listed in the Allow header.
  • 429 Too Many Requests — there are too many requests from the client, and they are being rate-limited. Used against DDoS and brute-force attacks.

5xx

These status codes indicate that the server has encountered errors during request processing:

  • 500 Internal Server Error — unhandled error on the server side. Usually happen due to bugs in the application.
  • 503 Service Unavailable — server is not ready to process the request. The response may have a hint on when it will become available.

If you have any questions, feel free to ask them in the
comments.

💖 💪 🙅 🚩
kazauwa
Igor Perepilitsyn

Posted on September 8, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related