HTTP: redirects, POST and GET requests, and the "lost data"
Arseny Zinchenko
Posted on November 23, 2019
We have a web application which has to accept POST-requests from clients.
In front of this application, there is some proxy service, no matter which – initially, we faced the issues on an AWS’s Application Load Balancer, then I reproduced them with NGINX, and it will “work” for any other proxying system.
Besides proxying – this service also performs an HTTP (80) to HTTPS (443) redirect.
The problem appears accurately during such a redirect:
- a client sends a
POST
requests via HTTP - a proxy returns 301 or 302 redirect to HTTPS
- the client then sends a request via HTTPS, but:
- in some cases this
POST
becomesGET
- or it still will be
POST
but all its data will be “lost”
- in some cases this
A testing environment setup
For the testing purpose, let’s use the next configuration:
- a NINGX accepts API-requests
- NGINX passes the via
proxy_pass
by HTTP to a backend-service- to reproduce the
POST
toGET
problem – a backend with a Go application in Docker container will be used - to reproduce the “lost” data and empty
Сontent-length
– a Python script will be used to run as a web-server
- to reproduce the
NGINX
Just a common config – NGINX, listen on 80, makes redirects to HTTPS:
server {
listen 80;
server_name dev.poc.example.com;
location / {
return 302 https://dev.poc.example.com$request_uri;
}
}
...
And a server 443 {}
– also common config with a proxy_pass
to the backend on an 8081 port:
...
server {
listen 443 ssl;
server_name dev.poc.example.com;
...
location / {
proxy_pass http://localhost:8081;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Go-app in Docker
We have multiple containers in a stack there, but currently we are interested in a go-queue-consumer with a local NGINX (the whole initial system still in the Proof of Concept state now, so don’t wonder about too many NGINXes here).
We’re just interested in its logs with a POST
or GET
methods.
Python web-server
And another web-server to play the backend role to see the data length issue – quickly googled Python script:
#!/usr/bin/env python3
"""
Very simple HTTP server in python for logging requests
Usage::
./server.py [<port>]
"""
from http.server import BaseHTTPRequestHandler, HTTPServer
import logging
class S(BaseHTTPRequestHandler):
def _set_response(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_GET(self):
logging.info("GET request,\nPath: %s\nHeaders:\n%s\n", str(self.path), str(self.headers))
self._set_response()
self.wfile.write("GET request for {}".format(self.path).encode('utf-8'))
def do_POST(self):
content_length = int(self.headers['Content-Length']) # <--- Gets the size of data
post_data = self.rfile.read(content_length) # <--- Gets the data itself
logging.info("POST request,\nPath: %s\nHeaders:\n%s\n\nBody:\n%s\n",
str(self.path), str(self.headers), post_data.decode('utf-8'))
self._set_response()
self.wfile.write("POST request for {}".format(self.path).encode('utf-8'))
def run(server_class=HTTPServer, handler_class=S, port=8081):
logging.basicConfig(level=logging.INFO)
server_address = ('', port)
httpd = server_class(server_address, handler_class)
logging.info('Starting httpd...\n')
try:
httpd.serve_forever()
except KeyboardInterrupt:
pass
httpd.server_close()
logging.info('Stopping httpd...\n')
if __name__ == '__main__':
from sys import argv
if len(argv) == 2:
run(port=int(argv[1]))
else:
run()
Examples: reproduce the Problem
A POST
‘s data lost after a redirect
The first thing we faced with was the fact that after HTTP to HTTPS redirects – our POST requests lose their data.
To reproduce it – let’s use the Python script mentioned above.
Run it on the 8081 port:
root@ip-10-0-15-118:/home/admin# ./test\_post.py 8081
INFO:root:Starting httpd...
And run curl
with POST
and some data in --data
:
$ curl -vL -X POST http://dev.poc.example.com/ -d "param1=value1¶m2=value2"
...
* Trying 52.***.***.224:80...
...
> Content-Length: 27
...
< HTTP/1.1 302 Moved Temporarily
...
< Content-Length: 161
< Connection: keep-alive
< Location: https://dev.poc.example.com/
...
> POST / HTTP/1.1
> Host: dev.poc.example.com
> User-Agent: curl/7.67.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 502 Bad Gateway
< Server: nginx/1.10.3
...
< Content-Type: text/html
< Content-Length: 173
Let’s see what’ happening here:
- a
POST
request is sent via HTTP, Content-Length: 27 - a 302 redirect issued to HTTPS
- and a
POST
request is sent via HTTPS, Content-Length: 173
In the NGINX logs, we can see a standard HTTP POST
:
...
==> /var/log/nginx/dev.poc.example.com-error.log <==
2019/11/23 09:52:41 [error] 19793#19793: \*51100 upstream prematurely closed connection while reading response header from upstream, client: 194.\*\*\*.\*\*\*.26, server: dev.poc.example.com, request: "POST / HTTP/1.1", upstream: "http://127.0.0.1:8081/", host: "dev.poc.example.com"
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:09:52:41 +0000] "POST / HTTP/1.1" 502 173 "-" "curl/7.67.0"
...
But let’s take a look at the stderr of the Python-server running:
...
Exception happened during processing of request from ('127.0.0.1', 38224)
Traceback (most recent call last):
File "/usr/lib/python3.5/socketserver.py", line 313, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python3.5/socketserver.py", line 341, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python3.5/socketserver.py", line 354, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python3.5/socketserver.py", line 681, in __init__
self.handle()
File "/usr/lib/python3.5/http/server.py", line 422, in handle
self.handle_one_request()
File "/usr/lib/python3.5/http/server.py", line 410, in handle_one_request
method()
File "./test_post.py", line 22, in do_POST
content_length = int(self.headers['Content-Length']) # <--- Gets the size of data
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
...
Pay attention at those lines:
...
content\_length = int(self.headers['Content-Length'])
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
E.g., the application got an empty/missing Content-Length field.
Still, if issue a new request directly via HTTP (with the redirect disabled on NGINX) or via HTTPS – everything will work, as expected:
$ curl -vL -X POST https://dev.poc.example.com/ -d "param1=value1¶m2=value2"
...
> POST / HTTP/1.1
> Host: dev.poc.example.com
> User-Agent: curl/7.67.0
> Accept: */*
> Content-Length: 27
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 27 out of 27 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.10.3
< Date: Sat, 23 Nov 2019 09:55:07 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: keep-alive
<
* Connection #0 to host dev.poc.example.com left intact
POST request for /
And the Python’s application stdout:
...
INFO:root:POST request,
Path: /
Headers:
Host: dev.poc.example.com
X-Real-IP: 194.***.***.26
X-Forwarded-For: 194.***.***.26
X-Forwarded-Proto: https
Connection: close
Content-Length: 27
User-Agent: curl/7.67.0
Accept: */*
Content-Type: application/x-www-form-urlencoded
Body:
param1=value1¶m2=value2
127.0.0.1 - - [23/Nov/2019 09:55:07] "POST / HTTP/1.0" 200 -
Now, let’s go to the next second problem.
A POST
became GET
During debugging the problem discussed above, when a backend didn’t receive data during POST
I found another “exciting” behavior of a request during HTTP-redirects.
Let’s see how our POST
request will become… a GET
request!
The details and the root cause will be examined later on this post, and now let’s reproduce the problem using two HTTP-clients – curl
and Postman.
curl
Run a request with the POST
type via HTTP with -L
to follow a redirect to the HTTPS.
In this case, we are using not the Python script, used above, but our real backend Docker container to demonstrate its work and to check its logs.
The data itself, as well as any errors, have no value here – we are interested only in requests methods used – POST
and GET
.
curl -vL -X POST http://dev.poc.example.com/skin/api/v1/receipt -d "{}"
...
> POST /skin/api/v1/receipt HTTP/1.1
> Host: dev.poc.example.com
> User-Agent: curl/7.67.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 400 Bad Request
< Server: nginx/1.10.3
< Date: Sat, 23 Nov 2019 10:07:37 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 58
< Connection: keep-alive
<
* Connection #1 to host dev.poc.example.com left intact
{"message":"Validation failed: unable to parse json body"}
Once again – do not look at the errors here – instead, let’s go to the NGINX’s logs:
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:07:37 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 400 58 "-" "curl/7.67.0"
Seems all good here? A POST
was sent – a POST
was received.
Just to check – let’s run curl
with explicitly specified GET
type to see the backend’s response and NGINX logs in this case:
$ curl -L -X GET http://dev.poc.example.com/skin/api/v1/receipt -d "{}"
404 page not found
And NGINX logs with this GET
:
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:07:37 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 400 58 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:09:57 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "-" "curl/7.67.0"
All good, all seems to be correct – no problems at all, huh?
Postman
But now, let’s use Postman to send the same request to the same endpoint: a POST
via HTTP to trigger and follow a redirect to the HTTPS:
And?…
And – let’s see NGINX logs now:
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:07:37 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 400 58 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:09:57 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:11:44 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "http://dev.poc.example.com/skin/api/v1/receipt" "PostmanRuntime/7.19.0"
Errr…
What?!?
But I sent an explicitly specified POST
request?
Once again:
- make a request with
curl
, HTTP => HTTPS redirect issued,POST
in logs – all good - make a request with Postman, HTTP => HTTPS redirect issued, but
GET
in logs – WTF???
POST, GET and a “lost” data
Well, now we can understand where our data went – because our POST
became GET
.
The root cause, 3xx redirects, and HTTP RFC
Actually, both problems are caused by the same reason.
Let’s start our investigation by reading the 301 code description in the RFC 2616 – https://tools.ietf.org/html/rfc2616#section-10.3.2, especially its Note :
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.
I.e., some clients after sending POST
and receiving 301 – will change the requests’ type to the GET
.
But this only the very beginning!
During the further reading, at the 302 code description in the same RFC 2016 – https://tools.ietf.org/html/rfc2616#section-10.3.3 we’ll see another Note :
Note: RFC 1945 and RFC 2068 specify that the client is not allowed
to change the method on the redirected request. However, most
existing user agent implementations treat 302 as if it were a 303
response, performing a GET on the Location field-value regardless
of the original request method. The status codes 303 and 307 have
been added for servers that wish to make unambiguously clear which
kind of reaction is expected of the client.
RFC 1945 about 3хх redirects – https://tools.ietf.org/html/rfc1945#section-9.3
RFC 2068 about 3хх redirects – https://tools.ietf.org/html/rfc2068#section-10.3.2
I.e., RFC 1945 and RFC 2068 states that a client has no change type for a redirected request, but most of them will treat the 302 code as… 303!
Let’s go to the next part and read about the 303 code – https://tools.ietf.org/html/rfc2616#section-10.3.4:
The response to the request can be found under a different URI and
SHOULD be retrieved using a GET method on that resource.
I.e., when a client considers that it received the 303 code – it always will perform the GET.
So, what we saw with the Postman case (and our mobile application clients where we faced that issue at first):
- a client performs
POST
by HTTP - gets a redirect to the HTTPS with either 301 or 302
- considers it as 303
- and changes its own request’s type sent via HTTPS from to the
GET
, “losing” all original’s data
The Solution
I found the solution after reading the Mozilla’s documentation (although there was a hint in the RFC 2016 Notes for the 302), which says about 301 and 302 redirects:
It is therefore recommended to set the 302 code only as a response for GET or HEAD methods and to use 307 Temporary Redirect instead , as the method change is explicitly prohibited in that case.
Okay – let’s update our NGINX and change code 302 to the 307:
server {
listen 80;
...
location / {
# return 302 https://dev.poc.example.com$request_uri;
return 307 https://dev.poc.example.com$request_uri;
}
}
...
Reload configs, and repeat the request from curl
:
$ curl -L -X POST http://dev.poc.example.com/skin/api/v1/receipt -d "{}"
{"message":"Validation failed: fields 'hardware\_id' and 'receipt' are mandatory"}
And we got a valid response from the backend, which just asks for a normal data now.
Redirect worked, a POST
request was received.
And the NGINX log:
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:07:37 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 400 58 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:09:57 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:11:44 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "http://dev.poc.example.com/skin/api/v1/receipt" "PostmanRuntime/7.19.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:35:51 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 422 81 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:36:00 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 422 81 "-" "curl/7.67.0"
Repeat the request from the Postman:
NGINX logs:
==> /var/log/nginx/dev.poc.example.com-access.log <==
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:07:37 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 400 58 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:09:57 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:11:44 +0000] "GET /skin/api/v1/receipt HTTP/1.1" 404 18 "http://dev.poc.example.com/skin/api/v1/receipt" "PostmanRuntime/7.19.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:35:51 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 422 81 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:36:00 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 422 81 "-" "curl/7.67.0"
194.\*\*\*.\*\*\*.26 - - [23/Nov/2019:10:37:57 +0000] "POST /skin/api/v1/receipt HTTP/1.1" 422 81 "http://dev.poc.example.com/skin/api/v1/receipt" "PostmanRuntime/7.19.0"
All works, as expected.
AWS Application Load Balancer redirects
Unfortunately, I wasn’t able to find how to set 307 when using AWS ALB:
That’s all, folks!
Useful links
Those problems on other sites
- POST Request redirects to GET in Nginx proxy and NodeJS
- Trouble while passing POST request through proxy_pass
- How to proxy_pass POST with headers
- Nginx removes Content-Length header for chunked content
- POST Request with response 302 and content-length 0 crashes nginx
- Reverse Proxy POST problem
- HTTP POST redirect – no Content-Length
Helped to find the solution
Similar posts
- 06/26/2019 NGINX: multi-branch deployment with Ansible, NGINX map and HTTP Headers
- 09/10/2015 NGINX: настройка SSL и HTTPS
- 09/20/2018 NGINX: gzip и ETag weak validation
Posted on November 23, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.