Kristjan Siimson
Posted on July 5, 2020
Application hanged
I’ll start by describing a situation that prompted me to explore topic of timeouts. I worked with a typical PHP and MySQL application. The server receives a HTTP request, passes it to PHP using FastCGI interface, PHP establishes a connection to MySQL database, and application then executes queries over the connection.
Client
| (HTTPS:443)
Server
| (FastCGI:9000)
PHP-FPM
| (mysqli:3306)
MySQL
The problem I encountered was related to communication between PHP and MYSQL. The application was hanging until server timed out the request. An investigation revealed a hanging query and MySQL, which caused queries on a certain table to hang indefinitely.
Problems
Apart from the obvious problem that the application was hanging, the available processes can quickly be used up. A malicious actor can exploit this in a denial-of-service attack. Also, small issues, such as a single endpoint failing, can quickly snowball into all endpoints failing.
It may also be hard to log such issues, especially if the request indeed runs indefinitely. In such cases, the only indication of a problem would be excessive number of active processes in PHP-FPM.
Timeouts
Timeouts are a way to mitigate this problems and can be configured in several layers, and to achieve optimal results certainly should be done so. The layers we’re going to look at are server, application and database. Let’s look at what each level has to provide. I will be looking at Apache and Nginx web servers, PHP and MySQL.
Web server
Web servers are the first layer where the request arrives. I tested locally with official Apache PHP docker image, and official Nginx and PHP-FPM images.
The first curiosity was Apache timeouts, which didn’t appear to have any effect whatsoever. Even with Timeout
and RequestReadTimeout
set to low values, sleep(PHP_MAX_INT);
would run obliviously for hours, until I finally closed the browser tab.
Nginx fared better – the request was terminated when fastcgi_read_timeout
was exceeded. When there was no output at the time of termination, Nginx served a static 504 error page. I didn’t find a way to gracefully shut down the PHP application however, as there was no SIGTERM
signal, and shutdown function wasn’t triggered. When I increased process_control_timeout
setting in PHP-FPM, then the PHP application continued as normal, and connection_aborted
function didn’t report anything unusual. Therefore, I would recommend to use web server timeout as a last resort.
PHP
An obvious solution to timing out the process is by setting the max_execution_time
value in PHP configuration. What is not so obvious is that max_execution_time
appears to ignore time spent on IO, such as running database queries. This is what a typical PHP application spends majority of it’s time on. Thus, it seems that it is only useful for catching problems like infinite loops. So something to think about, but not as useful as it appears.
The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.
Source: PHP: set_time_limit
A workaround involves using pcntl
extension, which is not supported in many environments, such as mod_php
module for Apache, because the PHP application shouldn’t be fiddling with processes in server environment.
The workaround seems fairly innocent, and is possible to use in PHP-FPM, so I will take a look at it anyway:
<?php
# https://stackoverflow.com/questions/7493676/detecting-a-timeout-for-a-block-of-code-in-php/7493838
# fork the process
$pid = pcntl_fork();
if ($pid === 0) {
# application code
# cleanup on SIGALRM
pcntl_async_signals(true);
pcntl_signal(SIGALRM, function() {
sleep(60); # simulation of cleanup work
exit;
});
# simulation of long running process
sleep(60);
} else {
# timeout process
# timing variables
$softTimeout = 5; //s
$hardTimeout = $softTimeout + 5; //s
$interval = 16000; // μs
$start = microtime(true);
pcntl_async_signals(true);
$handler = function($signo) use ($hardTimeout, $interval, $pid, $start) {
posix_kill($pid, $signo);
do {
usleep($interval);
$result = pcntl_wait($status, WNOHANG|WUNTRACED);
if ((microtime(true) - $start) >= $hardTimeout) {
# Immediate termination
posix_kill($pid, SIGKILL);
pcntl_wait($status);
break;
}
} while($result === 0);
};
# Termination signals
pcntl_signal(SIGTERM, $handler);
pcntl_signal(SIGINT, $handler);
pcntl_signal(SIGQUIT, $handler);
pcntl_signal(SIGHUP, $handler);
# Alarm signal
pcntl_signal(SIGALRM, $handler);
register_shutdown_function(function() {
pcntl_alarm(0);
});
pcntl_alarm($softTimeout);
pcntl_wait($pid);
}
EDIT: Reflecting back on this, a simpler approach without forking is also possible with pcntl_alarm
, and is probably a better choice in most situations:
<?php
$timeout = 1;
# cleanup on SIGALRM
pcntl_async_signals(true);
pcntl_signal(SIGALRM, function() {
echo "interupt";
exit;
});
register_shutdown_function(function() {
pcntl_alarm(0);
});
pcntl_alarm($timeout);
sleep(PHP_INT_MAX);
EDIT2: Preliminary testing with PHP-FPM shows a somewhat expected caveat with this approach, which is related to PHP-FPM reusing same processes for multiple requests. pcntl-alarm
will send SIGALRM
to the PHP-FPM worker, which may at time of triggering be doing nothing or processing another request. So at minimum, the script should do it's best to clean up the alarm in register_shutdown_function
. I've update the previous example accordingly.
EDIT3: I'd like to see that done in PHP-FPM, and to some extent it is possible with request_terminate_timeout
, but AFAIK this can only be set to same value for all workers in the pool, and it kills the worker (I'd prefer trying a soft termination first and not killing the worker unless necessary).
A safer approach could be to offload work using Gearman. With such approach PHP-FPM would only pass requests to Gearman. When designing such a solution, it is important to keep in mind that Gearman workers do not benefit from persistent database connections.
Database
Communication with database is where PHP applications tend to spend most of their time. It would be only natural to establish limits to database operations.
In my case, the application specified MYSQLI_OPT_CONNECT_TIMEOUT
using mysqli_options. But why didn’t that stop the long running query? PHP documentation comment section provided a clue. Turns out that connection timeout only times out the phase where PHP establishes connection to MySQL server and it doesn’t affect the queries at all. In fact, with persistent database connections, the database connections are shared between requests, and can therefore be rather long lived. PHP has another option for limiting query execution times, MYSQLI_OPT_READ_TIMEOUT
, but it does not appear in the documentation. Digging in PHP source, this is first mentioned in a non-released NEWS file for PHP 7.1 RC, but the constant doesn’t seem to make appearance before PHP 7.2 RC. And many users report success in earlier PHP versions by manually declaring the constant.
When testing this option, the behavior appears to be as follows:
- If the query execution interval is below timeout, nothing happens.
- When timeout is exceeded the connection is dropped (
2006 MySQL server has gone away
).
There seems to be more though. mysqlnd has an INI setting mysqlnd.net_read_timeout
, which reads to do the same thing, but the documentation also notes MYSQL_OPT_READ_TIMEOUT
only works for TCP/IP connections, and prior to MySQL 5.1.2, only for Windows.
Testing with mysqli revealed that both ways work identically, sockets included. However, the option can only be passed with mysqli_options
, whereas the INI setting can be also used with PDO.
Summary
Control over timeouts in PHP appears rather complicated, especially when compared to asynchronous languages like Node JS, but can certainly be done right.
Given the issues with timeouts and lack of pcntl support in Apache, Nginx + PHP-FPM combo seems like a more solid option.
The timeout strategies in PHP need a careful thought. An architecture using Gearman certainly looks promising, and I have used it with success in the past. However, I will probably give process forking pcntl_alarm
with PHP-FPM a try.
The read timeouts in MySQL is a must combined with connect timeout to avoid available worker pool exhaustion in case of a hanging query.
Posted on July 5, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.