Optimizing buffer allocation (NGINX)

2015 年 9 月 4 日9580

Nginx uses buffers to store request and response data at various stages. Optimal
buffer allocation can help you spare memory consumption and reduce CPU usage.
The following table lists directives that control buffer allocation and the stages they
are applied to:


As you can see, most of the directives take two arguments: a number and a size.
The number argument specifies the maximum number of buffers that can be
allocated per request. The size argument specifies the size of each buffer




Thepreceding figure illustrates how buffers are allocated for a data stream. Part
ashows what happens when an input data stream is shorter than the buffer size
specified in the directives above. The data stream occupies the entire buffer even
though the space for the whole buffer is allocated from the heap. Part bshows a data
stream that is longer than a single buffer, but shorter than the longest allowed chain
of buffers. As you can see, if the buffers are used in the most efficient way, some of
them will be fully used and the last one might be only partially used. Part cshows a
data stream that is much longer than the longest chain of buffers allowed. Nginx tries
to fill all available buffers with input data and flushes them once the data is sent.
After that, empty buffers wait until more input data becomes available.
New buffersare allocated as long as there are no free buffers at hand and input data
is available. Once the maximum number of buffers is allocated, Nginx waits until
used buffers are emptied and reuses them. This makes sure that no matter how
long the data stream, it will not consume more memory per request (the number of
buffers multiplied by the size) than specified by the corresponding directive.
The smaller the buffers, the higher the allocation overhead. Nginx needs to spend
more CPU cycles to allocate and free buffers. The larger the buffers, the larger
memory consumption overhead. If a response occupies only a fraction of a buffer,
the remainder of the buffer is not used—even though the entire buffer has to be
allocated from the heap.
The minimum portion of the configuration that the buffer size directives can be
applied to is a location. This means that if mixtures of large and small responses
share the same location, the combined buffer usage pattern will vary.
Static files are read into buffers controlled by the output_buffersdirective unless
sendfileis set to on. For static files, multiple output buffers don't make much
sense, as they are filled in the blocking mode anyway (this means a buffer cannot be
emptied while the other one is being filled). However, larger buffers lead to lower
system call rate. Consider the following example:
location /media {
output_buffers 1 256k;
If the output buffer size is too large without threads or AIO, it can lead to long
blocking reads that will affect worker process responsiveness.
When a response body is pipelined from a proxied server, FastCGI, UWCGI, or SCGI
server, Nginx is able to read data into one part of the buffers and simultaneously
send the other part to the client. This makes the most sense for long replies.


Assume you tuned your TCP stack before reading this chapter. The total size of a
buffer chain is then connected to the kernel socket's read and write buffer sizes. On
Linux, the maximum size of a kernel socket read buffer can be examined using the
following command:
$ cat /proc/sys/net/core/rmem_max
While the maximum size of a kernel socket write buffer can be examined using the
following command:
$ cat /proc/sys/net/core/wmem_max
These settings can be changed using the sysctlcommand or via /etc/sysctl.conf
at system startup.
In my case, both of them are set to 163840(160 KB). This is low for a real system, but
let's use itas an example. This number is the maximum amount of data Nginx can
read from or write to a socket in one system call without the socket being suspended.
With reads and writes going asynchronously, we need a buffer space no less than the
sum of rmem_maxand wmem_maxfor optimal system call rate.
Assume that the preceding Nginx proxies long files with rmem_maxand wmem_max
settings. The following configuration must yield the lowest system call rate with the
minimum amount of memory per request in the most extreme case:
location @proxy {
proxy_pass http://backend;
proxy_buffers 8 40k;
The same considerations apply to the fastcgi_buffers, uwcgi_buffers, and
For short response bodies, the buffer size has to be a bit larger than the predominant
size of a response. In this case, all replies will fit into one buffer—only one allocation
per request will be needed.
For the preceding setup, assume that most of the replies fit 128 KB, while some span
up to dozens of megabytes. The optimal buffer configuration will be somewhere
between proxy_buffers 2 160kand proxy_buffers 4 80k.


In the case of response body compression, the size of the GZIP buffer chain must
be downscaled by the average compression ratio. For the preceding setup, assume
that the average compression ratio is 3.4. The following configuration must yield the
lowest system call rate with a minimal amount of memory per request in presence of
response body compression:
location @proxy {
proxy_pass http://backend;
proxy_buffers 8 40k;
gzip on;
gzip_buffers 4 25k;
In the preceding configuration we make sure that in the most extreme case, if half of
the proxy buffers are being used for reception, the other half is ready for compression.
GZIP buffersare configured in a way that makes sure that the compressor output for
half of the uncompressed data occupies half of the output buffers, while the other half
of the buffers with compressed data are sent to the client.

0 0