This is part 2 of a quasi-series on hardening node.js for production systems (e.g. the Silly Face Society). The previous article covered a process supervisor that creates multiple node.js processes, listening on different ports for load balancing. This article will focus on HTTP: how to lighten the incoming load on node.js processes. Update: I’ve also posted a part 3 on zero downtime deployments in this setup.
Our stack consists of nginx serving external traffic by proxying to upstream node.js processes running express.js. As I’ll explain, nginx is used for almost everything: gzip encoding, static file serving, HTTP caching, SSL handling, load balancing and spoon feeding clients. The idea is use nginx to prevent unnecessary traffic from hitting our node.js processes. Furthermore, we remove as much overhead as possible for traffic that has to hit node.js.
Too much talk. Here is our nginx config:
http {
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=one:8m max_size=3000m inactive=600m;
proxy_temp_path /var/tmp;
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
gzip on;
gzip_comp_level 6;
gzip_vary on;
gzip_min_length 1000;
gzip_proxied any;
gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
gzip_buffers 16 8k;
upstream silly_face_society_upstream {
server 127.0.0.1:61337;
server 127.0.0.1:61338;
keepalive 64;
}
server {
listen 80;
listen 443 ssl;
ssl_certificate /some/location/sillyfacesociety.com.bundle.crt;
ssl_certificate_key /some/location/sillyfacesociety.com.key;
ssl_protocols SSLv3 TLSv1;
ssl_ciphers HIGH:!aNULL:!MD5;
server_name sillyfacesociety.com www.sillyfacesociety.com;
if ($host = 'sillyfacesociety.com' ) {
rewrite ^/(.*)$ http://www.sillyfacesociety.com/$1 permanent;
}
error_page 502 /errors/502.html;
location ~ ^/(images/|img/|javascript/|js/|css/|stylesheets/|flash/|media/|static/|robots.txt|humans.txt|favicon.ico) {
root /usr/local/silly_face_society/node/public;
access_log off;
expires max;
}
location /errors {
internal;
alias /usr/local/silly_face_society/node/public/errors;
}
location / {
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_set_header X-NginX-Proxy true;
proxy_set_header Connection "";
proxy_http_version 1.1;
proxy_cache one;
proxy_cache_key sfs$request_uri$scheme;
proxy_pass http://silly_face_society_upstream;
}
}
}
Also available as a gist.
Perhaps this code dump isn’t particularly enlightening: I’ll try to step through the config and give pointers on how this balances the express.js code.
The nginx <-> node.js link
First things first: how can we get nginx to proxy / load balance traffic to our node.js instances? We’ll assume that we are running two instances of express.js on ports 61337 and 61338. Take a look at the upstream
section:
http {
...
upstream silly_face_society_upstream {
server 127.0.0.1:61337;
server 127.0.0.1:61338;
keepalive 64;
}
...
}
The upstream directive specifies that these two instances work in tandem as an upstream server for nginx. The keepalive 64;
directs nginx to keep a minimum of 64 HTTP/1.1 connections to the proxy server at any given time. This is a true minimum: if there is more traffic then nginx will open more connections to the proxy.
upstream
alone is not sufficient – nginx needs to know how and when to route traffic to node. The magic happens within our server
section. Scrolling to the bottom, we have a location /
section like:
http {
...
server {
...
location / {
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-NginX-Proxy true;
...
proxy_set_header Connection "";
proxy_http_version 1.1;
proxy_pass http://silly_face_society_upstream;
}
...
}
}
This section is a fall-through for traffic that hasn’t matched any other rules: we have node.js handle the traffic and nginx proxy the response. The most important part of the section is proxy_pass
– this tells nginx to use the upstream
server that we defined higher up in the config. Next in line is proxy_http_version
which tells nginx that it should use HTTP/1.1 for connections to the proxy server. Using HTTP/1.1 spares the overhead of establishing a connection between nginx and node.js with every proxied request and has a significant impact on response latency. Finally, we have a couple of proxy_set_header
directives to tell our express.js processes that this is a proxied request and not a direct one. Full explanations can be found in the HttpProxyModule docs.
This part of the config is the minimum amount needed to get nginx serving port 80 and proxying our node.js processes underneath. The rest of this article will cover how to use nginx features to lighten the traffic load on node.js.
Static file intercept
Although express.js has built in static file handling through some connect middleware, you should never use it. Nginx can do a much better job of handling static files and can prevent requests for non-dynamic content from clogging our node processes. The location
directive in question is:
http {
...
server {
...
location ~ ^/(images/|img/|javascript/|js/|css/|stylesheets/|flash/|media/|static/|robots.txt|humans.txt|favicon.ico) {
root /usr/local/silly_face_society/node/public;
access_log off;
expires max;
}
...
}
}
Any requests for with a URI starting with images, img, css, js, ...
will be matched by this location. In my express.js directory structure, the public/
directory is used to store all my static assets – things like CSS, javascript and the like. Using root
I instruct nginx to serve these files without ever talking to the underlying servers. The expires max;
section is a caching hint that these assets are immutable. For other sites, it may be more appropriate to use a quicker cache expiry through something like expires 1h;
. Full information can be in nginx’s HttpHeadersModule.
Caching
In my opinion, any caching is better than no caching. Sites with extremely heavy traffic will use all kinds of caching solutions including varnish for HTTP acceleration and memcached for fragment caching and query caching. Our site isn’t so high-traffic but caching is still going to save us a fortune in server costs. For simplicity of configuration I decided to use nginx’s built-in caching.
Nginx’s built in caching is crude: when an upstream server provides HTTP header hints like Cache-Control
, it enables caching with an expiry time matching the header hint. Within the expiry time, the next request will pull a cached file from disk instead of hitting the underlying node.js process. To set up caching, I have set two directives in the http
section of the nginx config:
http {
...
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=one:8m max_size=3000m inactive=600m;
proxy_temp_path /var/tmp;
...
}
These two lines instruct nginx that we are going to use it in caching mode. proxy_cache_path
specifies the root directory for our cache, the directory-depth (levels
), the max_size
of the cache and the inactive
expire time. More importantly, it specifies the size of the in-memory keys for the files through keys_zone
. When nginx receives a request, it computes an MD5 hash and uses this key set to find the corresponding file on disk. If it is not available, the request will hit our underlying node.js processes. Finally, to make our proxied requests use this cache, we have to change the location /
section to include some caching information:
http {
server {
...
location / {
...
proxy_cache one;
proxy_cache_key sfs$request_uri$scheme;
...
}
...
}
}
This instructs nginx that it can use our one
keys_set to cache incoming requests. MD5 hashes will be computed using the proxy_cache_key
We have one miss: express.js will not be serving the proper HTTP cache hint headers. I wrote a quick piece of middleware that will provide this functionality.
cacheMiddleware = cacheMiddleware = (seconds) -> (req, res, next) ->
res.setHeader "Cache-Control", "public, max-age=#{seconds}"
next()
It is not appropriate to apply this middleware globally – certain requests (e.g. post requests that affect server state) should never be cached. As such, I use it on a per-route basis in my express.js app:
...
app.get "/recent", cacheMiddleware(5 * 60), (req, res, next) ->
#When someone hits /recent, nginx will cache it for 5 minutes!
...
GZIP
GZIP is a no-brainer for HTTP. By compressing incoming requests, clients will spend less time hogging up your server and everyone saves money on bandwidth. You could use some express.js middleware to handle gzipping of outgoing requests but nginx will do a better job and leave express.js with more resources. To enable GZIPed requests in nginx, add the following lines to the http
section:
http {
...
gzip on;
gzip_comp_level 6;
gzip_vary on;
gzip_min_length 1000;
gzip_proxied any;
gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
gzip_buffers 16 8k;
...
}
I won’t go into details on what these directives do. Like caching, any gzip is better than no gzip. For more control, there are a thousand micro-optimizations that you can perform that are all documented well under nginx’s HttpGzipModule.
SSL
Continuing our theme of leaving node.js to handle only basic HTTP, we arrive at SSL. As long as upstream servers are within a trusted network, it doesn’t make sense to encrypt traffic further than nginx – node.js can serve HTTP traffic for nginx to encrypt. This setup is easy to configure; in the server directive you can tell nginx how to configure SSL:
http {
...
server {
...
listen 443 ssl;
ssl_certificate /some/location/sillyfacesociety.com.bundle.crt;
ssl_certificate_key /some/location/sillyfacesociety.com.key;
ssl_protocols SSLv3 TLSv1;
ssl_ciphers HIGH:!aNULL:!MD5;
...
}
}
listen
tells nginx to enable SSL traffic. ssl_certificate
and ssl_certificate_key
tell nginx where to find the certificates for your server. The ssl_protocols
and ssl_ciphers
lines instruct
nginx on how to serve traffic. These are details: full configuration options are available in the nginx HttpSslModule.
We are almost there. The above configuration will get nginx decrypting traffic and proxying unecrypted requests to our upstream server. However, the upstream server may need to know whether it is in a secure context or not. This can be used to serve SSL-enabled assets from CDNs like Cloudfront, or to reject requests that come unencrypted. Add the following lines to the location /
section:
http {
...
server {
...
location / {
...
proxy_set_header X-Forwarded-Proto $scheme;
...
}
}
}
}
This will send an HTTP header hint down to your node.js processes. Again, I whipped up a bit of middleware that make SSL-detection a little bit easier:
app.use (req, res, next) ->
req.forwardedSecure = (req.headers["x-forwarded-proto"] == "https")
next()
Within your routes, req.forwardedSecure
will be true iff nginx is handling HTTPS traffic. For reference, the silly face society uses SSL for Facebook authentication token exchange when a user logs in using SSO (single sign on) on their phone. As an extension of implementing this, I also threw up a secure version of the site here.
Wrapping up
Phew. We covered how to (1) set up node.js as an upstream server for nginx and (2) how to lighten the load on the upstream server by letting nginx handle load balancing, static files, SSL, GZIP and caching. Caveat emptor: the silly face society hasn’t launched yet. The above configuration is based on personal testing and research: we haven’t reached production-level traffic yet. If anyone is still reading, I welcome suggestions for improvements in the comments.
Incidentally, our use of express.js for HTTP traffic is an accident. We started using node.js to provide a socket-server for the real time “party mode” of the silly face society. As we got close to launching, we decided to add a Draw Something-esque passive mode so that the Silly Face Society could have the launch inertia to be successful. Instead of rewriting our technology stack, we reused as much of the old code as possible and exposed an HTTP interface. If I had to do it all from scratch, I would re-evaluate our choices: node.js is a hard beast to tame for CRUD-apps.
Does your interest in nginx align with your interest in silly faces? Try out the Silly Face Society on the iPhone.
This article has also been translated to the Serbo-Croatian language by Anja Skrba from Webhostinggeeks.com.