livetvmatches.com: nginx Drupal 8 configuration with Microcaching and 410 Wildcards

Submitted by nigel on Thursday 7th April 2016
LiveTVMatches.com
Ok for the impatient here's the nginx configuration for my new website that includes Microcaching and some wildcard configuration for generating HTTP 410 'Gone Away' at the webserver before the framework / CMS is hit.
/etc/nginx/nginx.conf
worker_processes  1;
error_log  /var/log/nginx/error.log;
events {
    worker_connections  1024;
    use epoll;
}
 
http {
    include       mime.types;
    default_type  application/octet-stream;
 
    fastcgi_cache_path /var/cache/nginx2 levels=1:2 keys_zone=microcache:1m max_size=1000m;
    log_format cache '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $upstream_cache_status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';
 
    sendfile        on;
 
    keepalive_timeout  65;
 
    gzip  on;
    gzip_min_length 1100;
    gzip_buffers 4 32k;
    gzip_types text/plain application/x-javascript text/xml text/css;
 
    include vhosts.d/*.conf;
}
/etc/nginx/vhosts.d/mywebsite.conf
server {
  listen                *:80;
  server_name           mywebsite.com www.mywebsite.com;
  client_max_body_size 1m;
  root /srv/www/htdocs/mywebsite/docroot;
    index   index.php;
 
  access_log            /var/log/nginx/mywebsite_access.log;
  error_log             /var/log/nginx/lmywebsite_error.log;
 
  error_page 410 /custom_410.html;
  location  = /custom_410.html {
    root /srv/www/htdocs/mywebsite/docroot/;
    internal;
  }
 
  location ~ \..*/.*\.php$ {
    return 403;
  }
 
  # Block access to hidden directories
  location ~ (^|/)\. {
    return 403;
  }
 
  location ~ ^/sites/.*/private/ {
    return 403;
  }
 
  # No php is touched for static content
  location / {
    try_files $uri @rewrite;
  }
 
  # pass the PHP scripts to FastCGI server
  location ~ \.php$ {
    fastcgi_index index.php;
    try_files $uri =404;
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
 
    set $no_cache "";
    if ($request_method !~ ^(GET|HEAD)$) {
      set $no_cache "1";
    }       
 
    if ($no_cache = "1") {
      add_header Set-Cookie "_mcnc=1; Max-Age=2; Path=/";
      add_header X-Microcachable "0";
    }
 
    if ($http_cookie ~ SESS) {
      set $no_cache "1";
    }               
 
    fastcgi_no_cache $no_cache;
    fastcgi_cache_bypass $no_cache;
    fastcgi_cache microcache;
    fastcgi_cache_lock on;
    fastcgi_cache_key $server_name|$request_uri;
    fastcgi_cache_valid 404 30m;
    fastcgi_cache_valid 200 10s;
    fastcgi_max_temp_file_size 10M;
    fastcgi_cache_use_stale updating;
 
    # The address or socket on which FastCGI requests are accepted. Set yours in www.conf
    fastcgi_pass unix:/var/run/php5-fpm.sock;
    fastcgi_pass_header Set-Cookie;
    fastcgi_pass_header Cookie;
    fastcgi_ignore_headers Cache-Control Expires Set-Cookie;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    include fastcgi_params;
  }
 
  # Clean URLs
  location @rewrite {
    rewrite ^ /index.php;
  }
 
  # Bogus links from previous owner of website
  location ~ ^/.*(wp-content|feed|category|streaming-|online-|live-|watch-|highlights-|stream-)(.*) {
    return 410;
  }
 
  # Image styles
  location ~ ^/sites/.*/files/styles/ {
    try_files $uri @rewrite;
  }
 
  location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
    expires max;
    log_not_found off;
  }
 
  location = /themes/yourtheme/favicon.ico {
    log_not_found off;
    access_log off;
  }
 
  location = /robots.txt {
    allow all;
    log_not_found off;
    access_log off;
  }
 
}
/srv/www/htdocs/mywebsite/docroot/custom410.html
<html>
<head>
<style><!--
                body {font-family: arial,sans-serif}
                img { border:none; }
//--></style>
   <meta name="robots" content="noindex">
   <title>Page Gone - 410 Error</title>
</head>
<body>
<blockquote>
<hr>
<h1>Error 410 - Page deleted or gone</h1>
This might be because:
<ul>
  <li>You have typed the web address incorrectly, or the page you were looking for may have been deleted.</li>
</ul>
</blockquote>
</body>
</html>

Ok - so this isn't a tutorial on nginx or Microcache or Drupal since there are many others online, but this configuration worked for me. Microcaching is recommended for short term caching of dynamic content and goes well with nginx - The Benefits of Microcaching with NGINX. The configuration will need a caching directory to be configured - and I went for /var/cache/nginx as can be seen in the config files - so be sure to create the directory beforehand as root or it won't work.

Tuning the cache values will depend on your use case. Most blogs suggest 1 or 2 seconds but since my data isn't particularly dynamic I opted for 10 seconds. To test the performance of your cache you should use the Apache Benchmark (ab) command line utility. I would take its figures with a HUGE pinch of salt but notwithstanding that it is very good for comparison purposes and proves whether changes made are heading in the correct direction.

The configuration for '410 gone away' errors were a necessity in my case. I learned my domain http://livetvmatches.com had been preowned purely by the fact I was getting thousands of 404 Not Found errors by the Google bot, and they were all URLs showing illegal streaming of live football that my domain used to service or perhaps use as a redirection starting point. Whilst Google won't score a site down for 404s, they now recognise 410 Gone Away and will desist from trying the url again. By looking at my configuration - searching for various words within the URL - you should gain an appreciation how to achieve the same effect on your site.

blog terms
nginx Drupal Drupal 8 Linux