Friday, September 23, 2016

Varnish Security Firewall (VSF) heatmap with the Elastic Stack


I am using Varnish reverse-proxy in order to increase overall performance to my website, by caching objects in memory and serving them at amazingly fast; thus reducing requests to my backend web-servers.

Recently I discovered VSF (Varnish Security Firewall) which is basically a set of rules that written in VCL  (Varnish Configuration Language) that you include in your own VCL. These rules will make a bunch of check (you can modify them, or add your own custom rules if needed) for a variety of things; such as file extensions, empty user-agent strings, cross site scripting and much more). If a request to your website, you can display an error and block the request.

With Varnish - I am also using varnishncsa to log all requests, response times and so on to a log file on my server. I use a filebeat (part of beats) from the good people at Elastic to ship in (near) real-time all entries, as they come, to logstash (and subsequently into elasticsearch).

This allows me to produce the nice dashboard pictured above with kibana. Since the dashboard in linked to the indexed log files, I am able to dynamically change the heatmap. I can make a rectangular selection on the map (geo coordinates) and the heatmap will show blocked requests for this only. As seen on the screenshot below



So how does it all work?

I assume you have a working Varnish, and a working logstash, elasticsearch and kibana environment running; if not, there plenty of tutorials available that can help you. You can probably also install VSF  and filebeat.

I run FreeBSD on my systems, you might choose to run something else.

Start by including the VSF rules before vcl_recv() in your default VCL.

/usr/local/etc/varnish/default.vcl:

[...]
include "/usr/local/etc/varnish/security/vsf.vcl";
sub vcl_recv  {
[...]

and reload your vcl:
# varnishadm vcl.load vsftest default.vcl && varnishadm vcl.use vsftest

if everything works, you should see:
VCL compiled.

VCL 'vsftest' now active


Next, make sure that the rule description is being logged in varnishncsa. On FreeBSD, the varnishncsa format is defined in /etc/rc.conf - so you need to add this the right place on your system:

/etc/rc.conf:
varnishd_enable="YES"
varnishd_config="/usr/local/etc/varnish/default.vcl"
varnishd_storage="malloc,52G"
varnishd_admin=":81"
varnishncsa_enable="YES"
varnishncsa_pidfile="/var/run/varnishncsa.pid"
varnishncsa_file="/var/log/varnishncsa.log"
varnishncsa_logformat="%{X-Forwarded-For}i %u %t %m '%{Host}i' '%U' '%q' %s %b '%{Referer}i' '%{User-agent}i' %{Varnish:time_firstbyte}x %{Varnish:handling}x '%{X-VSF-RuleName}i'"


You can find an overview of  the different formatters for the logformat here.
The one we are interested in right now is "%{X-VSF-RuleName}i" - which is set by VSF whenever a request is caught by a rule.
Here's an example:

123.123.255.248 - [23/Sep/2016:00:26:36 +0200] PROPFIND '5.5.5.5' '/webdav/' '' 403 279 '-' 'WEBDAV Client' 0.000060 synth '-' 'Method Not Allowed'

We need to ship this to logstash. As mentioned earlier, I use filebeat:

filebeat.yml
filebeat:
  prospectors:
    -
      paths:
        - /var/log/varnishncsa.log
      input_type: log
      document_type: varnish

output.redis:
    hosts: ["172.16.0.1"]
    port: 6379
    password: "xxxxmysecurestringxxx"
    index: "filebeat"
    db: 0
    timeout: 5



I ship all my entries to a redis instance (broker) that my different logstash instances will poll continuously.

So far so good. We now have to configure logstash to match and manipulate the log entries before inserting them into elasticsearch

Here is the relevant section from my configuration.

/usr/local/etc/logstash/logstash.conf
        if [type] == "varnish" {
                grok {
                        patterns_dir => "/usr/local/etc/logstash/patterns"
                        match => [
                                "message", "%{VARNISH}"
                        ]
                        named_captures_only => true
                }
                geoip {
                        source => "ip1"
                        target => "geoip"
                        database => "/usr/local/etc/logstash/GeoLiteCity.dat"
                        add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
                        add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
                }
                mutate {
                        convert => [ "[geoip][coordinates]", "float"]
                        convert => [ "bytes", "integer" ]
                        convert => [ "berespms", "integer" ]
                }
        }

The geoip section, takes the ip address in the field "ip1" and makes a lookup in GeoLiteCity.dat - that will return geo coordinates into the field geoip.coordinates.

Here is my grok pattern for the Varnish log:

VARNISH %{IP:ip1} - \[%{HTTPDATE:timestamp}\] %{WORD:method} '%{NOTSPACE:host}' '%{NOTSPACE:path}' '(?:%{URIPARAM:param}|)' %{NUMBER:http_status} (?:%{NUMBER:bytes}|-') '(?:%{NOTSPACE:referrer}|-)' %{QS:agent} %{BASE10NUM:berespms} %{WORD:cache_handling} (?:%{QS:vsfvuln}|-)


Once this is in place, and you have restarted / reloaded all the different new configurations, you should be able to see vsfvuln and the GeoIP information in the indexed data:



I hope this was helpful, feel free to ask questions in the comment section below.

Thanks to all the people at Varnish Cache for providing this incredibly versatile and powerful software.