Own honeypots. Part 2 - nginx

Hello, my favourite visitor! In the previous article I have told you about basics of honeypotting. Today I want to show you why and how we can process the web traffic.

Why I want to analyze web junk?!

Well, there are huge amount of web vulnerabilities from variety of services, starting from firmware backdoor at home router and ending at misconfigurated web server, which leaks internal Jupiter Notebook. Bots nowadays are a bit clever, so they rarely trying to exploit system at the first attempt - their first step is system scan. So we can recognize which services are being scanned by bots and lookup recent or most known vulnerability of this service.

How we can detect malicious traffic if I run real website?

This is the most simple task IMO. Everything we need to do is to create a new virtual host, which intercepts all the traffic without Host header, or with non-existent domain name (that was not defined in the server configuration, like our external IP address). Why this would work? Corresponding to RFC 7230, “A client MUST send a Host header field in all HTTP/1.1 request messages”. So if real browser will request encryp.ch, web server will know it certainly. In other cases we have suspicious client with potentially malicious requests, that are trying to exploit bugs in software, which not conforms RFC fully.

Practice time

Now, we have theory to materialise. In this guide I will use nginx because it is far more superior than Apache (performance and security).

Firstly, create two vhosts - first will return 200 status code, another one will log request with it’s bodies. We need two different vhosts to be able to use $request_body in our log_format (this is because nginx itself doesn’t read request_body if no backend is present, it serves only static files; we are lying to nginx so it will try to process our request_body, and we will see it in the logs)

    # log_format is used to describe our log entries
    # for more information about fields we are saving,
    # see docs: https://nginx.org/en/docs/varindex.html
    # https://nginx.org/en/docs/http/ngx_http_log_module.html#log_format
    log_format detailed_json escape=json "{\"request\": {\"client_addr\": \"$remote_addr\", \"client_port\": \"$remote_port\", \"conn_id\": \"$connection\", \"request_body\": \"$request_body\", \"full_line\": \"$request\", \"parsed\": {\"protocol\": \"$server_protocol\", \"method\": \"$request_method\", \"uri\": \"$request_uri\", \"host\": \"$http_host\"}, \"completion\": \"$request_completion\", \"lenght\": \"$request_length\", \"user-agent\": \"$http_user_agent\"}, \"response\": {\"time\": {\"iso8601\": \"$time_iso8601\"}, \"code\": $status}}";

    # our honeypot virtual host
    server {
        # replace 10.0.0.2 with your external IP
        listen   10.0.0.2:80 default_server reuseport so_keepalive=off backlog=4096;
        listen 10.0.0.2:8000 default_server reuseport so_keepalive=off backlog=4096;
        listen 10.0.0.2:8008 default_server reuseport so_keepalive=off backlog=4096;
        listen 10.0.0.2:8080 default_server reuseport so_keepalive=off backlog=4096;
        listen 10.0.0.2:8888 default_server reuseport so_keepalive=off backlog=4096;
        server_name _ "";

        access_log /var/log/nginx/honeypot-full.json detailed_json;
        log_not_found off;
        
        location = /robots.txt {
            access_log off;
            return 200 "User-agent: *\nDisallow: /\nDisallow: /i-ignore-robots-txt\n";
        }

        location / {
            proxy_pass http://unix:/var/run/nginx-dummy.sock;
        }
    }

    # our dummy vhost, used for request body processing
    server {
         # can listen anything you wish,
         # but remember to modify proxy_pass directive
         listen unix:/var/run/nginx-dummy.sock;
         return 200 "";
    }

Remember to paste those lines before including real vhosts (may look like inlude vhosts/*.conf), so it will be used as default trap.

Now, our logs contain something like this line:

{"request": {"client_addr": "74.201.28.70", "client_port": "60174", "conn_id": "11231", "request_body": "", "full_line": "GET / HTTP/1.1", "parsed": {"protocol": "HTTP/1.1", "method": "GET", "uri": "/", "host": "10.0.1.2:80"}, "completion": "OK", "lenght": "120", "user-agent": "libwww-perl/6.53"}, "response": {"time": {"iso8601": "2021-04-03T09:48:20+03:00"}, "code": 200}}

Conclusion

Provided setup is simplified solution without online requests processing, so server load and functionality is minimal. But we can setup something to automatically process JSON logs to report IP, send payload to VirusTotal or ban IP from accessing our real services.