Sophos UTM DNS resolution bug leads to misclassification of web traffic.


*UPDATE 01\10\2017*  I FINALLY got an error that calls out a file that is faulting.  I started having various ipv6 sites not loading or loading very slowly.  Inspection of the logs showed the following:

2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc3fb6a00″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to www.cbmsystems.com using IPv6 timed out, re-trying to connect using IPv4″
2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc3fb7000″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to www.cbmsystems.com using IPv6 timed out, re-trying to connect using IPv4″
2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc3e85800″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to www.cbmsystems.com using IPv6 timed out, re-trying to connect using IPv4″
2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc3e4f600″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to waynesborocc.com using IPv6 timed out, re-trying to connect using IPv4″
2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc3ff0000″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to waynesborocc.com using IPv6 timed out, re-trying to connect using IPv4″
2017:01:10-10:19:53 etcutm httpproxy[8853]: id=”0003″ severity=”info” sys=”SecureWeb” sub=”http” request=”0xc4024c00″ function=”connect_server_timeout” file=”dns.c” line=”788″ message=”Connection to www.cbmsystems.com using IPv6 timed out, re-trying to connect using IPv4″

It is not only this site but others as well.  Disabling the HTTP proxy totally would mitigate this issue.  Disabling IPV6 would mitigate the issue as well.  However if I left IPV6 on and the HTTP off then the firewall would pass IPV6 traffic to those sites timing out under IPV6 with the http proxy enabled.  I was told(again) by Sophos that the issue was my network.  Again I was able to show that the issue was not on my end as things worked fine with the proxy off.  I explained that if it were my end i would have the issue with the proxy on or off or with my pc plugged directly into the cable modem. (I had zero IPV6 issues in any configuration except having the HTTP proxy enabled with IPV6 active.)  That was December of last year.  Today I got a VERY helpful technician from escalation who is determined to look through my years of results with an open mind and tell me what he thinks.  He has my direct number to reach me for any questions.  I have hope that finally this longstanding DNS issues inside the HTTP proxy will finally be resolved.  It might take some time…but I have a glimmer of hope.)

The HTTP proxy in UTM 9 has always been a finicky beast.  Unfortunately ever since Sophos took over Astaro AND Sophos took the HTTP proxy from the Squid proxy to a proprietary in house proxy the results have been less than stellar.  I have gone back and forth with Sophos on multiple occasions to prove that the issue is simple.  The problem is that the HTTP proxy itself for some reason does not properly perform RDNS lookups on all connections.  I have not, I must admit, extensively checked this issue out on ipv6.

The potential security implications are also potentially concerning.  Because the Web protection feature relies heavily on the web proxy and therefore is unable to properly classify traffic, Category classifications(like malware, suspicious, etc etc etc) are not fully effective in securing the edge of client networks.

That is the only way you are going to get this to work correctly without you having to build a growing and ever changing list of exceptions…unless you want to whitelist the entire root of Akamai, Edgesuite, Microsoft, Yahoo, Amazon, Limelight networks….etc etc etc etc.  If you have to continue to add these things the purpose of the UTM kinda goes up in flames at that point.  Why pay all the money for the UTM when Pfsense or a Netgear can do the same job?  Unfortunately for Sophos this misclassification causes a ton of problems as the entire core of the UTM’s network security relies on this functionality properly working.  Because of this issue,  web traffic has a high rate of mis-classification leading to various symptoms such as:

  1.  Traffic like Netflix or Facebook being classified as general HTTP traffic
  2. Traffic being classified as unclassified.
  3. Application control profiles becoming ineffective on traffic management for networks
  4. Mobile devices, phones, computers and other network devices unable to use various services like Netflix, Amazon Video, Debian updates, among others.
  5. Bypass regex expressions growing to unmanageable sizes and being made ineffective as client CDN node address change based on well established CDN behavior.
  6. Classification of traffic like malware, suspicious(which includes advertisements) is passed without inspection.
  7. Valid traffic is blocked if the filter option to block unclassified traffic is enabled leading to loss of web service for users.

Right now according to my UTM screen I have downloaded a total of 23.3 gigabytes of unclassified HTTP traffic and only 2.0 GB of YouTube traffic.  The problem is nearly all of that unclassified traffic is…..Netflix.  So what does this means?  If I try to use application control to limit Netflix traffic usage on my network…it will not work correctly as the traffic is not correctly classified.  Let’s fire up Netflix and see what happens….

Now with Netflix streaming right as i type this on my computer I see this come up:

2016:11:28-14:21:45 etcutm httpproxy[15703]: id=”0001″ severity=”info” sys=”SecureWeb” sub=”http” name=”http access” action=”pass” method=”CONNECT” srcip=”192.168.255.53″ dstip=”23.246.5.149″ user=”” group=”” ad_domain=”” statuscode=”200″ cached=”0″ profile=”REF_HttProContaInterNetwo3 (Internal Network)” filteraction=”REF_HttCffInternal (Internal)” size=”3213138″ request=”0xe3e35200″ url=”https://23.246.5.149/” referer=”” error=”” authtime=”0″ dnstime=”2″ cattime=”20816″ avscantime=”0″ fullreqtime=”118360798″ device=”0″ auth=”0″ ua=”” exceptions=”” category=”9998″ reputation=”unverified” categoryname=”Uncategorized”

2016:11:28-14:21:54 etcutm httpproxy[15703]: id=”0001″ severity=”info” sys=”SecureWeb” sub=”http” name=”http access” action=”pass” method=”CONNECT” srcip=”192.168.255.53″ dstip=”23.246.5.163″ user=”” group=”” ad_domain=”” statuscode=”200″ cached=”0″ profile=”REF_HttProContaInterNetwo3 (Internal Network)” filteraction=”REF_HttCffInternal (Internal)” size=”23855719″ request=”0x11877200″ url=”https://23.246.5.163/” referer=”” error=”” authtime=”0″ dnstime=”3″ cattime=”20727″ avscantime=”0″ fullreqtime=”126654592″ device=”0″ auth=”0″ ua=”” exceptions=”” category=”9998″ reputation=”unverified” categoryname=”Uncategorized”

192.168.255.53 is the current ip of my machine.  See the url?  Not dest ip…but look for url…  url=”https://23.246.5.149/.  That doesn’t seem to be such an issue right?  Ok now look at the end..what is the classification?  categoryname=”Uncategorized” This is no problem right?  Wrong.  Let me show you the RDNS from the UTM itself..but not from inside the HTTP proxy but at the root shell level:

nslookup 23.246.5.149
Server: 127.0.0.1
Address: 127.0.0.1#53

Non-authoritative answer:
149.5.246.23.in-addr.arpa name = ipv4_1.cxl0.c279.iad001.ix.nflxvideo.net.

Authoritative answers can be found from:
in-addr.arpa nameserver = a.in-addr-servers.arpa.
in-addr.arpa nameserver = c.in-addr-servers.arpa.
in-addr.arpa nameserver = e.in-addr-servers.arpa.
in-addr.arpa nameserver = b.in-addr-servers.arpa.
in-addr.arpa nameserver = d.in-addr-servers.arpa.
in-addr.arpa nameserver = f.in-addr-servers.arpa.
a.in-addr-servers.arpa internet address = 199.212.0.73
b.in-addr-servers.arpa internet address = 199.253.183.183
c.in-addr-servers.arpa internet address = 196.216.169.10
d.in-addr-servers.arpa internet address = 200.10.60.53
e.in-addr-servers.arpa internet address = 203.119.86.101
b.in-addr-servers.arpa has AAAA address 2001:500:87::87
d.in-addr-servers.arpa has AAAA address 2001:13c7:7010::53
e.in-addr-servers.arpa has AAAA address 2001:dd8:6::101
f.in-addr-servers.arpa has AAAA address 2001:67c:e0::1

 

ok so who does nflxvideo.net belong to?  A quick public whois turns up:

Domain Name: nflxvideo.net
Registry Domain ID: 1658109188_DOMAIN_NET-VRSN
Registrar WHOIS Server: whois.markmonitor.com
Registrar URL: http://www.markmonitor.com
Updated Date: 2016-10-27T08:29:56-0700
Creation Date: 2011-05-25T12:47:43-0700
Registrar Registration Expiration Date: 2017-05-25T00:00:00-0700
Registrar: MarkMonitor, Inc.
Registrar IANA ID: 292
Registrar Abuse Contact Email: 
Registrar Abuse Contact Phone: +1.2083895740
Domain Status: clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)
Domain Status: clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)
Domain Status: clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)
Registry Registrant ID:
Registrant Name: Domain Administrator
Registrant Organization: Netflix, Inc.
Registrant Street: 100 Winchester Circle,
Registrant City: Los Gatos
Registrant State/Province: CA
Registrant Postal Code: 95032
Registrant Country: US
Registrant Phone: +1.4085403700
Registrant Phone Ext:
Registrant Fax: +1.4085403737
Registrant Fax Ext:
Registrant Email: 

 

Houston we have a problem.  For some reason the RDNS works fine at the root shell but inside the HTTP Proxy does not do the lookup or it fails.  There’s nothing in the logs to explain why.  As long as my link to Netflix comes from that ip address it is now counted as unclassified traffic and is NOT subject to the application control feature of the Sophos UTM.  What if other misclassifications occur?  They do….and this has potential big security implications for Sophos UTM.  If it is not able to properly classify something as basic as Netflix what else gets messed up?

I have had and I have seen multiple reports of issues.  Netflix, Facebook, YouTube, Microsoft updates, Linux updates, and many many more.  What has been the Sophos response for more than 3 years?  Tell me it’s not an issue and the answer is to build increasingly large amounts of exceptions.  These exception effectively bypass the UTM’s network security features and allow the traffic to go through unscanned, unclassified, and unchecked.  This seems like a small problem but considering the dependence we have upon DNS due to IPV4’s exhaustion and the increasing traffic of IPV6…DNS resolution working properly at your edge protection device is CRUCIAL to proper security.  I have other detailed research on this but this has been going on for more than 3 years.  I have multiple tickets with my logs presented into this that I have not properly redacted.  I can provide those if requested but they all show this same type of issue i have shown in two lines.  It isn’t anything complicated….the fact that dns resolution fails leads to misclassification of web traffic.  Because of this there are numerous security implications for the Sophos UTM product and it’s partners and customers.

Comments are welcome on my twitter feed here for the one post.  My Twitter main feed is here.