Okay, I have to admit I was concerned.  We installed Windows Server 2008 R2, setup our applications, and then proceeded to add the required configuration to have Nagios monitor the host and associated services.  But Nagios claimed the host was down!  Simple pings returned a response.  Now what?

Background

I configure Nagios to use check_tping to monitor hosts, not the standard ICMP ping.  Why, you ask?  Some network devices do not handle ICMP on their fast-path in silicon, leaving it to be processed by the CPU.  During periods when the CPU is busy, ICMP will not be a good measure of host or network responsiveness.  In some networks, ICMP may be handled with a different QoS profile.  The best gauge of response, over a network, is something close to what applications use.  Guess what?  The Transmission Control Protocol!  How is it leveraged for monitoring?  A SYN packet is directly at the host on a port that is closed (it could be an open port — more on that later).  A host operating system will typically respond with a TCP reset (RST ACK to be exact) — a simple two-packet exchange without extra overhead — this is a kind of network equivalent to “take pictures, leave [very few] footprints!”  If there is an intervening firewall device, the chosen port will have to be opened to allow the SYN packets to reach the destination host.

Next Steps

Windows Server 2008 R2 comes with a host-based firewall and it is enabled by default.  Hmmm, well, let’s leave it enabled.  How much trouble can it be?  Given the decision to leave the firewall enabled, testing proceeded as follows (with packet sniffers on each end and firewall logging enabled on Windows Server 2008 R2):

  1. Initial test without any modifications.  Failure.
  2. Added an inbound rule for the domain profile and re-tested.  Failure.
  3. Disabled the firewall for the domain profile.  Failure.
  4. Re-enabled the firewall for the domain profile and modified the inbound rule to apply to all network profiles (domain, private, and public).  Failure.
  5. Disabled the firewall for *all* network profiles.  Success!

Failure = SYN packets observed leaving the source host, and observed either as dropped (test #1) or arriving but without any response (tests #2-4).

Success = SYN packets observed leaving the source host, arriving at the destination host and RST ACK packets being returned (test #5).

There was a test #0.  This involved sending the SYN packet probes to an open port.  However, this results in the TCP three-way handshake actually completing, which also means connection termination has to happen.  In the usual case, this will involve seven packets, three for connection establishment, and four for connection termination.  In addition, a connection slot is held down, and on the client side, the connection may go through a TIME_WAIT (aka 2MSL) state.  In this case, it does not.  Windows 2008 R2 issues a RST ACK immediately after the active close.  After connection establishment the sequence is:

  1. FIN sent by the client
  2. ACK from Windows
  3. An immediate RST ACK from Windows

I have to presume that some heuristic is in play here based on the time the connection was open.  For very short connections, Windows may decide to issue the reset rather than close the connection gracefully by sending its own FIN packet and waiting for the final ACK from the client.  Under attack, this might help to stave off resource exhaustion.  I find this to be an acceptable strategy.

Conclusions

Although I initially feared that the host OS network stack had changed its default behavior, this is not the case.  However, if the Windows firewall is enabled for *any* active network profile, even if traffic is destined for a NIC that does not match a profile for which the firewall is enabled, the firewall is still in the communication path and its default behavior is to silently drop packets destined for unopened ports.  In this case, the public network profile was active since there was a second, unused NIC that did not have an address (well, 169.n.n.n).  Who wudda thunk it?

While one can make arguments either way about whether a firewall should issue TCP resets or silently drop packets, since it is supposed to be nothing more than a “bump on the wire”, I maintain that host-based firewalls should issue resets.  The firewall and host share the same MAC and IP address, so what is the point of maintaining some illusion of separation?  It is true that not all internal traffic can be trusted or considered non-malicious, but at the very least, silent drop vs. reject (e.g., via a TCP RST) should be a configurable option.  As the scenario above shows, probes to closed ports can be useful and not just for the malicious person performing reconnaissance.

Advertisements