Operating System Fingerprinting with Packets (Part 1)

by Chris Sanders [Published on 31 Aug. 2011 / Last Updated on 31 Aug. 2011]

In this article series I will describe active and passive OS fingerprinting, the concepts that make them plausible, and go through some examples of how to do this in a manual and automated fashion.

If you would like to be noitified of when Chris Sanders releases the next part in this article series please sign up to our WindowSecurity.com Real Time Article Update newsletter.

Introduciton

Context. It’s the single most important thing you have when approaching a system from an offensive or defensive perspective. If you are approaching a system through the eyes of an intruder then you will need to know everything you can about that system so that you can exploit it, exfiltrate data from it, and eventually cover your tracks as you make your escape. On the other hand, if you are defending a system then it's knowledge of that systems architecture that will let you know what it may be vulnerable to, and if the activity you are watching is an indicator of compromise or simply noise on the wire. Context sets the stage for every action you take. It’s because of this that attackers and defenders spend so much time trying to gain this context.

If you are an attacker, you call it reconnaissance. If you are a defender, you call it profiling. Either way, the goal is the same; to find out everything you can about a system. This often starts with finding out what operating system (OS) the host is running. There are multitudes of ways to determine this information, but my favorite methods involve OS fingerprinting with packets. In this article series I will describe active and passive OS fingerprinting, the concepts that make them plausible, and go through some examples of how to do this in a manual and automated fashion.

RFCs, Developers, and Packets

In order for different systems to be able to communicate with one another it’s important that standards exist. These standards govern the operation of the various protocols that allow for network communication, such as IP, TCP, and UDP. Each standard is defined in a Request for Comments (RFC) document that explains the rules for implementation of the protocol. With these RFCs in existence, interoperability of systems is a breeze.

With that in mind, it’s the connection between these RFCs and the developers that utilize them that are of interest to us. Let’s look at RFC 791 for instance, which is the RFC for Internet Protocol (IP), and can located here.

If you read through the RFC, you will find information about a field within an IP packet called Time to Live (TTL). The TTL field is used to define the maximum time a packet can remain in transit on a network until it is discarded. The RFC contains a thorough description of what the TTL value is, how it works, and how devices should utilize it. Interestingly enough however, the RFC does not define a standardized value for TTL. As a result, this field is left open to interpretation by the developers who are creating each software implementation of the IP protocols. This ambiguity means that different operating systems utilize different default TTL values, which means that simple packet analysis can be used to determine what general type of operating system transmitted a particular packet.

The example I’ve just given is just one of many that allow for operating system fingerprinting to take place. Some of these instances center on default values of certain fields and only require a sample of packet traffic from the host in question, where as others focus on how a host might respond to certain types of specially crafted packets, requiring you to communicate with the host. The difference between these is what determines whether you are performing passive or active OS fingerprinting, which we will dig into individually.

Passive OS Fingerprinting

Passive OS fingerprinting is the examination of a passively collected sample of packets from a host in order to determine its operating system platform. It is called passive because it doesn’t involve communicating with the host being examined. This technique is preferred by those on the defensive side of the fence because those individuals are typically viewing hosts from the perspective of a network intrusion detection system or a firewall. This will typically mean that there is some level of access to capture packets generated by the host in order to perform the examination of that data.

Comparing Hosts

Given a passive collection scenario, let’s examine two individual packets collected from two separate hosts.


Figure 1: Packet from Host A


Figure 2: Packet from Host B

Before we begin looking at the differences in these packets, let’s go ahead and see what they have in common. First, we can easily determine that both hosts are sending a TCP packet to another host. The TCP packet has only the SYN flag set, so this should be the first packet in a sequence of communication. Both hosts are sending their packets to the destination port 80, so we will assume that this is attempted communication with a web server.

Now we can start to look at a couple of things that differ between the two packets. The first thing that jumps out to most people is that Host A mentions having an invalid checksum. In this case, that is actually normal based upon the test environment the packet was captured in, so that can be ignored. Beyond that, we can discern a couple of interest variances. Let’s step through these differences and then we will try and draw some conclusions about the host that sent each packet.

Time to Live

The first variance in the two packets is the Time to Live (TTL) value. We’ve discussed that the RFC documentation defines the TTL field as a means to define the maximum time a packet can remain in transit on a network until it is discarded. The RFC doesn’t define a default value, and as such, different operating systems use different default values. In this case, we can note that packet A uses a TTL of 128 and packet B uses a TTL of 64. Remember that these packets were captured directly from the transmitting source. If you were capturing these packets from the perspective of the recipient, you would see varying numbers depending on how many router hops are in between the source and destination. For instance, instead of a TTL of 128, you may see a value of 116. Generally, it’s safe to assume that if you see a number close to 128 or 64, then the default TTL was most likely one of these two.

Length

I mentioned earlier that these two packets are performing the exact same function. That might lead you to assume that they would be the exact same size, but that’s not the case. Packet A reports its length as 52 bytes, where as packet B reports a length of 60 bytes. That means that the source host transmitting packet B added an additional 8 bytes to its SYN packet.

The source of these extra bytes can be found in the TCP header portion of the packet. The TCP header is variable length. This means that according to specification, each and every packet containing a TCP header must include a certain number of required fields, but may optionally include a few other fields if they are needed. These additional fields are referred to as TCP Options.

In packet A, there are three TCP options set. These are the Maximum Segment Size, Window Scale, and TCP SACK Permitted options. There are also a few bytes of padding that make up a total of 12 bytes of TCP options. 


Figure 3: TCP Header of Packet A

In packet B, we have those same options, with the addition of the Timestamp option (which also replaces a few of the padding bytes we see in packet A).


Figure 4: TCP Header of Packet B

The timestamp option accounts for the additional 8 bytes of data in the packet sent from Host B. Both packets still perform the same function, the packet from host B just provides a bit more information about the host that created and transmitted it.

Making a Determination

Now that we’ve pointed out a few differences between the packets, we can try and determine the source operating system of each machine. In an effort to narrow things down about, I will go ahead and tell you that one is a Windows device and the other is a Linux device. In order to determine which packet belongs with which OS, you can either deploy both operating systems in a test environment and do the research yourself, or look at a chart that someone has already prepared for you.


Figure 5: A brief OS fingerprinting chart

The chart above is a very brief version of a more detailed chart put out by the SANS Institute. Based upon this chart, we can determine that Host A is a Windows device, and Host B is a Linux device.

With all of that said, it’s important to mention that passive fingerprinting isn’t always an exact science. We tend to group these values into their respective operating system family (Windows, Linux, etc), when in reality, a Fedora Linux device may produce a different packet size by default than a SuSe Linux device. In addition, a lot of these values are configurable whether it is by modifying a configuration file or the system registry. This means that although a Windows device may typically have a certain TCP Receive Window size by default, a simple registry modification could chance this behavior and fool our attempts at fingerprinting the OS.

Overall, there are all sorts of weird protocol quirks and default values of particular fields that can be used to passively fingerprint a system with packets alone. With this knowledge, you should be able to do a little bit of experimentation yourself to see what variances you can find between systems.

Wrapping Up

In this article we’ve discussed passive OS fingerprinting and gone through an example where we’ve compared to similar packets sent from devices using different operating systems. In the next article in this series we will talk about active OS fingerprinting and how the responses networked devices give to certain types of packets can help us clue in on their running operating system.

If you would like to be noitified of when Chris Sanders releases the next part in this article series please sign up to our WindowSecurity.com Real Time Article Update newsletter.

See Also


The Author — Chris Sanders

Chris Sanders is a technology consultant, author, and researcher born near Paducah, Kentucky. Chris serves as senior network security analyst for the US Department of Defense (SPAWAR) through Honeywell HTSI and is the author of the book "Practical Packet Analysis", as well as contributing many articles in the field. You can read more about Chris on his personal blog located at http://www.chrissanders.org.