Archive

Archive for November, 2009

Investigation of encrypted traffic

November 23rd, 2009 Clear2Go No comments

onyx1As the traffic on the Internet becomes more and more encrypted due to privacy concerns, the need to protect data from third parties, prying eyes, marketers, service providers and others, behavioural profiling of network sessions will become more and more necessary.  Already, there are many products that claim to do behavioural profiling of network activity in varying degrees to assist with behaviour detection.  There is more and more active research in this area by vendors, law enforcement, bad guys and others.

I reviewed a report where it was indicated that because the data was encrypted it was impossible to determine anything useful.  This is not always the case, but I have seen this conclusion in reports and investigations many times when dealing with encrypted or unidentified data.  Aside from the marketing which says that if my Internet sessions are encrypted then one is safe (nothing could be further from the truth), many network administrators do not understand or have had much experience with behavioural profiling.  Behavioural profiling of networks can be very complex, and research is relatively new in this area.  To give some insight into how one might profile network sessions and show how one can use behavioural profiling to extract information, I decided to walk through a simple example and answer a simple question.  Specifically, what are the differences between an encrypted network session where one is watching a program or video (user providing no input), compared to an interactive type of network session where one is interacting (providing input)?  I used the SSH protocol to illustrate.

I used video over SSH to watch a program.  The program was approximately 24 minutes in duration and was hosted on a server at my ISP.   There were no problems watching the program, it didn’t pause or stop, and it was just like watching a typical television program (in fact I watched it on my flat screen TV).  I used a device to capture the traffic between the server hosting the program and my home for the entire duration of the program.  Finally, I captured an interactive SSH session which was me logged into a server at my ISP, where I was doing some coding and some shell commands.

Attempts to look at the actual data of either of these captures will be useless.  Since the data is encrypted, without access to the session keys knowing what was transmitted is close to if not impossible.  That being stated, what behaviour characteristics can we observe to tell us what might be going on?

I separated the direction of each of two captures which gave me 4 capture files, video received, video transmitted, interactive data received and interactive data transmitted.

Bandwidth

Received Transmitted Ratio
Video 193.2 MB 7.0 MB 0.036
Interactive 0.59 MB 0.58 MB 0.98

Looking at the chart above, the video watching has a much larger amount of data received than transmitted compared to the interactive session where a similar amount of data is transmitted and received.  Analysis of most video streaming and flows where downloading is occurring will yield a similar results.  The ratio of received to transmitted data will be high.  Interactive sessions tend to have a more balanced ratio of transmitted to received data compared to a video session.  This of course has dependencies on what the user is doing in the interactive session, but typically this has been the case in my experience.

Inter-packet timing

Another interesting metric is the time difference or delta between two packets.  Watching a video or listening to music, the delta between two packets tends to be small in comparison to an interactive type of session.  There are a few reason for this.  Since the video is being viewed, it is important to ensure that the data arrives in a timely manner so as to not have the video ‘freeze’ while being watched.   Some software attempts to write the video data to disk in advance of viewing to help mitigate this problem, but that leaves an exposure where an savvy individual can obtain a copy of the video by simply making a copy of the temporary file.  As a result, newer software tends to attempt to keep the data in memory and not write it to disk.  The result is the need to ensure a smooth delivery of data, minimizing delay between packets (known as Jitter).

Received (seconds)
Transmitted (seconds)
Maximum Mean Std Dev. Maximum Mean Std Dev.
Video 3.065 0.021 0.094 3.051 0.014 0.076
Interactive 4028.555 3.568 88.736 4028.544 2.162 69.137

I wrote a simple python script which will take as input a capture file, calculates the inter-packet timing for each pair of packets and then outputs among other information, the results you see in the table above.  The Maximum field is the largest time between packets, the mean is the average time between packets, and the standard deviation is a measure of how ‘different’ the inter packet times are from the ‘normal’.  For those that don’t know or wish to have a refresher in standard deviation, here is a good place to start. However, most languages and spreadsheets have functions to calculate this for you if you do not wish to learn the math.  In simple terms and using our specific example, if all the packets had the exact same time between them then the standard deviation would be 0.  The greater the difference in timing between packets, the greater the standard deviation will be.

Notice that the standard deviation is much higher for the interactive session then the video session.  Sessions that stream data, tend to have a low standard deviation for inter-packet timing.  If you think about it this makes sense, as an interactive session you can walk away from the computer, or the program could be waiting for input from the user so data transmission will fluctuate more.

Bandwidth, inter-packet timing, and methods such as standard deviation and mean are just a few things that can be used to narrow down what a particular subjects activities might be.  In corporate or law enforcement investigations, profiling network behaviour can be a useful tool to determine if you need to spend more time on the investigation or if you have the right target.  Using our example above,  suppose a corporation wants to determine which employees are watching streaming videos.  A scan of the network data reveals an individual who has encrypted sessions, but these sessions show a transmit / receive ratio that is in line with typical interactive sessions and not video sessions.  Also, the standard deviation of the inter-packet timing is higher for these sessions, then you can rule them out as an individual of interest immediately.  This has the advantage of focusing your investigation, not encroaching on privacy issues unnecessarily,  and saves time by allowing you to focus on the users that have network sessions with characteristics that fit the behaviour you are looking for.

For those of you that feel comfortable because the data is ‘encrypted’ it can be a false sense of security.  These are two of the many metrics and theorems that can be used on the data.  This area has active research and there are many products that will do this type of analysis in an automated fashion.  For those interested in this, although older now, this is a great paper where an experiment was conducted to determine what movie people were watching even though the movie data was encrypted.  They used behavioural data to fingerprint the movies, then applied the fingerprints to encrypted transmitted data.

Leadership isn’t about you

November 18th, 2009 Clear2Go No comments

A great leadership post by Marshall Goldsmith in Harvard business blog.  It is a very simple concept, but one that many leaders and companies forget.

Categories: Leadership and Management Tags:

COFEE, Forensics and Security via Obscurity

November 9th, 2009 Clear2Go No comments

leakingCoffeeAnyone in the digital forensics community will have heard todays big story, Microsoft’s Live forensic toolkit called COFEE has been leaked (pun intended) onto the Internet.  Normally this would not be big news, but since it was supposedly designed for “Law enforcement only” it is being reported on and discussed widely.

I remember when this was announced. Like many, I was able to obtain some factual information on the COFEE ‘unofficially’ through a few contacts.  If you take a bunch of open source and freeware programs wrap them up in a pretty GUI based system that lets you create profiles to control which of these programs are run in what order and with what switches — that is COFEE.  You can then load a particular profile or profiles on a USB key.  You insert the USB key into the target and COFEE runs (assuming auto-run is enabled, if not you can manually start COFEE) the requested commands and options in the profile.  The output is saved and you can view it in a simple reporting package that organizes the information hierarchically by type REGISTRY, POLICY, MEMORY, PASSWORDS and other categories.  Of course you have to have user access to the system for COFEE to work, ideally an administrative level account.

One of the ’selling points’ was that any untrained officer could run COFEE on a target system and not have to understand what they are doing.  If the investigation does go to court, it will be expected that chain of custody, documentation, due diligence is all taken care of.  More importantly, I could see the lawyer for the defense saying something to the effect of “Let me see if I understand this.  Officer Joe here who has no knowledge of digital forensics, ran a COFEE on the target system unsupervised.  Officer Joe, are you sure the process list is complete?  There were no hidden processes that are not being shown?  Are you certain that you obtained every active user running on the system?  How are you sure? Are you certain you have a copy of all areas of memory and nothing was missed?”  The key to digital forensics is not the tool, it is having a understanding of what data is being extracted, how it is being extracted, what the data means that was extracted, and being able to explain what might have been missed or might be inaccurate and why.  This requires knowledge and training.

However, my biggest issue with COFEE has always been the “law enforcement only” type of approach.  It never works.  The software will eventually get out.  It is just another play on “Security via Obscurity”.   Why restrict it to law enforcement?  The only argument I have heard to support that is that if the tools get out the anti-forensics community will figure out a way around them so they don’t work.  This type of research and software deployment is alive and well and has been for some time.  I even received training on how to fool forensic memory acquisition software it in 2007 at Blackhat.  To be honest, if law enforcement is investigating a breech at a nuclear plant, or some other critical infrastructure I really hope they use many other publically available tools for their investigation instead of COFEE and individuals that know what they are doing – Personally, I’d feel more confident in them.   Of the few lawyers and law enforcement officials I do know, I have not heard of an untrained officer using COFEE on a system to date as the primary source of data gathering — not scientific, but I hope it is a sign that they are smarter then that.

Categories: Forensics Tags:

You can still be detected if using a proxy

November 2nd, 2009 Clear2Go No comments

Britney Spears 3 Video - thumb PictureSetting your proxy settings in Firefox or Internet Explorer does not mean that you are undetectable.  In fact, with most websites today embedding  applications that provide video, audio, gaming and other services, it is more common than ever before to find evidence in logs and databases that can reveal who you are.  Most involved with network security already know this, but if you are not you may think you are anonymous when in fact you are not.

I was talking to a individual recently who was involved in an investigation.  They assumed that by using a proxy, the target site would not have an IP address or any other data logged that could link them to the target site.  I explained this is false assumption and why, but it got me thinking about others that may be in law enforcement or corporate security conducting investigations and feel comfortable they are hidden via a proxy service when they are actually exposed.

If a target site wants to detect you, there are many ways it can accomplish this easily, and often they obtain identifying information unintentionally.  Here is a quick and simple example I put together.  First, I shutdown all the servers and clients on my home network except a single computer and the gateway.  On the gateway, I captured all the traffic entering and leaving the network. Next, I configured Firefox to use a SSH proxy.  SSH has the ability to emulate a SOCKS4 or SOCKS5 proxy.  A side note to using SOCKS4 or SOCKS5 is DNS is not proxied.  This is not a concern for this particular investigative scenario, but could be a concern for other investigations, so it is important to be aware of that issue should it become a concern during an investigation.

Firefox was configured to proxy via Socks 5:

sshProxyConfigExample1

Next, I visited a site that hosted the latest Britney Spears video entitled ‘3′.  The page load is shown below.

britneySpears3Video

The initial page loads along with the embedded video player.  Up to this point, the logs show that the packets are ingressing and egressing via the configured proxy server only which is our desired behaviour.

initalHTTPLoadViaProxyCleansed

The communication as shown above between the proxy server and the client continues until the video player application loads.  Once the player loads, it first does a DNS request for the the video service.

bsVideoPlayerDNSQueryCleansed

The player then directly connects to the video service bypassing the proxy at this point you have been identified.   This continues as the audio and video is streamed to the client.

bsRTMPStream1Cleansed

Keep in mind that you may already have been identified through the proxy itself.  It is entirely possible and likely that the website or player has transmitted other information about your system within the RTMP stream itself or even HTTP.  The problem stems from the fact that these embedded objects are in fact executable programs that can bypass the browser and other system settings.

If you are involved in an investigation where you don’t want to be detected by the target, do not assume that by using a proxy you are safe from detection.  There are ways to avoid detection in this way, but they require more sophisticated network and client configuration.  Regardless of your setup and configuration I would suggest always capturing the data transmitted and received.  Even if you don’t analyze every packet, it provides a detailed log of what actually was transmitted and received allowing you to go back and verify if necessary.

Categories: Auditing, Privacy / Anonymity Tags: