Archive

Archive for the ‘Forensics’ Category

Information leakage and privacy

March 1st, 2010 Clear2Go No comments

Have you ever sent an email from a personal email account at work such as Hotmail, Gmail, or your personal account at your service provider?  When you do that you might assume that since you are sending the email from a central system it would not be possible for the recipient to information about you beyond what you give them and an email address.  Unfortunately this is not true.  Information is leaked in many ways.  SMTP, DNS, HTTP all can leak information about a particular individual or organization.  In my experience, most people know this is possible, but fail to grasp the ease with which information about a person or company can be discovered.

Here is a simple example to illustrate.  I have found when speaking to many users of email, they feel that their location could not be determined by the recipient in an email unless they specifically give it, or it would be at least difficult to find out.  They even feel more comfortable with this statement when they are using their personal email from a terminal at work or a Internet cafe via a browser.

I was recently corresponding with a friend of mine.  She has a Rogers email account that she uses for her personal email.  She sent me a response to an email.  By looking at the email itself, there is no information that would give away where she was located.  However, if I look at the email headers a wealth of information is available.  Let’s focus on one piece.

* headers not required for purposes of entry have been removed and others edited as required to protect identities

The ‘Received:’ header above displays an IP address.  Taking that IP address and doing a ‘whois’ (shown below) reveals the company name where the email originated.

* removed ISP information and edited company info to ensure privacy

How could this information be used?  If someone wanted to surreptitiously gather intelligence on a target, one could send a email to a target asking an innocuous question.  By responding the target has unknowingly revealed their place of employment.  A few searches on Google, a picture on Facebook of yourself and family members … you get the idea.

This type of information gathering has valid uses.  Determining a time-line of a target and their actions from a corporate or legal investigation, determining if your spouse is cheating on you, or your teenage child is lying are some examples.

I am not suggesting that you should try to hide this or not use the Internet.  I am also not suggesting it will be fixed anytime soon, if ever.  I am suggesting to be aware.  Be aware that in todays world, data about yourself is being leaked all the time and any determined individual or group can find out what you are up to with minimal effort.  Be aware that even the most common activity leaks data.

How secure or anonymous do you feel when using the Internet?

photo credit

Categories: Forensics, Privacy / Anonymity Tags:

Tor and plausible deniability

February 18th, 2010 Clear2Go 2 comments

Once again I have been experimenting with the Tor network.  In doing so I have set up some Tor nodes. I have received a few notifications that my computer ‘may be infected’. Google for a brief period of time requested I enter a capcha to confirm I am human.  These are all expected minor nuisances when running Tor as an exit node. My main reason for setting up Tor this time, is to obtain a better understanding of what happens to behavioural and static detection when a Tor exit node is present.

If you want privacy or anonymity on the Internet, there are many things you can do. Proxies, Tor, encrypted tunnels, compromised systems, and many other techniques are available.  None of these will guarantee you anonymity or privacy, but they each help and the more you can do the better.  There are caveats of course and in several cases while consulting I have come across scenarios where a client thought they were being anonymous but were in fact not as anonymous as they thought.  When you are trying to be anonymous, use of monitoring techniques and system checks really help.

I’ve realized that running a Tor exit node but not using it yourself gives you anonymity.  I’ve always known this inherently, but I’ve realized that it is even better than I thought.  Say you are an evil person doing something evil on the Internet.  If your activities were being tracked by your service provider due to a warrant from law enforcement or laws were put in place that required all service providers to track and retain your Internet surfing activities for a period of time, they would be recording the surfing habits of every connection that selected your Tor node as its exit node.

If they accused you of illegal activity, you could easily say that was not me, it must have been someone using my Tor node.  While this is not a guarantee the criminal would not get caught, it would increase the cost of the investigation significantly.  More investigation time, more forensics to prove that the suspect is the criminal.  Add in anti-forensics on your terminals and systems you use for the crime and the costs for investigation again will increase, forcing them to assess if it is worth the time, money, and resources required.

If countries are going to deploy the retention laws similar to the above, it will only be a matter of time before they will have to outlaw services such as Tor in order to make them effective at catching the serious criminals.  From a Tor network perspective, these laws might help increase the node count of the Tor network on the Internet which is a good thing for them.

I wonder if law makers consider these questions when suggesting these laws?

Confirming email delivery

February 4th, 2010 Clear2Go 3 comments

http://www.flickr.com/photos/tiffanyhoran/4288875968/

Most people have come to expect that when an email is sent it will arrive at it’s destination.  Over the last decade, email delivery has become much more reliable due to many factors such as better network architecture, better mail server design, load-balancing and fail over design, all driven by increased reliance on email in todays world.  There is also the ability to request a delivery receipt on most email clients although users typically disable this feature themselves, or the security policy of the organization disables it.  Email however is not a guaranteed delivery service.  The SMTP protocol as well as the process of email delivery on the Internet does not guarantee delivery.

One technique that I have used when someone has either not responded or indicated that they did not receive my email is to check the server delivery logs.  While this does not guarantee that the email was placed in the destination users mailbox, it does indicate acceptance at the mail exchanger of the ISP or company.

Above is an email I sent to a friend last week confirming plans for dinner.  By viewing the headers and looking for the SMTP “Message-ID” field, I can then search for that ID in the log files of the mail server.

# cat maillog | grep -i "4B618D6A.2070804"
Jan 28 08:13:19 mailsvr sendmail[20093]: o0SDDGVS020093: from=<xx@xxxxxxxxxx.org>, size=399,, nrcpts=1, msgid=<4B618D6A.2070804@xxxxxxxxx.org>, proto=ESMTP, daemon=MTA, relay=eee.dddd.ca [216.bbb.ccc.12]
#
# cat maillog | grep -i "o0SDDGVS020093"
Jan 28 08:13:19 mailsvr sendmail[20093]: o0SDDGVS020093: from=<xx@xxxxxxxxxx.org>, size=399,, nrcpts=1, msgid=<4B618D6A.2070804@xxxxxxxxx.org>, proto=ESMTP, daemon=MTA, relay=eee.dddd.ca [216.bbb.ccc.12]
Jan 28 08:13:20 mailsvr sendmail[20098]: o0SDDGVS020093: to=<yyyyyy@gggggggg.com>, ctladdr=<xx@xxxxxxxxxxx.org> (501/501), delay=00:00:01, xdelay=00:00:01, mailer=esmtp, pri=120399, relay=ttttttt.hhhhhhcom. [142.fff.rrr.227], dsn=2.0.0, stat=Sent (Ok: queued as 51E02514002)
#

In this case the server logs are using Sendmail, so depending on your server, the procedure might be slightly different.  Using the SMTP Message-ID field as a search parameter, I obtain the entry of the unique ID of the Sendmail delivery process for that message, in this case “o0SDDGVS020093″.  Searching the log file for that unique ID, shows me the remote mail server that accepted the email for delivery.  The status is “sent” and confirmed by a Deliver Status Notification (dsn) of 2.0.0.

There are many other fields and status messages with server logs, some you can see above, which are useful resources when troubleshooting or doing forensic activity involving an email transmission in an investigation.   Although this might appear to be too technical for a general user, I have used the logs to confirm myself if email is getting to at least the mail exchanger.  These records can assist in determining if the email arrived.  At the very least, you can use it as evidence the email was received by the destination company.  While it is not 100% proof, it is typically a good indicator.

In one instance, I was not getting a response from my daughter’s school concerning a particular issue.  After several attempts, I sent a new email asking why they were not responding, as it appeared obvious the school board was receiving the emails and I attached the log.  I had a response within the hour.  I am sure the users didn’t fully understand each field, but it was enough to get a response.   I don’t know of any service providers or companies that provide an on-line interface to check status of messages, but it might not be a bad service to offer.

Categories: Forensics Tags:

Investigation of encrypted traffic

November 23rd, 2009 Clear2Go No comments

onyx1As the traffic on the Internet becomes more and more encrypted due to privacy concerns, the need to protect data from third parties, prying eyes, marketers, service providers and others, behavioural profiling of network sessions will become more and more necessary.  Already, there are many products that claim to do behavioural profiling of network activity in varying degrees to assist with behaviour detection.  There is more and more active research in this area by vendors, law enforcement, bad guys and others.

I reviewed a report where it was indicated that because the data was encrypted it was impossible to determine anything useful.  This is not always the case, but I have seen this conclusion in reports and investigations many times when dealing with encrypted or unidentified data.  Aside from the marketing which says that if my Internet sessions are encrypted then one is safe (nothing could be further from the truth), many network administrators do not understand or have had much experience with behavioural profiling.  Behavioural profiling of networks can be very complex, and research is relatively new in this area.  To give some insight into how one might profile network sessions and show how one can use behavioural profiling to extract information, I decided to walk through a simple example and answer a simple question.  Specifically, what are the differences between an encrypted network session where one is watching a program or video (user providing no input), compared to an interactive type of network session where one is interacting (providing input)?  I used the SSH protocol to illustrate.

I used video over SSH to watch a program.  The program was approximately 24 minutes in duration and was hosted on a server at my ISP.   There were no problems watching the program, it didn’t pause or stop, and it was just like watching a typical television program (in fact I watched it on my flat screen TV).  I used a device to capture the traffic between the server hosting the program and my home for the entire duration of the program.  Finally, I captured an interactive SSH session which was me logged into a server at my ISP, where I was doing some coding and some shell commands.

Attempts to look at the actual data of either of these captures will be useless.  Since the data is encrypted, without access to the session keys knowing what was transmitted is close to if not impossible.  That being stated, what behaviour characteristics can we observe to tell us what might be going on?

I separated the direction of each of two captures which gave me 4 capture files, video received, video transmitted, interactive data received and interactive data transmitted.

Bandwidth

Received Transmitted Ratio
Video 193.2 MB 7.0 MB 0.036
Interactive 0.59 MB 0.58 MB 0.98

Looking at the chart above, the video watching has a much larger amount of data received than transmitted compared to the interactive session where a similar amount of data is transmitted and received.  Analysis of most video streaming and flows where downloading is occurring will yield a similar results.  The ratio of received to transmitted data will be high.  Interactive sessions tend to have a more balanced ratio of transmitted to received data compared to a video session.  This of course has dependencies on what the user is doing in the interactive session, but typically this has been the case in my experience.

Inter-packet timing

Another interesting metric is the time difference or delta between two packets.  Watching a video or listening to music, the delta between two packets tends to be small in comparison to an interactive type of session.  There are a few reason for this.  Since the video is being viewed, it is important to ensure that the data arrives in a timely manner so as to not have the video ‘freeze’ while being watched.   Some software attempts to write the video data to disk in advance of viewing to help mitigate this problem, but that leaves an exposure where an savvy individual can obtain a copy of the video by simply making a copy of the temporary file.  As a result, newer software tends to attempt to keep the data in memory and not write it to disk.  The result is the need to ensure a smooth delivery of data, minimizing delay between packets (known as Jitter).

Received (seconds)
Transmitted (seconds)
Maximum Mean Std Dev. Maximum Mean Std Dev.
Video 3.065 0.021 0.094 3.051 0.014 0.076
Interactive 4028.555 3.568 88.736 4028.544 2.162 69.137

I wrote a simple python script which will take as input a capture file, calculates the inter-packet timing for each pair of packets and then outputs among other information, the results you see in the table above.  The Maximum field is the largest time between packets, the mean is the average time between packets, and the standard deviation is a measure of how ‘different’ the inter packet times are from the ‘normal’.  For those that don’t know or wish to have a refresher in standard deviation, here is a good place to start. However, most languages and spreadsheets have functions to calculate this for you if you do not wish to learn the math.  In simple terms and using our specific example, if all the packets had the exact same time between them then the standard deviation would be 0.  The greater the difference in timing between packets, the greater the standard deviation will be.

Notice that the standard deviation is much higher for the interactive session then the video session.  Sessions that stream data, tend to have a low standard deviation for inter-packet timing.  If you think about it this makes sense, as an interactive session you can walk away from the computer, or the program could be waiting for input from the user so data transmission will fluctuate more.

Bandwidth, inter-packet timing, and methods such as standard deviation and mean are just a few things that can be used to narrow down what a particular subjects activities might be.  In corporate or law enforcement investigations, profiling network behaviour can be a useful tool to determine if you need to spend more time on the investigation or if you have the right target.  Using our example above,  suppose a corporation wants to determine which employees are watching streaming videos.  A scan of the network data reveals an individual who has encrypted sessions, but these sessions show a transmit / receive ratio that is in line with typical interactive sessions and not video sessions.  Also, the standard deviation of the inter-packet timing is higher for these sessions, then you can rule them out as an individual of interest immediately.  This has the advantage of focusing your investigation, not encroaching on privacy issues unnecessarily,  and saves time by allowing you to focus on the users that have network sessions with characteristics that fit the behaviour you are looking for.

For those of you that feel comfortable because the data is ‘encrypted’ it can be a false sense of security.  These are two of the many metrics and theorems that can be used on the data.  This area has active research and there are many products that will do this type of analysis in an automated fashion.  For those interested in this, although older now, this is a great paper where an experiment was conducted to determine what movie people were watching even though the movie data was encrypted.  They used behavioural data to fingerprint the movies, then applied the fingerprints to encrypted transmitted data.

COFEE, Forensics and Security via Obscurity

November 9th, 2009 Clear2Go No comments

leakingCoffeeAnyone in the digital forensics community will have heard todays big story, Microsoft’s Live forensic toolkit called COFEE has been leaked (pun intended) onto the Internet.  Normally this would not be big news, but since it was supposedly designed for “Law enforcement only” it is being reported on and discussed widely.

I remember when this was announced. Like many, I was able to obtain some factual information on the COFEE ‘unofficially’ through a few contacts.  If you take a bunch of open source and freeware programs wrap them up in a pretty GUI based system that lets you create profiles to control which of these programs are run in what order and with what switches — that is COFEE.  You can then load a particular profile or profiles on a USB key.  You insert the USB key into the target and COFEE runs (assuming auto-run is enabled, if not you can manually start COFEE) the requested commands and options in the profile.  The output is saved and you can view it in a simple reporting package that organizes the information hierarchically by type REGISTRY, POLICY, MEMORY, PASSWORDS and other categories.  Of course you have to have user access to the system for COFEE to work, ideally an administrative level account.

One of the ’selling points’ was that any untrained officer could run COFEE on a target system and not have to understand what they are doing.  If the investigation does go to court, it will be expected that chain of custody, documentation, due diligence is all taken care of.  More importantly, I could see the lawyer for the defense saying something to the effect of “Let me see if I understand this.  Officer Joe here who has no knowledge of digital forensics, ran a COFEE on the target system unsupervised.  Officer Joe, are you sure the process list is complete?  There were no hidden processes that are not being shown?  Are you certain that you obtained every active user running on the system?  How are you sure? Are you certain you have a copy of all areas of memory and nothing was missed?”  The key to digital forensics is not the tool, it is having a understanding of what data is being extracted, how it is being extracted, what the data means that was extracted, and being able to explain what might have been missed or might be inaccurate and why.  This requires knowledge and training.

However, my biggest issue with COFEE has always been the “law enforcement only” type of approach.  It never works.  The software will eventually get out.  It is just another play on “Security via Obscurity”.   Why restrict it to law enforcement?  The only argument I have heard to support that is that if the tools get out the anti-forensics community will figure out a way around them so they don’t work.  This type of research and software deployment is alive and well and has been for some time.  I even received training on how to fool forensic memory acquisition software it in 2007 at Blackhat.  To be honest, if law enforcement is investigating a breech at a nuclear plant, or some other critical infrastructure I really hope they use many other publically available tools for their investigation instead of COFEE and individuals that know what they are doing – Personally, I’d feel more confident in them.   Of the few lawyers and law enforcement officials I do know, I have not heard of an untrained officer using COFEE on a system to date as the primary source of data gathering — not scientific, but I hope it is a sign that they are smarter then that.

Categories: Forensics Tags:

What is my daughter up to on the Internet, part I

October 25th, 2009 Clear2Go No comments

ObservationMy daughter has recently become much more interested in some of the social networking sites such as Facebook and Youtube. This is a little concerning for my wife and I. We encourage her to use technology as much as possible, but at the same time there is a inherent risk. There is software you can purchase and install that will download the latest bad sites, look for questionable URLs and even questionable pictures, but I didn’t want to move to this level just yet.  She is not running Windows.

The problem became how could I use some standard networking tools to passively monitor what she is up to on the Internet? I made some basic assumptions.  First, I am only interested in HTTP for now.  Second, I want to extract the sites she visits and do not care about the data that is returned at this point.

We have a Linux box that acts as our gateway to the Internet, so that seemed like the best place to deploy the solution. The first thing was to create a regular expression (regex) that will examine each packet that leaves our internal network and look for commands from the HTTP protocol specification. Any packets matching this will be saved for future analysis. The regex I created is:

^([Gg][Ee][Tt]|[Pp][Oo][Ss][Tt])|([Hh][Ee][Aa][Dd])|([Pp][Uu][Tt])|([Dd][Ee][Ll][Ee][Tt][Ee])|([Tt][Rr][Aa][Cc][Ee])|([Oo][Pp][Tt][Ii][Oo][Nn][Ss])|([Cc][Oo][Nn][Nn][Ee][Cc][Tt])\x20*[Hh][Tt][Tt][Pp]\x2f\x31\x2e

This regex looks for any packet that begins with a HTTP 1.x command such as GET,POST,HEAD,PUT,DELETE,TRACE,OPTIONS, or CONNECT.  The command is separated by a space and then contains the HTTP version number, HTTP 1.  I am aware the regex could be made more optimal.  I chose to not worry about it as this format makes it easier to explain and understand if you are not familiar with regular expressions.  For those with DPI experience, there are more complex and accurate ways to detect HTTP.  For example, ipoque the company that initiated opendpi.org, released some “demo code” that shows some of the ways deep packet inspection (DPI) works.  You can run the demo code on any pre-saved capture files you have and it will attempt to inform you of the protocols that are in the capture file.   If you look at their code for HTTP detection, they have a multi-stage approach that looks at both sides of the flow to determine if the protocol is in fact HTTP.  Any vendors selling DPI equipment today should be doing this type of approach for protocol detection when possible.  However, for the purposes of determining what a individual is doing, I feel this is overkill.  If the situation was a company that was ’suspicious’ of an employee and just wanted to investigate simple solutions are better.  If criminal activity was found, and the data goes to court you want to be able to explain how you gathered the data, why it is valid and what it means.  Keep the explanation as simple as possible in these potential circumstances.

The only two missing pieces are we need to specify that this is for packets egressing from a particular computer (in this case my daughters).  This can be accomplished by adding a Berkeley Packet filter (BPF) on ngrep which will pre-process the packets prior to the application of the regular expression.  The final command I deployed was:

ngrep -O ./httpWatch1.cap -d eth1 -tq -Wbyline “^([Gg][Ee][Tt]|[Pp][Oo][Ss][Tt])|([Hh][Ee][Aa][Dd])|([Pp][Uu][Tt])|([Dd][Ee][Ll][Ee][Tt][Ee])|([Tt][Rr][Aa][Cc][Ee])|([Oo][Pp][Tt][Ii][Oo][Nn][Ss])|([Cc][Oo][Nn][Nn][Ee][Cc][Tt])\x20*[Hh][Tt][Tt][Pp]\x2f\x31\x2e”  “src host 10.1.1.40 and tcp”

This records all packets to a file called httpWatch1.cap that arrive on my internal interface eth1 where an HTTP 1.x command is encountered and the source of the request is TCP and from my daughters computer.  The screen shot below of the first few packets show what you can expect throughout the file.

HTTPCaptureFirstFewPackets

I let it capture for approximately 8 days.  In the next few days I will post how to take the data in this file and manipulate it to extract the information I am looking for.

Categories: Forensics, monitoring Tags:

DNS versus HTTP_GET for a forensic investigation

July 14th, 2009 Clear2Go No comments

Back in May I was asked to give a presentation to law enforcement.  The presentation is here.  Since then, I have been asked to clarify the advantage of using DNS as opposed to HTTP when conducting an investigation.  It is not that one would use DNS instead of HTTP, but use DNS first to assess if further investigation into HTTP and other protocols is warranted. I will use the same example I used in the presentation to explain.

When a browser or application (here in I will just use ‘browser’)  goes to a website it almost always does a DNS request first for the site that the user is looking for.  The DNS request is basically asking “What server houses the site I am looking for”?  In the simplest case, the browser  makes a DNS request for an address based on the bookmark, link, or address entered.  A DNS response to the request comes back with the IP address.  The Browser then connects to the IP address and asks for the particular URL.

The capture below shows the DNS request and response in green and the request for the URL in blue.  In this particular example, the user requested to goto http://www.facebook.com.  Frame 1 shows the DNS request for www.facebook.com and frame 2 shows the response from the DNS server indicating the browser should connect to 69.63.180.15.  Then in frames 3,4,5 you see the connection to 69.63.180.15 and finally in frame 6, the request for the root web page.

DNS lookup and HTTP get of www.facebook.com

DNS lookup and HTTP get of www.facebook.com

Frames 7 and on are the data being transferred, along with other HTTP GETs made.   In fact, that one request for a web page generated many HTTP GET Requests.  You can see all the HTTP GET requests for http://www.facebook.com in the capture below.

Facebook homepage all HTTP GET requests

Facebook homepage all HTTP GET requests

During an investigation if initially you capture the HTTP requests, it is a lot harder to walk though each one and determine what the request is asking, what the response is, and determine if each request has relevance to the investigation.  It can be done, but it is more work and more time. While this effort may be necessary, often at the beginning of an investigation you want to determine first if further investigation is required.  Suppose you are investigating an individual suspected of selling stolen items on Ebay.  If you never see a DNS request to go to ebay or another auction site from that user, it may not make any sense and be a waste of time to investigate further – maybe you have the wrong individual.

When I have been asked to determine what a particular user, employee, or service is doing I usually always start with DNS.  By extracting what the subject was trying to lookup in DNS, you can quickly compile a time-line of sites and applications they were using.  From this data, you can determine if you need to investigate further and if so what applications, sites, and protocols you should focus on.  I find this allows me to focus my investigation easier, and not waste time looking at data that is not relevant to the investigation.

DNS has a few other advantages too.  It is not encrypted so it is easy to analyze.  It is the standard directory of the internet and used by most if not all applications and services.  While I acknowledge that a serious ‘anti-forensic’ individual or group might set-up and deploy infastructure to avoid detection via DNS such as VPN tunnel, their own DNS services for sites and applications where they wish to not be easily tracked, this is not typical behaviour and would be the exception not the rule.

Categories: Forensics Tags:

Forensic extraction of files from a browser memory cache

June 18th, 2009 Clear2Go No comments
photo courtesy of http://www.flickr.com/photos/dotlizard/3577921340/

photo courtesy of http://www.flickr.com/photos/dotlizard/3577921340/

I was doing some network research and came across a site I had not seen before that streamed music.  Similar to my previous investigation with another site, this site was  playing the music, yet the network activity had already stopped.

mp3 file transfer during audio stream play

Checking my  network history monitor the music file had completed in about 15 seconds.  As with the previous investigation I ran the lsof command on the web browser process to see what files were being accessed.  There were no files that related to any media file.  Here is an application actively playing a song for which there is no network activity and no files listed as open by this application.  This caught my curiosity.

No network sockets open moving data and no files open on the file system and the song is playing away.  That left memory as the the only option to where the file could be located.  Firefox has the ability to show what is currently in its memory cache so I started there.

ffMemoryCache1

By opening a new tab and entering “about:cache” in the address bar, you will get a list of cache devices.  Selecting the memory cache brought me to this page.  At the top, you can see a 5MB file from the site where I was listening to the music from.   Right click on the entry and select ‘Save link as’.  Give it a file name and save.  Firefox will produce a XHTML file.

Opening this file in your favourite text editor, you can see it contains a bunch of HTML tags, as well as  a complete memory dump in ASCII format of the file.  The memory dump is what we are interested in.

ff memory dump in xhtml

We need to extract the ASCII representation of the binary file.  To do this, you want to search for ‘00000000:’ which is the beginning of the binary data that was used by the browser application.  I am using ‘VI’ above, but any edtior with search and replace will work. Delete everything prior to this number, so that the first line in the file is this line containing the ‘00000000:’.

ff start of memory dump

Above, you can see the start of the memory dump.  You want to delete everything prior to the start of the memory dump.

firefox end of memory dump

Finally there are a few HTML tags at the end of the memory dump that you need to remove as well.  Once you have done that, save the file as a text file.  The file should just contain lines that have a memory offset and a series of hexadecimal numbers.

In order to get data in the text file ready to be converted into a binary file, we have to remove the memory offset column.  This is the first column of numbers up to and including the ‘:’.  To do this, I passed the file through a program called ‘awk’ and gave awk instructions to remove the first column.

firefox output, extract hexidecimal values and remove memory offset

The command

‘cat untouchedMem.part | awk ‘{print $1″ “$2″ “$3″ “$4″ “$5″ “$6″ “$7″ “$8″ “$9″ “$10″ “$11″ “$12″ “$13″ “$14″ “$15″ “$16}’ > untouched.hex’

takes the file called ‘untouchedMem.part’, removes the memory offset column and puts the results in a new file called ‘untouched.hex’.  This file can now be converted into a binary file.  To do this I used the command ‘xxd’.  ‘xxd is a *nix command that can take a binary file and create a hex dump of the binary file or do the reverse.  In this case we want the reverse.

convert hexdump to binary using xxd

The result from the command

xxd -p -r untouched.hex untouched.mp3

creates a binary file called ‘untouched.mp3′ from the ASCII hex dump file ‘untouched.hex’. Select your favourite mp3 player and play the file.  You should be listening to the complete music file as transmitted to your desktop during the streaming.  The process outlined here is not limited to music files.  It will work for any binary file that is kept in the browser memory.

Many system investigations involve immediately pulling the plug on the target system so as to preserve as much current state as possible from the non-volatile memory.  However, this is a simple example of where you would loose potential data.  The concept of live forensics tries to solve this problem by extracting data from a live system.  There are ways to image memory while the system is running and there are ways to fool software that is doing this as well, so one has to be careful.  In a full scale investigation, one would use tools to try and image as much of the memory as you can, not just the browser memory.  But for smaller investigations, this type of procedure would suffice as long as proper documentation was done as the process was being executed.

I have not tried this as of yet on windows, but I suspect it would work.  Most *nix tools have a windows variant.  In the case of xxd, Hextools does the same for windows.  There are browser tools you can add to do this extraction automatically such as Orbit.  From a forensic and explaination perspective, doing the approach this way provides a more detailed understanding and you have documented steps that occurred.   This makes it easier to justify as well as understand.

Categories: Forensics Tags:

Extracting audio and video from Imeem and other flash sites

June 7th, 2009 Clear2Go No comments
http://www.flickr.com/photos/soldiersmediacenter/1039179706/

http://www.flickr.com/photos/soldiersmediacenter/1039179706/

The other evening I was working on my laptop with imeem.com running in the background.   At a point I required a change, I grabbed a quick trace file of imeem transferring a video for play.  The transfer was done quite quickly and although the video was playing, most of it had yet to be played.  Obviously it must be stored on disk somewhere, and my browser was accessing it.  Executed the list open files command ‘lsof | grep -i firefox’ and parsed for firefox.  The result was many open files.  There were a few that caught my attention in swap (/tmp), so I filtered on them.

lsoftmp1

What interested me was this line:

firefox   6887        mike   82u      REG        8,1 54674048  892940 /tmp/FlashaxZz4P

I copied that file to my videos directory.  Selected the file and opened if up with VLC.  As expected it is a flash file containing the video I was previously watching.

Subsequent investigation at a some other sites revealed that this is not imeem specific, but the flash player itself.  It works for music, video, and any other type of flash file.  If you close the browser window, then the file is ‘deleted’ so if you do want to copy it, you have to do this prior to closing the browser tab.  I haven’t checked, but I suspect that any of the standard forensics tools would be able to extract the file even if it was ‘deleted’.  Finally, the video or music starts playing while downloading is still in progress, so you have to be sure the file has completed downloading.

Given that imeem allows you to play a video or song as often as you wish, I don’t really know why someone would bother copying the video or music for general watching.  I could see from an evidence perspective wanting to copy exactly what a subject was watching or listening to and putting those files into evidence in case the file becomes no longer available, changes location, or the subject claims that it was not what they were seeing or listening to.  A copy of the file along with the network trace of the file request, submitted with appropriate documentation, hashes would be useful in these cases.

Not sure how this would work on a Microsoft Windows System given the swap process is different, but I may investigate that later to see if there are simlar results.

Categories: Forensics Tags:

DNS forensics and working with service providers

May 29th, 2009 Clear2Go No comments

magnificationhI had the privilege yesterday of speaking to some law enforcement personnel and forensics experts.  The topic was on DNS forensics, the SSL server_name option, and working with service providers.  I enjoyed the opportunity.   I really like talking about network forensics, and being surrounded by smart people that are experts in their field. It also allows me to practice my public speaking which is always good.

The DNS section of the presentation was based on my earlier two posts on DNS analysis which are here and here.   The SSL server_name option was based on my post that is here.  The “Working with service providers” I have never really posted about yet, but have been engaged with service providers all over the world for almost 5 years consistently, so I spoke about my experiences, and thoughts.

The presentation slides are here.

Categories: Forensics, law enforcement, monitoring Tags: