Archive - Forensics RSS Feed

Extracting audio and video from Imeem and other flash sites

http://www.flickr.com/photos/soldiersmediacenter/1039179706/

http://www.flickr.com/photos/soldiersmediacenter/1039179706/

The other evening I was working on my laptop with imeem.com running in the background.   At a point I required a change, I grabbed a quick trace file of imeem transferring a video for play.  The transfer was done quite quickly and although the video was playing, most of it had yet to be played.  Obviously it must be stored on disk somewhere, and my browser was accessing it.  Executed the list open files command ‘lsof | grep -i firefox’ and parsed for firefox.  The result was many open files.  There were a few that caught my attention in swap (/tmp), so I filtered on them.

lsoftmp1

What interested me was this line:

firefox   6887        mike   82u      REG        8,1 54674048  892940 /tmp/FlashaxZz4P

I copied that file to my videos directory.  Selected the file and opened if up with VLC.  As expected it is a flash file containing the video I was previously watching.

Subsequent investigation at a some other sites revealed that this is not imeem specific, but the flash player itself.  It works for music, video, and any other type of flash file.  If you close the browser window, then the file is ‘deleted’ so if you do want to copy it, you have to do this prior to closing the browser tab.  I haven’t checked, but I suspect that any of the standard forensics tools would be able to extract the file even if it was ‘deleted’.  Finally, the video or music starts playing while downloading is still in progress, so you have to be sure the file has completed downloading.

Given that imeem allows you to play a video or song as often as you wish, I don’t really know why someone would bother copying the video or music for general watching.  I could see from an evidence perspective wanting to copy exactly what a subject was watching or listening to and putting those files into evidence in case the file becomes no longer available, changes location, or the subject claims that it was not what they were seeing or listening to.  A copy of the file along with the network trace of the file request, submitted with appropriate documentation, hashes would be useful in these cases.

Not sure how this would work on a Microsoft Windows System given the swap process is different, but I may investigate that later to see if there are simlar results.

DNS forensics and working with service providers

magnificationhI had the privilege yesterday of speaking to some law enforcement personnel and forensics experts.  The topic was on DNS forensics, the SSL server_name option, and working with service providers.  I enjoyed the opportunity.   I really like talking about network forensics, and being surrounded by smart people that are experts in their field. It also allows me to practice my public speaking which is always good.

The DNS section of the presentation was based on my earlier two posts on DNS analysis which are here and here.   The SSL server_name option was based on my post that is here.  The “Working with service providers” I have never really posted about yet, but have been engaged with service providers all over the world for almost 5 years consistently, so I spoke about my experiences, and thoughts.

The presentation slides are here.

Using DNS to determine when someone is home — DNS analysis, Part II

Last month, I did a quick write up on a DNS trace that I had extracted.  The trace was all the DNS queries that left my house over a few days.  Using that same trace, I noticed that there were many queries to the domain of my employer.   This in itself was not unusual, but one particular query caught my eye:

2009-02-08 05:34:02.680383 IP 216.240.7.12.58684 > 208.67.222.222.53: 30554+ A? ap-1.sandvine.com. (35)
2009-02-08 05:34:03.037603 IP 208.67.222.222.53 > 216.240.7.12.58684: 30554 1/0/0 A 216.16.234.191 (51)

This query happened every 10-20 minutes.  Tracing it back I realized it was coming from my mobile phone.  This got me to thinking, could one determine when I was or was not home with just access to a DNS trace?  To answer that I did a bit of investigation of the address ap-1.sandvine.com.

mike@Janel:~/investigation/homeDns$ dig @ns1.domainmonger.com ap-1.sandvine.com

; <<>> DiG 9.5.0-P2 <<>> @ns1.domainmonger.com ap-1.sandvine.com

; (1 server found)

;; global options: printcmd

;; Got answer:

;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36335

;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0

;; WARNING: recursion requested but not available

;; QUESTION SECTION:

;ap-1.sandvine.com. IN A

;; ANSWER SECTION:

ap-1.sandvine.com. 60 IN A 216.16.234.191

;; AUTHORITY SECTION:

sandvine.com. 60 IN NS ns1.domainmonger.com.

sandvine.com. 60 IN NS ns2.domainmonger.com.

;; Query time: 92 msec

;; SERVER: 216.98.150.33#53(216.98.150.33)

;; WHEN: Sun Apr 12 12:29:19 2009

;; MSG SIZE rcvd: 100

mike@Janel:~/investigation/homeDns$

From above the record, for ap-1.sandvine.com refreshes every 60 seconds.  That means that my mobile ignores the refresh request from the DNS.  While interesting to know, it doesn’t help answer my question.

I extracted all queries to ap-1.sandvine.com, the timestamp for each and quickly plotted them with gnuplot.  Next, I pulled my calendar and daily logs and added notes to the graph. The y-axis is irrelevant.  The red dots show when the queries were made and the green arrows and notes are my comments based on my calendar and logs.

A third party could easily determine when I was or was not home with a high degree of certainty.    With mobile phones now having wi-fi capabilities and connecting to the local wireless network it becomes trivial to use them as a vector to determine when someone is home or not.  I ran the same analysis on my wife’s mobile and got similar results (I didn’t add them to the chart here).

Obviously you could use other protocols and do a much more detailed analysis and correlation (or just execute standard physical surveillance), but DNS is good in that it is required for the Internet, a standard, and is not encrypted.  This was a relatively simple exercise and reasonably cost effective.   I am not a lawyer, but I suspect based on the ongoing privacy debate and  some recent court decisions that DNS queries executed by an individual or a business might be considered ‘public’ with no expectation of privacy.  I’d argue that with access to DNS information from a particular entity, one could glean interesting information from a competitive company.

Anti-Forensics – not as easy as once thought

image by wchulseiee (http://www.flickr.com/photos/wchulseiee/2427418216/)

image by wchulseiee (http://www.flickr.com/photos/wchulseiee/2427418216/)

My laptop is pretty secure. I am not silly enough to think that is is 100% secure or that no one could get into it, but relative to most laptops out there it’s not too bad. There are weaknesses due to time or software requirements, but I think I am aware of most of them. I don’t encrypt the operating system (yet), but all data partitions are encrypted. It has been configured with the goal that all sensitive data and metadata  (web browser, IM, video, audio, cache, bookmarks)  is encrypted.
once data is no longer ‘required it is stored on the servers at the office and then ‘wiped’ off the encrypted drives at regular intervals .    All metadata  is wiped from the encrypted drives each weekend, which gives at most one week of metadata, assuming an attacker can get into the encrypted drives to view it. The main reason for all this is to protect customer data. I like others in my industry work with institutions and their data.  In many cases that data can be politically, financially, or image ‘sensitive’ in nature if it was to get into the wrong hands.  Should my laptop ever be stolen, I want to at least make it difficult for an attacker to gain easy access to the data in a reasonable period of time.

Imagine my surprise when I was re-configuring my laptop and I discovered that my deleted file metadata had somehow been reset  to write to a different area, on an unencrypted area of my drive.  The following is a partial view of the files I discovered.  The files went back as far as November, 2008.

Trash Meta Directory on laptop

Trash Meta Directory on laptop

These are standard text files with information about each file that was deleted.   The information includes the original file location as well as a timestamp indicating when the file was deleted.

Trash meta data file details

Trash meta data file details

Even though the actual data files were not present, there is a lot of information here.  Just from working with the data contained in the files above, one could easily determine names of files worked on, importance, directory structure of encrypted partitions, date file was deleted and more.  You could very easily put together a time line of a customer, projects being worked on, dates of project activity, useful information that can be sold, used to a competing company or party’s advantage in court, for a bid, or a competitive product or service.

There is a lot of ‘negativity’ with Anti-Forensics lately, especially in the forensics community.   Although I understand and appreciate the problems and concerns they have, I believe anti-forensics is necessary and a good thing.   It all depends on who is using it and why.  Needless to say, I have fixed the problem with my laptop, and ‘double checked’ my drive encryption and scripts to ensure correct execution.

TLS/SSL data leakage

If you ask most people about TLS or SSL, they understand that it has something to do with ‘securing’ information that is on the Internet.  People with a networking background will understand it as an encrypted session which encrypts everything above layer 5, effectively user data.  In the case of HTTP, this would include the URL that a user was requesting such as https://www.tdcanadatrust.com.   I was looking at a network capture file recently, and was shocked to find at the start of the session the server that I was accessing in the initial client hello packet of the SSL session, specifically http://www.tdcanadatrust.com.

You can see in the server name in the SSL client hello packet.  The hello packet is the first part of the initial SSL handshake sequence when a application attempts to establish and SSL session.

Using Wireshark, and digging a little deeper, I found it is classified as an ‘Extension’ labeled ‘server_name’

It appears to be one of the acceptable extensions for SSL.  A quick check of the RFC revealed that it is an optional addition that applications such as a browser can add to the SSL negotiation process.

<snip>
.2. Extended Server Hello

The extended server hello message format MAY be sent in place of the
server hello message when the client has requested extended
functionality via the extended client hello message specified in
Section 2.1.

……

In order to provide the server name, clients MAY include an extension
of type “server_name” in the (extended) client hello.  The
“extension_data” field of this extension SHALL contain
“ServerNameList” where:

struct {
NameType name_type;
select (name_type) {
</snip>

As it turns out, this functionality was added to permit virtual hosting of SSL/TLS enabled sites.  Without it, every site requires a unique IP address.  With that reasoning, I expect it to become common place in the future.  One can argue that by having the destination IP address (which is not encrypted) of a network flow, determining which site a user is visiting when each IP address is mapped to a single SSL application is trivial.  Therefore adding this extended server_name option is no different and hence there is no added privacy concerns.   While I agree with this, it makes it much easier for the automation of statistics and monitoring of network flows.

The main point to keep in mind is that although you data is still encrypted, TLS/SSL still reveals the sites you visit.

Extracting audio from last.fm

Since I have been listening to last.fm lately and just recently pulled a capture file for analysis, I was wondering if audio extraction would work in the case of an investigation. Turns out using the procedure I wrote back in Oct works well. The end result is a directory of files containing the streamed audio from last.fm which can be played as standard mp3 files.

Extracting picture files from network streams

Part of my work and interest involves investigation and analysis of network traffic for one reason or another. These tasks fall under the larger umbrella of network forensics. Given the growth of the Internet and the transition from stored to streaming media both video and audio, the ability to perform analysis on network traffic is becoming more important. There are new products that are coming out that do this. For example, the company I work for has a product that assists service providers in responding to law enforcement requests. These products have many features, but one common feature is the ability to capture raw network traffic or packets from a live network and write them to a file or set of files for later analysis. Capturing raw network traffic is not anything new. It has been around for years and is often used by network administrators, researchers, and many others. The advent of network analysis products for service providers and law enforcement bring the basic abilities of traffic capture and analysis to a much wider and in some cases less technical audience.

What methodology, tools, and procedure would one use to determine what is happening in a particular trace file where web browsing is present? Lets use a simple example for illustration purposes. Assume you are investigating an individual that is suspected of brokering the sale of known stolen items. This individual visits the websites of their partners in crime. Using a specific product or several open source software packages, you capture the targets traffic to and from their system. You now have files containing the packets of data that were received and transmitted from the target’s system. Besides extracting the URLs, passwords and other information, it would be nice to get a list of the graphics and pictures that are contained in this network file as it is suspected that pictures of the stolen items are typically sent and received by potential buyers.

A open source application called Driftnet, is designed to extract graphics and photographs from any internet device by ‘spying’ on the data that is transmitted or received via its network interface. In its default mode of operation, driftnet is designed to ‘listen’ actively on a network connection and display graphic images as they pass by the network. Driftnet can be, and I have seen and used this software to covertly ‘spy’ on targets attached to a network and reveal in near real time any graphics that are viewed on the target desktops. Here is how to use it on a network capture file to extract the graphics inside the file.

On my *nix system, I start up two terminal sessions. In the first session, I start Driftnet and configure it to listen on the loopback interface of my laptop. The loopback interface is an interface like any other interface, but it does not communicate on the network — it is only visible and communicates with any services running on the local device that are bound to it.
We start Driftnet and tell it to listen on the loopback interface, use adjunct mode and write the graphics out to a specified directory. Adjunct mode tells Driftnet not to open a window on the console to display the graphics but instead write them to a storage device. The directory option tells Driftnet where to write the graphics it finds.

Now we need to replay the network capture file on the loopback interface. By doing this, Driftnet will ‘see’ the flows and extract the graphics, writing them to the directory we specified.

Tcpreplay is an open source program designed to replay a captured network file out a specified interface. In a second terminal window we run tcpreplay, specify the packets should be replayed out the loopback interface, and specify the network capture file containing the packets to be replayed.
If tcpreplay completes successfully, you will see some status output similar to the screen capture information above. On the terminal window where Driftnet was started, it will start scrolling text lines containing the graphics it has found and filenames it is using to write to storage.
Once the capture file has completed being transmitted on the loopback interface, you can simply browse to the directory where you specified the graphics files to be saved. There you can browse through the graphics files that were part of the stream of the targets.
This technique is simple and can give you a general idea of what a individual or group of individuals is viewing from a graphical perspective. This can allow an assessment to be made if the investigation needs to go further. If further investigation is warranted, the procedure, generated graphics and capture files can be digitally fingerprinted, documented for use as evidence.

* network photograph courtesy of IssacMao

Obtaining a mms video stream for analysis

A friend of mine sent me an e-mail which contained a file called “Technology.wvx”. The file was 328 bytes in size. Selecting the file, played a video mash-up which was obviously larger than 328 bytes. I also was curious as to what a “.wvx” file was.

Looking at the file showed it to be a XML format file with a reference to separate URL:

mms://a215.v47369f.c47369.g.vm.akamaistream.net/7/215/47369/v0001/sonybmgsftp.download.akamai.com/34732/promommxnonflash/GMM_Rome_DidYouKnow_300.wmv


MPlayer
is a very powerful and robust movie player. Besides supporting a multitude of file formats from MPEG, VOB, AVI, ASF, WMV to list a few, it is often able to play damaged files. Although I primarily use *nix as my operating system, mplayer is available for Windows, Mac, and other operating systems making this process available on those platforms as well. Mplayer source code is also available for those wishing to compile it.

Mplayer has many features which are beyond the scope of this post, but one nice feature is the ability to read in a raw stream and write it to a file. The two parameters we used to tell mplayer to read the stream and write the file to disk were:

- dumpstream: Dump the raw stream, not making any conversions or changes to it. In our case it is the URL from the ‘Technology.wvx’ file above.

- dumpfile: the filename to dump the stream. I chose ‘s.wmv’ for this example.

The full command used was:
mplayer -dumpstream “mms://a215.v47369f.c47369.g.vm.akamaistream.net/7/215/47369/v0001/sonybmgsftp.download.akamai.com/34732/promommxnonflash/GMM_Rome_DidYouKnow_300.wmv -dumpfile s.wmv

Mplayer will output a bunch of messages. This version outputted several error messages during the process, but these did not affect the final video file. The result was a local file called ‘s.wmv’, which when played with a video player nicely played back the sound and audio.

The ability to save streaming media is necessary and has many valid uses. Ability to play when the Internet is not available is one simple example. A better example is investigations. From an investigative point of view you want to be able to save the actual data for evidence purposes. Investigations can take time and often you have no control on the server that streamed the data. The video stream could be removed, the server or URI could suddenly change. By properly documenting your activities, adding in time stamp information, trace files, log captures, appropriate hashes and the procedure used to obtain and verify the video stream, evidence can be provided to interested parties with reasonable assurance that it is accurate.

In the future as content on the Internet goes from a ‘download and play’ scenario to a ‘video streaming on demand’ scenario, the ability to forensically find evidence on a target device will become more difficult, simply from the fact that the data isn’t stored on the device. There may be evidence of it in cache, swap files and the like, but these can be overwritten quickly and software is getting smarter. Most browsers and players have the option to not cache if told to do so. Smart people create a ‘secure cache or swap area’. In this case the caches and swap files are configured to write to encrypted disks or partitions using file formats that do not have ‘journalling’. These are then wiped prior to shutdown. Smarter people boot from a read only USB key, and create a ‘secure cache’. By using the technique above, combined with proper documentation of the process allows reasonable proof that the file you have captured is what the target was viewing.

Network Forensics – Extracting audio, video and other binary data from capture files

I remember a few consulting gigs years ago where I was required to extract binary data such as Microsoft Word documents, audio and video files from network captures. The process was quite involved using sniffer, hex editors, base64 decoders and other software to accomplish the task. Today, there are many commercial and freely available pieces of software that hide the process involved in conducting these activities.

Assuming you have access to a network stream, either in your corporate network or at an ISP via a warrant and you capture the network data of a particular subject, how do you review the binary data contained in the capture? Let’s assume that you are profiling a subject and they are visiting myspace briefly and appear to be listening to an audio track that is streaming from a server to the subject.

To accomplish this, I often use a utility called Chaosreader. Chaosreader is an older utility written in Perl, but I find it still does a good job extracting binary data from a network capture in the standard pcap format. The other benefit is that since it is written in Perl, code review is possible to understand how technically this is accomplished.

To keep this post as focused and easy to understand I isolated the capture file to contain just the area where the user connected to myspace to start listening to the audio file in question but this is not a requirement in a normal investigation.
Running Chaosreader on the capture file is a simple step. A summary of the files created from the capture file is listed and it creates an index.html file which you can point your browser to.
Looking at the resulting output, it is obvious the two files of interest are the session_0005.part_01.data and the session_0005.www.html. In a large capture with many sessions it is easier to view the index.html that was generated.
Viewing the index.html file with your browser will show a chart that breaks down each of the sessions listing a timestamp, duration, 5 tuple, service, data size transmitted, and links to the files that are associated with the flow.

The session we are interested in is session 5. It is by far the largest and will be the audio file that was being streamed to the subject. What we do not know is what type of data was streamed. Was it wmv, mpeg, or some other protocol? Selecting the as_html link for session 5, a text version of the file including headers will be displayed.

Here we can see two blocks of text. The first block in red shows the subject (client) requesting the resource to be streamed to them. The second block of text in blue is the data response to the request from the server. The header information is transmitted prior to the data which informs the client of the data that is about to be transmitted, then the binary data is transmitted to the client. Specifically if you look at the ‘Content-Type‘ header, the data format is ‘audio/mpeg‘.

Armed with this information, we simply rename the file session_0005.part_01.data which contains the binary stream to something more meaningful with a .mpg extension.
Select your preferred mpg player and play the file.
Keep in mind is that depending on the quality of the network connection, there is sometimes minor ‘noise’ in the output due to retransmits that happen on the network. Chaosreader provides other information not discussed here. I encourage anyone interested to experiment with it and other software available via the open source community.

Surveillance of people

I came across this article. It is a great synopsis of how easy it is to track the location of someone using thier own mobile phone. Third party companies are popping up to offer services like this. How do they do it? It is easy since some service providers are selling location data to anyone that wants it. What interested me about the article is it highlights how security analysis is changing. If you look at many of the current research papers and projects they involve using statistical data to determine patterns and what a particular user or group of users is up to. This removes the need for signatures, and also can yield useful information even if encryption is present.

Some key statements in the article that caught my attention:

  • Anyone can, for instance, sign up – at £29.99 a year – to mapAmobile.com (‘you’ll always know where your loved ones are’), which allows you to follow the movements of your ‘family and friends’ on a computer screen
  • That this sort of enterprising solution is possible is the result of the major networks – in the UK, Vodafone, Orange, O2 and T-Mobile – having decided, in around 2002, to sell their location data to any company willing to pay for it.
  • the information your phone provides is out there anyway. It doesn’t belong to you, and anyone with the required resources can do with it what they will.
  • Everyone on a network, he said, is part of a group; most groups talk to other groups, creating a spider’s web of interactions.
  • The remaining groups ranged in size from two to 142 subscribers. Members of these groups only ever called each other – clear evidence of antisocial behaviour – and, in one extreme case, a group was identified in which all the subscribers only ever called a single number at the centre of the web. This section of the ThorpeGlen presentation ended with one word: ‘WHY??
  • It also sells ‘profiling’ systems, which measure the behaviour pattern of an individual subscriber and, using statistical analysis, determine whether that same pattern is now appearing from another source.

A recent example of this type of research is the Switzerland project which is currently in alpha at the time of this post. This is an open source project designed to detect when service providers modify or change subscriber packets before letting them continue on in the network.

Another research project was able to detect what movie you were watching via a Slingbox even though it was encrypted.

Page 2 of 3«123»