Forensic extraction of files from a browser memory cache

photo courtesy of http://www.flickr.com/photos/dotlizard/3577921340/

photo courtesy of http://www.flickr.com/photos/dotlizard/3577921340/

I was doing some network research and came across a site I had not seen before that streamed music.  Similar to my previous investigation with another site, this site was  playing the music, yet the network activity had already stopped.

mp3 file transfer during audio stream play

Checking my  network history monitor the music file had completed in about 15 seconds.  As with the previous investigation I ran the lsof command on the web browser process to see what files were being accessed.  There were no files that related to any media file.  Here is an application actively playing a song for which there is no network activity and no files listed as open by this application.  This caught my curiosity.

No network sockets open moving data and no files open on the file system and the song is playing away.  That left memory as the the only option to where the file could be located.  Firefox has the ability to show what is currently in its memory cache so I started there.

ffMemoryCache1

By opening a new tab and entering “about:cache” in the address bar, you will get a list of cache devices.  Selecting the memory cache brought me to this page.  At the top, you can see a 5MB file from the site where I was listening to the music from.   Right click on the entry and select ‘Save link as’.  Give it a file name and save.  Firefox will produce a XHTML file.

Opening this file in your favourite text editor, you can see it contains a bunch of HTML tags, as well as  a complete memory dump in ASCII format of the file.  The memory dump is what we are interested in.

ff memory dump in xhtml

We need to extract the ASCII representation of the binary file.  To do this, you want to search for ’00000000:’ which is the beginning of the binary data that was used by the browser application.  I am using ‘VI’ above, but any edtior with search and replace will work. Delete everything prior to this number, so that the first line in the file is this line containing the ’00000000:’.

ff start of memory dump

Above, you can see the start of the memory dump.  You want to delete everything prior to the start of the memory dump.

firefox end of memory dump

Finally there are a few HTML tags at the end of the memory dump that you need to remove as well.  Once you have done that, save the file as a text file.  The file should just contain lines that have a memory offset and a series of hexadecimal numbers.

In order to get data in the text file ready to be converted into a binary file, we have to remove the memory offset column.  This is the first column of numbers up to and including the ‘:’.  To do this, I passed the file through a program called ‘awk’ and gave awk instructions to remove the first column.

firefox output, extract hexidecimal values and remove memory offset

The command

‘cat untouchedMem.part | awk ‘{print $1″ “$2″ “$3″ “$4″ “$5″ “$6″ “$7″ “$8″ “$9″ “$10″ “$11″ “$12″ “$13″ “$14″ “$15″ “$16}’ > untouched.hex’

takes the file called ‘untouchedMem.part’, removes the memory offset column and puts the results in a new file called ‘untouched.hex’.  This file can now be converted into a binary file.  To do this I used the command ‘xxd’.  ‘xxd is a *nix command that can take a binary file and create a hex dump of the binary file or do the reverse.  In this case we want the reverse.

convert hexdump to binary using xxd

The result from the command

xxd -p -r untouched.hex untouched.mp3

creates a binary file called ‘untouched.mp3′ from the ASCII hex dump file ‘untouched.hex’. Select your favourite mp3 player and play the file.  You should be listening to the complete music file as transmitted to your desktop during the streaming.  The process outlined here is not limited to music files.  It will work for any binary file that is kept in the browser memory.

Many system investigations involve immediately pulling the plug on the target system so as to preserve as much current state as possible from the non-volatile memory.  However, this is a simple example of where you would loose potential data.  The concept of live forensics tries to solve this problem by extracting data from a live system.  There are ways to image memory while the system is running and there are ways to fool software that is doing this as well, so one has to be careful.  In a full scale investigation, one would use tools to try and image as much of the memory as you can, not just the browser memory.  But for smaller investigations, this type of procedure would suffice as long as proper documentation was done as the process was being executed.

I have not tried this as of yet on windows, but I suspect it would work.  Most *nix tools have a windows variant.  In the case of xxd, Hextools does the same for windows.  There are browser tools you can add to do this extraction automatically such as Orbit.  From a forensic and explaination perspective, doing the approach this way provides a more detailed understanding and you have documented steps that occurred.   This makes it easier to justify as well as understand.

  • Conrad

    This is a slightly more compact and simpler command, which already takes care of the editing and the first column.

    sed -n -e ‘//,//p’ untouched.part | awk ‘{print $2″ “$3″ “$4″ “$5″ “$6″ “$7″ “$8″ “$9″ “$10″ “$11″ “$12″ “$13″ “$14″ “$15″ “$16″ “$17}’ | xxd -p -r > untouched.mp3

    • Clear2Go

      Hi Conrad,

      Thanks for the sed command to do the translation!
      -mike.

  • Conrad

    The above comment stripped out the pre tags. If the moderator could see them in the first sed commad andput them back in, then the comment about would be useful…