Archive

Archive for February, 2009

DNS analysis – Part I

February 15th, 2009 Clear2Go 2 comments

I have been doing some investigation into DNS lately.   I set up to capture all DNS queries that left my house for approximately six days.  There are three people in my house that use the internet in one way or another.  Using some quick scripts I wrote, I extracted the queries that were asked of the DNS.  Using some graphical software, with this data as input, I created a couple of visualizations.  First, a standard word tag visualization, where the larger the word the more references are associated with the word in a particular dataset.

What can you learn from a visualization such as this?  Could you build a profile of the persons in this house just from their DNS queries?  And if you can, what does it tell you?  Twitter is obviously used in the house as the largest number of references are made to ‘twitter’. ‘Sandvine’ is also used often.  There are references to ‘mac’ and ‘apple’.  ‘facebook’ also is large relative to the others.  There are queries to ‘thepiratebay’. What do these all mean?  What can we infer from them, and are we accurate with our inferences?

Using the same dataset with full queries, here it is visualized as a bubble graph .

From this visualization, ‘twitter.com’ and ’search.twitter.com’ receive most of the queries, making it safe to say there is probably at least an active twitter account with an individual in this residence.  The ‘DC-2.sandvine.com’ sheds light that someone reguarily looks up what is probably a ‘Domain controller’ for ‘Sandvine.com’.  If from this you were to infer an employee of Sandvine, well you’d be correct.  You can not actually get to any of those servers without using a VPN, but due to the way DNS works, it often leaks.

Over the next few weeks, I will be working with this data, the graphs above, with other tools and DNS vectors to determine what  else can be inferred from just DNS.

TLS/SSL data leakage

February 5th, 2009 Clear2Go No comments

If you ask most people about TLS or SSL, they understand that it has something to do with ’securing’ information that is on the Internet.  People with a networking background will understand it as an encrypted session which encrypts everything above layer 5, effectively user data.  In the case of HTTP, this would include the URL that a user was requesting such as https://www.tdcanadatrust.com.   I was looking at a network capture file recently, and was shocked to find at the start of the session the server that I was accessing in the initial client hello packet of the SSL session, specifically http://www.tdcanadatrust.com.

You can see in the server name in the SSL client hello packet.  The hello packet is the first part of the initial SSL handshake sequence when a application attempts to establish and SSL session.

Using Wireshark, and digging a little deeper, I found it is classified as an ‘Extension’ labeled ’server_name’

It appears to be one of the acceptable extensions for SSL.  A quick check of the RFC revealed that it is an optional addition that applications such as a browser can add to the SSL negotiation process.

<snip>
.2. Extended Server Hello

The extended server hello message format MAY be sent in place of the
server hello message when the client has requested extended
functionality via the extended client hello message specified in
Section 2.1.

……

In order to provide the server name, clients MAY include an extension
of type “server_name” in the (extended) client hello.  The
“extension_data” field of this extension SHALL contain
“ServerNameList” where:

struct {
NameType name_type;
select (name_type) {
</snip>

As it turns out, this functionality was added to permit virtual hosting of SSL/TLS enabled sites.  Without it, every site requires a unique IP address.  With that reasoning, I expect it to become common place in the future.  One can argue that by having the destination IP address (which is not encrypted) of a network flow, determining which site a user is visiting when each IP address is mapped to a single SSL application is trivial.  Therefore adding this extended server_name option is no different and hence there is no added privacy concerns.   While I agree with this, it makes it much easier for the automation of statistics and monitoring of network flows.

The main point to keep in mind is that although you data is still encrypted, TLS/SSL still reveals the sites you visit.

Categories: Forensics, Privacy / Anonymity Tags: