Collecting and analyzing Tor exit node traffic

Detta är en Uppsats för yrkesexamina på avancerad nivå från Blekinge Tekniska Högskola/Fakulteten för datavetenskaper; Blekinge Tekniska Högskola/Institutionen för datavetenskap

Sammanfattning: Background. With increased Internet usage occurring across the world journalists, dissidents and criminals have moved their operations online, and in turn, governments and law enforcement have increased their surveillance of their country’s networks. This has increased the popularity of programs masking users’ identities online such as the Tor Project. By encrypting and routing the traffic through several nodes, the users’ identity is hidden. But how are Tor users utilizing the network, and is any of it in the plain text despite the dangers of it? How has the usage of Tor changed compared to 11 years ago? Objectives. The thesis objective is to analyze captured Tor network traffic that reveals what data is sent through the network. The collected data helps draw conclusions about Tor usage and is compared with previous studies. Methods. Three Tor exit nodes are set up and operated for one week in the US, Germany, and Japan. We deploy packet sniffers performing a deep packet inspection on each traffic flow to identify attributes such as application protocol, number of bytes sent in a flow, and content-type if the traffic was sent in plain text. All stored data is anonymized. Results. The results show that 100.35 million flows were recorded, with 32.47%of them sending 4 or fewer packets in total. The most used application protocol was TLS with 55.03% of total traffic. The HTTP usage was 15.91% and 16% was unknown protocol(s). The countries receiving the most traffic were the US with over45% of all traffic, followed by the Netherlands, UK, and Germany with less than 10%of recorded traffic as its destination. The most frequently used destination ports were 443 at 49.5%, 5222 at 12.7%, 80 with 11.9%, and 25 at 9.3%.Conclusions. The experiment shows that it is possible to perform traffic analysis on the Tor network and acquire significant data. It shows that the Tor network is widely used in the world but with the US and Europe accounting for most of the traffic. As expected there has been a shift from HTTP to HTTPS traffic when compared to previous research. However, there is still unencrypted traffic on the network, where some of the traffic could be explained by automated tools like web crawlers. Tor users need to increase their awareness in what traffic they are sending through the network, as a user with malicious intent can perform the same experiment and potentially acquire unencrypted sensitive data.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)