Content

2022-04-17 19:56:26 +02:00
parent 02d8abd2cd
commit db2a4c42d2
4 changed files with 58 additions and 50 deletions
--- a/acronyms.tex
+++ b/acronyms.tex
@@ -28,11 +28,6 @@
  long  = {Domain Generation Algorithm}
 }

-\DeclareAcronym{dns}{
-  short = {DNS},
-  long  = {Domain Name System}
-}
-
 \DeclareAcronym{iot}{
  short = {IoT},
  long  = {Internet of Things}
--- a/content.tex
+++ b/content.tex
@@ -1,15 +1,6 @@
 %{{{ introduction
 \section{Introduction}

-TODO: problemstellung, forschungsfragen
-
-%}}} introduction
-
-
-%{{{ motivation
-\clearpage{}
-\section{Motivation}
-
 The Internet has become an irreplaceable part of our day-to-day lives.
 We are always connected via numerous \enquote{smart} and \ac{iot} devices.
 We use the Internet to communicate, shop, handle financial transactions, and much more.
@@ -19,24 +10,26 @@ Many personal and professional workflows are so dependent on the Internet, that
 In 2021, there were around 10 billion Internet connected \ac{iot} devices and this number is estimated to more than double over the next years up to 25 billion in 2030~\cite{bib:statista_iot_2020}.
 Many of these devices run on outdated software, don't receive regular updates, and don't follow general security best practices.
 While in 2016 only \SI{77}{\percent} of German households had a broadband connection with a bandwidth of \SI{50}{\mega\bit\per\second} or more, in 2020 it was already \SI{95}{\percent} with more than \SI{50}{\mega\bit\per\second} and \SI{59}{\percent} with at least \SI{1000}{\mega\bit\per\second}~\cite{bib:statista_broadband_2021}.
-Their nature as small, always online devices---often without any direct user interaction---behind Internet connections that are getting faster and faster makes them a desirable target for botnet operators.
+Their nature as small, always online devices---often without any direct user interaction---behind Internet connections that are getting faster and faster makes them a desirable target for \emph{botnet} operators.
+
+A \emph{botnet} is a network of malware infected computers, called \emph{bots}, controlled by a \emph{botmaster}.
+Botnets are controlled via a \emph{\ac{c2} channel}.
+The communication patterns of a \ac{c2} channel can be \emph{centralized}, \emph{decentralized} or \emph{distributed}.
+Centralized or decentralized botnets use one or more coordinating hosts to contact and receive new commands.
+Distributed botnets create a \emph{\ac{p2p}} network as their communication layer.
+The \ac{c2} channel for centralized and decentralized botnets can use anything from \ac{irc} over HTTP to Twitter~\cite{bib:pantic_covert_2015}.
+
 In recent years, \ac{iot} botnets have been responsible for some of the biggest \ac{ddos} attacks ever recorded---creating up to \SI{1}{\tera\bit\per\second} of traffic~\cite{bib:ars_ddos_2016}.
+Other malicious use of bots includes several activities---\ac{ddos} attacks, banking fraud, proxies to hide the attacker's identity, sending of spam emails, just to name a few.

-%}}} motivation
-
-\clearpage{}
-\section{Background and Related Work}
-
-Botnets consist of infected computers, so called \emph{bots}, controlled by a \emph{botmaster}.
-\emph{Centralized} and \emph{decentralized botnets} use one or more coordinating hosts, called \emph{\ac{c2} servers}, respectively.
-These \ac{c2} servers can use any protocol from \ac{irc} over HTTP to Twitter~\cite{bib:pantic_covert_2015} as communication channel with the infected hosts.
-The abuse of infected systems includes several activities---\ac{ddos} attacks, banking fraud, proxies to hide the attacker's identity, sending of spam emails, just to name a few.
-
-Analyzing and shutting down a centralized or decentralized botnet is comparatively easy since the central means of communication (the \ac{c2} IP addresses or domain names, Twitter handles or \ac{irc} channels), can be extracted from the malicious binaries or determined by analyzing network traffic and can therefore be considered publicly known.
-
+The constantly growing damage produced by botnets has many researchers and law enforcement agencies trying to shut down these operations~\cite{bib:nadji_beheading_2013, bib:nadji_still_2017, bib:dittrich_takeover_2012, bib:fbiTakedown2014}.
 A coordinated operation with help from law enforcement, hosting providers, domain registrars, and platform providers could shut down or take over the operation by changing how requests are routed or simply shutting down the controlling servers/accounts.

-To complicate take-down attempts, botnet operators came up with a number of ideas: \acp{dga} use pseudorandomly generated domain names to render simple domain blacklist-based approaches ineffective~\cite{bib:antonakakis_dga_2012} or fast-flux \ac{dns} entries, where a large pool of IP addresses is randomly assigned to the \ac{c2} domains to prevent IP based blacklisting and hide the actual \ac{c2} servers~\cite{bib:nazario_as_2008}.
+The monetary value of these botnets directly correlates with the amount of effort botmasters are willing to put into implementing defense mechanisms against take-down attempts.
+Botnet operators came up with a number of ideas: \acp{dga} use pseudorandomly generated domain names to render simple domain blacklist-based approaches ineffective~\cite{bib:antonakakis_dga_2012} or fast-flux DNS entries, where a large pool of IP addresses is randomly assigned to the \ac{c2} domains to prevent IP based blacklisting and hide the actual \ac{c2} servers~\cite{bib:nazario_as_2008}.
+Analyzing and shutting down a centralized or decentralized botnet is comparatively easy since the central means of communication (the \ac{c2} IP addresses or domain names, Twitter handles or \ac{irc} channels), can be extracted from the malicious binaries or determined by analyzing network traffic and can therefore be considered publicly known.
+A number of botnet operations were taken down by shutting down the \ac{c2} channel~\cite{bib:nadji_beheading_2013} and as the defenders upped their game, so did attackers---the concept of \ac{p2p} botnets emerged.
+The idea is to build a distributed network without \acp{spof} in the form of \ac{c2} servers as shown in \Fref{fig:p2p}.

 %{{{ fig:c2vsp2p
 \begin{figure}[h]
@@ -55,30 +48,48 @@ To complicate take-down attempts, botnet operators came up with a number of idea
 \end{figure}
 %}}}fig:c2vsp2p

-A number of botnet operations were shut down like this~\cite{bib:nadji_beheading_2013} and as the defenders upped their game, so did attackers---the concept of \ac{p2p} botnets emerged.
-The idea is to build a distributed network without \acp{spof} in the form of \ac{c2} servers as shown in \Fref{fig:p2p}.
+This lack of a \ac{spof} makes \ac{p2p} botnets more resilient to take-down attempts since there is no easy way to stop the communication and botmasters can easily rejoin the network and send new commands.
+
+Taking down a \ac{p2p} botnet requires intricate knowledge over botnet's characteristics, \eg{} size, risk, distribution over IP subnets or geolocations, network topology, participating peers and protocol characteristics.
+This can be obtained by monitoring peers activity of known participants in the botnet.
+
+\todo{few words about monitoring}
+
+In this work, we will show how a collaborative system of crawlers and sensors can make the monitoring and information gathering phase more efficient, resilient to detection and how collaborative monitoring can help circumventing anti-monitoring techniques.
+
+%}}} introduction
+
+
+%%{{{ motivation
+%\clearpage{}
+%\section{Motivation}
+
+
+
+
+%%}}} motivation
+
+\clearpage{}
+\section{Background and Related Work}
+
+
 In a \ac{p2p} botnet, each node in the network knows a number of its neighbors and connects to those. Each of these neighbors has a list of neighbors on its own, and so on.
 The botmaster only needs to join the network to send new commands or receive stolen data but there is no need for a coordinating host, that is always connected to the network.
 Any of the nodes in \Fref{fig:p2p} could be the botmaster but they don't even have to be online all the time since the peers will stay connected autonomously.
 In fact, there have been arrests of \ac{p2p} botnet operators but due to the autonomy offered by the distributed approach, the botnet keeps intact and continues operating~\cite{bib:netlab_mozi}.
 Especially worm-like botnets, where each peer tries to find and infect other systems, can keep lingering for many years.

-This lack of a \ac{spof} makes \ac{p2p} botnets more resilient to take-down attempts since the communication is not stopped and botmasters can easily rejoin the network and send commands.
-
-Successful take-downs of a \ac{p2p} botnet require intricate knowledge over the network topology, protocol characteristics and participating peers.
-This knowledge can be obtained by monitoring peer activity in the botnet.

 Bots in a \ac{p2p} botnet can be split into two distinct groups according to their reachability: peers that are not publicly reachable (\eg{} because they are behind a \ac{nat} router or firewall) and those, that are publicly reachable, also known as \emph{superpeers}.
-In contrast to centralized botnets with a fixed set of \ac{c2} servers, in a \ac{p2p} botnet, every superpeer might take the roll of a \ac{c2} server and \emph{non-superpeers} will connect to those superpeers when joining the network.
+In contrast to centralized botnets with a fixed set of \ac{c2} servers, in a \ac{p2p} botnet, every superpeer might take the role of a \ac{c2} server and \emph{non-superpeers} will connect to those superpeers when joining the network.

 As there is no well-known server in a \ac{p2p} botnet, they have to coordinate autonomously.
 This is achieved by connecting the bots among each other.
 Bot \textit{B} is considered a \emph{neighbor} of bot \textit{A}, if \textit{A} knows and connects to \textit{B}.
-Since bots can become unavailable, they have to consistently update their neighbor lists to avoid losing their connection into the botnet.
-This is achieved by periodically querying their neighbor's neighbors.
-This process is known as \emph{\ac{mm}}.
+Since bots can go offline can become unavailable (\eg{} because the system was shut down or the malware infection was detected and removed), they have to consistently update their neighbor lists to avoid losing their connection into the botnet.
+This is achieved by periodically querying their neighbor's neighbors in a process known as \emph{\ac{mm}}.

-\Ac{mm} can be distinguished into two categories: \emph{structured} and \emph{unstructured}~\cite{bib:baileyNextGen}
+\Ac{mm} can be distinguished into two categories: \emph{structured} and \emph{unstructured}~\cite{bib:baileyNextGen}.
 Structured \ac{p2p} botnets have strict rules for a bot's neighbors based on its unique ID and often use a \ac{dht}, which allows persisting data in a distributed network.
 The \ac{dht} could contain a ordered ring structure of IDs and neighborhood in the structure also means neighborhood in the network, as is the case in the Kademila botnet~\cite{bib:kademlia2002}.
 In botnets that employ unstructured \ac{mm} on the other hand, bots ask any peer they know for new peers to connect to, in a process called \emph{peer discovery}.
@@ -145,14 +156,14 @@ Also getting access to the required datasets might not be possible for everyone.

 As most botnet detection mechanisms, also the passive ones work by building communication graphs and finding tightly coupled subgraphs that might be indicative of a botnet~\cite{bib:botgrep2010}. An advantage of passive detection is, that it is independent of protocol details, specific binaries, or the structure of the network (\ac{p2p} vs.\ centralized/decentralized)~\cite{bib:botminer2008}.

-\begin{itemize}
+% \begin{itemize}

-  \item Large scale network analysis (hard to differentiate from legitimate \ac{p2p} traffic (\eg{} BitTorrent), hard to get data, knowledge of some known bots required)~\cite{bib:zhang_building_2014}
+%   \item Large scale network analysis (hard to differentiate from legitimate \ac{p2p} traffic (\eg{} BitTorrent), hard to get data, knowledge of some known bots required)~\cite{bib:zhang_building_2014}

-  \item Heuristics: Same traffic patterns, same malicious behaviour
+%   \item Heuristics: Same traffic patterns, same malicious behaviour

-\end{itemize}
-\todo{no context}
+% \end{itemize}
+% \todo{no context}

 Passive monitoring is only mentioned for completeness and not further discussed in this thesis.

@@ -177,6 +188,7 @@ Every entry \textit{E} in the peer exchange response received from bot \textit{A
 Therefore, edges should only be considered valid, if at least one crawler or sensor was able to contact or contacted by peer \textit{E}, thereby confirming, that \textit{E} is an existing participant in the botnet.

 A sensor implements the passive part of the botnet's \ac{mm}.
+It is populated into the network by crawlers or other sensors and waits for other peers to contact them.
 They cannot be used to create the botnet graph (only edges into the sensor node) or find new peers, but are required to enumerate the whole network, including non-superpeers.

 %}}} active detection
@@ -184,17 +196,18 @@ They cannot be used to create the botnet graph (only edges into the sensor node)
 %{{{ monitoring prevention
 \subsubsection{Anti-Monitoring Techniques}

-The constantly growing damage produced by botnets has many researchers and law enforcement agencies trying to shut down these operations~\cite{bib:nadji_beheading_2013, bib:nadji_still_2017, bib:dittrich_takeover_2012, bib:fbiTakedown2014}.
-The monetary value of these botnets directly correlates with the amount of effort botmasters are willing to put into implementing defense mechanisms against take-down attempts.

-Some of these countermeasures are explored by \citeauthor{bib:andriesse_reliable_2015} in \citetitle{bib:andriesse_reliable_2015} and include deterrence, which limits the number of bots per IP address or subnet; blacklisting, where known crawlers and sensors are blocked from communicating with other bots in the network (mostly IP based); disinformation, when fake bots are placed in the peer lists, to invalidate the data collected by crawlers; and active retaliation like \ac{ddos} attacks against sensors or crawlers~\cite{bib:andriesse_reliable_2015}.
+\citeauthor{bib:andriesse_reliable_2015} explore some monitoring countermeasures in \citetitle{bib:andriesse_reliable_2015}.
+These include deterrence, which limits the number of bots per IP address or subnet; blacklisting, where known crawlers and sensors are blocked from communicating with other bots in the network (mostly IP based); disinformation, when fake bots are placed in the peer lists, to invalidate the data collected by crawlers; and active retaliation like \ac{ddos} attacks against sensors or crawlers~\cite{bib:andriesse_reliable_2015}.
+

-In this work, we try to find ways to make the monitoring and information gathering phase more efficient and resilient to detection.

 %}}} monitoring prevention

 %}}} detection techniques

+
+
 %%{{{ detection criteria
 %\subsection{Detection Criteria}

@@ -218,7 +231,7 @@ The implementation of the concepts of this work will be done as part of \ac{bms}
 \footnotetext{\url{https://github.com/Telecooperation/BMS}}
 \Ac{bms} is intended for a hybrid active approach of crawlers and sensors (reimplementations of the \ac{p2p} protocol of a botnet, that won't perform malicious actions) to collect live data from active botnets.

-In an earlier project, we implemented different node ranking algorithms---among others \emph{PageRank}~\cite{bib:page_pagerank_1998} and \emph{SensorRank}---to detect sensor candidates in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
+In an earlier project, we implemented different graph ranking algorithms---among others \emph{PageRank}~\cite{bib:page_pagerank_1998} and \emph{SensorRank}---to detect sensor candidates in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
 Both ranking algorithms exploit the differences in a sensor's or crawler's \(\deg^+\) and \(\deg^-\) to weight the nodes.
 Sensors will have few to none outgoing edges, since they don't participate actively in the botnet, while crawlers have only outgoing edges.
 Another way to enumerate candidates for sensors in a \ac{p2p} botnet is to find \acp{wcc} in the graph.
--- a/metadata.tex
+++ b/metadata.tex
@@ -1,4 +1,4 @@
-\title{Collaborative Monitoring of Fully Distributed Botnets}
+\title{Collaborative Monitoring of \Acs*{p2p} Botnets}
 % \title{Centralized Crawling of Decentralized Botnets\\
 % Collaborative Crawling of Decentralized Botnets\\
 % Centralized Crawling of P2P Botnets}
--- a/report.pdf
+++ b/report.pdf