This commit is contained in:
Valentin Brandl 2022-03-29 20:39:29 +02:00
parent eb587029e9
commit e3b44f7331
4 changed files with 72 additions and 74 deletions

View File

@ -85,4 +85,9 @@
long = {network access translation}
}
\DeclareAcronym{md5}{
short = {MD5},
long = {Message-Digest Algorithm 5},
}
% vim: set filetype=tex ts=2 sw=2 tw=0 et :

View File

@ -11,22 +11,18 @@ Many personal and professional workflows are so dependent on the internet, that
The number of connected \ac{iot} devices is around 10 billion in 2021 and is estimated to be constantly growing over the next years up to 25 billion in 2030~\cite{bib:statista_iot_2020}.
Many of these devices run on outdated software, don't receive any updates, and don't follow general security best practices.
While in 2016 only \SI{77}{\percent} of German households had a broadband connection with a bandwidth of \SI{50}{\mega\bit\per\second} or more, in 2020 it was already \SI{95}{\percent} with more than \SI{50}{\mega\bit\per\second} and \SI{59}{\percent} with at least \SI{1000}{\mega\bit\per\second}~\cite{bib:statista_broadband_2021}\todo{graph as image?}.
This makes them an attractive target for botmasters since they are easy to infect, always online, behind internet connections that are getting faster and faster, and due to their nature as small devices, often without any direct user interaction, an infection can go unnoticed for a long time.
While in 2016 only \SI{77}{\percent} of German households had a broadband connection with a bandwidth of \SI{50}{\mega\bit\per\second} or more, in 2020 it was already \SI{95}{\percent} with more than \SI{50}{\mega\bit\per\second} and \SI{59}{\percent} with at least \SI{1000}{\mega\bit\per\second}~\cite{bib:statista_broadband_2021}.
Their nature as small devices---often without any direct user interaction---that are always online and behind internet connections that are getting faster and faster makes them a desirable target for botnets.
In recent years, \ac{iot} botnets have been responsible for some of the biggest \ac{ddos} attacks ever recorded---creating up to \SI{1}{\tera\bit\per\second} of traffic~\cite{bib:ars_ddos_2016}.
\todo{what is a bot? Infected systems. Malware. DGA, beispiele, tree vs graph}
A botnet is a network of infected computers with some means of communication to control the infected systems.
Classic botnets use one or more central coordinating hosts called \ac{c2} servers.
These \ac{c2} servers could use any protocol from \ac{irc} over \ac{http} to Twitter~\cite{bib:pantic_covert_2015} as communication channel with the infected hosts.
Abusive use of infected systems includes several things\todo{things = bad}---\ac{ddos} attacks, banking fraud, as proxies to hide the attacker's identity, send spam emails\dots{}
Analyzing and shutting down a centralized botnet is comparatively easy since every bot knows the IP address, domain name, Twitter handle or \ac{irc} channel the \ac{c2} servers are using.
Analyzing and shutting down a centralized botnet is comparatively easy since the central means of communication (the \ac{c2} IP address or domain name, Twitter handle or \ac{irc} channel) are publicly known.
A coordinated operation with help from law enforcement, hosting providers, domain registrars, and platform providers could shut down or take over the operation by changing how requests are rooted or simply shutting down the controlling servers/accounts.
A coordinated operation with help from law enforcement, hosting providers, domain registrars, and platform providers could shut down or take over the operation by changing how requests are routed or simply shutting down the controlling servers/accounts.
To complicate take-down attempts, botnet operators came up with a number of ideas: \acp{dga} use pseudorandomly generated domain names to render simple domain blacklist-based approaches ineffective~\cite{bib:antonakakis_dga_2012} or fast-flux \ac{dns}, where a large pool of IP addresses is used assigned randomly to the \ac{c2} domains to prevent IP based blacklisting~\cite{bib:nazario_as_2008}.
@ -35,33 +31,32 @@ To complicate take-down attempts, botnet operators came up with a number of idea
\centering
\begin{subfigure}[b]{.5\textwidth}
\centering
\includegraphics[width=1\linewidth]{c2.pdf}
\includegraphics[width=1\linewidth]{c2.drawio.pdf}
\caption{Topology of a \ac{c2} controlled botnet}\label{fig:c2}
\end{subfigure}%
\begin{subfigure}[b]{.5\textwidth}
\centering
\includegraphics[width=1\linewidth]{p2p.pdf}
\includegraphics[width=1\linewidth]{p2p.drawio.pdf}
\caption{Topology of a \ac{p2p} botnet}\label{fig:p2p}
\end{subfigure}%
\caption{Communication paths in different types of botnets}\label{fig:c2vsp2p}
\end{figure}
\todo{better image for p2p, really needed?}
%}}}fig:c2vsp2p
A number of botnet operations were shut down like this~\cite{bib:nadji_beheading_2013} and as the defenders upped their game, so did attackers\todo{too informal?}---the concept of \ac{p2p} botnets came up.
The idea is to build a decentralized network without \acp{spof} where the \ac{c2} servers are as shown in \autoref{fig:p2p}.
A number of botnet operations were shut down like this~\cite{bib:nadji_beheading_2013} and as the defenders upped their game, so did attackers---the concept of \ac{p2p} botnets emerged.
The idea is to build a decentralized network without \acp{spof} in the form of \ac{c2} servers as shown in \autoref{fig:p2p}.
In a \ac{p2p} botnet, each node in the network knows a number of its neighbors and connects to those, each of these neighbors has a list of neighbors on his own, and so on.
Any of the nodes in \autoref{fig:p2p} could be the bot master but they don't even have to be online all the time since the peers will stay connected autonomously.
The bot master only need to join the network to send new commands or receive stolen data.
The bot master only needs to join the network to send new commands or receive stolen data.
This lack of a \ac{spof} makes \ac{p2p} botnets more resilient to take-down attempts since the communication is not stopped and botmasters can easily rejoin the network and send commands.
This lack of a \ac{spof} makes \ac{p2p} botnets more resilient to take-down attempts since the communication is not stopped and bot masters can easily rejoin the network and send commands.
The constantly growing damage produced by botnets has many researchers and law enforcement agencies trying to shut down these operations~\cite{bib:nadji_beheading_2013, bib:nadji_still_2017, bib:dittrich_takeover_2012, bib:fbi_takedown_2014}.
The monetary value of these botnets directly correlates with the amount of effort, botmasters are willing to put into implementing defense mechanisms against take-down attempts.
The constantly growing damage produced by botnets has many researchers and law enforcement agencies trying to shut down these operations~\cite{bib:nadji_beheading_2013, bib:nadji_still_2017, bib:dittrich_takeover_2012, bib:fbiTakedown2014}.
The monetary value of these botnets directly correlates with the amount of effort bot masters are willing to put into implementing defense mechanisms against take-down attempts.
Some of these countermeasures include deterrence, which limits the number of allowed bots per IP address or subnet to 1; blacklisting, where known crawlers and sensors are blocked from communicating with other bots in the network (mostly IP based); disinformation, when fake bots are placed in the neighborhood lists, which invalidates the data collected by crawlers; and active retaliation like \ac{ddos} attacks against sensors or crawlers~\cite{bib:andriesse_reliable_2015}.
\todo{source for constantly growing, position in text}
\todo{take-down? take down?}
Successful take-downs of a \ac{p2p} botnet requires intricate knowledge over the network topology, protocol characteristics and participating peers.
%}}} motivation
@ -101,13 +96,12 @@ There are two distinct methods to map and get an overview of the network topolog
\subsubsection{Passive Detection}
For passive detection, traffic flows are analysed in large amounts of collected network traffic (\eg{} from \acp{isp}).
This has some advantages in that it is not possible for botmasters to detect or prevent data collection of that kind, but it is not trivial to distinguish valid \ac{p2p} application traffic (\eg{} BitTorrent, Skype, cryptocurrencies, \ldots) from \ac{p2p} bots.
This has some advantages in that it is not possible for bot masters to detect or prevent data collection of that kind, but it is not trivial to distinguish valid \ac{p2p} application traffic (\eg{} BitTorrent, Skype, cryptocurrencies, \ldots) from \ac{p2p} bots.
\citeauthor{bib:zhang_building_2014} propose a system of statistical analysis to solve some of these problems in~\cite{bib:zhang_building_2014}.
Also getting access to the required datasets might not be possible for everyone.
\todo{no context}
\todo{BotGrep (in zhang\_building\_2014)}
\todo{BotMiner (in zhang\_building\_2014)}
As most detection botnet mechanisms, also the passive ones work by building communication graphs and finding tightly coupled subgraphs that might be indicative of a botnet~\cite{bib:botgrep2010}. An advantage of passive detection is, that it is independent of protocol details, specific binaries or the structure of the network (\ac{p2p} vs.\ centralized)~\cite{bib:botminer2008}.
\begin{itemize}
\item Large scale network analysis (hard to differentiate from legitimate \ac{p2p} traffic (\eg{} BitTorrent), hard to get data, knowledge of some known bots required)~\cite{bib:zhang_building_2014}
@ -115,6 +109,7 @@ Also getting access to the required datasets might not be possible for everyone.
\item Heuristics: Same traffic patterns, same malicious behaviour
\end{itemize}
\todo{no context}
%}}} passive detection
@ -122,7 +117,7 @@ Also getting access to the required datasets might not be possible for everyone.
\subsubsection{Active Detection}
In this case, a subset of the botnet protocol are reimplemented to place pseudo-bots or sensors in the network, which will only communicate with other nodes but won't accept or execute commands to perform malicious actions.
The difference in behaviour from the reference implementation and conspicuous graph properties (\eg{} high \(\deg^{+}\) vs.\ low \(\deg^{-}\)) of these sensors allows botmasters to detect and block the sensor nodes.
The difference in behaviour from the reference implementation and conspicuous graph properties (\eg{} high \(\deg^{+}\) vs.\ low \(\deg^{-}\)) of these sensors allows bot masters to detect and block the sensor nodes.
There are three subtypes of active detection:
@ -140,19 +135,19 @@ There are three subtypes of active detection:
%}}} detection techniques
%{{{ detection criteria
\subsection{Detection Criteria}
%%{{{ detection criteria
%\subsection{Detection Criteria}
\begin{itemize}
%\begin{itemize}
\item \ac{p2p} online time vs host online time
% \item \ac{p2p} online time vs host online time
\item neighbourhood lists
% \item neighbourhood lists
\item no/few \ac{dns} lookups; instead direct lookups from routing tables
% \item no/few \ac{dns} lookups; instead direct lookups from routing tables
\end{itemize}
%}}} detection criteria
%\end{itemize}
%%}}} detection criteria
%}}} introduction
@ -169,18 +164,19 @@ Both ranking algorithms use the \(\deg^+\) and \(\deg^-\) to weight the nodes.
Another way to enumerate candidates for sensors in a \ac{p2p} botnet is to find \acp{wcc} in the graph.
Sensors will have few to none outgoing edges, since they don't participate actively in the botnet.
The goal of this work is to complicate detection mechanisms like this for botmasters by centralizing the coordination of the system's crawlers and sensors, thereby reducing the node's rank for specific graph metrics.
The goal of this work is to complicate detection mechanisms like this for bot masters by centralizing the coordination of the system's crawlers and sensors, thereby reducing the node's rank for specific graph metrics.
The coordinated work distribution also helps in efficiently monitoring large botnets where one sensor is not enough to track all peers.
The changes should allow the current sensors to use the new abstraction with as few changes as possible to the existing code.
The final result should be as general as possible and not depend on any botnet's specific behaviour, but it assumes, that every \ac{p2p} botnet has some kind of \enquote{getNeighbourList} method in the protocol, that allows other peers to request a list of active nodes to connect to.
The final results should be as general as possible and not depend on any botnet's specific behaviour, but it assumes, that every \ac{p2p} botnet has some kind of \enquote{getNeighbourList} method in the protocol, that allows other peers to request a list of active nodes to connect to.
In the current implementation, each sensor will itself visit and monitor each new node it finds.
The idea for this work is to report newfound nodes back to the \ac{bms} backend first, where the graph of the known network is created, and a sensor is selected, so that the specific ranking algorithm doesn't calculate to a suspiciously high or low value.
In the current implementation, each crawler will itself visit and monitor each new node it finds.
The idea for this work is to report newfound nodes back to the \ac{bms} backend first, where the graph of the known network is created, and a fitting worker is selected to archive the goal of the according coordination strategy.
That sensor will be responsible to monitor the new node.
If it is not possible, to select a specific sensor so that the monitoring activity stays inconspicuous, the coordinator can do a complete shuffle of all nodes between the sensors to restore the wanted graph properties or warn if more sensors are required to stay undetected.
The improved sensor system should allow new sensors to register themselves and their capabilities (\eg{} bandwidth, geolocation ), so the amount of work can be scaled accordingly between hosts.
The improved crawler system should allow new crawlers to register themselves and their capabilities (\eg{} bandwidth, geolocation ), so the amount of work can be scaled accordingly between hosts.
Further work might even consider autoscaling the monitoring activity using some kind of cloud computing provider.
To validate the result, the old sensor implementation will be compared to the new system using different graph metrics.
@ -189,9 +185,6 @@ To validate the result, the old sensor implementation will be compared to the ne
If time allows, \ac{bsf}\footnotemark{} will be used to simulate a botnet place sensors in the simulated network and measure the improvement achieved by the coordinated monitoring effort.
\footnotetext{\url{https://github.com/tklab-tud/BSF}}
\todo{which botnet?}
As a proof of concept, the coordinated monitoring approach will be implemented and deployed in the (Sality, Mirai, ...)? botnet.
%}}} methodology
%{{{ primitives
@ -199,22 +192,21 @@ As a proof of concept, the coordinated monitoring approach will be implemented a
The coordination protocol must allow the following operations:
\todo{Testnet + testnet crawler erweitern um mit complete knowledge zu verifizieren}
\subsubsection{Register Worker}
%{{{ sensor to backend
\subsubsection{Sensor to Backend}
\mintinline{go}{register(capabilities)}: Register new worker with capabilities (which botnet, available bandwidth, \ldots). This is called periodically and used to determine which worker is still active, when splitting the workload.
\todo{bestehende session Mechanik verwenden/erweitern}
\todo{failedTries im backend statt eigenem nachrichtentyp: remove?}
\begin{itemize}
\subsubsection{Report Peer}
\item \mintinline{go}{registerSensor(capabilities)}: Register new sensor with capabilities (which botnet, available bandwidth, \ldots). This is called periodically and used to determine which crawler is still active, when splitting the workload.
\mintinline{go}{reportPeer(peers)}: Report found targets. Both successful and failed attempts are reported, to detect as soon as possible, when a peer became unavailable.
\item \mintinline{go}{unreachable(targets)}:
\subsubsection{Report Edge}
\item \mintinline{go}{requestTasks() []PeerTask}: Receive a batch of crawl tasks from the coordinator. The tasks consist of the target peer, if the crawler should start or stop the operation, when it should start and stop monitoring and the frequency.
\mintinline{go}{reportEdge(edges)}: Report found edges. Edges are found by querying the neighbourhood list of known peers. This is how new peers are detected.
\end{itemize}
\subsubsection{Request Tasks}
\mintinline{go}{requestTasks() []PeerTask}: Receive a batch of crawl tasks from the coordinator. The tasks consist of the target peer, if the crawler should start or stop the operation, when it should start and stop monitoring and the frequency.
\begin{minted}{go}
type Peer struct {
@ -231,22 +223,6 @@ type PeerTask struct {
}
\end{minted}
%}}} sensor to backend
%{{{ backend to sensor
% TODO: remove?
\subsubsection{Backend to Sensor}
% \begin{itemize}
% % \item \mintinline{go}{stopCrawling(targets)}: Stop crawling a batch of nodes
% \end{itemize}
%}}} backend to sensor
%}}} primitives
%}}} methodology
@ -255,7 +231,7 @@ type PeerTask struct {
\section{Coordination Strategies}
Let \(C\) be the set of available crawlers.
Without loss of generality, if not stated otherwise, we assume that \(C\) is known when \ac{bms} is started and will not change afterward.
Without loss of generality, if not stated otherwise, I assume that \(C\) is known when \ac{bms} is started and will not change afterward.
There will be no joining or leaving crawlers.
This assumption greatly simplifies the implementation due to the lack of changing state that has to be tracked while still exploring the described strategies.
A production-ready implementation of the described techniques can drop this assumption but might have to recalculate the work distribution once a crawler joins or leaves.
@ -283,12 +259,12 @@ Load balancing allows scaling out, which can be more cost-effective.
\todo{weighted round robin}
Work is evenly distributed between crawlers according to their capabilities.
For the sake of simplicity, we will only consider the bandwidth as capability but it can be extended by any shared property between the crawlers, \eg{} available memory, CPU speed.
For the sake of simplicity, only the bandwidth will be considered as capability but it can be extended by any shared property between the crawlers, \eg{} available memory, CPU speed.
For a given crawler \(c_i \in C\) let \(B(c_i)\) be the total bandwidth of the crawler.
The total available bandwidth is \(b = \sum\limits_{c \in C} B(c_i)\).
The weight \(W(c_i) = \frac{B}{B(c_i)}\)\todo{proper def for weight} defines which percentage of the work gets assigned to \(c_i\).
The set of target peers \(P = <p_0, p_1, \ldots, p_{n-1}>\), is partitioned into \(|C|\) subsets according to \(W(c_i)\) and each subset is assigned to its crawler \(c_i\).
The mapping \mintedinline{go}{gcd(C)} is the greatest common divisor of all peers in \mintedinline{go}{C}, \(\text{maxWeight}(C) = \max \{ \forall c \in C : W(c) \}\).
The mapping \mintinline{go}{gcd(C)} is the greatest common divisor of all peers in \mintinline{go}{C}, \(\text{maxWeight}(C) = \max \{ \forall c \in C : W(c) \}\).
The following weighted round-robin algorithm distributes the work according to the crawlers' capabilities:
@ -334,9 +310,26 @@ for _, peer := range peers {
\subsubsection{IP-based Partitioning}\label{sec:ip_part}
Assuming IP addresses in a botnet are evenly distributed with regard to their \(\mod |C|\)\todo{source? law of large numbers}.
Using \(m(i) = i \mod |C|\) as mapping to determine which IP is assigned to which crawler.
This ensures neighboring IP addresses (\eg{} in the same \ac{as} and/or geolocation) get visited by different crawlers.
The output of cryptographic hash functions is uniformly distributed---even substrings of the calculated hash hold this property.
Calculating the hash of an IP address and distributing the work with regard to \(\text{hash}(\text{IP}) \mod \abs{C}\) creates about evenly sized buckets for each worker to handle.
This gives us the mapping \(m(i) = \text{hash}(i) \mod \abs{C}\) to sort peers into buckets.
Any hash function can be used but since it must be calculated often, a fast function should be used.
While the \ac{md5} hash function must be considered broken for cryptographic use, it is faster to calculate than hash functions with longer output.
For the use case at hand, only the uniform distribution property is required so \ac{md5} can be used without scarifying any kind of security.
\begin{figure}[H]
\centering
\includegraphics[width=1\linewidth]{./md5_ip_dist.png}
\caption{Distribution of the lowest byte of \ac{md5} hashes over IPv4}\label{fig:md5IPDist}
\end{figure}
\ac{md5} returns a \SI{128}{\bit} hash but Go cannot directly work with \SI{128}{\bit} integers.
It would be possible to implement the modulo operation for arbitrarily sized integers, but the uniform distribution also holds substrings of hashes.
\autoref{fig:md5IPDist} shows the distribution of the lowest \SI{8}{\bit} for \ac{md5} hashes over all \(2^{32}\) IP addresses in their representation as \SI{32}{\bit} integers.
By exploiting the even distribution offered by hashing, the work of each crawler is also evenly distributed over all IP subnets, \ac{as} and geolocations.
This ensures neighboring peers (\eg{} in the same \ac{as}, geolocation or IP subnet) get visited by different crawlers.
%}}} load balancing
@ -632,12 +625,12 @@ Also this does not help against the \ac{wcc} metric since this would create a bi
\centering
\begin{subfigure}[b]{.5\textwidth}
\centering
\includegraphics[width=1\linewidth]{sensorbuster1.pdf}
\includegraphics[width=1\linewidth]{dot/sensorbuster1.pdf}
\caption{\acp{wcc} for independent crawlers}\label{fig:sensorbuster1}
\end{subfigure}%
\begin{subfigure}[b]{.5\textwidth}
\centering
\includegraphics[width=1\linewidth]{sensorbuster2.pdf}
\includegraphics[width=1\linewidth]{dot/sensorbuster2.pdf}
\caption{\acp{wcc} for collaborated crawlers}\label{fig:sensorbuster2}
\end{subfigure}%
\caption{Differences in graph metrics}\label{fig:sensorbuster}

Binary file not shown.

View File

@ -105,7 +105,7 @@ headsepline,
% custom commands
\input{commands}
\graphicspath{{assets/dot/}, {assets/}}
\graphicspath{{assets/}}
\setcounter{tocdepth}{2}