This commit is contained in:
Valentin Brandl 2022-04-09 18:09:14 +02:00
parent 548b7016e2
commit fdf37d3cce
6 changed files with 51 additions and 28 deletions

View File

@ -1,6 +1,6 @@
\begin{abstract}
Botnets pose a huge risk on general internet infrastructure and services.
Decentralized \Acs*{p2p} topologies make it harder to detect monitor and take those botnets offline.
Distributed \Acs*{p2p} topologies make it harder to detect, monitor and take those botnets offline.
This work explores ways to make monitoring of fully distributed botnets more efficient, resilient and harder to detect, by using a collaborative, coordinated approach.
\todo{do me}
\end{abstract}

View File

@ -1,4 +1,4 @@
\newcommand{\eg}{\textit{e}.\textit{g}.}
\newcommand{\eg}{e.g.}
% Keywords command
\providecommand{\keywords}[1]

View File

@ -4,7 +4,7 @@
The internet has become an irreplaceable part of our day-to-day lives.
We are always connected via numerous \enquote{smart} and \ac{iot} devices.
We use the internet to communicate, shop, handle financial transactions, and much more.
Many personal and professional workflows are so dependent on the internet, that they won't work when being offline, and with the pandemic, we are living through, this dependency grew even stronger.
Many personal and professional workflows are so dependent on the internet, that they won't work when being offline, and with the pandemic we are living through, this dependency grew even stronger.
%{{{ motivation
% \subsection{Motivation}
@ -21,9 +21,9 @@ In recent years, \ac{iot} botnets have been responsible for some of the biggest
\section{Background}
Botnets consist of infected computers, so called \textit{bots}, controlled by a \textit{botmaster}.
\textit{Centralized} and \textit{decentralized botnets} use one or more coordinating hosts called \textit{\ac{c2} servers} respectively.
\textit{Centralized} and \textit{decentralized botnets} use one or more coordinating hosts called \textit{\ac{c2} servers} respectively\todo{wording}.
These \ac{c2} servers can use any protocol from \ac{irc} over \ac{http} to Twitter~\cite{bib:pantic_covert_2015} as communication channel with the infected hosts.
The abuse of infected systems includes several activities---\ac{ddos} attacks, banking fraud, as proxies to hide the attacker's identity, send spam emails\dots{}
The abuse of infected systems includes several activities---\ac{ddos} attacks, banking fraud, proxies to hide the attacker's identity, sending of spam emails\dots{}
Analyzing and shutting down a centralized or decentralized botnet is comparatively easy since the central means of communication (the \ac{c2} IP addresses or domain names, Twitter handles or \ac{irc} channels), can be extracted from the malicious binaries or determined by analyzing network traffic and can therefore be considered publicly known.
@ -59,7 +59,7 @@ Especially worm-like botnets, where each peer tries to find and infect other sys
This lack of a \ac{spof} makes \ac{p2p} botnets more resilient to take-down attempts since the communication is not stopped and botmasters can easily rejoin the network and send commands.
Bots in a \ac{p2p} botnet can be split into two distinct groups according to their reachability: publicly reachable peers, also known as \textit{superpeers}, and those, that are not (\eg{} because they are behind a \ac{nat} router or firewall).
Bots in a \ac{p2p} botnet can be split into two distinct groups according to their reachability: peers that are not publicly reachable (\eg{} because they are behind a \ac{nat} router or firewall) and those, that are publicly reachable, also known as \textit{superpeers}.
In contrast to centralized botnets with a fixed set of \ac{c2} servers, in a \ac{p2p} botnet, every superpeer might take the roll of a \ac{c2} server and \textit{non-superpeers} will connect to those superpeers when joining the network.
As there is no well known server in a \ac{p2p} botnet, they have to coordinate autonomously.
@ -72,13 +72,14 @@ This process is known as \textit{\ac{mm}}.
\Ac{mm} comes in two forms: structured and unstructured~\cite{bib:baileyNextGen}\todo{explain structured and unstructured}.
Structured \ac{p2p} botnets often use a \ac{dht} and strict rules for a bot's neighbors based on its unique ID.
In unstructured botnets on the other hand, bots ask any peer they know for new peers to connect to, in a process called \textit{peer discovery}.
To enable peers to connect to unstructured botnets, the malware binaries include hardcoded lists of superpeers for the newly infected systems to connect to.
The concept of \textit{churn} describes when a bot becomes unavailable.
There are two types of churn:
\begin{itemize}
\item \textit{IP churn}: A bot becomes unreachable because it got a new IP address assigned. The bot is still available but under another address.
\item \textit{IP churn}: A bot becomes unreachable because it got assigned a new IP address. The bot is still available but under another address.
\item \textit{Device churn}: The device is actually offline, \eg{} because the infection was cleaned, it got shut down or lost its internet connection.
@ -207,10 +208,10 @@ In this work we try to find ways to make the monitoring and information gatherin
The implementation of the concepts of this work will be done as part of \ac{bms}\footnotemark, a monitoring platform for \ac{p2p} botnets described by \citeauthor{bib:bock_poster_2019} in \citetitle{bib:bock_poster_2019}.
\footnotetext{\url{https://github.com/Telecooperation/BMS}}
\Ac{bms} uses a hybrid active approach of crawlers and sensors (reimplementations of the \ac{p2p} protocol of a botnet, that won't perform malicious actions) to collect live data from active botnets.
\Ac{bms} is intended for a hybrid active approach of crawlers and sensors (reimplementations of the \ac{p2p} protocol of a botnet, that won't perform malicious actions) to collect live data from active botnets.
In an earlier project, I implemented different node ranking algorithms (among others \enquote{PageRank}~\cite{bib:page_pagerank_1998}) to detect sensors and crawlers in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
Both ranking algorithms use the \(\deg^+\) and \(\deg^-\) to weight the nodes.
In an earlier project, we implemented different node ranking algorithms (among others \enquote{PageRank}~\cite{bib:page_pagerank_1998}) to detect sensor candidates in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
Both ranking algorithms exploit the differences in \(\deg^+\) and \(\deg^-\) for sensors to weight the nodes.
Another way to enumerate candidates for sensors in a \ac{p2p} botnet is to find \acp{wcc} in the graph.
Sensors will have few to none outgoing edges, since they don't participate actively in the botnet, while crawlers have only outgoing edges.
@ -218,16 +219,25 @@ The goal of this work is to complicate detection mechanisms like this for botmas
The coordinated work distribution also helps in efficiently monitoring large botnets where one crawler is not enough to track all peers.
The changes should allow the current crawlers and sensors to use the new abstraction with as few changes as possible to the existing code.
The final results should be as general as possible and not depend on any botnet's specific behaviour, but it assumes, that every \ac{p2p} botnet has some kind of \enquote{getPeerList} method in the protocol, that allows other peers to request a list of active nodes to connect to.
The goal of this work is to show how cooperative monitoring of a \ac{p2p} botnet can help with the following problems:
\begin{itemize}
\item Impede detection of monitoring attempts by reducing the impact of aforementioned graph metrics
\item Circumvent anti-monitoring techniques
\item Make crawling more efficient
\end{itemize}
The final results should be as general as possible and not depend on any botnet's specific behaviour (except for the mentioned anti-monitoring techniques which might be unique to some botnets), but we assume, that every \ac{p2p} botnet has some way of determining a bot's neighbors.
In the current implementation, each crawler will itself visit and monitor each new node it finds.
The idea for this work is to report newfound nodes back to the \ac{bms} backend first, where the graph of the known network is created, and a fitting worker is selected to achieve the goal of the according coordination strategy.
That sensor will be responsible to monitor the new node.
The general idea for the implementation of the ideas in this thesis is to report newfound nodes back to the \ac{bms} backend first, where the graph of the known network is created, and a fitting worker is selected to achieve the goal of the according coordination strategy.
That worker will be responsible to monitor the new node.
If it is not possible, to select a specific sensor so that the monitoring activity stays inconspicuous, the coordinator can do a complete shuffle of all nodes between the sensors to restore the wanted graph properties or warn if more sensors are required to stay undetected.
The improved crawler system should allow new crawlers to register themselves and their capabilities (\eg{} bandwidth, geolocation), so the amount of work can be scaled accordingly between hosts.
Further work might even consider autoscaling the monitoring activity using some kind of cloud computing provider.
%}}} methodology
@ -236,23 +246,26 @@ Further work might even consider autoscaling the monitoring activity using some
The coordination protocol must allow the following operations:
\subsubsection{Register Worker}
\begin{description}
\mintinline{go}{register(capabilities)}: Register new worker with capabilities (which botnet, available bandwidth, \ldots). This is called periodically and used to determine which worker is still active, when splitting the workload.
\item[Register Worker] Register a new worker with capabilities (which botnet, available bandwidth and processing power, \dots{}).
This is called periodically and used to determine which worker is still active, when assigning new tasks.
\subsubsection{Report Peer}
\item[Report Peer] Report found peers.
Both successful and failed attempts are reported, to detect churned peers, and blacklisted crawlers as soon as possible.
\mintinline{go}{reportPeer(peers)}: Report found targets. Both successful and failed attempts are reported, to detect as soon as possible, when a peer became unavailable.
\item[Report Edge] Report found edges.
Edges are created by querying the peer list of a bot.
This is how new peers are detected.
\subsubsection{Report Edge}
\item[Request Tasks] Receive a batch of crawl tasks from the coordinator.
The tasks consist of the target peer, if the worker should start or stop monitoring the peer, when the monitoring should start and stop and at which frequency the peer should be contacted.
\mintinline{go}{reportEdge(edges)}: Report found edges. Edges are found by querying the peer list of known peers. This is how new peers are detected.
\item[Request Neighbors] Sensors can request a list of candidate peers to return when their peer list is queried.
\subsubsection{Request Tasks}
\end{description}
\mintinline{go}{requestTasks() []PeerTask}: Receive a batch of crawl tasks from the coordinator. The tasks consist of the target peer, if the crawler should start or stop the operation, when it should start and stop monitoring and the frequency.
\begin{listing}
\begin{listing}[H]
\begin{minted}{go}
type Peer struct {
BotID string
@ -269,7 +282,8 @@ type PeerTask struct {
\end{minted}
\caption{Relevant Fields for Peers and Tasks}\label{lst:peerFields}
\end{listing}
\todo{caption not shown, link in list of listings is broken}
\Fref{lst:peerFields} shows the Go structures used for crawl tasks.
%}}} primitives
@ -289,7 +303,10 @@ The protocol primitives described in \Fref{sec:protPrim} already allow for this
%{{{ load balancing
\subsection{Load Balancing}\label{sec:loadBalancing}
This strategy simply splits the work into chunks and distributes the work between the available crawlers.
Depending on a botnet's size, a single crawler is not enough to monitor all superpeers.
While it is possible to run multiple, uncoordinated crawlers, multiple crawlers can find and monitor the same peer, making the approach inefficient with regard to the computing resources at hand.
The load balancing strategy solves this problem by systematically splitting the crawl tasks into chunks and distributes them among the available crawlers.
The following load balancing strategies will be investigated:
\begin{itemize}
@ -842,6 +859,9 @@ Doing so would allow a constant crawl interval for even highly volatile botnets.
Placing churned peers or peers with suspicious network activity (those behind carrier-grade \acp{nat}) might just offer another characteristic to flag sensors in a botnet.
This should be investigated and maybe there are ways to mitigate this problem.
Autoscaling features offered by many cloud-computing providers should be evaluated to automatically add or remove crawlers based on the monitoring load, a botnet's size and number of active peers.
This should also allow create workers with new IP addresses in different geolocations fast and easy.
%}}} further work
%{{{ acknowledgments

Binary file not shown.

View File

@ -22,6 +22,9 @@ headsepline,
% footsepline,
]{OTHRartcl}
% page layout and margins
\usepackage[a4paper, total={6in, 8in}]{geometry}
% document language and hyphenation
\usepackage[main=english,english,ngerman]{babel}
% math stuff

View File

@ -63,6 +63,6 @@ pkgs.mkShell {
pkgs.graphviz
pkgs.pygmentex
pkgs.pythonPackages.pygments
pkgs.python3Packages.pygments
];
}