Content
This commit is contained in:
parent
8e03043f8b
commit
7f285cd0ab
35
content.tex
35
content.tex
@ -54,8 +54,6 @@ Taking down a \ac{p2p} botnet requires intricate knowledge of the botnet's chara
|
|||||||
Just like for centralized and decentralized botnets, to take down a \ac{p2p} botnet, the \ac{c2} channel needs to be identified and disrupted.
|
Just like for centralized and decentralized botnets, to take down a \ac{p2p} botnet, the \ac{c2} channel needs to be identified and disrupted.
|
||||||
By \emph{monitoring} peer activity of known participants in the botnet, this knowledge can be obtained and used to find attack vectors in the botnet protocol.
|
By \emph{monitoring} peer activity of known participants in the botnet, this knowledge can be obtained and used to find attack vectors in the botnet protocol.
|
||||||
|
|
||||||
\todo{few words about monitoring}
|
|
||||||
|
|
||||||
In this work, we will show how a collaborative system of crawlers and sensors can make the monitoring and information gathering phase of a \ac{p2p} botnet more efficient, resilient to detection and how collaborative monitoring can help circumvent anti-monitoring techniques.
|
In this work, we will show how a collaborative system of crawlers and sensors can make the monitoring and information gathering phase of a \ac{p2p} botnet more efficient, resilient to detection and how collaborative monitoring can help circumvent anti-monitoring techniques.
|
||||||
|
|
||||||
%}}} introduction
|
%}}} introduction
|
||||||
@ -233,7 +231,6 @@ They depend on suspicious graph properties to enumerate candidate peers~\cite{bi
|
|||||||
\Ac{bms} is intended for a hybrid active approach of crawlers and sensors (reimplementations of the \ac{p2p} protocol of a botnet, that won't perform malicious actions) to collect live data from active botnets.
|
\Ac{bms} is intended for a hybrid active approach of crawlers and sensors (reimplementations of the \ac{p2p} protocol of a botnet, that won't perform malicious actions) to collect live data from active botnets.
|
||||||
|
|
||||||
In an earlier project, we implemented different graph ranking algorithms---among others \emph{PageRank}~\cite{bib:page_pagerank_1998} and \emph{SensorRank}---to detect sensor candidates in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
|
In an earlier project, we implemented different graph ranking algorithms---among others \emph{PageRank}~\cite{bib:page_pagerank_1998} and \emph{SensorRank}---to detect sensor candidates in a botnet, as described in \citetitle{bib:karuppayah_sensorbuster_2017}.
|
||||||
In an earlier project, we implemented the ranking algorithms described in \citetitle{bib:karuppayah_sensorbuster_2017} for \ac{bms}.
|
|
||||||
|
|
||||||
%%{{{ detection criteria
|
%%{{{ detection criteria
|
||||||
%\subsection{Detection Criteria}
|
%\subsection{Detection Criteria}
|
||||||
@ -366,7 +363,7 @@ To keep the distribution as even as possible, we keep track of the last crawler
|
|||||||
For the sake of simplicity, only the bandwidth will be considered as a capability but it can be extended by any shared property between the crawlers, \eg{} available memory or processing power.
|
For the sake of simplicity, only the bandwidth will be considered as a capability but it can be extended by any shared property between the crawlers, \eg{} available memory or processing power.
|
||||||
For a given crawler \(c_i \in C\) let \(cap(c_i)\) be the capability of the crawler.
|
For a given crawler \(c_i \in C\) let \(cap(c_i)\) be the capability of the crawler.
|
||||||
The total available capability is \(B = \sum\limits_{c \in C} cap(c)\).
|
The total available capability is \(B = \sum\limits_{c \in C} cap(c)\).
|
||||||
With \(G\) being the greatest common divisor of all the crawler's capabilities, the weight \(W(c_i) = \frac{cap(c_i)}{G}\).
|
With \(G\) being the greatest common divisor of all the crawler's capabilities, the weight of a crawler is \(W(c_i) = \frac{cap(c_i)}{G}\).
|
||||||
\(\frac{cap(c_i)}{B}\) gives us the percentage of the work a crawler is assigned.
|
\(\frac{cap(c_i)}{B}\) gives us the percentage of the work a crawler is assigned.
|
||||||
% The set of target peers \(P = <p_0, p_1, \ldots, p_{n-1}>\), is partitioned into \(|C|\) subsets according to \(W(c_i)\) and each subset is assigned to its crawler \(c_i\).
|
% The set of target peers \(P = <p_0, p_1, \ldots, p_{n-1}>\), is partitioned into \(|C|\) subsets according to \(W(c_i)\) and each subset is assigned to its crawler \(c_i\).
|
||||||
% The mapping \mintinline{go}{gcd(C)} is the greatest common divisor of all peers in \mintinline{go}{C}, \(\text{maxWeight}(C) = \max \{ \forall c \in C : W(c) \}\).
|
% The mapping \mintinline{go}{gcd(C)} is the greatest common divisor of all peers in \mintinline{go}{C}, \(\text{maxWeight}(C) = \max \{ \forall c \in C : W(c) \}\).
|
||||||
@ -421,7 +418,8 @@ Given the hash function \(H\), calculating the hash of an IP address and distrib
|
|||||||
This gives us the mapping \(m(i) = H(i) \mod \abs{C}\) to sort peers into buckets.
|
This gives us the mapping \(m(i) = H(i) \mod \abs{C}\) to sort peers into buckets.
|
||||||
|
|
||||||
Any hash function can be used but since it must be calculated often, a fast function should be used.
|
Any hash function can be used but since it must be calculated often, a fast function should be used.
|
||||||
While the \ac{md5} hash function must be considered broken for cryptographic use~\cite{bib:stevensCollision}, it is faster to calculate than hash functions with longer output.\todo{md5 crypto broken, distribution not?}
|
While the \ac{md5} hash function must be considered broken for cryptographic use~\cite{bib:stevensCollision}, it is faster to calculate than hash functions with longer output.
|
||||||
|
Collisions for \ac{md5} have been found but collision resistance is not required.
|
||||||
For the use case at hand, only the uniform distribution property is required so \ac{md5} can be used without scarifying any kind of security.
|
For the use case at hand, only the uniform distribution property is required so \ac{md5} can be used without scarifying any kind of security.
|
||||||
|
|
||||||
This strategy can also be weighted using the crawlers' capabilities by modifying the list of available workers so that a worker can appear multiple times according to its weight.
|
This strategy can also be weighted using the crawlers' capabilities by modifying the list of available workers so that a worker can appear multiple times according to its weight.
|
||||||
@ -554,7 +552,7 @@ While the effective frequency of the whole system is halved compared to~\Fref{fi
|
|||||||
\subsection{Creating and Reducing Edges for Sensors}
|
\subsection{Creating and Reducing Edges for Sensors}
|
||||||
|
|
||||||
\citetitle*{bib:karuppayah_sensorbuster_2017} describes different graph metrics to find sensors in \ac{p2p} botnets.
|
\citetitle*{bib:karuppayah_sensorbuster_2017} describes different graph metrics to find sensors in \ac{p2p} botnets.
|
||||||
These metrics depend on the uneven ratio between incoming and outgoing edges for crawlers.
|
These metrics depend on the uneven ratio between incoming and outgoing edges for sensors.
|
||||||
The \emph{SensorBuster} metric uses \acp{wcc} since naive sensors don't have any edges back to the main network in the graph.
|
The \emph{SensorBuster} metric uses \acp{wcc} since naive sensors don't have any edges back to the main network in the graph.
|
||||||
|
|
||||||
|
|
||||||
@ -609,7 +607,7 @@ The following candidates to place on the neighbor list will be investigated:
|
|||||||
\textbf{Other Sensors:} Returning all the other sensors when responding to peer list requests, thereby effectively creating a complete graph \(K_{\abs{C}}\) among the workers, creates valid outgoing edges.
|
\textbf{Other Sensors:} Returning all the other sensors when responding to peer list requests, thereby effectively creating a complete graph \(K_{\abs{C}}\) among the workers, creates valid outgoing edges.
|
||||||
The resulting graph will still form a \ac{wcc} with now edges back into the main network.
|
The resulting graph will still form a \ac{wcc} with now edges back into the main network.
|
||||||
|
|
||||||
Building a complete graph \(G_C = K_{\abs{C}}\) between the sensors by making them return the other known worker on peer list requests would still produce a disconnected component and while being bigger and maybe not as obvious at first glance, it is still easily detectable since there is no path from \(G_C\) back to the main network (see~\Fref{fig:sensorbuster2} and~\Fref{tab:metricsTable}).\todo{where?}
|
Building a complete graph \(G_C = K_{\abs{C}}\) between the sensors by making them return the other known worker on peer list requests would still produce a disconnected component and while being bigger and maybe not as obvious at first glance, it is still easily detectable since there is no path from \(G_C\) back to the main network (see~\Fref{fig:sensorbuster2} and~\Fref{tab:metricsTable}).
|
||||||
|
|
||||||
|
|
||||||
%{{{ churned peers
|
%{{{ churned peers
|
||||||
@ -906,7 +904,7 @@ This is good enough for balancing the tasks among workers.
|
|||||||
%{{{ eval redu requ freq
|
%{{{ eval redu requ freq
|
||||||
\subsection{Reduction of Request Frequency}
|
\subsection{Reduction of Request Frequency}
|
||||||
|
|
||||||
To evaluate the request frequency optimization described in \Fref{sec:stratRedReqFreq}, crawl a simulated peer and check if the requests are evenly distributed and how big the deviation from the theoretically optimal result is.
|
To evaluate the request frequency optimization described in \Fref{sec:stratRedReqFreq}, we crawl a simulated peer and check if the requests are evenly distributed and how big the deviation from the theoretically optimal result is.
|
||||||
To get more realistic results, the crawlers and simulated peer are running on different machines so they are not within the same LAN\@.
|
To get more realistic results, the crawlers and simulated peer are running on different machines so they are not within the same LAN\@.
|
||||||
We use the same parameters as in the example above:
|
We use the same parameters as in the example above:
|
||||||
|
|
||||||
@ -999,7 +997,7 @@ With this experiment, we try to estimate the impact of the latency.
|
|||||||
\caption{Average deviation per crawler}\label{tab:perCralwerDeviation}
|
\caption{Average deviation per crawler}\label{tab:perCralwerDeviation}
|
||||||
\end{table}
|
\end{table}
|
||||||
|
|
||||||
The monitored peer crawler \emph{c0} are located in Falkenstein, Germany, \emph{c1} in Nurnberg, Germany, \emph{c2} is in Helsinki, Finland and \emph{c3} in Ashburn, USA, to have some geographic distribution.
|
The monitored peer and crawler \emph{c0} are located in Falkenstein, Germany, \emph{c1} in Nurnberg, Germany, \emph{c2} is in Helsinki, Finland and \emph{c3} in Ashburn, USA, to have some geographic distribution.
|
||||||
|
|
||||||
The average deviation per crawler is below \SI{0.002}{\second} even with some outliers due to network latency or server load.
|
The average deviation per crawler is below \SI{0.002}{\second} even with some outliers due to network latency or server load.
|
||||||
The crawler \emph{c3} in the experiment is the furthest away from the monitored host therefore the larger derivation due to network latency is expected.
|
The crawler \emph{c3} in the experiment is the furthest away from the monitored host therefore the larger derivation due to network latency is expected.
|
||||||
@ -1090,7 +1088,7 @@ SensorBuster relies on the assumption that sensors don't have any outgoing edges
|
|||||||
|
|
||||||
For the \ac{wcc} metric, it is obvious that even a single edge back into the main network is enough to connect the sensor back to the main graph and therefore beat this metric.
|
For the \ac{wcc} metric, it is obvious that even a single edge back into the main network is enough to connect the sensor back to the main graph and therefore beat this metric.
|
||||||
|
|
||||||
\subsubsection{Effectiveness against Page- and SensorRank}
|
\subsection{Reducing Incoming Edges to Reduce Page- and SensorRank}
|
||||||
|
|
||||||
In this section, we will evaluate how adding outgoing edges to a sensor impacts its PageRank and SensorRank values.
|
In this section, we will evaluate how adding outgoing edges to a sensor impacts its PageRank and SensorRank values.
|
||||||
Before doing so, we will check the impact of the initial rank by calculating it with different initial values and comparing the value distribution of the result.
|
Before doing so, we will check the impact of the initial rank by calculating it with different initial values and comparing the value distribution of the result.
|
||||||
@ -1268,11 +1266,11 @@ Experiments were performed, in which the incoming edges for the known sensor are
|
|||||||
\end{figure}
|
\end{figure}
|
||||||
\end{landscape}
|
\end{landscape}
|
||||||
|
|
||||||
\Fref{fig:pr0} and \Fref{fig:sr0} show the situation on the base truth without modifications.
|
The graphs with 0 removed edges show the situation on the base truth without modifications.
|
||||||
|
|
||||||
|
|
||||||
We can see in \Fref{fig:prFiltered} and \Fref{fig:srFiltered}, that we have to reduce the incoming edges by \SI{20}{\percent} and \SI{30}{\percent} respectively to get average values for SensorRank and PageRank.
|
We can see in \Fref{fig:prFiltered} and \Fref{fig:srFiltered}, that we have to reduce the incoming edges by \SI{20}{\percent} and \SI{30}{\percent} respectively to get average values for SensorRank and PageRank.
|
||||||
This also means that the number of incoming edges for a sensor must be about the same as the average about of incoming edges as can be seen in \Fref{fig:in3}.
|
This also means that the number of incoming edges for a sensor must be about the same as the average about of incoming edges.
|
||||||
Depending on the protocol details of the botnet (\eg{} how many incoming edges are allowed per peer), this means that a large amount of sensors is needed if we want to monitor the whole network.
|
Depending on the protocol details of the botnet (\eg{} how many incoming edges are allowed per peer), this means that a large amount of sensors is needed if we want to monitor the whole network.
|
||||||
|
|
||||||
|
|
||||||
@ -1343,7 +1341,7 @@ The server-side part of the system consists of a \ac{grpc} server to handle the
|
|||||||
\section{Conclusion}
|
\section{Conclusion}
|
||||||
|
|
||||||
Collaborative monitoring of \ac{p2p} botnets allows circumventing some anti-monitoring efforts.
|
Collaborative monitoring of \ac{p2p} botnets allows circumventing some anti-monitoring efforts.
|
||||||
It also enables more effective monitoring systems for larger botnets, since each peer can be visited by only one crawler.
|
We were able to show, that it also enables more effective monitoring systems for larger botnets, since each peer can be visited by only one crawler.
|
||||||
The current concept of independent crawlers in \ac{bms} can also use multiple workers but there is no way to ensure a peer is not watched by multiple crawlers thereby using unnecessary resources.
|
The current concept of independent crawlers in \ac{bms} can also use multiple workers but there is no way to ensure a peer is not watched by multiple crawlers thereby using unnecessary resources.
|
||||||
|
|
||||||
We were able to show, that a collaborative monitoring approach for \ac{p2p} botnets helps to circumvent anti-monitoring and monitoring detection mechanisms and is helpful to improve resource usage when monitoring large botnets.
|
We were able to show, that a collaborative monitoring approach for \ac{p2p} botnets helps to circumvent anti-monitoring and monitoring detection mechanisms and is helpful to improve resource usage when monitoring large botnets.
|
||||||
@ -1363,15 +1361,16 @@ This might bring some performance issues to light which can be solved by investi
|
|||||||
|
|
||||||
Another way to expand on this work is automatically scaling the available crawlers up and down, depending on the botnet size and the number of concurrently online peers.
|
Another way to expand on this work is automatically scaling the available crawlers up and down, depending on the botnet size and the number of concurrently online peers.
|
||||||
Doing so would allow a constant crawl interval for even highly volatile botnets.
|
Doing so would allow a constant crawl interval for even highly volatile botnets.
|
||||||
|
Autoscaling features offered by many cloud-computing providers can be evaluated to automatically add or remove crawlers based on the monitoring load, a botnet's size, and the number of active peers.
|
||||||
|
This should also allow the creation of workers with new IP addresses in different geolocations in a fast, easy and automated way.
|
||||||
|
This also requires investigating hosting providers which allow botnet crawling by their terms of use.
|
||||||
|
|
||||||
|
The current backend implementation assumes an immutable set of crawlers.
|
||||||
|
For autoscaling to work, efficient reassignment of peers has to be implemented to account for added or removed workers.
|
||||||
|
|
||||||
Placing churned peers or peers with suspicious network activity (those behind carrier-grade \acp{nat}) might just offer another characteristic to flag sensors in a botnet.
|
Placing churned peers or peers with suspicious network activity (those behind carrier-grade \acp{nat}) might just offer another characteristic to flag sensors in a botnet.
|
||||||
The feasibility of this approach should be investigated and maybe there are ways to mitigate this problem.
|
The feasibility of this approach should be investigated and maybe there are ways to mitigate this problem.
|
||||||
|
|
||||||
Autoscaling features offered by many cloud-computing providers can be evaluated to automatically add or remove crawlers based on the monitoring load, a botnet's size, and the number of active peers.
|
|
||||||
This should also allow the creation of workers with new IP addresses in different geolocations in a fast, easy and automated way.
|
|
||||||
The current implementation assumes an immutable set of crawlers.
|
|
||||||
For autoscaling to work, efficient reassignment of peers has to be implemented to account for added or removed workers.
|
|
||||||
|
|
||||||
%}}} further work
|
%}}} further work
|
||||||
|
|
||||||
%{{{ acknowledgments
|
%{{{ acknowledgments
|
||||||
|
@ -12,10 +12,12 @@
|
|||||||
\studyprogramme{Master Informatik}
|
\studyprogramme{Master Informatik}
|
||||||
%\startingdate{1.\,November 2088}
|
%\startingdate{1.\,November 2088}
|
||||||
%\closingdate{11.\,Dezember 2089}
|
%\closingdate{11.\,Dezember 2089}
|
||||||
|
\startingdate{2021-12-01}
|
||||||
|
\closingdate{2022-05-01}
|
||||||
|
|
||||||
\firstadvisor{Prof.\ Dr.\ Christoph Skornia}
|
\firstadvisor{Prof.\ Dr.\ Christoph Skornia}
|
||||||
\secondadvisor{Prof.\ Dr.\ Thomas Waas}
|
\secondadvisor{Prof.\ Dr.\ Thomas Waas}
|
||||||
%\externaladvisor{Dr. Klara Endlos}
|
\externaladvisor{Leon Böck}
|
||||||
|
|
||||||
\date{\today}
|
\date{\today}
|
||||||
% \date{}
|
% \date{}
|
||||||
|
BIN
report.pdf
BIN
report.pdf
Binary file not shown.
@ -107,7 +107,7 @@ headsepline,
|
|||||||
\usepackage[pdftex,colorlinks=false]{hyperref}
|
\usepackage[pdftex,colorlinks=false]{hyperref}
|
||||||
|
|
||||||
% make overfull hbox warnings prominently visible in document
|
% make overfull hbox warnings prominently visible in document
|
||||||
\overfullrule=2cm
|
% \overfullrule=2cm
|
||||||
|
|
||||||
\pagestyle{headings}
|
\pagestyle{headings}
|
||||||
|
|
||||||
@ -134,7 +134,7 @@ headsepline,
|
|||||||
|
|
||||||
\clearpage{}
|
\clearpage{}
|
||||||
|
|
||||||
\listoftodos{}
|
% \listoftodos{}
|
||||||
|
|
||||||
\include{content}
|
\include{content}
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user