Add content
This commit is contained in:
parent
73073ded36
commit
3edb5b2bd1
1
assets/architecture.drawio
Normal file
1
assets/architecture.drawio
Normal file
@ -0,0 +1 @@
|
|||||||
|
<mxfile host="app.diagrams.net" modified="2022-03-12T15:07:40.300Z" agent="5.0 (X11)" etag="3AFIBAUFaqle-vFZ6r29" version="17.1.2" type="device"><diagram id="LEI1KErRWcNUmDFmud2C" name="Page-1">7Vxbc5s4FP41zLSdMYMkBPgxxHHbTTtNLzvdtx1iZJstIApyEvfXVxhkA8IOdW0uW2cyHnQQQjrnOzddUNB18PQ6dqLle+oSX4Ga+6SgiQIh0CFU0n/NXWcU0zAywiL23LzSjvDZ+0FyopZTV55LklJFRqnPvKhMnNEwJDNWojlxTB/L1ebUL781chZEInyeOb5M/eq5bJlRLazt6G+It1iKNwMtvxM4onJOSJaOSx8LJHSjoOuYUpZdBU/XxE+ZJ/iSPTfdc3fbsZiErMkDP/559/ZxBNlf63mAzI8fvI+j9yM9a+XB8Vf5gPPOsrXgAAndq5SRvHTv09k3BdmukyxJ2izgBX5/6vE3oYmWlXIhAj5Ce8kCP68X01Xobp5K68m9zweU0FU8y1+NxzcetT7F9jR4/Nehr0ePk9sRzlHgxAvCDtSDKKtI3JKEc+a8JjQgLF7zCo87uQIjF1ZMfId5D2UQODmWFttnt83dUY+PA2o57qFuqEbhz8z5nCsBFCARLWaDzhspyrDSrq5BFZeaAuNKUxlfpKb4RWHQO9IGJL8AGGN4gAEDAAwwy2KFBj4OIcjAqoV2f92iBeABwsVsihe9P3gx4PF4KTWEkaUaY4g0M/9tFy8DNC+wsX3pEC9QKL4Iw+CxHkjXVIywySM7a/NbRo/esn0xh4cXMB4AXgDGqg52AQyGJ4KPBlSIu8WMNTzMNIZMhyEMqpiYo0MY3ZSCFhWZCCAdQQthS8irLbyMh4cXNAC8ABOpoGIL0LF2pRrGQIBbBYkY/pBA0jjO7dKomOVEuiLlo/NoNFbB/vy87TRJcHhI6IGN52E6DGOqgDGAoWq6bpoGAGNdQ6dxUBiZ6tjCmqkb0LAMbLULHjkGjklEY5YcwhCNSCjjpISoc8S+Q3BMXL4qwnCsA2AauqmVha2D89gc2HIYDOUwuL+wGcIUno6hqmHLggI7J4KNNOXbFClcZs66UC1KKyT1LxN16HyekPPgrS6MNnyWwsd74JeL9HIWE4eRRNzhbyrc7B8whW97FpigS2AeNjx6Ndhpisx0kQOb1lg3kWHpEGiVdo12DRrSJIBt0TRc3IAOcSNZnqOhomPcFjYOLQHV2J57YVzs958/bfxfwfbc77U8BUy4MY2+CDnuYPFAYkaeSqyXkCKLMX+gmuZkxaKMrZy2LCw+V6VTFHGJo7/KPljDvgpDkqUTpZcBdVcpzf6PMLYWS+aWIGzXyvWmjNovz73cAxK7YA234Gm0ZBt/i6nRcgMncaiH5ngO+1PlGipX9oT4ZMEtYVqy7X3OtaoQDptxI9hXpQBm11rRYO9Eu1qBLloxqltwrgD7HV0svHDRH2D3zt43WIVtF9n4guxR3VLnKe39NQ3uvbBPBl/Xe6YXDVYO29UL86IXo0bzCr+hF1crRrluBB7rj2bICyEdawaQk++OVWN8UY3t7EET3fhK428kbqIZB51I7Dz6d4Q3VKsljWb1zu1V9iw5d6c7/UuvL/n1bsq2PeW541lJv3XHEEsP/VGe3mXh4JKG39ZvzKzAfer5ZPo2dPcBvl1oVzcKdA7s3iUb4JJt3NbvBjxlunHz5MyYv/4QzsjU64ty9E89YO8SDnDJOG7r90HKK239QXZ1jqkHyO5dOvAMC/8QZO9PB6KScIzvq/RIrx1wpHqhgq74XS164r8bIWgZfcRolN3TC/e4aNjI8b1F/tyMC2ajJ3mbVa/hFQhOwBu0fbn0Nm1j7szIllx+pJpfeDU5R1SldTfmranIpuOeS5ju470NPJNnSWNexumgxRF10T9Q29WMteF9Em3K6UCz/vJf3yPJC1VVI951NS2vX2ZDJ3FM49/isE/mrMBiuWO7jty4i0I30lJdL04m+0Y9m/qrZPmirh+yGMs9qxjI1Mx5M8e/ynA12SDPzlE2yfpiU15r7m+2Ac03W6ftOQ1ZcR9QWp46geen1uYN8R9I2qpS2hN0AickHdM1ZCe03eu+LH9u4FxeqC6v7lr1/2hzl8ZMwzF2oreJ0OYXCrYVPElpX5zkmwKvhYq/HLr2VpIjvWZnSq32WmdT3v27IH4rSW2yfHwotO/FFKZ0JKTzgL932y3gZb/Fbd0Bie69wR/tAe9iyuiM+j30gGeP2L+vSMIyh7qJ19PLsmfdhvFl19pFGJ8uZMndfLXLd/rRxY2ecPUWIcor5gVEnaxibq9o2E2/NgvozTr2/06JpMMHqMZRb78x0FJSJE/+/50QidMnPV5SOflx7nOXVr1UznE2V4hJyBdVxNb0cIlVbagaV5/5cEnNwaOBoKLxscoWUYGfEWbj02nVhk52vpYXd19tzKrvvn2Jbn4C</diagram></mxfile>
|
34
content.tex
34
content.tex
@ -436,6 +436,40 @@ This implementation is highly optimized but also tightly coupled and grown over
|
|||||||
The abstraction became leaky and extending it proved to be complicated.
|
The abstraction became leaky and extending it proved to be complicated.
|
||||||
A new crawler abstraction was created with testability, extensibility and most features of the existing implementation in mind, which can be ported back to be used by the existing crawlers.
|
A new crawler abstraction was created with testability, extensibility and most features of the existing implementation in mind, which can be ported back to be used by the existing crawlers.
|
||||||
|
|
||||||
|
%{{{ fig:crawler_arch
|
||||||
|
\begin{figure}[h]
|
||||||
|
\centering
|
||||||
|
\includegraphics[width=1\linewidth]{architecture.drawio.pdf}
|
||||||
|
\caption{Architecture of the new crawler}\label{fig:crawler_arch}
|
||||||
|
\end{figure}
|
||||||
|
%}}}fig:crawler_arch
|
||||||
|
|
||||||
|
The new implementation consists of three main interfaces:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item \textbf{FindPeer}, to receive new crawl tasks from any source
|
||||||
|
|
||||||
|
\item \textbf{ReportPeer}, to report newly found peers
|
||||||
|
|
||||||
|
\item \textbf{Protocol}, the actual botnet protocol implementation used to ping a peer and request its neighbourhood list
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Currently there are two sources \textbf{FindPeer} can use: read peers from a file on disk or request them from the \ac{grpc} BMS coordinator.
|
||||||
|
The \textbf{ExactlyOnceFinder} delegate can wrap another \textbf{FindPeer} instance and ensures the source is only requested once.
|
||||||
|
This is used to implement the bootstrapping mechanism of the old crawler, where once, when the crawler is started, the list of bootstrap nodes is loaded from a textfile.
|
||||||
|
\textbf{CombinedFinder} can combine any amount of \textbf{FindPeer} instances and will return the sum of requesting all the sources.
|
||||||
|
|
||||||
|
The \textbf{PeerTask} instances returned by \textbf{FindPeer} contain the IP address and port of the peer, if the crawler should start or stop the operation, when to start and stop crawling and in which interval the peer should be crawled.
|
||||||
|
For each task, a \textbf{CrawlPeer} and \textbf{PingPeer} worker is started or stopped as specified in the received \textbf{PeerTask}.
|
||||||
|
These tasks use the \textbf{ReportPeer} interface to report any new peer that is found.
|
||||||
|
|
||||||
|
Current report possibilities are \textbf{LoggingReport} to simply log new peers to get feedback from the crawler at runtime, and \textbf{BMSReport} which reports back to \ac{bms}.
|
||||||
|
\textbf{BatchedReport} delegates a \textbf{ReportPeer} instance and batch newly found peers up to a specified batch size and only then flush and actually report.
|
||||||
|
\textbf{AutoCommitReport} will automatically flush a delegated \textbf{ReportPeer} instance after a fixed amount of time and is used in combination with \textbf{BatchedReport} to ensure the batches are written regularly, even if the batch limit is not reached yet.
|
||||||
|
\textbf{CombinedReport} works analogous to \textbf{CombinedFinder} and combines many \textbf{ReportPeer} instances into one.
|
||||||
|
|
||||||
|
\textbf{PingPeer} and \textbf{CrawlPeer} use the implementation of the botnet \textbf{Protocol} to perform the actual crawling in predefined intervals, which can be overwritten on a per \textbf{PeerTask} basis.
|
||||||
|
|
||||||
%}}} implementation
|
%}}} implementation
|
||||||
|
|
||||||
% vim: set filetype=tex ts=2 sw=2 tw=0 et foldmethod=marker spell :
|
% vim: set filetype=tex ts=2 sw=2 tw=0 et foldmethod=marker spell :
|
||||||
|
BIN
report.pdf
BIN
report.pdf
Binary file not shown.
@ -29,6 +29,8 @@ headsepline,
|
|||||||
\usepackage{amsfonts}
|
\usepackage{amsfonts}
|
||||||
\usepackage{mathtools}
|
\usepackage{mathtools}
|
||||||
|
|
||||||
|
\usepackage{tikz}
|
||||||
|
|
||||||
% positioning
|
% positioning
|
||||||
\usepackage{float}
|
\usepackage{float}
|
||||||
|
|
||||||
@ -85,7 +87,7 @@ headsepline,
|
|||||||
% custom commands
|
% custom commands
|
||||||
\include{commands}
|
\include{commands}
|
||||||
|
|
||||||
\graphicspath{{assets/dot/}}
|
\graphicspath{{assets/dot/}, {assets/}}
|
||||||
|
|
||||||
\setcounter{tocdepth}{2}
|
\setcounter{tocdepth}{2}
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user