Argus
From NSMWiki
Contents
|
Introduction
Argus is an open source IP Audit tool written by Carter Bullard and has been under development for over 10 years. This wiki is an attempt to fill out the documentation for Argus. Argus distribution contains man pages which describe the various programs and their commandline switches, what is missing is tutorial type material describing how one might use Argus day to day. The material here should be used in conjunction with the man pages and information on the Argus Home.
Argus consists of two parts, a server which records network traffic visible from one or more of the NICs on the machine. Argus assembles these information on the traffic into network flows. Data about flows is either written to disk or to a network socket if a client has connected. The second part of Argus is a collection of clients which read flow data either from Argus log files for direct from an Argus server via a network socket.
Who uses Argus
Many universities, corporations, and government entities use Argus to record both internal traffic flows and flows entering and leaving their network(s). These records are used in both immediate network utilization analysis, and historical analysis or trending. With a sensor network using Argus, organizations may validate the connectivity of end-hosts through multiple routers. If routers A, B, and C are passing traffic for hosts Y and Z, Argus may be used to determine latency and other problems between routers B and C (which may not be apparent in packet captures).
Historical netflow data can be used in forensic investigations several months, or years, after an incident has taken place. Argus' netflow records offer up to a 10,000:1 ratio from the packet size to the record written to disk, which allows installations to save records for much longer than full packet captures. When network security is very important, non-repudiation becomes a very important requirement that must be provided throughout the network. Argus provides the basic data needed to establish a centralized network activity audit system. If done properly, this system can account for all network activity in and out of an enclave, which can provide the basic system needed to assure that someone can't deny having done something in the network.
Network research labs have used Argus to provide network performance measurements of unique protocols, such as Infiniband over IPv6. Argus can be quickly adapted to new protocols, and in some cases, provides the basic metrics without extension. Individuals use Argus in their home networks to give them a heads up on DSL and Cable Modem based networks. Argus provides a higher order view into packet data, that allows a network user the ability to see problems quickly.
Argus server
Clients
What sorts of things can argus do
Argus is primarily a network activity monitoring system. Historically, Argus has been used to support network security management and network forensics. through its ability to establish an audit trail of network activity.
Stéphane Peters's Cheat sheet
List originally contributed by Stéphane Peters (v3).
Examples
ragrep example: Finding Palevo / Sality virus activity
As of V3.0.2 ragrep() is obsolete. You should use the newer argus-clients-3.0.2 programs, all of which allow you to grep, using the "-e" option. bash code :
#!/bin/bash # File : ragrep-sality.sh s="solfire.aljosaborkovic.com" s="$s|kukutrustnet777.info" s="$s|www.kjwre.*fqwieluoi.info" s="$s|l33t.brand-clothes.net" s="$s|pica.banjalucke-ljepotice.ru" s="$s|maellisromance.com" s="$s|217.32.75.74" s="$s|pingaksh.com" s="$s|radio.irib.ir" s="$s|regal-mont.pondi.hr" s="$s|sandra.prichaonica.com" s="$s|sasgrowth.com" s="$s|snowboard619.w.interia.pl" s="$s|spargeunid.go.ro" s="$s|stakrix.st.funpic.de" s="$s|us516757.bizhostnet.com" s="$s|www.abassiehmunicipality.com" s="$s|www.polaris.ge" s="$s|www.railwayservices.be" s="$s|www.senaauto.ge" s="$s|ziyagokalpilkogretim72.meb.k12.tr" ra -s "+suser:50 -bytes" -e "$s" $* - udp port 53
It is really a one-liner like this, split on several lines for editing.
ra -s "+suser:50 -bytes" -e "solfire.aljosaborkovic.com|kukutrustnet777.info|www.kjwre.*fqwieluoi.info" -nr $file - udp port 53
You need to use "ragrep" in previous versions of argus-clients(3.0.0 for example).
Knowing that Palevo and Sality viruses try to connect to one of these sites, this script permits to identify the computers that have done such DNS requests, and that are infected (with a high degree of probability).
The resulting RE is an ORing of several strings and another RE (www.kjwre.*fqwieluoi.info) to cach a probably random number. The script is launched like this:
ragrep-sality.sh -nr $file ragrep-sality.sh -nr $file -w /tmp/sality-traces.ra
Here is an output:
StartTime Flgs Proto SrcAddr Sport Dir DstAddr Dport SrcPkts DstPkts State srcUdata 01/03 08:21 e udp 1.0.4.1.44177 <-> 100.0.1.1.53 1 1 CON s[40]=.............sandra.prichaonica.com..... 01/03 08:21 e udp 1.0.4.1.40419 <-> 100.0.1.1.53 1 1 CON s[44]=.............solfire.aljosaborkovic.com..... 01/03 08:21 e udp 1.0.5.1.32200 <-> 100.0.1.1.53 1 1 CON s[40]=.Y...........sandra.prichaonica.com..... 01/03 08:22 e udp 1.0.5.1.29661 <-> 100.0.1.1.53 1 1 CON s[44]=.............solfire.aljosaborkovic.com..... 01/03 08:29 e udp 1.0.5.1.32554 <-> 100.0.1.1.53 1 1 CON s[40]=.............sandra.prichaonica.com..... 01/03 08:30 e udp 1.0.5.1.44465 <-> 100.0.1.1.53 1 1 CON s[44]=.............solfire.aljosaborkovic.com..... 01/03 08:30 e udp 1.0.4.1.29810 <-> 100.0.1.1.53 1 1 CON s[40]=b............sandra.prichaonica.com..... 01/03 08:31 e udp 1.0.4.1.41186 <-> 100.0.1.1.53 1 1 CON s[44]=yc...........solfire.aljosaborkovic.com..... ... 01/03 10:27 * udp 1.0.9.2.42875 <-> 100.0.1.1.53 1 1 CON s[44]=e............solfire.aljosaborkovic.com..... 01/03 10:42 e udp 1.0.15.1.46746 -> 197.0.7.1.53 2 0 INT s[50]= O................V...........sandra.prichaonica.c 01/03 10:42 e udp 1.0.12.1.45079 <-> 100.0.1.1.53 1 1 CON s[40]=.............sandra.prichaonica.com..... 01/03 10:42 * udp 1.0.9.3.31681 <-> 100.0.1.1.53 1 1 CON s[40]=.............sandra.prichaonica.com..... 01/03 10:42 e udp 1.0.15.1.46746 -> 197.0.2.1.53 3 0 INT s[50]= O................V...........sandra.prichaonica.c 01/03 10:42 e udp 1.0.15.1.46746 -> 197.0.3.1.53 3 0 INT s[50]= O................V...........sandra.prichaonica.c 01/03 10:42 e udp 1.0.15.1.46746 -> 197.0.4.1.53 3 0 INT s[50]= O................V...........sandra.prichaonica.c
other
Flow filtering on certain port range :
ra -r $file - dst port gt 1024 and dst port lt 2048
Use racluster() to generate the counts you are looking for:
racluster -m proto -r $file -s proto spkts dpkts sbytes dbytes
Proto SrcPkts DstPkts SrcBytes DstBytes
udp 15567 12390 2912004 3240927
tcp 900187 866302 410506598 722771403
icmp 645 522 123240 61250
Packet Loss (with IP address):
ragraph loss saddr daddr -M 10s -r $file -title 'Packet Loss / IPs' -w ploss.png
Packet Loss (number of packets)
ragraph loss spkts dpkts -M 10s -r $file -title 'Packet Loss / Packets' -w ploss2.png
Jitter (number of packets)
ragraph jitter saddr daddr -M 10s -r $file -title 'Jitter' -w jitter.png
Concurrent transactions:
ragraph trans -M 10s -r $file -title 'Concurrent Transactions' -w transac.png2
Note (2010-0617): It does look, from the code, that it is trans/sec. We have explicit
code for controlling that, and it looks like "Trans" doesn't correct for
the the GAUGE/AVERAGE artifacts rrd and rrd_graph generates.
If you make this change to ragraph():
thoth:~ carter$ diff `which ragraph` /tmp/ragraph
1093c1093
< /Trans/ and do {$power[$x] = 1.0 ; };
---
> /Trans/ and do {$power[$x] = $STEP ; };
It will graph the actual 'trans' value in each time bin.
Top talkers & Listeners
racluster -m matrix -r $file -w - | rasort -m bytes | less
Note: piping through 'ra -n' again was redundant and a waste of CPU cycles (FYI: the -s switch is also available for rasort when one requires a different output)
Rastrip always removes argus management transactions, thus having the same effect as a
’not man’
filter expression.
To remove the tcp network DSR (data structure record?):
rastrip "-m -net"
(or something like it)
To see if you get something useful:
rastrip "-M time flow metric"
Yes, you can pipe rastrip(). Try something like this:
rastrip -S $server -w - | rasplit [options] -r -
racluster -r $file -M net 192.168.0.0/16 -m daddr/16 - "host 192.168.0.10 or host 192.168.0.11"
% ra -nr $file -s saddr sport daddr dport SrcAddr Sport DstAddr Dport 1.2.3.58.1140 1.2.4.5.41460 1.2.3.55.4100 1.2.4.5.41460 1.2.3.3.3336 1.2.5.6.135
Split records into 5 minute files
rasplit -M time 5m -S argus-north... -w /var/log/argus/\$srcid/%Y/%m/%d/file.%Y.%m%d.%H.%M.%S
one for every day
rasplit -S radium -M 1d -w /path/argus-\$srcid.%Y.%m.%d.log
It is possible to execute some command after each file, ie compress it or insert data in a database;
rastream -S argus -B 15s -w /archive/\$srcid/%Y/%m/%d/ntam.%Y.%m.%d.%H.%M.%S \
-f /usr/local/bin/rastreamshell
There is an example file in the distribution, SRC/support/Config/rastream.sh :
#!/bin/sh
#
# Argus Client Software. Tools to read, analyze and manage Argus data.
# Copyright (C) 2000-2011 QoSient, LLC.
# All Rights Reserved
#
# Script called by rastream, to process files.
#
# Since this is being called from rastream(), it will have only a single
# parameter, filename,
#
# Carter Bullard <carter@qosient.com>
#
PATH="/usr/local/bin:$PATH"; export PATH
package="argus-clients"
version="3.0.2"
OPTIONS="$*"
FILES=
while test $# != 0
do
case "$1" in
-r) shift; FILES="$1"; break;;
esac
shift
done
racluster -M replace -r $FILES
gzip $FILES
exit 0
Comma separated value
%cat ra3.conf.t RA_PRINT_LABELS=0 RA_FIELD_DELIMITER=',' RA_PRINT_NAMES=proto RA_TIME_FORMAT="%y-%m-%d %T" RA_PRINT_DURATION=no RA_PRINT_LASTIME=yes
%ra3 -F ra3.conf.t -r icmp3.argus | more StartTime,Flgs,Proto,SrcAddr,Sport,Dir,DstAddr,Dport,SrcPkts,DstPkts,SrcBytes,DstBytes,State 06-06-27 11:20:28.911941, v ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO 06-06-27 11:20:28.911946, v ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO 06-06-27 11:20:28.911951, v ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO
racluster -m saddr/23 daddr proto dport -w -r $file - dst net 10.1.2.0/23 \
| rasort -m proto daddr dport dbytes - \
-s ltime saddr sport daddr dport spkts dpkts sbytes dbytes \
|less
To do a top talkers for say IP addresses
(racluster can do it for any object in the record, top mac addrs, top
tos bytes, top mpls label, top vlan, top port, top ttl, etc....):
racluster -M rmon -m saddr -r $file - ip
A list with 2 columns, IP-address and bytes used:
racluster -M rmon -m saddr -r $file -w - - ip \ | rasort -m bytes -s saddr bytes |head -20
... not to be confused with :
racluster -M rmon -m saddr -r $file -w - - ip \ | rasort -N 20 -m bytes -s saddr bytes
... equivalent to :
racluster -M rmon -m saddr -r $file -w - - ip \ | ra -N 20 | rasort -m bytes -s saddr bytes
A list with 2 columns, IP-address and bytes used (carter version):
racluster -M rmon -m proto sport -r $file -w - - ip | \ rasort -m bytes proto sport -s stime dur proto sport spkts dpkts sbytes dbytes
802.1q packets monitoring already there. If you have vlan input traffic adding
-s +svlan +dvlan
to your ra command will display the VLAN tag values in hex form and you can filter ra (or other clients) traffic on vlan tags.
To see the VLAN in decimal form, use these options:
-s +svid +dvid
Top src address based on src bytes in a collection of records
racluster -m saddr -w - -R 2006/09/28 - ip | rasort -m sbytes
Top address, regardless of direction (The "-M rmon" folds the src and dst addresses together, putting the values into the saddr field.):
racluster -M rmon -m saddr -w - -R 2006/09/28 - ip | rasort -m sbytes
2007-0305 (Argus-info Digest, Vol 19, Issue 5) What is the current best way to get a report like :
ramon -nn -L0 -M svc -r $file - | head -25
racluster -M rmon -m proto sport -r $file -w - - tcp or udp | \
ra -N 25 -s proto sport spkts dpkts sbytes dbytes
2007-0321 (Argus-info Digest, Vol 19, Issue 30) Looking for functionality like: ramon -M TopN or -M Matrix try this:
racluster -r $file -M rmon -m saddr - ip ( this generates stats based on IP address)
racluster -r $file -m matrix - ip (based on IP matrix)
to do whatever TopN you want, pipe the output to rasort(). So to get the Top10 in packets received and transmitted:
racluster -r $file -M rmon -m saddr -w - | rasort -m pkts -w - | ra -N 10
To get the Top5 in bytes per second transmitted:
racluster -r $file -M rmon -m saddr -w - | rasort -m srate -w - | ra -N 5 -s +srate
2007-1102 (Argus-info Digest, Vol 27, Issue 2) I(Terry) run the following collectors:
/opt/argus/sbin/argus -X -d -A -i eth2 -P 561 /opt/argus/sbin/radium -X -d -C -S 1006 -P 564 /opt/argus/sbin/radium -X -d -C -S 1007 -P 565
I(Terry) have another process that aggregates these:
/opt/argus/sbin/radium -X -d -S localhost:561 -S localhost:564 -S \ localhost:565 -P 569
2008-0215 Some examples of ragraph: ( http://search.gmane.org/?query=ragraph&group=gmane.network.argus )
ragraph bytes proto -M 60s -r strange-broadcast-10000.argus -fill -stack \
-w ./strange-broadcast-10000.png
ragraph -r inputfiles* -t 12-13
ragraph spkts dport -M 1h -n -n -r argus.dat.04 - src net X/20
ragraph pkts dport -M 10s -T 60 -S 192.168.1.101 -p0
ragraph bytes saddr -M 1m -m saddr/24
rabins -M soft zero -p6 -GL0 -s ltime bytes -nn -M 1m \
-r $files - srcid eligate1 and icmp | head
ragraph sbytes dbytes -M rmon time 1m -m smac -t 2007/10/04 \
-r $file -w ragraph.png -- ether host 00:15:F2:64:92:13
ragraph pkts proto -M 1m -title 'eligate2: protocol distribution' \
-height 200 -t 2007/10/04 -r /var/log/argus/argus.log \
-w /var/www/argus/eligate2/proto/current.png - srcid eligate2
rahisto -r datafile -H drate 140:100-170K
bash> for i in 1s 2s 5s 10s 15s 20s 30s 45s 1m 2m 5m 10m 15m 20m 30m 1h 2h; do echo $i ;\
ragraph rate dport -M $i -r output.file -t 18-20 -m proto dport -upper 5000 -lower 7000 \
-title "Aggregation Metric Distribution Analysis - Resolution $i" ;\
mv ragraph.png aggregation.$i.png; done
rasort -R ${stats_dir}/.../day -m bytes smac saddr -w - \
| ra -N 20 -w top20.talkers.list
; ra -s addr -r top20.talkers.list > addrs.list
; rafilteraddr -f addrs.list -R ${stats_dir}/..../daily > /tmp/data
; ragraph spkts dpkts saddr -M 1m -w /tmp/ragraph.png
2008-0228 (Argus-info Digest, Vol 30, Issue 41)
to insert data every 5 minutes, it can be as easy as:
rastream -S live.argus.stream -f yourMysqlImport.sh -M time 5m -B 15s \
-w /opt/ARGUS/OUTBOUND/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
This would generate an argus archive broken out by year/month/day containing files every 5 minutes, and 15 seconds after then end of each 5 minute clock boundary, your script would be run against the file, indexing the data and then compressing the file. It could remove the file if you're not interested in keeping the archive etc......
2008-0305 (Argus-info Digest, Vol 31, Issue 6)
When the records are not well formed, you need the "-M rmon" option
to make the records direction-less. Because of the direction-less nature
you can use "dport" or "sport" as the merge key, but you have to be consistent,
as you will need to pipe the output to ra() to select the ports you're interested in:
racluster -M rmon -r $file -m proto dport -w - | \
ra -L 0 -s stime dur proto dport spkts dpkts sbytes dbytes - dst port 80 or 443
equivalent to (in argus clients v2.0.6)
ramon -M Svc -nn -r argus-$DATE.arg - port 80 or 443".
Bandwidth usage flow by flow on 26th Feb from 19h to 20h,
unnecessary columns have been cut to keep every record on a single line
( from : http://www.vorant.com/nsmwiki/Argus#How_do_I_do_IP_accounting_by_IP :-)
cd /archive/2008/02/26 racluster -w - -M rmon -m saddr daddr -r argus.19.00.00.gz -w - - ip and dur gt 1 \ | rasort -m sload -w - \ | ra -N 15 -p 0 -s "-flgs -proto -dir -state +avgdur +sload +dload +trans"
List all possible state fields of a file
% ra -r $file -nn | awk '{print $NF}' | sort | uniq -c | sort -nr
91104 CON
77066 FIN
65763 TIM
55618 ECO
41232 INT
28724 RST
798 ECR
467 URP
2 CLO
1 STA
2008-0312 (Argus-info Digest, Vol 31, Issue 15) Print headers in ra* version 3.*
"-L 0" will print the headers once, "-L 40" will print the headers every 40 lines, etc ...
2008-0312 (Argus 3: Statistics for Major Protocols) (C.S. Lee) Here you go, you can cluster or merge the records based on the flow key and it is suitable for data mining, data management and report generation, let's generate the statistical report using protocol as flow key. Notice I specify -m proto in command line below and using -s to print the field I want
racluster -L0 -m proto -r $file -s proto trans pkts bytes appbytes -\ tcp or udp or icmp
2008-0317 When (on which date) did start this long-running argus file (by default, ra* clients use the "%T" format ie HH:MM:SS) ?
cat /tmp/rarc
RA_TIME_FORMAT="%D %T"'
ra -s "stime" -F /tmp/rarc.$$ -N 1 -L 0 -nr $file
StartTime
02/29/08 18:42:55
2008-02-28 simple gnuplot plot file to generate a graph of "Total Bytes By Protocol" using argus data; assuming gnuplot is installed in /opt/local/bin/gnuplot (Carter Bullard).
% chmod 755 barchart.bytesxproto.plt % racluster -m proto -r argus.out -s proto spkts dpkts sbytes dbytes > racluster.dat % ./barchart.bytesxproto.plt ------ begin barchart.bytesxproto.plt ------ #!/opt/local/bin/gnuplot -persist # # G N U P L O T # Version 4.2 patchlevel 2 # last modified 31 Aug 2007 # System: Darwin 9.2.0 # # Copyright (C) 1986 - 1993, 1998, 2004, 2007 # Thomas Williams, Colin Kelley and many others # # Type `help` to access the on-line reference manual. # The gnuplot FAQ is available from http://www.gnuplot.info/faq/ # # Send bug reports and suggestions to <http://sourceforge.net/projects/gnuplot> # # reset # # Create simple barchart of Total Bytes by Protocol # The racluster.dat file was generated using: # # racluster -m proto -r argus.out -s proto spkts dpkts sbytes dbytes # # And is of the format: # # Proto SrcPkts DstPkts SrcBytes DstBytes # pim 53267 18086 48793554 1085160 # ospf 1764 0 213220 0 # [more] # set termoption font "Verdana, 12" set size square 0.90,0.90 set bmargin 4 set title "Total Bytes By Protocol" font "Verdana,22" set style data histogram set style histogram cluster gap 1 set style fill solid border -1 set tics font "Verdana,14" set boxwidth 0.80 set grid set ylabel "Log Total Bytes" font "Verdana,18" set logscale y 10 set auto y set label 1 "Generated by Argus using Gnuplot" set label 1 at graph 1.02, 0.62 rotate by 90 font "Verdana,9" # set key autotitle columnhead plot 'racluster.dat' using 4:xticlabels(1) ti col, \ using 5 ti col # ------ end barchart.bytesxproto.plt ------
2008-0326 Count flows by groups of 10 minutes : show only the flow start times, cut after the 10ths of minutes, strip first line (headers), add a trailing zero and delete heading spaces to show a nice HH:MM line, count them, invert columns, insert a delimitor. Ready to be feed in your favorite spreadsheet.
echo 'RA_TIME_FORMAT="%H:%M"' > raTime.conf
ra -F raTime.conf -s stime -nr $file | \
cut -c -4 | \
uniq -c | \
sed -e '1d' \
-e 's/$/0/' \
-e 's/^ *//' \
-e 's/\(.*\) *\(.*\)/\2,\1/' > flowcounts.csv
2008-0409 Carter's version, thanks to Nick Diel - This example assumes you have already merged status flow records, so records = flows, if not add another pipe of racluster. If you have multiple collectors, you can have rabins merge on something else
such as proto if you are filtering on tcp.
echo 'RA_TIME_FORMAT="%H:%M"' > raTime.conf # (you could also add this to your rarc file)
rastrip -r $file -M -agr -w - | \
rabins -M nomodify time 10m -m srcid -s stime trans -c , -F raTime.conf > flowcounts.csv
2008-0409 Carter's note : When you only want a single flow counted once, in the time bin
when it started. To do this you don't want to modify/split the flow records, so use this option:
rabins -M nomodify
2008-0409 Stéphane Peters : Small all-purpose script to count and totalize all columns : /bin/tot
#!/bin/awk -f
BEGIN{max=0}
{if ( NF > max ) max = NF;
for ( i=1 ; i <= NF ; i++ ) {
tot[i]+=$(i);
}
}
END { for ( i=1 ; i <= max ; i++ ) {
if ( tot[i] > 1000000 )
printf "%sm\t", tot[i]/1000000;
else if ( tot[i] > 1000 )
printf "%sk\t", tot[i]/1000;
else if ( tot[i] == 0 )
printf "-\t";
else
printf "%s\t",tot[i];
}
printf "\n";
}
2007-10-04 Wolfgang Barth : "I'm using the following code for graphing interface load:" (http://thread.gmane.org/gmane.network.argus/5338/focus=5348)
/usr/local/bin/rabins -M rmon 1m -m smac -t 2007/10/04 \ -r /var/log/argus/argus.log -w - - srcid eligate2 | \ /usr/local/bin/ragraph sbytes dbytes -M 1m -title 'eligate2: Load' \ -height 200 -upper 1000000 -rigid -lower 1000000 -rigid -t 2007/10/04 \ -w /var/www/argus/eligate2/load/current.png -r - - ether dst 00:15:F2:64:92:13
2008-06-25 From Peter Van Epp: How to put commas in large numbers (http://article.gmane.org/gmane.network.argus/6062)
The following perl fragment will add commas if you run the ra output through an appropriate perl script:
sub commas {
local($_) = @_;
1 while s/(.*\d)(\d\d\d)/$1,$2/;
$_;
}
and called like this:
$pcount = &commas($count);
2008-12-29 (Argus-info Digest, Vol 40, Issue 5) ragraph with large files
Carter : When you are graphing objects like ports, you can use the aggregation features of ragraph() to minimize the memory use. For example, you can use "-m proto dport" in :
ragraph dbytes sbytes dport -M 5m -t $time -fill -stack -invert -title \"$title\" \ $log -w $filename $filter
That should constrain your graph so that it doesn't use much memory at all (max should be, what, 64K ports for udp and tcp in memory for each 5m period). Thing to note : the destination port field doesn't decode without the protocol field having a valid value.
2009-02-13 (Argus-info Digest, Vol 42, Issue 15) Radium repository example
rasplit -M time 5m -S radium -w experiment/\$srcid/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S ra -S remoteRadium/path/to/specific/argus/file/argus.2009.02.13.15.20.00.gz
2009-04-24 (Argus-info Digest, Vol 44, Issue 35) argus reads tcpdump files
First, creation of the tcpdumpfile (CAP / PCAP format), followed by the conversion with argus
tcpdump -i eth0 -n -w testdump ; argus -mAJZR -r testdump -w testdump.arg3
ralabel example
(Argus-info Digest, Vol 59, Issue 33)
First, create a "ralabel.conf" file:
RALABEL_ARGUS_FLOW=yes RALABEL_ARGUS_FLOW_FILE="argus-flow-file"
Second, create an "argus-flow-file" :
# Argus-flow-file # # Our application filter="host 10.1.2.3 and port 80" label="Appserver - web traffic" filter="host 10.1.2.3" label="Appserver - other traffic" # Proxy filter="host 10.1.2.4 and port 8080" label="Proxy server - normal traffic" filter="host 10.1.2.4 and port 80" label="Proxy server - web traffic" filter="host 10.1.2.4" label="Proxy server - other traffic" filter="udp and port 53" label="DNS traffic"
Use it (some fields have been removed to fit the wiki page) :
ralabel -f ralabel.conf -nr $f -s "-status -sbytes -dbytes +label:40"
Result:
26/07 11:59 tcp 1.0.2.2.9405 -> 10.2.3.4.80 503846 RST flow=Proxy server - web traffic 26/07 11:59 tcp 10.2.3.4.8080 <?> 1.0.3.1.8248 163 CON flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.4.1.8820 -> 10.1.2.3.80 9895 FIN flow=Appserver - web traffic 26/07 11:59 icmp 1.0.5.1.8 <-> 10.2.3.4.11736 204 ECO flow=Proxy server - other traffic 26/07 11:59 tcp 1.0.6.1.9286 -> 10.2.3.4.8080 5381 FIN flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.4.1.8821 -> 10.1.2.3.80 1475 FIN flow=Appserver - web traffic 26/07 11:59 icmp 1.0.5.1.8 <-> 10.2.3.4.11736 204 ECO flow=Proxy server - other traffic 26/07 11:59 icmp 1.0.5.1.8 <-> 10.2.3.4.11736 204 ECO flow=Proxy server - other traffic 26/07 11:59 tcp 1.0.7.1.57268 -> 10.2.3.4.8080 1208 CON flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.8.1.9265 <?> 10.2.3.4.8080 242 CON flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.9.1.22513 -> 10.2.3.4.8080 9252 FIN flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.9.1.22516 -> 10.2.3.4.8080 9200 FIN flow=Proxy server - normal traffic 26/07 11:59 tcp 1.0.9.1.22518 -> 10.2.3.4.8080 155672 FIN flow=Proxy server - normal traffic
rasplit example, working with pipes
(Argus-info Digest, Vol 76, Issue 21, Jesse Bowling)
To have argus generate both a flow file as well as a pcap file of the data as it's captured... a hackish way to go about it...YMMV:
mkfifo tcpdump.fifo mkfifo argus.fifo tcpdump -r tcpdump.fifo -w /pcaps/%Y_%m_%d_%H%M_test.pcap -G 300 & argus -r argus.fifo -w - | rasplit -r - -M time 5m -w /argus/%Y_%m_%d_%H%M_test.argus & tcpdump -i eth0 -s 2048 -w - | tee argus.fifo > tcpdump.fifo &
ratop filters
(2007-02-14 04:46:06, Carter Bullard)
There are three types of filters in ratop(), the first is a remote filter, which will be transmitted to a remote argus source, thus limiting the amount of traffic on the wire. The second is a local input filter. You would use this type of filter if the remote does not support the type of filter you want to use. This is a compatibility feature. The third is a display filter, which will control what records are displayed, without affecting the internal buffers of ratop().
You differentiate the filter types using the keywords "remote", "local" and "display".
Without a keyword, you get "remote", and the remote filter is sent, if there is an argus server to send it to, and it is used as an input filter for ratop().
So ... try this:
ratop -r file
This causes ratop() to process the file without any type of input filtering. Once the data is done, then in ratop(), call up the "Specify filter: " prompt by typing:
f
and then at the prompt type:
display tcp and dst port 80
ratop various commands
(2012-01-23, Carter Bullard)
Command '/','?', 'n', 'N' - search
ratop.1 is like 'vi', in that you will have at the bottom of the screen a status line. If you type '/', you go into forward search mode, and you can type any string, then carriage return, and like 'vi', the cursor will bounce to that string in the developing flow cache display that ratop.1 is printing. That is a regex, so you can put really bizarre things in there. It will search across multiple pages, and then 'N', and 'n' allow you to go to the next or previous. These search on the actual strings on the screen, so you have to have fields displayed in order to search on them. If you type '?' you go into backwards search mode.
Control-r: Reverse flow direction
One that is important is the control-r command, as that reverses the direction of a specific flow record on the screen. Get the cursor to a line you want to reverse, then hit 'control-r'.
Command ':' - options
if you type ':' you will be in command mode and you can type options and commands. Command 'h', will print out the help screen. Using the ':' command method, you can for example, change the sorting algorithm on the fly (command 's'), you can change the fields (command 'F').
Command ':H' - Human bytes
At anytime type command 'H' while ratop is reading data, and most of the numeric metrics, such as bytes, appbytes, packet counts, rates, loads, etc…. will be converted to the appropriate abbreviations. 'H' is a toggle, so you can hit it as many time as you like to flip the abbreviations, and when you're done, carriage return will put ratop.1 back into navigation mode.
Command ':s' - Save cache
Because you can do a command 's' at anytime to save the cache that ratop.1 is working with, you can use ratop.1 to do corrections on flow records.
Command ':a' - add
One that is not mentioned in the help screen is the 'A' option, to add new lines to the display. The only one supported right now is 'totals', which puts the aggregate of the entire cache that ratop.1 is working with as the first line on the display. Remove it with "-totals", so type:
:atotals
gives you an "Add: " prompt, then type "totals", then carriage return.
As in 'vi', the 'h', 'j', 'k', 'l' navigation works, so you can move the cursor around if your arrow keys don't work.
various parameters
"-M hex" : ra hex dump
(2012-01-22 20:28:50, Carter Bullard, "Re: argus client obfuscation")
ra* programs currently support the "-M ascii", "-M hex", "-M encode32", and "-M encode64" command-line options, which are undocumented. I will change this support to " -M printer=ascii", "-M printer=hex" ...
Useful Links
Argus - Downloads (official download page, actual v3.0.0 stable since Apr 2008 !)
http://www.qosient.com/argus/downloads.shtml
Argus - Home
http://www.qosient.com/argus/index.shtml
Argus - FAQ
http://www.qosient.com/argus/faq.shtml
Argus - Development File Listing dev (download page for the next release in development)
http://qosient.com/argus/dev/
Argus - Previous versions
http://qosient.com/argus/src/
News Argus sur Gmane
http://news.gmane.org/gmane.network.argus (thread look) http://blog.gmane.org/gmane.network.argus (blog look)
Argus - NSMWiki (but ... it's here !)
http://nsmwiki.org/Argus
Argus - WTFWiki (another one, updated 2012-07)
http://wtf.hijacked.us/wiki/index.php/Argus
Argus - Documentation / How To File
http://www.qosient.com/argus/howto.shtml http://web.archive.org/web/20080119143705/http://qosient.com/argus/how-to.htm (last copy of old version)
[ARGUS] rahisto dialog
http://blog.gmane.org/gmane.network.argus/month=20061121
Argus Tips and Tricks: more than 17 extensive posts from C.S. Lee about Argus (When {Puffy} Meets ^RedDevil^: of C.S. Lee)
A good starting point to understand how the argus records work
http://geek00l.blogspot.com/2007/12/network-flow-demystified.html
Packets -> Flows -> CSV -> Graph
http://geek00l.blogspot.com/2007/11/packet-flow-csv-graph.html
Argus 3: Statistics for Major Protocols
http://geek00l.blogspot.com/2008/01/argus-3-statistics-for-major-protocols.html
Argus 3: German Article
http://geek00l.blogspot.com/2008/01/argus-3-german-article.html
... for the remaining ones, look at C.S. Lee's blog posts talking about Argus3 :
http://geek00l.blogspot.com/search/label/Argus3
Beber l'ARGUSnaute (Tutoriel en français - French tutorial)
http://www.minithins.net/warehouse/argus.txt
Scan of The Month 28 (one use of argus : analysis of a successful compromise and the attacker's actions after it.)
http://old.honeynet.org/scans/scan28/sol/2/index.html
Argus – Auditing network activity (PDF of Russ McRee)
http://holisticinfosec.org/toolsmith/docs/november2007.pdf
ArgusEye - A GUI for Argus
http://www.datenspionage.de/arguseye/
RRDtool - About RRDtool, used by ragraph
http://oss.oetiker.ch/rrdtool/
AfterGlow
http://afterglow.sourceforge.net/
argus.traffic.perl (Peter Van Epp) - multiple probe setup - was written for argus 2.0.6
ftp://ftp.sfu.ca/pub/unix/argus/
echo_response.pl (Peter Van Epp) A perl script for finding echo response packets.
ftp://ftp.andrew.cmu.edu/pub/argus/contrib/echo_response.pl
What we do with argus (Russel Fulton) (earlier mailing list on osdir.com)
http://osdir.com/ml/network.argus/2001-06/msg00090.html
ARGUS-2.0.1 IPFX BOF - IETF London, UK - Wed, August 8, 2001 ; slide show (Carter Bullard)
http://www.ietf.org/proceedings/51/slides/ipfx-2/sld001.htm
Re: Combination of snort and argus (Russel Fulton) - Description of an argus record
http://osdir.com/ml/security.ids.snort.sigs/2002-10/msg00109.html
Example working on the LBNL/ICSI Enterprise Tracing Project with sguil / argus (Richard Bejtlich)
http://taosecurity.blogspot.com/2007/05/lbnlicsi-enterprise-tracing-project.html
Howto run argus on OpenWRT (CS Lee)
http://geek00l.blogspot.com/2009/04/argus-3x-on-linksys-wrt54gl.html
FAQ
There is an FAQ on www.qosient.com which covers most of the basic issues you are likely to face when starting out with Argus. This is required reading for all new argus users.
Feel free to add new questions and answers here:
How do I do IP accounting by IP
Argus 3.0 provides the fundamental data needed to generate an accounting of network activity from many different perspectives. Because Argus data can contain ethernet addresses, VLAN tags, MPLS labels, Layer 4 protocol identifiers, and port numbers, argus data can provide the information needed to do Layer 2, sub Layer 3, and Layer 4+ accounting. Argus data is capable of this through flexible aggregation, provided by the client program, racluster().
The useful task for racluster() is to provide an accounting for each IP address in a stream of argus data. However, this is actually quite a complex task, because of the nature of argus data (bi-directional flow data), and what a single address accounting actually represents. I'll get into this later.
Racluster() is the program of choice for generating an IP address inventory, and the metrics that can be derived for each address. It provides simple command line directives to generate the list of IP addresses and their activity. Lets assume you have a file, argus.data, that contains the records of interest. To generate the simple per IP address data, simply run racluster() with these options:
racluster -M rmon -m saddr -r argus.data - ip
This will take in IP data records, and track each individual IP address, and after processing the entire file, it will output argus data, that provides aggregated statistics, such as the transaction count, the total bytes and packet counts for each IP address observed. Lets look at the options in detail.
The "-M rmon" option is the secret sauce that converts bi-directional flow data, which contains two IP addresses in each record, into IETF RMON style data, where you get a listing of individual IP addresses, with the input and output packet and byte counts observed in the data.
The "-m saddr" option, specifies the aggregation model for the data, and in this case, cause racluster() to aggregate using the source IP address as the flow key. This results in racluster() providing records with unique IP addresses in the "saddr" field of the output argus records.
The "-r argus.data" option simply specifies some input data. You could use the "-R" option to specify an entire filesystem of argus data, or you could use the "-S" option to specify a remote stream of argus data. Distribution programs like radium(), allow you to read remote argus data files using the "-S hostname:/full/path/name/to/the/argus.data" and so there are alot of options for getting data into racluster().
racluster() outputs argus data, so you can use the output to generate reports, graphs, and to use in other types of argus data processing. A common practice is to read the output of racluster() using rasort(), to geneate a "top" whatever list. Say if you wanted the top 15 IP addresses, based on input packet load, run racluster(), as above, to generate the IP address data, and then run rasort() using these options:
rasort -M sload -r racluster.output.file -w - | ra -N 15
Having racluster() write its output to a file, allow you to read in prior runs of racluster(), prior to reading the raw data. This type of preprocessing allows you to "seed" your IP address accounting system with existing lists of IP addresses. This is a strategy used by many to have a running total of all the IP addresses seen over some arbitrary period of time, say the last 28 days, and provides the basic data needed for a windowed IP accountability facility, such as that needed to provide "traceback". As an example, you can process data to generate hourly IP address lists, store them in an argus archive like filesystem, and then use that data to generate daily/weekly/monthly/annual IP address lists, such as this:
racluster -m saddr -r 2006/12/10/ipaddrs.2006.12.10.* -w ipaddrs.2006.12.10.daily.out
Conversely, by having the list of active IP addresses, you can build the list of addresses in your address space that are not used. This can be used as a "dark net" inventory, useful for doing scan detection.
How do people use Argus and manage its files ?
(Carter's response to Torbjorn.Wictorin, Argus-info Digest, Vol 48, Issue 38)
Most universities and corporations that run argus, use it along with snort, or some other type of IDS at their enterprise border. They use the IDS as their up-front security sensor, and argus as the "cover your behind" technology. The two basic strategies are to keep all their argus data to support historical forensics or toss it after looking at the IDS logs and seeing that not much is/was happening.
The first approach is usually chosen by sites that have technically advanced security personnel, that have been seriously attacked or for some reason have a real awareness of the issues and know that the commercial IDS/IPS market is lacking. For sites that are under funded or are less technically oriented, argus, or argus like strategies usually aren't being used. If these types of sites are using flow data, its almost always Netflow data and they are using a commercial report generator to give the data some utility. These strategies normally do not store significant amounts of flow data, as that would be a cost to the customer.
So when a site does collect a lot of flow data, they generally partition the data for scaling (like you are doing). For universities/small corporations, they generate argus data in the subdomain/workgroups/dorms, where 500GB can store a years worth of flow data.
When the point of collection is the enterprise boundary, and a site is really using the data, and justifying the expense of collecting it all, the site invests in storage, but they also do a massive amount of preprocessing to get the data load down.
Most sites generate 5m-1h files. We recommend 5 minutes. Most sites run racluster() with the default settings on their files, sometime early in the process, and then gzip the files. Just running racluster() with the default parameters will usually reduce a particular file by 50-70%. I took yesterdays data from one of my small workgroups, clustered it and compressed it and got these listings:
thoth:tmp carter$ ls -lag data* -rw-r--r-- 1 wheel 93096940 Aug 28 10:30 data -rw-r--r-- 1 wheel 12534420 Aug 28 10:34 data.clustered -rw-r--r-- 1 wheel 2781879 Aug 28 10:30 data.clustered.gz
So, from 93 MB to 2MB is pretty good. Reading these gzip'd files performs pretty well, but if you are going to processing them repeatedly, then delaying compression for a few days is the norm.
Because searching 100's of GB of primitive data is not very gratifying if you're looking for speed, almost all big sites process the data as it comes in to generate "derived views" that are their first glance tables, charts and information systems. After creating these "derived views" some sites toss the primitive data (the data from the probes). For billing or quota system verification, most sites generate the daily reports, and retain the aggregated/processed argus records, and throw away the primitive data. I've seen methods that toss, literally, 99.8% of the data within the first 24 hours, and still retain enough to do a good job on security awareness.
There was a time where scanning traffic generated most of the flow data (> 40%). That has shifted in the last 3-4 years, but we have filters that can very quickly remove data to your dark address space and split to other directories. Some sites use the data, many sites toss it.
Some sites want to track their IP address space, because they have found that that is important to them, some want to retain flow records only for "the dorms". The argus-clients package has programs to help facilitate all of this, but you need to figure out what will work for you.
Some discussions
rabins -M nomodify
(Argus-info Digest, Vol 32, Issue 5))
Rabins() could increase the reported flow count - why ?
In Nicks example, rabins() is reporting the number of flows during a specific 10m period. Flows that cross hard time boundaries, do exist in the multiple time periods. So if the question is:
"what is the flow count in any given time period"
splitting flows across time boundaries is the correct thing to do. There are lots of reasons to ask this question. How many concurrent flows are on the wire and how does that change over time is a very sensitive test for lots of network phenomenon. I like to track the natural beacons on the wire (network management stations pinging around, switches doing end system inventory by arping every IP address in the subnet every 300 secs), and its easy using this type of stat, as computers are pretty good clocks.
I suspect that either 10 flows cross the 10 minute boundaries, in this run, or there is a single flow that spans 10 time boundaries, or something in between.
However, if the question is instead:
"what is the flow initiation count in any given time period"
Well, then you only want a single flow counted once, in the time bin when it started. To do this you don't want to modify/split the flow records, so use this option:
rabins -M nomodify
And it will only tally the flows that start in that time period.
A statistical note:
The time boundary, relative to the actual real time of the flows can be random. Flows for the most part, don't really care that its 2pm somewhere, so the time "bins" that we create start at a random time relative to flow records. But this is not always the case. Actually if a human is not behind the generation of flows, then the flows are being created by a clock, because computers are really just digital alarm clocks in an abstract way.
This is called "flow clocking".
This is important in understanding the phenomenon that you point out. If you lengthen the time size of the bins from 10m to say 1h, the probability of a flow crossing a time boundary will go down in a probabilistic manner if flows start randomly. But if your flows are sync'd by a clock of some kind (like cron launching a job), then you will find that as the bin gets longer, all of a sudden, it goes from ( > N ) to ( == N).
Carter
What is jitter?
Jitter is the variance of the interpacket arrival times for a given flow/link/virtual circuit/whatever. The network has the ability to 'shape' traffic, either because its 'rate limiting' i.e. 'slow', or because it is busy and possibly introducing queuing artifacts, or its congested and it drops packets, or packets are taking differing paths in the network and arriving in different burst patterns or in different order. With your 'rate limited' link, you will get all kinds of jitter issues, depending on the size of the packets and the contention for the single link. Jitter is normally of interest for voice and video traffic, which has pretty stringent jitter tolerances if they are real time, but there are a lot of high performance applications, like remote file sharing, where jitter impacts peak performance.
An argus filter is not a ra* filter
(Argus-info Digest, Vol 19, Issue 1)
Russel Fulton said: Hmmm... trap for the unwary :) filter "tcp and dst port 80" means something rather different to ra and argus! It took me about half an hour to figure out why argus was seeing traffic in just one direction after I applied this filter. I've got so used to using filters with ra where the filter applies to flows that I simply assumed that argus filters would behave the same. They don't they behave just like tcpdump filters ( i.e. they are packet filters).
Carter's reply : You are absolutely correct!!! The input filter for argus is a libpcap filter, and so its a packet filter. All other filters are flow filters.
racluster -M rmon: how to use it?
( post on gmane 2007-02-22)
Wolfgang Barth : I want to plot inbound/outbound traffic - not src/dst traffic. I'm using something like this:
racluster -M rmon -r argus.log - \ srcid elibridge_dmz and src host 172.17.132.81 \ and dst host 172.17.130.2 and tcp dst port 80 and tcp src port 1415
The output is:
2007-02-21 08:15:27.658658 tcp 172.17.132.81.1415 -> 172.17.130.2.www \
9 13 1055 11936 FIN
2007-02-21 08:15:27.658658 tcp 172.17.130.2.www -> 172.17.132.81.1415 \
13 9 11936 1055 FIN
The flow is duplicated. Okay, if RMON works this way, how can I filter out inbound and outbound traffic?
Carter: The "-M rmon" option works by duplicating the record, reversing all the fields and then merging. So specify the object:
"-m smac" - by interface "-m svlan" - by vlan "-m smpls" - by mpls label "-m saddr" - by IP address "-m proto sport" - by port "-m smac sdsb" - by diffserv label
So if you want to aggregate based on interface, then use:
-M rmon -m smac
This will give you in/out stats based on interface. When you print/graph the records, you will want to print the smac field, so the full command:
racluster -M rmon -m smac -s smac spkt dpkts sbytes dbytes
should give you what you're looking for.
rabins() is really the combination of rasplit() and racluster()
( "5 minute load" : post on gmane 2009-04-19)
Carter : rabins() is really the combination of rasplit() and racluster(), where rasplit() divides data into the time slots, and each slot has its own racluster() context. At the end of the run, rabins() should printout the results of all the clusters that it has in its cache.
If you were to do something like:
rasplit -M hard time 5m -r 01.gz -w /tmp/test/argus.%Y.%m.%d.%H.%M.%S
and then:
for i in /tmp/test/argus*; do echo $i; racluster -m srcid -s stime bytes load; done
you should get data similar to what rabins() is trying to do. [assuming use of bash()]
I would like to print all the SYN/ACK occurrence
... to detect possible SYN flood attacks (2010-07-16); actually I can use -Zs flag in ra to have the result I was searching from
(Carter's response to Riccardo, Argus-info Digest, Vol 59, Issue 15)
You can specify these types of filters to get records that have specific tcp flag settings:
ra -r file - syn and not ack ra -r file - src syn and dst ack ra -r file - src push and dst urg
the keyword synack, is a special case argus status indication that matches when the source sent the syn, and the destination sent an ack.