Argus

From NSMWiki
Jump to: navigation, search

Contents

Introduction

Argus is an open source layer 2+ auditing tool (including IP audit) written by Carter Bullard which has been under development for over 10 years. This wiki is an attempt to fill out the documentation for the Argus system.

The `argus` and `argus-client` distributions contain man pages which describe the various programs and their command line switches (also available on the web), what is missing is tutorial type material describing how one might use Argus day-to-day. The material here should be used in conjunction with the man pages and information on `argus` and the various clients on the Argus site.

For more information, please refer to the below section on Argus Server and Clients.

What can Argus and Argus data do for you

Argus can be used to help support network security management and network forensics.

Argus can easily be adapted to be a network activity monitoring system, easily answering a variety of activity questions (such as bandwidth utilization). It can also be used to track network performance through the stack, and capture higher level protocol data.

With additional mining techniques (such as utilizing moving averages), Argus data can be used for "spike tracking" of many fields.

With the correct strategies, argus data can be mined to determine if you've been attacked or compromised historically, after an attack has been announced and indicators-of-compromise (IOCs) have been established. Carter has worked on mining argus data for IOCs indicative of an APT1 compromise and, soon, a Heartbleed compromise.

Who uses Argus

Many universities, corporations, and government entities use Argus to record both internal traffic flows and flows entering and leaving their network(s). These records are used in both immediate network utilization analysis, and historical analysis or trending. With a sensor network using Argus, organizations may validate the connectivity of end-hosts through multiple routers. If routers A, B, and C are passing traffic for hosts Y and Z, Argus may be used to determine latency and other problems between routers B and C (which may not be apparent in packet captures).

Historical netflow data can be used in forensic investigations several months, or years, after an incident has taken place. Argus' netflow records offer up to a 10,000:1 ratio from the packet size to the record written to disk, which allows installations to save records for much longer than full packet captures. When network security is very important, non-repudiation becomes a very important requirement that must be provided throughout the network. Argus provides the basic data needed to establish a centralized network activity audit system. If done properly, this system can account for all network activity in and out of an enclave, which can provide the basic system needed to assure that someone can't deny having done something in the network.

Network research labs have used Argus to provide network performance measurements of unique protocols, such as Infiniband over IPv6. Argus can be quickly adapted to new protocols, and in some cases, provides the basic metrics without extension. Individuals use Argus in their home networks to give them a heads up on DSL and Cable Modem based networks. Argus provides a higher order view into packet data, that allows a network user the ability to see problems quickly.

Argus Server and Clients

The Server (`argus`):

Consisting of the `argus` binary, the server retrieves packets that are received by one or more network interface available on a machine. `argus` then assembles these packets into binary data that represents network flows. `argus` can write this binary data to disk and/or a network socket. `argus` can also write pcap files of received packets.

The Clients (`argus-clients`):

The clients consist of many binaries and scripts that are distributed within the `argus-clients` package. Their purpose is to read flow data in any of the following ways: from files of binary flow data produced by `argus` or an `argus-client` (see `-r`), directly from an `argus` probe via a network socket (see `-S`), and/or from an unnamed pipe (like stdin) from an the stdout of an `argus-client` (see `-w -` and `-r -`). Most of the core clients are written with high performance in mind and utilize c, while a few scripts utilize perl.

It is important to realize that these are examples, which means that you can extend them to create new clients.

Processing type classifications

The classifications of "processing type" are loosely defined, meaning, you can likely apply some of the other categories to all clients. This classification of clients has not been condoned or created by Carter, but has been manufactured in effort to simplify understanding of the many Argus clients.

Note, all "live stream" and all "buffered stream" can also process "idle data." When processing idle data with a "buffered" processing type client, do not use the `-B` option.

Processing Types
Live stream Used to process live data in real time, stream processing. (see `-S`, `-w -` and `-r -` in ra() man page)
Buffered stream Utilize a buffer to handle live stream data and then output this data when the buffer timespan expires. This is useful for aggregation of data, for example. (see `-B` in each client's man page)
Idle data Mostly used to process data that has been stored (see `-w -` and `-r -`).

List of clients

Argus Core Clients
client processing type description
ra live stream This program reads argus output streams, either through a socket, a piped stream, or in a file, filters and optionally writes the output to 1) a file, 2) its stdout or 3) prints the binary records to stdout in ASCII.
rabins buffered stream time based bin processor. this routine will take in an argus stream and align it to a time array, and hold it for a hold period, and then output the bin contents as an argus stream. this is the basis for all stream block processors. used by ragraph() to structure the data into graphing regions.
racluster idle data command line aggregation. Also see the racluster.conf man page.
racount idle data Tally things about argus records.
radium live stream this is the argus record distribution node. Acting just like an ra* program, supporting all the options and functions of ra(), and providing access to data, like argus, supporting remote filtering, and MAR record generation. This is an important workhorse for the argus architecture. Also see the radium.conf man page.
ranonymize idle data anonymize fields in argus records. Also see the ranonymize.conf man page.
rasort idle data sort argus records based on various fields.
rasplit buffered stream reads argus data from an argus-data source, and splits the resulting output into consecutive sections of records based on size, count time, or flow event, writing the output into a set of output-files.


Argus Client Examples
client processing type description
raconvert idle data this converts ra() ascii output back into argus binary records. written in c.
radark idle data Report on dark address space accesses. The technique uses racluster to identify a current dark address space, using "no response" indications and specific ICMP unreachable events, and then use the list of dark address 'accessors' to generate a scanners list. written in perl.
radecode live stream use tshark tools to decode user data (see `-s` fields `suser` and `duser`). written in perl.
radump live stream print user data (see `-s` fields `suser` and `duser`) using various protocol printers. written in c.
raevent live stream event for ra* client programs. add application specific code, stir and enjoy. written in c.
rafilteraddr live stream filter records based on an address list. bypasses standard filter compiler. written in c.
ragraph buffered stream This program uses rabins() and rrdtool to generate png formatted graphs of argus data using rrdtools. written in perl.
ragrep live stream grep() implementation for argus user data searching. written in c
rahisto idle data produces a histogram of given data. written in c.
rahosts live stream ra() based host use report. written in perl.
ralabel live stream add descriptor labels to flows. this particular labeler adds descriptors based on addresses. Also see the ralabel.conf man page. written in perl.
rapolicy idle data match input argus records against a Cisco access control policy. written in c.
raports idle data ra() based host port use report. written in perl.
rarpwatch live stream IPv4 and IPv6 arpwatch. written in c.
raservices live stream discover and validate network services using a byte-pattern definition provided by rauserdata() (also see ../argus-clients*/support/Config/std.sig). written in c.
rasql idle data used to read binary data from the BLOB records optionally produced by rasqlinsert (that contain the entire binary flow record). written in c.
rasqlinsert live stream allows you to insert flow metadata into databases using ratop's raclient.c based record processing engine. written in c.
rasqltimeindex live stream Read Argus data and build a time index suitable for inserting into a database schema. written in c.
rapath idle data print derivable path information from argus data. The strategy is to take in 'icmpmap' data, and to formulate path information for the collection of records received. By classifying all the flow data by the tuple {src, dst}, we can track any number of simultaneous traceroutes and report on the results in a manner that preserves the granularity of the data seen, but provide means to modify that granularity to get interesting results. written in c.
rastream buffered stream splits the resulting output into consecutive sections of records based on size, count time, or flow event, writing the output into a set of output-files. optionally, rastream() will run a program against the output files N seconds after the file is closed (which occurs after all data has arrived for the specified timespan). written in c.
rastrip live stream remove fields from argus records. written in c.
ratemplate live stream template for ra* client programs. add application specific code, stir and enjoy. written in c.
ratimerange live stream print out the time range for the data seen. written in c.
rauserdata idle data reads argus data and produces a byte-pattern file for use with raservices(). written in c.
ratop live stream curses based argus GUI modeled after the top program. written in c.
routers idle data modification of raport(). written in perl.
nvargus idle data? this tests the nvOS interfaces to ra* programs. written in c.

Getting Started

Download the packages

There are two sets of packages retrievable via HTTP and FTP: stable and development.

The HTTP repositories are accessible via: stable, development. The FTP repositories are accessible via: qosient.com.

Carter usually uses the naming convention `argus-latest.tar.gz` and `argus-clients-latest.tar.gz` for the latest released versions, but all previous versions are listed.

Install the packages

`argus` and `argus-clients` require the following packages to build:

gcc make bison libpcap libpcap-devel readline-devel flex

Some additional features (such as sql, geoip, graphing, and sasl support) require the following libraries:

mysql mysql-devel mysql-libs openssl-devel cyrus-sasl ncurses-devel rrdtool rrdtool-perl geoip-devel geoip perl-Geo-IP

Once the dependencies are installed, perform the following build process:

cd
tar zxvf argus-latest.tar.gz
cd argus-*
./configure
make && make install
cd
tar zxvf argus-clients-latest.tar.gz
cd argus-clients-*
./configure
make && make install

Build process

Note that if you perform a `./configure` without the additional libraries installed, then the script will not configure the build to compile the portions of code that are dependent on the missing libraries.

For example, if you `./configure && make && make install` without the mysql, mysql-devel, and mysql-libs packages installed, then the build process will not compile the rasql* clients (and there will be no real error logged). In order to build these clients after you have already compiled (and copied the resulting binaries to your /sbin/ and /bin/ by way of `make`), you should perform the following:

cd
cd argus-clients-*
make clean
./configure
make && make install

`make clean` will clean up the files that `make` will reference, and the following `./configure` will produce reference files that point to the required libraries you have installed.

Set up the `argus` probe

To quickly set up `argus` to bind to an interface and serve generated binary data about flows on port 561, perform the following:

cp ./argus-*/support/Config/argus.conf /etc/argus.conf
argus -d -i eth0 -P 561

Review both the argus.conf man page, and the `argus` man page to understand the options. Value will only arise after you review the man pages.

Set up clients

To quickly set up any of the `argus-clients`, perform the following:

cp ./argus-clients*/support/Config/rarc ~/.rarc

Review the rarc man page to understand usage.

Want to see flow data?

Try `ra`, the most fundamental core client:

ra -S 127.0.0.1:561

Review the ra man page to understand usage.

Want to save flow data?

This can be a bit more complex and depending on what you want to do, you may use a variety of clients to write binary files. What about file rotation? Do you want to write binary data to files or do you want to write metadata to an SQL DB? Wouldn't it be useful to just write information that's aggregated over a time span to files?

What to do next

You know what you want and why you're here, but you have no idea how to get it.

Read the documentation

It is of utmost importance that you read the provided man pages for both the `argus` and the `argus-clients`.

Answers to a majority of questions are answered in TFM (the ... manual(s)).

The man pages are distributed with the `argus` and `argus-clients` packages, accessible via the `man` command after install.

The man pages are also available on QoSient's web site in PDF form.

Review examples

Beyond the examples written below, you should also refer to examples which Carter has made available on the QoSient site.

Refer to the mailing list archives:

A FAQ/wiki is always useful to gain fundamental understanding, but five+ years of questions and banter is great when you are ready for more.

Gmane is used to archive the argus-info mailing list.

The Gmane web UI consists of a few different renders of mailing list messages including: the threaded view, the blog-like view, a few rss feeds and even an nntp interface.

Finding help:

Currently, the best way to communicate with other argus users and Carter himself is via the argus-info mailing list. For realtime communication, there is also a currently under used IRC channel on freenode, #argus.

Quick tips

  • Please be prepared to perform thorough testing if recommended.
  • Please have a rudimentary understanding of *nix (such as how to list, (de)compress, move, edit, copy and generally manage files, how to find which port a local process is listening, how to use unnamed pipes).
  • Please do not make assumptions about the software, instead aim to state your problem exactly (segmentation faults are not "crashes"!), and be thorough (but not excessively so).
  • Please avoid "this doesn't work" type questions. Or if you do have a "this doesn't work" type problem, try to extract what doesn't work and be more exact with your message.
  • Please be nice. This includes not being rude, not being aggressive, being patient, reading responses thoroughly and responding with full detail.
  • Please be patient. Carter is a busy person, and the rest of the community is just that, your fellow users and nothing more!
  • Please be patient. Was this mentioned?


You must register for the mailing list in order to send Email.

  1. Access the mailing list administrative site, which is hosted by Carnegie Mellon University.
  2. Fill out the information below "Subscribing to Argus-info", and click "Subscribe". [If you are within the CMU network, you should be versed in mailing list subscription using the internal auth systems]
  3. To later edit your subscription settings (for instance to not receive all messages, or adjust digest receipt), under "Argus-info Subscribers" supply the subscribed to email and click "Unsubscribe or edit options."

Note that you can utilize an RSS reader and the gmane RSS feeds mention above. This includes the use of FeedBurner to "burn" those feeds and Email delivery of the feed contents. This is an option similar to digest delivery.

Getting started with the clients

Client relations

All `argus-clients` also have a large number of arguments/options in common with each other. This is a fundamentally essential point to understanding `argus-client` command invocations. A majority of `argus-clients` take a core set of arguments from `ra` (review the man page), but have additional options described on their own man pages.

A few other clients can be said to "utilize" the main functions of other clients. For example, this is the case with `rabins` and `ragraph`, which utilize some options from `racluster` and `rasplit`.

The pipeline

Under the Want to see flow data? section above, the use of `ra` was listed. We used ra() to attach to a TCP socket offered by the `argus` process at 127.0.0.1:561.

Each of the `argus-clients` has the ability to read and write to and from a file or an unnamed pipe (and even a few more things that are covered in the man pages). It is very common to see multiple `argus-clients` chained in a pipeline. This allows clients such as ralabel() to read from a source `argus` via a TCP socket, perform a labeling strategy according to a configured ralabel.conf, write output to the stdout unnamed pipe, then have an rastream() instance read from the stdin unnamed pipe and perform its configurable function.

Here is an example flow chart that shows some examples of the pipeline.

Please review the differences between `argus` and `radium` as they are major.

Introductory Client Usage

This section will contain introductory client usage examples, answering important questions such as how and why you would want to use a specific client in a specific way.

It will also contain a recommended path for new users to take that is leading and will help expand their knowledge. Some may seem redundant, but due to the reference nature of this page, they are purposefully redundant.

Warning: these examples are written to be introductory for new, not yet experienced users, so statements such as "[client] is used to (do X)," are limiting and could even be considered to not be correct (in the strictest sense, as they are limiting). For new users, keep in mind that the `argus-clients` are actually code files that contains functions, and these functions can be used by any `argus-client`. Carter has engineered Argus to be very flexible.

For example, this is why ra()'s man page contains so much information, and its arguments are "re-used" by many of the other `argus-clients`, because the functions used by ra() are called by all of the other clients. [truly, you could say that there are function libraries that are shared between all the clients, not necessarily that ra()'s code is a gateway to these functions].

ra(): Introduction

ra() is used alone to read and display argus data generally to stdout. ra() can also be used to read from and write to a variety of other things (like stdin/stdout). See the man page for more information.

ra(): Example

Mentioned previously in the Want to see flow data? section.

ra -S 127.0.0.1

explainshell.com link

ra(): What to do next
  1. Take a look at the pipeline section.
  2. Take a look at the rarc man page and the ~/.rarc file that you should have after properly configuring `argus-clients` and refer to RA_FIELD_SPECIFIER value.
  3. Take a look at the ra man page and review some very interesting options such as `-s`, `-N`, `-w`, `-r`.

....

rabins(): Introduction

rabins() is used to read and display or pass argus data that falls into bins (see Grouped data) of time, size, or count.

rabins(): Example

This example reads argus flow data out of a file [`-r hourly.argus`], then will return `stime`, `ltime`, and `rate` [`-s stime ltime rate`] for everything that happened within 1 second [`-M 1s`] (and only that second [`hard`]), consolidated by `srcid` (generally a `srcid` is a single argus probe, as in each argus probe should have a unique `srcid`).

You might have already guessed what this does: reports packets per second (`rate`) per argus probe.

rabins -r hourly.argus -s stime ltime rate -M 1s hard -m srcid

....

rasort(): Introduction

rasort() is used to read, sort, then display or pass argus data by field. You can use rasort() in a similar fashion to ra().

rasort(): Example

This example utilizes the concept of a pipeline and leverages the previous rabins() example. rabins() reads all the argus data files in the current directory [`-r *`], takes all the data about any fields that happened within 1 second [`-M 1s`] (and only that second [`hard`]), consolidated by `srcid`, and writes binary argus data to the named pipe stdout [`-w -`]. rasort() then reads the binary argus data from the named pipe stdin [`-r -`], sorts the all the field data by `rate` [`-m rate`], and displays stime, ltime, and rate to stdout in human readable form.

You may have already guess what this does: reports an ascending sorted list of the seconds' packets per second (`rate`).

rabins -r * -M 1s hard -m srcid -w - | rasort -r - -m rate -s stime ltime rate | less

Calculated Fields

Warning: Section Needs Work
This section needs specific attention to maintain, expand, correct, or improve.

There are many fields available for mining within Argus flow records. For more information, refer to `-s` in the ra() man page and RA_FIELD_SPECIFIER within the ~/.rarc file that exists if you setup `argus-clients` properly. The code for the calculations is located in ../argus-clients*/common/argus_client.c (beginning about line 11548).

Argus Record Fields
type field description calculated field formula calculated field formula when sorted
regular fields srcid argus source identifier.
stime record start time
ltime record last time.
flgs flow state flags seen in transaction.
seq argus sequence number.
smac, dmac source or destination MAC addr.
soui, doui oui portion of the source or destination MAC addr
saddr, daddr source or destination IP addr.
proto transaction protocol.
sport, dport source or destination port number.
stos, dtos source or destination TOS byte value.
sdsb, ddsb source or destination diff serve byte value.
sco, dco source or destination IP address country code.
sttl, dttl src -> dst (sttl) or dst -> src (dttl) TTL value.
sipid, dipid source or destination IP identifier.
smpls, dmpls source or destination MPLS identifier.
spkts, dpkts src -> dst (spkts) or dst -> src (dpkts) packet count.
sbytes, dbytes src -> dst (sbytes) or dst -> src (dbytes) transaction bytes.
sappbytes, dappbytes src -> dst (sappbytes) or dst -> src (dappbytes) application bytes.
sload, dload source or destination bits per second.
sloss, dloss source or destination pkts retransmitted or dropped.
sgap, dgap source or destination bytes missing in the data stream. Available after argus-3.0.4
dir direction of transaction
sintpkt, dintpkt source or destination interpacket arrival time (mSec). See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sintdist, dintdist source or destination interpacket arrival time distribution. See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sintpktact, dintpktact source or destination active interpacket arrival time (mSec). See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sintdistact, dintdistact source or destination active interpacket arrival time (mSec). See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sintpktidl, dintpktidl source or destination idle interpacket arrival time (mSec). See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sintdistidl, dintdistidl source or destination idle interpacket arrival time (mSec). See argus.conf:ARGUS_GENERATE_RESPONSE_TIME_DATA (`-R`), ARGUS_GENERATE_TCP_PERF_METRIC.
sjit, djit source or destination jitter (mSec). See argus.conf:ARGUS_GENERATE_JITTER_DATA (`-J`).
sjitact, djitact source or destination active jitter (mSec). See argus.conf:ARGUS_GENERATE_JITTER_DATA (`-J`).
sjitidle, djitidle source or destination idle jitter (mSec). See argus.conf:ARGUS_GENERATE_JITTER_DATA (`-J`).
state transaction state
suser, duser source or destination user data buffer. See argus.conf:ARGUS_CAPTURE_DATA_LEN (`-U`)
swin, dwin source or destination TCP window advertisement.
svlan, dvlan source or destination VLAN identifier.
svid, dvid source or destination VLAN identifier.
svpri, dvpri source or destination VLAN priority.
srng, erng start or end time for the filter timerange.
stcpb, dtcpb source or destination TCP base sequence number
tcprtt TCP connection setup round-trip time, the sum of ’synack’ and ’ackdat’.
synack TCP connection setup time, the time between the SYN and the SYN_ACK packets.
ackdat TCP connection setup time, the time between the SYN_ACK and the ACK packets.
tcpopt The TCP connection options seen at initiation. The tcpopt indicator consists of a fixed length field, that reports presence of any of the TCP options that argus tracks. See ra() man page section on `-s`.
inode ICMP intermediate node.
offset record byte offset infile or stream.
spktsz, dpktsz histogram for the source (spktsz) or destination (dpktsz) packet size distribution
smaxsz, dmaxsz maximum packet size for traffic transmitted by the source (smaxsz) or destination (dmaxsz).
sminsz, dminsz minimum packet size for traffic transmitted by the source or destination.
calculated fields dur duration of a flow ltime-stime
rate, srate, drate packets per second pkts/(ltime-stime)
trans aggregation record count.
runtime total active flow run time. This value is generated through aggregation, and is the sum of the records duration.
mean average duration of aggregated records.
stddev standard deviation of aggregated duration times.
sum total accumulated durations of aggregated records.
min minimum duration of aggregated records.
max maximum duration of aggregated records.
pkts total transaction packet count.
bytes total transaction bytes.
appbytes total application bytes.
load bits per second.
loss pkts retransmitted or dropped.
ploss percent pkts retransmitted or dropped.
sploss, dploss percent source or destination pkts retransmitted or dropped.
abr ratio between sappbytes and dappbytes (sappbytes-dappbytes)/(sappbytes+dappbytes)

Configuration Examples

This section covers examples of configuring some features of Argus that are not very straight forward and require external resources to complete (such as SASL and GeoIP labeling).

Using SASL with Argus

Argus supports the use of SASL for authentication and encryption. Configuring SASL can be an involved process, depending on which 'mechs' you wish to use.

Here are some instructions for configuring a RHEL 6 machine to use DIGEST-MD5 mode, a happy medium between security and ease of setup. This does not cover any other argus dependancies; it assumes you can already successfully compile argus and argus-clients and wish to configure SASL to work with them.

RHEL 6

Install the appropriate packages:

   sudo yum install cyrus-sasl cyrus-sasl-devel cyrus-sasl-md5

Configure your argus clients and server:

   make clean; ./configure --with-sasl && make && make install

Create a set of credentials in the sasldb2 database for connecting clients to use:

   saslpasswd2 -c -a argus USERNAME

It is important to call out, as the SASL documentation does, that this password is being stored IN PLAIN TEXT in /etc/sasldb2. You must protect this file accordingly.

Create a configuration file for the 'argus' service in /etc/sasl2/argus.conf:

   pwcheck_method: auxprop
   mech_list: DIGEST-MD5
   auxprop_plugin: sasldb

Your argus.conf (for server) and .rarc (for clients) files should already contain these directives:

   ARGUS_MIN_SSF=40
   ARGUS_MAX_SSF=128

That's it! Restart your argus server, and attempt to connect with a client such as ra:

   ra -S localhost -N 10

You'll notice being prompted for a "Username" twice; this is because in fact SASL is requesting a "user id" and an "authorization id". In this way, you may authenticate as one user, but act as another (if permissions are set accordingly; this is beyond the scope of this quick how-to).

Authenticating may become tiresome to you as an analyst, and is certainly an issue for clients that need to connect automatically (such as those started at boot). You may store the credentials in an ra.conf/.rarc file using the macros:

   RA_USER_AUTH="user/user"
   RA_AUTH_PASS="password"

Much like the use of sasldb2, it is paramount to protect these configuration directives from unauthorized access.

You may read more about cyrusSASL here.

Stéphane Peters's Cheat sheet

List originally contributed by Stéphane Peters (v3).

Examples

ragrep example: Finding Palevo / Sality virus activity

As of V3.0.2 ragrep() is obsolete. You should use the newer argus-clients-3.0.2 programs, all of which allow you to grep, using the "-e" option. bash code :

   #!/bin/bash
   # File : ragrep-sality.sh
   s="solfire.aljosaborkovic.com"
   s="$s|kukutrustnet777.info"
   s="$s|www.kjwre.*fqwieluoi.info"
   s="$s|l33t.brand-clothes.net"
   s="$s|pica.banjalucke-ljepotice.ru"
   s="$s|maellisromance.com"
   s="$s|217.32.75.74"
   s="$s|pingaksh.com"
   s="$s|radio.irib.ir"
   s="$s|regal-mont.pondi.hr"
   s="$s|sandra.prichaonica.com"
   s="$s|sasgrowth.com"
   s="$s|snowboard619.w.interia.pl"
   s="$s|spargeunid.go.ro"
   s="$s|stakrix.st.funpic.de"
   s="$s|us516757.bizhostnet.com"
   s="$s|www.abassiehmunicipality.com"
   s="$s|www.polaris.ge"
   s="$s|www.railwayservices.be"
   s="$s|www.senaauto.ge"
   s="$s|ziyagokalpilkogretim72.meb.k12.tr"
   ra -s "+suser:50 -bytes" -e "$s" $* - udp port 53

It is really a one-liner like this, split on several lines for editing.

   ra -s "+suser:50 -bytes" -e "solfire.aljosaborkovic.com|kukutrustnet777.info|www.kjwre.*fqwieluoi.info" -nr $file - udp port 53   

You need to use "ragrep" in previous versions of argus-clients(3.0.0 for example).

Knowing that Palevo and Sality viruses try to connect to one of these sites, this script permits to identify the computers that have done such DNS requests, and that are infected (with a high degree of probability).

The resulting RE is an ORing of several strings and another RE (www.kjwre.*fqwieluoi.info) to cach a probably random number. The script is launched like this:

 ragrep-sality.sh -nr $file 
 ragrep-sality.sh -nr $file -w /tmp/sality-traces.ra

Here is an output:

     StartTime    Flgs  Proto            SrcAddr  Sport   Dir            DstAddr  Dport  SrcPkts  DstPkts State                          srcUdata
   01/03 08:21  e         udp            1.0.4.1.44177    <->          100.0.1.1.53            1        1   CON s[40]=.............sandra.prichaonica.com.....
   01/03 08:21  e         udp            1.0.4.1.40419    <->          100.0.1.1.53            1        1   CON s[44]=.............solfire.aljosaborkovic.com.....
   01/03 08:21  e         udp            1.0.5.1.32200    <->          100.0.1.1.53            1        1   CON s[40]=.Y...........sandra.prichaonica.com.....
   01/03 08:22  e         udp            1.0.5.1.29661    <->          100.0.1.1.53            1        1   CON s[44]=.............solfire.aljosaborkovic.com.....
   01/03 08:29  e         udp            1.0.5.1.32554    <->          100.0.1.1.53            1        1   CON s[40]=.............sandra.prichaonica.com.....
   01/03 08:30  e         udp            1.0.5.1.44465    <->          100.0.1.1.53            1        1   CON s[44]=.............solfire.aljosaborkovic.com.....
   01/03 08:30  e         udp            1.0.4.1.29810    <->          100.0.1.1.53            1        1   CON s[40]=b............sandra.prichaonica.com.....
   01/03 08:31  e         udp            1.0.4.1.41186    <->          100.0.1.1.53            1        1   CON s[44]=yc...........solfire.aljosaborkovic.com.....
   ...
   01/03 10:27  *         udp            1.0.9.2.42875    <->          100.0.1.1.53            1        1   CON s[44]=e............solfire.aljosaborkovic.com.....
   01/03 10:42  e         udp           1.0.15.1.46746     ->          197.0.7.1.53            2        0   INT s[50]= O................V...........sandra.prichaonica.c
   01/03 10:42  e         udp           1.0.12.1.45079    <->          100.0.1.1.53            1        1   CON s[40]=.............sandra.prichaonica.com.....
   01/03 10:42  *         udp            1.0.9.3.31681    <->          100.0.1.1.53            1        1   CON s[40]=.............sandra.prichaonica.com.....
   01/03 10:42  e         udp           1.0.15.1.46746     ->          197.0.2.1.53            3        0   INT s[50]= O................V...........sandra.prichaonica.c
   01/03 10:42  e         udp           1.0.15.1.46746     ->          197.0.3.1.53            3        0   INT s[50]= O................V...........sandra.prichaonica.c
   01/03 10:42  e         udp           1.0.15.1.46746     ->          197.0.4.1.53            3        0   INT s[50]= O................V...........sandra.prichaonica.c

other

Flow filtering on certain port range :

  ra -r $file - dst port gt 1024 and dst port lt 2048

Use racluster() to generate the counts you are looking for:

   racluster -m proto -r $file -s proto spkts dpkts sbytes dbytes
   Proto  SrcPkts  DstPkts     SrcBytes     DstBytes 
     udp    15567    12390      2912004      3240927
     tcp   900187   866302    410506598    722771403
    icmp      645      522       123240        61250

Packet Loss (with IP address):

   ragraph loss saddr daddr -M 10s -r $file -title 'Packet Loss / IPs' -w ploss.png

Packet Loss (number of packets)

   ragraph loss spkts dpkts -M 10s -r $file -title 'Packet Loss / Packets' -w ploss2.png

Jitter (number of packets)

   ragraph jitter saddr daddr -M 10s -r $file -title 'Jitter' -w jitter.png

Concurrent transactions:

   ragraph trans -M 10s -r $file -title 'Concurrent Transactions' -w transac.png2
Note (2010-0617): It does look, from the code, that it is trans/sec.  We have explicit
code for controlling that, and it looks like "Trans" doesn't correct for
the the GAUGE/AVERAGE artifacts rrd and rrd_graph generates.  

If you make this change to ragraph():

thoth:~ carter$ diff `which ragraph` /tmp/ragraph
1093c1093
<          /Trans/    and do {$power[$x] = 1.0 ; };
---
>          /Trans/    and do {$power[$x] = $STEP ; };

It will graph the actual 'trans' value in each time bin.


Top talkers & Listeners

   racluster -m matrix -r $file -w - | rasort -m bytes | less

Note: piping through 'ra -n' again was redundant and a waste of CPU cycles (FYI: the -s switch is also available for rasort when one requires a different output)

Rastrip always removes argus management transactions, thus having the same effect as a

   ’not man’ 

filter expression.

To remove the tcp network DSR (data structure record?):

   rastrip "-m -net"

(or something like it)

To see if you get something useful:

   rastrip "-M time flow metric" 

Yes, you can pipe rastrip(). Try something like this:

  rastrip -S $server -w - | rasplit [options] -r - 


   racluster -r $file -M net 192.168.0.0/16 -m daddr/16 - "host 192.168.0.10 or host 192.168.0.11"


   % ra -nr $file -s saddr sport daddr dport 
   SrcAddr        Sport      DstAddr        Dport
   1.2.3.58.1140         1.2.4.5.41460
   1.2.3.55.4100         1.2.4.5.41460
   1.2.3.3.3336          1.2.5.6.135


Split records into 5 minute files

   rasplit -M time 5m -S argus-north... -w /var/log/argus/\$srcid/%Y/%m/%d/file.%Y.%m%d.%H.%M.%S

one for every day

   rasplit -S radium -M 1d -w /path/argus-\$srcid.%Y.%m.%d.log 

It is possible to execute some command after each file, ie compress it or insert data in a database;

   rastream -S argus -B 15s -w /archive/\$srcid/%Y/%m/%d/ntam.%Y.%m.%d.%H.%M.%S \
      -f /usr/local/bin/rastreamshell 

There is an example file in the distribution, SRC/support/Config/rastream.sh :

#!/bin/sh
#
#  Argus Client Software.  Tools to read, analyze and manage Argus data.
#  Copyright (C) 2000-2011 QoSient, LLC.
#  All Rights Reserved
#
# Script called by rastream, to process files.
#
# Since this is being called from rastream(), it will have only a single
# parameter, filename,
#
# Carter Bullard <carter@qosient.com>
#
 
PATH="/usr/local/bin:$PATH"; export PATH
package="argus-clients"
version="3.0.2"
 
OPTIONS="$*"
FILES=
while  test $# != 0
do
    case "$1" in
    -r) shift; FILES="$1"; break;;
    esac
    shift
done
 
racluster -M replace -r $FILES
gzip $FILES
exit 0


Comma separated value

   %cat ra3.conf.t
   RA_PRINT_LABELS=0
   RA_FIELD_DELIMITER=','
   RA_PRINT_NAMES=proto
   RA_TIME_FORMAT="%y-%m-%d %T"
   RA_PRINT_DURATION=no
   RA_PRINT_LASTIME=yes    
   %ra3 -F ra3.conf.t -r icmp3.argus | more
   StartTime,Flgs,Proto,SrcAddr,Sport,Dir,DstAddr,Dport,SrcPkts,DstPkts,SrcBytes,DstBytes,State
   06-06-27 11:20:28.911941, v       ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO
   06-06-27 11:20:28.911946, v       ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO
   06-06-27 11:20:28.911951, v       ,icmp,142.58.201.99,,->,142.58.201.254,,1,0,102,0,ECO


   racluster -m saddr/23 daddr proto dport -w -r $file - dst net 10.1.2.0/23 \
       | rasort -m proto daddr dport dbytes - \
       -s ltime saddr sport daddr dport spkts dpkts sbytes dbytes \
      |less


To do a top talkers for say IP addresses (racluster can do it for any object in the record, top mac addrs, top tos bytes, top mpls label, top vlan, top port, top ttl, etc....):

   racluster -M rmon -m saddr -r $file - ip


A list with 2 columns, IP-address and bytes used:

   racluster -M rmon -m saddr -r $file -w - - ip \
   |    rasort -m bytes -s saddr bytes |head -20

... not to be confused with :

   racluster -M rmon -m saddr -r $file -w - - ip \
   |    rasort  -N 20 -m bytes -s saddr bytes

... equivalent to :

   racluster -M rmon -m saddr -r $file -w - - ip \
   |    ra -N 20 | rasort -m bytes -s saddr bytes 

A list with 2 columns, IP-address and bytes used (carter version):

  racluster -M rmon -m proto sport -r $file -w - - ip | \
  rasort -m bytes proto sport -s stime dur proto sport spkts dpkts sbytes dbytes


802.1q packets monitoring already there. If you have vlan input traffic adding

   -s +svlan +dvlan

to your ra command will display the VLAN tag values in hex form and you can filter ra (or other clients) traffic on vlan tags.

To see the VLAN in decimal form, use these options:

   -s +svid +dvid


Top src address based on src bytes in a collection of records

   racluster -m saddr  -w - -R 2006/09/28 - ip | rasort -m sbytes

Top address, regardless of direction (The "-M rmon" folds the src and dst addresses together, putting the values into the saddr field.):

   racluster -M rmon -m saddr -w - -R 2006/09/28 - ip | rasort -m sbytes

2007-0305 (Argus-info Digest, Vol 19, Issue 5) What is the current best way to get a report like :

   ramon -nn -L0  -M svc -r $file - | head -25
   racluster -M rmon -m proto sport -r $file -w - - tcp or udp | \
       ra -N  25 -s proto sport spkts dpkts sbytes dbytes

2007-0321 (Argus-info Digest, Vol 19, Issue 30) Looking for functionality like: ramon -M TopN or -M Matrix try this:

     racluster -r $file -M rmon -m saddr  - ip  ( this generates  stats based on IP address)
     racluster -r $file -m matrix - ip    (based on IP matrix)

to do whatever TopN you want, pipe the output to rasort(). So to get the Top10 in packets received and transmitted:

     racluster -r $file -M rmon -m saddr -w - | rasort -m pkts -w -  | ra -N 10

To get the Top5 in bytes per second transmitted:

      racluster -r $file -M rmon -m saddr -w - | rasort -m srate -w  - | ra -N 5 -s +srate

2007-1102 (Argus-info Digest, Vol 27, Issue 2) I(Terry) run the following collectors:

   /opt/argus/sbin/argus -X -d -A -i eth2 -P 561
   /opt/argus/sbin/radium -X -d -C -S 1006 -P 564
   /opt/argus/sbin/radium -X -d -C -S 1007 -P 565

I(Terry) have another process that aggregates these:

   /opt/argus/sbin/radium -X -d -S localhost:561 -S localhost:564 -S \
   localhost:565 -P 569

2008-0215 Some examples of ragraph: ( http://search.gmane.org/?query=ragraph&group=gmane.network.argus )

   ragraph bytes proto -M 60s -r strange-broadcast-10000.argus -fill -stack  \
       -w ./strange-broadcast-10000.png
   ragraph -r inputfiles* -t 12-13
   ragraph spkts dport -M 1h -n -n -r argus.dat.04 - src net X/20
   ragraph pkts dport -M 10s -T 60 -S 192.168.1.101 -p0
   ragraph bytes saddr -M 1m -m saddr/24
   rabins -M soft zero -p6 -GL0 -s ltime bytes -nn -M 1m \
       -r $files - srcid eligate1 and icmp |  head
   ragraph sbytes dbytes -M rmon time 1m -m smac -t 2007/10/04 \
       -r $file -w ragraph.png -- ether host 00:15:F2:64:92:13
   ragraph pkts proto -M 1m -title 'eligate2: protocol distribution' \
       -height 200 -t 2007/10/04 -r /var/log/argus/argus.log \
       -w /var/www/argus/eligate2/proto/current.png - srcid eligate2
   rahisto -r datafile -H drate 140:100-170K 
   bash> for i in 1s 2s 5s 10s 15s 20s 30s 45s 1m 2m 5m 10m 15m 20m 30m 1h 2h; do echo $i ;\
         ragraph rate dport -M $i -r output.file -t 18-20 -m proto dport -upper 5000 -lower 7000 \
         -title "Aggregation Metric Distribution Analysis - Resolution $i" ;\
         mv ragraph.png aggregation.$i.png; done
   rasort -R ${stats_dir}/.../day -m bytes smac saddr -w - \
     | ra -N 20 -w top20.talkers.list
     ; ra -s addr -r top20.talkers.list > addrs.list
     ; rafilteraddr -f addrs.list -R ${stats_dir}/..../daily  > /tmp/data
     ; ragraph  spkts dpkts saddr -M 1m -w /tmp/ragraph.png


2008-0228 (Argus-info Digest, Vol 30, Issue 41) to insert data every 5 minutes, it can be as easy as:

  rastream -S live.argus.stream -f yourMysqlImport.sh -M time 5m -B 15s \
     -w /opt/ARGUS/OUTBOUND/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S

This would generate an argus archive broken out by year/month/day containing files every 5 minutes, and 15 seconds after then end of each 5 minute clock boundary, your script would be run against the file, indexing the data and then compressing the file. It could remove the file if you're not interested in keeping the archive etc......


2008-0305 (Argus-info Digest, Vol 31, Issue 6) When the records are not well formed, you need the "-M rmon" option to make the records direction-less. Because of the direction-less nature you can use "dport" or "sport" as the merge key, but you have to be consistent, as you will need to pipe the output to ra() to select the ports you're interested in:

  racluster -M rmon -r $file -m proto dport -w - | \
    ra -L 0 -s stime dur proto dport spkts dpkts sbytes dbytes - dst port 80 or 443

equivalent to (in argus clients v2.0.6)

  ramon -M Svc -nn -r argus-$DATE.arg - port 80 or 443".


Bandwidth usage flow by flow on 26th Feb from 19h to 20h, unnecessary columns have been cut to keep every record on a single line ( from : http://www.vorant.com/nsmwiki/Argus#How_do_I_do_IP_accounting_by_IP :-)

   cd /archive/2008/02/26
   racluster -w - -M rmon -m saddr daddr -r argus.19.00.00.gz -w - - ip and dur gt 1 \
   |  rasort -m sload -w - \
   |  ra -N 15  -p 0 -s "-flgs -proto -dir -state +avgdur +sload +dload +trans"

List all possible state fields of a file

 % ra -r $file -nn | awk '{print $NF}' | sort | uniq -c | sort -nr
 91104 CON
 77066 FIN
 65763 TIM
 55618 ECO
 41232 INT
 28724 RST
   798 ECR
   467 URP
     2 CLO
     1 STA

2008-0312 (Argus-info Digest, Vol 31, Issue 15) Print headers in ra* version 3.*

  "-L 0" will print the headers once, "-L 40" will print the headers every 40 lines, etc ...

2008-0312 (Argus 3: Statistics for Major Protocols) (C.S. Lee) Here you go, you can cluster or merge the records based on the flow key and it is suitable for data mining, data management and report generation, let's generate the statistical report using protocol as flow key. Notice I specify -m proto in command line below and using -s to print the field I want

  racluster -L0 -m proto -r $file -s proto trans pkts bytes appbytes -\
   tcp or udp or icmp

2008-0317 When (on which date) did start this long-running argus file (by default, ra* clients use the "%T" format ie HH:MM:SS) ?

  cat /tmp/rarc
     RA_TIME_FORMAT="%D  %T"'
  ra -s "stime"  -F  /tmp/rarc.$$  -N 1 -L 0 -nr $file
              StartTime
     02/29/08  18:42:55

2008-02-28 simple gnuplot plot file to generate a graph of "Total Bytes By Protocol" using argus data; assuming gnuplot is installed in /opt/local/bin/gnuplot (Carter Bullard).

   % chmod 755 barchart.bytesxproto.plt
   % racluster -m proto -r argus.out -s proto spkts dpkts sbytes dbytes > racluster.dat
   % ./barchart.bytesxproto.plt
  ------ begin barchart.bytesxproto.plt ------
#!/opt/local/bin/gnuplot -persist
#
#       G N U P L O T
#       Version 4.2 patchlevel 2
#       last modified 31 Aug 2007
#       System: Darwin 9.2.0
#
#       Copyright (C) 1986 - 1993, 1998, 2004, 2007
#       Thomas Williams, Colin Kelley and many others
#
#       Type `help` to access the on-line reference manual.
#       The gnuplot FAQ is available from http://www.gnuplot.info/faq/
#
#       Send bug reports and suggestions to <http://sourceforge.net/projects/gnuplot>
#
#
reset
#
# Create simple barchart of Total Bytes by Protocol
# The racluster.dat file was generated using:
#
#     racluster -m proto -r argus.out -s proto spkts dpkts sbytes dbytes
#
# And is of the format:
#
# Proto  SrcPkts  DstPkts     SrcBytes     DstBytes
#   pim    53267    18086     48793554      1085160
#  ospf     1764        0       213220            0
#  [more]
#
set termoption font "Verdana, 12"
set size square 0.90,0.90
set bmargin 4
set title "Total Bytes By Protocol" font "Verdana,22"
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set tics font "Verdana,14"
set boxwidth 0.80
set grid
set ylabel "Log Total Bytes" font "Verdana,18"
set logscale y 10
set auto y
set label 1 "Generated by Argus using Gnuplot"
set label 1 at graph 1.02, 0.62 rotate by 90 font "Verdana,9"
#
set key autotitle columnhead
plot 'racluster.dat' using 4:xticlabels(1) ti col, \
                    using 5 ti col
#
 ------ end barchart.bytesxproto.plt ------

2008-0326 Count flows by groups of 10 minutes : show only the flow start times, cut after the 10ths of minutes, strip first line (headers), add a trailing zero and delete heading spaces to show a nice HH:MM line, count them, invert columns, insert a delimitor. Ready to be feed in your favorite spreadsheet.

 echo 'RA_TIME_FORMAT="%H:%M"' > raTime.conf
 ra -F raTime.conf -s stime -nr $file | \
   cut -c -4 | \
   uniq -c | \
   sed -e '1d' \
       -e 's/$/0/' \
       -e 's/^ *//' \
       -e 's/\(.*\)  *\(.*\)/\2,\1/' > flowcounts.csv


2008-0409 Carter's version, thanks to Nick Diel - This example assumes you have already merged status flow records, so records = flows, if not add another pipe of racluster. If you have multiple collectors, you can have rabins merge on something else such as proto if you are filtering on tcp.

  echo 'RA_TIME_FORMAT="%H:%M"' > raTime.conf  # (you could also add this to your rarc file)
  rastrip -r $file -M -agr -w - | \
     rabins -M nomodify time 10m -m srcid -s stime trans -c , -F raTime.conf > flowcounts.csv


2008-0409 Carter's note : When you only want a single flow counted once, in the time bin when it started. To do this you don't want to modify/split the flow records, so use this option:

  rabins -M nomodify


2008-0409 Stéphane Peters : Small all-purpose script to count and totalize all columns : /bin/tot

#!/bin/awk -f
BEGIN{max=0}
{if ( NF > max ) max = NF;
       for ( i=1 ; i <= NF ; i++ ) {
               tot[i]+=$(i);
       }
}
END { for ( i=1 ; i <= max ; i++ ) {
       if ( tot[i] > 1000000 )
               printf "%sm\t", tot[i]/1000000;
       else if ( tot[i] > 1000 )
               printf "%sk\t", tot[i]/1000;
       else if ( tot[i] == 0 )
               printf "-\t";
       else
               printf "%s\t",tot[i];
       }
       printf "\n";
}

2007-10-04 Wolfgang Barth : "I'm using the following code for graphing interface load:" (http://thread.gmane.org/gmane.network.argus/5338/focus=5348)

/usr/local/bin/rabins -M rmon 1m -m smac -t 2007/10/04 \
  -r /var/log/argus/argus.log -w - - srcid eligate2 | \
  /usr/local/bin/ragraph sbytes dbytes -M 1m -title 'eligate2: Load' \
  -height 200 -upper 1000000 -rigid -lower 1000000 -rigid -t 2007/10/04 \
  -w /var/www/argus/eligate2/load/current.png -r - - ether dst 00:15:F2:64:92:13

2008-06-25 From Peter Van Epp: How to put commas in large numbers (http://article.gmane.org/gmane.network.argus/6062)

The following perl fragment will add commas if you run the ra output through an appropriate perl script:

sub commas {
       local($_) = @_;
       1 while s/(.*\d)(\d\d\d)/$1,$2/;
       $_;
}

and called like this:

$pcount = &commas($count);

2008-12-29 (Argus-info Digest, Vol 40, Issue 5) ragraph with large files

Carter : When you are graphing objects like ports, you can use the aggregation features of ragraph() to minimize the memory use. For example, you can use "-m proto dport" in :

ragraph dbytes sbytes dport -M 5m -t $time -fill -stack -invert -title \"$title\" \
   $log -w $filename $filter

That should constrain your graph so that it doesn't use much memory at all (max should be, what, 64K ports for udp and tcp in memory for each 5m period). Thing to note : the destination port field doesn't decode without the protocol field having a valid value.

2009-02-13 (Argus-info Digest, Vol 42, Issue 15) Radium repository example

 rasplit -M time 5m -S radium -w experiment/\$srcid/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
 ra -S remoteRadium/path/to/specific/argus/file/argus.2009.02.13.15.20.00.gz

2009-04-24 (Argus-info Digest, Vol 44, Issue 35) argus reads tcpdump files

First, creation of the tcpdumpfile (CAP / PCAP format), followed by the conversion with argus

 tcpdump -i eth0 -n -w testdump ;
 argus -mAJZR -r testdump -w testdump.arg3


ralabel example

(Argus-info Digest, Vol 59, Issue 33)

First, create a "ralabel.conf" file:

  RALABEL_ARGUS_FLOW=yes
  RALABEL_ARGUS_FLOW_FILE="argus-flow-file"

Second, create an "argus-flow-file" :

 # Argus-flow-file
 #
 # Our application
 filter="host 10.1.2.3 and port 80" label="Appserver - web traffic"
 filter="host 10.1.2.3"             label="Appserver - other traffic"
 
 # Proxy
 filter="host 10.1.2.4 and port 8080" label="Proxy server - normal traffic"
 filter="host 10.1.2.4 and port 80"   label="Proxy server - web traffic"
 filter="host 10.1.2.4"               label="Proxy server - other traffic"
 
 filter="udp and port 53"             label="DNS traffic"

Use it (some fields have been removed to fit the wiki page) :

 ralabel -f ralabel.conf -nr $f -s "-status -sbytes -dbytes +label:40"

Result:

26/07 11:59     tcp  1.0.2.2.9405      ->     10.2.3.4.80         503846   RST          flow=Proxy server - web traffic
26/07 11:59     tcp 10.2.3.4.8080     <?>      1.0.3.1.8248          163   CON       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.4.1.8820      ->     10.1.2.3.80           9895   FIN             flow=Appserver - web traffic
26/07 11:59    icmp  1.0.5.1.8        <->     10.2.3.4.11736         204   ECO        flow=Proxy server - other traffic
26/07 11:59     tcp  1.0.6.1.9286      ->     10.2.3.4.8080         5381   FIN       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.4.1.8821      ->     10.1.2.3.80           1475   FIN             flow=Appserver - web traffic
26/07 11:59    icmp  1.0.5.1.8        <->     10.2.3.4.11736         204   ECO        flow=Proxy server - other traffic
26/07 11:59    icmp  1.0.5.1.8        <->     10.2.3.4.11736         204   ECO        flow=Proxy server - other traffic
26/07 11:59     tcp  1.0.7.1.57268     ->     10.2.3.4.8080         1208   CON       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.8.1.9265     <?>     10.2.3.4.8080          242   CON       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.9.1.22513     ->     10.2.3.4.8080         9252   FIN       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.9.1.22516     ->     10.2.3.4.8080         9200   FIN       flow=Proxy server - normal traffic
26/07 11:59     tcp  1.0.9.1.22518     ->     10.2.3.4.8080       155672   FIN       flow=Proxy server - normal traffic

rasplit example, working with pipes

(Argus-info Digest, Vol 76, Issue 21, Jesse Bowling)

To have argus generate both a flow file as well as a pcap file of the data as it's captured... a hackish way to go about it...YMMV:

mkfifo tcpdump.fifo
mkfifo argus.fifo
tcpdump -r tcpdump.fifo -w /pcaps/%Y_%m_%d_%H%M_test.pcap -G 300 &
argus -r argus.fifo -w - | rasplit -r - -M time 5m -w
/argus/%Y_%m_%d_%H%M_test.argus &
tcpdump -i eth0 -s 2048 -w - | tee argus.fifo > tcpdump.fifo &

ratop filters

(2007-02-14 04:46:06, Carter Bullard)

There are three types of filters in ratop(), the first is a remote filter, which will be transmitted to a remote argus source, thus limiting the amount of traffic on the wire. The second is a local input filter. You would use this type of filter if the remote does not support the type of filter you want to use. This is a compatibility feature. The third is a display filter, which will control what records are displayed, without affecting the internal buffers of ratop().

You differentiate the filter types using the keywords "remote", "local" and "display".

Without a keyword, you get "remote", and the remote filter is sent, if there is an argus server to send it to, and it is used as an input filter for ratop().

So ... try this:

   ratop -r file

This causes ratop() to process the file without any type of input filtering. Once the data is done, then in ratop(), call up the "Specify filter: " prompt by typing:

  f

and then at the prompt type:

   display tcp and dst port 80

ratop various commands

(2012-01-23, Carter Bullard)

Command '/','?', 'n', 'N' - search

ratop.1 is like 'vi', in that you will have at the bottom of the screen a status line. If you type '/', you go into forward search mode, and you can type any string, then carriage return, and like 'vi', the cursor will bounce to that string in the developing flow cache display that ratop.1 is printing. That is a regex, so you can put really bizarre things in there. It will search across multiple pages, and then 'N', and 'n' allow you to go to the next or previous. These search on the actual strings on the screen, so you have to have fields displayed in order to search on them. If you type '?' you go into backwards search mode.

Control-r: Reverse flow direction

One that is important is the control-r command, as that reverses the direction of a specific flow record on the screen. Get the cursor to a line you want to reverse, then hit 'control-r'.

Command ':' - options

if you type ':' you will be in command mode and you can type options and commands. Command 'h', will print out the help screen. Using the ':' command method, you can for example, change the sorting algorithm on the fly (command 's'), you can change the fields (command 'F').

Command ':H' - Human bytes

At anytime type command 'H' while ratop is reading data, and most of the numeric metrics, such as bytes, appbytes, packet counts, rates, loads, etc…. will be converted to the appropriate abbreviations. 'H' is a toggle, so you can hit it as many time as you like to flip the abbreviations, and when you're done, carriage return will put ratop.1 back into navigation mode.

Command ':s' - Save cache

Because you can do a command 's' at anytime to save the cache that ratop.1 is working with, you can use ratop.1 to do corrections on flow records.

Command ':a' - add

One that is not mentioned in the help screen is the 'A' option, to add new lines to the display. The only one supported right now is 'totals', which puts the aggregate of the entire cache that ratop.1 is working with as the first line on the display. Remove it with "-totals", so type:

  :atotals

gives you an "Add: " prompt, then type "totals", then carriage return.

Navigation

As in 'vi', the 'h', 'j', 'k', 'l' navigation works, so you can move the cursor around if your arrow keys don't work.

various parameters

"-M hex" : ra hex dump

(2012-01-22 20:28:50, Carter Bullard, "Re: argus client obfuscation")

ra* programs currently support the "-M ascii", "-M hex", "-M encode32", and "-M encode64" command-line options, which are undocumented. I will change this support to " -M printer=ascii", "-M printer=hex" ...

Useful Links

Argus - Downloads (official download page, actual v3.0.0 stable since Apr 2008 !)

  http://www.qosient.com/argus/downloads.shtml

Argus - Home

  http://www.qosient.com/argus/index.shtml

Argus - FAQ

  http://www.qosient.com/argus/faq.shtml

Argus - Development File Listing dev (download page for the next release in development)

  http://qosient.com/argus/dev/

Argus - Previous versions

  http://qosient.com/argus/src/

News Argus sur Gmane

  http://news.gmane.org/gmane.network.argus (thread look)
  http://blog.gmane.org/gmane.network.argus (blog look)

Argus - NSMWiki (but ... it's here !)

  http://nsmwiki.org/Argus

Argus - WTFWiki (another one, updated 2012-07)

  http://wtf.hijacked.us/wiki/index.php/Argus

Argus - Documentation / How To File

  http://www.qosient.com/argus/howto.shtml
  http://web.archive.org/web/20080119143705/http://qosient.com/argus/how-to.htm (last copy of old version)

[ARGUS] rahisto dialog

  http://blog.gmane.org/gmane.network.argus/month=20061121

Argus Tips and Tricks: more than 17 extensive posts from C.S. Lee about Argus (When {Puffy} Meets ^RedDevil^: of C.S. Lee)

  A good starting point to understand how the argus records work
     http://geek00l.blogspot.com/2007/12/network-flow-demystified.html
  Packets -> Flows -> CSV -> Graph
     http://geek00l.blogspot.com/2007/11/packet-flow-csv-graph.html
  Argus 3: Statistics for Major Protocols
     http://geek00l.blogspot.com/2008/01/argus-3-statistics-for-major-protocols.html
  Argus 3: German Article
     http://geek00l.blogspot.com/2008/01/argus-3-german-article.html
  ... for the remaining ones, look at C.S. Lee's blog posts talking about Argus3 : 
     http://geek00l.blogspot.com/search/label/Argus3
  

Beber l'ARGUSnaute (Tutoriel en français - French tutorial)

  http://www.minithins.net/warehouse/argus.txt

Scan of The Month 28 (one use of argus : analysis of a successful compromise and the attacker's actions after it.)

  http://old.honeynet.org/scans/scan28/sol/2/index.html

Argus – Auditing network activity (PDF of Russ McRee)

  http://holisticinfosec.org/toolsmith/docs/november2007.pdf

ArgusEye - A GUI for Argus

  http://www.datenspionage.de/arguseye/

RRDtool - About RRDtool, used by ragraph

  http://oss.oetiker.ch/rrdtool/

AfterGlow

  http://afterglow.sourceforge.net/

argus.traffic.perl (Peter Van Epp) - multiple probe setup - was written for argus 2.0.6

  ftp://ftp.sfu.ca/pub/unix/argus/

echo_response.pl (Peter Van Epp) A perl script for finding echo response packets.

  ftp://ftp.andrew.cmu.edu/pub/argus/contrib/echo_response.pl

What we do with argus (Russel Fulton) (earlier mailing list on osdir.com)

  http://osdir.com/ml/network.argus/2001-06/msg00090.html

ARGUS-2.0.1 IPFX BOF - IETF London, UK - Wed, August 8, 2001 ; slide show (Carter Bullard)

  http://www.ietf.org/proceedings/51/slides/ipfx-2/sld001.htm

Re: Combination of snort and argus (Russel Fulton) - Description of an argus record

  http://osdir.com/ml/security.ids.snort.sigs/2002-10/msg00109.html

Example working on the LBNL/ICSI Enterprise Tracing Project with sguil / argus (Richard Bejtlich)

  http://taosecurity.blogspot.com/2007/05/lbnlicsi-enterprise-tracing-project.html

Howto run argus on OpenWRT (CS Lee)

  http://geek00l.blogspot.com/2009/04/argus-3x-on-linksys-wrt54gl.html

FAQ

There is an FAQ on www.qosient.com which covers most of the basic issues you are likely to face when starting out with Argus. This is required reading for all new argus users.

Feel free to add new questions and answers here:


How do I do IP accounting by IP

Argus 3.0 provides the fundamental data needed to generate an accounting of network activity from many different perspectives. Because Argus data can contain ethernet addresses, VLAN tags, MPLS labels, Layer 4 protocol identifiers, and port numbers, argus data can provide the information needed to do Layer 2, sub Layer 3, and Layer 4+ accounting. Argus data is capable of this through flexible aggregation, provided by the client program, racluster().

The useful task for racluster() is to provide an accounting for each IP address in a stream of argus data. However, this is actually quite a complex task, because of the nature of argus data (bi-directional flow data), and what a single address accounting actually represents. I'll get into this later.

Racluster() is the program of choice for generating an IP address inventory, and the metrics that can be derived for each address. It provides simple command line directives to generate the list of IP addresses and their activity. Lets assume you have a file, argus.data, that contains the records of interest. To generate the simple per IP address data, simply run racluster() with these options:

  racluster -M rmon -m saddr -r argus.data - ip

This will take in IP data records, and track each individual IP address, and after processing the entire file, it will output argus data, that provides aggregated statistics, such as the transaction count, the total bytes and packet counts for each IP address observed. Lets look at the options in detail.

The "-M rmon" option is the secret sauce that converts bi-directional flow data, which contains two IP addresses in each record, into IETF RMON style data, where you get a listing of individual IP addresses, with the input and output packet and byte counts observed in the data.

The "-m saddr" option, specifies the aggregation model for the data, and in this case, cause racluster() to aggregate using the source IP address as the flow key. This results in racluster() providing records with unique IP addresses in the "saddr" field of the output argus records.

The "-r argus.data" option simply specifies some input data. You could use the "-R" option to specify an entire filesystem of argus data, or you could use the "-S" option to specify a remote stream of argus data. Distribution programs like radium(), allow you to read remote argus data files using the "-S hostname:/full/path/name/to/the/argus.data" and so there are alot of options for getting data into racluster().

racluster() outputs argus data, so you can use the output to generate reports, graphs, and to use in other types of argus data processing. A common practice is to read the output of racluster() using rasort(), to geneate a "top" whatever list. Say if you wanted the top 15 IP addresses, based on input packet load, run racluster(), as above, to generate the IP address data, and then run rasort() using these options:

  rasort -M sload -r racluster.output.file -w - | ra -N 15

Having racluster() write its output to a file, allow you to read in prior runs of racluster(), prior to reading the raw data. This type of preprocessing allows you to "seed" your IP address accounting system with existing lists of IP addresses. This is a strategy used by many to have a running total of all the IP addresses seen over some arbitrary period of time, say the last 28 days, and provides the basic data needed for a windowed IP accountability facility, such as that needed to provide "traceback". As an example, you can process data to generate hourly IP address lists, store them in an argus archive like filesystem, and then use that data to generate daily/weekly/monthly/annual IP address lists, such as this:

  racluster -m saddr -r 2006/12/10/ipaddrs.2006.12.10.* -w ipaddrs.2006.12.10.daily.out

Conversely, by having the list of active IP addresses, you can build the list of addresses in your address space that are not used. This can be used as a "dark net" inventory, useful for doing scan detection.


How do people use Argus and manage its files ?

(Carter's response to Torbjorn.Wictorin, Argus-info Digest, Vol 48, Issue 38)

Most universities and corporations that run argus, use it along with snort, or some other type of IDS at their enterprise border. They use the IDS as their up-front security sensor, and argus as the "cover your behind" technology. The two basic strategies are to keep all their argus data to support historical forensics or toss it after looking at the IDS logs and seeing that not much is/was happening.

The first approach is usually chosen by sites that have technically advanced security personnel, that have been seriously attacked or for some reason have a real awareness of the issues and know that the commercial IDS/IPS market is lacking. For sites that are under funded or are less technically oriented, argus, or argus like strategies usually aren't being used. If these types of sites are using flow data, its almost always Netflow data and they are using a commercial report generator to give the data some utility. These strategies normally do not store significant amounts of flow data, as that would be a cost to the customer.

So when a site does collect a lot of flow data, they generally partition the data for scaling (like you are doing). For universities/small corporations, they generate argus data in the subdomain/workgroups/dorms, where 500GB can store a years worth of flow data.

When the point of collection is the enterprise boundary, and a site is really using the data, and justifying the expense of collecting it all, the site invests in storage, but they also do a massive amount of preprocessing to get the data load down.

Most sites generate 5m-1h files. We recommend 5 minutes. Most sites run racluster() with the default settings on their files, sometime early in the process, and then gzip the files. Just running racluster() with the default parameters will usually reduce a particular file by 50-70%. I took yesterdays data from one of my small workgroups, clustered it and compressed it and got these listings:

  thoth:tmp carter$ ls -lag data*
  -rw-r--r--  1 wheel  93096940 Aug 28 10:30 data
  -rw-r--r--  1 wheel  12534420 Aug 28 10:34 data.clustered
  -rw-r--r--  1 wheel   2781879 Aug 28 10:30 data.clustered.gz

So, from 93 MB to 2MB is pretty good. Reading these gzip'd files performs pretty well, but if you are going to processing them repeatedly, then delaying compression for a few days is the norm.

Because searching 100's of GB of primitive data is not very gratifying if you're looking for speed, almost all big sites process the data as it comes in to generate "derived views" that are their first glance tables, charts and information systems. After creating these "derived views" some sites toss the primitive data (the data from the probes). For billing or quota system verification, most sites generate the daily reports, and retain the aggregated/processed argus records, and throw away the primitive data. I've seen methods that toss, literally, 99.8% of the data within the first 24 hours, and still retain enough to do a good job on security awareness.

There was a time where scanning traffic generated most of the flow data (> 40%). That has shifted in the last 3-4 years, but we have filters that can very quickly remove data to your dark address space and split to other directories. Some sites use the data, many sites toss it.

Some sites want to track their IP address space, because they have found that that is important to them, some want to retain flow records only for "the dorms". The argus-clients package has programs to help facilitate all of this, but you need to figure out what will work for you.


Some discussions

rabins -M nomodify

(Argus-info Digest, Vol 32, Issue 5))

Rabins() could increase the reported flow count - why ?

In Nicks example, rabins() is reporting the number of flows during a specific 10m period. Flows that cross hard time boundaries, do exist in the multiple time periods. So if the question is:

  "what is the flow count in any given time period"

splitting flows across time boundaries is the correct thing to do. There are lots of reasons to ask this question. How many concurrent flows are on the wire and how does that change over time is a very sensitive test for lots of network phenomenon. I like to track the natural beacons on the wire (network management stations pinging around, switches doing end system inventory by arping every IP address in the subnet every 300 secs), and its easy using this type of stat, as computers are pretty good clocks.

I suspect that either 10 flows cross the 10 minute boundaries, in this run, or there is a single flow that spans 10 time boundaries, or something in between.

However, if the question is instead:

  "what is the flow initiation count in any given time period"

Well, then you only want a single flow counted once, in the time bin when it started. To do this you don't want to modify/split the flow records, so use this option:

  rabins -M nomodify

And it will only tally the flows that start in that time period.


A statistical note:

The time boundary, relative to the actual real time of the flows can be random. Flows for the most part, don't really care that its 2pm somewhere, so the time "bins" that we create start at a random time relative to flow records. But this is not always the case. Actually if a human is not behind the generation of flows, then the flows are being created by a clock, because computers are really just digital alarm clocks in an abstract way.

This is called "flow clocking".

This is important in understanding the phenomenon that you point out. If you lengthen the time size of the bins from 10m to say 1h, the probability of a flow crossing a time boundary will go down in a probabilistic manner if flows start randomly. But if your flows are sync'd by a clock of some kind (like cron launching a job), then you will find that as the bin gets longer, all of a sudden, it goes from ( > N ) to ( == N).

Carter

What is jitter?

Jitter is the variance of the interpacket arrival times for a given flow/link/virtual circuit/whatever. The network has the ability to 'shape' traffic, either because its 'rate limiting' i.e. 'slow', or because it is busy and possibly introducing queuing artifacts, or its congested and it drops packets, or packets are taking differing paths in the network and arriving in different burst patterns or in different order. With your 'rate limited' link, you will get all kinds of jitter issues, depending on the size of the packets and the contention for the single link. Jitter is normally of interest for voice and video traffic, which has pretty stringent jitter tolerances if they are real time, but there are a lot of high performance applications, like remote file sharing, where jitter impacts peak performance.

An argus filter is not a ra* filter

(Argus-info Digest, Vol 19, Issue 1)

Russel Fulton said: Hmmm... trap for the unwary  :) filter "tcp and dst port 80" means something rather different to ra and argus! It took me about half an hour to figure out why argus was seeing traffic in just one direction after I applied this filter. I've got so used to using filters with ra where the filter applies to flows that I simply assumed that argus filters would behave the same. They don't they behave just like tcpdump filters ( i.e. they are packet filters).

Carter's reply : You are absolutely correct!!! The input filter for argus is a libpcap filter, and so its a packet filter. All other filters are flow filters.

racluster -M rmon: how to use it?

( post on gmane 2007-02-22)

Wolfgang Barth : I want to plot inbound/outbound traffic - not src/dst traffic. I'm using something like this:

racluster -M rmon -r argus.log - \
  srcid elibridge_dmz and src host 172.17.132.81 \
  and dst host 172.17.130.2 and tcp dst port 80 and tcp src port 1415

The output is:

2007-02-21 08:15:27.658658 tcp 172.17.132.81.1415 -> 172.17.130.2.www \
                           9       13         1055        11936   FIN
2007-02-21 08:15:27.658658 tcp 172.17.130.2.www  -> 172.17.132.81.1415 \
                          13        9        11936         1055   FIN

The flow is duplicated. Okay, if RMON works this way, how can I filter out inbound and outbound traffic?

Carter: The "-M rmon" option works by duplicating the record, reversing all the fields and then merging. So specify the object:

   "-m smac" - by interface
   "-m svlan" - by vlan
   "-m smpls" - by mpls label
   "-m saddr" - by IP address
   "-m proto sport" - by port
   "-m smac sdsb" - by diffserv label

So if you want to aggregate based on interface, then use:

  -M rmon -m smac

This will give you in/out stats based on interface. When you print/graph the records, you will want to print the smac field, so the full command:

  racluster -M rmon -m smac -s smac spkt dpkts sbytes dbytes

should give you what you're looking for.

rabins() is really the combination of rasplit() and racluster()

( "5 minute load" : post on gmane 2009-04-19)

Carter : rabins() is really the combination of rasplit() and racluster(), where rasplit() divides data into the time slots, and each slot has its own racluster() context. At the end of the run, rabins() should printout the results of all the clusters that it has in its cache.

If you were to do something like:

  rasplit -M hard time 5m -r 01.gz -w /tmp/test/argus.%Y.%m.%d.%H.%M.%S

and then:

 for i in /tmp/test/argus*; do echo $i; racluster -m srcid -s stime bytes load; done

you should get data similar to what rabins() is trying to do. [assuming use of bash()]

I would like to print all the SYN/ACK occurrence

... to detect possible SYN flood attacks (2010-07-16); actually I can use -Zs flag in ra to have the result I was searching from

(Carter's response to Riccardo, Argus-info Digest, Vol 59, Issue 15)

You can specify these types of filters to get records that have specific tcp flag settings:

  ra -r file - syn and not ack
  ra -r file - src syn and dst ack
  ra -r file - src push and dst urg

the keyword synack, is a special case argus status indication that matches when the source sent the syn, and the destination sent an ack.