File: README.md

package info (click to toggle)
tcpflow 1.4.5%2Brepack1-3
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 3,004 kB
  • ctags: 5,121
  • sloc: cpp: 17,933; python: 7,736; sh: 1,315; xml: 1,100; ansic: 355; makefile: 346
file content (192 lines) | stat: -rw-r--r-- 7,061 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
TCPFLOW 1.4.5
=============
Downloads directory: http://www.digitalcorpora.org/downloads/tcpflow/


Compiling
---------
To compile for Linux

Be sure you have the necessary precursors. For RedHat based distributions use following command to install them:

    # yum -y install git gcc-c++ automake autoconf boost-devel cairo-devel libpcap-devel zlib-devel
    
If you are working on a Debian based distribution use this:

    # sudo apt-get install git gcc g++ automake autoconf libpcap-dev libboost-dev libssl-dev zlib1g-dev libcairo2-dev

Download the release from http://digitalcorpora.org/downloads/tcpflow/.  Compile and install with:

    ./configure
    make
    sudo make install

If you want do download the development tree with git, be sure to do a *complete* checkout with `--recursive` and then run `bootstrap.sh`, `configure` and `make`:

    git clone --recursive https://github.com/simsong/tcpflow.git
    cd tcpflow
    bash bootstrap.sh
    ./configure
    make
    sudo make install  


To download and compile for Amazon AMI:

    ssh ec2-user@<your ec2 instance>
    sudo bash yum -y install git make gcc-c++ automake autoconf boost-devel cairo-devel libpcap-devel openssl-devel zlib-devel
    git clone --recursive https://github.com/simsong/tcpflow.git
    sh bootstrap.sh


To Compile for Windows with mingw on Fedora Core:
    
    yum -y install mingw64-gcc mingw64-gcc-c++ mingw64-boost mingw64-cairo mingw64-zlib
    mingw64-configure
    make



Introduction To tcpflow
=======================

tcpflow is a program that captures data transmitted as part of TCP
connections (flows), and stores the data in a way that is convenient
for protocol analysis and debugging.  Each TCP flow is stored in its
own file. Thus, the typical TCP flow will be stored in two files, one
for each direction. tcpflow can also process stored 'tcpdump' packet
flows.

tcpflow stores all captured data in files that have names of the form:

       [timestampT]sourceip.sourceport-destip.destport[--VLAN][cNNNN]

where:
  timestamp is an optional timestamp of the time that the first packet was seen
  T is a delimiter that indicates a timestamp was provided
  sourceip is the source IP address
  sourceport is the source port
  destip is the destination ip address
  destport is the destination port
  VLAN is the VLAN port
  c is a delimiter indicating that multiple connections are present
  NNNN is a connection counter, when there are multiple connections with 
      the same [time]/sourceip/sourceport/destip/destport combination.  
      Note that connection counting rarely happens when timestamp prefixing is performed.

HERE are some examples:

       128.129.130.131.02345-010.011.012.013.45103

  The contents of the above file would be data transmitted from
  host 128.129.131.131 port 2345, to host 10.11.12.13 port 45103.

       128.129.130.131.02345-010.011.012.013.45103c0005

  The sixth connection from 128.129.131.131 port 2345, to host 10.11.12.13 port 45103.

       1325542703T128.129.130.131.02345-010.011.012.013.45103

  A connection from 128.129.131.131 port 2345, to host 10.11.12.13 port 45103, that started on
  at 5:19pm (-0500) on January 2, 2012
  
       128.129.130.131.02345-010.011.012.013.45103--3

  A connection from 128.129.131.131 port 2345, to host 10.11.12.13
  port 45103 that was seen on VLAN port 3. 
   

You can change the template that is used to create filenames with the
-F and -T options.  If a directory appears in the template the directory will be automatically created.

If you use the -a option, tcpflow will automatically interpret HTTP responses.

       If the output file is
          208.111.153.175.00080-192.168.001.064.37314,

       Then the post-processing will create the files:
          208.111.153.175.00080-192.168.001.064.37314-HTTP
          208.111.153.175.00080-192.168.001.064.37314-HTTPBODY

       If the HTTPBODY was compressed with GZIP, you may get a 
       third file as well:

          208.111.153.175.00080-192.168.001.064.37314-HTTPBODY-GZIP

       Additional information about these streams, such as their MD5
       hash value, is also written to the DFXML file


tcpflow is similar to 'tcpdump', in that both process packets from the
wire or from a stored file. But it's different in that it reconstructs
the actual data streams and stores each flow in a separate file for
later analysis.

tcpflow understands sequence numbers and will correctly reconstruct
data streams regardless of retransmissions or out-of-order
delivery. However, tcpflow currently does not understand IP fragments; flows
containing IP fragments will not be recorded properly.

tcpflow can output a summary report file in DFXML format. This file
includes information about the systme on which the tcpflow program was
compiled, where it was run, and every TCP flow, including source and
destination IP addresses and ports, number of bytes, number of
packets, and (optionally) the MD5 hash of every bytestream. 

tcpflow uses the LBL Packet Capture Library (available at
ftp://ftp.ee.lbl.gov/libpcap.tar.Z) and therefore supports the same
rich filtering expressions that programs like 'tcpdump' support.  It
should compile under most popular versions of UNIX; see the INSTALL
file for details.

What use is it?
---------------

tcpflow is a useful tool for understanding network packet flows and
performing network forensics. Unlike programs such as WireShark, which
show lots of packets or a single TCP connection, tcpflow can show
hundreds, thousands, or hundreds of thousands of TCP connections in
context. 

A common use of tcpflow is to reveal the contents of HTTP
sessions. Using tcpflow you can reconstruct web pages downloaded over
HTTP. You can even extract malware delivered as 'drive-by downloads.'

Jeremy Elson originally wrote this program to capture the data being
sent by various programs that use undocumented network protocols in an
attempt to reverse engineer those protocols.  RealPlayer (and most
other streaming media players), ICQ, and AOL IM are good examples of
this type of application.  It was later used for HTTP protocol
analysis.

Simson Garfinkel founded Sandstorm Enterprises in 1998. Sandstorm
created a program similar to tcpflow called TCPDEMUX and another
version of the program called NetIntercept. Those programs are
commercial. After Simson left Sandstorm he had need for a tcp flow
reassembling program. He found tcpflow and took over its maintenance.

Bugs
----

Please enter bugs on the [github issue tracker](https://github.com/simsong/tcpflow/issues?state=open)

tcpflow currently does not understand IP fragments.  Flows containing
IP fragments will not be recorded correctly. IP fragmentation is
increasingly a rare event, so this does not seem to be a significant problem.


MAINTAINER
==========
Simson L. Garfinkel <simsong@acm.org>



ACKNOWLEDGEMENTS
================
Thanks to: 
* Jeffrey Pang, for the radiotap implementation
* Doug Madory, for the  Wifi parser
* Jeremy Elson, for the Original idea and initial tcp/ip implementation