home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1997 December
/
Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso
/
ietf
/
92mar
/
trafchar-minutes-92mar.txt
< prev
next >
Wrap
Text File
|
1993-02-17
|
14KB
|
320 lines
This is only a rough draft - Megan 04/10/92
Summary of IETF BOF on Network Statistics and Analysis
1. Introduction
The purpose of this BOF is to instigate discussion
and information exchange within the community concerning
research in wide-area network traffic measurements.
Five brief presentations of related research were made,
followed by discussion of each.
One theme of the BOF was to discuss exactly what kind
of network instrumentation, measurement facilities, and
types of measurements should be recommended to the Internet
community. Many of us would like to encourage the managers
of stub networks and routers to collect and make available
information similar in spirit to the statistics
that NSFNET makes available through Merit/NSFNET Information
Services (NIS.NSF.NET). We hope this effort eventually
evolves into an RFC, and eventually leads to a widespread
cooperative effort. We freely admit that the road to success
will be an iterative process, fraught with plenty of
challenging technical details.
The amount of space consumed by this data completely depends on the
type of measurement. For example, collecting TCP SYN/FIN/RST packets
could lead to hundreds of megabytes a day, depending on the collection
site. Other methods, like sampling or recording the quantity of bytes
sent to particular destination networks might require less than a hundred
kilobytes a month. In the first case, the volume of trace data can be
on the order of one to two percent of the traffic itself, with the resulting
data possibly having to be sent by tape rather than electronic means to
the location where the network analysis will happen.
The Internet Activities Board (IAB) recently announced guidelines for
measurement activities. RFC 1262 lists bounds that should be commonly
acceptable. However RFC 1262 directly addresses invasive
measurement activities, and is only marginally applicable
to passive data collection. We believe we will have to face
many new issues hitherto unaddressed. What we propose must honor
the concerns and restrictions that individual networks may impose,
yet thorough enough to capture the data that we need to accomplish
the research goals, and should allow for flexibility. An example
of a difficult issue to resolve is the privacy when using network
addresses, in particular as workstations with their own IP addresses
frequently map to individual users. Our efforts should address
privacy measures, that still allow professional research to be
conducted.
Most likely, each of us has a different idea as to the data we need
to have measured to achieve our various objectives. Below,
we summarize these motivations and give a preliminary list of the
measurements and trace data that we believe should be collected or capturable.
We encourage you all to add to both the motivation list and chart
of traces and measurements, and mail them back to wanchar@usc.edu
for inclusion in this document.
2. Motivations
2.1 Artificial workload models (Danzig and Jamin)
Good artificial workload models are needed to drive simulations of
new resource management algorithms, flow control algorithms, and
routing algorithms. The artificial workload models that we are
developing consist of an application specific model (ftp, telnet, nntp, etc.)
and an application arrival rate model that is stub network dependent.
So far we have been able to identify applications from their port
numbers. As new transport protocols emerge, we may need other mechanisms.
Creating the application specific model requires full traces of TCP/IP
packet headers. Creating the stub network specific model requires
traces of TCP SYN/FIN/RST packets only. Most of our data has been collected
with statspy or tcpdump from a machine on the same
Ethernet segment as the stub network's gateway to the backbone.
We would like to collect SYN/FIN/RST traces from hundreds of stub
networks. Given current network bandwidth and usage, these traces
can range to 200MB/day.
2.2 Network planning (Braun and Claffy)
SDSC and UCSD are undertaking a network analysis effort
with multiple goals of immediate applicability
and interest to the Internet environment, with respect
to both performance and ubiquity.
Areas of current investigation include: measurements and
analysis of resource consumption and latencies, network
performance degradation under resource starvation, and
end-to-end performance testing. We have determined, for
selected data sets, characteristics of network usage by
application, bandwidth requirements, and geographic distribution.
We are also exploring the role that granularity plays in traffic
analysis, both in statistical sampling of traffic on an
operational basis, and in the level of detail
one presents data to optimize the information/noise ratio.
We are currently analyzing data from a variety of
sources, including national networks as well as federal network
interconnection points of multiple agencies.
Statistical examination and manipulation of data reveals
significant traffic correlations, trends, and dependencies.
We are also undertaking collaborative efforts with Toshiya
Asaba and the WIDE statistics working group in Japan.
In particular, Asaba is largely responsible for the
analysis scripts which facilitated statistical
examination and data presentation. We first intended
the scripts for use in a study of international traffic
between Japan and other nations. We were able to adapt
the script for use in subsequent studies. Building a
public library of usable scripts for different analysis
tasks requires agreement on data formats in multiple
phases of collection and analysis. We would like to
see a collaborative effort within the community toward
accomplishing such a task.
Further information and slides are available by
sending requests to the SDSC Applied Network Research
Group, via hwb@sdsc.edu or kc@sdsc.edu
2.3 Stateful router studies (Estrin and Mitzel)
[Related information, though not participated at the BOF.]
The current Internet is based on a stateless (datagram) architecture.
However, many recent proposals rely on the maintenance of state
information within network routers, leading to our interest in the
implications of a ``stateful'' network layer.
We wish to collect internetwork traffic traces at the border routers
of stub and transit networks, and use this data to evaluate, or
predict, the effects of design alternatives for stateful architectures.
An important design decision is the level at which conversations are
defined. This determines the granularity of control over the network
traffic, and affects the scalability of the system. We are interested in
several granularities of conversations, ranging from
a single TCP application association, up to aggregation of all traffic
between two communicating networks. We will use the data to estimate the
number of active conversations at a router, and derive the
storage requirements for the associated conversation state table. We
will analyze the feasibility of fine grain control at the network
periphery and deeper within the network.
In conventional IP, the only lookup function normally required for
packet forwarding is a routing table lookup. This has been recognized
as a bottleneck in the forwarding process [Feldmeier, Jain].
It has been shown that the introduction of an LRU cache can substantially
improve the efficiency of the packet forwarding process. Route
caching is used in many existing routers. However, unlike the
stateful schemes investigated here, which require lookup based
on source--destination pairs, current route caches are based only
on destination host or network. It is not intuitively obvious whether
the solutions developed for routing table caches can be applied here.
We will use our network traffic traces to
perform trace driven simulations of an LRU cache, for different
conversation granularities, and thereby assess traffic locality and
the benefits of caching.
2.4 Network monitoring (Schwartz and Pu)
Schwartz proposed that a group of a dozen of us or so agree to
collaborate to collect traces and measurements. He also described
his recent study of FTP traffic which showed that tools to
locate copies of large, replicated files may reduce wide area
network traffic due to FTP. The unique aspect of Schwartz's traces
was that it actually peered at application level data in a
way that preserved privacy.
2.5 Host reliability and availability (Long)
Long summarized his study of internet host reliability and
availability. This was the only active form of tracing
discussed during the BOF.
3. Measurements and traces
Here is a first pass at the type of data we would like to see
collected, and what studies would use this data. These categories
need to be detailed, and new categories probably need to be
filled in. The table identifies four types of data to collect.
These include captured packets and packet headers (excluding
data), headers of selected packets, summary data, and routing and
congestion data. The first three types of data are pretty well
defined, while the last is much less so. Although we can collect such
data from anywhere in the Internet, we classify it into three
classes: entrances to stub networks, regional and backbone
routers, and international gateways.
TYPE OF DATA
| Captured |TCPDUMP |NSF.NIS.NET |Router |
M | Packets & |Conversation|LIKE DATA |Timing and |
E | Packet |SYN/FIN/RST |Data |Queue length|
A | Headers |Traces | |(MIB) |
S --------------------------------------------------------------
U |Workload |Workload | | Congestion |
R STUB |models |models | | studies |
E NETWORKS | | | | |
M |Workload | |Workload | |
E |Planning | |Planning | |
N --------------------------------------------------------------
T | | | | |
REGIONAL | | | | |
AND |Stateful | | | Congestion |
BACKBONE |Routers | | | studies |
P NETWORKS | | | | |
O |Workload | |Workload | |
I |Planning | |Planning | |
N --------------------------------------------------------------
T | | | | Congestion |
INTER- | | | | studies |
NATIONAL | | | | |
GATEWAYS | | | | |
|Workload | |Workload | |
|Planning | |Planning | |
--------------------------------------------------------------
Table 1.
4. Trace formats and tools
We need to define the storage format for trace and statistical data.
For some formats, like tcpdump or statspy, the format is already pre-defined.
Almost certainly we should adopt NSFNET's current format for the type of data
they collect. We also need to define ``sanitizer'' programs that implement the
security concerns of particular networks.
There is an operations area in IETF which has been defining some standard
transport and storage formats for various kinds of operational data.
Dealing with gigabytes of data is results in a serious resource impact.
An effort has to be undertaken to identify schemes to make such large
quantities of data useful, possibly via multiple levels of data reduction.
5. Mailing list:
The current composition of wanchar@usc.edu is listed below.
Change requests can be sent to wanchar-request@usc.edu
afs@germany.eu.net
ala@merit.edu
amr@nri.reston.va.us
asaba@isr.recruit.co.jp
bac@sdsc.edu
becker@ans.net
boss@sunet.se
brunner@practic.com
calton@cs.columbia.edu
carson@utcs.utoronto.ca
cbagwell@gateway.mitre.org
chris@wugate.wustl.edu
cjw@nersc.gov
cward@westnet.net
dan@merlin.dev.cdx.mot.com
danzig@usc.edu
darrell@cse.ucsc.edu
estrin@usc.edu
fair@apple.com
golding@cis.ucsc.edu
goodwin@psc.edu
gruth@bbn.com
henry@oar.net
hwb@sdsc.edu
jamin@usc.edu
jfl@nersc.gov
jgodsil@ncsa.uiuc.edu
jkay@cs.ucsd.edu
jonchy@dxcoms.cern.ch
jrc@uswest.com
jun@wide.ad.jp
kc@sdsc.edu
kfall@cs.ucsd.edu
korz@bach.cs.columbia.edu
kr@concord.com
lear@sgi.com
lindahl@violet.berkeley.edu
lwinkler@anl.gov
mak@cnd.hp.com
mak@merit.edu
mankin@gateway.mitre.org
martin@cearn.cern.ch
medin@nsipo.nasa.gov
morris@ucar.edu
mws@sparta.com
nevil@aukuni.ac.nz
nitzan@ws1013.nersc.gov
ogud@cs.umd.edu
peter@usc.edu
peterd@cc.mcgill.ca
polyzos@cs.ucsd.edu
probins@bubba.wpd.sgi.com
pushp@cerf.net
rama@erlang.enet.dec.com
rbutler@ncsa.uiuc.edu
rcollet@icm1.icp.net
reschly@brl.mil
rgc@qsun.att.com
rin@qsun.att.com
rj@sgi.com
schwartz@cs.colorado.edu
sherk@sura.net
stats@nic.near.net
suelin@ibm.com
tmwalden@saturn.sys.acc.com
tom@cic.net
topolcic@nri.reston.va.us
van@horse.ee.lbl.gov
vcerf@nri.reston.va.us
vern@horse.ee.lbl.gov
vikas@jvnc.net
vu@polaris.dca.mil
whaley@ncsc.org