Measurement Studies of End-to-End Congestion Control in the Internet
This page surveys measurement studies
that shed light on the state of end-to-end
congestion control in the Internet.
It first motivates the topic,
then moves on to the measurements themselves.
Future predictions
-
Video Road Hogs Stir Fear of Internet Traffic Jam.
Steve Lohr, New York Times, March 13, 2008.
(An account might be needed to view this article...)
"Projections that the increasing amount of data on the Internet
will cause user demand to overwhelm the available capacity are disputed
by many experts. At the current rate of growth, global Internet traffic
cold guadruple by 2011."
-
Study Warns of Internet Brownouts By 2010.
Slashdot, November 19, 2007.
"Our findings indicate that although core fiber and switching/routing
resources will scale nicely to support virtually any conceivable user
demand, Internet access infrastructure, specifically in North America, will
likely cease to be adequate for supporting demand within the next three to
five years."
TCP-friendly congestion control
-
The
Concerns
about End-to-End Congestion Control page
contains pointers to articles on the need for end-to-end congestion control,
and a list of some potentially TCP-unfriendly protocols and applications.
-
The TCP-Friendly
Congestion Control page
summarizes work (up to 1999) on congestion control algorithms
for non-TCP based applications,
focusing on schemes that use the "TCP-friendly" equation.
-
TBIT, the
TCP Behavior Inference Tool, characterized the TCP implementations
of (many) web servers in the Internet.
Internet traffic and weather
-
The Internet Traffic Report
charts
global
and
regional
packet-loss indexes, with data up to 2008.
For February 2000,
the Global Packet Loss statistic averaged 4% packet loss,
and the North American Packet Loss statistic averaged 1% packet loss.
For February 2001, the Global Packet Loss statistic averaged 2% packet loss.
-
IEPM,
the Internet End-to-end Performance Monitoring Group,
with data from 2000, 2001, and 2002,
used
PingER to plot
monthly stats of packet loss rates.
SLAC's PingER data includes
continent-to-continent
and
country-to-country
loss rates between participating sites going back to April 2000, and
a
history of loss rates from SLAC since October 2000, showing the steady
decrease in these packet loss rates over the last two years.
Some traceroute results are also available from
SLAC
and
NLANR.
-
Internet quality of service assessment
from Telcordia.
They measured delays and bandwidth to the top 100 URLs, and to
100 random URLs.
-
The
Internet Weather Report
has been discontinued, replaced by the
Internet Average,
which (in 2003) reported latency, packet loss, and reachability for
the last day, week, or month.
They had
yearly data from 1993 up to at least 2003.
-
As of 2008,
the
Internet Health Report
reports on the latency (of establishing a TCP connection) between
agents at a number of major U.S. backbones, over the last
hour or day.
-
The
distribution of packet drop rates from traces
from the NIMI mesh in August, 2002, shows a median
packet drop rate of 0.7%.
Description of traces.
-
CAIDA's
Internet
Measurement Infrastructure
web page (updated in 2008) lists a number of Internet weather sites.
-
NLANR's
Active Measurement Project
was active up to 2006,
including data for
packet loss rates between pairs of nodes.
The nodes were in the US, largely at NSF-supported
High Performance Computing sites.
-
Measurements of the distribution of
round-trip times for an aggregate (1999-2002).
-
On the Constancy of Internet Path Properties,
Yin Zhang, Nick Duffield, Vern Paxson, and Scott Shenker,
ACM SIGCOMM Internet Measurement Workshop, 2001.
"We explore three different notions of constancy...
Using a large measurement dataset gathered from the NIMI infrastructure, we
then apply these notions to three Internet path properties: loss, delay, and
throughput."
Bandwidth used by different traffic types
-
CAIDA's Traffic
Workload Overview plots statistics showing the relative
bandwidth usage of TCP, UDP, ICMP, and other IP traffic at
selected sites, for 1999.
In addition, their
Graphs of Ames Internet Exchange (AIX) traffic
show the relative bandwidth usage of different traffic types
from May 1999 to March 2000.
In June 1999,
measurements at backbone routers, edge routers, and major exchange points
show 90-95% of the bytes belonging to TCP traffic.
-
February 2001 measurements from a transatlantic link, showing
that 95% of the bytes are from TCP.
Follow-up mail gives more information.
-
CAIDA's
demonstration of the realtime monitoring of traffic flows by
CoralReef shows traffic on the SDSC inbound link at CERFnet
sorted by protocols, applications, flows, source and
destination hosts, source and destination ASes, and source
and destination countries.
-
CAIDA's
Mantra
monitors the current state of multicast routing and of multicast routing
protocols.
-
Henning Schulzrinne's
Long-Term Traffic Statistics page
points to a range of traffic statistics, including statistics about the
relative bandwidth of voice and data traffic, from 1998.
-
Krishnamurthy et al's paper on
On the Use and Performance of Content Distribution Networks
reports an order of magnitude increase in the number of origin sites
using CDNs (Content Distribution Networks) between 11/99 and 12/00.
-
Gigandet et al's paper on
The Inktomi
Climate Lab
models the network traffic load at proxy caches.
Multimedia traffic characterization
-
The following paper presents a preliminary analysis of streaming media
traffic originating from a popular Internet audio service:
Art Mena and John Heidemann,
An Empirical Study of Internet Audio Traffic,
to appear in Proc. of IEEE Infocom 2000,
March 2000.
-
The following paper describes a tool for gathering traces
of Internet multimedia traffic,
and presents examples of the types of analysis the tool enables:
Jacobus van der Merwe, Ramón Cáceres, Yang-hua Chu, and Cormac Sreenan,
mmdump: A Tool for Monitoring Internet Multimedia Traffic,
AT&T Labs-Research TR 00.2.1,
February 2000.
Network Measurements at Specific Sites
-
The
UCLA Network Weather Report, and the
UCR Internet Reachability Report, both active in 2008.
-
An OC-192 backbone link in 2006:
W. John and S. Tafvelin,
Analysis of Internet Backbone Traffic and Header Anomalies
Observed,
IMC 2007.
This paper reports on packet size distribution,
transport protocol breakdown (96-97\% TCP bytes),
ECN use (not used),
IP options (not used),
TCP MSS and SACK options (widely used), and other properties
of aggregate traffic.
- Sprint:
-
Packet Trace Analysis from Sprint ATL.
Measurements include link utilization, number of active flows,
traffic breakdown by protocol and by application, and packet size
distribution, on traces from 2000 to 2005.
-
Chuck Fraleigh et al,
Packet-Level Traffic Measurements from the Sprint IP Backbone,
IEEE Network, November/December 2003, V.17 N.6.
This paper includes measurements of traffic breakdown by application
(e.g., web, peer-to-peer); packet size distribution;
TCP round-trip times; and out-of-sequence packets.
"We observe 1-6 percent of streaming traffic." In all cases, over
90 percent of the traffic is TCP.
-
S. Jaiswal et al,
Measurement and Classification of Out-of-Sequence Packets in a Tier-1 IP
Backbone, Infocom 2003, also IEEE/ACM Transactions on Networking 2007.
"Our measurements
show a relatively consistent rate of out-of-sequence packets of
approximately 4%. We observe that a majority of out-of-sequence
packets are retransmissions, with a smaller percentage resulting
from in-network reordering." The traces are from 2002.
- SLAC:
The daily and historical
NETFLOW Status Report
shows traffic breakdown by protocol, application, and SLAC program.
2005-2007.
- Internet2:
Weekly Reports from Internet2 shows data about bulk TCP
performance,
protocol distribution, ECN-capable traffic, application types,
and the like for traffic on Abilene. 2002-2007.
-
R. Nelson, D. Lawson, and P. Lorier,
Analysis of Long Duration Traces, CCR, January 2005.
"Some typical analyses are presented, covering protocol mix,
network trip times, and TCP flag analysis."
-
K. Papagiannaki, D. Veitch, and N. Hohn,
Origins of Microcongestion in an Access Router,
Passive and Active Measurment Workshop, 2004.
"We found that the link bandwidth reduction factor of 16 (from
OC-48 to OC-3) played a significant role in delay buildups."
"Finally the effect of individual 5-tuple flows, and sets of ‘bursty’
flows, was found to be small in most cases."
-
The
Tstat
tool has been used to collect a range of TCP/IP statistics from
traces
on an access link in Italy, for 2000-2002.
Statistics include IP protocol, TOS, and TTL fields,
flow length in bytes or packets,
port numbers, options, and advertised windows,
MSS, flight size, out-of-sequence and duplicate burst sizes,
and RTTs.
-
2007
Network Performance Statistics from the UW-Madison campus network.
This includes traffic breakdowns by IP protocol, application, and access link
bandwidth.
-
Measurements from the University of Auckland for 2001 show
information such as
packet losses
and
throughput
on the 4 Mbps access link
to the 100 Mbps local network.
-
Yubing Wang, Mark Claypool, et al,
An Empirical Study of RealVideo Performance Across the Internet,
IMW 2001.
"Overall video performance is most influenced by the bandwidth of the
end-user connection to the Internet, but high-bandwidth
Internet connections are pushing the video performance
bottleneck closer to the server."
-
Dmitri Loguinov and H. Radha,
Measurement Study of Low-bitrate Internet
Video Streaming, ACM SIGCOMM Internet Measurement Workshop (IMW),
November 2001.
This paper studied packet-loss rates
from a seven-month study of video traffic in the United States.
Of the dial-up connections that were able to provide packet drop rates
less than 15%, 75% experienced loss rates below 0.3%, and 91%
experienced loss rates below 2%.
- Hao Jiang and C. Dovrolis's paper on
Passive Estimation of TCP Round-Trip Times
estimates the distribution of roundtrip-times, by connection and
by byte, from several tcpdump traces.
-
Mark Allman's
A Web Server's View of the Transport Layer
looks at path properties and TCP and HTTP protocol behavior from
clients to a particular web server. The paper also reports on the range
of round-trip times and of packet sizes.
-
The
MAWI Working Group Traffic Archive has trans-Pacific packet traces
from 1999-2003,
with
each day's statistics giving the protocol breakdown
and the ten biggest flows. Including
outliers like
June 18, 2000, when 57% of the packets were from DNS traffic.
-
Michael S. Borella, Debbie Swider, Suleyman Uludag, Gregory B. Brewster,
Internet Packet Loss: Measurement and Implications for End-to-End
QoS, 1998.
"We analyze a month of Internet packet loss statistics for speech
transmission using three different sets of
transmitter /receiver host pairs." The packet drop rates for the
three paths, all in the U.S., ranged from 0.4% to 3.5%.
-
Hari Balakrishnan, Venkata N. Padmanabhan, Srinivasan Seshan, Mark
Stemm, Randy H. Katz,
TCP Behavior of a Busy Internet Server: Analysis and Improvements,
Infocom 1998.
From 1996 traces from the web server for the Atlanta Olympic games,
the total average packet drop rate was 0.5%.
-
The Nature of the Beast: Recent Traffic Measurements
from an Internet Backbone, from
K Claffy, G. Miller, and K. Thompson, gives
traffic measurements from the MCI backbone in 1998.
-
Trends in Wide Area IP Traffic Patterns:
A View from Ames Internet Exchange,
by S. McCreary and KC Claffy, describes trends in measurements
from May 1999 thru March 2000. The trends include decreases
in RealAudio traffic, and increases in traffic from online games,
Napster, and IPSec traffic.
Older Measurements
Mice and elephants
Colloquially, "mice" are short web transfers, and "elephants" are
long-lived connections or sessions. Several measurement studies
have shown that, while most of the connections represented in a
trace are mice, most of the bytes or packets in a trace generally
come from the elephants.
-
From Figure 9 of
Wide-Area Traffic: The Failure of Poisson Modeling by Paxson and Floyd,
1995, FTPDATA connections within a session are clustered into
bursts, and in one trace, half of the FTP traffic volume in bytes
comes from the largest 0.5% of the bursts. The traces are from
LBL, UCB, DEC, and UK, from 1994, 1989, 1995, and 1991 respectively.
In all of these traces, the largest 2% of the bursts account for
at least half of the bytes.
-
For the 1995
dataset of web client workloads
discussed in
Changes in Web Client Access Patterns: Characteristics and Caching
Implications by Barford et al., the largest 1% of the files
accounted for over half of the bytes transferred. In the 1998
dataset, the largest 5% of the files accounted for half of the bytes
transferred.
-
From Figure 2A in
Load-Sensitive Routing of Long-Lived IP Flows by Shaikh, Rexford,
and Shin, 1999, of the port-to-port flows in the ISP traffic trace,
more than 85% of the packets are from the roughly 20% of the flows
that have more than 10 packets. More than 50% of the packets are
from the less than 2% of the flows that have more than 130 packets.
-
In a July 2000 trace of wide-area traffic to and from UC Berkeley,
while half the flows were at most four packets long, only
10% of the bytes were from flows of 10 packets or less; only 25% of the
bytes were from flows of 100 packets or less; and 50% of the bytes
were from flows of 650 packets or less. [Ratul Mahajan]
-
We are aware of only one trace
that is dominated by mice rather than by elephants.
Vern Paxson reports that for an October 1998 trace of traffic
to and from a Yahoo image server,
more than half of the packets were from flows of at most twelve packets.
-
The
Self-Similarity and Long Range Dependence in Networks page
lists references on heavy-tailed distributions in network traffic.
Traffic measurements using global routing data
-
Cowie, J., Ogielski, A., Premore, B., and Yuan, Y.,
Global Routing Instabilities during Code Red II and Nimda
Worm Propagation.
Preliminary report on strong correlations between
BGP message storms and propagation periods for Microsoft worms such as
Code Red and Nimda.
``What were thought to be purely traffic-based denials of service in
fact are seen to generate widespread
end-to-end routing instability originating at the Internet's edge.''
Congestion and Topology
-
A. Akella, S. Seshan, and A. Shaikh,
An Empirical Evaluation of Wide-Area Internet Bottlenecks, 2003.
Conventional wisdom has been that the performance limitations in
the current Internet lie at the edges of the network i.e last mile
connectivity to users, or access links of stub ASes. As these links
are upgraded, however, it is important to consider where new bottlenecks
and hot-spots are likely to arise."
Tools and infrastructures for measurement and analysis
Ramón Cáceres
and Sally Floyd
|
Last modified in December 2008.
|