Internet Engineering Task Force                              Sally Floyd
INTERNET DRAFT                                                      ICSI
draft-flows-00c.txt                                        Ratul Mahajan
                                                                      UW
                                                             April, 2003


     Router Primitives for Protection Against High-Bandwidth Flows
                             and Aggregates


Status of this Memo


   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   The research literature includes many proposals to control high-
   bandwidth flows and aggregates to protect the remaining traffic. This
   document considers a framework for this based on rate-limiters before
   the output queue, and describes a set of primitives in routers that
   could support this framework.  The aim is to gather feedback on the
   feasibility of these primitives as a concrete step towards getting
   these mechanisms deployed.


Floyd, Mahajan                                                  [Page 1]


draft-flows-00c                                               April 2003


1.  Introduction.

   The network needs to protect itself and the rest of the traffic from
   the adverse impacts of high-bandwidth flows and aggregates.

   The key goal is to stop high-bandwidth flows or aggregates from
   consuming all the bandwidth at a time of high congestion. An
   additional reason for restricting such flows is to help maintain an
   incentive for users to use end-to-end congestion control for best-
   effort traffic.  (Thus, depending on local policy, one part of this
   framework might be to more aggressively restrict the bandwidth of
   those high-bandwidth flows determined to be misbehaving, that is, not
   to be using conformant end-to-end congestion control.)

   An aggregate is a collection of packets from one or more flows that
   have some property (e.g., destination prefix, protocol type, etc.) in
   common. Examples of events that generate high-bandwidth aggregates
   are DDoS attacks, flash crowds, and worms.  The purpose of
   restricting the bandwidth given to the aggregate is to preserve some
   fraction of the bandwidth for the general traffic that is not in that
   aggregate.

   This document does not propose specific mechanisms to counter the
   above threat. Instead it describes the overall framework of using
   rate-limiters before the output queue to restrict flows and/or
   aggregates, along with the primitives required from routers to
   support this framework. The goal is to gather feedback on whether the
   implementation of such primitives (and thus the mechanisms) is
   feasible.

   Other frameworks have also been proposed for controlling high-
   bandwidth flows and aggregates in routers, including scheduling-based
   frameworks based on lower-priority queues and preferential-dropping-
   based frameworks based on differential treatment in the output queue
   itself.  In this document we only consider the framework of rate-
   limiters before the output queue;  comparisons with other frameworks
   have been presented in [MF01] and [MBF+01].


2.  The Framework


   The overall rate-limiting framework is shown below in Figure 1. All
   high-level policy decisions such as which flows or aggregates to
   rate-limit or what their rate-limits should be are made in the dotted
   box labeled policy. It is not necessary that this policy be
   implemented in the router itself; it can be implemented in a box
   outside of the router. The policy box needs various inputs from the


Floyd, Mahajan                                                  [Page 2]


draft-flows-00c                                               April 2003


   routers to make its decisions. In the following sections, we break
   down the functionality of this box into components, and describe the
   router support required to implement them.

   Figure 1 shows a rate-limiter in front of the output queue; in some
   router architectures, rate-limiters could be placed in front of the
   input queues as well.  The rate-limiter is programmed by the policy
   box. This box looks at each incoming packet, and decides whether to
   drop or mark this packet before forwarding it to the output queue,
   based on the past history of the specific flow or aggregate
   containing that packet.  The rate-limiter includes no queues, but
   merely looks at each arriving packet one at a time.

   For example, the rate-limiter could determine if the arriving packet
   belongs to a rate-limited flow or aggregate, dropping or ECN-marking
   some of these packets as part of the rate-limiting strategy [MFW01,
   ECB01].  There are other proposals for rate-limiters that do not
   include the explicit identification of flows [PBPS01].  (Some
   aggregates could respond to preferential ECN-marking in the rate-
   limiter, while rate-limiters would have to use dropping instead of
   marking for more recalcitrant flows and aggregates.)  All packets not
   dropped in the rate-limiter are passed to the output queue.

   The rate-limiter is the only component in the fast path before the
   output queue, and is already present in most routers in some form;
   for instance, Cisco's Committed Access Rate (CAR).

   The output queue is assumed to be FIFO output queue, optionally
   preceded by an Active Queue Management (AQM) module that decides
   whether to drop the arriving packet, ECN-mark it, or forward it to
   the output queue unmarked.  We make no assumption about the AQM
   mechanism in the output queue; it would even be a Drop-Tail queue
   with no AQM mechanism at all (particularly if we don't use the packet
   drop history to estimate the arrival rates of flows).  The assumption
   is that the AQM module in the output queue makes drop or marking
   decisions based on the level of congestion in the output queue,
   rather than on the past history of the specific flow or aggregate
   containing that packet.


Floyd, Mahajan                                                  [Page 3]


draft-flows-00c                                               April 2003


             ______________              ___________________
            |              |            |                   |
         -->| Rate-limiter |----------->| AQM, output queue |-->
          : |______________|            |___________________|
          :       ^                               :
          :       :                               :
         .v.......v...............................v.
         :                                         :
         :                Policy                   :
         :.........................................:

    Figure 1:  Packet flow with rate-limiter before the output queue.

3. Essential Functionality

   This section talks about the essential functionality that would be
   part of any rate-limiting strategy.

3.1 Identification of high-bandwidth flows or aggregates

   The identification of high-bandwidth flows or aggregates requires
   having some idea of the traffic arriving at the router. This can be
   accomplished using one of the following as inputs into the policy
   box.

    a. Headers of all traffic.
    b. Headers of a random sample of traffic.
    c. Headers of packets dropped at the output queue (only when dropped
    packets can be considered to be a random sample, as is the case for
    RED).

   Note that the first two do not necessarily require router support;
   they can be accomplished using external methods such as tapping the
   router's links. Also note that this functionality is already present
   in current routers in some form; for instance, Cisco's Netflow is one
   example of a mechanism for sampling traffic.

   We do not delve into the algorithms used to identify high-bandwidth
   flows or aggregates; refer to [MFW01], [MBF+02], and [EV02] for
   example algorithms. But the choice of algorithms does depend on
   whether some other primitives are available. An important restriction
   is that the dimensions along with flows and aggregates are identified
   is limited by the dimensions along which rate-limiting can be
   implemented -- for instance, if the rate-limiter cannot rate-limit
   based on port-numbers, the identified "flows" and "aggregates" cannot
   be based on port numbers. Similarly, an identification algorithm with
   a high false positive rate is acceptable as long as the primitives to
   closely observe the behavior of rate-limited traffic exist (4.2).


Floyd, Mahajan                                                  [Page 4]


draft-flows-00c                                               April 2003


   In identifying high-bandwidth flows or aggregates, the policy engine
   will need to take into account the kinds of aggregate specifications
   that are understood by the rate-limiter.  (Some rate-limiters might
   accept aggregate specifications that are arbitrary functions of a
   number of fields in the packet header, but other rate-limiters are
   likely to have more limited formats for specifying aggregates.)

3.2. Rate-limiting identified flows or aggregates

   The goal of rate-limiting is to limit the throughput of the
   identified flow or aggregate. It can be implemented using a virtual
   token-bucket rate limiter before the output queue. (Note that other
   rate-limiting mechanisms are possible, but we focus on a token-bucket
   based rate-limiter because of its simplicity.) This is essentially a
   virtual queue with some permissible maximum burst and whose drain
   rate is the rate-limit itself. Incoming packets are dropped when this
   virtual queue is full, and go through to the real output queue
   otherwise. It already exists in some form; for instance, Cisco's
   Committed Access Rate (CAR).

   In case of resource-constrained routers, it is possible to have just
   one rate-limiter for an output queue; all rate-limited flows and
   aggregates suffer the same fate, and are not protected from each
   other.  But wherever possible, it would be highly desirable to have
   many separate rate-limiters, for different flows or aggregates that
   are being limited.

3.3. (Un)Installing or updating rate-limiting filters at run-time

   Routers need to be able to install, remove, or change rate-limiting
   filters at runtime as directed by the policy box. The basis of doing
   this exists in most routers today that let an operator load a new
   configuration file without disruption in forwarding.

4.  Optional Primitives

   We take the reverse approach in this section: instead of specifying
   primitives needed for a given functionality, we specify what
   functionality is possible if a particular primitive is available.

4.1 Ambient drop rate

   A primitive to report the ambient drop rate (i.e., the packet
   drop/mark rate at the output queue) could be used to determine the
   level of congestion in the output queue, as input to determining the
   rate limits to be applied to flows or aggregates. This primitive is
   already available in current routers through management interfaces
   such as SNMP or Cisco's "show interface" command.


Floyd, Mahajan                                                  [Page 5]


draft-flows-00c                                               April 2003


   Typically, flows would be rate-limited only when the output link is
   fully utilized.  Information on the ambient drop rate can be used by
   the policy box for this purpose.  (We note that a router might
   occasionally choose to rate-limit an aggregate even when the output
   link is not fully utilized, e.g., in response to the pushback
   requests described in Section 5.)


4.2 Closer monitoring of rate-limited flows and aggregates.

   This is useful for a variety of purposes such as identifying
   misbehaving flows and narrowing the congestion signature for
   aggregates.

   4.2.1 Estimate of the arrival rate of identified flows or aggregates.

   4.2.2. Estimate of the packet drop rate at the rate-limiter for
   identified flows or aggregates.

   Taken together, these are helpful in determining whether the
   identified flow is misbehaving. Both of these are accessible when
   rate-limits are implemented through Cisco's CAR.

   4.2.3. Packet drop history or random sample of rate-limited traffic.

   This is useful for narrowing the congestion signature for identified
   aggregates, by watching if most of the aggregate traffic is of a
   certain type. This may already be available as part of 3.1.

5. Enabling Pushback

   Pushback is a cooperative mechanism routers could use to request
   their upstream neighbors to rate-limit the aggregates [MBF+02]. This
   saves bandwidth on the congested links and helps spatially isolate
   the attacking source(s). This section discusses the additional
   primitives required to enable pushback.


   1. Estimate of the relative contribution of input links for an
   identified aggregate.

   This helps deciding which upstream neighbors are sending more traffic
   from that aggregate.  This can be accomplished by tagging packets
   with the ingress interface, which most routers already do.

   2. Communicating about pushback to upstream or downstream routers.


Floyd, Mahajan                                                  [Page 6]


draft-flows-00c                                               April 2003


   Details about message formats used for this purpose can be found in
   [FBI+02].

   3. Access to routing tables

   This is required to narrow the congestion signature received from
   downstream. This should be easy, as this is done in software.

6.  Simple examples of rate-limiting.

   A simple example of rate-limiting would be for the router to have a
   static definition of some subset of the non-TCP/SCTP/DCCP traffic,
   and to rate-limit this subset to ensure that this traffic did not
   grab all of the bandwidth on the output link in times of high
   congestion.  This could be implemented in a router that had a single
   rate-limiter, along with some input to the policy box about the
   ambient drop rate in the output queue.

   Another example is rate-limiting worm traffic. A router could
   automatically detect when worm probes to a particular port are
   causing congestion, and rate-limit the probes to that port to save
   bandwidth for other traffic.

7.  Conclusions

   Most of the required primitives are already available in current
   routers. It should be easy to implement complete rate-limiting
   solutions.

8.  Acknowledgements

   We thank Steve Bellovin, Mark Handley, John Ioannidis, and Scott
   Shenker for feedback and contributions to this document.

9.  Normative References

   There are no normative references in this document.

10.  Informative References

   [ECB01] F. Ertemalp, D. Cheriton, A. Bechtolsheim, Using Dynamic
   Buffer Limiting to Protect against Belligerent Flows in High-Speed
   Networks, International Conference on Network Protocols (ICNP),
   November 2001.

   [EV02] Estan, C., and Varghese, G., "New Directions in Traffic
   Measurement and Accounting", ACM SIGCOMM, August 2002.


Floyd, Mahajan                                                  [Page 7]


draft-flows-00c                                               April 2003


   [FBI+02] Floyd, S., Bellovin, S., Ioannidis, J., Kompella, K.,
   Manajan, R., and Paxson, V., "Pushback Messages for Controlling
   Aggregates in the Network", Internet-draft: draft-floyd-pushback-
   messages-00.txt, expired draft, work in progress, July 2001.  URL
   "http://www.icir.org/pushback/".

   [MFW01] Mahajan, R., Floyd, S., and Wetherall, D., "Controlling High-
   Bandwidth Flows at the Congested Router", International Conference on
   Network Protocols (ICNP), November 2001.

   [MF01] Mahajan, R., and Floyd, S., "Controlling High-Bandwidth Flows
   at the Congested Router", ICSI Technical Report TR-01-001, April
   2001.

   [MBF+01] Mahajan, R., Bellovin, S., Floyd, S., Ioannidis, J., Paxson,
   V., and Shenker, S., "Controlling High Bandwidth Aggregates in the
   Network (Extended Version)", July 2001, URL
   "http://www.icir.org/pushback/".

   [MBF+02] Mahajan, R., Bellovin, S., Floyd, S., Ioannidis, J., Paxson,
   V., and Shenker, S., "Controlling High Bandwidth Aggregates in the
   Network", ACM SIGCOMM CCR, Vol 32, No. 3, July 2002.

   [PBPS01]  Rong Pan, Lee Breslau, Balaji Prabhakar, Scott Shenker,
   Approximate Fairness through Differential Dropping, 2001.

11.  Security Considerations

   This document discusses primitives that enable mechanisms for
   selectively limiting network traffic.  Requests for such limiting
   must come from authorized sources only. If these requests are coming
   from control connections, the connections themselves should be
   properly authenticated, e.g., using IPsec.  If these requests are
   generated automatically, whether on-board the router or from some
   outside control element, there exists a risk that an adversary may
   send traffic that mimics legitimate traffic and thus cause legitimate
   traffic to be rate-limited.  When the policy engine is not on the
   router, there is also a need for protect and encrypt the traffic to
   the policy engine, to protect the privacy of customers' traffic.

12.  IANA Considerations

   There are no IANA considerations in this document.

   AUTHORS' ADDRESSES


Floyd, Mahajan                                                  [Page 8]


draft-flows-00c                                               April 2003


      Sally Floyd
      Phone: +1 (510) 666-2989
      ICIR (ICSI Center for Internet Research)
      Email: floyd@icir.org
      URL: http://www.icir.org/floyd/

      Ratul Mahajan
      Department of Computer Science and Engineering
      University of Washington
      Email:  ratul@cs.washington.edu
      URL: http://www.cs.washington.edu/homes/ratul/


Floyd, Mahajan                                                  [Page 9]