Adaptive Web Caching
A Collaboration between Lixia Zhang (UCLA),
Sally Floyd, and Van Jacobson (
NRG, LBNL).
The Adaptive Web Caching project is a new DARPA-funded research project
whose aim is to design and prototype implement protocols for
a self-configuring,
highly adaptive Web caching system that will scale to the
global information infrastructure. At a later stage, the project will
investigate possibilites for the incremental deployment of the
new protocols into the existing manually-configured web caching
infrastructure.
Overview
With the exponential growth of the Internet, the World
Wide Web is rapidly becoming a global-scale data dissemination system.
The unprecedented success of Web, however, also caused traffic overload
at data sources and along network paths. ``Hot spots'' of network load
have been repeatedly observed, with the same data being transmitted
over the same network links again and again to thousands of users.
Although the problem is not entirely new,
a similar overload had occurred in the past at popular FTP servers, however
the old solution of manually configuring a few replication sites no longer
works.
Using multicast delivery
Given that the basic problem is data dissemination to thousands or
millions of users, the basic solution ought to be some form of multicast
delivery. That is, the data should be fetched only once from the origin
server, and then forwarded via a multicast tree to all the interested
parties. Unlike multicast delivery for realtime multimedia applications,
however, Web requests for the same data come asynchronously
because different users surf the Web at different times. Therefore Web
``multicasting'' must be done via caching: the network temporarily
buffers popular Web pages at places the pages have traveled through (due
to previous requests), so that future requests for those pages can be
served from the cache.
We propose to undertake the design and prototype implementation of a
self-configuring, highly adaptive Web caching system that will
enable the World Wide Web as well as other data dissemination
applications to scale to the dimension of the global information
infrastructure and beyond. Our design uses IP multicast as a basic
building block. IP multicast serves two distinguished functions, one
being the most efficient way to deliver the same data to multiple
receivers, the other being an information discovery vehicle---a host can
multicast a query to a relevant group when it does not know exactly whom
to ask. Our caching design makes use of both features; we multicast
page requests in order to locate the nearest cache copy, and multicast
page responses in order to efficiently disseminate pages that have
common interest.
The need for a self-configuring system
In our proposed design, Web servers and cache servers are organized into
multiple, overlapping multicast groups, so that a client page request
can be either met by some cache server in a local group, or otherwise
forwarded to other group(s) that lie on the path towards the information
source or are otherwise judged as most likely to have the referenced
object. In order for this caching infrastructure to be {\em robust,
scalable, and efficient}, the organization of Web caches into
overlapping groups must be self-configuring. We propose to {\bf develop
self-organizing algorithms and protocols} that allow cache groups to
dynamically adjust
themselves according to changing conditions in network topology, traffic
load, and user demands.
We believe this need for self-configuring systems to be an essential
component for a range of loosely coupled, globally distributed systems
such as the Internet. Examples include the need for self-configuring
groups for scalable session message distribution in RTP, the need for
self-configuring groups for session messages and for local recovery in
scalable reliable multicast, and the need for self-configuring search
structures for information discovery protocols. We envision that the basic
approaches to self-configuration developed in this research be further
extended to other large scale systems.
Return to
[
Adaptive Web Caching at LBL]