Internet Experiment Note # 19
Notebook Section 126.96.36.199
A note on
Inter-Network Naming, Addressing, and Routing
John F. Shoch
Xerox Palo Alto Research Center
Palo Alto, California 94305
Taxonomies and terminology will not, by themselves, solve some of the
difficult problems associated with the inter-connection of computer networks;
but carefully choosing our words can help us to avoid misunderstanding and
refine our perceptions of the task.
In 'Through the Looking Glass', the White Knight tries to elucidate (for an
imprecise Alice) the important differences between what a song *is*, what it
*is called*, what it *is named*, and what *the name is called*; perhaps we
need to be equally vigilant with our use of the words 'name', 'address', and
Let me offer one scheme which has proven useful in analyzing this domain, and
begin by asserting that 'names', 'addresses', and 'routes' are different
entities. [Even one of my favorite papers introduces part of this topic by
merging two of these characteristics: "The question of addressing. or how to
name all the participants in an interconnected communication system...."]
The General Model
We can first construct an extremely general definition:
The 'name' of a resource indicates *what* we seek,
an 'address' indicates *where* it is, and
a 'route' tells us *how to get there*.
Some further detail to flesh this out:
I. A 'name' is a symbol -- usually a human-readable string -- identifying
some resource, or set of resources.
The string need not be meaningful to all users, and need not be drawn
from a uniform name space.
The strings can identitify processes, places, people, machines,
functions, or anything else the user chooses.
To be useful, however, there will probably be some mechanism available to
the user that will map names into addresses.
The name (what we seek) need not be bound to the address (where it is)
until this mapping takes place; the address (or addresses) associated
with a particular name may change over time.
II. An 'address', however, is the data structure whose format can be
recognized by all elements in the domain, and which defines the fundamental
Addresses must, therefore, be meaningful throughout the domain, and must
be drawn from some uniform address space.
This address space may be a "flat" one which spans the entire domain
(such as Social Security numbers), or it may be a hierarchical address
space (such as telephone numbers).
If a name has produced several different addresses for a particular
resource, some form of 'a priori' information may prove useful in
selecting a preferred address.
At the time one wishes to communicate with a particular address, there
will be some mechanism that will map an address into an appropriate
The address (where something is) need not be bound to the route (how to
get there) until this mapping takes place; the choice of an "appropriate"
route may change over time.
III. A 'route' is the specific information needed to forward a piece of
information to its specified address.
The routing action may only require one step to reach a destination (a
direct route), or it may require a series of steps in order to forward
the information on its way.
Where there is merely one hop in a simple topology, the routing decision
is usually straightforward.
When the path to an address requires several steps (as in a store and
forward system), the route defines a path through intermediate switching
In categorizing such routing mechanisms, one dimension is the *place* at
which each intermediate routing decision is specified:
a. The source may specificy all of the intermediate routing decisions,
and include this information along with the data being sent ('source
routing'). The source must have fairly comprehensive information about
the environment, but the switching points need not maintain routing
b. Alternatively, the source may only specify the destination
address, and the intermediate switching points choose the next
portion of the route ('hop-by-hop routing', sometimes called
'intermediate routing' or 'incremental routing'). In this case, the
source only needs enough information to reach the next switching
point, but each of these points must then have a routing table of
c. There may be a hybrid routing mechanism combining these two, in
which the source specifies certain major intermediate points, but
allows the underlying system to choose routes between those points.
A second dimension of the routing mechanims [sic] is the 'time constant'
of the routing decisions -- when are they specified, and how frequently
they may be changed:
a. Routing tables may be set once, left unchanged for relatively long
of time, and only changed to reflect major modifications to the
system ('fixed routing', or 'deterministic routing').
b. Alternatively, routing tables may be updated relatively
frequently, reflecting shorter term changes in the environment
('dynanamic routing', or 'adaptive routing').
[This measure is, in some sense, relative to the time constants of
the communication system itself.]
If there is a dynamic routing system providing periodic updates of
appropriate information, a third dimension of the routing process is the
a. Individual switching points may try to update their information in
an 'isolated' manner, perhaps by trying various routes and observing
b. Changes in the environment or connectivity may all be reported
back to one 'centralized' point, which is then responsible for
promulgating this inforniation to the sources (if we use source
routing) or to the switches (if we use hop-by-hop routing).
c. Alternatively. control of the dynamic update process may be
'distributed' among all of the sources or switches, which individualy
[sic] obtain information about the state of the system.
It is meaningful to talk about possible combinations of these
dimensions: 'fixed source routing' may be quite simple to implement,
while 'dynamic source routing' may be beneficial if the source can learn
about changes occurring throughout the system. 'Fixed hop-by-hop routing'
requires setting the routing tab1es at the switching points, while
'dynamic hop-by-hop routing' then requires updates to these routing
tables. 'Distributed, dynamic, hop-by-hop routing' might obtain these
periodic updates from neighboring switches, while a 'centralized,
dynamic, hop-by-hop' system would depend on some form of network coutrol
center to provide new information.
Thus, a *name* may be used to derive an *address*, which may then be used to
derive a *route*.
[There is an interesting similarity between this structure and mechanisms
used in programming languages (where one must bind a value to a
variable), or in operating systems (where one must link a particular
piece of code into a module). In all these cases, there is a certain
amount of added flexibility gained by deferring the binding process as
long as possible: MULTICS defers the binding of pieces of code until
run-time so that new instantiations of code will be used automatically,
while the Arpanet tries to defer routing decisions as long as possible,
in order to make the most informed choice.]
If one fails to deliver a piece of information to the resource originally
specified, an error recovery procedure can systematically try to undo each of
these binding decisions: first try an *alternate route*, then an *alternate
address*, then an *alternate name*.
Example #1: Naming/Addressing/Routing in the Telephone System
Before attempting to apply this scheme to a connected set of computer
networks, let's see how well it serves to understand the workings of another
domain: placing a telephone call.
[The telephone system, of course, uses circuit switching, and dedicates
resources for the lifetime of a call. We are here discussing the naming,
addressing, and routing which take place during call set-up.]
I. We can look up a 'name' in the phone book.
The name may be a specific person ("D. B. Cooper"), an institution
("Stanford University"), the local representative of a nation-wide
service ("the nearest TWA
office"), or a generic function ("Get me the police!", or the time
The names of different individuals may yield the same phone number.
One name may yield several phone numbers, if there are two telephones in
one home or if a business provides multiple incoming lines. In addition, a
considerate business might provide several incoming lines vith
different area codes.
Should a name yield more than one number, the caller may use some
additional information to aid in the choice of which to dial: is one
number less likely to be in use, or is one address in our own area code,
and thus cheaper.
There can be a two-level name-to-address mapping process: look up the
name "information" in the phone book, get that number (411), place a
call, and then present a new name to be looked-up by the information
II. The 'address' which we eventually find is just the telephone number.
The phone system did not have to understand the subscribers' names, but
it certainly must be able to understand their addresses (numbers).
Addressing in the phone network is based on a hierarchical structure --
the major contribution of area codes, as specified in the North American
There are certain conventions for defaulting parts of the address when it
is found: if the area code is not listed, we generally assume it is the
same as our own.
Every resource (person) accessible through the phone system is actually
represented by a telephone instrument, which we can address. Actual
signals are intended for the person, but they are always sent via the
III. Vhen we dial the number (in fact, each time we dial the number) the
phone network attempts to build an appropriate route to the destination,
reflecting current conditions.
Two successive calls to the same number need not be routed through the
same path, and the telephone plant provides alternate routes if a trunk
is busy. The distributed, adaptive nature of this routing is one of the
major factors which contributes to the reliability of the phone system.
Now consider the error recovery procedure if we are not able to reach the
person we seek: we move back through the hierarchy, trying alternate routes,
alternate addresses, and alternate names.
1. If we get a busy trunk indication, we know that the route chosen by the
network turned out to be inadequate. We can, in effect, ask the system to try
picking a route again, by re-dialing (re-transmitting). If we are
particularly concerned, we can ask an operator to override some of the
automatic routing decisions, and attempt an alternate route.
2. If all of the routes to this address are inadequate, or if the address is
unavailable (perhaps busy, or broken), we must fall back and find an
alternate address: perhaps there were two phone numbers listed, and we can
try the second. (How many of us have second phones for a terminal, which we
call in on when the regular line is busy?) This will lead to the selection of
a completely new route.
3. If the alternate address fails, we can fall back one more step, and try to
find an alternate name: perhaps the phone is listed in the name of one's
spouse, or maybe we should look up the name of the organization where one
The Implications of Hierarchical vs. Flat Address Spaces
It should be apparent that the structure of the address space is of central
importance: it is the major element of commonality in such an environment,
and can have a profound influence on the naming mechanism "above" and the
routing mechanism "below".
An address space can be partitioned in a hierarchical manner, or left as a
single uniform space. Use of a flat address space implies that:
1. The address given to any resource must be unique over the whole
2. There is no structure to the address which might aid the routing
process; instead, the routing mechanism must be able to handle all
addresses, without segmenting them into parts (i.e., there is no area
There are some advantages to this approach: it is logically simple, and has
the potential to optimize the particular route between any two resources. But
there are certain kinds of costs: upon adding a new set of resources to an
existing system, one must insure [sic] that none of the new addresses
conflict with those already used, and the routing process must now be able to
recognize all of the legal addresses (in the phone system, 10^10).
The alternative is to provide a hierarchical structure for addresses, as has
been done with area codes. Local addresses can then be changed within areas,
but need not be made known to all of the other parts.
Yet the choice of hierarchical addressing can have important ramifications
for the levels "below".
We would certainly not want to allow a strictly hierarchical address scheme
to serve as the justification for a strictly hierarchical physical topology:
that would allow only one path to each resource, and severely diminish
reliability. Thus, the underlying communications should provide alternate
paths to a destination.
Furthermore, the routing mechanism must now be flexible enough to take
advantage of the alternate paths available -- it would make litle sense to
lay a strictly hierarchical routing algorithm on top of a richly-connected
network. Given a hierarchical address, this form of area routing should be
able to pick out an appropriate field, and route the data towards that area,
through one of the available paths, but be able to identify an alternate path
Though the hierarchical address space is often viewed as preferable, there is
a cost here too: if the routing decision does in fact use the hierarchical
structure of the address and route to an area, then some information has been
lost: the ability to specify precise routes between two resources. This can
lead to several kinds of difficulties:
1. Two resources may be geographically close, but on distant branches of
the hierarchical tree, If the routing strictly parallels the addressing
hierarchy, it may require many hops to reach a nearby node.
The telephone system provides a routing hierarchy which matches the
addressing scheme, but then provides additional routes (high usage
trunks) across the tree, as warranted by the traffic; switching
centers then have the choice of routing through the specially
allocated trunk, or back up the regular hierarchy, which then becomes
the alternate route.
This means, however, that some central offices within an area code
are reachable through direct lines, while other less-heavily used
offices must be reached through the standard hierarchy. To make that
routing decision, the foreign switch must be provided with additional
information about the internal structure of the system within the
destination area code -- i.e., which central office prefixes are
reachable through a "short-cut". To take advantage of a direct
high-usage trunk from downtown New York to a
central office in downtown San Francisco, without going through the
California area code hierarchy, the switch in New York must be able
to perform what is called "6-digit translation" on the phone number,
examining both the area code and the central office prefix.
2. If a sub-area (netvork) is unintentionally partitioned so that half of
its internal addresses are now unreachable, a distant routing process
won't be able to recognize that, given only the sub-area number.
3. Should there be alternate paths into the partitioned sub-area, now
making those lost nodes reachable, the distant routing process will not
be able to make an informed choice about the proper route into the
4. Two resources may be in the same sub-area, but perhaps connected by an
inefficient route (many hops or poor quality). Provided there are two
paths in to the sub-area, it may make more sense for a resource to route
its traffic out of the area aud back in through the other path, but area
routing will indicate that the source is already in the area of the
destination, and may not allow a route going outside of that area.
In these unusual circumstances, we see that there are clear weaknesses at the
routing level which evolve from the choice of a hierarchical address space;
but in the most common cases this approach will work well.
This choice of hierarchical addresses may also impact the naming mechanism,
although that ought not be necessary. The name-to-address mapping can be
supplied by a process which is external to the system itself, and there is no
reason to enforce hierarchical names. A user may offer a name from which one
can directly derive the address, but it should also be acceptable to provide
a symbolic name, independent of the address format. An example may show the
different kinds of names which might be converted into adresses (telephone
1. The syntax of the string "(415) 767-2676" explicitly indicates the
three fields of the address, and their values.
2. The string "(SanFrancisco) Pop-Corn" has the syntactic structure of a
proper hierarchical address, but will require a symbolic look-up of the
3. Yet the string "GetMeTheTimeLady" -- when looked-up in the context of
my name dictionary -- should produce this same address.
The virtue of the third approach is that the names may remain the same, even
if the resource is moved to a different address, or if I should move to a
different area; only the name-lookup process must be made aware of the change.
Example #2: Inter-connected Computer Netvorks
It seems that this model of naming/addressing/routing provides a useful
framework for discussing the phone system, and that system provides a rich
set of examples for comparison with the domain of computer networks.
I. A name is merely a human-readable string meant to denote some resource:
"BBN-TenexA" (a host), "SAIL" and "SU-AI" (two names for the same host), "the
SRI mobile van", "a time server", "the Telnet server at ISI", "the Telnet
server on any Tenex", "the nearest FORTRAN compiler", "the error-logging
process in IMP32", "the inter-network routing process in Gateway2l" .... Note
that these names may identify resources outside of the communication system,
or processes within it.
Virtually any object can have a name. In practice, of course, any such
resource is usually represented by some sort of process which can act on its
The Arpanet system for handling text messages describes a slightly
different kind of name, in the form "User@Host", but that would not
really resolve down to a proper internetwork address. Instead, the "host"
name is used to derive an address for the
mail server process, and the "user" name is sent along as data, to be
interpreted by a higher-level protocol (the mail server) at the
Some local process (either local or part of some distant resource) may offer
to convert names into appropriate addesses, although a single name may yield
many addresses. The name-lookup process may know about the precise address of
certain well-known severs [sic], and may provide certain default conventions
if some portions of the address are not fully specified.
I have not yet mentioned the notion of names and addresses for
multi-destination pieces of information. The name
"Internet-Working-Group" could produce a whole set of names, which in
turn could be transformed into individual addressess [sic], and
individual copies of the information could be sent to each place. Most
communications subsystems do not yet have protocols which would provide
more congenial support for multi-destination packets.
If a single name does yield many addresses, one of them may be chosen on the
basis of some other information: bandwidth and delay characteristics, cost,
II. An address, in this context, really means an internetwork address: a
collection of bits in a standard format, recognizable by any object cognizant
of the inter-network protocols.
In most inter-network efforts we have adopted a hierarchical address space,
usually partitioned into (net, host, socket) tuples.
Note that this model is not quite adequate to describe the dynamic naming
and addessing conventions planned for the Distributed Computing System
(DCS) at UC Irvine. They had proposed that processes not be bound to
particular machines, but would be free to migrate around the ring; 16-bit
process ID's on each packet would be dynamically checked in each host,
and the packet given to the appropriate process, if it happened to be in
that host. The actual address of a destination process would remain
hidden from the sending program,
I would argue that the 16-bit process ID (although not human-readable) is
most accurately called a process 'name', since it may be used to describe
a generic resource without specifying a particular address. Thus, this
mechanism pushes "down" and distributes the task of mapping from names to
addresses, thereby deferring the binding until even later.
The mechamsm depended heavily on two aspects of the DCS ring:
-- the ability of the communications subsystem to broadcast a packet to
all nodes on the ring, and,
-- the ability of each interface to quickly look up a process name in a
content addressable memory (associative memory) used to identify those
processes resident in the node. (In the original proposal, this memory
could hold the names of 16 resident processes) [sic]
In a different communications enviromment -- or if there were more than
16 processes in one host -- this mechanism would not function properly.
In those cases, however, one can provide the same effect (allowing users
to communicate with resources stirctly [sic] by name, without specifying
an address) merely by providing an extra layer of protocol. I would
conclude that the different naming/addressing conventions in DCS are more
of a difference in style, rather than a difference in substance.
We usually think of these three components as physical objects -- networks
and machines -- but they are really only logical groupings used to form the
address. In a particular environment, for example, the "host" might be a
front-end processor, and the "sockets" might be specialized processors,
perfoming the function which the user sought at that
If a machine has only one network interface, processes in that machine will
all have the same <Net><Host> address, although from many other addresses
there may be multiple paths to this address.
If a machine has two network interfaces, to the same network or to two
different networks, it will have two alternate internetwork addresses for
each process resident there. Another process seeking to communicate might
select either address.
III. The underlying communications system provides the means for routing
packets to their destination.
Individual networks use different routing strategies for forwarding their
internal traffic: the DCS ring has a simple topology with no alternate
routes, while the Alohanet and the Etherent [sic] use a broadcast
communications media, requiring no routing. The Arpanet uses dynamic
hop-by-hop routing, while the Packet Radionet uses dynamic source routing.
Internetwork packets, however, may travel through various networks, routed
among internetwork gateways. The gateway routing processes make use of the
information specified in the internetwork address, and can route to the areas
specified. Upon arrival at a gateway entering the destination network, the
network-specific routing mechanism can be used to route it to the specified
The gateway processes may implement a form of dynamic hop-by-hop routing,
using complete networks as the communications channel to "adjacent" gateways.
These gateways can exchange routing tables and perform dynamic hop-by-hop
routing, or one could simplify the gateway by requiring users to perform
In one sense, these two methods can be combined: while the gateways normally
do hop-by-hop routing, there could be a special process resident in the
gateway which would take incoming packets, examine the data field, extract
the next address indicated there, and forward the packet to that destination.
(Danny Cohen has called this process a "forwarder", and I believe Steve
Crocker once called it an "oasis".) The effect is to provide source routing
to the user, by simply incorporating the list of intermediate addresses in
the data field of a packet, rather than in the header.
We have spoken earlier on the manner in which area routing using hierarchical
addresses cannot properly identify resources which may have been stuck in
different parts of a partitioned network -- it requires some additional
information to guide the selection of an appropriate route. If we are using
some form of source routing, and if some information about the internal
structure of the destination net can propogate back to the source, it may be
possible to then choose an alternate route which will lead into one part of a
In addition, a source routing or hybrid scheme which allows a user to specify
certain intermediate addresses allows a process to force traffic over a
particular network, perhaps in response to political, economic, or
performance considerations: "take any reasonable path across the country, but
I must use the Packet Satellite system across the Atlantic". More difficult,
however, is selectively eliminating some paths from the routing decision:
"take me to Europe, but don't ever go through Spain".
Naming/Addressing/Routing and Internetwork Gateways
The gateway function is. in general, performed in a machine which is
connected to at least two networks, and which therefore has at least two
addresses. Incoming packets which need to be forwarded may enter on any
attached net, using any of the legal addresses.
Yet there is a single gateway process which handles all of this traffic, and
which is responsible for maintaining relevent [sic] routing tables.
It has been suggested that a gateway might be viewed as two completely
separate "halves" actually residing on different machines, connected by a
"thin wire" communications channel. We do not find this abstraction
particularly useful, since it does not easily account for the shared gateway
processes which actually perform the routing function.
In this model, the gateway processes maintain routing tables to reach network
areas through other gateways, but they do not attempt to maintain any other
state about networks, hosts, or connections. Thus, these intermediate
gateways -- using dynamic hop-by-hop routing -- will not be able to deal with
Material not covered in this note
Distribution lists, multi-destination packets, and broadcasting.
Ways we might modify area routing to more gracefully deal with partitioned
Internetwork flow control and congestion control.
Flow control between gateways.
The material in this note has evolved from many conversations with David
Boggs, Ed Taft, Bob Metcalfe, and Yogen Dalal; any clarity which has emerged
is due to their willingness to struggle with the semantic demons. Recent
conversations with Danny Cohen have been particularly useful in understanding
his proposal for hybrid routing schemes (where the source routing process
specifies only some of the intermediate stops), and the manner in which area
routing may obscure a preferred off-net route. Much of the commentary on
routing has evolved from [McQuillan 1974] and [Sunshine 1977]; the latter
paper is a partictilarly useful one, touching on many of the issues raised
here, although we tend to differ on some of the fine points.
AT&T, 'Notes on Distance Dialing', 1975.
J. M. McQuillan, 'Adaptive routing algorithms for distributed
computer networks', BBN Report No. 2831, May 1974.
Carl A. Sunshine, 'Interconnection of computer networks', Computer
January 29, 1978 8:02 PM