Implementing MPOA in ATM Edge Devices
The MPOA edge device is an enhanced bridge, which is improved to understand internetworking packet formats, and to be able to detect flows. Though the MPOA specification is still under
development, here are some guidelines for implementing it into ATM devices.
By Joel M. Halpern
T
he ATM Forum MPOA specification (currently under development), provides a number of capabilities for ATM devices. It synthesizes the existing ATM Forum LAN emulation (LANE) work and the Internet Engineering Task Force (IETF)
Next Hop Routing Protocol (NHRP) activities. More significantly for this article, the specification also provides support for a virtual router with
distributed virtual nets/subnets.
A virtual router is a collection of devices that together present the external behavior of a traditional bridge/
router. The MPOA (muliprotocol over ATM) specification describes how the components use ATM to provide the capabilities of the virtual router. It also describes how these services are visible to Ethernet or Token Ring attached hosts, and how services are provided to ATM attached hosts.
MPOA components
There are many components to the MPOA
system. While this article focuses on one of them, it is helpful to keep in mind the context of the entire system. Figure 1 shows the systems architecture as described in the baseline specification.
All the MPOA operational constructs are defined in terms of functional groups (FG in the element names). These functional groups represent collections of capabilities that need to be implemented together. The specification describes in detail the required behaviors for each functional group. The functional groups
are used so that the designer is not constrained by whether to implement one or many functional groups in the same device.
The Internet address subgroup (IASG) represents a central concept in MPOA. Most internetworking protocols operate in terms of the network or subnet. For routing and related purposes, a similar concept must be created on ATM. Classical IP over ATM (RFC 1577), for example, simply transfers the concept intact and whole. MPOA is a little different. The IASG represents those aspects of
the subnet that MPOA supports. Thus, it includes the notion of broadcast scope, and of an aggregation, which is at the bottom of the routing hierarchy. However, in MPOA, ATM-attached hosts no longer need to be cognizant of the address aggregation. Default forwarding can be used, even between hosts that are in the same IASG. And direct communication can be used between hosts that are actually in different IASGs. An IASG defines the logical space over which MPOA operates.
As can be seen in Figure 1, MPOA
utilizes LANE, which is used to provide the necessary bridging support. The need for bridging is a natural consequence of the support for virtual routing, in two regards.
One capability of the MPOA system is to allow conductivity to a single subnet to be distributed across several edge devices. In order to ensure that the collection behaves as a single subnet, bridging is used. Specifically, a logical, virtual bridged LAN is created to support the subnet, with connectivity on all participating edge
devices.
The routing protocol and control functions of the virtual router reside in the route server funtional group (RSFG) and remote forwarder functional group (RFFG). Therefore, hosts on traditional (Ethernet and Token Ring) media will attempt to communicate with the RSFG/RFFG as if it were their router. While the EDFG could actively serve as a proxy for the RSFG/RFFG, this would require significant control coordination. In addition, complexity would be introduced when multiple RSFGs/RFFGs were used. By
using bridging for this base communication, a number of positive goals are achieved:
The default path to the RSFG/RFFG exists reliably.
Communication exists without excess coordination.
The same mechanisms used to support the distributed subnet also support control/default communication.
The NHRP is being developed by the IETF. It allows stations attached to ATM (or other NBMA technologies) to obtain the information needed to establish direct communication
(virtual circuits over ATM) to communications peers on the ATM that are not in the same subnet. These communications peers may be NBMA-attached hosts, or they may be the routers that are the entry and egress to/from ATM for a particular destination. The IETF has specified a flexible, extensible protocol that supports these needs. This protocol is used by MPOA for its registration and address resolution needs.
The ATM host functional group (AHFG) represents the functions that an ATM-attached host (which
does not wish to use LANE) needs to support to participate in MPOA. Principally, these include registration and the use of the NHRP query/response protocol. The AHFG also may use default forwarding when it does not wish to establish a direct virtual circuit to its communications peer.
The IASG coordination functional group (ICFG) and default forwarder functional group (DFFG) together provide the capabilities required to coordinate AHFG participation in the IASG. Specifically, the ICFG accepts registrations
from MPOA AHFG, and answers queries about those hosts. The DFFG provides default forwarding services to those ATM attached hosts. It also provides forwarding between the LANE portion of the IASG and the AHFGs.
The RSFG and RFFG provide the control functions associated with a traditional router. The RSFG runs the routing protocols, and it also manages the cache information required by the edge devices. It participates in the NHRP resolution protocol used by all aspects of MPOA for address resolution. The
RFFG provides default, connectionless-packet forwarding between IASG. It forwards packets according to the routing information maintained by the RSFG.
An edge device
The MPOA system includes edge devices that are responsible for bridging and cut-through, and for internetworking level forwarding. These devices logically consist of a bridge, usually supporting multiple virtual LANs, and the edge device functional group, as shown in Figure 2.
Derived from the MPOA baseline specification, the
edge device functional group (EDFG) is a logical layering on top of a bridge. The EDFG has two logical functions: flow detection and flow utilization. The EDFG must monitor the flow of traffic to MPOA-participating destinations. Using LANE, MPOA provides identification so that an EDFG can recognize which MAC destinations are associated with MPOA. Internetworking traffic to those destinations is monitored for flows to specific internetworking destinations. If a flow is detected (by a sufficient number of
packets in a short period of time), then a query is sent to determine what the destination of that flow (MAC and internetworking address pair) should be.
Once a flow has been detected and resolved, an ATM virtual circuit is established. Then, traffic for that MAC/internetworking destination is redirected over that flow. The MAC headers are stripped as part of the redirection, and the MPOA-specific header is added.
While there are many ways to implement an MPOA edge device, I will describe one common
approach to the implementation. The description will use IP as the example internetworking protocol. It will also assume that bridging is supported by 802.1D. For flow detection, we will assume that the criteria is the transmission of more than five packets in 1 second. Depending on the exact hardware and software architecture that this system is embedded in, there are likely to be significant optimizations over and above what is described here. The description here is intended to convey the necessary aspects of
implementation, not the precise methodology.
At the center of an MPOA edge device is an implementation of a bridge. In the usual case, this is an 802.1D-compliant bridge with virtual LAN support. There is an instantiation of the 802.1D spanning tree finite state machinery. This will control what ports are blocked, and what ports are receiving data. The EDFG machinery being described here is never addressed by the "local" MAC addresses. Therefore, all MPOA traffic is subject to the spanning tree
behaviors.
Upon receiving a packet, the bridging system performs the usual source learning and blocked-port checking. The packet is then analyzed to find the internetworking protocol (if such can be understood from the packet). It then uses the combination of ingress port, protocol, and destination MAC to find a bridging table entry. (Different architectures and approaches will use these fields in different orders. The listed order provides a very general system behavior.) The bridging table will give an egress
port and an internetworking flows table address. If MPOA is not interested in the MAC address, the flows table pointer will be null.
The egress port is immediately checked for two error conditions. If the egress port is blocked, or is the same as the ingress port, then error indication is returned to the caller. If the caller was normal bridging, the error will be ignored and the packet discarded. The MPOA EDFG function can take action on certain error conditions.
If the flows entry is non-null, then
the internetworking destination of the packet must be checked. The protocol classification has provided enough information to locate the destination. The flows table can use any one of a number of table structures. Patricia trees, radix trees, and simple binary trees are common. Fortunately, there is no need for hierarchical look-up in this context, as all entries are host routes. (Support for atomic prefixes is permitted by the specification, but not required.) The flow table consists of:
Control address for queries
One entry per detected internetworking address
Internetworking address
Occurrence count
Timestamp
MPOA ATM address (or null if not yet known)
Flags (query-sent, VC-in-progress, do-no-VC, MPOA-tagging)
VPI/VCI (or 0 if not yet established)
MPOA tag.
If no entry is found in the flow table for the destination internetworking address, then one is created, with an occurrence count of one, and a timestamp
equal to the current time.
If an entry is found, then the occurrence count is updated. If the entry does not have an ATM address (and no query has been sent), then the occurrence count and timestamp are used to decide if a query is needed. If so, then unless the do-no-VC flag is set, one is built (with the MPOA tagging extension) and sent to the MPOA control address in the table. The query-sent bit is used to prevent excessive queries. Meanwhile, the packet is handed back to the bridging service to send.
The do-no-VC flag is set to reflect a failure of a query, and thereby prevent excessive queries.
If there is an ATM address, but there is no VC (and no VC in progress) then the occurrence and timestamp are used to decide if one is needed (again, allowing for the do-no-vc flag). If so, VC establishment is begun, and the VC-in-progress bit is set to reflect this. In the absence of the VC, even if one is in progress, the packet is handed back to the bridging service for transmission.
If there is an MPOA
VC, then the MAC header is stripped from the frame, and an MPOA LLC/SNAP header is added. If tagging was provided by the far end, then the tag (and appropriate LLC/SNAP) is also added. The packet is then sent on the MPOA cut-through VC.
There are many legitimate implementation variations on this aspect of the system. The timestamp may be replaced with a more complex structure for detecting flows. The upper-level protocol and port numbers in the packet may be examined for policy-based decisions to
establish a VC even before enough packets have been seen to warrant a flow. (An FTP data connection is recognizable, and likely to warrant a VC, for example.)
When the bridge packet handling detects a flow, an NHRP query is generated. This is sent to the control address associated with the table entry. A query is also generated if an NHRP trigger message is received. An NHRP trigger message is sent by the RSFG/RFFG if it has noticed something that it thinks the EDFG should treat as a flow. If such a message is
received, it will have the MAC address in it and will be received over a control VC (therefore having a known control ATM address). Using this, a cache entry is either created or updated, and the appropriate NHRP query is generated.
When an NHRP response is received, it causes an update of the cache entry. The extension fields in the query contain enough information to find the flow table entry. If the query succeeds, the table entry is updated with the ATM address (and tagging if provided by the remote
end), and a VC is established. (Auxiliary control structures should be used so that VC establishment causes the entries in the relevant flow table to be updated. There can be multiple flow entries waiting on a single VC establishment.) If the query resulted in failure, the "do-no-VC" bit is set to prevent excessive retries.
The selection of attributes for the VC is a local matter. Usually, UBR or ABR with 0 minimum cell rate is going to be used, but other variations are possible. If one chooses to
establish VC with varying QoS attributes, it is even possible to have multiple VCs, although a more complex table structure is clearly necessary.
Also, the failure handling may be more complex. It is possible to establish time-intervals after which failures are retried, or additional information may be available that indicates that retries are useful.
The EDFG will receive MPOA NHRP requests. These will contain egress cache information and may contain a tagging extension.
The egress cache information
specifies the handling for a received internetworking frame. It specifies a particular protocol, ATM source address, and internetworking destination address. For internetworking packets received with that combination (over a cut-through NHRP VC), the egress cache information indicates what MAC header should be put on the frame, and how the arrival should be treated. MPOA treats all arriving internetworking packets as if they had logically arrived over a specific emulated LAN (thus providing an arriving bridge
port for the bridging system).
The tagging extension allows the EDFG to attempt to optimize received packet processing. If the tagging extension is present, the EDFG may provide a tag value that the remote sender will include in all packets to the given IP destination. For example, the EDFG may provide the memory address of the egress cache information.
The egress cache imposition information is stored so that when VC(s) are established or received from the remote party, the information, including the
identity of the RSFG which sent the imposition, is associated with that VC.
One important part of the system is the existence of the flow table and the ATM control address associated with that. This is based on information from LANE control. Whenever an unknown destination is being flooded onto ATM, a LANE proxy (bridge) performs an LE-ARP operation. If this results in the system resolving the address, an ATM address of a LANE peer is returned. If that LANE peer is also an MPOA device, the ATM device
type and ATM control address are returned in an extension to the LE-ARP message. (The EDFGs type and control address are provided in the LE-ARP request.) If the type is anything other than an EDFG, then the EDFG creates a flow entry for the associated MAC address and stores the associated control address in the table. The MPOA type and control address are also stored in information about the LANE VCs being established. If the VC is not associated with MPOA, that is stored so as to prevent repeated inquiries.
LANE clients are permitted (and encouraged) to learn about reachability from incoming LANE frames. If the frame is coming in over a VC that already has the LE-ARP tagging information, then that is associated with the learned MAC address. If there is no such information, then this is a remotely initiated VC for LANE (for which I have never done an LE-ARP). Therefore, using the source MAC of the received frame, an LE-ARP is done in order to get the MPOA control information. When a packet is received whose
LLC/SNAP information indicates it is MPOA data, then it is processed by the EDFG.
If the packet contains an MPOA tag (there is a special LLC/SNAP to indicate this), then the tag is used as a lookup to check/find the egress cache information. If tagging is used, the cache entries must be validated. The ATM address in the cache entry is checked against the arriving VC, and the IP destination in the cache entry is checked against the destination in the packet.
If there is no tag in the packet, then the
arriving VC, protocol, and destination IP address are used to find an egress cache entry. If the entry indicates invalid information, then an error should be sent to the control MPOA device.
Assuming a proper egress cache entry exists, then the MAC header is added to the frame, and the frame is handed off to the bridging system. The ostensible source of the frame is the port to the emulated LAN indicated in the cache entry. If the bridging system returns an error, then an error indication is sent to the
MPOA control address associated with the cache entry. Clearly, several timers are required for this system. Most are course enough in granularity that a periodic scan of the tables is sufficient.
All NHRP-derived entries have a lifetime. Information derived from a local query (ingress information) must be reverified prior to lifetime expiration. Egress information must be discarded if it is not refreshed before a lifetime expires.
A periodic updating of the flow-checking table will manage the updating
of counters for correct detection of flows. Also, this will detect if the flow is not being used, so that the VC may be torn down.
The MPOA edge device is an enhanced bridge. The bridge is enhanced to understand internetworking packet formats and to be able to detect flows. This article outlines the logic for these activities. Details of the formats and precise operational requirements will be in the forthcoming MPOA specification. Issues such as configuration are also dealt with there. VLAN
configuration is left to other standards groups, or individual designers.
Author's note: Since this article was composed, certain features have been deferred to later phases in the ATM Forum. Specifically, the support for multiple emulator LAN segments within an IASG has been deferred. Additionally, the ICFG/DFFG elements have been deferred, although ATM native hosts are still supported in their own subnets. None of these deferrals has any significant effect on the design and implementation of an MPOA edge device.
Joel Halpern
is a principal engineer at Newbridge Networks in Herndon, VA. He is the routing director in the IETF, dealing with IP routing protocols, and is also a participant in the multiprotocol over ATM activity in the ATM Forum. He is also a contributing editor to Communication Systems Design.