home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
- Network Working Group G. Armitage
- Request for Comments: 2121 Bellcore
- Category: Informational March 1997
-
-
- Issues affecting MARS Cluster Size
-
- Status of this Memo
-
- This memo provides information for the Internet community. This memo
- does not specify an Internet standard of any kind. Distribution of
- this memo is unlimited.
-
- Abstract
-
- IP multicast over ATM currently uses the MARS model [1] to manage the
- use of ATM pt-mpt SVCs for IP multicast packet forwarding. The scope
- of any given MARS services is the MARS Cluster - typically the same
- as an IPv4 Logical IP Subnet (LIS). Current IP/ATM networks are
- usually architected with unicast routing and forwarding issues
- dictating the sizes of individual LISes. However, as IP multicast is
- deployed as a service, the size of a LIS will only be as big as a
- MARS Cluster can be. This document provides a qualitative look at the
- issues constraining a MARS Cluster's size, including the impact of VC
- limits in switches and NICs, geographical distribution of cluster
- members, and the use of VC Mesh or MCS modes to support multicast
- groups.
-
- 1. Introduction
-
- A MARS Cluster is the set of IP/ATM interfaces that are willing to
- engage in direct, ATM level pt-mpt SVCs to perform IP multicast
- packet forwarding [1]. Each IP/ATM interface (a MARS Client) must
- keep state information regarding the ATM addresses of each leaf node
- (recipient) of each pt-mpt SVC it has open. In addition, each MARS
- Client receives MARS_JOIN and MARS_LEAVE messages from the MARS
- whenever there is a requirement that Clients around the Cluster need
- to update their pt-mpt SVCs for a given IP multicast group.
-
- The definition of Cluster 'size' can mean two things - the number of
- MARS Clients using a given MARS, and the geographic distribution of
- MARS Clients. The number of MARS Clients in a Cluster impacts on the
- amount of state information any given client may need to store while
- managing outgoing pt- mpt SVCs. It also impacts on the average rate
- of JOIN/LEAVE traffic that is propagated by the MARS on
- ClusterControlVC, and the number of pt-mpt VCs that may need
- modification each time a MARS_JOIN or MARS_LEAVE appears on
- ClusterControlVC.
-
-
-
- Armitage Informational [Page 1]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- The geographic distribution of clients affects the latency between a
- client issuing a MARS_JOIN, and it finally being added onto the pt-
- mpt VCs of the other MARS Clients transmitting to the specified
- multicast group. (This latency is made up of both the time to
- propagate the MARS_JOIN, and the delay in the underlying ATM cloud's
- reaction to the subsequent ADD_PARTY messages.)
-
- When architecting an IP/ATM network it is important to understand the
- worst case scaling limits applicable to your Clusters. This document
- provides a primarily qualitative look at the design choices that
- impose the most dramatic constraints on Cluster size. Since the focus
- is on worst-case scenarios, most of the analysis will assume
- multicast groups that are VC Mesh based and have all cluster members
- as sources and receivers. Engineering using the worst-case boundary
- conditions, then applying optimisations such as Multicast Servers
- (MCS), provides the Cluster with a margin of safety. It is hoped
- that more detailed quantitative analysis of Cluster sizing limits
- will be prompted by this document.
-
- Section 2 comments on the VC state requirements of the MARS model,
- while Sections 3 and 4 identify the group change processing load and
- latency characteristics of a cluster as a function of its size.
- Section 5 looks at how Multicast Routers (both conventional and
- combination router/switch architectures) increase the scale of a
- multicast capable IP/ATM network. Finally, Section 6 discusses how
- the use of Multicast Servers (MCS) might impact on the worst case
- Cluster size limits.
-
-
- 2. VC state limitations.
-
- Two characteristics of ATM NICs and switches will limit the number of
- members a Cluster may contain. They are:
-
- The maximum number of VCs that can be originated from, or
- terminate on, a port (VCmax).
-
- The maximum number of leaf nodes supportable by a root node
- (LEAFmax).
-
- We'll assume that the MARS node has similar VCmax and LEAFmax values
- as Cluster members. VCmax affects the Cluster size because of the
- following:
-
- The MARS terminates a pt-pt control VC from each cluster member,
- and originates a VC for ClusterControlVC and ServerControlVC.
-
-
-
-
-
- Armitage Informational [Page 2]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- When a multicast group is VC Mesh based, a group member terminates
- a VC from every sender to the group, per group.
-
- When a multicast group is MCS based, the MCS terminates a VC from
- every sender to the group.
-
- LEAFmax affects the Cluster size because of the following:
-
- ClusterControlVC from the MARS. It has a leaf node per cluster
- member (MARS Client).
-
- Packet forwarding SVCs out of each MARS Client for each IP
- multicast group being sent to. It has a leaf node for each group
- member when a group is VC Mesh based.
-
- Packet forwarding SVCs out of each MCS for each IP multicast group
- being sent to. It has a leaf node for each group member when a
- group is MCS based.
-
- If we have N cluster members, and M multicast groups active (using VC
- Mesh mode, and densely populated - all receivers are senders), the
- following observations may be made:
-
- ClusterControlVC has N leaf nodes, so
- N <= LEAFmax.
-
- The MARS terminates a pt-pt VC from each cluster member, and
- originates ClusterControlVC and ServerControlVC, so
- (N+2) <= VCmax.
-
- Each Cluster Member sources 1 VC per group, terminates (N-1) VC
- per group, originates a pt-pt VC to the MARS, and terminates 1 VC
- as a leaf on ClusterControlVC, so
- (M*N) + 2 <= VCmax.
-
- The VC sourced by each Cluster member per group goes to all other
- cluster members, so
- (N-1) <= LEAFmax.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Armitage Informational [Page 3]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- Since all the above conditions must be simultaneously true, we can
- see that the most constraining requirement is either:
-
- (M*N) + 2 <= VCmax.
-
- or
-
- N <= LEAFmax.
-
- The limit involving VCmax is fundamentally controlled by the VC
- consumption of group members using a VC Mesh for data forwarding,
- rather than the termination of pt-pt control VCs on the MARS. (It is
- in practice going to be very dependent on the multicast group
- membership distributions within the cluster.)
-
- The LEAFmax limit comes from ClusterControlVC, and is independent of
- the density of group members (or the ratios of senders to receivers)
- for active multicast groups within the cluster.
-
- Under UNI 3.0/3.1 the most obvious limit on LEAFmax is 2^15 (the leaf
- node ID is 15 bits wide). However, the signaling driver software for
- most ATM NICs may impose a limit much lower than this - a function of
- how much per-leaf node state information they need to store (and are
- capable of storing) for pt-mpt SVCs.
-
- VCmax is constrained by the ATM NIC hardware (for available
- segmentation or reassembly instances), or by the VC capacity of the
- switch port that the NIC is attached to. VCmax will be the smaller
- of the two.
-
- A MARS Client may impose its own state storage limitations, such that
- the combined memory consumption of a MARS Client and the ATM NIC's
- driver in a given host limits both LEAFmax and VCmax to values lower
- than the ATM NIC alone might have been able to support.
-
- It may be possible to work around LEAFmax limits by distributing the
- leaf nodes across multiple pt-mpt SVCs operating in parallel.
- However, such an approach requires further study, and doesn't solve
- the VCmax limitation associated with a node terminating too many VCs.
-
- A related observation can also be made that the number of MARS
- Clients in a Cluster may be limited by the memory constraints of the
- MARS itself. It is required to keep state on all the groups that
- every one of its MARS Clients have joined. For a given memory limit,
- the maximum number of MARS Clients must drop if the average number of
- groups joined per Client rises. Depending on the level of group
- memberships, this limitation may be more severe than LEAFmax.
-
-
-
-
- Armitage Informational [Page 4]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- 3. Signaling load.
-
- In any given cluster there will be an 'ambient' level of
- MARS_JOIN/LEAVE activity. The dynamic characteristics of this
- activity will depend on the types of multicast applications running
- within the cluster. For a constant relative distribution of multicast
- applications we can assume that, as the number of MARS Clients in a
- given cluster rises, so does the ambient level of MARS_JOIN/LEAVE
- activity. This increases the average frequency with which the MARS
- processes and propagates MARS_JOIN/LEAVE messages.
-
- The existence of MARS_JOIN/LEAVE traffic also has a consequential
- impact on signaling activity at the ATM level (across the UNI and
- {P}NNI boundaries). For groups that are VC Mesh supported, each
- MARS_JOIN or MARS_LEAVE propagated on ClusterControlVC will result in
- an ADD_PARTY or DROP_PARTY message sent across the UNIs of all MARS
- Clients that are transmitting to a given group. As a cluster's
- membership increases, so does the average number of MARS Clients that
- trigger ATM signaling activity in response to MARS_JOIN/LEAVEs.
-
- The size of a cluster needs to be chosen to provide some level of
- containment to this ambient level of MARS and UNI/NNI signaling.
-
- Some refinements to the MARS Client behaviour may also be explored to
- smooth out UNI signaling transients. MARS Clients are currently
- required to initiate revalidation of group memberships only when the
- Client next sends a packet to an invalidated group SVC. A Client
- could apply a similar algorithm to decide when it should issue
- ADD_PARTYs. For example, after seeing a MARS_JOIN, wait until it
- actually has a packet to send, send the packet, then initiate the
- ADD_PARTY. As a result actively transmitting Clients would update
- their SVCs sooner than intermittently transmitting Clients.
-
-
- 4. Group change latencies
-
- The group change latency can be defined as the time it takes for all
- the senders to a group to have correctly updated their forwarding
- SVCs after a MARS_JOIN or MARS_LEAVE is received from the MARS. This
- is affected by both the number of Cluster members and the
- geographical distribution of Cluster members. (Groups that are MCS
- based create the lowest impact when new members join or leave, since
- only the MCS needs to update its forwarding SVC.) Under some
- circumstances, especially modelling or simulation environments, group
- change latencies within a cluster may be an important characteristic
- to control.
-
-
-
-
-
- Armitage Informational [Page 5]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- As noted in the previous section, the ADD_PARTY/DROP_PARTY signaling
- load created by membership changes in VC Mesh based groups goes up as
- the number of cluster members rises (assuming worst case scenario of
- each cluster member being a sender to the group). As the UNI load
- rises, the ATM network itself may start delivering slower processing
- of the requested events.
-
- Wide geographic distribution of Cluster members also delays the
- propagation of MARS_JOIN/LEAVE and ATM UNI/NNI messages. The further
- apart various members are, the longer it takes for them to receive
- MARS_JOIN/LEAVE traffic on ClusterControlVC, and the longer it takes
- for the ATM network to react to ADD_PARTY and DROP_PARTY requests. If
- the long distance paths are populated by many ATM switches,
- propagation delays due to per-switch processing will add
- substantially to delays due to the speed of light.
-
- (Unfortunately, mechanisms for smoothing out the transient ATM
- signaling load described in section 3 have a consequence of
- increasing the group change latency, since the goal is for some of
- the senders to deliberately delay updating their forwarding SVCs.
- This is an area where the system architect needs to make a
- situation-specific trade-off.)
-
- It is not clear what affect the internal processing of the MARS
- itself has on group change latency, and how this might be impacted by
- cluster size. A component of the MARS processing latency will depend
- on the specific database implementation and search algorithms as much
- as on the number of group members for the group being modified at any
- instant. Since the maximum number of group members for a given group
- is equal to the number of cluster members, there will be an indirect
- (even if small) relationship between worst case MARS processing
- latencies and cluster size.
-
-
- 5. Large IP/ATM networks using Mrouters
-
- Building a large scale, multicast capable IP over ATM network is a
- tradeoff between Cluster sizes and numbers of Mrouters. For a given
- number of hosts, the number of clusters goes up as individual
- clusters shrink. Since Mrouters are the topological intersections
- between clusters, the number of Mrouters rises as the size of
- individual clusters shrinks. (The actual number of Mrouters depends
- largely on the logical IP topology you choose to implement, since a
- single physical Mrouter may interconnect more than two Clusters at
- once.) It is a local deployment question as to what the optimal mix
- of Clusters and Mrouters will be.
-
-
-
-
-
- Armitage Informational [Page 6]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- Currently two broad classes of Mrouters may be identified:
-
- Those that originate unique VCs into target Clusters, and
- forward/interleave data at the IP packet level (the Conventional
- Mrouter).
-
- Those that originate unique VCs into target Clusters, but create
- internal, cell level 'cut through' paths between VCs from
- different Clusters (e.g. the Cell Switch Router).
-
- How these Mrouters establish and manage the associations of VCs to IP
- traffic flows is beyond the scope of this document. However, it is
- worth looking briefly at their impact on VC consumption and ATM
- signaling load.
-
- 5.1 Impact of the Conventional Mrouter
-
- A conventional Mrouter acts as an aggregation point for both
- signaling and data plane loads. It hides host specific group
- membership changes in one cluster from senders within other clusters,
- and protects group members (receivers) in one cluster from having to
- be leaf nodes on SVCs from senders in other Clusters.
-
- When acting as an ingress point into a cluster, a conventional
- Mrouter establishes a single forwarding SVC for IP packets. This
- single SVC carries data from other clusters interleaved at the IP
- packet level. Only this single SVC needs to be modified in response
- to group memberships changes within the target cluster. As a
- consequence, there is no need for sources in other clusters to be
- aware of, or react to, MARS_JOIN/LEAVE traffic in the target cluster.
- (The consequential UNI signaling load identified in section 3 is also
- localized within the target Cluster.)
-
- MARS Clients within the target cluster also benefit from this data
- path aggregation because they terminate only one SVC from the Mrouter
- (per group), rather than multiple SVCs originating from actual
- senders in other Clusters.
-
- Conventional Mrouters help control the limiting factors described in
- sections 2, 3, and 4. A hypothetical 10000 node Cluster could be
- broken into two 5000 node Clusters, or four 2500 node Clusters, etc,
- to reduce VC consumption. Or you might have 200 nodes of the overall
- 10000 that are known to join and leave groups rapidly, whilst the
- other 9800 are fairly steady - so you deploy clusters of 200, 2500,
- 2500, 2500, 2300 hosts respectively.
-
-
-
-
-
-
- Armitage Informational [Page 7]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- 5.2. Impact of the Cell Switch Router (CSR).
-
- Another class of Mrouter, the Cell Switch Router (CSR) attempts to
- utilize IP level flow information to dynamically manage the switching
- of data through the device below the IP level. Once the CSR has
- identified a flow of IP traffic, and associated it with an inbound
- and outbound SVC, it begins to function as an ATM cell level device
- rather than a packet level device.
-
- Even when operating in this mode the CSR isolates attached Clusters
- from each other's MARS_JOIN/LEAVE activities, in the same manner as a
- conventional Mrouter. This occurs because the CSR manages its
- forwarding SVCs just like a normal MARS Client - responding to
- MARS_JOIN/LEAVE messages within the target cluster by updating the
- pt-mpt trees rooted on its own ATM ports.
-
- However, since AAL5 AAL_SDUs cannot be interleaved at the cell level
- on a single SVC, a CSR cannot simultaneously perform cell level cut-
- through and aggregate the IP packet flows from multiple senders onto
- a single SVC into a target Cluster. As a result, the CSR must
- construct a separate forwarding SVC into a target cluster for each
- SVC it is a leaf of in a source Cluster (to to ensure that cells from
- individual sources are not interleaved prior to reaching the re-
- assembly engines of the group members in the target cluster).
-
- Interestingly, the UNI signaling load offered within the target
- Cluster by the CSR is potentially greater than that of a conventional
- Mrouter. If there are N senders in the source Cluster, the CSR will
- have built N identical pt-mpt SVCs out to the group members within
- the target Cluster. If a new MARS_JOIN is issued within the target
- Cluster, the CSR must issue N ADD_PARTYs to update the N SVCs into
- the target Cluster. (Under similar circumstances a conventional
- Mrouter would have issued only one ADD_PARTY for its single SVC into
- the target Cluster.)
-
- Thus, without the ability to provide internal cut-through forwarding
- with AAL_SDU boundaries intact, the CSR only provides for the
- isolation of MARS_JOIN/LEAVE traffic within clusters. It cannot
- provide the data path aggregation of a conventional Mrouter.
-
-
-
-
-
-
-
-
-
-
-
-
- Armitage Informational [Page 8]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- 6. The impact of Multicast Servers (MCSs)
-
- Since the focus of this document is on worst-case scenarios, most of
- the analysis has assumed multicast groups that are VC Mesh based and
- have all cluster members as sources and receivers. The impact of
- using an MCS to support a multicast group can be dramatic in the
- context of the group's resource consumption, but less so in the
- over-all context of cluster size limits.
-
- The intra-cluster, per group impact of an MCS is somewhat analogous
- to the inter-cluster impact of a conventional Mrouter. The MCS
- aggregates the data flows (only 1 SVC terminates on each group
- member, independent of the number of senders), and isolates
- MARS_JOIN/LEAVE traffic (which is shifted to ServerControlVC rather
- than ClusterControlVC). The resulting UNI signaling traffic and load
- is reduced too, as only the forwarding SVC out of the MCS needs to be
- modified for every membership change in the MCS supported group.
-
- Deploying a mixture of MCS and VC Mesh based groups will certainly
- improve resource utilization. However, the actual extent of the
- improvements (and consequently how large the cluster can be made)
- will depend greatly on the dynamics of your typical applications and
- which characteristics from sections 2, 3, and 4 are your primary
- limitations.
-
- For example, if VCmax or LEAFmax (section 2) are primary limitations,
- one must keep in mind that each MCS itself suffers the same NIC
- limits as the MARS and MARS Clients. Even though using an MCS
- dramatically reduces the number of VCs per MARS Client per group,
- each MCS still needs to terminate 1 SVC per sender - potentially up
- to 1 SVC from each Cluster member. (This may become 1 SVC per member
- per group if the MCS supports multiple groups simultaneously.)
-
- Assume we have a Cluster where every group is MCS based, each MCS
- supports only one group, and both VCmax and LEAFmax apply equally to
- MCS nodes as MARS and MARS Clients nodes. If we have N cluster
- members, M groups, and all receivers are senders for a given MCS
- supported group, the following observations may be made:
-
- Each MCS forwarding SVC has N leaf nodes, so
- N <= LEAFmax.
-
- Each MCS terminates an SVC from N senders, originates 1 SVC
- forwarding path, originates a pt-pt control SVC to the MARS, and
- terminates 1 SVC as a leaf on ServerControlVC, so
- N + 3 <= VCmax.
-
-
-
-
-
- Armitage Informational [Page 9]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- MARS ClusterControlVC has N leaf nodes, so
- N <= LEAFmax.
-
- MARS ServerControlVC has M leaf nodes, so
- M <= LEAFmax.
-
- The MARS terminates a pt-pt VC from each cluster member, a pt-pt
- VC from each MCS, originates ClusterControlVC, and originates
- ServerControlVC, so
- N + M + 2 <= VCmax.
-
- Each Cluster Member sources 1 VC per group, terminates 1 VC per
- group, originates a pt-pt VC to the MARS, and terminates 1 VC as a
- leaf on ClusterControlVC, so
- 2*M + 2 <= VCmax.
-
- Since all the above conditions must be simultaneously true, we can
- see that the most constraining requirements are:
-
- N + M + 2 <= VCmax (if M <= N)
-
- 2*M + 2 <= VCmax (if M >= N)
- or
- N <= LEAFmax.
-
- (Assuming that in general M+2 > 3, so the VCmax constraint at each
- MCS is not a limiting factor.)
-
- We can get a feel for the relative impacts of VC Mesh groups vs MCS
- based groups by considering a cluster where M1 represents the number
- of VC Mesh based groups, and M2 represents the number of MCS based
- groups. Again we assume worst case group density (all N cluster
- members are group members, all receivers are also senders).
-
- As noted in section 2, the VCmax constraint in VC Mesh mode comes
- from each MARS Client, and is:
-
- N*M1 <= VCmax - 2
-
- For the MCS case we have two scenarios, M2 <= N and M2 >= N.
-
- If M2 <= N we can see the VC consumption by VC Mesh based groups will
- become the applicable constraint on cluster size N when:
-
- N + M2 <= N*M1
- i.e.
- M1 >= 1 + (M2/N)
-
-
-
-
- Armitage Informational [Page 10]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- Thus, if there is more than 1 VC Mesh based group, and less MCS based
- groups than cluster members (M2 < N), the constraint on cluster size
- is dictated by the VC Mesh characteristics: N*M1 <= VCmax - 2. (If M2
- == N, then there may be 2 VC Mesh based groups before the VC Mesh
- characteristics are the dictating factor.)
-
- Now, if M2 > N (more MCS based groups, and hence MCSes, than cluster
- members) the calculation is more complex since in this case VCmax at
- the MARS Client is the limiting parameter for both VC Mesh and MCS
- cases. The limit becomes:
-
- N*M1 + 2*M2 <= VCmax - 2
-
- However, on face value this is an odd situation anyway, since it
- implies more MCS entities than hosts or router interfaces into the
- cluster (given the assumption of one group per MCS).
-
- The impact of MCS entities that simultaneously support multiple
- groups is left for future study.
-
-
- 7. Open Issues
-
- There is a wide range of qualitative analysis that can be extracted
- from typical MARS deployment scenarios. This document does not
- attempt to develop any numerical models for VC consumptions, end to
- end latencies, etc.
-
-
- 8. Conclusion
-
- This document has provided a high level, qualitative overview of the
- parameters affecting the size of MARS Clusters. Limitations on the
- number of leaf nodes a pt-mpt SVC may support, sizes of the MARS
- database, propagation delays of MARS and UNI messages, and the
- frequency of MARS and UNI control messages are all identified as
- issues that will constrain Clusters. Conventional Mrouters are
- identified as useful aggregators of IP multicast traffic and
- signaling information. Cell Switch Routers are noted to offer only
- some of the aggregation attributes of conventional Mrouters. Large
- scale IP multicasting over ATM requires a combination of Mrouters and
- appropriately sized MARS Clusters. Finally, it has been shown that in
- a simple cluster where there are less MCS based groups than cluster
- members, two or more VC Mesh based groups are sufficient to render
- the use of Multicast Servers irrelevant to the worst case cluster
- size limit.
-
-
-
-
-
- Armitage Informational [Page 11]
-
- RFC 2121 Issues affecting MARS Cluster Size March 1997
-
-
- Security Considerations
-
- Security issues are not discussed in this memo.
-
- Acknowledgments
-
- Thanks must go to Rajesh Talpade (Georgia Tech) for specific input on
- aspects of the VC Mesh vs MCS tradeoffs, and Joel Halpern (Newbridge)
- for general input on the document's focus.
-
-
- Author's Address
-
- Grenville Armitage
- Bellcore, 445 South Street
- Morristown, NJ, 07960
- USA
-
- EMail: gja@thumper.bellcore.com
- Phone +1 201 829 2635
-
-
- References
-
- [1] Armitage, G., "Support for Multicast over UNI 3.0/3.1 based ATM
- Networks.", Bellcore, RFC 2022, November 1996.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Armitage Informational [Page 12]
-
-