Here is an article on InfiniBand switching's central role. Some more details, but not as techie.
InfiniBand scales as a network switch
By Rob Davis, Vice President, Advanced Engineering, QLogic Corp., Aliso Viejo, Calif. EE Times (09/18/00, 12:38 p.m. EST)
eetimes.com
InfiniBand is a new interconnect architecture designed to significantly boost data transfers between servers and peripherals. InfiniBand abandons the shared-bus concepts of the PCI bus, shifting I/O control from processors to a channel-based, switched-fabric, point-to-point, full-duplex interconnect architecture. This shift to intelligent I/O engines provides a scalable solution that promises new levels of performance, scalability, and reliability in networking and attached storage.
To explore the current state of bus-based system architecture, the place to start is PCI. The conventional 32-bit, 33-MHz PCI local bus has become ubiquitous in servers and desktop computers for I/O devices. There are several reasons for this popularity, including PCI's processor independence, low-pin-count interface, and scalability up to 64-bit I/O performance. The technology has also benefited from evolutionary improvements.
Despite PCI's evolution, all PCI-based architectures still force devices to share the total available bandwidth. While this approach was acceptable in nearly all environments when PCI was introduced years ago, there is an increasing number of distributed applications where a technology that can scale without impacting performance would be more appropriate, such as in e-commerce applications running in server cluster environments.
Pushed to the limit
Moreover, PCI is also subject to signal-integrity and timing constraints that push the I/O interface to the limit. In some environments, these constraints can make it challenging to consistently deliver all data bits on a bus to a destination at the same time. Widening the bus would not help, because that would increase the likelihood that data bits will not be synchronized with each other, making design more difficult and devices more expensive.
Multivendor interoperability
InfiniBand, backed by the Infiniband Trade Association (IBTA), is designed from the ground up with an eye to the future. Boasting a well-layered, standardized architecture, InfiniBand supports multivendor interoperability from day one. This approach enables the architecture to evolve at the rate of technology.
InfiniBand supports a 2.5-Gbit/s wire-speed connection, with one (1X), four (4X), or 12 (12X) wire-link widths. Data throughput ranges from 2.5 Gbits/s per link (one link) to 30 Gbits/s per link (12 links), resulting in lower latency and easier and faster sharing of data. By comparison, PCI's maximum speed is 1 Gbit/s across all PCI slots. Even high-end PC servers with 64-bit, 66-MHz buses can only reach 4 Gbits/s of shared bandwidth.
InfiniBand also increases the amount of usable bandwidth. Data is sent serially using fiber optic and copper cables. To minimize data errors, data is encoded with redundant information. In addition, the four-wire link provides four signaling elements in parallel for 10 billion bits/s, while the 12-wire link delivers 30 billion bits/s. Each InfiniBand link is full duplex. Further enhancing performance, InfiniBand employs a switched fabric that creates a direct, high-speed, virtual, dedicated channel between a server, other servers, and I/O units.
InfiniBand addresses scalability in several ways. First, InfiniBand offers scalable performance through multiple wire-link width. Second, the I/O fabric itself is designed to scale without experiencing the latencies of shared-bus I/O architectures as workload increases. This is explained in more detail below. Third, InfiniBand's physical modularity obviates the need for customers to buy excess capacity up front in anticipation of future growth; instead, customers can buy what they need now and add capacity as their requirements grow, without impacting existing operations.
InfiniBand's new form factor makes it easier to add, remove, and upgrade than today's shared-bus I/O cards. Instead of installing and removing cards from a bus, InfiniBand provides a single interface for multiple types of I/O. In this way, InfiniBand effectively takes I/O expansion outside the box, eliminating slot limits and bandwidth limitations, while freeing the CPU to work on applications.
Redundant paths
On the reliability front, InfiniBand creates multiple redundant paths between nodes, reducing the hardware that needs to be purchased. It also sheds the load-and-store-based communications methods used by shared local bus I/O to a more reliable message-passing paradigm.
At the heart of the InfiniBand architecture is a bi-directional link with dedicated physical lanes sending and receiving data simultaneously. InfiniBand increases bandwidth by boosting the number of physical lanes per link-the InfiniBand specification calls for two, eight and 24 physical lanes. Half of the physical lanes in each link send data, while the other half receive data. Each physical lane has a theoretical signal rate of 2.5 Gbits/s. Therefore, two-, eight- and 24-channel bidirectional physical lanes have theoretical bandwidths of 5, 20 and 60 Gbits/s, respectively.
InfiniBand defines four types of devices: host channel adapters (HCA), target channel adapters (TCA), switches and routers. The HCA is installed in the server and connects to one or more switches. I/O devices connect to the switches through a TCA, thereby creating a subnet with up to 64,000 nodes. The HCA can communicate with one or more TCAs either directly or through one or more switches. A router interconnects several subnets.
Together, these components transform the system bus into a universal, dynamically configured, scalable interconnection mechanism that enables computers, peripherals and networks to work in concert to form one enormous, hybrid system.
While the InfiniBand specification allows a TCA to connect directly to an HCA via a serial link, the real power of InfiniBand comes from having a switch between the TCA and HCA. Connecting a TCA to an HCA through a switch enables devices on an InfiniBand network to be connected to several hosts.
Switches interconnect multiple links by passing data packets between ports within a subnet of up to 64,000 nodes. Similar to Internet Protocol, each packet has address and error-detection data, along with a payload. Each time a switch receives a packet, it reads the destination address at the beginning of the packet, and sends the packet to an outgoing port based on an internal table. One or more switches can link multiple HCAs with multiple TCAs to provide high availability, higher aggregate bandwidth, load balancing, or data copying to a backup storage site. All InfiniBand devices are thus connected in a fabric.
The idea behind InfiniBand is to create a switched-fabric, serial point-to-point link I/O architecture that meets the requirements for cost-effective I/O and expands and simplifies connectivity between devices, while improving reliability, scalability, and performance. InfiniBand achieves this goal by using a unified fabric to connect elements of computer systems.
Multiple stages
A switched fabric is simply an interconnection architecture that uses multiple stages of switches to route transactions between an initiator and a target. Each connection is a point-to-point link. This inherently provides better electrical characteristics by allowing higher frequencies, while delivering greater throughput than bus architectures. The use of multistage switch architectures maximizes the flexibility and scalability of the interconnect.
The concept of switched-fabric, point-to-point interconnects is not a new one. The latest example of such interconnects can be found in Fibre Channel-based storage-area networks (SANs), in which shared-bandwidth hubs never achieved broad acceptance and switches quickly became the norm. The concept of the InfiniBand switch fabric thus draws on other mature and proven technologies and network architectures, utilizing the collective knowledge of switched-fabric implementations to deliver the best and most cost-effective I/O solutions.
The InfiniBand switched fabric brings increased bandwidth, and the abilities to aggregate bandwidth as connections are added, easily isolate a fault to a single connection, and upgrade connections and their bandwidth one at a time as new technologies are developed.
We also anticipate an opportunity for enhanced integration with Fibre Channel-based SANs. By the time InfiniBand gains momentum in the marketplace, perhaps in 2002, a tremendous installed base of Fibre Channel and SAN products will be in place, particularly in the enterprise-class environments where InfiniBand makes its presence known first. Line-speed bridging between InfiniBand and Fibre Channel will be a critical requirement, and several vendors are actively addressing this challenge, including QLogic.
The processing demands of distributed e-commerce applications have led to increased use of clusters. As microprocessors become faster and less expensive, clustering becomes increasingly viable. Clustering presents at least a couple of challenges, however. To work effectively, clustering activities require high-speed interprocessor communications, with virtually no latency. Many clustering applications also require significant amounts of I/O, such as that required for disk or network access.
InfiniBand addresses both challenges head on. Multiple processors using HCAs can communicate with each other at up to 24 billion bits/s, and multiple I/O devices using TCAs can support the I/O needs of the cluster. Because the switches pass data on all ports at full bandwidth, all devices can work in concert.
Infiniband building blocks are designed from the ground up to support high availability, as demonstrated in the following example.
Assume that two servers need access to two different WANs. An HCA at each computer is used to access a TCA for each WAN through a switch.
Normally, each server can leverage both WANs, thereby providing twice as much capacity as a single WAN. However, if WAN 1 goes down or is congested, the InfiniBand switch will direct all data flow to WAN 2.
Similarly, if server 2 fails, the switch can direct all WAN data to server 1. To eliminate the switch as a single point of failure, an additional switch should be added to this configuration.
See chart img.cmpnet.com
QLogic has been working closely with the InfiniBand Trade Association to promote and develop InfiniBand. As one of the foremost proponents of InfiniBand, QLogic believes its endorsement will increase the architecture's adoption among industry leaders.
Draft spec
As InfiniBand evolves, QLogic intends to put it into practice by leveraging our core expertise in high-performance HBAs and switching to bring high-bandwidth solutions to the industry-standard space.
The draft specification for InfiniBand is in preparation, and a final release is tentatively scheduled for this year. The industry's first InfiniBand products should start appearing by the first half of 2001, mostly in the form of storage and server connections. The first switches and routers should be introduced between then and the first half of 2002.
In addition to QLogic, companies that have announced significant product development efforts have included Agilent Technologies, IBM, Intel, Sun Microsystems and many other major vendors. Some early products will be demonstrated this fall at the Comdex trade show in Las Vegas.
QLogic's first InfiniBand-compliant products, including a switch, should also appear during 2001. QLogic demonstrated a prototype switch at the Intel Developer Forum in San Jose, Calif., last month. |