Who can resist a title like this?
How to Tell Your NAS From Your Elbow
computerworld.com
Storage-area networks are powerful but costly and hard to implement. Network-attached storage is comparatively inexpensive and easy but limited. Here's how IT managers are tackling the strategic but complicated question of how best to manage their storage.
By TOMMY PETERSON (August 07, 2000) Wai Chong manages information systems at a health maintenance organization in California. Mark Silva is vice president of network operations at a large Boston investment firm. Mark Dahl is the distributed systems manager at the Anchorage, Alaska, subsidiary of a global oil company. Each is responsible for very different kinds of data, but all must store rapidly increasing amounts of information. Each is looking to networked storage to solve that problem.
Click the thumbnail above to view a larger pdf version of this diagram.That means they're grappling with whether to choose relatively inexpensive, easy-to-implement network-attached storage (NAS) or storage-area networks (SAN), which are potentially more powerful but also more expensive and harder to implement.
Managers tend to go with NAS if they have tight budgets, need to bring more storage online quickly and work at firms leery of fast-changing technology.
SANs are more appealing to companies that need fast data access for widely distributed users and have the money to make long-term investments in their storage infrastructures.
Information technology managers must weigh cost against ease of implementation and management, speed of data access, scalability, backup and fail-over capabilities and interoperability with other parts of the network. The decisions will become more urgent as the Internet and applications such as customer relationship management and enterprise resource planning generate more customer data.
Even when IT does decide on a strategy, management must be convinced that the move is worth it. (See related story, page 28.)
NAS usually occupies its own node on a LAN, typically an Ethernet network. In this configuration, a single server handles all data storage on the network, taking a load off the application or enterprise server. By detaching storage from individual servers, it makes the data available to any user on the network. NAS is essentially plug-and-play storage that uses proven Ethernet and SCSI technology.
A SAN, by contrast, is a high-speed dedicated subnetwork connecting storage disks or tapes with their associated servers. Although these components can be connected via other protocols, including SCSI or IBM's Escon optical fiber, they're associated with the emerging high-speed (133M to 4.25G bit/sec.), long-distance Fibre Channel protocol.
SAN technology is designed to support disk mirroring, backup and restoration, archiving and retrieval, data migration among storage devices and sharing of stored data among servers. SANs can also be configured to incorporate subnetworks such as NAS systems.
Weighing the Cost
Chong, an information systems manager at Omni Health Corp. in Sacramento, Calif., says he wants to implement a SAN to accommodate Omni's storage needs and its plans to give patients access to billing records over the Web. "A SAN is the best strategy for future needs," he says.
Earlier this year, Chong began talking to Compaq Computer Corp. about building a SAN, but his bosses recently applied the brakes to the project, wanting more time to consider costs and the still-evolving SAN technology. Chong's situation is common.
"These are expensive decisions - I spend a lot of my time thinking about storage and trying to think about it strategically," says Silva, vice president for network operations at State Street Corp. in Boston.
Omni's 1 terabyte (TB) of data is currently stored on rack-mounted disks connected to individual servers via SCSI buses. This common approach is easy and relatively cheap. A Guardian 90GB SCSI RAID array from Seagate Technology Inc. costs about $7,600, while NAS or SAN technology can cost hundreds of thousands or even millions of dollars. Storage can be increased merely by adding SCSI host bus adapter ports in the form of add-in cards to the server, daisy-chaining more devices off existing buses or adding servers - or all three.
However, each SCSI bus can support a maximum of 15 disk arrays, and each SCSI bus can stretch no farther than 75 ft. from the host.
Large storage needs can quickly translate into a dense jumble of hardware, with data accessible only through individual servers. To see data on other servers, users must go through the network - a process that's slow for the user and bogs down the network. In some cases, the user may not be able to see data without switching drives.
In addition, if any device needs maintenance, the entire string must be taken off-line.
On the other hand, "we can get a NAS box in a shipment at 9 in the morning, and it can be up and involved in production by noon," says Dahl, distributed systems manager at BP Exploration (Alaska) Inc., an Anchorage-based subsidiary of global oil company BP Amoco PLC.
"NAS is very stable technology," says Lauri Vickers, an industry analyst at Cahners In-Stat Group in Newton, Mass. "It provides a lot more scalability than conventional storage" but costs much less.
However, she adds, SANs have "the most strategic intelligence that can be applied best to large volumes of stored data."
The promise of SANs must be balanced against their comparatively high costs, as well as implementation and management headaches, say analysts.
"Ninety-one cents of every dollar spent on nontraditional storage, specifically for SANs, goes to management and maintenance," says Vickers.
Another negative is the question of interoperability. The Storage Networking Industry Association (SNIA) and the Fibre Alliance are engaged in an ongoing battle over Fibre Channel standards. The Mountain View, Calif.-based SNIA has support from Compaq, Sun Microsystems Inc. and other vendors, while the Fullerton, Calif.-based Fibre Alliance is backed by Hopkinton, Mass.-based EMC Corp.
Complex Needs
BP Exploration recently invested in both NAS and SAN technologies to accommodate almost 5TB of data on Unix and Windows NT platforms.
NAS technology is "way ahead" of SAN technology for usability and reliability, especially in cross-platform applications, says Dahl. However, he says, SANs let him allocate data to any free space on a storage network.
State Street's major storage requirements are for its online transaction service, which demands high availability but not huge capacity, and a growing 3TB data warehouse. The data warehouse must store large amounts of data but has comparatively lower uptime requirements.
Silva has added 800GB of StorEdge disk arrays from Sun while it evaluates SAN products. He has been talking to EMC and Mountain View, Calif.-based Veritas Software Inc. Silva says he has a lot of confidence in Veritas, but he says he isn't sure he can afford EMC's high-end products on his approximately $2 million budget.
"If money weren't an issue, we'd go with EMC because we know they're solid products," says Silva. "Moving to a SAN, I do worry about interoperability and management issues." Those concerns lead some customers to stick with safe storage choices.
Earlier this year Dow Jones & Co., an investment and publishing company in New York, spent about $3.5 million in storage technology, most of it configured as NAS, according to senior systems administrator Marc Appelbaum.
"Dow Jones doesn't want to set the pace with a new technology," says Appelbaum. "This is a conservative company [that] is not going to take a risk on a technology that's not mature."
The 100GB of data being stored by Dow Jones includes customer preferences, archived stories from The Wall Street Journal and the backup for the Web version of the newspaper.
Dow Jones turned to Storage Technology Corp., a Louisville, Colo.-based vendor that deals in SAN products and tape storage devices, for both products and help implementing them. The primary storage device is the StorageTek L700 tape library, with 13.6TB capacity and Ultra SCSI (20M byte/sec. data-transfer) connections.
Appelbaum says his department is planning to set up separate Compaq StorageWorks SANs for individual business units that request them because they offer high-speed access to data.
Easier Choices
For some IT managers, the choice of SAN vs. NAS was easier.
Ken Ciaccia, information services project manager at Armstrong World Industries Inc., has implemented a NAS configuration. He says the Lancaster, Pa.-based flooring company is currently storing 628GB of data from its SAP AG applications, with another 28GB added every month. The SAP data is stored on EMC drives attached to the Armstrong LAN.
In addition, the company is moving to an imaging system and needs to store contracts and other documents in Portable Document Format.
"SAN is a technology that makes sense for applications where customers constantly have to access your data," Ciaccia says. "We have storage needs but not so much constant access needs."
John Stone, administration director at the Office of the Public Defender for the Ninth Judicial Circuit in Orlando, oversees the court records for two counties. Staff members in both his and the state attorney's office, as well as independent lawyers, need constant access to these files. Nine months ago, he implemented a SAN to provide faster access to data.
"We have over 240GB of storage capacity, [which] we don't really need all of right now, but we do need the speed that the storage network gives us," he says. "When you have six or seven hundred people trying to access data at the same time, you need a SAN, or they might as well go to lunch every time they try to get to a document file."
The court's SAN comprises one Compaq RAID Array 8000 Pedestal, two HSG80 array controllers, two Peripheral Component Interconnect-to-Fibre Channel adapters, two eight-port Fibre Channel switches, 15 9GB Ultra SCSI 10,000 RPM drives and six 18GB Ultra SCSI 10,000 RPM drives.
Another factor in Stone's decision to choose SAN technology was the electrical isolation the SAN allows for switches and controllers.
"We were looking for the most reliable and the fastest option, and this was what we chose," says Stone. "Everything is mirrored and striped on the drives. This provides everything, including hot backup, unless you lose the drive, and then all the data is mirrored."
After a "substantial" discount, Stone says, the court paid a little more than $100,000 for the storage network. Even though Stone says he's satisfied with the technology, his shop wasn't immune to the implementation hassles widely reported with SANs.
"[Compaq] got the technology, but you need to find the gurus to help you through the process," says Stone. "There wasn't a lot of knowledge in programming the Fibre Channel switches and the controllers [that allocate data among storage devices] here, and it took a while to find the people we needed to help."
Some predict the choices will become easier. Many industry observers say that the implementation headaches will fade as the distinctions between NAS and SAN fade. "The time will come when you won't tell the difference," says Thomas Coughlin, an independent industry consultant in San Jose. "Users are going to insist that . . . SANs become easier to implement and cheaper to maintain and that NAS becomes more scalable and flexible."
Vickers says she agrees, to a point. "NAS and SAN features will converge, especially at the high-volume, high-price end, and will look like SANs with NAS devices as part of the storage network," she says. "But you're not going to want to centralize all your storage, which is the general direction. Low-end NAS will continue in that form."
NAS AND SAN COMPARED
NETWORK-ATTACHED STORAGE (NAS) STORAGE-AREA NETWORK (SAN)
Connectivity technology: Most applications are supported by SCSI, Ethernet and/or TCP/IP. Some NAS devices use Fibre Channel for increased speed. Associated with Fibre Channel, but SANs can be supported by Fast Ethernet or SCSI-2. Use of Fibre Channel increases distance over which the network can operate, from approximately 25 m to 10 km or more.
Access to stored data: Provides access to stored data related to the NAS device to all users on a LAN. Provides access to all online stored data to all users on a LAN. Provides cross-platform access to data.
Speed of access to data: NAS server frees some server resources and increases speed of access over conventional storage but still consumes network bandwidth. Provides fastest access to stored data because storage traffic moves on its own network.
Allocation of storage resources: NAS distributes stored data to the devices on its node of the network. SAN architecture allows storage to be distributed across servers and storage devices, allowing for most efficient use of storage resources.
Implementation: Essentially plug-and-play into node of existing LAN. Complex installation. Users consistently report the need for much support before, during and after the installation. Few IT shops have internal expertise to build and maintain a SAN at present.
Cost: Advocates point to relatively low initial investment in hardware, software and implementation support. Advocates claim that initial costs are balanced by higher performance, leading to lower cost over time.
GLOSSARY Availability:The degree to which a computer system or network is available. ------------------------------------------------------------------------ Bus:A physical transmission channel in a computer or on a network that carries signals to and from devices attached to the channel. ------------------------------------------------------------------------ Disaster recovery:Preventive measures using redundant hardware, software, data centers and other facilities that either ensures that a business can continue operations during a natural or man-made disaster or helps restore business operations as quickly as possible. ------------------------------------------------------------------------ Disk controller:Hardware that controls the writing and reading of data to and from a disk drive. ------------------------------------------------------------------------ Disk mirroring:The creation of two copies of data on separate disk drives. ------------------------------------------------------------------------ Disk striping:Spreading data across multiple drives and combining partitions from separate disks into a volume that the operating system recognizes as a single drive. Disk striping enhances performance by enabling multiple I/O operations in the same volume to proceed simultaneously. ------------------------------------------------------------------------
Fabric:A Fibre Channel topology (in this case, linking storage units) that features one or more switching devices. Fail-over:The process by which data is immediately and nondisruptively routed to an alternate data path or device in the event of the failure of an adapter, cable, channel controller or other device. ------------------------------------------------------------------------ Fibre Channel:Fibre Channel is nominally a 1G bit/sec. data-transfer interface technology, although the specification allows data transfer rates from 133M bit/sec. up to 4.25G bit/sec. Data can be transmitted and received at 1G bit/sec. simultaneously. ------------------------------------------------------------------------ Fibre Channel Arbitrated Loop (FC-AL):FC-AL places up to 126 devices on a loop to share bandwidth. Typically, this is done using a star layout that is logically a loop, employing a Fibre Channel hub. This allows IT managers to add or remove devices without having to bring down the entire loop. ------------------------------------------------------------------------ Host bus adapter (HBA):A SCSI-2 adapter that plugs into a host and lets that host communicate with a device. The HBA usually performs at the lower level of the SCSI protocol and is normally the initiator. ------------------------------------------------------------------------ Hot swapping:The process of removing and replacing a failed system component while the system remains online.
Hub:A device joining communication lines at a central location, providing a common connection to all devices on the network. ------------------------------------------------------------------------ Interoperability:The ability of hardware and software made by a variety of different manufacturers to work seamlessly together. ------------------------------------------------------------------------ Online transaction processor:Executes transactions the instant they’re received by the computer and updates master files immediately. ------------------------------------------------------------------------ Protocol:A set of rules or standards to enable computers to communicate. ------------------------------------------------------------------------ SCSI:The standard set of protocols for host computers communicating with attached peripherals. SCSI allows the connection of as many as six peripheral devices. ------------------------------------------------------------------------ SCSI bus:A parallel bus that carries data and control signals from SCSI devices to an SCSI controller. ------------------------------------------------------------------------
Switch:A network device that selects a path or circuit for sending data. ------------------------------------------------------------------------ Workload balancing:A technique that ensures that no data path becomes overloaded while others have underutilized bandwidth, causing an I/O bottleneck. When one or more paths become busier than others, workload balancing shifts I/O traffic from the busy paths to the less-busy paths, further enhancing throughput over the already efficient multipathing method. |