Interesting post on WAFL from the Motley Fool thread...
See: boards.fool.com
"> WAFL is USELESS in backend operations because those mainframes and high-end Unix servers have their own > file systems which are more efficient especially in clustered configurations. WAFL is USEFUL in distribution > systems because of the inefficiences of general network file systems like NFS and CIFS. > > Obviously, you have different types of applications running on those operational, analytical and distribution > systems so the file system is really only as good as your caching subsystem and data placement technology as well > as your failure management technology.
Conjecture,
I respectfully disagree. A filesystem is what allows application programs to perform reads and writes to permanent storage without having to worry about the physical attributes of that storage system, not to mention how it is connected to the host computer. Physical attributes include which drive to use, which cylinder to use, which block. It also needs to make sure it isn't stepping on a data block that belongs to another user. It also needs to figure out which communications channel to use, how to set up the registers to address the selected block(s), and on and on. Add various flavors of RAID to the mix, and it gets real messy, real fast. When using a filesystem, the application program just needs to know that it can write its 4K block to byte offset 32768 of file /export/home/oracle/data/oradata.dbf, and know that the next time it is needed, that's where to go. The filesystem does all of the aforementioned grunt work.
I can't speak to mainframe filesystems, but as for open systems, NetApp's WAFL ranges anywhere from superior to the filesystems in use (Veritas) to vastly superior (in the case of the de-facto standard Unix filesystem called Berkeley FFS). The Enguinity Operating System to which you refer sets up a volume on the Symm. The filesystem that lives within said volume is managed entirely by the machine that is hosting the application. The Celerra's data movers running DART OS do, of course implement filesystems on the Symm, but I wasn't talking about Celerra. After all, there are very few businesses out there that use Celerra for the operational and analytical apps to which you refer. They use Symms, or filers.
It is easy to become confused by the terms NFS, Network File System, and CIFS, Common Internet File System. Even though filesystem is part of the name, they are not filesystems. They are communications protocols that allow applications programs running on a client machine to perform file operations, e.g. read, write, directory lookup, change permissions, on a host machine via a network connection. The host machine, e.g. a NetApp filer, then performs the disk-related actions on behalf of the client machine.
There is a double win for the client machine. First, it doesn't need to allocate CPU and memory resources to manage the disks, making them available to the application program. Second, the speed at which the filer responds allows the application to get on with the task at hand more quickly. True, the NFS protocol imposes overhead of a different nature into the equation. The efficiencies of WAFL, however, yield an overall gain in performance over a Veritas filesystem running on direct attached fibre in spite of the NFS overhead. The Veritas filesystem, by the way, is no slouch. WAFL is just so much better. That was the purpose of the document from my earlier post the other day.
netapp.com
Now, if you could just come up with a way to use WAFL without the overhead of NFS, the system would really rock! That's where DAFS comes in. No more NFS, no more TCP/IP, no more context switching from application level to kernel level activity, no more buffering of data blocks in kernel space. The application program simply writes its data directly into the filer's cache, using the RDMA (Remote Direct Memory Access) protocol over VI.
Cache is of course very important for improving short term response, and both the filer and the Symm make use of cache. Ultimately, however, the data in cache has to go to disk, and that's where the filesystem, or data placement, as you call it, comes in. The more efficient the filesystem is at laying out the data, the better the storage system will perform. The filer uses WAFL, and the Symm just does whatever the filesystem that runs on the host machine tells it to do. The Symm does apply whatever RAID mechanism it is configured to do, if any. The filesystem is frequently the Berkeley Fast File System or Veritas in the Unix world, or it is NTFS when supporting Windows hosts.
I beg to differ with your statement WAFL is USELESS in backend operations. One would have to put Oracle in the backend operations category. For one out of many examples, please refer to the link "Oracle Selects Network Appliance to Support Oracle E-Business Needs"
netapp.com
Why would Oracle select NetApp if the resulting system provides inferior performance?
Alas, I ramble...
Philip" |