More broad strokes from Jim Rothnie, EMC's Chief Technology Officer.
.....The three main focuses of EMC's R&D investment have been exploiting the advances in storage density, driving connectivity and re-inventing storage management. EMC has made great advances in reducing the number of people required to manage information. Soon each manager will be able to manage hundreds of terabytes of information, Rothnie said. This will come about by widespread adoption of storage networking and by advances in EMC management software. The vision is automation.
Just as a long-distance phone call is incredibly complicated yet remarkably simple for the end user, storage management should have the same ease of use. "We expect our customers to think of having one seamless information storage system," he said.
Rothnie said automation is different than "storage virtualization." Virtualization is a new term for creating an abstract view of information – something that EMC has been doing for years with the Symmetrix. Automation means moving that information to where it needs to be with minimal human involvement.
"Automation leads to better performance, simplicity in making changes in configuration and cost effectiveness," Rothnie said.
The brains of automation will be the next generation of EMC ControlCenter software. "ControlCenter will watch traffic at all points and will make a decision that the data should be here rather than there, using I/O Redirectors (next generation of PowerPath) and Data Movers (next generations of SRDF and MirrorView) to make it happen," Rothnie said. The result is the coordination of all these actions to automatically move information objects.
EMC's information storage works with all server types, all forms of connectivity, has open APIs and cooperative support agreements with many other companies. Soon, EMC's software products will work with storage hardware from other vendors, he said....
emc.com
Now, I'm going to take these steps, these kind of layers in the layer cake one at a time, and talk about what are the key issues for each of them, where do things stand today, and in some cases, a glimpse into the future for where we see these layers of the cake headed.
And we'll start out with the physical layer. This is the foundation, this is the hardware platform, which lies beneath the entire stack of information storage resources in the information plant. This is the place where the foundations of reliability and performance and capacity all sit. Many of the suppliers of storage systems today, really focus all of their attention on those three characteristics of the storage system, really focused on what the hardware can do.
AT EMC, we also focus a lot of attention on reliability, performance and capacity, and today achieve a higher level of each of those measurements than anybody else in the industry, but we also understand that this physical layer is the foundation of everything else. And what you need to have inside this physical layer, is the capability to execute the software, which is ultimately going to build on all of the layers above.
Why is it necessary to do this? Back in the old days, when people organized storage this way, and their needs were much, much simpler, that is single server, single applications, with storage organized as a peripheral device connected to the backend, it was unnecessary to worry about the classes of functionality we'll talk about here today.
But the reality is our customers are not dealing with that kind of simple environment anymore. They have hundreds, or often thousands of separate computer systems. If they did deploy their storage resources as peripherals behind each one, that would lead to the thousand islands phenomenon - a thousand islands of separate corners of information which are extremely difficult to manage and share and protect.
So customers are looking to storage technology to simply this picture, to put that storage into a common resource, which everything can connect to and use effectively.
In order to do that, it is essential that higher levels of functionality be delivered. Higher levels of functionality like replication and multipath file sharing. Things like data protection, and things like data optimization - that's really data placement optimization. I want to focus, just for a second, on that particular function as an illustration of what's important and where things are head for the future.
Data placement optimization is something that I think is not yet well understood in our industry. EMC introduced products in this space about 18 months ago, and there're now around a thousand customers around the world using this capability. What it's about is similar to what customers used to do themselves back a couple of years ago, in terms of striping data, taking popular data objects, and spreading it across multiple disk drives. But in an era of petabyte information plants, and in much more dynamic information, it has become literally impossible for human beings to deal with this. And so the function of our optimization software is to do this automatically - to keep track of what's actually happening in the system and to reshuffle data as necessary to get it in the right place.
One of the facets of high functionality that I want to discuss with you here for a moment, is where does the intelligence belong in a modern storage deployment, the type that all of our customers are heading to today? The various resources involved with accessing information are organized, somewhat simplistically, in the three layers that are shown here.
1) Servers, which are running the applications, 2) a connectivity layer, which connects the servers to the information, 3) then the storage system itself, which is storing that data.
Functionality can be put in any of those three places, but for each particular function, it turns out that there is a best place to put it. And if you put that function somewhere else, it can cost you an order of magnitude in performance, and often really spells the difference between being able to do it at all, and being able to do it effectively.
Storage optimization is actually an excellent example of this, because optimization involves moving data from one part of the storage system to another. That really belongs in the storage system. If you don't put it there, for example, if you put that software up in the application server, it costs an enormously larger amount to perform these optimization functions. The data has to travel all the way up, across the connectivity layer, into the servers, and back down to accomplish that data moving function.
Data replication and remote mirroring are other examples of functions which involve data movement and are performed much more effectively in the storage layer than anywhere else.
And another place that functionality can be placed, is in the server complex itself. And again, there are certain functions which belong there. This chart illustrates one, called path failover, which involves controlling the channels which come out from the servers and down into the storage complex. Controlling them when a failure occurs or to balance the load across them, must be performed inside the server, so that's the right place to put it.
What you find in the storage market today is that vendors who have products in one particular layer try to put all the value-add there. I can say in practice, I think of it as "Commoditize thy neighbor." The effort to move value into the part of the complex where you have the strongest position.
Certainly we see storage software suppliers who try to put it all in the server layer. We see suppliers of things like SAN appliances, they try to put it all in the connectivity layer, and back in the old days, when EMC had a narrower scope of interest, we thought it all belongs in the storage system itself. But as our scope of understand of the whole storage environment matured, and through discussions with customers, and a clear understanding of what they required, we have come to understand that the right thing to do is to put functions where they belong.
And as a rule of thumb, it's putting that intelligence close to the thing that the intelligence is controlling. And you get implementations of advanced functionality which are far superior to any other approach. So distributed intelligence is really the answer to handling advanced functions effectively.
wwpi.com |