The transcript of a panel discussion at the GlobusWorld 2005 Conference in Boston, Feb. 8.
John Dix, editor in chief of Network World, moderates a panel discussion of leading vendors on how grid computing will affect the end-to-end network.
Participants:
Cisco: Rob Redford, vice president, product and technology marketing
HP: Michael Feinberg, vice president and CTO of Network Storage Solutions
IBM: David Martin, program director, Internet Standards and Technology
Intel: Robert Fogel, director of Worldwide Grid Strategy and Business Development
Nortel: Franco Travostino, director of Advanced Technology
SAP: Alexander Gebhart, development manager, Netweaver Enterprise Grid Computing
Dix opening remarks: By many accounts, average system utilization across organizations is 15% to 20% today, while obviously the ideal would be around 80%. What’s more, some 20% of IS budgets go to operations, marginally less than the 25% earmarked for capital investments. We’ve created large, underutilized, complex environments that are costly to maintain. So there is a huge need to do this better, and the prevailing thinking at this point seems to be that grid is the answer.
And there is great opportunity because we have this confluence of developments:
The maturation of grid technology
General consensus that the future of computing is about networked, low-cost components
And the emergence of service oriented architectures, or Web services .
The combination of these three elements seems to add up to a potent force, a sensible image of the future we can build toward.
What’s more, we have new applications on the horizon that will give us further impetus, one being efforts to instrument anything and everything of value - sensors in/on everything in every field, from medicine to military and manufacturing. This will create a flood of raw, real-time streaming data.
For the purpose of our discussion here today, we’re talking about the big grid: As our Intel speaker so eloquently put it: “The grid spanning the enterprise and used to integrate, provision, virtualize and aggregate all enterprise resources including compute, communication, storage, apps and data.”
And we have a good cross- section of vendor speakers to address the grid issue, including representatives from companies selling network, software, systems, storage and processors.
Since advances in one sector always make possible advances in another - more powerful computers, for example, making it possible to develop more demanding software - as a starting point, let’s talk about what the power and flexibility of grids means to networks, applications, storage, etc.
Dix: Franco – starting with you on the network side – do bigger pipes answer all the grid challenges, or do networks take a radical turn somewhere down the road?
Travostino (Nortel): My opinion is that bigger pipes help, but they are not the ultimate solution. My mantra is that we do not want to adapt applications to the network, but rather we want to adapt the network to the applications. When we talk about grid, it’s not just about moving petabytes or terabytes around – it’s about adding intelligence so that the network can better understand the workflow.
Redford (Cisco): I agree with Franco. It’s not just about fat pipes. It’s very easy to look at the problem and say there needs to be lots of connectivity to move the data around to all these different places. But the role of the network is a little bit inconsequential because right now we need to be primarily focused on getting all these different pieces to fit together. If you look at how any tech evolves from early origin stage to the practical stage – you have to create separation and isolation to make the different parts work before you look at how they work together to solve a broader set of problems. It’s not just a matter of making it work, you have to make it work effectively.
Everyone talks about all these unused CPU cycles that need to be put to better use. But there is an even larger problem, and that’s complexity and cost. The operational costs are very high for any large enterprise today, and they’re seeking ways to simplify. Grid offers a great opportunity, but [it won’t fly unless] it makes it less expensive to operate, makes it simpler to program, because the cost of systems integration and programming fundamentally drive the economics of any business decision. If we can’t figure out a way to make that work, we won’t be able to get grid out of the academic world and move it into the business mainstream.
Dix: Robert, from the processor viewpoint, does Moore’s Law matter any more since you can scale by adding more processors?
Fogel (Intel): I believe so. We’ve traditionally looked at Moore’s Law primarily from a performance perspective. But it’s really beginning to make a substantial impact in convergence – that is, the ability to integrate more and more into the processor. That includes the ability to manage the other aspects of the infrastructure, and that includes the network. As Moore’s Law continues to evolve – and it appears it has a pretty distant horizon – the impact will primarily be on this integration factor and the convergence of various aspects of the infrastructure.
Dix: Mike, does the grid change the equation for storage-area networks?
Feinberg (HP): I want to take a step back and examine what the word "grid" means. I think the word is confusing. To my mind, when we talk about grid we’re talking about a common management interface, so you can actually understand the resources that are out there and utilize them in a systematic manner across the environment. The second part is the capabilities of grids.
If you think about grid and the architectures people deploy, they are implicitly talking about geographic distribution. So what would storage have to deliver as capabilities? I think that’s still virgin territory, still being developed. The grid community talks about data a fair amount, but the relationship between data and storage is somewhat cloudy. I also think we should think about the concepts and tenants of pooling, sharing resources, dynamic provisioning and security – that manifest themselves and how you build your own technology.
In HP’s storage division, we’ve incorporated our vision of HP Storage Works Grid. And this whole concept is to be able to lay that hardware down once and be able to reutilize that hardware to deliver new capabilities just in time.
Redford (Cisco): If you look at grid as a general abstraction mechanism – as a way of virtualizing lots of different resources – then the idea that you can ask a network for some storage or data, and it can figure out where it’s located and hand it to you, that’s a compelling idea. If you have to know explicitly where the data is and ask for it, it’s a lot more complicated to program. If the infrastructure and the middleware can give it to you regardless of where it is – via intelligence in the network, intelligence in the storage-area network and intelligence in the storage, working through middleware to provide that to you – that makes it a heck of a lot easier than each programmer having to worry about FTPing lots of data to different locations or having to be able to assemble the data needed to run they particular application.
Dix: Alexander, how might grid change enterprise applications?
Gebhart (SAP): At last year’s GlobusWorld somebody said the success of grid depends on how many grid-enabled applications we’ll see. Let me tell you a little bit about the challenges that arise when you try to grid-enable some parts of supply chain management applications or customer relationship management applications.
The first issue is the program decomposition – cutting the application into tiny pieces that can be executed in parallel. It sounds easy, but it’s quite a huge problem. The next challenge being encountered is deployment. What does the dynamic deployment look like? What does the installation look like? What does the dynamic customizing look like? How do we make sure we are in a perfect environment – that all the parts are there that we need for a specific application, that we have the right connections, the appropriate network bandwidth and so on.
Look at the end result, once the application has run on the grid, there is log data or tracing data residing on grid nodes. Since the grid node is empty prior to the execution of an application, it must be empty after the execution of the application, but the log files need to be somewhere. So this is also a major problem.
Even with the perfect infrastructure, infinite bandwidth and infinite storage, to grid-enable an application is a challenge. But the benefits are quite remarkable if this challenge is mastered somehow. The benefits for existing applications are an increase in the quality of results, and it will make possible new applications that we never thought possible due to the lack of resources available.
Dix: How far away do you think we are from seeing commercial applications available for grid environments?
Gebhart (SAP): We’re well beyond the prototype phase. Now the challenge is to make them ready for production. But I can’t comment on the exact timing.
Martin (IBM): You can build a grid where there’s no knowledge of the network or resources or any dynamic capabilities – so the application thinks it’s running on a single processor. The other way to do grid is to make the applications very aware of the network and resources, and prioritize the requests and carefully describe the resources they need. We’re seeing both models emerge.
As programmers come into this grid environment, some of them really want to work in the single-processor environment, whereas others want to take advantage of the dynamic capabilities. So our challenge as middleware designers is to give people the environment they want – whether it’s a single Java execution environment or a really dynamic, collaborative environment that gives them the capability to distribute things around the world and know where things are running. That type of environment is starting to evolve, but all the vendors and Globus are struggling to keep enough hidden so that someone can come and quickly write applications, but also make enough available so that they can take advantage of the grid network.
Redford (Cisco): Dave’s point is worth emphasizing. We tend to think early on that there is one answer, the answer. But there’s not going to be an answer. Some programs are going to want a high degree of abstraction. Others will want a fine-grained degree of control, because that’s what’s necessary for that application. Both are perfectly legitimate; neither is right or wrong.
It’s the true sign of success – that you start needing those different degrees of control. It means that people are actually using it, it’s actually worthwhile. I think we’re going to end up with different levels within the management framework. We could supply a policy interface to all the different infrastructure pieces so they can all be administered through policy – but at some point, you’re going to want to get into specific devices that requires fine-grained control.
Feinberg (HP): I’d add that we’re really talking about quality of service here. I agree wholeheartedly that there’s no one right answer. You probably don’t want or need at this point applications that understand the construction policy and the rate of protection of the storage grid. I think that’s the wrong model going forward – it doesn’t scale well. What we probably do is try to abstract it in a way that talks about QoS in a way that’s meaningful, and lends to discussions about latency, bandwidth, geographic distribution, etc.
The challenge we have with applications is there’s this balance between giving end users functionality and being a good citizen in infrastructure. I don’t think we want to change the balance where all programmers are worried about is being a good member of the infrastructure. We want them to focused on delivering end-user functionality.
Dix: David, today grid requires different management systems for network, systems, security, etc. How do we get those to play nicely together in grid environments?
Martin (IBM): The real challenge for the industry – and for IBM, where we have a huge number of products in our portfolio – is to integrate all of these different layers under a common management structure. The solution is common standards. A lot of the win in grid is not necessarily around utilization – it’s cheap to buy more processors and storage. The win is to be able to quickly bring up a new application, and from the ground up be able to do it quickly and efficiently and with one common management interface.
Dix: As grids emerge, does the grid intelligence move out into the network as it has with SANs?
Travostino (Nortel): My opinion is that the intelligence has to be everywhere. For the network, there are going to be grid network services. In fact, in my team at Nortel, we have created several generations of those already. We brought the latest version to GlobusWorld, which is integrated with Globus Toolkit 4 and WSRF .
Redford (Cisco): There’s no one place for the intelligence to be. Intelligence has to be in all the layers because the issue we’re trying to solve are complexity and operational costs. That means finding ways to make the different pieces work together better. Asking which parts of the enterprise should be smart or dumb is like asking someone ‘What side of your brain would you like to be dumb, your left or your right?’ But if both sides are smart, including the things used to connect them, it works better.
We believe that the intelligence will reside in the upper-level middleware and the lower middleware, in the network, in the operating system – and over time, the isolated entities will get better integrated to simplify operation. There will be different parts of intelligence and different types of intelligence, but ultimately they will be brought together for better integration.
Back in the early days of the Internet, when it was all about bandwidth, they said, 'Let’s keep it simple and put the intelligence outside.' Now the bottleneck problems have been solved, and now that the connectivity problems have been solved, now the issue is, 'How do I make the different layers – rather than having an air gap between them – how do we move them together so they can work together more efficiently?'
Lots of things show up in the network as the natural consequence of that evolution of blending the layers together. When we talk about networks in an intelligent way, we’re talking about the networks actually participating with the applications and the services. The service-oriented infrastructure should be very much aligned with the intelligent network.
We think grid, service-oriented architectures, Web services – we think all of these things eventually have to flow together and will ultimately change the architecture of enterprise networks.
Dix: Given the scope and complexity of grid, can enterprise users realistically expect the vendors to play together nicely on this? Globus is a step forward, but are we looking at a huge compatibility/interoperability issue down the road?
Feinberg (HP): I think in general you see industry as a whole coming together on standards. I think this is an issue of 'All boats rise in a rising tide.' I’ve never seen a data center that has just one vendor. I think the industry as a whole is moving toward common standards, and the toolkit is helping us along that way.
Redford (Cisco): I don’t think anybody disagrees with standards. The issue isn’t whether you have standards, but how you arrive at certain standards. We can point in our history to lots of examples of efforts to define a standard that were done purely in an academic sense, but in the practical realities of deployment they failed.
So the issue is how do we take the academic piece and make it work, but also allow for creative experimentation? If you look at the Internet Engineering Task Force, one of the most successful models we’ve had in the history of the industry, they defined simple standards, allowed innovation and brought people together around what actually worked based on practical experience, and not just a bunch of people deciding the way it should be.
The trick is balancing all those pieces out. How do you balance out the need for standards and also allow for a certain degree of flexibility to try different things and see what works – and then decide based on that experience what the standard ought to be. If we don’t allow for that actual experimentation, trial and error and open discussion, then the standard doesn’t mean anything. You can all agree on a standard, but if it just means a standard degree of mediocrity, that model doesn’t work. You also need to avoid secular interests preventing you from getting there or from creating a standard that doesn’t make sense for everybody.
Travostino (Nortel): Inside of the Global Grid Forum (GGF ) we’ve been working on networking considerations. Even though we have made a lot of progress over the last two years, we still haven’t gotten enough participation from the industry. So I’m here to solicit my colleagues, competitors and friends to join these operations; GGF and the Globus Consortium are working hard to get the discussion of standards going with the right people.
Fogel (Intel): To tie this back to Globus – as you heard this morning, Ian Foster and Steve Tuecke were talking about the two entities that were created this year. Univa is a commercial support mechanism for the Globus Toolkit. And the Globus Consortium is an industry consortium around accelerating the adoption of Globus in the enterprise. It’s those kinds of entities that I think are important in taking the standards and moving them into the industry.
Martin (IBM): In the IETF – what really started driving it was the network effect. Robert Metcalf talks about how the value of the network device is increased by the networking of all devices. We’re just starting to see this in grid. Most grid installations now are stand-alone grids that can be used within specific companies’ data centers. We’re starting to see people hooking together their grids. That’s what’s really going to drive things forward with the vendors, to make sure that everything that works together. Right now, if you hire IBM or HP or any of the other vendors to build your grid, it will work in that environment, but when you start hooking into other people’s grids, that’s what’s going to drive grid interoperability and standards integration issues forward. Industry is getting there quickly. SNIA has been working with a pretty narrow storage community to work on grid. The Liberty Alliance is working on grid security considerations. These industry efforts are being driven forward by the fact that organizations want grids that work together.
Dix: All of the vendors use different words to describe these various grid efforts, utility computing, on demand, etc. How do these efforts fit together, or do they?
Martin (IBM): If you dig down in all of the market speak, you find an incredibly common set of things. Though I’m depressed sometime that everyone is using different terms, I’m encouraged by the fact that all of the concepts are generally the same. While some people talk about the Adaptive Enterprise and others talk about the Dynamic Grid Infrastructure, it’s all in the same general context. But there will always be a race for the market term.
Feinberg (HP): All the vendors are going to create unique implementations or solutions based on the same technologies. So we’ll be speaking a common language – like we do with TCP/IP – but we each have value adds. We have to understand that while we share relatively common views of how the data center should be run, we will all have our unique differentiation. HP has done a lot of research and we believe we have some unique differentiators. So all the vendors are not doing exactly the same thing.
Dix: The trend for the last five years or so has been to centralize resources. Does grid change that?
Feinberg (HP): There’s a reason for people trying to centralize. To the extent that grid allows a common technique and management style to manage distributed environments more efficiently, that changes the challenge for the customer. To the extent grid allows data to move and migrate in an efficient manner and be protected – that’s a large challenge in the distributed world that it’s addressing. You find that there’s a lot of data all over the world that’s not being protected or managed in the same way you’d want it to be.
Redford (Cisco): Sometimes I wonder whether there’s even a distinction between centralized and decentralized anymore. You can have a centralized control point and a common interface on the network wherever you happen to be. So it’s distributed and centralized at the same time.
Dix: How about from your viewpoint, Alexander? A number of companies have taken SAP instances from around the world and centralized them in one location.
Gebhart (SAP): It depends on the application. There are applications where centralization makes perfect sense. On the other hand, there are applications that are perfect for a distributed environment. There is no generic answer. But of course, we will see more distributed applications as we see more network bandwidth and reliability. If you look at the hardware evolution, it basically paves the road to a more distributed environment.
Travostino (Nortel): In my view, the defining problem for grid is decentralized control. Whether the resources are centralized or decentralized doesn’t matter.
Fogel (Intel): When I think of centralized vs. decentralized, I associate that with operating behind the firewall or outside of the firewall. I think we tend to think of operating behind the firewall as more of a centralized model. With grid and the evolution of the infrastructure, things are moving outside of the firewall. It’s very important for us to comprehend this whole idea of users and devices moving outside of the firewall and being able to accommodate that.
Martin (IBM): There really is no glass house data center anymore. Even though big data centers are efficient and powerful, they’re incredibly decentralized. A lot of the local processing of data is being done all around the world. The concept of having a centralized data center – other than from the physical standpoint – is essentially dead. It was dying in the client/server world, and grid is killing whatever remnants remained.
Dix: Let’s turn to the discussion of security. With so many resources outside of the firewall, so many scattered grid resources, how big of a problem is security?
Travostino (Nortel): I think there are a lot of new challenges that are brought about by grid. For example, access control and authorization takes a whole new meaning, because the roles and resources change very quickly, so all of the technology we have so far needs to be extended in that regard. One major area that needs to be addressed is the use of extensible standards. The tools that we have to write policy languages are primitive – there is no way, for example, to have a compiler to detect a flaw or bad policies. So we’re still doing that largely by hand. And how is this going to scale when you have dynamic virtualization?
Redford (Cisco): We haven’t found a way to make security fundamentally work with the infrastructure we have now. But if we’ve learned anything, it’s that we need to have multi-level security; just like a bank has TV systems, guards, safes – and lots of different layers of security. There have to be multiple levels of security throughout the grid. We take the problems we have with security today and we multiply them substantially as we look at grid and the future of virtualization. There needs to be lots of thought put into the grid security piece, and it needs to be tackled in bite-sized chunks.
Feinberg (HP): There’s no new problem, per se, introduced by grid. Every enterprise has to think about going outside of the firewall when employees take a laptop out of the environment. I think grid may actually crystallize these problems so they can be solved. In some cases, enterprises don’t think through all the implications of security. Getting outside of the firewall forces people to think about it.
Gebhart (SAP): Transport-level security, message-level and network-level security is very strong. But application security – sometimes you don’t know exactly where a node is residing. Usually an application developer writes for files to solve encryption and to use data that is stored locally and may be sensitive. In this area, security is weak, and also there is no standard here that can be used.
Redford (Cisco): If you really want to think about an application running across a distributed infrastructure, you can’t have that application asking for a user name and password 50,000 times while running across different grids. So determining pervasive credentials for application access across different domains would be a good starting point.
Martin (IBM)of the traps that network engineers have fallen into is the ‘crunchy on the outside and soft and chewy in the middle’ model of network security. That security model is dying quickly,and the adoption of grid is going to push that over the edge. The idea of having a big corporate firewall – where inside the firewall is safe, and what’s outside is not – doesn’t work. What you have to do is start securing resources and identifying users in very specific ways with very specific authorization and define very specific rules. The grid environment is going to accelerate that, because you have to be able to secure resources.
Dix: The Globus Toolkit has been married to Web services, but observers say that some Web service tools aren’t adequate for grid performance. Do we need to see some core Web services tools evolve?
Travostino (Nortel): There is no other choice than using Web services. If you want to build an Internet-scale system of systems like a grid, you have to use Web services. But I don’t see performance of Web services being the most challenging problem. You can have fast paths and slow paths, versus data intensive paths – and define priorities and use XML accelerators and appliances that increase the performance of Web services. On the other hand, the reliability of Web services is very concerning. The dependability and robustness are the greater concerns.
Redford (Cisco): You’ve got to adopt the standards that are out there, and Web services is the reality. At Cisco we believe that by adding certain Web services into the network itself, we can make a lot of the fundamental mechanisms – like reliability – work. For example, right now we have to have application-level processes to guarantee a message going from one point to another point. But why? Why not have this be an intrinsic function within the network? Translation of different protocols can also be built into the network. So when we think about Web services and grid, they both have to evolve. Web services itself is still evolving so it’s not like it’s a done deal and static. Grid will force us to look at Web services in a different way. Web services, encapsulations and service-oriented architectures – these are all different ways to simplify programming.
Dix: Given that most of you expect grids to span geographies the question is, are WANs keeping up?
Redford (Cisco): The bandwidth is there in the WAN – it’s a matter of whether you want to pay for it or not. So that’s just a service issue, of how expensive it will be. The greater issue is having intelligence inside the network. We’ve taken one step to making this whole infrastructure more autonomic, more self-provisioning. Where today a human has to make a decision, can tomorrow that decision be made at the network level? We have the technology to do that, it’s just a matter of determining how the protocols will work to balance it out.
Dix: OK, final question. Where do you think we are, what do you think will happen next?
Feinberg (HP): In terms of high-performance computing and the traditional grid computing – they’re moving us in a great direction. Companies like HP, IBM, Intel, Cisco, SAP, Nortel – they’re participating in the Globus Consortium and the GGF, and getting the standards out there. The commercial community is just getting engaged. Customers want to be able to utilize existing resources to solve greater problems. We’re at the starting point in commercial grids, but pretty far down the road in terms of viability for the scientific community. It’s still early, and we still need more collaboration.
Gebhart (SAP): We have completed, advanced prototypes of grid. The next step will be to deliver grid applications.
Redford (Cisco): We definitely see grid as an early-stage technology. However, Cisco has been participating in grid-related activities for a while. We’re hopeful that as we get a better understanding of how the pieces will fit together, the network will continue to evolve to facilitate the application. We think you’ll start to see things coming out of the network industry this year that really apply to grids. We’ll see technologies that show this evolutionary path.
Travostino (Nortel): We’re moving along, but there is still a lot hype. Nortel has produced a layer of middleware and integrated Globus Toolkit 4. We are learning from our beta customers. We think that there is good movement there, but we need to continue the discussion at the vendor level, to further the interoperability efforts.
Fogel (Intel): I see the network, or ‘fabric’, being elevated in terms of being one of the key resources, along with compute and storage – that has to be provisioned and aggregated and so on. I see a convergence of standards efforts. There are a number of different standards organizations that were announced this year – including the Globus Consortium. I see those bodies working together to create interoperable working groups and building blocks. In terms of technology, I see further steps in terms of convergence – addressing compute, storage management, storage reliability – being further converged into processing elements.
Martin (IBM): We’re starting to see people interconnect grids. Eventually, instead of many grids, we’ll see ‘the grid’, like the Internet, which was also many little pieces when it began. When the Internet was finally well connected and finally available, we got the Web. People think of the Web as the Internet, but really it’s an application that sits on the Internet. What we’re seeing today is a class of applications that sit on top of the grid. We’re building the infrastructure underneath to put all these things together and make them play together.