NextIO's rolls out vCORE GPU appliance

NextIO introduced the vCORE consolidation appliance recently – a shared resource appliance that congregates GPUs in a single external enclosure. 

I/O virtualization picks up steam

The vCORE appliance consists of a 4U (7-inch) high, 20-inch deep enclosure containing either eight double-wide or 16 single-wide GPUs, which can be connected to servers, where it is able to consolidate GPUs in an industry standards-based fashion. The appliance at this time accommodates either NVIDIA Tesla or Quadro GPUs, although in the future will work with GPUs from other vendors. It attaches to any x86-based blade or rack-mounted server via the PCIe bus. Having the CPUs concentrated in a single enclosure also provides for investment protection and future-proofing – GPUs can be dynamically added to the system as needed or replaced when failures occur without affecting workloads running on other GPUs.

The vCORE consolidation appliance has 3+1 fan cooling and a 2400W power and cooling capacity. Carrier cards are incorporated into the appliance for hot-swappable GPU replacement. Further, the appliance can be managed remotely via the included nConnect management software, which offers a GUI, command line or third-party API. The vCORE appliance also supports the I/O virtualization specification for Fibre Channel and Gigabit Ethernet, which allows multiple operating systems to run simultaneously within a single computer and natively share PCIe devices.

Originally used for 3D gaming acceleration, GPUs have now come to the forefront of use for scientific computing, financial services' Monte Carlo simulations, oil and gas exploration and pattern recognition, among other applications – all embarrassing parallel workloads involving very large data sets and calling for floating point operations – that can be broken down into parallel processes and thus, saving organizations time and money.

Despite all the advantages of using GPUs for technical and scientific computing, also comes a wealth of challenges, mostly centered upon GPU maintenance and replacement. When workloads are broken up so that the sequential part of the application runs on the CPU and the computationally-intensive part runs on the GPU, if the GPU requires replacement, all workloads running on the server containing the GPU must be stopped, the server downed and the GPU replaced. The same thing happens when new firmware needs to be loaded or when a new version of the GPU needs to be added to the system – operations stop while labor-intensive and high touch-point GPU maintenance occurs. This in turn impacts all the applications running on the server, not just that application running on the failed GPU – an event that delays an organization's ability to do its work.

As the number of jobs and GPUs increase, so do the dependency and potential for resource contention. Jobs that don't require GPUs for computation may be assigned to servers containing GPUs, while workloads requiring GPUs wait for them to be available.

The result of GPU contention and maintenance factors is that system administrators tend to over-provision GPUs to ensure that one is always available when needed for a planned workload. This approach is clearly problematic – TCO increases, labor intensifies and ROI decreases.

Learn more about this topic

Intel: 2-year-old Nvidia GPU outperforms 3.2GHz Corei7

Battle lines drawn over parallel processing

Nvidia GeForce GTX 480M: World's fastest notebook GPU?

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Take IDG’s 2020 IT Salary Survey: You’ll provide important data and have a chance to win $500.