The Open Compute Project's reputation, according to industry folklore, is as a cost-cutting innovator that scaled up Facebook's data centers with commodity hardware. Last week, Wired and CIO speculated that OCP's announcement of the Telco Project with large telecoms like AT&T, Deutsche Telekom, and SK Telecom were targeting Cisco and HP's proprietary gear with open-sourced commodity hardware, so they could compete with cloud service providers Amazon, Microsoft, and Google.
In the wake of the announcement, I asked Professor Frank Fitzek of the Technical University of Dresden to explain how large telecoms could use an OCP commodity hardware platform.
As Deutsche Telekom's Chair of Communication Networks, Dr. Fitzek coordinates the 5G Lab in Germany, giving him insight into the future of networking. Generally, Dr. Fitzek sees the OCP Telco Project to be a strategic initiative to move the cloud closer to the network's edge, where apps can be built to perform complex functions. It is a telecom-specific platform, combining Network Function Virtualization (NFV) and Software Defined Networking (SDN.) He sees many more types of apps than the classic policy management use cases in technical literature that explain SDNs. He predicts an Internet of Things (IoT) with apps at the network edge that control autonomous vehicles and other types of autonomous devices, including robots, drones, and farm equipment.
Dr. Fitzek's vision of the IoT doesn't consist of islands of smart sensors and autonomous devices loosely coupled by a best-effort internet. He described a new 5G IoT platform that would enable a huge network of not just sensors, but real-time control of vehicles and robotic systems. The IoT, according to Dr. Fitzek, will have islands of automation interconnected by very low-latency, error-correcting 5G networks capable of coordinating thousands of autonomous vehicles traveling at 150 MPH and robots interacting with traffic and humans. His design scale is 50 billion IoT devices extensible to 500 billion.
Dr. Fitzek said the Google autonomous car, guided by a local cloud of cameras, lasers, radios, and sensors, can't detect and react to an upstream collision out of the range of the local cloud because the latency is too great. If the latency is greater than 1 ms, control and coordination of large-scale systems operating over long distances become difficult.
In this vision, growth in latency increases reaction time of autonomous cars, meaning lower speeds and less distance between cars, reducing highway throughput and increasing travel time. And autonomous vehicles can't see around corners; however, 5G-connected vehicles can.
As the latency increases, coordination becomes more difficult, and finally fails. Take, for example, a person controlling a robot via a video camera that catches a ball thrown to it. At very low latency, it is very easy for the person to observe the throw, adjust the robot's hands to match the ball's trajectory, and catch the ball. Now introduce a video communications delay in the observation of the ball and the control of the robot. It becomes more difficult to coordinate the robot and catch the ball. As the latency increases, the person controlling the robot must make estimates based on less and less information about the ball, until finally the ball can no longer be caught.
In Dr. Fitzek's scenario, 5G network nodes can be located at distances of up to 60 miles from an autonomous vehicle or system. Autonomous cars would use all sensors to steer in traffic, relying on low-latency 5G to look around corners and synchronize the cars with changing traffic conditions or accidents.
Stateless data transmission using Random Linear Network Coding (RLNC) enables control of the mobile edge cloud to be distributed, increasing its resilience without increasing latency. The on-the-fly coding properties of RLNC distribute error correction and the associated load to all nodes on the network. RLNC is an elegant approach to stateless transmission, avoiding altogether the network congestion from error correction and retransmission that degrades network performance. RLNC can recreate any packet lost on the receiving side from a later sequenced packet. In over-simplified terms, each RLNC encoded packet sent is encoded using the previously sequenced packet and randomly generated coefficients using a linear algebra function. Since the RLNC encoding sender doesn't need to listen for acknowledgements of successful transmission and perhaps retransmit, the sender can continuously transmit at near-wire speed optimized for latency and network throughput. If the computation time to reproduce a lost or damaged packet takes too long, the autonomous node can request other proximate nodes to broadcast random packets, increasing the probability that the computational time will be decreased.
After speaking with Dr. Fitzek, the OCP Telco Project appears to be the start of standardization of an open design for commodity-edge NFV/SDN application servers. The telecoms are defining a new platform that moves the cloud closer to the network edge, enabling low-latency control that would not be possible using today's internet and cloud incarnation.