With WebRTC, Real-Time Communications Come to the Browser

The WebRTC standard aims to make peer-to-peer communication over the Web as easy as picking up a phone. Here's what developers need to know about WebRTC, including how to set it up and what limitations the protocol currently faces.

Everyone has had the experience of trying to join a Web conference, only to realize that you must first download some plug-in, upgrade Java or Flash, or install another application. If you've ever had to explain to a customer how or why she need to download and install a plug-in or application to meet with you online, youve probably also had the experience of throwing up your hands and saying, "You know what? I'll just call you."

[ALSO: Retro browser battle: IE vs. Netscape]

Live audio and video streams and live conferencing over the Web are changing business and personal communications, but theyre often deemed too complicated or unreliable for many end users. Certainly, creating a real-time communications (RTC) application is certainly too complicated for the average Web developer.

All of this is changing now, thanks to Web Real-Time Communications ( WebRTC), an industry standard for building peer-to-peer communications right into Web applications.

Today, Real-Time Communications Mostly Means 'Call Me'

Until WebRTC came along, real-time communications required a plugin or a native app. This required users to download, install, upgrade, launch, configure or troubleshoot numerous issues to get themselves from "I need to do this" to "I've done it."

You can currently use several applications for real-time communication, including Skype, Facetime, Google Talk, Yahoo Messenger, iChat, GotoMeeting and join.me. This doesn't include the VOIP phones sitting on your physical desks or all the RTC apps on your smartphones and tablets, either. Undoubtedly, there are several other RTC apps on your computers that you've downloaded and used only once, or that were downloaded in the form of Flash or Java applications when you visited various business Websites and used their live chat features.

Related: 5 Ways to Improve Your Enterprise Social Networking Strategy

In short, real-time communication is all over the Web today, both literally and figuratively.

You can use some RTC clients to communicate with other people using different RTC programs; for example, iChat talks to Yahoo Messenger. For the most part, though, each of these programs is designed to work best, or only with, other computers running the same software or plugin.

The plain old telephone service (POTS), on the other hand, is familiar, universal and simple. It's no wonder that so many people and businesses are still unable to completely dump the horrible sound quality and expense of analog telephones and phone service over copper wires. You need that phone to call into a Web meeting when the audio portion of the call doesn't work. You may also have a smartphone that's almost always connected to a Wi-Fi network-yet you still pay for a wireless plan in addition to high-speed Internet and phone service.

WebRTC Offers Browser-Agnostic Solution

Contrast this current state of affairs with a vision of what's possible with WebRTC. Anyone with a Web browser and a microphone can make calls to anyone else with a Web browser and a microphone. If one or both parties has some sort of video camera, the call can also involve video.

Furthermore, the JavaScript APIs involved in enabling this peer-to-peer communication are simple enough that you can create a WebRTC client with just five or six lines of JavaScript and HTML. The browsers involved in the conversation basically handle everything on your behalf.

If you have any experience with VOIP and video connections, you know that VOIP generally involves proxy and firewall issues, as well as codecs and signaling protocols, which need to be agreed upon by all parties involved. The idea of WebRTC is that HTTP and the Web have already solved the problem of how to get data from one point to another with very few of these issues. The Web just works.

If you have a WebRTC-capable browser (e.g., Chrome and Firefox) installed on your computer, you can use that browser to communicate with any other WebRTC client.

If somebody else has a Web browser with WebRTC support-whether on a desktop computer, a smartphone or a super-awesome wristwatch communication device of the future-you can talk with that person in real time just as easily and trouble-free as if you had picked up the handset of a 1960s wall-mounted rotary phone provided by the central telephone company.

How WebRTC Works: Establish Connection, Create Stream

In 2010, Google acquired Global IP Solutions (GIPS), which developed codecs and real-time voice and video software. In 2011, Google released Hangouts, which uses technology from GIPS, and open sourced the GIPS technologies in the form of WebRTC. (As of this writing, Hangouts still uses a plugin, but rumor (and logic) has it that a WebRTC version is in the works.) WebRTC 1.0 is currently a W3C Working Draft. Although the Working Draft has been implemented in several browsers already, the specification remains very much in flux.

The first step in establishing a voice and video connection between peers is to gain access to the microphone and camera on each device. Until recently, this wasnt really possible with Web browsers. The W3C developed a simple API called the Media Capture API that has gained some support among browser makers and was recently partially baked into Mobile Safari.

How-to: 8 Reasons to Gear Up For HTML5 Now

However, Media Capture doesn't provide any means for streaming video or audio. That's where the MediaStream API comes in.

The job of the MediaStream API is to ask the user for permission to access a camera and microphone and then to create a synchronized video and audio stream. It does this with a JavaScript method called getUserMedia().

The basic code for creating a stream and displaying it using an HTML5 video tag is as follows. It is taken, and modified slightly. from the Mozilla.org getUserMedia docs.

< script > navigator.getMedia = ( navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia); navigator.getMedia ( // constraints { video: true, audio: true }, // successCallback function(localMediaStream) { var video = document.getElementsByTagName('video')[0]; video.src = window.URL.createObjectURL(localMediaStream); video.onloadedmetadata = function(e) { // Do something with the video here. }; }, // errorCallback function(err) { console.log("The following error occurred: " + err); } ); < /script > < video autoplay >< /video >

More Than One Way to Establish a Connection

Once a media stream is created, WebRTC uses the RTCPeerConnection API to communicate streaming data between peers. RTCPeerConnection, like MediaStream, employs a very simple interface. It takes a media stream and sends it to another recipient, where it's loaded into an RTCPeerConnection on their end. That's all there is to it.

Under the hood, however, RTCPeerConnection has to handle the following tasks:

  • Signal processing (including echo cancellation and noise reduction)
  • Codec selection
  • Peer-to-peer communication
  • Encryption
  • Bandwidth management

Before streaming between peers can begin, though, a process known as signaling must occur. Signaling is where things usually get sticky. Rather than reinvent the wheel, WebRTC leaves signaling up to the application. A developer is free to choose any two-way communication protocol, whether it's SIP, XMPP, WebSocket or even just JSON data.

Related: Video Streaming Gets a Boost with New 802.11ac Products

Signaling is the part of a WebRTC system that does require servers to get around firewalls and network address translation (NAT). Most often, all that's needed to initiate a peer-to-peer connection is a public IP address for the sender. WebRTC uses a Session Traversal Utilities for NAT (STUN) server to tell a WebRTC application that's behind a firewall its public IP address. The WebRTC application can then proceed with establishing a peer-to-peer connection with another application.

If, for some reason, establishing a peer-to-peer connection fails, WebRTC can fall back to establishing a connection through a server by using a technology called Traversal Using Relays around NAT (TURN). Obviously, a connection that doesn't require any server resources (STUN) is less expensive and more efficient than routing data through a relay server (TURN). Whenever possible, then, you want to establish connections using STUN.

The job of making sure that your calls are always as low-cost as possible falls to a protocol called Interactive Connectivity Establishment (ICE). In the majority of cases-86 percent of the time, according to Google-you can make video calls work using only STUN. The result is a huge savings in server resources over real-time video systems that always require a server.

Here's a simple example of how RTCPeerConnection does its thing:

< script > pc = new RTCPeerConnection(null); pc.onaddstream = gotRemoteStream; pc.addStream(localStream); pc.createOffer(gotOffer); function gotOffer(desc) { pc.setLocalDescription(desc); sendOffer(desc); } function gotAnswer(desc) { pc.setRemoteDescription(desc); } function gotRemoteStream(e) { attachedMediaStream(remoteVideo, e.stream); } < /script >

At Google I/O 2013, Google nonchalantly demonstrated a full video chat client that was written using only 50 or so lines of JavaScript.

News: At Google I/O, Developer Services Hogged Spotlight

For data exchange over peer-to-peer connections, WebRTC includes an RTCDataChannels API. This uses the capabilities of RTCPeerConnection so users can exchange any type of data without having to worry about firewall, proxy, size restrictions, third-party application and other headaches that have plagued the seemingly simple task of transferring a file to another person since the beginning of computing.

RTCDataChannel uses the same API as WebSockets. A simple send method and the OnMessage event handler are all that's required to enable bidirectional, low-latency data connections over RTCPeerConnection. Possible applications include gaming, screen sharing, and even secure large file sharing.

RTCDataChannel can also deliver data in reliable or unreliable modes. If you just need data with the lowest possible latency, and a missed packet is OK once in a while, unreliable mode is the way to go. If the data must be correct (as in file transfer, for example), use reliable mode instead. ShareFest, an open source one-to-many sharing application, uses RTCDataConnection to allow sharing of files of up to around 1 GB without going through a server.

WebRTC Has Amazing Potential-Limitations, Too

With WebRTC, the potential exists to solve the biggest hurdle in Web communications: Making peer-to-peer voice and video (and data sharing, for that matter) as easy as typing messages into forms through the Web today.

Potential WebRTC applications beyond video conferencing include the following:

  • Websites that improve readability based on how far away the reader's head is.
  • Customer support calls that seamlessly integrate video, audio and desktop sharing.
  • Gaming
  • Photo booth or audio recording apps that don't require a second endpoint.

WebRTC is supported and enabled by default in Google Chrome, Chrome for Android and the latest beta of Firefox. It can be used in Internet Explorer with Chrome Frame, a plugin that enables open Web technologies in Internet Explorer. Although Mobile Safari doesn't yet support WebRTC, Ericsson Labs' Bowser browser currently makes WebRTC possible on iOS and Android.

Related: Ericsson Lays Groundwork for Calls From Mobile Browsers with Bowser

Of course, this is the Web, and nothing is dead easy. Both Microsoft and Apple have huge investments in their own RTC solutions. Microsoft has raised objections to Google's VP8 Codec and hasn't added support for WebRTC to Internet Explorer. Apple's position on WebRTC isn't known, but it's speculated that Apple sees WebRTC as a threat to FaceTime; as a result, Apple may not be in much of a hurry to implement it.

Then there's the problem of how to implement conference calls in a peer-to-peer network. Groups larger than five users present real problems for WebRTC, owing to the complexity of updating and routing data from each peer to every other peer. Just because of the mathematics involved, it quickly becomes overwhelming as the number of parties involved grows, to an order of n factorial.

More practical architectures for enabling peer-to-peer communications in larger groups include the star architecture, where one peer acts as the focus of the call and sends and receives to all other peers, or a server called a multipoint control unit (MCU), which relays all data between each of the peers.

Security, meanwhile, is built into WebRTC in a serious way. First, all camera and microphone access is explicitly opt-in. That is, the browser will ask the user for each session whether the application can access the camera and microphone and the user must click OK. Next, all data shared between peers is encrypted using AES encryption. Lastly, because WebRTC doesn't use any plug-ins, it runs within the browser sandbox and has only the same access to the user's computer as any Web application does.

How WebRTC Is Being Used Today

1 2 Page 1
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.