Web search engines for IoT: The new frontier

A new method for searching the web is needed to allow IoT devices to independently and securely discover other “things” in the connected world of the future

Web Search Engines for IoT: The new frontier

We are all intimately familiar with the experience of “googling” a keyword(s) on a Web browser search engine to find related websites. For example, searching for “best French restaurant” in Google or Yahoo will return a list of many websites that are related to this topic. However, this key feature of the current Web will have to be fundamentally reworked for the new types of devices that are expected to join the Web as part of the Internet of Things (IoT). I mean, just how is it going to work when your fridge needs to do a search for something - and it will before too long?

Traditional web search engines

When thinking about any technology evolution, it is useful to first understand how the current generation of technology works before we try to predict what will happen in the future. So let’s briefly review how search engines work today.

Search engines primarily utilize automated programs called Web crawlers to discover and visit every possible website in the Internet. At each visited website, the Web crawler makes a copy of the website content and records it back in a large database at the search engine. This database is then analyzed off line, and a fast lookup index is created so that a rapid search can be performed every time a human user sends a keyword search request. The result of the lookup will be a ranked list of website addresses (i.e. Uniform Resource Indicators - URIs) that corresponds to the keyword that was searched for. In the current Web all the information transferred between the Web browser, website and the search engine server uses the ubiquitous and well known HTTP protocol.

The search engine problem in IoT

The existing Pull model of information exchange where the search engines sends out web crawlers to discover webserver information will unfortunately not work for most IoT cases. There are several reasons for this.

First, many IoT devices will be battery or solar powered and thus will often be “sleeping” in a low power mode when not performing their intended function. For example, a high temperature sensor in a remote industrial application may only be physically activated when its hardware gets heated above a certain temperature. When this happens, the sensor will get activated and send an HTTP message to a central controller to report an alarm. Below this temperature the sensor will be inactive and in sleep mode. So in general this temperature sensor will not be discoverable by web crawlers sent out by a traditional search engine as it will be sleeping most of the time and will not respond.

Secondly, many IoT devices will be located in semi-closed networks that will block traditional search engine web crawlers from discovering them. For example, a fitness center may freely allow web crawlers to discover their treadmills and other exercise equipment. However, the fitness center will definitely block discovery, using a security firewall, of IoT devices like electronic door locks and video cameras for security and privacy reasons.

Emerging solutions

A key solution for the IoT search problem is currently being standardized in the Internet Engineering Task Force (IETF). Specifically, a new type of search engine called a Resource Directory (RD) is being defined. This will be a very distributed search engine, with multiple RDs expected for a given geographical area like a city. IoT devices are expected to register their web addresses (URIs) to their local RD in Push model. This will typically be done when the IoT device is first installed and powered up.

Then when a search request is sent to the RD, the RD will first do access control and other security checks to make sure that only authorized parties are allowed to discover the relevant information. For example, suppose the fridge in my house wants to discover my home electricity meter to check the current time-of-use charge rate. The fridge wants to use this information to adjust its internal temperature up or down, within a certain bound, to reduce my electricity costs. In this case, the RD that serves my neighborhood will allow my fridge to discover the electricity meter URI because it knows that they are both part of my home network and are trusted devices. However, if an IoT device from my neighbor’s house made a similar request, the neighborhood RD would return an error message as that foreign device is not authorized to make that search query.

In addition to the IETF, another important body contributing to solving the challenges of IoT web searches is the Hypercat consortium. They are developing specifications that will allow inter-exchange of data between data hubs in different domains. This will allow, for example, exchange of data between a neighborhood RD and Google’s global search engine.

A bright future

A major reason for the success of the Web over the last 20 years has been the use of search engines to organize and make a huge amount of web information easily accessible to human users. If we wish to continue this success with the billions of IoT devices that are expected to join the Web over the coming years, then we will have to keep innovating. Fortunately, with next generation solutions like the IETF’s Resource Directory concept, and the Hypercat meta-data specification under development, and very much more on the horizon, it looks like search engine evolution is definitely keeping good pace with all the other parallel innovation going on in the worlds of 5G and IoT.

This article is published as part of the IDG Contributor Network. Want to Join?

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.