Skip Links

Network World

  • Social Web 
  • Email 
  • Close

(Comma separation for multiple addresses)
Your Message:

Software moves streams in real time

By Ugur Cetintemel , Network World , 05/02/2005
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
  • Share/Email
  • Tweet This
  • Comment
  • Print

Applications that process real-time datastreams are pushing the limits of traditional data processing technologies. These applications are characterized by the need for sub-second response times - whether they involve automating trades, monitoring networks for intrusions, or tracking credit card transactions for fraud. Applications that depend on the traditional store-and-query model cannot handle the volume and velocity of streaming data, whose value might exist only in the moment.

A stream-processing engine (SPE) is data management software that enables the execution of queries and computations - and ultimately, actions - on streaming data in real time. Previously, queries and computations could only be executed with stored data using standard database management systems. An SPE accepts SQL-like, stream-oriented, continuous queries and executes them over live event streams, outputting results in real time.

An SPE achieves real-time operation by integrating several mechanisms. First, it supports inbound processing, in which incoming event streams immediately start to flow through the continuous queries as they enter the system. The queries transform the events as they move, continuously producing results, all in main memory. Read or write operations to storage are optional and can be executed asynchronously in many cases.

Inbound processing overcomes a limitation of the traditional outbound processing model conventional database management systems employ, in which data must be inserted into the database and indexed before any processing can take place. By removing storage from the critical path of processing, an SPE achieves significant performance gains compared with traditional processing approaches.

Second, an SPE adopts a single-process model, in which all time-critical operations (including event processing, storage and execution of custom application logic) are run as part of one multi-threaded process. This integrated approach eliminates high-overhead process switches present in solutions that use multiple software systems to provide the same capabilities.

Third, an SPE provides a flexible, in-process storage model and standards-based access to external databases. In-memory hash tables are used for very fast insert and look-up operations. Embedded databases are used to ensure persistence of data and can be accessed and manipulated using SQL-style declarative queries. External, remote-process databases are accessible through standard Open Database Connectivity calls and are convenient to use when supporting legacy databases or facilitating database sharing with external applications.

An SPE has built-in filtering, aggregating and correlating, and merging operators that manipulate windows of events. Standard SQL is defined over finite-sized tables, and an execution engine thereby knows when it is finished with all its operations. In contrast, streams potentially never end, and an SPE must be instructed when to finish processing and output an answer.

  • Share/Email
  • Tweet This
  • Comment
  • Print

Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed