Event stream processing, or ESP, is a set of technologies designed to assist the construction of event-driven information systems. ESP technologies include event visualization, event databases, event-driven middleware, and event processing languages, or complex event processing (CEP). In practice, the terms ESP and CEP are often used interchangeably. ESP deals with the task of processing streams of event data with the goal of identifying the meaningful pattern within those streams, employing techniques such as detection of relationships between multiple events, event correlation, event hierarchies, and other aspects such as causality, membership and timing.

ESP enables many different applications such as algorithmic trading in financial services, RFID event processing applications, fraud detection, process monitoring, and location-based services in telecommunications.

1 Architecture concepts

1.1 Service Components

Service Components are software programs or a set of programs and data that implement functions that are relevant in a business context. For instance, while updating a table in a database has only a technical meaning, the process of updating a customer address - whatever that involves technically - has a business meaning. Hence "update customer address" or "check inventory in warehouse" are Service Components since they have business semantics as opposed to technical semantics. Service Component Architecture (SCA) refers to the design and composition of business applications from modular Service Components.

1.2 Service Oriented Architecture (SOA)

SOA refers to the design of applications via components (often referred to as "services") that expose interfaces that can be called by other client applications; multiple components can be chained together via request/reply calls to create a larger "composite application" which could potentially be considered a logical module within a larger business process. Unfortunately, the primary focus of SOA has been the concept of accessing functions in remote components to create a distributed application based on request/reply semantics. SOA infrastructure typically does not mandate any particular component-model that guides developers to create software modules based on a coarser level of granularity that matches higher-level business functions rather than lower-level technical functions. As such, typical SOA applications focus more on the notion of distributed computing rather than on the creation of reusable, modular Service Components, resulting in software systems that are difficult to develop, deploy, modify and change.

1.3 Event Driven Architecture (EDA)

EDA refers to the design of applications as a collection of components that exchange events to perform business-functions. The major difference between SOA and EDA is that in an SOA, all intermediate Service Components suspend their operation until the relevant request/reply call returns, while in an EDA all Service Components continue to operate since their focus is on processing incoming messages and publishing outgoing messages; EDA is thus typically more efficient than an SOA approach due to pipelined, concurrent processing of events by multiple software modules chained together (since there is no waiting for blocked calls to return). Unfortunately, current EDA approaches suffer from the same problem as SOA since the focus is more on the event-exchanges between distributed software components rather than on the modularity and granularity of the components participating in the EDA process.

1.4 Service Component Architecture (SCA)

SCA is an architectural approach in which application developers decompose problems into smaller modules, each of which executes a well-defined business function and is implemented as an encapsulated component. The interactions between Service Components may be either request/reply (SOA) or via events (EDA). Service Component Architecture thus moves the focus of application design from the concept of distributed computing towards the intelligent design of modular Service Components. A single SCA application may involve multiple request/reply calls as well as multiple event-exchanges. As such, SCA logically unifies SOA and EDA into a single framework, since the distributed nature of the interaction between Service Components in an application is now overshadowed by the notion of software modularity. Finding the right level of granularity at which to implement a Service Component now becomes more important than the request/reply or event-driven exchanges of information between the components themselves.

2 Implementations

There are several EPS frameworks including

Flink is a storage-agnostic stream processing framework, and is thus used in conjunction with data storage or brokering systems. A typical architectural pattern is to use Flink in conjunction with Apache Kafka to:

  1. Ingest data into other systems such as HDFS, databases, or search indices and create continuous ETL pipelines.
  2. Perform analytics directly on the moving data to create alerts, dashboards, or power operational applications obviating the need for ingestion and ETL.
  3. Perform machine learning on streams by continuously building models of the events as they arrive and using the model to serve online recommendations.
Since Flink is a full-fledged system for batch processing as well, it can also be used for applications on top of static data. The following picture provides an overview of where Flink applications might fit in a broader stack: