The core of Odysseus provides all necessary mechanisms ready to perform the actual data processing and is completed by further components, such as a user management.
Odysseus can be seen as the middle layer in a three-layer architecture and operates between the data sources that generate events and applications that use the result. Here, the application e.g. only have to visualize the processed data via a dashboard.
Through a query interface (Query Interface), applications can define the processing in the form of queries in different languages, such as StreamSQL (extension to SQL) or PQL (procedural). This can be done integrated via Java or even via a web interface.
The particular query is translated by Odysseus (Translate), optimized (rewrite) and transformed into an executable graph (Transform). This executable graph - called query plan - consists of several reusable and closed operators. There are those that connect to the sensors or data sources (source and router) and those who forward the results to the application (sink). The data are then run through multiple operators (pipe) from the sources to the sinks, each operator performs a specific function (eg, filtering, aggregation, etc.). The data sources send their data actively to Odysseus, so that the processing is done reactively when necessary. In order to handle unpredictable amounts of data, the plan manager and the scheduler monitor and execute the processing to avoid an overload or starvation.Odysseus is designed so that both the individual steps (translate, rewrite, transform, or the scheduler) and the operators are extensible, interchangeable and customizable. Here, even the data format, which is implemented in the standard configuration as relational tuples, is exchangable and thus allows, for example, the processing of RDF, XML or JSON objects. For that, Odysseus is implemented as an OSGi-based architecture, and is based out of more than 200 components, where diffrent sets of components are logically bundled into features. These features can be easily updated or added via the integrated update system.
Example of the use of Odysseus
The classic use of Odysseus can be compared with a database system. So Odysseus is not a separate application, but can rather be used as a platform for the development of diverse applications.
Similarly to a database system, there are different levels of abstraction and reusable components, which not only minimize the error in development, but significantly reduce the development effort. This of course is reflected positively in the costs down! The use of Odysseus in the development may be illustrated by the following example.
1. Installing Odysseus
Odysseus can be downloaded from the download area for different operating systems. The executables of Odysseus can be unpacked directly to any place and comes out without an installation routine.
2. Access and control of Odysseus
The developer can access Odysseus via a web service interface or by using Odysseus Studio. Like in a database system Odysseus is controlled by using a so-called query language - which is similar to SQL, a established standard in database systems. This level of abstraction allows the developer to express the processing steps in a much simpler way.
Instead, for example, to consumingly program in Java by hand how elements or events are to be accepted and to be then transformed, filtered, and then sum up, this can be expressed by a short query. The developer tells Odysseus how it should transform, filter or sum up the data, but does not have to worry about how this is done correctly and efficiently. The steps, which are frequently used in the development of those applications are as follows.
3. Create connections
First, you normally have to connect to the data sources and data sinks. Instead of implementing complex protocols, Odysseus flexible adapter interface can be used here. For example, to the data via a TCP connection and parse a CSV format, the developer can define the following query:
CREATE STREAM exampleSource (id INTEGER, name STRING, value FLOAT) WRAPPER 'GenericPush' PROTOCOL 'CSV' TRANSPORT 'TCPClient' DATAHANDLER 'Tuple' OPTIONS ( 'port' '12345', 'host' 'odysseus.informatik.uni-oldenburg.de')
In this case a TCP connection to host odysseus.informatik.uni-oldenburg.de on port 12345 is opened. The data is pushed into the system, so that Odysseus reactively responds to the data and the incoming CSV data is translated into a tuple of the form “(integer, string, float)”. The data stream is then applied over the name exampleSource, so that it is available for other uses. Using CREATE SINK instead of CREATE STREAM allows an outgoing connection for a sink.
Odysseus already provides some protocols and transport mechanisms, so that they can be easily replaced if necessary. Also, adding new protocols or transport mechanisms is possible through interfaces and the component-based architecture even at run time!
4. Define query and processing
If Odysseus has established connections to the data source and the data sink, the developer can define the processing steps. If, for example, the data stream exampleSource should be processed so that the value “value” is multiplied by 10 and only events with an “id” greater than 100 matters, then the whole processing steps can be explained by the following short query:
SELECT value * 10 FROM exampleSource WHERE id > 100
Odysseus generates a so-called query plan from this query, which encapsulates the operations of the above query into operators. This ensures a high degree of reusability on the one hand and on the other hand, this allows optimizations.Through multi-layer optimization strategies, Odysseus optimizes the query (without loss of accuracy and information) in order to perform the processing as efficiently as possible.Odysseus then executes the query plan by establishing a connection to the data source and receiving the data. The data are then run through the optimized query plan and the data is - corresponding to the above query - transformed and filtered. The results are made available to the application again, so that the developer can accept.Of course, Odysseus can handle more of such requests and it also performs global optimization strategies by, for example, to exploit similarities between multiple take and, e.g., visualize the data or invoke a certain command.While the definition of queries, the developer can use on Odysseus Studio. Odysseus Studio offers a monitoring of the current state of Odysseus and an integrated editor for formulating queries.