Get started with IBM Streams
Learn to use the components and features of IBM Streams and more specifically, Streams Studio. You’ll build and enhance a simple application based on a connected-car automotive scenario in which you track vehicle locations and speeds (and other variables).
This course is for software developers who have a working knowledge of:
Install prerequisite software and lab files
To complete the labs in this course, you need to install one of the following versions of the IBM Streams Quick Start Edition:
Install the Quick Start Edition
You have different options about what to install, but it’s recommended that you install the Quick Start Edition virtual machine. The following software is already installed on the Quick Start VMware image:
Install the lab
After you start the Quick Start VM image, you need to install the lab projects, data files, and toolkits.
Develop an application for a simple scenario: read vehicle location, speed, and other sensor data from a file, look for observations of a specific few vehicles, and write the selected observations to another file.
Create a project
You’re ready to start building your application. First, you’ll create a Streams project, which is a collection of files in a directory tree in the Eclipse workspace, and then an application in that project.
Review the Project Explorer
The Project Explorer shows both an object-based and a file-based view of all the projects in the workspace.
In Project Explorer, note that you can expand and collapse MyProject by clicking the twisty on the left.
Define a stream type
Rather than separately define the schema (stream type) in the declaration of each stream, create a type first so that each stream can simply refer to that type.
Create an application graph
You are now ready to construct the application graph and will need the following data for this section:
Add streams to your application graph
Output ports are shown as little yellow boxes on the right side of an operator. Input ports are on the left.
Specify stream properties
The streams are what hold the graph together, so give meaning to them first. Tell the operators how to do their jobs later.
Specify operator properties
With the streams fully defined, it is time to configure the operators.
1. In the graphical editor, select FileSink_1.
2. In the Properties view, click the Param tab.
Run your application
You are now ready to run this program or, in Streams Studio parlance, launch the build.
In the Project Explorer, right-click MyMainComposite. You might need to expand MyProject and my.name.space. Select Launch > Launch Active Build Config To Running Instance.
In this lab, you will further develop the vehicle data filtering application and get a more detailed understanding of the data flow and the facilities in Studio for monitoring and examining the running application.
Add operators to enhance monitoring
Two new operators are needed to make your application easier to monitor and debug. The Throttle operator copies tuples from input to output at a specified rate rather than as fast as possible.
Define the new stream and operator details
Now, you need to define the schema for the stream from the DirectoryScan and tell that operator where to look for files.
Monitor the application by using the instance graph
The Instance Graph in Streams Studio provides many ways to monitor what your application does and how data flows through a running job. This part of the lab explores those capabilities.
View stream data
While developing an application, you often want to inspect not just the overall tuple flow, but the actual data. Previously, you looked at the results file, but you can also see the data in the Instance Graph.
You should understand how to develop the vehicle data filtering application, and understand the data flow and the facilities in Studio for monitoring and examining the running application.
In this lab, you will enhance the app you’ve built by adding an operator to compute an average speed over every five observations, separately for each vehicle tracked. After that, you will use the Streams Console to monitor results.
Add a window-based operator
You will compute average speeds over a window separately for vehicles C101 and C133. Use a tumbling window of a fixed number of tuples: each time the window collects the required number of tuples, the operator computes the result and submits an output tuple, discards the window contents, and is again ready to collect tuples in a now empty window.
Explore the Application Dashboard
Let’s look more closely at your running application. While the Management Dashboard is designed for administrators, the Application Dashboard is more useful for developers.
Maximize the Streams Graph card. You can also enlarge it by using the resize handle at the bottom right of the card. Enlarge it just enough to show the entire graph. Move it to another position and remove other cards as you see fit.
Now, you know how to enhance the application that you’ve built by adding an operator to compute an average speed over every five observations, separately for each vehicle tracked and how to monitor the results in the Streams Console.
Prepare for bringing in live streaming data by adding a test for unexpected data. You split the application into two modules that connect automatically at runtime by using exported streams.
Add a test for unexpected data
A best practice is to validate the data that you receive from a feed. Data formats might not be well defined, ill-formed data can occur, and transmission noise can also appear.
Split off the ingest module
Now, it gets interesting. In a Streams application, data flows from operator to operator on streams, which are fast and flexible transport links.
Add a live feed
Rather than building a live-data ingest application from scratch, you will import a Streams project that has already been prepared.
Show location data on the map
The NextBus toolkit comes with another application that lets you view data in a way that is more natural for moving geographic locations, namely on a map.
Optional: Investigate back-pressure
This section builds on your exploration of the Streams Console in Lab 3. It assumes that you have kept the job from Lab 3 running for at least 40 minutes. To proceed, go back to the Application Dashboard in the Streams Console.
You should now understand how to add a test for unexpected data by splitting the application into two modules that connect automatically at runtime by using exported streams.
The labs in this course have barely scratched the surface of what Streams is capable of. Apart from a very small number of SPL expressions, no coding was involved in building a progressively more interesting application.