intro

Streamtools is a graphical toolkit for dealing with streams of data. Streamtools makes it easy to explore, analyse, modify and learn from streams of data.

You'll primarily interact with streamtools in the browser. However, since all functionality is exposed over HTTP, you can use tools like curl to send commands and even treat any part of streamtools as an API endpoint. More about that later.

install

binary

Download the appropriate version of the latest streamtools for your operating system from our releases on github.

Extract the archive (it'll either be a .tar.gz or .zip)

Navigate to the extracted folder and run the st executable.

You should see streamtools start up, telling you it's running on port 7070.

Now, open a browser window and point it at localhost:7070. You should see a (nearly) blank page. At the bottom you should see a status bar that says client: connected to Streamtools followed by a version number. Congratulations! You're in.

source

Make sure you have go, git, hg, and bzr installed. You can download go for Mac OS X, Linux and FreeBSD and Windows from the golang.org website. Git, hg and bzr are simple enough to install using homebrew, apt or your OS package manager of choice.

Once you have these dependencies, compile streamtools with these commands:

mkdir -p ~/go/src/github.com/nytlabs
cd ~/go/src/github.com/nytlabs
git clone git@github.com:nytlabs/streamtools.git
cd streamtools
make

To start the streamtools server:

$ ./build/st
Apr 30 19:44:36 [ SERVER ][ INFO ] "Starting Streamtools 0.2.5 on port 7070"

You should see a message similar to the one above letting you know streamtools is running at port 7070.

getting started

Streamtools is a binary that can run on your local machine or a remote server. We usually run it using upstart on an ubuntu ec2 server in Amazon's cloud. To begin with, though, we'll assume that you're running streamtools locally, on a machine you can touch. We're also going to assume you're running OSX or Linux - if you're a Windows user we do provide binaries but don't know much about how to interact with a Windows machine - you will need to translate these instructions to Windows yourself.

Before we go any further, you should make sure you've installed streamtools. Check out the directions on starting the server, either from a binary release or from source, if you haven't already done so.

You should see streamtools start up, telling you it's running on port 7070.

Now, open a browser window and point it at localhost:7070. You should see a (nearly) blank page. At the bottom you should see a status bar that says client: connected to Streamtools followed by a version number. Congratulations! You're in.

As a "Hello World", try double-clicking anywhere on the page above the status bar, type fromhttpstream and hit enter. This will bring up your first block. Double-click on the block and enter http://developer.usa.gov/1usagov in the Endpoint text-box. Hit the update button. Now double-click on the page and make a tolog block. Finally, connect the two blocks together by first clicking on the fromhttpstream block's OUT route (a litle black square on the bottom of the block) to the tolog block's IN route (which is the little black square on the top of the block). Click on the status bar and, after a moment, you should start to see JSON scroll through the log - these are live clicks on the US government short links! Click anywhere on the log to make it go away again.

how it works

Streamtools' basic paradigm is straightforward: data flows from blocks through connections to other blocks.

Together, these 5 concepts: blocks, rules, connections, routes and patterns form the basic vocabulary we use to talk about streamtools, and about streaming data systems.

reference

blocks

Each block is briefly detailed below, along with the rules that define each block. To make a block in streamtools, double click anywhere on the page and type the name of the block as they appear below. For programmatic access, see the API docs.

Blocks rely on some general concepts:

generator blocks

These blocks emit messages on their own.

flow blocks

These blocks are useful for shaping (transforming or manipulating) the stream in one way or another.

source blocks

These blocks hook into another system and collect messages to be emitted into streamtools.

sink blocks

These blocks send data to external systems.

state blocks

These blocks maintain a state, storing something about the stream of data.

random number blocks

These blocks emit random numbers when polled. So to generate a stream of random numbers, connect a generator block (like a ticker) to a random number block's POLL endpoint. Each of these blocks emits JSON of the form:

{
    "sample": 1234
}

interface

Streamtool's GUI aims to be responsive and informative, meaning that you can both create and interrogate a live streaming system. At the same time, it aims to be as minimal as possible - the GUI posses a very tight relationship with the underlying streamtools architecture enabling users of streamtools to see and understand the execution of the system.

make a block

To make a block, double click anywhere on the background. Type the name of the block you'd like and press enter.

create

connect two blocks

To connect two blocks together, first click on an outbound route on the bottom of the block you want to connect from. Almost always this route will be labelled OUT when you mouse over it. Then click on an inbound route on the top of another block. There can be a few inbound routes; common ones are IN, RULE, and POLL. This will create a connection between the blocks.

connect_2

set the rule of a block

To set a block's rules, double click it. This will open a window where you can enter rules. When you're done entering rules, hit the update button.

update_rule

query a block

You can query a block's rules, or any other queryable route a block has, by clicking on the red squares on the right of the block. These will open a window that shows a JSON representation of that information. An example of a queryable route is COUNT for the count block. If you click on the little red square associated with the COUNT route, then you'll get a JSON representation of that block's current count.

query

delete a block

To delete a block you don't like anymore, click on it and press the delete (backspace) button on your keyboard.

delete

move a block

To move a block around, simply drag it about the place.

drag_2

see the last message that passed through a connection

To see the last message that passed through a connection, click and drag the connection's rate estiamte. This creates a window containing the JSON representation of the last message to pass through that connection.

last_message

API

Streamtools provides a full RESTful HTTP API allowing the developer to programatically control all aspects of streamtools. The API can be broken up into three parts: those endpoints that general aspects of streamtools, those that control blocks and those that control connections.

If you are running streamtools locally, using the default port, all of the GET endpoints can be queried either by visiting in a browser:

http://localhost:7070/{endpoint}

For example, if you wanted to see the streamtools library, visit http://localhost:7070/library.

The POST endpoints are expecting you to send data. To use these you'll need to use the command line and a program called curl. For example, to create a new tofile block you need to send along the JSON definition of the block, like this:

curl http://localhost:7070/blocks -d'{"Type":"tofile","Rule":{"Filename":"test.json"}}'

This POSTs the JSON {"Type":"tofile","Rule":{"Filename":"test.json"}} to the /blocks endpoint.

streamtools

GET /library

The library endpoint returns a description of all the blocks available in the version of streamtools that is runnning.

GET /version

The version endpoint returns the current version of streamtools.

GET /export

Export returns a JSON representation of the current streamtools pattern.

POST /import

Import accepts a JSON representation of a pattern, creating it in the running streamtools instance. Any block ID collissions are resolved automatically, meaning you can repeatedly import the same pattern if it's useful.

data

Every block that has an OUT route also has a websocket and a long-lived HTTP connection associated with it. These are super useful for getting data out of streamtools.

WEBSOCKET /ws/{id}

a websocket emitting every message sent on the block's OUT route.

GET /stream/{id}

a long-lived HTTP stream of every message sent on the block's OUT route.

blocks

A block's JSON representation uses the following schema:

{
  "Id":
  "Type":
  "Rule":{ ... }
  "Position":{
    "X":
    "Y":
  }
}

Only Type is required, everything will be automatically generated if you don't specify them. The Id is used to uniquely identify that block within streamtools. This is normally just a number but can be any string. Type is the type of the block, selected from the streamtools library. Rule specifies the block's rule, which will be different for each block. Finally Position specifies the x and y coordinates of the block from the top left corner of the screen.

POST /blocks

To create a new block, simply POST its JSON representation as described above to the /blocks endpoint.

GET /blocks/{id}

Returns a JSON representation of the block specified by {id}.

DELETE /blocks/{id}

Deletes the block specified by {id}.

POST /blocks/{id}/{route}

Send data to a block. Each block has a set of default routes ("in","rule") and optional routes ("poll"), as well as custom rotues that defined by the block designer as they see fit. This will POST your JSON to the block specified by {id} via route {route}.

GET /blocks/{id}/{route}

Recieve data from a block. Use this endpoint to query block routes that return data. The only default route is rule which, in response to a GET query, will return the block's current rule.

connections

A connection's JSON representation uses the following schema:

{
  Id:
  FromId:
  ToId:
  ToRoute:
}

Here, only Id is optional. Id is used to uniquely refer to the connection inside streamtools. FromId refers to the block that data is flowing from. ToId refers to the block the data is flowing to. ToRoute tells the connection which inbound route to send data to.

POST /connections

Post a connection's JSON representation to this endpoint to create it.

GET /connections

Lists all the current connections.

GET /connections/{id}

Returns the JSON representation of the connection specified by {id}.

DELETE /connections/{id}

Deletes the connection specified by {id}.

GET /connections/{id}/{route}

Query a connection via its routes. Each connection has a rate route which will return an estimate of the rate of messages coming through it and a last route which will return the last message it saw.

command line

The streamtools server is completely contained in a single binary called st. It has a number of options:

More Info

For more info see Introducing Streamtools on The New York Times R&D Labs blog.

For background on responsive programming tools see Bret Victor's learnable programming.

If you're interested in learning more about visual programming languages, check out Interface Vision's fantastic roundup dating back to 1963.