Skip to main content

Pipeline design

Edge and Mesh Pipelines enable the data flow between the edge and the mesh (cloud) environment. The Edge Pipelines are responsible for preprocessing the data before sending it to the Mesh Pipelines. The Mesh Pipelines are responsible for processing the data in the cloud environment.

Every pipeline consists of a sequence of nodes. Each node is responsible for a specific task, such as filtering, transforming, or aggregating data. The nodes are connected in a directed acyclic graph (DAG) where the edges represent the data flow between the nodes.

The following diagram illustrates the execution roles of Edge and Mesh Adapters in the ETL process:

Every node in the pipeline has a specific configuration that defines its behavior. The configuration includes the type of the node, the input and output schema, and any additional parameters required for the node to function correctly.

The following example shows the configuration of a Project node that selects specific fields from the input data stream and passes them to the output stream:

- type: Project@1
fields:
- path: "$.Sinus"
inclusion: false
- path: "$.Constant_6"
inclusion: false

Every node receives and input object that can be modified and passed to the next node in the pipeline. The input object is a JSON object that represents the data stream at that point in the pipeline. The output object is the modified input object that is passed to the next node in the pipeline.

Pipeline definitions

We call the list of nodes to be executed Pipeline Definition, the typical format is YAML. The following example shows a simple pipeline definition with two nodes:

triggers:
- type: FromHttpRequest@1
path: /demo
method: POST
transformations:
- type: PrintDebug@1

This pipeline definition consists of two parts: the triggers section and the transformations section. The triggers section defines the event that triggers the pipeline execution, while the transformations section defines the sequence of nodes that are executed when the pipeline is triggered.

Node types

We distinguish between different types of nodes based on their functionality. Some nodes are Adapter specific, while others are general-purpose nodes that can be used in any pipeline.

The following list provides an overview of the different node types:

Trigger nodes

Trigger nodes are used to start the execution of a pipeline based on an event. The event can be a message from a message broker, a polling event, or a webhook.

In the subsection 'Nodes' of the docs you can find detailed information about the different trigger nodes.

Control nodes

Control nodes are used to control the flow of the pipeline. They can be used to iterate over an array, execute a sub-pipeline for each element, or execute a sub-pipeline based on a condition.

In the subsection 'Nodes' of the docs you can find detailed information about the different control nodes.

Transformation nodes

Transformation nodes are used to transform the data in the pipeline. They can be used to filter, aggregate, or enrich the data.

In the subsection 'Nodes' of the docs you can find detailed information about the different transformation nodes.

Extract nodes

Extract nodes are used to extract data from a source. They can be used to extract data from a database, a file, or a message broker. Most of the time Trigger nodes are used to extract data from a source.

In the subsection 'Nodes' of the docs you can find detailed information about the different extract nodes.

Load nodes

Load nodes are used to load data into a target. They can be used to load data into a database, a file, or a message broker.

In the subsection 'Nodes' of the docs you can find detailed information about the different load nodes.