Expressions are an essential feature of Serveless Workflow Specification. They are everywhere and they are powerful. As you should already be aware if you have ever watched a superhero movie, with great power comes great responsibility. In the Kogito universe, when discussing expressions, this famous sentence means there is a risk you will overuse them. Or, employing culinary terms, expressions are like the salt in a stew, you need to find out the right amount for your recipe to excel.
But what is exactly an expression? In a nutshell, a string that adheres to certain conventions established by a language specification that allows you to interact with the flow model. There are two terms in the previous sentence that deserve explanation: workflow data model and language specification.
Every workflow instance is associated with a data model. This model, regardless if your flow file is written in YAML or JSON, consists of a JSON object. The initial content of that object is set by the creator of the flow. As you already know, the flow can be started through a Cloud Event or a HTTP POST invocation. No matter which approach you use to start the flow, both the event or the request body will be a JSON object, which is expected to have a workflowdata
property. Its value, typically another JSON object, will be used as the initial model. That model will be accessed/updated as part of the flow execution. Expressions are the mechanism defined by the Serverless Workflow Specification for the states to interact with the model.
Kogito supports two expression languages: jsonpath and jq. The default one is jq, but you can change it to jsonpath by using the expressionLang
property.
Why these two languages? As you already guessed, since the flow model is a JSON object, it makes sense that the expression languages intended to interact with it are JSON-oriented ones.
Why is jq the default? Because it is more powerful. In fact, jsonpath is not suitable for all use cases supported by the specification, as you will soon find out in this post.
Given this quick introduction, let’s discuss in this post some use cases for expressions: switch state conditions, action function args and state filtering.
Switch conditions
Unlike a human, a flow does not have free will, its destiny is decided by the contents of the model and the flow designer.
Conditions inside a switch state allow the flow designer to choose which path the flow should follow depending on the model content. A condition is an expression that returns a boolean when evaluated against the model. If the condition associated with a state transition returns true, that is the place for the flow to go.
For example, in greetings repository, we are selecting which message should be displayed to the user depending on his language of choice: English or Spanish. In computational terms, if the value of the property language
is english, the constant literal to be injected on property message
will be Hello from… , else if the value of the same property is spanish, then the injected message will be Saludos desde….
Using jq as expression language and JSON as workflow definition language, the switch
state contains.
"dataConditions": [
{
"condition": "${ .language == \"English\" }",
"transition": "GreetInEnglish"
},
{
"condition": "${ .language == \"Spanish\" }",
"transition": "GreetInSpanish"
}
]
Using jsonpath as expression language and YAML as workflow definition language, the switch
would have look like
dataConditions:
- condition: "${$.[?(@.language == 'English')]}"
transition: GreetInEnglish
- condition: "${$.[?(@.language == 'Spanish')]}"
transition: GreetInSpanish
Note that, as required by the specification, in these examples, all expressions are embedded within ${… }. But Kogito is smart enough to infer that the string within condition is an expression, so you can skip it and just write
"dataConditions": [
{
"condition": ".language == \"English\"",
"transition": "GreetInEnglish"
},
{
"condition": ".language == \"Spanish\"",
"transition": "GreetInSpanish"
}]
Which will behave the same and is a bit shorter.
Function arguments
One of the coolest things about Serverless Workflow Specification is the capability to define functions that can be invoked several times by the states of the flow. Every different function call might contain different arguments, which are specified using function arguments.
A function definition example can be found in the temperature conversion repository. This flow performs two consecutive REST invocations to convert Fahrenheit to Celsius (a subtraction and a multiplication).
Lets focus on the first function definition, the subtraction.
"functions": [
{
"name": "subtraction",
"operation": "specs/subtraction.yaml#doOperation"
}]
In the snippet above, we are defining a function called subtraction that performs an OpenAPI call. Kogito knows that the operation
property defines an OpenAPI invocation because REST is the default operation type (adding “type”: “rest
” will also work, but it is redundant). The OpenAPI specification URI is the sub-string before the # character in the operation
property. The operationId
is the sub-string after the #. In Kogito Serverless Workflow implementation, when an URI does not have a scheme, it is assumed to be located in the classpath of the Maven project.
paths:
/:
post:
operationId: doOperation
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/SubtractionOperation'
responses:
"200":
description: OK
components:
schemas:
SubtractionOperation:
type: object
properties:
leftElement:
format: float
type: number
rightElement:
format: float
type: number
As you can see in the snippet above, subtraction.yaml specification file referenced in this example defines an operationId called doOperation
, which expects two parameters: leftelement
and rightelement
. In order to invoke a function, we use a functionRef
construct. It is composed by the refName
(which should match the function definition name) and the arguments
to be used in the function call.
Function arguments are expressed as a JSON object whose property values might be either a string containing an expression or any JSON data type (string, number, boolean…). Note that in this example, the expression is not embedded within ${}
. Kogito will infer it is a valid jq expression because of the . prefix, but you can embed it if you prefer to do so.
"functionRef":
{
"refName": "subtraction",
"arguments":
{
"leftElement": ".fahrenheit",
"rightElement": ".subtractValue"
}
}
In the snippet above, we are specifying that the left number of the subtraction is equal to fahrenheit
property (which is an input number provided by the user invoking the flow) and that the right element is equal to substractvalue
property (which is a constant number injected into the flow model by SetConstants
state). After resolving expression evaluation, the JSON object is used as a request body.
"functionRef": {
"refName": "subtraction",
"arguments": "{leftElement: .fahrenheit, rightElement : .subtractValue}"
}
You can also write function arguments
as a string containing an expression that returns a JSON object. You should be aware that this capability might not be supported by the expression language (jsonpath does not, hence one of the reasons why jq is the default expression language in the specification). Also, it is only suitable when the OpenAPI operation does not define any path, query or header parameter. The snippet above returns the same JSON object as in the previous one, but using a jq expression string rather than a JSON object.
Setting OpenApi parameters
In the previous example, the function arguments are used as the POST/PUT request body
. But what happens if you want to set path, header or query parameters? In this case, you must use the JSON object approach, where the proper query, path or header parameters are extracted from arguments matching the name of the parameter. The remaining arguments in the JSON object, if any, will be used as the request body
.
Let’s illustrate it with an example, consider the following OpenAPI definition, which adds a header named pepe
to multiplication
operation id.
paths:
/:
post:
operationId: doOperation
parameters:
- in: header
name: pepe
schema:
type: string
required: false
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/MultiplicationOperation'
responses:
"200":
description: OK
components:
schemas:
MultiplicationOperation:
type: object
properties:
leftElement:
format: float
type: number
rightElement:
format: float
type: number
You can set the value for header pepe
by including a property named pepe
into the function arguments using the JSON object approach, as in the snippet below, which sets the value of the header to pepa. As explained before, the resulting POST request body will just contain leftElement
and rightElement
.
"functionRef": {
"refName": "multiplication",
"arguments": {
"pepe":"pepa",
"leftElement": ".subtraction.difference",
"rightElement": ".multiplyValue"
}
}
State Filtering
Let’s conclude our review of expression usage by considering this expressions example. The input model is an array of complex numbers (where x is the real coordinate and y the imaginary one) and the output model is the maximum value of the real coordinate within the input array.
The flow consists of an action expression (defined inside squareState
) that calculates the maximum x and the minimum y; an action data filter (defined inside squareState
) that selects the maximum x as the action output that should be merged into the model; and a state data filter (defined inside finish
state) that sets that max value as the whole model that will be returned to the caller. Let’s examine the three of them in detail.
"functions": [
{
"name": "max",
"type": "expression",
"operation": "{max: .numbers | max_by(.x), min: .numbers | min_by(.y)}"
}
]
In the snippet above we define a max function of type expression. The operation property is a string containing a jq expression. This expression returns a JSON object, where max property is the maximum value of x coordinate in the input array and min property is the minimum value of y coordinate in the same array.
Now let’s go to the place where this function is invoked.
"actions": [
{
"name": "maxAction",
"functionRef": {
"refName": "max"
},
"actionDataFilter": {
"results" : ".max.x",
"toStateData" : ".number"
}
}
]
Since we are only interested in the maximum x, besides invoking the function using functionRef
, we add an action data filter. If we do not add this filter, the whole JSON Object returned by the function call will be merged into the flow model. The filter has two properties: results
, which selects the attribute and toStateData
, which indicates the name of the target property inside the flow model (in case this property does not exist, it will be added). So, after executing the action, the flow model will consist of a number
property storing the maximum value and the original numbers
array. Then the flow moves to the next state: finish
.
"name": "finish",
"type": "operation",
"stateDataFilter": {
"input": "{result: .number}"
}
Since we do not want to return the user input as a result of the flow execution, the final stage consists of a state data filter that sets the contents of the output model. Hence, we set the model to be a JSON object containing a property named result
, whose value is the maximum number calculated by the previous state, stored in the number
property. We do this using the input
property of the stateDataFilter
construct, meaning that the model is changed before the state gets executed. So the final model content returned to the user contains a result
property whose value is the maximum x.
There are some aspects I would like to remark before jumping to the conclusions:
- The significant difference between action and state data filters. While the former just selects the portion of the action result that will be merged into the model, overriding only those properties in the flow model that share the name with the selected action result, the latter sets the whole flow model to the JSON object returned by the expression, discarding any existing property.
- Since
stateDataFilter
is expected to return a whole JSON object, jq is the only real valid option if you are using this construct (remember jsonpath expressions are not able to build new JSON objects) - We use expression strings that are not embedded within
${..}
. As explained before, this is just syntactic sugar. You can use the approach you prefer.
Conclusion
This post illustrates usage of expressions in three main areas of Serverless Workflow specification: branching using Switch
State, invoking OpenAPI services with arguments that are extracted from the flow model, and manipulating the flow model using jq expression language. Now is your turn to combine the three of them to perform complex and unrestricted service orchestration. The only limit is your imagination.