From server to cloud to serverless
The first wave of cloud adoption enabled developers to remotely manage virtual machines, CPU cores, memory, disk, networking – avoiding the hustle of buying and managing physical servers. IaaS and PaaS are part of that evolution, which enabled developers to move their applications (monoliths) to the cloud. As deployment and devops became more efficient, soon new development practices arose: some functionality got “sandboxed” and exposed over the REST APIs as a service, such as payment processing or weather forecast service, what we call a SaaS today.
The latest change in the deployment process today is serverless, where rather than buying “virtual capacity”, developers are tapping into remote distributed computations with cloud functions. For instance, in his blog Slobodan Stojanović describes how to migrate existing applications to serverless.
To better understand the impact of serverless on the software architecture, consider the following picture:
AWS lambda, Azure and Google functions, OpenWhisk – they liberate developers from managing cloud functions, just like the cloud liberated developers from managing physical servers.
Developing applications using only cloud functions brings its own challenges, as demonstrated in this cartoon:
When implementing applications using cloud functions, some new problems arise:
- How to make microservices talk to each other in a secure way
- How to discover what microservices are being offered
- How to expose a unified application (REST) view to the outside world
- How to build logic using stateless functions?
In this blog, I am focusing on the last problem, as the first three can be resolved by API gateways.
As an example of the serverless architecture, I am using the open source project OpenWhisk, which consists of dockerized sandboxed VMs that execute remote functions. Further scalability is achieved by Kafka, which routes functions to sharded VMs.
Below is the cloud function “hello world” written in nodejs (in python, c# or any other language that method signature would be pretty much the same):
Rules and orchestration using cloud functions
In this great blog Martin Fowler describes serverless architecture and one of the restrictions that comes with cloud functions:
“FaaS functions have significant restrictions when it comes to local (machine / instance bound) state. In short you should assume that for any given invocation of a function none of the in-process or host state that you create will be available to any subsequent invocation. This includes state in RAM and state you may write to local disk. In other words from a deployment-unit point of view FaaS functions are stateless. This has a huge impact on application architecture.”
Since functions are stateless, this gives limited possibilities in how to build the logic using them. Here are some of these challenges:
- Challenges related to deciding when to run functions (via triggers, schedules, http calls etc)
- Challenges related to orchestration of functions
- Challenges related to building logic than spans over multiple “runs” of the same rule
I will use OpenWhisk as a reference to see how these challenges are addressed today. One thing we can observe from the picture below is that the main approach to function orchestration is achieved via chaining actions (and payloads which result from the execution of actions).
For more complicated use cases you can use compositions. For instance, you can do something like this:
As we can see, modelling the logic using stateless cloud functions is based on two features of the action method signature:
- Exit code of the function, which either fails or succeeds, like in the composer above
- JSON payloads that can be shuffled from one function’s output to another one as the input argument (params in the example below):
which leads us to only one architecture solution: execute actions in the flow and use payloads after each action to decide what to do next.
But how would we build more complex logic this way?
- One option is to use the exit call of the function as a decision criteria. That sounds as a terrible idea so let’s assume and hope nobody will ever try this.
- Another option is to branch based on the payload message content. That way we would need to dig into the payloads to know what is going on. Moreover, if you would need to follow two or more outcomes that are encoded in the payload object, we need to split the payload result and create multiple flows at that point in time, similarly to Node-RED flow philosophy, as shown in the picture below:
If at one moment we need to merge these flows, we’re in for even more fun… I have already argued in another blog post that decision trees are a terrible idea in capturing the logic in more complex systems, and if we add to this problem the idea of splitting and merging functions while shuffling payloads, you know where this is all going: a gigantic cloud spaghetti.
In order to solve part of this problem, AWS’ approach is to wrap lambda functions into step functions. like in this example below (defined using JSON notation). Hello World is a task, which wraps lambda function, and ends as soon as the function is executed (“End” : true), otherwise this task could be chained to another task explicitly within the JSON payload:
Developers can start building logic using step functions, similar to BPM engines. You can also define callbacks to handle errors in the same task definition, but overall complexity is not that much reduced compared to the OpenWhisk approach.
An alternative way – cloud function orchestration made easy
Update (November 2020): This presentation, in the extended form was presented at Serverless Architecture Conference in Berlin 2020: “Solving the weak spots of serverless with Directed Acyclic Graph Model”
At Waylay, we have come up with the concept of smart objects, sensors and actuators. With our smart agent concept, developers build logic by assembling sensors through logical gates, which in result can trigger actuators. All sensors and actuators are sandboxed cloud functions, where sensor observations infer their results back to the cloud engine. The rule engine is a type of inference engine, or rather what we call a cloud-based Bayesian smart agent architecture for internet of things applications, which instantaneously propagates results of the cloud functions through sensor states.
That means that some of the nodes in the graph represent objects as the source of information, such as a door (which can be open or closed), weather forecast, CRM system, smart washing machine or just the temperature in a house. That way, developers are not any more concerned (nor constrained) about the logic at the moment they write cloud functions. Since developers are free to add as many sensors and actuators “on the fly”, the library of sensors and actuators becomes sort of the Cloud DSL language, where cloud functions (sensors and actuators) can be easily reused in multiple use cases. One side effect of this abstraction is that this approach enables role separation between persons that are responsible for sensor gathering from persons who are responsible for knowledge modelling.
This way of inference modelling also allows both the push (over REST/MQTT/Websockets) and the pull mode (API, database..) to be treated as first class citizens. For the engine, the mode makes no difference, because as soon as the new insight is provided, it is inferred to all other nodes in the network. That also means that we don’t need to use pub/sub integration for triggering flows when the new data arrives – but also within a running (stateful) rule, Waylay can execute cloud functions (which are part of the bigger rule) as soon as data is present. So rather than forcing the architecture choice where you start a flow per new data point, we can actually model a rule where some of the cloud functions “awake” as soon as data is presented to them.
There you go, cloud function orchestration made easy: serverless doesn’t need to be ‘headless’!
In the next blog Waylay engine, a new era of building cloud applications we shall see how we deal with cloud function orchestration using Waylay inference engine.
Finally, if you are intrigued by Waylay internals, there is one more blog for you Serverless automation with Waylay engine