Introduction

Real-time fraud detection is crucial in payment processing. Stream processing engines must be capable of implementing sophisticated fraud detection algorithms to identify and prevent fraudulent transactions.

Stream processing engines are designed to process data in real-time or near-real-time as it flows through a system. They are optimized for low-latency data processing, making them suitable for applications such as real-time analytics, monitoring, fraud detection, etc. Some of the processing engines on the market are: Apache Kafka Streams, Apache Flink, Apache Storm, Apache Spark Streaming, Amazon Kinesis (with services such as Kinesis Streams, Kinesis Firehose, and Kinesis Analytics), Microsoft Azure Stream Analytics, and Google Cloud Dataflow (based on Apache Beam). 

Need for speed

Before we continue, let’s see how payment checks are done traditionally, especially in applications like anti-money laundering (AML). Most of the traditional applications in this domain let users make, or rely on suppliers to create AML rules - often in hundreds (policy rules). These rules would then be configured to go over all transaction logs once a day. As soon as one of the policies would “break”, processing of that transaction would be flagged and put in the exception queue (and good luck in knowing which rule triggered it). Anyhow, let’s stay at the processing side of things. Most of the policy rules are created based on decision trees or tables, or simple conditional statements, which are well suited for ETL pipelines but impossible to extend to stream processing, which is required for real time implementation. 

On the other hand, these policies are hard to describe, implement, and manage as SQL-like rules, or even using CEP (complex event processing) syntax, so they are not well suited for stream processing engines. 

So, even though stream frameworks are great, they still lack some basic functionality when it comes to creating complex and sophisticated rules, which is what payment processing is all about.  And this is where Waylay Engine comes into play.

Payment stream processing architecture with Waylay

In many cases, when processing payment transactions, the journey starts with the ETL process. Before this phase, data is already stored in diverse database backends. This data preparation ensures that the window function, which is crucial for subsequent SQL queries, is readily available. After undergoing the ETL process, the data is often consolidated into a larger BigQuery database. This database serves as the foundation for various analytical reports and machine learning algorithms, frequently incorporating batch processing. Additionally, as illustrated in the diagram below, payment entries are forwarded to either Kinesis or Kafka, which are then processed in real time by the Waylay Engine.  

AML Architecture blueprint

What sets Waylay Engine apart is its unique ability to inject specially crafted policy rules directly into message buses, such as Kafka or Kinesis (in that regard, you can consider Waylay Engine as the message consumer). Instead of processing data through conventional SQL queries, Waylay Engine processes data using rule definitions. These rule definitions can also be composed of serverless endpoints, lambda functions, or even ML models. This design somewhat resembles the use of AWS Step Functions for event processing, but with the added advantage of crafting more complex rules compared to Step Functions, since AWS Step Functions is a finite state machine, and most of the policy rules resemble decision table/tree rules.

AML Policy rule template

Furthermore, Waylay Engine allows you to define rules that are natively executed within the engine itself, harnessing the computational speed of native processing engines. Instead of relying on SQL-like query languages, you simply define the rules using templates which you can manage, test, and audit separately:

Testing “one policy” rule

Another intriguing aspect is the ability to seamlessly blend native code with sub-rules that are described by machine learning models or cloud functions, or functions that are exposed as APIs by 3rd parties. This combination enables you to delegate specific computations to external machine learning models and systems, seamlessly integrating them into the data stream to enforce policy rules.

Since all these rules are described as templates, you can create a library of different policy rules and dynamically apply them to all incoming data streams within the system.

Conclusion 

In this architecture, we leverage the synergy of two distinct paradigms: 

  • The robust capabilities of stream applications, enabling an event-driven architecture for real-time or near-real-time data processing, characterized by low-latency, scalability, and fault tolerance across batch and stream processing modes. 
  • Concurrently, we introduce the Waylay Engine as a formidable rules engine for transaction monitoring, capable of executing complex policy rules that are attached to data input streams. After data undergoes processing, it seamlessly flows back into the data output stream for subsequent tasks, including case management or reporting.

To better visualize this approach, consider the following video: where each 'dot' within the AML rule represents native code injected into the Waylay Engine. Data is funneled through the data ingestion layer and processed by the Waylay Engine. The result is akin to a stream processing engine; however, Waylay Engine distinguishes itself by not relying on SQL query language for stream processing. Instead, it serves as a highly configurable rules engine for stream processing. Additionally, it offers the flexibility to seamlessly integrate serverless functions, machine learning models, and lambdas when the need arises.