Reactive Programming for Database Business Logic

Reactive Programming applies elegantly to database transaction logic, and can result in enormous savings in time, quality and simplicity. Here's how.

What is Reactive Programming?

Reactive Programming is a declarative approach where variables defined by expressions automatically react to changes in referenced values.

As well-explained in this wiki page, Reactive Programming is a declarative - not imperative (aka "procedural") - approach for automatic propagation of changes:

In computing, reactive programming is a programming paradigm oriented around data flows and the propagation of change. This means that it should be possible to express static or dynamic data flows with ease in the programming languages used, and that the underlying execution model will automatically propagate changes through the data flow.

For example, in an imperative programming setting, a := b + c would mean that a is being assigned the result of b + c in the instant the expression is evaluated. Later, the values of b and c can be changed with no effect on the value of a.

In reactive programming, the value of a would be automatically updated based on the new values.

This is an abstract definition, which can be adapted to various disciplines as described below. Regardless, the system needs to establish watches on the referenced values, and react when changes are observed. The watch mechanism may be automatic, or manual wherein you must explicitly invoke APIs to establish watches.

Current Industrial Use

Microsoft has been very active in this field. See the Reactive Framework Examples, and several other references from this StackOverflow posting.

By far the most famous commercial application is the spreadsheet. Your cells are defined by values referencing other cells - the referencing cell reacts to changes in the referenced cells by recomputing its value. This process can, of course, chain, enabling quite complex computational systems to be defined.

Reactive Database Logic

The powerful Reactive Programming model is perfectly suited to address the change propagation required for transaction processing, that is, when the server receives change (Insert, Update, or Delete) requests. The key approach borrows heavily from Domain Driven Design to provide Declarative Domain Logic.

Declarative is the more general term, so we narrow our focus here to Reactive Database Logic, characterized as described below.

1. Base Tables / Columns / Roles are variables

The relational database schema is the Object Model, used to define the variables that can be referenced in Reactive Programming. So, you can refer to customer.balance. This can be extended to provide for attributes not defined in the schema, but that is beyond the scope of this article.

Real world complexity requires that expressions be able to refer to data in other tables, so you can refer to Purchasorder.customer.balance, or Purchaseorder.lineitems. For these:

Parent role: we use the Foreign Key constraint name for references to the "many" side (as in Purchaseorder.customer)
Child Role: the table name for the "one" "side (as in Purchaseorder.lineitems).

Actual support should provide customized child role names, but that is beyond the scope of this document.

2. Define Multi-table Expressions for database columns

Sometimes called rules, these can be simple expressions, like this to compute the amount for a Lineitem row:

return row.price * row.quantity

Expressions might be conditional, like

if (row.qty_ordered <= 6)
return row.product_price * row.qty_ordered;
else
return row.product_price * row.qty_ordered * 0.8;

Multi-table expression support is crucial for multi-table transactions, such as

customer.balance = sum(purchaseorder.amount_total where paid = false)

**3. Establish automatic `watches` on dependent data**

This requires the system parse the expressions to determine the referenced data, which is then automatically watched.

**4. Multi-table recalculation**

Reactive means the system reacts to watched changes to recompute referencing data. Note that changes must be interpreted broadly: not just Updates, but also Inserts and Deletes.

The real power occurs when you consider multi-table derivations. For true declarative / Reactive support, you should not need to code SQL commands to access the data - this should be fully automated. So, deleting a Purchaseorder should adjust the balance.

5. SQL Automation

SQLs required for multi-table recalculation should be automated.

**6. Optimized SQL handling**

Which brings us to optimization. The agility / clarity advantages of Reactive Logic provide little value if there is a performance penalty.

SQLs are far more expensive than arithmetic, so for enterprise performance, the system needs to optimize update processing in much the same matter as we rely on relational database to optimize retrievals. The system must eliminate and optimize SQL with techniques such as:

Pruning: no SQL should occur if the watched data is not altered (e.g, changing an orders' date should not require access to the customer).
Adjustment: SQLs are expensive, but aggregate SQLs are even more so. Accordingly, recalculations of aggregates (eg., sums) should be done with 1 row updates.

Observe that aggregates can nest. So, a simple non-optimized transaction to pay an order would require a select sum of all the orders and items. Optimized adjustment logic can reduce these SQLs by orders of magnitude.

This is very important, since it provides the basis for real-time analytics, such as real-time accounting (instant posting of new transactions), real-time medical records, etc.
Caching: an order transaction with several line items should issue 1 adjustment to the order, not 1 per item

7. View/Resource Mapping to Objects

Critical View/Resource mapping to Objects is required, so you can define Views (or RESTful Resources) optimized for program use, and reuse the Reactive Logic associated with the base table.

8. Chaining, with Automatic Ordering

The power of spreadsheets is that derived cells can be referenced in other cells, so one change can naturally propagate through the data. The same is required for database logic.

Of course, more complex dependencies require ordering. For example,

a = b+c
b = c+d

requires that, on a change to c, b be calculated before a.

9. Imperative escapes

A good question is "How much of my database logic can be addressed with Reactive Programming?". The real-world answer is more than you think (in excess of 95%), but not 100%. So we need an imperative escape.

This is best provided by a familiar event model, where you define table events that are called on Inserts, Updates and Deletes. You might use events to send messages / email, start a business process etc. The real power is that these events are typically predicated on the Reactive derivations.

The language used for event logic is of course critical. The consensus choice here is JavaScript - the one language you'll probably need to use no matter what other choices you make.

10. Transactional support

Basic database integrity requires that multiple updates within a transaction are atomic - they are rolled back if errors occur. This applies to updates arising from both imperative events and Reactive logic.

11. Validation support

You can define a list of expressions for a table which must be true for a transaction to succeed (e.g., balance <= credit_limit), else an exception is raised.

Architecture

While this paper is primarily focused on the concepts of Reactive Database Logic, it's only interesting if it actually runs. It can do so in a number of ways:

Database Support: native database support could be provided though DDL extensions with corresponding runtime support. Many advantages, but a long process to add to the SQL standard, and it would still not provide natural support for remote non-SQL data.
Code Generation of triggers and stored procedures: this is, after all, what triggers do, so it would clearly be possible to generate these. This provides excellent active enforcement, but does not deal well with integrating multiple databases.
App Servers: traditional App Servers also enforce business logic, with support for integrating multiple databases. The API invocation, however, is not architecture independent.
RESTful Servers: a more recent approach is to build RESTful servers that provide Service Oriented Architectures, typically with JavaScript support for accessing multiple databases and RESTful services. Note that REST naturally provides for multi-row PUTs and POSTs, which complements the transactional nature of databases.

Reactive Logic Example

This sample problem illustrates the use and operation of Reactive Database Logic. Logic is expressed as a set of spreadsheet-like expressions that define what your data means.

Consider the Use Case Place Order, with the requirement Customer's balance may not exceed their credit limit.. The "Cocktail Napkin" requirements are illustrated above.

The following logic, which looks rather like a requirements document, should be fully executable:

Validation Logic
Validations are expressions that must be satisfied to commit transactions (else an exception is thrown). Far beyond simple single-attribute validations, the requirement is multi-attribute validations, such as this Customer Validation:

balance < credit_limit

Derivation Logic
Validations are the most familiar form of business logic, but derivations and events are the most valuable, per dependency / re-use automation noted below. Derivations here include multi-table derivation rules such as in this rule which defines the balance as the sum of the unpaid order totals:

sum(purchaseorder.amount_total where paid = false)

Derivations are conceptually similar to spreadsheet cell formulas. Recall that:

Derivations are multi-table, as in the example above
Derivations can chain, both to validations, and other derivations:

Purchaseorder.amount_total = sum(lineitem.amount) Lineitem.amount = qty_ordered * product_price Lineitem.product_price = copy(product.price)

You might declare the logic via a User Interface as shown below. Dialogs can be provided to make it simple to define (point and click), and the system produces this documentation automatically.

Reactive in Action

Reactive means that when the client submits some altered data, the server detects all changes and adjusts any dependent data - including related tables. This incurs ordering, and requires optimization.

The diagram below illustrates the server processing for the change in Line Item quantity (represented by the green circle at the bottom):

The amount is dependent on the quantity, so it is recomputed
The amountTotal is dependent on the amount, so it is adjusted. Note this may require a SQL (though the data might be in the cache), and SQL operations are pruned if the amount is not changed
The balance is dependent on the amountTotal, so it is adjusted
The result balance is credit checked. If an error is detected, the transaction is rolled back and an exception is returned.

The same processing is automatically applied to all Use Cases as a natural result of automatic watch logic - automatic reuse.

Reactive Agility

By the time you boil off change detection, change propagation, and SQL, you have burned off all the "noise". What is revealed is the pure, crystal logic.

As it turns out, this "noise" is most of the code of a traditional program. The 5 lines of Reactive Programming require literally 500 lines of imperative code. This proportion holds - in fact improves - as system size increases. The result is a 2 orders of magnitude higher level of abstraction.

Active Integrity

Imperative code runs when you call it. Code for integrity may exist, but it is of no value unless it's called in every required circumstance. So, verifying compliance entails checking all the paths of all the existing code.

By contrast, you don't call Reactive Logic. You just state it. It is systems' job to react as required - in all required circumstances.

So, if you want to verify compliance, just scan the logic. If it's there, you can be certain it is being applied.

Reactive Reuse

The Customer.balance rule, perhaps declared for Place Order, simply states that its value is the sum of the unpaid orders. This design intent is encapsulated into Purchaseorder, and then automatically re-used over all related transactions. So, the balance is increased if an order is added, decreased when it is deleted or paid, and so forth.

Put another way, Logic is associated with your tables, not with a specific transaction, request type, or method. This is what enables logic re-use.

So our logic above, perhaps conceived for Place Order, is automatically re-used for all these related transactions:

Delete Order: the balance is reduced
Pay Order: the balance is reduced
Reassign Order to a new customer - new customer balance increased, old balance decreased (for unpaid Orders)
Reassign a Line Item to a different Product (see above)
Add a Line Item
Delete a Line Item
Change Line Item Quantity

This "design one / solve many" applies to maintenance as well: when you change to your logic, the system automatically reflects it in all the relevant transactions.

Another perspective is Object Technology, which sought to factor out logic from applications into Domain Object methods: encapsulation. Reactive objects factor logic from the methods (still clumsy imperative code) into declarative reactive expressions:

Maintenance

Imperative programming is distinguished by ordering - your code must be executed in a correct order. The pain of this is most evident in the archaeology of maintenance, where most of the time is spent studying existing code to determine where to insert a few new lines.

This entire process is eliminated by automatic ordering. So, you just alter the logic, and the system computes a new execution order (and optimization strategy) based on the new dependencies.

Automatic Documentation

As shown in the screenshot in the Example above, our logic is a virtually direct entry of the cocktail napkin spec. So logic is not only executable, it is transparent documentation that enables Business Users and IT to partner and find missing / incorrect elements.

Reactive Performance

The optimizations discussed above: pruning, caching and adjustment are applied to every maintenance change. So, unlike conventional systems that slow down over time due to maintenance patches, reactive systems sustain a consistently high level of performance.

Final Thoughts

It is natural, when confronted with a new technology, to correlate it with well-known existing technologies:

Is it a 4GL? - absolutely not. 4GLs are imperative languages (now generally replaced by Object Oriented Languages), and satisfy none of the criteria above.
Is it an Object Oriented Language? - no (OO's are imperative), but integrates well. As noted above, Declarative / Reactive needs imperative escapes, for which OO languages are an appropriate match.
Is it a 5GL? - perhaps. I am less clear here, since this term has been used so loosely in the past. Recent attempts to redefine 5GLs as domain-specific declarative languages do fit this approach (the domain being the database), but I am reluctant to draw a clear conclusion here.

Conclusion

Argh - stupid program! Do what I mean, not what I say!!

Ever said that? Me too; only perhaps riper.

Programming can feel a bit like you are a warden. An adversarial relationship, struggling with misbehaving code.

Reactive Logic makes the system a partner - you say what you mean, it carries it out. So you become a conductor, not a warden.

Way more fun.

Intrigued? Get in touch, I'd love to hear your comments. Or, check it out - it's available now at Espresso Logic .

article