Performance Tuning in Zato Rule Engine

It's easy to write rules that are both clear and optimized for the rule engine's performance. Here's how.

First off, be specific whenever possible

Before implementing any of the changes below, be as specific with your rule matching as possible.

If you know that a specific rule is required to make a decision, provide its name when matching, i.e. evaluate this one rule instead of a whole set of rules of which only one will match. Likewise, if you know it will be a few of them, provide their names of input instead evaluating all the rules you know won't match.

The end result will be the same but the rule engine won't have to evaluate and reject many unwanted rules during the matching so the whole process will be much faster if you have many rules to iterate through.

Rules are evaluted alphabetically

Rules always evaluated alphabetically so if you'd like to assign higher priority to some of them, you can use a prefix number for ordering (e.g., 01_, 02_) followed by a descriptive name.

rule
    01_Flight_Delays

Put conditions that are more likely to be false first

rule
    Targeted_Offer
when
    region == 'northeast' and           # Highly selective condition first
    account_status == 'active' and      # Less selective condition
    customer_type == 'business'         # Less selective condition
then
    offer_region = 'ABC'

Why it works: The rule engine uses short-circuit evaluation for logical and operations. Once a condition fails, the remaining conditions aren't evaluated.

For example, if only 5% of customers are in the 'northeast' region, putting this condition first means the remaining conditions will be skipped 95% of time.

Use common conditions across rules

# ###########################################################################################

rule
    Premium_Business_Offer
when
    account_status == 'active'  and     # Common condition
    customer_type == 'business' and     # Common condition
    monthly_spend > 1000                # Unique condition
then
    offer_type = 'premium_business'

# ###########################################################################################

rule
    Premium_Business_Support
when
    account_status == 'active'  and     # Common condition
    customer_type == 'business' and     # Common condition
    support_tickets_open > 0            # Unique condition
then
    support_level = 'priority'

# ###########################################################################################

Why it works: The rule engine caches condition evaluation results. When multiple rules share common conditions, the engine evaluates these conditions only once and reuses the results across rules.

This significantly reduces redundant computation, and the more rules share common condition patterns, the more efficient the overall evaluation becomes. In the example above, both common conditions are evaluated once upfront rather than being re-evaluated for each rule. Only the unique conditions for each rule require additional evaluation.

It means that you need to maintain consistent naming conventions. Within the same rules file, use identical identifiers for the same business objects so the engine recognizes them as the same expression. For example, always use "account_status" rather than varying between "acc_status" or "status" across different rules.

Provide only the relevant fields on input

You should provide as few input parameters as are actually required from the business perspective. For example, don't pass entire large dictionaries serialized from JSON for evaluation. Instead, extract the fields that are genuinely needed and provide only those.

For instance, we have two rules here:

# ###########################################################################################

rule
    Usage_Reporter_Normal
when
    usage_percantage > 50 and
    peak_usage_percentage > 70
then
    report_type = 'normal'

# ###########################################################################################

rule
    Usage_Reporter_Max
when
    usage_percantage > 90      and
    peak_usage_percentage > 95 and
    account_type == 'enterprise'
then
    report_type = 'max'

# ###########################################################################################

And let's say our input is:

data = {
    'usage_percentage': 70,
    'peak_usage_percentage': 75
}

Now, the second rule will be entirely skipped because it requires an "account_type" field on input, so it doesn't make sense to evaluate it at all as we see there's no such field on input.

Why it works: Before an evaluation of a set of rules begins, Zato checks which of them cannot possibly match at all by filtering out those rules that need fields which you haven't provided on input.

Fewer rules with many conditions vs. more rules with fewer conditions

In most real-world scenarios, more rules with fewer conditions tend to perform better, but there's a middle ground to find because very granular rules (1-2 conditions each) can lead to rule explosion and maintenance challenges.

In practice:

  • Aim for 3-7 conditions per rule as a general guideline
  • Keep common condition patterns consistent across rules (maintain consistent naming conventions explained above)
  • Group related simple rules together in the same file
  • Consider splitting rules when they contain unrelated "or" conditions (see below)

When to split or conditions

Consider this rule:

rule
    Customer_Discount_Eligibility
when
    account_status == 'active' and
    (
        (customer_type == 'business'   and contract_years > 2) or
        (customer_type == 'individual' and loyalty_points > 1000)
    )
then
    discount_eligible = true
    discount_percentage = 15

And now compare it with these two:

rule
    Business_Discount_Eligibility
when
    account_status == 'active'  and
    customer_type == 'business' and
    contract_years > 2
then
    discount_eligible = true
    discount_percentage = 15
rule
    Individual_Discount_Eligibility
when
    account_status == 'active'    and
    customer_type == 'individual' and
    loyalty_points > 1000
then
    discount_eligible = true
    discount_percentage = 15

They express the same business ideas, but splitting rules with "or" conditions into equivalent "and" conditions may improve performance because:

  • Each rule accesses only the input fields it needs
  • Each rule can fail fast on its own conditions without evaluating the rest of the conditions
  • Simpler conditions have better cache utilization

At the same time, many business concepts are naturally expressed using "or" logic:

rule
    Preferred_Customer
when
    (customer_type == 'platinum') or
    (customer_type == 'gold') or
    (customer_tenure_years > 5 and lifetime_value > 10000)
then
    service_level = 'preferred'
    discount_eligible = true

By using "or", all the related business logic stays together, the intent is clearer and there are fewer rules to maintain, so there are obvious advantages to using or logic too.

In practice, you need to balance it out:

  • Start with business-oriented rules and design them in a way that clearly express business concepts, including "or" conditions where it's natural to use them
  • Split only the "or" conditions that significantly impact performance
  • Don't optimize too early, try to keep it all on a business-friendly level instead

The easiest way to implement the performance suggestions in a way that's also very natural is to keep related rules in the same file or files.

For instance, don't create a single file with 500 rules for all possible situations. Instead, break it down into smaller files by their business purpose, e.g. one for billing, one for CRM, one for HR, and so on.

In this way, there will be a natural tendency for the rules to require the same or similar input parameters, in which case the rule engine will work better.