Behavior Driven Testing and Drools

roddy

Roddy

Posted on October 7, 2021

Behavior Driven Testing and Drools

Behavior Driven Design (BDD) and Drools go together like chocolate and cayenne pepper. Your first reaction may be "blech!" but then you actually try it and it kind of grows on you like a fungus. It may not be your cup of cocoa, but it works and when you have a complex problem space, it may actually be an improvement in terms of programmer efficiency.

Behavior Driven Design is one of those hand-wavy terms that is hard to explain in depth. Basically it's a form of software development where your requirements are described by behavior -- the behavior of the user, the system, the widget, whatever. Here's an example:

Given the user is on the login screen
When the user enters an incorrect password
Then the system should send an email
Enter fullscreen mode Exit fullscreen mode

This requirement (called a "spec" or "specification") describes a behavior (actor fails to login), the expected result (email is sent), and some preconditions (location is the login screen.) Specs are written in plain English (or other spoken language) rather than any sort of programming language or psuedo-code.

BDD is closely related to Test Driven Design (TDD) in that the behavior spec that describes the requirements also describes the test. Before any coding can begin, the specs are prepared describing the behavioral expectations for the system. Then code is written to implement this, and finally we use the same specs to verify the expected behavior. It's rather straight-forward, yeah? It's just basically TDD except that instead of the unit test being the first thing that gets written, it's the human-readable description of the behavioral expectation of the system.

The behavior spec defines both our test case and our requirements. What if we could just run the spec like a unit test?

Enter cucumber. (The technology, not the vegetable.)

Cucumbers, gherkins, and other pickles

Cucumber is a BDD testing framework, based around the idea that your spec can both define the requirements and be the test to verify them. It does this by leveraging a language called gherkin, which really is just a list of restricted keywords ("given", "when", "then", etc.) That example spec I showed previously for a failed login behavior? That's gherkin.

It looks suspiciously like English, and that's the point. You don't need to know anything about code or programming to write a spec. (Gherkin can be internationalized as well. I've written behavioral specifications in Klingon just to prove it was possible. My manager laughed and made me rewrite it, of course.)

I'm not going to go too much in depth into Cucumber and how it works. All you really need to know is that your spec is written in something a regular non-programmer person can read, and it describes some behavioral requirement for your application. Cucumber translates that "plain English" into actions that integrate into the testing framework of your choice, thus allowing you to use your requirements as the inputs for your unit or integration test.

Drooling over cucumbers

Hopefully you already know that Drools is a business rules management system. You write rules in either "drl" syntax, in spreadsheets, or in glorified flowcharts, and then let your application throw data at it.

I'll be the first to jump on the "I love Drools!" bandwagon, but even I will admit that it is quite complicated. It's not something that you can drop entry level programmers into and expect them to just get; the framework is extremely powerful, but its abstractions are not always self-evident. If you print the documentation for the latest version of Drools (7.59 at time of writing), you'll get 616 pages about the framework.

This leaves us in a bit of a bind. Drools is supposed to model business logic in the form of rules, but to properly implement them you really need a programmer with a solid understanding of the framework. Your business analyst who understands the business may be able to write some simple or trivial rules, but they're also very likely to do it inefficiently. And for more complex rules, forget it. Your engineer, on the other hand, could write the rule -- but they don't have the necessary business knowledge to do so correctly.

This is where Behavior Driven Design shows up to save the day. Your business analyst writes out the expected behavior in plain English (or another common spoken language.) Your engineer then implements that plain English requirement in Drools. And then, finally, we drop that plain English behavioral specification into Cucumber, and verify that the system does what the business analyst asked it to do.

An anecdote

Yes, I too was skeptical when this was first proposed to me. I was working in healthcare IT for a US-based startup. My application would calculate how much a patient should be charged for whatever-it-is that they were in the hospital for. This was an extremely complex calculation, which involved things like health insurance policies (sometimes multiple ones with varying coverage rules), the patient's recent medical history (especially how many visits/how much they'd paid so far against the plan), the contracted cost of the procedure, physician fees for specialists, various discounts, offsets from charitable donations based on credit score, and so on. Honestly, I understood very little of the business; I've only been to the hospital a handful of times, mostly as a visitor. But I did have a robust set of Drools rules that would magically spit out a reasonable-looking number after I tossed the patient's life history, current horoscope, and yesterday's winning lottery picks into working memory.

And then the Affordable Care Act was passed (commonly called "Obamacare") and all of a sudden we had to update my rules. Now we had a problem. I had some very smart business analysts who understood exactly how we needed to calculate now. I had what was effectively a black box of rules that I could modify because I understood the Drools framework. And then we had a gaping chasm between us which basically represents the fact that we didn't really speak a common language. My analysts could talk about healthcare 'til the cows came home, but to me it sounded like the wah wah wah from the Peanuts cartoons.

So the analysts and I sat down and came up with a plan of action. Our first problem, we realized, is that while we had rules, they were effectively a black box. We had no way of knowing whether the numbers they were producing were accurate. Before we could determine how the rules had to change, we first had to understand how the rules worked today.

Fair enough. The solution here is clearly to write tests to verify behavior based on known inputs.

And this is where the eureka moment occurred. I, clearly, did not have the necessary business knowledge to prepare these test cases. The business analysts, on the other hand did -- but did not have the technical knowledge to implement the tests. So why not have them write gherkin test cases to validate the behavior of the system.

It was genius. After all, if a test failed in the future due to some changes, I could just take that plain English representation of a requirement back to the business analysts and say, "Hey, I implemented the XYZ use case for the ticket, but this other use case failed. Can you review?" And they could review it, because the other use case was written in English and not some programmer mumbo jumbo. Huzzah!

Cucumber and you

Healthcare is hard, and I don't work in that field anymore, so I can't exactly show you the sorts of rules I was dealing with there. Luckily coming up with toy examples for Drools questions is a particular hobby of mine, so we can still muddle through with an example.

So let's say we're writing a game about space piracy. We've got space pirates, regular civilians, and the military all zooming around in their spaceships, hauling cargo of various kinds (some of which is illegal.) Oh and there's also aliens.

Anyway, so we've got this game, and we've decided to implement our business logic in Drools. In this case our business logic will be stuff like validation (eg. is this procedurally-generated ship valid?) and also for calculations (eg. my ship is of type X, I'm at coordinates Y, and I want to go to coordinates Z ... can I?) For the sake of this question, we're going to consider the validation rules -- but if you're curious about the rest of it, I may have gotten a wee bit carried away implementing the example application available on GitHub.

Right. So, validation. Today we get a new requirement: we need to restrict how our ships are named. Luckily our Business Analyst or Product Owner has jumped aboard the BDD bandwagon and gives us our requirements in gherkin:

Scenario: An Alien spaceship cannot have a name starting with U.S.S
  When I have an Alien spaceship named "U.S.S. Enterprise"
  Then my spaceship is not valid
  And the error message should contain: "only human ships may use prefix U.S.S."
Enter fullscreen mode Exit fullscreen mode

Cool biz. It's a pretty simple requirement, so you can probably already imagine what the rule is going to look like, so we'll skip past that temporarily and instead talk about ... well how do I run this as a unit test?

The answer: Cucumber. I think I mentioned that. Specifically, two cucumber libraries: cucumber-java and cucumber-junit (because I am firmly in the JUnit camp, but for reference I have used TestNG for Cucumber and Drools as well and the principles are the same.)

For this example, we'll use the latest version of these two libraries: 6.10.4 at time of writing.

<dependency>
  <groupId>io.cucumber</groupId>
  <artifactId>cucumber-java</artifactId>
  <version>6.10.4</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>io.cucumber</groupId>
  <artifactId>cucumber-junit</artifactId>
  <version>6.10.4</version>
  <scope>test</scope>
</dependency>
Enter fullscreen mode Exit fullscreen mode

It's important to note that Cucumber hasn't caught up with the modern age yet and is only compatible with JUnit 4, so if you're using JUnit 5 you'll need to include a dependency on junit-vintage-engine and remember not to use any JUnit 5 features at all in your tests. (Even your assertions need to be the ones in the old junit.* package.)

The glue

The next thing we do is write what is called the "glue". This is a class (or classes) which "glue" the gherkin to the application, hence the name. It basically provides a translation layer between the gherkin and actual functionality. In our case, for each of the "steps" (the lines prefixed "When", "Then", "Given", "And"), we need to provide a bit of parsing logic to translate that into something that feeds data into the application, does an action, or asserts something.

Cucumber makes some useful annotations available, so the basic outline of our methods will look something like this:


public class SpaceshipValidationGlue {
    @When("^I have an Alien spaceship named \"(.*)\"$")
    public void setSpaceshipSpeciesAndName(String name) {
        // set the name of the spaceshp we're passing into the rules
    }

    @Then("^my spaceship is not valid$")
    public void assertInvalid() throws Exception {
        // fire the rules (if needed) and assert the validity of the result
    }

    @Then("^the error message should contain: \"(.*)\"$")
    public void assertErrorMessageContains(String messages) throws Exception {
        // fire the rules (if needed) and verify the error message
    }
}
Enter fullscreen mode Exit fullscreen mode

Basically, you have a method, annotated with one of the Cucumber @When, @Then, @Given, etc. annotations. These annotations contain a regular expression, which you can use to extract variables. Then in the method you can do whatever-it-is that is being described by the English representation. Note that while Cucumber provides all of the reserved keywords as annotations -- both in English and a variety of other languages -- it doesn't really matter which one you use. There's no difference in using @When versus @Then. However I like to use @When for all of the steps which do some sort of initialization, and @Then for any which do assertions; that way I can just look at the annotation and known generally what that broad category that method belongs to. But you can do as you like.

Now if you're trying to follow along in my GitHub repository, you'll notice that the step definitions in there are a lot more complex -- because I have a lot more rules and a lot more features. It was a surprisingly fun toy example to put together, and I got totally carried away. I make no apologies, but it does mean that at least for this specific step definition you'll need to follow along here instead.

Right. Back to our spaceship name validation. Let's go about implementing the step definitions for this use case.

Usually I like to start from the top, but in this case we need to start from the bottom -- namely, what are we passing into the rules? In this case, we're passing in a Spaceship object, so we'll need to add that to our class definition. Here's where we run into our first gotcha.

For each test suite (not individual test) the cucumber framework will new up an instance of this class. So since we can't pass our spaceship instance between individual methods, we'll need to save it as a class-level variable, and remember to "clear" it between individual Scenarios. (Gherkin Scenarios are analogous to test cases.) To do this, Cucumber gives us a @Before annotation, called a "hook", so we can use that annotation to decorate a "setup" method.

public class SpaceshipValidationGlue {

    private Spaceship spaceshipUnderTest;

    @Before
    public void initialize() {
        this.spaceshipUnderTest = new Spaceship();
    }

    // ...
}
Enter fullscreen mode Exit fullscreen mode

Be careful with your imports! @Before should be imported from io.cucumber.java.Before -- there's an identically named JUnit 4 class which you must not use.

Anyway, now that we've got our spaceship defined, we can implement the @When step definition:

@When("^I have an Alien spaceship named \"(.*)\"$")
public void setSpaceshipSpeciesAndName(String name) {
    this.spaceshipUnderTest.setName(name);
}
Enter fullscreen mode Exit fullscreen mode

That takes care of our test setup. We've got two "Then" steps, which are the methods which validate the consequences of our action (our action being "we fired the validation rules.") So the first thing that has to happen is, clearly, that we need to fire the rules.

@Then("^my spaceship is not valid$")
public void assertInvalid() throws Exception {
    fireRules();

    // TODO validate the results
}


@Then("^the error message should contain: \"(.*)\"$")
public void assertErrorMessageContains(String messages) throws Exception {
    fireRules();

    // TODO verify the error message
}

private void fireRules() {
    KieServices kieServices = KieServices.Factory.get();
    KieContainer kContainer = kieServices.getKieClasspathContainer();
    KieBase kBase = kContainer.getKieBase("validation");
    KieSession session = kBase.newKieSession();
    try {
        session.insert(spaceshipUnderTest);
        session.fireAllRules();
    } finally {
        session.dispose();
    }
}
Enter fullscreen mode Exit fullscreen mode

But wait! Why are we firing the rules twice?

Good question. We shouldn't be. We should instead track the fact that we fired the rules already, and not fire them again and just reuse the same results for subsequent assertions. I like to use this with an AtomicBoolean tracking variable that gets reset in the @Before.

public class SpaceshipValidationGlue {
    private final AtomicBoolean isRulesFired = new AtomicBoolean(false);

    @Before
    public void initialize() {
        this.isRulesFired.set(false);
        // ...
    }

    // ...

    private void fireRules() {
        if(isRulesFired.compareAndSet(false, true)) {
            KieServices kieServices = KieServices.Factory.get();
            KieContainer kContainer = kieServices.getKieClasspathContainer();
            KieBase kBase = kContainer.getKieBase("validation");
            KieSession session = kBase.newKieSession();
            try {
                session.insert(spaceshipUnderTest);
                session.fireAllRules();
            } finally {
                session.dispose();
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

We're getting close. Isn't this exciting?

Next thing we need to do is get the results out of the rules, so we can assert against something in our steps. I'm going to take the easy way out and use a global object called ValidationResult, but there's a number of ways we could do this instead. (Eg. we could pass the result object into working memory, we could have a "valid" flag on the spaceship itself, etc.)

Since we're going to have to share the results across potentially multiple methods doing assertions, we'll need to store the results somewhere that's accessible to all methods in this class -- namely as a class variable. So we'll just drop it alongside the Spaceship variable, and also update the @Before to reset the value between scenarios.

Thus, our completed step definitions look like this:

public class SpaceshipValidationGlue {

    private final AtomicBoolean isRulesFired = new AtomicBoolean(false);

    private Spaceship spaceshipUnderTest;
    private ValidationResult result;

    @Before
    public void initialize() {
        this.isRulesFired.set(false);
        this.spaceshipUnderTest = new Spaceship();
        this.result = null;
    }


    @When("^I have an Alien spaceship named \"(.*)\"$")
    public void setSpaceshipSpeciesAndName(String name) {
        this.spaceshipUnderTest.setName(name);
    }

    @Then("^my spaceship is not valid$")
    public void assertInvalid() throws Exception {
        fireRules();

        assertFalse(result.isValid());
    }

    @Then("^the error message should contain: \"(.*)\"$")
    public void assertErrorMessageContains(String messages) throws Exception {
        fireRules();

        String errorMessage = result.getError();
        assertNotNull(errorMessage);
        assertTrue(errorMessage.contains(messages));
    }

    private void fireRules() {
        if(isRulesFired.compareAndSet(false, true)) {
            result = new ValidationResult();

            KieServices kieServices = KieServices.Factory.get();
            KieContainer kContainer = kieServices.getKieClasspathContainer();
            KieBase kBase = kContainer.getKieBase("validation");
            KieSession session = kBase.newKieSession();
            try {
                session.insert(spaceshipUnderTest);
                session.setGlobal("validation", result);
                session.fireAllRules();
            } finally {
                session.dispose();
            }
        }
        assertNotNull(result);
    }
}
Enter fullscreen mode Exit fullscreen mode

As an aside, one of the cool things about these step definitions is that they're reusable. I can now include I have a spaceship named "foo" in any of my scenarios, and this step definition will provide the implementation for it. IDEs like IntelliJ's IDEA provide built-in support for Gherkin and will automatically map from the gherkin feature to the defined step so you can navigate between them. (Note that this might be a paid feature in IntelliJ; I'm unfamiliar with other IDEs that may or may not support this functionality as well.)

The test runner

Step definitions by themselves aren't tests though. They're just the translation layer, the glue. We still need the actual test runner.

I usually go through these examples stepwise, but for the runner it's easier if I just show you the whole thing. This is the test runner for this example feature:

import io.cucumber.junit.Cucumber;
import io.cucumber.junit.CucumberOptions;
import org.junit.runner.RunWith;

@RunWith(Cucumber.class)
@CucumberOptions(plugin = { "html:target/cucumber_html/spaceshipValidation/",
                            "json:target/json-spaceshipValidation.json",
                            "junit:target/junit-spaceshipValidation.xml"},
                 glue = {"app.roddy.space.validation.spaceship"},
                 features = "src/test/resources/features/validation/spaceshipValidation.feature",
                 tags = "@validation")
public class SpaceshipValidationRulesTest {

}
Enter fullscreen mode Exit fullscreen mode

When you're writing Cucumber tests in Java, you'll end up with a lot of these test runners, and they'll end up being primarily copy-paste jobs. There's a lot of boilerplate, but not a good way of extracing the commonalities into something less verbose.

To start with, the @RunWith annotation is the JUnit4 extension which identifies Cucumber as the test runner. Then @CucumberOptions configures this particular test suite as follows:

Parameter Description
plugin These plugins configure how the test outputs its results. HTML generates a webpage with test results. JSON spits out json. JUnit is JUnit xml, commonly used as an input for external reporters (eg Jenkins plugins)
glue An array of packages where our glue file(s) live. This is the package for the step definition file we just wrote for our example feature.
features The full path to the feature file. This isn't relative to the class path (src/test/resources), but from the project root.
tags Not really discussed here, but you can put annotations on your scenarios to categorize them so that you can target them specifically. A common workflow is to tag stuff @wip or @skip and exclude them from this parameter (eg. ~@wip)

You'll notice that the file itself is empty. If you have some sort of common setup/tear down steps, this is where you'd put that logic using the JUnit @Before and @BeforeAll annotations (or "after" equivalents). An important thing to remember is that JUnit does properly identify each individual Scenario as a separate JUnit test; IDEs like IntelliJ IDEA are smart enough to understand this, but the JUnit framework itself just sees it as a single black-box thing it's running. So anything that you put in this file will run before all of the scenarios or after all of the scenarios.

At my previous company, when I wrote Drools for a healthcare application, I developed a code-coverage reporting framework for Drools. It required attaching some listeners to the Drools session and collocating the metrics for the suite, and this is where I implemented that sort of logic. Otherwise you could put some common setup/teardown in here as well. You might've noticed that we load the rules into memory every time we call fireRules() in the step definitions; we could improve performance by moving that "load" logic here so the rules are only loaded once.

At this point, we're actually done with our unit tests. You can run SpaceshipValidationRulesTest as a JUnit test, and it will actually track down that gherkin Scenario and run it using your glue. The test will fail, of course, since we've not written the rule yet -- but hey that's test driven development for you. Huzzah!

The rule

The rule is pretty anti-climactic at this point. I intentionally chose a very simple rule with a very simple spec as an example, and as a result it's actually not very interesting in terms of implementation.

global ValidationResult validation;

rule "Only human spaceships can use the prefix U.S.S."
when
  Spaceship( name != null,
             name str[startsWith] "U.S.S.",
             belongsTo != Species.HUMAN)
then
  validation.setIsValid(false);
  validation.appendMessage("only human ships may use prefix U.S.S.");
end
Enter fullscreen mode Exit fullscreen mode

Just like it says on the tin. If the spaceship is not a Human ship, and the name starts with "U.S.S.", then we reject this as not being a valid spaceship.

Conclusion

And there you have it. We have a complex system, a requirement written in plain English that can be run as a test, and a rule implemented against it. Clearly most real world applications won't involve spaceship, aliens, and piracy, but hopefully you can imagine how this might work with a legitimate business use case that requires in-depth knowledge.

Healthcare, banking, accounting, HR -- there's a lot of very complicated businesses which could benefit from business rules, but where it would be a waste of time (and effort) to try and educate your programmers to be subject matter experts. Better to let your SMEs and product owners describe the expected behavior in a way they understand, and let the programmers implement the system using their own expertise, and let the Cucumber keep both sides honest (and accurate.)

It's also worth noting that Cucumber isn't just for Java, and it's not just for unit tests. At the unit test level, it makes sense to implement tests with Java because a Drools application will be written in Java, by definition. But you can also design an integration or end-to-end testing framework that runs independently, and use other languages for that. Cucumber was originally written in Ruby; the healthcare application I've mentioned a few times had its e2e tests written in Cucumber and Ruby. There's also Javascript libraries, C#, a much maligned Python implementation, and so on. I've only shown you here the tip of the proverbial iceberg, and focused on how you can validate your Drools business logic, but the reality is that this is a very powerful tool and there are many more use cases which may be of use to you.

More information

Cover image attribution: congerdesign @ pixabay.

💖 💪 🙅 🚩
roddy
Roddy

Posted on October 7, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Behavior Driven Testing and Drools
drools Behavior Driven Testing and Drools

October 7, 2021