Approval Tests For PDF Document Generation

janvanryswyck

Jan Van Ryswyck

Posted on October 7, 2021

Approval Tests For PDF Document Generation

A while back I was confronted with a part of a legacy system that generates PDF documents. This legacy system used a well known library for generating the requested PDF files. The good news was that there were a decent amount of tests available. The not so good news was that these tests made heavy use of test doubles for swapping out most of the types provided by the third-party API. I strongly believe that using test doubles in such cases is not a good choice. In the past I already wrote about why to avoid using test doubles for types that you don’t own.

Tests like these might become a large impediment whenever we upgrade to a newer version of the third-party library. Major versions quite often introduce breaking changes to existing API’s. Whenever tests are strongly coupled to such an API, they basically need to be rewritten during the upgrade.

In order to reduce the coupling of these tests, I decided to replace them with Approval Tests instead. This technique is also known as “Golden Master” or “Characterization Test”. The idea behind an Approval Test is that after the first test run, some output needs to be visually verified and approved. During subsequent test runs, the approved output will be compared to current output. When there’s a difference in the output, the test will fail.

This is especially useful whenever we have to deal with code of a legacy system. You can check out this video from Emily Bache where she demonstrates the Gilded Rose refactoring kata using Approval Tests. Highly recommended!

Usually the output of Approval Tests is captured in plain text files, which has nothing to do with PDF files. So I decided to extend the Java Version of an Approval Test libraryto support PDF documents as well. Let’s have a look at some example code to demonstrate this extension.

Suppose that we have a small application that generates a PDF document. A generated document contains the refrain of a well-known song lyric. The following code shows a possible implementation.

public class SingAlongPdfGenerator {

    public ByteArrayOutputStream generate(SingAlongData pdfData) {

        var outputStream = new ByteArrayOutputStream();
        PdfDocument pdf = new PdfDocument(new PdfWriter(outputStream));

        Document document = new Document(pdf);
        document.add(new Paragraph("Let's sing-a-long:"));

        List list = new List()
            .setSymbolIndent(12)
            .setListSymbol("\u2022");

        for(var refrainLine : pdfData.getRefrainLines()) {
           list.add(new ListItem(refrainLine));
        }

        document.add(list);
        document.close();

        return outputStream;
    }
}

public class SingAlongData {

    private final List<String> refrainLines;

    public SingAlongData(List<String> refrainLines) {

        this.refrainLines = refrainLines;
    }

    public List<String> getRefrainLines() {
        return refrainLines;
    }
}
Enter fullscreen mode Exit fullscreen mode

The SingAlongPdfGenerator class provides a method named generate that accepts a SingAlongData instance as its only parameter. The SingAlongData class is merely a DTO that provides a list of refrain lines. The generatemethod creates a new document containing a paragraph of text and a list of the refrain lines.

Let’s have a look at the code of the corresponding Approval Test.

public class SingAlongPdfGeneratorTests {

    @Test
    public void generateRickRollPdf() {

        var refrainLines = Arrays.asList(
            "Never gonna give you up",
            "Never gonna let you down",
            "Never gonna run around and desert you",
            "Never gonna make you cry",
            "Never gonna say goodbye",
            "Never gonna tell a lie and hurt you"
        );
        var data = new SingAlongData(refrainLines);

        var pdfGenerator = new SingAlongPdfGenerator();
        var result = pdfGenerator.generate(data);

        PdfApprovals.verify(result);
    } 
}
Enter fullscreen mode Exit fullscreen mode

First we create an instance of the SingAlongData DTO, specifying some test data. Next we create an instance of the Subject Under Test, which in this case is the SingAlongPdfGenerator class and call the generate method. This returns a ByteArrayOutputStream containing the data of the PDF document. Then we verify the result by calling the PdfApprovals.verify method. Notice that the anatomy of an Approval Test is identical to any other type of test as it also adheres to the Arrange, Act, Assert pattern.

A new Approval Test always fails the very first time that it gets executed. After the initial test run, a PDF file is generated that needs to be visually verified and approved. So when we first run the generateRickRollPdf test, a file with the name SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf appears which has the following content:

A generated PDF file that needs to be approved

After we verified the PDF document, we approve it using the following command:

mv ~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf 
~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf
Enter fullscreen mode Exit fullscreen mode

At this point we have a PDF file named SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf. When we now execute our test again, it passes as the newly generated PDF document matches the approved PDF document.

Note that we also have to make sure to commit the approved PDF file alongside our test code. Otherwise we have to approve the output of the test again when executed on another machine.

Let’s say that we want to make a change to the implementation of our SingAlongPdfGenerator class. For example, we’re going to make a small change to the text inside the paragraph.

document.add(new Paragraph("Let's sing-a-long shall we?"));
Enter fullscreen mode Exit fullscreen mode

When we execute our test again, it fails due to the change that we’ve made.

Failed Approval
  Approved:~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.approved.pdf
  Received:~/src/test/resources/pdfapproval/SingAlongPdfGeneratorTests.generateRickRollPdf.received.pdf
Enter fullscreen mode Exit fullscreen mode

The test created a “received” PDF file again alongside the existing PDF file that we’ve approved earlier. We now have to visually compare the received and the approved file side-by-side. Needless to mention that this is going to be quite cumbersome. For this reason I’ve added the capability that in case of failing test, a third PDF file is generated that contains the annotated differences between the two PDF files.

A PDF file that indicated the differences

The purple markers indicate where the changes are located in the document. The green colon indicates what is expected (approved). The red text that we’ve added to the paragraph shows the actual text found in the document (received). We now have to decide whether we want to approve the changes that we’ve made or not. Let’s say that we are happy with the change that we’ve made. To do that we simply run the approval command again as shown earlier. And that’s it.

With just a handful of these Approval Tests I was able to eliminate all the tightly coupled tests. Generating PDF files is more often than not an infrastructure concern. Sociable tests tests are much more appropriate in this case compared to solitary tests. Using Approval Tests this way turned out to be a very valuable approach.

💖 💪 🙅 🚩
janvanryswyck
Jan Van Ryswyck

Posted on October 7, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related