Property Testing with JSVerify · Part III
Gabriel Lebec
Posted on March 25, 2019
Part III: Intermediate JSVerify Usage
- In Part I, we explored the concept of property testing and saw basic JSVerify usage.
- In Part II, we took a detailed look at the language of JSVerify's README, including the meaning of types like
generator
,shrink
, andarbitrary
. - Armed with both conceptual knowledge and library-specific jargon, we are now ready to take a deeper dive into using JSVerify.
Using Built-In Arbitraries
JSVerify features a string-based Domain-Specific Language (DSL) documented here for describing the desired arbitrary data you want to test. The simplest possible examples use built-in primitive arbitraries by name.
// a dummy property used to produce example output
jsc.assert(jsc.forall(
'integer', 'nat', 'number', 'bool', 'falsy', 'char',
(int, nat, num, bool, fls, chr) => {
console.log(int, nat, num, bool, fls, chr)
return true
}
))
int |
nat |
num |
bool |
fls |
chr |
---|---|---|---|---|---|
6 |
4 |
-5.293295592069626 |
false |
'' |
´ |
-3 |
6 |
6.371534138917923 |
false |
false |
" |
-6 |
2 |
6.8075778381899 |
false |
NaN |
- |
5 |
2 |
0.37970419274643064 |
false |
null |
ø |
-1 |
6 |
-8.00378399901092 |
true |
0 |
\u001e |
-5 |
3 |
11.162152161821723 |
true |
false |
n |
4 |
4 |
-8.52733048191294 |
false |
NaN |
Z |
14 |
12 |
-20.764524302445352 |
false |
undefined |
' ' |
4 |
14 |
-2.3603379679843783 |
false |
undefined |
« |
... |
... |
... |
... |
... |
... |
Note, however, that using the DSL is optional; the functions the DSL refers to are available for direct use, and are also valid inputs to forall
.
// equivalent to previous example
jsc.assert(jsc.forall(
jsc.integer, jsc.nat, jsc.number, jsc.bool, jsc.falsy, jsc.char,
(int, nat, num, bool, fls, chr) => {
console.log(int, nat, num, bool, fls, chr)
return true
}
))
Some of the built-in primitive arbitraries are parameterized – they act as functions with parameters, for things like min and max bounds or to choose from an array of options. Expressing this using the string DSL can be tricky or unsupported, however. In such cases it is often easier to use the ordinary JS function equivalent:
// a dummy property used to produce example output
jsc.assert(jsc.forall(
jsc.integer(-2, 2), jsc.nat(5), jsc.elements([true, null]),
(int, nat, trueOrNull) => {
console.log(int, nat, trueOrNull)
return true
}
))
int |
nat |
trueOrNull |
---|---|---|
2 |
2 |
true |
2 |
1 |
null |
-1 |
5 |
null |
1 |
5 |
true |
0 |
0 |
null |
... |
... |
... |
Combinators: Deriving New Arbitraries from Existing Arbitraries
We have seen:
- atomic ("primitive") arbitraries, e.g.
jsc.bool : arbitrary bool
jsc.integer : arbitrary integer
- functions which map vanilla values to arbitraries, e.g.
jsc.integer(min: integer, max: integer) : arbitrary integer
jsc.elements(args: array a): arbitrary a
The next level in this hierarchy would be:
- functions which take arbitraries and return new arbitraries, e.g.
jsc.array(arb: arbitrary a): arbitrary (array a)
dict(arb: arbitrary a): arbitrary (dict a)
This kind of library design is known as the combinator pattern, and is a powerful yet concise way to build up complex functionality. Given a variety of primitive values and a selection of combinators, new values can be produced that can be fed back into those combinators, and so on – a self-reinforcing loop which creates a combinatorial explosion of possibilities.
We have already seen one example combinator: the jsc.array
function takes an arbitrary (e.g. jsc.nat
) and returns a new arbitrary (which generates arrays of that arbitrary value).
jsc.forall(
'[nat]', jsc.array(jsc.nat), // these are both equivalent
(arrayOfNats1, arrayOfNats2) => { /* use generated values */ }
)
arrayOfNats1 |
arrayOfNats2 |
---|---|
[7, 40, 4, 2] |
[18, 40, 15, 6] |
[11] |
[26] |
[] |
[] |
[] |
[14, 13, 17, 12] |
[3] |
[3, 6, 5] |
... |
... |
Since the return value of jsc.array
is itself an arbitrary
, it can become the input to jsc.array
again, to yield arrays-of-arrays:
jsc.forall(
'[[nat]]', jsc.array(jsc.array(jsc.nat)), // again, equivalent
(arrArrNat1, arrArrNat2) => { /* use generated values */ }
)
arrArrNat1 |
arrArrNat2 |
---|---|
[ [], [ 2, 10 ] ] |
[ [ 1 ] ] |
[ [ 10, 26, 9, 7 ] ] |
[] |
[ [ 22, 3, 17, 16 ], [ 13 ], [ 0, 23 ] ] |
[ [ 5, 10, 18, 4, 17 ], [ 17, 21 ] ] |
... |
... |
At this point we can begin to see why JSVerify includes a string-based DSL for specifying input types. '[[nat]]'
is much more concise than jsc.array(jsc.array(jsc.nat))
. On the other hand, method calls can be typed (e.g. for TypeScript), don't require learning a new syntax, and are a superset of what is possible with the DSL.
Some more examples of combinators are demonstrated below:
Combinator | String DSL Form | Method Call Form | Two Example Generated Values |
---|---|---|---|
pair |
'pair nat bool' |
jsc.pair(jsc.nat, jsc.bool) |
① [4, false] ② [0, true]
|
tuple |
'nat & nat & bool' |
jsc.tuple([jsc.nat, jsc.nat, jsc.bool]) |
① [0, 18, true] ② [9, 2, true] (all types will be used per generation) |
sum |
'char ¦ nat ¦ bool' |
jsc.sum([jsc.char, jsc.nat, jsc.bool]) |
① 9 ② 'h' (a random type will be chosen per generation) |
fn |
'a -> bool' |
jsc.fn(jsc.bool) |
① a function which maps 1 to true , 'hi' to false , null to true … ② a function which maps all inputs to false … |
record |
'{ name: asciinestring; age: nat }' |
jsc.record({ name: jsc.asciinestring, age: jsc.nat }) |
① { name: 'KUW<.', age: 0 } ② { name: '^L', age: 37 }
|
⚠️ (NB: for the sum
type, we have substituted the "broken bar" symbol ¦
for the pipe character |
because Markdown syntax uses the pipe character in tables.)
Putting it all together, you can specify reasonably complex arbitrary data with a minimal amount of work. And if your property fails, the failing input case can be automatically shrunk by JSVerify.
jsc.forall(
'[{ start: integer & integer; end: integer & integer }]',
(lineSegments) => { /* use generated values */ }
)
lineSegments |
---|
[ {start: [26, 6], end: [20, -4]} ] |
[] |
[ {start: [11, -6], end: [-14, 8]}, {start: [-14, -15], end: [11, -15]} ] |
... |
Transforming Arbitraries
With the supplied primitive arbitraries and combinators, you can do a lot, but not everything. As a very small example, what if very specifically want arbitrary even natural numbers? One strategy is to start with something close (integers) and transform them inside your property:
jsc.assert(jsc.forall(
'integer',
(integer) => {
const evenInt = integer * 2
return funcThatUsesEvens(evenInt) === 'cool'
}
))
This works (for some problems), but has a downside: if we ever encounter a failure, the failing input case that is logged is pre-transformation.
Error: Failed after 13 tests and 9 shrinks. rngState: 93bc7e5e6160384cf8;
Counterexample: 7
What we would really like is a value of type arbitrary even, so the failing cases are the even numbers fed into our function under test. JSVerify gives the capability to transform an existing arbitrary into a new arbitrary via the .smap
(surjective map) prototypal method:
.smap(f: a -> b, g: b -> a, newShow: (b -> string)?): arbitrary b
In other words, given arbA : arbitrary a
, you can create a transformed arbB : arbitrary b
by providing (at least) two functions:
-
f: a -> b
is a function that transforms the input case to the form you desire -
g: b -> a
is a function that maps derived values back to the original type (necessary for shrinking)
Applied to our example of even numbers:
// arbitrary even
const arbEven = jsc.integer.smap(
n => n * 2, // function to transform integer numbers to evens
e => e / 2 // function to map even numbers back to integers
)
jsc.assert(jsc.forAll(arbEven, e => {
return funcThatUsesEvens(e) === 'cool'
}))
As the docs note, the first function passed to smap
should be surjective. If you "map back" your derived value x
using the second function, then the first function should bring you back to x
exactly. That is, f(g(x))
always equals x
.
With our arbitrary even
type implemented, our failures now show actual even numbers which fail:
Error: Failed after 13 tests and 9 shrinks. rngState: 93bc7e5e6160384cf8;
Counterexample: 14
Defining Custom Arbitraries from Scratch
In Part II, we saw an example generator, shrink, and show function for a hypothetical "hydra"-type value.
const generatorHydra = (size) => {
const hydra = { heads: size ** 2 }
return hydra
}
const shrinkHydra = (hydra) => {
if (hydra.heads <= 0) return []
return [{ heads: hydra.heads - 1 }] // valid but inefficient
}
const showHydra = (hydra) => `{ heads: ${hydra.heads} }`
Together, these three functions almost make up the complete manual definition of a custom arbitrary hydra
value.
const almostArbitraryHydra = {
generator: generatorHydra,
shrink: shrinkHydra,
show: showHydra
}
However, there is one bit of spice we need to apply before this can actually be used in JSVerify as an arbitrary hydra
. JSVerify assumes that generators, shrinks, and arbitraries will have certain methods – for example, arbitraries will have .smap
as shown earlier. In order to add these methods to your own from-scratch arbitraries, JSVerify provides three helper functions named bless
.
const arbitraryHydra = jsc.bless({
generator: jsc.generator.bless(generatorHydra),
shrink: jsc.shrink.bless(shrinkHydra),
show: showHydra
})
jsc.bless(arbLike: {...}) : arbitrary a
jsc.generator.bless(genLike: (size: nat) -> a) : generator a
jsc.shrink.bless(shrinkLike: a -> [a]) : shrink a
- (there is no
bless
forshow
– it doesn't need it)
The bless
functions "upgrade" pseudo-generators / pseudo-shrinks / pseudo-arbitraries into bona-fide full versions suitable for use by e.g. jsc.forall
.
With our holy hydra fully evolved, we can use it in a property:
jsc.assert(jsc.forAll(hydra, (h) => {
return h.heads >= 0
}))
If we wanted to refer to the hydra arbitrary by name using the String DSL, we would add it to a namespace object and pass in that map as our penultimate env
argument to forall
:
const env = { hydra: hydra }
jsc.assert(jsc.forAll('hydra', env, (h) => {
return h.heads >= 0
}))
Deriving Generators
When defining new arbitraries, it was sometimes useful to start with existing arbitraries and use either combinators (like array
) or transformations (like smap
) on them. Similarly, when defining our own generators, it is sometimes useful to start with an existing generator.
Generator Combinators
Some of the same combinators exist for generators as for arbitraries. For example:
-
jsc.generator.array(gen: generator a): generator (array a)
- "given a generator of
a
, thearray
method returns a generator for arrays ofa
"
- "given a generator of
-
jsc.generator.tuple(gens: (generator a, generator b...)): generator (a, b...)
- "given an array of generators for
a
,b
, etc., thetuple
method returns a generator for arrays of form[a, b]
etc."
- "given an array of generators for
// (size: nat) -> [char, [nat], bool]
const generatorCharNatsBool = jsc.generator.tuple([
jsc.char.generator,
jsc.generator.array(jsc.nat.generator),
jsc.bool.generator
])
// example outputs:
// ['h', [0, 3], true]
// ['#', [91, 3, 84], false]
Note that generator combinator methods live on jsc.generator.COMBINATOR_NAME
whereas existing generators for a primitive can be found as jsc.PRIMITIVE_NAME.generator
.
Transforming Generators
Transforming a generator is easier than transforming a full arbitrary. With an arbitrary value, smap
needed a pair of functions – one to derive data, and one to map derived data back to the original type. Generators have a map
method which just needs the transformer:
(gen: generator a).map(f: a -> b): generator b
// (size : nat) -> nat
const evenGenerator = jsc.generator.nat.map(n => n * 2)
// example outputs:
// 12
// 0
// 54
const yellGenerator = jsc.generator.string.map(s => s + '!')
// example outputs:
// '!'
// 'c><92&X.!'
// 's'72$!'
Given a generator of a
-type values, and a transforming function a -> b
, the map
method will give you a generator of b
-type values.
As a side note: this is precisely the type needed to define a functor – a structure which can be mapped over. Just like arrays can be mapped (transforming the values in the array, but leaving the array structure alone), generators can be mapped (transforming the output of the function, but leaving its function nature / input type alone). Functors show up extensively in functional programming, but you don't need to know their theory in order to use map
.
Dependently-Generated Values
Let's consider something more challenging – a generator customized by other generators. As an example, we might want to generate both a nonempty array of values, and a random index on that array, e.g. to pass into includes
.
for all a ∈ Array x | a.length > 0,
for all i ∈ Nat | i < a.length,
a.includes(a[i])
- We know how to generate arbitrary nonempty arrays, e.g.
jsc.nearray(jsc.bool)
. - We know how to generate arbitrary naturals with a max size, e.g.
jsc.nat(8)
.
What we haven't seen yet is how to make the input of one generator (e.g. nat
) rely on values from the output of another generator (e.g. nearray
). As a first attempt, something like map
seems promising:
// Will not work
const genArrIdx =
jsc.generator.nearray(jsc.bool.generator) // gen [bool]
.map(arrBools => { // [bool] ->
const maxIdx = arrBools.length - 1
const arbIdx = jsc.nat(maxIdx)
return arbIdx.generator // gen nat
}) // -> gen (gen nat)
Unfortunately, this does quite not work – we have actually created a generator of (generator of nat). Consider this analogous example of returning an array from array map:
const result =
[1, 2, 3] // array int
.map(num => { // int ->
return [num, num] // array int
}) // -> array (array int)
console.log(result) // [[1, 1], [2, 2], [3, 3]]
Just as mapping an array to arrays returns a nested array, mapping a generator to generators returns a nested generator. To un-do this nesting, we need a new function, .flatMap
.
// Almost there…
const genArrIdx =
jsc.generator.nearray(jsc.bool.generator)
- .map( arrBools => {
+ .flatMap(arrBools => {
const maxIdx = arrBools.length - 1
const arbIdx = jsc.nat(maxIdx)
return arbIdx.generator // gen nat
- }) // -> gen (gen nat)
+ }) // -> gen nat
This gives us a generator nat
with no nesting. But as you may recall, we really wanted an array alongside a randomly-generated valid index on that array, not just the index alone. To solve this, we can use an inner map
to combine our two results:
// Works
const genArrIdx =
jsc.generator.nearray(jsc.bool.generator)
.flatMap(arrBools => {
const maxIdx = arrBools.length - 1
const arbIdx = jsc.nat(maxIdx)
return arbIdx.generator
+ .map(idx => {
+ return [arrBools, idx]
+ }) // -> gen [array bool, nat]
- }) // -> gen nat
+ }) // -> gen [array bool, nat]
A bit more tersely, and with JS syntax highlighting:
// generator [array bool, nat]
const genArrIdx =
jsc.generator.nearray(jsc.bool.generator).flatMap(arrBools => {
const maxIdx = arrBools.length - 1
const arbIdx = jsc.nat(maxIdx)
return arbIdx.generator.map(idx => [arrBools, idx])
})
The flatMap
method is just like map
except the function passed to it should return generators of values rather than just plain values. Where map
would produce a nested generator, flatMap
removes the nesting.
(generator a) .map(a -> b ) : generator b
(generator a).flatMap(a -> (generator b)) : generator b
As a side note: this is precisely the type needed to define a monad – a structure which can be un-nested after mapping. Like functors, monads show up often in functional programming, though again all you really need to know practically for this library is "return a generator in the function passed to flatMap".
Technically, flatMap
is just a convenience method. We could have defined our array+index generator manually as follows:
const genArrIdx =
jsc.generator.bless((size) => {
// create and use array generator:
const genArrBool = jsc.generator.nearray(jsc.bool.generator)
const arrBools = genArrBool(size)
const maxIdx = arrBools.length - 1
// create and use index generator:
const genIdx = jsc.nat(maxIdx).generator
const idx = genIdx(size)
// return resulting generated values
return [arrBools, idx]
})
Note how we pass the size
parameter through all the generators ourself; flatMap
handles this for us.
Regardless of which style (applying size
or using flatMap
) you prefer, when you sequence dependent generators you must write your own corresponding shrink function for the result. Shrinks aren't provided automatically, like they are with smap
.
Writing Shrinks
Like full arbitrary a
and lower-level generator a
functions, shrink
functions can be written from scratch or derived from existing shrinks. The derivation method is smap
, like with arbitraries.
(shrink a).smap(f: a -> b, g: b -> a): shrink b
const shrinkEven = jsc.nat.shrink.smap(
n => n * 2, // convert nat to even
e => e / 2 // convert even to nat
)
console.log(shrinkEven(32)) // [ 16, 24, 28, 30 ]
Earlier, we defined our own generator [array bool, nat]
function for constructing pairs of random arrays with random valid indexes (less than the array length). A corresponding shrink would have to preserve the property that the nat
value remains a valid index. In this case, a first attempt might re-use existing shrinks:
// Looks good, but there is an error…
const shrinkArrIdx = jsc.shrink.bless(([arr, idx]) => {
// shrink arrays, ensure valid index
const shrinkArrBool = jsc.shrink.nearray(jsc.bool.shrink)
const shrunkArrs = shrinkArrBool(arr)
const shrunkArrsWithIdx = shrunkArrs.map((shrunkArr) => {
const validIdx = Math.min(shrunkArr.length - 1, idx)
return [shrunkArr, validIdx]
})
// shrink index, re-use existing array
const shrunkIdxs = jsc.nat.shrink(idx)
const shrunkIdxsWithArr = shrunkIdxs.map((shrunkIdx) => {
return [arr, shrunkIdx]
})
// flat list of shrunk values? nope, this breaks:
return [...shrunkArrsWithIdx, ...shrunkIdxsWithArr]
})
Unfortunately, this fails.
TypeError: shrunkArrsWithIdx is not iterable
How odd – doesn't a shrink function return an array? That's what the type annotation seems to promise:
"shrink is a function
a -> [a]
, returning smaller values."
In reality, though, the array-based shrinks currently return lazy sequences as defined by the lazy-seq
package, by the same author as JSVerify. This improves the performance of shrinking arrays, which could otherwise be combinatorially expensive.
Looking at the lazy-seq
API docs, we can see that lazy sequences can be combined using an append
method – so that's what we'll do here.
const shrinkArrIdx = jsc.shrink.bless(([arr, idx]) => {
if (!arr.length) return []
const shrinkArrBool = jsc.shrink.nearray(jsc.bool.shrink)
const shrunkArrs = shrinkArrBool(arr)
const shrunkArrsWithIdx = shrunkArrs.map((shrunkArr) => {
const validIdx = Math.min(shrunkArr.length - 1, idx)
return [shrunkArr, validIdx]
})
const shrunkIdxs = jsc.nat.shrink(idx)
const shrunkIdxsWithArr = shrunkIdxs.map((shrunkIdx) => {
return [arr, shrunkIdx]
})
- return [...shrunkArrsWithIdx, ...shrunkIdxsWithArr]
+ return shrunkArrsWithIdx.append(shrunkIdxsWithArr)
})
This is not yet documented in JSVerify, so again be aware that the use of lazy-seq
methods on shrunk arrays may not be officially supported or subject to SEMVER rules. Cf. issue #297
Putting It All Together
Below is a complete, working implementation of our relatively involved arbitrary <array bool, index>
.
// generator <array bool, index>
const genArrIdx =
jsc.generator.nearray(jsc.bool.generator).flatMap(arrBools => {
const maxIdx = arrBools.length - 1
const arbIdx = jsc.nat(maxIdx)
return arbIdx.generator.map(idx => [arrBools, idx])
})
// shrink <array bool, index>
const shrinkArrIdx = jsc.shrink.bless(([arr, idx]) => {
// shrink arrays, ensure valid index
const shrinkArrBool = jsc.shrink.nearray(jsc.bool.shrink)
const shrunkArrs = shrinkArrBool(arr)
const shrunkArrsWithIdx = shrunkArrs.map((shrunkArr) => {
const validIdx = Math.min(shrunkArr.length - 1, idx)
return [shrunkArr, validIdx]
})
// shrink index, re-use existing array
const shrunkIdxs = jsc.nat.shrink(idx)
const shrunkIdxsWithArr = shrunkIdxs.map((shrunkIdx) => {
return [arr, shrunkIdx]
})
// lazy sequence of shrunk values
return shrunkArrsWithIdx.append(shrunkIdxsWithArr)
})
// arbitrary <array bool, index>
const arbArrIdx = jsc.bless({
generator: genArrIdx,
shrink: shrinkArrIdx
})
The moral of the story here might be "arbitraries whose generators sequence other generators are more trouble than they are worth." Of course, it will depend on the actual problem at hand, and in practice, it seems that such sequenced generators are not often needed. Much of the time, a new arbitrary value can be derived using smap
. Also, strictly speaking an arbitrary may omit the shrink, but that reduces the usefulness of property testing to an extent. Ultimately it is up to the developer to weigh the options.
Summary
- Arbitraries: "blessed" objects with generator, shrink, and show fns
- JSVerify ships with a selection of primitive (atomic) arbitraries like
jsc.nat
. - Combine arbitraries into more complex arbitraries using combinator functions like
jsc.array
. - Transform arbitraries into new arbitraries using the
.smap
method. - Define custom arbitraries from complete scratch by using the
jsc.bless
method on an object wrapping a generator, shrink, and optional show function. - Arbitraries can be passed in directly to
forall
, or else referenced by string DSL name with the help of anenv
argument.
- JSVerify ships with a selection of primitive (atomic) arbitraries like
- Generators: "blessed" functions from size to pseudorandom data
- The generators for built-in arbitraries can be accessed as
jsc.ARB_NAME.generator
. - Combine generators using provided combinators such as
jsc.generator.array
. - Transform generators using the
map
method. - Sequence generators using the
flatMap
method. - Define custom from-scratch generators by calling
jsc.generator.bless
on a function from size to random data. - Randomness in a from-scratch generator should be accomplished using
jsc.random
and/or by calling existing generators with thesize
.
- The generators for built-in arbitraries can be accessed as
- Shrinks: "blessed" functions from data to a collection of smaller data
- The shrinks for built-in arbitraries can be accessed as
jsc.ARB_NAME.shrink
. - Combine shrinks using provided combinators such as
jsc.shrink.array
. - Transform shrinks using the
smap
method. - Define custom from-scratch shrinks by calling
jsc.shrink.bless
on a function from data to arrays (orlazy-seq
values) of "smaller" values. - A good shrink should put small and/or likely-to-fail values near the start of the output collection.
- The shrinks for built-in arbitraries can be accessed as
- Shows: functions from data to a string
- The default
show
function isJSON.stringify
- It is good practice, though not absolutely required, for a
show
function to output valid JS code
- The default
Using existing arbitrary values is quick, easy, and powerful on its own. Combinators and smap
can also sometimes be a relatively painless way to yield the arbitrary value desired. More advanced use cases, building up complex data with interdependencies, can require more fluent wielding of flatMap
and/or manual definition of the shrink function. However, the benefits of having a fully-defined arbitrary for your property tests might be worth the extra difficulty.
At its best, property testing can provide a large amount of value and confidence for a small degree of effort and code. It is my hope that with greater knowledge of this approach in the JavaScript community, libraries like JSVerify may be more widely explored, maintained, extended and supported. Consider using JSVerify to augment your next project's test suite – you might find it catching mistakes you wouldn't have noticed with just a few unit tests.
Addendum
In the process of writing this article, I came to appreciate that JSVerify is useful but also has a few rough edges and outstanding issues. I would encourage anyone interested in JS property testing to also take a look at some of the competing tools in this space, including:
Posted on March 25, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.