Christian Neumanns
Posted on October 2, 2019
Introduction
There are two effective approaches to eliminate the daunting null pointer error:
The Maybe/Option pattern - mostly used in functional programming languages.
Compile-time null-safety - used in some modern programming languages.
This article aims to answer the following questions:
How does it work? How do these two approaches eliminate the null pointer error?
How are they used in practice?
How do they differ?
Notes:
Readers not familiar with the concept of
null
might want to read first: A quick and thorough guide to 'null'.For an introduction to
Maybe / Option
I recommend: F#: The Option type. You can also search the net for "haskell maybe" or "f# option".
Why Should We Care?
"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. ... This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. ..."
-- Tony Hoare
In the context of Java, Professor John Sargeant from the Manchester school of computer science puts it like this:
"Of the things which can go wrong at runtime in Java programs, null pointer exceptions are by far the most common."
-- John Sargeant
We can easily deduce:
"By eliminating the infamous null pointer error, we eliminate one of the most frequent reasons for software failures."
That's a big deal!
We should care about it.
Three Approaches
Besides showing the reason for the null pointer error, this article also aims to demonstrate how the null pointer error can be eliminated.
We will therefore compare three different approaches:
-
The language uses
null
, but doesn't provide null-safety.In these languages null pointer errors occur frequently.
Most popular languages fall into this category. For example: C, C++, Java, Javascript, PHP, Python, Ruby, Visual Basic.
-
The language doesn't support
null
, but usesMaybe
(also calledOption
orOptional
) to represent the 'absence of a value'.As
null
is not supported, there are no null pointer errors.This approach is mostly used in some functional programming languages. But it can as well be used in non-functional languages.
At the time of writing, the most prominent languages using this approach are probably Haskell, F#, and Swift.
-
The language uses null and provides compile-time-null-safety.
Null pointer errors cannot occur.
Some modern languages support this approach.
Source Code Examples
In this chapter we'll look at some source code examples of common use cases involving 'the absence of a value'. We will compare the code written in the three following languages representing the three approaches mentioned in the previous chapter:
-
Java (supports null, but not null-safe)
Java is one of the industry's leading languages, and one of the most successful ones in the history of programming languages. But it isn't null-safe. Hence, it is well suited to demonstrate the problem of the null pointer error.
-
Haskell (Maybe type)
Haskell is the most famous one in the category of pure functional languages. It doesn't support
null
. Instead it uses theMaybe
monad to represent the 'absence of a value'.Note: I am by no means a Haskell expert. If you see any mistake or need for improvement in the following examples, then please leave a comment so that the article can be updated.
-
PPL (supports null and is null-safe)
The Practical Programming Language (PPL) supports
null
and has been designed with full support for compile-time-null-safety from the ground up. However, be warned! PPL is just a work in progress, not ready yet to write mission-critical enterprise applications. I use it in this article because (full disclosure!) I am the creator of PPL, and I want to initiate some interest for it. I hope you don't mind - after reading this article.
All source code examples are available on Github. The Github source code files contain alternative solutions for some examples, not shown in this article.
Null-Safety
How does null-safety work in practice? Let's see.
Null Not Allowed
We start with an example of code where null
is not allowed.
Say we want to write a very simple function that takes a positive integer and returns a string. Neither the input nor the output can be null
. If the input value is 1, we return "one". If it is not 1, we return "not one". How does the code look like in the three languages? And, more importantly, how safe is it?
Java
This is the function written in Java:
static String intToString ( Integer i ) {
if ( i == 1 ) {
return "one";
} else {
return "not one";
}
}
We can use the ternary operator and shorten the code a bit:
static String intToString ( Integer i ) {
return i == 1 ? "one" : "not one";
}
Note: I am using type Integer
, which is a reference type. I am not using type int
, which is a value type. The reason is that null
works only with reference types.
To test the code, we can write a simple Java application like this:
public class NullNotAllowedTest {
static String intToString ( Integer i ) {
return i == 1 ? "one" : "not one";
}
public static void main ( String[] args ) {
System.out.println ( intToString ( 1 ) );
System.out.println ( intToString ( 2 ) );
}
}
If you want to try out this code you can use an online Java Executor like this one. Just copy/paste the above code in the Source File
tab, and click Execute
. It looks like this:
If you have Java installed on your system, you can also proceed like this:
Save the above code in file
NullNotAllowedTest.java
.-
Compile and run it by typing the following two commands in a terminal:
javac NullNotAllowedTest.java java NullNotAllowedTest
The output written to the OS out device is:
one
not one
So far so good.
Haskell
In Haskell, there are a few ways to write the function. For example:
intToString :: Integer -> String
intToString i = case i of
1 -> "one"
_ -> "not one"
Note: The first line in the above code could be omitted, because Haskell supports type inference for function arguments. However, it's considered good style to include the type signature, because it makes the code more readable. Hence, we will always include the type signature in the upcoming Haskell examples.
The above code uses pattern matching, which is the idiomatic way to write code in Haskell.
We can write a simple Haskell application to test the code:
intToString :: Integer -> String
intToString i = case i of
1 -> "one"
_ -> "not one"
main :: IO ()
main = do
putStrLn $ intToString 1
putStrLn $ intToString 2
As for Java, you can use an online Haskell executor to try out the code. Here is a screenshot:
Alternatively, if Haskell is installed on your system, you can save the above code in file NothingNotAllowedTest.hs
. Then you can compile and run it with these two commands:
ghc -o NothingNotAllowedTest NothingNotAllowedTest.hs
NothingNotAllowedTest.exe
The output is the same as in the Java version:
one
not one
PPL
In PPL the function can be written like this:
function int_to_string ( i pos_32 ) -> string
if i =v 1 then
return "one"
else
return "not one"
.
.
Note: The comparison operator =v
in the above code is suffixed with a v
to make it clear we are comparing values. If we wanted to compare references, we would use operator =r
.
We can shorten the code by using an if-then-else expression (instead of an if-then-else statement):
function int_to_string ( i pos_32 ) -> string = \
if i =v 1 then "one" else "not one"
A simple PPL application to test the code looks like this:
function int_to_string ( i pos_32 ) -> string = \
if i =v 1 then "one" else "not one"
function start
write_line ( int_to_string ( 1 ) )
write_line ( int_to_string ( 2 ) )
.
At the time of writing there is no online PPL executor available. To try out code you have to install PPL and then proceed like this:
Save the above code in file
null_not_allowed_test.ppl
-
Compile and run the code in a terminal by typing:
ppl null_not_allowed_test.ppl
Again, the output is:
one
not one
Discussion
As we have seen (and expected), the three languages allow us to write 'code that works correctly'. Here is a reprint of the three versions, so that you can easily compare the three versions:
-
Java
static String intToString ( Integer i ) { return i == 1 ? "one" : "not one"; }
-
Haskell
intToString :: Integer -> String intToString i = case i of 1 -> "one" _ -> "not one"
-
PPL
function int_to_string ( i pos_32 ) -> string = \ if i =v 1 then "one" else "not one"
A pivotal question remains unanswered:
"What happens in case of a bug in the source code?"
-- The Crucial Question
In the context of this article we want to know: What happens if the function is called with null
as input? And what if the function returns null
?
This question is easy to answer in the Haskell world. null
doesn't exist in Haskell. Haskell uses the Maybe
monad to represent the 'absence of a value'. We will soon see how this works. Hence, in Haskell it is not possible to call intToString
with a null
as input. And we can't write code that returns null
.
PPL supports null
, unlike Haskell. However, all types are non-null by default. This is a fundamental rule in all effective null-safe languages. A PPL function with the type signature pos_32 -> string
states that the function cannot be called with null
as input, and it cannot return null
. This is enforced at compile-time
, so we are on the safe side. Code like int_to_string ( null )
simply doesn't compile.
"By default all types are non-null in a null-safe language."
"By default it is illegal to assign
null
."-- The 'non-null by default' rule
What about Java?
Java is not null-safe. Every type is nullable, and there is no way to specify a non-null type for a reference. This means that intToString
can be called with null
as input. Moreover, nothing prevents us from writing code that returns null
from intToString
.
So, what happens if we make a function call like intToString ( null )
? The program compiles, but the disreputable NullPointerException
is thrown at run-time:
Exception in thread "main" java.lang.NullPointerException
at NullNotAllowedTest.intToString(NullNotAllowedTest.java:4)
at NullNotAllowedTest.main(NullNotAllowedTest.java:10)
Why? The test i == 1
is equivalent to i.compareTo ( new Integer(1) )
. But i
is null
in our case. And executing a method on a null
object is impossible and generates a NullPointerException
.
This is the well-known reason for the infamous billion-dollar mistake.
What if intToString
accidentally returns null
, as in the following code:
public class NullNotAllowedTest {
static String intToString ( Integer i ) {
return null;
}
public static void main ( String[] args ) {
System.out.println ( intToString ( 1 ) );
}
}
Again, no compiler error. But a runtime error occurs, right? Wrong, the output is:
null
Why?
The reason is that System.out.println
has been programmed to write the string "null"
if it is called with null
as input. The method signature doesn't show this, but it is clearly stated in the Java API documentation: "If the argument is null then the string 'null' is printed.".
What if instead of printing the string returned by intToString
, we want to print the string's size (i.e. the number of characters). Let's try it by replacing ...
System.out.println ( intToString ( 1 ) );
... with this:
System.out.println ( intToString ( 1 ).length() );
Now the program doesn't continue silently. A NullPointerException
is thrown again, because the program tries to execute length()
on a null
object.
As we can see from this simple example, the result of misusing null
is inconsistent.
In the real world, the final outcome of incorrect null
handling ranges from totally harmless to totally harmful, and is often unpredictable. This is a general, and frustrating property of all programming languages that support null
, but don't provide compile-time-null-safety
. Imagine a big application with thousands of functions, most of them much more complex than our simple toy code. None of these functions are implicitly protected against misuses of null
. It is understandable why null
and the "billion dollar mistake" have become synonyms for many software developers.
We can of course try to improve the Java code and make it a bit more robust. For example, we could explicitly check for a null
input in method intToString
and throw an IllegalArgumentException
. We could also add a NonNull
annotation that can be used by some static code analyzers or super-sophisticated IDEs. But all these improvements require manual work, might depend on additional tools and libraries, and don't lead to a satisfactory and reliable solution. Therefore, we will not discuss them. We are not interested in mitigating the problem of the null pointer error, we want to eliminate it. Completely!
Null Allowed
Let's slightly change the specification of function int_to_string
. We want it to accept null
as input and return:
"one"
if the input is 1"not one"
if the input is not 1 and notnull
null
if the input isnull
How does this affect the code in the three languages?
Java
This is the new code written in Java:
static String intToString ( Integer i ) {
if ( i == null ) {
return null;
} else {
return i == 1 ? "one" : "not one";
}
}
We could again use the ternary operator and write more succinct code:
static String intToString ( Integer i ) {
return i == null ? null : i == 1 ? "one" : "not one";
}
Whether to chose the first or second version is a matter of debate. As a general rule, we should value readability more than terseness of code. So, let's stick with version 1.
The crucial point here is that the function's signature has not changed, although the function's specification is now different. Whether the function accepts and returns null
or not, the signature is the same:
String intToString ( Integer i ) {
This doesn't come as a surprise. As we saw already in the previous example, Java (and other languages without null-safety) doesn't make a difference between nullable and non-nullable types. All types are always nullable. Hence by just looking at a function signature we don't know if the function accepts null
as input, and we don't know if it might return null
. The best we can do is to document nullability for each input/output argument. But there is no compile-time protection against misuses.
To check if it works, we can write a simplistic test application:
public class NullAllowedTest {
static String intToString ( Integer i ) {
if ( i == null ) {
return null;
} else {
return i == 1 ? "one" : "not one";
}
}
static void displayResult ( String s ) {
String result = s == null ? "null" : s;
System.out.println ( "Result: " + result );
}
public static void main ( String[] args ) {
displayResult ( intToString ( 1 ) );
displayResult ( intToString ( 2 ) );
displayResult ( intToString ( null ) );
}
}
Output:
Result: one
Result: not one
Result: null
Haskell
This is the code in Haskell:
intToString :: Maybe Integer -> Maybe String
intToString i = case i of
Just 1 -> Just "one"
Nothing -> Nothing
_ -> Just "not one"
Key points:
-
Haskell doesn't support
null
. It uses theMaybe
monad.The Maybe type is defined as follows:
data Maybe a = Just a | Nothing deriving (Eq, Ord)
The Haskell doc states: "The
Maybe
type encapsulates an optional value. A value of typeMaybe a
either contains a value of typea
(represented asJust a
), or it is empty (represented asNothing
). TheMaybe
type is also a monad."Note: More information can be found here and here. Or you can read about the Option type in F#.
The function signature clearly states that calling the function with no integer (i.e. the value
Nothing
in Haskell) is allowed, and the function might or might not return a string.For string values the syntax
Just "string"
is used to denote a string, andNothing
is used to denote 'the absence of a value'. Analogously, the syntaxJust 1
andNothing
is used for integers.Haskell uses pattern matching to check for 'the absence of a value' (e.g.
Nothing ->
). The symbol_
is used to denote 'any other case'. Note that the_
case includes theNothing
case. Hence if we forget the explicit check forNothing
there will be no compiler error, and"not one"
will be returned if the function is called withNothing
as input.
Here is a simple test application:
import Data.Maybe (fromMaybe)
intToString :: Maybe Integer -> Maybe String
intToString i = case i of
Just 1 -> Just "one"
Nothing -> Nothing
_ -> Just "not one"
displayResult :: Maybe String -> IO()
displayResult s =
putStrLn $ "Result: " ++ fromMaybe "null" s
main :: IO ()
main = do
displayResult $ intToString (Just 1)
displayResult $ intToString (Just 2)
displayResult $ intToString (Nothing)
Output:
Result: one
Result: not one
Result: null
Note the fromMaybe "null" s
expression in the above code. In Haskell this is a way to provide a default value in case of Nothing
. It's conceptually similar to the expression s == null ? "null" : s
in Java.
PPL
In PPL the code looks like this:
function int_to_string ( i pos_32 or null ) -> string or null
case value of i
when null
return null
when 1
return "one"
otherwise
return "not one"
.
.
Note: A case expression will be available in a future version of PPL (besides the case statement shown above). Then the code can be written more concisely as follows:
function int_to_string ( i pos_32 or null ) -> string or null = \
case value of i
when null: null
when 1 : "one"
otherwise: "not one"
Key points:
-
In PPL
null
is a regular type (likestring
,pos_32
, etc.) that has one possible value:null
.It appears as follows in the top of PPL's type hierarchy:
-
PPL supports union types (also called sum types, or choice types). For example, if a reference can be a string or a number, the type is
string or number
.That's why we use the syntax
pos_32 or null
andstring or null
to denote nullable types. The typestring or null
simply means that the value can be any string ornull
. The function clearly states that it accepts
null
as input, and that it might returnnull
.We use a
case
instruction to check the input and return an appropriate string. The compiler ensures that each case is covered in thewhen
clauses. It is not possible to accidentally forget to check fornull
, because (in contrats to Haskell) theotherwise
clause doesn't cover thenull
clause.
A simple test application looks like this:
function int_to_string ( i pos_32 or null ) -> string or null
case value of i
when null
return null
when 1
return "one"
otherwise
return "not one"
.
.
function display_result ( s string or null )
write_line ( """Result: {{s if_null: "null"}}""" )
.
function start
display_result ( int_to_string ( 1 ) )
display_result ( int_to_string ( 2 ) )
display_result ( int_to_string ( null ) )
.
Output:
Result: one
Result: not one
Result: null
Note the """Result: {{s if_null: "null"}}"""
expression used in function display_result
. We use string interpolation: an expression embedded between a {{
and }}
pair. And we use the if_null:
operator to provide a string that represents null
. Writing s if_null: "null"
is similar to s == null ? "null" : s
in Java.
If we wanted to print nothing in case of null
, we could code """Result: {{? s}}"""
Discussion
Again, the three languages allow us to write code that works correctly.
But there are some notable differences:
In Haskell and PPL, the functions clearly state that 'the absence of a value' is allowed (i.e.
Nothing
in Haskell, ornull
in PPL). In Java, there is no way to make a difference between nullable and non-nullable arguments (except via comments or annotations, of course).In Haskell and PPL, the compiler ensures we don't forget to check for 'the absence of a value'. Executing an operation on a possibly
Nothing
ornull
value is not allowed. In Java we are left on our own.
Here is a comparison of the three versions of function int_to_string
:
-
Java
static String intToString ( Integer i ) { if ( i == null ) { return null; } else { return i == 1 ? "one" : "not one"; } }
-
Haskell
intToString :: Maybe Integer -> Maybe String intToString i = case i of Just 1 -> Just "one" Nothing -> Nothing _ -> Just "not one"
-
PPL
New version (not available yet):
function int_to_string ( i pos_32 or null ) -> string or null = \ case value of i when null: null when 1 : "one" otherwise: "not one"
Current version:
function int_to_string ( i pos_32 or null ) -> string or null case value of i when null return null when 1 return "one" otherwise return "not one" . .
And here is the function used to display the result:
-
Java
static void displayResult ( String s ) { String result = s == null ? "null" : s; System.out.println ( "Result: " + result ); }
-
Haskell
import Data.Maybe (fromMaybe) displayResult :: Maybe String -> IO() displayResult s = putStrLn $ "Result: " ++ fromMaybe "null" s
PPL
function display_result ( s string or null )
write_line ( """Result: {{s if_null: "null"}}""" )
.
That's it for part 1. In part 2 (to be published soon) we'll have a look at some useful null-handling features used frequently in practice.
Header image by dailyprinciples from Pixabay.
Posted on October 2, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.