100 Languages Speedrun: Episode 37: OCaml

taw

Tomasz Wegrzanowski

Posted on December 27, 2021

100 Languages Speedrun: Episode 37: OCaml

OCaml is functional programming language with very weird static type system (once you get past the basics), and what's possibly the ugliest syntax of any major programming language. Double semicolons are only the start of it.

Hello, World!

(* Hello, World! in OCaml *)
print_string "Hello, World!\n"
Enter fullscreen mode Exit fullscreen mode

Multiple statements

So far it wasn't bad. Let's try to define a function and call it:

let ask_for_name () = (
  print_string "What's your name? ";
  read_line()
);;

print_string ("Hello, "^ask_for_name()^"!\n");;
Enter fullscreen mode Exit fullscreen mode
$  ocaml name.ml
What's your name? Kitty
Hello, Kitty!
Enter fullscreen mode Exit fullscreen mode

So that worked, but that syntax is quite awful. ; within an expression and ;; between expressions.

Fibonacci

let rec fib n =
  if n <= 2
    then 1
    else fib (n - 1) + fib (n - 2)
;;

for i = 1 to 20 do
  print_string("fib(" ^ (string_of_int(i)) ^ ")=" ^ (string_of_int(fib(i))) ^ "\n")
done;;
Enter fullscreen mode Exit fullscreen mode

There's no string interpolation, and we need explicit type conversions. In OCaml we don't even do the usual (for verbose statically typed languages) "convert to X" - we need specifically "convert to X from Y" so it's twice as verbose.

To define recursive function we need to specify let rec instead of the usual let, which is a stupid design decision a lot of functional languages make.

Oh and I'm typing a few parentheses more than idiomatic OCaml code would use, as I think that's going to be more readable for non-OCaml developers.

Unicode

Not only OCaml has no string interpolation, it doesn't even have any equivalent of console.log. Well, why don't we create our own!

type printable =
    S of string
  | I of int
;;

let printable_to_string = function
    S s -> s
  | I i -> string_of_int i
;;

let rec string_join sep = function
    [] -> ""
  | [s] -> s
  | (s::ss) -> s ^ sep ^ (string_join sep ss)
;;

let console_log list =
  print_string ((string_join " " (List.map printable_to_string list)) ^ "\n")
;;

console_log [(S "Length of [1; 2; 3] is "); (I (List.length [1; 2; 3]))];;
console_log [(S "Length of \"Hello\" is"); (I (String.length "Hello"))];;
console_log [(S "Length of \"Żółw\" is"); (I (String.length "Żółw"))];;
console_log [(S "Length of \"💩\" is"); (I (String.length "💩"))];;
Enter fullscreen mode Exit fullscreen mode

The result is completely wrong of course:

$ ocaml unicode.ml
Length of [1; 2; 3] is  3
Length of "Hello" is 5
Length of "Żółw" is 7
Length of "💩" is 4
Enter fullscreen mode Exit fullscreen mode

But first, what the hell is even going on here!

  • OCaml doesn't have any "polymorphic" functions that would accept multiple types - every function has just one type. So List.length for length of a list, String.length for length of a string, and so on. There are no ways around it.
  • There's no String.join, but that's not too hard to write on our own. Pattern matching is pretty decent.
  • We can define custom data type that's S string or I int, then we just need to pass. By the way these names generally need to be unique everywhere, so you can't really reuse it in some different interface which would take I int | F float.
  • once we wrap everything in the right type wrapper, we can send it to our console_log.

With this much pain for such a simple thing, does it get easier for more complex code? It does not, it gets a lot worse.

OCaml has a few outs. It has some (atrocious) macro functionality, which allows Printf.printf (with static template only). And it has "polymorphic variants", which at least let you reuse those wrappers, so with a lot of extra explicit type declarations you can have one function take S | I and another take I | F. At least that's the idea, it runs into a lot of problems in practice.

Oh and you might have noticed that OCaml has no idea what Unicode even is. All the answers were wrong.

FizzBuzz

No new issues here:

let fizzbuzz i =
  if i mod 15 == 0
    then "FizzBuzz"
    else if i mod 5 == 0
      then "Buzz"
      else if i mod 3 == 0
        then "Fizz"
        else string_of_int i
;;

for i = 1 to 100 do
  print_string (fizzbuzz(i) ^ "\n")
done;;
Enter fullscreen mode Exit fullscreen mode

Pythagorean theorem

I keep saying that in OCaml every function must have unique input types, and there are no polymorphic functions at all. A small note on terminology, as "polymorphic" refers to two different things. Either to functions like List.length which can work on list of any type because it doesn't look inside (which I'd just call "generic functions", sometimes they're called "parametric polymorphism"). Or to functions like length which could work with multiple container types (sometimes called "ad-hoc polymorphism" - but that's what most people call "polymorphic"). OCaml has generic function, but no ad-hoc polymorphism, and it's extremely committed to that.

How committed? Well, you can't even + two floats.

let a = 3.0;;
let b = 4.0;;
let c = Float.sqrt(a *. a +. b *. b);;

Printf.printf "%f^2 + %f^2 = %f^2\n" a b c;;
Enter fullscreen mode Exit fullscreen mode

Every type has its own +, and *, and so on.

The only tiny exception to this is that =, > etc are polymorphic. Of course types on both sides must be the same.

Printf.printf macro saves us from a lot of nasty code.

Custom operators

One consequence of needing so many different operator variants is that OCaml lets us (and pretty much forces us to) define our own operators. For example this defines +& as addition of two 2D points:

type point = {x: float; y: float};;

let (+&) a b = {x=a.x +. b.x; y = a.y +. b.y};;

let a = {x=1.0; y=2.0};;
let b = {x=2.0; y=5.0};;
let c = a +& b;;

Printf.printf "<%f,%f>\n" c.x c.y;;
Enter fullscreen mode Exit fullscreen mode

Of course we cannot do anything polymorphic here. 2D points of ints, or 3D points of floats, or anything like that would all need their own symbols.

How operator starts is used for precedence, so a +& b *& c would be treated as a +& (b *& c).

It's better than having to do Point2D.add etc., but it's still miserable compared to just having + work on everything, like it works in most other languages.

Oh and OCaml does not love if you reuse field names between different types. So type point3 = {x: float; y: float; z: float};; isn't forbidden, but it causes issues and would require a lot of manual type annotations.

Should you use OCaml?

No. And I'm saying it as someone who's done a lot of OCaml back in the days.

OCaml offered a mix of features that was somewhat appealing a few decades ago - it's a functional garbage-collected language, statically compiled to speeds comparable to Java, with easy to understand eager semantics (no laziness and monads), and syntax which while godawful at least doesn't have millions of parentheses. All alternatives back then were either not really functional (C, Java), too parenthesized (Lisp), semantically too weird (Haskell), or too slow (Lisp, Ruby; generally Haskell too unless you put a lot of effort to work around its laziness).

Nowadays most languages have sufficient functional features (even totally non-functional ones like Kotlin), there's a plethora of LLVM-based languages that are fast enough, so OCaml's niche disappeared - and it was a small niche to begin with.

OCaml is also in weird situation where a lot of users simply don't use the standard "standard library" and instead replace it with their own. And there's multiple such efforts. So the thing they're using, is it even OCaml? Well, it can't fix the core language issues.

OCaml has far too many quirks, such atrocious syntax, and lacks convenience functions provided by pretty much every language these days. There's no payoff for putting up with all that. Unless you want a job at Jane Street Capital I guess (who have their own replacement standard library too).

Fully committed "Functional Programming" didn't get far, but then neither did fully committed "Object Oriented Programming" (that's just Smalltalk and Ruby). But half-assed functional programming just as half-assed OOP is everywhere these days, and it's good enough. Either pick a language that does good-enough functional programming (which is most of them), or accept the sacrifices to do something like Ruby, Haskell, Clojure, or Racket, they're all much less painful than OCaml's.

Code

All code examples for the series will be in this repository.

Code for the OCaml episode is available here.

💖 💪 🙅 🚩
taw
Tomasz Wegrzanowski

Posted on December 27, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related