Feedback request on "hollistic" programming language idea
DrBearhands
Posted on October 15, 2021
I want to make my own programming language, but I don't want to waste years of my life on a silly idea nobody would ever want, so here I am writing down my ideas and asking for feedback.
Pure, functional programming, particularly the dependently typed variety, holds a lot of potential business value. It lets me express (parts of) the spec as types, and ensures my implementation matches the types, and therefore the spec. If there was no spec, the types are now the (proto-)spec. The same program in Haskell or Idris will have fewer faults than something in, say, Javascript, C++, or Java. At least, that is both my personal experience and what I keep hearing from other users. When strictly enforced, it is also a huge boon to security, especially when using third party libraries, which sometimes steal your cryptocurrencies.
There is a huge limitations in this approach: specifying a solution as programs. Type-safety ends at I/O, so defining a solution in terms of multiple communicating programs right from the get-go loses type-safety unnecessarily. E.g. in a CRUD application it is more insightful to declare how responses are produced for requests on an API endpoint, without mixing in explicit I/O operations to get data from one machine to the next. In theory, pure functions have explicit data-dependencies, so we can split off evaluation over nodes at a later point.
I will call the approach of leaving program I/O undefined "hollistic programming", as it only specifies I/O of the whole system/solution.
As an example, if we have a function:
f : Int
f = g + h
we could leave the definition as-is in the code, and at compile/deploy time decide to evaluate g
on a different node/thread/whatever. If we had to specify this as a program immediately, the type of f
would be lifted to IO Int
, which tells us next-to nothing about error handling, security, or correctness. The code would also be full of explicit communication.
There are other such decisions that are often made in code, but (assuming temination and infinite resources) make no difference to the result of the evaluation: forking, vectorizing, monomorphizing, boxing, type-unrolling, type-mapping, evaluation strategy, heap recycling, choice of allocator...
What I'm suggesting is, essentially, a fairly extreme separation of concerns.
But if not with an IO monad, how should I/O be modelled? The precise model would be user-defined, specific to a solution. A CRUD application has conceptually different I/O than an embedded system, or a video game. IO would be declared by abstract linear types c
, denoting independent I/O channels, and handler functions c ⊸ c
. This is a generalized case of IO ()
.
Caveats
I realize there are some caveats that make hollistic programming non-trivial. I will list the ones I already thought about:
- Databases/shared state.
- How do I model inherently concurrent state changes? (as opposed to concurrency for the sake of speed optimization). I will have to formalize transactional state changes. I do not know how to do this elegantly. Examples are multiple clients accessing the same database and asynchronous display/update in videogames.
- Data in databases persists over code changes. To preserve correctness and data, compilation would become stateful. This is difficult, but on the flipside, anything is a step up from current ignoramus (et ignorabimus) approaches.
- Adding communication adds potential for errors by partitions. Bottoms are inevitable even in the best case due to potential power loss. Nevertheless, smaller bottoms are better. I want to decrease uncertainty, not increase it. Could be adressed by lifting to arrows with timeout condition in spec, or ignored with compile argument for non-critical applications.
- Some evaluation strategies (lazy, eager, greedy) may bottom when others don't. I'm not very concerned about this in practice. Most functions in BLOBA (Boring Line-of-Business Applications) are provably terminating in O(n) in my experience. In the worst case scenario, I can just force lazy evaluation on functions that cannot be proved to terminate by argument reduction. Good enough.
- Other processes on a system might break assumptions made by the compiler. E.g. a large constant value stored in a file might get deleted. I don't find this a valid concern. Messing with a system that the compiler assumes is dedicated is like changing bytecode, things will break.
Other stuff
There are a few other, less novel and ambitious, things I want to achieve in my language
- Limited feature set. I want people to actually try this out. Make dependently typed programming as approachable as possible.
- Removing primitive types from the language. What is primitive depends on architecture. Booleans and floats are good examples of common primitives that don't always exist. User-specified mappings defined at compile-time should be used instead.
- (Semi-)automatic type unrolling/optimization: e.g.
{Fin x | x <= (2^32)}
toint32
, andVect a n
toa[n]
. - Encourage using correct types over "one size fits all". E.g. fixed point over floats and pretty much anything over string. If nothing else, strings and floats should not be in the prelude.
- Raise warnings, or have a similar mechanism, for dubious bits/changes in code (e.g.
unsafe
operations).
My current plan of attack is to fork Idris 2, remove the things that are incompatible with my ideas (primitive types, interfaces, probably every single library), mock about to get a better feeling about how to define systems if not I/O monads, and start building a custom, interactive backend. I am not sure that this is sensible, or that I will stray too far from Idris to benefit from future updates.
Your feedback is appreciated.
Posted on October 15, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.