100 Languages Speedrun: Episode 78: Better Whenever Interpreter with Python and SLY
Tomasz Wegrzanowski
Posted on February 3, 2022
A while back I covered Whenever - a language where the program is a todo list. Whenever has no functions, no variables, and the state is just a todo list. As long as there's something on it, it picks an item at random, and performs it.
For this episode, I want to create a better version of Whenever language. It's mostly backwards compatible with the original and can run every Whenever program I could find. However, behavior for edge cases isn't documented anywhere, and absolute compatibility isn't the goal.
I made the following changes to make it more convenient, while keeping the challenge level just high as before:
- todo items can be identified by name, not just line number
- todo items don't need to be in any specific order
- empty lists are allowed
- comments with
//
are allowed - semicolons are optional
- you can mix
print
actions and counter actions in the same todo - if you name one of the todo items as
start
, program will begin with it as the only active action on the todo list - it uses much smarter execution model, so it can run much more ambitious programs than the original interpreter
- it detects some infinite loops if no action can possibly happen and throws an error
- it uses infinite precision integers
-
!
is supported - I didn't add
forget
as it's listed as deprecated
How it works
There's a lot of code coming, and it's only somewhat commented, so here's the big picture overview:
- you can run it with
whenever.py whenever_file.txt
- lexer and parser are implemented with SLY, I recently wrote about it, check it out if you need some help understanding how SLY works
-
WheneverLexer.py
implements the lexer, which turns Whenever program into a stream of tokens like["print", "(", '"Hello, World!"', ")"]
- it also handles trivial things like removing comments -
WheneverParser.py
implements the parser, which turns that stream of tokens into a program -
WheneverEval.py
is all the code that evaluates Whenever expressions, dealing with all the operators, type conversions, and so on -
whenever.py
runs the program using the previous classes
I'll go into a bit of detail when I deal with each of the files.
Examples of Better Whenever Programs
I included all examples from previous episode, as well as all examples from official documentation, but since I want to improve the language I also included a few more.
math.txt
, just a quick test script to check that operators have correct precedence:
a print(300 + 50 * 4 + 80 / 4 - (80 - 30) * 2)
fizzbuzz.txt
, a simple program simplified to take advantage of new features like comments, no-semicolons, and start
, but still using numbers for todo items:
// start
start 10,11
// variables
2 2
3 3
4 4
// printing
5 defer((N(6) + N(7) + N(8) + N(9)) != 3) 6#-N(6),7#-N(7),8#-N(8),9#-N(9)
6 defer(N(3) == 0 || N(4) == 0) print(N(2))
7 defer(3 || N(4) == 0) print("Fizz")
8 defer(N(3) == 0 || 4) print("Buzz")
9 defer(3 || 4) print("FizzBuzz")
// loop logic
10 defer(5 || N(2) >= 100) 2,3,-3#((N(3)/3)*3),4,-4#((N(4)/5)*5),5,6,7,8,9,10
11 defer(5 || N(2) < 100) -2#N(2),-3#N(3),-4#N(4),-10#N(10)
fizzbuzz2.txt
, same, but now with all todo items named, and !
used:
// start
start loop,loopend
// variables
a a
t t
f f
// printing
prn defer((N(pn) + N(pf) + N(pb) + N(pfb)) != 3) -pn#N(pn),-pf#N(pf),-pb#N(pb),-pfb#N(pfb)
pn defer(!t || !f) print(N(a))
pf defer(t || !f) print("Fizz")
pb defer(!t || f) print("Buzz")
pfb defer(t || f) print("FizzBuzz")
// loop logic
loop defer(prn || N(a) >= 100) a,t,-t#((N(t)/3)*3),f,-f#((N(f)/5)*5),prn,pn,pf,pb,pfb,loop
loopend defer(prn || N(a) < 100) -a#N(a),-t#N(t),-f#N(f),-loop#N(loop)
WheneverLexer.py
This is the simplest file. Possibly improvements would be support for escape codes like \n
in strings (but then print
would need to change to stop autoprinting \n
), and maybe some extra features like read
or forget
. I also didn't bother tracking line numbers, so error messages won't be great.
from sly import Lexer
class WheneverLexer(Lexer):
tokens = { PLUS, MINUS, TIMES, DIVIDE, LPAREN, RPAREN, COMMA, HASH, REL, NUM, ID, EOS, STRING, AND, OR, NOT, PRINT, DEFER, AGAIN, N, TRUE, FALSE, MOD }
ignore = " \t\r"
EOS = r"(\n|;|//[^\n]*\n)+"
PLUS = r"\+"
MINUS = r"-"
TIMES = r"\*"
DIVIDE = r"/"
MOD = r"%"
LPAREN = r"\("
RPAREN = r"\)"
COMMA = ","
HASH = "#"
REL = r"<=|<|>=|>|==|!="
OR = r"\|\|"
AND = "&&"
NOT = "!"
STRING = r'"[^"]*"'
NUM = r"[0-9]+"
ID = r"[a-zA-Z_][a-zA-Z0-9_]*"
ID['again'] = AGAIN
ID['defer'] = DEFER
ID['false'] = FALSE
ID['N'] = N
ID['print'] = PRINT
ID['true'] = TRUE
def NUM(self, t):
t.value = int(t.value)
return t
def STRING(self, t):
t.value = t.value[1:-1]
return t
SLY makes defining lexers really easy.
WheneverParser.py
The parser is long, but not really complicated. For AST I just use Python dictionaries, and I use explicit rules like exprm
instead of precedence tables, as I'm more used to this style. SLY by default has really weird error handling (print error to stderr, do nothing), so we need to override it to throw proper exceptions. I used the same messages as would be printed.
For simplicity, I don't distinguish between identifier (basically symbols) and strings, so you can do N("loop")
or print(Fizz)
, even though that shouldn't really be supported. It's largely harmless for now.
Most of the syntax is fairly obvious. One somewhat unusual thing I did was explicitly disallowing multiple comparison at the same level without parentheses, like (1 == 2 == 3
or 1 < 2 == 3 < 4
), as that's just really confusing.
from sly import Parser
from WheneverLexer import WheneverLexer
class WheneverParser(Parser):
tokens = WheneverLexer.tokens
@_("")
def program(self, p):
return {}
@_("EOS program")
def program(self, p):
return p.program
@_("todo program")
def program(self, p):
program = p.program
id, todo = p.todo
if id in program:
raise Exception(f"Duplicate todo id: {id}")
program[id] = todo
return program
@_("id modifiers actions EOS")
def todo(self, p):
again = []
defer = []
for mod in p.modifiers:
if "again" in mod:
again.append(mod["again"])
if "defer" in mod:
defer.append(mod["defer"])
return (p.id, {"again": again, "defer": defer, "actions": p.actions})
@_("")
def modifiers(self, p):
return []
@_("modifier modifiers")
def modifiers(self, p):
return [p.modifier] + p.modifiers
@_("DEFER LPAREN expr RPAREN")
def modifier(self, p):
return {"defer": p.expr}
@_("AGAIN LPAREN expr RPAREN")
def modifier(self, p):
return {"again": p.expr}
@_("action")
def actions(self, p):
return [p.action]
@_("action COMMA actions")
def actions(self, p):
return [p.action] + p.actions
@_("id")
def action(self, p):
return {"action": "change", "id": p.id, "count": {"value": 1}}
@_("MINUS id")
def action(self, p):
return {"action": "change", "id": p.id, "count": {"value": -1}}
@_("id HASH expr")
def action(self, p):
return {"action": "change", "id": p.id, "count": p.expr}
@_("MINUS id HASH expr")
def action(self, p):
return {"action": "change", "id": p.id, "count": {"op": "uminus", "a": p.expr}}
@_("PRINT LPAREN expr RPAREN")
def action(self, p):
return {"action": "print", "arg": p.expr}
# expr, going through all the priority levels
@_("exprlo")
def expr(self, p):
return p.exprlo
### expr logical or
@_("exprla")
def exprlo(self, p):
return p.exprla
@_("exprlo OR exprla")
def exprlo(self, p):
return {"op": "||", "a": p.exprlo, "b": p.exprla}
### expr logical and
@_("exprr")
def exprla(self, p):
return p.exprr
@_("exprla AND exprr")
def exprla(self, p):
return {"op": "&&", "a": p.exprla, "b": p.exprr}
### expr relational
### do not support chaining them without parentheses
@_("expra")
def exprr(self, p):
return p.expra
@_("expra REL expra")
def exprr(self, p):
return {"op": p.REL, "a": p.expra0, "b": p.expra1}
### expr additive
@_("expra PLUS exprm")
def expra(self, p):
return {"op": "+", "a": p.expra, "b": p.exprm}
@_("expra MINUS exprm")
def expra(self, p):
return {"op": "-", "a": p.expra, "b": p.exprm}
@_("exprm")
def expra(self, p):
return p.exprm
### expr multiplicative
@_("exprm TIMES expru")
def exprm(self, p):
return {"op": "*", "a": p.exprm, "b": p.expru}
@_("exprm DIVIDE expru")
def exprm(self, p):
return {"op": "/", "a": p.exprm, "b": p.expru}
@_("exprm MOD expru")
def exprm(self, p):
return {"op": "%", "a": p.exprm, "b": p.expru}
#### expr unary
@_("expru")
def exprm(self, p):
return p.expru
@_("value")
def expru(self, p):
return p.value
@_("NOT expru")
def expru(self, p):
return {"op": "!", "a": p.expru}
@_("MINUS expru")
def expru(self, p):
return {"op": "uminus", "a": p.expru}
### primitive value
@_("LPAREN expr RPAREN")
def value(self, p):
return p.expr
# No reason to distinguish between foo and "foo" yet
@_("ID")
def value(self, p):
return {"value": p.ID}
@_("NUM")
def value(self, p):
return {"value": p.NUM}
@_("N LPAREN id RPAREN")
def value(self, p):
return {"N": p.id}
@_("STRING")
def value(self, p):
return {"value": p.STRING}
@_("TRUE")
def value(self, p):
return {"value": True}
@_("FALSE")
def value(self, p):
return {"value": False}
@_("ID")
def id(self, p):
return p.ID
@_("NUM")
def id(self, p):
return p.NUM
def error(self, token):
if token:
lineno = getattr(token, "lineno", 0)
if lineno:
raise Exception(f"sly: Syntax error at line {lineno}, token={token.type}")
else:
raise Exception(f"sly: Syntax error, token={token.type}")
else:
raise Exception("sly: Parse error in input. EOF")
WheneverEval.py
Evaluating expressions involves so much code I put it into a separate file. Whenever has unusual type coercions - for example 3 || !4
means N(3) > 0 || !(N(4) > 0)
. Whenever also converts everything to string if added to a string, and converts strings (just initial digits, or 0 if there aren't any initial digits) and booleans to ints when used in int context.
I make comparison operators >
, <
, >=
, <=
, ==
, and !=
also convert to integers. Whenever doesn't really do anything useful with strings except printing them anyway, so there's never any point doing "foo" == "bar"
anyway.
Things like type coercions are among things that could be improved, but that would cost backwards compatibility.
class WheneverEval:
def __init__(self, program):
self.program = program
def int_eval_node(self, node):
a = self.int_eval(node["a"])
b = self.int_eval(node["b"])
return (a, b)
def int_eval(self, node):
return self.int(self.eval(node))
def bool_eval(self, node):
return self.bool(self.eval(node))
def str_eval(self, node):
return self.str(self.eval(node))
# A lot of aggressive casting to int here
def eval(self, node):
if "value" in node:
return node["value"]
if "N" in node:
return self.program.counters[node["N"]]
if "op" not in node:
raise Exception(f"Neither op nor value? {node}")
op = node["op"]
if op == "-":
(a, b) = self.int_eval_node(node)
return a - b
if op == "*":
(a, b) = self.int_eval_node(node)
return a * b
if op == "/":
(a, b) = self.int_eval_node(node)
return a // b
if op == "%":
(a, b) = self.int_eval_node(node)
return a % b
# Not even clear about this one:
if op == "+":
a = self.eval(node["a"])
b = self.eval(node["b"])
if isinstance(a, str) or isinstance(b, str):
return self.str(a) + self.str(b)
else:
return self.int(a) + self.int(b)
if op == "uminus":
a = self.int_eval(node["a"])
return -a
if op == "||":
a = self.bool_eval(node["a"])
b = self.bool_eval(node["b"])
return a or b
if op == "&&":
a = self.bool_eval(node["a"])
b = self.bool_eval(node["b"])
return a and b
if op == "!":
a = self.bool_eval(node["a"])
return not a
if op == "<":
(a, b) = self.int_eval_node(node)
return a < b
if op == "<=":
(a, b) = self.int_eval_node(node)
return a <= b
if op == ">":
(a, b) = self.int_eval_node(node)
return a > b
if op == ">=":
(a, b) = self.int_eval_node(node)
return a >= b
if op == "==":
(a, b) = self.int_eval_node(node)
return a == b
if op == "!=":
(a, b) = self.int_eval_node(node)
return a != b
raise Exception(f"TODO: Operation {op} not supported yet")
def int(self, value):
if isinstance(value, bool):
if value == True:
return 1
if value == False:
return 0
if isinstance(value, int):
return value
if isinstance(value, str):
if re.match(r"^-?\d+", value):
return int(value)
else:
return 0
raise Exception(f"Invalid type {value}")
def bool(self, value):
if isinstance(value, bool):
return value
return self.program.counters[value] > 0
def str(self, value):
if isinstance(value, bool):
if value == True:
return "true"
if value == False:
return "false"
if isinstance(value, str):
return value
if isinstance(value, int):
return str(value)
raise Exception(f"Invalid type {value}")
whenever.py
And finally the main program.
The main execution loop is like this:
- if there's
start
symbol, start with counters at{start: 1, evertyhing_else: 0}
. Otherwise give everything starting counter of 1. - mark any rule like
n n
as trivial - these do nothing when executed, we still need to track their value, but we never need to actually run them, as program state is same before and after running it (we could ignore defer here etc. for more optimization) - as long as there's a nontrivial rule with nonzero counter, pick something at random at run it
Trivial next todo selection algorithm would result in exceedingly slow execution. For example fib
from official documentation is so slow in it that it never finishes. With this improved interpreter it finishes in almost no time.
- every time, we copy list of all nontrivial rules as
actionable
- we pick a rule at random from among
actionable
- we check if rule is deferred, if yes, we remove it from
actionable
set and try again - if program isn't finished, but all rules are trivial or deferred, we raise exception instead of doing infinite loop
This means if we have situation like this:
- N(1) - 17167680177566 times, deferred while N(3) > 0
- N(2) - 27777890035289 times, deferred while N(3) > 0
- N(3) - 1 times, actionable
We don't have to keep re-rolling 17167680177566+27777890035289+1 times until we get to run todo item 3.
The algorithm is lazy, so we only check deferred rule if we rolled it, so normally it shouldn't slow us down. The interpreter is still not very optimized, and it's definitely possible to write very slow programs for it, but at least it avoids those exponential slowdowns for such very simple programs.
#!/usr/bin/env python3
import sys
from WheneverLexer import WheneverLexer
from WheneverParser import WheneverParser
from WheneverEval import WheneverEval
from random import randint
class WheneverProgram:
def __init__(self, path):
self.path = path
with open(path) as f:
self.text = f.read()
lexer = WheneverLexer()
parser = WheneverParser()
self.program = parser.parse(lexer.tokenize(self.text))
self.eval = WheneverEval(self)
# A lot more optimizations are possible
# total is the sum of all nontrivial counters
def init_counters(self):
self.total = 0
self.trivial = set()
self.nontrivial = set()
self.counters = {}
has_start = ("start" in self.program)
for id in self.program.keys():
if id == "start" or not has_start:
starting_value = 1
else:
starting_value = 0
self.counters[id] = starting_value
is_trivial = (self.program[id] == {'again': [], 'defer': [], 'actions': [{'action': 'change', 'id': id, 'count': {'value': 1}}]})
if is_trivial:
self.trivial.add(id)
else:
self.nontrivial.add(id)
self.total += starting_value
def is_deferred(self, id):
for expr in self.program[id]["defer"]:
if self.eval.bool_eval(expr):
return True
return False
def random_todo(self, actionable, actionable_total):
i = randint(0, actionable_total - 1)
for id in actionable:
i -= self.counters[id]
if i < 0:
return id
raise Exception("Math failure")
def execute_random_todo(self):
actionable = self.nontrivial.copy()
actionable_total = self.total
while True:
if actionable_total == 0:
raise Exception("All actions are deferred")
id = self.random_todo(actionable, actionable_total)
if self.is_deferred(id):
actionable.remove(id)
actionable_total -= self.counters[id]
else:
self.execute_todo(id)
return
# defer is checked before we call this
def execute_todo(self, id):
todo = self.program[id]
again = todo["again"]
remove_todo = True
for expr in todo["defer"]:
if self.eval.bool_eval(expr):
remove_todo = False
for action in todo["actions"]:
self.execute_action(action)
if remove_todo:
self.update_counter(id, -1)
def execute_action(self, action):
action_type = action["action"]
if action_type == "print":
s = self.eval.str_eval(action["arg"])
print(s)
elif action_type == "change":
id = action["id"]
count = self.eval.int_eval(action["count"])
self.update_counter(id, count)
else:
raise Exception(f"Unknown action: {action_type}")
def update_counter(self, id, change):
old_value = self.counters[id]
new_value = old_value + change
if new_value < 0:
new_value = 0
self.counters[id] = new_value
if id not in self.trivial:
self.total += (new_value - old_value)
def run(self):
self.init_counters()
while self.total > 0:
self.execute_random_todo()
if sum(self.counters.values()) != 0:
raise Exception("Program entered do-nothing infinite loop")
program = WheneverProgram(sys.argv[1])
program.run()
Should you try Better Whenever?
I think the improved version is much more friendly, as things like comments, variable names, start
rule just make the whole coding process more pleasant, and you can focus on doing crazy things in a language with esoteric execution model without all the hassle of turning everything into numbers.
It also definitely helps that this interpreter avoids a lot of exponential slowdown.
Like everything else in this series, I hacked this program during one evening, so it's not perfect. If you're interested in improving it further, let me know, and I can setup proper repository, documentation, specs etc. for it.
This interpreter follows the original idea very closely, but it's definitely possible to do something more creative with the "program as todo list" idea, just like AsciiDots really evolved Befunge idea.
Code
All code examples for the series will be in this repository.
Code for the Better Whenever Interpreter with Python and SLY episode is available here.
Posted on February 3, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
February 3, 2022