100 Languages Speedrun: Episode 78: Better Whenever Interpreter with Python and SLY

taw

Tomasz Wegrzanowski

Posted on February 3, 2022

100 Languages Speedrun: Episode 78: Better Whenever Interpreter with Python and SLY

A while back I covered Whenever - a language where the program is a todo list. Whenever has no functions, no variables, and the state is just a todo list. As long as there's something on it, it picks an item at random, and performs it.

For this episode, I want to create a better version of Whenever language. It's mostly backwards compatible with the original and can run every Whenever program I could find. However, behavior for edge cases isn't documented anywhere, and absolute compatibility isn't the goal.

I made the following changes to make it more convenient, while keeping the challenge level just high as before:

  • todo items can be identified by name, not just line number
  • todo items don't need to be in any specific order
  • empty lists are allowed
  • comments with // are allowed
  • semicolons are optional
  • you can mix print actions and counter actions in the same todo
  • if you name one of the todo items as start, program will begin with it as the only active action on the todo list
  • it uses much smarter execution model, so it can run much more ambitious programs than the original interpreter
  • it detects some infinite loops if no action can possibly happen and throws an error
  • it uses infinite precision integers
  • ! is supported
  • I didn't add forget as it's listed as deprecated

How it works

There's a lot of code coming, and it's only somewhat commented, so here's the big picture overview:

  • you can run it with whenever.py whenever_file.txt
  • lexer and parser are implemented with SLY, I recently wrote about it, check it out if you need some help understanding how SLY works
  • WheneverLexer.py implements the lexer, which turns Whenever program into a stream of tokens like ["print", "(", '"Hello, World!"', ")"] - it also handles trivial things like removing comments
  • WheneverParser.py implements the parser, which turns that stream of tokens into a program
  • WheneverEval.py is all the code that evaluates Whenever expressions, dealing with all the operators, type conversions, and so on
  • whenever.py runs the program using the previous classes

I'll go into a bit of detail when I deal with each of the files.

Examples of Better Whenever Programs

I included all examples from previous episode, as well as all examples from official documentation, but since I want to improve the language I also included a few more.

math.txt, just a quick test script to check that operators have correct precedence:

a print(300 + 50 * 4 + 80 / 4 - (80 - 30) * 2)
Enter fullscreen mode Exit fullscreen mode

fizzbuzz.txt, a simple program simplified to take advantage of new features like comments, no-semicolons, and start, but still using numbers for todo items:

// start
start 10,11
// variables
2 2
3 3
4 4
// printing
5 defer((N(6) + N(7) + N(8) + N(9)) != 3) 6#-N(6),7#-N(7),8#-N(8),9#-N(9)
6 defer(N(3) == 0 || N(4) == 0) print(N(2))
7 defer(3 || N(4) == 0) print("Fizz")
8 defer(N(3) == 0 || 4) print("Buzz")
9 defer(3 || 4) print("FizzBuzz")
// loop logic
10 defer(5 || N(2) >= 100) 2,3,-3#((N(3)/3)*3),4,-4#((N(4)/5)*5),5,6,7,8,9,10
11 defer(5 || N(2) < 100) -2#N(2),-3#N(3),-4#N(4),-10#N(10)
Enter fullscreen mode Exit fullscreen mode

fizzbuzz2.txt, same, but now with all todo items named, and ! used:

// start
start loop,loopend
// variables
a a
t t
f f
// printing
prn defer((N(pn) + N(pf) + N(pb) + N(pfb)) != 3) -pn#N(pn),-pf#N(pf),-pb#N(pb),-pfb#N(pfb)
pn defer(!t || !f) print(N(a))
pf defer(t || !f) print("Fizz")
pb defer(!t || f) print("Buzz")
pfb defer(t || f) print("FizzBuzz")
// loop logic
loop defer(prn || N(a) >= 100) a,t,-t#((N(t)/3)*3),f,-f#((N(f)/5)*5),prn,pn,pf,pb,pfb,loop
loopend defer(prn || N(a) < 100) -a#N(a),-t#N(t),-f#N(f),-loop#N(loop)
Enter fullscreen mode Exit fullscreen mode

WheneverLexer.py

This is the simplest file. Possibly improvements would be support for escape codes like \n in strings (but then print would need to change to stop autoprinting \n), and maybe some extra features like read or forget. I also didn't bother tracking line numbers, so error messages won't be great.

from sly import Lexer

class WheneverLexer(Lexer):
  tokens = { PLUS, MINUS, TIMES, DIVIDE, LPAREN, RPAREN, COMMA, HASH, REL, NUM, ID, EOS, STRING, AND, OR, NOT, PRINT, DEFER, AGAIN, N, TRUE, FALSE, MOD }

  ignore = " \t\r"
  EOS = r"(\n|;|//[^\n]*\n)+"
  PLUS = r"\+"
  MINUS = r"-"
  TIMES = r"\*"
  DIVIDE = r"/"
  MOD = r"%"
  LPAREN = r"\("
  RPAREN = r"\)"
  COMMA = ","
  HASH = "#"
  REL = r"<=|<|>=|>|==|!="
  OR = r"\|\|"
  AND = "&&"
  NOT = "!"
  STRING = r'"[^"]*"'
  NUM = r"[0-9]+"
  ID = r"[a-zA-Z_][a-zA-Z0-9_]*"

  ID['again'] = AGAIN
  ID['defer'] = DEFER
  ID['false'] = FALSE
  ID['N'] = N
  ID['print'] = PRINT
  ID['true'] = TRUE

  def NUM(self, t):
    t.value = int(t.value)
    return t

  def STRING(self, t):
    t.value = t.value[1:-1]
    return t
Enter fullscreen mode Exit fullscreen mode

SLY makes defining lexers really easy.

WheneverParser.py

The parser is long, but not really complicated. For AST I just use Python dictionaries, and I use explicit rules like exprm instead of precedence tables, as I'm more used to this style. SLY by default has really weird error handling (print error to stderr, do nothing), so we need to override it to throw proper exceptions. I used the same messages as would be printed.

For simplicity, I don't distinguish between identifier (basically symbols) and strings, so you can do N("loop") or print(Fizz), even though that shouldn't really be supported. It's largely harmless for now.

Most of the syntax is fairly obvious. One somewhat unusual thing I did was explicitly disallowing multiple comparison at the same level without parentheses, like (1 == 2 == 3 or 1 < 2 == 3 < 4), as that's just really confusing.

from sly import Parser
from WheneverLexer import WheneverLexer

class WheneverParser(Parser):
  tokens = WheneverLexer.tokens

  @_("")
  def program(self, p):
    return {}

  @_("EOS program")
  def program(self, p):
    return p.program

  @_("todo program")
  def program(self, p):
    program = p.program
    id, todo = p.todo
    if id in program:
      raise Exception(f"Duplicate todo id: {id}")
    program[id] = todo
    return program

  @_("id modifiers actions EOS")
  def todo(self, p):
    again = []
    defer = []
    for mod in p.modifiers:
      if "again" in mod:
        again.append(mod["again"])
      if "defer" in mod:
        defer.append(mod["defer"])
    return (p.id, {"again": again, "defer": defer, "actions": p.actions})

  @_("")
  def modifiers(self, p):
    return []

  @_("modifier modifiers")
  def modifiers(self, p):
    return [p.modifier] + p.modifiers

  @_("DEFER LPAREN expr RPAREN")
  def modifier(self, p):
    return {"defer": p.expr}

  @_("AGAIN LPAREN expr RPAREN")
  def modifier(self, p):
    return {"again": p.expr}

  @_("action")
  def actions(self, p):
    return [p.action]

  @_("action COMMA actions")
  def actions(self, p):
    return [p.action] + p.actions

  @_("id")
  def action(self, p):
    return {"action": "change", "id": p.id, "count": {"value": 1}}

  @_("MINUS id")
  def action(self, p):
    return {"action": "change", "id": p.id, "count": {"value": -1}}

  @_("id HASH expr")
  def action(self, p):
    return {"action": "change", "id": p.id, "count": p.expr}

  @_("MINUS id HASH expr")
  def action(self, p):
    return {"action": "change", "id": p.id, "count": {"op": "uminus", "a": p.expr}}

  @_("PRINT LPAREN expr RPAREN")
  def action(self, p):
    return {"action": "print", "arg": p.expr}

  # expr, going through all the priority levels
  @_("exprlo")
  def expr(self, p):
    return p.exprlo

  ### expr logical or
  @_("exprla")
  def exprlo(self, p):
    return p.exprla

  @_("exprlo OR exprla")
  def exprlo(self, p):
    return {"op": "||", "a": p.exprlo, "b": p.exprla}

  ### expr logical and
  @_("exprr")
  def exprla(self, p):
    return p.exprr

  @_("exprla AND exprr")
  def exprla(self, p):
    return {"op": "&&", "a": p.exprla, "b": p.exprr}

  ### expr relational
  ### do not support chaining them without parentheses
  @_("expra")
  def exprr(self, p):
    return p.expra

  @_("expra REL expra")
  def exprr(self, p):
    return {"op": p.REL, "a": p.expra0, "b": p.expra1}

  ### expr additive
  @_("expra PLUS exprm")
  def expra(self, p):
    return {"op": "+", "a": p.expra, "b": p.exprm}

  @_("expra MINUS exprm")
  def expra(self, p):
    return {"op": "-", "a": p.expra, "b": p.exprm}

  @_("exprm")
  def expra(self, p):
    return p.exprm

  ### expr multiplicative
  @_("exprm TIMES expru")
  def exprm(self, p):
    return {"op": "*", "a": p.exprm, "b": p.expru}

  @_("exprm DIVIDE expru")
  def exprm(self, p):
    return {"op": "/", "a": p.exprm, "b": p.expru}

  @_("exprm MOD expru")
  def exprm(self, p):
    return {"op": "%", "a": p.exprm, "b": p.expru}

  #### expr unary
  @_("expru")
  def exprm(self, p):
    return p.expru

  @_("value")
  def expru(self, p):
    return p.value

  @_("NOT expru")
  def expru(self, p):
    return {"op": "!", "a": p.expru}

  @_("MINUS expru")
  def expru(self, p):
    return {"op": "uminus", "a": p.expru}

  ### primitive value
  @_("LPAREN expr RPAREN")
  def value(self, p):
    return p.expr

  # No reason to distinguish between foo and "foo" yet
  @_("ID")
  def value(self, p):
    return {"value": p.ID}

  @_("NUM")
  def value(self, p):
    return {"value": p.NUM}

  @_("N LPAREN id RPAREN")
  def value(self, p):
    return {"N": p.id}

  @_("STRING")
  def value(self, p):
    return {"value": p.STRING}

  @_("TRUE")
  def value(self, p):
    return {"value": True}

  @_("FALSE")
  def value(self, p):
    return {"value": False}

  @_("ID")
  def id(self, p):
    return p.ID

  @_("NUM")
  def id(self, p):
    return p.NUM

  def error(self, token):
    if token:
      lineno = getattr(token, "lineno", 0)
      if lineno:
        raise Exception(f"sly: Syntax error at line {lineno}, token={token.type}")
      else:
        raise Exception(f"sly: Syntax error, token={token.type}")
    else:
        raise Exception("sly: Parse error in input. EOF")
Enter fullscreen mode Exit fullscreen mode

WheneverEval.py

Evaluating expressions involves so much code I put it into a separate file. Whenever has unusual type coercions - for example 3 || !4 means N(3) > 0 || !(N(4) > 0). Whenever also converts everything to string if added to a string, and converts strings (just initial digits, or 0 if there aren't any initial digits) and booleans to ints when used in int context.

I make comparison operators >, <, >=, <=, ==, and != also convert to integers. Whenever doesn't really do anything useful with strings except printing them anyway, so there's never any point doing "foo" == "bar" anyway.

Things like type coercions are among things that could be improved, but that would cost backwards compatibility.

class WheneverEval:
  def __init__(self, program):
    self.program = program

  def int_eval_node(self, node):
    a = self.int_eval(node["a"])
    b = self.int_eval(node["b"])
    return (a, b)

  def int_eval(self, node):
    return self.int(self.eval(node))

  def bool_eval(self, node):
    return self.bool(self.eval(node))

  def str_eval(self, node):
    return self.str(self.eval(node))

  # A lot of aggressive casting to int here
  def eval(self, node):
    if "value" in node:
      return node["value"]
    if "N" in node:
      return self.program.counters[node["N"]]
    if "op" not in node:
      raise Exception(f"Neither op nor value? {node}")
    op = node["op"]
    if op == "-":
      (a, b) = self.int_eval_node(node)
      return a - b
    if op == "*":
      (a, b) = self.int_eval_node(node)
      return a * b
    if op == "/":
      (a, b) = self.int_eval_node(node)
      return a // b
    if op == "%":
      (a, b) = self.int_eval_node(node)
      return a % b
    # Not even clear about this one:
    if op == "+":
      a = self.eval(node["a"])
      b = self.eval(node["b"])
      if isinstance(a, str) or isinstance(b, str):
        return self.str(a) + self.str(b)
      else:
        return self.int(a) + self.int(b)
    if op == "uminus":
      a = self.int_eval(node["a"])
      return -a
    if op == "||":
      a = self.bool_eval(node["a"])
      b = self.bool_eval(node["b"])
      return a or b
    if op == "&&":
      a = self.bool_eval(node["a"])
      b = self.bool_eval(node["b"])
      return a and b
    if op == "!":
      a = self.bool_eval(node["a"])
      return not a
    if op == "<":
      (a, b) = self.int_eval_node(node)
      return a < b
    if op == "<=":
      (a, b) = self.int_eval_node(node)
      return a <= b
    if op == ">":
      (a, b) = self.int_eval_node(node)
      return a > b
    if op == ">=":
      (a, b) = self.int_eval_node(node)
      return a >= b
    if op == "==":
      (a, b) = self.int_eval_node(node)
      return a == b
    if op == "!=":
      (a, b) = self.int_eval_node(node)
      return a != b
    raise Exception(f"TODO: Operation {op} not supported yet")

  def int(self, value):
    if isinstance(value, bool):
      if value == True:
        return 1
      if value == False:
        return 0
    if isinstance(value, int):
      return value
    if isinstance(value, str):
      if re.match(r"^-?\d+", value):
        return int(value)
      else:
        return 0
    raise Exception(f"Invalid type {value}")

  def bool(self, value):
    if isinstance(value, bool):
      return value
    return self.program.counters[value] > 0

  def str(self, value):
    if isinstance(value, bool):
      if value == True:
        return "true"
      if value == False:
        return "false"
    if isinstance(value, str):
      return value
    if isinstance(value, int):
      return str(value)
    raise Exception(f"Invalid type {value}")
Enter fullscreen mode Exit fullscreen mode

whenever.py

And finally the main program.

The main execution loop is like this:

  • if there's start symbol, start with counters at {start: 1, evertyhing_else: 0}. Otherwise give everything starting counter of 1.
  • mark any rule like n n as trivial - these do nothing when executed, we still need to track their value, but we never need to actually run them, as program state is same before and after running it (we could ignore defer here etc. for more optimization)
  • as long as there's a nontrivial rule with nonzero counter, pick something at random at run it

Trivial next todo selection algorithm would result in exceedingly slow execution. For example fib from official documentation is so slow in it that it never finishes. With this improved interpreter it finishes in almost no time.

  • every time, we copy list of all nontrivial rules as actionable
  • we pick a rule at random from among actionable
  • we check if rule is deferred, if yes, we remove it from actionable set and try again
  • if program isn't finished, but all rules are trivial or deferred, we raise exception instead of doing infinite loop

This means if we have situation like this:

  • N(1) - 17167680177566 times, deferred while N(3) > 0
  • N(2) - 27777890035289 times, deferred while N(3) > 0
  • N(3) - 1 times, actionable

We don't have to keep re-rolling 17167680177566+27777890035289+1 times until we get to run todo item 3.

The algorithm is lazy, so we only check deferred rule if we rolled it, so normally it shouldn't slow us down. The interpreter is still not very optimized, and it's definitely possible to write very slow programs for it, but at least it avoids those exponential slowdowns for such very simple programs.

#!/usr/bin/env python3

import sys
from WheneverLexer import WheneverLexer
from WheneverParser import WheneverParser
from WheneverEval import WheneverEval
from random import randint

class WheneverProgram:
  def __init__(self, path):
    self.path = path
    with open(path) as f:
      self.text = f.read()
    lexer = WheneverLexer()
    parser = WheneverParser()
    self.program = parser.parse(lexer.tokenize(self.text))
    self.eval = WheneverEval(self)

  # A lot more optimizations are possible
  # total is the sum of all nontrivial counters
  def init_counters(self):
    self.total = 0
    self.trivial = set()
    self.nontrivial = set()
    self.counters = {}
    has_start = ("start" in self.program)

    for id in self.program.keys():
      if id == "start" or not has_start:
        starting_value = 1
      else:
        starting_value = 0

      self.counters[id] = starting_value

      is_trivial = (self.program[id] == {'again': [], 'defer': [], 'actions': [{'action': 'change', 'id': id, 'count': {'value': 1}}]})
      if is_trivial:
        self.trivial.add(id)
      else:
        self.nontrivial.add(id)
        self.total += starting_value

  def is_deferred(self, id):
    for expr in self.program[id]["defer"]:
      if self.eval.bool_eval(expr):
        return True
    return False

  def random_todo(self, actionable, actionable_total):
    i = randint(0, actionable_total - 1)
    for id in actionable:
      i -= self.counters[id]
      if i < 0:
        return id
    raise Exception("Math failure")

  def execute_random_todo(self):
    actionable = self.nontrivial.copy()
    actionable_total = self.total

    while True:
      if actionable_total == 0:
        raise Exception("All actions are deferred")
      id = self.random_todo(actionable, actionable_total)

      if self.is_deferred(id):
        actionable.remove(id)
        actionable_total -= self.counters[id]
      else:
        self.execute_todo(id)
        return

  # defer is checked before we call this
  def execute_todo(self, id):
    todo = self.program[id]
    again = todo["again"]
    remove_todo = True
    for expr in todo["defer"]:
      if self.eval.bool_eval(expr):
        remove_todo = False
    for action in todo["actions"]:
      self.execute_action(action)
    if remove_todo:
      self.update_counter(id, -1)

  def execute_action(self, action):
    action_type = action["action"]
    if action_type == "print":
      s = self.eval.str_eval(action["arg"])
      print(s)
    elif action_type == "change":
      id = action["id"]
      count = self.eval.int_eval(action["count"])
      self.update_counter(id, count)
    else:
      raise Exception(f"Unknown action: {action_type}")

  def update_counter(self, id, change):
    old_value = self.counters[id]
    new_value = old_value + change
    if new_value < 0:
      new_value = 0
    self.counters[id] = new_value
    if id not in self.trivial:
      self.total += (new_value - old_value)

  def run(self):
    self.init_counters()
    while self.total > 0:
      self.execute_random_todo()
    if sum(self.counters.values()) != 0:
      raise Exception("Program entered do-nothing infinite loop")

program = WheneverProgram(sys.argv[1])
program.run()
Enter fullscreen mode Exit fullscreen mode

Should you try Better Whenever?

I think the improved version is much more friendly, as things like comments, variable names, start rule just make the whole coding process more pleasant, and you can focus on doing crazy things in a language with esoteric execution model without all the hassle of turning everything into numbers.

It also definitely helps that this interpreter avoids a lot of exponential slowdown.

Like everything else in this series, I hacked this program during one evening, so it's not perfect. If you're interested in improving it further, let me know, and I can setup proper repository, documentation, specs etc. for it.

This interpreter follows the original idea very closely, but it's definitely possible to do something more creative with the "program as todo list" idea, just like AsciiDots really evolved Befunge idea.

Code

All code examples for the series will be in this repository.

Code for the Better Whenever Interpreter with Python and SLY episode is available here.

💖 💪 🙅 🚩
taw
Tomasz Wegrzanowski

Posted on February 3, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related