Tomasz Wegrzanowski
Posted on February 21, 2022
The series is heading towards completion, but I have one unfinished business.
Back in episode 79 I did some prototyping of new esoteric language Tuples. Then in episode 80 I investigated a minimalistic version of it.
Let's try to turn it into a real language. By that I mean one that could at least in principle run Wordle.
We'll need all these features:
- parser for Tuple language
- support for reading files
- support for random number generation
- interactive I/O
- tuples with multiple dependencies
Special Tuples
We need a much richer interface to the runner than the one we had before. To let the runner evolve without messing up the program, only tuples starting with 0.*
will have special meaning. Everything else can safely be used by the programs.
So here are special tuple types we need.
If any file names are passed to the runner at start, these are pre-filled before program starts. They don't do anything special afterwards. This is about the bare minimum file I/O, it will let us implement common Unix programs like cat
or wc
, or read data files like wordle-dictionary.txt
. We can't create files, write files, or open files dynamically, but it's a start.
-
0.2.f.i.c
-i
-th character of the file with idf
has Unicode codepointc
(system-created, before start) -
0.2.f.i.0
- special value for first position after end of file (system-created, before start)
For random numbers, we need a request tuple and a response tuple:
-
0.4.id.min.max
- request to generate a random value - each request has uniqueid
in case you need multiple numbers (program-created) -
0.4.id.min.max.value
- random value betweenmin
andmax
(system-created, on demand)
So if you need to roll a bunch of d6s, you can create tuples 0.4.100.1.6
, 0.4.101.1.6
, 0.4.102.1.6
, and the runner will respond by creating some numbers like let's say 0.4.100.1.6.5
, 0.4.101.1.6.2
, and 0.4.102.1.6.3
.
I'm including id.min.max
in the response, just to make it clear what happens if someone requestsn 0.4.69.1.6
and then 0.4.69.1.10
- these will be independent requests.
For input we need "request to read i
-th character", and response with contents of that i
-th character, or 0
if end of input was reached:
-
0.0.i
- request to get more input up toi
-th character (program-created) - note that STDIN input will use full lines anyway -
0.0.i.c
-i
-th character of the input has Unicode codepointc
(system-created, only when requested) -
0.0.i.0
- special value for first position after end of input (system-created, only when requested)
And finally the output. I found it much easier to write Tuples programs if gaps in the output are allowed. If you don't need any interactivity, you can just create some 0.1.*.*
tuples and be done with it. But if you want to flush the output during program run, you need to create 0.3.i
tuples. Once the flush request has been fulfilled, corresponding 0.3.i.0
tuple will be created.
-
0.1.i.c
-i
-th character of the output has Unicode codepointc
(program-created, gaps allowed, 0s ignored) -
0.3.i
- request to print everything up toi
(skipping gaps; program-created) - the rest will be printed anyway when program ends -
0.3.i.0
- request acknowledgements (system-created after request done)
There's nothing special about this arrangement, it's just the minimum functionality I needed to enable writing the kind of mini-programs I wanted to do, like infinite loop printing, concatenating multiple files, or Wordle.
Structure of the episode
First, I'll show a bunch of Tuples programs, then the interpreter. I originally planned to do Wordle in Tuples, but this post was already getting crazy long with 16 [ ] examples, so I'll leave that as an exercise for the reader.
I put a lot of comments inside Tuples code, but maybe it's not enough as it's an esoteric language. I'd love it if other people also tried to use it. Let me know if you write anything in it.
First I'll recap programs from old episodes that still work. The main difference is just some extra comments.
Hello, World!
Many programs from previous episodes still work just fine. This is the Hello, World!:
# Just the characters in "Hello, World!\n"
-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)
-> (0, 1, 7, 87)
-> (0, 1, 8, 111)
-> (0, 1, 9, 114)
-> (0, 1, 10, 108)
-> (0, 1, 11, 100)
-> (0, 1, 12, 33)
-> (0, 1, 13, 10)
$ ./tuples.rb examples/hello.txt
Hello, World!
Loop
# Initial state
-> (1, 1, 9)
# Generate data
(1, a, b) -> (1, a+1, b-1)
# Generate output, number to string conversion
# But send it to extra stage so we can filter the parts we don't want
# if second arg is negative, it won't get added to the state
(1, a, b) -> (2, (a/10)-1, 10*a+0, 48+(a/10))
(1, a, b) -> (2, 0, 10*a+1, 48+(a%10))
(1, a, b) -> (2, 0, 10*a+2, 10)
# Filtering is done, output the result
(2, a, b, c) -> (0, 1, b, c)
$ ./tuples.rb examples/loop.txt
1
2
3
4
5
6
7
8
9
10
Minimalistic Loop
That's the version that uses no operations except +1 and -1.
# Initial state
-> (1, 1, 19)
# Generate data
(1, a, b) -> (1, a+1, b-1)
# Send it to digit splitter
(1, a, b) -> (2, a, a, 0, 0)
# Split digits, send to r<10 filter
(2, i, a, d, r) -> (2, i, a-1, d, r+1)
(2, i, a, d, 10) -> (2, i, a, d+1, 0)
(2, i, 0, d, r) -> (3, i, d, r, r, 9)
# m<10 filter
(3, i, d, r, ra, rb) -> (3, i, d, r, ra-1, rb-1)
(3, i, d, r, 0, rb) -> (4, i, 0, 0, d, r)
# OK, we split the number and we're ready to print
# Multiply i by 3 to get final position, then send to three printers
(4, i, ia, 0, r, m) -> (4, i-1, ia, 3, r, m)
(4, i, ia, ib, r, m) -> (4, i, ia+1, ib-1, r, m)
(4, 0, ia, 0, r, m) -> (10, ia, r, r-1)
(4, 0, ia, 0, r, m) -> (20, ia+1, m)
(4, 0, ia, 0, r, m) -> (30, ia+1)
# First digit printer, needs additional check arguments to filter out zeroes
(10, i, d, z) -> (11, i, d, 48)
(11, i, a, b) -> (11, i, a+1, b-1)
(11, i, a, 0) -> (0, 1, i, a)
# Second digit printer
(20, i, d) -> (21, i, d, 48)
(21, i, a, b) -> (21, i, a+1, b-1)
(21, i, a, 0) -> (0, 1, i, a)
# Newline printer
(30, i) -> (0, 1, i+1, 10)
FizzBuzz
# Initial state
-> (1, 1, 99)
# Generate data
(1, a, b) -> (1, a+1, b-1)
# Print all "\n"s
(1, i, _) -> (0, 1, 1000*i + 100, 10)
# Decide which print function to call
(1, i, a) -> (20, (i%3)-1, (i%5)-1, i)
(1, i, a) -> (21, -(i%3), (i%5)-1, i)
(1, i, a) -> (22, (i%3)-1, -(i%5), i)
(1, i, a) -> (23, -(i%3), -(i%5), i)
# Print number
(20, a, b, i) -> (11, 1000*i + 99, i)
# Print Fizz
(21, a, b, i) -> (0, 1, 1000*i + 0, 70)
(21, a, b, i) -> (0, 1, 1000*i + 1, 105)
(21, a, b, i) -> (0, 1, 1000*i + 2, 122)
(21, a, b, i) -> (0, 1, 1000*i + 3, 122)
# Print Buzz
(22, a, b, i) -> (0, 1, 1000*i + 0, 66)
(22, a, b, i) -> (0, 1, 1000*i + 1, 117)
(22, a, b, i) -> (0, 1, 1000*i + 2, 122)
(22, a, b, i) -> (0, 1, 1000*i + 3, 122)
# Print FizzBuzz
(23, a, b, i) -> (0, 1, 1000*i + 0, 70)
(23, a, b, i) -> (0, 1, 1000*i + 1, 105)
(23, a, b, i) -> (0, 1, 1000*i + 2, 122)
(23, a, b, i) -> (0, 1, 1000*i + 3, 122)
(23, a, b, i) -> (0, 1, 1000*i + 5, 66)
(23, a, b, i) -> (0, 1, 1000*i + 6, 117)
(23, a, b, i) -> (0, 1, 1000*i + 7, 122)
(23, a, b, i) -> (0, 1, 1000*i + 8, 122)
# Convert string to number and output it
(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)
Minimalistic FizzBuzz
That's the version that uses no operations except +1 and -1.
# Initial state
# There's a lot of countdown counters here
-> (1, 0, 0, 9, 0, 0, 0, 100)
# cnt -= 1
(1, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (2, i1, i0, ic, fcnt, bcnt, pos, 10, cnt-1)
# loop to do pos += 10
(2, i1, i0, ic, fcnt, bcnt, pos, posplus, cnt) -> (2, i1, i0, ic, fcnt, bcnt, pos+1, posplus-1, cnt)
(2, i1, i0, ic, fcnt, bcnt, pos, 0, cnt) -> (3, i1, i0, ic, fcnt, bcnt, pos, cnt)
# update i1, i0, and ic
# i0 - lowest digit
# i1 - highest digit
# ic + i0 == 9 always
(3, i1, i0, 0, fcnt, bcnt, pos, cnt) -> (4, i1+1, 0, 9, fcnt, bcnt, pos, cnt)
(3, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (4, i1, i0+1, ic-1, fcnt, bcnt, pos, cnt)
# update fizz countdown
(4, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (5, i1, i0, ic, fcnt-1, bcnt, pos, cnt)
(4, i1, i0, ic, 0, bcnt, pos, cnt) -> (5, i1, i0, ic, 2, bcnt, pos, cnt)
# update buzz countdown
(5, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (6, i1, i0, ic, fcnt, bcnt-1, pos, cnt)
(5, i1, i0, ic, fcnt, 0, pos, cnt) -> (6, i1, i0, ic, fcnt, 4, pos, cnt)
# decide what to print based on counters
(6, i1, i0, ic, 0, bcnt, pos, cnt) -> (10, pos)
(6, i1, i0, ic, fcnt, 0, pos, cnt) -> (11, pos)
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (12, fcnt-1, bcnt-1, i1, i0, pos)
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (13, pos)
# continue the loop
(6, i1, i0, ic, fcnt, bcnt, pos, cnt) -> (1, i1, i0, ic, fcnt, bcnt, pos, cnt)
# print fizz
(10, pos) -> (20, pos, 0, 70)
(10, pos) -> (20, pos, 1, 105)
(10, pos) -> (20, pos, 2, 122)
(10, pos) -> (20, pos, 3, 122)
# print buzz
(11, pos) -> (20, pos, 4, 66)
(11, pos) -> (20, pos, 5, 117)
(11, pos) -> (20, pos, 6, 122)
(11, pos) -> (20, pos, 7, 122)
# print number
(12, fcheck, bcheck, i1, i0, pos) -> (14, i1-1, pos, i1)
(12, fcheck, bcheck, i1, i0, pos) -> (14, 0, pos+1, i0)
# print newline
(13, pos) -> (20, pos, 8, 10)
# digit print check
(14, dcheck, pos, c) -> (15, pos, c, 48)
# convert digit to ASCII
(15, pos, c, cplus) -> (15, pos, c+1, cplus-1)
(15, pos, c, 0) -> (0, 1, pos, c)
# print character at specific offset
(20, pos, posplus, c) -> (20, pos+1, posplus-1, c)
(20, pos, 0, c) -> (0, 1, pos, c)
Fibonacci
# Initial state
-> (10, 1, 1, 1, 99)
# Generate data
(10, i, a, b, cnt) -> (10, i+1, b, a+b, cnt-1)
# print "fib(", ")=", and "\n" parts
(10, i, a, b, cnt) -> (0, 1, 1000*i + 0, 102)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 1, 105)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 2, 98)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 3, 40)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 200, 41)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 201, 61)
(10, i, a, b, cnt) -> (0, 1, 1000*i + 401, 10)
# Send i and fib(i) to conversion function to output at the right places
(10, i, a, b, cnt) -> (11, 1000*i + 199, i)
(10, i, a, b, cnt) -> (11, 1000*i + 399, a)
# Convert string to number and output it
(11, i, n) -> (0, 1, i, 48+(n%10))
(11, i, n) -> (12, (n/10)-1, i-1, n/10)
(12, _, i, n) -> (11, i, n)
Print File
OK, let's get to the new functionality.
Here's Tuples program for printing a single file. As you can see for esoteric language, it can be quite expressive:
# 0.2.0.*.* is contents of a file, just send them through
(0, 2, 0, i, b) -> (0, 1, i, b)
$ ./tuples.rb examples/print_file.txt spec/data/a.txt
One
Two
Three
Dice
The RNG system is very easy to use, we just generate some requests at start, and print in the right place when we get a response:
# Request some dice rolls
-> (0, 4, 1, 1, 6)
-> (0, 4, 2, 1, 6)
-> (0, 4, 3, 1, 6)
-> (0, 4, 4, 1, 6)
# Print the results
(0, 4, i, _, _, n) -> (0, 1, 2*i, 48+n)
(0, 4, i, _, _, n) -> (0, 1, 2*i+1, 10)
$ ./tuples.rb examples/dice.txt
6
1
4
2
$ ./tuples.rb examples/dice.txt
5
3
6
1
Yes
It's equivalent of yes
Unix program that prints infinite y
. We cannot actually generate infinite y
s at once, we need to generate just some defined chunk (in this case one line), and call flush.
Then in respone to flush succeeding, do the next chunk (in this case next line etc).
# Initial state
-> (1, 0)
# Print "y\n" and request callback
(1, i) -> (0, 1, i*2, 121)
(1, i) -> (0, 1, i*2+1, 10)
(1, i) -> (0, 3, i*2+1)
# If printed, get next state
(0, 3, ii, 0) -> (1, (ii/2) + 1)
$ ./tuples.rb examples/yes.txt | head
y
y
y
y
y
y
y
y
y
y
Echo
Echo program gets whatever's on STDIN, and echos it.
# Request first character
-> (0, 0, 0)
# Stop if EOF
(0, 0, pos, char) -> (1, pos, char, char-1)
# On character received, print it and request another
# (no flushing, just let it flush on exit)
(1, pos, char, _) -> (0, 1, pos, char)
(1, pos, char, _) -> (0, 0, pos+1)
$ ./tuples.rb examples/echo.txt
呕贸艂w likes 馃崹
呕贸艂w likes 馃崹
Print multiple files
Printing one file was easy, printing multiple files is more complicated, as we need to know where one file starts and another ends.
We could also just assume all files are smaller than let's say 1GB, and just offset each by 1_000_000_000
and that would totally work, but here's doing it the hard way:
# 2.i.size are file sizes
(0, 2, i, size, 0) -> (2, i, size)
# 3.i.size is total size of all files preceeding i
# check flag is == j
-> (3, 0, 0)
(3, i, presize), (2, j, size) -> (3, i+1, presize+size, -(i-j)**2)
(3, i, size, _) -> (3, i, size)
# output contents of each file at the right offset
# (we can safely add 0s to the output at duplicate position
# they're ignored by the output algorithm)
# check flag id1 == id2
(0, 2, id1, i, b), (3, id2, size) -> (4, size+i, b, -(id1-id2)**2)
(4, pos, b, _) -> (0, 1, pos, b)
$ ./tuples.rb examples/print_files.txt spec/data/{a,b,c}.txt
One
Two
Three
Four
Five
Six
Seven
Eight
Infinite Loop
Infinite loop is a mix of "yes" program and "loop" program.
# Initial state
-> (1, 0)
# Print number and "\n" and request flush
(1, i) -> (0, 1, 10*i+9, 10)
(1, i) -> (2, i, 10*i+8)
(1, i) -> (0, 3, 10*i+9)
# On flush, do next number
(0, 3, ii, 0) -> (1, (ii/10) + 1)
# (2, num, pos) prints number with last digit at pos
(2, num, pos) -> (0, 1, pos, 48+(num % 10))
(2, num, pos) -> (3, (num/10), pos-1, (num/10)-1)
(3, num, pos, _) -> (2, num, pos)
Hello, Name!
Nothing to complicated about it, except we want to keep going until we find either End-Of-File (0
) or newline character (10
).
# Request first character
-> (0, 0, 0)
# Print "Hello, "
-> (0, 1, 0, 72)
-> (0, 1, 1, 101)
-> (0, 1, 2, 108)
-> (0, 1, 3, 108)
-> (0, 1, 4, 111)
-> (0, 1, 5, 44)
-> (0, 1, 6, 32)
# Stop if newline or EOF
(0, 0, i, b) -> (1, i, b, b-1, ((b-10)**2)-1)
(0, 0, i, 0) -> (2, i)
(0, 0, i, 10) -> (2, i)
# On character received, print it and request another
# (no flushing, just let it flush on exit)
(1, pos, char, _, _) -> (0, 1, 10+pos, char)
(1, pos, char, _, _) -> (0, 0, pos+1)
# Print final parts
(2, i) -> (0, 1, 10+i, 33)
(2, i) -> (0, 1, 11+i, 10)
$ ./tuples.rb examples/hello_name.txt
Eve
Hello, Eve!
Count Lines
This program counts lines in a passed file:
# how many lines before N
-> (1, 0, 0)
# process file character by character
(0, 2, 0, pos1, 10), (1, pos2, cnt) -> (2, pos1+1, cnt+1, -(pos1-pos2)**2, 0, 0)
(0, 2, 0, pos1, c), (1, pos2, cnt) -> (2, pos1+1, cnt, -(pos1-pos2)**2, c-1, ((c-10)**2)-1)
(0, 2, 0, pos1, 0), (1, pos2, cnt) -> (3, cnt, -(pos1-pos2)**2)
# continue loop
(2, pos, cnt, _, _, _) -> (1, pos, cnt)
# print result
(3, cnt, _) -> (4, cnt, 100)
# always print "\n"
-> (0, 1, 101, 10)
(4, cnt, outpos) -> (0, 1, outpos, 48 + (cnt % 10))
(4, cnt, outpos) -> (4, (cnt/10), outpos-1, (cnt/10)-1)
(4, cnt, outpos, _) -> (4, cnt, outpos)
$ ./tuples.rb examples/count_lines.txt spec/data/a.txt
3
Random Word
I started writing Wordle program, but I didn't finish it, as this post was already far too long. But the first part is picking up random file from the Wordle dictionary.
Because we know every line is 6 characters (5 letters + newline), we can really simplify this. Random line of unknown length would be a lot more complicated.
# we know every line is 5 characters + newline, that makes it a lot easier
# as we can just start at 6*Nth character
# 10.x is count of words in the dictionary
(0, 2, 0, i, 0) -> (10, i/6)
# roll random index
# 11.x is character position of random word (6*random number)
(10, count) -> (0, 4, 0, 0, count-1)
(0, 4, 0, _, _, index) -> (11, index*6)
# 12.i.b is the random word, i-ith letter
(11, pos1), (0, 2, 0, pos2, b) -> (12, 0, b, -(pos1-pos2)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 1, b, -(pos1-pos2+1)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 2, b, -(pos1-pos2+2)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 3, b, -(pos1-pos2+3)**2)
(11, pos1), (0, 2, 0, pos2, b) -> (12, 4, b, -(pos1-pos2+4)**2)
(12, i, b, _) -> (12, i, b)
# print the random word
(12, i, b) -> (0, 1, i, b)
-> (0, 1, 5, 10)
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
sweep
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
honey
$ ./tuples.rb examples/random_word.txt spec/data/wordle-answers-alphabetical.txt
rumor
Sum
And finally a program that finds all the numbers in a file and sums them up.
# 10.i.b - at position i, there's digit b
(0, 2, 0, i, b) -> (10, i, b-48, -(48-b)*(57-b))
(10, i, b, _) -> (10, i, b)
# 11.i - at position i, there's a non-digit, non-EOF
(0, 2, 0, i, b) -> (11, i, (47-b)*(58-b), b-1)
(11, i, _, _) -> (11, i)
# 12.i - at position i, there's EOF
(0, 2, 0, i, 0) -> (12, i)
# 13.i.v - partial numbers coming into i
# if number ended at i, it would be v
# so "x123yz" would result in [0, 0, 1, 12, 123, 0]
-> (13, 0, 0)
(11, i) -> (13, i+1, 0)
(13, i, v), (10, j, b) -> (13, i+1, 10*v+b, -(i-j)**2)
(13, i, v, _) -> (13, i, v)
# 14.i.v is v if number ends just before i, or 0 otherwise
(13, i, v), (11, j) -> (14, i, v, -(i-j)**2)
(13, i, v), (12, j) -> (14, i, v, -(i-j)**2)
(10, i, _) -> (14, i, 0)
(14, i, v, _) -> (14, i, v)
# 15.i.v is sum of all numbers before i
-> (15, 0, 0)
(15, i, v), (14, j, u) -> (15, i+1, u+v, -(i-j)**2)
(15, i, v, _) -> (15, i, v)
# 16.v is the final sum
# (there's extra +1 as sums shifted)
(15, i, v), (12, j) -> (16, v, -(i-j-1)**2)
(16, v, _) -> (16, v)
# send to printer, also add "\n"
(16, v) -> (20, v, 100)
-> (0, 1, 101, 10)
# (20, num, pos) prints number with last digit at pos
(20, num, pos) -> (0, 1, pos, 48+(num % 10))
(20, num, pos) -> (21, (num/10), pos-1, (num/10)-1)
(21, num, pos, _) -> (20, num, pos)
Let's try it on some files:
$ cat spec/data/numbers.txt
15 38 9 007
$ ./tuples.rb examples/sum.txt spec/data/numbers.txt
69
$ cat spec/data/budget.txt
Food $200
Data $150
Rent $800
Candles $3600
Utility $150
$ ./tuples.rb examples/sum.txt spec/data/budget.txt
4900
Runner
OK, that's enough Tuples, let's see how it was implemented:
-
tuples.rb
- main program -
tuples_parser.rb
- Tuples parser -
tuples_program.rb
- Tuples program runner -
spec
- integration tests -
spec/data/*
- sample data files -
spec/integration/*.txt
- expected output
There's nothing interesting about specs, you can check the sources if you want to.
Main Program
The main program has very few responsibilities, it just loads some dependencies, handles ARGV, and loads files:
#!/usr/bin/env ruby
require "ostruct"
require "pathname"
require "pry"
require "set"
require_relative "tuples_parser"
require_relative "tuples_program"
class TuplesRunner
def initialize(program_path, *file_paths)
@program = TuplesParser.new(program_path).call
@files = file_paths.map { |file_path| Pathname(file_path).read }
end
def call
@program.reset
@files.each_with_index do |file, i|
@program.load_file file.codepoints, i
end
@program.call
end
end
Signal.trap("PIPE", "EXIT")
unless ARGV.size >= 1
STDERR.puts "Usage: #{$0} <input.txt> [<additional files> ...]"
exit 1
end
TuplesRunner.new(*ARGV).call
TuplesParser
The parser is half-implemented. Each line is parsed separately, with any empty, comment lines, or lines not containing ->
ignored. The left side of ->
is fully parsed. The right side of ->
just wrapped in a block and passed to Ruby eval.
This means the languge is not fully defined, but the intention is that you should only use integer arithmetic expressions (+
, -
, *
, /
, %
, **
, ()
, <<
, >>
, etc.), and no logic like ?:
/or
/and
.
It's a decent way to prototype the language, get the big picture right, and leave the uninteresting details for later.
class TuplesParser
def initialize(path)
@path = Pathname(path)
end
def parse_error(fragment)
raise "Parse error in #{@path}:#{@lineno}: #{fragment}"
end
def parse_left_part(fragment)
fragment.split(/\s*,\s*/).map do |frag|
case frag
when /\A\d+\z/
frag.to_i
when "_"
nil
else
frag.to_sym
end
end
end
def parse_left(text)
return [] if text.empty?
parse_error text unless text =~ /\A\((.*)\)\z/
parts = $1.split(/\)\s*,\s*\(/)
parts.map { |part| parse_left_part(part) }
end
def parse_right(text)
parse_error text unless text =~ /\A\((.*)\)\z/
eval "proc{ [#{$1}] }", nil, @path.to_s, @lineno
end
def call
rules0 = []
rules1 = []
rules2 = []
@lineno = 0
@path.each_line do |line|
@lineno += 1
next if line =~ /\A\s*#/
next unless line =~ /(.*)->(.*)/
right = parse_right($2.strip)
left = parse_left($1.strip)
case left.size
when 0
rules0 << right
when 1
rules1 << [left[0], right]
when 2
rules2 << [left[0], left[1], right]
else
parse_error "Only rules with 0-2 dependencies allowed: #{text}"
end
end
TuplesProgram.new(rules0, rules1, rules2)
end
end
TuplesProgram
And finally the runner. For rules with 0 or 1 dependencies it's actually reasonably performant, for rules with 2 dependencies it's ridiculously slow.
If you run the program with DEBUG=1
, you'll see a lot of debug messages, so you can debug your Tuples programs this way.
class TuplesProgram
def initialize(rules0, rules1, rules2)
@rules0 = rules0
@rules1 = rules1
@rules2 = rules2
end
def debug
STDERR.puts yield if ENV["DEBUG"]
end
def add(state)
return if @states.include?(state)
return if state.any?(&:negative?)
debug{ "Adding #{state.inspect}" }
@states << state
@queue << state
end
def process(state)
@rules1.each do |pattern, block|
process_rule state, pattern, block
end
end
def process_rule(state, pattern, block)
return unless pattern.size == state.size
args = OpenStruct.new
pattern.size.times do |i|
# _ => nil => ignore
next unless pattern[i]
# symbol => pass to block
if pattern[i].is_a?(Symbol)
args[pattern[i]] = state[i]
# number => check for equality
elsif pattern[i] != state[i]
return
end
end
add args.instance_eval(&block)
end
# This is very slow
def process_rule2s
@rules2.each do |pattern1, pattern2, block|
part1s = []
part2s = []
@states.each do |state|
next unless state.size == pattern1.size
h = process_partial_rule(pattern1, state)
part1s << h if h
end
@states.each do |state|
next unless state.size == pattern2.size
h = process_partial_rule(pattern2, state)
part2s << h if h
end
debug{ "Rule2: #{part1s.inspect} v #{part2s.inspect}" }
part1s.each do |part1|
part2s.each do |part2|
arg = OpenStruct.new(part1.merge(part2))
add arg.instance_eval(&block)
end
end
end
end
def process_partial_rule(pattern, state)
args = {}
pattern.zip(state).each do |patterni, statei|
next unless patterni
if patterni.is_a?(Symbol)
args[patterni] = statei
elsif patterni != statei
return
end
end
args
end
def load_file(data, file_index)
data.each_with_index do |c, i|
add [0, 2, file_index, i, c]
end
add [0, 2, file_index, data.size, 0]
end
def reset
@queue = []
@states = Set[]
@requests_done = Set[]
@input_done = -1
@output_done = -1
@input = []
@no_more_input = false
@rules0.each do |block|
add block.call
end
end
def process_requests
active = false
input_request = @input_done
output_request = @output_done
@states.select{|s| s[0] == 0}.each do |state|
next if @requests_done.include?(state)
if state.size == 3 and state[1] == 0
input_request = [state[2], input_request].max
elsif state.size == 3 and state[1] == 3
output_request = [state[2], output_request].max
add [0, 3, state[2], 0]
elsif state.size == 5 and state[1] == 4
# Can just process this right away
add [*state, rand(state[3]..state[4])]
else
next
end
active = true
@requests_done << state
end
process_output output_request if output_request != @output_done
process_input input_request if input_request != @input_done
active
end
def process_input(input_request)
debug{ "Processing input request up to #{input_request}" }
while input_request >= @input.size or @no_more_input
line = STDIN.gets
if line
debug{ "Adding #{line.codepoints}" }
@input.push *line.codepoints
else
debug{ "Adding 0 for EOF" }
@no_more_input = true
@input.push 0
break
end
end
@input.each_with_index do |c, i|
add [0, 0, i, c]
end
@input_done = input_request
end
def process_output(output_request)
output_todo = @states.filter{|s|
s.size == 4 and s[0] == 0 and s[1] == 1 and s[2] > @output_done and s[3] != 0
}
if output_request
output = output_todo.filter{|s| s[2] <= output_request}.sort.map(&:last)
debug{ "Processing output request #{@output_done+1} to #{output_request}: #{output.inspect}" }
@output_done = output_request
else
output = output_todo.sort.map(&:last)
end
print output.pack("U*")
end
def call
loop do
until @queue.empty?
process @queue.pop until @queue.empty?
process_rule2s
end
break unless process_requests
end
process_output(nil)
if ENV["DEBUG"]
@states.sort.each do |state|
p state
end
end
end
end
Future Work
There are quite a few limitations:
- it is it is really slow, especially anything with 2-dependency rules
- the parser is only half-implemented, what goes on the right side is not defined at all
- it could be packaged in a proper gem
There are also some questions where the language could go:
- Tuples language is quite friendly for libraries, so we could have a directive like
use library
to include libraries for let's say printing numbers etc. - We could extend pattern matching system to treat multiples of the same as equality so we could do
(13, i, v), (10, i, b) -> (13, i+1, 10*v+b)
instead of(13, i, v), (10, j, b) -> (13, i+1, 10*v+b, -(i-j)**2)
. This would only really make sesne if that brings significant performance improvements to the 2-dependency rules. - The whole
0.*
interface feels quite messy, I'm not sure if it could be improved somehow
Code
All code examples for the series will be in this repository.
Code for the Extending Esoteric Language Tuples episode is available here.
Posted on February 21, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.