Building a stack-based virtual machine, part 3 - instructions

jimsy

James Harton

Posted on September 5, 2018

Building a stack-based virtual machine, part 3 - instructions

In part 2 we build a stack that we can use to store our operands. That's pretty cool. The next step for us is to define instructions that allow us to perform operations on the operands in the stack.

In case you missed the definitions in part 1 let's recap what we need to make instructions.

  • A name - Technically this is optional, but I'll be damned if I'm going to use a machine where I have to remember the opcodes of our instructions.
  • An opcode - A unique number for this instruction. It's important that this doesn't change between VM versions otherwise we won't be able to run old code.
  • Arity - The number of arguments this instruction takes - that's different to the number of operands it pops from the stack. Think about the instruction push 2; where does the 2 come from? It is supplied as an argument. Most instructions will take 0 arguments but sometimes you need to give an instruction more to go on than simply the contents of the stack.
  • Behaviour - The action that our instruction will perform. In a CPU this is made using transistors but in our case we'll use Rust functions.

With all that in mind, let's build our instruction type. Remember that because our operands are generic we need our instructions to be generic too.

pub struct Instruction<T> {
  pub op_code: usize,
  pub name: String,
  pub arity: usize,
  pub fun: InstructionFn<T>
}

pub type InstructionFn<T> = fn(machine: &mut Machine<T>, args: &[usize]);

impl<T> Instruction<T> {

    pub fn new(op_code: usize, name: &str, arity: usize, fun: InstructionFn<T>) -> Instruction<T> {
        Instruction {
            op_code: op_code,
            name: String::from(name),
            arity: arity,
            fun: fun
        }

    }
}
Enter fullscreen mode Exit fullscreen mode

Don't worry about the Machine type. We'll get to that in a later episode.

Something important to note here is that the instructions arguments aren't of type T here - they're indexes into the program's data section. We'll get on to that later too.

Because the Instruction type is not much more than a handy container the tests are correspondingly simple:

#[cfg(test)]
mod test {
    use super::*;

    struct Operand(i64);

    fn noop(_machine: &mut Machine<Operand>, _args: &[usize]) {}

    #[test]
    fn new() {
        let operand = Instruction::new(13, "noop", 7, noop);
        assert_eq!(operand.op_code, 13);
        assert_eq!(operand.name, "noop".to_string());
        assert_eq!(operand.arity, 7);
    }
}
Enter fullscreen mode Exit fullscreen mode

Next we need an instruction table - a structure which we can store instructions in and look them up by opcode and name. We're going to use a HashMap.

pub struct InstructionTable<T>(HashMap<usize, Instruction<T>>);

impl<T> InstructionTable<T> {
    pub fn new() -> InstructionTable<T> {
        InstructionTable(HashMap::new())
    }

    pub fn by_op_code(&self, op_code: usize) -> Option<&Instruction<T>> {
        self.0.get(&op_code)
    }

    pub fn by_name(&self, name: &str) -> Option<&Instruction<T>> {
        self.0
            .values()
            .find(|ref instr| instr.name == name)
    }

    pub fn insert(&mut self, instr: Instruction<T>) {
        self.0.insert(instr.op_code, instr);
    }

    pub fn is_empty(&self) -> bool {
        self.0.is_empty()
    }
}

Enter fullscreen mode Exit fullscreen mode

That wasn't so hard. Now let's write some tests:

#[cfg(test)]
mod test {
    use super::*;
    use machine::Machine;

    fn noop(_machine: &mut Machine<usize>, _args: &[usize]) {}

    #[test]
    fn new() {
        let table: InstructionTable<usize> = InstructionTable::new();
        assert!(table.is_empty())
    }

    #[test]
    fn insert() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        assert!(table.is_empty());
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        assert!(!table.is_empty());
    }

    #[test]
    fn by_op_code() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        let instr = table.by_op_code(0).unwrap();
        assert_eq!(instr.name, "NOOP");
    }

    #[test]
    fn by_name() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        let instr = table.by_name("NOOP").unwrap();
        assert_eq!(instr.op_code, 0);
    }
}
Enter fullscreen mode Exit fullscreen mode

We've come to the end of today's journey. We now have an operand stack, and a way to collect and use our instructions. In our next episode we're going to create the necessary structures to represent a program that our machine can execute.

💖 💪 🙅 🚩
jimsy
James Harton

Posted on September 5, 2018

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related