Building a stack-based virtual machine, part 6 - function calls

jimsy

James Harton

Posted on September 14, 2018

Building a stack-based virtual machine, part 6 - function calls

In our last episode we finally ran a program. It was pretty exciting. This episode we're going to add function calling.

Up until now whenever we've used the term "stack" we've been referring to the operand stack, but typically when programmers talk about stacks they're talking about call stacks.

So what is a call stack and what is it used for? I'm glad you asked.

When you call a function you're executing a jump from one place in the program (instruction pointer) to another (hopefully named) place in the program. The difference between a function call and a jump is that you have to store the place you jumped from somewhere so that you can return back to it when you're done.

Arguments are passed by simply pushing them on to the stack before performing the call. The callee should pop the appropriate number of arguments off the stack and push the appropriate number of return values back onto the stack.

Yes, this means that functions can have multiple return values. How a machine takes arguments and return values is called it's calling convention. Here we've opted for the simplest possible convention to allow you, the compiler writer, to decide for yourself how you want function calling to behave.

So what is a call stack a stack of? We usually refer to them as stack frames so that's what we're going to build. At this stage our frame consists of a single value - our return address.

Frames will usually contain other environmental information, such as local variables or anything else that should be within this scope.

pub struct Frame {
    pub return_address: usize
}

impl Frame {
  pub fn new(return_address: usize) -> Frame {
    Frame { return_address: return_address }
  }
}
Enter fullscreen mode Exit fullscreen mode

So how do we make a stack out of them? Turns out we already have a generic Stack<T> that we can use for the job. Let's add it to our machine:

pub struct Machine<'a, T> {
  code: Code<T>,
  instruction_table: &'a InstructionTable<T>,
  ip: usize,
  operand_stack: Stack<T>,
  call_stack: Stack<Frame>,
}
Enter fullscreen mode Exit fullscreen mode

We need to update our new function to initialise the call stack. We create a frame which points past the last instruction in our program so that if the last frame is popped off the IP is pointing outside the program and the machine halts.

We also need to add the following methods to our machine:

  • jump - update the machine's instruction pointer to that of the named label.
  • call - create a new stack frame with the current IP as it's return address.
  • ret - pop the top frame off the stack and set the machine's IP back to the original location.
impl<'a, T> Machine<'a, T> {
    pub fn new(code: Code<T>, instruction_table: &'a InstructionTable<T>) -> Machine<'a, T> {
        let frame = Frame::new(code.code.len());
        let mut call_stack = Stack::new();
        call_stack.push(frame);

        Machine {
            code: code,
            instruction_table: instruction_table,
            ip: 0,
            call_stack: call_stack,
            operand_stack: Stack::new(),
        }
    }

    pub fn jump(&mut self, label: &str) {
        self.ip = self
            .code
            .get_label_ip(label)
            .expect(&format!("Attempted to jump to unknown label {}", label));
    }

    pub fn call(&mut self, label: &str) {
        self.call_stack.push(Frame::new(self.ip));
        self.jump(label);
    }

    pub fn ret(&mut self) {
        let frame = self.call_stack.pop();
        self.ip = frame.return_address;
    }

}
Enter fullscreen mode Exit fullscreen mode

So now that we have our methods we can convert our earlier arithmetic program into one that has an add function. Here we use an enum as our operand type because now we need to store both labels (strings) and integers.

Before we code it up, here's our program as assembler so that it's easier to follow:

@0 = I(3)
@1 = I(4)
@2 = S("add_fun")

.main:
    push @0
    push @1
    call @2
    ret

.add_fun:
    add
    ret
Enter fullscreen mode Exit fullscreen mode

So in order to understand what it's doing let's go through it step by step:

  1. Push the integer 3 onto the stack.
  2. Push the integer 4 onto the stack.
  3. Call the "add_fun" function, which pushes a new stack frame and jumps to the add_fun label.
  4. Add pops the two operands off the stack and pushes back the result of adding them together.
  5. Ret pop the frame off the call stack and return by setting the IP back to 3.
  6. Ret pop the last frame off the call stack and end the program.
#[derive(Clone, Debug, PartialEq)]
enum Operand {
    I(i64),
    S(String),
}

impl Operand {
    fn to_i(&self) -> Option<i64> {
        match self {
            &Operand::I(i) => Some(i),
            _ => None,
        }
    }

    fn to_s(&self) -> Option<&str> {
        match self {
            &Operand::S(ref s) => Some(s),
            _ => None,
        }
    }
}

fn push(machine: &mut Machine<Operand>, args: &[usize]) {
    let arg = machine.get_data(args[0]).clone();
    machine.operand_push(arg)
}

fn add(machine: &mut Machine<Operand>, _args: &[usize]) {
    let rhs = machine.operand_pop().to_i().unwrap();
    let lhs = machine.operand_pop().to_i().unwrap();
    machine.operand_push(Operand::I(lhs + rhs));
}

fn call(machine: &mut Machine<Operand>, args: &[usize]) {
    let label = machine.get_data(args[0]).clone();
    machine.call(label.to_s().unwrap());
}

fn ret(machine: &mut Machine<Operand>, _args: &[usize]) {
    machine.ret();
}

fn instruction_table() -> InstructionTable<Operand> {
    let mut it = InstructionTable::new();
    it.insert(Instruction::new(0, "push", 1, push));
    it.insert(Instruction::new(1, "add", 0, add));
    it.insert(Instruction::new(2, "call", 1, call));
    it.insert(Instruction::new(3, "ret", 0, ret));
    it
}

impl From<i64> for Operand {
    fn from(i: i64) -> Self {
        Operand::I(i)
    }
}

impl<'a> From<&'a str> for Operand {
    fn from(s: &'a str) -> Self {
        Operand::S(s.to_string())
    }
}

#[test]
fn example() {
    let it = instruction_table();
    let mut builder: Builder<Operand> = Builder::new(&it);
    builder.push("push", vec![Operand::from(3)]);
    builder.push("push", vec![Operand::from(4)]);
    builder.push("call", vec![Operand::from("add_fun")]);
    builder.push("ret", vec![]);
    builder.label("add_fun");
    builder.push("add", vec![]);
    builder.push("ret", vec![]);
    let mut machine: Machine<Operand> = Machine::new(Code::from(builder), &it);
    machine.run();
    let result = machine.operand_pop().to_i().unwrap();
    assert_eq!(result, 7);
}
Enter fullscreen mode Exit fullscreen mode

Awesome. Now we have function calling. In the next episode I'll show you how to implement conditional instructions.

💖 💪 🙅 🚩
jimsy
James Harton

Posted on September 14, 2018

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related