Writing a new programming language. Part IV: Boolean expressions and if statements

kgrech

Konstantin Grechishchev

Posted on August 18, 2022

Writing a new programming language. Part IV: Boolean expressions and if statements

It was pleasing to see the number of reactions and reads on the last part of the tutorial! As promised, I am publishing part IV, where we will look into boolean type support and more complex programs with if statements!

Goal for today

We have two goals for today:

  • Extend LR-language to support boolean type as well as boolean operations, such as <, <=, >, >=, ==, !=, &&, || and !.
  • Add the support of the if statements. We would like to be able to run the following program as the result of today's excise:
{
    let pi: Float = 3.14;
    let r: Int = 5;
    let square: Float = 2 * pi * r * r;
    let value: String = "The square of the circle with the r = " + r + " is " + square + ".";
    if square > 100.0 && square <= 300.0 {
        value = value + " It is > 100.";
        if square <= 200.0 {
            value = value + " It is <= 200.";
        } else {
            value = value + " It is > 200.";
        }
    } else {
        value = value + " It is <= 100.";
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's start!

Boolean type and operations

Boolean expressions are required to implement conditions and loops (coming soon!). There are a few main challenges we have to address to support boolean logic in LR-language:

  • Introduce the boolean type itself
  • Support parsing, representation in the syntax tree, and execution of a decent amount of boolean operations (see above)
  • Ensure we respect the expression precedence rules between among operations and existing arithmetical operations. For example, for expression 2 + 2 == 4 && 5 > 2 * 4, we need to ensure to compute * and + first, follow with == and > and execute conjunction as the last step.

Despite it looking like a lot of work, I'll try not to spend a lot of time on it. My main reason is that boolean implementation is just a bit more complex version of the work we've done in parts II and III of this tutorial. I will only cover the main ideas behind the implementation to help you to navigate through the code on your own. You can see all the changes required to be made in this commit.

Boolean type

Adding the new type is no different compared to the work we've done last time. We just have to do the following steps:

  • Make changes to Value and Type enums to have a new variant for the type called Value::Bool and Type::Bool.
  • Modify the grammar to recognize true and false terminals and boolean literals and add the
BoolLiteral => Box::new(Expr::Constant(Value::Bool(<>)))
Enter fullscreen mode Exit fullscreen mode

line to make use of them in expressions.

  • Make changes to existing implementations of the Add, Sub, Mul and Div traits to support Value::Bool variant. Lucky it is not a lot of effort as most of the operations are not defined for the boolean types, except probably a concatenation with a string.

Expressions

We can continue with parsing and execution of the operations once the type is defined. It is important to define the operator precedence correctly. We are not going to reinvent the wheel and will just use the same precedence used in rust:

  • The ! operator is the top priority
  • After that, we need to take care of existing math operations
  • Next, we evaluate comparison operations <, <=, >, >=, ==, !=
  • Finally, we have conjunction operator (and) && followed by disjunction (or) ||.

Negation operation

Here is the first real challenge. Negation operator ! accepts only one operand and so far we can only represent the expressions containing two operands and an operation between them! In order to overcome it, we rename the existing Opcode struct to BinaryOpcode and introduce a new one:

#[derive(PartialEq, Eq, Hash, Debug, Clone, Copy)]
pub enum UnaryOpcode {
    Not,
}
Enter fullscreen mode Exit fullscreen mode

We can then make the changes to Expr enum to represent both unary and binary operations:

pub enum Expr {
    Constant(Value),
    Identifier(String),
    BinaryOp(Box<Expr>, BinaryOpcode, Box<Expr>),
    UnaryOp(UnaryOpcode, Box<Expr>),
}
Enter fullscreen mode Exit fullscreen mode

We are now forced to introduce the new match branch to the evalutate_expression function in order to make the code to compile after the above change:

Expr::UnaryOp(op, exp) => {
    let result = match op {
        UnaryOpcode::Not => !evalutate_expression(frame, exp)?,
    };
    result.map_err(|e| ExpressionError::OperationError(expr.to_string(), e))
}
Enter fullscreen mode Exit fullscreen mode

Note that we apply the ! operator to Value struct here. Rust will only allow us to do so if we implement the Not trait for it:

impl Not for Value {
    type Output = Result<Value, OperationError>;

    fn not(self) -> Self::Output {
        match self {
            Value::Int(v) => Ok(Value::Int(!v)),
            Value::Float(_) => unary_error!(Not, Float),
            Value::String(_) => unary_error!(Not, String),
            Value::Bool(v) => Ok(Value::Bool(!v)),
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

As a bonus, we've made ! to work as a bitwise negation for Int type. The unary_error is a tiny helper macro to return an error.

Finally, we modify our grammar to be able to parse ! unary operation:

UnaryResult: Box<Expr> = {
    UnaryOp Term => Box::new(Expr::UnaryOp(<>)),
    Term,
};

UnaryOp: UnaryOpcode = {
    "!" => UnaryOpcode::Not,
}
Enter fullscreen mode Exit fullscreen mode

Comparison and logical operations

We are going to apply the same trick as in part II to implement the operator precedence correctly. As a reminder, the idea is to avoid the necessity to deal with the ordering in runtime and instead take care of it at the phase of parsing.

First, let's rename existing Expr non-terminal into Sum to better reflect the meaning of it. We would then apply the following induction rules:

  • The result of comparison is either just a single Sum or two Sum with a compassion operator in between
  • The conjunction is either just a single Comparison or a conjunction followed by && and followed by Comparison.
  • Finally, the disjunction is either just a single Conjunction or a disjunction followed by || and followed by Conjunction.

This is how the AST for 1 + 2 == 3 && 4 < 5 * 6 expression could be visualized as graph:
Image description

The above idea could be expressed in the formar grammar in the following way:

pub Expr: Box<Expr> = {
    Disjunction
}

Disjunction: Box<Expr> = {
    Disjunction DisjOp Conjunction => Box::new(Expr::BinaryOp(<>)),
    Conjunction,
};

DisjOp: BinaryOpcode = {
    "||" => BinaryOpcode::Disj,
}

Conjunction: Box<Expr> = {
    Conjunction ConjOp Comparison => Box::new(Expr::BinaryOp(<>)),
    Comparison,
};

ConjOp: BinaryOpcode = {
    "&&" => BinaryOpcode::Conj,
}

Comparison: Box<Expr> = {
    Summ CompareOp Summ => Box::new(Expr::BinaryOp(<>)),
    Summ,
};

CompareOp: BinaryOpcode = {
    "==" => BinaryOpcode::Equals,
    "!=" => BinaryOpcode::NotEquals,
    "<" => BinaryOpcode::Lower,
    ">" => BinaryOpcode::Greater,
    "<=" => BinaryOpcode::LowerEquals,
    ">=" => BinaryOpcode::GreaterEquals,
}
Enter fullscreen mode Exit fullscreen mode

The reason this grammar enforces the construction of the above graph is that to produce, for example, Comparison we first produce two Summ involved. This way, the Expr for Comparison would be applied to the result of the Sum expressions evaluation as operands.

We are able to parse new expressions now, but not yet able to evaluate them. You might expect to see implementations of a few other rust traits (like PartialEq) to define required operations, however there are a few problems with this approach. First, rust does not allow to redefine && and || operations. Second, unlike Add, Sub, Mul and Div traits, the PartialEq does not allow to use Result as a return type. In other words, PartialEq expects the comparison not to fail, which is not true for our case.

Luckily enough, we have no need to implement the traits! We can just define functions for required operations and called them instead in the evalutate_expression method:

BinaryOpcode::Conj => conjunction(value_1, value_2),
BinaryOpcode::Disj => disjunction(value_1, value_2),
BinaryOpcode::Equals => equals(value_1, value_2),
BinaryOpcode::NotEquals => not_equals(value_1, value_2),
BinaryOpcode::Greater => greater(value_1, value_2),
BinaryOpcode::GreaterEquals => greater_equals(value_1, value_2),
BinaryOpcode::Lower => lower(value_1, value_2),
BinaryOpcode::LowerEquals => lower_equals(value_1, value_2),
Enter fullscreen mode Exit fullscreen mode

To keep this tutorial short, I avoid providing the implementation of the above functions here. You can find it in the operations/logical.rs file in this commit.

Nested code blocks

Before we jump into the implementation of the condition operator, we have to solve another problem with our code. Do you remember an optional parent: Option<Box<Frame>> field inside the Frame struct which we've planned for the parent frame variables lookup? So far we never had a need to support nested frames and never set it to Some. Let's fix it!

In order to do so, we introduce a new structure called Block and consider the block as one variant of the statement enum:

Statement: Statement = {
    "let" <Identifier> ":" <Type> "=" <Expr> ";" => Statement::new_definition(<>),
    <Identifier> "=" <Expr> ";" => Statement::new_assignment(<>),
    Block => Statement::new_block(<>)
};

Block: Block = {
    "{" <Statements> "}" => Block::new_statements(<>),
}
Enter fullscreen mode Exit fullscreen mode

The execute_statement match should be now extended to have a new branch:

Statement::Block(block) => {
    frame = execute_block(frame, block)?;
    Ok(frame)
}
Enter fullscreen mode Exit fullscreen mode

To execute the block, we first create a new child frame with the current frame as a parent, execute the statements of the block using a new frame and get the (possibly modified) parent frame back:

pub fn execute_block(frame: Frame, block: &Block) -> Result<Frame, RuntimeError> {
    match block {
        Block::StatementsBlock(statements) => {
            let mut block_frame = Frame::new(Box::new(frame));
            block_frame = execute_statements(block_frame, statements)?;
            Ok(*block_frame.take_parent().unwrap())
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

If Statements

It is time to implement the main concept of today's tutorial! We are going to support the following syntax:

if expression { statements } else { statements }
Enter fullscreen mode Exit fullscreen mode

The { and } are mandatory (for simplicity) and the else branch is optional.

The representation of the above using the grammar is the following:

Block: Block = {
    "{" <Statements> "}" => Block::new_statements(<>),
    "if" <Expr> "{" <Statements> "}" => Block::new_condition(<>, None),
    "if" <exp:Expr> "{" <thn:Statements> "}" "else" "{" <els:Statements> "}" => Block::new_condition(exp, thn, Some(els)),
}
Enter fullscreen mode Exit fullscreen mode

Here is the final code of the Block struct:

pub enum Block {
    StatementsBlock(Vec<Statement>),
    Condition {
        expression: Box<Expr>,
        then_block: Vec<Statement>,
        else_block: Option<Vec<Statement>>,
    },
}
Enter fullscreen mode Exit fullscreen mode

Since we've added a new Condition variant, we need to update execute_block again:

pub fn execute_block(frame: Frame, block: &Block) -> Result<Frame, RuntimeError> {
    match block {
        Block::StatementsBlock(statements) => {
            // Existing code
        }
        Block::Condition {
            expression,
            then_block,
            else_block,
        } => execute_condition(frame, expression.as_ref(), then_block, else_block),
    }
}
Enter fullscreen mode Exit fullscreen mode

The main logic is hidden in the execute_condition method:

pub fn execute_condition(
    mut frame: Frame,
    expr: &Expr,
    then_block: &[Statement],
    else_block: &Option<Vec<Statement>>,
) -> Result<Frame, RuntimeError> {
    let value = evalutate_expression(&mut frame, expr)?;
    if let Value::Bool(value) = value {
        let mut block_frame = Frame::new(Box::new(frame));
        if value {
            block_frame = execute_statements(block_frame, then_block)?;
        } else if let Some(else_block) = else_block {
            block_frame = execute_statements(block_frame, else_block)?;
        }
        Ok(*block_frame.take_parent().unwrap())
    } else {
        Err(RuntimeError::NonBooleanCondition(
            expr.to_string(),
            (&value).into(),
        ))
    }
}
Enter fullscreen mode Exit fullscreen mode

We evaluate the condition expression, ensure that the result is boolean and execute then or else branch (if define) depending on the condition value.

Summary

We've done it! The LR-Language supports if statements now! Here is how the main frame of the above program looks after the execution:

Frame {
    parent: None,
    local_variables: {
        "pi": Float(
            3.14,
        ),
        "r": Int(
            5,
        ),
        "value": String(
            "The square of the circle with the r = 5 is 157. It is > 100. It is <= 200.",
        ),
        "square": Float(
            157.0,
        ),
    },
}
Enter fullscreen mode Exit fullscreen mode

Please stay in touch for Part V, where I plan to implement for and while loops! I will publish it if this post gets 30 reactions!

The state of the source code after this part could be found in the part_4 branch of the GitHub repository!

💖 💪 🙅 🚩
kgrech
Konstantin Grechishchev

Posted on August 18, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related