Writing a new programming language. Part IV: Boolean expressions and if statements
Konstantin Grechishchev
Posted on August 18, 2022
It was pleasing to see the number of reactions and reads on the last part of the tutorial! As promised, I am publishing part IV, where we will look into boolean type support and more complex programs with if
statements!
Goal for today
We have two goals for today:
- Extend LR-language to support boolean type as well as boolean operations, such as
<
,<=
,>
,>=
,==
,!=
,&&
,||
and!
. - Add the support of the if statements. We would like to be able to run the following program as the result of today's excise:
{
let pi: Float = 3.14;
let r: Int = 5;
let square: Float = 2 * pi * r * r;
let value: String = "The square of the circle with the r = " + r + " is " + square + ".";
if square > 100.0 && square <= 300.0 {
value = value + " It is > 100.";
if square <= 200.0 {
value = value + " It is <= 200.";
} else {
value = value + " It is > 200.";
}
} else {
value = value + " It is <= 100.";
}
}
Let's start!
Boolean type and operations
Boolean expressions are required to implement conditions and loops (coming soon!). There are a few main challenges we have to address to support boolean logic in LR-language:
- Introduce the boolean type itself
- Support parsing, representation in the syntax tree, and execution of a decent amount of boolean operations (see above)
- Ensure we respect the expression precedence rules between among operations and existing arithmetical operations. For example, for expression
2 + 2 == 4 && 5 > 2 * 4
, we need to ensure to compute*
and+
first, follow with==
and>
and execute conjunction as the last step.
Despite it looking like a lot of work, I'll try not to spend a lot of time on it. My main reason is that boolean implementation is just a bit more complex version of the work we've done in parts II and III of this tutorial. I will only cover the main ideas behind the implementation to help you to navigate through the code on your own. You can see all the changes required to be made in this commit.
Boolean type
Adding the new type is no different compared to the work we've done last time. We just have to do the following steps:
- Make changes to
Value
andType
enums to have a new variant for the type calledValue::Bool
andType::Bool
. - Modify the grammar to recognize
true
andfalse
terminals and boolean literals and add the
BoolLiteral => Box::new(Expr::Constant(Value::Bool(<>)))
line to make use of them in expressions.
- Make changes to existing implementations of the Add, Sub, Mul and Div traits to support
Value::Bool
variant. Lucky it is not a lot of effort as most of the operations are not defined for the boolean types, except probably a concatenation with a string.
Expressions
We can continue with parsing and execution of the operations once the type is defined. It is important to define the operator precedence correctly. We are not going to reinvent the wheel and will just use the same precedence used in rust:
- The
!
operator is the top priority - After that, we need to take care of existing math operations
- Next, we evaluate comparison operations
<
,<=
,>
,>=
,==
,!=
- Finally, we have conjunction operator (and)
&&
followed by disjunction (or)||
.
Negation operation
Here is the first real challenge. Negation operator !
accepts only one operand and so far we can only represent the expressions containing two operands and an operation between them! In order to overcome it, we rename the existing Opcode
struct to BinaryOpcode
and introduce a new one:
#[derive(PartialEq, Eq, Hash, Debug, Clone, Copy)]
pub enum UnaryOpcode {
Not,
}
We can then make the changes to Expr
enum to represent both unary and binary operations:
pub enum Expr {
Constant(Value),
Identifier(String),
BinaryOp(Box<Expr>, BinaryOpcode, Box<Expr>),
UnaryOp(UnaryOpcode, Box<Expr>),
}
We are now forced to introduce the new match branch to the evalutate_expression
function in order to make the code to compile after the above change:
Expr::UnaryOp(op, exp) => {
let result = match op {
UnaryOpcode::Not => !evalutate_expression(frame, exp)?,
};
result.map_err(|e| ExpressionError::OperationError(expr.to_string(), e))
}
Note that we apply the !
operator to Value
struct here. Rust will only allow us to do so if we implement the Not trait for it:
impl Not for Value {
type Output = Result<Value, OperationError>;
fn not(self) -> Self::Output {
match self {
Value::Int(v) => Ok(Value::Int(!v)),
Value::Float(_) => unary_error!(Not, Float),
Value::String(_) => unary_error!(Not, String),
Value::Bool(v) => Ok(Value::Bool(!v)),
}
}
}
As a bonus, we've made !
to work as a bitwise negation for Int
type. The unary_error
is a tiny helper macro to return an error.
Finally, we modify our grammar to be able to parse !
unary operation:
UnaryResult: Box<Expr> = {
UnaryOp Term => Box::new(Expr::UnaryOp(<>)),
Term,
};
UnaryOp: UnaryOpcode = {
"!" => UnaryOpcode::Not,
}
Comparison and logical operations
We are going to apply the same trick as in part II to implement the operator precedence correctly. As a reminder, the idea is to avoid the necessity to deal with the ordering in runtime and instead take care of it at the phase of parsing.
First, let's rename existing Expr
non-terminal into Sum
to better reflect the meaning of it. We would then apply the following induction rules:
- The result of comparison is either just a single
Sum
or twoSum
with a compassion operator in between - The conjunction is either just a single
Comparison
or a conjunction followed by&&
and followed byComparison
. - Finally, the disjunction is either just a single
Conjunction
or a disjunction followed by||
and followed byConjunction
.
This is how the AST for 1 + 2 == 3 && 4 < 5 * 6
expression could be visualized as graph:
The above idea could be expressed in the formar grammar in the following way:
pub Expr: Box<Expr> = {
Disjunction
}
Disjunction: Box<Expr> = {
Disjunction DisjOp Conjunction => Box::new(Expr::BinaryOp(<>)),
Conjunction,
};
DisjOp: BinaryOpcode = {
"||" => BinaryOpcode::Disj,
}
Conjunction: Box<Expr> = {
Conjunction ConjOp Comparison => Box::new(Expr::BinaryOp(<>)),
Comparison,
};
ConjOp: BinaryOpcode = {
"&&" => BinaryOpcode::Conj,
}
Comparison: Box<Expr> = {
Summ CompareOp Summ => Box::new(Expr::BinaryOp(<>)),
Summ,
};
CompareOp: BinaryOpcode = {
"==" => BinaryOpcode::Equals,
"!=" => BinaryOpcode::NotEquals,
"<" => BinaryOpcode::Lower,
">" => BinaryOpcode::Greater,
"<=" => BinaryOpcode::LowerEquals,
">=" => BinaryOpcode::GreaterEquals,
}
The reason this grammar enforces the construction of the above graph is that to produce, for example, Comparison
we first produce two Summ
involved. This way, the Expr
for Comparison
would be applied to the result of the Sum
expressions evaluation as operands.
We are able to parse new expressions now, but not yet able to evaluate them. You might expect to see implementations of a few other rust traits (like PartialEq) to define required operations, however there are a few problems with this approach. First, rust does not allow to redefine &&
and ||
operations. Second, unlike Add, Sub, Mul and Div traits, the PartialEq does not allow to use Result
as a return type. In other words, PartialEq
expects the comparison not to fail, which is not true for our case.
Luckily enough, we have no need to implement the traits! We can just define functions for required operations and called them instead in the evalutate_expression
method:
BinaryOpcode::Conj => conjunction(value_1, value_2),
BinaryOpcode::Disj => disjunction(value_1, value_2),
BinaryOpcode::Equals => equals(value_1, value_2),
BinaryOpcode::NotEquals => not_equals(value_1, value_2),
BinaryOpcode::Greater => greater(value_1, value_2),
BinaryOpcode::GreaterEquals => greater_equals(value_1, value_2),
BinaryOpcode::Lower => lower(value_1, value_2),
BinaryOpcode::LowerEquals => lower_equals(value_1, value_2),
To keep this tutorial short, I avoid providing the implementation of the above functions here. You can find it in the operations/logical.rs
file in this commit.
Nested code blocks
Before we jump into the implementation of the condition operator, we have to solve another problem with our code. Do you remember an optional parent: Option<Box<Frame>>
field inside the Frame
struct which we've planned for the parent frame variables lookup? So far we never had a need to support nested frames and never set it to Some
. Let's fix it!
In order to do so, we introduce a new structure called Block
and consider the block as one variant of the statement enum:
Statement: Statement = {
"let" <Identifier> ":" <Type> "=" <Expr> ";" => Statement::new_definition(<>),
<Identifier> "=" <Expr> ";" => Statement::new_assignment(<>),
Block => Statement::new_block(<>)
};
Block: Block = {
"{" <Statements> "}" => Block::new_statements(<>),
}
The execute_statement
match should be now extended to have a new branch:
Statement::Block(block) => {
frame = execute_block(frame, block)?;
Ok(frame)
}
To execute the block, we first create a new child frame with the current frame as a parent, execute the statements of the block using a new frame and get the (possibly modified) parent frame back:
pub fn execute_block(frame: Frame, block: &Block) -> Result<Frame, RuntimeError> {
match block {
Block::StatementsBlock(statements) => {
let mut block_frame = Frame::new(Box::new(frame));
block_frame = execute_statements(block_frame, statements)?;
Ok(*block_frame.take_parent().unwrap())
}
}
}
If Statements
It is time to implement the main concept of today's tutorial! We are going to support the following syntax:
if expression { statements } else { statements }
The {
and }
are mandatory (for simplicity) and the else branch is optional.
The representation of the above using the grammar is the following:
Block: Block = {
"{" <Statements> "}" => Block::new_statements(<>),
"if" <Expr> "{" <Statements> "}" => Block::new_condition(<>, None),
"if" <exp:Expr> "{" <thn:Statements> "}" "else" "{" <els:Statements> "}" => Block::new_condition(exp, thn, Some(els)),
}
Here is the final code of the Block
struct:
pub enum Block {
StatementsBlock(Vec<Statement>),
Condition {
expression: Box<Expr>,
then_block: Vec<Statement>,
else_block: Option<Vec<Statement>>,
},
}
Since we've added a new Condition
variant, we need to update execute_block
again:
pub fn execute_block(frame: Frame, block: &Block) -> Result<Frame, RuntimeError> {
match block {
Block::StatementsBlock(statements) => {
// Existing code
}
Block::Condition {
expression,
then_block,
else_block,
} => execute_condition(frame, expression.as_ref(), then_block, else_block),
}
}
The main logic is hidden in the execute_condition
method:
pub fn execute_condition(
mut frame: Frame,
expr: &Expr,
then_block: &[Statement],
else_block: &Option<Vec<Statement>>,
) -> Result<Frame, RuntimeError> {
let value = evalutate_expression(&mut frame, expr)?;
if let Value::Bool(value) = value {
let mut block_frame = Frame::new(Box::new(frame));
if value {
block_frame = execute_statements(block_frame, then_block)?;
} else if let Some(else_block) = else_block {
block_frame = execute_statements(block_frame, else_block)?;
}
Ok(*block_frame.take_parent().unwrap())
} else {
Err(RuntimeError::NonBooleanCondition(
expr.to_string(),
(&value).into(),
))
}
}
We evaluate the condition expression, ensure that the result is boolean and execute then or else branch (if define) depending on the condition value.
Summary
We've done it! The LR-Language supports if statements now! Here is how the main frame of the above program looks after the execution:
Frame {
parent: None,
local_variables: {
"pi": Float(
3.14,
),
"r": Int(
5,
),
"value": String(
"The square of the circle with the r = 5 is 157. It is > 100. It is <= 200.",
),
"square": Float(
157.0,
),
},
}
Please stay in touch for Part V, where I plan to implement for
and while
loops! I will publish it if this post gets 30 reactions!
The state of the source code after this part could be found in the part_4 branch of the GitHub repository!
Posted on August 18, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
August 18, 2022