How Code formatter implemented in Turtle Graphics

amrdeveloper

Amr Hesham

Posted on November 3, 2022

How Code formatter implemented in Turtle Graphics

Hello everyone, in the last article I introduced the Turtle Graphics Android App project with implementation details and resources about the scripting language, editor, generating documentation…etc, after I published the app and got more than 2000 downloads a few times and good ratings and feedback, I decided to add support for code formatting and in this article, I will talk in detail how simple code formatter work and how I implemented in Turtle Graphics app

Code formatter

As programmers Code formatter are an essential tool in our day-to-day jobs, They make it more easier to read the code if it is formatted, but did you ask yourself how it works?

Before talking about Code formatter, lets first talk about how Compilers represent your code from text to data structure to do the process on it such as type checking.

Lets start our story from your file that contains a simple hello world example

fun main() {
   print("Hello, World!")
}
Enter fullscreen mode Exit fullscreen mode

The first step is to read this text file and convert it into a list of tokens, A token is a class that represents keyword, number, bracket, string, …etc with this position in the source code for example

data class Token (
   val kind : TokenKind,
   val literal : String,
   val line : Int,
)
Enter fullscreen mode Exit fullscreen mode

We can also saved the file name, column start and end so when we want to report error we can provide useful info about the position for example

Error in File Main Line 10: Missing semicolon :D

This step called scanner, lexer or tokenizer and at the end we will end up with List of tokens for example

{ FUN_KEYWORD, "fun", 1 }
{ IDENTIFIER, "main", 1 }
{ LEFT_PAREN, "(", 1 }
{ RRIGHT_PAREN, ")", 1 }
{ LEFT_BRACE, "{", 1 }
{ IDENTIFIER, "print", 2 }
{ LEFT_PAREN, "(", 2 }
{ STRING, "Hello, World!", 2 }
{ RRIGHT_PAREN, ")", 2 }
{ RIGHT_BRACE, "}", 3 }
Enter fullscreen mode Exit fullscreen mode

The result is list of tokens

val tokens : List<Token> = tokenizer(input)
Enter fullscreen mode Exit fullscreen mode

Note that in this step we can check for some errors such as un terminated string or char, un supported symbols …etc

After this step, you will forget your text file and deal with this list of tokens, and now we should convert some tokens into nodes depending on our language grammar for when we saw FUN_KEYWORD that means we will build a function declaration node and we expect name, paren, parameters …etc

In this step, we need a data structure to represent the program in a way we can traverse and validate it later and it is called Abstract Syntax Tree (AST), each node in AST represent statement such as If, While, Function declaration, var declaration …etc or expressions such as assignments, unary …etc, each node store required information to use them later in the next steps for example

Function Declaration

data class Function (
   var name : String,
   var arguments : List<Argument>,
   var body : List<Statement>
)
Enter fullscreen mode Exit fullscreen mode

Variable Declaration

data class Var (
   var name : String
   var value : Expression
)
Enter fullscreen mode Exit fullscreen mode

This step is called parsing and we will end up with an AST object that we can use latter to traverse all nodes.

var astNode = parse(tokens)

If the language statically types such as Java, C, Go …etc we will go to the Type Checker step, the goal for this step is to check that the user use type correctly for example, if the use declare a variable with int type it should store only integers on it, the if condition must be a boolean type or an integer in a language like C …etc

Image description

After this step, we will end up with the same AST node but now we know that it is valid and we can now compile it to any target or evalute it, But also we can do the formatting, static analysis, optimization, check code style …etc

For example suppose that we want all developers to declare variables without using _ inside the name, to check that we will traverse our AST node to find all Var nodes and check them

fun checkVarDeclaration(node : Var) {
   if (node.name.contains("_") {
      reportError("Ops your variable name ${node.name} contains _")
   }
}
Enter fullscreen mode Exit fullscreen mode

But now we need to format it, so how to do that? It's the same we traverse our AST and for each node, we will write it back to text but formatted for example

fun formatVarDeclaration(node : Var) : String {
   var builder = StringBuilder()
   builder.append(indentation)
   builder.append("var ")
   builder.append(node.name)
   builder.append(" = ")
   builder.append(formatValue(node.value))
   builder.append("\n")   
   return builder.toString()
}
Enter fullscreen mode Exit fullscreen mode

In this simple method, we rewrite the node to string but with correct indentations and add a new line after it so now 2 variables are declared in the same line, the value also is formatted using another function you can use Visitor design pattern to make it easy to handle all nodes.

At the end of this step, we end up with a string that represents the same input file but formatted and then we write it back to the file.

This is the basic implementation of code formatter, a real production code formatter must handle more cases for example what if the code is not valid?, should i format only valid code? should we read the whole program every time we want to format or compile the code?

Now back to Turtle graphics, In this project i already done all the required steps before and has a ready AST, so i just rewrite it with code like you saw above ^_^ i read it from the UI format it and write it back to UI in my case

If you are interested and want to read more I suggest

  • Read at last one Compiler book such as Craftring interpreters
  • Read about Language Server Portcol (LSP)
  • Watch Typescript Compiler explained by the Author Anders Hejlsberg
  • Think if you have Your program as AST what else you can do with it

I hope you enjoyed my article and you can find me on

Enjoy Programming 😋.

💖 💪 🙅 🚩
amrdeveloper
Amr Hesham

Posted on November 3, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related