Eli
Posted on December 14, 2020
About
I decided one day, that I would learn C++ and make a programming language. It is PieLang. I have been working on it for 6 months, and hope it will come out swell. I dived into it, and found that the water was, way, way, way to deep. I decided to take a break, and learn how Lexing and Parsing work. 2 months later, I continued. It is an advanced topic, and I still don't know it all. But this is the basics to help you start creating your own programming language!
Lexing
Lexing or lex is the first step of making a programming language. I, if I need to provide code snippets, will write them in C++.
Well, Lexing basically takes the source code(file/input) and turns them into a stream of tokens. Take this code for example:
print(2 + 2);
This would return a token stream, something like this:
IDENT: "print"
LPAREN: '('
NUMBER: '2'
OP: '+'
NUMBER: '2'
RPAREN: ')'
EOL: ';'
EOL stands for end of line by the way
You see how that went character by character? It returned a token stream!
If you don't get what I mean, think of this:
std::string Lexer(){
// ...
return token_stream;
}
Or something like that.
Lets do another tokenize...
int main(){
std::cout << "Hello, world!";
}
Let's see what the lexer spits out this time:
IDENT: "int"
IDENT: "main"
LPAREN: '('
RPAREN: ')'
LCURLY: '{'
IDENT: "std::cout"
LESS_THAN: '<'
LESS_THAN: '<'
STRING: "Hello, world!"
EOL: ';'
RCURLY: '}'
Woah, that was much more! Imagine your biggest program, then imagine how many tokens there were!
Note: See how the double quotes 'disappear'? Thats because the lexer sees them, but it doesn't output them as tokens, it just tells the lexer: "Oh, hey, this is a string!"
Double-Note: std::
is a library so if the real lexer saw that it would be like: "Lets access the struct/namespace/class called std! Oh wait, that is a built in lib! Lets access that instead"
Lets imagine that we were checking if something was a print statement(this code is not real)!
if(type == "IDENT" && val == "std::cout"){
//...
std::cout << string;
}
That is not real code, it just helps us understand more!
Now that you know some lexing, its parsing...
Parsing
The parser creates an AST(abstract syntax tree) and then either executes it or sends it to the compiler/interpreter to do the work.
Example of an AST:
+
/ \
1 2
This basic representation of a AST helps us understand it better! Lets take a look here.
The +
symbol is telling that we are adding these numbers.
Then the two numbers are below it(1, 2
).
Lets take a more advanced equation.
1+2*3-4
Hmm, we couldn't do that in a if statement! That's why we have to send it to the parser!
-
/ \
+ 4
/ \
1 *
/\
2 3
Remember that little AST that we made before? Well that is this, just more of them packed together! Don't believe me? Lemme explain. We start of with -
or minuce. (Remember how we started off with the +
last time?) Then we went to +
and 4
. Lets think of the plus as a variable, which it holds this value:
+ = 1+(2*3)
I know that is against the rules of coding, but this is just to show you
You get it? Good. Cause next is the second stage.
The parser eventually solves all of the problems, like so:
# Second stage
-
/ \
+ 4
/ \
1 6
Notice: Notice how the (2*3) turned to six? Yeah. Cool right?
# Third stage
-
/ \
7 4
Yay, we got it all the way down to 7-4
! Now we can just do the equation...
# Final stage
3
Boom! We just solved 1+2*3-4
!
After
It takes a strong mind to actually code these things. I hope you can do this, with the foundation I have created for you.
Links
Here are some useful links!
Craftinginterpreters.com This link helped me with the basic knowledge I know, so go check it out: https://craftinginterpreters.com
PieLang Want to see the progress that I am making with my language? Here is the github link: https://github.com/elipie/PieLang
Then, the most useful link I could give you is: https://google.com Google and any browser is a great source of ideas, and help.
Found something?
Found a bug, error, typo, or something thats not true? Put it in the comments
Happy coding, and stay safe.
Posted on December 14, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.