CS50: Week 2, Arrays
Luciano Muñoz
Posted on May 2, 2022
Hi veryone! Continuing with the CS50 series, this is what I found interesting in Week 2. Topics included in this lecture were compilers, debugging techniques, memory, and mostly arrays.
We’re using the C programming language, a compiled language. It means that a compiler traduces the source code into machine code and generates the binary file that the computer runs to execute the software. On the other hand we have also interpreted languages, like Python or Javascript, which are traduced at run-time by an interpreter, line by line.
Compilers execute 4 tasks when compiling code: preprocessing, compiling, assembling, and linking.
-
Preprocessing involves replacing lines that start with a
#
, as#include
. For example,#include <cs50.h>
will tell the compiler to look for that header file, since it contains content, like prototypes of functions, that we want to include in our program. Then the compiler will essentially copy and paste the contents of those header files into our program. - Compiling takes our source code, in C, and converts it to another language called assembly language.
- The next step is to take the code in assembly and translate it to binary by assembling it. The instructions in binary are machine code, which a computer’s CPU can run directly.
- The last step is linking, where previously compiled versions of libraries that we included earlier, like
cs50.c
, are combined with the compiled binary of our program. In other words, linking is the process of combining all the machine code into one binary file.
That a language is compiled does not mean that you will never need to debug a program, it is an advantage that the compiler prevents some errors, but we going to debug our programs anyway.
There are different techniques to debug code, like printing variable values on the screen to understand what is doing the software. A better option is using a debugger, which you can run line by line to watch the flow the code follow and how each variable changes. Lastly, we have the rubber duck debugging technique, where we explain what we’re trying to do in our code to a rubber duck (or other inanimate object), and just by talking through our own code out loud step-by-step, we might realize the mistakes we made.
I mentioned in my last post about some of the data types that come with C, which we can use to declare variables and store specific data, like int
, float
, or char
.
Each variable is stored in memory with a fixed number of bytes. These are common type size for most computer systems:
-
bool
is 1 byte size -
char
is 1 byte size -
int
is 4 bytes size, -
float
is 4 bytes size -
double
is 8 bytes size -
long
is 8 bytes size - but talking about
strings
, the size will vary depending on the length of the string.
We can think of bytes stored in RAM as though they were in a grid, one after the other.
A char
whose size is 1 byte will is stored in one of those, but an int
will use four bytes, four of those squares.
When talking about arrays, we’re referring to one of the most widely used data structure in most programming languages.
In an array, we can store more than one value. In C all the values must be of the same type, but in other languages, an array can contain different types of values.
The particularly of arrays is that they’re stored in memory back-to-back, or contiguously, and we can access their values using bracket notation starting from the element 0: array[0]
, array[1]
, array[2]
, etc.
Arrays, unlike the other data types, don’t have a fixed size in memory, it’s up to you to assign the size when defining the array in the code. But, once you set the size of the array it is permanent, and can’t be changed.
C doesn’t have native support for strings, they’re just an array of type char
. Each character in our string is stored in a byte of memory as well:
In C, strings end with a special character, '\0'
, or a byte with all eight bits set to 0, so our programs have a way of knowing where the string ends. This character is called the null character, or NUL. So, we actually need four bytes to store the string HI!
.
The last words of this lecture were about command-line arguments. I’ve seen this before with Python, and now I realize that Python took it from C, and used even the same variable name. In C we have the variables argc
and argv
that the main
function gets automatically when the program is run from the command line. argc
(argument count) store the number of arguments, and argv[]
(argument vector) is an array of the arguments themselves.
These variables receive at least one parameter when we run our software: the program itself.
$ ./name
argc = 1
argv[0] = './name'
$ ./name Luciano
argc = 2
argv[0] = './name'
argv[1] = 'Luciano'
The code below checks if it receives a name as an argument and prints hello, {name}
:
#include <cs50.h>
#include <stdio.h>
int main(int argc, string argv[])
{
if (argc != 2)
{
printf("missing command-line argument\n");
return 1;
}
printf("hello, %s\n", argv[1]);
return 0;
}
Note also that the main
function returns an integer value called an exit status. There are some conventions but by default, we return 0
if nothing went wrong, and 1
or higher if something fails.
Check the code for the Problem set of Week 2.
Next week is about algorithms for sorting and searching, a challenging but necessary topic.
I hope you find interesting what I’m sharing about my learning path. If you’re taking CS50 also, left me your opinion about it.
See you soon! 😉
Thanks for reading. If you are interested in knowing more about me, you can give me a follow or contact me on Instagram, Twitter, or LinkedIn 💗.
Posted on May 2, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.