Be Grateful for JavaScript Arrays: A Comparison with C
Josh Melo
Posted on October 20, 2022
Ahh, JavaScript arrays. For a lot of us, they're the first data structure we’re introduced to, and for good reason! There are so many ways we can use our little (or sometimes massive) list-like structures. But have you ever laid in bed and been jolted awake by curiosity as to how they work under the hood? No, you’re probably a normal person! But for the rest of you, stick around, and you too might develop a stronger love for JavaScript.
Learning how arrays and simple methods like .push()
work in lower-level languages can really deepen your understanding of and appreciation for the wonders of JavaScript arrays. And what better language to do this in than the language that practically runs our world, C?
In this post, we’ll take a look at how you could go about implementing something similar to .push()
in C. We'll discuss some of the major differences between C and JavaScript, memory management and heap vs. stack memory in C, array types in C, and variables and pointers. And at the end, hopefully you'll have a newfound appreciation for JavaScript's humble array .push()
method!
Background
Here are some key differences between JavaScript and C:
- JavaScript is considered a high-level language, meaning there are a lot of abstractions between you (the developer) and the machine. Which is a fancy way of saying that high-level languages do a lot of work for us. Here’s a great article that highlights these differences.
- C requires manual memory management, whereas JavaScript provides automatic memory management. More on this later.
- C must be compiled in advance, whereas JavaScript is compiled right before being executed, often referred to as just-in-time compilation.
- Arrays in C can hold a single data type (char, int, float, etc.), whereas in JavaScript, arrays can hold mixed data types out of the box, such as strings, numbers, and boolean values. (However, arrays in C can also be made to hold different types with some work.)
The ease of developing in high-level languages often comes at the cost of performance. C is generally used for system programming and embedded systems where speed is more of a concern, while JavaScript is generally used in the browser and (more recently) on servers with Node.
Everything you can do in JS you can do in C as well, with a lot of work! The main advantages of higher-level languages are the tools that come out of the box for us developers to use, as well as the ease of cross-platform deployment.
Push It!
Let’s take a look at how we would implement pushing an element into an array in both languages.
JavaScript code:
const myArr = [1,2,3,4,5];
myarr.push(6);
console.log(myArr);
//[1,2,3,4,5,6]
No surprises here. How about in C?
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char sizeOfIntegerPointer = sizeof(int);
int *myArr = malloc(sizeOfIntegerPointer * 5);
if (myArr == NULL) {
printf("There was an error allocating memory for the array");
exit(EXIT_FAILURE);
}
for(int i = 0; i < 5; i++) {
myArr[i] = i + 1;
}
int *myArrExpanded = realloc(myArr, sizeOfIntegerPointer * 6);
if (myArrExpanded == NULL) {
printf("There was an error reallocating memory for the array");
exit(EXIT_FAILURE);
}
myArrExpanded[5] = 6;
for(int i = 0; i < 6; i++) {
printf("%d", myArrExpanded[i]);
}
free(myArrExpanded);
return 0;
}
Quite a lot more code! So what exactly is our good friend JavaScript abstracting away for us? To understand what’s happening here, we need to dive into a few concepts. Let’s start with how memory is laid out in C and JavaScript programs.
Memory Layout
Stack
When C and JavaScript programs are running, the memory they have access to is split up into a few areas. One of these areas is known as the call stack, which holds function invocations and variables in their scope.
You can think of the call stack like a stack of pancakes. And for more deliciousness, let’s say each pancake has syrup and chocolate chips on them. It’s easy to stack our flapjacks on top of one another, but it becomes more tricky when you want to remove, let’s say, the third pancake in the stack. Unless, of course, you lift up two pancakes at a time and use your other hand to grab the third one. But then you risk dropping all of those pancakes and spilling syrup everywhere!
This is why stacks follow the last in, first out method. Meaning, if I place 4 pancakes on the stack, in order to get to the second one, I’ll first need to remove the fourth and third.
When functions are called in your program, the local variables within the function and the return address of the calling function, collectively known as a stack frame, are pushed to the call stack. Once the function has completed, the stack frame is removed from the stack and any variables that were declared in that function are removed.
Here’s an in-depth example diagram of the call stack in action in JavaScript.
Generally, the amount of memory allocated for the call stack and each stack frame is computed when the program is compiled and can't be altered at run time. (While you can allocate more memory on the stack during run time using alloca in C, this is generally discouraged.) The main idea here is that if you need data to stick around between function calls and/or the amount of memory you need at run time is not known ahead of time, you need to rely on another, more persistent, area of memory.
Heap
In C, when we want data to persist between function calls, we can request memory from the operating system (OS). If the OS is able to give us the memory we requested, we’ll have a place in memory for our data in a tree-like data structure called the heap.
The variables that are allocated on the heap are not destroyed at the end of any function call like they are on the stack. Instead, these variables stick around until we free them up. In lower-level languages, we manually have to free these objects up when we’re done using them. But in higher-level languages like JavaScript, we have a built-in sanitation crew called the garbage collector come in to automatically throw these variables in the trash when our program is done using them.
Here’s a really fantastic, short video describing the differences between the heap and stack.
Note: There are other areas of C program memory. You can read about those here.
Let’s now go over some of the different types of arrays in C.
Arrays
We don’t really talk much about different types of arrays in JavaScript because JavaScript arrays can do it all! They are dynamic, meaning they can shrink, grow, and be of mixed types. They’re the omnivore of the array world. (This is also true of a lot of high-level languages.) Simple arrays in C are a different beast.
Here’s an example of a simple array of integers in C. Take a look at the top row. These numbers refer to the element's location in memory. We can actually make use of these memory locations using pointers, which we’ll touch on below.
Fixed-Length Arrays in C
Fixed-length arrays are determined at compile time. Their sizes cannot change at run time and must be known prior to compiling your program. So int arr[5] = {1,2,3,4,5}
will stay as a 5 element integer array for the entire duration of the program.
Variable-Length Arrays in C
Variable-length arrays were introduced in C99 (released in 1999, 27 years after C was created!). Their sizes can be updated at run time. For example:
int array_len;
printf("Enter desired length of array: ");
// The ‘&’ operator in C means the “address of”. We’ll touch on this more below
scanf("%d", &array_len);
int arr[array_len];
This C code takes input from the user and creates the size of the array based on the user’s input. The variable-length array's length doesn't need to be known prior to run time, but once its size has been initialized, it cannot be changed. So we can have a user enter in a number to determine the length of this kind of array in C, but after that, the size cannot change. This is a bit closer to what we want to implement, but it’s still not going to allow us to add an element to the array.
Dynamic Arrays in C
As you may have guessed, dynamic arrays can be resized when the program is running. This is possible because the memory they use is allocated on the heap, which means we can request more memory from the OS at run time if needed. But it’s important to note that requesting memory from the OS is a costly operation because a number of different steps need to take place for this run time allocation. So both types of arrays have their tradeoffs and use cases.
Let’s now return to the C code example from the beginning:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
// Request memory from the OS
// Hello OS! I'm requesting memory for 5 times the size of an integer
char sizeOfIntegerPointer = sizeof(int);
int *myArr = malloc(sizeOfIntegerPointer * 5);
// Verify that the allocation was successful
if (myArr == NULL) {
printf("There was an error allocating memory for the array");
exit(EXIT_FAILURE);
}
// Initialize array values
for(int i = 0; i < 5; i++) {
myArr[i] = i + 1;
}
// Move the array values to a new, expanded memory location through realloc
int *myArrExpanded = realloc(myArr, sizeOfIntegerPointer * 6);
if (myArrExpanded == NULL) {
printf("There was an error reallocating memory for the array");
exit(EXIT_FAILURE);
}
myArrExpanded[5] = 6;
// Print the values
for(int i = 0; i < 6; i++) {
printf("%d", myArrExpanded[i]);
}
// Free up the memory
free(myArrExpanded);
return 0;
}
I’ve left comments on each section of code I felt was important and wrote further explanations below.
Request Memory from the OS
The function call to malloc
(memory allocation) is probably the most unsettling piece of this code, but it’s really not so bad! malloc
takes one parameter: how much memory to request from the OS in bytes.
Different data types take up different sizes (for example, 4 bytes for an integer, 1 byte for a char, etc.). So we can make use of the sizeof
function to determine how many bytes we need for our data type, and multiply that by the length of the array we want. If we want a 5 element array, all we have to do is multiply 5 by the size of an integer to get space for a 5 integer array!
You may be thinking the result of malloc
returns an array. It definitely seems like it given its usage in the for loops afterwards! But what it actually returns is a pointer. A pointer is simply a variable that stores a memory location.
Consider a simpler example:
Here we have an array of 5 integers, and each element has a specific address in memory. It’s like their home address. You can see here that the value of arr[0]
is 1 since that’s what we assigned to the first element. The address where that 1 lives is stored in myPointer
.
So in our original example, the result of malloc
would return the pointer to the first element of our array.
Why use pointers? And why would they reuse the asterisk when it already means multiplication? Good questions! Unfortunately, I can’t answer the latter, but there are a number of reasons to use pointers. One example being, in JavaScript, when we pass objects to functions, we pass what's known as a reference to that object. C does not have this concept of a reference. Meaning the only way for us to say “I want to pass this array of integers to this function and modify it without making a copy” is by using pointers.
Check That Allocation Worked
When we request memory from the OS, there is a chance the OS doesn’t have enough memory to spare. So we should always check to make sure the request was successful, which we do in the if statement here. If the allocation fails, we exit the program with an error code.
Move Array Values to a Larger Memory Location
The realloc
function re-allocates memory that was previously allocated from malloc
by the new size given in the second parameter. It will copy all the data to the new expanded memory location. This is how we can achieve dynamic arrays and, more broadly, dynamically allocated memory.
Assign and Free Up Memory
We can then safely assign a value to our sixth array element, and finally free the reallocated memory to prevent memory leaks, now that we're no longer using it. Recall that the call stack memory is deallocated after the function finishes running, but that is not the case with the heap. In C, the heap’s objects can and will stick around until you manually deallocate them using free
.
Comparison with JavaScript
With our new appreciation of how arrays are resized in C, let’s take another look at the JS code:
const myArr = [1,2,3,4,5];
myarr.push(6);
console.log(myArr);
//[1,2,3,4,5,6]
In that first line alone, we’re requesting memory from the OS and with no fixed size. No need to manually allocate memory or check if the allocation was successful.
Once we push 6 into the array, we don’t need to worry about reallocating the memory that was previously allocated. Variables are declared on the stack and heap in JavaScript, but you as the programmer don’t need to worry about it--the language does it for you.
You can see how this process could get onerous in C when repeated over and over in a complex application. So thank goodness for JavaScript!
Conclusion
We’ve only skimmed the surface of each of these topics, but I think this knowledge gives us an appreciation for higher-level languages as a whole. Now, when you look at .push()
you can appreciate just how much JS is doing for you. :)
Posted on October 20, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024