Balanced Trees - The Experience

This article isn't what you think it is. This is not a deep exploration of this amazing data structure, nor is it a beginner's guide explaining the properties or a walkthrough of the code. NO!
This article talks about a personal experience I had while navigating the Introduction to Data Structures and Algorithms class (CS010C) at the University of California, Riverside. I will talk about what a freshman like me, who wants to pursue software engineering, experiences as a young padawan in the field. I will also talk about the various steps I had to take, the tools I had to use, and the amount of hair I had to rip out to get a few special cases working (my code isn't even close to complete).

The Goliath - A threat to my being

Earlier in the course, I tackled various other, much simpler data structures like linked lists, binary search trees, stacks, etc. For anyone unfamiliar with these terms, trust me, you already know them. These were simple structures; yes, they were hard to code, but the logic was easily comprehendible. You could almost see them move in front of your eyes. For example, in the earlier stages of the course, we started out by writing a linked list. This was the code for the remove() function for the linked list:

void Playlist::RemoveSong() {
    cout << "REMOVE SONG" << endl;
    //checking if list is empty
    if (head == nullptr) {
        cout << "Your Playlist is Empty!" << endl;
        return;
    }

    string id;
    cout << "Enter song's unique ID:\n";

    //validating input
    if(!(cin >> id)) {
        runtime_error("Invalid Input Type -- run the program again");
    }



    //Keeping track of current node and previous node
    PlaylistNode* curr = head;
    PlaylistNode* prev = nullptr;

    //finding song with the same id as input
    while (curr != nullptr && curr->GetID() != id) {
        prev = curr;
        curr = curr->GetNext();
    }

    //case: no element has the given ID
    if (curr == nullptr) {
        cout << "Song with ID " << id << " not found in the playlist!" << endl;
        return;
    }

    //case: only one element and it matches the ID
    if (curr == head && curr == tail) {
        head = nullptr;
        tail = nullptr;
    }

    else{

        //case: the node with the given ID is the head
        if (curr == head) {
            head = head->GetNext();
        }

        //case: the node with the given ID is the tail
        else if (curr == tail) {
            tail = prev;
        }

        //case: the node is some value in the middle
        else {
            prev->SetNext(curr->GetNext());
        }
    }

    //output name of song with ID
    cout <<"\"" <<curr->GetSongName() <<"\""  << " removed.\n\n";

    //deleting song
    delete curr;
}

All this code condenses down to this movement here:

As you might have noticed, it is a simple movement. Something digestible, something comprehendible. Now let us look at the code for my B-Tree remove function:

//remove💀💀💀
void Tree::remove(const string& word) {

    // if the tree is empty
    if ( !root ) return;

    // search for the word
    Node* loc = search(word);
    if (loc == nullptr) return;

    //root
    if (loc->parent == nullptr) {
        //if the root has no children
        if (noChildren(loc)) {
            //if the root has one element
            if (loc->numData == 1) {
                delete loc;
                root = nullptr;
                return;
            }

            //if the root has two elements
            --loc->numData;

            //if small is the word
            if (loc->small == word) {
                loc->small = loc->large;
                loc->large = "";
                return;
            }

            //if large is the word
            else {
                loc->large = "";
                return;
            }
        }

        //if the root has children
        if (loc->numData == 1) {
            if (loc->left->numData == 1 && loc->middle->numData == 1) {
                swap(loc->middle->small, loc->small);
                remove(word);
                delete loc->middle;
                return;
            }
        }

    }

    //all leaf nodes
    if (noChildren(loc)) {

        //if node has two elements
        if (loc->numData == 2) {
            --loc->numData;
            if (word == loc->small) {
                loc->small = loc->large;
                loc->large = "";
                return;

            }
            loc->large = "";
            return;
        }

        //merging
        Node* _parent = loc->parent;
        loc->small = "";
        if (_parent->numData == 1 and _parent->parent == nullptr){
            if (_parent->left->numData == 1 && _parent->middle->numData == 1) {
                string smaller = midStr(_parent->small, _parent->left->small, _parent->middle->small);
                string larger = maxStr(_parent->small, _parent->left->small, _parent->middle->small);

                _parent->small = smaller;
                _parent->large = larger;

                delete _parent->left;
                delete _parent->middle;

                _parent->left = nullptr;
                _parent->middle = nullptr;

                _parent->numData++;

            }
        }
        return;

    }

    throw runtime_error("remove() :: node is an internal node and not the root");
}
 // Note: This is not even close to complete

For reference, this is what the graphical representation looks like:

When you first look at it, it almost seems as if the tree has a mind of its own. This is scary...This is complex!
Till now all we had been looking at were simple changes of pointer values, and then you get slapped with this monstrosity, which is almost used in all Database Management Systems. This brought my capabilities into question; it made me question if this field is for me all together.

The Painful Beginning - as always...

I was sitting in my dimly lit dorm room, trying to figure out where to start (while contemplating my life choices). I sat there for 5 minutes, then 10 minutes, then 30, then an hour, then 3 hours, and there, I had hardly written 10 concrete lines of code. I felt petrified as I realized that the deadline was in three days. I then turned to watch YouTube tutorials; I found nothing concrete except theory, diagrams, and animations. I was stuck.
As I sat there, staring at my Zybooks assignment, I heard the voice of my lovely professor, Pat Miller, come to my head - "Try the simple cases first...start by getting the first test case."
There, something clicked. Yes, I was still panicking, but there was a new hope.

The Shared Distress - You are not alone!

It was the next day. I kept reminding myself of the newly established ideology - target the test cases first. I started doing that. Turns out, if you don't obsess over a general solution - a master algorithm - and instead tackle each case carefully, you seem to see patterns. These similarities can be overlooked when we look at the problem as a whole.
I started entering a state of flow; there was no me, there were only my thoughts, my keyboard, and Neo Vim. After 2 hours, I was ready to tackle the insert function.
I remember getting texts from a bunch of my friends. Turns out, this was a shared distress. It isn't a good idea to celebrate a common misery, but it gave me courage - I am not alone!

Tools & Utilities - Work Smart!

This project was not only different for me because of the young exposure to the data structure, but also because it was the first time I took my time to do real debugging. This is how I used to debug before this assignment:

#define _DEBUG_
#include <iostream>

int main() {

#ifdef _DEBUG_
std::cout << "Debug Message" << std::endl;
#endif
return -1;
}

This was pretty straightforward - if DEBUG is defined, we will print out this statement in the terminal, else it will treat it as if it doesn't exist.
Yet, this time, things weren't that simple. I had to look at various pointers and values while the program ran!
To achieve this, I used a tool called GDB which was a command-line debugger. This was my first time using a debugger and it was the best decision of my life. I could now set breakpoints in my program and see exactly where things go wrong. I could also check variable values while the program runs - objective achieved!
Another CLI tool that helped me was Valgrind - this tool is magical. Let's say you write a member function and it works as expected. Yet, you would never even notice if there is a memory leak while the program is running. Valgrind helps you see that; it provides details regarding how much memory has been leaked, where it was created, and much more! This was a helpful tool as one of the parameters on the grading rubric was memory leaks. For me, it helped me get better grades.
Lastly, what made everything glued together was TMUX and Neo Vim. These two didn't directly contribute to the making of the B-Tree, but - oh boy - it made the development process lightning fast. TMUX is a multiplexer tool for the command line i.e. it helps you create multiple terminal sessions and windows. It also helps you create panes on the same page. This, along with Neo Vim - a terminal text editor - made debugging silky smooth.

Looking Back

While this B-Tree may have seemed like an insurmountable Goliath at first, breaking it down into smaller tasks and using the right tools turned this terrifying beast into a conquerable foe. This project wasn't just about the B-Tree; it was about learning to approach problems strategically, leverage the available resources, and finding comfort in the fact that even the most complex problems can be untangled with a little focus and perseverance. And who knows, maybe someday I'll even look back at B-Trees with a sense of accomplishment, rather than the cold sweat they induce now.

Blog

Balanced Trees - The Experience

Nidheesh Manishkumar Vakharia

The Goliath - A threat to my being

The Painful Beginning - as always...

The Shared Distress - You are not alone!

Tools & Utilities - Work Smart!

Looking Back

Join Our Newsletter. No Spam, Only the good stuff.

Related