Ideas for making more informative names
amu
Posted on March 2, 2024
Something that didn't catch my eyes and I just accepted the mental load that it always consumes was naming conventions. During the past few months I've been thinking about that. There are systems and conventions per language and per team, but I never questioned them deeply.
Here are some ideas that I've been testing on some projects and they are helping me. (Note that a lot of the ideas really apply on larger codebases and are probably not useful on a small codebase.)
1. PascalCase and camelCase are inconvenient for brain to parse
Consider these two ways of naming the variable:
- AutomateAsync
- automate_async
I find the latter much more clean for 2 reasons:
1) It might not seem much, but the spacing that is created by underline helps with readability. Especially, it adds up in a large file. If your codebase naming convention is PascalCase, when your brain wants to parse AutomateAsync it reads the name letter by letter and splits it via detecting capital letters. So it has to check at least 26 possibilities for each letter of the word. But with the underline, it's literally O(1). It's fundamentally less complex. Of course if you are familiar with the code, you might have no problems because you the longer names read as a single unit to you. But if the code base is huge and is developed by multiple people, or you have developed the code but haven't touched it for a few months, you need to focus on the name of variables more when reading the currently unfamiliar part to understand how it works.
2) Moreover, a PascalCased name, by nature, makes your brain slow down its crazy cache-based skimming functionality: while reading, your brain speeds up the word parsing process by focusing more on the begining and the end of the wrod. That's why you might have not picked up the tyops in the previous sentence. And the typo in the word "tyops" in the last sentence. In essence, your brain uses its word caching ability as opposed to processing every single word repeatedly that reside in its ram/disk. That is why you can read faster as you read more books. In the case of reading code, when your brain is aware that the codebase names are PascalCasely typed, it can't rely on that optimization as much, because it knows that is going to miss the middle of the word, where other words are formed.
2. Names can provide additional context by being longer
Names for files, classes, functions, etc can save a lot of time if they provide more context on their own. Again, this is more helpful in a larger code base and is overkill if the project is on a smaller scale.
My basic idea here is that using a separator and adding contexts to names will be helpful in many different use cases.
Here are some examples in different contexts and sizes:
2.1. For Naming Files [+ Directory Trees]
Without needing any context on the project, consider renaming a file called "effects.hpp" into "effects_manager.hpp". And in the same codebase rename the file "sound_played.hpp" into "sound_played_event.hpp". In this example, as you might have guessed, these files do something to sounds. The additional context provided in the names helped with indicating the granularity of responsibilities in the C++ header files. E.g. I expect that if I open the event header, I will see less things happening than when I open the manager event.
Of course, this isn't the first and only resort for providing context on the file level. In fact, providing helpful context by having a good folder structure pays off a lot.
Here's an example file tree from a part of the Godot game engine source code.
servers/
|__> audio/
|__> effects/
|__> audio_effect_amplify.cpp
|__> audio_effect_amplify.h
|__> audio_effect_capture.cpp
|__> audio_effect_capture.h
|__> audio_effect_chorus.cpp
|__> audio_effect_chorus.h
|__> audio_effect_delay.cpp
|__> audio_effect_delay.h
|__> audio_driver_dummy.cpp
|__> audio_driver_dummy.h
|__> audio_effect.cpp
|__> audio_effect.h
|__> audio_filter_sw.cpp
|__> audio_filter_sw.h
|__> audio_rb_resampler.cpp
|__> audio_rb_resampler.h
|__> audio_stream.cpp
|__> audio_stream.h
|__> camera/
|__> camera_feed.cpp
|__> camera_feed.h
You can see how the directory names provide context, hence the path of a file provides context, and the file names on their own provide context, because when you work on code editor it's not nice to always look at the path of each file.
Here's an absolute nightmare example file tree, from the newly open-sourced Valve's audio engine called Steam Audio:
(Abbriviated list)
|__> src
|__> core
|__> binaural_effect.cpp
|__> binaural_effect.h
|__> math_functions.cpp
|__> math_functions.h
|__> matrix.h
|__> memory_allocator.cpp
|__> memory_allocator.h
|__> mesh.cpp
|__> mesh.fbs
|__> mesh.h
|__> profiler.cpp
|__> profiler.h
|__> propagation_medium.h
|__> quaternion.h
|__> reverb.fbs
|__> reverb_effect.cpp
|__> reverb_effect.h
|__> scene.cpp
|__> scene.fbs
|__> scene.h
|__> scene_factory.cpp
|__> scene_factory.h
|__> serialized_object.cpp
|__> serialized_object.h
|__> sh.cpp
|__> sh.h
|__> speaker_layout.cpp
|__> speaker_layout.h
|__> sphere.fbs
|__> sphere.h
|__> sse_float4.h
|__> stack.h
|__> thread_pool.cpp
|__> thread_pool.h
|__> triple_buffer.h
|__> types.h
|__> util.h
|__> vector.fbs
|__> vector.h
|__> window_function.cpp
|__> window_function.h
Upon looking at the files, I don't get why reverb effect code is near the thread pool or a 3D model file for a sphere. What does even sh.cpp mean? I may find out that the code is be phenomenally written when I open the code files and read the code. But the file naming and structure is just not good in the sense that it is not welcoming to a new maintainer. Bare in mind that you in the future can also feel like a new maintainer sometimes, even if you are the original author of the code.
This is generally a good question: Does the it make sense when you put yourself in the shoes of a person who is reading your code for the first time?
2.2. For Naming Functions
Here are some examples:
1) For some functions that I know are for debug or are temporary, I like to put that in the name to make sure I don't use it more than they should be used. For example, sometimes a temp method exposes too much of the structure.
For example:
void calculate_mass__tmp();
void serialize_file__debug();
Sometimes I get so annoyed that I have to write a really dirty function that I add that fact to the name too. So that whenever I call that function I get reminded that it needs to be cleaned. Eventually.
void send_request__TOO_DIRTY();
3. Naming in Reverse can Help Readability
Let's start by mentioning some examples:
You have a class for your game called player. The name "player" by itself is too vague for me when there are a lot of classes hanging around (as I talked about in previous ideas.) So I can add some context to the name. I can use "player_entity." This idea is about using "entity_player" instead. The reason is when you have a bunch of entities, it makes it easier to digest. As the following:
entity__player
entity__car
entity__ship
entity__bird
And maybe some components:
entity__player
entity__car
entity__ship
entity__bird
component__collider2d
component__collider3d
component__transform
component__rigidbody2d
component__rigidbody3d
Another example: you have a state machine and you have functions to handle each state:
enum states { PLAY, PAUSE, STOP }
on_state__PLAY()
on_state__PAUSE()
on_state__STOP()
For variables specifically, it doesn't usually seem that useful to me. Unless there are many variables with similarly formatted names. I generally don't like it because usually, when I'm caring about a variable, I'm most likely dealing with the code within the function contents and I'm willing to spend energy to learn what it does. But when I'm calling/using a function, I may not care about what happens inside it and I just want to pass inputs and receive outputs. In other words when I'm using a function the stakes is higher. When I'm writing something inside a function, the stakes are lower relative to calling.
As a general idea I think it is nice to be:
- Extra considerate about the interface [things that are exposed]
- Considerate enough about the non-interface [things that aren't exposed]
But here's an example for a bunch of variables, in a shop system of a game:
queries_max
transactions_max
users_count
queries_count
transactions_count
I can flip the context in variable names and bring it first so that the variables are sorted by context. So users_count will be count_users (kinda like count_of_users with an omitted "of".)
Now they are like this:
max_queries
max_transactions
count_users
count_queries
count_transactions
These ideas will change, 100%
These were my recent ideas I wanted to talk about in this post. I am sure they will change for me. Maybe even as soon as next week.
However, the mean concepts that I am sure about are this:
- The way at least I name things, can be improved.
- Language naming conventions are opinions that can be broken if you want to.
Anyways, I hope this gave you some ideas. Maybe ideas that you count as really bad and things to avoid. Which is okay with me. That's why it's fun.
Posted on March 2, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 13, 2023
October 9, 2023
September 28, 2023