Learning Zig : a Python dev's learning notes

taikedz

Tai Kedzierski

Posted on October 13, 2024

Learning Zig : a Python dev's learning notes

Cover photo (C) Tai Kedzierski

Scripting languages and System languages are two parts of a vast ecosystem. I wanted to cross to the other side to see what life is like there...

Having trawled through a lot of reference material, I found some common-use concepts still lacking for the beginner, so this aims to stand as a jump pad for anyone else from a python/scripting starting out with systems languages in general, and Zig in particular.

Zig? ⚡

Zig is the ace new kid on the block and, whilst still in its early days, is proving that it has a lot to offer. The release of the bun JS/TS runtime launched its name to the fore and developer interest has increased rapidly over the last two years - it debuted on StackOverflow's 2023 survey at an impressive initial 39.5% on the "admired" rating metric, and the following year rose to 73.8% - well ahead of C's 47.4% , and only just tailing Rust's 82.2%.

I myself work predominantly with Python, a garbage-collected scripting language with lots of QOL features, and have been dabbling with Go (both sitting at a tidy 67% rating for the "admired" metric). Both occupy a same space of wanting to facilitate rapid development - but whereas Python offers a rich language feature set for flexibility and readability; Go offers a much more stripped back syntax with a focus on simplicity and efficiency.

I've coded in Go a little bit, and was considering basing my next efforts there, as it feels much more approachable for a script language specialist; but I felt it would be a much better horizon-widener to dive properly into a systems programming language. I tried to get on board with Rust, but felt its hype and promises ended up not delivering enough against the difficulty to work with it. The worst for me was the amount of code nesting its management entailed, especially managing error cases against option-matching.

Documentation and Learning Resources

There are a number of good learning resources for the basics - the order in which I stepped through was to first read the language reference. Following some advice I saw for Rust, I read through the documentation first whilst resisting the temptation to experiment too much.

I then started looking through the Ziglings excercises one by one, and finding them to be pretty good at gently easing the learner through basic concepts. After exercise 25 or so, I decided that perhaps I didn't need to go through the rest of it and decided to rewrite my Go project Os-Release Query . I had written it in Go as a beginner in about two or three sessions to completion, so I reckoned it was a simple enough to proceed with. It took me about two weeks with several sessions of learning and stumbling.

As I contended with a few teething difficulties, I found myself frequently going to Zig.guide for simple examples and explanations of usage - specifically the ArrayList and Allocators pages, the former being key for doing general file reading tasks. My first mistake was to think about string arrays similarly as in Python or in Go (see "Strings and ArrayLists" below).

Finally, sensing I was needing a bit more guidance, I found myself stepping through several excellent videos from Dude the Builder's (José Colón) Zig in Depth series. Having read through the standard documentation, and cross-referencing against the standard library documentation, Colón's guidance was the last missing piece for getting proper progress.

And of course, web searching and StackOverflow filled in specific gaps, notably I kept coming back to a string concatenation answer on SO.

Concepts

Arrays, Slices, Strings and ArrayLists

Unless you've worked in C before, you come in for a rude awakening to discover that there is no native language construct for a "string" (beyond a double-quote notation of static literals). Most modern languages have them. All scripting languages have them in some form. Go and Rust have them. But C and Zig share the same principle: there is no string, only sequences of bytes. Whilst C has a "character" char type, Zig takes it a step further by only referencing an unsigned integer type to represent a byte.

# C
char hi[] = "hello"; 
// or even
char* hi = "hello";

# Zig
var hi:[]const u8 = "hello";
Enter fullscreen mode Exit fullscreen mode

On the Zig side, note the const on the array type notation - a lot of functions which receive a "string" as parameter best makes the assumption that the data being received could be a literal - in this case, it's hard-coded and unmutable. If a function is defined as

fn myFunc(data:[]u8) { ... }
Enter fullscreen mode Exit fullscreen mode

then the type cannot be a string literal, but instead requires that the item be mutable, that it can change - pointers into heap memory locations most frequently.

There are many subtleties to this, and I mainly worked against an item from Zig.news to keep myself sane.

I made an early mistake to think about loading a ASCII file lines, into an array or array of bytes, whereas in fact what I needed was an ArrayList - a dynamically growable and iterable sequence of items - of slices, which are sort of 'views' (with a pointer and length) onto arrays (a fixed-length sequence of a given type).

Stack and Heap

Stack vs Heap is a key concept in any language where you are made to manually manage memory, there's no way around it really, and coming in from GC languages, I had to be disabused of some simplistic notions.

Understanding the stack and the heap was a bit more than "heap is big and persistent/stack is small and throwaway". It also goes hand-in-hand with reasoning on passing by value or reference, which in turn is more than "did you pass through a native literal type or a pointer." Geeks for Geeks goes over the principle from a mainly C/C++ point of reference, whereas OpenMyMind.net walks through it specifically with Zig examples.

I am not in a position to explain its nuances myself as I am still learning, but the shift in understanding these was very much crucial for comprehending what I was doing in my project. I can only recommend reading up on those two sites as a starting points as mandatory reading before proceeding further.

Allocators

One of the key features of Zig is its use of allocators, which is one of the major improvements in memory management assistance we get from the compiler and its debug-mode builds. It is still possible to double-free and leak memory from improperly-written code - but using the debug build (the default build mode) causes exceedingly useful stack traces to show the origins of leaked/non-freed memory and give you an early chance of catching and remediating memory problems.

The General Purpose Allocator can be the go-to reference for pretty much anything if you just want to get started. Both Zig Guide and Dude the Builder give an excellent run down on how to use them - once you see them in use a couple of times, they usually feel mostly plug-and-play.

Zig makes a promise - and sets a standard/best-practice - of no hidden allocations: if a function does not take an allocator instance, or a type is not initialized with one, then it promises no allocation to happen. Likewise, library creators are encouraged to abide by the same principle. This goes a long way to ensuring no nasty memory leak surprises deep in a project's dependencies.

Structs and Methods

Unlike in C and Go, structs can come with their own dedicated functions, or methods, firmly attached to the types they are associated with. Like in Python, if the first argument is a self reference, it can be used on an instance, but without it, it can only be used from an uninstantiated struct type.

This gives the flexibility of some type-specific operations that are carried through a instance.method() notation, but avoids the pitfall of object-orientation inheritance deluges by just not permitting them.

I find this notationally and conceptually convenient, and a good introduction to the idea of "composition over inheritance" that is becoming more and more popular. Stuff that, JavaDeepInheritanceSingletonFactoryInstantiatorHelper!

Build System

The build system in Zig is interesting. I cannot claim to fully understand it much at all, except that to say I am glad I don't have to contend with Makefiles. Whether it is in fact better remains to be seen (for me), but I am going to keep exploring.

One thing I do enjoy very much however is the build.zig.zon file, and how it manages dependencies.

.{
  .dependencies = .{
    .zg = .{
      .url = "https://codeberg.org/dude_the_builder/zg/archive/v0.13.2.tar.gz",
      .hash = "122055beff332830a391e9895c044d33b15ea21063779557024b46169fb1984c6e40",
    },

    .module2 = .{
      .url = "https://site/path/item2.tar.gz",
      .hash = "1210abcdef1234567891",
    },
  },
}
Enter fullscreen mode Exit fullscreen mode

Using other peoples' libs

There is no central package manager in Zig - simply point at tarball URLs online and fetch them. ZigList.org tracks libraries on Github and Codeberg (primary host of Ziglings and a preferred space for some Zig creators), and if you do publish one on either platform, it should eventually get picked up.

The hash however is not a hash of the tarball, but a cumulative hash of each exported item from the package.

Either way, there is an interesting aspect of the system: after performing zig build, any dependencies are downloaded and unpacked to ~/.cache/zig/p/<hash>/... . There are two aspects that I came up against in playing with this:

  • You cannot easily calculate the hash in advance yourself
  • If a hash exists already, then the tarball is not re-downloaded.

So when I wanted to add my second module, I needed a placeholder hash. I blindly re-used the hash from the prior module and could not figure out why it wouldn't a/ prompt me for hash re-validation on my bogus hash, and b/ didn't find the files from the second module.

Eventually after trawling the net I found two resources, I figured that I should simply copy an existing hash and then mangle the last few bytes of it. The standard documentation is admitted by the core team to not yet fully be there yet - which I understand since the dust isn't yet completely settled on the final forms of many things, and a lot is still moving about, so other peoples' blogs are a godsend...

Using a library involves adding some code in your build.zig file:

    // Find the dependency named "zg" as declared in build.zig.zon
    const dep_zg = b.dependency("zg", .{});

    // Get the `CaseData` module from the `zg` dependency
    //   and register as importable module "zg_CaseData"
    // In project code, use @import("zg_CaseData")
    exe.root_module.addImport("zg_CaseData", dep_zg.module("CaseData"));
Enter fullscreen mode Exit fullscreen mode

Usually the README of the library will give you more detailed instructions and, hopefully, the hash to use as well. The above was an example for using Dude the Builder's zg library for UTF-8 string handling.

Creating your own lib

One item I created for my project was an ASCII file reader and, knowing that I would likely want to re-use that code, I decided to try to make my own library.

The most minimal snippet for creating a library in Zig 0.13.0 consists of a source directory, a build.zig.zon to declare relevant items, and a build.zig for executing the build when it is imported:

The build.zig.zon file:

.{
    .name = "my-lib",
    .version = "0.1.0",
    .paths = .{
        "build.zig",
        "build.zig.zon",
        "src",
    },
}
Enter fullscreen mode Exit fullscreen mode

The build.zig file:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    _ = b.addModule("my-lib", .{
        .root_source_file = b.path("src/main-file.zig"),
        .target = target,
        .optimize = optimize,
    });
}
Enter fullscreen mode Exit fullscreen mode

That's it. Follow then the above for importing it.

Final thoughts

It's only the beginning for me with Zig and so far, it has been rewardingly challenging. Going on this little escapade has taught me much about how to design for memory efficiency (I have not achieved it - but I know how my Python-oriented code design would not fly in a performance-oriented space).

And that's growth. 🌱

💖 💪 🙅 🚩
taikedz
Tai Kedzierski

Posted on October 13, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related