Golang - Garbage Collection in General
Satyajiijt Roy
Posted on August 17, 2022
As we all know the golang
is a garbage collected language like other languages like java
, python
, C#
etc. Golang
is a statically typed garbage collected language.
What is Garbage Collection and Why it is needed
So many articles has written about this subject. So I am going to keep that small and try to get some deeper insight about the concept.
In SML programs (and in most other programming languages), it is possible to create garbage : allocated space that is no longer usable by the program.
When software is executed on your computer, there are two important memory location parts that it uses: Stack and Heap.
Above code has a RandomBox
type and GenerateRandomBox()
is a function which returns RandomBox
type. The ref has been assigned in stack and struct data is in heap. Golang uses Escape Analysis to to determine that.
When the reference goes away, we end up with Garbage. This is how we get Garbage, which requires to be clean up time to time.
When the program memory footprint reaches a certain threshold, the whole application will be suspended, the Garbage Collector scans all the objects assigned memory space and recycling are no longer used, after the end of this process, the user program can continue, the language also use this strategy implement garbage collection in the early days, but today’s implementation is much complicated
Garbage collection can be initiated manually or automatically depending on the programming language you are using. Every program compiler
or interpreter
uses a specific algorithm to perform Garbage collection. Garbage collection in a compiled
language works the same way as in an interpreted
language.
Golang
use Tracing garbage collectors even though their code is usually compiled to machine code ahead-of-time. Go uses a concurrent mark and sweep garbage collector algorithm.
What happens in Garbage Collection Process
Go
’s garbage collector is called concurrent because it can safely run in parallel with the main program. When compiler decides that this is the time to run garbage collection based on some condition (discussed below), this is what it follows
Mark Setup
- which means compiler try to stops everything, literally called stop the world step. Nothing gets done in your application at this time.
- First it does is to enable write barrier which mean nothing get written in memory when this barrier is on (Stop all go-routines). Compiler has to perform this to make sure that your application is not losing any data.
- Once the STW is performed and the write barrier is on, collector moves on to next phase.
Marking
In this phase the following happens
- Inspect
stacks
to find the root pointer inheap
- Traverse the
heap
and see if they are still in use
Collector also uses go-routine like us and it takes 25% of your available go-routines and assign them to itself. Means based on our previous example, 1 thread will be dedicated to collector.
Now, if the collect finds out that it might go out of memory while performing this task, because some other
go-routine
is allocating more then it can mark. So it will choose thatgo-routine
and ask it to help with marking. This process is called Mark Assist
Mark Termination
Here collector will again perform the STW, turn the write barrier off, perform some clean-up and calculate the next Garbage collection schedule
Note: The goal is to keep the STW down or within the 100ms on every collection it needs to perform
One the Mark Termination process is complete (STW and Write Barrier is Off), application start working again with all the OS Threads available.
This is what happens every time the collection happens, you may ask that yes, it did the marking by identifying the dangling values, however, it didn’t clean up them. That part is called Sweeping
Myth Buster: Sweeping is not part of Garbage Collection, it happens outside of collection process.
Sweeping
This is process where we claim the non-marked memory allocations. We need to get them back, thats the whole purpose of Garbage Collection.
- Claiming the unused locations happens when a new allocation happens. So, technically latency for sweeping is not added to garbage collection.
Garbage Collection Triggers
- The first metric the garbage collector will watch is the growth of the
heap
. By default, it will run when the heap doubles its size. (code) - The second metric the garbage collector is watching is the delay between two garbage collectors. If it has not been triggered for more than two minutes, one cycle will be forced. (code)
Application memory can also trigger garbage collection is the
runtime.mallocgc
function, which at runtime divides objects on theheap
into micro objects, small objects, and large objects by size. The creation of each of these three objects can trigger a new garbage collection cycle.Manually triggering it by calling
gc()
.
GC Collector knobs
Unlike java
it was decided that a golang
developer shouldn’t have to tune their Garbage collector whenever they move to different hardware. So they only provided one tuning config called SetGCPercentage
or GOGC
environment variable. Default for GOGC
is 100%
GC Pacer
There is pacer algorithm trying to figure out when to start new collection. As we know that the calculation happens when Mark Termination phase of Garbage collection. The Pacer algorithm tries to calculate this and if it find that it can get more advantage of starting the collection even before the condition applies, it will do that. So take away is that the collection can even start before then the calculated time if pacer think that it can get more benefit.
Hope this blog was able to provide little more in-depth inside about the Garbage Collection process and how it is implemented in Golang
Happy Learning!!
Posted on August 17, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.