Garbage Collection - an introduction for everyone
Sendil Kumar
Posted on January 30, 2020
Programs, when they run, needs memory. They use memory to store themselves and during execution, they store values and data structures. But memory is not an indefinite resource. They need to manage the memory efficiently. To manage the memory, programs collect and remove the unused memory (a.k.a garbage). The process is called as garbage collection. Programming languages allow you to manage memory either manual or automatic.
The textbook definition of Garbage collection is as follows:
Garbage collection (GC), also known as automatic memory management, is the automatic recycling of dynamically allocated memory. Garbage collection is performed by a garbage collector which recycles memory that it can prove will never be used again. Systems and languages which use garbage collection can be described as garbage-collected.
Manual Memory Management
The manual memory management provides developers with the flexibility to optimally allocate and de-allocate memory. They provide high performance, low latency, low pause time, and high throughput if done correctly. Because developers precisely know when to need the memory. Since it is a manual process, it is an error-prone and redundant burden for developers.
If memory management is not properly done it leads to:
- Memory leak, when the program is not de-allocating dead objects
- Memory corruption, when the program tries to free the already freed memory.
- Dangling pointers, when the program tries to access memory that was freed.
What is pause time?
When garbage collection happens, the program pauses its execution and remove the dead objects. The pause time is directly proportional to the amount of garbage collection. Longer the pause time, more garbage is collected but lesser the performance. On the other hand, shorter the pause time, higher the performance but lesser the garbage collected.
What is Throughput?
Throughput is the percentage of time not spent in garbage collection over the time spent of overall execution. Ideally, languages aim for higher throughput. Higher the throughput, shorter the pauses.
What is Latency?
Latency is the responsiveness of an application. Garbage collection pauses affect the responsiveness of applications.
The efficiency of the garbage collection algorithm is measured with the above properties.
Automatic Memory Management
Automatic Memory Management is not as efficient as Manual Memory Management. Since it has to guess where the garbage is and when to collect them. But it addresses the error-prone manual process of managing memory. The automatic garbage collection ensures the memory is properly collected when it is no longer needed. When all the references are de-allocated, the object is completely removed.
Automatic Memory Management provides lower performance, higher latency, and lower throughput than compared with (efficiently done) manual memory management. But on the other hand, it eases the burden from the developers to manually manage memory with malloc
and dealloc
.
The automatic memory management does not have a clue on how the program is using the memory/objects. To de-allocate memory correctly, they scan the entire object tree from root and remove those objects that are not reachable or track the object references. Based on how they operate the garbage collection process is broadly classified into two types:
- Reference counting
- Tracing
Reference Counting
Reference counting is the most intuitive method of garbage collection. It is very simple. In reference counting an object is live if and only if the object has references greater than zero. To achieve that each object reference in the reference counting world has a counter attached to it.
Consider a GCObject
class. The class has an rc
(reference counter) to count its reference. The reference counting operates on object references during their initialisation and de-initialisation.
The pseudo structure of an object in reference counting world is as follows:
- Add a counter
rc
to the object
class GCObject(val name: String) {
var rc = 0
}
Wondering about Kotlin syntax, check out the below post
- Increase the counter, for every new instance that is created for the object.
class GCObject(val name: String) {
var rc = 0
fun addNewInstance() {
rc++
}
}
- For every de-referencing, decrease the counter. When the counter reaches zero reclaim the memory.
class GCObject(val name: String) {
var rc = 0
fun removeNewInstance() {
rc--;
if(rc == 0) {
reclaimSpace(this)
}
}
}
Advantages
- The memory is reclaimed immediately when the object reference count reaches zero. This makes reference counting more efficient than other methods of garbage collection.
- The memory management pressure is evenly distributed across the program execution.
Disadvantages
Reference counting needs to be atomic, this will add performance overhead for the program.
Reclaiming cyclic data structures is painful. It is difficult to detect cyclic references.
Reference counting has smaller pauses while reclaiming memory. But if the object reclaimed is large reference counting may still induce pauses.
Remove performance overhead
The reference counting algorithm is simple and intuitive
For every object allocation/de-allocation, the runtime has to increment/decrement the reference count. There can be multiple process/threads acting on initialising/de-initialising objects. The reference is updated concurrently. When an operation increments the reference count rc
it should know the exact value of the reference count. In other words, all the operation on the reference count must be atomic. A mismatch in the reference count leads to memory leak or corruption.
First acquire a lock before updating the reference count, then update the reference count. This atomic operation adds an overhead. It increases latency and decreases overall performance.
We fix this performance problem with a buffer. Instead of updating the reference count for every object creation, use a buffer to queue the reference count updates. Adding to the queue need not be atomic. This increases performance. Using a buffer eliminates concurrency issues.
Refer to the paper On-the-Fly Reference-Counting Garbage Collector by Levanoni, Petrank, et.al.,.
Fix Cyclic references
The other major problem with reference counting is finding cyclic references.
For example, when we have a cyclic reference like below:
data class Person(val name: String) {
var house: House
}
data class House(val name: String, val owner: Person)
The reference counter need not know when to reclaim the cyclic object. Does it have to garbage collect the Person object when the Person reference count goes to zero or wait until House reference count goes to zero.
This inability to detect cycles is one of the major drawbacks of reference counting algorithm.
The runtime must know about the cyclic references, to properly manage the memory. There are weirder and corner cases that the runtime should handle while dealing with the cyclic objects.
In Swift, we use owned
or weak
references to refer the cyclic references. The owned
reference refers that the dependencies are cyclic and the references live for the same duration or longer than its cyclic counterpart. The weak
reference refers that the dependencies are cyclic and the references live shorter than its cyclic counterpart. The runtime with this heuristic handles managing the memory.
class Person {
let name: String
var house: House?
}
class House {
let name: String
var owner: Person?
}
Refer more here.
Alternative approach
Based on the paper by David F. Bacon et al.
They provide a new concurrent cycle detection algorithm that traces cycles locally. They use Lins' Lazy cyclic Reference counting that reduces the asymptotic complexity from O(n^2)
to O(n)
.
Check out the following repositories to understand more about how pure reference counting works. Choose the language of your choice: Go
or Java
sendilkumarn / java-purerc
A Pure Reference Counting Garbage Collection in Java
Throughput:
Naive reference counting requires synchronised count changes. This increases garbage collection time. That, in turn, reduces the throughput. But using a buffer increase the throughput (i.e., less time spent on garbage collection).
Pause time:
Naive reference counting algorithm does not have pause time, the garbage collection process is distributed evenly throughout the program. The garbage collection kicks in when the reference count reaches zero. Using the modified reference counting algorithm to detect cycles, the program should stop the world, while detecting cycles.
Who uses it?
Swift, SmallTalk, LISP, Perl uses Reference counting for garbage collection. AssemblyScript introduced reference counting garbage collection into the world of WebAssembly.
Tracing Garbage Collection
Tracing Garbage Collector is one of the most popular garbage collection. It is extensively studied and implemented. In tracing, the Garbage collectors traverse through all live objects and remove those objects that do not have any reference.
i.e., tracing collectors work on the live objects and the reference collectors work on the dead objects.
Tracing garbage collector does not happen every time an object is initialised/de-initialised. This reduces the performance overhead. The garbage collection is triggered at a random interval or when required (like when the server has low requests or memory usage is very high).
The tracing garbage collection works as follows:
Mark
The garbage collector traverse through all the reachable objects in the heap. It marks the reachable objects as live.
Sweep
Except for the marked objects, Sweep the entire heap.
This is the mark and sweep
algorithm. This is the simplest tracing garbage collection algorithm.
Advantages
Reference counting adds a performance overhead, but mark and sweep algorithm does not add any overhead.
All live objects are traversed in the marking step. This takes care of the cyclic dependencies.
The tracing garbage collection is tunable. We can tune it for increasing either latency or throughput.
Disadvantages
Removing dead objects happen only in the sweeping phase. Until then dead objects live in the heap.
After sweeping, the memory allocation is fragmented.
The memory management pressure is not evenly distributed across the program execution. The garbage collection runs in periodic (configurable) intervals. The program execution is halted when garbage collection happens.
Mark - Sweep - Compact
The sweeping operation fragments the memory. Allocating new memory in a fragmented heap is difficult. It is always preferred to have contagious memory for allocation.
Why contagious memory check Flat Memory Model.
The Mark - Sweep - Compact works as follows:
Mark
The garbage collector traverse through all the reachable objects in the heap. It marks the reachable objects as live.
Sweep
Except for the marked objects, Sweep the entire heap.
Compact
Slide the memory and remove the fragmentation. The garbage collector will run through the entire live objects one more time during this phase.
Advantages
Similar to mark-sweep, mark-sweep-compact does not add any overhead on the program execution.
All live objects are traversed in the marking step. This takes care of the cyclic dependencies.
The compaction step increases the speed of allocating new memories.
Disadvantages
Removing dead objects happen only in the sweeping phase. Until then dead objects live in the heap.
The memory management pressure is not evenly distributed across the program execution. The garbage collection runs in periodic (configurable) intervals. The program execution is halted when garbage collection happens.
The compaction step takes a second pass on the live objects. The memory relocation operation is also expensive. This increases pause time and decreases throughput.
Mark - Copy
The major drawback of mark-sweep-compact is the compaction step.
Compaction does two steps:
- Trace and mark the fragmented memory
- Compact the memory Both the steps stop the program execution.
We avoid this compaction in the Mark-Copy. The available memory is first divided into two parts. Let us call them A and B.
The A is responsible for new memory allocation. The B acts as a backup.
Mark
The garbage collector traverse through all the reachable objects in the "A" part. It marks the reachable objects as live.
Copy
Move all the marked objects into the "B" segment. Unlike compaction, copying does not add any overhead.
Sweep
Clear the "A" segment. Now the roles of "A" and "B" are reversed. The "B" is responsible for new memory allocation. The "A" acts as a backup.
Advantages
- Much faster memory allocation and reduces the complexity due to compaction.
- Similar to mark-sweep, mark-compact does not add any overhead on the program execution.
- All live objects are traversed in the marking step. This takes care of the cyclic dependencies.
Disadvantages
Since the heap is split into two segments, the total available memory is less.
Repeated copying of objects between primary and secondary regions might lead to poor performance.
Throughput & Pause Time:
In tracing garbage collection, higher the throughput, more the pause time. Increase in pause time reduces the responsiveness of the application.
Interested to explore further
Java - Garbage Collection Basics
Unified theory of Garbage Collection
Discussions 🐦 Twitter // 💻 GitHub // ✍️ Blog
If you like this article, please leave a like or a comment. ❤️
Posted on January 30, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.