Concurrency Patterns in Go: sync.WaitGroup
Jon Calhoun
Posted on January 11, 2021
This article was originally an email from my Go newsletter. If you are interested in learning a little bit about Go every week please consider subscribing.
When working with concurrent code, the first thing most people notice is that the rest of their code doesn't wait for the concurrent code to finish before moving on. For instance, imagine that we wanted to send a message to a few services before shutting down, and we started with the following code:
package main
import (
"fmt"
"math/rand"
"time"
)
func notify(services ...string) {
for _, service := range services {
go func(s string) {
fmt.Printf("Starting to notifing %s...\n", s)
time.Sleep(time.Duration(rand.Intn(3)) * time.Second)
fmt.Printf("Finished notifying %s...\n", s)
}(service)
}
fmt.Println("All services notified!")
}
func main() {
notify("Service-1", "Service-2", "Service-3")
// Running this outputs "All services notified!" but we
// won't see any of the services outputting their finished messages!
}
If we were to run this code with some sleeps in place to similuate latency, we would see an "All services notified!" message output, but none of the "Finished notifying ..." messages would ever print out, suggesting that our applicaiton doesn't wait for these messages to send before shutting down. That is going to be an issue!
One way to solve this problem is to use a sync.WaitGroup. This is a type provided by the standard library that makes it easy to say, "I have N tasks that I need to run concurrently, wait for them to complete, and then resume my code."
To use a sync.WaitGroup
we do roughly four things:
- Declare the
sync.WaitGroup
- Add to the WaitGroup queue
- Tell our code to wait on the WaitGroup queue to reach zero before proceeding
- Inside each goroutine, mark items in the queue as done
The code below shows this, and we will discuss the code after you give it a read.
func notify(services ...string) {
var wg sync.WaitGroup
for _, service := range services {
wg.Add(1)
go func(s string) {
fmt.Printf("Starting to notifing %s...\n", s)
time.Sleep(time.Duration(rand.Intn(3)) * time.Second)
fmt.Printf("Finished notifying %s...\n", s)
wg.Done()
}(service)
}
wg.Wait()
fmt.Println("All services notified!")
}
At the very start of our code we achieve (1) by declaring the sync.WaitGroup. We do this before calling any goroutines so it is available for each goroutine.
Next we need to add items to the WaitGroup queue. We do this by calling Add(n), where n
is the number of items we want to add to the queue. This means we can call Add(5)
once if we know we want to wait on five tasks, or in this case we opt to call Add(1)
for each iteration of the loop. Both approaches work fine, and the code above could easily be replaced with something like:
wg.Add(len(services))
for _, service := range services {
go func(s string) {
fmt.Printf("Starting to notifing %s...\n", s)
time.Sleep(time.Duration(rand.Intn(3)) * time.Second)
fmt.Printf("Finished notifying %s...\n", s)
wg.Done()
}(service)
}
In either case, I recommend calling Add() outside of the concurrent code to ensure it runs immediately. If we were to instead place this inside of the goroutine it is possible that the program will get to the wg.Wait() line before the goroutine has run, in which case wg.Wait() won't have anything to wait on and we will be in the same position we were in before. This is shown on the Go Playground here: https://play.golang.org/p/Yl4f_5We6s7
We need to also mark items in our WaitGroup queue as complete. To do this we call Done(), which unlike Add()
, does not take an argument and needs to be called for as many items are in the WaitGroup queue. Because this is dependent on the code running in a goroutine, the call to Done()
should be run inside of the goroutine we want to wait on. If we were to instead call Done()
inside the for loop but NOT inside the goroutine it would mark every task as complete before the goroutine actually runs. This is shown on the Go Playground here: https://play.golang.org/p/i2vu2vGjYgB
Finally, we need to wait on all the items queued up in the WaitGroup to finish. We do this by calling Wait(), which causes our program to wait there until the WaitGroup's queue is cleared.
It is worth noting that this pattern works best when we don't care about gathering any results from the goroutines. If we find ourselves in a situation where we need data returned from each goroutine, it will likely be easier to use channels to communicate that information. For instance, the following code is very similar to the wait group example, but it uses a channel to receive a message from each goroutine after it completes.
func notify(services ...string) {
res := make(chan string)
count := 0
for _, service := range services {
count++
go func(s string) {
fmt.Printf("Starting to notifing %s...\n", s)
time.Sleep(time.Duration(rand.Intn(3)) * time.Second)
res <- fmt.Sprintf("Finished %s", s)
}(service)
}
for i := 0; i < count; i++ {
fmt.Println(<-res)
}
fmt.Println("All services notified!")
}
Why don't we just use channels all the time? Based on this last example, we can do everything a sync.WaitGroup does with a channel, so why the new type?
The short answer is that sync.WaitGroup
is a bit clearer when we don't care about the data being returned from the goroutine. It signifies to other developers that we just want to wait for a group of goroutines to complete, whereas a channel communicates that we are interested in results from those goroutines.
That's it for this one. In future articles I'll present a few more concurrency patterns, and we can continue to expand upon the subject. If you happen to have a specific use case and want to share it (or request I write about it), just send me an email.
Posted on January 11, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.