Speeding Up Go Concurrency: Unlocking the Power of Array Segmentation

Hey there, fellow Gophers! 🌟

If you’ve been playing around with Go, you know that goroutines are the bread and butter of concurrency. But here’s the thing: when multiple goroutines try to access the same array, things can get messy real quick. You end up having to use mutex locks to avoid race conditions, and that can slow things down big time. But don’t worry, I’ve got a cool trick up my sleeve – segmenting the array to let goroutines work independently without stepping on each other’s toes. Let’s dive in!

Use Case: Crunching Big Data

Imagine you’re building a web scraper that collects tons of data from different sources and dumps it all into a big array. You’ve got multiple goroutines fetching and processing data, and you want them to work concurrently to speed things up. But how do you keep them from getting in each other’s way?

The Mutex Approach

First up, let’s see how we usually do it with mutex locks.

package main

import (
    "fmt"
    "sync"
)

func main() {
    const size = 1000000
    data := make([]int, size)
    var mu sync.Mutex
    var wg sync.WaitGroup

    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(start int) {
            defer wg.Done()
            for j := start; j < start+size/10; j++ {
                mu.Lock()
                data[j] = j * 2 // Example operation
                mu.Unlock()
            }
        }(i * (size / 10))
    }

    wg.Wait()
    fmt.Println("Processing complete")
}

In this setup, every goroutine locks the entire array before it writes to it. It’s safe, but man, is it slow!

The Segmentation Magic

Now, let’s try something cooler. We’ll segment the array so each goroutine works on a different part without any locks.

package main

import (
    "fmt"
    "sync"
)

func main() {
    const size = 1000000
    data := make([]int, size)
    var wg sync.WaitGroup

    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(start int) {
            defer wg.Done()
            for j := start; j < start+size/10; j++ {
                data[j] = j * 2 // Example operation
            }
        }(i * (size / 10))
    }

    wg.Wait()
    fmt.Println("Processing complete")
}

Here, each goroutine handles a specific segment of the array, and guess what – no locks needed! 🚀

Showdown: Benchmarking Both Approaches

Alright, time to pit these two against each other and see which one’s faster. Here’s the benchmark code:

package main

import (
    "sync"
    "testing"
)

const size = 1000000

func ProcessWithMutex(data []int, wg *sync.WaitGroup, size int) {
    var mu sync.Mutex
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(start int) {
            defer wg.Done()
            for j := start; j < start+size/10; j++ {
                mu.Lock()
                data[j] = j * 2
                mu.Unlock()
            }
        }(i * (size / 10))
    }
}

func ProcessWithSegmentation(data []int, wg *sync.WaitGroup, size int) {
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func(start int) {
            defer wg.Done()
            for j := start; j < start+size/10; j++ {
                data[j] = j * 2
            }
        }(i * (size / 10))
    }
}

func BenchmarkWithMutex(b *testing.B) {
    data := make([]int, size)
    var wg sync.WaitGroup

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        ProcessWithMutex(data, &wg, size)
        wg.Wait()
    }
}

func BenchmarkWithSegmentation(b *testing.B) {
    data := make([]int, size)
    var wg sync.WaitGroup

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        ProcessWithSegmentation(data, &wg, size)
        wg.Wait()
    }
}

Run these benchmarks using go test -bench . and let’s see the results!

Benchmark Results:

With Mutex:
- Operations: 14
- Average Time Per Operation: ~82.48 ms (82,482,426 ns)
With Segmentation:
- Operations: 4948
- Average Time Per Operation: ~0.23 ms (231,511 ns)

Analysis:

Performance with Mutex:
- Only 14 operations in the testing period.
- Each operation took around 82.48 milliseconds.
- Mutexes are safe but introduce a lot of overhead and slow things down.
Performance with Segmentation:
- A whopping 4948 operations in the same period.
- Each operation took just 0.23 milliseconds.
- By letting goroutines work on separate parts of the array, we cut out the need for locks and speed things up massively.

Conclusion:
Segmentation blows mutexes out of the water! It’s about 357 times faster, making it a fantastic way to boost performance when dealing with large datasets and high concurrency.

So next time you’re juggling goroutines, think about segmenting your data. It’s a game-changer!

Happy coding, and feel free to share your thoughts and results in the comments below. Let’s make Go concurrency even more awesome! 🚀✨

P.S: Thank you for reading! Feel free to leave your comments below, and let's stay connected on LinkedIn or follow me on Twitter or Github for more insights and updates in data mining and analytics.

Blog

Speeding Up Go Concurrency: Unlocking the Power of Array Segmentation

Ahmad Ganjtabesh

Use Case: Crunching Big Data

The Mutex Approach

The Segmentation Magic

Showdown: Benchmarking Both Approaches

Join Our Newsletter. No Spam, Only the good stuff.

Related