A search engine - Part 4: Concurrency and safety

Since our index can now be accessed from several goroutines,
data races can occur, when we do not eliminate them using some additional logic. We have to synchronize the access to our data structures. We can utilize the sync package for this, which is already a built in Go library.

Go's Mutex

The Mutex type from the sync package will allow us to place a lock on a shared resource using mutual exclusion. The first call to Mutex.Lock() will set the mutex to locked and if some other goroutine calls the same method again, it will block (and therefore wait) until the resource gets a call to Mutex.Unlock(). We'll use this simple synchronization helper to eliminate possible parallel access to the same data structure.

We extend the GlSearch struct type by the mutex and initialize it in the New() method.

type GlSearch struct {
    config Config
    cache  []string
    index  map[string][]int
    // the syncronization helper
    mu     *sync.Mutex
}

// New creates a new GlSearch instance
func New(c Config) *GlSearch {

    glsearch := GlSearch{
        config: c,
        cache:  []string{},
        index:  map[string][]int{},
        mu:     &sync.Mutex{},
    }

    return &glsearch
}

Using a deferred call, we can very easily make sure that the mutex gets unlocked after our method returns. We have to simply add these two lines to every func which reads or writes on our resources.

// Add appends the provided string to the cache and updates the index accordingly
func (g *GlSearch) Add(s string) {
    g.mu.Lock()
    defer g.mu.Unlock()

    // ...
}


// Find queries the index and returns the results from the cache
func (g *GlSearch) Find(s string) SearchResult {
    g.mu.Lock()
    defer g.mu.Unlock()

    // ...
}

The Flush() method

This method will reset our cache and index, while keeping the configuration the same.

Here we just initialize a new slice and map and let the old ones be garbage collected.

// Flush resets the index and cache
func (g *GlSearch) Flush() {
    g.mu.Lock()
    defer g.mu.Unlock()

    g.cache = []string{}
    g.index = map[string][]int{}

}

So we're done for this episode, we ensured that our data can't be accessed in paralell. In the next one we'll talk about more generic access to the index and cache.

Stay tuned.

Blog

A search engine - Part 4: Concurrency and safety

David Kröll

Go's Mutex

The Flush() method

Join Our Newsletter. No Spam, Only the good stuff.

Related