A search engine - Part 4: Concurrency and safety
David Kröll
Posted on December 8, 2020
Since our index can now be accessed from several goroutines,
data races can occur, when we do not eliminate them using some additional logic. We have to synchronize the access to our data structures. We can utilize the sync
package for this, which is already a built in Go library.
Go's Mutex
The Mutex
type from the sync
package will allow us to place a lock on a shared resource using mutual exclusion. The first call to Mutex.Lock()
will set the mutex to locked and if some other goroutine calls the same method again, it will block (and therefore wait) until the resource gets a call to Mutex.Unlock()
. We'll use this simple synchronization helper to eliminate possible parallel access to the same data structure.
We extend the GlSearch struct type by the mutex and initialize it in the New()
method.
type GlSearch struct {
config Config
cache []string
index map[string][]int
// the syncronization helper
mu *sync.Mutex
}
// New creates a new GlSearch instance
func New(c Config) *GlSearch {
glsearch := GlSearch{
config: c,
cache: []string{},
index: map[string][]int{},
mu: &sync.Mutex{},
}
return &glsearch
}
Using a deferred call, we can very easily make sure that the mutex gets unlocked after our method returns. We have to simply add these two lines to every func
which reads or writes on our resources.
// Add appends the provided string to the cache and updates the index accordingly
func (g *GlSearch) Add(s string) {
g.mu.Lock()
defer g.mu.Unlock()
// ...
}
// Find queries the index and returns the results from the cache
func (g *GlSearch) Find(s string) SearchResult {
g.mu.Lock()
defer g.mu.Unlock()
// ...
}
The Flush() method
This method will reset our cache and index, while keeping the configuration the same.
Here we just initialize a new slice and map and let the old ones be garbage collected.
// Flush resets the index and cache
func (g *GlSearch) Flush() {
g.mu.Lock()
defer g.mu.Unlock()
g.cache = []string{}
g.index = map[string][]int{}
}
So we're done for this episode, we ensured that our data can't be accessed in paralell. In the next one we'll talk about more generic access to the index and cache.
Stay tuned.
Posted on December 8, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.