Building a Solr Cluster with TerraForm – Part 1

documentednerd

Kevin Mack

Posted on April 29, 2019

Building a Solr Cluster with TerraForm – Part 1

So it’s no surprise that I very much have been talking about how amazing TerraForm is, and recently I’ve been doing a lot of investigation into Solr and how to build a scalable Solr Cluster.

So given the kubernetes template I wanted to try my hand at something similar. The goals of this project were the following:

  1. Build a generic template for creating a Solr cloud cluster with distributed shard.
  2. Build out the ability to scale the cluster for now using TerraForm to manually trigger increases to cluster size.
  3. Make the nodes automatically add themselves to the cluster.

And I could do this just using bash scripts and packer. But instead wanted to try my hand at cloud init.

But that’s going to be the end result, I wanted to walkthrough the various steps I go through to get to the end. The first real step is to get through the installation of Solr on linux machines to be implemented.

So let’s start with “What is Solr?” The answer is that Solr is an open source software solution that provides a means of creating a search engine. It works in the same vein as ElasticSearch and other technologies. Solr has been around for quite a while and is used by some of the largest companies that implement search to handle search requests by their customers. Some of those names are Netflix and CareerBuilder. See the following links below:

So I’ve decided to try my hand at this and creating my first Solr cluster, and have reviewed the getting started.

So I ended up looking into it more, and built out the following script to create a “getting started” solr cluster.

sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-commonsudo apt-get install -y gnupg-curlsudo wget https://www.apache.org/dist/lucene/solr/8.0.0/solr-8.0.0.zip.asc | sudo apt-key addsudo apt-get update -ysudo apt-get install unzipsudo wget http://mirror.cogentco.com/pub/apache/lucene/solr/8.0.0/solr-8.0.0.zipsudo unzip -q solr-8.0.0.ziplssudo mv solr-8.0.0 /usr/local/bin/solr-8.0.0 -fsudo rm solr-8.0.0.zip -fsudo apt-get install -y default-jdksudo chmod +x /usr/local/bin/solr-8.0.0/bin/solrsudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node1/solrsudo chmod +x /usr/local/bin/solr-8.0.0/example/cloud/node2/solrsudo /usr/local/bin/solr-8.0.0/bin/bin/solr -e cloud -noprompt

The above will configure a “getting started solr cluster” that leverages all the system defaults and is hardly a production implementation. So my next step will be to change this. But for the sake of getting something running, I took the above script and moved it into a packer template using the following json. The above script is the “../scripts/Solr/provision.sh”

{ "variables": { "deployment\_code": "", "resource\_group": "", "subscription\_id": "", "location": "", "cloud\_environment\_name": "Public" }, "builders": [{ "type": "azure-arm", "cloud\_environment\_name": "{{user `cloud_environment_name`}}", "subscription\_id": "{{user `subscription_id`}}", "managed\_image\_resource\_group\_name": "{{user `resource_group`}}", "managed\_image\_name": "Ubuntu\_16.04\_{{isotime \"2006\_01\_02\_15\_04\"}}", "managed\_image\_storage\_account\_type": "Premium\_LRS", "os\_type": "Linux", "image\_publisher": "Canonical", "image\_offer": "UbuntuServer", "image\_sku": "16.04-LTS", "location": "{{user `location`}}", "vm\_size": "Standard\_F2s" }], "provisioners": [{ "type": "shell", "script": "../scripts/ubuntu/update.sh" }, { "type": "shell", "script": "../scripts/Solr/provision.sh" }, { "execute\_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'", "inline": [ "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"], "inline\_shebang": "/bin/sh -e", "type": "shell" }]}

The only other script mentioned is the “update.sh”, which has the following logic in it, to install the cli and update the ubuntu image:

#! /bin/bashsudo apt-get update -ysudo apt-get upgrade -y#Azure-CLIAZ\_REPO=$(sudo lsb\_release -cs)sudo echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ\_REPO main" | sudo tee /etc/apt/sources.list.d/azure-cli.listsudo curl -L https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -sudo apt-get install apt-transport-httpssudo apt-get update && sudo apt-get install azure-cli

So the above gets me to a good place for being able to create an image with it configured.

For next steps I will be doing the following:

  • Building a more “production friendly” implementation of Solr into the script.
  • Investigating leveraging cloud init instead of the “golden image” experience with Packer.
  • Building out templates around the use of Zookeeper for managing the nodes.
💖 💪 🙅 🚩
documentednerd
Kevin Mack

Posted on April 29, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related