Setting up SolrCloud for Production

dhar01

Loknath Dhar

Posted on July 28, 2022

Setting up SolrCloud for Production

I wasn't familiar with SolrCloud when I started working with this. When I started to learn it and worked directly with it - I did so many mistakes. Even setting up SolrCloud for production was a hassle for me (I'm a slow learner). But overtime, I got a hang of it and I think writing an up-to-date guide to set up SolrCloud on production server is a good idea.

The official documentation is amazing enough and you should read it. This post is very straight forward. If you want any explanation and want to know/learn more, please consult official documentation.

Let's get started.

(I'm writing this guide with the experience of Linux servers)

To easily go to the installation directory, I put these on .bashrc:

export solr_home=/opt/solr
PATH=/opt/solr/bin:$PATH

export zookeeper_home=/opt/zookeeper
PATH=/opt/zookeeper/bin:$PATH

$ cd $zookeeper_home    # will take to /opt/zookeeper
$ cd $solr_home    # will take to /opt/solr
Enter fullscreen mode Exit fullscreen mode

Note that, this solr_home and SOLR_HOME variable isn't same.

Zookeeper

To bring the SolrCloud into production, we need external zookeeper (not the embedded one) server to manage our configuration and coordination centrally. We're going to work with 3 servers. We will install zookeeper and Apache Solr into all servers with same configuration.

First, let's configure our Apache Zookeeper. Install desired Java version according to the official documentation. At the time of writing, the latest Zookeeper and Solr - both needs Java 11.

$ sudo apt install openjdk-11-jdk

# check java version
$ java -version
Enter fullscreen mode Exit fullscreen mode

Download the latest stable version from the official website. Notice that we don't want the source (src) bundle, we need the binary (bin) version.

Installation and configuration

It's not recommended to work with them while on root. But I'm going to work as a root user. My installation directory will be under /opt.

$ tar -xvf zookeeper-*.tar.gz -C /opt

$ ln -s /opt/zookeeper-* /opt/zookeeper
Enter fullscreen mode Exit fullscreen mode

We need to create a configuration file for zookeeper under $zookeepr_home/conf/. The file name will be zoo.cfg:

tickTime=2000
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/logs

clientPort=2181
4lw.commands.whitelist=mntr,conf,ruok

initLimit=5
syncLimit=2

# put IP addresses or host on the place of Server1, Server2, Server3
server.1=Server1:2888:3888
server.2=Server2:2888:3888
server.3=Server3:2888:3888

autopurge.snapRetainCount=3
autopurge.purgeInterval=1
Enter fullscreen mode Exit fullscreen mode

We will create a zookeeper environment file in the same place of zoo.cfg, which is under $zookeeper_home/conf. The file name will be zookeeper-env.sh:

JAVA_HOME="/usr/lib/jvm/java-1.11.0-openjdk-amd64"
ZOO_LOG_DIR="/var/lib/zookeeper/logs"
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"

# increaseing the file size limit to 50MiB
JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=50000000"
Enter fullscreen mode Exit fullscreen mode

Create directories defined on the configuration:

mkdir -p /var/lib/zookeeper/data
mkdir -p /var/lib/zookeeper/logs
Enter fullscreen mode Exit fullscreen mode

Create a myid text file under /var/lib/zookeeper/datadirectory. Put the server id in that file with a single line. In case of server 2, the file will contain: 2

echo "2" >/var/lib/zookeeper/data/myid
Enter fullscreen mode Exit fullscreen mode

Now you can start zookeeper whenever you want but it need to be started before Solr.

$ cd $zookeeper_home
$ bin/zkServer.sh start
Enter fullscreen mode Exit fullscreen mode

Apache SolrCloud

Download latest Solr (bin version) on the server, move file to the /opt and extract it.

$ tar xzf solr-*.tgz solr-*/bin/install_solr_service.sh --strip-components=2
Enter fullscreen mode Exit fullscreen mode

Install it under /opt.

$ sudo bash ./install_solr_service.sh solr-*.tgz
$ ln -s solr-*/ solr
Enter fullscreen mode Exit fullscreen mode

Edit bin/solr.in.sh for some configuration. We can also set this system wide by creating a file under /etc/default.

#writing include file

SOLR_PID_DIR=/var/solr
SOLR_HOME=/var/solr/data

#LOG SETTINGS

LOG4J_PROPS=/var/solr/log4j2.xml
SOLR_LOGS_DIR=/var/solr/logs

SOLR_HEAP="1g"

SOLR_JAVA_HOME="/usr/lib/jvm/java-1.11.0-openjdk-amd64"
ZK_HOST="zk-server1:2181,zk-server2:2181,zk-server3:2181"
SOLR_LOG_LEVEL=INFO

# Data backup location for replication environment
SOLR_OPTS="$SOLR_OPTS -Dsolr.allowPaths=/mnt/solr_backup"

# for soft commits
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=10000"
SOLR_HOST="zk-server-ip" # current server IP address

SOLR_JAVA_MEM="-Xms2g -Xmx2g"
ZK_CLIENT_TIMEOUT="30000"
SOLR_PORT=8983

# To make available on the public internet
SOLR_JETTY_HOST="0.0.0.0"

# set this up in case if you set up authentication. 
# By setting this, the script will run without error
SOLR_AUTH_TYPE="basic"
SOLR_AUTHENTICATION_OPTS="-Dbasicauth=username:password"
Enter fullscreen mode Exit fullscreen mode

Create directories defined in the configuration:

$ mkdir -p /var/solr/data
$ mkdir -p /var/solr/logs
Enter fullscreen mode Exit fullscreen mode

And it's done.

You have to configure all servers like this.

Start SolrCloud by using the script:

$ bin/solr start -c -p 8983 -s /var/solr/data -z zk1:2181,zk2:2181,zk3:2181 -force
Enter fullscreen mode Exit fullscreen mode

To get help:

bin/solr start -help
bin/solr restart -help # you got the point how to get help
Enter fullscreen mode Exit fullscreen mode

I hope you are able to get the SolrCloud running without any errors.

Use bin/solr script to interact with Solr and Zookeeper. To know more, use official documentation.

To create a collection:

$ bin/solr create_collection -c col_name -d _default -shards 1 -replicationFactor 3 -p 8983 -V -force
Enter fullscreen mode Exit fullscreen mode

To delete a collection:

$ bin/solr delete -c col_name -deleteConfig true -p 8983 -V
Enter fullscreen mode Exit fullscreen mode

That's all. I will try to update this post from time to time. I'm also planning to include some useful commands later.

Hope, things are working.

Happy searching!

💖 💪 🙅 🚩
dhar01
Loknath Dhar

Posted on July 28, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related