Pull to refresh

Extending and moving a ZooKeeper ensemble

Reading time3 min
Views2.2K
Original author: Yersin Kakimishov

    Once upon a time our DBA team had a task. We had to move a ZooKeeper ensemble which we had been using for Clickhouse cluster. Everyone is used to moving an ensemble by moving its data files. It seems easy and obvious but our Clickhouse cluster had more than 400 TB replicated data. All replication information had been collected in ZooKeeper cluster from the very beginning. At the end of the day we couldn’t miss even a row of data. Then we looked for information on the internet. Unfortunately there was a good tutorial about 3.4.5 and didn’t fit our version 3.6.2. So we decided to use “the extending” for moving our ensemble.

Work Plan

    Here we have a ZooKeeper ensemble consisting of 3 instances running on 3 independent servers. Also we had given 3 new servers where we had to move the ensemble in. Additionally we found another 1 temporary server for our quorum. Why? Because a Clickhouse cluster with ReplicatedMergeTree tables provides writing only with a ZooKeeper ensemble quorum else we can only have a reading. For better understanding here is the scheme 

3 (id: 1, 2, 3) +1 (id: 4) + 3 (id: 5, 6, 7)

    The genuine ensemble is located in 3 servers (id: 1, 2, 3). We added 1 temporary server (id: 4) and new servers (id: 5, 6, 7). When the ensemble is synchronised we are ready to remove unnecessary instances (id: 1, 2, 3, 4). Finally we had an ensemble with quorum which is running on new servers (id: 5, 6, 7)

Extending the genuine ensemble

    In the official documentation you can find information about an ensemble extending. However it’s not a tutorial.

The genuine ensemble runs on 3 independent servers with CentOS 7

server.1=zk-1:2888:3888
server.2=zk-2:2888:3888
server.3=zk-3:2888:3888

Check your configuration file and if reconfigEnabled=true is absent you must add it. Also you will face environment issues leading to dynamic reconfiguration problems.  The solution is to add 

Dzookeeper.skipACL=yes
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Dzookeeper.skipACL=yes"

inside the $ZK_HOME/bin/zkEnv.sh file. Now you are ready to restart the ensemble. Make sure you stop a Clickhouse cluster first.

It’s time to prepare new servers:

  1. Upload the apache-zookeeper tarball to new servers (in our case apache-zookeeper-3.6.2-bin.tar.gz);

  2. Create OS user for zookeeper and extract tarball in home directory of zookeeper;

  3. Create a directory for zookeeper data files and a myid file which is an indicator for instance. For example we already have 3 instances in the genuine ensemble (server.1=zk-1:2888:3888 server.2=zk-2:2888:3888 server.3=zk-3:2888:3888.) myid files of new instances will contain numbers 4, 5, 6, 7;

  4. Add rules in firewall (ports: 2888, 3888, 2181, 7000);

  5. Create a service file

    When everything is ready run $ZK_HOME/bin/zkCli.sh and enter 

reconfig -add server.4=zk-4:2888:3888:participant;2181

    If you find zoo.conf.dynamic.100000000 in $ZK_HOME/conf directory on all 3 servers you will be on the right path. Then you can start zookeeper server on the 4th server and zoo.conf.dynamic.100000000 must appear in $ZK_HOME/conf  as well. After each other reconfig the number after zoo.conf.dynamic (100000000) will change. Repeat this step for instances 5, 6, 7. At the end the ensemble must contain all 7 instances synchronised.

Removing unnecessary instances from the ensemble

      Removing can be done with the well known $ZK_HOME/bin/zkCli.sh. Just execute 

reconfig -remove 1
reconfig -remove 2
reconfig -remove 3
reconfig -remove 4

    After removing stop/disable zookeeper services on servers 1, 2, 3, 4 and check $ZK_HOME/ /conf/zoo.conf.dynamic.XXXXXXX. They must be the same.

Summary

       At the end of the day we moved the ensemble from servers zk-1, zk-2, zk-3 to zk-5, zk-6, zk-7. Additionally we recommend to add 

metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider

in zoo.conf and open the 7000 port of the firewall. This is a metrics exporter for Prometheus.

Tags:
Hubs:
Rating0
Comments0

Articles