AkbarAhmed.com

Engineering Leadership

Introduction

You will need to know the location of binaries, configuration files, and libraries when working with Zookeeper.

Zookeeper 3.4.3 is a part of Cloudera Distribution Hadoop (CDH4).

Directories

/etc/zookeeper/conf

/etc/zookeeper/conf is the location for all of Zookeeper’s configuration files.

Zookeeper uses Debian Alternatives, so there are a number of symlinks to the configuration files.

/etc/zookeeper/conf is a symlink to /etc/alternatives/zookeeper-conf.
/etc/alternatives/zookeeper-conf is a symlink to /etc/zookeeper/conf.dist

Files

Configuration Files

The following configuration files are located in /etc/zookeeper/conf

configuration.xsl

log4j.properties

zoo.cfg

zoo.cfg is the main Zookeeper configuration file.

dataDir

dataDir specifies the directory where znode snapshot files and transaction logs are stored. These files are important as you will need them to recover data.

The files located in dataDir should be backed up regularly.

zoo_sample.cfg

A sample configuration file. One of the more interesting notes is about the autopurge.snapRetainCount configuration variable (http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance).

Init Files

zookeeper-server

Use the init script to start, stop, restart, check the status of zookeeper, and initial zookeeper.

Binaries and Scripts

/usr/lib/zookeeper/bin/zkCleanup.sh

Script that cleans up the files created in dataDir. This script should be modified per installation and should be added to cron for periodic cleanup.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: