Introduction
You will need to know the location of binaries, configuration files, and libraries when working with Zookeeper.
Zookeeper 3.4.3 is a part of Cloudera Distribution Hadoop (CDH4).
Directories
/etc/zookeeper/conf
/etc/zookeeper/conf is the location for all of Zookeeper’s configuration files.
Zookeeper uses Debian Alternatives, so there are a number of symlinks to the configuration files.
/etc/zookeeper/conf is a symlink to /etc/alternatives/zookeeper-conf.
/etc/alternatives/zookeeper-conf is a symlink to /etc/zookeeper/conf.dist
Files
Configuration Files
The following configuration files are located in /etc/zookeeper/conf
configuration.xsl
log4j.properties
zoo.cfg
zoo.cfg is the main Zookeeper configuration file.
dataDir
dataDir specifies the directory where znode snapshot files and transaction logs are stored. These files are important as you will need them to recover data.
The files located in dataDir should be backed up regularly.
zoo_sample.cfg
A sample configuration file. One of the more interesting notes is about the autopurge.snapRetainCount configuration variable (http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance).
Init Files
zookeeper-server
Use the init script to start, stop, restart, check the status of zookeeper, and initial zookeeper.
Binaries and Scripts
/usr/lib/zookeeper/bin/zkCleanup.sh
Script that cleans up the files created in dataDir. This script should be modified per installation and should be added to cron for periodic cleanup.