Debugging HBase: org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable location to assign region -ROOT

Introduction

I ran into an annoying error in HBase due to the localhost loopback. The solution was simple, but took some trial and error.

Error

I was following the HBase logs with the following command:

tail -1000f /var/log/hbase/hbase-hbase-master-freshstart.log

The following error kept poping up in the log file.

org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable location to assign region -ROOT

Solution

sudo vi /etc/hosts

I changed:


127.0.0.1       localhost
127.0.1.1       freshstart

to:


#127.0.0.1      localhost
#127.0.1.1      freshstart
192.168.2.15   freshstart
127.0.0.1      localhost

192.168.2.15 is my internal IP address, and freshstart is my hostname.

At this point I rebooted as a quick and dirty way to restart all Hadoop / HBase services. Alternatively, you can start/stop all services.

Additional Thoughts

Updating the hosts file is an option for me currently since I have everything installed on a single machine. However, it seems that this error is a name resolution related issue, so a properly configured DNS server is likely necessary when deploying Hadoop / HBase in a production cluster.

Advertisements

Install Pyramid on Ubuntu 12.04 LTS in the Rackspace Cloud

Check the Installed Python Version

python --version

You should see the following output:

Python 2.7.3

Install Prerequisites

apt-get install python-setuptools python-pip python-virtualenv virtualenvwrapper

Install Prerequisites for Pyramid Speedups

apt-get install gcc cpp libc6-dev python2.7-dev

Install nginx

apt-get install nginx nginx-full nginx-common

Create a wwwuser that waitress (the web server) will run as

useradd wwwuser -d /home/wwwuser -k /etc/skel -m -s /bin/bash -U

Setup the Virtual Environment

mkdir -p /var/www/delixus.com
mkdir /var/www/environments
cd /var
chown -R wwwuser:wwwuser www

We are now going to change users to wwwuser user.

su - wwwuser
cd /var/www/environments
virtualenv env_delixus

Install Pyramid

You must perform the following steps as the wwwuser user.

cd /var/www/environments/env_delixus
source bin/activate

You should see the environment name as the prefix in the command prompt, such as:

(env_delixus)wwwuser@ws2:

easy_install Pyramid
pip install waitress

Checkout the Pyramid Project

cd /var/www/delixus.com

Change the SVN checkout command to something that matches your server. If you use git, then change appropriately.

svn checkout https://repo.company.com/source/delixus/tags/1.0 .

Install the delixus.com Pyramid project

cd /var/www/delixus.com/delixus
vi production.ini

Under [app:main], add a [server:main] configuration as follows:


# http://docs.pylonsproject.org/projects/waitress/en/latest/arguments.html
[server:main]
use = egg:waitress#main
host = 127.0.0.1
port = %(http_port)s
# default # of threads = 4
threads = 8
url_scheme = http

I don’t think you need to install the development version of the site, but it seems to be the only way that I get everything to work while debugging…go figure.

python setup.py develop
pserve development.ini

Then open the site in a text-based web browser.

links http://localhost:6543

You should be able to view your site at this point.

Now, let’s install the production version of the site.

python setup.py install

Start Waitress

First we’re going to start and test waitress, then we’ll start it as a deamon.

pserve production.ini http_port=5000
links http://localhost:5000

Again, you should be able to view your site.

pserve production.ini start --daemon --pid-file=/var/www/5000.pid \
--log-file=/var/www/5000.log --monitor-restart http_port=5000
pserve production.ini start --daemon --pid-file=/var/www/5001.pid \
--log-file=/var/www/5001.log --monitor-restart http_port=5001

Check the waitress process.

ps -ef | grep pserve

You should see the pserve process running.

Configure nginx as a Proxy for Waitress

The following steps must be performed as root.

cd /etc/nginx/sites-available
vi delixus

Paste the following into the delixus.conf file.


upstream delixus-site {
    server 127.0.0.1:5000;
    server 127.0.0.1:5001;
}

server {
    listen 80;
    server_name  localhost www.delixus.com delixus.com;

    access_log  /var/log/nginx/delixus.com-access.log;

    location / {
        proxy_set_header        Host $host;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        X-Forwarded-Proto $scheme;

        client_max_body_size    10m;
        client_body_buffer_size 128k;
        proxy_connect_timeout   60s;
        proxy_send_timeout      90s;
        proxy_read_timeout      90s;
        proxy_buffering         off;
        proxy_temp_file_write_size 64k;
        proxy_pass http://delixus-site;
        proxy_redirect          off;
    }

    location /static {
        root            /var/www/delixus.com/delixus/delixus;
        expires         30d;
        add_header      Cache-Control public;
        access_log      off;
    }
}

rm /etc/nginx/sites-enabled/default
ln -s /etc/nginx/sites-available/delixus /etc/nginx/sites-enabled/delixus
service nginx stop
service nginx start

A good next step at this point is to setup Supervisor to control pserve/waitress.

Install RazorSQL on Ubuntu 12.04 LTS x64

Introduction

RazorSQL is a GUI tool for working with Postgresql.

Install RazorSQL

First, create the razorsql directory.

mkdir ~/Downloads/razorsql

Download RazorSQL into the ~/Downloads/navicat directory.

  1. Open a web browser to http://www.razorsql.com/download_linux.html.
  2. Click Download next to Linux (64-bit).

After the download completes, open a command prompt and enter the following commands. I have assumed that you downloaded the .zip file to the Downloads/razorsql directory.

cd ~/Downloads/razorsql
unzip razorsql5_6_4_linux_x64.zip
mv razorsql ~/bin/
cd ~/bin/razorsql
chmod 755 razorsql.sh
./razorsql.sh

Connect to a Database

The following steps are performed in the RazorSQL GUI.

  • From the menu, select Connections, click Add Connection Profil.
  • In the Connection Wizard, select Postgresql, click Continue.
    Add Connection Profile 01
  • In the 2nd screen, enter the information shown below.
    Add Connection Profile 02
  • Click Connect.

Enable TCP/IP Connections to Postgresql 9.1

Introduction

The default installation of Postgresql does not allow TCP/IP connections, which also means JDBC connections wont’ work.

Enable TCP/IP Connections

cd /etc/postgresql/9.1/main

Let’s make a backup of the original config file.

sudo cp postgresql.conf postgresql.conf.org
sudo vi postgresql.conf

Find the line that contains

#listen_addresses = 'localhost'

and change it to:

listen_addresses = 'localhost'

If you want to allow remote connections, then you need to set listen_address to something like (assuming your machine’s IP Address is 192.168.1.2):
listen_addresses = ‘127.0.0.1 192.168.1.2’

You will also need to update the firewall to allow connections to this machine.

Restart the Postgresql Server

sudo service postgresql restart

Test the TCP/IP Connection

psql -h 127.0.0.1 -p 5433 -U postgres -W

Initial Configuration of Postgresql on Ubuntu 12.04 LTS

Introduction

After installing Postgresql on Ubuntu, you will need to complete a few basic setup steps.

Set the Password for the postgres User

The first step is to update the password for the postgres user.

sudo -u postgres psql postgres

Then enter the following command at the postgres=# prompt.

\password postgres

Create a Database

If you exited the postgres=# prompt, then enter the following command to connect to Postgresql. However, if your prompt still looks like postgres=#, then do not enter the psql command below.

sudo -u postgres psql postgres

Enter the following command at the postgres=# prompt.

First we’ll list all of the databases, which should be the following: postgres, template0, and template1.

\l
CREATE DATABASE practicedb;
\l

Now you should see the practicedb database.

Press Ctrl + D to exit the Postgres prompt.

Install Postgresql on Ubuntu 12.04 LTS

Introduction

Postgresql is an open source object relational database. It is often thought of as an alternative to MySQL.

In this post I’ll provide the steps required to install Postgresql on a developer laptop, assuming that work will be done in both SQL and Python.

You can learn more at:

Installation

Open a command prompt and enter the following commands.

sudo apt-get install postgresql postgresql-common postgresql-contrib \
postgresql-client postgresql-client-common \
pgsnap pgadmin3 pgpool2 ptop pgtune pgloader pgagent \
python-pygresql postgresql-plpython-9.1 python-psycopg2

You’ll see a list of additional packages that will be installed by default.

The following extra packages will be installed:
libossp-uuid16 libpgpool0 libpq5 libwxbase2.8-0 libwxgtk2.8-0 pgadmin3-data php5-cli php5-common php5-pgsql postgresql-9.1 postgresql-client-9.1 postgresql-contrib-9.1 python-support

Installed Packages

Each of the installed packages are described below.

postgresql
This is the core Postgresql object relational database server.

postgresql-common
Add the ability to setup a cluster of Postgresql database servers.

postgresql-client
Command line client for interacting with a Postgresql database, including the psql command.

postgresql-client-common
Allows multiple clients to be installed as part of a Postgresql cluster.

pgsnap
Generates a Postgresql performance report in HTML format.

pgadmin3
GUI tool to work with Postgresql. Personally, I prefer some of the commercial tools, such as Navicat on Linux or EMS SQL Manager on Windows.

pgpool2
Middleware between a Postgresql client and a Postgresql server that provides connection pooling, replication and load balancing, among other things.

ptop
CLI based performance monitoring tool that’s analogous to the Linux psql command

pgtune
Automatically tunes the Postgresql configuration file postgresql.conf based on the system’s hardware.

pgloader
Utility for loading flat files and CSV files into a Postgresql table.

pgagent
Job scheduler for Postgresql.

python-pygresql
Python module that allows you to query a Postgresql database from a python script. Basically, this is for python developers who need to query Postgresql.

postgresql-plpython-9.1
Allows SQL developers to extend their SQL script by writing procedural functions in python.

python-psycopg2
Similar to python-pygresql, except that python-psycopg2 is designed for heavily threaded python scripts that create and destroy a large number of cursors, and execute a high volume of INSERTs and UPDATEs.

How to write professional emails (Top 10 list)

I was going through my old notes as I’m cleaning our my closet. I came across the following notes, that I like a lot, on how to write a professional email.

  1. Use email macros to both improve your efficiency in replying to email, and to improve quality.
  2. End each sentence with the correct punctuation.
  3. Write short sentences.
  4. Each sentence should discuss only one topic.
  5. Use bulleted lists to simplify that information you are trying to convey.
    1. Bulleted lists are easy to read.
    2. Each point is separated from other points.
    3. Lists help to organize thoughts.
  6. Use the following correctly:
    1. Capitalization
    2. Spacing
    3. Formatting
  7. Do not use SMS fake words – ur, dk, lol, how ru, etc.
  8. Make the Subject descriptive.
  9. Limit the email to the subject matter described in the Subject.
  10. Attach files before writing the email. No body likes a forgotten attachments.

There are more, but I would put these as my top 10 list.

How to add numbers with Pig

Introduction

We’re going to start with a very simple Pig script that reads a file that contains 2 numbers per line separated by a comma. The Pig script will first read the line, store each of the 2 numbers in separate variables, and will then add the numbers together.

Create the Sample Input File

cd
vi pig-practice01.txt

Paste the following into pig-practice01.txt.

5	1
6	4
3	2
1	1
9	2
3	8

Create the Input and Output Directories in HDFS

We’re going to create 2 directories to store the input to and output from our first pig script.

hadoop fs -mkdir pig01-input
hadoop fs -mkdir pig01-output

Put Data File into HDFS

hadoop fs -put pig-practice01.txt pig01-input

Now, let’s check that our file was put from our local file system to HDFS correctly.

hadoop fs -ls pig01-input
hadoop fs -cat pig01-input/pig-practice01.txt

Write the Pig Latin Script

vi practice01.pig

Paste the following code into practice01.pig.

/*
Add 2 numbers together
*/

-- Load the practice file from HDFS
A = LOAD 'pig01-input/pig-practice01.txt' USING PigStorage() AS (x:int, y:int);

-- Add x and y 
B = FOREACH A GENERATE x + y;

-- Show the output
STORE B INTO 'pig01-output/results' USING PigStorage();

Run the Pig Script

pig practice01.pig

View the Results

hadoop fs -ls pig01-output/results

The results are stored in the part* file.

hadoop fs -cat pig01-output/results/part-m-0000

Additional Reading

Install Pig 0.9.2 for CDH4 on Ubuntu 12.04 LTS x64

Introduction

Installing Pig is drop dead simple.

Installation

sudo apt-get install pig

Check the Pig version.

pig --version

Setup the Environment

We’re going to set the environment variables system-wide for Pig programming.

sudo vi /etc/environment

Paste the following environment variables into the environment file.

HADOOP_MAPRED_HOME="/usr/lib/hadoop-mapreduce"
PIG_CONF_DIR="/etc/pig/conf"
source  /etc/environment

That’s it. You can now start to write and run pig jobs.

Install Pentaho Analysis Services (Mondrian) 3.4.1 with MySQL on Ubuntu 12.04 LTS

Introduction

Pentaho Analysis Services is called Mondrian, which is the name I’ll use for the remainder of this post. Mondrian is Pentaho’s OLAP server.

In this post I’ll provide step-by-step instructions on how to install Mondrian 3.4.1 Ubuntu Linux 12.04 LTS x64. We’ll use MySQL as the database.

Download

Let’s first create a directory to download Mondrian into.

mkdir -p ~/Downloads/pentaho/analysis
cd ~/Downloads/pentaho/analysis

To download the Mondrian either run the following command, or follow the 3 bulleted steps below.

wget http://hivelocity.dl.sourceforge.net/project/mondrian/mondrian/mondrian-3.4.1/mondrian-3.4.1.zip

Or follow the steps below if you don’t want to use the wget command shown above.

Unzip

cd ~/Downloads/pentaho/analysis
unzip mondrian-3.4.1.zip

Install the Mondrian Server

cd ~/Downloads/pentaho/analysis/mondrian-3.4.1/lib
sudo unzip mondrian.war -d /opt/pentaho/biserver-ce/tomcat/webapps/mondrian
sudo chown -R pentaho:pentaho /opt/pentaho/biserver-ce/tomcat/webapps/mondrian

Create the FoodMart Database

FoodMart is a sample database that can be used to learn Mondrian.

mysql -u root -p

Enter the following commands at the MySQL command prompt.

mysql> CREATE DATABASE foodmart;

Please use a real password by changing pass.word below.

mysql> GRANT ALL ON foodmart.* TO foodmart@localhost IDENTIFIED BY 'pass.word';
mysql> quit;

Import the Foodmart Schema and Data

cd /opt/pentaho/biserver-ce/tomcat/webapps/mondrian

Important
Change akbar to your user name in the inputFile option in the command below.

java -cp "\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/mondrian.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/log4j-1.2.8.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/commons-collections-3.1.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/olap4j.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/commons-logging-1.0.4.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/eigenbase-xom.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/eigenbase-resgen.jar:\
/opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/lib/eigenbase-properties.jar:\
/opt/pentaho/biserver-ce/tomcat/lib/mysql-connector-java-5.1.17.jar" \
mondrian.test.loader.MondrianFoodMartLoader -verbose -tables -data -indexes \
-jdbcDrivers=com.mysql.jdbc.Driver \
-inputFile=/home/akbar/Downloads/pentaho/analysis/mondrian-3.4.1/demo/FoodMartCreateData.sql \
-outputJdbcURL="jdbc:mysql://localhost/foodmart?user=foodmart&password=pass.word"

If you get an error, there is a 99% probability that the jar files cannot be found on the paths specified in the command above. So, the first thing you should do is check if the jar files can be found in the class path paths that are listed above, then update the paths as necessary.

Update the Sample MDX Query Files

cd /opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF/queries

Edit each of the following files and make the change shown in the From / To instructions below.

Change each of the JSP files from:


<jp:mondrianQuery id="query01" jdbcDriver="org.apache.derby.jdbc.EmbeddedDriver" jdbcUrl="jdbc:derby:classpath:/foodmart" catalogUri="/WEB-INF/queries/FoodMart.xml"
   jdbcUser="sa" jdbcPassword="sa" connectionPooling="false">

To…


<jp:mondrianQuery id="query01" jdbcDriver="com.mysql.jdbc.Driver" jdbcUrl="jdbc:mysql://localhost/foodmart?user=foodmart&password=pass.word" catalogUri="/WEB-INF/queries/FoodMart.xml">

sudo vi fourhier.jsp
sudo vi mondrian.jsp
sudo vi arrows.jsp
sudo vi colors.jsp
cd /opt/pentaho/biserver-ce/tomcat/bin
sudo ./shutdown.sh
sudo ./startup.sh

I have changed the default Pentaho BI Server port from 8080 to 8585.

Open a web browser to http://localhost:8585/mondrian.

Click each of the links below to view the same cube.

The cube will be populated for the first time when you click the first link below. In other words, the first page will take a long time to load while the cube is populating. On my system, I actually had to reboot and restart Tomcat before the pages displayed.

Mondrian Configuration Files

Nothing needs to be done to setup the FoodMart database, however you will need to edit the following configuration files when you create your own database.

mondrian.properties

cd /opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF
sudo cp mondrian.properties mondrian.properties.org
sudo vi mondrian.properties

datasources.xml

cd /opt/pentaho/biserver-ce/tomcat/webapps/mondrian/WEB-INF
sudo cp datasources.xml datasources.xml.org
sudo vi datasources.xml

Debugging

I was initially using the instructions at http://mondrian.pentaho.com/documentation/installation.php, however, I got the following error when I ran the jar command.

Exception in thread "main" java.lang.NoClassDefFoundError: org/olap4j/mdx/IdentifierSegment
	at mondrian.test.loader.MondrianFoodMartLoader.(MondrianFoodMartLoader.java:98)
Caused by: java.lang.ClassNotFoundException: org.olap4j.mdx.IdentifierSegment
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
	... 1 more
Could not find the main class: mondrian.test.loader.MondrianFoodMartLoader.  Program will exit.

The solution was to add the following missing .jar files to the class path:

  • commons-collections-3.1.jar
  • olap4j.jar