Install Pyramid on Ubuntu 12.04 LTS in the Rackspace Cloud

Check the Installed Python Version

python --version

You should see the following output:

Python 2.7.3

Install Prerequisites

apt-get install python-setuptools python-pip python-virtualenv virtualenvwrapper

Install Prerequisites for Pyramid Speedups

apt-get install gcc cpp libc6-dev python2.7-dev

Install nginx

apt-get install nginx nginx-full nginx-common

Create a wwwuser that waitress (the web server) will run as

useradd wwwuser -d /home/wwwuser -k /etc/skel -m -s /bin/bash -U

Setup the Virtual Environment

mkdir -p /var/www/delixus.com
mkdir /var/www/environments
cd /var
chown -R wwwuser:wwwuser www

We are now going to change users to wwwuser user.

su - wwwuser
cd /var/www/environments
virtualenv env_delixus

Install Pyramid

You must perform the following steps as the wwwuser user.

cd /var/www/environments/env_delixus
source bin/activate

You should see the environment name as the prefix in the command prompt, such as:

(env_delixus)wwwuser@ws2:

easy_install Pyramid
pip install waitress

Checkout the Pyramid Project

cd /var/www/delixus.com

Change the SVN checkout command to something that matches your server. If you use git, then change appropriately.

svn checkout https://repo.company.com/source/delixus/tags/1.0 .

Install the delixus.com Pyramid project

cd /var/www/delixus.com/delixus
vi production.ini

Under [app:main], add a [server:main] configuration as follows:


# http://docs.pylonsproject.org/projects/waitress/en/latest/arguments.html
[server:main]
use = egg:waitress#main
host = 127.0.0.1
port = %(http_port)s
# default # of threads = 4
threads = 8
url_scheme = http

I don’t think you need to install the development version of the site, but it seems to be the only way that I get everything to work while debugging…go figure.

python setup.py develop
pserve development.ini

Then open the site in a text-based web browser.

links http://localhost:6543

You should be able to view your site at this point.

Now, let’s install the production version of the site.

python setup.py install

Start Waitress

First we’re going to start and test waitress, then we’ll start it as a deamon.

pserve production.ini http_port=5000
links http://localhost:5000

Again, you should be able to view your site.

pserve production.ini start --daemon --pid-file=/var/www/5000.pid \
--log-file=/var/www/5000.log --monitor-restart http_port=5000
pserve production.ini start --daemon --pid-file=/var/www/5001.pid \
--log-file=/var/www/5001.log --monitor-restart http_port=5001

Check the waitress process.

ps -ef | grep pserve

You should see the pserve process running.

Configure nginx as a Proxy for Waitress

The following steps must be performed as root.

cd /etc/nginx/sites-available
vi delixus

Paste the following into the delixus.conf file.


upstream delixus-site {
    server 127.0.0.1:5000;
    server 127.0.0.1:5001;
}

server {
    listen 80;
    server_name  localhost www.delixus.com delixus.com;

    access_log  /var/log/nginx/delixus.com-access.log;

    location / {
        proxy_set_header        Host $host;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        X-Forwarded-Proto $scheme;

        client_max_body_size    10m;
        client_body_buffer_size 128k;
        proxy_connect_timeout   60s;
        proxy_send_timeout      90s;
        proxy_read_timeout      90s;
        proxy_buffering         off;
        proxy_temp_file_write_size 64k;
        proxy_pass http://delixus-site;
        proxy_redirect          off;
    }

    location /static {
        root            /var/www/delixus.com/delixus/delixus;
        expires         30d;
        add_header      Cache-Control public;
        access_log      off;
    }
}

rm /etc/nginx/sites-enabled/default
ln -s /etc/nginx/sites-available/delixus /etc/nginx/sites-enabled/delixus
service nginx stop
service nginx start

A good next step at this point is to setup Supervisor to control pserve/waitress.

Advertisements

Install Postgresql on Ubuntu 12.04 LTS

Introduction

Postgresql is an open source object relational database. It is often thought of as an alternative to MySQL.

In this post I’ll provide the steps required to install Postgresql on a developer laptop, assuming that work will be done in both SQL and Python.

You can learn more at:

Installation

Open a command prompt and enter the following commands.

sudo apt-get install postgresql postgresql-common postgresql-contrib \
postgresql-client postgresql-client-common \
pgsnap pgadmin3 pgpool2 ptop pgtune pgloader pgagent \
python-pygresql postgresql-plpython-9.1 python-psycopg2

You’ll see a list of additional packages that will be installed by default.

The following extra packages will be installed:
libossp-uuid16 libpgpool0 libpq5 libwxbase2.8-0 libwxgtk2.8-0 pgadmin3-data php5-cli php5-common php5-pgsql postgresql-9.1 postgresql-client-9.1 postgresql-contrib-9.1 python-support

Installed Packages

Each of the installed packages are described below.

postgresql
This is the core Postgresql object relational database server.

postgresql-common
Add the ability to setup a cluster of Postgresql database servers.

postgresql-client
Command line client for interacting with a Postgresql database, including the psql command.

postgresql-client-common
Allows multiple clients to be installed as part of a Postgresql cluster.

pgsnap
Generates a Postgresql performance report in HTML format.

pgadmin3
GUI tool to work with Postgresql. Personally, I prefer some of the commercial tools, such as Navicat on Linux or EMS SQL Manager on Windows.

pgpool2
Middleware between a Postgresql client and a Postgresql server that provides connection pooling, replication and load balancing, among other things.

ptop
CLI based performance monitoring tool that’s analogous to the Linux psql command

pgtune
Automatically tunes the Postgresql configuration file postgresql.conf based on the system’s hardware.

pgloader
Utility for loading flat files and CSV files into a Postgresql table.

pgagent
Job scheduler for Postgresql.

python-pygresql
Python module that allows you to query a Postgresql database from a python script. Basically, this is for python developers who need to query Postgresql.

postgresql-plpython-9.1
Allows SQL developers to extend their SQL script by writing procedural functions in python.

python-psycopg2
Similar to python-pygresql, except that python-psycopg2 is designed for heavily threaded python scripts that create and destroy a large number of cursors, and execute a high volume of INSERTs and UPDATEs.

What is sed?

Introduction

sed is short for Stream EDitor, which is a utility that allow you to parse and transform text one line at a time. sed is a useful tool, along with grep and awk, when manipulating text files. It is also often overlooked when working with Hadoop, although the use of sed, awk and grep can help speed up processing times by preprocessing text before sending it to a MapReduce job.

Install JDK 7 u5 on Ubuntu 12.04 LTS (as a secondary JDK)

Introduction

I had installed JDK 6.0 update 31 in an earlier post. However, I now need to write a Java application that requires the features available in JDK 7.

In this post, I will install JDK 7 update 5 as a secondary JDK, while JDK 6.0 u31 will be the primary JDK. It’s perfectly normal to have multiple JDKs on a single machine to support the requirements of different applications. Fortunately, it’s easy to use a different JDK on a per application basis.

Download

I have a 64 bit version of Ubuntu 12.04 LTS installed, so the instructions below only apply to this OS.

  1. Download the Java JDK from http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1637583.html.
  2. Click Accept License Agreement
  3. Click dk-7u5-linux-x64.tar.gz
  4. Login to Oracle.com with your Oracle account
  5. Download the JDK to your ~/Downloads directory
  6. After downloading, open a terminal, then enter the following commands.

Installation

Open a terminal, then enter the following commands:

cd ~/Downloads
tar -xzf jdk-7u5-linux-x64.tar.gz

Note:
The jvm directory is used to organize all JDK/JVM versions in a single parent directory. As this is our 2nd JDK, we’ll assume that the jvm directory already exists.

sudo mv jdk1.7.0_05 /usr/lib/jvm

The next 3 commands are split across 2 lines per command due to width limits in the blog’s theme.

sudo update-alternatives --install "/usr/bin/java" "java"  \
	"/usr/lib/jvm/jdk1.7.0_05/bin/java" 2
sudo update-alternatives --install "/usr/bin/javac" "javac"  \
	"/usr/lib/jvm/jdk1.7.0_05/bin/javac" 2
sudo update-alternatives --install "/usr/bin/javaws" "javaws"  \
	"/usr/lib/jvm/jdk1.7.0_05/bin/javaws" 2
sudo update-alternatives --config java

You will see output similar to the following (although it’ll differ on your system). Read through the list and find the number for the Oracle JDK installation (/usr/lib/jvm/jdk1.7.0_05/bin/java)

There are 2 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                               Priority   Status
------------------------------------------------------------
* 0            /usr/lib/jvm/jdk1.7.0_05/bin/java   2         auto mode
  1            /usr/lib/jvm/jdk1.6.0_31/bin/java   1         manual mode
  2            /usr/lib/jvm/jdk1.7.0_05/bin/java   2         manual mode

Press enter to keep the current choice[*], or type selection number:

On my system I did entered 1 to keep JDK 1.6.0 u31 as my primary JDK (change the number that is appropriate for your system). To enter 1, press 1 on the keyboard, then press Enter.

sudo update-alternatives --config javac
There are 2 choices for the alternative javac (providing /usr/bin/javac).

  Selection    Path                                Priority   Status
------------------------------------------------------------
* 0            /usr/lib/jvm/jdk1.7.0_05/bin/javac   2         auto mode
  1            /usr/lib/jvm/jdk1.6.0_31/bin/javac   1         manual mode
  2            /usr/lib/jvm/jdk1.7.0_05/bin/javac   2         manual mode

Press enter to keep the current choice[*], or type selection number:

I entered 1 then pressed Enter to keep JDK 1.6.0 u31 as my primary javac command.

sudo update-alternatives --config javaws
There are 2 choices for the alternative javaws (providing /usr/bin/javaws).

  Selection    Path                                 Priority   Status
------------------------------------------------------------
* 0            /usr/lib/jvm/jdk1.7.0_05/bin/javaws   2         auto mode
  1            /usr/lib/jvm/jdk1.6.0_31/bin/javaws   1         manual mode
  2            /usr/lib/jvm/jdk1.7.0_05/bin/javaws   2         manual mode

Press enter to keep the current choice[*], or type selection number:

I entered 1 then pressed Enter to keep JDK 1.6.0 u31 as my primary javaws command.

As a final step, let’s test each of the commands to ensure everything is setup correctly.

java -version

The output should be:
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)

javac -version

The output should be:
javac 1.6.0_31

javaws -version

The output should be:
Java(TM) Web Start 1.6.0_31, which is followed by a long usage message.

That’s it, the JDK 7 u5 is installed.

Install Java JDK 6.0 update 31 on Ubuntu 12.04 LTS

Introduction

The first question is why are we installing an old JDK. The answer is that Oracle JDK 6.0 update 31 is the JDK recommended by Cloudera when installing CDH4 (Cloudera Distribution Hadoop v4).

This is an update to an older version of this post. Mainly I have changed the JDK from 1.6.0_26 to 1.6.0_31 as this is the recommended JDK for CDH4.

Install Java

I have a 64 bit version of Ubuntu 12.04 LTS installed, so the instructions below only apply to this OS.

  1. Download the Java JDK from http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase6-419409.html#jdk-6u31-oth-JPR.
  2. Click Accept License Agreement
  3. Click jdk-6u31-linux-x64.bin
  4. Login to Oracle.com with your Oracle account
  5. Download the JDK to your ~/Downloads directory
  6. After downloading, open a terminal, then enter the following commands.
cd ~/Downloads
chmod +x jdk-6u31-linux-x64.bin
./jdk-6u31-linux-x64.bin

Note:
The jvm directory is used to organize all JDK/JVM versions in a single parent directory.

sudo mkdir /usr/lib/jvm
sudo mv jdk1.6.0_31 /usr/lib/jvm

The next 3 commands are split across 2 lines per command due to width limits in the blog’s theme.

sudo update-alternatives --install "/usr/bin/java" "java" \
"/usr/lib/jvm/jdk1.6.0_31/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" \
"/usr/lib/jvm/jdk1.6.0_31/bin/javac" 1
sudo update-alternatives --install "/usr/bin/javaws" "javaws" \
"/usr/lib/jvm/jdk1.6.0_31/bin/javaws" 1
sudo update-alternatives --config java

You will see output similar to the following (although it’ll differ on your system). Read through the list and find the number for the Oracle JDK installation (/usr/lib/jvm/jdk1.6.0_26/bin/java)

There are 2 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
* 0            /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java   1051      auto mode
  1            /usr/lib/jvm/jdk1.6.0_31/bin/java                1         manual mode
  2            /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java   1051      manual mode

On my system I did the following (change the number that is appropriate for your system):
Press 1 on your keyboard, then press Enter.

sudo update-alternatives --config javac

Follow steps similar to those listed above if you are presented with a list of options. In my case, I had not previously installed the OpenJDK javac binary, so my output looked like the following:

There is only one alternative in link group javac: /usr/lib/jvm/jdk1.6.0_31/bin/javac
Nothing to configure.
sudo update-alternatives --config javaws

As with javac, I did not have the OpenJDK version of javaws installed, so my output was simple. However, if you get a list of options, just type in the number of the path to the Oracle javaws command, and press Enter.

There is only one alternative in link group javaws: /usr/lib/jvm/jdk1.6.0_31/bin/javaws
Nothing to configure.

As a final step, let’s test each of the commands to ensure everything is setup correctly.

java -version

The output should be:
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)

javac -version

The output should be:
javac 1.6.0_31

javaws -version

The output should be:
Java(TM) Web Start 1.6.0_31
which is followed by a long usage message.

Create the JAVA_HOME environment variable

Open a terminal, then enter the following commands:

sudo vi /etc/environment

WARNING
WordPress displays the quotes around the JAVA_HOME value below as magic quotes. This will cause problems when you try to use your JVM in certain applications.

Do not copy/paste the JAVA_HOME value below. Or if you do, ensure that you change magic quotes to straight quotes in your editor.

Enter the following at the bottom of the file:
JAVA_HOME="/usr/lib/jvm/jdk1.6.0_31"

Type the following commands to finish the setup and verify that everything is setup correctly.

source /etc/environment
echo $JAVA_HOME

You should see the following output:

/usr/lib/jvm/jdk1.6.0_31

Lastly, verify that JAVA_HOME is set correctly for the sudo user:

sudo env | grep JAVA_HOME

That’s it, the JDK 6.0 update 31 is installed.

Install Pentaho Design Studio 4.0 on Ubuntu 12.04 LTS Desktop

Introduction

Pentaho Design Studio (PDS) is a BI plugin for Eclipse. I’m going to download the complete package as Pentaho was nice enough to integrate the plugin with Eclipse for us.

Download

To download the Pentaho Design Studio (PDS) either run the following command, or follow the bulleted steps below.

wget http://downloads.sourceforge.net/project/pentaho/Design%20Studio/4.0.0-stable/pds-ce-linux-64-4.0.0-stable.tar.gz

Or follow the steps below if you don’t want to use the wget command shown above.

Installation

Note:
I am going to assume that you have downloaded the file listed above into the Downloads directory in your Home directory.

Open a terminal and enter the following commands:

cd
mkdir bin
cd ~/Downloads
tar -xzf pds-ce-linux-64-4.0.0-stable.tar.gz
mv design-studio/ ~/bin/pds-ce-linux-64-4.0.0
cd ~/bin
ln -s pds-ce-linux-64-4.0.0 design-studio
vi ~/.profile

Near the bottom of the file you should see the PATH variable. Append :$HOME/bin/design-studio to end of the PATH.

For example, my PATH was:
PATH="$HOME/bin:$PATH"

…which I updated to:
PATH="$HOME/bin:$PATH:$HOME/bin/design-studio"

It’s better to append :$HOME/bin/design-studio to end of the PATH than the beginning so that we don’t accidentally step on another installation of Eclipse. Also, as we created a symlink named pds we are less likely to have PDS inaccessible due to another Eclipse installation that is earlier in the PATH.

Next, we’ll create a symlink named pds so that we can type a shorter command to open Pentaho Design Studio.

cd ~/bin/design-studio
ln -s eclipse pds

Finally, source your profile to update your environment.

source ~/.profile

Now just type pds and press the Enter key:

pds

Install Pentaho BI Server 4.5 on Ubuntu 12.04 LTS Desktop

Overview: What is Pentaho?

Pentaho is an open source Business Intelligence (BI) Suite that comes in with either commercial support (http://www.pentaho.com/) and or community support (http://community.pentaho.com/).

This post provides instructions for the Pentaho community edition suite.

Create a pentaho user and group

Open a terminal and run the following commands:

sudo addgroup pentaho
sudo adduser --system --ingroup pentaho --disabled-login pentaho

Install Java

Follow the JDK installation instructions that are listed in the following post: Install Java JDK 6.0 update 31 on Ubuntu 12.04 LTS

Install the Pentaho BI Server

Now that we have Java installed we can get on with our main task of installing the Pentaho BI Server.

  1. Download the Pentaho BI Server from http://wiki.pentaho.com/display/COM/Latest+Stable+Builds. I’m using the current stable build for x64 Linux which is biserver-ce-4.5.0-stable.tar.gz.
  2. Open a terminal and enter the following commands:
sudo mkdir /opt/pentaho
cd ~/Downloads
gunzip biserver-ce-4.5.0-stable.tar.gz
tar xf biserver-ce-4.5.0-stable.tar
sudo mv biserver-ce /opt/pentaho/biserver-ce-4.5.0
sudo mv administration-console /opt/pentaho/administration-console-ce-4.5.0
cd /opt/pentaho
sudo ln -s biserver-ce-4.5.0 biserver-ce
sudo ln -s administration-console-ce-4.5.0 administration-console
sudo chown -R pentaho:pentaho /opt/pentaho

Start the Pentaho Server

Open a terminal, then enter the following commands:

Note:
The following command is only required if you downloaded a Windows .zip file by accident. If this is the case, then none of the .sh files will be executable.
sudo find /opt/pentaho/ -type f -name '*.sh' -exec chmod 744 '{}' \+

cd /opt/pentaho/biserver-ce
sudo -u pentaho ./start-pentaho.sh

Login to the Pentaho User Console

  1. Open a web browser to http://localhost:8080.
  2. Click Evaluation Login and select a user type to login as.

Login to the Pentaho Administration Console

Open a terminal, then enter the following commands:

cd /opt/pentaho/administration-console
sudo -u pentaho ./start-pac.sh
  1. Open a web browser to http://localhost:8099.
  2. Enter a User Name of admin.
  3. Enter a Password of password.
  4. Click Log In.

That’s it. The core Pentaho BI server is installed and ready for development. However, a good next step is to change the database that Pentaho uses and install the Pentaho Design Studio (PDS), but we’ll leave that for future posts.

Hadoop also port 8080, so you will either need to use a different port for the Pentaho User Console or change the Hadoop MapReduce ShuffleHandler port.

Change the default ssh port on Ubuntu

Changing the default port of ssh is not a huge improvement in security, but I’ve found it to be a useful tool in keeping log files free from failed login attempts with username root on port 22 (and I hope you do spend the time to review your log files!). A large number of scripts run scans on the default ssh port of 22 looking for known vulnerabilities. Of course, you should keep ssh fully patched, however rapidly growing log files is a problem all its own.

One of the easiest ways to keep your log files from filling up with failed login attempts is to change the ssh port.

vi /etc/ssh/sshd_config

Update the port to a new value, such as:

Port 876

Once you’ve updated sshd, you may also which to update ssh for convenience:

vi /etc/ssh/ssh_config

Uncomment the line with Port, and set it to the same value that you set in the sshd_config file:

Port 876

Lastly, reload the sshd daemon:

/etc/init.d/ssh reload

Open a 2nd ssh session to the server to ensure everything is working.

Note:
I recommend you keep the original session open in case you get something wrong in your configuration.

Install Python 2.7.2 from source on Ubuntu 10.04 LTS

The first thing I did was to create a wwwuser that I plan to run pyramid under. As a result, I am intentionally installing Python 2.7.2 under only 1 user account, and am leaving the system wide python installation unchanged.

useradd wwwuser
passwd wwwuser
cd /home
mkdir wwwuser
chown wwwuser:wwwuser wwwuser

Copy all of the hidden files into the /home/wwwuser folder. I did this from my desktop files.

vi /etc/passwd
Update the shell to be:
/bin/bash

su - wwwuser
ln -s .profile .bash_profile
mkdir bin
mv Python-2.7.2.tgz bin
cd bin
tar -xzf Python-2.7.2.tgz
cd Python-2.7.2
./configure --prefix /home/wwwuser/bin/Python-2.7.2
make
make install

vi ~/.profile
Update path to:
PATH="$HOME/bin:/home/wwwuser/bin/Python-2.7.2/bin:$PATH"

source ~/.profile

which python
/home/wwwuser/bin/Python-2.7.2/bin/python

python --version

The output of the python --versioncommand should now be Python 2.7.2.