Introduction sed provides a quick and easy way to find and replace text via it’s search command (‘s’). Sample File Copy and paste the following text into a file named practice01.txt. Author: Akbar S. Ahmed Date: July 1, 2012 Subject: Sed sed is an extremely useful Unix/Linux/*nix utility that allows you to manipulate a text stream. It is useful when working with Hadoop, as sed is often used to manipulate text prior to MapReduce. sed […]
Introduction sed is short for Stream EDitor, which is a utility that allow you to parse and transform text one line at a time. sed is a useful tool, along with grep and awk, when manipulating text files. It is also often overlooked when working with Hadoop, although the use of sed, awk and grep can help speed up processing times by preprocessing text before sending it to a MapReduce job.
Introduction I had installed JDK 6.0 update 31 in an earlier post. However, I now need to write a Java application that requires the features available in JDK 7. In this post, I will install JDK 7 update 5 as a secondary JDK, while JDK 6.0 u31 will be the primary JDK. It’s perfectly normal to have multiple JDKs on a single machine to support the requirements of different applications. Fortunately, it’s easy to use […]
Introduction Installing the MySQL package on Ubuntu is extremely simple. Installation Open a terminal and enter the following commands. sudo apt-get install mysql-client mysql-navigator mysql-server Type Y to accept the additional packages. Press Enter. After downloading and during installation, the MySQL configuration dialogs will display in the terminal. In the first dialog, press Enter. Enter a password for the MySQL root user. Press Enter. Reenter the root password. Press Enter. That’s it, MySQL is now […]
Introduction This is my personal .bash_aliases file that is mainly used for Cloudera CDH4 (Hadoop) and Pentaho. As a result, many of my aliases are specific to these software packages. I plan to update this post as my .bash_aliases file expands. I will also push my .bash_aliases file into Git to make it easier to keep up with changes to the file. How to create a .bash_aliases file vi ~/.bash_aliases Paste the following into the […]
Introduction Pentaho Design Studio (PDS) is a BI plugin for Eclipse. I’m going to download the complete package as Pentaho was nice enough to integrate the plugin with Eclipse for us. Download To download the Pentaho Design Studio (PDS) either run the following command, or follow the bulleted steps below. wget http://downloads.sourceforge.net/project/pentaho/Design%20Studio/4.0.0-stable/pds-ce-linux-64-4.0.0-stable.tar.gz Or follow the steps below if you don’t want to use the wget command shown above. Open a web browser to http://wiki.pentaho.com/display/COM/Latest+Stable+Builds. Click […]
Overview: What is Pentaho? Pentaho is an open source Business Intelligence (BI) Suite that comes in with either commercial support (http://www.pentaho.com/) and or community support (http://community.pentaho.com/). This post provides instructions for the Pentaho community edition suite. Create a pentaho user and group Open a terminal and run the following commands: sudo addgroup pentaho sudo adduser --system --ingroup pentaho --disabled-login pentaho Install Java Follow the JDK installation instructions that are listed in the following post: Install […]
I have updated an earlier post on how to install MySQL 5.5 on Ubuntu 10.04 LTS. Importantly, the new instructions include how to add the MySQL 5.5 libs to the loader path. This is very important as it’s highly likely that you’ll build software that depends on these MySQL libraries. The updated instructions can be viewed at: MySQL 5.5 on Ubuntu 10.04 LTS.
Changing the default port of ssh is not a huge improvement in security, but I’ve found it to be a useful tool in keeping log files free from failed login attempts with username root on port 22 (and I hope you do spend the time to review your log files!). A large number of scripts run scans on the default ssh port of 22 looking for known vulnerabilities. Of course, you should keep ssh fully […]
The first thing I did was to create a wwwuser that I plan to run pyramid under. As a result, I am intentionally installing Python 2.7.2 under only 1 user account, and am leaving the system wide python installation unchanged. useradd wwwuser passwd wwwuser cd /home mkdir wwwuser chown wwwuser:wwwuser wwwuser Copy all of the hidden files into the /home/wwwuser folder. I did this from my desktop files. vi /etc/passwd Update the shell to be: […]