Prerequisite for Hadoop Installation :
Hadoop requires java 1.5 & above for its working but Java 1.6 is recommended.
So first thing you need in your machine is java 1.6.
Step 1. Check you have java 1.6 installed or not.
Write the command $ java -version and then press enter.
$ sudo apt-get install openjdk-6-jre
After installation of java, Check Java is installed properly or not by by using
$ java -version command.
If the above output comes, java is installed properly on your system.
You can check for the installation package at /usr/lib/jvm/
Step 3. Adding a dedicated system user :
It helps to separate the Hadoop installation with other software application and also with the user account running on the single node.
So, for creating a separate user you can use the below commands:
$ sudo addgroup hadoop1
where Hadoop1 is a group name.
Adding a user in Hadoop1 group
$ sudo adduser --ingroup hadoop hduser
`
Step 4. Configuring a SSH(Secure Shell) to localhost :
Hadoop requires SSH access to manage its nodes. So for this single node installation of Hadoop we need to configure the SSH access to localhost.
We will be creating this access for the hduser we created in the previous step.
We will be creating this access for the hduser we created in the previous step.
$ sudo apt-get install openssh-server
$ su - hduser
Step 6. After the SSH server installation. we have to generate an SSH key for the hduser:
$ ssh-keygen -t rsa -P ""
Step 7. Now since the key pair is generated we have to enable SSH access to local machine with this newly created key. For that you have to put the below command.
hduser@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Step 8. Finally you can check the same using command :
$ ssh localhost
Hadoop Installation :
Download & Extract Hadoop :
So if you have all the above prerequisite in your machine,you are good to go with the Hadoop installation.
First download Hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core and extract the same at any location, I kept it at /usr/local. Also you need to change the owner permission of all files to hduser and group to Hadoop.
$ cd /usr/local
$ sudo tar xzf hadoop-1.2.1.tar.gz
$ sudo mv hadoop-1.0.3 hadoop
$ sudo chown -R hduser:hadoop hadoop
Update $HOME/.bashrc
Update the following lines at the end of $Home/.bashrc file of user hduser. Well if you are using a different shell than bash, you have to update the appropriate configuration file.
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/openjdk-6-jre
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
export PATH=$PATH:$HADOOP_HOME/bin
Configuration File Setup
By now we are almost done with the Hadoop installation. Now what we have to do is, change a few properties of the configuration file provided in Hadoop Conf folder.
But before that we have to make a directory where we are going to save our data on the local node cluster. We will be saving our data on HDFS.
So lets create the directory and set the required ownership and permission.
But before that we have to make a directory where we are going to save our data on the local node cluster. We will be saving our data on HDFS.
So lets create the directory and set the required ownership and permission.
$ sudo mkdir /tmp/hadoop_data $ sudo chown hduser:hadoop /tmp/hadoop_data
$ sudo chmod 777 /tmp/hadoop_data
Now lets start changing a few of the required configuration file.
Note: you will find all these configuration file inside hadoop/conf directory where you have put your file. In my case it is at /usr/local/hadoop/conf.
hadoop-env.sh
Open the Hadoop-env.sh file and change the only required environment variable for local machine installation. And it is JAVA_HOME. For this you just need to uncomment the below line and set the JAVA_HOME environment to your JDK/JRE directory.
# The java implementation to use. Required. export JAVA_HOME=/usr/lib/jvm/openjdk-6-jre
core-site.xml
In between <configuration> ... </configuration> put the below code:
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop_data</value>
<description>directory for hadoop data</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description> data to be put on this URI</description>
</property>
mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>...
</description>
</property>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Formatting and Starting the Single Node Cluster.
So if you are done till now successfully, you are done with the installation part. Now we just have to format the name-node and start the cluster.
hduser@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode -format
1.
|
|
2.
|
|
3.
|
|
4.
|
|
5.
|
|
6.
|
|
7.
|
1 comments:
Click here for commentsyou can Achieve your Target............
Dot Net Training in Chennai
Android Training in Chennai
Dot Net Training in Chennai
ConversionConversion EmoticonEmoticon