CONFIGURE AND RUN HADOOP AND HDFS

 To install Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.

Procedure:

Step 1: Download Hadoop 0.18.3 from the following website viz.    http://hadoop.apache.org/common/releases.html

Step 2: Extract the hadoop tar.gz file as follows:-

$tar -xvzf hadoop-0.20.2.tar.gz

Step 3: Set the PATH and CLASSPATH variables appropriately in your .profile file and load it.

Example:

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.9

Step 4: The following files will have to be modified to complete the Hadoop setup:

 

Setup NAMENODE service:

Open the file and enter the following in between the <configuration></configuration> tag:

vi /home/hadoop/hadoop/conf/core-site.xml

<configuration>

 <property>

  <name>hadoop.tmp.dir</name>

  <value>ipaddress:port</value>

  <description>A base for other temporary directories.</description>

 </property>

 

 <property>

  <name>fs.default.name</name>

  <value>/home/hadoop/hdfsdrive</value> 

 </property>

</configuration>

 

 

Setup JOBTRACKER service:

By default, the folder contains /home//hadoop/hadoop/conf/mapred-site.xml file which has to be renamed/copied with the name mapred-site.xml:

vi /home/hadoop/hadoop/conf/mapred-site.xml

The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:

<configuration>

 <property>

  <name>mapred.job.tracker</name>

  <value>ipaddress:host</value>

 </property>

</configuration>


Setup JOBTRACKER service:

vi /home/hadoop/hadoop/conf/masters

ipaddress

Setup SLAVES service:

It is used to specify the directories which will be used as the namenode and the datanode on that host.

vi /home/hadoop/hadoop/conf/slaves

ipaddress (namenode)

Step 5: To create a necessary directory named path in hdfs

            mkdir /home/hadoop/hdfsdrive

Step 6: To format the Namenode

            Hadoop namenode –format

Step 7: To create password-less setup:

            ssh-keygen

            ssh-copy-id –I /home/hadoop/.ssh/id-rsa.pub hadoop@ipaddress

Step 8: Start hadoop service

            start-all.sh

Step 9: Stop hadoop service:

            stop-all.sh

Comments

Popular posts from this blog

How to create Animated 3d chart with R.

Linux/Unix Commands frequently used

R Programming Introduction