CONFIGURE AND RUN HADOOP AND HDFS

November 30, 2021

To install Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.

Procedure:

Step 1: Download Hadoop 0.18.3 from the following website viz. http://hadoop.apache.org/common/releases.html

Step 2: Extract the hadoop tar.gz file as follows:-

$tar -xvzf hadoop-0.20.2.tar.gz

Step 3: Set the PATH and CLASSPATH variables appropriately in your .profile file and load it.

Example:

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.9

Step 4: The following files will have to be modified to complete the Hadoop setup:

Setup NAMENODE service:

Open the file and enter the following in between the <configuration></configuration> tag:

vi /home/hadoop/hadoop/conf/core-site.xml

<name>hadoop.tmp.dir</name>

<value>ipaddress:port</value>

<description>A base for other temporary directories.</description>

</property>

<name>fs.default.name</name>

<value>/home/hadoop/hdfsdrive</value>

</property>

</configuration>

Setup JOBTRACKER service:

By default, the folder contains /home//hadoop/hadoop/conf/mapred-site.xml file which has to be renamed/copied with the name mapred-site.xml:

vi /home/hadoop/hadoop/conf/mapred-site.xml

The mapred-site.xml file is used to specify which framework is being used for MapReduce.
We need to enter the following content in between the <configuration></configuration> tag:

<name>mapred.job.tracker</name>

<value>ipaddress:host</value>

</property>

</configuration>

Setup JOBTRACKER service:

vi /home/hadoop/hadoop/conf/masters

ipaddress

Setup SLAVES service:

It is used to specify the directories which will be used as the namenode and the datanode on that host.

vi /home/hadoop/hadoop/conf/slaves

ipaddress (namenode)

Step 5: To create a necessary directory named path in hdfs

mkdir /home/hadoop/hdfsdrive

Step 6: To format the Namenode

Hadoop namenode –format

Step 7: To create password-less setup:

ssh-keygen

ssh-copy-id –I /home/hadoop/.ssh/id-rsa.pub hadoop@ipaddress

Step 8: Start hadoop service

start-all.sh

Step 9: Stop hadoop service:

stop-all.sh

Search This Blog

S3PROGRAMMINGTECH

CONFIGURE AND RUN HADOOP AND HDFS

Comments

Post a Comment

Popular posts from this blog

How to create Animated 3d chart with R.

Linux/Unix Commands frequently used

R Programming Introduction