Intro
CDH (Cloudera's Distribution including apache Hadoop) is the most popular and the best documented distribution of Apache Hadoop. I have recently found out some deficiencies in its documentation when following the CDH4 Quick Start Guide instructions. I installed the Oracle Java Development Kit and set up JAVA_HOME environmental variable according to the instructions, but when attempting to start HDFS nodes I was receiving an error message stating that JAVA_HOME is not set and could not be found. After a quick research I have finally found out that a solution for that is just to export JAVA_HOME inside hadoop-env.sh configuration file in addition to .bash_profile file. The above solution comes very quickly for an experienced Hadoop administrator, but can be tricky for a beginner, so should be well documented by Cloudera in my opinion. The following covers detailed troubleshooting steps both with a solution.
Symptoms
1) You have the Oracle Java Development Kit installed and JAVA_HOME environmental variable exported according to the following HowTo:
[root@hadoop-standalone-mr1 ~]# env | grep JAVA_HOME
JAVA_HOME=/opt/jdk1.6.0_45
2) When attempting to start HDFS nodes you are receiving the following error messages:
[root@hadoop-standalone-mr1 ~]# for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do service $x start ; done
Starting Hadoop datanode: [ OK ]
Error: JAVA_HOME is not set and could not be found.
Starting Hadoop namenode: [ OK ]
Error: JAVA_HOME is not set and could not be found.
Starting Hadoop secondarynamenode: [ OK ]
Error: JAVA_HOME is not set and could not be found.
How to fix the issue
1) Export JAVA_HOME environmental variable in hadoop-env.sh configuration file:
echo export `env | grep ^JAVA_HOME` >> /etc/alternatives/hadoop-conf/hadoop-env.sh
2) You should be fine. All HDFS nodes start up properly now:
[root@hadoop-standalone-mr1 ~]# for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do service $x start ; done
Starting Hadoop datanode: [ OK ]
starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-hadoop-standalone-mr1.out
Starting Hadoop namenode: [ OK ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-hadoop-standalone-mr1.out
Starting Hadoop secondarynamenode: [ OK ]
starting secondarynamenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-hadoop-standalone-mr1.out
Disclaimer
- The above has been tested on CDH4 package, on CentOS 6.4 x86_64 system, in Google Compute Engine environment.
- The above solution works both for MRv1 and YARN.
Starting namenodes on Error: JAVA_HOME is not set and could not be found
ReplyDeleteI LOVE YOU
ReplyDelete