Hadoop 0.23.7的伪分布式开发环境

2013 年 4 月 26 日2,7470

本站主要内容均为原创,转帖需注明出处www.alexclouds.net

       这一版本和2。X.X的区别就是没有NN的HA,一些目录相比我们常用的0.20.X有了较大的改动,原来的配置文件目录conf/废弃了,新版中在etc/hadoop/下配置。旧版中主要配置core-site.xml,hdfs.site.xml,mapred-site.xml,hadoop-env.sh文件。而新版中需要配置.bashrc, yarn-env.sh,添加各种环境变量,在core-site.xml,hdfs-site.xml, mapred-site.xml, yarn-site.xml配置hadoop。新版中已废弃start-all.sh及stop-all.sh. 采用start-dfs.sh, stop-dfs.sh来启动和关闭hdfs的namenode和datanode;用start-yarn.sh, stop-yarn.sh启动和停止 resource管理器和node管理器

       基本环境如JAVA都是随CENTOS自带安装的。默认安装在/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64,因此这就简单了,在/etc/profile里配置些环境参数:
MAVEN也装好。

export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
export JRE_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre

export M2_HOME=/home/apache-maven-2.2.1
export M2=$M2_HOME/bin
export MAVEN_OPTS="-Xms256m -Xmx512m"

export HADOOP_HOME=/home/hadoop-0.23.7

export PATH=.:$PATH::$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$M2:$FLUME_HOME:$HBASE_HOME/bin:$APACHE_NUTCH_HOME/bin

     之后只要执行 source /etc/proflee ,退出重新登录即可。

     将hadoop-0.23.7解压到/home/hadoop-0.23.7

     添加hadoop group和HADOOP user, groupadd hadoop useradd hadoop –G hadoop,最后做这一步骤 chown -R hadoop:hadoop /home/hadoop-0.23.7

     配置本机ssh无密码登录

ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ssh localhost

      在.bashrc里添加如下行

export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
export JRE_HOME=${JAVA_HOME}/jre
export HADOOP_HOME=/home/hadoop-0.23.7
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$HADOOP_HOME/bin:$PATH
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

     配置hadoop文件:

1、yarn-env.sh

最后添加如下行:

export HADOOP_FREFIX=/home/hadoop-0.23.7
export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
export PATH=$PATH:$HADOOP_FREFIX/bin
export PATH=$PATH:$HADOOP_FREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
export YARN_HOME=${HADOOP_FREFIX}
export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop

 

2、core-site.xml

bash-4.1$ more core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:12200</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop-0.23.7/hadoop-root</value>
  </property>
<property>
  <name>fs.arionfs.impl</name>
  <value>org.apache.hadoop.fs.pvfs2.Pvfs2FileSystem</value>
  <description>The FileSystem for arionfs.</description>
</property>
</configuration>

    

3、hdfs-site.xml

bash-4.1$ more hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop-0.23.7/data/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>file:/home/hadoop-0.23.7/data/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>

 

4、mapred-site.xml

bash-4.1$ more mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    <property>
    <name>mapreduce.job.tracker</name>
    <value>hdfs://localhost:9001</value>
    <final>true</final>
    </property>
    <property>
    <name>mapreduce.map.memory.mb</name>
    <value>1536</value>
    </property>
    <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx1024M</value>
    </property>
    <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>3072</value>
    </property>
    <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx2560M</value>
    </property>
    <property>
    <name>mapreduce.task.io.sort.mb</name>
    <value>512</value>
    </property>
<property>
    <name>mapreduce.task.io.sort.factor</name>
    <value>100</value>
    </property>
    <property>
    <name>mapreduce.reduce.shuffle.parallelcopies</name>
    <value>50</value>
    </property>
    <property>
    <name>mapreduce.system.dir</name>
    <value>file:/home/hadoop-0.23.7/data/mapred/system</value>
    </property>
    <property>
    <name>mapreduce.local.dir</name>
    <value>file:/home/hadoop-0.23.7/data/mapred/local</value>
    <final>true</final>
    </property>
</configuration>

 

5、yarn-site.xml

bash-4.1$ more yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce.shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>user.name</name>
    <value>hadoop</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:54311</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>localhost:54312</value>
  </property>
<property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>localhost:54313</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:54314</value>
  </property>
  <property>
    <name>yarn.web-proxy.address</name>
    <value>localhost:54315</value>
  </property>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost</value>
  </property>
</configuration>

 

6、libexec/hadoop-config.sh

只在JAVA_HOME这一位置添加这一行: export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64

# Attempt to set JAVA_HOME if it is not set
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
if [[ -z $JAVA_HOME ]]; then
  # On OSX use java_home (or /Library for older versions)
  if [ "Darwin" == "$(uname -s)" ]; then
    if [ -x /usr/libexec/java_home ]; then
      export JAVA_HOME=($(/usr/libexec/java_home))
    else
      export JAVA_HOME=(/Library/Java/Home)
    fi
  fi

  # Bail if we did not detect it
  if [[ -z $JAVA_HOME ]]; then
    echo "Error: JAVA_HOME is not set and could not be found." 1>&2
    exit 1
  fi
fi

 

7、hadoop namenode –format

 

8、启动hadoop

bash-4.1$ more 0.sh
#!/bin/bash

/home/hadoop-0.23.7/sbin/start-dfs.sh
/home/hadoop-0.23.7/sbin/start-yarn.sh

 

9、运行JPS看HADOOP是否启动正常

jps

0 0