hdfs 参数配置
dfs.nameservices: 指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致
dfs.ha.namenodes.ns1:ns1下面有两个NameNode,分别是nn1,nn2
dfs.namenode.rpc-address.ns1.nn1: nn1的RPC通信地址
dfs.namenode.http-address.ns1.nn1: nn1的http通信地址
dfs.namenode.shared.edits.dir:指定NameNode的元数据在JournalNode上的存放位置
dfs.journalnode.edits.dir : 指定JournalNode在本地磁盘存放数据的位置
dfs.ha.automatic-failover.enabled: true是开启NameNode失败自动切换
dfs.client.failover.proxy.provider.ns1:配置失败自动切换实现方式
dfs.ha.fencing.ssh.private-key-files:使用sshfence隔离机制时需要ssh免登陆
fs.defaultFS:指定hdfs的nameservice为ns1
hadoop.tmp.dir:指定hadoop临时目录
ha.zookeeper.quorum:指定zookeeper地址
环境:
系统:centos 6.6 x86_64
机器:
192.168.0.93 lsvr93
192.168.0.94 lsvr94
192.168.0.103 lsvr103
操作账号:root
1.下载地址:
http://apache.fayea.com/hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz
2.安装JDK环境
a.安装jdk
rpm -ivh jdk-8u71-linux-x64.rpm
b.配置环境变量
vi /etc/profile (当前登录用户,可选 /etc/profile 对所有用户生效)
#set java environment
JAVA_HOME=/usr/java/jdk1.8.0_71
JRE_HOME=/usr/java/jdk1.8.0_71/jre
JAVA_BIN=/usr/java/jdk1.8.0_71/bin
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_BIN:$PATH
export PATH JAVA_HOME CLASSPATH
c.检测Java环境
java -version
java version "jdk1.8.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
3.ssh 无密码登陆
a.生成SSH KEY
在93节点上执行:
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
scp authorized_keys root@192.168.0.94:/root/.ssh
scp authorized_keys root@192.168.0.103:/root/.ssh
b.测试登陆
ssh lsvr93
ssh lsvr94
ssh lsvr103
4.安装 hadoop 2.7.2
三个节点上都需要执行相同操作(93,94,103)
a.解压文件
cd /usr/local/src
tar zxvf hadoop-2.7.2.tar.gz
mv hadoop-2.7.2 /opt
b.创建目录
mkdir -p /opt/hadoop/tmp
mkdir -p /opt/hadoop/hdfs/datanode
mkdir -p /opt/hadoop/hdfs/namenode
mkdir -p /opt/hadoop/tmp /opt/hadoop/hdfs/datanode /opt/hadoop/hdfs/namenode
c.设置环境变量
vi /etc/profile
#hadoop
export HADOOP_HOME=/opt/hadoop-2.7.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
d.设置/etc/hosts文件
192.168.0.93 lsvr93
192.168.0.94 lsvr94
192.168.0.103 lsvr103
d.修改hadoop-env.sh配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_55
e.修改 yarn-env.sh 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_55
f.修改 core-site.xml 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://lsvr93:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
</configuration>
fs.defaultFS: namenode 节点的配置,ip 端口
io.file.buffer.size:hadoop 文件读写缓存大小
hadoop.tmp.dir:临时文件目录
g.修改 hdfs-site.xml 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>lsvr93:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
dfs.namenode.name.dir: namenode 本地存储和处理日志的目录
dfs.datanode.data.dir: datanode 本地存储块目录
dfs.replication:默认复制副本数量
dfs.namenode.secondary.http-address:第二namenode节点的IP地址,端口
dfs.webhdfs.enabled:Enable WebHDFS (REST API) in Namenodes and Datanodes.
h.修改 mapred-site.xml 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
mapreduce.framework.name: hadoop yarn 框架
i.修改 yarn-site.xml 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>lsvr93:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>lsvr93:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>lsvr93:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>lsvr93:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>lsvr93:8088</value>
</property>
</configuration>
yarn.nodemanager.aux-services:mapreduce 处理方式
yarn.nodemanager.auxservices.mapreduce.shuffle.class:
yarn.resourcemanager.address:resourcemanager IP地址 端口,提供给clients 提交作业
yarn.resourcemanager.scheduler.address:ApplicationMasters 获取资源访问接口
yarn.resourcemanager.resource-tracker.address:NodeManager 节点IP地址,端口
yarn.resourcemanager.admin.address: 管理命令IP地址,端口
yarn.resourcemanager.webapp.address:web ui 访问ip地址,端口
j.修改 slaves 配置文件
vi /opt/hadoop-2.7.2/etc/hadoop/slaves
lsvr94
lsvr103
k.修改 masters 配置文件
lsvr93
l. 格式化namenode 文件
bin/hadoop namenode -format
以上需在各节点上执行相同操作!!!
m. 启动hadoop 程序
在93节点上执行:
sbin/start-dfs.sh
sbin/start-yarn.sh
n.机器验证:
在93节点上的进程:(主节点)
在94节点上:
在103节点上:
o.查询集群 HDFS 信息
http://192.168.0.93:50070
p.查询集群 yarn 信息
http://192.168.0.93:8088
r.关闭集群
sbin/stop-all.sh
s.运行测试程序
cd /opt/hadoop-2.7.2
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar pi 20 10
计算PI 精度