02.Hadoop集群部署
Hadoop集群部署
环境说明
主机列表
IP地址 | 主机名 | 操作系统 |
---|---|---|
192.168.159.130 | node01 | debian 9.6 |
192.168.159.131 | node02 | debian 9.6 |
192.168.159.132 | node03 | debian 9.6 |
注意,此处的主机名不能包含
-
和_
JDK
因为Oracle变更了Java的使用授权方式,所以使用OpenJDK
openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)
Hadoop
Hadoop版本:hadoop-2.9.2
Java环境
在三台主机执行以下操作
$ mkdir ~/Downloads
cd ~/Downloads
wget https://download.java.net/java/GA/jdk11/13/GPL/openjdk-11.0.1_linux-x64_bin.tar.gz
sudo tar xf openjdk-11.0.1_linux-x64_bin.tar.gz -C /usr/local/
修改/etc/profile
文件,追加以下内容,并注销后重新登陆
export JAVA_HOME=/usr/local/jdk-11.0.1
export CLASSPATH=.:$JAVA_HOME/lib
export PATH=$JAVA_HOME/bin:$PATH
修改hosts
在三台主机执行以下操作
sudo vim /etc/hosts
追加以下内容
192.168.159.130 node01
192.168.159.131 node02
192.168.159.132 node03
安装SSH服务并配置SSH Key登陆
安装SSH服务
在三台主机执行以下操作
sudo apt-get update
sudo apt-get install openssh-server # 一般都安装过了
ssh-keygen -t rsa -P "" # 此步需要输入一次回车确认SSH Key生成路径
分发公钥
分别在各主机使用scp
命令将生成的SSH Key复制到其它机器
本例如下:
node01
scp ~/.ssh/id_rsa.pub cloud@node02:~/.ssh/id_rsa_master.pub
scp ~/.ssh/id_rsa.pub cloud@node03:~/.ssh/id_rsa_master.pub
node02
scp ~/.ssh/id_rsa.pub cloud@node01:~/.ssh/id_rsa_slave_01.pub
scp ~/.ssh/id_rsa.pub cloud@node03:~/.ssh/id_rsa_slave_01.pub
node03
scp ~/.ssh/id_rsa.pub cloud@node01:~/.ssh/id_rsa_slave_02.pub
scp ~/.ssh/id_rsa.pub cloud@node02:~/.ssh/id_rsa_slave_02.pub
生成authorized_keys
分别在各主机使用执行
node01
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_slave_01.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_slave_02.pub >> ~/.ssh/authorized_keys
node02
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_master.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_slave_02.pub >> ~/.ssh/authorized_keys
node03
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_master.pub >> ~/.ssh/authorized_keys
cat ~/.ssh/id_rsa_slave_01.pub >> ~/.ssh/authorized_keys
测试SSH Key登陆
node01
ssh node01
ssh node02
ssh node03
node02
ssh node01
ssh node02
ssh node03
node03
ssh node01
ssh node02
ssh node03
这里一般会有如下输出
The authenticity of host 'node0x (192.168.159.13x)' can't be established.
ECDSA key fingerprint is SHA256:YX+xBXQ615FPJ/Fzn051BbaelveeLC5Y+B91AjeYgLk.
Are you sure you want to continue connecting (yes/no)?
输入yes回车即可
下载Hadoop并配置环境变量
在三台主机执行以下操作
下载操作可以在一台主机下载,然后使用
scp
拷贝到其它主机
cd ~/Downloads
wget http://mirrors.shu.edu.cn/apache/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz
sudo mv hadoop-2.9.2.tar.gz /opt/
sudo tar xf /opt/hadoop-2.9.2.tar.gz -C /opt/
sudo chown -R 用户名:用户组 /opt/hadoop-2.9.2/
修改/etc/profile
文件,追加以下内容,并注销后重新登陆
export HADOOP_HOME=/opt/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
修改配置文件
在三台主机执行以下操作
此操作可以在一个主机修改,完成后使用
scp
命令分发
core-site.xml
vim /opt/hadoop-2.9.2/etc/hadoop/core-site.xml
在configuration
节点下添加以下内容
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node01:9000</value>
</property>
fs.defaultFS
指定HDFS中NameNode的地址
hadoop.tmp.dir
指定hadoop运行时产生文件的存储目录
hadoop-env.sh
vim /opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh
注释掉export JAVA_HOME=${JAVA_HOME}
(约26行),在下面增加 export JAVA_HOME=/usr/local/jdk-11.0.1
# The java implementation to use.
# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/local/jdk-11.0.1
hdfs-site.xml
vi /opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
在configuration
节点下添加以下内容
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/data</value>
</property>
dfs.replication
设置dfs副本数,不设置默认是3个
slaves
vim /opt/hadoop-2.9.2/etc/hadoop/slaves
写入以下内容
node02
node03
mapred-site.xml
cp /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml.template /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml
vim /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml
在configuration
节点下添加以下内容
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
mapreduce.framework.name
指定mr运行在yarn上
yarn-env.sh
vim /opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh
在# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
下(约24行)添加export JAVA_HOME=/usr/local/jdk-11.0.1
# some Java parameters
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/local/jdk-11.0.1
if [ "$JAVA_HOME" != "" ]; then
#echo "run java in $JAVA_HOME"
JAVA_HOME=$JAVA_HOME
fi
yarn-site.xml
vim /opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml
在configuration
节点下添加以下内容
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node01</value>
</property>
分发配置
scp /opt/hadoop-2.9.2/etc/hadoop/core-site.xml cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/core-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/core-site.xml cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/core-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh
scp /opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/hadoop-env.sh
scp /opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/slaves cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/slaves
scp /opt/hadoop-2.9.2/etc/hadoop/slaves cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/slaves
scp /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/mapred-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh
scp /opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/yarn-env.sh
scp /opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml cloud@node02:/opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml
scp /opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml cloud@node03:/opt/hadoop-2.9.2/etc/hadoop/yarn-site.xml
启动集群
创建HDFS临时目录
在三台主机执行以下操作
mkdir -p /data/hadoop/tmp
mkdir -p /data/hadoop/data
mkdir -p /data/hadoop/name
格式化namenode
hdfs namenode -format
此处如果有异常,请自行查找错误
在任意一台主机上执行
start-dfs.sh
start-yarn.sh
输出
start-dfs.sh
$ start-dfs.sh
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/hadoop-2.9.2/share/hadoop/common/lib/hadoop-auth-2.9.2.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Starting namenodes on [node01]
node01: starting namenode, logging to /opt/hadoop-2.9.2/logs/hadoop-cloud-namenode-node01.out
node03: starting datanode, logging to /opt/hadoop-2.9.2/logs/hadoop-cloud-datanode-node03.out
node02: starting datanode, logging to /opt/hadoop-2.9.2/logs/hadoop-cloud-datanode-node02.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:YX+xBXQ615FPJ/Fzn051BbaelveeLC5Y+B91AjeYgLk.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop-2.9.2/logs/hadoop-cloud-secondarynamenode-node01.out
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/hadoop-2.9.2/share/hadoop/common/lib/hadoop-auth-2.9.2.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
The authenticity of host ‘0.0.0.0 (0.0.0.0)’ can’t be established. ECDSA key fingerprint is SHA256:YX+xBXQ615FPJ/Fzn051BbaelveeLC5Y+B91AjeYgLk. Are you sure you want to continue connecting (yes/no)? yes 如果出现类似的内容,输入yes
start-yarn.sh
$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.9.2/logs/yarn-cloud-resourcemanager-node01.out
node02: starting nodemanager, logging to /opt/hadoop-2.9.2/logs/yarn-cloud-nodemanager-node02.out
node03: starting nodemanager, logging to /opt/hadoop-2.9.2/logs/yarn-cloud-nodemanager-node03.out
测试
可以使用浏览器访问 Master的50070端口,查看集群状态
停止集群
在任意一台主机上执行
stop-yarn.sh
stop-dfs.sh
输出
stop-yarn.sh
$ stop-yarn.sh
stop-dfs.sh
$ stop-dfs.sh
上面的警告是因为JDK版本过高(也是有点醉),具体的原理还没有研究,不影响正常运行,就是看着很不舒服,处女座的可以降级JDK到1.8解决这个问题