Hadoop 3.x 集群环境搭建完整指南

一、环境准备

1.1 软件版本要求

  • JDK: 推荐 JDK 8 或 JDK 11(Hadoop 3.3+ 支持)
  • Hadoop: 3.x 版本(本文以 3.3.4 为例)
  • 操作系统: CentOS 7+/Ubuntu 18.04+

1.2 系统配置优化

防火墙配置(生产环境提议)

# 开放Hadoop必要端口

sudo firewall-cmd –permanent –add-port=8020/tcp # NameNode RPC

sudo firewall-cmd –permanent –add-port=9000/tcp # NameNode HTTP

sudo firewall-cmd –permanent –add-port=8088/tcp # YARN ResourceManager

sudo firewall-cmd –permanent –add-port=9864/tcp # DataNode HTTP

sudo firewall-cmd –permanent –add-port=9870/tcp # NameNode Web UI

sudo firewall-cmd –permanent –add-port=8030-8033/tcp # YARN ports

sudo firewall-cmd –reload

# 或临时关闭防火墙(测试环境)

sudo systemctl stop firewalld

sudo systemctl disable firewalld

主机名与域名解析

# 所有节点配置 /etc/hosts

sudo vi /etc/hosts

# 添加以下内容(根据实际IP修改)

192.168.1.10 master

192.168.1.11 slave1

192.168.1.12 slave2

# 设置主机名(在各节点分别执行)

sudo hostnamectl set-hostname master # 在master节点

sudo hostnamectl set-hostname slave1 # 在slave1节点

sudo hostnamectl set-hostname slave2 # 在slave2节点

1.3 创建Hadoop专用用户

# 创建用户组和用户

sudo groupadd hadoop

sudo useradd -g hadoop hadoop

sudo passwd hadoop # 设置密码

# 配置sudo权限

sudo visudo

# 在文件末尾添加:

hadoop ALL=(ALL) NOPASSWD:ALL

二、SSH免密登录配置

2.1 生成SSH密钥对

在所有节点执行:

su – hadoop

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

2.2 配置互信关系

在master节点操作:

# 收集所有公钥

ssh-copy-id hadoop@master

ssh-copy-id hadoop@slave1

ssh-copy-id hadoop@slave2

# 或者手动方式

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

scp hadoop@slave1:~/.ssh/id_rsa.pub ~/.ssh/slave1.pub

scp hadoop@slave2:~/.ssh/id_rsa.pub ~/.ssh/slave2.pub

cat ~/.ssh/slave1.pub >> ~/.ssh/authorized_keys

cat ~/.ssh/slave2.pub >> ~/.ssh/authorized_keys

# 设置权限

chmod 600 ~/.ssh/authorized_keys

# 分发授权文件

scp ~/.ssh/authorized_keys hadoop@slave1:~/.ssh/

scp ~/.ssh/authorized_keys hadoop@slave2:~/.ssh/

2.3 验证SSH连接

ssh master “hostname” # 应该返回 master

ssh slave1 “hostname” # 应该返回 slave1

ssh slave2 “hostname” # 应该返回 slave2

三、Hadoop 3.x 安装与配置

3.1 安装目录准备

# 创建安装目录

sudo mkdir -p /opt/hadoop

sudo chown -R hadoop:hadoop /opt/hadoop

# 切换至hadoop用户

su – hadoop

cd /opt/hadoop

# 下载Hadoop 3.3.4(或使用预先下载的安装包)

wget https://archive.apache.org/dist/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz

tar -xzf hadoop-3.3.4.tar.gz

ln -s hadoop-3.3.4 hadoop # 创建符号链接

3.2 环境变量配置

编辑 ~/.bashrc 文件:

export JAVA_HOME=
/usr/lib/jvm/java-8-openjdk # 根据实际路径调整

export HADOOP_HOME=/opt/hadoop/hadoop

export HADOOP_INSTALL=$HADOOP_HOME

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS=”-Djava.library.path=$HADOOP_HOME/lib/native”

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

使配置生效:source ~/.bashrc

3.3 Hadoop配置文件

core-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://master:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/opt/hadoop/tmp</value>

</property>

<property>

<name>io.file.buffer.size</name>

<value>131072</value>

</property>

</configuration>

hdfs-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:///opt/hadoop/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:///opt/hadoop/dfs/data</value>

</property>

<property>

<name>dfs.namenode.http-address</name>

<value>master:9870</value>

</property>

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>master:9868</value>

</property>

<property>

<name>dfs.datanode.address</name>

<value>0.0.0.0:9866</value>

</property>

<property>

<name>dfs.datanode.http.address</name>

<value>0.0.0.0:9864</value>

</property>

</configuration>

mapred-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>master:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>master:19888</value>

</property>

<property>

<name>yarn.app.mapreduce.am.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>

</property>

<property>

<name>mapreduce.map.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>

</property>

<property>

<name>mapreduce.reduce.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>

</property>

</configuration>

yarn-site.xml

<?xml version=”1.0″ encoding=”UTF-8″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.env-whitelist</name>

<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ</value>

</property>

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>master:8088</value>

</property>

<property>

<name>yarn.nodemanager.resource.memory-mb</name>

<value>8192</value>

</property>

<property>

<name>yarn.scheduler.maximum-allocation-mb</name>

<value>8192</value>

</property>

<property>

<name>yarn.nodemanager.resource.cpu-vcores</name>

<value>4</value>

</property>

</configuration>

workers文件(Hadoop 3.x 替代了 slaves 文件)

# 编辑 $
HADOOP_HOME/etc/hadoop/workers

master

slave1

slave2

hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk

export HADOOP_HOME=/opt/hadoop/hadoop

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export HADOOP_LOG_DIR=$HADOOP_HOME/logs

export HADOOP_PID_DIR=$HADOOP_HOME/pids

3.4 创建必要的目录结构

mkdir -p /opt/hadoop/{tmp,dfs/{name,data},logs,pids}

chmod -R 755 /opt/hadoop

四、集群部署

4.1 分发Hadoop到所有节点

在master节点执行:

cd /opt

scp -r hadoop hadoop@slave1:/opt/

scp -r hadoop hadoop@slave2:/opt/

# 分发环境配置

scp ~/.bashrc hadoop@slave1:~/

scp ~/.bashrc hadoop@slave2:~/

# 在各节点创建目录

ssh slave1 “mkdir -p /opt/hadoop/{tmp,dfs/{name,data},logs,pids}”

ssh slave2 “mkdir -p /opt/hadoop/{tmp,dfs/{name,data},logs,pids}”

4.2 格式化HDFS

仅在第一次启动时执行:hdfs namenode -format

4.3 启动Hadoop集群

# 启动HDFS

start-dfs.sh

# 启动YARN

start-yarn.sh

# 启动MapReduce历史服务

mapred –daemon start historyserver

4.4 验证集群状态

# 检查Java进程

jps

# master节点应有:NameNode、ResourceManager、SecondaryNameNode

# slave节点应有:DataNode、NodeManager

# 检查HDFS状态

hdfs dfsadmin -report

# 检查YARN状态

yarn node -list

五、集群功能测试

5.1 HDFS操作测试

# 创建目录

hdfs dfs -mkdir -p /test/input

# 上传测试文件

echo “Hello Hadoop 3.x World” > test.txt

hdfs dfs -put test.txt /test/input/

# 查看文件

hdfs dfs -ls /test/input

hdfs dfs -cat /test/input/test.txt

5.2 MapReduce作业测试

# 运行WordCount示例

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar wordcount /test/input /test/output

# 查看结果

hdfs dfs -cat /test/output/part-r-00000

5.3 Web界面访问

  • HDFS NameNode: http://master:9870
  • YARN ResourceManager: http://master:8088
  • MapReduce History: http://master:19888

六、故障排查与维护

6.1 常用诊断命令

# 查看日志

tail -f $HADOOP_HOME/logs/hadoop-*-namenode-*.log

tail -f $HADOOP_HOME/logs/hadoop-*-datanode-*.log

# 检查端口监听

netstat -tulpn | grep java

# 检查磁盘空间

hdfs dfs -df -h

6.2 服务管理脚本

# 完整启动集群

start-all.sh

# 完整停止集群

stop-all.sh

# 单独启动/停止服务

hdfs –daemon start|stop namenode|datanode|secondarynamenode

yarn –daemon start|stop resourcemanager|nodemanager

6.3 常见问题解决

  1. 权限问题:确保所有Hadoop目录的属主是hadoop用户
  2. 端口冲突:检查端口是否被其他服务占用
  3. 内存不足:根据机器配置调整yarn-site.xml中的内存设置
  4. 主机名解析:确保所有节点可以相互解析主机名
© 版权声明
THE END
如果内容对您有所帮助,就支持一下吧!
点赞0 分享
襟上花的头像 - 鹿快
评论 抢沙发

请登录后发表评论

    暂无评论内容