--- title: Hadoop集群搭建基础环境 top_img: /img/site01.jpg top_img_height: 800px date: 2024-08-01 09:10:40 tags: hadoop --- ### 防火墙关闭 ```bash # 在 6 台主机执行 systemctl stop firewalld systemctl disable firewalld ``` ### 配置yum源 - 下载 repo 文件: [Centos-7.repo](http://mirrors.aliyun.com/repo/Centos-7.repo) 并上传到`/tmp`,进入到`/tmp` - 备份并且替换系统的repo文件 ``` bash cp Centos-7.repo /etc/yum.repos.d/ cd /etc/yum.repos.d/ mv CentOS-Base.repo CentOS-Base.repo.bak mv Centos-7.repo CentOS-Base.repo ``` - 将`nn1`上的`CentOS-Base.repo`拷贝到其他主机 ```bash scp /etc/yum.repos.d/CentOS-Base.repo root@nn2:/etc/yum.repos.d scp /etc/yum.repos.d/CentOS-Base.repo root@nn3:/etc/yum.repos.d scp /etc/yum.repos.d/CentOS-Base.repo root@s1:/etc/yum.repos.d scp /etc/yum.repos.d/CentOS-Base.repo root@s2:/etc/yum.repos.d scp /etc/yum.repos.d/CentOS-Base.repo root@s3:/etc/yum.repos.d ``` - 执行yum源更新命令 ```bash yum clean all yum makecache yum update -y ``` - 安装常用软件 ```bash yum install -y openssh-server vim gcc gcc-c++ glibc-headers bzip2-devel lzo-devel curl wget openssh-clients zlib-devel autoconf automake cmake libtool openssl-devel fuse-devel snappy-devel telnet unzip zip net-tools.x86_64 firewalld systemd ntp unrar bzip2 ``` ### JDK安装 >注意需要在六台机器依次执行 - 上传到`/tmp`目录下,安装 ```bash cd /tmp rpm -ivh jdk-8u144-linux-x64.rpm ``` - 配置环境变量 ```bash ln -s /usr/java/jdk1.8.0_144/ /usr/java/jdk1.8 echo 'export JAVA_HOME=/usr/java/jdk1.8' >> /etc/profile.d/myEnv.sh echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile.d/myEnv.sh source /etc/profile java -version ``` ### 修改主机名和主机名映射 ```bash vim /etc/hostname ``` 6台机器分别为nn1、nn2、nn3、s1、s2、s3 ```bash vim /etc/hosts ``` 修改为 ```text 192.168.1.30 nn1 192.168.1.31 nn2 192.168.1.32 nn3 192.168.1.33 s1 192.168.1.34 s2 192.168.1.35 s3 ``` ### 创建hadoop用户 ```bash #创建hadoop用户 useradd hadoop #给hadoop用户设置密码: 12345678 passwd hadoop ``` ### 禁止非 wheel 组用户切换到root,配置免密切换root - 修改/etc/pam.d/su配置 ```bash sed -i 's/#auth\t\trequired\tpam_wheel.so/auth\t\trequired\tpam_wheel.so/g' '/etc/pam.d/su' sed -i 's/#auth\t\tsufficient\tpam_wheel.so/auth\t\tsufficient\tpam_wheel.so/g' '/etc/pam.d/su' ``` - 修改/etc/login.defs文件 ```bash echo "SU_WHEEL_ONLY yes" >> /etc/login.defs ``` - 添加用户到管理员,禁止普通用户su 到 root ```bash #把hadoop用户加到wheel组里 gpasswd -a hadoop wheel #查看wheel组里是否有hadoop用户 cat /etc/group | grep wheel ``` ### 给hadoop用户,配置SSH密钥 #### 配置hadoop用户ssh免密码登录到hadoop - 仅在`nn1`执行这段脚本命令即可 但是 `su - hadoop ` ,` mkdir ~/.ssh` 需要在其他主机执行一下 ```bash #切换到hadoop用户 su - hadoop #生成ssh公私钥 ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' ssh-copy-id nn1 ssh-copy-id nn2 ssh-copy-id nn3 ssh-copy-id s1 ssh-copy-id s2 ssh-copy-id s3 scp /home/hadoop/.ssh/id_rsa hadoop@nn2:/home/hadoop/.ssh scp /home/hadoop/.ssh/id_rsa hadoop@nn3:/home/hadoop/.ssh scp /home/hadoop/.ssh/id_rsa hadoop@s1:/home/hadoop/.ssh scp /home/hadoop/.ssh/id_rsa hadoop@s2:/home/hadoop/.ssh scp /home/hadoop/.ssh/id_rsa hadoop@s3:/home/hadoop/.ssh ``` #### 配置hadoop用户ssh免密码登录到root - 同上 ```bash ssh-copy-id root@nn1 ssh-copy-id root@ nn2 ssh-copy-id root@nn3 ssh-copy-id root@s1 ssh-copy-id root@s2 ssh-copy-id root@s3 scp /home/hadoop/.ssh/id_rsa root@nn2:/root/.ssh scp /home/hadoop/.ssh/id_rsa root@nn3:/root/.ssh scp /home/hadoop/.ssh/id_rsa root@s1:/root/.ssh scp /home/hadoop/.ssh/id_rsa root@s2:/root/.ssh scp /home/hadoop/.ssh/id_rsa root@s3:/root/.ssh ``` ### 脚本配置 - **ips** ```bash vim /home/hadoop/bin/ips ``` ```bash nn1 nn2 nn3 s1 s2 s3 ``` - **ssh_all.sh** ```bash vim /home/hadoop/bin/ssh_all.sh ``` ```bash #! /bin/bash # 进入到当前脚本所在目录 cd `dirname $0` # 获取当前脚本所在目录 dir_path=`pwd` #echo $dir_path # 读ips文件得到数组(里面是一堆主机名) ip_arr=(`cat $dir_path/ips`) # 遍历数组里的主机名 for ip in ${ip_arr[*]} do # 拼接ssh命令: ssh hadoop@nn1.hadoop ls cmd_="ssh hadoop@${ip} \"$*\" " echo $cmd_ # 通过eval命令 执行 拼接的ssh 命令 if eval ${cmd_} ; then echo "OK" else echo "FAIL" fi done ``` - **ssh_root.sh** ```bash #! /bin/bash # 进入到当前脚本所在目录 cd `dirname $0` # 获取当前脚本所在目录 dir_path=`pwd` #echo $dir_path # 读ips文件得到数组(里面是一堆主机名) ip_arr=(`cat $dir_path/ips`) # 遍历数组里的主机名 for ip in ${ip_arr[*]} do # 拼接ssh命令: ssh hadoop@nn1.hadoop ls cmd_="ssh hadoop@${ip} ~/bin/exe.sh \"$*\"" echo $cmd_ # 通过eval命令 执行 拼接的ssh 命令 if eval ${cmd_} ; then echo "OK" else echo "FAIL" fi done ``` - **scp_all.sh** ```bash #! /bin/bash # 进入到当前脚本所在目录 cd `dirname $0` # 获取当前脚本所在目录 dir_path=`pwd` #echo $dir_path # 读ips文件得到数组(里面是一堆主机名) ip_arr=(`cat $dir_path/ips`) # 源 source_=$1 # 目标 target=$2 # 遍历数组里的主机名 for ip in ${ip_arr[*]} do # 拼接scp命令: scp 源 hadoop@nn1.hadoop:目标 cmd_="scp -r ${source_} hadoop@${ip}:${target}" echo $cmd_ # 通过eval命令 执行 拼接的scp 命令 if eval ${cmd_} ; then echo "OK" else echo "FAIL" fi done ``` - **exe.sh** ```bash #切换到root用户执行cmd命令 cmd=$* su - << EOF $cmd EOF ``` - 赋予执行权限 ```bash chmod +x ssh_all.sh chmod +x scp_all.sh chmod +x ssh_root.sh chmod +x exe.sh ``` - 分发到其他主机 ```bash ./ssh_all.sh mkdir /home/hadoop/bin ./scp_all.sh /home/hadoop/bin/ips /home/hadoop/bin/ ./scp_all.sh /home/hadoop/bin/exe.sh /home/hadoop/bin/ ./scp_all.sh /home/hadoop/bin/ssh_all.sh /home/hadoop/bin/ ./scp_all.sh /home/hadoop/bin/scp_all.sh /home/hadoop/bin/ ./scp_all.sh /home/hadoop/bin/ssh_root.sh /home/hadoop/bin/ ``` - 将 `/home/hadoop/bin`添加到hadoop用户的环境变量,需要切换到`hadoop`用户 ```bash echo 'export PATH=$PATH:/home/hadoop/bin' >> ~/.bashrc && source ~/.bashrc scp_all.sh /home/hadoop/.bashrc /home/hadoop/ ssh_all.sh source ~/.bashrc ```