上传hadoop-3.1.4.tar.gz到/tmp,解压
注意在六台机器均上传到/tmp
1 2 3 4 5
| sudo tar -zxvf /tmp/hadoop-3.1.4.tar.gz -C /usr/local/
ssh_root.sh chown -R hadoop:hadoop /usr/local/hadoop-3.1.4 ssh_root.sh ln -s /usr/local/hadoop-3.1.4/ /usr/local/hadoop
|
配置环境变量
1 2 3
| echo 'export HADOOP_HOME=/usr/local/hadoop' >> /etc/profile.d/myEnv.sh echo 'export PATH=$PATH:$HADOOP_HOME/bin' >> /etc/profile.d/myEnv.sh echo 'export PATH=$PATH:$HADOOP_HOME/sbin' >> /etc/profile.d/myEnv.sh
|
1 2 3 4
| scp_all.sh /etc/profile.d/myEnv.sh /etc/profile.d/
ssh_root.sh source /etc/profile
|
还需要创建 /data这个目录,由于nn1、nn2、nn3已经创建/data,其他三台需要创建一下
1 2 3
| sudo mkdir /data sudo chown -R hadoop:hadoop /data
|
修改core-site.xml
1
| vim /usr/local/hadoop/etc/hadoop/core-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> <description>默认文件服务的协议和NS逻辑名称,和hdfs-site.xml里的对应此配置替代了1.0里的fs.default.name</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/tmp</value> <description>数据存储目录</description> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>hadoop</value> <description> hdfs dfsadmin –refreshSuperUserGroupsConfiguration, yarn rmadmin –refreshSuperUserGroupsConfiguration 使用这两个命令不用重启就能刷新 </description> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>localhost</value> <description>本地代理</description> </property> <property> <name>ha.zookeeper.quorum</name> <value>nn1:2181,nn2:2181,nn3:2181</value> <description>HA使用的zookeeper地址</description> </property> </configuration>
|
修改hdfs-site.xml
1
| vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
| <configuration> <property> <name>dfs.namenode.name.dir</name> <value>/data/namenode</value> <description>namenode本地文件存放地址</description> </property> <property> <name>dfs.nameservices</name> <value>ns1</value> <description>提供服务的NS逻辑名称,与core-site.xml里的对应</description> </property> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2,nn3</value> <description>列出该逻辑名称下的NameNode逻辑名称</description> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>nn1:9000</value> <description>指定NameNode的RPC位置</description> </property> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>nn1:50070</value> <description>指定NameNode的Web Server位置</description> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>nn2:9000</value> <description>指定NameNode的RPC位置</description> </property> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>nn2:50070</value> <description>指定NameNode的Web Server位置</description> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn3</name> <value>nn3:9000</value> <description>指定NameNode的RPC位置</description> </property> <property> <name>dfs.namenode.http-address.ns1.nn3</name> <value>nn3:50070</value> <description>指定NameNode的Web Server位置</description> </property> <property> <name>dfs.namenode.handler.count</name> <value>77</value> <description>namenode的工作线程数</description> </property>
<property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://nn1:8485;nn2:8485;nn3:8485/ns1</value> <description>指定用于HA存放edits的共享存储,通常是namenode的所在机器</description> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/data/journaldata/</value> <description>journaldata服务存放文件的地址</description> </property> <property> <name>ipc.client.connect.max.retries</name> <value>10</value> <description>namenode和journalnode的链接重试次数10次</description> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>10000</value> <description>重试的间隔时间10s</description> </property>
<property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> <description>指定HA做隔离的方法,缺省是ssh,可设为shell,稍后详述</description> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> <description>杀死命令脚本的免密配置秘钥</description> </property> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> <description>指定客户端用于HA切换的代理类,不同的NS可以用不同的代理类以上示例为Hadoop 2.0自带的缺省代理类</description> </property> <property> <name>dfs.client.failover.proxy.provider.auto-ha</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/datanode</value> <description>datanode本地文件存放地址</description> </property> <property> <name>dfs.replication</name> <value>3</value> <description>文件复本数</description> </property> <property> <name>dfs.namenode.datanode.registration.ip-hostname-check</name> <value>false</value> </property> <property> <name>dfs.client.use.datanode.hostname</name> <value>true</value> </property> <property> <name>dfs.datanode.use.datanode.hostname</name> <value>true</value> </property> </configuration>
|
修改hadoop-env.sh
1
| vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
|
1 2 3
| source /etc/profile export HADOOP_HEAPSIZE_MAX=512
|
分发这些配置文件
1 2 3
| scp_all.sh /usr/local/hadoop/etc/hadoop/core-site.xml /usr/local/hadoop/etc/hadoop/ scp_all.sh /usr/local/hadoop/etc/hadoop/hdfs-site.xml /usr/local/hadoop/etc/hadoop/ scp_all.sh /usr/local/hadoop/etc/hadoop/hadoop-env.sh /usr/local/hadoop/etc/hadoop/
|
集群初始化
集群启动