歡迎來到魔據教育大數據學院,專注大數據工程師培養!
當前位置:首頁 > 學習資料 > 講師博文 > Hadoop集群配置

Hadoop集群配置

時間:2017-08-17 17:28:59作者:[!--zuozhe--]

 

小編這里主要介紹下Hadoop HA集群中Hadoop參數所配置及啟動、停止及環境驗證的方法。
一、 修改配置文件
這里小編用的實驗平臺,共配置了 hadoop-env.sh、core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml、slaves六個文件。實際生活中,隨著要求的提高,會有不同,這里僅供參考學習使用。
修改文件的目錄地址:
/home/user/workspace/hadoop/etc/hadoop/
1. 文件 hadoop-env.sh
export JAVA_HOME=/home/user/workspace/jdk

1.png

export HADOOP_CLASSPATH=.:$CLASSPATH:
$HADOOP_CLASSPATH:$HADOOP_HOME/bin

2.png

 

export HADOOP_LOG_DIR=/home/user/yarn_data/hadoop/log
2. 文件 core-site.xml
<configuration>
<property>
   <name>fs.defaultFS</name>
   <value>hdfs://cluster1</value>
</property>
<property>
   <name>io.file.buffer.size</name>
   <value>131072</value>
</property>
<property>
   <name>hadoop.tmp.dir</name>
   <value>/home/user/yarn_data/tmp</value>
   <description>Abase for other temporary directories.</description>
</property>
<property>
   <name>hadoop.proxyuser.hduser.hosts</name>
   <value>*</value>
</property>
<property>
   <name>hadoop.proxyuser.hduser.groups</name>
   <value>*</value>
</property>
<property>
   <name>ha.zookeeper.quorum</name>
   <value>master:2181,master0:2181,slave1:2181,slave2:2181,slave3:2181</value>
</property>
</configuration>
3. 文件 hdfs-site.xml
<configuration>
<property>
   <name>dfs.namenode.name.dir</name>
   <value>/home/user/dfs/name</value>
</property>
<property>
   <name>dfs.datanode.data.dir</name>
   <value>/home/user/dfs/data</value>
</property>
<property>
   <name>dfs.replication</name>
   <value>2</value>
</property>
<property>
   <name>dfs.permissions</name>
   <value>false</value>
</property>
<property>
   <name>dfs.permissions.enabled</name>
   <value>false</value>
</property>
<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
</property>
<property>
   <name>dfs.nameservices</name>
   <value>cluster1</value>
</property>
<property>
   <name>dfs.ha.namenodes.cluster1</name>
   <value>hadoop1,hadoop2</value>
</property>
<property>
   <name>dfs.namenode.rpc-address.cluster1.hadoop1</name>
   <value>master:9000</value>
</property>
<property>
   <name>dfs.namenode.rpc-address.cluster1.hadoop2</name>
   <value>master0:9000</value>
</property>
<property>
   <name>dfs.namenode.http-address.cluster1.hadoop1</name>
   <value>master:50070</value>
</property>
<property>
   <name>dfs.namenode.http-address.cluster1.hadoop2</name>
   <value>master0:50070</value>
</property>
<property>
   <name>dfs.namenode.servicerpc-address.cluster1.hadoop1</name>
   <value>master:53310</value>
</property>
<property>
   <name>dfs.namenode.servicerpc-address.cluster1.hadoop2</name>
   <value>master0:53310</value>
</property>
<property>
   <name>dfs.namenode.shared.edits.dir</name>
   <value>qjournal://slave1:8485;slave2:8485;slave3:8485/cluster1</value>
</property>
<property>
   <name>dfs.journalnode.edits.dir</name>
   <value>/home/user/yarn_data/journal</value>
</property>
<property>
   <name>dfs.journalnode.http-address</name>
   <value>0.0.0.0:8480</value>
</property>
<property>
   <name>dfs.journalnode.rpc-address</name>
   <value>0.0.0.0:8485</value>
</property>
<property>
 <name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
</value>
</property>
<property>
   <name>dfs.ha.automatic-failover.enabled.cluster1</name>
   <value>true</value>
</property>
<property>
   <name>ha.zookeeper.quorum</name>
   <value>slave1:2181,slave2:2181,slave3:2181</value>
</property>
<property>
   <name>dfs.ha.fencing.methods</name>
   <value>sshfence</value>
</property>
<property>
   <name>dfs.ha.fencing.ssh.private-key-files</name>
   <value>/home/user/.ssh/id_rsa</value>
</property>
<property>
   <name>dfs.ha.fencing.ssh.connect-timeout</name>
   <value>10000</value>
</property>
<property>
   <name>dfs.namenode.handler.count</name>
   <value>100</value>
</property>
</configuration>
 
4. 文件yarn-site.xml
重點核心文件:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
   <name>yarn.resourcemanager.connect.retry-interval.ms</name>
   <value>2000</value>
</property>
<property>
   <name>yarn.resourcemanager.ha.enabled</name>
   <value>true</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.rm-ids</name>
  <value>rm1,rm2</value>
</property>
<property>
  <name>ha.zookeeper.quorum</name>
  <value>slave1:2181,slave2:2181,slave3:2181</value>
</property>
<property>
   <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
   <value>true</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm1</name>
  <value>master</value>
</property>
<property>
   <name>yarn.resourcemanager.hostname.rm2</name>
   <value>master0</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.id</name>
  <value>rm1</value><!—主Namenode此處為rm1,從Namenode此處值為rm2-->
</property>
<property>
  <name>yarn.resourcemanager.recovery.enabled</name>
  <value>true</value>
</property>
<property>
  <name>yarn.resourcemanager.zk-state-store.address</name>
  <value>slave1:2181,slave2:2181,slave3:2181</value>
</property>
<property>
  <name>yarn.resourcemanager.store.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
  <name>yarn.resourcemanager.zk-address</name>
  <value>slave1:2181,slave2:2181,slave3:2181</value>
</property>
<property>
  <name>yarn.resourcemanager.cluster-id</name>
  <value>gagcluster-yarn</value>
</property>
<property>
  <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
  <value>5000</value>  
</property>
<property>
  <name>yarn.resourcemanager.address.rm1</name>
  <value>master:8132</value>
</property>
<property>
  <name>yarn.resourcemanager.scheduler.address.rm1</name>
  <value>master:8130</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm1</name>
  <value>master:8188</value>
</property>
<property>
   <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
   <value>master:8131</value>
</property>
<property>
  <name>yarn.resourcemanager.admin.address.rm1</name>
  <value>master:8033</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.admin.address.rm1</name>
  <value>master:23142</value>
</property>
<property>
  <name>yarn.resourcemanager.address.rm2</name>
  <value>master0:8132</value>
</property>
<property>
  <name>yarn.resourcemanager.scheduler.address.rm2</name>
  <value>master0:8130</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm2</name>
  <value>master0:8188</value>
</property>
<property>
  <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
  <value>master0:8131</value>
</property>
<property>
  <name>yarn.resourcemanager.admin.address.rm2</name>
  <value>master0:8033</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.admin.address.rm2</name>
  <value>master0:23142</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
  <name>yarn.nodemanager.local-dirs</name>
  <value>/home/user/yarn_data/local</value>
</property>
<property>
  <name>yarn.nodemanager.log-dirs</name>
  <value>/home/user/yarn_data/log/hadoop</value>
</property>
<property>
  <name>mapreduce.shuffle.port</name>
  <value>23080</value>
</property>
<property>
  <name>yarn.client.failover-proxy-provider</name>
  <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property>
<property>
    <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>
    <value>/yarn-leader-election</value>
</property>
</configuration>
5. 文件 mapred-site.xml
重點核心文件:
<configuration>
<property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
    <property>
            <name>mapreduce.jobhistory.address</name>
            <value>0.0.0.0:10020</value>
    </property>
    <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>0.0.0.0:19888</value>
     </property>
</configuration>
6. 文件 slaves
重點核心文件:

3.png

二、 配置環境變量
在所有機器上的/etc/profile文件上配置環境變量
# set hadoop environment
export HADOOP_PREFIX=/home/user/workspace/hadoop
export PATH=$HADOOP_PREFIX/bin:${PATH}
export PATH=$PATH:$HADOOP_PREFIX/sbin
三、 啟動Hadoop HA環境
第一次啟動環境與不是第一次啟動環境方法是不一樣的。
1. 第一次啟動Hadoop HA環境
在所有機器上的/etc/profile文件上配置環境變量:
1) 每臺機器啟動Zookeeper啟動命令如下
     zkServer.sh start
2) 在某一個Namenode下,一般選主Namenode節點下,執行如下命令,以創建命名空間
    hdfs zkfc -formatZK
3) 在每個節點用如下命令啟動日志程序
   ./sbin/hadoop-daemon.sh start journalnode
4) 在主Namenode節點用下面命令格式化Namenode和journalnode目錄
    hadoop namenode -format cluster1
5) 在主Namenode節點,啟動Namenode進程
        ./sbin/hadoop-daemon.sh start namenode
6) 在從Namenode依順序啟動如下命令。其中第一行命令,是把從Namenode目錄格式化并把元數據從主Namenode節點Copy到從Namenode來,并且這個命令不會把Journalnode目錄再格式化了!
    hdfs namenode -bootstrapStandby
   ./sbin/hadoop-daemon.sh start namenode
7) 兩個Namenode節點都執行以下命令
   ./sbin/hadoop-daemon.sh start zkfc
8) 所有datanode節點都執行下面命令,啟動Datanode
   ./sbin/hadoop-daemon.sh start datanode
2. 非第一次啟動HadoopHA環境
在所有機器上的/etc/profile文件上配置環境變量:
1) 每臺機器啟動Zookeeper啟動命令如下
    zkServer.sh start
2) 兩個Namenode節點都執行以下命令
        ./sbin/hadoop-daemon.sh start zkfc
3) 在主Namenode上啟動
./sbin/start-all.sh
4) 在從Namenode上啟動
./sbin/ yarn-daemon.sh start resourcemanager

4.png

5.png

6.png

3. 啟動HadoopHA環境結果

如下圖所示:

7.png

8.png

9.png

四、 停止Hadoop HA環境
在所有機器上的/etc/profile文件上配置環境變量
1) 在主Namenode上執行下面命令
./sbin/stop-dfs.sh
2) 二臺Namenode上執行如下命令
./sbin/hadoop-daemon.sh stop zkfc
3) 每臺機器上執行
zkServer.sh stop
五、 驗證Hadoop HA環境
啟動環境后,按下圖輸入網址會看到Master機器處于Active狀態,Master0機器處于Standby狀態。你可以Kill掉Master機器的Namenode進程,或者把Master機器關機,這時你會看到Master0機器由Standby狀態改為Active狀態。在瀏覽器輸入以下訪問地址進行查看。

圖片1.png

圖片2.png


更多大數據相關資訊敬請關注魔據教育,為您分享最及時的大數據資訊。
學習大數據敬請關注魔據教育微信二維碼。
魔據教育微信二維碼

【版權與免責聲明】如發現內容存在版權問題,煩請提供相關信息發郵件至[email protected],我們將及時溝通與處理。本站內容除非來源注明魔據教育,否則均為網友轉載,涉及言論、版權與本站無關。

全國咨詢熱線:18501996998,值班手機:18501996998(7*24小時)

在線咨詢:張老師QQ 320169340

企業合作服務專線:010-82340234-821, 院校合作洽談專線:010-82340234

Copyright 2001-2019 魔據教育 - 北京華育興業科技有限公司 版權所有,京ICP備17018991號-2

免費在線咨詢立即咨詢

免費索取技術資料立即索取

大數據技術交流QQ:226594285

電話咨詢010-82340234

六合图库118万众图库