Hadoop安装手册
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Hadoop
目录
1Centos ··········································· 错误!未定义书签。
1.1安装环境 (3)
1.2初始环境配置 (3)
1.2.1设置centos开机自启动网络 (3)
1.2.2关闭防火墙 (4)
1.2.3关闭selinux (5)
1.2.4修改hosts (5)
1.3安装JDK (6)
1.3.1安装 (6)
1.3.2设置环境变量 (7)
1.4添加用户 (8)
1.5复制虚拟机 (8)
1Hadoop1.1.2安装
1.1安装环境
操作系统:Centos6.5 64位
Hadoop:hadoop-1.1.2
Jdk:jdk-7u51
1.2初始环境配置
1.2.1设置centos开机自启动网络
[root@localhost network-scripts]# cd /etc/sysconfig/network-scripts
[root@localhost network-scripts]# ls
ifcfg-eth0 ifdown-bnep ifdown-ipv6 ifdown-ppp ifdown-tunnel ifup-bnep ifup-ipv6 ifup-plusb ifup-routes ifup-wireless network-functions
ifcfg-lo ifdown-eth ifdown-isdn ifdown-routes ifup ifup-eth ifup-isdn ifup-post ifup-sit init.ipv6-global network-functions-ipv6
ifdown ifdown-ippp ifdown-post ifdown-sit ifup-aliases ifup-ippp ifup-plip ifup-ppp ifup-tunnel net.hotplug
第3页/共53页
[root@localhost network-scripts]# vi ifcfg-eth0
DEVICE=eth0
HWADDR=00:0C:29:09:8B:B0
TYPE=Ethernet
UUID=3192d1fe-8ff3-411f-81b7-8270e86f5959
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=dhcp
1.2.2关闭防火墙
关闭防火墙
[root@master01 /]# service iptables stop
iptables:将链设置为政策 ACCEPT:filter [确定]
iptables:清除防火墙规则: [确定]
iptables:正在卸载模块: [确定]
设置开机不启用
[root@master01 /]# chkconfig iptables off
第4页/共53页
1.2.3关闭selinux
[root@master01 /]# vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
#SELINUXTYPE=targeted
1.2.4修改hosts
[root@master01 etc]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.239.100 master01
第5页/共53页
192.168.239.101 slave01
192.168.239.102 slave02
1.3安装JDK
1.3.1安装
cd /home/nunchakus
[root@localhost nunchakus]# ls
jdk-7u51-linux-x64.rpm
[root@localhost nunchakus]# rpm -ivh jdk-7u51-linux-x64.rpm
Preparing... ########################################### [100%] 1:jdk ########################################### [100%] Unpacking JAR files...
rt.jar...
jsse.jar...
charsets.jar...
tools.jar...
localedata.jar...
jfxrt.jar...
[root@localhost nunchakus]# ls
jdk-7u51-linux-x64.rpm
[root@localhost nunchakus]# cd /usr
[root@localhost usr]# ls
bin etc games include java lib lib64 libexec local sbin share src tmp [root@localhost usr]# cd java
[root@localhost java]# ls
default jdk1.7.0_51 latest
第6页/共53页
1.3.2设置环境变量
[root@localhost ~]# vi /etc/profile
末尾加入
export JAVA_HOME=/usr/java/jdk1.7.0_51
export JAVA_BIN=/usr/java/jdk1.7.0_51/bin
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JAVA_HOME JAVA_BIN PATH CLASSPATH
让/etc/profile文件修改后立即生效,可以使用如下命令:
# . /etc/profile
注意: . 和/etc/profile 有空格.
重启测试
[root@localhost ~]# java -version
第7页/共53页
java version "1.7.0_45"
OpenJDK Runtime Environment (rhel-2.4.3.3.el6-x86_64 u45-b15)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
1.4添加用户
[root@master01 /]# useradd hadoop
[root@master01 /]# passwd hadoop
更改用户hadoop 的密码。
新的密码:
无效的密码:它没有包含足够的不同字符
无效的密码:是回文
重新输入新的密码:
passwd:所有的身份验证令牌已经成功更新。
1.5复制虚拟机
从master01分别复制出slave01和slave02
1.6修改复制虚拟机的主机名
[root@master01 ~]# vi /etc/sysconfig/network
第8页/共53页
NETWORKING=yes
HOSTNAME=slave01
NTPSERVERARGS=iburst
[root@master01 ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave02
NTPSERVERARGS=iburst
1.7设置SSH免密码登陆
设置master01、slave01、slave02三台机器之间ssh免密码登陆
1.7.1SSH配置
以hadoop用户登陆,每个节点执行如下,生成秘钥
[hadoop@master01 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
第9页/共53页
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
cf:e8:82:9c:6e:22:6c:ae:6c:df:39:3f:19:9e:2f:1f hadoop@master01
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| |
| |
| S |
| . + |
|. . o. =Eo |
|o+. *.oB . |
|*+.=.ooo*o |
+-----------------+
[hadoop@master01 ~]$ cd .ssh
[hadoop@master01 .ssh]$ cp id_rsa.pub authorized_keys
第10页/共53页
1.7.2分发SSH公钥
将每个节点生成的authorized_keys文件互相拷贝至各个节点
[hadoop@master01 .ssh]$ scp authorized_keys hadoop@slave01:/home/hadoop/.ssh/id_rsa.pub.master01 hadoop@slave01's password:
authorized_keys
[hadoop@master01 .ssh]$ scp authorized_keys hadoop@slave02:/home/hadoop/.ssh/id_rsa.pub.master01 The authenticity of host 'slave02 (192.168.239.102)' can't be established.
RSA key fingerprint is 4b:66:f7:1d:d2:cb:8d:c7:f4:fe:e3:cb:f4:7e:67:c3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave02,192.168.239.102' (RSA) to the list of known hosts.
hadoop@slave02's password:
authorized_keys
将每个节点生成的公钥加入authorized_keys文件(每个节点都要做)
[hadoop@master01 .ssh]$ cat id_rsa.pub.slave01>>authorized_keys
[hadoop@master01 .ssh]$ cat id_rsa.pub.slave02>>authorized_keys
修改每个节点的权限(每个节点都要做)
[root@master01 home]# chmod -R 700 hadoop
[root@master01 .ssh]# chmod 644 authorized_keys要保证此文件只有hadoop本用户访问的权限
测试
第11页/共53页
[hadoop@master01 .ssh]$ ssh slave01
Last login: Tue Mar 18 16:28:41 2014 from 192.168.239.1
1.8下载并解压Hadoop
[hadoop@master01 ~]$ tar xvf hadoop-1.1.2.tar.gz
1.9版本1.1.2修改配置文件
配置文件路径
[hadoop@master01 conf]$ pwd
/home/hadoop/hadoop-1.1.2/conf
1.9.1修改core-site.xml
[hadoop@master01 conf]$ vi core-site.xml
~ <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
第12页/共53页
<name></name>
<value>hdfs://master01:9000</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hdfs</value>
</property>
<property>
<name>hadoop.tmp.dir</name><!--默认的hadoop.tmp.dir的选线一般都为/tmp,而linux系统的/tmp目录往往文件系统的类型往往是Hadoop不支持的。
所以我们在反复namenode format,restart之后,仍然启动不了HDFS,不妨修改一些hadoop.tmp.dir目录-
<value>/home/hadoop/temp</value>
</property>
</configuration>
1.9.2修改mapred-site.xml
[hadoop@master01 conf]$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
第13页/共53页
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master01:9001</value>
</property>
</configuration>
1.9.3修改hdfs-site.xml
[hadoop@master01 conf]$ vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replaction</name>
<value>2</value>
</property>
</configuration>
第14页/共53页
1.9.4修改masters
[hadoop@master01 conf]$ vi masters
master01
1.9.5修改slaves
[hadoop@master01 conf]$ vi masters
slave01
slave02
1.9.6修改hadoop-env.sh
[hadoop@master01 conf]$ vi hadoop-env.sh
第15页/共53页
1.10向各个节点复制Hadoop
[hadoop@master01 ~]$ scp hadoop-1.1.2 slave01:/home/hadoop
hadoop-examples-1.1.2.jar 100% 139KB 139.1KB/s 00:00 NOTICE.txt 100% 101 0.1KB/s 00:00 unch 100% 2920 2.9KB/s 00:00
1.11格式化namenode
[hadoop@master01 bin]$ ./hadoop namenode -format
14/03/18 17:32:30 INFO Node: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master01/192.168.239.100
第16页/共53页
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.1.2
STARTUP_MSG: build = https:///repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
14/03/18 17:32:30 INFO util.GSet: VM type = 64-bit
14/03/18 17:32:30 INFO util.GSet: 2% max memory = 19.33375 MB
14/03/18 17:32:30 INFO util.GSet: capacity = 2^21 = 2097152 entries
14/03/18 17:32:30 INFO util.GSet: recommended=2097152, actual=2097152
14/03/18 17:32:30 INFO namenode.FSNamesystem: fsOwner=hadoop
14/03/18 17:32:31 INFO namenode.FSNamesystem: supergroup=supergroup
14/03/18 17:32:31 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/03/18 17:32:31 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
14/03/18 17:32:31 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/03/18 17:32:31 INFO Node: Caching file names occuring more than 10 times
14/03/18 17:32:31 INFO common.Storage: Image file of size 112 saved in 0 seconds.
14/03/18 17:32:31 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits
14/03/18 17:32:31 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits
14/03/18 17:32:32 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
14/03/18 17:32:32 INFO Node: SHUTDOWN_MSG:
第17页/共53页
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master01/192.168.239.100
************************************************************/
1.12启动守护进程
[hadoop@master01 bin]$ ./start-all.sh
starting namenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-namenode-master01.out
slave02: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-slave02.out
slave01: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-slave01.out
master01: starting secondarynamenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-secondarynamenode-master01.out
starting jobtracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-jobtracker-master01.out
slave02: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-slave02.out
slave01: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-slave01.out
1.13查看进程
[hadoop@master01 bin]$ jps
27392 Jps
27206 SecondaryNameNode
第18页/共53页
27038 NameNode
27281 JobTracker
[hadoop@slave01 .ssh]$ jps
25456 Jps
25300 TaskTracker
25214 DataNode
2Hadoop2.2.0安装
2.1安装环境
操作系统:Centos6.5 64位
Hadoop:hadoop-2.2.0
Jdk:jdk-7u51
2.2初始环境配置
2.2.1设置centos开机自启动网络
[root@localhost network-scripts]# cd /etc/sysconfig/network-scripts [root@localhost network-scripts]# ls
第19页/共53页
ifcfg-eth0 ifdown-bnep ifdown-ipv6 ifdown-ppp ifdown-tunnel ifup-bnep ifup-ipv6 ifup-plusb ifup-routes ifup-wireless network-functions
ifcfg-lo ifdown-eth ifdown-isdn ifdown-routes ifup ifup-eth ifup-isdn ifup-post ifup-sit init.ipv6-global network-functions-ipv6
ifdown ifdown-ippp ifdown-post ifdown-sit ifup-aliases ifup-ippp ifup-plip ifup-ppp ifup-tunnel net.hotplug
[root@localhost network-scripts]# vi ifcfg-eth0
DEVICE=eth0
HWADDR=00:0C:29:09:8B:B0
TYPE=Ethernet
UUID=3192d1fe-8ff3-411f-81b7-8270e86f5959
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=dhcp
2.2.2关闭防火墙
关闭防火墙
[root@master01 /]# service iptables stop
iptables:将链设置为政策 ACCEPT:filter [确定]
iptables:清除防火墙规则: [确定]
第20页/共53页
iptables:正在卸载模块: [确定]
设置开机不启用
[root@master01 /]# chkconfig iptables off
2.2.3关闭selinux
[root@master01 /]# vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
#SELINUXTYPE=targeted
2.2.4修改hosts
[root@master01 etc]# vi /etc/hosts
第21页/共53页
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.239.100 master01
192.168.239.101 slave01
192.168.239.102 slave02
2.3安装JDK
2.3.1安装
cd /home/nunchakus
[root@localhost nunchakus]# ls
jdk-7u51-linux-x64.rpm
[root@localhost nunchakus]# rpm -ivh jdk-7u51-linux-x64.rpm
Preparing... ########################################### [100%] 1:jdk ########################################### [100%] Unpacking JAR files...
rt.jar...
jsse.jar...
charsets.jar...
tools.jar...
localedata.jar...
jfxrt.jar...
[root@localhost nunchakus]# ls
jdk-7u51-linux-x64.rpm
[root@localhost nunchakus]# cd /usr
[root@localhost usr]# ls
第22页/共53页
bin etc games include java lib lib64 libexec local sbin share src tmp [root@localhost usr]# cd java
[root@localhost java]# ls
default jdk1.7.0_51 latest
2.3.2设置环境变量
[root@localhost ~]# vi /etc/profile
末尾加入
export JAVA_HOME=/usr/java/jdk1.7.0_51
export JAVA_BIN=/usr/java/jdk1.7.0_51/bin
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATH
让/etc/profile文件修改后立即生效,可以使用如下命令:
# . /etc/profile
注意: . 和/etc/profile 有空格.
重启测试
第23页/共53页
[root@localhost ~]# java -version
java version "1.7.0_45"
OpenJDK Runtime Environment (rhel-2.4.3.3.el6-x86_64 u45-b15)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
2.4添加用户
[root@master01 /]# useradd hadoop
[root@master01 /]# passwd hadoop
更改用户hadoop 的密码。
新的密码:
无效的密码:它没有包含足够的不同字符
无效的密码:是回文
重新输入新的密码:
passwd:所有的身份验证令牌已经成功更新。
2.5复制虚拟机
从master01分别复制出slave01和slave02
第24页/共53页
2.6修改复制虚拟机的主机名
[root@master01 ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave01
NTPSERVERARGS=iburst
[root@master01 ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave02
NTPSERVERARGS=iburst
2.7设置SSH免密码登陆
设置master01、slave01、slave02三台机器之间ssh免密码登陆
2.7.1SSH配置
以hadoop用户登陆,每个节点执行如下,生成秘钥
[hadoop@master01 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
第25页/共53页
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
cf:e8:82:9c:6e:22:6c:ae:6c:df:39:3f:19:9e:2f:1f hadoop@master01
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| |
| |
| S |
| . + |
|. . o. =Eo |
|o+. *.oB . |
|*+.=.ooo*o |
+-----------------+
[hadoop@master01 ~]$ cd .ssh
第26页/共53页
[hadoop@master01 .ssh]$ cp id_rsa.pub authorized_keys
2.7.2分发SSH公钥
将每个节点生成的authorized_keys文件互相拷贝至各个节点
[hadoop@master01 .ssh]$ scp authorized_keys hadoop@slave01:/home/hadoop/.ssh/id_rsa.pub.master01 hadoop@slave01's password:
authorized_keys
[hadoop@master01 .ssh]$ scp authorized_keys hadoop@slave02:/home/hadoop/.ssh/id_rsa.pub.master01 The authenticity of host 'slave02 (192.168.239.102)' can't be established.
RSA key fingerprint is 4b:66:f7:1d:d2:cb:8d:c7:f4:fe:e3:cb:f4:7e:67:c3.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave02,192.168.239.102' (RSA) to the list of known hosts.
hadoop@slave02's password:
authorized_keys
将每个节点生成的公钥加入authorized_keys文件(每个节点都要做)
[hadoop@master01 .ssh]$ cat id_rsa.pub.slave01>>authorized_keys
[hadoop@master01 .ssh]$ cat id_rsa.pub.slave02>>authorized_keys
修改每个节点的权限(每个节点都要做)
[root@master01 home]# chmod -R 700 hadoop
[root@master01 .ssh]# chmod 644 authorized_keys要保证此文件只有hadoop本用户访问的权限
第27页/共53页
测试
[hadoop@master01 .ssh]$ ssh slave01
Last login: Tue Mar 18 16:28:41 2014 from 192.168.239.1
2.8下载并编译Hadoop
………………………………………………
2.9修改配置文件
配置文件路径
[hadoop@master01 conf]$ pwd
/home/hadoop/hadoop-2.2.0/etc/hadoop
2.9.1修改core-site.xml
[hadoop@master01 conf]$ vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master01:9000</value>
</property>
第28页/共53页
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<!--<name>hadoop.tmp.dir</name><!--默认的hadoop.tmp.dir的选线一般都为/tmp,而linux系统的/tmp目录往往文件系统的类型往往是Hadoop不支持的。
所以我们在反复namenode format, restart之后,仍然启动不了HDFS,不妨修改一些hadoop.tmp.dir目录-->
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>
第29页/共53页
2.9.2修改mapred-site.xml
[hadoop@master01 conf]$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master01:9001</value>
</property>
</configuration>
2.9.3修改hdfs-site.xml
[hadoop@master01 conf]$ vi hdfs-site.xml
<configuration>
<property>
<name>node.secondary.http-address</name>
<value>master01:9001</value>
</property>
第30页/共53页
<property>
<name>.dir</name>
<value>file:/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
2.9.4修改slaves
第31页/共53页
[hadoop@master01 conf]$ vi slaves
slave01
slave02
2.9.5修改yarn-env.sh
[hadoop@master01 conf]$ vi yarn-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
2.9.6修改yarn-site.xml
[hadoop@master01 conf]$ vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
第32页/共53页
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master01:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
第33页/共53页
<value>master01:8088</value>
</property>
</configuration>
2.9.7修改hadoop-env.sh
[hadoop@master01 conf]$ vi hadoop-env.sh
2.10向各个节点复制Hadoop
[hadoop@master01 ~]$ scp hadoop-1.1.2 slave01:/home/hadoop
hadoop-examples-1.1.2.jar 100% 139KB 139.1KB/s 00:00 NOTICE.txt 100% 101 0.1KB/s 00:00 unch 100% 2920 2.9KB/s 00:00
第34页/共53页
2.11格式化namenode
[hadoop@master01 bin]$ ./hadoop namenode -format
14/03/18 17:32:30 INFO Node: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master01/192.168.239.100
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.1.2
STARTUP_MSG: build = https:///repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013
************************************************************/
14/03/18 17:32:30 INFO util.GSet: VM type = 64-bit
14/03/18 17:32:30 INFO util.GSet: 2% max memory = 19.33375 MB
14/03/18 17:32:30 INFO util.GSet: capacity = 2^21 = 2097152 entries
14/03/18 17:32:30 INFO util.GSet: recommended=2097152, actual=2097152
14/03/18 17:32:30 INFO namenode.FSNamesystem: fsOwner=hadoop
14/03/18 17:32:31 INFO namenode.FSNamesystem: supergroup=supergroup
14/03/18 17:32:31 INFO namenode.FSNamesystem: isPermissionEnabled=true
14/03/18 17:32:31 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
第35页/共53页
14/03/18 17:32:31 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/03/18 17:32:31 INFO Node: Caching file names occuring more than 10 times
14/03/18 17:32:31 INFO common.Storage: Image file of size 112 saved in 0 seconds.
14/03/18 17:32:31 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits
14/03/18 17:32:31 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-hadoop/dfs/name/current/edits
14/03/18 17:32:32 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
14/03/18 17:32:32 INFO Node: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master01/192.168.239.100
************************************************************/
2.12启动守护进程
[hadoop@master01 bin]$ ./start-all.sh
starting namenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-namenode-master01.out
slave02: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-slave02.out
slave01: starting datanode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-datanode-slave01.out
master01: starting secondarynamenode, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-secondarynamenode-master01.out
starting jobtracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-jobtracker-master01.out
slave02: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-slave02.out
第36页/共53页
slave01: starting tasktracker, logging to /home/hadoop/hadoop-1.1.2/libexec/../logs/hadoop-hadoop-tasktracker-slave01.out 2.13查看进程
[hadoop@master01 bin]$ jps
27392 Jps
27206 SecondaryNameNode
27038 NameNode
27281 JobTracker
[hadoop@slave01 .ssh]$ jps
25456 Jps
25300 TaskTracker
25214 DataNode
2.14hadoop-1.1.2编译hadoop-eclipse-plugin-1.1.2.jar
hadoopeclipse
1、将hadoop-1.1.2.tar.gz解压。
比如:D:\hadoop-1.1.2。
2、进入D:\hadoop-1.1.2\src\contrib目录。
将build-contrib.xml复制到D:\hadoop-1.1.2\src\contrib\eclipse-plugin目录下。
将
第37页/共53页
第38页/共53页
3测试Hadoop
3.1初始化测试数据
[hadoop@master01 ~]$ mkdir input
[hadoop@master01 ~]$ cd input
[hadoop@master01 input]$ echo "hello world" >test11.txt
[hadoop@master01 input]$ echo "hello hadoop" >test22.txt
[hadoop@master01 input]$ cd ../hadoop-1.1.2/bin
[hadoop@master01 bin]$ ./hadoop dfs -ls
第40页/共53页
ls: Cannot access .: No such file or directory.
3.2将数据放入Hadoop
[hadoop@master01 bin]$ ./hadoop dfs -put /home/hadoop/input input
[hadoop@master01 bin]$ ./hadoop dfs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2014-03-18 18:43 /user/hadoop/input
[hadoop@master01 bin]$ ./hadoop dfs -ls ./input/*
-rw-r--r-- 3 hadoop supergroup 12 2014-03-18 18:43 /user/hadoop/input/test11.txt
-rw-r--r-- 3 hadoop supergroup 13 2014-03-18 18:43 /user/hadoop/input/test22.txt
3.3测试WordCount
[hadoop@master01 bin]$ ./hadoop jar ../hadoop-examples-1.1.2.jar wordcount input output 14/03/18 19:02:29 INFO input.FileInputFormat: Total input paths to process : 2
14/03/18 19:02:29 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/03/18 19:02:29 WARN snappy.LoadSnappy: Snappy native library not loaded
14/03/18 19:02:32 INFO mapred.JobClient: Running job: job_201403181836_0001
14/03/18 19:02:33 INFO mapred.JobClient: map 0% reduce 0%
14/03/18 19:05:15 INFO mapred.JobClient: map 50% reduce 0%
第41页/共53页
14/03/18 19:06:44 INFO mapred.JobClient: map 100% reduce 0%
14/03/18 19:08:08 INFO mapred.JobClient: map 100% reduce 100%
14/03/18 19:08:21 INFO mapred.JobClient: Job complete: job_201403181836_0001
14/03/18 19:08:21 INFO mapred.JobClient: Counters: 29
14/03/18 19:08:21 INFO mapred.JobClient: Job Counters
14/03/18 19:08:21 INFO mapred.JobClient: Launched reduce tasks=1
14/03/18 19:08:21 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=394999
14/03/18 19:08:21 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/03/18 19:08:21 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/03/18 19:08:21 INFO mapred.JobClient: Launched map tasks=3
14/03/18 19:08:21 INFO mapred.JobClient: Data-local map tasks=3
14/03/18 19:08:21 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=172094
14/03/18 19:08:21 INFO mapred.JobClient: File Output Format Counters
14/03/18 19:08:21 INFO mapred.JobClient: Bytes Written=25
14/03/18 19:08:21 INFO mapred.JobClient: FileSystemCounters
14/03/18 19:08:21 INFO mapred.JobClient: FILE_BYTES_READ=55
14/03/18 19:08:21 INFO mapred.JobClient: HDFS_BYTES_READ=253
14/03/18 19:08:21 INFO mapred.JobClient: FILE_BYTES_WRITTEN=155635
14/03/18 19:08:21 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25
14/03/18 19:08:21 INFO mapred.JobClient: File Input Format Counters
第42页/共53页
14/03/18 19:08:21 INFO mapred.JobClient: Bytes Read=25
14/03/18 19:08:21 INFO mapred.JobClient: Map-Reduce Framework
14/03/18 19:08:21 INFO mapred.JobClient: Map output materialized bytes=61
14/03/18 19:08:21 INFO mapred.JobClient: Map input records=2
14/03/18 19:08:21 INFO mapred.JobClient: Reduce shuffle bytes=61
14/03/18 19:08:21 INFO mapred.JobClient: Spilled Records=8
14/03/18 19:08:21 INFO mapred.JobClient: Map output bytes=41
14/03/18 19:08:21 INFO mapred.JobClient: Total committed heap usage (bytes)=414679040 14/03/18 19:08:21 INFO mapred.JobClient: CPU time spent (ms)=66740
14/03/18 19:08:21 INFO mapred.JobClient: Combine input records=4
14/03/18 19:08:21 INFO mapred.JobClient: SPLIT_RAW_BYTES=228
14/03/18 19:08:21 INFO mapred.JobClient: Reduce input records=4
14/03/18 19:08:21 INFO mapred.JobClient: Reduce input groups=3
14/03/18 19:08:21 INFO mapred.JobClient: Combine output records=4
14/03/18 19:08:21 INFO mapred.JobClient: Physical memory (bytes) snapshot=175300608 14/03/18 19:08:21 INFO mapred.JobClient: Reduce output records=3
14/03/18 19:08:21 INFO mapred.JobClient: Virtual memory (bytes) snapshot=2178383872 14/03/18 19:08:21 INFO mapred.JobClient: Map output records=4
[hadoop@master01 bin]$ ./hadoop dfs -ls
Found 2 items
第43页/共53页。