修改etc/hosts
给节点起个名:
127.0.0.1 HadoopMaster
安装ssh
$ sudo apt-get install ssh
$ sudo apt-get install pdsh
安装JDK,在bashrc添加:
export JAVA_HOME=/usr/java/jdk_xxxx
etc/hadoop/core-site.xml:
添加
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://HadoopMaster:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopdata</value>
<description>Abase for other temporary directories.</description>
</property>
</configuration>
etc/hadoop/HDFS-site.xml:
添加
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
ssh免密登录
$ ssh-keygen -t rsa -P''-f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
etc/hadoop/workers
HDFS操作
格式化:
$ bin / hdfs namenode -format
打开DFS服务:
$ sbin / start-dfs.sh
在HDFS中创建目录
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
$ bin/hdfs dfs -mkdir input
向HDFS导入文件
$ bin/hdfs dfs -put etc/hadoop/*.xml input
运行示例程序
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0.jar grep input output 'dfs[a-z.]+'
从HDFS导出文件
$ bin/hdfs dfs -get output output
停止HDFS服务
$ sbin/stop-dfs.sh
## etc/hadoop/mapred-site.xml: 添加
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
etc/hadoop/yarn-site.xml:
添加
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
启动Yarn
$ sbin / start-yarn.sh
停止Yarn
$ stop-yarn.sh
BugShooting
- 遇到问题先参照Log文件
- www.bing.com
- Hadoop配置文件里面JDK路径不能有空格
- 直接关闭防火墙
- 建一个hadoop用户
- chown chmod 多用用
- hosts文件里面的主机名要注意不要有特殊字符