【Hadoop】大数据开发环境配置
文章目录
1 设置静态ip
-
进入ifcfg-ens33文件
vi /etc/sysconfig/network-scripts/ifcfg-ens33
-
先修改
BOOTPROTO
参数,将之前的dhcp改为static;BOOTPROTO="static"
-
IPADDR
中192.168.152是取自虚拟机中虚拟网络编辑器中子网地址的值,最后的100是自定义的,这个值可以取3~254之间的任意一个数值;
IPADDR=192.168.152.100
-
GATEWAY
和DNS1
设置为网关IP;GATEWAY=192.168.152.2 DNS1=192.168.152.2
2 设置主机名
-
先设置临时主机名
[root@bigdata01 ~]# hostname bigdata01
-
设置永久主机名并生效
[root@bigdata01 ~]# vi /etc/hostname bigdata01 [root@bigdata01 ~]# source /etc/hostname
-
验证主机名
[root@bigdata01 ~]# hostname bigdata01
3 关闭防火墙
-
临时关闭
[root@bigdata01 ~]# systemctl stop firewalld
-
永久关闭
[root@bigdata01 ~]# systemctl disable firewalld
-
验证防火墙状态
[root@bigdata01 ~]# systemctl list-unit-files | grep firewalld firewalld.service disabled
4 ssh免密码登录
-
生成密钥
[root@bigdata01 ~]# ssh-keygen -t rsa [root@bigdata01 ~]# ll ~/.ssh/
-
把公钥拷贝到需要免密码登录的机器上面
[root@bigdata01 ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
-
验证免密登录
[root@bigdata01 ~]# ssh bigdata01 Last login: Tue Apr 7 15:05:55 2020 from 192.168.182.1
5 JDK配置
-
下载JDK安装包并传输到预安装目录(以
/data/soft
路径为例)[root@bigdata01 ~]# mkdir -p /data/soft [root@bigdata01 soft]# ll total 189496 -rw-r--r--. 1 root root 194042837 Apr 6 23:14 jdk-8u202-linux-x64.tar.gz
-
解压安装包
[root@bigdata01 soft]# tar -zxvf jdk-8u202-linux-x64.tar.gz
-
配置环境变量
[root@bigdata01 soft]# vi /etc/profile ..... export JAVA_HOME=/data/soft/jdk1.8 export PATH=.:$JAVA_HOME/bin:$PATH [root@bigdata01 soft]# source /etc/profile
-
验证
[root@bigdata01 soft]# java -version java version "1.8.0_202" Java(TM) SE Runtime Environment (build 1.8.0_202-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
6 hadoop安装并配置
6.1 集群节点之间时间同步
[root@bigdata01 hadoop-3.3.5]# systemctl status crond
● crond.service - Command Scheduler
Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-06-15 09:47:16 CST; 12h ago
Main PID: 6559 (crond)
CGroup: /system.slice/crond.service
└─6559 /usr/sbin/crond -n
Jun 15 09:47:16 bigdata01 systemd[1]: Started Command Scheduler.
Jun 15 09:47:16 bigdata01 crond[6559]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 34% if used.)
Jun 15 09:47:17 bigdata01 crond[6559]: (CRON) INFO (running with inotify support)
[root@bigdata01 hadoop-3.3.5]# crontab -l
no crontab for root
[root@bigdata01 hadoop-3.3.5]# crontab -e
no crontab for root - using an empty one
crontab: installing new crontab
[root@bigdata01 hadoop-3.3.5]# crontab -l
* * * * * /usr/sbin/ntpdate -u ntp.sjtu.edu.cn
6.2 SSH免密码登录完善
在bigdata01中执行:
[root@bigdata01 ~]# scp ~/.ssh/authorized_keys bigdata02:~/
[root@bigdata01 ~]# scp ~/.ssh/authorized_keys bigdata03:~/
在bigdata02中执行:
[root@bigdata02 ~]# cat ~/authorized_keys >> ~/.ssh/authorized_keys
在bigdata03中执行:
[root@bigdata03 ~]# cat ~/authorized_keys >> ~/.ssh/authorized_keys
在bigdata01节点验证效果:
[root@bigdata01 ~]# ssh bigdata02 Last login: Tue Apr 7 21:33:58 2020 from bigdata01
[root@bigdata02 ~]# exit
logout
Connection to bigdata02 closed.
[root@bigdata01 ~]# ssh bigdata03 Last login: Tue Apr 7 21:17:30 2020 from 192.168.182.1
[root@bigdata03 ~]# exit
logout
Connection to bigdata03 closed.
6.3 hadoop配置
-
在bigdata01虚拟机中,把hadoop的安装包上传到预安装目录(以
/data/soft
路径为例)[root@bigdata01 soft]# ll total 527024 -rw-r--r--. 1 root root 345625475 Jul 19 2019 hadoop-3.2.0.tar.gz drwxr-xr-x. 7 10 143 245 Dec 16 2018 jdk1.8 -rw-r--r--. 1 root root 194042837 Apr 6 23:14 jdk-8u202-linux-x64.tar.gz
-
在bigdata01虚拟机中,解压hadoop安装包
[root@bigdata01 soft]# tar -zxvf hadoop-3.2.0.tar.gz
-
在bigdata01、bigdata02、bigdata03虚拟机中,分别配置环境变量
[root@bigdata01 hadoop-3.2.0]# vi /etc/profile ....... export JAVA_HOME=/data/soft/jdk1.8 export HADOOP_HOME=/data/soft/hadoop-3.2.0 export PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH [root@bigdata01 hadoop-3.2.0]# source /etc/profile
-
在bigdata01虚拟机中,修改Hadoop相关配置文件
-
进入配置文件所在目录(
etc/hadoop/
)[root@bigdata01 hadoop-3.2.0]# cd etc/hadoop/ [root@bigdata01 hadoop]#
-
修改hadoop-env.sh文件
... export JAVA_HOME=/data/soft/jdk1.8 export HADOOP_LOG_DIR=/data/hadoop_repo/logs/hadoop
-
修改 core-site.xml 文件
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://bigdata01:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop_repo</value> </property> </configuration>
-
修改hdfs-site.xml文件
<configuration> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
-
修改mapred-site.xml文件
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
-
修改yarn-site.xml文件
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>bigdata01</value> </property> </configuration>
-
修改workers文件
bigdata02 bigdata03
-
修改start-dfs.sh和stop-dfs.sh文件,将如下代码增添到该文件前面:
[root@bigdata01 hadoop-3.2.0]# cd sbin/ [root@bigdata01 sbin]# vi start-dfs.sh HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root [root@bigdata01 sbin]# vi stop-dfs.sh HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
-
修改start-yarn.sh和stop-yarn.sh文件,将如下代码增添到该文件前面:
[root@bigdata01 sbin]# vi start-yarn.sh YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root [root@bigdata01 sbin]# vi stop-yarn.sh YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
-
把bigdata01节点上将修改好配置的安装包拷贝到其他两个从节点
[root@bigdata01 sbin]# cd /data/soft/ [root@bigdata01 soft]# scp -rq hadoop-3.2.0 bigdata02:/data/soft/ [root@bigdata01 soft]# scp -rq hadoop-3.2.0 bigdata03:/data/soft/
-
仅在bigdata01中,格式化HDFS
[root@bigdata01 soft]# cd /data/soft/hadoop-3.2.0 [root@bigdata01 hadoop-3.2.0]# bin/hdfs namenode -format
-
-
启动集群,在bigdata01节点上执行下面命令
[root@bigdata01 hadoop-3.3.5]# sbin/start-all.sh Starting namenodes on [bigdata01] Last login: Thu Jun 15 22:10:23 CST 2023 on pts/0 Starting datanodes Last login: Thu Jun 15 23:31:46 CST 2023 on pts/0 bigdata02: WARNING: /data/hadoop_repo/logs/hadoop does not exist. Creating. bigdata03: WARNING: /data/hadoop_repo/logs/hadoop does not exist. Creating. Starting secondary namenodes [bigdata01] Last login: Thu Jun 15 23:31:50 CST 2023 on pts/0 Starting resourcemanager Last login: Thu Jun 15 23:31:54 CST 2023 on pts/0 Starting nodemanagers Last login: Thu Jun 15 23:32:01 CST 2023 on pts/0 You have new mail in /var/spool/mail/root
-
验证集群
[root@bigdata01 hadoop-3.3.5]# jps 26115 SecondaryNameNode 25812 NameNode 26372 ResourceManager 26716 Jps
[root@bigdata02 soft]# jps 17120 NodeManager 17251 Jps 17022 DataNode
[root@bigdata03 ~]# jps 9556 Jps 9401 NodeManager 9294 DataNode
-
停止集群
[root@bigdata01 hadoop-3.3.5]# sbin/stop-all.sh Stopping namenodes on [bigdata01] Last login: Thu Jun 15 23:32:04 CST 2023 on pts/0 Stopping datanodes Last login: Thu Jun 15 23:38:39 CST 2023 on pts/0 Stopping secondary namenodes [bigdata01] Last login: Thu Jun 15 23:38:41 CST 2023 on pts/0 Stopping nodemanagers Last login: Thu Jun 15 23:38:43 CST 2023 on pts/0 Stopping resourcemanager Last login: Thu Jun 15 23:38:47 CST 2023 on pts/0
至此,Hadoop大数据开发环境配置完毕!