0
点赞
收藏
分享

微信扫一扫

Spark---环境搭建---入门概念

蛇发女妖 2024-02-19 阅读 14

目录

环境搭建


环境搭建

1:解压;

2:配置spark环境变量:

vim /etc/profile
export SPARK_HOME=/opt/module/spark
export PYSPARK_PYTHON=/opt/module/anacond3/envs/pyspark/bin/python3.8
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop


export PATH=$PATH:$SPARK_HOME/bin

source /etc/profile

 编辑:

vim ~/.bashrc
export JAVA_HOME=/opt/module/jdk
export PYSPARK_PYTHON=/opt/module/anacond3/envs/pyspark/bin

 测试:

spark-submit --version

3:设置spark,yarn是hadoop的一部分,必须启动hadoop时才会运行,spark中配置的和hadoop有关的;

cp spark-env.sh.template spark-env.sh

HADOOP_CONF_DIR=/opt/module/hadoop/etc/hadoop

 4:测试spark:

运行命令为:

spark-submit --master yarn --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.12-3.1.1.jar

yarn 需要配置:

yarn-site.xml:

<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>

hadoop必须启动!

如果提示safe mode问题需要执行:

hadoop dfsadmin -safemode leave

举报

相关推荐

0 条评论