1. 打包
File -> Project Structure -> Artifact -> + -> JAR -> From module with dependencies
选择一个Module,之后再选择一个主类
选择要打包的依赖
Build -> Build Artifact -> 项目名称 ->Build
路径
2. Spark运行模式
Spark的运行模式多种多样,在单机上可以以Local和伪分布式模式运行;部署在集群上时有Spark内建的Standalone模式及对于外部框架的支持,有Mesos模式和Spark On YARN模式,这两者主要区别是资源管理交由谁负责。
Local模式
本地运行,不加任何配置,默认为Local模式。
Standalone模式
Standalone模式中Spark集群中Master和Worker节点构成。用户程序通过与Master节点交互,申请所需资源,Worker节点负责具体Executor的启动运行。
Local Cluster模式
Local Cluster即伪分布式模式,是基于Standalone模式实现的,启动Master(主进程)和Worker(工作进程)的位置全部在本地。
YARN Standalone/YARN Cluster模式
通过Hadoop YARN框架来调度Spark应用所需资源。
3. 运行
运行
spark-submit必至少提供--master和--name参数,这两个参数分别用来设置spark应用的运行模式和名称
spark-submit --master local --name demo_local aboutyun_log_analysis.jar
报错
Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:314)
at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:268)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:316)
at java.util.jar.JarVerifier.update(JarVerifier.java:228)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:383)
at java.util.jar.JarFile.getInputStream(JarFile.java:450)
at sun.misc.JarIndex.getJarIndex(JarIndex.java:137)
at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:839)
at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:831)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.URLClassPath$JarLoader.ensureOpen(URLClassPath.java:830)
at sun.misc.URLClassPath$JarLoader.<init>(URLClassPath.java:803)
at sun.misc.URLClassPath$3.run(URLClassPath.java:530)
at sun.misc.URLClassPath$3.run(URLClassPath.java:520)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.URLClassPath.getLoader(URLClassPath.java:519)
at sun.misc.URLClassPath.getLoader(URLClassPath.java:492)
at sun.misc.URLClassPath.getNextLoader(URLClassPath.java:457)
at sun.misc.URLClassPath.getResource(URLClassPath.java:211)
at java.net.URLClassLoader$1.run(URLClassLoader.java:365)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
zip -d aboutyun_log_analysis.jar 'META-INF/.SF' 'META-INF/.RSA' 'META-INF/*SF'
官方文档