博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Windows上搭建Standalone模式的Spark环境
阅读量:5896 次
发布时间:2019-06-19

本文共 11018 字,大约阅读时间需要 36 分钟。

Java

安装Java8,设置JAVA_HOME,并添加 %JAVA_HOME%\bin 到环境变量PATH中

E:\java -versionjava version "1.8.0_60"Java(TM) SE Runtime Environment (build 1.8.0_60-b27)Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)复制代码

Scala

下载解压Scala ,设置SCALA_HOME,并添加 %SCALA_HOME%\bin 到PATH中

E:\ scala -verionScala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL复制代码

Spark

下载解压Spark , 设置SPARK_HOME,并添加 %SPARK_HOME%\bin 到PATH中,此时尝试在控制台运行spark-shell,出现如下错误提示无法定位winutils.exe

E:\>spark-shellUsing Spark's default log4j profile: org/apache/spark/log4j-defaults.propertiesSetting default log level to "WARN".To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).17/06/05 21:34:43 ERROR Shell: Failed to locate the winutils binary in the hadoop binary pathjava.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.        at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:379)        at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:394)        at org.apache.hadoop.util.Shell.
(Shell.java:387) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:2327) at org.apache.hadoop.hive.conf.HiveConf$ConfVars.
(HiveConf.java:365) at org.apache.hadoop.hive.conf.HiveConf.
(HiveConf.java:105) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:229) at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:991) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:92) at $line3.$read$$iw$$iw.
(
:15) at $line3.$read$$iw.
(
:42) at $line3.$read.
(
:44) at $line3.$read$.
(
:48) at $line3.$read$.
(
) at $line3.$eval$.$print$lzycompute(
:7) at $line3.$eval$.$print(
:6) at $line3.$eval.$print(
) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:105) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:69) at org.apache.spark.repl.Main$.main(Main.scala:52) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)复制代码

从错误消息中可以看出Spark需要用到Hadoop中的一些类库(通过HADOOP_HOME环境变量,因为我们之前并未设置过,所以文件路径null\bin\winutils.exe里面出现了null),但这并不意味这我们一定要安装Hadoop,我们可以直接下载所需要的到磁盘上的任何位置,比如C:\winutils\bin\winutils.exe,同时设置 HADOOP_HOME=C:\winutils

现在我们再次运行spark-shell,又有一个新的错误:

java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)  at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)  at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)  at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)  at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)  at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96)  ... 47 elidedCaused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)  at java.lang.reflect.Constructor.newInstance(Constructor.java:422)  at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)  ... 58 moreCaused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':  at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)  at org.apache.spark.sql.internal.SharedState.
(SharedState.scala:86) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101) at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100) at org.apache.spark.sql.internal.SessionState.
(SessionState.scala:157) at org.apache.spark.sql.hive.HiveSessionState.
(HiveSessionState.scala:32) ... 63 moreCaused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166) ... 71 moreCaused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262) at org.apache.spark.sql.hive.HiveExternalCatalog.
(HiveExternalCatalog.scala:66) ... 76 moreCaused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.HiveClientImpl.
(HiveClientImpl.scala:188) ... 84 moreCaused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) ... 85 more
:14: error: not found: value spark import spark.implicits._ ^
:14: error: not found: value spark import spark.sql ^Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.1.1 /_/Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60)Type in expressions to have them evaluated.Type :help for more information.scala>复制代码

错误消息中提示零时目录 /tmp/hive 没有写的权限:

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------复制代码

所以我们需要更新E:/tmp/hive的权限(我在E盘下运行的spark-shell命令,如果在其他盘运行,就改成对应的盘符+/tmp/hive)。运行如下命令:

E:\>C:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive复制代码

再次运行spark-shell,spark启动成功。此时可以通过 来访问Spark UI

转载于:https://juejin.im/post/5a3dad9ff265da432f3151a1

你可能感兴趣的文章
nginx_lua_waf安装测试
查看>>
WinForm窗体缩放动画
查看>>
JQuery入门(2)
查看>>
POI导出JavaWeb中的table到excel下载
查看>>
linux文件描述符
查看>>
C++ const 详解
查看>>
传值引用和调用引用的区别
查看>>
Hive简介
查看>>
hyper-v 无线网连接
查看>>
Python3.7.1学习(六)RabbitMQ在Windows环境下的安装
查看>>
Windows下memcached的安装配置
查看>>
ubuntu: firefox+flashplay
查看>>
常见的海量数据处理方法
查看>>
C语言博客作业03--函数
查看>>
web.xml 中CharacterEncodingFilter类的学习
查看>>
显示刚刚添加的最后一条数据,access,选择语句,select
查看>>
贪吃蛇逻辑代码
查看>>
实现c协程
查看>>
ASP.NET视频教程 手把手教你做企业论坛网站 视频教程
查看>>
[LeetCode] Meeting Rooms II
查看>>