問題描述
安裝 spark 2.3 并在 .bashrc 中設置以下環境變量(使用 gitbash)
After spark installation 2.3 and setting the following env variables in .bashrc (using gitbash)
HADOOP_HOME
HADOOP_HOME
SPARK_HOME
PYSPARK_PYTHON
PYSPARK_PYTHON
JDK_HOME
執行 $SPARK_HOME/bin/spark-submit 顯示以下錯誤.
executing $SPARK_HOME/bin/spark-submit is displaying the following error.
錯誤:無法找到或加載主類 org.apache.spark.launcher.Main
Error: Could not find or load main class org.apache.spark.launcher.Main
我在 stackoverflow 和其他網站上進行了一些研究檢查,但無法找出問題所在.
I did some research checking in stackoverflow and other sites, but could not figure out the problem.
執行環境
- Windows 10 企業版
- Spark 版本 - 2.3
- Python 版本 - 3.6.4
你能提供一些指導嗎?
推薦答案
我收到了那個錯誤信息.它可能有幾個根本原因,但這是我調查和解決問題的方式(在 linux 上):
I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):
- 不要啟動
spark-submit
,而是嘗試使用bash -x spark-submit
來查看哪一行失敗. - 多次執行該過程(因為 spark-submit 調用嵌套腳本),直到找到調用的底層過程:在我的情況下類似于:
- instead of launching
spark-submit
, try usingbash -x spark-submit
to see which line fails. - do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp '/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*' -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name 'Spark shell' spark-shell代碼>
因此,spark-submit 啟動了一個 java 進程,但使用 /opt/spark-2.2.0-bin-hadoop2.7/中的文件找不到 org.apache.spark.launcher.Main 類jars/*
(參見上面的 -cp 選項).我在這個 jars 文件夾中做了一個 ls 并計算了 4 個文件而不是整個 spark 分發(約 200 個文件).這可能是安裝過程中的一個問題.所以我重新安裝了 spark,檢查了 jar 文件夾,它就像一個魅力.
So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/*
(see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files).
It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.
所以,你應該:
- 檢查
java
命令(cp 選項) - 檢查您的 jars 文件夾(它至少包含所有 spark-*.jar 嗎?)
希望對你有幫助.
這篇關于Spark 安裝 - 錯誤:無法找到或加載主類 org.apache.spark.launcher.Main的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!