[BigData] 客製化 Apache Spark 在 ARM64 架構上
在創建以 Spark 為基礎的應用程式的時候,基本上都會直接下載 Spark 官網打包好的程式碼,在官網上的 Spark 是以 Intel x86 架構為主的,由於 Java 並沒有架構上的差異,所以之前在 Spark K8S 並沒有遇到一樣的問題,但是在一次實務上驅動 Spark Standalone 的時候遇到以下的錯誤訊息,主要是跟 External Shuffle Service 有關,這個錯誤並不會出現在以 Spark K8S 為方式驅動的應用,但是 External Shuffle Manager 會遇到,所以筆者才會需要利用 Spark 提供的 make-distribution.sh 檔案去重新打包 Apache Spark 在 ARM64 架構上。
Spark 3.3.0 上的錯誤訊息:
01:52:12.422 ERROR SparkUncaughtExceptionHandler - Uncaught exception in thread Thread[main,5,main]
java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /tmp/libleveldbjni-64-1-1645791443138296863.8: /tmp/libleveldbjni-64-1-1645791443138296863.8: cannot open shared object file: No such file or directory (Possible cause: can't load AMD 64-bit .so on a AARCH64-bit platform)]
at org.fusesource.hawtjni.runtime.Library.doLoad(Library.java:182) ~[jline-2.14.6.jar:?]
at org.fusesource.hawtjni.runtime.Library.load(Library.java:140) ~[jline-2.14.6.jar:?]
at org.fusesource.leveldbjni.JniDBFactory.<clinit>(JniDBFactory.java:48) ~[leveldbjni-all-1.8.jar:1.8]
at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:48) ~[spark-network-common_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:126) ~[spark-network-shuffle_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:99) ~[spark-network-shuffle_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.network.shuffle.ExternalBlockHandler.<init>(ExternalBlockHandler.java:81) ~[spark-network-shuffle_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.ExternalShuffleService.newShuffleBlockHandler(ExternalShuffleService.scala:82) ~[spark-core_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.ExternalShuffleService.<init>(ExternalShuffleService.scala:56) ~[spark-core_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.worker.Worker.<init>(Worker.scala:183) ~[spark-core_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:966) ~[spark-core_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:934) ~[spark-core_2.12-3.3.0.jar:3.3.0]
at org.apache.spark.deploy.worker.Worker.main(Worker.scala) ~[spark-core_2.12-3.3.0.jar:3.3.0]
01:52:12.428 INFO ShutdownHookManager - Shutdown hook called
Spark 3.5.3 上的錯誤訊息
23:26:43.438 ERROR SparkUncaughtExceptionHandler - Uncaught exception in thread Thread[main,5,main]
java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [no leveldbjni64-1.8 in java.library.path: /usr/java/packages/lib:/usr/lib/aarch64-linux-gnu/jni:/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu:/usr/lib/jni:/lib:/usr/lib, no leveldbjni-1.8 in java.library.path: /usr/java/packages/lib:/usr/lib/aarch64-linux-gnu/jni:/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu:/usr/lib/jni:/lib:/usr/lib, no leveldbjni in java.library.path: /usr/java/packages/lib:/usr/lib/aarch64-linux-gnu/jni:/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu:/usr/lib/jni:/lib:/usr/lib, /tmp/libleveldbjni-64-1-5086456728708123234.8: /tmp/libleveldbjni-64-1-5086456728708123234.8: cannot open shared object file: No such file or directory (Possible cause: can't load AMD 64 .so on a AARCH64 platform)]
at org.fusesource.hawtjni.runtime.Library.doLoad(Library.java:182) ~[leveldbjni-all-1.8.jar:1.8]
at org.fusesource.hawtjni.runtime.Library.load(Library.java:140) ~[leveldbjni-all-1.8.jar:1.8]
at org.fusesource.leveldbjni.JniDBFactory.<clinit>(JniDBFactory.java:48) ~[leveldbjni-all-1.8.jar:1.8]
at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:48) ~[spark-network-common_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.network.util.DBProvider.initDB(DBProvider.java:40) ~[spark-network-common_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:131) ~[spark-network-shuffle_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:100) ~[spark-network-shuffle_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.network.shuffle.ExternalBlockHandler.<init>(ExternalBlockHandler.java:81) ~[spark-network-shuffle_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.ExternalShuffleService.newShuffleBlockHandler(ExternalShuffleService.scala:88) ~[spark-core_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.ExternalShuffleService.<init>(ExternalShuffleService.scala:57) ~[spark-core_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.worker.Worker.<init>(Worker.scala:183) ~[spark-core_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:967) ~[spark-core_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:935) ~[spark-core_2.12-3.5.3.jar:3.5.3]
at org.apache.spark.deploy.worker.Worker.main(Worker.scala) ~[spark-core_2.12-3.5.3.jar:3.5.3]
23:26:43.452 INFO ShutdownHookManager - Shutdown hook called
利用 make-distribution.sh 編譯 Spark 程式碼
嘗試用 aarch64 為架構去重新打包 Spark 的原始碼,要特別注意要在 arm64 的 java 之下執行以下的指令:
./dev/make-distribution.sh --name hadoop3 --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pyarn -Pkubernetes -Phadoop-3 -Paarch64 -Dhadoop.version=3.3.4 -e -DskipTests
Note: 如果是要編譯給 ARM64 的環境使用,必須要加 -Paarch64 這一個 profile,筆者一開始因為沒有加,所以導致在編譯的時候還是使用到原本的 org.fusesource.leveldbjni 裡面的 leveldbjni-1.8.jar 檔,這會導致錯誤持續發生,透過觀察 Spark 專案的 dependency 可以觀察到除了 pom.xml 本體會用到 leveldbjni 之外,kvstore, network-common, spark-network-common 等等 project 都會使用到。

實際去檢查也可以發現利用 org.openlabtesting.leveldbjni 提供的 leveldbjni-1.8.jar 裡面是有提供 aarch64 使用的 .so 檔。

開始執行的時候,出現以下的編譯訊息可以檢查是否有使用正確的編譯環境,由於本篇是想要在 ARM64 的架構中打包,所以 os.detected.arch 必須要是 aarch_64。
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Detecting the operating system and CPU architecture
[INFO] ------------------------------------------------------------------------
[INFO] os.detected.name: osx
[INFO] os.detected.arch: aarch_64
[INFO] os.detected.version: 14.6
[INFO] os.detected.version.major: 14
[INFO] os.detected.version.minor: 6
[INFO] os.detected.classifier: osx-aarch_64
[INFO] ------------------------------------------------------------------------
假設成功利用 Maven 編譯成功的話,會顯示以下的訊息:
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 3.5.3:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [ 2.998 s]
[INFO] Spark Project Tags ................................. SUCCESS [ 4.224 s]
[INFO] Spark Project Sketch ............................... SUCCESS [ 2.017 s]
[INFO] Spark Project Local DB ............................. SUCCESS [ 3.241 s]
[INFO] Spark Project Common Utils ......................... SUCCESS [ 8.526 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 4.579 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 3.266 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 2.728 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 2.892 s]
[INFO] Spark Project Core ................................. SUCCESS [01:08 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 11.242 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 15.844 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 15.650 s]
[INFO] Spark Project SQL API .............................. SUCCESS [ 16.183 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [ 43.525 s]
[INFO] Spark Project SQL .................................. SUCCESS [ 42.269 s]
[INFO] Spark Project ML Library ........................... SUCCESS [ 48.001 s]
[INFO] Spark Project Tools ................................ SUCCESS [ 4.290 s]
[INFO] Spark Project Hive ................................. SUCCESS [ 19.292 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 7.000 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 32.032 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 11.904 s]
[INFO] Spark Project Kubernetes ........................... SUCCESS [ 11.906 s]
[INFO] Spark Project Hive Thrift Server ................... SUCCESS [ 14.014 s]
[INFO] Spark Project Assembly ............................. SUCCESS [ 4.154 s]
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 6.091 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 9.020 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 11.082 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 26.970 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 2.161 s]
[INFO] Spark Avro ......................................... SUCCESS [ 9.767 s]
[INFO] Spark Project Connect Common ....................... SUCCESS [ 24.701 s]
[INFO] Spark Protobuf ..................................... SUCCESS [ 8.417 s]
[INFO] Spark Project Connect Server ....................... SUCCESS [ 21.896 s]
[INFO] Spark Project Connect Client ....................... SUCCESS [ 25.447 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:06 min
[INFO] Finished at: 2025-07-18T23:20:09+08:00
[INFO] ------------------------------------------------------------------------
之後就會開始打包其他的應用程式例如:pyspark 與 SparkR 等等終端程式,筆者在編譯的時候,一開始出現以下的錯誤訊息導致打包失敗,從字面是以為是幾個 Rd 檔出現的 syntax error 導致問題。
Use inherits() (or maybe is()) instead.
* checking Rd files ... NOTE
checkRd: (-1) column_collection_functions.Rd:324: Lost braces
324 | \url{https://spark.apache.org/docs/latest/sql-data-sources-json.html#data-source-option}{Data Source Option}
| ^
checkRd: (-1) column_collection_functions.Rd:332: Lost braces
332 | \url{https://spark.apache.org/docs/latest/sql-data-sources-csv.html#data-source-option}{Data Source Option}
| ^
checkRd: (-1) read.jdbc.Rd:45: Lost braces
45 | \url{https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#data-source-option}{Data Source Option} in the version you use.
| ^
checkRd: (-1) read.json.Rd:14: Lost braces
14 | \url{https://spark.apache.org/docs/latest/sql-data-sources-json.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) read.orc.Rd:14: Lost braces
14 | \url{https://spark.apache.org/docs/latest/sql-data-sources-orc.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) read.parquet.Rd:14: Lost braces
14 | \url{https://spark.apache.org/docs/latest/sql-data-sources-parquet.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) read.text.Rd:14: Lost braces
14 | \url{https://spark.apache.org/docs/latest/sql-data-sources-text.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) repartition.Rd:24: Lost braces in \itemize; meant \describe ?
checkRd: (-1) repartition.Rd:25-26: Lost braces in \itemize; meant \describe ?
checkRd: (-1) repartition.Rd:27-28: Lost braces in \itemize; meant \describe ?
checkRd: (-1) repartitionByRange.Rd:24-25: Lost braces in \itemize; meant \describe ?
checkRd: (-1) repartitionByRange.Rd:26-27: Lost braces in \itemize; meant \describe ?
checkRd: (-1) spark.kmeans.Rd:69: Lost braces; missing escapes or markup?
69 | (cluster centers of the transformed data), {is.loaded} (whether the model is loaded
| ^
checkRd: (-1) write.jdbc.Rd:28-29: Lost braces
28 | \url{https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html#data-source-option}{
| ^
checkRd: (-1) write.json.Rd:19: Lost braces
19 | \url{https://spark.apache.org/docs/latest/sql-data-sources-json.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) write.orc.Rd:19: Lost braces
19 | \url{https://spark.apache.org/docs/latest/sql-data-sources-orc.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) write.parquet.Rd:19: Lost braces
19 | \url{https://spark.apache.org/docs/latest/sql-data-sources-parquet.html#data-source-option}{Data Source Option} in the version you use.}
| ^
checkRd: (-1) write.text.Rd:19: Lost braces
19 | \url{https://spark.apache.org/docs/latest/sql-data-sources-text.html#data-source-option}{Data Source Option} in the version you use.}
|
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ... SKIPPED
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes ... OK
* checking re-building of vignette outputs ... [8s/47s] OK
* checking PDF version of manual ... WARNING
LaTeX errors when creating PDF version.
This typically indicates Rd problems.
* checking PDF version of manual without index ... ERROR
Re-running with no redirection of stdout/stderr.
Hmm ... looks like a package
Converting parsed Rd's to LaTeX ..........................
Creating pdf output from LaTeX ...
Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet, :
pdflatex is not available
Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet, :
pdflatex is not available
呼叫 tools::texi2pdf()時發生錯誤
...
* checking for non-standard things in the check directory ... NOTE
Found the following files/directories:
‘.DS_Store’ ‘SparkR-manual.tex’
* checking for detritus in the temp directory ... OK
* DONE
Status: 1 ERROR, 1 WARNING, 8 NOTEs
解決方法:安裝 TinyTex
利用以下指令在地端的 R 環境裡面安裝 TinyTex。
install.packages("tinytex")
tinytex::install_tinytex()
接著在利用以下指令檢查是否安裝成功:
tinytex::tlmgr_path()