Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting SPARK_HOME in R Spark example #49

Open
shrutimantri opened this issue Jan 25, 2024 · 2 comments
Open

Setting SPARK_HOME in R Spark example #49

shrutimantri opened this issue Jan 25, 2024 · 2 comments
Labels
bug Something isn't working
Milestone

Comments

@shrutimantri
Copy link
Contributor

Expected Behavior

Actual Behaviour

In the R Spark flow example provided here: https://kestra.io/plugins/plugin-spark/tasks/io.kestra.plugin.spark.RSubmit
What should be SPARK_HOME set as in the env variable?

This runs in a Docker runner, so its unclear as to what should be set as SPARK_HOME.
Once we know how the flow should exactly be, I can make changes in the documentation accordingly.

Steps To Reproduce

N/A

Environment Information

  • Kestra Version: Any version
  • Plugin version: Any version
  • Operating System (OS / Docker / Kubernetes): N/A
  • Java Version (If not docker): N/A

Example flow

Flow as provided here: https://kestra.io/plugins/plugin-spark/tasks/io.kestra.plugin.spark.RSubmit

@shrutimantri shrutimantri added the bug Something isn't working label Jan 25, 2024
@anna-geller
Copy link
Member

Sys.getenv("SPARK_HOME") should normally resolve to /opt/bitnami/spark if you're running this script with a default bitnami/spark image

@anna-geller anna-geller added this to the v0.20.0 milestone Jan 25, 2024
@anna-geller
Copy link
Member

the default example

id: "r_submit"
type: "io.kestra.plugin.spark.RSubmit"
runner: DOCKER
docker:
  networkMode: host
  user: root
master: spark://localhost:7077
mainScript: |
  library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
  sparkR.session()
   
  print("The SparkR session has initialized successfully.")
   
  sparkR.stop()

from here fails with error:
Exception in thread "main" java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory

image

@github-project-automation github-project-automation bot moved this to Backlog in Issues Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog
Development

No branches or pull requests

2 participants