Skip to content

Assigning multiple files' path in minio to app_path #72

Closed Answered by hussein-awala
ArijitSinghEDA asked this question in Q&A
Discussion options

You must be logged in to vote

The easiest way to use custom modules is to add them to the docker image and configure the Python path to use them (for ex: #73).

If your files need to be in an S3 bucket and for some reason you need to add them to the job at runtime (it is recommended to avoid doing this if possible), then you should store your modules in a zip file and use the argument --py-files in the spark-submit command which is currently not supported by the spark-on-k8s package, but there are two alternatives ways:

  • providing the zip path through the config spark.submit.pyFiles
  • updating your job script by adding pyspark.SparkContext.addPyFile() and providing the files as arguments

You can check PySpark documentation

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ArijitSinghEDA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants