Assigning multiple files' path in minio to app_path #72
-
The following files are in my minio bucket
I need to call only |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The easiest way to use custom modules is to add them to the docker image and configure the Python path to use them (for ex: #73). If your files need to be in an S3 bucket and for some reason you need to add them to the job at runtime (it is recommended to avoid doing this if possible), then you should store your modules in a zip file and use the argument
You can check PySpark documentation for more information. |
Beta Was this translation helpful? Give feedback.
The easiest way to use custom modules is to add them to the docker image and configure the Python path to use them (for ex: #73).
If your files need to be in an S3 bucket and for some reason you need to add them to the job at runtime (it is recommended to avoid doing this if possible), then you should store your modules in a zip file and use the argument
--py-files
in the spark-submit command which is currently not supported by thespark-on-k8s
package, but there are two alternatives ways:spark.submit.pyFiles
pyspark.SparkContext.addPyFile()
and providing the files as argumentsYou can check PySpark documentation