You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently released an update to deal with situations where the spark context is not available to query things like default parallelism. This should address this
In general, the way to safeguard against this is to explicitly specify the number of partitions requested when generating the specification for your dataset. This will avoid the query against the sparkContext.
While we have not tested against Spark Connect, we have tested against other environments where there is no sparkContext available
ronanstokes-db
changed the title
Upgrade this lib to be compatible with Spark Connect
Upgrade this lib to be compatible with Spark Connect / DB Connect
Jul 16, 2024
Expected Behavior
This library works the same with Spark Connect.
Current Behavior
This library uses
sparkSession.sparkContext
which doesn't work with Spark Connect, here is an example:dbldatagen/dbldatagen/data_generator.py
Line 251 in debb29f
Steps to Reproduce (for bugs)
Run the test suite with Spark Connect enabled and fix all issues.
The text was updated successfully, but these errors were encountered: