Spark: Relativize in-memory paths for data file and rewritable delete file locations #11525

amogh-jahagirdar · 2024-11-12T16:01:54Z

This is a follow up to https://github.com/apache/iceberg/pull/11273/files#

Instead of broadcasting a map with absolute paths for data files and delete files to executors, we could shrink the memory footprint by relativizing the in-memory mapping, and then just prior to lookup on executors, reconstruct the absolute path as for the relevant delete files.

There are a few ways to go about relativization, in the current implementation I just did the simplest thing which was to relativize to the table location. There are more sophisticated things that could be done to save even more memory consumer from paths such as relativize according to the data file location (requires surfacing more details from LocationProvider), find the longest common prefix between all data/delete files in the rewritable deletes (requires a double pass over tasks, once to identify the longest common prefix via smallest/largest lexicographical strings, and then another to actually reconstruct the delete files). Patricia tries are another possibility though the serialized representation seems to take about the same amount of memory, not sure why that's the case.

I'm also working on identifying if using spark bytestobytes offheap map will save us even more memory but in the mean time thought it made sense to at least get this improvement in the interim. This is all internal, so we can always remove it down the line if something better comes along.

… file locations

amogh-jahagirdar · 2024-11-12T17:20:08Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java

+      return deleteLoader.loadPositionDeletes(
+          rewritableDeletes.deletesFor(path.toString(), specs), path);


I'll go back to explicitly returning a null index in case there are no deletes for the given path, the internal implementation of loader takes care of it regardless but it isn't obvious without reading into it.

amogh-jahagirdar · 2024-11-12T17:22:45Z

spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/RewritableDeletes.java

+  private DeleteFile relativizeDeleteFile(
+      DeleteFile deleteFile, Map<Integer, PartitionSpec> specs) {
+    return FileMetadata.deleteFileBuilder(specs.get(deleteFile.specId()))
+        .copy(deleteFile)


I think we'd have a slight perf hit by having to do this copy (particularly the copying the partition data in memory). I'll see if there's any relevant benchmarking we can leverage to measure if it's really significant or not.

Maybe there's a way to extend planning APIs so that we return all paths already relativized but that seems like it would be a bigger change, and it's not obvious we should do that until there's evidence that this copy is expensive.

amogh-jahagirdar · 2024-11-12T22:31:08Z

cc @singhpk234

RussellSpitzer · 2024-11-12T22:42:04Z

Just as a gut comment, if we just compressed them shouldn't we get almost all the benefits we are looking for? They are just a bunch of strings so the binary representation of all of them should be pretty compressible.

amogh-jahagirdar · 2024-11-13T00:57:07Z

Just as a gut comment, if we just compressed them shouldn't we get almost all the benefits we are looking for? They are just a bunch of strings so the binary representation of all of them should be pretty compressible.

It's true the broadcast would be compressed by default via spark.broadcast.compress and that would minimize the space pretty well. I think the concern is more so when we need to load the map broadcast variable on executor side we'd ultimately need to decompress all the chunks of the map. So the goal for relativization was to minimize how much would take in the in-memory representation of the map after decompression. Let me know if that makes sense.

RussellSpitzer · 2024-11-13T15:06:26Z

It's true the broadcast would be compressed by default via spark.broadcast.compress and that would minimize the space pretty well. I think the concern is more so when we need to load the map broadcast variable on executor side we'd ultimately need to decompress all the chunks of the map. So the goal for relativization was to minimize how much would take in the in-memory representation of the map after decompression. Let me know if that makes sense.|

Won't it all be in memory anyway? Java should do string interning with prefix?

Spark: Relativize in-memory paths for data file and rewritable delete…

3fea846

… file locations

github-actions bot added the spark label Nov 12, 2024

amogh-jahagirdar commented Nov 12, 2024

View reviewed changes

amogh-jahagirdar marked this pull request as ready for review November 12, 2024 22:30

amogh-jahagirdar requested review from rdblue, nastra, aokolnychyi and RussellSpitzer November 12, 2024 22:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark: Relativize in-memory paths for data file and rewritable delete file locations #11525

Spark: Relativize in-memory paths for data file and rewritable delete file locations #11525

amogh-jahagirdar commented Nov 12, 2024 •

edited

Loading

amogh-jahagirdar Nov 12, 2024 •

edited

Loading

amogh-jahagirdar Nov 12, 2024 •

edited

Loading

amogh-jahagirdar Nov 12, 2024

amogh-jahagirdar commented Nov 12, 2024

RussellSpitzer commented Nov 12, 2024

amogh-jahagirdar commented Nov 13, 2024 •

edited

Loading

RussellSpitzer commented Nov 13, 2024

		return deleteLoader.loadPositionDeletes(
		rewritableDeletes.deletesFor(path.toString(), specs), path);

Spark: Relativize in-memory paths for data file and rewritable delete file locations #11525

Are you sure you want to change the base?

Spark: Relativize in-memory paths for data file and rewritable delete file locations #11525

Conversation

amogh-jahagirdar commented Nov 12, 2024 • edited Loading

amogh-jahagirdar Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

amogh-jahagirdar Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

amogh-jahagirdar Nov 12, 2024

Choose a reason for hiding this comment

amogh-jahagirdar commented Nov 12, 2024

RussellSpitzer commented Nov 12, 2024

amogh-jahagirdar commented Nov 13, 2024 • edited Loading

RussellSpitzer commented Nov 13, 2024

amogh-jahagirdar commented Nov 12, 2024 •

edited

Loading

amogh-jahagirdar Nov 12, 2024 •

edited

Loading

amogh-jahagirdar Nov 12, 2024 •

edited

Loading

amogh-jahagirdar commented Nov 13, 2024 •

edited

Loading