Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19354. S3AInputStream to be created by factory under S3AStore. #7237

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

ahmarsuhail
Copy link
Contributor

@ahmarsuhail ahmarsuhail commented Dec 20, 2024

Description of PR

This PR makes some additional changes to the initial PR from Steve: #7214.

Not sure if I got all of this right, but here are a few callouts:

  • Thought of moving InputStreamCallbacks into S3AStore, but I don't think that is the right place for it. InputStreamCallBacks uses S3AFileSystemOperations, which uses S3AStore, so you end up in a dependency mess. Instead, move them to a separate class InputStreamCallbacksImpl, which keeps the code out of S3AFileSystem. Maybe there is a better way to do this, but I couldn't think of anything.

  • Adds a new config, fs.s3a.input.stream.type. This can be set to classic, prefetch, analytics. Believe this is better than having multipleprefetch.enabled and analytics.enabled flags.

  • Could not figure out what was meant by "S3Store to implement the factory interface, completing final binding operations (callbacks, stats)" in Steve's original PR, let's discuss.

  • Let's merge this into trunk first, and then rebase https://github.com/apache/hadoop/tree/feature-HADOOP-19363-analytics-accelerator-s3 on top of it, and move the analytics stream creation code into the new factory.

  • This is a draft PR, was just attempting to complete the original PR.

How was this patch tested?

Not tested. Whoever picks this up for completion can test!

steveloughran and others added 3 commits December 6, 2024 18:45
First iteration
* Factory interface with a parameter object creation method
* Base class AbstractS3AInputStream for all streams to create
* S3AInputStream subclasses that and has a factory
* Production and test code to use it

Not done
* Input stream callbacks pushed down to S3Store
* S3Store to dynamically choose factory at startup, stop in close()
* S3Store to implement the factory interface, completing final binding
  operations (callbacks, stats)

Change-Id: I8d0f86ca1f3463d4987a43924f155ce0c0644180
Revision

API: Make clear this is part of the fundamental store Model:

* abstract stream class is now ObjectInputStream
* interface is ObjectInputStreamFactory
* move to package org.apache.hadoop.fs.s3a.impl.model

Implementation: Prefetching stream is created this way too;
adds one extra parameter.

Maybe we should pass conf down too

Change-Id: I5bbb5dfe585528b047a649b6c82a9d0318c7e91e
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 21m 56s trunk passed
+1 💚 compile 0m 27s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 23s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 0m 21s trunk passed
+1 💚 mvnsite 0m 29s trunk passed
+1 💚 javadoc 0m 28s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 22s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 spotbugs 0m 46s trunk passed
+1 💚 shadedclient 19m 8s branch has no errors when building and testing our client artifacts.
-0 ⚠️ patch 19m 21s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 18s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 21s the patch passed
+1 💚 compile 0m 15s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 javac 0m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 11s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 26 new + 13 unchanged - 0 fixed = 39 total (was 13)
+1 💚 mvnsite 0m 21s the patch passed
-1 ❌ javadoc 0m 19s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 generated 8 new + 0 unchanged - 0 fixed = 8 total (was 0)
-1 ❌ javadoc 0m 17s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga generated 8 new + 0 unchanged - 0 fixed = 8 total (was 0)
-1 ❌ spotbugs 0m 44s /new-spotbugs-hadoop-tools_hadoop-aws.html hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 💚 shadedclient 18m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 10s hadoop-aws in the patch passed.
-1 ❌ asflicense 0m 25s /results-asflicense.txt The patch generated 2 ASF License warnings.
76m 5s
Reason Tests
SpotBugs module:hadoop-tools/hadoop-aws
Exceptional return value of java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in org.apache.hadoop.fs.s3a.impl.InputStreamCallbacksImpl.submit(CallableRaisingIOE) At InputStreamCallbacksImpl.java:ignored in org.apache.hadoop.fs.s3a.impl.InputStreamCallbacksImpl.submit(CallableRaisingIOE) At InputStreamCallbacksImpl.java:[line 76]
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7237/1/artifact/out/Dockerfile
GITHUB PR #7237
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 416124a83a09 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / e02791f
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7237/1/testReport/
Max. process+thread count 555 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7237/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants