-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream simulation results in their own thread #1630
base: develop
Are you sure you want to change the base?
Conversation
ae938e6
to
0db8fbb
Compare
...ker/src/main/java/gov/nasa/jpl/aerie/merlin/worker/postgres/PostgresProfileQueryHandler.java
Outdated
Show resolved
Hide resolved
|
for(final var realEntry : resourceProfiles.realProfiles().entrySet()){ | ||
addProfileSegmentsToBatch(realEntry.getKey(), realEntry.getValue(), realDynamicsP); | ||
queryQueue.addToQueue(() -> { | ||
try (var transaction = new PostgresProfileQueryHandler(dataSource, datasetId)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make a new Query Handler per-transaction instead of having a single handler
instance that is close in the close
method? Asking as I'm concerned about running out of DB connections if the queue gets backed up, since the constructor for this class opens a new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Creating a single handler instance caused me to run into an PSQLException: This statement has been closed
error. I believe this was due to the HikariCP lifecycle auto-closing the connection. The good news is that each PostgresProfileQueryHandler will be created as part of the queued task, so it won't be created until the previous one is completed. So we should only have one DB connection at a time.
That said, I'm not sure of the overhead of creating "new" connections. I assumed HikariCP maintaining a connection pool meant that the overhead was low, but I could be wrong. There's likely a way of maintaining a single handler
if we think the overhead is non-negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hikari does not auto-close the connection, otherwise we would have run into this issue with the develop
implementation, alongside several other Hikari DB connections that have long lifespans with low activity.
Let me amend this: if Hikari auto-closes a connection, it recreates one to refill the pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally this statement has been closed
isn't about the connection, it's about a Statement
object, which has a lifespan independent of the connection
object. Basically, when you use the connection
to make a new Statement
, the prior statement
(and corresponding ResultsSet
) is closed, and trying to access it gives the error you received.
private PreparedStatement postSegmentsStatement; | ||
private PreparedStatement updateDurationStatement; | ||
|
||
private long datasetId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be final
private PreparedStatement postProfileStatement; | ||
private PreparedStatement postSegmentsStatement; | ||
private PreparedStatement updateDurationStatement; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If prepareStatements
was inlined into the constructor, these could be final
.
* Utility class to handle upload of resource profiles to the database. This class allows for each upload to have its | ||
* own Connection object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nonblocking Suggestion: remove the second part of this class comment, since "each upload having its own connection object" isn't an inherit property of the class (since the same instance can work for multiple uploads):
* Utility class to handle upload of resource profiles to the database. This class allows for each upload to have its | |
* own Connection object. | |
* Utility class to handle upload of resource profiles to the database. |
private long datasetId; | ||
private PostgresQueryQueue queryQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can be final
import java.util.concurrent.ExecutorService; | ||
import java.util.concurrent.Executors; | ||
|
||
public class PostgresQueryQueue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class can be inlined into PostgresProfileStreamer
, since it's just a wrapper around an executor
, albeit one with most methods hidden. I'd recommend doing this unless you see a use elsewhere/feel it's important to hide these methods.
Description
This should not be a breaking change.
This PR introduces threaded uploads of simulation results to the database. This is a follow up to this PR in which streaming simulation results were introduced.
Implementation Details
PostgresProfileQueryHandler
class which abstracts out database query logic that was previously inPostgresProfileStreamer
.PostgresProfileStreamer.accept()
calls will be in order.PostgresQueryQueue
which enqueues concurrent uploads using anExecutorService
.PostgresProfileStreamer
creates a single instance of the query queue, which handles all uploads and is automatically closed.Verification
Initially I had hoped to profile against the status prior to the original streaming PR but some version mismatches in clipper made this difficult.
I instead used the
feature--aerie-3.2.0
branch ofclipper-aerie
. I then extended Clipper'sgnc-all-activities
plan by repeating its content every day for five days. I'm happy to make this plan available to anyone who would like to see it or use it for future reference, but I doubt I can upload it here in this public venue. I ran profiling against this extendedgnc-all-activities
twice, once with high-overhead memory instrumentation and once with async sampling. I encountered errors when extracting mission model parameters, but this apparently did not impact simulation results, and the error message in the console was not helpful.Here are the results from using async sampling:
develop
feature/streaming-thread
Statistics for these results:
As expected, the primary gain here is in simulation duration, as we are no longer blocking simulation to upload results to the database. I'll attach some further sim results for anyone interested.
ProfilingResults.zip
Documentation
No documentation needs updating.
Future work