You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pipeline logs should be easily accessible for developers to help debug issues they are having. We would like to show a list of previously executed pipelines, the summary of the pipeline that ran, and the logs generated by them.
Motivation
Currently, all logs are written to log files. This makes it difficult to track issues when multiple jobs are running concurrently. We would also like to start surfacing more information during model training to users about the accuracy of the models being trained.
Design Proposal
Showing Pipeline History
Displaying Formatted Logs for Pipeline History
Storing Log Files
Postprocess current logs files and store results into a single file after the pipeline finishes execution
Add additional logging that appends to a log file for each pipeline in a pipeline execution folder that can be retrieved via REST API
Add another table in the PostgreSQL database that we store pipeline logs and index against the task ID so we can quickly pull them up
Add another type of database better suited for the logs like MongoDB and write there
Retrieving Log Files
We already have an endpoint that returns the pipeline history on the server. We will need an additional endpoint to return the logs. The UI will then need to perform some formatting to display them.
Performance Implications
Potentially adds a high-write database table, or additional logs. I don't expect any issues at the scale for a local server.
Adds a few extra API calls on the front end when loading the UI
Dependencies
No dependencies
User Impact
No response
The text was updated successfully, but these errors were encountered:
RFC Feature Enhancement
Description
Pipeline logs should be easily accessible for developers to help debug issues they are having. We would like to show a list of previously executed pipelines, the summary of the pipeline that ran, and the logs generated by them.
Motivation
Currently, all logs are written to log files. This makes it difficult to track issues when multiple jobs are running concurrently. We would also like to start surfacing more information during model training to users about the accuracy of the models being trained.
Design Proposal
Showing Pipeline History
Displaying Formatted Logs for Pipeline History
Storing Log Files
Retrieving Log Files
We already have an endpoint that returns the pipeline history on the server. We will need an additional endpoint to return the logs. The UI will then need to perform some formatting to display them.
Performance Implications
Potentially adds a high-write database table, or additional logs. I don't expect any issues at the scale for a local server.
Adds a few extra API calls on the front end when loading the UI
Dependencies
No dependencies
User Impact
No response
The text was updated successfully, but these errors were encountered: