Releases: MarquezProject/marquez
Releases · MarquezProject/marquez
Marquez 0.50.0
Added
- Web: New Data Observability dashboard for stats on OpenLineage events (
24hrs
, past7.days
); views are also available for sources, datasets, and jobs; new job list view has also been introduced displaying the latestN
runs (and duration) for a given job#2913
@phixMe - Web:
404
page#2890
@phixMe - Web: Display parent job (if present) in job panel
#2868
@phixMe - Web: Allow override of
web.port
viaWEB_PORT
environment variable#2838
@bidlako - Web: Allow nullable columns for schema in dataset panel (use
N/A
)#2896
@phixMe - Web: Better feedback when lineage events are loading
#2916
@NisargChokshi45 - API:
Job
object will now returnJob.latestRuns
(latestN
runs) andJob.latestRun
(last run to execute)#2901
@phixMe - API: Use
io.openlineage.server.*
pkg and classMetadata
(utility class forOpenLineage.RunEvent
)#2853
@wslulciuc - API: Use
TIMESTAMPTZ
for timestamps in database; supports Data Observability dashboard with timezone of user#2924
@wslulciuc - API: Set
current_run_uuid
in tablejobs
optimizing query forJobDao.findAll()
#2929
@wslulciuc - API: New
GET
/api/v1/jobs
#2930
@wslulciuc - CLI: New cmd args for
cli.MetadataCommand
#2923
@wslulciuc--jobs
: limits OL jobs up to N (default: 5)--runs-per-job
: limits OL run executions per job up to N (default: 10)--runs-active
: limits OL run executions marked as active (='RUNNING') up to N--max-run-fails-per-job
: maximum OL run fails per job (default: 2)--min-run-duration
: minimum OL run duration (in seconds) per execution (default: 300)--run-start-time
: specifies the OL run start time in UTC ISO ('YYYY-MM-DDTHH:MM:SSZ'); used for the initial OL run, with subsequent runs starting relative to the initial start time. (default: 2024-10-15T01:00:11.080828Z)--run-end-time
: specifies the OL run end time in UTC ISO ('YYYY-MM-DDTHH:MM:SSZ'); used for the initial OL run, with subsequent runs ending relative to the initial end time. (default: 2024-10-15T01:07:25.080828Z)
Fixed
- Web: Better rendering of long text
#2942
@phixMe - Web: Display full
runID
and check icon when copied#2940
#2941
@wslulciuc @phixMe - Web: Use DatasetVersionAPI to display latest schema and remove extra job facets API call in dataset panel
#2938
@phixMe - Web: Use DatasetAPI for data quality assertions in dataset panel
#2937
@phixMe - Web: Fill-in job node in lineage graph with correct color for
JobEvent
s#2934
@phixMe - Web: Fill-in job node in lineage graph with correct color for run states
RUNNING
,COMPLETED
, etc#2897
@phixMe - API: Pagination for
DatasetVersion.findAll()
; not all dataset versions were returned forGET
/api/v1/namespaces/{namespace}/datasets/{dataset}/versions
#2944
@inanalper
- API:
null
namespace and dataset name in viewdataset_view
for old versions; use tabledataset_versions
instead in column lineage query #2881 @sophiely - API: Missing
DELETE CASCADE
on tablejob_facets
#2878
@mattwparas - API: Ensure
Job.latestRun
inJob
object is set for runs in aRUNNING
state; beforeJob.latestRun
was set only for a run in a done state (COMPLETED
/FAILED
)#2933
@phixMe - CLI: Repurpose cmd
db-migrate
to run all pending database migrations, no longer coupling migrations with HTTP server startup#2936
@davidjgoss - Chart: Missing common
labels
fordeployment.replicas
#2877
@alaturqua
Marquez 0.49.0
Added
- API: Job-to-Job lineage
#2752
@yanlibert
Intended in part to spur a larger discussion of full parent/child hierarchy handling in Marquez. Changes only the backend API, adding the Job UUID along with the parent name to the Job metadata returned.
Fixed
Marquez 0.48.0
Added
- API: add endpoint method and path to metrics name
#2850
@JDarDagran
In the metrics endpoint, there was information gathered containing the SQL Object name and method name. This introduces labels (DAO name, DAO method, endpoint method, endpoint path) and adds more information about endpoints. - API: add paging to dataset versions panel
#2855
@davidsharp7
Adds Datasets paging. - API: add paging on Jobs panel
#2852
@davidsharp7
Adds Job-level paging of Runs. - API: add Dataset schema versions
#2763
@davidjgoss
Adds Dataset schema versions to the model and enables writing to it. - Docker: make db port configurable via
POSTGRES_PORT
#2751
@merobi-hub
Adds support for easy db port reassignment. - Java: allow customization of Apache HTTP in Java client
#2822
@davidjgoss
Allows customization of Apache HTTP in Java client. - Web: add Job tagging to UI
#2837
@davidsharp7
Adds Job tagging to the UI. - Web: source code facets
#2833
@phixMe
Adds typedef and rendering of thesourceCode
facet for a Job if available.
Fixed
- API: Dataset query to get only the latest facet for each version
#2859
@sophiely
The facet partition is ranked by Dataset version and facet name so as we can take only the most recent facet for each Dataset UUID and type. - API: optimize column lineage query performance
#2821
@vinhnemo
Adds a filter condition to the CTEdataset_fields_view
in ColumnLineageDao.java. - Web: deduplicate the versions displayed
#2854
@namyyys
Excludes the symlinks from the result of the query displaying the version history in order to exclude duplicate versions. - Web: clean up issues highlighted by some Spark Integration Data
#2856
@phixMe
Fixes numerous issues in our interfaces related to some OpenLineage Spark events. - Web: remove limit from assertion evaluation
#2844
@phixMe
Fixes bug where our status indicator was the wrong color. - Web: bring Dataset tags into line with Job Tags
#2841
@davidsharp7
Brings Dataset tags into line with Job tags. - Web: fix scroll issues for drawer and home pages
#2820
@phixMe
Scrolling improvements for drawer and home pages. - Web: fix search endpoint parameters
#2818
@Nisarg-Chokshi
The search API parameters were not getting updated correctly on changing the filter and sort options.
Removed
Marquez 0.47.0
Added
Data Quality and Job Status Display in Marquez Web
- API: add job tagging to API
#2774
@davidsharp7
Adds support for job tagging to the API. - Chart: add
serviceAccount
andextraContainers
to helm chart values#2766
@kostas-theo
To make the Kubernetes service account configurable, adds these values to the helm chart values with defaults set to maintain current functionality. - Client/Java: add
jobVersion
field to Run in Java client#2808
@davidjgoss
AddsjobVersion
field to Run in Java client. - Docker: improve down.sh script
#2778
@dolfinus
Adds new-v
option and fixes down.sh script to rely ondocker-compose down -v
and make volume deletion optional. - Web: tooltips and display updates
#2809
@phixMe
Updates tooltips to be more modernized and custom. - Web: update JSON theme
#2807
@phixMe
Makes the JSON theme more in-line with the Marquez brand. - Web: column lineage linking and sticky tab titles
#2805
@phixMe
Adds sticky Titles and moves column lineage links to the table definition. - Web: refine panel feature set
#2798
@phixMe
Adds many refinements in response to user feedback. - Web: update dataset/dataset field-tagging experience
#2761
@davidsharp7
Adds support for adding multiple tags at once, introduces a switch to allow field-level tags to be exposed, and fixes refresh for an improved field-tagging experience. - Web: web refresh + loading states
#2779
@phixMe
Adds a refresh button for jobs, datasets, and lineage events pages. This also will work in empty states.
Removed
- Web: remove old files and dependencies
#2801
@phixMe
Drops deps and removes unused React components no longer required by the new lineage graph.
Fixed
- API: adapt column lineage query for symlink dataset
#2775
@sophiely
Changes the column lineage query in order to take only the 'main' dataset, not the dataset created via symlink. - Web: resolve issue data quality assertion facet are not displayed
#2528
@sophiely
Fixes rendering of theDataQualityAssertion
facet by adding support fordataset
,unknown
andinput
. - Web: fix
showTags
refresh#2799
@davidsharp7
AddsshowTags
to the dependencies offetchDatasetVersions
and disables the show tags toggle until the latest version has been pulled. - Web: various dataset tags improvements
#2813
@davidsharp7
Various tag improvements including a carat for the dropdown. - Web: use Webpack-bundled icon instead of GitHub-hosted content
#2803
@dodo0822
For compliance with a strict CSP, replaces the icon with an SVG bundled by Webpack instead of linking toraw.githubusercontent.com
.
Marquez 0.46.0
Changed
- Web: various revisions
#2770
@phixMe
Includes clean up of issues in the UI and removal of non-useful elements.
Fixed
- Streaming API: fix behaviour for
COMPLETE
/FAIL
events within streaming jobs#2768
@pawel-big-lebowski
Newjob_version
is not created for a streaming job terminal event with no dataset information and existing version is kept.
Marquez 0.45.0
Added
Redesigned Web UI Featuring Column Lineage
- Web: updates to Table and Column Lineage #2725 @phixMe
A new page for column lineage and an updated view for lineage with a common set of shared principles. - Web: quality of life updates for new lineage graph display #2750 @phixMe
Visual updates from early feedback on lineage graph navigation, including a zoom button to center on the selected node. - Web: improve visual display of lineage #2753 @phixMe
Visual improvements to nodes including the addition of more detail and the ability to collapse dataset nodes manually.
- Web: add dataset field level tags to UI #2729 @davidsharp7.
Updates to the DatasetTags component to allow for field-level tagging/deletion and addition of this to the DatasetInfo component. - Web: update dataset tags to allow editing/addition of tags #2759 @davidsharp7
Updates to DatasetTags to include a split button menu and a new dialog/reducer for adding new tags. - Web: minor dataset tags revisions #2754 @phixMe
Minor cleanup of the dataset tags feature including a pointer on the expandable row and a transition on row expansion, plus some new CSS elements.
Fixed
- Web: minor UI enhancements #2727 @phixMe
Hygienic cleanup of project as a follow-up to #2725, including a fix for #2747. - Web: fix symlink display #2736 @sophiely
Changed behavior to display the symlink dataset in the previously empty namespace and link the symlink dataset lineage to the main dataset.
Marquez 0.45.0-rc.1
Added
- Web: updates for Table and Column Lineage #2725 @phixMe
Creates a new page for column lineage and an updated view for lineage with a common set of shared principles. - Web: add dataset field level tags to UI #2729 @davidsharp7
Updates the DatasetTags component to allow for field-level tagging/deletion and adds this to the DatasetInfo component.
Fixed
- Web: minor UI enhancements #2727 @phixMe
Hygienic cleanup of project as a follow-up to #2725, including a fix for #2747. - API: fill data in column lineage input nodes #2742 @JDarDagran @wslulciuc
Fixes the issue of null output nodes in the column lineage endpoint.
Marquez 0.44.0
Added
- Web: add dataset tags tabs for adding/deleting of tags
#2714
@davidsharp7
Adds a dataset tags component so that datasets can have tags added/deleted. - API: Add endpoint to delete field-level tags
#2705
@davidsharp7
Adds delete endpoint to remove dataset field tags.
Fixed
- Web: fix dataset tag reducers bug
#2716
@davidsharp7
Removes result from dataset tags reducer to fix a sidebar bug.
Marquez 0.43.1
Fixed
- API: fix broken lineage graph for multiple runs of the same job #2710 @pawel-big-lebowski
Problem: lineage graph was not available for jobs run multiple times of the same job as a result of bug introduced with recent release. In order to fix the inconsistent data, this UPDATE query should be run. This is not required when upgrading directly to 0.43.0.
Marquez 0.43.0
Added
- API: refactor the
RunDao
SQL query#2685
@sophiely
Improves the performance of the SQL query used for listing all runs. - API: refactor dataset version query
#2683
@sophiely
Improves the performance of the SQL query used for the dataset version. - API: add support for a
DatasetEvent
#2641
#2654
@pawel-big-lebowski
Adds a feature for saving into the Marquez model datasets sent via theDatasetEvent
event type. Includes optimization of the lineage query. - API: add support for a
JobEvent
#2661
@pawel-big-lebowski
Adds a feature for saving into the Marquez model jobs and datasets sent via theJobEvent
event type. - API: add support for streaming jobs
#2682
@pawel-big-lebowski
Creates job version and reference rows at the beginning of the job instead of on complete. Updates the job version within the run if anything changes. - API/spec: implement upstream run-level lineage
#2658
@julienledem
Returns the version of each job and dataset a run is depending on. - API: add
DELETE
endpoint for dataset tags#2698
@davidsharp7
Creates a new endpoint for removing the linkage between a dataset and a tag indatasets_tag_mapping
to supply a way to delete a tag from a dataset via the API. - Web: add a dataset drawer
#2672
@davidsharp7
Adds a drawer to the dataset column view in the GUI.
Fixed:
- Client/Java: change url path encoding to match jersey decoding
#2693
@davidjgoss
Swaps out the implementation ofMarquezPathV1::encode
to use theUrlEscapers
path segment escaper, which does proper URI encoding. - Web: fix pagination in the Jobs route
#2655
@merobi-hub
Hides job pagination in the case of no jobs. - Web: fix empty search experience
#2679
@phixMe
Use of the previous search value was resulting in a bad request for the first character of a search.
Removed:
- Client/Java: remove maven-archiver dependency from the Java client
#2695
@davidjgoss
Removes a dependency frombuild.gradle
that was bringing some transitive vulnerabilities.