Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File metrics fixes #11189

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/release-notes/Metrics-FixFileAPIs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
The /api/info/metrics/files/monthly API call had a bug that resulted in files being counted each time they were published in a new version if those publication events occurred in different months. This resulted in an over-count.
The /api/info/metrics/files and /api/info/metrics/files/toMonth API calls had a bug that resulted in files that were published but no longer in the latest published version as of the specified date (now, or the date entered in the /toMonth variant). This resulted in an under-count.
Original file line number Diff line number Diff line change
Expand Up @@ -288,11 +288,18 @@ public JsonArray filesTimeSeries(Dataverse d) {
+ "from (\n"
+ "select min(to_char(COALESCE(releasetime, createtime), 'YYYY-MM')) as date, filemetadata.id as id\n"
+ "from datasetversion, filemetadata\n"
+ "where datasetversion.id=filemetadata.datasetversion_id\n"
+ "and versionstate='RELEASED' \n"
+ "and dataset_id in (select dataset.id from dataset, dvobject where dataset.id=dvobject.id\n"
+ "where datasetversion.id = filemetadata.datasetversion_id\n"
+ "and datasetversion.versionstate = 'RELEASED'\n"
+ "and dataset_id in (select dataset.id from dataset, dvobject where dataset.id = dvobject.id\n"
+ "and dataset.harvestingclient_id IS NULL and publicationdate is not null\n "
+ ((d == null) ? ")" : "and dvobject.owner_id in (" + getCommaSeparatedIdStringForSubtree(d, "Dataverse") + "))\n ")
+ "and filemetadata.id = (\n"
+ " select min(fm.id)\n"
+ " from filemetadata fm\n"
+ " join datasetversion dv on dv.id = fm.datasetversion_id\n"
+ " where fm.datafile_id = filemetadata.datafile_id\n"
+ " and dv.versionstate = 'RELEASED'\n"
+ ")\n"
+ "group by filemetadata.id) as subq group by subq.date order by date;");
logger.log(Level.FINE, "Metric query: {0}", query);
List<Object[]> results = query.getResultList();
Expand All @@ -314,8 +321,9 @@ public long filesToMonth(String yyyymm, Dataverse d) {
+ "select DISTINCT ON (datasetversion.dataset_id) datasetversion.id \n"
+ "from datasetversion\n"
+ "join dataset on dataset.id = datasetversion.dataset_id\n"
+ "join filemetadata fm on fm.datasetversion_id = datasetversion.id\n"
+ ((d == null) ? "" : "join dvobject on dvobject.id = dataset.id\n")
+ "where versionstate='RELEASED'\n"
+ "where versionstate='RELEASED' and filemetadata.datafile_id=fm.datafile_id\n"
+ ((d == null) ? "" : "and dvobject.owner_id in (" + getCommaSeparatedIdStringForSubtree(d, "Dataverse") + ")\n")
+ "and date_trunc('month', releasetime) <= to_date('" + yyyymm + "','YYYY-MM')\n"
+ "and dataset.harvestingclient_id is null\n"
Expand Down
Loading