Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add totalCount extension #102

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

lachlan-00
Copy link
Member

@lachlan-00 lachlan-00 commented Oct 9, 2024

As discussed in #16

Adda totalCount to the following

  • Endpoints

    • getalbumlist
    • getalbumlist2
    • getnewestpodcasts
    • getsimilarsongs
    • getsimilarsongs2
    • getsongsbygenre
    • gettopsongs
    • search3
  • Responses

    • albumList
    • albumList2
    • newestPodcasts
    • searchResult3
    • similarSongs
    • similarSongs2
    • songsByGenre
    • topSongs

note Search3 responses prefix totalCount for each object type (totalAlbumCount, totalArtistCount, totalSongCount)

Copy link

netlify bot commented Oct 9, 2024

Deploy Preview for opensubsonic ready!

Name Link
🔨 Latest commit b956a6f
🔍 Latest deploy log https://app.netlify.com/sites/opensubsonic/deploys/6710710f69bedc000838b82a
😎 Deploy Preview https://deploy-preview-102--opensubsonic.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@Tolriq
Copy link
Member

Tolriq commented Oct 9, 2024

This is just a new returned field there's no need for an extension.

But you must add the field description in the table at the bottom to say what it is and that it's a new OS field.

And as you proposed this can also be expanded to artists / songs too no ? (And probably any current API that supports pagination)

@lachlan-00
Copy link
Member Author

okay, that's set without extension.

i'll look at methods returning objects now

@jeffvli
Copy link
Member

jeffvli commented Oct 9, 2024

It would also be immensely helpful if the total count for each filtered entity (artist, album, song) is included in searchResult3 so we can get page counts when searching. The totalCount returned in this instance would be the total count of entries with the search filter applied.

@epoupon
Copy link
Member

epoupon commented Oct 9, 2024

It would also be immensely helpful if the total count for each filtered entity (artist, album, song) is included in searchResult3 so we can get page counts when searching. The totalCount returned in this instance would be the total count of entries with the search filter applied.

If this slows down the query (will make some tests on a 2M songs generated database), maybe we would need to fill in the info only when no filter is applied

@lachlan-00
Copy link
Member Author

It would also be immensely helpful if the total count for each filtered entity (artist, album, song) is included in searchResult3 so we can get page counts when searching. The totalCount returned in this instance would be the total count of entries with the search filter applied.

i looked at that, wasn't sure if i would but for all objects that probably works

@lachlan-00
Copy link
Member Author

This is added in all possible places now i can think of

@jeffvli
Copy link
Member

jeffvli commented Oct 10, 2024

For search3, you need to return a separate count for album, artist, and song since a user can search for multiple entities at once.

Maybe totalAlbumCount, totalArtistCount, and totalSongCount for that response?

@lachlan-00
Copy link
Member Author

For search3, you need to return a separate count for album, artist, and song since a user can search for multiple entities at once.

Maybe totalAlbumCount, totalArtistCount, and totalSongCount for that response?

I've set that now.

jeffvli
jeffvli previously approved these changes Oct 10, 2024
@Tolriq
Copy link
Member

Tolriq commented Oct 10, 2024

One last small detail, on the pages where there's specific OS changes like this one, the top linkTitle should be changed to add " [OS]" in it like https://github.com/opensubsonic/open-subsonic-api/blob/main/content/en/docs/Responses/ArtistID3.md?plain=1#L3

This helps to quickly identify all the endpoints and response that we have enhanced.

@lachlan-00
Copy link
Member Author

I'll change those and squashed again tomorrow when I'm back at my desk

@epoupon
Copy link
Member

epoupon commented Oct 10, 2024

It would also be immensely helpful if the total count for each filtered entity (artist, album, song) is included in searchResult3 so we can get page counts when searching. The totalCount returned in this instance would be the total count of entries with the search filter applied.

If this slows down the query (will make some tests on a 2M songs generated database), maybe we would need to fill in the info only when no filter is applied

Ok, this is definitely too slow.
Getting the first results of a basic search string is like instant, but counting all results is several seconds long.
Today, the query can already be long if all entries have to be scanned (searching for a non existing entry for example), but here we would pay the cost for all requests, and for each sub query if the client repeats the query with updated offset.

@lachlan-00
Copy link
Member Author

lachlan-00 commented Oct 10, 2024

[OS] is tagged on the file links for changes.

I don't get any difference in speed (rounded to the second just comparing log lines) using / not using counts in the Ampache api i'd expect the same for subsonic responses.

Search results of 600k tracks limited to 10 results. you could take the row count for most queries instead of counting results or cache certain calls using the total library counts for general browse calls.

@jeffvli
Copy link
Member

jeffvli commented Oct 10, 2024

What about adding a query parameter to optionally add the totalCount property for all applicable requests?

For example, Jellyfin's API uses a boolean parameter called enableTotalRecordCount to conditionally query for the total count. We could add the parameter toggled as false by default, so in cases where query speed is an issue (such as needing to paginate through the entire library to sync offline-first apps) then there won't be any performance implications.

An example of how I would use this feature would be:

  1. Making a single request to the search3 endpoint with the parameter enabled to get the total count
  2. After I get the count, I can calculate the page count from it and render it to the UI
  3. Every request after that would be the search3 endpoint with the parameter disabled so it doesn't need to re-query it every time

So either a boolean parameter or servers should just be advised to implement caching for counts, with cache invalidation whenever items are added/removed from the database.

Reposting this comment from the correct account

@lachlan-00
Copy link
Member Author

I would just make a preference on the server for enable/disable of extra features like this doesn't really need to change the api.

@Tolriq
Copy link
Member

Tolriq commented Oct 11, 2024

What about adding a query parameter to optionally add the totalCount property for all applicable requests?

It was explained in the original discussion. Those endpoints are to be deprecated to be able to add the new ones that offer proper filtering and sorting and pagination.

Tolriq
Tolriq previously approved these changes Oct 12, 2024
@epoupon
Copy link
Member

epoupon commented Oct 12, 2024

I am not against these additions, but to implement them, we will need a cache server-side to keep the totalCount for each set of parameters for each endpoint. This will always slow down the first request anyway (hopefully likely not noticeable for "small" databases).
Otherwise they are perf killers, and may not be needed for clients that have infinite scrolling views.

I have doubts about the similar songs. This may be costly server side to get similar songs/artists, and the more elements the user asks, the less accurate are the last results, there is no actual "limit". Then I would need to put a hardcoded limit for OS (like 50).
(same for getTopSongs, we usually want the first items, there is no need to know there are more as they are less relevant)

@dweymouth
Copy link
Member

Otherwise they are perf killers, and may not be needed for clients that have infinite scrolling views.

I agree with this. I think this needs a way for clients to request the totalCount to be included or not. Especially for the search3 endpoint - having to compute the total number of matching songs in a millions-of-songs database just to return the first 20 anyway is a huge perf penalty.

@Tolriq
Copy link
Member

Tolriq commented Oct 12, 2024

Then move on creating the new endpoints? :)

Search3 endpoint does not support sorting or any kind of filtering so it's usage is quite limited in all cases. Adding new parameter to that does not make sense in the goal of OS.

@dweymouth
Copy link
Member

I guess I would just suggest adding a new query parameter includeTotalCount=true to these endpoints (defaults to false), and the OpenSubsonic named extension to identify servers that support includeTotalCount. This would allow clients that need it to request it, and clients that don't to not have to suffer the performance impacts.

@Tolriq
Copy link
Member

Tolriq commented Oct 12, 2024

And then we add sorting and a new extension and then multiple filtering and a new one, and the client have to try to navigate between 4 or 6 extensions to build a query?

We already have a well known list of needs for those endpoints for sorting, filtering, field selection and pagination, this is simpler and faster and better for the future than adding this as an extension.

Of course it requires someone to start it from all the needs exposed in the discussions, I guess that it will again be on me to do it.

Everyone survived with the current API, let's try to do things correctly to improve the future and not make OS something unusable in the long term.

TL;DR; adding some non breaking optional fields to allow some improvements clients side is OK, creating extensions should be impactful.

Edit: Wrong click on the close sorry.

@Tolriq Tolriq closed this Oct 12, 2024
@Tolriq Tolriq reopened this Oct 12, 2024
@dweymouth
Copy link
Member

I am wary about adding something that will permanently make queries less performant though. If we add this as-is, I encourage server devs to add a server-side setting to disable this feature

@Tolriq
Copy link
Member

Tolriq commented Oct 12, 2024

It was discussed above, this is an optional field that servers returns or not at their will. This is still something temporary until there's new proper endpoints anyway and yes most servers will probably not return it for search3, this does not mean that they can't if their database or way of querying allows that efficiently or without additional cost.

My next OS work as soon as I have time for it, will be the transcoding as I want to play my hi res music on my Sonos devices :p
But after that I'll try to again do the proposal for the rest.

jeffvli
jeffvli previously approved these changes Oct 14, 2024
@lachlan-00
Copy link
Member Author

I really don't see the slowdowns but that might also the be db/php setup i have. I'll expand my database with fake data to try different ways of counting the data.

null / empty can also be fine right? you don't have to include a value just the field? I also don't really think search is an important one to keep compared to the other areas

@Tolriq
Copy link
Member

Tolriq commented Oct 15, 2024

null is not a valid result in OpenSubsonic, never ever.
And empty is also against the spec since we define it's a number it can't be empty as it could with a string.

So this leaves us with a default value of -1 to be documented.

But is there really cases where the server would for some endpoint sometimes calculate and sometimes not?
I would assume that most server would either calculate or not at all for search3 for example.

There's no problem and it still fit the spec that a server only return the field for the endpoints he support and not for the endpoints it does not support.

@epoupon ?

@lachlan-00
Copy link
Member Author

actually that's a god point. returning -1 can be the same for not counted / not supported then you can include the property and ignore

@epoupon
Copy link
Member

epoupon commented Oct 15, 2024

Well I was thinking about your use case in symfonium: it is trivial to know the number or albums/tracks/artists when syncing using search3 with no query.
But that's not the case when performing actual string searches.

@Tolriq
Copy link
Member

Tolriq commented Oct 15, 2024

Symfonium is a special case due to the lack of some others API and it does not require the totals so should no really be taken in account, with proper getallsong/artists and albums endpoints I will not use search3 anymore. (And probably no one will as the new endpoint should support better search anyway).

@epoupon
Copy link
Member

epoupon commented Oct 15, 2024

Then let's keep it simple.
For lms, I will likely implement totalCount only for a few endpoints of this list.

@dweymouth
Copy link
Member

I want to point out another one where returning -1 may be useful: getAlbumList2 with a year range filter. Though depending on the DB model maybe you can trivially (O(1) constant time) query how many albums are from a given year and sun it up for all the years in the range.

@Tolriq
Copy link
Member

Tolriq commented Oct 16, 2024

Ok so @lachlan-00 can you add -1 as unknow / not calculated as documented value to finish this. This does not change much and servers can still not return anything is they do not support the field for an endpoint.

@lachlan-00
Copy link
Member Author

Added for -1 value

Total item count for the request ignoring sizeandoffsetlimits. Use-1 to denote unsupported, unknown or uncounted values.

@lachlan-00 lachlan-00 requested review from Tolriq and jeffvli October 17, 2024 00:39
@deluan
Copy link
Member

deluan commented Oct 17, 2024

I have doubts about the similar songs. This may be costly server side to get similar songs/artists, and the more elements the user asks, the less accurate are the last results, there is no actual "limit". Then I would need to put a hardcoded limit for OS (like 50).
(same for getTopSongs, we usually want the first items, there is no need to know there are more as they are less relevant)

I agree that for these endpoints (similar songs, top songs) it does not make sense to return totalCount, specially when the server gets the info from external sources that does not provide counts, like Last.fm.

IIUC, in these cases the server can simply return -1, right.

Also, am I the only one who thinks making these counts optional will make online-first clients harder to implement? They will have to implement a fallback for when the info is not available, right?

add for `offset` and `count` parameters that make sense

* Endpoints
  * getalbumlist
  * getalbumlist2
  * getnewestpodcasts
  * getsimilarsongs
  * getsimilarsongs2
  * getsongsbygenre
  * gettopsongs
  * search3

* Responses
  * albumList
  * albumList2
  * newestPodcasts
  * searchResult3
  * similarSongs
  * similarSongs2
  * songsByGenre
  * topSongs

Search3 responses prefix totalCount for each object type

add note for unknown, uncalculated, unsupported values

tag [OS]

Update searchResult3.md

Update songsByGenre.md
@lachlan-00
Copy link
Member Author

I think it's easier to remove from endpoints over different people implementing it different ways so cutting some of them i would rather do than just let it all differ

@lachlan-00
Copy link
Member Author

i think it's better to cut methods that don't make enough sense for everyone to implement

@deluan
Copy link
Member

deluan commented Oct 17, 2024

i think it's better to cut methods that don't make enough sense for everyone to implement

I agree. Let's do a quick vote? Which servers would implement counts for SimilarSongs/TopSongs?

Vote with a 👍🏼 (yes) or 👎🏼 (no)

@lachlan-00
Copy link
Member Author

i've made an alternative without them. that's an easier way to approve/disprove

#107

I left search in as it's probably the best reason to include -1 as an option.

@epoupon
Copy link
Member

epoupon commented Oct 17, 2024

Also, am I the only one who thinks making these counts optional will make online-first clients harder to implement? They will have to implement a fallback for when the info is not available, right?

I have the same feeling, not sure how complicated this will be for clients to handle fallbacks for all these endpoints.

@Tolriq
Copy link
Member

Tolriq commented Oct 17, 2024

You do know that clients already do ? And that as optional field not all servers will implement that. (Including the non OS ones)

So in all cases for every endpoint the clients will have to handle both the value present or not.

The -1 vs not returning the field is more to respect the OS way of optional fields and leave more liberty to servers.

So the number of endpoints in API does not impact server choices, and removing some might just force a new PR if the need arise later.

@epoupon
Copy link
Member

epoupon commented Oct 17, 2024

Just saying that after this PR they will need to handle more combinations, and not sure how complicated it is for clients.
But what is the point of adding totalCount in endpoints that no server will implement?

@Tolriq
Copy link
Member

Tolriq commented Oct 17, 2024

I don't see any difference in difficulty between endpoint a and b.

Honestly I don't really care, but OS is about all servers and clients including future ones.

Since the servers have no obligations and clients usually do not implement anything before the server they prefer support it, not being consistent for all endpoints does not bring any benefits just lower the consistency.

So the question is what does it bring as a positive gain to not be consistent?

Remember that OS main purposes includes being consistent.

@deluan
Copy link
Member

deluan commented Oct 17, 2024

Remember that OS main purposes includes being consistent.

I think adding more and more optional fields is the opposite of consistency. IMHO, it will make servers diverge even more, and clients having to chose specific servers to support.

One things is having Composer or SampleRate as optional. The client can simple not display it. But counts and paging parameters are crucial for client implementations and it is not just a matter of displaying it or not: it drives completely different UX.

With all respect @Tolriq, I want to hear from other client devs on this subject, as you do not and probably will never use pagination.

Some examples of the lack totalCount/paging causing issues for clients:

@Tolriq
Copy link
Member

Tolriq commented Oct 17, 2024

I'm well aware of the need of the totals and it would improve sync too ;)

I'll ask again the same question what does it bring to not be consistent in all endPoints that support pagination?

This is OS you can't force servers to add support, so I still do not understand the divergence in the end points.

The direct example of what I said just above, you decided on a few endpoints and ignored some. And someone came and said why not search2? And that's exactly my point here.

Why not some endpoints that some servers and clients may want to support by on purpose excluding them from the spec for arbitrary decisions?

So can someone give me a reason why some endpoint should be excluded and what does it bring as a positive gain to exclude those end points ?

So let's take your sentence:

I think adding more and more optional fields is the opposite of consistency. IMHO, it will make servers diverge even more, and clients having to chose specific servers to support.

We can define a consistent API that improve paging for all endpoints that support paging. Or we can arbitrary only choose some of them.

Why and how being consistent with all the paging endpoint is the opposite of consistency ?

I'm not debating about clients or servers here, I'm debating the purpose of OS and how we decide what is consistent and what is not.

So again my logical OS POV: if totalCount is needed for some paging endpoints that support size and offset then it make sense and should be present in all end points that support those, I have still not yet see any argument against this.

Edit: And BTW
@memen45's SubMusic has to support Ampache API, as Garmin smartwatch has limitations on size of playlists (memen45/SubMusic#97)
is a second endpoint not covered and so another argument for consistency in all endpoints ...

@deluan
Copy link
Member

deluan commented Oct 17, 2024

The direct example of what I said just above, you decided on a few endpoints and ignored some. And someone came and said why not search2? And that's exactly my point here.
Why not some endpoints that some servers and clients may want to support by on purpose excluding them from the spec for arbitrary decisions?
So can someone give me a reason why some endpoint should be excluded and what does it bring as a positive gain to exclude those end points ?

The ones I asked to be removed from the PR (getSimilarSongs and getTopSongs) are NOT paginated endpoints (they don't accept offsets), so there's no point in returning counts for them. And by definition, their result degrades as more elements are being returned (as pointed out by @epoupon)

EDIT: The Garmin situation is just an example of missing pagination, and won't be solved by simply adding counts. We would need to add offsets and sizes to the Playlists endpoints. I think this is subject for another PR/Discussion

@Tolriq
Copy link
Member

Tolriq commented Oct 17, 2024

PR (getSimilarSongs and getTopSongs) are NOT paginated endpoints

Then there's no issue, all I keep saying from the start is consistency for all pagination endpoints, unfortunately half the discussion is on the discussion part and the rest here and people do not read all.

There's different mixes between the -1, the endpoints and actual pagination, so if all endpoints that support pagination are covered then everything is OK. (After a quick pass it seems at least search/search2 is missing and since getnewestpodcasts is not paginated it then should not return the total)

And by definition, their result degrades as more elements are being returned (as #102 (comment) by @epoupon)

And that is exactly why I keep talking about a proper pagination API with continuation tokens and the like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants