-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficient Pagination Handling in Expanded Entities ($expand ) in Microsoft.AspNetCore.OData #1274
Comments
@andykelvinbrooks I tried this out and this is the IQueryable that OData generates after applying the $top and $skip: DbSet<Order>()
.OrderBy($it => $it.OrderId)
.Select($it => new SelectAllAndExpand<Order>{
Model = TypedLinqParameterContainer<IEdmModel>.TypedProperty,
Instance = $it,
UseInstanceForProperties = True,
Container = new NamedProperty<IEnumerable<SelectAll<OrderItem>>>{
Name = "OrderItems",
Value = $it.OrderItems
.Select($it => new SelectAll<OrderItem>{
Model = TypedLinqParameterContainer<IEdmModel>.TypedProperty,
Instance = $it,
UseInstanceForProperties = True
}
)
}
}
)
.Skip(TypedLinqParameterContainer<int>.TypedProperty)
.Take(TypedLinqParameterContainer<int>.TypedProperty)
And this is the query that EfCore generates: DECLARE @__TypedProperty_2 int = 1;
DECLARE @__TypedProperty_3 int = 2;
SELECT [t].[OrderId], [t].[CustomerName], [o0].[OrderItemId], [o0].[OrderId], [o0].[ProductName], [o0].[Quantity]
FROM (
SELECT [o].[OrderId], [o].[CustomerName]
FROM [Orders] AS [o]
ORDER BY [o].[OrderId]
OFFSET @__TypedProperty_2 ROWS FETCH NEXT @__TypedProperty_3 ROWS ONLY
) AS [t]
LEFT JOIN [OrderItems] AS [o0] ON [t].[OrderId] = [o0].[OrderId]
ORDER BY [t].[OrderId]
This shows that pagination is applied at the database level. I'm not sure whether the reason you're getting this is because you're using an in-memory database. Also You could try server side paging using |
Attached sample project which contains local DB with limited sample records. When we get the data from http://localhost:5209/invoices?$expand=invoicelines Hardcoded pagesize as 50 in example Generated query is :- Microsoft.EntityFrameworkCore.Database.Command: Information: Executed DbCommand (1ms) [Parameters=[@__TypedProperty_3='51', @__TypedProperty_1='51'], CommandType='Text', CommandTimeout='600'] Generated query which I marked as bold(Inner Query) is fetching all the records from child(InvoiceLines) . We have a huge performance in real time as there are millions of records InvoiceLines in real time. @ElizabethOkerio Kindly let me know your thoughts |
@Dhananjaya-Reddy Thanks for the repro. I'll take a look and get back to you. |
Hi @ElizabethOkerio , Any updates on this? |
@Dhananjaya-Reddy I looked at your repro and the shared query above. From the shared query above, it's not true that the inner query fetches all records from child (InvoiceLines). If you look at the query, this part The only issue I'm seeing is with the calculation of row numbers. This part to be exact: ROW_NUMBER() OVER (
PARTITION BY [v0].[InvoiceId] ORDER BY [v0].[InvoiceLineId]
) AS [row]
FROM [VIEW_InvoiceLineItem_2_MS] AS [v0]
) AS [t1]
If this is being done for millions of rows even if not fetching all of them can be computationally intensive and might affect performance. I think proper indexing on the columns used in the |
Hi @ElizabethOkerio , Thank you for your thorough analysis of the query and the repro I shared. Your insights are valuable, and I appreciate the time you've taken to examine the issue in detail. I'd like to address a few points and share some additional thoughts:
Your observations about the query limitation and PageSize behavior are spot-on. This helps clarify how the pagination is intended to work.
This is the core issue I'm worried about. As you pointed out, even though we're not fetching all records, the ROW_NUMBER() function is still processing all rows in the view before any filtering occurs. For a view with millions of rows, this could be a significant performance bottleneck.
The most critical optimization we need to implement is applying filters before the ROW_NUMBER() calculation. This would significantly reduce the number of rows processed, especially for large datasets. Is there a way to restructure the query or modify the OData implementation that don't require major changes to the overall OData implementation to achieve this? Thank you again for your insightful analysis, and I look forward to your thoughts on this critical optimization. |
@ElizabethOkerio any update on fixing this issue? |
Summary: When using the $expand query option in Microsoft.AspNetCore.OData, all data for the expanded entities is retrieved first, and then pagination is applied in memory. This leads to performance inefficiencies, especially with large datasets.
Steps to Reproduce:
Expected Behavior: Pagination should be applied at the database level when retrieving expanded entities to minimize data retrieval and processing overhead.
Actual Behavior: All data for the expanded entities is retrieved first, and then pagination is applied in memory, leading to potential performance issues and high memory usage.
Additional Context:
This issue becomes more significant with larger datasets where retrieving all data for expanded entities before applying pagination results in slow response times and high memory consumption. Optimizing this behavior to apply pagination at the database level would greatly enhance performance and efficiency.
Suggested Improvement:
Investigate and implement a way to apply pagination ($top and $skip) to expanded entities directly in the database query, ensuring that only the required subset of data is retrieved.
The text was updated successfully, but these errors were encountered: