Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition / RowKey Schema + Performance Efficiency #16

Open
aarondcoleman opened this issue Jun 16, 2015 · 3 comments
Open

Partition / RowKey Schema + Performance Efficiency #16

aarondcoleman opened this issue Jun 16, 2015 · 3 comments

Comments

@aarondcoleman
Copy link

Since the current version stores all errors in a single partition, as the row count grows, performance starts to deteriorate. A better solution would be to try to minimize the number of rows in a partition to just a few hundred.

From the guidelines on designing a scalable table solution doc here: https://msdn.microsoft.com/en-us/library/azure/hh508997.aspx

"A highly uneven distribution of entities across partitions may limit the performance of the larger and more active partitions"

A better solution might be to either

  • Partition on a day, or an hour that can also be a range query (numeric) representation like 20150616
  • A fixed partition size with a counter so 000001, 000002, and an additional table that has some pointer info to know which dates fall in to which partition buckets.

But those are just two ideas. We're using this in production and now that our table size has increased, performance is dramatically slow (lookups of up to 30 seconds!)

Thoughts?

@MisinformedDNA
Copy link
Owner

True. Can you create a PR and set the partition key to something like option 1?

@aarondcoleman
Copy link
Author

I think the only problem with that is the paging functionality. There'll have to be some internal logic that figures out how to assemble a page and keep track of a cursor window depending on how large / how many partitions to pull. Ideas?

@MisinformedDNA
Copy link
Owner

Hmmm... the partition key is already being set to the application name. I think your best bet would be to either delete or migrate older errors elsewhere.

I'm surprised that it is taking that long though since Azure Tables has a default sort that we use (PartitionKey ASC, RowKey ASC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants