-
Notifications
You must be signed in to change notification settings - Fork 120
Description
continuation of #959 specifically in regards to support for partitioning feature (MySQL | Postgres)
Are composite partition keys needed? Should partitionBy('RANGE', ['year', 'month']) be supported?
Recommendation: Not needed for v5
Our implementation plan uses PARTITION BY RANGE COLUMNS(created) on a single DATETIME column. MySQL 8's RANGE COLUMNS partitioning handles the date directly without needing to decompose into year/month components. The partition boundaries are defined as date strings (e.g., '2024-05-01'), not computed expressions on multiple columns.
If there's demand for composite keys from other users (e.g., partitioning by (region, date)), it could be a v6 feature, but it's not a blocker for our use case.
Subpartitioning (MySQL) - Worth including in v5?
Recommendation: No, defer to v6 or later
Our plan doesn't use subpartitioning. We're using a hybrid approach (yearly partitions for old data, monthly for recent data) but this is handled via separate partition definitions, not MySQL's native SUBPARTITION BY syntax.
Subpartitioning adds significant complexity (composite partition keys [see above], additional maintenance logic) and is an advanced use case. Most MySQL partitioning use cases work fine with a single level of partitioning. I'd suggest we keep v5 focused on the core partitioning types and adding subpartitioning only if there's community demand.
PostgreSQL partition naming - Auto-generate {table}_{partition} or require explicit names?
Recommendation: Support both, with auto-generation as default
We already have a plan for our partition names, but if it makes sense we could:
- Default behavior: Auto-generate names using a predictable convention like
{table}_{partition_key}(e.g.,posts_2024,posts_202405) - Override option: Allow explicit naming when needed
This mirrors what we're doing manually in MySQL - we explicitly name partitions like p2017, p202404, etc. for clarity. Having a predictable naming convention makes maintenance easier and partitions more intuitive.
Expressions Support for e.g. PARTITION BY RANGE (YEAR(article_date))?
Recommendation: Yes, this would be valuable.
While our implementation uses RANGE COLUMNS(created) (which operates directly on the DATETIME column without expressions), expression-based partitioning is a common pattern that will be needed e.g. in behaviors and plugins.
However, there's an important distinction:
PARTITION BY RANGE COLUMNS(col)- No expressions allowed, but works directly with date/string
comparisonsPARTITION BY RANGE (expr)- Supports expressions like YEAR(date), TO_DAYS(date), etc.
Which of these, or both, make sense for Migrations? For v5, I'd suggest supporting at minimum:
RANGEwith a simple expression (column reference or function call)RANGE COLUMNSwith one or more columns (no expressions)
API example (very sparse):
// Expression-based RANGE
->partitionBy('RANGE', 'YEAR(order_date)')
// Column-based RANGE COLUMNS (what we use)
->partitionByColumns('RANGE', ['created'])