Skip to content

Partitioning support - Questions for refinement #977

@jamisonbryant

Description

@jamisonbryant

continuation of #959 specifically in regards to support for partitioning feature (MySQL | Postgres)

Are composite partition keys needed? Should partitionBy('RANGE', ['year', 'month']) be supported?

Recommendation: Not needed for v5

Our implementation plan uses PARTITION BY RANGE COLUMNS(created) on a single DATETIME column. MySQL 8's RANGE COLUMNS partitioning handles the date directly without needing to decompose into year/month components. The partition boundaries are defined as date strings (e.g., '2024-05-01'), not computed expressions on multiple columns.

If there's demand for composite keys from other users (e.g., partitioning by (region, date)), it could be a v6 feature, but it's not a blocker for our use case.

Subpartitioning (MySQL) - Worth including in v5?

Recommendation: No, defer to v6 or later

Our plan doesn't use subpartitioning. We're using a hybrid approach (yearly partitions for old data, monthly for recent data) but this is handled via separate partition definitions, not MySQL's native SUBPARTITION BY syntax.

Subpartitioning adds significant complexity (composite partition keys [see above], additional maintenance logic) and is an advanced use case. Most MySQL partitioning use cases work fine with a single level of partitioning. I'd suggest we keep v5 focused on the core partitioning types and adding subpartitioning only if there's community demand.

PostgreSQL partition naming - Auto-generate {table}_{partition} or require explicit names?

Recommendation: Support both, with auto-generation as default

We already have a plan for our partition names, but if it makes sense we could:

  • Default behavior: Auto-generate names using a predictable convention like {table}_{partition_key} (e.g., posts_2024, posts_202405)
  • Override option: Allow explicit naming when needed

This mirrors what we're doing manually in MySQL - we explicitly name partitions like p2017, p202404, etc. for clarity. Having a predictable naming convention makes maintenance easier and partitions more intuitive.

Expressions Support for e.g. PARTITION BY RANGE (YEAR(article_date))?

Recommendation: Yes, this would be valuable.

While our implementation uses RANGE COLUMNS(created) (which operates directly on the DATETIME column without expressions), expression-based partitioning is a common pattern that will be needed e.g. in behaviors and plugins.

However, there's an important distinction:

  • PARTITION BY RANGE COLUMNS(col) - No expressions allowed, but works directly with date/string
    comparisons
  • PARTITION BY RANGE (expr) - Supports expressions like YEAR(date), TO_DAYS(date), etc.

Which of these, or both, make sense for Migrations? For v5, I'd suggest supporting at minimum:

  • RANGE with a simple expression (column reference or function call)
  • RANGE COLUMNS with one or more columns (no expressions)

API example (very sparse):

// Expression-based RANGE
->partitionBy('RANGE', 'YEAR(order_date)')

// Column-based RANGE COLUMNS (what we use)
->partitionByColumns('RANGE', ['created'])

Metadata

Metadata

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions