-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database schema and data migrations #25
Comments
DbUp is a great alternative solution to the EF migrations problem (schema, data, schema) since the migration scripts are executed in-order and are only applied once (schemaversions). integrating a migrations run at startup / build pipeline is as simple as var internalScriptName = scriptName;
var db = scriptName.Split('-')[3];
var connectionString = ConfigurationManager.ConnectionStrings[db];
var upgrader = DeployChanges.To
.PostgresqlDatabase(connectionString.ConnectionString)
.WithExecutionTimeout(new TimeSpan(0, executionTimeoutMinutes, 0))
.WithScriptsEmbeddedInAssembly(Assembly.GetExecutingAssembly(), s => s == internalScriptName)
.LogToConsole()
.Build();
var result = upgrader.PerformUpgrade(); would this be a viable solution alternative? |
Thanks for the feedback/suggestion. I've used DbUp extensively over the past 4 years. It is certainly an alternative, but it can be confusing to understand how your DB schema looks when you have made hundreds of migrations. It also is not trivial to work with when you do Disaster Recovery, as you need full control over which version your schema was in. And your backups might be different (unless you do full database restore when doing Disaster Recovery, where the migration history table is in sync). Like EF migrations, it requires a deep understanding by everyone using these migrations, but most people think they are trivial because they only test the happy path. I've seen people go back and change old scripts because it was too slow to run on staging, then another slightly different version was deployed to production, essentially making the active schema ever so slightly different between staging and production (in terms of indexes). A bomb waiting to explode. Personally, I like DacPac as it's declarative, and you have tooling to modify your database schema. Basically, you have a I'm starting to lean towards doing database migrations as part of CI/CD. It will make it clear for everyone that all database migrations must be at least one version backward compatible (so you can roll back, and so rolling deployments do not cause problems). Also, database migrations are much more likely to cause production problems, in the same way as infrastructure changes (Bicep, Terraform). So, they need more governance than normal code deployments. We are still trying to make up our mind around this. Until we find something noticeably better, we'll stick with EF migrations. |
I would like to contribute on this topic as I have done a lot of research on SQL Schema and I would like to discuss and get feedback on this especially in particular to modular migrations that are reliable for different types of B2B/B2C SaaS systems. |
Hi Casper, Cool. Would you have time for a short introduction (online)? As written above, I have some ideas on how to build this with DacPac in CI/CD workflows. But it's not completely set in stone. We are currently using EF Core migrations, which are fine for now, but they are run as part of startup, which makes them, slow, fragile, and susceptible to race conditions. If a migration fails, there is an incident. I've used DacPac before, but not in CI/CD, so we will have to do research to ensure it is reliable in both continuous deployment scenarios, and in disaster recovery (where you might restore a database that is several versions old). A core part of our principles is that "application code" is deployed with continuous deployment (push on green), compared to continuous delivery (manual approval). See https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment: Database code is infrastructure and is deployed with continuous delivery (manual approval). So, I fear the scenario where a developer makes a pull-request with both database changes and code changes. In our setup, the application code will always hit production before the database changes. Somehow, we need to protect ourselves against this. If you have experience with or ideas on how to set this up in a robust, clean, and developer-friendly way, I would love to talk. To be honest, I have not seen anyone solve this problem in a good way :) |
Hey Thomas, I think I can find some time for an online meeting about this topic as I think it is quite important and I would like to understand your points of view on the matter a bit better. If you could connect with me via my github profile for example on LinkedIn, I think that would be the easiest to schedule a meeting 😀. |
### Summary & Motivation Move the endpoints from the Core project to the Calendar-Assistant API project because `IEndpoints` are only picked up by Dependency Injection when they reside in the API project. This adjustment ensures that the Dependency Injection system properly registers and manages these endpoints. ### Atomic Changes - Move endpoints to Calendar-Assistant API project ### Checklist - [x] I have added a Label to the pull-request - [x] I have added tests, and done manual regression tests - [x] I have updated the documentation, if necessary --------- Co-authored-by: Marco van Kimmenade <[email protected]>
@tjementum I want to comment on this from my perspective. I am using this template for my project, and personally, I am thinking of changing the database PostgreSQL, instead of MsSql or Azure SQL, before going live. And, I think that adding DACPAC might kind of limit this template since it's a starter template. I think there would be many people who would want to use Postgres instead of MsSql due to licensing costs etc. And, currently, with EF Core and Azure that's easy enough to do. However, with DACPAC what would be the alternative if you didn't want to use SQL Server? DbUp works with both, however, I am not sure how it will work with the code-first approach. I think you still need to run EF Core migrations and keep those files even when using DbUp. But I know that you said you don't like EF Core migrations so maybe you won't like that approach. |
Hi Ali Since PlatformPlatform is open source, you can stick to Entity Framework migrations, but when pulling changes from PlatformPlatform, you will run into problems no matter what if you change to Postgres. I'm very clear that PlatformPlatform is an opinionated platform, which provides the benefit of making choices that do not need to follow the lowest common denominator. I also receive questions about supporting AWS, Kubernetes, Azure Functions, etc., but this flexibility comes with constraints and tradeoffs. Personally, a database is just for tables and columns, so I don't see the benefits of using Postgres, MongoDB, etc. over SQL, as the database should have no logic. On Azure, SQL Server is, in my opinion, the better database choice in many ways. Stability, security, integration with other Azure services (like Microsoft Fabric, Purview, PowerBI, etc.), documentation, and general support are all advantages. I'm not saying Postgres is a bad database, but like .NET, Microsoft SQL is a first-class citizen in Azure. Not to mention that you do not risk the Redis, FluentAssertion problem when open-source systems stop allowing cloud providers to monetize their platform. As for Entity Framework migrations, I really am not a fan, and they always get in the way (merge conflicts, static code analysis warnings, search and replace when refactoring, problems when changing between branches, bad workflow). I know this is what others use, and similar approaches are used on other platforms like Prisma or TypeScript. Also, I think it's because most engineers have not seen what a true declarative solution to a database schema looks like. I'm not 100% settled on DACPAC, as there might be unknown problems. I've never used DACPAC in CI/CD, but if it plays out like I hope, this will be a much better choice. |
Tasks
Problem
In .NET Entity Framework, there can be several database schema migrations. However, between each Database migration, there might be the need to have data migration don by code code.Example:
This is just a simple example. The solution should handle that many complex migrations are run in a certain order, if a system has not been updated for some time (e.g. Schema, Data, Schema, Schema, Data, Schema, Data).
How can I create a good migration strategy that runs as part of the Application startup?
Solution 1
Use SQL .dbproj and DacPac: Power your DevOps Development with SQL Server Containers
Solution 2
Managing database migrations and code transformations in a sequential and reliable manner is indeed an important aspect of maintaining a robust application, especially in a complex domain-driven design project. Your problem is about orchestrating these operations in a way that ensures database integrity and data consistency.
For .NET and Entity Framework, you have a tool called EF Migrations which can handle database schema changes. However, EF Migrations doesn't inherently support code migrations that transform data between schema migrations.
Here is an approach you might consider:
001_AddColumns
,002_PopulateColumns
,003_RemoveNullCheck
). This will ensure that migrations are executed in the correct order.Run the EF Migrations up to the point where a data migration is needed.
Run the corresponding data migration scripts.
Continue with the rest of the EF Migrations.
This could be achieved by using the
DbContext.Database.Migrate()
method to apply the EFMigrations and a custom method to execute the data migrations. The data migration scripts could be SQL scripts or C# scripts depending on the complexity of the data transformations.
This solution allows you to manage both schema migrations and data transformations in a sequential manner. It also makes sure the order of operations is preserved when you deploy to different environments.
Remember that handling migrations this way could potentially cause longer downtimes during deployment, especially for large data migrations. So, you need to plan your deployments carefully.
Solution 3
A more naïve approach would be to run migration as part of the application startup. If not done carefully this might cause race conditions. One way to avoid Race conditions is to use "Init containers":
The text was updated successfully, but these errors were encountered: