Skip to content
This repository has been archived by the owner on Aug 8, 2024. It is now read-only.
Dave Abrahams edited this page May 23, 2013 · 4 revisions

Welcome to Boost's transition from Subversion to Modularized Git!

Current Status

  • Modularized results for branches/release and trunk, with full history, are being pushed here and here.
  • The rules for modularization are here
  • The language that describes these rules is documented here
  • Changes not otherwise accounted-for by our rules end up in a fallback repository.

Vetting Period

When the fallback repository is empty, the rulesets that make up our modularization rules are “complete,” in that they intentionally cover every change. At that point we will have a vetting period during which Boosters can inspect the history of their projects and make sure they are happy with the results, submitting edits to the rulesets if they wish to make changes. Instructions for running modularization on your own machine (e.g. to test ruleset changes) are available upon request.

Missing Ancestry

Some Subversion merges are represented by diffs, but not by commit ancestry, in the modularized repositories. Representing all ancestry in Git is tricky because:

  • Subversion didn't always have a way to represent ancestry, so
    • in many early commits, the only clues to what was merged may be in log comments.
    • later commits may have stored ancestry information using the external svnmerge.py tool.
  • Where it is represented in Subversion, ancestry is described essentially as a series of cherry-picks. It may not have a perfectly accurate representation in Git, where an ancestor commit brings an entire history along with it.
  • When Subversion finally got native ancestry representation in version 1.5, it was semantically equivalent to what is represented by svnmerge.py, but used a different representation. Svn2git may not try to interpret the svnmerge.py information.
  • Even after version 1.5, Boosters continued to use Subversion in ways that didn't actually record ancestry, e.g. by applying diffs directly to the release branch from the trunk.

The behavior of future Git merges will be correct as long as the very latest merges to or from active branches are represented with ancestry. Therefore, the cost of all other lost ancestry information is low. Since only a few branches are active (probably just trunk and release), we should consider simply adding a parent map to represent the most recent merges between these active branches in each repository.

Important Links

Notes

  • We're using bitbucket, at least temporarily, because it's able to show multiple branches/merges at once (see, e.g., wave, which we think may be useful for people who want to vet the conversion.