-
Notifications
You must be signed in to change notification settings - Fork 702
design: cornerstone for gitattributes features #7164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
89c01ae
to
813bbdf
Compare
e71fd26
to
c79dd7f
Compare
The example implementation is at 06393993/jj@gitattr-design...gitattr-example-impl. |
c79dd7f
to
609cc6d
Compare
@@ -0,0 +1,590 @@ | |||
# Git attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only skimmed the document but what I'm missing is an overview of how we're going to support gitattributes. Specifically, will it only affect the working copy just like EOL conversion does (I think that's my preference)? I.e., I hope all the code can be in local_working_copy.rs
(helper elsewhere is fine as long as they're only used from local_working_copy.rs
). I may very well have missed reasons that that's not possible. If it's not possible to do it in the working copy backend, then where in the architecture will we insert the understanding of gitattributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it only affect the working copy just like EOL conversion does (I think that's my preference)?
For the only 3 features I know that we want:
- EOL conversion and smudge/clean filter will only interact with the
local_working_copy
module - The diff feature needs modification outside the
local_working_copy
. The changes should happen in thediff_util
module in thejj-cli
crate for built-in diff tools, and in themerge_tools::diff_working_copies
crate for external diff tools. The reason for the impossibility is obvious.
There are other git attributes features that may introduce modification outside local_working_copy.rs
, e.g., merge
, but we don't consider them in the current design.
I have already added that information to the "Context and Scope" section. Thanks for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the "Context and Scope" section but I don't see how it's explained there.
- The diff feature needs modification outside the
local_working_copy
.
I was hoping that we would simply not support that. It seems too specific to Git for my taste. Or are you suggesting that we support gitattributes when using non-Git backends too? Also when running on a server? How will we find the smudge/clean tools? I guess you explain that somewhere. Sorry, I've only skimmed the doc so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No they are not included in the docs.
The diff features are all about the built-in diff tool and the external diff tool. I am Ok that we don't support it. ilyagr@ suggests that "For a minor example, jj diff could respect the linguist-generated attribute (or a differently named attribute) and not show generated files by default, like GitHub." so I include it.
are you suggesting that we support gitattributes when using non-Git backends too?
I am Ok with supporting gitattributes only for git backends, but on the code side, it should probably not behind the git Cargo feature flag, which may complicate the code. But I am also Ok with it.
Also when running on a server?
I am not sure how diff is relevant to a server. Am I missing anything?
How will we find the smudge/clean tools?
Same as git, defined in the user settings. If you don't define the filter, files should be checked out and checked in as is(Haven't checked the git behavior, but I assume so).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, the conversion would be done just before displaying to the user. That at least reduces the scope. I'm still quite worried about the scope of this feature. I'm worried i will impact code in lots of different places. Hopefully I'm wrong.
I am not sure how diff is relevant to a server. Am I missing anything?
No. As long as we don't care about it when merging, I think it won't affect diffing code. I think we will just decide that we don't handle all cases exactly correctly. Git doesn't either, AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried i will impact code in lots of different places.
No, it won't. At least this design doesn't include such changes and doesn't suggest we need such changes for future features. If there is such impact, it will be in a different design and you will be notified then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a bunch of discussion points.
Beside that I like how hard this tries to map Git's .gitattribute
design onto Jujutsu with almost no friction. While this a good thing in some ways, it also misses what a rethought interface for jj
could or should be and to me this is important. I'm going to think hard about that now.
@@ -0,0 +1,590 @@ | |||
# Git attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other titles which could work: Git Attributes for Jujutsu, Supporting Git Attributes in Jujutsu
|
||
**Authors**: [Kaiyi Li](mailto:[email protected]) | ||
|
||
**Summary**: This design introduces an API internal to `jj-lib` that future git attributes related features could use to query the git attributes information associated with a file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: While this is a nice summary, I'd expect a briefer overview following it (it'd also would've answered Martin's question).
|
||
This design provides the cornerstone of all the previously mentioned features. | ||
|
||
In this design, we use the same terminology as the [git attributes document](https://git-scm.com/docs/gitattributes#_description). Some regularly used definitions include pattern, state, and `gitattributes` file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: While a link is fine, it forces you to keep a tab open to it. And since there are a bunch of newcomers to source control here it'd be also worthwhile to copy the relevant terms into this section (it'd also help if the link were unavailable).
* How the `gitattributes` file should be loaded from the `Store` or from the file system. | ||
* When query the attributes associated with a file, how to look up `gitattributes` in all ancestor folders. | ||
* [Macro](`https://git-scm.com/docs/gitattributes#_using_macro_attributes`) resolution. | ||
* The query interface should be thread-safe, because `TreeState::snapshot` currently snapshots files in multiple threads concurrently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd reword this should as must, since I don't think this is negotiable longterm.
* The query interface should be thread-safe, because `TreeState::snapshot` currently snapshots files in multiple threads concurrently. | ||
* The design should make it easy to optimize the performance of the query interface without having to change the caller code too much, and the initial implementation should be reasonably fast, because any feature that makes use of `gitattributes` is likely to query the git attributes of all the changed files whenever `jj` snapshots files from the file system to the `Store` or updates files from the `Store` to the file system. | ||
* When `gitattributes` is not used, the design should introduce negligible runtime and memory overhead, if not 0, i.e. if none of git attributes related features is enabled through the user settings. | ||
* Include git built-in `gitattribute` macros. Currently there is only [`binary`](https://github.com/git/git/blob/master/attr.c#L639). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor wording nit: s/Include/Support all (there's only one and that's a detail)
|
||
`GitAttributes::search` may try to read the `gitattributes` file, and if the read fails for a reason other than file not found, or `Store` fails to retrieve a tree, `GitAttributes::search` fails, i.e., returns an `Err`. | ||
|
||
Note that we are using the `gix_attributes::State` type from the external `gix-attributes` crate in the return type, so we restrict the accessibility of the `GitAttributes::search` interface to `pub(crate)`. If we want to promote it to `pub`, we either need to define our own `gix_attributes::Assignment` type or re-export the `gix_attributes::Assignment` type. Promotion to `pub` is likely, because we will need `DiffRenderer` in the `jj-cli` crate to query the `text`, `linguist-generated`, and `diff` git attributes to tell whether a file is binary, generated or how to generate the diff texts respectively. But for this design, we still just keep it as `pub(crate)` for simplicity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd prefer to export our own type since we're able to change the underlying implementation depending on the projects needs. Also I assume its highly likely that it will be pub
at some point anyway, so we can guard against misuses.
```rust | ||
#[async_trait] | ||
trait FileLoader: Send + Sync { | ||
async fn load(&self, path: &RepoPath) -> Result<Option<Box<dyn AsyncRead + Send + Unpin>>>; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Have you considered a design which allows you to nest FileLoaders
then fetching from Disk or a external cache could be a implementation detail?
```rust | ||
struct GitAttributesNode { | ||
disk_first: OnceCell< | ||
Arc<( | ||
gix_attributes::Search, | ||
gix_attributes::search::MetadataCollection, | ||
)>, | ||
>, | ||
store_first: OnceCell< | ||
Arc<( | ||
gix_attributes::Search, | ||
gix_attributes::search::MetadataCollection, | ||
)>, | ||
>, | ||
disk_file_loader: Arc<dyn FileLoader>, | ||
store_file_loader: Arc<dyn FileLoader>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be in a linked list and know where they're assigned from (i.e Disk or Store)? Isn't it simpler to just have a GitAttributesNode
which is lazily loaded and make the cache responsible for path nesting and management with something like a Vec
or LruCache
?
|
||
### Integration with EOL conversion | ||
|
||
We add 2 fields `TreeState::git_attributes: Arc<GitAttributes>` and `TreeState::git_attributes: Arc<GitAttributes>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think one of these should be TreeState::disk_attributes: Arc<GitAttributes>
@@ -169,6 +169,7 @@ nav: # This lists all the files that become part of the documentation | |||
- Sparse patterns v2: 'design/sparse-v2.md' | |||
- Tracking branches: 'design/tracking-branches.md' | |||
- Copy tracking and tracing: 'design/copy-tracking.md' | |||
- gitattributes cornerstone: 'design/gitattributes.md' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also could use a better title.
609cc6d
to
51b5f49
Compare
51b5f49
to
ff95760
Compare
Checklist
N/A:
CHANGELOG.md
README.md
,docs/
,demos/
)cli/src/config-schema.json
)