Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Records - a better way #3399

Closed
bboyle1234 opened this issue Apr 26, 2020 · 29 comments
Closed

Records - a better way #3399

bboyle1234 opened this issue Apr 26, 2020 · 29 comments

Comments

@bboyle1234
Copy link

bboyle1234 commented Apr 26, 2020

Edit: After thinking about this issue for a couple of hours actually writing about it instead of just giving it a few moments in my head every now and then, I think I'm going to change my mind and say that code-gen is NOT better than records as a language feature ... I'll leave this here undeleted though - someone may find the train of thought helpful. Here is what I wrote (which I now am starting to disagree with)

IMHO, Code-gen is a better way to achieve the goals of the much-discussed c# record feature.
A simple and very effective example of such a thing can be found here. https://github.com/amis92/RecordGenerator @amis92

What are the goals of the records proposal?

  • Brevity. Writing little to no boiler plate code.
  • Immutability and brevity.
  • Mutability by the construction of new instances with brevity and pleasing syntax.
  • Object initializer syntax as an alternative to complicated constructor parameter lists.
  • Memberwise equality and hash coding
  • Serialization
  • ToString

All of these goals can be better-met via code generation. I'll try to justify that as I work on this thread over time.

I think one of the main mistakes of the records proposals made so far is that they are confusing immutability and memberwise equality as though they are the same thing, to be solved together. They are not. They are separate problems and should be solved separately. With two separate code-gen utilities. I'll explain more on that later but for now note particularly that it is nice to have memberwise equality available in an object that is not immutable, and vice-versa.

The other major mistake is that it's almost impossible to find a succinct single language feature solution that fits 99% of memberwise equality scenarios.

For the c# records proposal to work in most cases, it needs to provide a very configurable set of options. But language features can get "stuck" and have very limited scope compared to what can be achieved through a living, continually growing open-source code-gen solution available on nuget that moves to meet demand and allows the development of new and special features that can be added to classes via attribute declaration.

Well that's my opening statement. I'll go ahead and add some of my uses of the RecordGenerator package in comments below and hopefully inspire the c# team to pick up on the idea of starting a living growing open source code gen project for our immutability and memberwise equality needs.

Another good example of code-gen for this purpose can be seen here. I have appended it as a footnote though because it is a bit distracting because of the complication it adds with lists and other object hierarchies. https://github.com/AArnott/ImmutableObjectGraph. @AArnott

@bboyle1234
Copy link
Author

bboyle1234 commented Apr 26, 2020

Here's an immutable class that didn't need the memberwise equality feature. Here's the code I wrote:

[ObfuscationAttribute(Exclude = true)]
[Record(Features.Default ^ (Features.Equality | Features.Deconstruct))]
internal sealed partial class SignInState : ISignInState {

    public static readonly SignInState NotInitialized = new Builder {
        State = SignInStates.NotInitialized,
        Username = null,
        Password = null,
        RememberMe = false,
        Session = null,
        LastSession = null,
        ErrorMessage = "Not initialized",
    }.ToImmutable();

    public string Username { get; }
    public string Password { get; }
    public bool RememberMe { get; }

    [JsonConverter(typeof(StringEnumConverter))]
    public SignInStates State { get; }

    public ISession Session { get; }
    public ISession LastSession { get; }
    public string ErrorMessage { get; }
}

And here's the code that got generated: So far, it's not better than what we get from the records spec. It's much the same. The partial OnConstructed method is a nice touch, but I believe we'll get that in the records spec also.

What we do have (so far) that is better, is that each of the features generated below is opt-in. By decorating the class above with attributes, we can opt-in and/or configure each of the features generated below. In the example above, we opted into all features except memberwise equality and deconstruction. If we leave it to a language feature, we will get VERY little configurability.

Keep reading below the code gen for some cool things, aside from configurability, that ARE better than language-spec records.

partial class SignInState {
    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState(string username, string password, bool rememberMe, SignInStates state, ISession session, ISession lastSession, string errorMessage) {
        this.Username = username;
        this.Password = password;
        this.RememberMe = rememberMe;
        this.State = state;
        this.Session = session;
        this.LastSession = lastSession;
        this.ErrorMessage = errorMessage;
        OnConstructed();
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    partial void OnConstructed();

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState Update(string username, string password, bool rememberMe, SignInStates state, ISession session, ISession lastSession, string errorMessage) {
        return new SignInState(username, password, rememberMe, state, session, lastSession, errorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithUsername(string value) {
        return Update(value, Password, RememberMe, State, Session, LastSession, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithPassword(string value) {
        return Update(Username, value, RememberMe, State, Session, LastSession, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithRememberMe(bool value) {
        return Update(Username, Password, value, State, Session, LastSession, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithState(SignInStates value) {
        return Update(Username, Password, RememberMe, value, Session, LastSession, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithSession(ISession value) {
        return Update(Username, Password, RememberMe, State, value, LastSession, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithLastSession(ISession value) {
        return Update(Username, Password, RememberMe, State, Session, value, ErrorMessage);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public SignInState WithErrorMessage(string value) {
        return Update(Username, Password, RememberMe, State, Session, LastSession, value);
    }

    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public override string ToString() => new {
        Username,
        Password,
        RememberMe,
        State,
        Session,
        LastSession,
        ErrorMessage
    }

    .ToString();
}

partial class SignInState {
    [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
    public Builder ToBuilder() {
        return new Builder { Username = Username, Password = Password, RememberMe = RememberMe, State = State, Session = Session, LastSession = LastSession, ErrorMessage = ErrorMessage };
    }

    [ObfuscationAttribute(Exclude = true)]
    public partial class Builder {
        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public string Username {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public string Password {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public bool RememberMe {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public SignInStates State {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public ISession Session {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public ISession LastSession {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public string ErrorMessage {
            get;
            set;
        }

        [System.CodeDom.Compiler.GeneratedCodeAttribute("Amadevus.RecordGenerator", "0.5.0.0")]
        public SignInState ToImmutable() {
            return new SignInState(Username, Password, RememberMe, State, Session, LastSession, ErrorMessage);
        }
    }
}

@bboyle1234
Copy link
Author

bboyle1234 commented Apr 26, 2020

The Builder object created by the code-gen above completely obliviates the need for init-only properties. I think the entire c# team can take a breath and stop work on that feature. No offense to the smarter people than me who are working on it for reasons I have not thought of far beyond this simple scope, but from my point of view with the Builder object created above, here's how to get the pleasing object initializer syntax without going all the way to town and creating init-only properties:

var signInState = new SignInState.Builder {
    Username = "immutability",
    Password = "is easy",
}.ToImmutable();

The Builder was code-genned so it was no extra effort.

There is a little bit of extra noise in what the developer needed to type, but we could find ways to structure the generated code to minimise it. Instead of spending effort creating init-only properties and the record-feature, I'd prefer the c# team to talk about making code-gen easier to build and deploy. At the moment it's cumbersome to write a nuget package that does code-gen without the user taking additional steps, and it's cumbersome to make it work from a nuget package in the CI build pipelines too.

(I've written in such an opinionated manner ... whilst it IS my opinion, I submit it very humbly to the people I admire and adore so much)

@bboyle1234
Copy link
Author

bboyle1234 commented Apr 26, 2020

What happens when the object you're writing needs a custom ToString()?
With the code-gen feature demonstrated above, we opt out of the code-gen ToString and implement our own. This is possible because the code-gen approach allows the use of attributes for advanced customization.

@HaloFour
Copy link
Contributor

I more or less agree with this discussion. Source generators are already on the roadmap and they enable all of these scenarios. However, the team has stated pretty explicitly that they feel that "records" shouldn't be implemented via source generators.

I've also argued for a builder-based approach to init-only properties and withers and the team decided against it.

I guess the one consolation prize is that if/when source generators do ship with the compiler that we can use them to accomplish records without using these record language features and with substantially more flexibility.

@bboyle1234
Copy link
Author

bboyle1234 commented Apr 26, 2020

---- Here's where I started to backtrack and disagree with myself. I was writing about memberwise equality, and the complexity of trying to solve every scenario using a single language feature. But I came to the point where I thought an attribute specifying the required IEqualityComparer for any property that doesn't use the standard comparer will satisfy 99% of scenarios, so my major concern about memberwise equality, the intended "mic drop" of this thread, doesn't really stand with me any longer .... so long as the records language feature allows the specification of the IEqualityComparer for each property where necessary.

@bboyle1234
Copy link
Author

bboyle1234 commented Apr 26, 2020

This leaves me with the following points:

Immutability and Memberwise equality are SEPARATE issues and it would be very good to have those features individually available. I want immutable classes that don't have memberwise equality forced upon, and vice versa, and I feel this is very important not only for me but for MANY users. The records spec still fails us in this regard. This is the contention for me that would prevent the record spec from satisfying the 99% (or even 90%) of use cases.

I see that the introduction of init-only properties, combined with the withers syntax defined by Mads Torgersen as With { X=1 Y=1 } will reduce allocations involved in mutating the objects, making the use of immutables in "hot paths" a little faster and more practical than without. Is this the reason for the design team's decision?

@iam3yal
Copy link
Contributor

iam3yal commented Apr 26, 2020

@bboyle1234

I want immutable classes that don't have memberwise equality forced upon, and vice versa

You can have value equality without immutability as far as I understood and if you need immutability without value equality why would you care about it? does it matter?

As far as I understood they are going with individual features and records just build on top of these features.

@bboyle1234
Copy link
Author

@eyalsk

As far as I understood they are going with individual features and records just build on top of these features.

Then I am a happy man!! I hope they have customizable equality for individual properties ... any word on that?

@iam3yal
Copy link
Contributor

iam3yal commented Apr 26, 2020

@bboyle1234 I read something about it but I'm not sure where it stands, I think they took a step back, could be wrong.

@bboyle1234
Copy link
Author

if you need immutability without value equality why would you care about it? does it matter?

Because sometimes you just wanna write your own equality checking algorithms or hash-coding algorithm. Most of the time it's just falling back to default reference-type equality, but other times you wanna add some custom detail. Examples I can think of immediately: Saying two double properties are equal (for the purpose of this particular object) if they are almost equal, except for some small arithmetic error that was introduced, or perhaps you want two string fields to be equal if they contain the same time but in different locales, etc, and so on, there are millions of possible reasons. (And don't judge me by the time string example haha)

@iam3yal
Copy link
Contributor

iam3yal commented Apr 26, 2020

Because sometimes you just wanna write your own equality checking algorithms or hash-coding algorithm

I'm pretty sure that the plan is to allow you to override any method including Equals just like you would today with any other class or struct.

@HaloFour
Copy link
Contributor

@bboyle1234

Then I am a happy man!! I hope they have customizable equality for individual properties ... any word on that?

I believe it's either an on-or-off proposition and you won't be permitted to handle specific properties or fields in custom ways without writing your own Equals or GetHashCode methods.

@bboyle1234
Copy link
Author

Is there a place to vote on the issue? It would mean a lot to me (read, save me from pages of boilerplate) and many other coders if we could customize the IEqualityComparer used for each property in a record, without re-writing the entire Equals and GetHashCode methods

@HaloFour
Copy link
Contributor

@bboyle1234

The issue is here: #3213

@bboyle1234
Copy link
Author

I commented there. Thank you. @HaloFour and @eyalsk

@333fred
Copy link
Member

333fred commented Apr 26, 2020

it's almost impossible to find a succinct single language feature solution that fits 99% of memberwise equality scenarios.

To be clear, we agree with this statement. We even talked about it here: https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-03-30.md#value-equality. Anything particularly complicated will need to use a source generator or write its own equality. However, we do think that shallow field-based equality will be be correct for at least a large plurality of records, and for those that it's not correct, it's no worse (and possibly a bit better) than default reference equality.

As to the parts of records, we do think that they can't just be syntax sugar with a generator. It may end up being that the generator is effectively built in to the compiler as we work on pulling record features apart into separate language features, but the point of records is to get very brief data holder declarations. We think the necessary attributes, using statement for the attributes, and all the rest that would be required for a source generator is too much, and that we can (and should) do better. It's very tempting to fall back to source generators for language design now that we're actually planning on shipping something in the space, but we want to make sure that we don't fall into the trap of letting generators dictate language features.

@HaloFour
Copy link
Contributor

@333fred

We think the necessary attributes, using statement for the attributes, and all the rest that would be required for a source generator is too much, and that we can (and should) do better.

IMO the team should take a bit of a look at Project Lombok. I think it does a good job of demonstrating how such a generator can compose over multiple individual features without requiring a spin-out of attributes. They use meta-annotations which combine together multiple other annotations.

For example, this is all you need to write in order to declare an immutable "record" in Java:

@Value class Person { String name; int age; }

That combines generator policies from five different annotations.

If I decided I wanted it to be mutable:

@Data class Person { String name; int age; }

Which combines together the generator policies from five other annotations.

The degree of flexibility I have to customize how records are generated is quite incredible, both project-wide and down to the individual record. With just equality I have control over which fields participate and how, as well as how inheritance plays into it. I can control how annotations applied to fields are "promoted" to the accessors or even the accessors of builders so that I can get fluent builder APIs that are completely compatible with GSON or Jackson serialization libraries.

This project works with every version of Java back to Java 6, when annotation processors were added to the compiler. And the tooling support, in IntelliJ at least, has been spot-on, even though that support is provided entirely through a plugin.

My opinion is that records, as they are being designed by the team now, are a massive pit of differing policy choices. I can understand wanting them to be more first-class, but I have a feeling that it doesn't matter how close you get to the majority case you're only going to end up with an avalanche of feature requests to allow tweaking and tuning those features to match the expectations of how POCOs are supposed to be designed based on the very different approaches that people have adopted over the past 20 years. Orders of magnitude worse than with auto-props. To me source generators are the perfect way to let the ecosystem decide.

@CyrusNajmabadi
Copy link
Member

@HaloFour We did bring it up in an LDM meeting (i was the one driving that). However, overall discussion was if this felt like it was a first class language feature for people, or an external library feature. Majority fell on this being first-class, and so the approach using generators was felt to be insufficient.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 26, 2020

Then I am a happy man!! I hope they have customizable equality for individual properties ... any word on that?

It was heavily discussed. And the team is very cognizant of the desire to cutomize this stuff. We didn't land on any approach (naming/attributes/accessor/other) that felt satisfactory. So, initially, we're going with an all-or-nothing approach for equality. We supply a default impl, you can override and do whatever you want. In the future we may expand that to support different equality concepts.

@CyrusNajmabadi
Copy link
Member

I think one of the main mistakes of the records proposals made so far is that they are confusing immutability and memberwise equality as though they are the same thing, to be solved together. They are not.

As fred mentioned above, it's worth reiterating that this is not the case. Records is a very broad topic built out of many interesting pieces. We've explored approaches where everything is tied together and approaches where it's small individual pieces. Right now we're going in the latter direction, and we're already taking small pieces that we feel are sufficiently designed and are moving them to the prototype-impl phase (while also continuing to iterate on parts that we still feel need mental cycles attached to them).

Like with NRT, we don't view records a "this is a small feature that will be done in a single language release". Instead, it's part of a broader, and long-lived, approach around how we think the language should look and feel around both data production and data consumption. This is naturally an extremely complex and involved design process because the above is so varied for so much of the ecosystem. like with NRT, i do not expect our initial results here to be suitable or satisfactory for everyone. Instead, they will be used to get value and functionality out in a step-wide fashion, allowing those who can to adopt, while also allowing for updated feedback to roll into future developments in this space.

@CyrusNajmabadi
Copy link
Member

To me source generators are the perfect way to let the ecosystem decide.

I don't see why it's either/or. I imagine that with source-gens you'll have this happen anyways, and there will be some project with an enormous amount of fine-grained customization enabled that people can use if they want. But the presence of such a lib doesn't negate the value in having a (potentially opinionated) pre-baked solution in hte langauge itself.

If you find yourself running into cases hte language-approach is not suitable. Then by all means write your own type (or use the source-generator). That's very much part and parcel of the design. A 'record' is not some special entity in the system that only teh compiler could create. It's intended to be more about expectations and defaults to help a plurality out.

@333fred
Copy link
Member

333fred commented Apr 26, 2020

Perfectly said Cyrus.

@HaloFour
Copy link
Contributor

@eyalsk

My team is using Lombok in our current Spring project. There is a maven plugin which does the source generation at build time, but there's also a plugin for IntelliJ which reflects the results of the annotations immediately so there is no intermediate step between creating/modifying an annotated class and consuming it from elsewhere in the project. It is mostly driven by annotations although they also have configuration that can be applied at the project or package level that allows you to control some very fine-grained details, like being able to state that you want JSON-specific annotations copied to the accessor methods of the auto-generated builder class for a given immutable type. It feels very much like it's a part of the language experience and not something bolted onto the side.

@CyrusNajmabadi

However, overall discussion was if this felt like it was a first class language feature for people, or an external library feature. Majority fell on this being first-class, and so the approach using generators was felt to be insufficient.

I can understand this. My concern is that in order to make this a first-class citizen while trying to keep complexity under control you have to be fairly opinionated around the design. Every single one of these individual concerns is minefield of different policy choices, and the language is certainly not going to support all or even most of them. Even if you achieve the plurality of use cases out of the individual features I have severe doubts that, when combined, any such feature will be usable for majority of developers out of the box, at least not without some serious compromises.

I don't see why it's either/or.

It doesn't have to be, but if the feature can be solved better and with more flexibility through third-party libraries, why integrate it into the language? That's the argument that the team makes regarding tons of proposals that have been made for the language. IMO you don't get much more policy-riddled than trying to create shorthand for POCOs.

It's interesting that you compare it to NRTs, because I can see the amount of work being involved easily being equivalent, but you're not trying to solve a "billion dollar mistake" with it. The "war on boilerplate" is likely as much of a never-ending quagmire as most of those "wars" end up being.

Does the team feel that the effort is worth it if a large portion of the community still needs to reach out to third-party generators to fill in all of the gaps left behind by first-class records?

@YairHalberstadt
Copy link
Contributor

I don't see why it's either/or. I imagine that with source-gens you'll have this happen anyways, and there will be some project with an enormous amount of fine-grained customization enabled that people can use if they want. But the presence of such a lib doesn't negate the value in having a (potentially opinionated) pre-baked solution in hte langauge itself.

We've explored approaches where everything is tied together and approaches where it's small individual pieces. Right now we're going in the latter direction, and we're already taking small pieces that we feel are sufficiently designed and are moving them to the prototype-impl phase

I feel like these two approaches conflict. If there was no such thing as source generators, there would be an incentive for the language to provide as many customisable knobs as possible to cover as many use cases as possible. However once we're accepting that source generators will take up some of the slack, I reckon it should be easy for the language to cover the 90% use case with a simple first class language feature, and leave all the rest to source generator libraries.

@HaloFour
Copy link
Contributor

@YairHalberstadt

However once we're accepting that source generators will take up some of the slack, I reckon it should be easy for the language to cover the 90% use case with a simple first class language feature, and leave all the rest to source generator libraries.

I can imagine that will be the approach that the language team takes. The wider ecosystem will be able to provide significantly more flexibility either on top of the first-class citizen or entirely independently. To that end, the language team will probably "scratch the itch" to their satisfaction and move on. I can see the plurality of uses being a combination of first-class and third-party features.

@333fred
Copy link
Member

333fred commented Apr 26, 2020

That's certainly the use case we've been targeting so far. We've explicitly designed with the intention that we can have an IDE refactoring to spit out all the code a "record" will generate, and that if the user provides explicit implementations of parts of the feature we'll gracefully incorporate them. Equality is complex enough that I don't think we can solve it directly in the language, but if it's not one of the first source generators written I will buy a hat and eat it. It's one we should honestly consider shipping directly with the framework.

@HaloFour
Copy link
Contributor

@333fred

It's one we should honestly consider shipping directly with the framework.

Is it no longer a "first-class citizen" if the source generator is shipped by MS with the SDK? :)

I imagine that an important testing scenario of the combination of records with generators is that the generator can provide an implementation for features that would otherwise be provided out of the box.

@333fred
Copy link
Member

333fred commented Apr 26, 2020

Is it no longer a "first-class citizen" if the source generator is shipped by MS with the SDK? :)

In this context, no. It's first-class in the framework and tooling, but it's not a language feature, which is what mean when I say first-class on this repo.

I imagine that an important testing scenario of the combination of records with generators is that the generator can provide an implementation for features that would otherwise be provided out of the box.

The plan is that the should be able to do so.

@YairHalberstadt
Copy link
Contributor

Closing as records have already shipped as have source generators

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants