Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper support for extending enums #818

Open
SimonCockx opened this issue Aug 29, 2024 · 9 comments · May be fixed by #833
Open

Proper support for extending enums #818

SimonCockx opened this issue Aug 29, 2024 · 9 comments · May be fixed by #833
Labels
enhancement New feature or request subject: code generation This issue is about code generation subject: model validation This issue is about validation of Rosetta models, such as the type system subject: syntax This issue is about the syntax of Rosetta

Comments

@SimonCockx
Copy link
Contributor

SimonCockx commented Aug 29, 2024

Problem description

Rune has rudimentary support for extending enums, e.g.,

enum ISOCurrencyCodeEnum: <"The enumerated values to specify standard currency codes according to the International Standards Organization (ISO).">
    AFN <"Afghani">
    EUR <"Euro">
    GBP <"Pound Sterling">
    USD <"US Dollar">
    ...

enum CurrencyCodeEnum extends ISOCurrencyCodeEnum: <"Union of the enumerated values defined by the International Standards Organization (ISO) and the FpML nonISOCurrencyScheme.">
    CNH <"Offshore Chinese Yuan traded in Hong Kong.">
    JEP <"Jersey Pound.">
    KID <"Tuvaluan Dollar.">
    VAL <"Vatican Lira.">
    ...

Semantically, this will "copy over" all enum values from ISOCurrencyCodeEnum into CurrencyCodeEnum, as if extending the list of ISO currency codes with other enumerated values. However, when used in expressions there are three use cases which are currently unsupported:

  1. Comparing a value of ISOCurrencyCodeEnum with a value of CurrencyCodeEnum. E.g., isoCurrencyCode = currencyCode. Since Rune does not define any subtype relation between these two enum types, this results in a model validation error saying that the two types are not comparable. However, they do share enum values, so it should be possible to compare them.

  2. Concatenating multiple enumerations together. Right now, one can only extend a list of enumerated values, but one cannot concatenate multiple lists into one list of enumerated values. See Support for merged enum #525 for an extended use case and discussion.

  3. Checking if a value of CurrencyCodeEnum belongs to ISOCurrencyCodeEnum. Given a currency code CurrencyCodeEnum -> GBP it should be possible to check that this is indeed an ISOCurrencyCodeEnum, and it should be possible to convert it. (see Support for merged enum #525 (comment))

These three use cases can not only be seen from the perspective of a modeller, but also from the perspective of a consumer of the generated code. For example, if we add support for comparing extended enumerations, then a Java implementer should be able to perform that same operation in a convenient way.

EDIT: the initial proposal below has been replaced by a different approach after discussion. To see the current proposal, see #818 (comment).

Proposal

I split up a proposed solution in three parts for different stakeholders: modellers (changes to the Rune syntax and type system), code generator implementers (how language changes should be handled in Java and other code generators), and Java code consumers (how consumers of the generated code can achieve the same in Java).

Changes for modellers

Changes to the subtype relation

To make extended enumerations comparable, there should be a subtype relation between the two types. However, whereas a type extension is a subtype of its parent, I argue that for enums it is the other way around, i.e., an enum extension is a supertype of its parent.

It is often convenient to consider the subtype relation as the "is a" relation. A Cat is an Animal because type Cat extends Animal. In this case, we consider Cat a subtype of Animal. However, consider again the example of currency codes:

enum ISOCurrencyCodeEnum:
    AFN
    EUR
    GBP
    USD
    ...

enum CurrencyCodeEnum extends ISOCurrencyCodeEnum:
    CNH
    JEP
    KID
    VAL
    ...

In this case, the statement that "a currency code is an ISO currency code" does not hold! For example, CNH is a currency code, but not an ISO currency code. The converse does hold: each ISO currency code is a currency code, so we can consider ISOCurrencyCodeEnum a subtype of CurrencyCodeEnum.

I propose to enhance the type system of Rune with this subtype relation. Consequences:

  • An ISOCurrencyCodeEnum can be compared with a CurrencyCodeEnum. (this addresses our first use case)
  • An ISOCurrencyCodeEnum can be assigned to an attribute of type CurrencyCodeEnum.
  • An ISOCurrencyCodeEnum can be passed to a function which expects a CurrencyCodeEnum.

Concatenating multiple enumerations into one

Taking the example from #525, a rate index can be either a floating rate index or an inflation rate index. Allowing enumerations to extend multiple others would handle this use case:

enum FloatingRateIndexEnum:
  ...

enum InflationRateIndexEnum:
  ...

enum RateIndexEnum extends FloatingRateIndexEnum, InflationRateIndexEnum:
  // Merge of `FloatingRateIndexEnum` and `InflationRateIndexEnum`

This would be a relatively easy change to the Rune language, which does not influence the implementation of code generators in any way. Note that, considering the proposed subtype relation, RateIndexEnum would be the supertype of both FloatingRateIndexEnum and InflationRateIndexEnum.

Checking if the value of an enum extension is a value of its parent enum

For the use case of converting one into the other, I propose to enhance the to-enum operation, so it also accepts an expression of an enum type as its left operand:

currencyCode to-enum ISOCurrencyCodeEnum

This will result in the corresponding ISOCurrencyCodeEnum value, or empty if no corresponding value exists. Combined with the exists operation, it would allow a modeller to check whether a currency code is an ISO currency code:

if currencyCode to-enum ISOCurrencyCodeEnum exists
then ... // currencyCode is an ISOCurrencyCodeEnum
else ... // currencyCode is not an ISOCurrencyCodeEnum

Changes for code generator implementers

The changes to the subtype relation imply type coercions in the generated Java code. When an expression of type ISOCurrencyCodeEnum is used in a place where a value of type CurrencyCodeEnum is expected, it will have to be coerced to a CurrencyCodeEnum. Similar changes will need to be made for the Python code generator.

Allowing to extend multiple enumerations does not influence the code generation for enumerations: code generators can still use the utility method RosettaExtensions#getAllEnumValues to iterate through all enumerated values, including those of all parent enumerations.

The semantics of the to-enum operation will need to change appropriately if its left operand is an enum.

Utilities for Java code consumers

Ideally, consumers of the generated Java code should be able to compare extended enums with a similar ease, as well as easily convert from/to extended enums. I propose to generate two methods for each enum extensions to perform these conversions:

@RosettaEnum("CurrencyCodeEnum", extends = { ISOCurrencyCodeEnum.class })
public enum CurrencyCodeEnum {
    ...
    
    public static CurrencyCodeEnum fromISOCurrencyCodeEnum(ISOCurrencyCodeEnum isoCurrencyCodeEnum) { ... }

    public ISOCurrencyCodeEnum toISOCurrencyCodeEnum() { ... }
}

Also notice the extends attribute inside the @RosettaEnum annotation.

As an example, consumers can use this to compare a value of CurrencyCodeEnum with ISOCurrencyCodeEnum:

if (currencyCode == CurrencyCodeEnum.from(isoCurrencyCode)) {
  // they are equal
} else {
  // they are not equal
}

Everything together, I think this proposal addresses all use cases, and will significantly improve the way enumerations can be used in the CDM. Feedback is welcome!

Related:

@SimonCockx SimonCockx added enhancement New feature or request subject: code generation This issue is about code generation subject: syntax This issue is about the syntax of Rosetta subject: model validation This issue is about validation of Rosetta models, such as the type system labels Aug 29, 2024
@hugohills-regnosys
Copy link
Contributor

hugohills-regnosys commented Aug 29, 2024

If the merged enum functionality was implemented, I think it would be better to represent currency codes like this:

enum ISOCurrencyCodeEnum:
    AFN
    EUR
    GBP
    USD
    ...

enum NonISOCurrencyCodeEnum:
    CNH
    JEP
    KID
    VAL
    ...

enum CurrencyCodeEnum extends ISOCurrencyCodeEnum, NonISOCurrencyCodeEnum:

@lolabeis
Copy link
Contributor

lolabeis commented Sep 3, 2024

Fully agree that in the case of enums, extends reverses the sub-type / super-type relationship, so should be implemented accordingly in the type system.

On this basis, I wonder if it's wise to use the same extends keyword? But I acknowledge that for non-developers who may not be versed into type-system arcane, it does look like they're "extending" the enum. Maybe we keep the same keyword but add this observation in the Rune documentation?

@SimonCockx
Copy link
Contributor Author

On this basis, I wonder if it's wise to use the same extends keyword? But I acknowledge that for non-developers who may not be versed into type-system arcane, it does look like they're "extending" the enum. Maybe we keep the same keyword but add this observation in the Rune documentation?

It's a good point. Especially when extending multiple enumerations, I think the keyword extends might be confusing. Perhaps another keyword is more appropriate? E.g., includes:

enum CurrencyCodeEnum includes ISOCurrencyCodeEnum, NonISOCurrencyCodeEnum:

@davidalk
Copy link
Contributor

On this basis, I wonder if it's wise to use the same extends keyword? But I acknowledge that for non-developers who may not be versed into type-system arcane, it does look like they're "extending" the enum. Maybe we keep the same keyword but add this observation in the Rune documentation?

It's a good point. Especially when extending multiple enumerations, I think the keyword extends might be confusing. Perhaps another keyword is more appropriate? E.g., includes:

enum CurrencyCodeEnum includes ISOCurrencyCodeEnum, NonISOCurrencyCodeEnum:

I think that reads better than extends or perhaps borrowing from other languages with union:

   enum CurrencyCodeEnum union ISOCurrencyCodeEnum, NonISOCurrencyCodeEnum:

Also do we intent to make the new super type immutable? So no adding enum values to CurrencyCodeEnum. I think that would be a good restriction to try and keep the types clean.

@minesh-s-patel
Copy link
Contributor

I prefer the includes rather than the extends.
In general, when we introduce new syntax, I think we should also think about applying it to basic types/record types/complex types, enums etc. Users of Rune should have less specialised keywords.

Another option is to introduce pure union types.

e.g. For enums

union enum CurrencyCodeEnum
    ISOCurrencyCodeEnum
    NonISOCurrencyCodeEnum

e.g. For types


type Name:
  firstName string (1..1)
  lastName string (1..1)

type Address:
  firstLine string (1..1)
  postCode string (1..1)
  city string (1..1)
  country CurrencyCodeEnum (1..1)

// A person has  firstName, lastName, firstLine, postCode, city and country (union enum)
union type Person
    Name
    Address

This looks a little similar to the choice syntax but one key difference is that there is not a one-of constraint on the union.

@hugohills-regnosys
Copy link
Contributor

hugohills-regnosys commented Sep 12, 2024

If we implemented the above simple union approach, without allowing new enum values to be added it would force changes to the model (i.e. backward incompatible). However, I think that would be good in terms of improving the structure of the model.

This existing code would become invalid:

enum ISOCurrencyCodeEnum:
    AFN
    EUR
    GBP
    USD
    ...

enum CurrencyCodeEnum extends ISOCurrencyCodeEnum:
    CNH
    JEP
    KID
    VAL
    ...

It would have to be replaced by this:

enum ISOCurrencyCodeEnum:
    AFN
    EUR
    GBP
    USD
    ...

enum NonISOCurrencyCodeEnum:
    CNH
    JEP
    KID
    VAL
    ...

enum CurrencyCodeEnum includes ISOCurrencyCodeEnum, NonISOCurrencyCodeEnum

@SimonCockx
Copy link
Contributor Author

SimonCockx commented Sep 13, 2024

Another option is to introduce pure union types.
[...]
This looks a little similar to the choice syntax but one key difference is that there is not a one-of constraint on the union.

I don't really follow what difference you see between "pure union" types and the choice types we already have. Assuming we do the last bits necessary on choice types (#797), perhaps we don't need enum extensions at all?

E.g.,

enum ISOCurrencyCodeEnum:
    AFN
    EUR
    GBP
    USD
    ...

enum NonISOCurrencyCodeEnum:
    CNH
    JEP
    KID
    VAL
    ...

choice CurrencyCode:
    ISOCurrencyCodeEnum
    NonISOCurrencyCodeEnum

This would tackle the first two use cases (comparing currency codes of different underlying enums and concatenating multiple enums together).

The third point (checking which enum a value belongs to) also becomes possible using the switch expression to switch over choice options as proposed in #797:

currencyCode switch
    ISOCurrencyCodeEnum then true, // it is an iso currency code
    NonISOCurrencyCodeEnum then false // it is not an iso currency code

However, what becomes more tedious is switching over enum values, since CurrencyCode is not an enum anymore. This results in a nested switch:

currencyCode switch
    ISOCurrencyCodeEnum then switch
        AFN then ...,
        EUR then ...,
        GBP then ...,
        USD then ...,
        default ...,
    NonISOCurrencyCodeEnum then switch
        CNH then ...,
        JEP then ...,
        KID then ...,
        VAL then ...,
        default ...

Perhaps it's not too bad. What do you think?

@SimonCockx
Copy link
Contributor Author

SimonCockx commented Oct 7, 2024

Proposal: use the existing feature of choice types to support "enum extensions", and enhance the switch operation.

Some of these changes will be backwards incompatible. Therefore the following steps are split in two sections:

  • The first three steps will deprecate the current way of extending enums, but still support it. At this point we can remove their usage from the models, and replace it with the syntax in this proposal.
  • The last step will remove the deprecated syntax, which is backwards incompatible.

1. Deprecate support for extending enums.

Modellers should choice types instead. E.g., the following would be deprecated:

enum ISOCurrencyCodeEnum extends CurrencyCode:
    CNH
    JEP
    KID
    VAL
    ...

It should be modelled using a choice type instead:

choice CurrencyCode:
    ISOCurrencyCodeEnum
    NonISOCurrencyCodeEnum

2. Implement a proper subtype relation for choice types. #797

Modelling CurrencyCode as a choice type in its current form would bring about some annoyances:

  • Need to wrap an enum value when passing it to a function, e.g., to pass in Euro: ProcessCurrencyCode(CurrencyCode { ISOCurrencyCode: EUR, ... }). The proposal in Dedicated choice types #797 would resolve that so you could write ProcessCurrencyCode(EUR) instead.
  • Need to unwrap an enum value when comparing it, e.g., currencyCode -> ISOCurrencyCode = EUR. The proposal in Dedicated choice types #797 would resolve that so you could write currencyCode = EUR instead.

3. Add support to directly switch on enum values of a choice type. -- Optional

Given a currencyCode modelled as a choice type, right now performing case analysis results in a nested switch, e.g.,

currencyCode switch
    ISOCurrencyCodeEnum then switch
        AFN then ...,
        EUR then ...,
        GBP then ...,
        USD then ...,
        default ..., // all unhandled ISO currency codes are handled here
    NonISOCurrencyCodeEnum then ... // all non-ISO currency codes are handled here

We could enhance the switch operator to allow direct enum value cases, e.g.,

currencyCode switch
    AFN then ...,
    EUR then ...,
    GBP then ...,
    USD then ...,
    NonISOCurrencyCodeEnum then ..., // all non-ISO currency codes are handled here
    default ... // all unhandled ISO currency codes are handled here

The advantage is that it would be a little more concise, and the default case expression would not have to be repeated for each choice option.

4. Remove support for the deprecated enum extension syntax and remove "handling choices as data types". -- This is backwards incompatible.

This would force modellers to use choice types instead.

@SimonCockx
Copy link
Contributor Author

A summary of the proposal and its impact on the models (CDM, DRR, ISDA Foundations) can be found in this deck:
Extending enums.pdf

For a full discussion of the problems and different proposals which were considered, see the comments above.

The estimated amount of DSL work sums up to 5 days.

@SimonCockx SimonCockx changed the title Proper support for enum extensions Proper support for extending enums Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request subject: code generation This issue is about code generation subject: model validation This issue is about validation of Rosetta models, such as the type system subject: syntax This issue is about the syntax of Rosetta
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants