-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine single-value column to treat it as that single value #12120
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a great start.
...ribution/lib/Standard/Table/0.0.0-dev/src/Internal/Type_Refinements/Single_Value_Column.enso
Outdated
Show resolved
Hide resolved
…e extra info - but currently it doesn't work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to be getting failures in some of the tests. I guess I am not the only one, right? Are there bugs you'd like me to fix, @radeusgd?
c1.value_type . should_equal Value_Type.Integer | ||
c1.at 0 . should_equal 23 | ||
|
||
c2 = c1 + 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume 100 + c1
would also work now yielding an Integer
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, worth adding a test, will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 4820070
100 + c1
won't work directly as it needs a cast, so we need 100 + c1:Integer
. I think this is consistent with other examples, like my_integer_fn
. Or do you think that binary operators should cope without casts? If yes, let's file a ticket to fix it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type is Column & Any
, then there will be no need for a cast to Integer
. The Integer
will be available. To verify remove -> Column
check in from_vector
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I thought that for now we wanted to see how it all works with -> Column
, not Column & Any
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for now we agreed to try -> Column.
It is a bit more 'conservative' and should more easily allow 'loosening' the requirement to be Column & Any and removing the need for casts - so the eventual change in this direction is doable but the reverse direction could break user's workflows (removing casts is a mostly compatible change, whereas starting to requiring them is breaking).
If we try out with -> Column
I think we can more easily migrate to -> Column & Any
later. Doing a migration in the reverse direction will be a breaking change.
@@ -136,8 +137,9 @@ type Column | |||
## PRIVATE | |||
Creates a new column given a Java Column object. | |||
from_java_column : Java_Column -> Column | |||
from_java_column java_column = | |||
Column.Value java_column | |||
from_java_column java_column:Java_Column -> Column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a PRIVATE function. It doesn't really need signature. E.g. the -> Column
can be dropped without any impact. Some publicly visible function needs a signature in order for the static analysis to use it. What's that function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the question is: what should be the signature of such a function? @radeusgd has proposed Column & Any
. What should it mean if a multi value gets into such a cast?
- if one of the intersection types of such a multi value is
Column
, then move it first and unhide all other hidden types - if no type of
Column
is among the intersection types of such a multi value, but there is a conversion from one of its types, then just perform conversion toColumn
and return the result
Is that how you want Column & Any
to behave, @radeusgd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a PRIVATE function. It doesn't really need signature. E.g. the
-> Column
can be dropped without any impact. Some publicly visible function needs a signature in order for the static analysis to use it. What's that function?
The idea was that adding the return type check here, affects semantics of all functions that rely on it. So even if I forget to update the signature in some case, all methods that return a Column
will have consistent semantics regarding 'hiding' of the intersection type. Thus my plan was to first only have this method updated (to get the desired semantics and experiment with it) and do a refactor of all type signatures in a separate PR - that will be a pretty big and boring change, so I thought it will be easier to review if separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if one of the intersection types of such a multi value is
Column
, then move it first and unhide all other hidden types
I was initially thinking to avoid hiding any additional types that were intersected with Column
. I was not thinking about additionally 'unhiding' any other types. I think it may be acceptable to unhide all types if X & Any
is encountered, although I feel that the instanceof
metaphor works better if hidden types are not unhidden, just adding & Any
guarantees we don't hide any currently visible ones.
I'm happy for either.
But overall yes, that was my initial suggestion for how I think Column & Any
should work. I'd be very happy if we can get it working that way :)
Even if we don't end up using Column & Any
here and only have Column
- requiring the casts to be inserted, I still think that the semantics of X & Any
(if it's ever encountered in any code) should indeed be as you suggest above.
...ribution/lib/Standard/Table/0.0.0-dev/src/Internal/Type_Refinements/Single_Value_Column.enso
Outdated
Show resolved
Hide resolved
...ribution/lib/Standard/Table/0.0.0-dev/src/Internal/Type_Refinements/Single_Value_Column.enso
Outdated
Show resolved
Hide resolved
c5b7067
to
9bb73ae
Compare
@@ -120,10 +121,12 @@ type Column | |||
case needs_polyglot_conversion of | |||
True -> Java_Column.fromItems name (enso_to_java_maybe items) expected_storage_type java_problem_aggregator | |||
False -> Java_Column.fromItemsNoDateConversion name items expected_storage_type java_problem_aggregator | |||
result = Column.from_java_column java_column . throw_on_warning Conversion_Failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling Any.throw_on_warning
extracts the first value of an intersection type of a multi value that has such a method - e.g. Column
and invokes the method on such a simple value while loosing the additional types. Related issues:
- Let
EnsoMultiValue.to_text
delegate to first typeto_text
#11827 multi_value.to
doesn't work on second & further elements of intersection type #11935
The workaround is to avoid calling Any
instance methods. Done in 9bb73ae. The only "proper solution" I can think of: if dispatching instance method of Any
, send the whole multi value as self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only "proper solution" I can think of: if dispatching instance method of
Any
, send the whole multi value asself
.
I think that is the behaviour we need for the intersection type solution to make any sense. We can't make it so easy to 'loose' the intersected types. While in libraries we can try to rely on workarounds, our users will get confused if the column stops being an Integer_Column
after removing warnings. throw_on_warning
and the related functions are user facing methods so they cannot be breaking it.
_ : Float -> | ||
key_as_float : Float -> | ||
if no_warning then new_dict else | ||
Warning.attach (Floating_Point_Equality.Used_As_Dictionary_Key key) new_dict | ||
Warning.attach (Floating_Point_Equality.Used_As_Dictionary_Key key_as_float) new_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With intersection types, with
case a of
b : T -> ...
The a
and b
are no longer interchangeable. The T
part could be hidden in a
(so you cannot pass it into functions that expect T
) and it can be un-hidden by the b : T
check at which point b
has it 'visible' and b
can be passed to f (t : T)
whereas a
cannot.
So we need to keep in mind that whenever we rely on case of
, we should not do _ : T
but name it and use the named component. At least anywhere where we may expect the intersection types to come up.
cc: @jdunkerley @GregoryTravis
This is actually an important change to Enso semantics that we kind of knew about in #11600 but I'm not sure we have appreciate its implications enough yet.
resolved = case value of | ||
_ : Column -> value | ||
_ : Text -> self.make_constant_column value | ||
_ : Expression -> self.evaluate_expression value on_problems | ||
_ : Column -> value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, intersection types have changed something fundamental about Enso semantics.
case of
branches that used to be completely disjoint - a value was able to match only one of them - are now no longer disjoint - a multi-value can match multiple branches. So now we need to be more careful about ordering of the branches. If a value can match more than one, which branch is the one that should be preferred?
E.g. here single-text-value column would match both _ : Text
and _ : Column
branch (because it is Column & Text
). Now, we want it to still go to the _ : Column
branch, if it went to _ : Text
branch that was leading to errors.
Pull Request Description
Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.