-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a typed col
function for creating column references
#187
base: master
Are you sure you want to change the base?
Conversation
Also refs #164 |
Codecov Report
@@ Coverage Diff @@
## master #187 +/- ##
=========================================
Coverage ? 94.95%
=========================================
Files ? 35
Lines ? 674
Branches ? 11
=========================================
Hits ? 640
Misses ? 34
Partials ? 0
Continue to review full report at Codecov.
|
@@ -18,9 +18,10 @@ class SelectTests extends TypedDatasetSuite { | |||
val A = dataset.col[A]('a) | |||
|
|||
val dataset2 = dataset.select(A).collect().run().toVector | |||
val symDataset2 = dataset.select(functions.col('a)).collect().run().toVector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, this works because inference fixes T
from the expected type of select. But does it scale to complex expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Yes, this will probably only work when used directly in the args to select. Not sure if it’ll work with selectMany, that needs to be tested. I plan on adding a similar assertion to all of the tests here.
An idea to deal with situations in which you have to specify T is to use https://tpolecat.github.io/2015/07/30/infer.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. but if T
is Tuple3[Int, String, Double]
it's not gonna look pretty :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I have another idea - will push in a few minutes.
@OlivierBlanvillain have another look; this approach should allow to keep the column reference in a value, for example, and then have the actual TypedColumn constructed when using select. |
Ah, but this approach doesn't support expressions like |
What's the reasoning behind carrying around the dataset type in the typed columns? |
If it was just |
That's true, but how about moving the column existence evidence to the methods on The reasoning is that a |
We didn't consider that, but it sounds isomorphic given than we can already do something like |
It is isomorphic up to the requirement of having I will open a separate PR to experiment with moving the evidence to |
0be79f0
to
897e499
Compare
Resolves #186.
Would be a good idea to wait for #110 before merging this to avoid conflict on
SelectTests.scala
.