Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to construct a TypedColumn reference from a Symbol #186

Open
iravid opened this issue Sep 21, 2017 · 8 comments · May be fixed by #187
Open

Provide a way to construct a TypedColumn reference from a Symbol #186

iravid opened this issue Sep 21, 2017 · 8 comments · May be fixed by #187

Comments

@iravid
Copy link
Contributor

iravid commented Sep 21, 2017

Currently, we cannot chain selects that reference columns introduced in a select without introducing intermediate vals. Here's a (contrived) example:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

ds.select(ds('i), ds('j), ds('i) + 1).select(ds('_3))

One possible solution for this would be to provide an overload to TypedDataset.select that uses Symbol arguments and a similar frameless.functions.col function.

iravid pushed a commit to iravid/frameless that referenced this issue Sep 22, 2017
@imarios
Copy link
Contributor

imarios commented Sep 24, 2017

Hi @iravid, is this covered by #187?

@iravid
Copy link
Contributor Author

iravid commented Sep 24, 2017

Yes, with some limitations to type inference, discussed there

@imarios
Copy link
Contributor

imarios commented Sep 24, 2017

I think we tried this a couple of times over the past year and always type inference was stopping us from having something that works in a useful way. Probably the macro approach that @OlivierBlanvillain suggest is the best way to go. Also, @jeremyrsmith alternative select macro was kind of solving a similar problem.

@iravid
Copy link
Contributor Author

iravid commented Sep 24, 2017

Yes, there’s very little flexibility if T is a type parameter on the column.

I personally find this better than nothing currently but understood if you disagree :-)

Macros are very attractive usage wise, yes, but I have a feeling they’d present a maintenance challenge to the project compared to shapeless-based approaches.

@imarios
Copy link
Contributor

imarios commented Sep 24, 2017

If I remember correct, the issue with this approach is that it will work great under some context and not under something else; which makes the API kind of "clumsy".

I do get the issue with macros, and personally I barely understand what is going on there. At the same time, I do believe that not all macros are evil. Some macros are worst than others. If the macro is simple enough and solves an important problem, I would surely consider using it and spend the extra brain power to understand and maintain it.

@iravid
Copy link
Contributor Author

iravid commented Sep 24, 2017

Btw - can you point me to the macro suggestion you’re referring to?

@imarios
Copy link
Contributor

imarios commented Sep 25, 2017

Hi @iravid, that's #110. Also, what @OlivierBlanvillain suggest about intermediate case classes might work well. Not sure which one is easier, but I would with the easiest.

@iravid
Copy link
Contributor Author

iravid commented Sep 26, 2017

Ah thanks for pointing out @imarios. How would #110 solve this usecase though? You still need to name the intermediate dataset for this usecase.

For example, this doesn't work:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

ds.select(ds(_.i), ds(_.j), ds(_.i) + 1)
  .select(ds(_._3))

But this does:

case class X2(i: Int, j: Int)

val ds = TypedDataset.create(Seq(X2(10, 10))

val intermediate = ds.select(ds(_.i), ds(_.j), ds(_.i) + 1)

intermediate.select(intermediate(_._3))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants