You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently most of the transformation catalog is only applicable at the top level of the spark dataframe. Creating a base transformation class to implement on nested fields could resolve this problem and deeply increase the scope of the library.
Describe the solution you'd like
Create a base transformation class that can be used to extend the functionality of the library by transforming nested fields using only dot notation.
Describe alternatives you've considered
Raw spark code implementations.
Additional context
The text was updated successfully, but these errors were encountered:
@taquero-s can you give some insight on how you see this looking like from a API point of view? What I mean is: how would we call a Transformation class for a nested field.
Also, with nested data, the traditional route to take is to explode the data and then re-build it back to the nested structure (which is quite computationally expensive and inefficient of course). Do you have some insight on how to do this more effectively?
Additionally, we need to agree on the scope. I like doing this on a Base Class level, that would make it where this would work essentially at any cascade. Personally I think we should do this for ColumnsTransformation and ColumnsTransformationsWithTarget based classes. Tests would also be needed to be added of course.
Is your feature request related to a problem? Please describe.
Currently most of the transformation catalog is only applicable at the top level of the spark dataframe. Creating a base transformation class to implement on nested fields could resolve this problem and deeply increase the scope of the library.
Describe the solution you'd like
Create a base transformation class that can be used to extend the functionality of the library by transforming nested fields using only dot notation.
Describe alternatives you've considered
Raw spark code implementations.
Additional context
The text was updated successfully, but these errors were encountered: