-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented dateframe.between_time #2111
Implemented dateframe.between_time #2111
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2111 +/- ##
=======================================
Coverage 95.35% 95.36%
=======================================
Files 60 60
Lines 13505 13514 +9
=======================================
+ Hits 12878 12887 +9
Misses 627 627
Continue to review full report at Codecov.
|
Thanks for your PR! Would you add |
databricks/koalas/frame.py
Outdated
@@ -2977,6 +2977,86 @@ class locomotion | |||
).resolved_copy | |||
return DataFrame(internal) | |||
|
|||
def between_time( | |||
self, start_time, end_time, include_start=True, include_end=True, axis: Union[int, str] = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you complete type annotations of parameters?
start_time: Union[datetime.time, str],
end_time: Union[datetime.time, str],
include_start: bool = True,
include_end: bool = True,
databricks/koalas/frame.py
Outdated
Examples | ||
-------- | ||
>>> i = pd.date_range('2018-04-09', periods=4, freq='1D20min') | ||
>>> ts = pd.DataFrame({'A': [1, 2, 3, 4]}, index=i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> ts = pd.DataFrame({'A': [1, 2, 3, 4]}, index=i)
>>> kts = ks.from_pandas(ts)
can be
>>> kts = ks.DataFrame({'A': [1, 2, 3, 4]}, index=i)
databricks/koalas/frame.py
Outdated
indexer = index.indexer_between_time( | ||
start_time, end_time, include_start=include_start, include_end=include_end | ||
).to_numpy() | ||
return self.copy().take(indexer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would return self.iloc[indexer]
be better?
databricks/koalas/frame.py
Outdated
|
||
indexer = index.indexer_between_time( | ||
start_time, end_time, include_start=include_start, include_end=include_end | ||
).to_numpy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to_numpy()
is a good idea since all the data will be loaded into the driver's memory.
Let me think about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I used to_numpy()
is because without it I got this not implemented error.
databricks.koalas.exceptions.PandasNotImplementedError: The method
pd.Index.iter() is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
...
def pandas_between_time(pdf):
return pdf.between_time(start_time, end_time, include_start, include_end)
return self.koalas.apply_batch(pandas_between_time)
?
Then we might also want to add a test
def test_between_time_no_shortcut(self):
with ks.option_context("compute.shortcut_limit", 0):
i = pd.date_range("2018-04-09", periods=4, freq="1D20min")
ts = pd.DataFrame({"A": [1, 2, 3, 4]}, index=i)
kts = ks.DataFrame({"A": [1, 2, 3, 4]}, index=i)
self.assert_eq(
ts.between_time("0:15", "0:45"), kts.between_time("0:15", "0:45"), almost=True
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LSturtew Thanks for working on this!
I left comments, most of them are nits. Otherwise, LGTM so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending tests.
@LSturtew Thanks! Let me merge this now. |
ref #1929
Implement
DataFrame.between_time