Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve external warnings #1208

Closed
wants to merge 4 commits into from
Closed

Conversation

EwoutH
Copy link
Contributor

@EwoutH EwoutH commented Aug 17, 2024

This PR addresses resolves several 3 issues caused by external dependencies, that were present one or multiple times in osmnx CI test runs.

  • Upgrade scikit-learn to >=1.0:

    • Updated the dependency to scikit-learn>=1.0, resolving deprecation warnings related to np.float when using NumPy > 1.20.
  • Resolve SettingWithCopyWarning in Pandas:

    • Fixed a SettingWithCopyWarning in features.py by ensuring a copy of the DataFrame slice is made before modifications.
  • Fix VisibleDeprecationWarning in NumPy:

    • Addressed a VisibleDeprecationWarning in the _verify_edge_attribute function by explicitly setting dtype=object for a NumPy array to handle ragged nested sequences, maintaining future compatibility.

Some more details are in the commit messages of each commit.

CI is passing on my test branch, where I triggered the CI explicitly.

There's one external issue left, but I think that has to be fixed upstream (tracking: geopandas/geopandas#3403). The other issues look to be triggered by OSMnx (or its tests) itself.

Resolves an large amount of warnings of the type DeprecationWarning: `np.float` is a deprecated alias for the builtin `float` that were present in scikit-learn when using NumPy > 1.20.

scikit-learn>=1.0 is also ABI stable and follows semantic versioning, making maintaining a range of versions easier.
Resolves the following warning:

tests/test_osmnx.py::test_features
  /home/runner/work/osmnx/osmnx/osmnx/features.py:684: SettingWithCopyWarning: 
  A value is trying to be set on a copy of a slice from a DataFrame.
  Try using .loc[row_indexer,col_indexer] = value instead
  
  See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    gdf.loc[:, "geometry"] = gdf["geometry"].make_valid()
Resolved a VisibleDeprecationWarning by specifying dtype=object when creating a NumPy array from potentially ragged nested sequences in the _verify_edge_attribute function. This ensures that the array can handle irregular shapes without raising deprecation warnings, maintaining compatibility with future NumPy versions.


tests/test_osmnx.py::test_routing
  /home/runner/work/osmnx/osmnx/osmnx/routing.py:473: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
    values_float = (np.array(tuple(G.edges(data=attr)))[:, 2]).astype(float)
Copy link

codecov bot commented Aug 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.31%. Comparing base (94b7e73) to head (7cb893e).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1208   +/-   ##
=======================================
  Coverage   98.31%   98.31%           
=======================================
  Files          24       24           
  Lines        2369     2370    +1     
=======================================
+ Hits         2329     2330    +1     
  Misses         40       40           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -680,7 +680,7 @@ def _filter_features(
Filtered GeoDataFrame of features.
"""
# remove any null or empty geometries then fix any invalid geometries
gdf = gdf[~(gdf["geometry"].isna() | gdf["geometry"].is_empty)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the SettingWithCopyWarning here a false positive? That is, we don't care about writing back to the original dataframe, because we're overwriting the variable itself. Given the expensiveness of .copy(), it may be better to filter/disable the warning on this line instead (assuming this is indeed a false positive).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, regarding:

Addressed a VisibleDeprecationWarning in the _verify_edge_attribute function by explicitly setting dtype=object for a NumPy array to handle ragged nested sequences, maintaining future compatibility.

How did you produce this warning? I'm using numpy 2 and haven't seen it. I'm trying to figure it out, but dtype=object doesn't make sense to me. This shouldn't be a ragged array, but rather always a $m \times 3$ array created from the tuples of (u, v, attr_value).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All warnings were from a CI run at that time on the main branch.

Unfortunately, at this time I don't have the capacity to follow up on this, so feel free to close or modify this PR.

@gboeing gboeing closed this Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants