ValueError: could not convert string to float: 'Rural Adj' #128

sbpatel2009 · 2023-08-15T22:47:36Z

I am attempting to fit the Preprocessor on training data that only includes categorical variables. It appears that when I pass an empty list to the num_feats parameter, the Preprocessor attempts to convert all columns to floats and returns an error.

This code:

cat_feats = ['Rural']
num_feats = []

preprocessor = Preprocessor()
preprocessor.fit(data=X, cat_feats=cat_feats, num_feats=num_feats)

...returns this error:

ValueError: could not convert string to float: 'Rural Adj'

I may be missing it, but I don't see how to handle this issue in the documentation. Does this class require there to be numerical features?

The text was updated successfully, but these errors were encountered:

matteo4diani · 2023-11-09T15:40:56Z

Hi @sbpatel2009, sorry for the late reply and thanks for contributing to auton-survival 🙂

You may already have figured it out by yourself, but the reason for that error lies in this bit of code:

auton-survival/auton_survival/preprocessing.py

Lines 209 to 212 in 5dde465

    
           if self._num_feats: 
        
             self.scaler = scaler.fit(df[self._num_feats]) 
        
           else: 
        
             self.scaler = scaler.fit(df)

As you can see from the code, the preprocessor assumes that if no num_feats are provided, all feats are num_feats.
I'm going to work on this and other bugs in the next week(s). If you want to contribute a fix, you are more than welcome 😄

A workaround can be to add a dummy numerical column to the input DataFrame:

from auton_survival.preprocessing import Preprocessor
import pandas as pd


cat_feats = ['Rural']
num_feats = ['Dummy']

X = pd.DataFrame({'Rural': ['yes', 'no', 'maybe'], 'Dummy': [0, 0, 0]})

preprocessor = Preprocessor()
X = preprocessor.fit_transform(data=X, cat_feats=cat_feats, num_feats=num_feats)

X = X.drop(columns=['Dummy'])

print(X)

sbpatel2009 · 2023-11-10T00:23:44Z

Thank you, Matteo! That is a clever workaround! I just used the one hot encoder in scikit learn. Best, Snehal

…

On Thu, Nov 9, 2023 at 9:41 AM Matteo Fordiani ***@***.***> wrote: Hi @sbpatel2009 <https://github.com/sbpatel2009>, sorry for the late reply and thanks for contributing to auton-survival 🙂 You may already have figured it out by yourself, but the reason for that error lies in this bit of code: https://github.com/autonlab/auton-survival/blob/5dde465f7223601717abddc1d075e837707c403b/auton_survival/preprocessing.py#L209-L212 As you can see from the code, the preprocessor assumes that if no num_feats are provided, all feats are num_feats. A workaround can be to add a dummy numerical column to the input DataFrame, although while experimenting with this example I found another bug that I detailed here #133 <#133> from auton_survival.preprocessing import Preprocessorimport pandas as pd cat_feats = ['Rural']num_feats = ['Dummy'] X = pd.DataFrame({'Rural': ['yes', 'no', 'maybe'], 'Dummy': [0, 0, 0]}) preprocessor = Preprocessor()X = preprocessor.fit_transform(data=X, cat_feats=cat_feats, num_feats=num_feats) print(X) — Reply to this email directly, view it on GitHub <#128 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGAPI626SXUOM66H3NLPCYDYDT2RHAVCNFSM6AAAAAA3RVXYOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBUGA3TAOJTGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

matteo4diani mentioned this issue Nov 9, 2023

opened by mistake #133

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: could not convert string to float: 'Rural Adj' #128

ValueError: could not convert string to float: 'Rural Adj' #128

sbpatel2009 commented Aug 15, 2023

matteo4diani commented Nov 9, 2023 •

edited

Loading

sbpatel2009 commented Nov 10, 2023 via email

ValueError: could not convert string to float: 'Rural Adj' #128

ValueError: could not convert string to float: 'Rural Adj' #128

Comments

sbpatel2009 commented Aug 15, 2023

matteo4diani commented Nov 9, 2023 • edited Loading

sbpatel2009 commented Nov 10, 2023 via email

matteo4diani commented Nov 9, 2023 •

edited

Loading