Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/wepon/season one 中extract_feature.py问题 #12

Open
fannn1217 opened this issue Mar 8, 2018 · 11 comments
Open

/wepon/season one 中extract_feature.py问题 #12

fannn1217 opened this issue Mar 8, 2018 · 11 comments

Comments

@fannn1217
Copy link

请问在运行extract_feature.py时出现了这样的问题
Traceback (most recent call last):
File "extract_feature.py", line 60, in
feature3 = off_train[((off_train.date>='20160315')&(off_train.date<='20160630'))|((off_train.date=='null')&(off_train.date_received>='20160315')&(off_train.date_received<='20160630'))]
File "/Library/Python/2.7/site-packages/pandas/core/ops.py", line 879, in wrapper
res = na_op(values, other)
File "/Library/Python/2.7/site-packages/pandas/core/ops.py", line 818, in na_op
raise TypeError("invalid type comparison")
TypeError: invalid type comparison
怎么解决呢

@yunxinan
Copy link

yunxinan commented Mar 8, 2018 via email

@JackeYou
Copy link

JackeYou commented May 24, 2018

你好请问下,这个怎么解决的?

解决了,在读取train文件时设置keep_default_na = False..我的是py3.可能就是nan的问题!

@iimmortall
Copy link

@WorldBestGaming 您好,请问你的nan问题是如何解决,我遇到了这个问题“ValueError: invalid literal for int() with base 10: 'nan'”。

@JackeYou
Copy link

JackeYou commented Jul 8, 2018

@iimmortall 你pandas读取文件时,有个参数是keep_default_na你设置一下False就ok了。

@RobertMarton
Copy link

Cause there so many pd.read_csv..so when trackback the code ,which operaion should I change 'keep_defau.lt_na=False‘.I've changed some but the result is negative..

@JackeYou
Copy link

@RobertMarton i can`t understand you say that the result is negative.= =! emmm

@RobertMarton
Copy link

Emmm, I added 'keep_default_na=False' in code
"off_train = pd.read_csv('data/ccf_offline_stage1_train.csv',header=None,keep_default_na=False)"&
"on_train = pd.read_csv('data/ccf_online_stage1_train.csv',header=None,keep_default_na=False)"
But the error still exist like below =-= :
t3['user_merchant_any'] = 1
(112803, 52)
(257126, 53)
Traceback (most recent call last):
File "extract_feature.py", line 1031, in
dataset2.label = dataset2.label.apply(get_label)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src/inference.pyx", line 1472, in pandas._libs.lib.map_infer
File "extract_feature.py", line 981, in get_label
elif (date(int(s[0][0:4]),int(s[0][4:6]),int(s[0][6:8]))-date(int(s[1][0:4]),int(s[1][4:6]),int(s[1][6:8]))).days<=15:
ValueError: invalid literal for int() with base 10: 'nan'

@RobertMarton
Copy link

I changed some code then ,when running xgb.py occurs error below:
"Check failed: !auc_error AUC: the dataset only contains pos or neg samples"
But the label column in dataset1.csv is all value 0 ,so is this normal?

@gutsttt
Copy link

gutsttt commented Jul 19, 2018

@RobertMarton How duo you solve this
elif (date(int(s[0][0:4]),int(s[0][4:6]),int(s[0][6:8]))-date(int(s[1][0:4]),int(s[1][4:6]),int(s[1][6:8]))).days<=15:
ValueError: invalid literal for int() with base 10: 'nan'
i also see this in my code. help me plz.

@JackeYou
Copy link

@RobertMarton emmmm. My solution is based on python3.your enviorment is py2.I guess the data format is a problem, you can try to change utf-8. = =!

@ChenKevin0123
Copy link

nan的问题在比较前先把那列fillna('null')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants