-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue while running vis_corex #15
Comments
Oh, that's disappointing. That error is caused by the "nan" in the output for TC (it's trying to find the best TC value, but "nan" is not comparable). If you put --verbose=2 you can see the TCs as you are running - then you might be able to see a nan arise earlier and stop it.
Not an issue, but you should add the option --no_row_names, since your first column is not an index. Another possibility for your dataset is to "bin" the data and treat it as discrete. So for instance, you might set 0: 0, 1:1, 2: (any number greater than 1). Then run without the -c option (c to treat as continuous). |
One other suggestion. This looks like count data. I've always meant to include a specific handling of count data, but haven't yet. One thing that works well for count data is to transform each value to log_2(1+x). The 0's and 1's stay the same, but the long tail of high counts is compressed inward. This makes the numerical modeling easier by reducing outliers. |
Thanks for your quick response. We will try the suggestions you have outlined here. |
While running a file with the following arguments, I am getting an error after 24 hours of script run time.
Command:
python3 vis_corex.py /home/ppandey/dx_desc.csv --delimiter="|" --layers=32,16,8,1 --dim_hi dden=3 --missing=-1e6 -c -b -v -o dxm --ram=72 --cpu=36
Sample File:
DX101|DX110|DX115|DX118|DX142|DX143|DX155|DX160|DX166|DX169|DX175|DX184|DX196|DX212|DX215|DX218|DX222|DX223|DX234|DX235|DX239|DX253|DX254|DX267|DX271|DX275|DX277|DX278|DX279|DX295|DX298|DX310|DX315|DX332|DX335|DX342|DX343|DX344|DX356|DX385|DX386|DX399|DX404
8|0|1|6|0|0|0|0|0|0|0|0|5|0|3|0|0|6|0|453|0|0|0|2|0|0|6|0|0|0|9|4|6|0|0|1|1|0|9|0|0|41|81
0|4|0|0|0|4|1|0|53|0|0|2|0|0|1|0|0|0|0|0|0|4|0|0|0|0|3|0|0|0|0|0|11|0|4|0|0|0|0|0|7|0|0
0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
0|0|0|0|1|0|0|0|0|0|0|0|0|0|9|0|0|3|0|0|0|0|0|0|0|0|0|2|0|0|2|0|25|0|0|0|0|0|0|0|2|0|0
Output:
`[-0. -0. -0. 0. 0. -0. 0. -0. -0. 0. 0. -0. 0. -0. nan -0.]
[ 0. 0. -0. 0. -0. -0. 0. -0. 0. 0. -0. -0. 0. 0. nan -0.]
[ 0. 0. 0. 0. 0. -0. -0. 0. 0. -0. -0. 0. -0. 0. nan -0.]
Overall tc: nan
Traceback (most recent call last):
File "vis_corex.py", line 777, in
n_cpu=options.cpu, ram=options.ram).fit(X_prev))
File "/home/usr/bio_corex/corex.py", line 171, in fit
self.fit_transform(X)
File "/home/usr/bio_corex/corex.py", line 220, in fit_transform
self.dict = best_dict
UnboundLocalError: local variable 'best_dict' referenced before assignment`
The text was updated successfully, but these errors were encountered: