-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Transformer] ChiMergeDiscretisor #459
base: main
Are you sure you want to change the base?
Conversation
Hi @Morgan-Sell Thanks for kicking this off. I reckon this one is not ready for review, right? It would be great to have some tests with the expected result for the transformer. Thank you! |
hola @solegalli, Correct, it's not ready for review. I'm still working on it. And, yes, I'll create tests. |
Should we avoid using dataframes and instead use dictionaries and numpy arrays? I suspect iterating through dataframes increases computational costs. I'm going to keep the question, but I think the answer is "yes". Dictionaries and numpy arrays simplify the merging of frequency distributions ;) |
…hod now returns a 2-d numpy array and 1-d numpy array instead of a dictionary.
…hod now returns a 2-d numpy array and 1-d numpy array instead of a dictionary.
… New method is incomplete. Issue with some of the chi-square calculations. It only happens w/ certain distributions
…the first 2 and last 2 chi-square values do not match expected results. meanwhile, the other 9 chi-square values match. unsure what is the cause of the discrepancy
hola @solegalli, I think I need an extra set of eyes. I'm struggling to identify what is causing the error for test_chi_merge(). I believe the root cause is in _calc_chi_square(); however, I cannot identify where. In test_chi_merge(), the expected results are The transformer returns the following results: The above values are the results from the chi-square tests of the consecutive distributions. 5 of the 12 expected results are incorrect. Indices of the values that don't reconcile: 0, 1, 6, 10, and 11. Do you see the bug? |
hola @solegalli, Did you have a chance to look at this bug? I'm stumped. |
hi @solegalli, Are you still getting around to reviewing this discretizer? I think it's super cool! I know you're quite busy. I'm trying to organize myself. |
Still pending. I send you an email? would that work? |
Closes #450.
Notes from #450:
Existing implementations:
https://github.com/lisette-espin/pychimerge
https://github.com/night18/ChiMerge
https://github.com/raiyan1102006/ChiMerge
https://gist.github.com/alanzchen/17d0c4a45d59b79052b1cd07f531689e?short_path=f2e54c6
Reference to the original article can be found in the first link