You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is some randomness in fitting the ClusterBasedNormalizer. This also causes reproducibility issues in the other sdv libraries, e.g., sdv-dev/CTGAN#213.
Problem Description
There is some randomness in fitting the
ClusterBasedNormalizer
. This also causes reproducibility issues in the other sdv libraries, e.g., sdv-dev/CTGAN#213.Expected behavior
The
BayesianGaussianMixture
used to fit the distribution has arandom_state
argument that could be used for reproducibility purposes (see https://scikit-learn.org/stable/modules/generated/sklearn.mixture.BayesianGaussianMixture.html).Additional context
I have only looked at the
ClusterBasedNormalizer
, but it may be that other methods could use the same approach for reproducibility purposes.The text was updated successfully, but these errors were encountered: