You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All cells work fine until the synthesizer.fit() step.
I got the following error
Expected behavior
Context
Operating System and version: macOS 14.7.1
Which version are you using: 0.2.4
Error message
{
"name": "TypeError",
"message": "Could not convert string '2015-06-012010-10-012016-08-012013-05-012017-04-012016-08-012015-07-012016-07-012012-08-012017-02-012016-11-012015-04-012015-03-012015-08-012017-04-012015-08-012015-11-012016-10-012016-11-012016-01-012016-02-012015-03-012014-06-012017-08-012014-05-012015-08-012011-05-012016-09-012012-10-012015-01-012016-06-012015-08-012013-03-012016-03-012018-06-012017-11-012018-03-012011-10-012016-07-012014-07-012016-04-012013-05-012016-02-012015-03-012014-09-012015-09-012015-04-012016-01-012013-12-012014-10-012017-05-012016-06-012016-07-012016-01-012017-08-012016-03-012018-09-012015-11-012015-03-012015-02-012017-08-012016-07-012016-01-01......' to numeric",
"stack": "---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[13], line 2
1 # Fit the model
----> 2 synthesizer.fit()
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/sdgx/synthesizer.py:327, in Synthesizer.fit(self, metadata, inspector_max_chunk, metadata_include_inspectors, metadata_exclude_inspectors, inspector_init_kwargs, model_fit_kwargs)
325 try:
326 logger.info("Model fit Started...")
--> 327 self.model.fit(metadata, processed_dataloader, **(model_fit_kwargs or {}))
328 logger.info("Model fit... Finished")
329 finally:
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/sdgx/models/ml/single_table/ctgan.py:220, in CTGANSynthesizerModel.fit(self, metadata, dataloader, epochs, *args, **kwargs)
218 if epochs is not None:
219 self._epochs = epochs
--> 220 self._pre_fit(dataloader, discrete_columns, metadata)
221 if self.fit_data_empty:
222 logger.info("CTGAN fit finished because of empty df detected.")
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/sdgx/models/components/sdv_rdt/transformers/numerical.py:479, in ClusterBasedNormalizer._fit(self, data)
466 """Fit the transformer to the data.
467
468 Args:
469 data (pandas.Series):
470 Data to fit to.
471 """
472 self._bgm_transformer = BayesianGaussianMixture(
473 n_components=self.max_clusters,
474 weight_concentration_prior_type="dirichlet_process",
475 weight_concentration_prior=0.001,
476 n_init=1,
477 )
--> 479 super()._fit(data)
480 data = super()._transform(data)
481 if data.ndim > 1:
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/pandas/core/series.py:6457, in Series._reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
6452 # GH#47500 - change to TypeError to match other methods
6453 raise TypeError(
6454 f"Series.{name} does not allow {kwd_name}={numeric_only} "
6455 "with non-numeric dtypes."
6456 )
-> 6457 return op(delegate, skipna=skipna, **kwds)
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/pandas/core/nanops.py:147, in bottleneck_switch.call..f(values, axis, skipna, **kwds)
145 result = alt(values, axis=axis, skipna=skipna, **kwds)
146 else:
--> 147 result = alt(values, axis=axis, skipna=skipna, **kwds)
149 return result
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/pandas/core/nanops.py:404, in _datetimelike_compat..new_func(values, axis, skipna, mask, **kwargs)
401 if datetimelike and mask is None:
402 mask = isna(values)
--> 404 result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
406 if datetimelike:
407 result = _wrap_results(result, orig_values.dtype, fill_value=iNaT)
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/pandas/core/nanops.py:720, in nanmean(values, axis, skipna, mask)
718 count = _get_counts(values.shape, mask, axis, dtype=dtype_count)
719 the_sum = values.sum(axis, dtype=dtype_sum)
--> 720 the_sum = _ensure_numeric(the_sum)
722 if axis is not None and getattr(the_sum, "ndim", False):
723 count = cast(np.ndarray, count)
File ~/.pyenv/versions/3.10.15/envs/adhoc/lib/python3.10/site-packages/pandas/core/nanops.py:1701, in _ensure_numeric(x)
1698 elif not (is_float(x) or is_integer(x) or is_complex(x)):
1699 if isinstance(x, str):
1700 # GH#44008, GH#36703 avoid casting e.g. strings to numeric
-> 1701 raise TypeError(f"Could not convert string '{x}' to numeric")
1702 try:
1703 x = float(x)
TypeError: Could not convert string '2015-06-012010-10-012016-08-012013-05-012017-04-012016-08-012015-07-012016-07-012012-08-012017-02-012016-11-012015-04-012015-03-012015-08-012017-04-012015-08-012015-11-012016-10-012016-11-012016-01-012016-02-012015-03-012014-06-012017-08-012014-05-012015-08-012011-05-012016-09-012012-10-012015-01-012016-06-012015-08-012013-03-012016-03-012018-06-012017-11-012018-03-012011-10-012016-07-012014-07-012016-04-012013-05-012016-02-012015-03-012014-09-012015-09-012015-04-012016-01-012013-12-012014-10-012017-05-012016-06-012016-07-012016-01-012017-08-012016-03-012018-09-012015-11-012015-03-012015-02-012017-08-012016-07-012016-01-012015-07-012016-03-012014-07-012013-02-012014-06-012014-06-012014-10-012015-11-012015-01-012015-08-012015......' to numeric"
}
Configuration
Paste the contents of your configuration file here.
Additional context
The string in the error message is too long to fit in an github issue, so I shorten the date string a bit.
The text was updated successfully, but these errors were encountered:
Hi! I have tried this but I can't reproduct this bug.
But I thought this bug may same to this issue #248 .
Could you try this after excluding FixedCombinationInspector?
Description
When running the example notebook sdgx_example_ctgan.ipynb, I ran into error in the fit step.
Reproduce
Follow the https://github.com/hitsz-ids/synthetic-data-generator/blob/main/example/sdgx_example_ctgan.ipynb
All cells work fine until the
synthesizer.fit()
step.I got the following error
Expected behavior
Context
Error message
Configuration
Additional context
The text was updated successfully, but these errors were encountered: