Skip to content

Issue in PrivBayes data encoding (continous/categorical) #350

@symdec

Description

@symdec

Description

If you run a training/fit with the PrivBayes plugin on a dataset with continuous columns and enough lines, you will encounter an out-of-memory system error (basically the program stops).
If you dig a bit, what happens is that the continuous variables are encoded as categorical, with many categories (as we deal with floats) and it represents too much in the computer memory, which leads to a crash.

It seems to have an error in the control logic of the _encode function in the file: src/synthcity/plugins/privacy/plugin_privbayes.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions