Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure serializing the model to disk #790

Open
ipapa opened this issue Oct 22, 2024 · 1 comment
Open

Failure serializing the model to disk #790

ipapa opened this issue Oct 22, 2024 · 1 comment

Comments

@ipapa
Copy link

ipapa commented Oct 22, 2024

Describe the bug
I am training a model using Random Forest classification algorithm. The algorithm is working fine producing the model. but it is failing to serialize the model to disk. I have tried multiple versions of JDK (11, 17, 22) and the result has been the same.

Expected behavior
I expect that the model gets serialized to disk using Java's serialization. I want to load that model later on in order to run queries against it.

Actual behavior
Exception in thread "main" java.io.NotSerializableException: smile.data.type.ObjectType$$Lambda$18/0x00007e9cd8cef840
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1369)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1165)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1369)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1165)
at java.base/java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1543)
at java.base/java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1500)
at java.base/java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1423)
at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1169)
at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:345)
at smile.io.Write.object(Write.java:58)
at com.*.machinelearning.SmileRandomForestTrainer.main(SmileRandomForestTrainer.java:74)

Code snippet
// Load the Parquet file
DataFrame data = Read.csv(trainingDataPath, CSVFormat.DEFAULT.withFirstRecordAsHeader().withRecordSeparator(','));

    // Define formula for the model
    Formula formula = Formula.lhs("target_hit");

    // Set up properties for Random Forest
    Properties props = new Properties();
    props.setProperty("numTrees", "100");
    props.setProperty("maxDepth", "23");
    props.setProperty("minSamplesSplit", "2");
    props.setProperty("minSamplesLeaf", "1");
    props.setProperty("randomState", "11");

    // Train the Random Forest model
    RandomForest rf = RandomForest.fit(formula, data, props);
    
    // write the model to filesystem
    Write.object(rf, Path.of(outputModelPath));
    LOG.info("Model trained and saved to: {}", outputModelPath);

Input data

target_hit,device_make,device_model,device_language,device_name,carrier,total_unique_sessions,unique_sessions_with_target_hit
0,70,3177,207,24634,1,113,9
0,70,1553,207,24634,1,70,12
1,70,2361,207,24634,1,33,16
0,376,798,1,36706,910,7150,2
0,70,800,207,24634,1,11,1
0,70,3184,207,24634,1,101,3
0,376,2,1278,36706,1495,43,2
0,70,2358,207,24634,1,82,0
0,376,1554,1278,36706,1493,1426,11

Additional context
openjdk version "22.0.2" 2024-07-16
OpenJDK Runtime Environment Corretto-22.0.2.9.1 (build 22.0.2+9-FR)
OpenJDK 64-Bit Server VM Corretto-22.0.2.9.1 (build 22.0.2+9-FR, mixed mode, sharing)

Amazon Linux - EC2 in AWS.

Running with Smile 3.1.1 verison.

@ipapa ipapa changed the title Failure saving the model to disk Failure serializing the model to disk Oct 22, 2024
@haifengl
Copy link
Owner

haifengl commented Nov 5, 2024

Thanks for reporting. I cannot reproduce it though. Can you please try the master branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants