Skip to content

org.apache.avro.SchemaParseException: Can't redefine: test #182

@lockwobr

Description

@lockwobr

Having issues writing data with avro_rs and reading it with apache avro java. I was able to create one example that is close to what i am experiencing. I have a pretty complicated schema, so trying to boil it down the problem bits.

This code works just fine, but went read into avro tools i get an error.

use avro_rs::{Codec, Reader, Schema, Writer, from_value, types::Record, Error};
use serde::{Deserialize, Serialize};
use std;

#[derive(Debug, Deserialize, Serialize)]
struct Test {
    a: i64,
    b: String,
    test: Test2,
}

#[derive(Debug, Deserialize, Serialize)]
struct Test2 {
    a: i64,
    b: String,
}


fn main() -> Result<(), Error> {
    let raw_schema = r#"
        {
            "type": "record",
            "name": "test",
            "fields": [
                {"name": "a", "type": "long", "default": 42},
                {"name": "b", "type": "string"},
                {"name": "test", "type": {
                    "type": "record",
                    "name": "test",
                    "fields": [
                        {"name": "a", "type": "long", "default": 42},
                        {"name": "b", "type": "string"}
                    ]
                }}
            ]
        }
    "#;

    let schema = Schema::parse_str(raw_schema)?;

    // println!("{:?}", schema);

    let mut writer = Writer::new(&schema, std::io::stdout());

    let test = Test {
        a: 27,
        b: "foo".to_owned(),
        test: Test2 {
            a: 23,
            b: "bar".to_owned(),
        }
    };

    writer.append_ser(test)?;
    writer.flush()?;

    Ok(())
}
❯./target/debug/example > avro.out
❯ java -jar ~/bin/avro-tools-1.10.1.jar tojson avro.out
21/02/17 10:38:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.avro.SchemaParseException: Can't redefine: test
        at org.apache.avro.Schema$Names.put(Schema.java:1542)
        at org.apache.avro.Schema$Names.add(Schema.java:1536)
        at org.apache.avro.Schema.parse(Schema.java:1655)
        at org.apache.avro.Schema.parse(Schema.java:1668)
        at org.apache.avro.Schema$Parser.parse(Schema.java:1425)
        at org.apache.avro.Schema$Parser.parse(Schema.java:1413)
        at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:131)
        at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:90)
        at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:93)
        at org.apache.avro.tool.Main.run(Main.java:67)
        at org.apache.avro.tool.Main.main(Main.java:56)

Seems like there might be a validation that apache avro is doing that avro_rs is not. How I found this error is using the parse_list or load a directory of schema files. I have a record type that is used more that once in a parent record type and because it in lines the child schemas in the record I get an error this like the one above. In apache avro when it inlines the child schemas in parent it only defines the child record type once and then uses it by name the subsequent times. In my example, this is sort of the same issues, the record type name is the same "test" and avro_rs is ok with that, but apache avro is not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions