Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDF/XML parser is silently failing to load ontology containing XMLLiteral #496

Closed
rsgoncalves opened this issue Mar 3, 2016 · 6 comments

Comments

@rsgoncalves
Copy link
Member

Loading the ontology below yields no errors/exceptions, yet the resulting OWLOntology object contains 0 axioms and 0 entities. I guess it's failing due to the XMLLiteral, but I expected an exception.

<?xml version="1.0"?>
<rdf:RDF xmlns="http://www.w3.org/2002/07/owl#"
     xml:base="http://www.w3.org/2002/07/owl"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:protege="http://protege.stanford.edu/"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
    <Ontology/>

    <AnnotationProperty rdf:about="http://protege.stanford.edu/code"/>

    <Class rdf:about="http://protege.stanford.edu/A">
        <rdfs:subClassOf rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
        <protege:code rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"><test>xxx</test></protege:code>
    </Class>

</rdf:RDF>

<!-- Generated by the OWL API (version 4.2.0.20160228-1947) https://github.com/owlcs/owlapi -->

By replacing rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral", with rdf:parseType="Literal" the ontology loads fine, and Protégé correctly interprets the literal as an XMLLiteral, though this is perhaps a separate issue--maybe related to #412 or #439 .

@ignazio1977
Copy link
Contributor

The datatype IRI is correct for XMLLiterals, but it does not seem to be used in RDF/XML - only the rdf:parseType="Literal" is mentioned in the spec https://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#section-Syntax-XML-literals

Stands to reason that the datatype declaration should be used as well, but I can't find references saying that it's al allowed behaviour.

XMLLiteral is defined in RDF, so I took a look, but the definition seems almost circular, in that it refers RDF/XML, and is quite terse: https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-rdf-XMLLiteral

I'd be tempted to allow both forms under lax parsing, although if possible it would be better to change the tools that created these literals.

Regarding the lack of thrown exceptions, this is because the default parsing mode is lax; under lax mode, remaining unparsed triples do not cause exceptions to be thrown - a conforming parser really shouldn't do that, but the whole lax parsing functionality is designed to handle data with syntactic errors, to avoid throwing errors on so many real world ontologies.

@ignazio1977
Copy link
Contributor

Incidentally, the example in the specs:

<ex:prop rdf:parseType="Literal"
         xmlns:a="http://example.org/a#"><a:Box required="true">
     <a:widget size="10" />
     <a:grommit id="23" /></a:Box>
</ex:prop>

is not great, as the namespace a is declared on the property element rather than the literal. Should have been declared on the Box element for the literal to be self contained.

@ignazio1977
Copy link
Contributor

/shame
The tool that created those literals might well be the OWL API, I found at least one way of making that happen.

@ignazio1977
Copy link
Contributor

Speaking of which, I got exceptions attempting to parse this with RDF/XML - I suspect the lack of exceptions in this case is due to not specifying a format, and another parser not throwing an error. In the past, this has happened because of the OBO parser (e.g., it accepts a preamble in Manchester syntax).
Can you have a look at what format the empty ontology is reported to be?

@ignazio1977 ignazio1977 added the bug label Mar 5, 2016
@rsgoncalves
Copy link
Member Author

Oh guess I did forget to mention that I produced the ontology programmatically with the OWL API 4.2.0. I didn't specify a format when parsing it, so that explains it. The empty ontology is reported to be in RDF/XML syntax.

@ignazio1977
Copy link
Contributor

I fixed the incorrect output in 4.2.0 (4.2.1 has just been released).

The empty ontology still needs investigation. However, you should no longer get in this situation with 4.2.1

ignazio1977 added a commit that referenced this issue Mar 6, 2016
OWLAPI bug caused unparseable output. Allow it to be read and saved 
correctly.
ignazio1977 added a commit that referenced this issue Mar 6, 2016
OWLAPI bug caused unparseable output. Allow it to be read and saved 
correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants