Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why quoted triples, when we already have named graphs? #46

Open
lars-hellstrom opened this issue Jun 7, 2023 · 22 comments
Open

Why quoted triples, when we already have named graphs? #46

lars-hellstrom opened this issue Jun 7, 2023 · 22 comments
Labels
wr:pending Wide review management

Comments

@lars-hellstrom
Copy link

This is not meant to suggest that RDF shouldn't have quoted triples, but rather to point out that this is likely an objection that some people will raise, and the specification should have some answer. Concepts seems the most natural document in the collection.

I'm quite the beginner, so I don't know what the answer might be. Quoted triples definitely seems an obvious way of making claims about other claims, but when suggesting an encoding in terms of quoted triples to people more experienced with semantic web things, I got the reply that "nah, you do that with named graphs instead". It took me a while to grok that — much literature still paints named graphs mainly as a tool for keeping track of remote datasources — but of course a single edge named graph is a way of speaking about that edge, and in a quadstore it is more feasible to search all graphs for a triple than it is to search for a blank node reifying that triple, so named graphs overcome problems with classic reification. Then what is the point of quoted triples as a completely different mechanism?

It is possible that there is some aspect of having just one term that is the quoted triple (s,p,o), instead of any number of size 1 graphs which contain exactly that triple, that is the important difference (at least that is one thing I cannot easily see how it would be simulated), but then that should probably be spelt out. This is the kind of requirement one throws in when aiming to introduce a quoting operation, but it is not entirely clear what application would benefit from it.

The counterargument against quoted triples is conceptual and implementation simplicity: quadstores are nicely flat data structures, whereas quoting brings the complexities of nesting. Of course, representing information that is inherently nested within a flat store only means that the nesting happens at a higher level of the representation, but since this could make it Someone Else's Problem it would probably have its proponents. Why not regard << >> and {| |} merely as syntactic shorthands for constructing a named graph? (It doesn't meet the spec, but why is the spec that way?)

@rat10
Copy link

rat10 commented Jun 9, 2023

As long as named graphs can't be nested in the syntax they will always be used for one primary purpose mainly. What that purpose is is a very application specific question. It must not always be the same purpose, but managing provenance is a natural default in an integration-focused project like the semantic web.
Quoted triples won't solve that because nesting them is verbose and quickly gets quite unreadable, and the semantics the CG report proposed doesn't help either. But they do, at least syntactically, open the door to Property Graph style statement annotation and qualification, and that is where they will be successful. That is a modelling idiom very different from named graphs, focused more on qualification than on integration, providing detail instead of context.
But what is considered detail and what context is again an intuition that is very application specific. And syntactically both intuitions do overlap in the case of single triples.
Maybe nested named graphs would provide the best of both worlds and bridge them quite naturally (and in one and only one modelling primitive), but syntactically I'd rather go for nesting {{{ ... }}} than << << << ... >> >> >>. or {| {| {| ... |} |} |}.

@afs
Copy link
Contributor

afs commented Jun 9, 2023

I'm sure a NGS design could be done. (NGS = Named Graph Singleton")

But as already noted, named graphs are already in use for e.g. data management - a non semantic usage pattern - indeed the concepts behind Named Graphs existed before SPARQL.

Some systems have the default as the union of named graphs.
This would result in the triple in the NGS also being asserted.

One use case for quote triples is referring to triple as triples - the "syntactic case". "Triple was added to database" as used in bitemporal modelling for "as it was recorded", different to when a fact was original asserted.

The challenge that I see is what is the reasonably minimal building block. Encoding abstractions with triples (c.f RDF Lists, Reification) has practical usage problems.

@rat10
Copy link

rat10 commented Jun 10, 2023

I'm sure a NGS design could be done. (NGS = Named Graph Singleton")

The point is nesting. Then different levels can have different purposes. The outermost named graph could still be used for data management.

But as already noted, named graphs are already in use for e.g. data management - a non semantic usage pattern - indeed the concepts behind Named Graphs existed before SPARQL.

Defined in Named Graphs, Carroll et al 2005, with a very definitive semantics, which was largely ignored in practice. My take on that history is: if a semantics doesn't match the predominant intuition and needs of users, it won't survive. RDF-Star semantics as defined in the CG report will face the same fate.

Some systems have the default as the union of named graphs. This would result in the triple in the NGS also being asserted.

You will always find some system that has some default that doesn't match a certain proposal. We have to be pragmatic. A pragmatic approach to named graph semantics IMO would be: formalize the "data management" semantics as the default - names address graphs, graphs are just ordinary chunks of RDF data - and add a facility to specify other semantics on demand. Plain and simple IMHO.

One use case for quote triples is referring to triple as triples - the "syntactic case". "Triple was added to database" as used in bitemporal modelling for "as it was recorded", different to when a fact was original asserted.

The challenge that I see is what is the reasonably minimal building block.

Define reasonably. The quoted triple as defined by the CG report is indeed minimal. The downside however is that the overwhelming majority of use cases will have to add another triple to refer to an occurrence, and two more triples to implement the TEP. All that for the sake of a very specific use case and an unwillingness to tackle the type/instance problem (which will not go away, ever). This is a design that is minimal in what it achieves, but maximal in the trouble it causes. Reasonable? Hardly.

I can only repeat that I find the graph literal datatype as proposed by Antoine Z. and before by Ivan H. much more convincing as a way to satisfy the need for syntactically faithful representations of triples: it has clear and intuitive semantics that even a novice user will immediately understand, it represents a minimal extension to RDF, it perfectly captures the meaning of blank nodes if the graph captures the whole CBD. What more could you want?

@TallTed
Copy link
Member

TallTed commented Jun 12, 2023

@rat10 (and others) — Please avoid the temptation to use every acronym/abbreviation available to you without linking to the meaning you intend. I stumble over many of these, even when I have known their (typical) meaning for years, significantly but not only because they are not always used for that meaning.

TEP? Probably but not certain to have been intended to mean rdf-star:TransparencyEnablingProperty, as defined in § 7.2 Extended vocabulary of the RDF-star and SPARQL-star Final Community Group Report 17 December 2021.

CBD? Probably but not certain to have been intended to mean Concise Bounded Description, which has very limited validation as a W3C Member Submission ("This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process.")

Even the recent citation of the Named Graphs paper was challenging to run down. I think this link gets what was intended. I note that there is no official W3C Standard or Draft for Named Graph, though there does exist a W3C Editor's Draft.

@afs
Copy link
Contributor

afs commented Jun 12, 2023

"Named Graph" was introduced in SPARQL in the definitions section

It is in RDF Concepts where it was generalized: https://www.w3.org/TR/rdf-concepts/#dfn-named-graphs

@rat10
Copy link

rat10 commented Jun 13, 2023

@TallTed This is a discussion board. I don't need to treat entries here with the same diligence than a mail to the list. TEP and CBD have already been discussed in meetings and on the mailinglist, the Named Graph paper is easy to find with the information I gave. The TEP is the only bridge the proposed semantics provides to implement referentially transparent embedded triples, so the acronym should be familiar to anyone with a passing interest. The concept of CBDs has been around more than 20 years now, and I do even write it out half of the time I use it. I think that if you are a member of an RDF 1.2 WG you should feel compelled to look these acronyms up if you don't know them already, and you should be able to disambiguate the correct reference from any other Google result on "TEP" or "CBD" (if there are any). That much I think can be expected from participants in this discussion.

@lars-hellstrom
Copy link
Author

@rat10

This is a discussion board. I don't need to treat entries here with the same diligence than a mail to the list.

Actually, I'd say it's the other way around. A mailing list, even if it does get archived, is an ongoing conversation which the subscribers are following; if you don't understand, you can ask. An issue tracker is far more something that will see renewed interest much later, when a poster need not be around anymore. In particular, due diligence before posting an issue is that one at least scans old issues to check if one's own issue is a duplicate of an old issue. That's hard to do if the comments are too obscure.

I think that if you are a member of an RDF 1.2 WG you should […] That much I think can be expected from participants in this discussion.

The published public draft for RDF 1.2 concepts refer to this tracker for feedback. It would be wrong to assume it is only visited by WG members.

@rat10
Copy link

rat10 commented Jun 13, 2023

@lars-hellstrom Okay, I'll take that into account in future posts. Still, my feeling is that too much of these formal demands rather stiffle discussions than enabling them. And is the mailing list now not anymore considered part of the conversation that I can rely on that readers know about? I did recently write a really exhaustive review of the TEP after Olaf asked me to. Do I have to link to that every time I mention the TEP in this issue tracker?

@doerthe
Copy link

doerthe commented Jun 13, 2023

To come back on the discussion: As far as I know there are multiple semantics for named graphs which are all implemented. So, if we want to bind rdf-star to named graphs we would need to fix their semantics. Given the discussions we already have here and the fact, that the multiple semantics are captured in a WG-report (that is: there was a discussion), I am not as optimistic as @rat10 that we would come to an agreement.

I'd like to be wrong here :)

I think we first need to fully understand the use cases and decide based on these where (not) to go.

@lars-hellstrom
Copy link
Author

@afs wrote:

I'm sure a NGS design could be done. (NGS = Named Graph Singleton")

Do you mean "implementing quoted triples in terms of named graph singletons"? (Feels like 'singleton' should be qualifying 'named graph' rather than the other way around, but whatever.)

Effectively that would mean that in Turtle-star, the quoting <<>> is like [], except that the implicit blank node becomes the name of a size 1 graph rather than the subject of some number of triples.

That violates the spec in that quoted triples are no longer distinct from blank nodes. (I'm not sure whether thinking of them as being drawn from a separate namespace makes sense, especially not when considering dataset isomorphism.)

A naive implementation also violates the spec in that multiple quotations of the same triple might generate new blank nodes representing the same quoted triple, but I suppose you could get around that by keeping a table of named graph singletons and always picking the first match from that table. (What if someone explicitly created a named graph singleton, though — would that be appropriated to represent the quoted triple?)

Finally this would allow for creating data structures that indistinguishable from cyclic quotations:

_:cycle {
   :foo :bar << :baz :bar _:cycle >> .
}
_:cycle2 {
   :baz :bar << :foo :bar _:cycle2 >> .
}

So it seems such a design has some issues to nail down.

But the point of my original question was: What do you (the WG) say to an implementor (not me) who feels quoted triples ought to work in that way? Why shouldn't it be correct?

@rat10

The point is nesting. Then different levels can have different purposes. The outermost named graph could still be used for data management.

Is this the data structures versus structured data distinction? (And it's probably called something else in the CS literature, sigh.) The programming language mainstream don't bother to support composite values, because there are tools (such as cons cells) for building data structures into which you can store as many atomic values as you wish, by allocating dynamic memory as needed. This is not quite the same as proper composite values (a difference that becomes apparent when you work with computer algebra), but the opinion which seems to be dominant among language designers is that it should be enough.

Just like on a heap you can build data structures nested to whatever depth you want, named graphs with blank node names can in principle represent nesting to whatever depth one wants. The Turtle serialisation may not show that nesting, but you could regard <<>> as syntactic sugar, for the inline construction of a separate named graph while you're asserting triples of some other graph. Logical RAM being flat does not prevent nesting, and nor would quadstores being flat.

But does the WG want to do it that way, or do you want it to not be done that way?

@rat10
Copy link

rat10 commented Jun 16, 2023

@lars-hellstrom

@rat10

The point is nesting. Then different levels can have different purposes. The outermost named graph could still be used for data management.

Is this the data structures versus structured data distinction? [...]
Without properly understanding what you refer to: no.
Just like on a heap you can build data structures nested to whatever depth you want, named graphs with blank node names can in principle represent nesting to whatever depth one wants. The Turtle serialisation may not show that nesting, but you could regard <<>> as syntactic sugar, for the inline construction of a separate named graph while you're asserting triples of some other graph. Logical RAM being flat does not prevent nesting, and nor would quadstores being flat.

I was alluding to the syntactic issue. For one, replacing <<...>> with nested {...} makes for a much saner syntax. It also makes the question obsolete what the difference between << :s :p :o >> and { :s :p :o } might be - there shouldn't be one IMO. It also solves the question of quoted graphs, i.e. "why don't we have << :s :p :o. :x :y :z>>?" because we would have {:s :p :o. :x :y :z} just as well. This makes navigating and querying much more straightforward, as there would be no need to e.g. check for the provenance of a statement twice: as an annotation on a quoted triple or on the named graph containing it. Blank nodes would be local to graphs, as in Pat Hayes' BLogic proposal (see this PDF for an introduction) and that way able to avoid some of the contortions in the CG semantics.
There is some experience already with Notation3 (N3) nested graphs - 2 decades and quite some research, however no large scale usage. Still, compared to RDF-star the Notation3 proposal seems like well explored territory. I would however very strongly suggest that the default semantics for such nested graphs don't follow any special use case, like in N3 or the original Named Graph proposal by Carroll et al 2005, but contend themselves with the standard RDF semantics of referential transparency. Also graph naming semantics should default to what SPARQL suggests: the name addresses the graph. Anything more - like overloading the graph name with references to a thing in the real world or its closure under some entailment regime, or specifying a different semantics for the graph - should be possible to express (and supported by syntactic shortcuts, like {:opaque | :s :p :o} just as an example), but as an extra, in 80/20-style. I never understood why the RDF 1.1 WG wasn't able to agree on such a bare-bones common denominator, but it's never to late, isn't it? ;-)
Implementation would still be based on quad stores and the details - if each statement has an identifier and for each identifier there is a second statement, or a separate entry in an extra table, that records its graph membership(s), or the nesting of graphs is recorded via extra triples - don't need to be discussed here. Amazon Neptune seems to go this route already and there is certainly more than one way to do it.

[EDIT] The reasons why we are discussing RDF-star today and not Notation3 or other approaches to Named Graphs are IMHO largely social and "political". The discussions in the RDF 1.1 WG must have been intense, judging from the mailinglist archive, and it seems that nobody wants to get burned again with that topic. But OTOH a few years ago also only very few people would have thought that a proposal like RDF-star that requires some mayor tweaking in the installed base (specs, serializations and code), would garner such wide support. IMO that support is foremost an expression of a desire to get a sound and concise meta-modelling mechanism, but not very specific to the RDF-star proposal itself. When I ask people why they like RDF-star, or hope for it, they mostly are not aware of its pitfalls and consequences. They just hope for some rubber-stamped solution that they can use for what they actually care about.
There is some reluctance to jeopardize the momentum of support with a switch away from RDF-star. That's what I mean with "political", but my political instincts tell me that it is possible and that in the end everybody will be very happy if this WG arrives at a well-rounded solution, not another lukewarm compromise.

@niklasl
Copy link
Contributor

niklasl commented Jun 22, 2023

In trying to understand the possible relationships between quoted triples and named graphs (as I elaborated upon here), I think the notion of "graph literals", denoting themselves, help. It appears to me that the reason for not defining semantics of named graphs has been about the relation between a graph and its name, not "what a graph is". The graph is already defined as a set of triples.

A named graph could be just (in Notation 3, where a formula are "literals which are graphs themselves"):

<g1> rdfg:nameOf { <s> :p "o" . } .

(Note: In an imagined future where TriG 1.X acquired such a syntax it would be nigh ambiguous with named graphs, so perhaps some "quotation" marker, such as %{ .... } would be necessary.)

This also makes it possible to define subproperties of rdfg:nameOf, e.g. :denotes rdfs:subPropertyOf rdfg:nameOf, owl:sameAs ., or :believes rdfs:subPropertyOf rdfg:nameOf, rdfs:domain foaf:Person ., to represent the various notions of "naming".

Such "graphs themselves" would be opaque until explicitly linked together, with "transparency enabling"properties" between named graphs. This is exactly like how named graphs in a dataset are often managed in practise, just with explicit semantics.

But what would such graph literals mean in the default graph of an implicit or explicit dataset, or "within" a named graph? This is where opacity is an issue. Are graph literals "visible" there? Named graphs are not. as I understand it. So I would define them as not, until "enabled". Within a graph that "enabling" could be done with rdf:type, e.g. rdfg:NeutralGraph or even rdfg:NegativeGraph. (This is what RDF surfaces is about, and has a precursor in the old RDF Graph Literals and Named Graphs Note.)

Another question is whether a quoted triple is such a singleton graph (one statement), or still contained by it? As I mentioned in the referenced email, RDF 1.1 semantics state:

A subgraph of an RDF graph is a subset of the triples in the graph. A triple is identified with the singleton set containing it, so that each triple in a graph is considered to be a subgraph. A proper subgraph is a proper subset of the triples in the graph.

Given that plus a notion of graph literals gives us a yes, they are identical. This sets it even further apart from a reified statement though, which could, IIUC, be defined as unique, given OWL, with rdf:Statement owl:key (rdf:subject rdf:predicate rdf:object). And you can "point into" such triples (x:nameOfTheSubject owl:propertyChainAxiom (rdf:subject foaf:name)), since they are fully transparent but still "neutral". I haven't seen any need for that expressed in use cases for RDF-star triples though; I have seen more need for being able to map old reified forms into qualified descriptions though (e.g. for mapping a bf:Contribution to dc:contributor). (I personally wouldn't mind a graph literal typed as rdf:Statement to magically turn into such though, but I'd have to work to argue for it.)

There are practical questions in play too. Given that named graphs are what is currently managed as "units of description" in quad stores, it would be necessary to make some room for these self-denoting graphs. Given that room is already being required for quoted triples I think it would be beneficial to consider this larger question of "nested graphs". Technically, a "special URI" made by uniquely hashing its canonicalized ntriples representation (e.g. like <urn:tdb:2014:urn:sha256:fed610a12857801b8fed465951e8dbb3d3a4a03f2933dd146a897cd0ded87ea9>), can be to conceptual graphs what skolemization is to bnodes (and I believe quoted triples are often implemented something like that in practise). This has the upshot of everything in RDF still being possible to flatten to just RDF 1.1. NQuads, with no new syntax for that (unless the special URI is desired to be very special, and not just a protocol).

Storing such as actual specially-named graphs in separate "documents" or "contexts" could work in existing implementations as is (with some code wrangling, cf. how we handle bnodes, especially RDF lists). But it may be untenable for lots of small graphs (see e.g. this performance assessment). This is also a lot of overhead for the case of annotated triples (both asserted and quoted). This may be another practical reason for keeping single triples within the same "graph storage unit". But I think that's an implementation detail. (At the National Library of Sweden we store RDF as just JSON-LD, so we basically side-step all of that (at a cost). This of course has a lot of bearing upon my view of RDF data as "raw", with all entailment happening upon that.)

Some more motivation to think along these lines is that quoted triples quickly blow up syntactically when you need to talk about, i.e. quote, facts like:

<bob> :knows [ a :Person ; :name "Mary" ] .

or:

<abc> :hasOrderedParts (<a> <b> <c>) .

And we still need separate graphs to capture things like:

<ng1> { <mary> a :Person ; :name "Mary" . }
<ng2> { <mary> :name "Mary" . }
<ng2> rdfg:subGraphOf <ng1> .

With all this said, I must stress that we have a need for quoted triples, often also asserted, i.e. annotated triples. But as we use named graphs for all provenance management (including data from other sources), I strongly believe these ought to be explicitly co-defined in RDF 1.2, and not just being "different" in an undefined way. See e.g. json-ld/json-ld-star#45 for more thoughts and needs. (Also, see this wikidata experiment for related use of annotations for provenance.)

@rat10
Copy link

rat10 commented Jun 27, 2023

In trying to understand the possible relationships between quoted triples and named graphs (as I elaborated upon here),

A post that IMO didn't get the attention it deserves. However, to answer the main question: many people just don't want to get "there" again - "there" being named graphs and all the stressful disagreement around them. A truce was called in RDF 1.1 that left many people unsatisfied, but is a truce nonetheless. IMO this is a social issue rather than a technical one. A technical solution that just adds to what we have now, without disturbing anything that exists, is perfectly possible. But there is a very outspoken reluctance to go there. Just check recent WG minutes, 15.6.23 and 22.6.23. The RDF*/star CG didn't engage in discussing the issue, instead taking Olaf's dictum that the two approaches are "orthogonal" as the last word on it.

A named graph could be just (in Notation 3, where a formula are "literals which are graphs themselves"):

<g1> rdfg:nameOf { <s> :p "o" . } .

(Note: In an imagined future where TriG 1.X acquired such a syntax it would be nigh ambiguous with named graphs, so perhaps some "quotation" marker, such as %{ .... } would be necessary.)

This also makes it possible to define subproperties of rdfg:nameOf, e.g. :denotes rdfs:subPropertyOf rdfg:nameOf, owl:sameAs ., or :believes rdfs:subPropertyOf rdfg:nameOf, rdfs:domain foaf:Person ., to represent the various notions of "naming".

Basically: yes. The graph name "identifies" a graph. It doesn't necessarily denote it, but may mean something else. That is in line with the muddled way how identification works on the semantic web in general (some background). It just gets more apparent here because the thing addressed is not a thing in the real world or a resource on the web (e.g. an HTML document) but another piece of RDF data.

Such "graphs themselves" would be opaque until explicitly linked together, with "transparency enabling"properties" between named graphs. This is exactly like how named graphs in a dataset are often managed in practise, just with explicit semantics.

This I can't agree with. IMO an RDF 1.1 named graph, lacking any further specification, can only be assumed to be referentially transparent, just like any other set of triples, because that is the way RDF is defined. Can you explain how you come to the conclusion that in practice they are managed to be referentially opaque?

But what would such graph literals mean in the default graph of an implicit or explicit dataset, or "within" a named graph? This is where opacity is an issue. Are graph literals "visible" there? Named graphs are not. as I understand it. So I would define them as not, until "enabled". Within a graph that "enabling" could be done with rdf:type, e.g. rdfg:NeutralGraph or even rdfg:NegativeGraph. (This is what RDF surfaces is about, and has a precursor in the old RDF Graph Literals and Named Graphs Note.)

I'm pondering a design in which RDF 1.1 named graphs are left as they are (of course a vocabulary to describe their intended or application-specific semantics should be added), but nested graphs - using curly brackets as well - would be defined with a clear (and configurable) semantics.

[...]

Technically, a "special URI" made by uniquely hashing its canonicalized ntriples representation (e.g. like <urn:tdb:2014:urn:sha256:fed610a12857801b8fed465951e8dbb3d3a4a03f2933dd146a897cd0ded87ea9>), can be to conceptual graphs what skolemization is to bnodes (and I believe quoted triples are often implemented something like that in practise).

From what I heard they are rather implemented as triples with an identifier and a special marker. Maybe that "special URI" is the identifier? Well, implementation detail...

[...]

With all this said, I must stress that we have a need for quoted triples, often also asserted, i.e. annotated triples. But as we use named graphs for all provenance management (including data from other sources), I strongly believe these ought to be explicitly co-defined in RDF 1.2, and not just being "different" in an undefined way. See e.g. json-ld/json-ld-star#45 for more thoughts and needs. (Also, see this wikidata experiment for related use of annotations for provenance.)

My proposal to the WG would be to say that RDF 1.1 named graphs are meant to implement application-specific semantics. RDF currently provides no means to explicitly describe their meaning, but such a facility could (and IMO should) easily be added. Nonetheless there are things that RDF is too weak to do and that can only be achieved through out-of-band means. That's what RDF 1.1 named graphs are for, and should remain to be used for (in the absence of any other grouping mechanism they also are and will be used as an optimization technique for purposes that RDF could do, albeit only on a singleton level - one of the downsides of the proposed singleton <<...>> approach and the reason why I would favor nested graphs).

Any solution - quoted triples, or nested graphs or a singleton property-based triple identification mechanism or what have you - should have clearly defined default semantics (asserted referentially transparent occurrences IMO) and a syntactically concise mechanism to define other semantics. And any solution has to be able to be mapped to an n-ary relation in an unambiguous way - flattened in your words - to guarantee compatibility with RDF 1.1.

@lars-hellstrom
Copy link
Author

@rat10 wrote

I was alluding to the syntactic issue. For one, replacing <<...>> with nested {...} makes for a much saner syntax.

I agree it's nicer, but wouldn't braces for quoting get into trouble with SPARQL's use of braces for grouping? I'm thinking specifically about group patterns. Even if the context of a group pattern can be proved sufficiently distinct from the context of an RDF term that there is no confusion, one would still have to parse a lot of text before being able to determine which it is, if both use braces.

Blank nodes would be local to graphs, as in Pat Hayes' BLogic proposal (see this PDF for an introduction) and that way able to avoid some of the contortions in the CG semantics.

That PDF rather seems to suggest the blank nodes are local to graph surfaces, where a graph surface may contain multiple graphs, so no change from the current state of affairs in that respect. (I've come across a scheme for signatures of graphs that would put the blank node naming the graph to sign inside the signature graph, but also the blank node naming that signature graph inside the signature graph, so merely nesting graphs would not suffice for that scheme. Then again, I don't know if it was a good scheme.)

@lars-hellstrom
Copy link
Author

@niklasl wrote:

Another question is whether a quoted triple is such a singleton graph (one statement), or still contained by it? As I mentioned in the referenced email, RDF 1.1 semantics state:

A subgraph of an RDF graph is a subset of the triples in the graph. A triple is identified with the singleton set containing it, so that each triple in a graph is considered to be a subgraph. A proper subgraph is a proper subset of the triples in the graph.

I think this is a red herring. That 'is identified' has the smell of something an author writes because they don't want to burden the presentation with extra formalia (or themselves with having to write out those formalia), not something that logically means anything. The classical example is to identify the letter a with the length 1 word/string a: it simplifies your notation, but you don't want to do that in your formal construction of words on a given alphabet.

Storing such as actual specially-named graphs in separate "documents" or "contexts" could work in existing implementations as is (with some code wrangling, cf. how we handle bnodes, especially RDF lists). But it may be untenable for lots of small graphs (see e.g. this performance assessment).

That assessment is annoyingly void on details on how they actually store the quoted triples. For the other approaches they refer to Reifying RDF: What works well with wikidata, whose authors give plenty of details, but for quoted triples there is just "our product does this today". Also the dataset appears not to have any nesting of quoting, which could be skewing the assessment.

This is also a lot of overhead for the case of annotated triples (both asserted and quoted). This may be another practical reason for keeping single triples within the same "graph storage unit".

You're thinking the asserted and the quoted forms of a triple should share storage? Oh well, implementation details, I suppose.

One interesting example mentioned in that Reifying RDF paper is the presidencies of Grover Cleveland. In Wikidata there were (at least at the time) two edges asserting Grover Cleveland was president of the United States, corresponding to his two non-consecutive terms as president, and these could be distinct by virtue of having different start time and end time annotations. In RDF, an edge is just a triple, no matter what assertions might have been made about that triple. This corresponds neatly to the RDF-star spec that << :s :p :o >> is always the same RDF term with the same value, no matter how many times you say it. There's probably an illustrative example that can be made out of this, which would be well placed in rdf-concepts.

@niklasl
Copy link
Contributor

niklasl commented Jul 9, 2023

@rat10 wrote:

In trying to understand the possible relationships between quoted triples and named graphs (as I elaborated upon here),

A post that IMO didn't get the attention it deserves. However, to answer the main question: many people just don't want to get "there" again - "there" being named graphs and all the stressful disagreement around them. A truce was called in RDF 1.1 that left many people unsatisfied, but is a truce nonetheless. IMO this is a social issue rather than a technical one. A technical solution that just adds to what we have now, without disturbing anything that exists, is perfectly possible. But there is a very outspoken reluctance to go there. Just check recent WG minutes, 15.6.23 and 22.6.23. The RDF*/star CG didn't engage in discussing the issue, instead taking Olaf's dictum that the two approaches are "orthogonal" as the last word on it.

I can understand the reluctance, and the desire to do things step by step. I don't read too much reluctance in those minutes though (albeit signs thereof), I mostly see the trouble stemming from the known issue that named graphs aren't enough. They are occurrences of graph literals, or graph terms, but the latter, which do appear to almost(?) equate with quoted triples, are beyond the charter of the WG. It is this limitation that is somewhat concerning, but the WG can still address questions thereof (and appears to want to).

It appears to me that the old "graph literals", now named graph terms in the latest Notation 3 CG draft, are about the same kind of quoting as quoted triples. Perhaps it is an issue of triples versus triple sets? (The Axiom of regularity, stating that "no set is an element of itself", may provide the key difference; since as @lars-hellstrom just pointed out, the phrasing "A triple is identified with the singleton set containing it" in RDF 1.1 Concepts may indeed be a red herring.) Given that the members of these groups overlap, and following the minutes, I have high hope that this can be clarified further. I hope issues such as this provide a sample of perspectives to be taken into account, but these are indeed both technical, social and cognitive issues, all at once. (Standardization requires diligence to eliminate unnecessary differences, whilst being pragmatic enough to yield working implementations and adoption, all the while balancing predictable long-term consequences.)

A named graph could be just (in Notation 3, where a formula are "literals which are graphs themselves"):
(Note: In an imagined future where TriG 1.X acquired such a syntax it would be nigh ambiguous with named graphs, so perhaps some "quotation" marker, such as %{ .... } would be necessary.)
This also makes it possible to define subproperties of rdfg:nameOf, e.g. :denotes rdfs:subPropertyOf rdfg:nameOf, owl:sameAs ., or :believes rdfs:subPropertyOf rdfg:nameOf, rdfs:domain foaf:Person ., to represent the various notions of "naming".

Basically: yes. The graph name "identifies" a graph. It doesn't necessarily denote it, but may mean something else. That is in line with the muddled way how identification works on the semantic web in general (some background). It just gets more apparent here because the thing addressed is not a thing in the real world or a resource on the web (e.g. an HTML document) but another piece of RDF data.

I have hopes that Notation 3 is aiming for defining precisely this definition of named graphs. I believe that we can strive towards convergence in thinking, design and implementation here. RDF 1.2 won't standardize Notation 3, but it might set the stage for clarification and interoperability with named graphs, quoted triple "constituents" and graph terms as extensions of those. It is not ideal, but they need not diverge.

Such "graphs themselves" would be opaque until explicitly linked together, with "transparency enabling"properties" between named graphs. This is exactly like how named graphs in a dataset are often managed in practise, just with explicit semantics.

This I can't agree with. IMO an RDF 1.1 named graph, lacking any further specification, can only be assumed to be referentially transparent, just like any other set of triples, because that is the way RDF is defined. Can you explain how you come to the conclusion that in practice they are managed to be referentially opaque?

Two named graphs in a dataset can contain contradictions, and differences can be preserved that have perhaps in its default graph been asserted as owl:sameAs. I believe this has to be the "default". Implementations may differ, and as we know this is not standardized.

This is supported by wording in the RDF 1.1 W3C Working Group Note: On Semantics of RDF Datasets (quoting that document):

Named graphs in RDF datasets are sometimes used to delimit a context in which the triples of the named graphs are true. From the truth of these triples according to the graph semantics, follows the truth of the named graph pair. An example of such situation occurs when one wants to keep track of the evolution of facts with time. Another example is when one wants to allow different viewpoints to be expressed and reasoned with, without creating a conflict or inconsistency. By having inferences done at the named graph level, one can prevent for instance that triples coming from untrusted parties are influencing trusted knowledge. Yet it does not disallow reasoning with and drawing conclusions from untrusted information.

(As a note this has no official bearing, but I hope that it states intent for further convergence. I think that the RDF-star WG can follow this intent, if only for quoted triples, to avoid creating differences that become future obstacles when standardizing semantics for named graphs or graph terms. Even if nothing normative can be defined, a Note with advice can help a lot, which I also believe there is will to make.)

Also, the ongoing Notation 3 CG work currently states:

Essentially, a graph term represents an occurrence of an RDF graph — i.e., a quoting or citing of the graph. Importantly, a graph term does not assert the contents of the RDF graph as being true (e.g., :cervantes dc:wrote :moby_dick). In fact, the graph term is interpreted as a resource on its own.

(I am not sure what "occurrence" means here as I am quite sure that a graph term ought to "denote itself"...) It also states:

As they represent a quoting of RDF graphs, graph terms are not "referentially transparent".

(followed by a rendering of the classical Superman problem).

To elaborate on that, I would say that <superman> obviously becomes identical to <clarkkent> if you "believe" <lexluthor>, as in "interpret" the named graph in question, or "unquote" it in Notation 3 terminology. But until that happens, they are isolated beliefs (descriptions of different worlds, if you will).

(Also, we might say that the Superman problem lacks an essential <loislane> :believes << <superman> owl:differentFrom <clarkkent> >>, as otherwise that "belief" is just ambiguous, and clarified by <lexluthor>. Also, in the <lexluthor> belief, there is no difference, meaning there is no way to say what <loislane> believes (other than as an absurd quote, stating _:x owl:differentFrom _:x). But that is in the belief. We can look from the "outside", and state these facts. These are cartoon worlds both in the literal and the epistemological sense, after all.)

Back to quoting and replying to @rat10:

[...]

I'm pondering a design in which RDF 1.1 named graphs are left as they are (of course a vocabulary to describe their intended or application-specific semantics should be added), but nested graphs - using curly brackets as well - would be defined with a clear (and configurable) semantics.

This would need to be aligned with TriG, and with SPARQL. It might be hard, but I believe there is a desire to do so, given e.g. this thread on the semantic web mailing list. I think the core question is whether RDF-star introduces a divergent concept from graph terms, or if they can converge. A semantic difference for quoting may confuse and complicate practises, whereas one of granularity or grouping, which is more about data ergonomics (cf. RDF lists in concrete syntax vis. cons cell form) may be more palatable. Is the notion of "quotation" the same, and the differences lie in granularity? In use cases combining data sources with augmentation (through e.g. editorial work, inference or ML) using quotation or annotation on one or more subsets of facts, this is rather crucial.

(I also wonder if there really is any semantic "nesting" going on, more than relations between terms (though a relation may be named :nestedIn, or :subsetOf). From a technical implementation point of view, it is crucial, but that appears to be more of a lexicographical detail than a mathematical or semantic one? Cf. how a concrete syntax for multiple graphs like JSON-LD can be very nested, but still unambiguously flattened into N-Quads. In a non-standard generalized graph store, you can have one named graph "carrying" a default graph together with two local graphs named using blank nodes, all the while logically these are three named graphs. I've been wary about using that of course, and both RDF-star and Notation 3 appear to help out here, but I am not sure about the implications yet.)

From what I heard they are rather implemented as triples with an identifier and a special marker. Maybe that "special URI" is the identifier? Well, implementation detail...

Well, they cannot be asserted triples in a graph, so they need to be "disconnected" but still both referable and conditionally interpreted (so we can find all quoted facts about e.g. <superman>). These appear to be the same characteristics that named graphs have, and (perhaps even more so) graph terms. But I am glossing over some things to point out the technical necessity of ensuring the predicable identifier. The triple is not denoted by a URI, just as a literal nor a bnode is not; but they might be reduced to special kinds of URI:s internally (and if so, even graph terms can be). That is akin to the case of BNode skolemization, used as a last resort for predictable serialization of bnodes. Anyway, this specific anything-as-a-URI detail may be off-topic for this issue (albeit related); it appears more relevant to w3c/rdf-star#23.

To summarize: my current interpretation and hope is that the definition of graph terms, currently as part of the the ongoing work on Notation 3, might shed clarity upon the question of the possible semantics of named graphs, and that the former (graph terms) have a much closer relationship to quoted triples. The challenge is, I believe, to "pave the path" for graph terms by defining quoted triples first, rather than in conjunction.

@rat10
Copy link

rat10 commented Jul 11, 2023

@niklasl wrote:

@rat10 wrote:

In trying to understand the possible relationships between quoted triples and named graphs (as I elaborated upon here),

A post that IMO didn't get the attention it deserves. However, to answer the main question: many people just don't want to get "there" again - "there" being named graphs and all the stressful disagreement around them. A truce was called in RDF 1.1 that left many people unsatisfied, but is a truce nonetheless. IMO this is a social issue rather than a technical one. A technical solution that just adds to what we have now, without disturbing anything that exists, is perfectly possible. But there is a very outspoken reluctance to go there. Just check recent WG minutes, 15.6.23 and 22.6.23. The RDF*/star CG didn't engage in discussing the issue, instead taking Olaf's dictum that the two approaches are "orthogonal" as the last word on it.

I can understand the reluctance, and the desire to do things step by step.

Standardizing RDF-star quoted triples now, Notation3 formulae next, without a coherent vision of how the pieces fit together into a greater whole, may very well lead to more confusion and divergence in modelling idioms. "Step by step" doesn't cut it when such a basic feature as meta-modelling is concerned. RDF semantic extensions are free to do what they want, and indeed encouraged to explore new areas. I would not mind too much if RDF-star became such an extension (although I still find it dangerously disconnected from reality and prone to misuse in practice in more than one way). I would welcome Notation3 as a semantic extension to RDF, as it seems to be a really well-thought out and rounded concept (however, also Notation3 doesn't fit the bill when it comes to e.g. Property Graph compatibility). But integrating one or both into the core of RDF without having a concept of how they should interact with each other, how they could provide solutions to the pressing issues of statement qualification, without even making a discussion of such topics a requirement - that's just an unwarranted hoping for the best.

I don't read too much reluctance in those minutes though (albeit signs thereof), I mostly see the trouble stemming from the known issue that named graphs aren't enough. They are occurrences of graph literals, or graph terms, but the latter, which do appear to almost(?) equate with quoted triples, are beyond the charter of the WG. It is this limitation that is somewhat concerning, but the WG can still address questions thereof (and appears to want to).

In the Community Group I tried to discuss the whole problem space, including e.g. named graphs, statement qualification, etc., to which RDF-star claims to provide a solution, but the argument against such initiative was that "we are only working on RDF-star here, we are not trying to solve all the problems of RDF". Then the WG was chartered in a way very much reflecting the view of the CG report, but nonetheless is aiming to become "RDF 1.2" (or even "RDF 2.0" per one of the very same editors that blocked all wider ranging effort in the CG). And now "the narrow charter prevents us...". This is a circular argument, and therefore should be rejected as useless. It also doesn't hold: the WG is free to come to the conclusion that its charter is too narrow to produce a useful result and demand an extension, or dissolve without producing a spec. The WG is now responsible for what it produces, nobody else.

It appears to me that the old "graph literals", now named graph terms in the latest Notation 3 CG draft, are about the same kind of quoting as quoted triples. Perhaps it is an issue of triples versus triple sets? (The Axiom of regularity, stating that "no set is an element of itself", may provide the key difference; since as @lars-hellstrom just pointed out, the phrasing "A triple is identified with the singleton set containing it" in RDF 1.1 Concepts may indeed be a red herring.)

IMO this is angels dancing on the pin of a needle. An RDF graph is a set of RDF triples. A set is not required to contain multiple triples. It can contain only one triple just as well, or even be empty. We also know what a triple is. So, what more is there to know?

Given that the members of these groups overlap, and following the minutes, I have high hope that this can be clarified further. I hope issues such as this provide a sample of perspectives to be taken into account, but these are indeed both technical, social and cognitive issues, all at once. (Standardization requires diligence to eliminate unnecessary differences, whilst being pragmatic enough to yield working implementations and adoption, all the while balancing predictable long-term consequences.)

Hope alone is not gonna cut it. Some people claim that annotating a triple is completely different to annotating a graph. Some people try to justify not discussing named graphs in the context of the work in RDF-star with that argument. Some people claim that resorting to the safety of mathematical abstractions is a "prudent" approach, as if practice would care about such reserve. Some people claim that named graphs can and will never have an agreed upon semantics, as if there was no way to formalize and manage such an out-of-band arrangement. IMO these are all just lame excuses to not be bothered with the complexities of knowledge representation in the real world. Notation3 nested graphs come with a very specific semantics that doesn't help modelling of complex facts either, but is optimized for reasoning - powerful and interesting indeed, but not what was asked for in the W3C workshop on improving compatibility between RDF and LPG in Berlin 2019. Souri Das has proposed a syntax to the WG (called RDFn) and singleton properties provide a semantics that combined would meet the request from the Berlin workshop. The WG doesn't take them up and I can find no justification for that decision anywhere. Instead that pseudo-simple approach of quoted triples is pursued, favoring very specific demands of - again - out-of band issues like versioning of triples, as if we hadn't the named graph mechanism already for such purposes. That doesn't give me hope.

[...]

I have hopes that Notation 3 is aiming for defining precisely this definition of named graphs. I believe that we can strive towards convergence in thinking, design and implementation here. RDF 1.2 won't standardize Notation 3, but it might set the stage for clarification and interoperability with named graphs, quoted triple "constituents" and graph terms as extensions of those. It is not ideal, but they need not diverge.

Notation3 has a very specific interpretation of nested graphs and I don't see how that helps with much more basic needs like qualification of statements, grouping statements (as a very basic KR activity) etc. IIUC the semantics of RDF-star quoted triples per the CG report and of Notation3 formulae is very similar, modulo the treatment of blank nodes in RDF-star (which again is motivated mainly by the need to overcome the limitation of RDF-star to single triples). In my interpretation the semantics grafted on RDF-star by the CG is a reflection of the Notation3 people seeing this as a chance to introduce formulae into core RDF through the backdoor of RDF-star, despite the obvious problems (like what to do with blank nodes, or mimicking formulae as lists of quoted triples) - because they don't believe in sane W3C processes either.

No, I don't share your hopes. I see tactical manouvering that reflects a W3C unable to provide vision and guidance and organize support. All I can see from the W3C process seems rather dysfunctional to me, not to mention that it is disturbingly opaque. The whole scenery also seems to reflect an unwillingness to face some fundamental problems of RDF - the issues that arise with meta-modelling and with application-specific intuitions clashing with the integration-focused set semantics of RDF. This is of course no easy terrain, but it is an illusion to think it can be avoided by restricting oneself to "safe" areas of mathematical abstractions. It's easy to get lost in rabbit holes, but trying to ignore them has consequences too.

Such "graphs themselves" would be opaque until explicitly linked together, with "transparency enabling"properties" between named graphs. This is exactly like how named graphs in a dataset are often managed in practise, just with explicit semantics.

This I can't agree with. IMO an RDF 1.1 named graph, lacking any further specification, can only be assumed to be referentially transparent, just like any other set of triples, because that is the way RDF is defined. Can you explain how you come to the conclusion that in practice they are managed to be referentially opaque?

Two named graphs in a dataset can contain contradictions, and differences can be preserved that have perhaps in its default graph been asserted as owl:sameAs. I believe this has to be the "default". Implementations may differ, and as we know this is not standardized.

Okay, I missed the "between" part. But IIUC there is still an important difference between such graphs internally - where IMO they can by default only be understood as referentially transparent - and the CG semantics for quoted triples which makes each term denote, but only in the specific syntactic form provided.

[...]

Cf. how a concrete syntax for multiple graphs like JSON-LD can be very nested, but still unambiguously flattened into N-Quads.

Can you give or point to an example? I think you mean a tree-like nesting, where curly brackets stand in for blank nodes. What I'm talking about is different: whole (sets of) triples nested as vertices within triples (and those within other graphs). That can be flattened to N-Triples, but it admittedly is ugly because very heavy on blank nodes AFAICT.

[...]

To summarize: my current interpretation and hope is that the definition of graph terms, currently as part of the the ongoing work on Notation 3, might shed clarity upon the question of the possible semantics of named graphs,

The RDF 1.1 WG Note on dataset semantics gives an example how SPARQL service descriptions can be used to describe the semantics of named graphs in a dataset, as default semantics and/or specifically per named graph. Add to that an IRI referring to a quotation semantics of your liking and you're all set, aren't you?

SPARQL/ RDF 1.1 named graphs are designed to facilitate out-of-band means, to do things with RDF that the spec doesn't specify how to do, because e.g. they are outside the realm of the mathematical abstractions that RDF operates on. I've come to the conclusion that this is a sensible arrangement. We should stop lamenting that "named graphs have no semantics", but embrace the fact that we need another instrument, and this time with semantics, to do all the things that are within the scope of RDF, but presently can resort to no other modelling primitive than RDF standard reification and n-ary relations: nested graphs as a way to ease modelling, inside RDF 1.1 named graphs. If well designed they can also provide hooks to configure semantics, asserted-ness, instantiation etc, meeting all the advanced needs that currently people try to stuff in into only one syntactic instrument.

and that the former (graph terms) have a much closer relationship to quoted triples. The challenge is, I believe, to "pave the path" for graph terms by defining quoted triples first, rather than in conjunction.

But who would still need the quoted triples if we had nested graphs with the same semantics? Or should they have different semantics? Then how to sensibly decide on the semantics of quoted triples without knowing what the semantics of nested graphs will be, and without being sure that they indeed will come? This needs a comprehensive architecture, not some narrowly chartered 2-years-let's-just-do-it-WG. Of course this also needs engagement and leadership. I have to confess I know nobody among the well-established members of this community who would be willing to take on such an endeavor, but maybe if its necessity becomes apparent...

@lars-hellstrom
Copy link
Author

@rat10 wrote:

Cf. how a concrete syntax for multiple graphs like JSON-LD can be very nested, but still unambiguously flattened into N-Quads.

Can you give or point to an example? I think you mean a tree-like nesting, where curly brackets stand in for blank nodes. What I'm talking about is different: whole (sets of) triples nested as vertices within triples (and those within other graphs).

Could you explain what you mean by non-tree-like nesting? Having multiple triples in a nested graph is no stranger syntactically than a variadic function in a mathematical formula, and certainly fits within the tree-like paradigm. (There are cases where you have to go beyond treelike structure—it's something I've worked on research-wise—but I would be surprised if it would arise in an RDF context.)

That can be flattened to N-Triples, but it admittedly is ugly because very heavy on blank nodes AFAICT.

With quads, you get one blank node per graph literal. Thus some blank nodes, but way less per atomic term than you get for the RDF encoding of lists. If insisting on triples — assigning a separate identifier / bnode for each distinct triple in the dataset, then separately collecting these identifiers into graphs — the count goes up, but probably not by that much.

With <<>> as quoting a set of triples (rather than exactly one triple), one could write

<<
  :a :b :c .
  :d :e << :f :g "h" >>
>> :i << >> .

for what flattens to quads (in order graph–subject–predicate–object) as

_:1 :a :b :c .
_:1 :d :e _:2 .
_:2 :f :g "h" .
_:1 :i _:3 .

(Since _:3 doesn't occur in the graph position of any quad, it refers to an empty graph.)

@rat10
Copy link

rat10 commented Jul 12, 2023

@lars-hellstrom

@rat10 wrote:

Cf. how a concrete syntax for multiple graphs like JSON-LD can be very nested, but still unambiguously flattened into N-Quads.

Can you give or point to an example? I think you mean a tree-like nesting, where curly brackets stand in for blank nodes. What I'm talking about is different: whole (sets of) triples nested as vertices within triples (and those within other graphs).

Could you explain what you mean by non-tree-like nesting? Having multiple triples in a nested graph is no stranger syntactically than a variadic function in a mathematical formula, and certainly fits within the tree-like paradigm. (There are cases where you have to go beyond treelike structure—it's something I've worked on research-wise—but I would be surprised if it would arise in an RDF context.)

Sorry, I was distracted by the reference to JSON-LD which I don't know well enough. I meant to refer to flattening to triples. More specifically, IMO it is crucial to define how annotated nested graphs can be mapped/flattened to mere triples, because for those we know what they mean. And AFAICT that is possible, but would for example involve a lot of blank nodes and a special property - possibly a subproperty of rdf:value - like rdfx:primaryTopicOf. And "tree-like nesting" is indeed not a useful term to put what I wanted to refer to, which is the way that nested n-ary relations with intermediate blank nodes are expressed in Turtle.

@rat10
Copy link

rat10 commented Jul 12, 2023

@rat10 wrote

I was alluding to the syntactic issue. For one, replacing <<...>> with nested {...} makes for a much saner syntax.

I agree it's nicer, but wouldn't braces for quoting get into trouble with SPARQL's use of braces for grouping? I'm thinking specifically about group patterns. Even if the context of a group pattern can be proved sufficiently distinct from the context of an RDF term that there is no confusion, one would still have to parse a lot of text before being able to determine which it is, if both use braces.

One thing at a time ;-) I admit I haven't put much thought into how things are queried. But a) it's complicated enough without and b) I guess some formatting (indentation and line breaks) should cover a lot of terrain. If not, then, well, maybe we have to invent a new kind of braces :-/

Blank nodes would be local to graphs, as in Pat Hayes' BLogic proposal (see this PDF for an introduction) and that way able to avoid some of the contortions in the CG semantics.

That PDF rather seems to suggest the blank nodes are local to graph surfaces, where a graph surface may contain multiple graphs, so no change from the current state of affairs in that respect. (I've come across a scheme for signatures of graphs that would put the blank node naming the graph to sign inside the signature graph, but also the blank node naming that signature graph inside the signature graph, so merely nesting graphs would not suffice for that scheme. Then again, I don't know if it was a good scheme.)

You got me there. My point was meant to be that with BLogic one can define a boundary in which blank nodes mean the same - there called a surface - and that boundary can contain multiple triples and even nested triples. But I admit that I don't really understand why it is so hard for RDF-star to get the semantics of blank nodes right, so I'll leave it at that.

@lars-hellstrom
Copy link
Author

lars-hellstrom commented Jul 13, 2023 via email

@rat10
Copy link

rat10 commented Jul 17, 2023

@rat10 wrote:

IMO it is crucial to define how annotated nested graphs can be mapped/flattened to mere triples, because for those we know what they mean. And AFAICT that is possible, but would for example involve a lot of blank nodes and a special property - possibly a subproperty of rdf:value - like rdfx:primaryTopicOf.

So on the one hand you insist

I 'insist' that the solution should do what it is expected to do, that it should be easy to use and should put the mainstream needs of non-logicians first.

on regarding the relation between quoted triple and its parts as an N-ary relation — which in a way it necessarily is, but how explicit[*] that relation has to be in the formalism is another matter — and on the other hand you also insist that this relation gets encoded in terms of triples. How is that different from reinventing RDF 1.0 reification (with rdf:Statement as the sought special property)? (Edit: I got confused about the terminology. "property"=verb, so I suppose the properties of the scheme would be rdf:subject, rdf:predicate, and rdf:object.)

You probably are aware that the CG report defines an unstar-mapping that maps quoted triples to n-ary relations. Still the result of that mapping is not RDF standard reification: those n-ary relations describe different things. The same would be true for a mapping of nested graphs to n-ary relations: they would describe yet another thing. The ways in which those things differ may often seem arcane, but they have very practical consequences: RDF standard reification describes a statement instance without asserting it. RDF-star per CG report describes a statement type without asserting it (and syntactically constrained). Singleton properties instantiate subtypes of statements, and assert them (and the supertype statement can be entailed). Consequently singleton properties can assert qualified relations, but RDF standard reification and RDF-star can't. IMO if singleton properties had chosen a syntactic extension that puts the (familiar) supertype first, but keeps the annotations/qualifications "nearby", they might have succeeded (despite the lack of a grouping mechanism).

What does this have to do with named graphs? Probably nothing. It is an orthogonal issue if the syntax is optimized for triples or graphs. IMO the syntax should be optimized for graphs (nested within RDF named graphs, as those are application-specific devices without a semantics), because the distinction between annotations on one or multiple triples is arbitrary and optimizing the proposed solution for single triples will only encourage people to use RDF named graphs for annotations on multiple triples - no matter the lack of sound semantics, and no matter the resulting cacophony of modelling styles. But RDF named graphs are probably best left to issues that can only be handled out-of-band, that are specific to an application, that don't need to be shared or that are in other ways out of scope of RDF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wr:pending Wide review management
Projects
None yet
Development

No branches or pull requests

7 participants