RDF* and SPARQL*

Draft Community Group Report 09

Latest editor's draft:
https://w3c.github.io/rdf-star/
Editors:
Olaf Hartig ( Linköping University )
Pierre-Antoine Champin ( ERCIM )
Authors:
Dörthe Arndt (Ghent University)
Bryan Thompson (Amazon)
Participate:
GitHub w3c/rdf-star
File a bug
Commit history
Pull requests

Abstract

TODO

Status of This Document

This is a preview

Do not attempt to implement this version of the specification. Do not reference this version as authoritative in any way. Instead, see https://w3c.github.io/rdf-star/ for the Editor's draft.

This specification was published by the RDF-DEV Community Group . It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups .

1. Introduction

1.1 Background and Motivation

This section is non-normative.

TODO, citing [ RDF-STAR-FOUNDATION ]

1.2 Overview

This section is non-normative.

TODO (the purpose of this section will be to provide an informal introduction to the approach for practitioners)

The syntax of RDF is defined in two layers:

Similarly, this document defines the abstract syntax of RDF* in §  2. Concepts and Abstract Syntax , and one concrete syntax based on Turtle [ TURTLE ] in §  3. Turtle* .

TODO list the prefix definitions implicitly used in all examples

1.3 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY , MUST , and MUST NOT in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.

Issue 3 : Do we need more things in the 'conformance' section? later process

For the moment, we only have the boilerplate text generated by respec.

2. Concepts and Abstract Syntax

An RDF* graph is a set of RDF* triples .

An RDF* triple is a 3-tuple defined recursively as follows:

As for RDF triples , we call the 3 components of an RDF* triple its subject , predicate and object , respectively. From the definitions above, it follows that any RDF graph is also an RDF* graph . Note also that, by definition, an RDF* triple cannot contain itself and cannot be nested infinitely.

The definition relies on the notions of IRI , literal , blank node , RDF triple , and RDF graph , introduced by RDF 1.1 Concepts and Abstract Syntax  [ RDF11-CONCEPTS ].

IRIs , literals , blank nodes and RDF* triples are collectively known as RDF* terms .

For every RDF* triple t , we define its constituent terms (or simply constituents) as the set containing its subject , its predicate , its object , plus all the constituent terms of its subject and/or its object if they are themselves RDF* triples . By extension, we define the constituent terms of an RDF* graph to be the union set of the constituent terms of all its triples.

Consider the following RDF* triple (represented in Turtle*):
<<
_:a
:name
"Alice"
>>
:statedBy
:bob.
Its set of constituent terms comprises the IRIs :name , :statedBy , :bob , the blank node _:a , the literal "Alice" , and the triple << _:a :name "Alice" >> .

An RDF* triple used as the subject or object of another RDF* triple is called an embedded triple . An RDF* triple that is an element of an RDF* graph is called an asserted triple . Note that, in a given RDF* graph , the same triple MAY be both embedded and asserted .

An RDF* dataset is a collection of RDF* graphs , and comprises:

Again, this definition is an extension of the notion of RDF dataset , hence it follows that any RDF dataset is also an RDF* dataset .

3. Turtle*

In this section, we present Turtle*, an extension of the Turtle format [ TURTLE ] allowing the representation of RDF* graphs . For the sake of conciseness, we only describe here the differences between Turtle* and Turtle.

3.1 Grammar

Turtle* is defined to follow the same grammar as Turtle, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.

[8] objectList ::= object annotation ? ( ',' object annotation ? )*
[10] subject ::= iri | BlankNode | collection | embTriple
[12] object ::= iri | BlankNode | collection | blankNodePropertyList | literal | embTriple
[27] embTriple ::= '<<' embSubject verb embObject '>>'
[28] embSubject ::= iri | BlankNode | embTriple
[29] embObject ::= iri | BlankNode | literal | embTriple
[31] annotation ::= '{|' predicateObjectList '|}'
Note

The only changes are that subject and object productions have been extended to accept embedded triples , which are described by the new productions 27 to 29 . Note that embedded triples accept a more restricted range of subject and object expressions than asserted triples . Additionally, the objectList production now accepts an optional annotation after each object.

Issue 9 : Include Annotation syntax in Turtle* and SPARQL* concrete-syntax sparql*

This has already been discussed on the mailing list .

The idea would be to have a notation like

:bob :age 42 {| :source <http://example.org/~bob/> |}.

as shortcut for

:bob :age 42.
<< :bob :age 42 >> :source <http://example.org/~bob/>.

3.2 Parsing

A Turtle* parser is similar to a Turtle parser as defined in Section 7 of the Turtle specification  [ TURTLE ], with an additional item in its state :

Additionally, the curSubject can be bound to any RDF* term (including an embedded triple ).

A Turtle* document defines an RDF* graph composed of a set of RDF* triples . The subject and embSubject productions set the curSubject . The verb production sets the curPredicate . The object and embObject productions set the curObject . For each Finishing the object N , production, an RDF* triple curSubject curPredicate N curObject is generated and added to the RDF* graph .

Beginning the embTriple production records the curSubject and curPredicate . Finishing the embTriple production yields the RDF* triple curSubject curPredicate curObject and restores the recorded values of curSubject and curPredicate .

Beginning the annotation production records the curSubject and curPredicate , and sets the curSubject to the RDF* triple curSubject curPredicate curObject . Finishing the annotation production restores the recorded values of curSubject and curPredicate .

All other productions MUST be handled as specified by Section 7 of the Turtle specification  [ TURTLE ], while still applying the changes above recursively.

3.3 Other Concrete Syntaxes

This section is non-normative.

While this document specifies only one concrete syntax , nothing prevents other concrete syntaxes of RDF* from being proposed. In particular, other existing concrete syntaxes for RDF, such as RDF/XML [ RDF-SYNTAX-GRAMMAR ], could be extended to support RDF*. In particular, the N-Triples syntax [ N-TRIPLES ] being a subset of Turtle, an appropriate subset of Turtle* could be defined to extend N-Triples accordingly.

4. SPARQL* Query Language

This Section introduces SPARQL*, which is an RDF*-aware extension of the RDF query language SPARQL [ SPARQL11-QUERY ]; i.e., SPARQL* can be used to query RDF* graphs.

4.1 Initial Definitions

In the following, we introduce a number of SPARQL*-specific definitions, which rely on the following notions, defined in SPARQL 1.1 Query Language  [ SPARQL11-QUERY ]: RDF term , query variable , triple pattern , property path pattern , property path expression , and solution mapping .

A SPARQL* triple pattern is a 3-tuple that is defined recursively as follows:

  1. Every SPARQL triple pattern is a SPARQL* triple pattern;
  2. If t and t' are SPARQL* triple patterns, x is an RDF term or a query variable , and p is an IRI or a query variable , then ( t p x ), ( x p t ), and ( t p t' ) are SPARQL* triple patterns.

As for RDF* triples , a SPARQL* triple pattern MUST NOT contain itself.

A SPARQL* basic graph pattern  ( BGP *) is a set of SPARQL* triple patterns .

A SPARQL* property path pattern is a 3-tuple ( s , p , o ) where

Issue 7 : Property path patterns in SPARQL* sparql*

I have added the definition of a SPARQL* property path pattern into the draft just for the sake of having such a definition. We need to think about whether it is useful to add this to SPARQL*, in which case we need to define the semantics of such SPARQL* property path patterns.

In fact, no matter what we decide, even for standard property path patterns , the semantics may have to be extended to use them over RDF* graphs .

A SPARQL* solution mapping  μ is a partial function from the set of all query variables to the set of all RDF* terms . The domain of μ, denoted by dom(μ), is the set of query variables for which μ is defined.

Note

The notion of a SPARQL* solution mapping extends the notion of a standard SPARQL solution mapping ; that is, every SPARQL solution mapping is a SPARQL* solution mapping . However, in contrast to SPARQL solution mappings , SPARQL* solution mappings may map variables also to RDF* triples .

All notions related to SPARQL solution mappings carry over naturally to SPARQL* solution mappings. In particular, the definition of compatibility extends naturally to SPARQL* solution mappings: two SPARQL* solution mappings μ 1 and μ 2 are compatible if, for every variable v that is both in dom(μ 1 ) and in dom(μ 2 ), μ 1 (v) and μ 2 (v) are the same RDF* term . In this case, μ 1 ∪ μ 2 is also a SPARQL* solution mapping. Moreover, for any SPARQL* solution mapping μ we write card[Ω](μ) to denote the cardinality of μ in a multiset Ω of such mappings. Finally, given a BGP *   B and a SPARQL* solution mapping μ, we write μ( B ) to denote the result of replacing every variable  v in B for which μ is defined with μ(v).

Next, we aim to carry over the notion of solutions for BGPs to BGP * . To this end, we first define an auxiliary concept that carries over the notion of an RDF instance mapping  [ RDF11-MT ] to RDF*.

An RDF* instance mapping  σ is a partial function from the set of all blank nodes to the set of all RDF* terms . The domain of σ, denoted by dom(σ), is the set of blank nodes for which σ is defined.

Similar to the corresponding notation for solution mappings, for an RDF* instance mapping σ and a BGP B we write σ( B ) to denote the result of replacing every blank node  b in B for which σ is defined with σ(b).

Now we are ready to define the notion of solution for BGP *.

Given a BGP *   B and an RDF* graph   G , a SPARQL* solution mapping  μ is a solution for the BGP *   B over   G if it has the following two properties

4.2 Grammar

SPARQL* is defined to follow the same grammar as SPARQL, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.

[60] Bind ::= 'BIND' '(' ( Expression | EmbTP ) 'AS' Var ')'
[75] TriplesSameSubject ::= VarOrTermOrEmbTP PropertyListNotEmpty | TriplesNode PropertyList
[80] Object ::= GraphNode | EmbTP
[81] TriplesSameSubjectPath ::= VarOrTermOrEmbTP PropertyListPathNotEmpty | TriplesNode PropertyListPath
[105] GraphNodePath ::= VarOrTermOrEmbTP | TriplesNodePath |
[174] EmbTP ::= '<< EmbSubjectOrObject Verb EmbSubjectOrObject '>>
[175] EmbSubjectOrObject ::= Var | BlankNode | iri | RDFLiteral | NumericLiteral | BooleanLiteral | EmbTP
[176] VarOrTermOrEmbTP ::= Var | GraphTerm | EmbTP

This introduces a notation for embedded triple patterns (productions [174] and following), which is similar to the one defined for embedded triples in §  3. Turtle* , but accepting also variables . These embedded triple patterns are allowed in the subject ( [75] , [81] ) and object ( [80] , [105] ) positions of SPARQL* triple patterns , as well as in BIND statements ( [60] ).

Issue 6 : FIND instead of BIND sparql*

Instead of reusing the keyword BIND for SPARQL* (as in my original proposal), we may want to consider using a different keyword for this functionality because the behavior is a bit different. For instance, @klinovp has mentioned this issue in an email on the mailing list . In another email , @afs has proposed to use the keyword FIND instead.

4.3 Translation to the Algebra

Based on the SPARQL grammar, the SPARQL specification defines the process of converting graph patterns and solution modifiers in a SPARQL query string into a SPARQL algebra expression  [ SPARQL11-QUERY, Section 18.2 ]. This process must be adjusted to consider the extended grammar introduced above . In the following, any step of the conversion process that requires adjustment is discussed.

4.3.1 Variable Scope

As a basis of the translation, the SPARQL specification introduces a notion of in-scope variables . To cover the new syntax elements introduced in §  4.2 Grammar this notion MUST be extended as follows.

4.3.2 Expand Syntax Forms

The translation process starts with expanding abbreviations for IRIs and triple patterns  [ SPARQL11-QUERY, Section 18.2.2.1 ]. This step MUST be extended in two ways:

  1. Abbreviations for triple patterns with embedded triple patterns MUST be expanded as if each embedded triple pattern was a variable (or an RDF term ).

    For instance, the following syntax expression:
    <<?c a rdfs:Class>> dct:source ?src ;
    prov:wasDerivedFrom
    <<?c
    a
    owl:Class>>
    .
    
    must be expanded to
    <<?c a rdfs:Class>> dct:source ?src .
    <<?c
    a
    rdfs:Class>>
    prov:wasDerivedFrom
    <<?c
    a
    owl:Class>>
    .
    
  2. Abbreviations for IRIs in all embedded triple patterns MUST be expanded.

    For instance, the embedded triple pattern
    <<?c
    a
    rdfs:Class>>
    
    must be expanded to
    <<?c
    <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
    <http://www.w3.org/2000/01/rdf-schema#Class>>>
    

4.3.3 Translate Property Path Patterns

The translation of property path patterns has to be adjusted because the extended grammar allows for property path patterns whose subject or object is an embedded triple pattern (cf. §  4.2 Grammar ).

The translation as specified in the W3C specification distinguishes four cases. The first three of these cases do not require adjustment because they are taken care of either by recursion or by the adjusted translation of basic graph patterns (as defined in §  4.3.4 Translate Basic Graph Patterns below). However, the fourth case MUST be adjusted as follows.

Let X P Y be a string that corresponds to the fourth case in [ SPARQL11-QUERY, Section 18.2.2.4 ]. Given the grammar introduced in §  4.2 Grammar , X and Y may be an RDF term , a variable , or an embedded triple pattern , respectively (and P is a property path expression ). The string X P Y is translated to the algebra expression Path ( X’ , P , Y’ ) where X’ and Y’ are the result of calling a function named Lift for X and Y , respectively. For some input string Z  (such as X or Y ) that can be an RDF term , a variable , or an embedded triple pattern , the function Lift is defined as follows:

  1. If Z is an embedded triple pattern << S , P , O >> then return the SPARQL* triple pattern ( Lift ( S , P , Lift ( O ));
  2. Otherwise, return Z .

4.3.4 Translate Basic Graph Patterns

After translating property path patterns , the translation process collects any adjacent triple patterns [...] to form a basic graph pattern  [ SPARQL11-QUERY, Section 18.2.2.5 ]. This step has to be adjusted because triple patterns in the extended syntax may have an embedded triple pattern in their subject position or in their object position (or in both). To ensure that every result of this step is a BGP * , before adding a triple pattern to its corresponding collection, its subject and object MUST be replaced by the result of calling function Lift for the subject and the object, respectively.

4.3.5 Translate BIND Clauses with an Embedded Triple Pattern

The extended grammar in §  4.2 Grammar allows for BIND clauses with an embedded triple pattern . The translation of such a BIND clause to a SPARQL algebra expression requires a new algebra symbol:

Then, any string of the form BIND( T AS v ) with T being an embedded triple pattern (i.e., not a standard BIND expression) is translated to the algebra expression TR ( T’ , v ) where T’ is the result of the function Lift for T .

Notice, the translation of BIND clauses with an embedded triple pattern as defined in this section is used during the translation of group graph patterns . The case of BIND clauses with an embedded triple pattern is covered in this translation of group graph patterns by the last, “catch all other” IF statement (i.e., the IF statement with the condition E is any other form ) and not by the IF statement for BIND clauses with an expression.

4.4 Evaluation Semantics

The SPARQL specification defines a function eval( D ( G ), algebra expression) as the evaluation of an algebra expression with respect to a dataset D having active graph G  [ SPARQL11-QUERY, Section 18.6 ]. Recall that the dataset D in the context of SPARQL* is an RDF* dataset and, thus, the active graph G is an RDF* graph , and so is any other graph in dataset D . The definition of the eval function is recursive; the two base cases of this definition for SPARQL* are given as follows:

  1. For every BGP * B , eval( D ( G ),  B ) is a multiset Ω that consists of all SPARQL* solution mappings that are a solution for the BGP *   B over  G . For every such mapping μ, card[Ω](μ) is the number of distinct RDF* instance mappings  σ such that dom(σ) is equivalent to the set of blank nodes in  B and μ(σ( B )) is a subgraph of G . (For any SPARQL* solution mapping μ' that is not a solution for  B over  G , we have that card[Ω](μ')=0; i.e., μ' is not in Ω.)
  2. For any algebra expression E of the form TR( tp , ?v ) where tp is a SPARQL* triple pattern and ?v is a variable (as introduced in §  4.3.5 Translate BIND Clauses with an Embedded Triple Pattern ), eval( D ( G ),  E ) is a multiset Ω that consists of as many SPARQL* solution mappings as there are solution mappings in Ω', where Ω'=eval( D ( G ),{ tp }), such that for every μ' in Ω' there exists a μ in Ω that has the following four properties:
    1. dom(μ) = dom(μ') ∪ { ?v }
    2. μ and μ' are compatible
    3. μ( ?v ) = μ'( tp )
    4. card[Ω](μ) = card[Ω'](μ')
Issue 8 : Add evaluation semantics for BGP* and BIND*/FIND into the draft action sparql*

Currently, the Evaluation Semantics section of the draft is just a copy from the corresponding subsection of the original tech report . The actual definition of the formal semantics of a BGP* and of the SPARQL* version of BIND is in a different part of that tech report (where it defines ⟦B⟧_G and ⟦(tp AS ?v)⟧_G). These definitions still need to be adapted and moved into the Evaluation Semantics section of the draft.

For any other algebra expression, the SPARQL specification defines algebra operators [ SPARQL11-QUERY ]. These definitions can be extended naturally to operate over multisets of SPARQL* solution mappings (instead of ordinary solution mappings ). Given this extension, the recursive steps of the definition of the eval function for SPARQL* are the same as in the SPARQL specification.

4.5 Query Result Formats

TODO: brief introduction paragraph (including a note that result of a CONSTRUCT query or a DESCRIBE query can be serialized using Turtle* )

4.5.1 SPARQL* Query Results JSON Format

Issue 13 : Add a standardized extension of SPARQL 1.1 Query Results JSON format help wanted sparql*

(related issue: #12 )

The result of a SPARQL SELECT query is serialized in JSON using the SPARQL 1.1 Query Results JSON format . This format will need to be extended to deal with the RDF* triple being a new possible value type for a binding. For example, the result of a query where variable ?a is bound to an RDF* triple:

?a ?b ?c
<<<http://example.org/bob> <http://xmlns.com/foaf/0.1/age> 23>> <http://example.org/certainty> 0.9

Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J , the extension looks as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "triple",
          "value" : {
            "s" : {
              "type" : "uri",
              "value" : "http://example.org/bob"
            },
            "p" : {
              "type" : "uri",
              "value" : "http://xmlns.com/foaf/0.1/age"
            },
            "o" : {
              "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
              "type" : "literal",
              "value" : "23"
            }
          }
        },
        "b": {
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In Apache Jena , the extension looks as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "triple",
          "value" : {
            "subject" : {
              "type" : "uri",
              "value" : "http://example.org/bob"
            },
            "property" : {
              "type" : "uri",
              "value" : "http://xmlns.com/foaf/0.1/age"
            },
            "object" : {
              "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
              "type" : "literal",
              "value" : "23"
            }
          }
        },
        "b": { 
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In Stardog , the format extension as currently implemented is as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "statement",
          "s" : {
            "type" : "uri",
            "value" : "http://example.org/bob"
          },
          "p" : {
            "type" : "uri",
            "value" : "http://xmlns.com/foaf/0.1/age"
          },
          "o" : {
            "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
            "type" : "literal",
            "value" : "23"
          }
        },
        "b": { 
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In summary, Jena and RDF4J differ only by the names of the keys inside the new RDF* triple type ( s vs subject , p vs property , etc). Stardog deviates slightly more in that it does not wrap the individual components of the RDF* triple into a value .

Other implementations may have yet other, slightly deviant variants. This makes it difficult to process query results from different endpoint implementations. A single recommended extension would be a benefit for parser implementors and users alike.

4.5.2 SPARQL* Query Results XML Format

Issue 12 : Add a standardized extension of SPARQL Query Results XML format help wanted sparql*

The result of a SPARQL SELECT query is serialized in XML using the SPARQL Query Results XML format . This format will need to be extended to deal with the RDF* triple being a new possible value type for a binding. For example, the result of a query where variable ?a is bound to an RDF* triple:

?a ?b ?c
<<<http://example.org/bob> <http://xmlns.com/foaf/0.1/age> 23>> <http://example.org/certainty> 0.9

Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J , the extension looks as follows:

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<triple>
					<subject>
						<uri>http://example.org/bob</uri>
					</subject>
					<predicate>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</predicate>
					<object>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</object>
				</triple>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>

</

sparql

>


In Apache Jena , the extension is almost identical, except for the choice to name the middle element property (where RDF4J uses predicate ):

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<triple>
					<subject>
						<uri>http://example.org/bob</uri>
					</subject>
					<property>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</property>
					<object>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</object>
				</triple>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>

</

sparql

>


In Stardog , the implemented extension currently looks as follows:

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<statement>
					<s>
						<uri>http://example.org/bob</uri>
					</s>
					<p>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</p>
					<o>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</o>
				</statement>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>

</

sparql

>


In summary, RDF4J, Jena and Stardog all differ in the element names used ( triple vs statement , property vs predicate , subject vs s , etc).

Other implementations may have yet other, slightly deviant variants. This makes it difficult to process query results from different endpoint implementations. A single recommended extension would be a benefit for parser implementors and users alike.

5. SPARQL* Update

Issue 14 : Section that defines SPARQL* Update sparql*

We need a section that defines SPARQL* Update. The text for this section can be taken from the following document: https://blog.liu.se/olafhartig/documents/sparql-update/

6. RDF* Semantics

In this section, we provide a model-theoretic semantics for RDF*, by extending the one defined in RDF 1.1 Semantics  [ RDF11-MT ].

6.1 Definitions

An RDF* triple is said to be ground if it has no blank node in its constituent terms . An RDF* graph is ground if all its triples are ground . This definition generalizes the notion of ground RDF graph . IRIs , literals and ground RDF* terms are collectively known as ground RDF* terms .

An RDF* simple interpretation I is a structure consisting of:

  1. A non-empty set IR of resources, called the domain or universe of I .
  2. A set IP , called the set of properties of I .
  3. A mapping IEXT from IP into the powerset of IR × IR i.e. the set of sets of pairs ( x , y ) with x and y in IR .
  4. A mapping IS from IRIs into IR IP .
  5. A partial mapping IL from literals into IR .
  6. A partial mapping IT from ground RDF* triples into IR , such that for any two triples ( s1 , p1 , o1 ) and ( s2 , p2 , o2 ), IT ( s1 , p1 , o1 )= IT ( s2 , p2 , o2 ) implies IG ( s1 )= IG ( s2 ), IG ( p1 )= IG ( p2 ) and IG ( o1 )= IG ( o2 ), where IG = IS IL IT .

This definition is identical to the definition of simple interpretation  [ RDF11-MT ] up to item 5 included. Item 6 extends it to support RDF* triples . Any RDF simple interpretation can be considered as an RDF* simple interpretation with IT =∅.

6.1.1 Semantic condition for ground graphs

The denotation of a ground RDF* graph in an RDF* simple interpretation I is then given by the following rules, where the interpretation is also treated as a function from expressions (terms, triples and graphs) to elements of the universe and truth values:

Since IL and IT are partial mappings, I ( E ) may be undefined for some literal or triple E . In that case, E has no semantic value in I , so any asserted triple having E as subject or object it will fail to satisfy the condition above, hence any graph containing such asserted triple will be false.

Note
In the original condition for simple entailment  [ RDF11-MT ], the denotation of an RDF triple is a boolean, while above we define the denotation of an RDF* triple to be an element of the universe. However, the denotation of any RDF graph is always the same under the original condition and the condition above. Therefore, the condition above can be considered as an extension of the original one to support any RDF* graph .

6.1.2 Semantic condition with blank nodes

Given an RDF* graph E , we call the embedded blank nodes ( ebn ) of E the set of blank nodes appearing in subject or object position of some embedded triple in E ; we call the open blank nodes ( obn ) of E all the other blank nodes appearing in E .

A mapping from a set blank nodes into a set of ground RDF* terms is called a grounding function . We define the extended application of a grounding function   Γ to other RDF* terms and to RDF* graphs as follows:

  • if E is a blank node then Γ *( E ) = Γ ( E ) if it is defined, E otherwise
  • if E is an IRI or a literal , then Γ *( E ) = E
  • if E is an RDF* triple ( s , p , o ), then Γ *( E ) = ( Γ *( s ), Γ *( p ), Γ *( o ))
  • if E is an RDF* graph , then Γ *( E ) = { Γ *( t ) for all t E }

Suppose I is an RDF* simple interpretation and A is a mapping from a set of blank nodes to the universe IR of I . Define the mapping [ I + A ] of RDF* terms into IR to be A on blank nodes of the set, and I on any other term; and extend this mapping to RDF* triples and RDF* graphs using the rules given above for ground graphs . Then the denotation of any RDF* graph in I is given by:

Note
For the special case of a standard RDF graph and an RDF simple interpretation , the condition above is equivalent to the one defined in RDF 1.1 Semantics , which only requires a mapping A , but no grounding function . Indeed, since RDF graphs have no embedded triples , the empty mapping is the only possible choice for Γ , and Γ * maps the graph to itself.

6.1.3 Entailment

Following RDF 1.1 Semantics , we extend the notions of satisfiability and entailment. An RDF* simple interpretation satisfies E when I ( E )=true. E is (simply) satisfiable when an RDF* simple interpretation exists which satisfies it, otherwise (simply) unsatisfiable . An RDF* graph G simply entails an RDF* graph H when every interpretation which satisfies G also satisfies H . If two RDF* graphs G and H each entail the other then they are logically equivalent .

Any semantic extension of RDF MAY be extended to RDF* by replacing the semantic conditions, the notion of satisfiability and the notion of entailment, defined in RDF 1.1 Semantics , by their corresponding extension defined above. This is notably the case for Datatype entailment and RDFS entailment .

6.2 Design Rationale

This section is non-normative.

In this section, we discuss a number of desired features of RDF* semantics in order to shed light on the design choices made in the previous section

6.2.1 Quoting without asserting

RDF* must be able to quote a triple without asserting it, so that we can represent peoples' beliefs or claims without endorsing them, or represent facts that are no longer or not yet true. This is ensured by the fact that only asserted triples are considered to determine if the denotation of a graph is true or false.

For example, the following graph:
<<
:alice
foaf:knows
:bob
>>
dc:creator
:alice.
does not entail :alice foaf:knows :bob , and the SPARQL* query below executed against the graph above would return no result.
SELECT
?who
{
:alice
foaf:knows
?who
}

6.2.2 Referential opacity

Embedded triples are referentially opaque, meaning that triples using different terms can be considered different, even if their terms can be inferred to be synonyms. Although RDF* simple entailment has no means to entail any kind of synonymy, it is possible in some semantic extensions , such as OWL [ OWL2-RDF-BASED-SEMANTICS ].

A well known example is the superman problem :

:loisLane :believes << :superman :can :fly >>.
:superman owl:sameAs :clarkKent.
:superman
:can
:fly.

Intuitively: this graph states that Superman and Clark Kent are the same person, so if Superman can fly, then it follows that Clark Kent can as well. So, under OWL2-entailment, this graph entails :clarkKent :can :fly . However, Lois Lane does not know that Superman and Clark Kent are the same person. So from her point of view, the two triples are not equivalent, and she can believe one without believing the other.

Referential opacity is ensured by differentiating the intension of embedded triples (represented by the IT mapping) from their extension (the denotations of their subject, predicate and object). Since IT is based solely on the syntax of triples, two syntactically different triples can always have different intentions, even if their subjects, predicates and objects are semantically equivalent.

On the other hand, all triples with the same intension are required to have the same extension. So if two RDF* triples denote the same resource T, their subjects, predicates and objects, respectively, are constrained to also denote the same thing.

6.2.3 Blank node scope

Blank nodes in embedded triples have the same scope as blank nodes used in the subject or object position of asserted triples (usually the whole graph or the whole dataset in which they appear). This means that the same blank node identifier used in different embedded triples , or at different levels of nesting, will refer to the same thing.

For example, in the following graph:

:alice :knows _:x.
<< _:x :name "Bob" >> dc:creator :alice.
<<
_:x
:workingFor
:acme
>>
dc:creator
:alice.

the three occurrence of _:x must refer to the same resource in every interpretation of the graph. In other words, it must be the same resource that Alice knows, that she claims is named "Bob", and that she claims works for ACME.

As a consequence, the following query will return "Bob" :

SELECT ?name {
  :alice :knows ?x.
  << ?x :name ?name >> dc:creator :alice.
  << ?x :workingFor :acme >> dc:creator :alice.
}

As another consequence, the following graph does not entail the graph above (because the graph below allows the resource known by Alice to be different from the one about which she makes claims).

:alice :knows _:y.
<< _:x :name "Bob" >> dc:creator :alice.
<<
_:x
:workingFor
:acme
>>
dc:creator
:alice.

Formally, the second graph is satisfied by an interpretation having:

  • IS : :alice →A, :knows →K, dc:creator →C, :bob →B
  • IT : << :bob :name "Bob" >> →T1, << :bob :workingFor :acme >> →T2
  • IEXT (K) = {(A,Y)}
  • IEXT (C) = {(T1, A), (T2, A)}

with Γ: _:x :bob , and A: _:y →Y. But this interpretation can not satisfy the first graph , because it would require a grounding function Γ' such that

  • I(Γ'( _:x )) = Y in order to satisfy the first triple,
  • Γ'( _:x ) = :bob in order to satisfy the second and third triple,
  • and both can not be true at the same time in the interpretation above.

6.2.4 Interpolation lemma

The interpolation lemma  [ RDF11-MT ] states that an RDF graph G simply entails an RDF graph E if and only if a subgraph of G is an instance of E. Intuitively, this means that all graphs simply entailed by G can be constructed by:

  • removing triples,
  • replacing a term with a fresh blank node,
  • splitting a blank node into two (or more) fresh blank nodes.

A design goal of the RDF* semantics was to preserve that property.

Issue

We didn't prove it yet...

A. Historical remarks

A.1 SA-mode and PG-mode

A lot of discussions on the RDF* mailing list and GitHub repository refer to SA-mode and PG-mode. Those abbreviations stand for "Separate Assertion mode" and "Property Graph mode". They originate in the fact that different versions of RDF* have been published over the years, with different designs. In PG-mode, any embedded triple was also considered asserted . SA-mode, on the other hand, allowed to mention embedded triples without asserting them, requiring them to be asserted separately, if that is the intent. SA-mode was more flexible, but induced redundancy in the use-cases that PG-mode was designed to address.

The notion of annotations in the Turtle* syntax was introduced to remove the need for different modes. Instead of interpreting the same syntax differently (which would have caused interoperability problems), it was decided to provide two different syntaxes for each use case.

A. B. Issue Summary

B. C. References

B.1 C.1 Normative references

[RDF11-CONCEPTS]
RDF 1.1 Concepts and Abstract Syntax . Richard Cyganiak; David Wood; Markus Lanthaler. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/
[RDF11-MT]
RDF 1.1 Semantics . Patrick Hayes; Peter Patel-Schneider. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-mt/
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels . S. Bradner. IETF. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words . B. Leiba. IETF. May 2017. Best Current Practice. URL: https://tools.ietf.org/html/rfc8174
[SPARQL11-QUERY]
SPARQL 1.1 Query Language . Steven Harris; Andy Seaborne. W3C. 21 March 2013. W3C Recommendation. URL: https://www.w3.org/TR/sparql11-query/
[TURTLE]
RDF 1.1 Turtle . Eric Prud'hommeaux; Gavin Carothers. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/turtle/
[XML]
Extensible Markup Language (XML) 1.0 (Fifth Edition) . Tim Bray; Jean Paoli; Michael Sperberg-McQueen; Eve Maler; François Yergeau et al. W3C. 26 November 2008. W3C Recommendation. URL: https://www.w3.org/TR/xml/

B.2 C.2 Informative references

[JSON-LD11]
JSON-LD 1.1 . Gregg Kellogg; Pierre-Antoine Champin; Dave Longley. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11/
[N-TRIPLES]
RDF 1.1 N-Triples . Gavin Carothers; Andy Seaborne. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/n-triples/
[OWL2-RDF-BASED-SEMANTICS]
OWL 2 Web Ontology Language RDF-Based Semantics (Second Edition) . Michael Schneider. W3C. 11 December 2012. W3C Recommendation. URL: https://www.w3.org/TR/owl2-rdf-based-semantics/
[RDF-STAR-FOUNDATION]
Foundations of RDF* and SPARQL* - An Alternative Approach to Statement-Level Metadata in RDF. . Olaf Hartig. In Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management (AMW), Montevideo, Uruguay. June 2017. URL: http://ceur-ws.org/Vol-1912/paper12.pdf
[RDF-SYNTAX-GRAMMAR]
RDF 1.1 XML Syntax . Fabien Gandon; Guus Schreiber. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf-syntax-grammar/