Copyright © 2020 the Contributors to the RDF* and SPARQL* Specification, published by the RDF-DEV Community Group under the W3C Community Contributor License Agreement (CLA) . A human-readable summary is available.
TODO
This specification was published by the RDF-DEV Community Group . It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups .
This section is non-normative.
TODO, citing [ RDF-STAR-FOUNDATION ]
This section is non-normative.
TODO (the purpose of this section will be to provide an informal introduction to the approach for practitioners)
The syntax of RDF is defined in two layers:
Similarly, this document defines the abstract syntax of RDF* in § 2. Concepts and Abstract Syntax , and one concrete syntax based on Turtle [ TURTLE ] in § 3. Turtle* .
TODO list the prefix definitions implicitly used in all examples
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , and MUST NOT in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
An RDF* graph is a set of RDF* triples .
An RDF* triple is a 3-tuple defined recursively as follows:
As for RDF triples , we call the 3 components of an RDF* triple its subject , predicate and object , respectively. From the definitions above, it follows that any RDF graph is also an RDF* graph . Note also that, by definition, an RDF* triple cannot contain itself and cannot be nested infinitely.
IRIs , literals , blank nodes and RDF* triples are collectively known as RDF* terms .
For every RDF* triple t , we define its constituent terms (or simply constituents) as the set containing its subject , its predicate , its object , plus all the constituent terms of its subject and/or its object if they are themselves RDF* triples . By extension, we define the constituent terms of an RDF* graph to be the union set of the constituent terms of all its triples.
<< _:a :name "Alice" >> :statedBy :bob.
:name
,
:statedBy
,
:bob
,
the
blank
node
_:a
,
the
literal
"Alice"
,
and
the
triple
<<
_:a
:name
"Alice"
>>
.
An RDF* triple used as the subject or object of another RDF* triple is called an embedded triple . An RDF* triple that is an element of an RDF* graph is called an asserted triple . Note that, in a given RDF* graph , the same triple MAY be both embedded and asserted .
An RDF* dataset is a collection of RDF* graphs , and comprises:
Again, this definition is an extension of the notion of RDF dataset , hence it follows that any RDF dataset is also an RDF* dataset .
In this section, we present Turtle*, an extension of the Turtle format [ TURTLE ] allowing the representation of RDF* graphs . For the sake of conciseness, we only describe here the differences between Turtle* and Turtle.
Turtle* is defined to follow the same grammar as Turtle, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.
[8] |
objectList
| ::= |
object
annotation
?
(
','
object
annotation
?
)*
|
[10] |
subject
|
::= |
iri
|
BlankNode
|
collection
|
embTriple
|
[12] |
object
|
::= |
iri
|
BlankNode
|
collection
|
blankNodePropertyList
|
literal
|
embTriple
|
[27] |
embTriple
|
::= |
'<<'
embSubject
verb
embObject
'>>'
|
[28] |
embSubject
|
::= |
iri
|
BlankNode
|
embTriple
|
[29] |
embObject
|
::= |
iri
|
BlankNode
|
literal
|
embTriple
|
[31] |
annotation
| ::= |
'{|'
predicateObjectList
'|}'
|
The
only
changes
are
that
subject
and
object
productions
have
been
extended
to
accept
embedded
triples
,
which
are
described
by
the
new
productions
27
to
29
.
Note
that
embedded
triples
accept
a
more
restricted
range
of
subject
and
object
expressions
than
asserted
triples
.
Additionally,
the
objectList
production
now
accepts
an
optional
annotation
after
each
object.
This has already been discussed on the mailing list .
The idea would be to have a notation like
:bob :age 42 {| :source <http://example.org/~bob/> |}.
as shortcut for
:bob :age 42.
<< :bob :age 42 >> :source <http://example.org/~bob/>.
A Turtle* parser is similar to a Turtle parser as defined in Section 7 of the Turtle specification [ TURTLE ], with an additional item in its state :
Additionally, the curSubject can be bound to any RDF* term (including an embedded triple ).
A
Turtle*
document
defines
an
RDF*
graph
composed
of
a
set
of
RDF*
triples
.
The
subject
and
embSubject
productions
set
the
curSubject
.
The
verb
production
sets
the
curPredicate
.
The
object
and
embObject
productions
set
the
curObject
.
For
each
Finishing
the
object
N
,
production,
an
RDF*
triple
curSubject
curPredicate
N
curObject
is
generated
and
added
to
the
RDF*
graph
.
Beginning
the
embTriple
production
records
the
curSubject
and
curPredicate
.
Finishing
the
embTriple
production
yields
the
RDF*
triple
curSubject
curPredicate
curObject
and
restores
the
recorded
values
of
curSubject
and
curPredicate
.
Beginning
the
annotation
production
records
the
curSubject
and
curPredicate
,
and
sets
the
curSubject
to
the
RDF*
triple
curSubject
curPredicate
curObject
.
Finishing
the
annotation
production
restores
the
recorded
values
of
curSubject
and
curPredicate
.
All other productions MUST be handled as specified by Section 7 of the Turtle specification [ TURTLE ], while still applying the changes above recursively.
This section is non-normative.
While this document specifies only one concrete syntax , nothing prevents other concrete syntaxes of RDF* from being proposed. In particular, other existing concrete syntaxes for RDF, such as RDF/XML [ RDF-SYNTAX-GRAMMAR ], could be extended to support RDF*. In particular, the N-Triples syntax [ N-TRIPLES ] being a subset of Turtle, an appropriate subset of Turtle* could be defined to extend N-Triples accordingly.
This Section introduces SPARQL*, which is an RDF*-aware extension of the RDF query language SPARQL [ SPARQL11-QUERY ]; i.e., SPARQL* can be used to query RDF* graphs.
In the following, we introduce a number of SPARQL*-specific definitions, which rely on the following notions, defined in SPARQL 1.1 Query Language [ SPARQL11-QUERY ]: RDF term , query variable , triple pattern , property path pattern , property path expression , and solution mapping .
A SPARQL* triple pattern is a 3-tuple that is defined recursively as follows:
As for RDF* triples , a SPARQL* triple pattern MUST NOT contain itself.
A SPARQL* basic graph pattern ( BGP *) is a set of SPARQL* triple patterns .
A SPARQL* property path pattern is a 3-tuple ( s , p , o ) where
I have added the definition of a SPARQL* property path pattern into the draft just for the sake of having such a definition. We need to think about whether it is useful to add this to SPARQL*, in which case we need to define the semantics of such SPARQL* property path patterns.
In fact, no matter what we decide, even for standard property path patterns , the semantics may have to be extended to use them over RDF* graphs .
A SPARQL* solution mapping μ is a partial function from the set of all query variables to the set of all RDF* terms . The domain of μ, denoted by dom(μ), is the set of query variables for which μ is defined.
The notion of a SPARQL* solution mapping extends the notion of a standard SPARQL solution mapping ; that is, every SPARQL solution mapping is a SPARQL* solution mapping . However, in contrast to SPARQL solution mappings , SPARQL* solution mappings may map variables also to RDF* triples .
All notions related to SPARQL solution mappings carry over naturally to SPARQL* solution mappings. In particular, the definition of compatibility extends naturally to SPARQL* solution mappings: two SPARQL* solution mappings μ 1 and μ 2 are compatible if, for every variable v that is both in dom(μ 1 ) and in dom(μ 2 ), μ 1 (v) and μ 2 (v) are the same RDF* term . In this case, μ 1 ∪ μ 2 is also a SPARQL* solution mapping. Moreover, for any SPARQL* solution mapping μ we write card[Ω](μ) to denote the cardinality of μ in a multiset Ω of such mappings. Finally, given a BGP * B and a SPARQL* solution mapping μ, we write μ( B ) to denote the result of replacing every variable v in B for which μ is defined with μ(v).
Next, we aim to carry over the notion of solutions for BGPs to BGP * . To this end, we first define an auxiliary concept that carries over the notion of an RDF instance mapping [ RDF11-MT ] to RDF*.
An RDF* instance mapping σ is a partial function from the set of all blank nodes to the set of all RDF* terms . The domain of σ, denoted by dom(σ), is the set of blank nodes for which σ is defined.
Similar to the corresponding notation for solution mappings, for an RDF* instance mapping σ and a BGP * B we write σ( B ) to denote the result of replacing every blank node b in B for which σ is defined with σ(b).
Now we are ready to define the notion of solution for BGP *.
Given a BGP * B and an RDF* graph G , a SPARQL* solution mapping μ is a solution for the BGP * B over G if it has the following two properties
SPARQL* is defined to follow the same grammar as SPARQL, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.
[60] |
Bind
|
::= |
'BIND'
'('
(
Expression
|
EmbTP
)
'AS'
Var
')'
|
[75] |
TriplesSameSubject
|
::= |
VarOrTermOrEmbTP
PropertyListNotEmpty
|
TriplesNode
PropertyList
|
[80] |
Object
|
::= |
GraphNode
|
EmbTP
|
[81] |
TriplesSameSubjectPath
|
::= |
VarOrTermOrEmbTP
PropertyListPathNotEmpty
|
TriplesNode
PropertyListPath
|
[105] |
GraphNodePath
|
::= |
VarOrTermOrEmbTP
|
TriplesNodePath
|
|
[174] |
EmbTP
|
::= |
'<<
EmbSubjectOrObject
Verb
EmbSubjectOrObject
'>>
|
[175] |
EmbSubjectOrObject
|
::= |
Var
|
BlankNode
|
iri
|
RDFLiteral
|
NumericLiteral
|
BooleanLiteral
|
EmbTP
|
[176] |
VarOrTermOrEmbTP
|
::= |
Var
|
GraphTerm
|
EmbTP
|
This introduces a notation for embedded triple patterns (productions [174] and following), which is similar to the one defined for embedded triples in § 3. Turtle* , but accepting also variables . These embedded triple patterns are allowed in the subject ( [75] , [81] ) and object ( [80] , [105] ) positions of SPARQL* triple patterns , as well as in BIND statements ( [60] ).
Instead of reusing the keyword BIND for SPARQL* (as in my original proposal), we may want to consider using a different keyword for this functionality because the behavior is a bit different. For instance, @klinovp has mentioned this issue in an email on the mailing list . In another email , @afs has proposed to use the keyword FIND instead.
Based
on
the
SPARQL
grammar,
the
SPARQL
specification
defines
the
process
of
converting
graph
patterns
and
solution
modifiers
in
a
SPARQL
query
string
into
a
SPARQL
algebra
expression
[
SPARQL11-QUERY,
Section 18.2
].
This
process
must
be
adjusted
to
consider
the
extended
grammar
introduced
above
.
In
the
following,
any
step
of
the
conversion
process
that
requires
adjustment
is
discussed.
As a basis of the translation, the SPARQL specification introduces a notion of in-scope variables . To cover the new syntax elements introduced in § 4.2 Grammar this notion MUST be extended as follows.
BIND ( T AS v )
(where
T
is
an
embedded
triple
pattern
)
if
the
variable
is
variable
v
or
the
variable
occurs
in
the
embedded
triple
pattern
T
.
As
for
standard
BIND
clauses
with
expressions,
variable
v
must
not [be] in-scope from the preceeding elements in the group graph pattern in which [the BIND clause] is used[ SPARQL11-QUERY, Section 18.2.1] ].
The
translation
process
starts
with
expanding
abbreviations
for
IRIs
and
triple
patterns
[
SPARQL11-QUERY,
Section 18.2.2.1
].
This
step
MUST
be
extended
in
two
ways:
Abbreviations for triple patterns with embedded triple patterns MUST be expanded as if each embedded triple pattern was a variable (or an RDF term ).
Abbreviations for IRIs in all embedded triple patterns MUST be expanded.
The translation of property path patterns has to be adjusted because the extended grammar allows for property path patterns whose subject or object is an embedded triple pattern (cf. § 4.2 Grammar ).
The translation as specified in the W3C specification distinguishes four cases. The first three of these cases do not require adjustment because they are taken care of either by recursion or by the adjusted translation of basic graph patterns (as defined in § 4.3.4 Translate Basic Graph Patterns below). However, the fourth case MUST be adjusted as follows.
Let
X
P
Y
be
a
string
that
corresponds
to
the
fourth
case
in [
SPARQL11-QUERY,
Section 18.2.2.4
].
Given
the
grammar
introduced
in
§
4.2
Grammar
,
X
and
Y
may
be
an
RDF
term
,
a
variable
,
or
an
embedded
triple
pattern
,
respectively
(and
P
is
a
property
path
expression
).
The
string
X
P
Y
is
translated
to
the
algebra
expression
Path
(
X’
,
P
,
Y’
)
where
X’
and
Y’
are
the
result
of
calling
a
function
named
Lift
for
X
and
Y
,
respectively.
For
some
input
string
Z
(such
as
X
or
Y
)
that
can
be
an
RDF
term
,
a
variable
,
or
an
embedded
triple
pattern
,
the
function
Lift
is
defined
as
follows:
Lift
(
S
,
P
,
Lift
(
O
));
After
translating
property
path
patterns
,
the
translation
process
collects
any
adjacent
triple
patterns
[...]
to
form
a
basic
graph
pattern
[
SPARQL11-QUERY,
Section 18.2.2.5
].
This
step
has
to
be
adjusted
because
triple
patterns
in
the
extended
syntax
may
have
an
embedded
triple
pattern
in
their
subject
position
or
in
their
object
position
(or
in
both).
To
ensure
that
every
result
of
this
step
is
a
BGP
*
,
before
adding
a
triple
pattern
to
its
corresponding
collection,
its
subject
and
object
MUST
be
replaced
by
the
result
of
calling
function
Lift
for
the
subject
and
the
object,
respectively.
The extended grammar in § 4.2 Grammar allows for BIND clauses with an embedded triple pattern . The translation of such a BIND clause to a SPARQL algebra expression requires a new algebra symbol:
Then,
any
string
of
the
form
BIND( T AS v )
with
T
being
an
embedded
triple
pattern
(i.e.,
not
a
standard
BIND
expression)
is
translated
to
the
algebra
expression
TR
(
T’
,
v
)
where
T’
is
the
result
of
the
function
Lift
for
T
.
Notice,
the
translation
of
BIND
clauses
with
an
embedded
triple
pattern
as
defined
in
this
section
is
used
during
the
translation
of
group
graph
patterns
.
The
case
of
BIND
clauses
with
an
embedded
triple
pattern
is
covered
in
this
translation
of
group
graph
patterns
by
the
last,
“catch
all
other”
IF
statement
(i.e.,
the
IF
statement
with
the
condition
E
is
any
other
form
)
and
not
by
the
IF
statement
for
BIND
clauses
with
an
expression.
The
SPARQL
specification
defines
a
function
eval(
D
(
G
), algebra expression)
as
the
evaluation
of
an
algebra
expression
with
respect
to
a
dataset
D
having
active
graph
G
[
SPARQL11-QUERY,
Section
18.6
].
Recall
that
the
dataset
D
in
the
context
of
SPARQL*
is
an
RDF*
dataset
and,
thus,
the
active
graph
G
is
an
RDF*
graph
,
and
so
is
any
other
graph
in
dataset
D
.
The
definition
of
the
eval
function
is
recursive;
the
two
base
cases
of
this
definition
for
SPARQL*
are
given
as
follows:
Currently, the Evaluation Semantics section of the draft is just a copy from the corresponding subsection of the original tech report . The actual definition of the formal semantics of a BGP* and of the SPARQL* version of BIND is in a different part of that tech report (where it defines ⟦B⟧_G and ⟦(tp AS ?v)⟧_G). These definitions still need to be adapted and moved into the Evaluation Semantics section of the draft.
For any other algebra expression, the SPARQL specification defines algebra operators [ SPARQL11-QUERY ]. These definitions can be extended naturally to operate over multisets of SPARQL* solution mappings (instead of ordinary solution mappings ). Given this extension, the recursive steps of the definition of the eval function for SPARQL* are the same as in the SPARQL specification.
TODO: brief introduction paragraph (including a note that result of a CONSTRUCT query or a DESCRIBE query can be serialized using Turtle* )
(related issue: #12 )
The
result
of
a
SPARQL
SELECT
query
is
serialized
in
JSON
using
the
SPARQL
1.1
Query
Results
JSON
format
.
This
format
will
need
to
be
extended
to
deal
with
the
RDF*
triple
being
a
new
possible
value
type
for
a
binding.
For
example,
the
result
of
a
query
where
variable
?a
is
bound
to
an
RDF*
triple:
?a | ?b | ?c |
---|---|---|
<<<http://example.org/bob>
<http://xmlns.com/foaf/0.1/age>
23>>
|
<http://example.org/certainty>
|
0.9
|
Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J , the extension looks as follows:
{
"head" : {
"vars" : [
"a",
"b",
"c"
]
},
"results" : {
"bindings": [
{ "a" : {
"type" : "triple",
"value" : {
"s" : {
"type" : "uri",
"value" : "http://example.org/bob"
},
"p" : {
"type" : "uri",
"value" : "http://xmlns.com/foaf/0.1/age"
},
"o" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#integer",
"type" : "literal",
"value" : "23"
}
}
},
"b": {
"type": "uri",
"value": "http://example.org/certainty"
},
"c" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
"type" : "literal",
"value" : "0.9"
}
}
]
}
}
In Apache Jena , the extension looks as follows:
{
"head" : {
"vars" : [
"a",
"b",
"c"
]
},
"results" : {
"bindings": [
{ "a" : {
"type" : "triple",
"value" : {
"subject" : {
"type" : "uri",
"value" : "http://example.org/bob"
},
"property" : {
"type" : "uri",
"value" : "http://xmlns.com/foaf/0.1/age"
},
"object" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#integer",
"type" : "literal",
"value" : "23"
}
}
},
"b": {
"type": "uri",
"value": "http://example.org/certainty"
},
"c" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
"type" : "literal",
"value" : "0.9"
}
}
]
}
}
In Stardog , the format extension as currently implemented is as follows:
{
"head" : {
"vars" : [
"a",
"b",
"c"
]
},
"results" : {
"bindings": [
{ "a" : {
"type" : "statement",
"s" : {
"type" : "uri",
"value" : "http://example.org/bob"
},
"p" : {
"type" : "uri",
"value" : "http://xmlns.com/foaf/0.1/age"
},
"o" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#integer",
"type" : "literal",
"value" : "23"
}
},
"b": {
"type": "uri",
"value": "http://example.org/certainty"
},
"c" : {
"datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
"type" : "literal",
"value" : "0.9"
}
}
]
}
}
In
summary,
Jena
and
RDF4J
differ
only
by
the
names
of
the
keys
inside
the
new
RDF*
triple
type
(
s
vs
subject
,
p
vs
property
,
etc).
Stardog
deviates
slightly
more
in
that
it
does
not
wrap
the
individual
components
of
the
RDF*
triple
into
a
value
.
Other implementations may have yet other, slightly deviant variants. This makes it difficult to process query results from different endpoint implementations. A single recommended extension would be a benefit for parser implementors and users alike.
The
result
of
a
SPARQL
SELECT
query
is
serialized
in
XML
using
the
SPARQL
Query
Results
XML
format
.
This
format
will
need
to
be
extended
to
deal
with
the
RDF*
triple
being
a
new
possible
value
type
for
a
binding.
For
example,
the
result
of
a
query
where
variable
?a
is
bound
to
an
RDF*
triple:
?a | ?b | ?c |
---|---|---|
<<<http://example.org/bob>
<http://xmlns.com/foaf/0.1/age>
23>>
|
<http://example.org/certainty>
|
0.9
|
Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J , the extension looks as follows:
<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
<head>
<variable name='a'/>
<variable name='b'/>
<variable name='c'/>
</head>
<results>
<result>
<binding name='a'>
<triple>
<subject>
<uri>http://example.org/bob</uri>
</subject>
<predicate>
<uri>http://xmlns.com/foaf/0.1/age</uri>
</predicate>
<object>
<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
</object>
</triple>
</binding>
<binding name='b'>
<uri>http://example.org/certainty</uri>
</binding>
<binding name='c'>
<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
</binding>
</result>
</results>
</
sparql
>
In
Apache
Jena
,
the
extension
is
almost
identical,
except
for
the
choice
to
name
the
middle
element
property
(where
RDF4J
uses
predicate
):
<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
<head>
<variable name='a'/>
<variable name='b'/>
<variable name='c'/>
</head>
<results>
<result>
<binding name='a'>
<triple>
<subject>
<uri>http://example.org/bob</uri>
</subject>
<property>
<uri>http://xmlns.com/foaf/0.1/age</uri>
</property>
<object>
<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
</object>
</triple>
</binding>
<binding name='b'>
<uri>http://example.org/certainty</uri>
</binding>
<binding name='c'>
<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
</binding>
</result>
</results>
</
sparql
>
In Stardog , the implemented extension currently looks as follows:
<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
<head>
<variable name='a'/>
<variable name='b'/>
<variable name='c'/>
</head>
<results>
<result>
<binding name='a'>
<statement>
<s>
<uri>http://example.org/bob</uri>
</s>
<p>
<uri>http://xmlns.com/foaf/0.1/age</uri>
</p>
<o>
<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
</o>
</statement>
</binding>
<binding name='b'>
<uri>http://example.org/certainty</uri>
</binding>
<binding name='c'>
<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
</binding>
</result>
</results>
</
sparql
>
In
summary,
RDF4J,
Jena
and
Stardog
all
differ
in
the
element
names
used
(
triple
vs
statement
,
property
vs
predicate
,
subject
vs
s
,
etc).
Other implementations may have yet other, slightly deviant variants. This makes it difficult to process query results from different endpoint implementations. A single recommended extension would be a benefit for parser implementors and users alike.
We need a section that defines SPARQL* Update. The text for this section can be taken from the following document: https://blog.liu.se/olafhartig/documents/sparql-update/
In this section, we provide a model-theoretic semantics for RDF*, by extending the one defined in RDF 1.1 Semantics [ RDF11-MT ].
An RDF* triple is said to be ground if it has no blank node in its constituent terms . An RDF* graph is ground if all its triples are ground . This definition generalizes the notion of ground RDF graph . IRIs , literals and ground RDF* terms are collectively known as ground RDF* terms .
An RDF* simple interpretation I is a structure consisting of:
This definition is identical to the definition of simple interpretation [ RDF11-MT ] up to item 5 included. Item 6 extends it to support RDF* triples . Any RDF simple interpretation can be considered as an RDF* simple interpretation with IT =∅.
The denotation of a ground RDF* graph in an RDF* simple interpretation I is then given by the following rules, where the interpretation is also treated as a function from expressions (terms, triples and graphs) to elements of the universe and truth values:
Since IL and IT are partial mappings, I ( E ) may be undefined for some literal or triple E . In that case, E has no semantic value in I , so any asserted triple having E as subject or object it will fail to satisfy the condition above, hence any graph containing such asserted triple will be false.
Given an RDF* graph E , we call the embedded blank nodes ( ebn ) of E the set of blank nodes appearing in subject or object position of some embedded triple in E ; we call the open blank nodes ( obn ) of E all the other blank nodes appearing in E .
A mapping from a set blank nodes into a set of ground RDF* terms is called a grounding function . We define the extended application of a grounding function Γ to other RDF* terms and to RDF* graphs as follows:
Suppose I is an RDF* simple interpretation and A is a mapping from a set of blank nodes to the universe IR of I . Define the mapping [ I + A ] of RDF* terms into IR to be A on blank nodes of the set, and I on any other term; and extend this mapping to RDF* triples and RDF* graphs using the rules given above for ground graphs . Then the denotation of any RDF* graph in I is given by:
Following RDF 1.1 Semantics , we extend the notions of satisfiability and entailment. An RDF* simple interpretation satisfies E when I ( E )=true. E is (simply) satisfiable when an RDF* simple interpretation exists which satisfies it, otherwise (simply) unsatisfiable . An RDF* graph G simply entails an RDF* graph H when every interpretation which satisfies G also satisfies H . If two RDF* graphs G and H each entail the other then they are logically equivalent .
Any semantic extension of RDF MAY be extended to RDF* by replacing the semantic conditions, the notion of satisfiability and the notion of entailment, defined in RDF 1.1 Semantics , by their corresponding extension defined above. This is notably the case for Datatype entailment and RDFS entailment .
This section is non-normative.
In this section, we discuss a number of desired features of RDF* semantics in order to shed light on the design choices made in the previous section
RDF* must be able to quote a triple without asserting it, so that we can represent peoples' beliefs or claims without endorsing them, or represent facts that are no longer or not yet true. This is ensured by the fact that only asserted triples are considered to determine if the denotation of a graph is true or false.
For example, the following graph:<< :alice foaf:knows :bob >> dc:creator :alice.
:alice
foaf:knows
:bob
,
and
the
SPARQL*
query
below
executed
against
the
graph
above
would
return
no
result.
SELECT ?who { :alice foaf:knows ?who }
Embedded triples are referentially opaque, meaning that triples using different terms can be considered different, even if their terms can be inferred to be synonyms. Although RDF* simple entailment has no means to entail any kind of synonymy, it is possible in some semantic extensions , such as OWL [ OWL2-RDF-BASED-SEMANTICS ].
A well known example is the superman problem :
:loisLane :believes << :superman :can :fly >>. :superman owl:sameAs :clarkKent. :superman :can :fly.
Intuitively:
this
graph
states
that
Superman
and
Clark
Kent
are
the
same
person,
so
if
Superman
can
fly,
then
it
follows
that
Clark
Kent
can
as
well.
So,
under
OWL2-entailment,
this
graph
entails
:clarkKent
:can
:fly
.
However,
Lois
Lane
does
not
know
that
Superman
and
Clark
Kent
are
the
same
person.
So
from
her
point
of
view,
the
two
triples
are
not
equivalent,
and
she
can
believe
one
without
believing
the
other.
Referential opacity is ensured by differentiating the intension of embedded triples (represented by the IT mapping) from their extension (the denotations of their subject, predicate and object). Since IT is based solely on the syntax of triples, two syntactically different triples can always have different intentions, even if their subjects, predicates and objects are semantically equivalent.
On the other hand, all triples with the same intension are required to have the same extension. So if two RDF* triples denote the same resource T, their subjects, predicates and objects, respectively, are constrained to also denote the same thing.
Blank nodes in embedded triples have the same scope as blank nodes used in the subject or object position of asserted triples (usually the whole graph or the whole dataset in which they appear). This means that the same blank node identifier used in different embedded triples , or at different levels of nesting, will refer to the same thing.
For example, in the following graph:
:alice :knows _:x. << _:x :name "Bob" >> dc:creator :alice. << _:x :workingFor :acme >> dc:creator :alice.
the
three
occurrence
of
_:x
must
refer
to
the
same
resource
in
every
interpretation
of
the
graph.
In
other
words,
it
must
be
the
same
resource
that
Alice
knows,
that
she
claims
is
named
"Bob",
and
that
she
claims
works
for
ACME.
As
a
consequence,
the
following
query
will
return
"Bob"
:
SELECT ?name { :alice :knows ?x. << ?x :name ?name >> dc:creator :alice. << ?x :workingFor :acme >> dc:creator :alice. }
As another consequence, the following graph does not entail the graph above (because the graph below allows the resource known by Alice to be different from the one about which she makes claims).
:alice :knows _:y. << _:x :name "Bob" >> dc:creator :alice. << _:x :workingFor :acme >> dc:creator :alice.
Formally, the second graph is satisfied by an interpretation having:
:alice
→A,
:knows
→K,
dc:creator
→C,
:bob
→B
<<
:bob
:name
"Bob"
>>
→T1,
<<
:bob
:workingFor
:acme
>>
→T2
with
Γ:
_:x
→
:bob
,
and
A:
_:y
→Y.
But
this
interpretation
can
not
satisfy
the
first
graph
,
because
it
would
require
a
grounding
function
Γ'
such
that
_:x
))
=
Y
in
order
to
satisfy
the
first
triple,
_:x
)
=
:bob
in
order
to
satisfy
the
second
and
third
triple,
The interpolation lemma [ RDF11-MT ] states that an RDF graph G simply entails an RDF graph E if and only if a subgraph of G is an instance of E. Intuitively, this means that all graphs simply entailed by G can be constructed by:
A design goal of the RDF* semantics was to preserve that property.
We didn't prove it yet...
A lot of discussions on the RDF* mailing list and GitHub repository refer to SA-mode and PG-mode. Those abbreviations stand for "Separate Assertion mode" and "Property Graph mode". They originate in the fact that different versions of RDF* have been published over the years, with different designs. In PG-mode, any embedded triple was also considered asserted . SA-mode, on the other hand, allowed to mention embedded triples without asserting them, requiring them to be asserted separately, if that is the intent. SA-mode was more flexible, but induced redundancy in the use-cases that PG-mode was designed to address.
The notion of annotations in the Turtle* syntax was introduced to remove the need for different modes. Instead of interpreting the same syntax differently (which would have caused interoperability problems), it was decided to provide two different syntaxes for each use case.
<<
...
>>
syntax
represents
an
embedded
triples
without
asserting
it
(thus
answering
the
needs
for
the
former
SA-mode).
{|
...
|}
syntax
creates
triples
where
the
subject
is
an
embedded
version
of
the
previously
asserted
triple,
without
the
need
to
repeat
it
(thus
answering
the
needs
for
the
former
PG-mode).