Copyright © 2021 the Contributors to the RDF* and SPARQL* Specification, published by the RDF-DEV Community Group under the W3C Community Contributor License Agreement (CLA) . A human-readable summary is available.
TODO
This specification was published by the RDF-DEV Community Group . It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups .
This section is non-normative.
TODO, citing [ RDF-STAR-FOUNDATION ]
This section is non-normative.
The RDF data model lets you state facts in three-part subject-predicate-object statements known as triples. For example, with a single RDF triple you can say that employee38 has a familyName of "Smith". A triple's predicate is a property specified with an IRI (an Internationalized version of a URI) to identify the namespace of the property name. A triple's subject and object can each be an IRI referencing any entity, and the object can also be a literal value such as "Smith" or data of other types such as dates, numbers, or Boolean values.
The subject and object of a triple can themselves reference triples. In the statement "employee22 claims that employee38 has a jobTitle of 'Assistant Designer'", the object of the triple that has employee22 as its subject references the statement "employee38 has a jobTitle of 'Assistant Designer'". This use of a triple as the subject or object resource of another triple so that we can say things about that triple is known as reification .
The concept of reification has always been part of RDF, but expressing it in RDF concrete syntaxes such as Turtle, N-Triples, and RDF/XML has been verbose and cumbersome. This specification describes a new, more compact conceptual data model and Turtle concrete syntax for reification known as RDF* (pronounced "RDF star") and Turtle* (pronounced "Turtle star"), respectively. This model and syntax enable the creation of concise triples that reference other triples as subject and object resources.
Triples that include a triple as a subject or an object are known as RDF* triples. The following dataset shows the example RDF* triples from above using the Turtle* syntax, which uses double angle brackets to enclose a triple serving as a subject or object resource:
@prefix : <http://www.example.org/> . :employee38 :familyName "Smith" . :employee22 :claims << :employee38 :jobTitle "Assistant Designer" >> .
After
declaring
a
prefix
so
that
IRIs
can
be
abbreviated,
the
first
triple
in
this
example
asserts
that
employee38
has
a
familyName
of
"Smith".
Note
that
this
dataset
does
not
assert
that
employee38
has
a
jobTitle
of
"Assistant
Designer";
it
says
that
employee22
has
made
that
claim.
In
other
words,
the
triple
"employee38
has
a
jobTitle
of
'Assistant
Designer'"
is
not
what
we
call
an
asserted
triple,
like
"employee38
has
a
familyName
of
'Smith'"
above;
it
is
known
as
an
embedded
triple.
(If
we
added
the
triple
:employee38
:jobTitle
"Assistant
Designer"
below
the
triple
about
employee22's
claim
in
the
example
above,
then
this
triple
about
employee38's
jobTitle
would
be
both
an
embedded
triple
and
an
asserted
one.)
This specification also describes an extension to the SPARQL Protocol and Query Language known as SPARQL* (pronounced "SPARQL star") for the querying of RDF* triples. For example, the following SPARQL* query asks "who has made any claims about employee38?"
PREFIX : <http://www.example.org/>
SELECT ?claimer WHERE {
?claimer :claims << :employee38 ?property ?value >>
}
SPARQL query triple patterns that include a triple pattern as a subject or object are known as SPARQL* triple patterns.
For the remainder of this document, examples will assume that the following prefixes have been declared to represent the IRIs shown with them here:
:
|
<http://www.example.org/>
|
rdfs:
|
<http://www.w3.org/2000/01/rdf-schema#>
|
owl:
|
<http://www.w3.org/2002/07/owl#>
|
prov:
|
<http://www.w3.org/ns/prov#>
|
dc:
|
<http://purl.org/dc/elements/1.1/>
|
dct:
|
<http://purl.org/dc/terms/>
|
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , and MUST NOT in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
In the following, we we introduce a number of SPARQL*-specific definitions, which rely on the following notions (extending some of them) defined in RDF 1.1 Concepts and Abstract Syntax [ RDF11-CONCEPTS ]: blank node , default graph , graph name , IRI , literal , named graphs , object , predicate , RDF dataset , RDF graph , RDF triple , and subject
An RDF* graph is a set of RDF* triples .
An RDF* triple is a 3-tuple defined recursively as follows:
As for RDF triples , we call the 3 components of an RDF* triple its subject , predicate and object , respectively. From the definitions above, it follows that any RDF graph is also an RDF* graph . Note also that, by definition, an RDF* triple cannot contain itself and cannot be nested infinitely.
IRIs , literals , blank nodes and RDF* triples are collectively known as RDF* terms .
For every RDF* triple t , we define its constituent terms (or simply constituents) as the set containing its subject , its predicate , its object , plus all the constituent terms of its subject and/or its object if they are themselves RDF* triples . By extension, we define the constituent terms of an RDF* graph to be the union set of the constituent terms of all its triples.
<< _:a :name "Alice" >> :statedBy :bob.
:name
,
:statedBy
,
:bob
,
the
blank
node
_:a
,
the
literal
"Alice"
,
and
the
triple
<<
_:a
:name
"Alice"
>>
.
An RDF* triple used as the subject or object of another RDF* triple is called an embedded triple . An RDF* triple that is an element of an RDF* graph is called an asserted triple . Note that, in a given RDF* graph , the same triple MAY be both embedded and asserted .
An RDF* dataset is a collection of RDF* graphs , and comprises:
Again, this definition is an extension of the notion of RDF dataset , hence it follows that any RDF dataset is also an RDF* dataset .
This section is non-normative.
According to the definitions above, an RDF* triple is an abstract entity whose identity is entirely defined by its subject, predicate and object. Conversely, given three RDF* terms s , p , and o , there is exactly and only one RDF* triple with subject s , predicate p and object o . This unique triple ( s , p , o ) can be embedded as the subject or object of multiple other triples, but must be assumed to represent the same thing everywhere it occurs, just like the same IRI p is assumed to represent the same thing everywhere it occurs.
In
some
situations,
however,
it
might
be
necessary
to
distinguish
the
occurrences
of
a
triple
in
different
graphs.
Consider
the
following
sentence:
"The
triple
<http://example.org/s>
<http://example.org/p>
<http://example.org/o>
in
(the
graph
represented
by)
file1.ttl
was
added
by
Alice,
and
the
same
triple
in
file2.ttl
was
added
by
Bob."
Note
that
the
words
"same
triple"
in
this
sentence
may
be
confusing,
because
although
the
triple
(as
an
abstract
entity)
is
the
same,
its
respective
occurrences
are
different
things,
each
within
a
different
file
and
with
a
different
author
(this
is
known,
in
philosophy
and
linguistics,
as
the
type-token
distinction
).
As
the
embedded
triple
represents
a
unique
thing,
adequately
conveying
the
meaning
of
the
sentence
above
requires
additional
nodes
for
representing
the
two
distinct
occurrences.
One
possible
solution
is
illustrated
in
the
following
example
(using
the
Turtle*
concrete
syntax
described
in
the
next
section
).
_:a :occurenceOf << :s :p :o >> ;
:in <file1.ttl> ;
dct:creator :alice.
_:b :occurenceOf << :s :p :o >> ;
:in <file2.ttl> ;
dct:creator
:bob.
In this section, we present Turtle*, an extension of the Turtle format [ TURTLE ] allowing the representation of RDF* graphs . For the sake of conciseness, we only describe here the differences between Turtle* and Turtle.
Turtle* is defined to follow the same grammar as Turtle, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.
| [8] |
objectList
|
::= |
object
annotation
?
(
','
object
annotation
?
)*
|
| [10] |
subject
|
::= |
iri
|
BlankNode
|
collection
|
embTriple
|
| [12] |
object
|
::= |
iri
|
BlankNode
|
collection
|
blankNodePropertyList
|
literal
|
embTriple
|
| [27] |
embTriple
|
::= |
'<<'
embSubject
verb
embObject
'>>'
|
| [28] |
embSubject
|
::= |
iri
|
BlankNode
|
embTriple
|
| [29] |
embObject
|
::= |
iri
|
BlankNode
|
literal
|
embTriple
|
| [31] |
annotation
|
::= |
'{|'
predicateObjectList
'|}'
|
The
changes
are
that
subject
and
object
productions
have
been
extended
to
accept
embedded
triples
,
which
are
described
by
the
new
productions
27
to
29
.
Note
that
embedded
triples
accept
a
more
restricted
range
of
subject
and
object
expressions
than
asserted
triples
.
Additionally,
the
objectList
production
now
accepts
an
optional
annotation
after
each
object.
This has already been discussed on the mailing list .
The idea would be to have a notation like
:bob :age 42 {| :source <http://example.org/~bob/> |}.
as shortcut for
:bob :age 42.
<< :bob :age 42 >> :source <http://example.org/~bob/>.
A Turtle* parser is similar to a Turtle parser as defined in Section 7 of the Turtle specification [ TURTLE ], with an additional item in its state :
Additionally, the curSubject can be bound to any RDF* term (including an embedded triple ).
A
Turtle*
document
defines
an
RDF*
graph
composed
of
a
set
of
RDF*
triples
.
The
subject
and
embSubject
productions
set
the
curSubject
.
The
verb
production
sets
the
curPredicate
.
The
object
and
embObject
productions
set
the
curObject
.
Finishing
the
object
production,
an
RDF*
triple
curSubject
curPredicate
curObject
is
generated
and
added
to
the
RDF*
graph
.
Beginning
the
embTriple
production
records
the
curSubject
and
curPredicate
.
Finishing
the
embTriple
production
yields
the
RDF*
triple
curSubject
curPredicate
curObject
and
restores
the
recorded
values
of
curSubject
and
curPredicate
.
Beginning
the
annotation
production
records
the
curSubject
and
curPredicate
,
and
sets
the
curSubject
to
the
RDF*
triple
curSubject
curPredicate
curObject
.
Finishing
the
annotation
production
restores
the
recorded
values
of
curSubject
and
curPredicate
.
All other productions MUST be handled as specified by Section 7 of the Turtle specification [ TURTLE ], while still applying the changes above recursively.
This section is non-normative.
While this document specifies only one concrete syntax, nothing prevents other concrete syntaxes of RDF* from being proposed. In particular, other existing concrete syntaxes for RDF, such as RDF/XML [ RDF-SYNTAX-GRAMMAR ], could be extended to support RDF*. In particular, the N-Triples syntax [ N-TRIPLES ] being a subset of Turtle, an appropriate subset of Turtle* could be defined to extend N-Triples accordingly.
This Section introduces SPARQL*, which is an RDF*-aware extension of the RDF query language SPARQL [ SPARQL11-QUERY ]; i.e., SPARQL* can be used to query RDF* graphs.
In the following, we introduce a number of SPARQL*-specific definitions, which rely on the following notions, defined in SPARQL 1.1 Query Language [ SPARQL11-QUERY ]: RDF term , query variable , triple pattern , property path pattern , property path expression , and solution mapping .
A SPARQL* triple pattern is a 3-tuple that is defined recursively as follows:
As for RDF* triples , a SPARQL* triple pattern MUST NOT contain itself.
A SPARQL* basic graph pattern ( BGP *) is a set of SPARQL* triple patterns .
A SPARQL* property path pattern is a 3-tuple ( s , p , o ) where
I have added the definition of a SPARQL* property path pattern into the draft just for the sake of having such a definition. We need to think about whether it is useful to add this to SPARQL*, in which case we need to define the semantics of such SPARQL* property path patterns.
In fact, no matter what we decide, even for standard property path patterns , the semantics may have to be extended to use them over RDF* graphs .
A SPARQL* solution mapping μ is a partial function from the set of all query variables to the set of all RDF* terms . The domain of μ, denoted by dom(μ), is the set of query variables for which μ is defined.
The notion of a SPARQL* solution mapping extends the notion of a standard SPARQL solution mapping ; that is, every SPARQL solution mapping is a SPARQL* solution mapping . However, in contrast to SPARQL solution mappings , SPARQL* solution mappings may map variables also to RDF* triples .
All notions related to SPARQL solution mappings carry over naturally to SPARQL* solution mappings. In particular, the definition of compatibility extends naturally to SPARQL* solution mappings: two SPARQL* solution mappings μ 1 and μ 2 are compatible if, for every variable v that is both in dom(μ 1 ) and in dom(μ 2 ), μ 1 (v) and μ 2 (v) are the same RDF* term . In this case, μ 1 ∪ μ 2 is also a SPARQL* solution mapping. Moreover, for any SPARQL* solution mapping μ we write card[Ω](μ) to denote the cardinality of μ in a multiset Ω of such mappings. Finally, given a BGP * B and a SPARQL* solution mapping μ, we write μ( B ) to denote the result of replacing every variable v in B for which μ is defined with μ(v).
Next, we aim to carry over the notion of solutions for BGPs to BGP * . To this end, we first define an auxiliary concept that carries over the notion of an RDF instance mapping [ RDF11-MT ] to RDF*.
An RDF* instance mapping σ is a partial function from the set of all blank nodes to the set of all RDF* terms . The domain of σ, denoted by dom(σ), is the set of blank nodes for which σ is defined.
Similar to the corresponding notation for solution mappings, for an RDF* instance mapping σ and a BGP * B we write σ( B ) to denote the result of replacing every blank node b in B for which σ is defined with σ(b).
Now we are ready to define the notion of solution for BGP *.
Given a BGP * B and an RDF* graph G , a SPARQL* solution mapping μ is a solution for the BGP * B over G if it has the following two properties
SPARQL* is defined to follow the same grammar as SPARQL, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.
| [60] |
Bind
|
::= |
'BIND'
'('
(
Expression
|
EmbTP
)
'AS'
Var
')'
|
| [75] |
TriplesSameSubject
|
::= |
VarOrTermOrEmbTP
PropertyListNotEmpty
|
TriplesNode
PropertyList
|
| [80] |
Object
|
::= |
GraphNode
|
EmbTP
|
| [81] |
TriplesSameSubjectPath
|
::= |
VarOrTermOrEmbTP
PropertyListPathNotEmpty
|
TriplesNode
PropertyListPath
|
| [105] |
GraphNodePath
|
::= |
VarOrTermOrEmbTP
|
TriplesNodePath
|
|
| [174] |
EmbTP
|
::= |
'<<'
EmbSubjectOrObject
Verb
EmbSubjectOrObject
'>>'
|
| [175] |
EmbSubjectOrObject
|
::= |
Var
|
BlankNode
|
iri
|
RDFLiteral
|
NumericLiteral
|
BooleanLiteral
|
EmbTP
|
| [176] |
VarOrTermOrEmbTP
|
::= |
Var
|
GraphTerm
|
EmbTP
|
This introduces a notation for embedded triple patterns (productions [174] and following), which is similar to the one defined for embedded triples in § 3. Turtle* , but accepting also variables . These embedded triple patterns are allowed in the subject ( [75] , [81] ) and object ( [80] , [105] ) positions of SPARQL* triple patterns , as well as in BIND statements ( [60] ).
Instead of reusing the keyword BIND for SPARQL* (as in my original proposal), we may want to consider using a different keyword for this functionality because the behavior is a bit different. For instance, @klinovp has mentioned this issue in an email on the mailing list . In another email , @afs has proposed to use the keyword FIND instead.
Based
on
the
SPARQL
grammar,
the
SPARQL
specification
defines
the
process
of
converting
graph
patterns
and
solution
modifiers
in
a
SPARQL
query
string
into
a
SPARQL
algebra
expression
[
SPARQL11-QUERY,
Section 18.2
].
This
process
must
be
adjusted
to
consider
the
extended
grammar
introduced
above
.
In
the
following,
any
step
of
the
conversion
process
that
requires
adjustment
is
discussed.
As a basis of the translation, the SPARQL specification introduces a notion of in-scope variables . To cover the new syntax elements introduced in § 4.2 Grammar this notion MUST be extended as follows.
BIND ( T AS v )
(where
T
is
an
embedded
triple
pattern
)
if
the
variable
is
variable
v
or
the
variable
occurs
in
the
embedded
triple
pattern
T
.
As
for
standard
BIND
clauses
with
expressions,
variable
v
must
not [be] in-scope from the preceding elements in the group graph pattern in which [the BIND clause] is used[ SPARQL11-QUERY, Section 18.2.1] ].
The
translation
process
starts
with
expanding
abbreviations
for
IRIs
and
triple
patterns
[
SPARQL11-QUERY,
Section 18.2.2.1
].
This
step
MUST
be
extended
in
two
ways:
Abbreviations for triple patterns with embedded triple patterns MUST be expanded as if each embedded triple pattern was a variable (or an RDF term ).
<<?c a owl:Class>> dct:source ?src ; :entailing <<?c aowl:Class>>rdfs:Class>> .
<<?c a owl:Class>> dct:source ?src . <<?c a rdfs:Class>>prov:wasDerivedFrom:entailing <<?c aowl:Class>>rdfs:Class>> .
Abbreviations for IRIs in all embedded triple patterns MUST be expanded.
<<?c a rdfs:Class>>
<<?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class>>>
The translation of property path patterns has to be adjusted because the extended grammar allows for SPARQL* property path patterns whose subject or object is a SPARQL* triple pattern .
The translation as specified in the W3C specification distinguishes four cases. The first three of these cases do not require adjustment because they are taken care of either by recursion or by the adjusted translation of basic graph patterns (as defined in § 4.3.4 Translate Basic Graph Patterns below). However, the fourth case MUST be adjusted as follows.
Let
X
P
Y
be
a
string
that
corresponds
to
the
fourth
case
in [
SPARQL11-QUERY,
Section 18.2.2.4
].
Given
the
grammar
introduced
in
§
4.2
Grammar
,
X
and
Y
may
be
an
RDF
term
,
a
variable
,
or
an
embedded
triple
pattern
,
respectively
(and
P
is
a
property
path
expression
).
The
string
X
P
Y
is
translated
to
the
algebra
expression
Path
(
X’
,
P
,
Y’
)
where
X’
and
Y’
are
the
result
of
calling
a
function
named
Lift
for
X
and
Y
,
respectively.
For
some
input
string
Z
(such
as
X
or
Y
)
that
can
be
an
RDF
term
,
a
variable
,
or
an
embedded
triple
pattern
,
the
function
Lift
is
defined
recursively
as
follows:
Lift
(
S
),
P
,
Lift
(
O
));
Lift
translates
every
embedded
triple
pattern
as
can
be
written
in
the
SPARQL*
syntax
into
a
SPARQL*
triple
pattern
.
After
translating
property
path
patterns
,
the
translation
process
collects
any
adjacent
triple
patterns
[...]
to
form
a
basic
graph
pattern
[
SPARQL11-QUERY,
Section 18.2.2.5
].
This
step
has
to
be
adjusted
because
triple
patterns
in
the
extended
syntax
may
have
an
embedded
triple
pattern
in
their
subject
position
or
in
their
object
position
(or
in
both).
To
ensure
that
every
result
of
this
step
is
a
BGP
*
,
before
adding
a
triple
pattern
to
its
corresponding
collection,
its
subject
and
object
MUST
be
replaced
by
the
result
of
calling
function
Lift
for
the
subject
and
the
object,
respectively.
The extended grammar in § 4.2 Grammar allows for BIND clauses with an embedded triple pattern . The translation of such a BIND clause to a SPARQL algebra expression requires a new algebra symbol:
Then,
any
string
of
the
form
BIND( T AS v )
with
T
being
an
embedded
triple
pattern
(i.e.,
not
a
standard
BIND
expression)
is
translated
to
the
algebra
expression
TR
(
T’
,
v
)
where
T’
is
the
result
of
the
function
Lift
for
T
.
Notice,
the
translation
of
BIND
clauses
with
an
embedded
triple
pattern
as
defined
in
this
section
is
used
during
the
translation
of
group
graph
patterns
.
The
case
of
BIND
clauses
with
an
embedded
triple
pattern
is
covered
in
this
translation
of
group
graph
patterns
by
the
last,
“catch
all
other”
IF
statement
(i.e.,
the
IF
statement
with
the
condition
E
is
any
other
form
)
and
not
by
the
IF
statement
for
BIND
clauses
with
an
expression.
The
SPARQL
specification
defines
a
function
eval(
D
(
G
), algebra expression)
as
the
evaluation
of
an
algebra
expression
with
respect
to
a
dataset
D
having
active
graph
G
[
SPARQL11-QUERY,
Section
18.6
].
Recall
that
the
dataset
D
in
the
context
of
SPARQL*
is
an
RDF*
dataset
and,
thus,
the
active
graph
G
is
an
RDF*
graph
,
and
so
is
any
other
graph
in
dataset
D
.
The
definition
of
the
eval
function
is
recursive;
the
two
base
cases
of
this
definition
for
SPARQL*
are
given
as
follows:
For any other algebra expression, the SPARQL specification defines algebra operators [ SPARQL11-QUERY ]. These definitions can be extended naturally to operate over multisets of SPARQL* solution mappings (instead of ordinary solution mappings ). Given this extension, the recursive steps of the definition of the eval function for SPARQL* are the same as in the SPARQL specification.
In SPARQL, queries can take four forms: SELECT , CONSTRUCT , DESCRIBE , and ASK - see SPARQL1.1 Query, Section 16 [ SPARQL11-QUERY ]. The first of these returns a sequence of solution mappings that contain variable bindings. The second and third both return an RDF graph, and the last returns a boolean value.
The result of the ASK query form is not changed by the introduction of RDF*, and the result of the CONSTRUCT and DESCRIBE forms can be represented by Turtle* . However, since the SELECT form deals with returning individual RDF terms, the specific serialization formats for representing such query results need to be extended so that the new embedded triple RDF term can be represented. In this section, we propose extensions for the two most common formats for this purpose: SPARQL 1.1 Query Results JSON Format , and SPARQL Query Results XML Format (Second Edition) .
In addition to defining the extended formats for serializing the result of a SPARQL* SELECT query ( #12 and #13 ), we have to decide whether we need/want new mime types for these extended formats? Similarly, do we need/want to introduce another namespace for the extended XML result format?
The result of a SPARQL SELECT query is serialized in JSON as defined in SPARQL 1.1 Query Results JSON Format , which specifies a JSON representation of variable bindings to RDF terms (see [ sparql11-results-json, Section 3.2 ]). To accommodate the new RDF term for embedded triples that RDF* introduces, the table of RDF term JSON representations in sparql11-results-json, Section 3.2.2 is extended with the following entry:
S
,
predicate
RDF
term
P
and
object
RDF
term
O
{
"type": "triple",
"value": {
"subject": S,
"predicate": P,
"object": O
}
}
where
S
,
P
and
O
are
encoded
using
the
same
format,
recursively.
<< <http://example.org/alice> <http://example.org/name> "Alice" >>
{
"type": "triple",
"value": {
"subject": {
"type": "uri",
"value" "http://example.org/alice"
},
"predicate": {
"type": "uri",
"value" "http://example.org/name"
},
"object": {
"type": "literal",
"value" "Alice",
"datatype": "http://www.w3.org/2001/XMLSchema#string"
},
}
}
The result of a SPARQL SELECT query is serialized in XML as defined in SPARQL Query Results XML Format (Second Edition) . This format proposes an XML representation of variable bindings to RDF terms.
To accommodate the new RDF term for embedded triples that RDF* introduces, the list of RDF terms and their XML representations in [ rdf-sparql-XMLres, Section 2.3.1 ] is extended as follows:
S
,
predicate
term
P
,
and
object
term
O
<binding>
<triple>
<subject>S</subject>
<predicate>P</predicate>
<object>O</object>
</triple>
</
binding
>
where
S
,
P
and
O
are
encoded
recursively,
using
the
same
format,
without
the
enclosing
<binding>
tag.
<< <http://example.org/alice> <http://example.org/name> "Alice" >>
<triple>
<subject>
<uri>http://example.org/alice</uri>
</subject>
<predicate>
<uri>http://example.org/name</uri>
</predicate>
<object>
<literal datatype='http://www.w3.org/2001/XMLSchema#string'>Alice</literal>
</object>
</
triple
>
We need a section that defines SPARQL* Update. The text for this section can be taken from the following document: https://blog.liu.se/olafhartig/documents/sparql-update/
In this section, we provide a model-theoretic semantics for RDF*, by extending the one defined in RDF 1.1 Semantics [ RDF11-MT ].
An RDF* triple is said to be ground if it has no blank node in its constituent terms . An RDF* graph is ground if all its triples are ground . This definition generalizes the notion of ground RDF graph . IRIs , literals and ground RDF* terms are collectively known as ground RDF* terms .
An RDF* simple interpretation I is a structure consisting of:
This definition is identical to the definition of simple interpretation [ RDF11-MT ] up to item 5 included. Item 6 extends it to support RDF* triples . Any RDF simple interpretation can be considered as an RDF* simple interpretation with IT =∅.
The denotation of a ground RDF* graph in an RDF* simple interpretation I is then given by the following rules, where the interpretation is also treated as a function from expressions (terms, triples and graphs) to elements of the universe and truth values:
Since IL and IT are partial mappings, I ( E ) may be undefined for some literal or triple E . In that case, E has no semantic value in I , so any asserted triple having E as subject or object it will fail to satisfy the condition above, hence any graph containing such asserted triple will be false.
Given an RDF* graph E , we call the embedded blank nodes ( ebn ) of E the set of blank nodes appearing in subject or object position of some embedded triple in E ; we call the open blank nodes ( obn ) of E all the other blank nodes appearing in E .
A mapping from a set blank nodes into a set of ground RDF* terms is called a grounding function . We define the extended application of a grounding function Γ to other RDF* terms and to RDF* graphs as follows:
Suppose I is an RDF* simple interpretation and A is a mapping from a set of blank nodes to the universe IR of I . Define the mapping [ I + A ] of RDF* terms into IR to be A on blank nodes of the set, and I on any other term; and extend this mapping to RDF* triples and RDF* graphs using the rules given above for ground graphs . Then the denotation of any RDF* graph in I is given by:
Following RDF 1.1 Semantics , we extend the notions of satisfiability and entailment. An RDF* simple interpretation satisfies E when I ( E )=true. E is (simply) satisfiable when an RDF* simple interpretation exists which satisfies it, otherwise (simply) unsatisfiable . An RDF* graph G simply entails an RDF* graph H when every interpretation which satisfies G also satisfies H . If two RDF* graphs G and H each entail the other then they are logically equivalent .
Any semantic extension of RDF MAY be extended to RDF* by replacing the semantic conditions, the notion of satisfiability and the notion of entailment, defined in RDF 1.1 Semantics , by their corresponding extension defined above. This is notably the case for Datatype entailment and RDFS entailment .
This section is non-normative.
In this section, we discuss a number of desired features of RDF* semantics in order to shed light on the design choices made in the previous section
RDF* must be able to quote a triple without asserting it, so that we can represent peoples' beliefs or claims without endorsing them, or represent facts that are no longer or not yet true. This is ensured by the fact that only asserted triples are considered to determine if the denotation of a graph is true or false.
For example, the following graph:<< :alice foaf:knows :bob >> dc:creator :alice.
:alice
foaf:knows
:bob
,
and
the
SPARQL*
query
below
executed
against
the
graph
above
would
return
no
result.
SELECT
?who
{
:alice
foaf:knows
?who
}
Embedded triples are referentially opaque, meaning that triples using different terms can be considered different, even if their terms can be inferred to be synonyms. Although RDF* simple entailment has no means to entail any kind of synonymy, it is possible in some semantic extensions , such as OWL [ OWL2-RDF-BASED-SEMANTICS ].
A well known example is the superman problem :
:loisLane :believes << :superman :can :fly >>. :superman owl:sameAs :clarkKent. :superman :can :fly.
Intuitively:
this
graph
states
that
Superman
and
Clark
Kent
are
the
same
person,
so
if
Superman
can
fly,
then
it
follows
that
Clark
Kent
can
as
well.
So,
under
OWL2-entailment,
this
graph
entails
:clarkKent
:can
:fly
.
However,
Lois
Lane
does
not
know
that
Superman
and
Clark
Kent
are
the
same
person.
So
from
her
point
of
view,
the
two
triples
are
not
equivalent,
and
she
can
believe
one
without
believing
the
other.
Referential opacity is ensured by differentiating the intension of embedded triples (represented by the IT mapping) from their extension (the denotations of their subject, predicate and object). Since IT is based solely on the syntax of triples, two syntactically different triples can always have different intentions, even if their subjects, predicates and objects are semantically equivalent.
On the other hand, all triples with the same intension are required to have the same extension. So if two RDF* triples denote the same resource T, their subjects, predicates and objects, respectively, are constrained to also denote the same thing.
Blank nodes in embedded triples have the same scope as blank nodes used in the subject or object position of asserted triples (usually the whole graph or the whole dataset in which they appear). This means that the same blank node identifier used in different embedded triples , or at different levels of nesting, will refer to the same thing.
For example, in the following graph:
:alice :knows _:x. << _:x :name "Bob" >> dc:creator :alice. << _:x :workingFor :acme >> dc:creator :alice.
the
three
occurrence
of
_:x
must
refer
to
the
same
resource
in
every
interpretation
of
the
graph.
In
other
words,
it
must
be
the
same
resource
that
Alice
knows,
that
she
claims
is
named
"Bob",
and
that
she
claims
works
for
ACME.
As
a
consequence,
the
following
query
will
return
"Bob"
:
SELECT ?name {
:alice :knows ?x.
<< ?x :name ?name >> dc:creator :alice.
<< ?x :workingFor :acme >> dc:creator :alice.
}
As another consequence, the following graph does not entail the graph above (because the graph below allows the resource known by Alice to be different from the one about which she makes claims).
:alice :knows _:y. << _:x :name "Bob" >> dc:creator :alice. << _:x :workingFor :acme >> dc:creator :alice.
Formally, the second graph is satisfied by an interpretation having:
:alice
→A,
:knows
→K,
dc:creator
→C,
:bob
→B
<<
:bob
:name
"Bob"
>>
→T1,
<<
:bob
:workingFor
:acme
>>
→T2
with
Γ:
_:x
→
:bob
,
and
A:
_:y
→Y.
But
this
interpretation
can
not
satisfy
the
first
graph
,
because
it
would
require
a
grounding
function
Γ'
such
that
_:x
))
=
Y
in
order
to
satisfy
the
first
triple,
_:x
)
=
:bob
in
order
to
satisfy
the
second
and
third
triple,
The interpolation lemma [ RDF11-MT ] states that an RDF graph G simply entails an RDF graph E if and only if a subgraph of G is an instance of E. Intuitively, this means that all graphs simply entailed by G can be constructed by:
A design goal of the RDF* semantics was to preserve that property.
We didn't prove it yet...
This section is non-normative.
A lot of discussions on the RDF* mailing list and GitHub repository refer to SA-mode and PG-mode. Those abbreviations stand for "Separate Assertion mode" and "Property Graph mode". They originate in the fact that different versions of RDF* have been published over the years, with different designs. In PG-mode, any embedded triple was also considered asserted . SA-mode, on the other hand, allowed the use of embedded triples without those triples being automatically asserted , requiring that they be asserted separately when that was intended. SA-mode was more flexible, but induced redundancy in the use-cases that PG-mode was designed to address.
The notion of annotations in the Turtle* syntax was introduced to remove the need for different modes. Rather than interpret the same syntax differently in each mode, which would have caused interoperability problems and required a switch for those modes, it was decided to provide a different syntax for each use case.
<<
...
>>
syntax
represents
a
triple
that
is
embedded
without
being
asserted
,
satisfying
the
need
formerly
filled
by
SA-mode.
:a
:b
:c
{|
:p
:o
...
|}
annotation
syntax
creates
triples
where
the
subject
is
an
embedded
version
of
the
triple
asserted
just
before
the
annotation
(here,
:a
:b
:c
),
without
the
need
to
repeat
it,
satisfying
the
need
formerly
filled
by
PG-mode.
The motivating example in the original RDF* paper [ RDF-STAR-FOUNDATION ] was on a provenance use-case, and is repeated below.
# the controversial seminal example :bob foaf:name "Bob". <<:bob foaf:age 23>> dct:creator <http://example.com/crawlers#c1> ; dct:source <http://example.net/listing.html> .
This
example
was
further
debated
on
the
RDF*
mailing
list
,
as
it
appears
to
have
set
wrong
expectations
about
what
RDF*
embedded
triples
represent.
More
precisely,
from
this
example,
one
may
wrongly
assume
that
<<:bob
foaf:age
23>>
represents
the
occurrence
of
the
given
triple
at
the
address
http://example.net/listing.html
(see
§
2.1
Triples
and
occurrences
).
This
impression
may
be
reinforced
by
the
use
of
dct:creator
:
arguably,
a
triple
(as
a
unique
abstract
entity)
is
not
"created"
by
anyone,
while
an
occurrence
thereof
can
be
said
to
be
created
or
authored.
Another serious issue with this example is that it does not allow the addition of other creators and sources for the triple: one could not tell which source corresponds to which creator. Correctly capturing this information would require additional nodes to explicitly represent triple occurrences, as in Example 4 . In retrospect, the provenance use-case, although a valid use-case for RDF*, was not the most suitable choice for an introductory example.
This section is non-normative.