Copyright © 2020-2022 the Contributors to the YAML-LD Specification, published by the JSON for Linking Data Community Group under the W3C Community Contributor License Agreement (CLA). A human-readable summary is available.
In recent years, [YAML] has emerged as a more concise format
to represent information that had previously been serialized as JSON,
including Linked Data.
This document defines how to serialize linked data
in YAML.
Moreover, it registers the application/ld+yaml media type.
This specification was published by the JSON for Linking Data Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.
This document has been developed by the JSON-LD Community Group.
GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-linked-json@w3.org (subscribe, archives).
[JSON-LD11] is a JSON-based format to serialize Linked Data. In recent years, [YAML] has emerged as a more concise format to represent information that had previously been serialized as [JSON], including API specifications, data schemas, and Linked Data.
This document defines YAML-LD as a set of conventions on top of YAML which specify how to serialize Linked Data [LINKED-DATA] as [YAML] based on JSON-LD syntax, semantics, and APIs.
Since YAML is more expressive than JSON, both in the available data types and in the document structure (see [I-D.ietf-httpapi-yaml-mediatypes]), this document identifies constraints on YAML such that any YAML-LD document can be represented in JSON-LD.
This section is non-normative.
This document uses the following terms as defined in external specifications and defines terms specific to JSON-LD.
A YAML-LD stream is a YAML stream of YAML-LD documents.
For interoperability considerations on YAML streams, see the relevant section in YAML Media Type.
A YAML-LD document is any YAML document from which a conversion to [JSON] produces a valid JSON-LD document which can be interpreted as [LINKED-DATA].
The term media type is imported from [RFC6838].
The term JSON is imported from [JSON]
The term JSON document represents a serialization of a resource conforming to the [JSON] grammar.
The term JSON-LD document is imported from [JSON-LD11].
The term internal representation is imported from [JSON-LD11-API]. The term documentLoader is imported from [JSON-LD11-API]
The terms array, boolean, map, map entry, null, and string are imported from [INFRA].
The term number is imported from [ECMASCRIPT].
The terms YAML, YAML representation graph, YAML stream, YAML directive, YAML document, YAML sequence (either block sequence or flow sequence), YAML mapping (either block mapping or flow mapping), node, scalar, node anchor, and alias node, are imported from [YAML].
The term content negotiation is imported from [RFC9110].
The terms fragment and fragment identifier in this document are to be interpreted as in [URI].
The term Linked Data is imported from [LINKED-DATA].
This section is non-normative.
This specification makes use of the following namespace prefixes:
| Prefix | IRI |
|---|---|
| ex | http://example.org/ |
| rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
| xsd | http://www.w3.org/2001/XMLSchema# |
These are used within this document as part of a compact IRI
as a shorthand for the resulting IRI, such as dcterms:title
used to represent http://purl.org/dc/terms/title.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, RECOMMENDED, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
A YAML-LD document complies with this specification if it follows the normative statements from this specification and can be interpreted as [JSON-LD11] after transformation into the internal representation. For convenience, normative statements for documents are often phrased as statements on the properties of the document.
A YAML-LD document complies with the YAML-LD-JSON profile of this specification if it follows the normative statements from this specification and can be transformed into a JSON-LD representation, then back to a conforming YAML-LD document, without loss of semantic information.
This section is non-normative.
To ease writing and collaborating on [JSON-LD11] documents, it is becoming common practice to serialize them as [YAML]. This requires a registered media type, not only to enable content negotiation of linked data documents in YAML, but also to define the expected behavior of applications that process these documents, including fragment identifiers and interoperability considerations.
This is because YAML is more flexible than [JSON]:
The first goal of this specification is to allow a JSON-LD document to be processed and serialized into YAML, and then back into JSON-LD, without losing any semantic information.
This is always possible, because a YAML representation graph can always represent a tree, because JSON data types are a subset of YAML's, and because JSON encoding is UTF-8.
The subset of YAML-LD which supports serialisation of JSON-LD documents is defined as the YAML-LD-JSON profile of YAML-LD.
Example: The JSON-LD document below
{
"@context": "http://example.org/context.jsonld",
"@graph": [
{"@id": "http://example.org/1", "title": "Example 1"},
{"@id": "http://example.org/2", "title": "Example 2"},
{"@id": "http://example.org/3", "title": "Example 3"}
]
}
Can be serialized as YAML as follows.
Note that entries
starting with @ need to be enclosed in quotes
(as shown in this example),
because @ is a reserved character in YAML.
%YAML 1.2
---
'@context': http://example.org/context.jsonld
'@graph':
- '@id': http://example.org/1
title: Example 1
- '@id': http://example.org/2
title: Example 2
- '@id': http://example.org/3
title: Example 3
This document is based on YAML 1.2.2,
but YAML-LD is not tied to a specific version of YAML.
Implementers concerned about features related to a specific YAML version
can specify it in documents using the %YAML directive
(see 7. Interoperability Considerations).
FIXME.
This section is non-normative.
As JSON-LD keywords cannot be represented in YAML without being quoted, publishers may find it useful to declare term aliases for these keywords.
To make this easier, a convenience context which pre-defines
these aliases is available at https://www.w3.org/ns/yaml-ld/v1, and
partially shown below:
{
"@context": {
"$always": "@always",
"$base": "@base",
"$container": "@container",
"$direction": "@direction",
"$embed": "@embed",
"$explicit": "@explicit",
"$graph": "@graph",
"$id": "@id",
"$import": "@import",
"$included": "@included",
"$index": "@index",
"$json": "@json",
"$language": "@language",
"$list": "@list",
"$nest": "@nest",
"$never": "@never",
"$none": "@none",
"$null": "@null",
"$omitDefault": "@omitDefault",
"$once": "@once",
"$prefix": "@prefix",
"$propagate": "@propagate",
"$protected": "@protected",
"$requireAll": "@requireAll",
"$reverse": "@reverse",
"$set": "@set",
"$type": "@type",
"$value": "@value",
"$version": "@version",
"$vocab": "@vocab"
}
}
The convenience context contains an alias for every JSON-LD keyword
permitted to be aliased by the JSON-LD 1.1 specifications;
JSON-LD does not allow @context to be aliased.
This allows YAML-LD to be created with minimal use of quoted keys.
The YAML-reserved "@" character is replaced by "$",
which is not reserved and therefore does not require quoting.
Consider Example 12 reformatted using the convenience context:
%YAML 1.2
---
'@context':
- http://example.org/context.jsonld
- https://www.w3.org/ns/yaml-ld/v1
$graph:
- $id: http://example.org/1
title: Example 1
- $id: http://example.org/2
title: Example 2
- $id: http://example.org/3
title: Example 3
A YAML-LD document MUST be encoded in UTF-8, to ensure interoperability with [JSON].
Since anchor names are a serialization detail, such anchors MUST NOT be used to convey relevant information, MAY be altered when processing the document, and MAY be dropped when interpreting the document as JSON-LD.
A YAML-LD document MAY contain anchored nodes and alias nodes, but its representation graph MUST NOT contain cycles. When interpreting the document as JSON-LD, alias nodes MUST be resolved by value to their target nodes.
The YAML-LD document in the following example
contains alias nodes for the {"@id": "countries:ITA"} object:
%YAML 1.2
---
"@context":
"@vocab": "http://schema.org/"
"countries": "http://publication.europa.eu/resource/authority/country/"
"@graph":
- &ITA
"@id": countries:ITA
- "@id": http://people.example/Homer
name: Homer Simpson
nationality: *ITA
- "@id": http://people.example/Lisa
name: Lisa Simpson
nationality: *ITA
While the representation graph (and eventually the in-memory representation of the data structure, e.g., a Python dictionary or a Java hashmap) will still contain references between nodes, the JSON-LD serialization will not, as shown below:
{
"@context": {
"@vocab": "http://schema.org/",
"countries": "http://publication.europa.eu/resource/authority/country/"
},
"@graph": [
{
"@id": "countries:ITA"
},
{
"@id": "http://people.example/Homer",
"full_name": "Homer Simpson",
"country": {
"@id": "countries:ITA"
}
},
{
"@id": "http://people.example/Lisa",
"full_name": "Lisa Simpson",
"country": {
"@id": "countries:ITA"
}
}
]
}
Every YAML-LD file is a YAML-LD stream and might contain multiple YAML-LD documents, as shown in the example below.
"@id": ex:Ray
"@type": ex:Cat
name:
en: Ray
---
"@id": ex:Smoke
"@type": ex:Cat
name:
en: Smoke
Each of the individual YAML documents in the stream has to be converted into a separated JSON-LD document and processed separately.
This is inconsistent with the processing description in 5.1 Converting a YAML stream.
To reduce the demand for loading static documents, implementations SHOULD maintain a locally cached version of the following documents that will be satisfied by the default documentLoader.
https://www.w3.org/ns/yaml-ld/v1YAML-LD processing is defined by converting YAML to the internal representation and using JSON-LD 1.1 Processing Algorithms and API to process on that representation, after which the representation is converted back to YAML. As information specific to a given YAML document structure is lost in this transformation, much of the specifics of that original representation are therefore lost in that conversion, limiting the ability to fully round-trip a YAML-LD document back to an equivalent representation. Consequently, round-tripping in this context is limited to preservation of the semantic representation of a document, rather than a specific syntactic representation.
The conversion process represented here is compatible with the description of Constructing Native Data Structures in [YAML].
A YAML stream is composed of zero or more YAML documents.
YAML streams may correspond more directly to JavaScript Object Notation (JSON) Text Sequences, which are not presently part of the JSON-LD representation model. The description here more closely aligns with how JSON-LD interprets HTML Scripts.
Any error reported in a recursive processing step results in the failure of this processing step.
From the YAML grammar, a YAML document MAY be preceded by a Document Prefix and/or a set of directives followed by a YAML bare document, which is composed of a single node.
Any error reported in a recursive processing step results in the failure of this processing step.
Both block sequences and flow sequences are directly aligned with an array in the internal representation.
Any error reported in a recursive processing step results in the failure of this processing step.
Both block mappings and flow mappings are directly aligned with a map in the internal representation.
Any error reported in a recursive processing step results in the failure of this processing step.
The conversion result is the value of the entry in the named nodes map having the node entry. If none exist, the document is invalid, and processing MUST end in failure.
An invalid YAML document may have an alias node used within the node anchor definition which it references. In some implemenentations, this may lead to a processing error.
This section is non-normative.
See Security considerations in JSON-LD 1.1. Also, see the YAML media type registration.
This section is non-normative.
For general interoperability considerations on the serialization of JSON documents in [YAML], see YAML and the Interoperability consideration of application/yaml [I-D.ietf-httpapi-yaml-mediatypes].
The YAML-LD format and the media type registration are not restricted to a specific version of YAML, but implementers that want to use YAML-LD with YAML versions other than 1.2.2 need to be aware that the considerations and analysis provided here, including interoperability and security considerations, are based on the YAML 1.2.2 specification.
This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.
This section describes the information required to register the above media type according to [RFC6838]
profileA non-empty list of space-separated URIs identifying specific
constraints or conventions that apply to a YAML-LD document according to [RFC6906].
A profile does not change the semantics of the resource representation
when processed without profile knowledge, so that clients both with
and without knowledge of a profiled resource can safely use the same
representation. The profile parameter MAY be used by
clients to express their preferences in the content negotiation process.
If the profile parameter is given, a server SHOULD return a document that
honors the profiles in the list which it recognizes,
and MUST ignore the profiles in the list which it does not recognize.
It is RECOMMENDED that profile URIs are dereferenceable and provide
useful documentation at that URI. For more information and background
please refer to [RFC6906].
This specification allows the use of the profile parameters listed in
and additionally defines the following:
http://www.w3.org/ns/json-ld#extended
When used as a media type parameter [RFC4288]
in an HTTP Accept header field [RFC9110],
the value of the profile parameter MUST be enclosed in quotes (") if it contains
special characters such as whitespace, which is required when multiple profile URIs are combined.
When processing the "profile" media type parameter, it is important to note that its value contains one or more URIs and not IRIs. In some cases it might therefore be necessary to convert between IRIs and URIs as specified in section 3 Relationship between IRIs and URIs of [RFC3987].
This section is non-normative.
Fragment identifiers used with application/ld+yaml are treated as in RDF syntaxes, as per RDF 1.1 Concepts and Abstract Syntax [RDF11-CONCEPTS] and do not follow the process defined for application/yaml.
This section is non-normative.
FIXME
This section is non-normative.
REMOVE THIS SECTION BEFORE PUBLICATION.
While YAML-LD could define a specific predicate for comments, that is insufficient because, for example, the order of keywords is not preserved in JSON, so the comments could be displaced. This specification does not provide a means for preserving [YAML] comments after a JSON serialization.
# First comment
"@context": "http://schema.org"
# Second comment
givenName: John
Transforming the above entry into a JSON-LD document results in:
{
"@context": "http://schema.org",
"givenName": "John"
}
The above structures cannot be preserved when serializing to JSON-LD and - with respect to cycles - the serialization will fail.
Programming languages such as Java and Python already support
YAML representation graphs, but these implementations may behave
differently.
In the following example, &value references the value
of the keyword value.
value: &value 100
valve1:
temperature: &temp100C
value: *value
unit: degC
valve2:
temperature: *temp100C
Processing this entry in Python, I get the following
structure that preserve the references to
mutable objects (e.g., the temperature dict)
but not to scalars (e.g., the value keyword).
temperature = { "value": 100, "unit": "degC" }
document = {
"value": 100,
"valve1": { "temperature": temperature },
"valve2": { "temperature": temperature }
}
Since all these implementations pre-date this specification, some more interoperable choices include the following:
This section is non-normative.
Here, we propose to YAML-LD users a bit of advice which, although optional, might suggest one or two useful thoughts.
…in order to achieve a greater level of reusability, performance, and human friendliness among YAML-LD aware systems. The [json-ld-bp] document is as relevant to YAML-LD as it is to [JSON-LD11].
Instead, provide pre-built contexts that the user can reference by URL for a majority of common use cases.
YAML-LD is intended to simplify the authoring of Linked Data for a wide range of domain experts; its target audience is not comprised solely of IT professionals. [YAML] is chosen as a medium to minimize syntactic noise, and to keep the authored documents concise and clear. [JSON-LD11] (and hence YAML-LD) Context comprises a special language of its own. A requirement to author such a context would make the domain expert's job much harder — which we, as system architects and developers, should try to avoid.
If most, or all, of a user's documents are based on one particular context, try to make it the default in order to rescue the user from copy-pasting the same technical incantation from one document to another.
For instance, according to [JSON-LD11-API], the expand() method of a JSON-LD processor accepts an
expandContext argument which can be used to provide a default system context.
If possible, map JSON-LD keywords containing the @ character to keywords that do not contain it.
The @ character is reserved in YAML, and thus requires quoting (or escaping), as in the following
example:
"@context":
- https://prefix.cc/context
- ex: https://example.org/
name:
"@id": rdfs:label
"@container": "@language"
"@id": ex:Ray
"@type": ex:Cat
name:
en: Ray
ua: Промiнчик
ru: Лучик
The need to quote these keywords has to be learnt, and introduces one more little irregularity to the document
author's life. Further, on most keyboard layouts, typing quotes will require Shift, which reduces typing speed,
albeit slightly.
In order to avoid this, the context might introduce custom mappings for the necessary keywords. For instance,
[schema-org] context redefines @id as just id — which seems to be much more convenient to type, and
no more difficult to remember.
YAML-LD users may use a JSON-LD context provided as part of this specification, henceforth known as the
convenience context, which defines a standardized mapping of every @-keyword to a $-keyword, except @context.
Consider Example 12 reformatted using the convenience context:
"@context":
- https://prefix.cc/context
- https://yaml-ld.dev/context
- ex: https://example.org/
name:
$id: rdfs:label
$container: $language
$id: ex:Ray
$type: ex:Cat
name:
en: Ray
ua: Промiнчик
ru: Лучик
The applicability of this context depends on the domain and is left to the architect's best judgement.
See 3.1 Convenience Context for more information.
Referenced in:
4.2 Comments
Comments in YAML-LD documents are treated as white space. This behavior is consistent with other Linked Data serializations like [TURTLE]. See Interoperability considerations of [I-D.ietf-httpapi-yaml-mediatypes] for more details.