YAML-LD

Draft Community Group Report

Latest published version:
https://www.w3.org/yaml-ld/
Latest editor's draft:
https://json-ld.github.io/yaml-ld/
Test suite:
https://json-ld.github.io/yaml-ld/tests/
Editor:
JSON-LD Community
Feedback:
GitHub json-ld/yaml-ld (pull requests, new issue, open issues)
public-linked-json@w3.org with subject line [yaml-ld] … message topic … (archives)

Abstract

In recent years, [YAML] has emerged as a more concise format to represent information that had previously been serialized as JSON, including Linked Data. This document defines how to serialize linked data in YAML. Moreover, it registers the application/ld+yaml media type.

Status of This Document

This is a preview

Do not attempt to implement this version of the specification. Do not reference this version as authoritative in any way. Instead, see https://json-ld.github.io/yaml-ld/ for the Editor's draft.

This specification was published by the JSON for Linking Data Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

This document has been developed by the JSON-LD Community Group.

GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-linked-json@w3.org (subscribe, archives).

1. Introduction

[JSON-LD11] is a JSON-based format to serialize Linked Data. In recent years, [YAML] has emerged as a more concise format to represent information that had previously been serialized as [JSON], including API specifications, data schemas, and Linked Data.

This document defines YAML-LD as a set of conventions on top of YAML which specify how to serialize Linked Data [LINKED-DATA] as [YAML] based on JSON-LD syntax, semantics, and APIs.

Since YAML is more expressive than JSON, both in the available data types and in the document structure (see [I-D.ietf-httpapi-yaml-mediatypes]), this document identifies constraints on YAML such that any YAML-LD document can be represented in JSON-LD.

1.1 How to read this document

This section is non-normative.

To understand the basics of this specification, one must be familiar with the following:

This document is intended primarily for two main audiences, comprised of software developers and IT and non-IT professionals, as described below.

1.2 Terminology

This section is non-normative.

This document uses the following terms as defined in external specifications and defines terms specific to JSON-LD.

A YAML-LD stream is a YAML stream of YAML-LD documents.

Note: Interoperability considerations on YAML streams

For interoperability considerations on YAML streams, see the relevant section in YAML Media Type.

A YAML-LD document is any YAML document from which a conversion to [JSON] produces a valid JSON-LD document which can be interpreted as [LINKED-DATA].

The term media type is imported from [RFC6838].

The term JSON is imported from [JSON]

The term JSON document represents a serialization of a resource conforming to the [JSON] grammar.

The terms JSON-LD document, and value object are imported from [JSON-LD11].

The terms internal representation, and documentLoader are imported from [JSON-LD11-API].

The terms array, boolean, map, map entry, null, and string are imported from [INFRA].

The term number is imported from [ECMASCRIPT].

The terms YAML, YAML representation graph, YAML stream, YAML directive, TAG directive, YAML document, YAML sequence (either block sequence or flow sequence), YAML mapping (either block mapping or flow mapping), node, scalar, node anchor, node tags, and alias node, are imported from [YAML].

The term content negotiation is imported from [RFC9110].

The terms RDF literal, language-tagged string, datatype IRI, and language tag are imported from [RDF11-CONCEPTS].

The terms fragment and fragment identifier in this document are to be interpreted as in [URI].

The term Linked Data is imported from [LINKED-DATA].

1.3 Namespace Prefixes

This section is non-normative.

This specification makes use of the following namespace prefixes:

Prefix IRI
ex http://example.org/
i18n https://www.w3.org/ns/i18n#
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns#
xsd http://www.w3.org/2001/XMLSchema#

These are used within this document as part of a compact IRI as a shorthand for the resulting IRI, such as dcterms:title used to represent http://purl.org/dc/terms/title.

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, RECOMMENDED, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

A YAML-LD document complies with this specification if it follows the normative statements from this specification and can be interpreted as [JSON-LD11] after transformation into the internal representation. For convenience, normative statements for documents are often phrased as statements on the properties of the document.

A YAML-LD document complies with the YAML-LD JSON profile of this specification if it follows the normative statements from this specification and can be transformed into a JSON-LD representation, then back to a conforming YAML-LD document, without loss of semantic information.

3. Basic Concepts

This section is non-normative.

To ease writing and collaborating on [JSON-LD11] documents, it is becoming common practice to serialize them as [YAML]. This requires a registered media type, not only to enable content negotiation of linked data documents in YAML, but also to define the expected behavior of applications that process these documents, including fragment identifiers and interoperability considerations.

This is because YAML is more flexible than [JSON]:

The first goal of this specification is to allow a JSON-LD document to be processed and serialized into YAML, and then back into JSON-LD, without losing any semantic information.

This is always possible, because a YAML representation graph can always represent a tree, because JSON data types are a subset of YAML's, and because JSON encoding is UTF-8.

The subset of YAML-LD which supports serialisation of JSON-LD documents is defined as the YAML-LD JSON profile of YAML-LD.

Example: The JSON-LD document below

Example 1: Basic JSON-LD document
{
  "@context": "https://json-ld.org/contexts/person.jsonld",
  "name": "Joe Hacker",
  "homepage": "https://example.org/joe.hacker/",
  "image": "https://example.org/joe.hacker/image.png"
}

Can be serialized as YAML as follows. Note that entries starting with @ need to be enclosed in quotes (as shown in this example), because @ is a reserved character in YAML.

Example 2: Basic YAML-LD document
%YAML 1.2
---
'@context': https://json-ld.org/contexts/person.jsonld
name:     Joe Hacker
homepage: https://example.org/joe.hacker/
image:    https://example.org/joe.hacker/image.png

This document is based on YAML 1.2.2, but YAML-LD is not tied to a specific version of YAML. Implementers concerned about features related to a specific YAML version can specify it in documents using the %YAML directive (see 6. Interoperability Considerations).

4. Core Requirements

4.1 Encoding

A YAML-LD document MUST be encoded in UTF-8, to ensure interoperability with [JSON]; otherwise, an invalid-encoding error has been detected and processing is aborted.

4.2 Comments

Comments in YAML-LD documents are treated as white space. This behavior is consistent with other Linked Data serializations such as [TURTLE]. See Interoperability considerations of [I-D.ietf-httpapi-yaml-mediatypes] for more details.

4.3 Anchors and aliases

Since anchor names are a serialization detail, such anchors MUST NOT be used to convey relevant information, MAY be altered when processing the document, and MAY be dropped when interpreting the document as JSON-LD.

Editor's note

Not sure how to test that anchors are not used to convey information. As the Internal Representation has does not have a way of expressing anchors, also not sure how to test for this.

A YAML-LD document MAY contain anchored nodes and alias nodes, but its representation graph MUST NOT contain cycles; otherwise, a loading-document-failed error has been detected and processing is aborted. When interpreting the document as JSON-LD, alias nodes MUST be resolved by value to their target nodes.

The YAML-LD document in the following example contains alias nodes for the {"@id": "countries:ITA"} object:

Example 3: YAML-LD with node anchors
%YAML 1.2
---
"@context":
  "@vocab": "http://schema.org/"
  "countries": "http://publication.europa.eu/resource/authority/country/"
"@graph":
- &ITA
  "@id": countries:ITA
- "@id": http://people.example/Homer
  name: Homer Simpson
  nationality: *ITA
- "@id": http://people.example/Lisa
  name: Lisa Simpson
  nationality: *ITA

While the representation graph (and eventually the in-memory representation of the data structure, e.g., a Python dictionary or a Java hashmap) will still contain references between nodes, the JSON-LD serialization will not, as shown below:

Example 4: JSON-LD resulting from YAML with node anchors
{
  "@context": {
    "@vocab": "http://schema.org/",
    "countries": "http://publication.europa.eu/resource/authority/country/"
  },
  "@graph": [
    {
      "@id": "countries:ITA"
    },
    {
      "@id": "http://people.example/Homer",
      "full_name": "Homer Simpson",
      "country": {
        "@id": "countries:ITA"
      }
    },
    {
      "@id": "http://people.example/Lisa",
      "full_name": "Lisa Simpson",
      "country": {
        "@id": "countries:ITA"
      }
    }
  ]
}

4.4 Streams

Every YAML-LD file is a YAML-LD stream and might contain multiple YAML-LD documents, as shown in the example below.

Example 5: YAML-LD with several documents in one file
"@id": ex:Ray
  "@type": ex:Cat
  name:
    en: Ray
---
"@id": ex:Smoke
  "@type": ex:Cat
  name:
    en: Smoke

Each of the individual YAML documents in the stream is converted into a separate JSON-LD document and processed separately.

Issue 63: YAML Streams and JSON Sequences spec

The current text does not support this, and only supports a single YAML document. This is inconsistent with the processing description in D.1.1 Converting a YAML stream.

5. Security Considerations

This section is non-normative.

See Security considerations in JSON-LD 1.1. Also, see the YAML media type registration.

6. Interoperability Considerations

This section is non-normative.

For general interoperability considerations on the serialization of JSON documents in [YAML], see YAML and the Interoperability consideration of application/yaml [I-D.ietf-httpapi-yaml-mediatypes].

The YAML-LD format and the media type registration are not restricted to a specific version of YAML, but implementers that want to use YAML-LD with YAML versions other than 1.2.2 need to be aware that the considerations and analysis provided here, including interoperability and security considerations, are based on the YAML 1.2.2 specification.

A. IANA Considerations

This section has been submitted to the Internet Engineering Steering Group (IESG) for review, approval, and registration with IANA.

This section describes the information required to register the above media type according to [RFC6838]

A.1 application/ld+yaml

Type name:
application
Subtype name:
ld+yaml
Required parameters:
N/A
Optional parameters:
profile

A non-empty list of space-separated URIs identifying specific constraints or conventions that apply to a YAML-LD document according to [RFC6906]. A profile does not change the semantics of the resource representation when processed without profile knowledge, so that clients both with and without knowledge of a profiled resource can safely use the same representation. The profile parameter MAY be used by clients to express their preferences in the content negotiation process. If the profile parameter is given, a server SHOULD return a document that honors the profiles in the list which it recognizes, and MUST ignore the profiles in the list which it does not recognize. It is RECOMMENDED that profile URIs are dereferenceable and provide useful documentation at that URI. For more information and background please refer to [RFC6906].

This specification allows the use of the profile parameters listed in and additionally defines the following:

http://www.w3.org/ns/json-ld#extended
To request or specify extended YAML-LD document form.
Editor's note
This is a placeholder for specifying something like an extended YAML-LD document form making use of YAML-specific features.

When used as a media type parameter [RFC4288] in an HTTP Accept header field [RFC9110], the value of the profile parameter MUST be enclosed in quotes (") if it contains special characters such as whitespace, which is required when multiple profile URIs are combined.

When processing the "profile" media type parameter, it is important to note that its value contains one or more URIs and not IRIs. In some cases it might therefore be necessary to convert between IRIs and URIs as specified in section 3 Relationship between IRIs and URIs of [RFC3987].

Encoding considerations:
See YAML media type.
Security considerations:
See 5. Security Considerations.
Interoperability considerations:
See 6. Interoperability Considerations.
Published specification:
http://www.w3.org/TR/yaml-ld
Applications that use this media type:
Any programming environment that requires the exchange of directed graphs. Implementations of YAML-LD have been created for FIXME.
Additional information:
Magic number(s):
See application/yaml
File extension(s):
.yamlld
Macintosh file type code(s):
TEXT
Person & email address to contact for further information:
Philippe Le Hégaret <plh@w3.org>
Intended usage:
Common
Restrictions on usage:
N/A
Author(s):
Roberto Polli, Gregg Kellogg
Change controller:
W3C

A.2 Fragment identifiers

This section is non-normative.

Fragment identifiers used with application/ld+yaml are treated as in RDF syntaxes, as per RDF 1.1 Concepts and Abstract Syntax [RDF11-CONCEPTS] and do not follow the process defined for application/yaml.

Editor's note
Perhaps more on fragment identifiers from Issue 13

A.3 Examples

This section is non-normative.

Editor's note

FIXME

B. FAQ

This section is non-normative.

Editor's note

REMOVE THIS SECTION BEFORE PUBLICATION.

B.1 Why does YAML-LD not preserve comments?

Editor's note
According to [YAML], information that does not reflect into the representation graph (e.g., comments, directives, mapping keys order, anchor names, ...) must not be used to convey application level information. Moreover [JSON] (and hence [JSON-LD11]) does not support comments, and other Linked Data serialization formats that support comments (such as [TURTLE]) do not provide a means to preserve them when processing and serializing the document in other formats. The proposed behavior is thus consistent with other implementations.

While YAML-LD could define a specific predicate for comments, that is insufficient because, for example, the order of keywords is not preserved in JSON, so the comments could be displaced. This specification does not provide a means for preserving [YAML] comments after a JSON serialization.

Example 6: YAML-LD with comments
# First comment
"@context": "http://schema.org"

# Second comment
givenName: John

Transforming the above entry into a JSON-LD document results in:

Example 7: Result of parsing YAML-LD with comments to JSON-LD
{
  "@context": "http://schema.org",
  "givenName": "John"
}

B.2 Why does YAML-LD not extend the JSON-LD data model ?

Editor's note
[JSON] only represents simple trees while [YAML] can support rooted, directed graphs with references and cycles.

The above structures cannot be preserved when serializing to JSON-LD and - with respect to cycles - the serialization will fail.

Programming languages such as Java and Python already support YAML representation graphs, but these implementations may behave differently. In the following example, &value references the value of the keyword value.

Example 8: YAML-LD with references
value: &value 100
valve1:
  temperature: &temp100C
    value: *value
    unit: degC
valve2:
  temperature: *temp100C

Processing this entry in Python, I get the following structure that preserve the references to mutable objects (e.g., the temperature dict) but not to scalars (e.g., the value keyword).

Example 9: Result of parsing YAML-LD with references to Python
temperature = { "value": 100, "unit": "degC" }
document = {
  "value": 100,
  "valve1": { "temperature": temperature },
  "valve2": { "temperature": temperature }
}

Since all these implementations pre-date this specification, some more interoperable choices include the following:

  • forbidding cycles in YAML-LD documents
  • considering all references in YAML-LD as static, i.e., a shorthand way to repeat specific patterns

C. Best Practices

This section is non-normative.

Here, we propose to YAML-LD users a bit of advice which, although optional, might suggest one or two useful thoughts.

Best Practice 1: Follow JSON-LD best practices

…in order to achieve a greater level of reusability, performance, and human friendliness among YAML-LD aware systems. The [json-ld-bp] document is as relevant to YAML-LD as it is to [JSON-LD11].

Best Practice 2: Do not force users to author contexts

Instead, provide pre-built contexts that the user can reference by URL for a majority of common use cases.

YAML-LD is intended to simplify the authoring of Linked Data for a wide range of domain experts; its target audience is not comprised solely of IT professionals. [YAML] is chosen as a medium to minimize syntactic noise, and to keep the authored documents concise and clear. [JSON-LD11] (and hence YAML-LD) Context comprises a special language of its own. A requirement to author such a context would make the domain expert's job much harder, which we, as system architects and developers, should try to avoid.

Best Practice 3: Use a default context

If most, or all, of a user's documents are based on one particular context, try to make it the default in order to rescue the user from copy-pasting the same technical incantation from one document to another.

For instance, according to [JSON-LD11-API], the expand() method of a JSON-LD processor accepts an expandContext argument which can be used to provide a default system context.

Best Practice 4: Alias JSON-LD keywords

If possible, map JSON-LD keywords containing the @ character to keywords that do not contain it.

The @ character is reserved in YAML, and thus requires quoting (or escaping), as in the following example:

Example 10: Example YAML-LD document without Convenience Context
"@context":
  - https://prefix.cc/context
  - ex: https://example.org/
    name:
      "@id": rdfs:label
      "@container": "@language"
"@id": ex:Ray
"@type": ex:Cat
name:
  en: Ray
  ua: Промiнчик
  ru: Лучик

The need to quote these keywords has to be learnt, and introduces one more little irregularity to the document author's life. Further, on most keyboard layouts, typing quotes will require Shift, which reduces typing speed, albeit slightly.

In order to avoid this, the context might introduce custom mappings for the necessary keywords. For instance, [schema-org] context redefines @id as just id — which seems to be much more convenient to type, and no more difficult to remember.

Best Practice 5: Use YAML-LD Convenience Context

YAML-LD users may use a JSON-LD context provided as part of this specification, henceforth known as the convenience context, which defines a standardized mapping of every @-keyword to a $-keyword, except @context.

The convenience context contains an alias to every JSON-LD keyword which the JSON-LD 1.1 specification permits aliasing — which means all the keywords except @context. The reserved @ character is replaced by $, which is not reserved and therefore does not require quoting. Consider Example 10 reformatted using the convenience context:

Example 12: Example YAML-LD document with Convenience Context
"@context":
  - https://prefix.cc/context
  - https://yaml-ld.dev/context
  - ex: https://example.org/
    name:
      $id: rdfs:label
      $container: $language
$id: ex:Ray
$type: ex:Cat
name:
  en: Ray
  ua: Промiнчик
  ru: Лучик

The applicability of this context depends on the domain and is left to the architect's best judgement.

D. Extended YAML-LD Profile

This section is non-normative.

To take better advantage of the broader expressivity of YAML, this document also defines a means of extending the JSON-LD internal representation to allow a more complete expression of native data types within YAML-LD, and allows use of the complete JSON-LD 1.1 Processing Algorithms and API [JSON-LD11-API] Application Programming Interface to manipulate extended YAML-LD documents.

A YAML-LD document complies with the YAML-LD extended profile of this specification if it follows the normative statements from this specification and can be transformed into the JSON-LD extended internal representation, then back to a conforming YAML-LD document, without loss of semantic information.

As [YAML] has well-defined representation requirements, all YAML-LD streams MUST form a well-formed stream and use alias node defined by a previous node with a corresponding anchor; otherwise, a loading-document-failed error has been detected and processing is aborted.

The YAML-LD extended profile allows full use of anchor names and alias nodes subject to the requirements described above in this section.

If the extendedYAML API flag is true, the processing result will be in the extended internal representation.

When processing using the YAML-LD JSON profile, documents MUST NOT contain alias nodes; otherwise, a profile-error error has been detected and processing is aborted.

D.1 Conversion to the Internal Representation

YAML-LD processing is defined by converting YAML to the internal representation and using JSON-LD 1.1 Processing Algorithms and API to process on that representation, after which the representation is converted back to YAML. As information specific to a given YAML document structure is lost in this transformation, much of the specifics of that original representation are therefore lost in that conversion, limiting the ability to fully round-trip a YAML-LD document back to an equivalent representation. Consequently, round-tripping in this context is limited to preservation of the semantic representation of a document, rather than a specific syntactic representation.

The conversion process represented here is compatible with the description of "Composing the Representation Graph" from the 3.1.2 Load section of [YAML]. The steps described below for converting to the internal representation operate upon that YAML Ain’t Markup Language (YAML™) version 1.2.2.

When operating using the YAML-LD JSON profile, it is intended that the common feature provided by most YAML libraries of transforming YAML directly to JSON satisfies the requirements for parsing a YAML-LD file.

Issue 12: Convert JSON-LD to YAML-LD using standard YAML libraries UCR

As a developer, I want to be able to convert JSON-LD documents to YAML-LD by simply serializing the document using any standard YAML library, So that the resulting YAML is valid YAML-LD, resolving to the same graph as the original JSON-LD.

D.1.1 Converting a YAML stream

A YAML stream is composed of zero or more YAML documents.

Issue 63: YAML Streams and JSON Sequences spec

YAML streams may correspond more directly to JavaScript Object Notation (JSON) Text Sequences, which are not presently part of the JSON-LD representation model. The description here more closely aligns with how JSON-LD interprets HTML Scripts.

  1. Set stream content to an empty array.
  2. If the stream is empty, set stream content to an empty array.
  3. Otherwise, if the stream contains a single YAML document, set stream content the result of D.1.2 Converting a YAML document.
  4. Otherwise: for each document in the stream:
    1. Set doc to the result of D.1.2 Converting a YAML document for document.
    2. If doc is an array, merge it to the end of stream content.
    3. Otherwise, append doc to stream content
    Editor's note
    This step is inconsistent with other statements about processing each document separately, resulting in some other stream of JSON-LD output (i.e., something like NDJSOND-LD). Also, presumably an empty stream would result in either an empty NDJSON-LD document, or an empty [JSON-LD] document.
  5. The conversion result is stream content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

D.1.2 Converting a YAML document

From the YAML grammar, a YAML document MAY be preceded by a Document Prefix and/or a set of directives followed by a YAML bare document, which is composed of a single node.

  1. Create an empty named nodes map which will be used to associate each alias node with the node having the corresponding node anchor.
  2. Set document content to the result of processing the node associated with the YAML bare document, using the appropriate conversion step defined in this section. If that node is not one of the following, a loading-document-failed error has been detected and processing is aborted.
    Note
    A node may be of another type, but this is incompatilbe with JSON-LD, where the top-most node must be either an array or map.
  3. The conversion result is document content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

D.1.3 Converting a YAML sequence

Both block sequences and flow sequences are directly aligned with an array in the internal representation.

  1. Set sequence content to an empty array.
  2. If the sequence has a node anchor, add a reference from the anchor name to the sequence in the named nodes map.
  3. For each node n in the sequence, append the result of processing n to sequence content using the appropriate conversion step.
  4. The conversion result is sequence content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

D.1.4 Converting a YAML mapping

Both block mappings and flow mappings are directly aligned with a map in the internal representation.

  1. Set mapping content to an empty map.
  2. Otherwise, if the mapping has a node anchor, add a reference from the anchor name to the mapping in the named nodes map.
  3. For each entry in the mapping composed of a key/value pair:
    1. Set key and value to the result of processing entry using the appropriate conversion step.
    2. If key is not a string, a mapping-key-error error has been detected and processing MUST be aborted.
    3. Add a new entry to mapping content using key and value.
  4. The conversion result is mapping content.

Any error reported in a recursive processing step MUST result in the failure of this processing step.

D.1.5 Converting a YAML scalar

  1. If the extendedYAML flag is true, and node n has a node tag t, n is mapped as follows:
    1. If t resolves with a prefix of tag:yaml.org.2002:, the conversion result is mapped through the YAML Core Schema.
    2. Otherwise, if t resolves with a prefix of https://www.w3.org/ns/i18n#, and the suffix does not contain an underscore ("_"), the conversion result is a language-tagged string with value taken from n, and a language tag taken from the suffix of t.
      Note
      Node tags including an underscore ("_"), such as i18n:ar-eg_rtl describe a combination of language and text direction. See The i18n Namespace in [JSON-LD11].
    3. Otherwise, the conversion result is an RDF literal with value taken from n and datatype IRI taken from t.
  2. Otherwise, the conversion result is mapped through the YAML Core Schema.
Note

Implementations may retain the representation as an YAML Integer, or YAML Floating Point, but a JSON-LD processor must treat them uniformly as a number, although the specific type of number value SHOULD be retained for round-tripping.

D.1.6 Converting a YAML alias node

The conversion result is the value of the entry in the named nodes map having the node entry. If none exist, the document is invalid, and processing MUST end in failure.

If an alias node is encountered when processing the YAML representation graph and the extendedYAML flag is false, the YAML-LD JSON profile has been selected. A profile-error error has been detected and processing MUST be aborted.

If a cycle is detected, a processing error MUST be returned, and processing aborted.

D.2 Conversion to YAML

The conversion process from the internal representation involves turning that representation back into a YAML representation graph and relies on the description of "Serializing the Representation Graph" from the 3.1.1 Dump section of [YAML] for the final serialization.

As the internal representation is rooted by either an array or a map, the process of transforming the internal representation to YAML begins by preparing an empty representation graph which will be rooted with either a YAML mapping or YAML sequence.

Although outside of the scope of this specification, processors MAY use YAML directives, including TAG directives, and Document markers, as appropriate for best results. Specifically, if the extendedYAML API flag is true, the document SHOULD use the %YAML directive with version set to at least 1.2. To improve readability and reduce document size, the document MAY use a %TAG directive appropriate for RDF literals contained within the representation.

Note

The use of %TAG directives in YAML-LD is similar to the use of the PREFIX directive in [Turtle] or the general use of terms as prefixes to create Compact IRIs in [JSON-LD11]: they not change the meaning of the encoded scalars.

Example 14: Serialized representation of the extended internal representation
%YAML 1.2
%TAG !xsd! http://www.w3.org/2001/XMLSchema%23
---
"@context":
  "@vocab": http://xmlns.com/foaf/0.1/
name: !xsd!string Gregg Kellogg
homepage: https://greggkellogg.net/
depiction: http://www.gravatar.com/avatar/42f948...
date: !xsd!date "2022-08-08"
Note

Although allowed within the YAML Grammar, some current YAML parsers do not allow the use of "#" within a tag URI. Substituting the "%23" escape is a workaround for this problem, that will hopefully become unnecessary as implementations are updated.

Issue 6: Use tags to distinguish "plain" YAML-LD from "idiomatic" YAML-LD UCRspec

A concrete proposal in that direction would be to use a tag at the top-level of any "idiomatic" YAML-LD document, applying to the whole object/array that makes the document.

It might also include a version to identify the specification that it relates to, allowing for version announcement that could be used for future-proofing.

The following block is one example:

!yaml-ld
$context: http://schema.org/
$type: Person
name: Pierre-Antoine Champin

See Example 14 for an example of serializing the extended internal representation.

D.2.1 Converting From the Internal Representation

This algorithm describes the steps to convert each element from the internal representation into corresponding YAML nodes by recursively processing each element n.

  1. If n is an array, the conversion result is a YAML sequence with child nodes of the sequence taken by converting each value of n using this algorithm.
  2. Otherwise, if n is an map, the conversion result is a YAML mapping with keys and values taken by converting each key/value pair of n using this algorithm.
  3. Otherwise, if n is an RDF literal:
    1. If the datatype IRI of n is xsd:string, the conversion is a YAML scalar with the value taken from that value of n.
    2. Otherwise, if n is a language-tagged string, the conversion is a YAML scalar with the value taken from that value of n and a node tag constructed by appending that language tag to https://www.w3.org/ns/i18n#.
    3. Otherwise, the conversion is a YAML scalar with the value taken from that value of n and a node tag taken from the datatype IRI of n.
  4. Otherwise, if n is a number, the conversion result is a YAML scalar with the value taken from n.
  5. Otherwise, if n is a boolean, the conversion result is a YAML scalar with the value either true or false based on the value of n.
  6. Otherwise, if n is null, the conversion result is a YAML scalar with the value null.
  7. Otherwise, conversion result is a YAML scalar with the value taken from n.

D.3 Application Profiles

This section identifies two application profiles for operating with YAML-LD:

Application profiles allow publishers to use YAML-LD either for maximum interoperability, or for maximum expressivity. The YAML-LD JSON profile provides for complete round-tripping between YAML-LD documents and JSON-LD documents. The YAML-LD extended profile allows for fuller use of YAML features to enhance the ability to represent a larger number of native datatypes and reduce document redundancy.

Application profiles can be set using the JsonLdProcessor API interface, as well as an HTTP request profile (see A. IANA Considerations).

D.3.1 YAML-LD JSON Profile

The YAML-LD JSON profile is based on the YAML Core Schema, which interprets only a limited set of node tags. YAML scalars with node tags outside of the defined range SHOULD be avoided and MUST be converted to the closest scalar type from the YAML Core Schema, if found. See D.1.5 Converting a YAML scalar for specifics.

Although YAML supports several additional encodings, YAML-LD documents in the YAML-LD JSON Profile MUST NOT use encodings other than UTF-8.

Keys used in a YAML mapping MUST be strings.

Although YAML-LD documents MAY include node anchors, documents MUST NOT use alias nodes.

A YAML stream MUST include only a single YAML document, as the JSON-LD internal representation only supports a single document model.

D.3.2 YAML-LD Extended Profile

The YAML-LD extended profile extends the YAML Core Schema, allowing node tags to specify RDF literals by using a JSON-LD extended internal representation capable of directly representing RDF literals.

As with the YAML-LD JSON profile, YAML-LD documents in the YAML-LD extended profile MUST NOT use encodings other than UTF-8.

As with the YAML-LD JSON profile, keys used in a YAML mapping MUST be strings.

YAML-LD docucments MAY use alias nodes, as long as dereferencing these aliases does not result in a loop.

As with the YAML-LD JSON profile, a YAML stream MUST include only a single YAML document, as the JSON-LD extended internal representation only supports a single document model.

Issue 79: YAML-LD IRI tags

Consier something like !id as a local tag to denote IRIs.

D.3.2.1 The JSON-LD Extended Internal Representation

This specification defines the JSON-LD extended internal representation , an extension of the JSON-LD internal representation.

In addition to maps, arrays, and strings, the internal representation allows native representation of numbers, boolean values, and nulls. The extended internal representation allows for native representation of RDF literals, both with a datatype IRI, and language-tagged strings.

When transforming from the extended internal representation to the internal representation — for example when serializing to JSON or to the YAML-LD JSON profile — implementations MUST transform RDF literals to the closest native representation of the internal representation:

Editor's note

An alternative would be to transform such literals to JSON-LD value objects, and we may want to provide a means of transforming between the internal representation and extended internal representation using value objects, but this treatment is consistent with [YAML] Core Schema Tag Resolution.

D.4 The Application Programming Interface

This specification extends the JSON-LD 1.1 Processing Algorithms and API [JSON-LD11-API] Application Programming Interface and the JSON-LD 1.1 Framing [JSON-LD11-FRAMING] Application Programming Interface to manage the serialization and deserialization of [YAML] and to enable an option for setting the YAML-LD extended profile.

D.4.1 JsonLdProcessor

The JSON-LD Processor interface is the high-level programming structure that developers use to access the JSON-LD transformation methods. The updates below is an experimental extension of the JsonLdProcessor interface defined in the JSON-LD 1.1 API [JSON-LD11-API] to serialize output as YAML rather than JSON.

compact()
Updates step 10 of the compact() algorithm to serialize the the result as YAML rather than JSON as defined in D.2 Conversion to YAML.
expand()
Updates step 9 of the expand() algorithm to serialize the the result as YAML rather than JSON as defined in D.2 Conversion to YAML.
flatten()
Updates step 7 of the flatten() algorithm to serialize the the result as YAML rather than JSON as defined in D.2 Conversion to YAML.
Updates step 22 of the frame() algorithm to serialize the the result as YAML rather than JSON as defined in D.2 Conversion to YAML.
fromRdf()
Updates step 3 of the fromRdf() algorithm to serialize the the result as YAML rather than JSON as defined in D.2 Conversion to YAML.
Updates the RDF to Object Conversion algorithm before step 2.6 as follows:
Otherwise, if both the useNativeTypes and extendedYAML flags are set and the datatype IRI of value is not xsd:string:
  1. If value is a language-tagged string set converted value to a new RDF literal composed of the lexical form of value and datatype IRI composed of https://www.w3.org/ns/i18n# followed by the language tag of value.
  2. Otherwise, et converted value to value.
toRdf()
Updates the Object to RDF Conversion algorithm before step 10 as follows:
  1. Otherwise, if value is an RDF literal, value is left unmodified. This will only be the case when processing a value from an extended internal representation.

D.4.2 JsonLdOptions

The JsonLdOptions type is used to pass various options to the JsonLdProcessor methods.

WebIDLpartial dictionary JsonLdOptions {
  boolean extendedYAML = false;
};

In addition to those options defined in the JSON-LD 1.1 API [JSON-LD11-API] and JSON-LD 1.1 Framing [JSON-LD11-FRAMING], this specification defines these additional options:

extendedYAML
When used for serializing the internal representation (or extended internal representation) into a YAML representation graph:
When used for the documentLoader, it causes documents of type application/ld+yaml to be parsed into a YAML representation graph and generates an internal representation (or extended internal representation):

D.4.3 Remote Document and Context Retrieval

This section describes an update to the built-in LoadDocumentCallback to load YAML streams and documents into the internal representation, or into the extended internal representation if the extendedYAML API flag is true.

The LoadDocumentCallback algorithm in [JSON-LD11-API] is updated as follows:

Note

These updates are intended to be compatible with other updates to the LoadDocumentCallback, such as Process HTML as defined in [JSON-LD11-API].

D.4.4 YamlLdErrorCode

The YamlLdErrorCode represents the collection of valid YAML-LD error codes, which extends the JsonLdErrorCode definitions.

WebIDLenum YamlLdErrorCode {
  "invalid-encoding",
  "mapping-key-error",
  "profile-error"
};
invalid-encoding
The character encoding of an input is invalid.
mapping-key-error
A YAML mapping key was found that was not a string.
profile-error
The parsed YAML document contains features incompatible with the specified profile.

E. References

E.1 Normative references

[I-D.ietf-httpapi-yaml-mediatypes]
YAML Media Type. Roberto Polli; Erik Wilde; Eemeli Aro. IETF. 2022-08-05. WG Document. URL: https://www.ietf.org/archive/id/draft-ietf-httpapi-yaml-mediatypes-03.html
[JSON]
The JavaScript Object Notation (JSON) Data Interchange Format. T. Bray, Ed.. IETF. December 2017. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc8259
[JSON-LD11]
JSON-LD 1.1. Gregg Kellogg; Pierre-Antoine Champin; Dave Longley. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11/
[JSON-LD11-API]
JSON-LD 1.1 Processing Algorithms and API. Gregg Kellogg; Dave Longley; Pierre-Antoine Champin. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11-api/
[LINKED-DATA]
Linked Data Design Issues. Tim Berners-Lee. W3C. 27 July 2006. W3C-Internal Document. URL: https://www.w3.org/DesignIssues/LinkedData.html
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[RFC3987]
Internationalized Resource Identifiers (IRIs). M. Duerst; M. Suignard. IETF. January 2005. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3987
[RFC4288]
Media Type Specifications and Registration Procedures. N. Freed; J. Klensin. IETF. December 2005. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc4288
[RFC6838]
Media Type Specifications and Registration Procedures. N. Freed; J. Klensin; T. Hansen. IETF. January 2013. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc6838
[RFC6906]
The 'profile' Link Relation Type. E. Wilde. IETF. March 2013. Informational. URL: https://www.rfc-editor.org/rfc/rfc6906
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[RFC9110]
HTTP Semantics. R. Fielding, Ed.; M. Nottingham, Ed.; J. Reschke, Ed.. IETF. June 2022. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc9110
[YAML]
YAML Ain’t Markup Language (YAML™) version 1.2.2. Oren Ben-Kiki; Clark Evans; Ingy döt Net. 2021-10-01. URL: https://yaml.org/spec/1.2.2/

E.2 Informative references

[ECMASCRIPT]
ECMAScript Language Specification. Ecma International. URL: https://tc39.es/ecma262/multipage/
[INFRA]
Infra Standard. Anne van Kesteren; Domenic Denicola. WHATWG. Living Standard. URL: https://infra.spec.whatwg.org/
[JSON-LD]
JSON-LD 1.0. Manu Sporny; Gregg Kellogg; Markus Lanthaler. W3C. 3 November 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld/
[json-ld-bp]
JSON-LD Best Practices. Gregg Kellogg; Ivan Herman; BigBlueHat; A. Soroka; Ruben Taelman; David I. Lehn; Philippe Le Hegaret. W3C. 2022-05-24. W3C Group Note. URL: https://w3c.github.io/json-ld-bp/
[JSON-LD11-FRAMING]
JSON-LD 1.1 Framing. Dave Longley; Gregg Kellogg; Pierre-Antoine Champin. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11-framing/
[RDF11-CONCEPTS]
RDF 1.1 Concepts and Abstract Syntax. Richard Cyganiak; David Wood; Markus Lanthaler. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/
[rfc2045]
Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. N. Freed; N. Borenstein. IETF. November 1996. Draft Standard. URL: https://www.rfc-editor.org/rfc/rfc2045
[RFC6839]
Additional Media Type Structured Syntax Suffixes. T. Hansen; A. Melnikov. IETF. January 2013. Informational. URL: https://www.rfc-editor.org/rfc/rfc6839
[RFC7464]
JavaScript Object Notation (JSON) Text Sequences. N. Williams. IETF. February 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7464
[schema-org]
Schema.org. W3C Schema.org Community Group. W3C. 6.0. URL: https://schema.org/
[TURTLE]
RDF 1.1 Turtle. Eric Prud'hommeaux; Gavin Carothers. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/turtle/
[URI]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[xmlschema11-2]
W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. David Peterson; Sandy Gao; Ashok Malhotra; Michael Sperberg-McQueen; Henry Thompson; Paul V. Biron et al. W3C. 5 April 2012. W3C Recommendation. URL: https://www.w3.org/TR/xmlschema11-2/
[YAML-LD-PRIMER]
YAML-LD Primer. JSON-LD Community Group. 2023-04-01. URL: https://github.com/json-ld/yaml-ld-primer/