Publication Manifest

W3C Editor's Draft

This version:
https://w3c.github.io/pub-manifest/
Latest published version:
https://www.w3.org/TR/pub-manifest/
Latest editor's draft:
https://w3c.github.io/pub-manifest/
Editors:
Matt Garrish (DAISY Consortium)
Ivan Herman (W3C)
Participate:
GitHub w3c/pub-manifest
File a bug
Commit history
Pull requests

Abstract

This specification defines a general manifest format for expressing information about a digital publication. It uses [schema.org] metadata augmented to include various structural properties about publications, serialized in [json-ld11], to enable interoperability between publishing formats while accommodating variances in the information that needs to be expressed.

Status of This Document

This is a preview

Do not attempt to implement this version of the specification. Do not reference this version as authoritative in any way. Instead, see https://w3c.github.io/pub-manifest/ for the Editor's draft.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document was published by the Publishing Working Group as an Editor's Draft.

GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-publ-wg@w3.org (archives).

Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

1. Introduction

1.1 Scope

This specification defines a general manifest format to describe publications. It does not attempt to constrain the nature of the publications that use the manifest. Rather, it is designed to be adaptable to the needs of specific areas of publishing, such as audiobook production, by specifying a modular approach for creating specializations.

This specification is also intended to facilitate different user agent architectures. While it is expected that traditional Web user agents (browsers) will be able to consume a publication manifest, this should not limit the capabilities of any other possible type of user agent (e.g., applications, whether standalone or running within a user agent, or even publications that include their own user interface).

This specification does not define how user agents are expected to render publications that use the manifest format.

1.2 Terminology

Bounds

A digital publication consists of a finite set of resources that represent its content. This extent is known as its bounds and is defined within its manifest — it is obtained from the union of resources listed in the default reading order and resource list.

Digital Publication

The term digital publication refers to a publication authored in a format that uses a profile of the manifest.

Internal Representation

The internal representation of a manifest is the data structure created by user agents when they process the manifest and remove all possible ambiguities and incorporate any missing values that can be inferred from another source.

It is possible for the information expressed in the manifest to be the equivalent of the internal representation created by user agents if there are no ambiguities or missing information.

Manifest

A manifest represents structured information about a publication, such as informative metadata, a list of resources, and a default reading order.

Non-empty

For the purposes of this specification, non-empty is used to refer to an element, attribute or property whose text content or value consists of one or more characters after whitespace normalization, where whitespace normalization rules are defined per the host format.

Profile

Profiles are publication formats (e.g., audiobooks) that use the manifest format defined in this specification to describe their bounds and content. These formats can extend the core definition in this specification with profile-specific terms and/or new requirements.

Although profiles can differ in their structural and content requirements, such variances are restricted to maintain a high degree of predictabibility between formats. (See § 5. Modular Extensions.)

1.3 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, OPTIONAL, RECOMMENDED, REQUIRED, SHOULD, and SHOULD NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Publication Manifest

2.1 Introduction

2.1.1 Manifest Format

This section is non-normative.

A digital publication is described by its manifest, which provides a set of properties expressed using a specific shape of JSON-LD [json-ld11] (a variant of JSON [ecma-404] for linked data).

The manifest is what enables user agents to understand the bounds of digital publication and the connection between its resources. It includes metadata that describes the digital publication, as a publication has an identity and nature beyond its constituent resources. The manifest also provides a list of all the resources that belong to the digital publication and a default reading order, which is how it connects resources into a single contiguous work.

The properties of the manifest describe the basic information a user agent requires to process and render a publication. For ease of understanding, these properties are categorized as follows:

Descriptive properties

Descriptive properties describe aspects of a digital publication, such as its title, creator, and language.

Resource categorization properties

Resource categorization properties describe or identify common sets of resources, such as the resource list and default reading order. These properties refer to one or more resources, such as HTML documents, images, scripts, and metadata records.

The manifest also identifies key resources of a digital publication through the use of link relations. These relations are applied to the rel property of LinkedResource objects (e.g., the links found in the table of contents and resource list).

The types of resources these relations identify are categorized as follows:

Informative resources

Informative resources are resources that contain additional information about the publication, such as its privacy policy, accessibility report, or preview.

Structural resources

Structural resources are key meta structures of the publication, such as the cover image, table of contents, and page list.

Note

These categorizations have no relevance outside this specification (i.e., the properties are not actually grouped together in the manifest).

2.1.2 JSON-LD Authoring and Processing

This specification defines the publication manifest as a specific "shape" of [json-ld11]. This means that the manifest SHOULD be expressed using only the syntactic constructions defined in this specification, as opposed to all the possibilities offered by the JSON-LD syntax.

Note

This shape is also defined, informally, through a JSON schema [json-schema] that expresses the constraints defined in this specification. This schema is maintained at https://github.com/w3c/pub-manifest/blob/master/schema/publication.schema.json.

The publication manifest also has a number of authoring flexibilities and compact authoring expressions. For example, it is not always required that object types be explicitly authored, as these are automatically generated during processing when missing (see § 2.4.4 Explicit and Implied Objects for more information). An internal representation of the manifest data is defined separately; see the separate section on § 2.3 Web IDL for further details.

As a consequence, a user agent does not have to be a full JSON-LD processor. User agents only need to be able to read the manifest's specific shape and internalize the data.

2.1.3 Relationship to Schema.org

Manifest properties, in particular those categorized as descriptive properties, are primarily drawn from schema.org and its hosted extensions [schema.org]. As a consequence, these properties inherit their syntax and semantics from schema.org, making manifest authoring compatible with schema.org authoring.

When a manifest item corresponds to a schema.org property, its property definition identifies its mapping and includes the defining type (e.g., CreativeWork or Book) in parentheses.

Schema.org additionally includes a large number of properties that, though relevant for publishing, are not mentioned in this specification. These properties MAY be used in a manifest as this document defines only the minimal set of manifest items.

When using additional schema.org properties, ensure that they are valid for the type of publication specified in the manifest. Properties are often available in many schema.org types, as a result of the inheritance model used by the vocabulary, but not all properties are available for all types. For more detailed information about which types accept which properties, refer to [schema.org].

More information about using additional schema.org properties is also available in § 2.7 Publication Types and § 2.9.3.2 Additional Manifest Properties

2.2 Requirements

The following properties MUST be set in the manifest:

The priority of all other properties and resource relations is OPTIONAL, but MAY be modified by implementations of the manifest format.

Note

Some properties are implicitly required, as they are compiled from alternative information when not explicitly authored. See § 2.3 Web IDL for more information.

2.3 Web IDL

Although a digital publication's manifest is authored as [json-ld11], a user agent processes this information into the internal representation, which can be in any language, in order to utilize the properties.

To simplify the manifest format for developers, this specification defines an abstract representation of the data structures employed by the manifest using the Web Interface Definition Language (Web IDL) [webidl-1] — the PublicationManifest dictionary.

This definition expresses the expected names, datatypes, and possible restrictions for each member of the manifest. Unlike a typical Web IDL definition, however, user agents are not expected to expose the information in the manifest as an API. The Web IDL language is chosen solely to provide an abstraction of the data model.

The required members of the PublicationManifest dictionary do not exactly match the required members of the manifest as user agents are expected to compile information from other sources in the absence of an explicit declaration for some properties (e.g., for the title and reading order).

Note

It is not necessary to understand the Web IDL definition in order to create digital publications. Authoring requirements are defined in the following sections.

2.3.1 The PublicationManifest Dictionary

dictionary PublicationManifest {
    required sequence<DOMString>         type;
    required sequence<DOMString>         conformsTo;
             DOMString                   id;
             boolean                     abridged;
             sequence<DOMString>         accessMode;
             sequence<DOMString>         accessModeSufficient;
             sequence<DOMString>         accessibilityFeature;
             sequence<DOMString>         accessibilityHazard;
             LocalizableString           accessibilitySummary;
             sequence<Entity>            artist;
             sequence<Entity>            author;
             sequence<Entity>            colorist;
             sequence<Entity>            contributor;
             sequence<Entity>            creator;
             sequence<Entity>            editor;
             sequence<Entity>            illustrator;
             sequence<Entity>            inker;
             sequence<Entity>            letterer;
             sequence<Entity>            penciler;
             sequence<Entity>            publisher;
             sequence<Entity>            readBy;
             sequence<Entity>            translator;
             sequence<DOMString>         url;
             DOMString                   duration;
             sequence<DOMString>         inLanguage;
             DOMString                   dateModified;
             DOMString                   datePublished;
             TextDirection               readingProgression = "ltr";
    required sequence<LocalizableString> name;
    required sequence<LinkedResource>    readingOrder;
             sequence<LinkedResource>    resources;
             sequence<LinkedResource>    links;
};

enum TextDirection {
    "ltr",
    "rtl"
};
2.3.1.1 The LinkedResource Dictionary
dictionary LinkedResource {
    required DOMString                           url;
             DOMString                           encodingFormat;
             sequence<LocalizableString>         name;
             LocalizableString                   description;
             sequence<DOMString>                 rel;
             DOMString                           integrity;
             double                              length;
             sequence<LinkedResource>            alternate;
};
2.3.1.2 The Entity Dictionary
dictionary Entity {
             sequence<DOMString>         type;
    required sequence<LocalizableString> name;
             DOMString                   id;
             DOMString                   url;
             sequence<DOMString>         identifier;
};
2.3.1.3 The LocalizableString Dictionary
dictionary LocalizableString {
    required DOMString                   value;
             DOMString                   language;
             TextDirection               direction;
};

2.4 Value Categories

This section describes the categories of values that can be used with properties of the publication manifest.

2.4.1 Literals

When a manifest property expects a literal text string as its value — one that is not language-dependent, such as a code value or date — its value MUST be expressed as a [json] string.

Literal values are not changed during processing of the manifest, unlike other values which might be, for example, converted to objects.

2.4.2 Numbers

When a manifest property expects a number as its value, the value MUST be expressed as a [json] number.

2.4.3 Booleans

When a manifest property expects a boolean as its value, the value MUST be expressed as an [ecmascriptBoolean value (true or false).

2.4.4 Explicit and Implied Objects

Various manifest properties are expected to be expressed as [jsonobjects. Although the use of objects is usually recommended, the following sections identify cases where it is also acceptable to use string values that are interpreted as objects depending on the context. The exact mapping of text values to objects is part of the property or object definitions.

2.4.4.1 Localizable Strings

When a manifest property expects a localizable text string as its value, the value MUST be expressed either as:

A single string value represents an implied object whose value property is the string's text and whose language is determined from other information in the manifest.

A LocalizableString is a [jsonobject consisting of the following properties:

Term Description Required Value Value Type [schema.org] Mapping
value The value of the localizable string. REQUIRED. Text. Literal (None)
language The language of the value. OPTIONAL. A well-formed language tag [bcp47]. Literal (None)
direction The base direction of the value. OPTIONAL. ltr or rtl Literal (None)

The meaning of the base direction values are:

  • ltr: indicates that the textual value is explicitly directionally set to left-to-right text;
  • rtl: indicates that the textual value is explicitly directionally set to right-to-left text;

A missing base direction value means that that the textual value is explicitly directionally set to the direction of the first character with a strong directionality, following the rules of the Unicode Bidirectional Algorithm [bidi].

Note

If the base direction value were not set in the last example, the text would be displayed, following the Unicode Bidirectional Algorithm [bidi] and due to the presence of a Latin character starting the string, as:

HTML היא שפת סימון.

However, that would be incorrect. The extra direction value is necessary to control the display to yield:

HTML היא שפת סימון.

See also the [string-meta] document for further explanations and examples.

2.4.4.2 Entities

When a manifest property expects an entity (i.e., an individual or organization responsible for the various aspects of creation), its value MUST be expressed either as:

A single string value represents an instance of an Entity object whose name property is the string's text and whose type is assumed to be Person [schema.org].

An Entity is defined as an instance of either the [schema.org] Person or Organization type with the following minimal property set:

Term Description Required Value Value Type [schema.org] Mapping
type The type of creator. OPTIONAL One or more Text. Sequence MUST include "Person" or "Organization". Array of Literals (None)
name Name of the creator. REQUIRED. One or more Text. Array of Localizable Strings name
id A canonical identifier associated with the creator. OPTIONAL. A URL record [url]. Identifier (None)
url An address associated with the creator. OPTIONAL. A valid URL string [url]. URL url
identifier An identifier associated with the creator (e.g., ORCID). OPTIONAL. One or more text(s). Array of Literals identifier

Note that user agents MAY interpret a wider range of creator properties defined by Schema.org than the ones in the preceding table.

2.4.4.3 Linked Resources

When a manifest property links to one or more resources, it MUST be expressed either as:

  1. a [json] string encoding the URL of the resources; or
  2. an instance of a LinkedResource.

A string value represents an implied LinkedResource object whose url property is set to the string value.

Term Description Required Value Value Type [schema.org] Mapping
type The type of resource. OPTIONAL One or more Text. Sequence MUST include "LinkedResource". Array of Literals (None)
url Location of the resource. REQUIRED. A valid URL string [url]. Refer to the property definitions that accept this type for additional restrictions. URL url
encodingFormat Media type of the resource (e.g., text/html). OPTIONAL. MIME Media Type [rfc2046]. Literal encodingFormat
name Name of the item. OPTIONAL. One or more Text items. Array of Localizable Strings name
description Description of the item. OPTIONAL. Text. Localizable String description
rel The relation of the resource to the publication. OPTIONAL.

One or more relations. The values are either the relevant relation terms of the IANA link registry [iana-link-relations], or specially-defined URLs if no suitable link registry item exists.

Array of Literals (None)
integrity A cryptographic hashing of the resource that allows its integrity to be verified. OPTIONAL.

One or more whitespace-separated sets of integrity metadata [sri]. The value MUST conform to the metadata definition [sri].

Refer to [sri] for the list of cryptographic hashing functions that user agents are expected to support.

Literal (None)
length The total length of a time-based media resource in (possibly fractional) seconds. OPTIONAL Number Number (None)
alternate

References to one or more reformulation(s) of the resource in alternative formats.

When specified, encodingFormat indicates the format of the reformulation.

Order is not significant.

OPTIONAL

One or more of:

  • a string, representing the URL of the resource reformulation in an alternative format; or
  • an instance of a LinkedResource object

The order of items is not significant. Non-HTML resources SHOULD be expressed as LinkedResource objects with their encodingFormat values set.

One or more instances of LinkedResource objects.

Array of Linked Resources (None)

Although user agent support for the integrity property is OPTIONAL, user agents that support cryptographic hashing comparisons using this property MUST do so in accordance with [sri].

Example 4: A resource with a SHA-256 hashing of its content.
{
    "type"           : "LinkedResource",
    "url"            : "chapter1.html",
    "encodingFormat" : "text/html",
    "name"           : "Chapter 1 - Loomings",
    "integrity"      : "sha256-13AE04E21177BABEDFDE721577615A638341F963731EA936BBB8C3862F57CDFC"
}
Example 5: A resource with its alternate formats.
{
    "type"           : "LinkedResource",
    "url"            : "chapter1.mp3",
    "encodingFormat" : "audio/mpeg",
    "name"           : "Chapter 1 - Loomings",
    "alternate"      : [
        "chapter1.html",
        {
            "type": "LinkedResource",
            "url": "chapter1.json",
            "encodingFormat": "application/vnd.wp-sync-media+json",
            "length": 1669
        }
    ]
}

A string value represents an implied LinkedResource object whose url property is set to the string value.

2.4.5 URLs

URLs are used to identify resources associated with a digital publication. When a property expects a URL value, it MUST be a valid URL string [url].

Manifest URLs are restricted to only the http and https schemes [url].

In the case of relative-URL strings, these are resolved to absolute-URL strings using a base URL [url].

The base URL for relative-URL strings is determined as follows:

By consequence, relative-URL strings in embedded manifests are resolved against the URL of the document that references the manifest unless the document declares a base URL (i.e., in a <base> element in its header).

Note

URLs allow for the usage of characters from Unicode following [rfc3987]. See the note in the HTML5 specification for further details.

2.4.6 Identifiers

Identifiers are used to refer to a digital publication and the entities reponsible for its creation in a persistent and unambiguous manner. URLs, URNs, DOIs, ISBNs, and PURLs are all examples of persistent identifiers frequently used in publishing.

Identifiers MUST be expressed as URL records [url]

2.4.7 Arrays

When a manifest property allows one or more value of their respective type (e.g., literal, object, or URL), these values are expressed as [json] arrays. When a property value is a single element, however, the array syntax MAY be omitted.

2.5 Manifest Contexts

A manifest MUST set its JSON-LD context [json-ld11] with the following two components, in the specified order:

  1. the [schema.org] context: https://schema.org
  2. the publication context: https://www.w3.org/ns/pub-context
Example 8: Setting the context declaration.
{
    "@context" : [
        "https://schema.org",
        "https://www.w3.org/ns/pub-context"
    ],
    …
}

The publication context document adds features to the properties defined in Schema.org (e.g., the requirement for the creator property to be order preserving).

Although Schema.org is often referenced using the http URI scheme, the vocabulary is being migrated to use the secure https scheme as its default. The use of https when referencing Schema.org in the manifest is REQUIRED by this specification.

The context can be extended by including additional paramaters — such as the global language and direction declarations — in an object following the publication context.

{
    "@context" : [
        "https://schema.org",
        "https://www.w3.org/ns/pub-context",
        {
            "language" : "es"
        }
    ],
    …
}

2.6 Manifest Language and Direction

Each natural language property value in a manifest (e.g., title, creators) has a default natural language, which is the language that it is expressed in (e.g., English, French, Chinese). It also has a natural base direction in which it is written — the display direction, either left-to-right or right-to-left.

The digital publication manifest provides the ability to set both these concepts globally as well as on individual items to aid user agents in interpreting and presenting the metadata.

2.6.1 Global Declarations

The default language for natural language manifest properties is set by including a global language and/or a global base direction declaration in the context using the language, respectively the direction, keywords [json-ld11]. It is used to expand simple string values into localizable strings during the processing of the manifest, as well as to provide a language and the base direction for localizable strings that omit one.

The value of language MUST be a well-formed language tag [bcp47].

The value of direction MUST have one of the following values:

  • "ltr": indicates that the textual values are explicitly directionally set to left-to-right text;
  • "rtl": indicates that the textual values are explicitly directionally set to right-to-left text;

The global language and base direction declaration, when present, MUST follow the publication context.

Default values are not specified for the global language or base direction.

2.6.2 Item-Specific Declarations

It is possible to set the language or a base direction locally for any natural language value in the manifest using a localizable string:

Note

The extra base direction setting for the Arabic title is necessary to yield the correct display, i.e.,:

HTML و CSS: تصميم و إنشاء مواقع الويب

The possible values of the language and direction keywords [json-ld11] are the same as for the global declaration. Furthermore, both values can also be the (JSON) value of null, indicating that no explicit language, respectively direction, is set.

Note

Setting the value of language to null can be useful if a value (e.g., the name of an organization) is commonly used without any associated language (e.g., "Google").

A local declaration of the language, respectively the base direction, takes precedence over a global declaration.

2.7 Publication Types

A digital publication's manifest defines its Publication Type using the type keyword [json-ld11]. The type MAY be mapped onto any [schema.org] type, but CreativeWork is assumed as the default when no type is specified.

Example 14: Setting a publication's type to CreativeWork.
{
    "@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
    "type"     : "CreativeWork",
    …
}

More specific subtypes of CreativeWork, such as Article, Book, TechArticle, and Course can be used instead of, or in addition to, CreativeWork.

Example 15: Setting a publication's type to Book.
{
    "@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
    "type"     : "Book",
    …
}

Each schema.org type defines a set of properties that are valid for use with it. To ensure that the manifest can be validated and processed by schema.org-aware processors, the manifest SHOULD contain only the properties associated with the selected type.

If properties from more than one type are needed, the manifest MAY include multiple type declarations.

Example 16: Setting the type property for a publication that combines properties from Book and VisualArtwork.
{
    "@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
    "type"     : ["Book", "VisualArtwork"],
    …
}

User agents SHOULD NOT fail to process manifests that are not valid to their declared schema.org type(s).

Note

Refer to the schema.org site for the complete list of CreativeWork subtypes.

2.8 Profile Conformance

A digital publication indicates the profile its manifest and content are in conformance with using the conformsTo property.

Term Description Required Value Value Type [dcterms] Mapping
conformsTo URL of the profile. A valid URL string [url]. Array of URLs conformsTo

The URL to use for each profile is defined in its respective specification.

Note

The conformsTo property can also be used to indicate conformance to other specifications and standards (e.g., to [WCAG21]).

Example 17: Identify that a digital publication conforms to the W3C Audiobooks specification.
{
    …
    "conformsTo" : "https://www.w3.org/TR/audiobooks",
    …
}

2.9 Properties

2.9.1 Descriptive Properties

2.9.1.1 Abridged

The abridged property provides information on whether or not a digital publication has been shortened from its original form.

Term Description Required Value Value Type [schema.org] Mapping
abridged Indicates whether the book is an abridged edition. Either true or false. Boolean abridged (Book)
Example 18: Setting that a publication is abridged.
{
    …
    "abridged" : true,
    …
}
2.9.1.2 Accessibility

The accessibility properties provide information about the suitability of a digital publication for consumption by users with different preferred reading modalities. These properties typically supplement an evaluation against established accessibility criteria, such as those provided in [wcag21].

The following properties are categorized as accessibility properties:

Term Description Required Value Value Category [schema.org] Mapping
accessMode The human sensory perceptual system or cognitive faculty through which a person may process or perceive information. One or more text(s). Array of Literals accessMode (CreativeWork)
accessModeSufficient A list of single or combined access modes that are sufficient to understand all the intellectual content of a resource. One or more ItemList. Array of Literals accessModeSufficient (CreativeWork)
accessibilityFeature Content features of the resource, such as accessible media, alternatives and supported enhancements for accessibility. One or more text(s). Array of Literals accessibilityFeature (CreativeWork)
accessibilityHazard A characteristic of the described resource that is physiologically dangerous to some users. One or more text(s). Array of Literals accessibilityHazard (CreativeWork)
accessibilitySummary A human-readable summary of specific accessibility features or deficiencies that is consistent with the other accessibility metadata. Text. Localizable String accessibilitySummary (CreativeWork)
Note

Detailed descriptions of these properties, including the expected values to use with them, are available at [webschemas-a11y].

Note

A reference to a detailed accessibility report can also be provided if more information is needed than can be expressed by these properties.

Example 19: Setting accessiblity metadata for a publication that provides alternative text and long descriptions appropriate for each image, enabling it to be read in purely textual form.
{
    …
    "accessMode"              : ["textual", "visual"],
    "accessibilityFeature"    : ["alternativeText", "longDescription"]
    "accessModeSufficient"    : [
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual", "visual"]
        },
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual"]
        }
    ],
    …
}
2.9.1.3 Address

An address is a URL that identifies the source location of a digital publication. It is expressed using the url property.

Term Description Required Value Value Type [schema.org] Mapping
url URL of the publication. A valid URL string [url]. Array of URLs url (Thing)

A digital publication MAY have more than one address, but all the addresses MUST resolve to the same document.

Note
The publication's address can also be used as value for an identifier link relation [link-relation].
Example 20: Setting the address of the publication.
{
    …
    "url" : "https://publisher.example.org/frankenstein",
    …
}
2.9.1.4 Canonical Identifier

A digital publication's canonical identifier property provides a unique identifier for a digital publication. It is expressed using the id property.

Term Description Required Value Value Type [schema.org] Mapping
id Preferred version of the publication. A URL record [url]. Identifier (None)
Note

Ensuring uniqueness of canonical identifiers is outside the scope of this specification. The actual achievable uniqueness depends on such factors as the conventions of the identifier scheme used and the degree of control over assignment of identifiers.

If a canonical identifier is not provided in the manifest, or the value is an invalid URL, the digital publication does not have a canonical identifier. User agents MUST NOT attempt to construct a canonical identifier from any other identifiers provided in the manifest.

The specification of the canonical identifier MAY be complemented by the inclusion of additional types of identifiers using the identifier property [schema.org] and/or its subtypes.

Example 21: Setting the canonical identifier and the address as URLs.
{
    …
    "id"  : "http://www.w3.org/TR/tabular-data-model/",
    "url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
    …
}
Example 22: Using a URN for the canonical identifier.
{
    …
    "id"  : "urn:isbn:9780123456789",
    "url" : "https://publisher.example.org/wuthering-heights",
    …
}
2.9.1.5 Creators

A creator is an individual or organization responsible for the creation of a digital publication.

The following properties are categorized as creators:

Term Description Required Value Value Category [schema.org] Mapping
artist The primary artist for the publication, in a medium other than pencils or digital line art. One or more Person. Array of Entities artist (VisualArtwork)
author The author of the publication. One or more Person and/or Organization. Array of Entities author (CreativeWork)
colorist The individual who adds color to inked drawings. One or more Person. Array of Entities colorist (VisualArtwork)
contributor Contributor whose role does not fit to one of the other roles in this table. One or more Person and/or Organization. Array of Entities contributor (CreativeWork)
creator The creator of the publication. One or more Person and/or Organization. Array of Entities creator (CreativeWork)
editor The editor of the publication. One or more Person. Array of Entities editor (CreativeWork)
illustrator The illustrator of the publication. One or more Person. Array of Entities illustrator (Book)
inker The individual who traces over the pencil drawings in ink. One or more Person. Array of Entities inker (VisualArtwork)
letterer The individual who adds lettering, including speech balloons and sound effects, to artwork. One or more Person. Array of Entities letterer (VisualArtwork)
penciler The individual who draws the primary narrative artwork. One or more Person. Array of Entities penciler (VisualArtwork)
publisher The publisher of the publication. One or more Person and/or Organization. Array of Entities publisher (CreativeWork)
readBy A person who reads (performs) the publication (for audiobooks). One or more Person. Array of Entities readBy (Audiobook)
translator The translator of the publication. One or more Person and/or Organization. Array of Entities translator (CreativeWork)

Creators MUST be represented either as:

  1. a [json] string encoding the name of a Person [schema.org]; or
  2. an instance of a Person or Organization [schema.org].

A single string value is a shorthand for a [schema.org] Person whose name property is set to that string value. (See also § 2.4.4.2 Entities.)

When compiling each set of creator information from a [schema.org] Person or Organization type, user agents MUST retain the properties defined for the Entity type, when available.

The manifest MAY include more than one of each type of creator.

Example 23: Setting the author of a book.
{
    …
    "url"      : "https://publisher.example.org/alice-in-wonderland",
    "author"   : {
        "type"  : "Person",
        "name"  : "Lewis Carroll"
    }
}
Example 24: Separating editors, authors, and publisher. Some persons expressed as simple strings instead of objects.
{
    …
    "author"     : [
        "Jeni Tennison",
        {
            "type"       : "Person",
            "name"       : "Gregg Kellogg",
        },
        {
            "type"       : "Person",
            "name"       : "Ivan Herman",
            "id"         : "https://www.w3.org/People/Ivan/"
            "identifier" : "0000-0003-0782-2704",
        }
    ],
    "editor"    : [
        "Jeni Tennison",
        {
            "type" : "Person",
            "name" : "Gregg Kellogg",
        }
    ],
    "publisher" : {
        "type" : "Organization",
        "name" : "World Wide Web Consortium",
        "id"   : "https://www.w3.org/"
    }
    …
}
2.9.1.6 Duration

The global duration indicates the overall length of a time-based digital publication (e.g., an audiobook or a book consisting of a series of video clips). It is expressed using the duration property.

Term Description Required Value Value Category [schema.org] Mapping
duration Overall duration of a time-based publication. Duration value as defined by [iso8601]. Literal duration (Property)
Example 25: Setting the global duration in seconds.
{
    …
    "type"     : "Audiobook",
    "id"       : "https://example.org/flatland-a-romance-of-many-dimensions/",
    "url"      : "https://w3c.github.io/pub-manifest/experiments/audiobook/",
    "name"     : "Flatland: A Romance of Many Dimensions",
    …
    "duration" : "PT15153S",
    …
}
Note

The relevant Wikiepedia page gives a concise description of the ISO duration syntax.

2.9.1.7 Last Modification Date

The last modification date is the date when a digital publication was last updated (i.e., whenever changes were last made to any of the resources of the publication, including the manifest). It is expressed using the dateModified property.

Term Description Required Value Value Category [schema.org] Mapping
dateModified Last modification date of the publication. A Date or DateTime value [schema.org], both expressed in ISO 8601 Date, or Date Time formats, respectively [iso8601]. Literal dateModified (CreativeWork)

The last modification date does not necessarily reflect all changes to a publication (e.g., if a digital publication format allows references to third-party content). User agents SHOULD check the last modification date of individual resources to determine if they have changed and need updating.

Example 26: Setting the last modification date of the publication.
{
    …
    "dateModified" : "2015-12-17",
    …
}
2.9.1.8 Publication Date

The publication date is the date on which a digital publication was originally published. It represents a static event in the lifecycle of a publication and allows subsequent revisions to be identified and compared. It is expressed using the datePublished property.

Term Description Required Value Value Category [schema.org] Mapping
datePublished Creation date of the publication. A Date or DateTime, both expressed in ISO 8601 Date, or Date Time formats, respectively [iso8601]. Literal datePublished (CreativeWork)

The exact moment of publication is intentionally left open to interpretation: it could be when the publication is first made available or could be a point in time before publication when the publication is considered final.

Example 27: Setting the creation and modification date of the publication.
{
    …
    "datePublished" : "2015-12-17",
    "dateModified"  : "2016-01-30",
    …
}
2.9.1.9 Publication Language

A digital publication has at least one natural language, which is the language that the content is expressed in (e.g., English, French, Chinese). The manifest includes the following property to set this concept, which can influence, for example, the behavior of a user agent (e.g., to preload a dictionary or text-to-speech engine).

Term Description Required Value Value Category [schema.org] Mapping
inLanguage Default language for the publication. One or more well-formed language tags [bcp47]. Array of Literals inLanguage (Property)

The natural language MUST be a well-formed language tag  [bcp47].

If a user agent requires the publication language and it is not available in the manifest, or the obtained value is not well-formed [bcp47], the user agent MAY attempt to determine the publication language when generating its internal representation. This specification does not mandate how such a language tag is created. The user agent might:

If a user agent requires a primary language for the publication and more than one language is specified, the first entry in the inLanguage array MUST be recognized as the primary.

Note

It is important to differentiate the language of the publication from the language of the individual resources that compose it. If such resources are, for example, in HTML, the language needs to be set in those resources, too. The language of the publication is not inherited.

2.9.1.10 Reading Progression Direction

The reading progression direction establishes the reading direction from one resource to the next within a digital publication. It is used to adapt such publication-level interactions as menu position, touch gestures, swap direction, and tap zones for next and previous page. The reading progression is expressed using the readingDirection property.

Term Description Required Value Value Category [schema.org] Mapping
readingProgression Reading progression direction from one resource to the other. One of: ltr or rtl. Literal (None)

The value of this property MUST be either:

  • ltr: left-to-right;
  • rtl: right-to-left.

The default value is ltr. If the readingProgression is not set, user agents MUST use the default value when generating their internal representation.

This property has no effect on the rendering of the individual primary resources; it is only relevant for the progression direction from one resource to the other.

Example 28: Setting the reading progression explicitly to ltr (left-to-right).
{
    …
    "readingProgression" : "ltr",
    …
}
2.9.1.11 Title

The title provides the human-readable name of a digital publication. It is expressed using the name property.

Term Description Required Value Value Category [schema.org] Mapping
name Human-readable title of the publication. One or more text items for the title. Array of Localizable Strings name (Thing)

If a title is not included in the manifest, and a digital publication does not define alternative rules for obtaining one, the user agent MUST create one. This specification does not specify what heuristics to use to generate such a title.

Note

A user agent is not expected to produce a meaningful title [wcag21] for a publication when one is not specified.

Example 29: Setting the title of a book explicitly.
{
    …
    "name" : "Heart of Darkness",
    …
}

2.9.2 Resource Categorization Properties

Publication resources are specified via the default reading order, the resource list, and the links, as defined in this section. These lists contain references to informative resources like the privacy policy, and structural resources like the table of contents.

Note that a particular resource's URL MUST NOT appear in more than one of these lists, and a URL MUST NOT be repeated within a list.

The manifest MUST NOT include a reference to itself within any of these lists.

2.9.2.1 Default Reading Order

The default reading order is a specific progression through a set of digital publication resources. A user might follow alternative pathways through the content, but in the absence of such interaction the default reading order defines the expected progression from one resource to the next.

The default reading order is expressed using the readingOrder property.

Term Description Required Value Value Category [schema.org] Mapping
readingOrder Order of progression through the resources of a digital publication.

One or more LinkedResource.

Array of Linked Resources (None)

Each element of the readingOrder property MUST be expressed either as:

A single string value represents an instance of a LinkedResource object whose url property is the string's text.

The order of items is significant.

The URLs expressed in the reading order MUST NOT include fragment identifiers. Non-HTML resources SHOULD be expressed as LinkedResource objects with their encodingFormat values set.

The default reading order MUST include at least one resource.

Example 30: Expressing the reading order as a simple list of URLs.
{
    …
    "readingOrder" : [
        "html/title.html",
        "html/copyright.html",
        "html/introduction.html",
        "html/epigraph.html",
        "html/c001.html",
        …
    ],
    …
}
Example 31: Expressing the reading order as LinkedResource objects to provide more information.
{
    …
    "readingOrder" : [
        {
            "type"           : "LinkedResource",
            "url"            : "html/title.html",
            "encodingFormat" : "text/html",
            "name"           : "Title page"
        },
        {
            "type"           : "LinkedResource",
            "url"            : "html/copyright.html",
            "encodingFormat" : "text/html",
            "name"           : "Copyright page"
        },
        …
    ],
    …
}
2.9.2.2 Resource List

The resource list enumerates any additional resources used in the processing or rendering of a digital publication that are not already listed in the default reading order. It is expressed using the resources property.

Term Description Required Value Value Category [schema.org] Mapping
resources List of additional publication resources used in the processing or rendering of a publication.

One or more LinkedResource.

Array of Linked Resources (None)

Each element of the resources property MUST be expressed either as:

A single string value represents an instance of a LinkedResource object whose url property is the string's text.

The order of items is not significant.

The URLs MUST NOT include fragment identifiers. It is RECOMMENDED to use LinkedResource objects with their encodingFormat values set.

The completeness of the resource list can affect the usability of a digital publication in certain reading scenarios (e.g., the ability to read it offline). For this reason, it is strongly RECOMMENDED to provide a comprehensive list of all of the publication's constituent resources beyond those listed in the default reading order.

In some cases, a comprehensive list of these resources might not be easily achieved (e.g., third-party scripts that reference resources from deep within their source), but a user agent SHOULD still be able to render a publication even if some of these resources are not identified as belonging to the publication (e.g., if it is taken offline without them).

Example 32: Expressing the list of resources via a combination of simple URL strings and LinkedResource objects.
{
    …
    "resources"  : [
        "datatypes.html",
        "datatypes.svg",
        "datatypes.png",
        "diff.html",
        {
            "type"           : "LinkedResource",
            "url"            : "test-utf8.csv",
            "encodingFormat" : "text/csv"
        },
        {
            "type"           : "LinkedResource",
            "url"            : "test-utf8-bom.csv",
            "encodingFormat" : "text/csv"
        },
        …
    ],
    …
}

2.9.3 Extensibility

The manifest is designed to provide a basic set of properties for use by user agents in presenting and rendering a digital publication, but MAY be extended in the following ways:

  1. by the provision of linked metadata records.
  2. through the inclusion of additional properties in the manifest;

Although both methods are valid, the use of linked records is RECOMMENDED.

This specification does not define how such additional properties are compiled, stored or exposed by user agents in their internal representation of the manifest. A user agent MAY ignore some or all extended properties.

2.9.3.1 Linked records

Extending the manifest through links to a record, such as an ONIX [onix] or BibTeX [bibtex] file, MUST be expressed using a LinkedResource object, where:

  • the rel value of the LinkedResource SHOULD include a relevant identifier defined by IANA or by other organizations; if the link record contains descriptive metadata it MUST include the describedby (IANA) identifier;
  • the value of the encodingFormat in the link MUST use the MIME media type [rfc2046] defined for that particular type of record, if applicable.

Linked records MUST be included in the resource list when they are part of the publication (i.e., are needed for more than just manifest extensibility). Otherwise, they MUST be included in the links list.

Example 33: Linking to an external ONIX for Books metadata record.
{
    …
    "links"  : [
        {
            "type"            : "LinkedResource",
            "url"             : "https://www.publisher.example.org/time-machine/onix.xml",
            "encodingFormat"  : "application/onix+xml",
            "rel"             : "describedby"
        },
        …
    ],
    …
}
Editor's note

The application/onix+xml MIME type has not yet been registered by IANA at the time of writing this document, and is included in the example for illustrative purposes only.

2.9.3.2 Additional Manifest Properties

Additional properties MAY be included directly in the manifest. It is RECOMMENDED that these properties be taken from public schemes like [schema.org] or [dcterms] and use values from controlled vocabularies whenever possible. Proprietary terms MAY be used, but it is RECOMMENDED that such terms be included using Compact IRIs [json-ld11], with prefixes defined as part of the context.

Example 34: Extending the basic data set using a vocabulary prefix declaration.
{
    "@context" : [
        "https://schema.org",
        "https://www.w3.org/ns/pub-context",
        {
            "language" : "en",
            "ex"       : "https://example.org/vocab"
        }
    ],
    …
    "ex:region" : "North America",
    …
}
Note

The schema.org context file [schema.org] defines a number of prefixes for commonly used vocabularies, such as the Dublin Core Terms (dcterms) [dcterms] and Element Set (dc) [dc11], the FOAF vocabulary (foaf) [foaf], and the Bibliographic Ontology (bibo) [bibo]. Properties from these vocabularies can be used without their prefixes having to be declared.

Example 35: Extending the basic data using the schema.org 'copyrightYear' and 'copyrightHolder' terms.
{
    …
    "copyrightYear"   : "2015",
    "copyrightHolder" : "World Wide Web Consortium",
    …
}
Example 36: Extending the basic data set using the Dublin Core 'subject' term with the 2012 ACM Classification terms.
{
    …
    "dcterms:subject" : ["Web data description languages","Data integration","Data Exchange"],
    …
}

2.10 Resource Relations

2.10.1 Informative Resources

2.10.1.1 Accessibility Report

An accessibility report provides information about the suitability of a digital publication for consumption by users with varying preferred reading modalities. These reports typically identify the result of an evaluation against established accessibility criteria, such as those provided in [wcag21], and are an important source of information in determining the usability of a publication.

An accessibility report is identified using the accessibility-report link relation.

Editor's note

The accessibility-report term is not currently registered in the IANA link relations but the Working Group expects to add it.

The manifest SHOULD include a link to an accessibility report when one is available for a publication. It is RECOMMENDED that the report be included as a resource of the publication.

It is also RECOMMENDED that the accessibility report be provided in a human-readable format, such as HTML [html]. Augmenting these reports with machine-processable metadata, such as provided in schema.org [schema.org], is also RECOMMENDED.

2.10.1.2 Preview

Not all digital publications will be available to all users (e.g., they might be restricted to registered users of a site). In such cases, the publisher might wish to provide a preview of the content in order to entice users to access the full version.

A preview is identified using the preview link relation [iana-link-relations].

Previews MAY be located externally or included as resources of digital publications.

Example 38: Identifying a preview as an audio resource of a digital publication.
{
    …
    "links" : [
        {
            "type"           : "LinkedResource",
            "url"            : "preview.mp3",
            "encodingFormat" : "audio/mpeg",
            "rel"            : "preview"
        },
        …
    ],
    …
}
2.10.1.3 Privacy Policy

Users often have the legal right to know and control what information is collected about them, how such information is stored and for how long, whether it is personally identifiable, and how it can be expunged. Including a statement that addresses such privacy concerns is consequently an important part of publishing digital publications. Even if no information is collected, such a declaration increases the trust users have in the content.

A link to a privacy policy can be included in the manifest for this purposes. It is RECOMMENDED that the privacy policy be included as a resource of the publication.

A privacy policy is identified using the privacy-policy link relation [iana-link-relations].

2.10.2 Structural Resources

2.10.2.1 Cover

The cover is a resource that user agents can use to present a digital publication (e.g., in a library or bookshelf, or when initially loading the publication).

The cover is identified by the cover link relation. The URL expressed in the url term MUST NOT include a fragment identifier.

Editor's note

The cover term is not currently registered in the IANA link relations but the Working Group expects to add it.

If the cover is in an image format, a title and description SHOULD be provided. User agents can use these properties to provide alternative text and descriptions when necessary for accessibility.

More than one cover MAY be referenced from the manifest (e.g., to provide alternative formats and sizes for different device screens). If multiple covers are specified, each instance MUST define at least one unique property to allow user agents to determine its usability (e.g., a different format, height, width or relation).

Example 41: Identifying an HTML cover page.
{
    …
    "resources" : [
        {
            "type"           : "LinkedResource",
            "url"            : "cover.html",
            "encodingFormat" : "text/html",
            "rel"            : "cover"
        },
        …
    ],
    …
}
Example 42: Identifying a cover image. Alternative text and a description are provided in the name and description properties, respectively.
{
    …
    "resources" : [
        {
            "type"           : "LinkedResource",
            "url"            : "whale-image.jpg",
            "encodingFormat" : "image/jpeg",
            "rel"            : "cover",
            "name"           : "Moby Dick attacking hunters",
            "description"    : "A white whale is seen surfacing from the water to attack a small whaling boat"
        },
        …
    ],
    …
}
Example 43: Providing a cover image in JPEG and SVG formats.
{
    …
    "resources" : [
        {
            "type"           : "LinkedResource",
            "url"            : "lilliput.jpg",
            "encodingFormat" : "image/jpeg",
            "rel"            : "cover"
        },
        {
            "type"           : "LinkedResource",
            "url"            : "lilliput.svg",
            "encodingFormat" : "image/svg+xml",
            "rel"            : "cover"
        },
        …
    ],
    …
}
2.10.2.2 Page List

The page list is a navigational aid that contains a list of static page demarcation points within a digital publication.

The page list is identified by the pagelist link relation. The URL expressed in the url term of the resource that identifies the page list MUST NOT include a fragment identifier.

Editor's note

The pagelist term is not currently registered in the IANA link relations but the Working Group expects to add it.

The link to the page list MUST NOT be specified in the links list.

Example 44: Identifying the resource that contains the page list.
{
    …
    "resources" : [
        {
            "type" : "LinkedResource",
            "url"  : "toc_file.html",
            "rel"  : "pagelist"
        },
        …
    ],
    …
}
2.10.2.3 Table of Contents

The table of contents is a navigational aid that provides links to the majort structural sections of a digital publication.

The table of contents is identified by the contents link relation [iana-link-relations]. The URL expressed in the url term of the resource that identifies the table of contents MUST NOT include a fragment identifier.

The link to the table of contents MUST NOT be specified in the links list.

The RECOMMENDED structure and processing model for the table of contents is defined in § A. Machine-Processable Table of Contents.

Example 45: Identifying the resource that contains the table of contents.
{
    …
    "resources" : [
        {
            "type" : "LinkedResource",
            "url"  : "toc_file.html",
            "rel"  : "contents"
        },
        …
    ],
    …
}

2.10.3 Extensions

If additional relations beyond those defined in this specification need to be expressed, the rel property can be extended in one of the following ways:

Use of relations from [iana-relations] is RECOMMENDED.

3. Manifest Discovery

3.2 Embedding

When a digital publication format allows manifests to be embedded within an HTML document, the manifest MUST be included in a script element [html] whose type attribute is set to application/ld+json [json-ld11].

Example 49: Script tag for a publication manifest embedded in an HTML document.
<script type="application/ld+json">
    {
        "@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
        …
    }
</script>

3.3 Other Discovery Methods

Digital publication formats MAY define alternative methods of discovering a manifest that do no involve linking to, or embedding, a manifest (e.g., that manifest could be discovered through the use of a restricted name and/or location). This specification does not add any restrictions on such methods.

4. Processing a Manifest

The steps for processing a manifest into its internal representation are given by the following algorithm.

The following error types are used in this algorithm:

User agents SHOULD NOT include in their internal representation any value that generates a validation error.

User agents SHOULD expose both validation and fatal errors, but this specification does not prescribe the manner in which this has to be done.

The algorithm takes the following arguments:

If successful, the algorithm returns an object containing the internal representation of the data.

Note

This algorithm does not define how the manifest is discovered and obtained. The steps by which to do so are defined by each digital publication format.

Note

For illustrative purposes, the examples in this section show the structure of internal representation as JavaScript objects. User agents can process and internalize the resulting structure in whatever language and form is appropriate.

  1. Let processed be an object containing the internal representation of the manifest.

  2. Let manifest be the result of parsing text as JSON [ecmascript].

    If parsing throws an error, this is a fatal error. Return failure.

  3. If typeof(manifest) is not Object [ecmascript], this is a fatal error. Return failure.

  4. (§ 2.5 Manifest Contexts) If manifest["@context"] does not contain the values "https://schema.org" and "https://www.w3.org/ns/pub-context" (in this order), this is a fatal error. Return failure.

  5. (§ 2.8 Profile Conformance) Determine the profile the manifest conforms to as follows:

    1. If manifest["conformsTo"] is not set, this is a fatal error. Return failure.
    2. If manifest["conformsTo"] does not include a URL of a recognized profile, this is a fatal error. Return failure.
    3. If manifest["conformsTo"] includes the URLs of more than one recognized profile, this is a validation error. The user agent can select which to give precedence to.
    4. Otherwise, the publication is an instance of the profile defined by the recognized URL.
    Explanation

    The profile the publication conforms to is used to determine any additional extension steps that have to be performed during processing.

  6. (§ 2.6.1 Global Declarations) Let lang be the value of the last instance of the language keyword in an object in the manifest["@context"] array. If no instances of language exist, set lang to undefined.

    If the value of lang is not a well-formed [bcp47] language tag, this is a validation error. Set lang to undefined.

    Explanation

    The global language declaration obtained here is used to set the language for localizable strings without a declaration.

  7. (§ 2.6.1 Global Declarations) Let dir be the value of the last instance of the direction keyword in an object in the manifest["@context"] array. If no instances of direction exist, set dir to undefined.

    If the value of dir is not "ltr" or "rtl" this is a validation error. Set dir to undefined.

    Explanation

    The global language declaration obtained here is used to set the base direction for localizable strings without a declaration.

  8. Iterate over each property term in manifest and expand its value, as necessary, as follows. More than one step MAY apply.

    Note

    The steps below depend on the expected value category of a term. Care should be taken that the value category may depend not only on the name of the term, but also on the object it is used for. For example, the value category for the term url is an Array (of URLs) when used for the PublicationManifest object, but a single URL literal otherwise.

    For this step, the variable current is set to the value of manifest[term] and is updated by each applicable transformation.

    1. (§ 2.4.7 Arrays) If term expects an array but current contains a string or object value, convert current to an array and push value onto it.

      Explanation

      A number of terms require their values to be arrays, but, for the sake of convenience, authors are allowed to use a single value instead of a one element array. For example,

      {
          …
          "name"   : "Et dukkehjem",
          "author" : "Henrik Ibsen",
          …
      }

      yields:

      {
          …
          "name"   : ["Et dukkehjem"],
          "author" : ["Henrik Ibsen"],
          …
      }
    2. (§ 2.4.4.2 Entities) If term expects an entity but the value of current is a string str, convert current to an object with the following properties:

      • type – set to an array with the value "Person";

      • name – set to str.

      If typeof(current) is Array [ecmascript], perform this step on each element of the array.

      Explanation

      Authors, editors, etc., are expected to be explicitly designed as an object of type Person, but, for the sake of convenience, only their name has to be specified. For example:

      {
          …
          "author" : ["Ralph Ellison"],
          …
      }

      This rule converts the string values to objects, yielding the following for the preceding example:

      {
          …
          "author" : [
              {
                  "type" : ["Person"],
                  "name" : "Ralph Ellison"
              }
          ],
          …
      }
      Note

      For simplicity, the conversion of name to a localizable string is described by a later step.

    3. (§ 2.4.4.1 Localizable Strings) If term expects a localizable string and current contains a string str, convert current to an object with the following properties:

      • value – set to str; and

      • language – set to the value of lang when not undefined; otherwise, omitted.

      • direction – set to the value of dir when not undefined; otherwise, omitted.

      If typeof(current) is Array [ecmascript], perform this step on each element of the array.

      If typeof(current) is Object [ecmascript], perform this step on each property of the object.

      Explanation

      Natural language text values are expected to be explicitly designed as localizable string objects, but, for the sake of convenience, can be simple strings. For example, if no language information has been provided via the global language declaration then:

      {
          "@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
          "name"     : ["La Comédie humaine"],
          …
      }

      yields:

      {
          "@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
          "name"     : [
              {
                  "value" : "La Comédie humaine"
              }
          ],
          …
      }

      If, however, an explicit language has been provided in the manifest, that language is added to the localizable string object. For example,

      {
          "@context" : [
              "https://schema.org",
              "https://www.w3.org/ns/pub-context",
              {"language": "fr"}
          ],
          "name"     : ["La Comédie humaine"],
          …
      }

      yields:

      {
          "@context" : [
              "https://schema.org",
              "https://www.w3.org/ns/pub-context",
              {"language": "fr"}
          ],
          "name"     : [
              {
                  "value"    : "La Comédie humaine",
                  "language" : "fr"
              }
          ],
          …
      }
    4. (§ 2.6.2 Item-Specific Declarations) If term expects a localizable string, update the language and direction values of current as follows:

      • Set the final language tag:
        • if current[language] is not defined set current[language] to lang unless that value is undefined.
        • otherwise, if current[language] is null, remove current[language].
      • Set the final base direction:
        • if current[direction] is not defined set current[direction] to dir unless that value is undefined.
        • otherwise, if current[direction] is null, remove current[direction].

      If typeof(current) is Array [ecmascript], perform this step on each element of the array.

      Explanation

      These steps uses the global language/direction values to set the language/direction for LocalizableString, unless the local setting is present or a local null value prevents the global value to take effect.

      {
          "@context" : [
              "https://schema.org",
              "https://www.w3.org/ns/pub-context", 
              {"language":"fr"}
          ],
          …
          "name" : [{
              "value" : "La Comédie humaine"
          }],
          "publisher" : [{
              "type":["Organization"],
              "name":[{
                  "value": "Hachette",
                  "language": null
              }]
          }],
          …
      }

      yields:

      {
          "@context" : [
              "https://schema.org",
              "https://www.w3.org/ns/pub-context", 
              {"language":"fr"}
          ],
          …
          "name" : [{
              "value" : "La Comédie humaine",
              "language": "fr"
      	}],
          "publisher" : [{
              "type":["Organization"],
              "name":[{
                  "value": "Hachette",
              }]
          }],
          …
      }
    5. (§ 2.4.5 URLs) If term expects a URL and the value of current is a string that is not an absolute URL string, resolve the value using the value of base [rfc1808].

      If typeof(current) is Array [ecmascript], perform this step on each element of the array.

      If typeof(current) is Object [ecmascript], perform this step on each property of the object.

      Explanation

      All relative URLs in the publication manifest must be resolved against the base value to yield absolute URLs.

    6. (§ 5. Modular Extensions, extension point) if a profile defines processing steps for profile-specific terms, those steps are executed at this point.

    Set processed[term] to the value of current.

  9. Add defaults from document for the following properties, when applicable:

    1. (§ 2.9.1.11 Title) If processed["name"] is not set, or its value is an empty string:

      • Create a new object title:

        • if the title element [html] of document is set and its contains is not empty, set title["value"] to the text content of the title element. Set title["language"] to the language [html], if available, and the title["direction"] to the base direction [html] if that value is available and its value is either "ltr" or "rtl".

        • otherwise, generate a value for title["value"] (see the separate note for details) and issue a validation error.

      • Set processed["name"] to an empty array and add title to it.
      Explanation

      This step adds the content of the title element of document when the name property is not specified in the manifest. For example:

      <html>
      <head lang="en">
          <title>The Golden Bough</title><script type="application/ld+json">
          {
              "@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
              …
          }
          </script>

      yields:

      {
          …
          "name" : [
              {
                  "value" : "The Golden Bough",
                  "language" : "en"
              }
          ],
          …
      }
    2. (§ 2.9.2.1 Default Reading Order) If processed["readingOrder"] is not set or its value is an empty string or array:

      • if document.URL is not set, this is a fatal error. Return failure.

      • otherwise, create an object and set its url property to the value of document.URL. Set processed["readingOrder"] to an empty array and add the object to it.

      Explanation

      If the Digital Publication consists only of the referencing document, the default reading order can be omitted; it will consist, automatically, of that single resource.

    3. If a profile specifies document fallbacks, those steps are executed at this point.

  10. Perform data integrity checks on the following properties in processed:

    1. (§ 2.4 Value Categories) For each property with a known value type, check that the actual value matches the expected value. If not, this is a validation error.

    2. (§ 2.4.4.2 Entities) For each property term in processed that expects an array of entities, check whether each value entry in processed[term] has its name property set. If not, this is a validation error.

      Repeat this step recursively for all properties of current that expect an object in which one or more properties expect entities.

    3. (§ 2.6.1 Global Declarations and § 2.6.2 Item-Specific Declarations) Recursively check that every language property in processed is well-formed [bcp47]. Each non well-formed value found is a validation error.

    4. (§ 2.6.1 Global Declarations and § 2.6.2 Item-Specific Declarations) Recursively check that every direction property in processed has the value of "ltr" or "rtl". If not, this is validation error.

    5. (§ 2.7 Publication Types) If processed["type"] is not set or contains an empty value, this is a validation error. Set its value to an array with the value "CreativeWork".

    6. (§ 2.9.1.1 Abridged) If processed["abridged"] is set and it is not a boolean value, this is a validation error.

    7. (§ 2.9.1.6 Duration) If processed["duration"] is set and its value is not a valid duration value, per [iso8601], this is a validation error.

    8. (§ 2.9.1.7 Last Modification Date) If processed["dateModified"] is set and its value is not a valid date or date-time per [iso8601], this is a validation error.

    9. (§ 2.9.1.8 Publication Date) If processed["datePublished"] is set and its value is not a valid date or date-time per [iso8601], this is a validation error.

    10. (§ 2.9.1.9 Publication Language) If processed["inLanguage"] is set, check that each value in its array is well-formed [bcp47]. Each non well-formed value found is a validation error.

    11. (§ 2.9.1.10 Reading Progression Direction) Verify processed["readingProgression"] as follows:

    12. (§ 2.9.2 Resource Categorization Properties) For each resource categorization property term in processed, check whether each object P in processed[term] has P["url"] set:

    13. (§ 2.3.1.1 The LinkedResource Dictionary) For each property term in processed that expects type LinkedResource, if term["length"] is set and is not a valid number this is a validation error.

    14. If a profile specifies data validation checks, those steps are executed at this point.

  11. Return processed.

5. Modular Extensions

The manifest format defined in this specification is designed to be implemented and extended by publishing communities in the production of new profiles (e.g., audiobooks and scholarly publications). The flexibility the manifest format offers allows it to be tailored to each community's specific needs while also providing a common base for user agents that need to process the profiles (i.e., minimizing the differences between each profile and simplifying interoperability).

In order for a profile to be compatible with this specification, the following conditions MUST be met:

  1. It MUST adhere to the requirements for constructing a manifest as defined in § 2. Publication Manifest.
  2. It MUST define a unique conformance URL and require that conforming publications include this URL in their conformsTo property.
  3. One or both linking methods MUST be used in the discovery of the manifest.
  4. The generic processing steps described in § 4. Processing a Manifest MUST remain valid for the extended manifest. To achieve this, and if new terms are added to the general manifest, then:
    • The term SHOULD be categorized, if applicable, to one or more of the general term categories used in the algorithm (e.g., array or localizable string). This means the relevant processing steps will be automatically executed for those terms
    • If necessary, the profile MAY define its own processing step(s), to be executed at the designated extension points within the processing algorithm. Such extra steps MUST NOT invalidate the results of any of the steps defined for the processing algorithm in general.
Editor's note

Adding an example of a term added by, e.g., the audiobook profile would be a good idea, when available.

6. Security and Privacy Considerations

As the manifest is expressed using JSON-LD, the privacy and security considerations [json-ld11] detailed in that specification are applicable to all profiles of the manifest.

Some additional general considerations for profiles include:

More specific security and privacy considerations are left to each profile to detail, as these will vary depending on the nature of the digital publication format.

A. Machine-Processable Table of Contents

A.1 Introduction

This section is non-normative.

To facilitate navigation within pages and across sites, HTML uses the nav element [html] to express lists of links. Although generic in nature by default, the purpose of a nav element can be more specifically identified by use of the role attribute [html]. In particular, the doc-toc role from the [dpub-aria-1.0] vocabulary identifies the nav element as the digital publication's table of contents.

Including an identifiable table of contents is an accessible way to produce any digital publication, but due to the flexibility of HTML markup, it also presents challenges for user agents trying to extract a meaningful hierarchy of links (e.g., to provide a custom view available from any page). To avoid duplicating the tables of contents for different uses, this section defines a syntax that is both human friendly and commonly used while still providing enough structure for user agent extraction.

Authors have a choice of lists (ordered or unordered) to construct their table of contents. By tagging each link within these lists in anchor tags (a elements), user agents can easily differentiate the information they need from any peripheral content (asides) or stylistic tagging that has also been added. The table of contents can consist of both active links (with an href attribute) and inactive links (excluding the href attribute), providing additional flexibility in how the table of contents is constructed (e.g., to omit links to certain headings or only link to certain content in a preview).

A.2 HTML Structure

The table of contents is expressed via an [html] element (typically a nav element). This element MUST be identified by the role attribute [html] value "doc-toc" [dpub-aria-1.0], and MUST be the first element in the document in document tree order [dom] with that role value.

The manifest SHOULD identify the resource that contains the table of contents.

Although the content model of the nav element is not restricted, user agents will only be able to extract a usable table of contents when the following markup guidelines are followed:

Table of Contents Title

Although a title for the table of contents is optional, to avoid having a user agent generate a placeholder title when one is needed, it is advised to add one. Titles are specified using any of the [htmlh1 through h6 elements. Note that only the first such element is recognized as the title. If a heading element is not found before the list of links, user agents will assume that one has not been specified.

The first [htmlol or ul list element encountered in the nav element is assumed to contain the list that defines the links into the content. This list will be found even if it is nested inside of div elements, for example, as the algorithm ignores elements that are not relevant to its processing. The list cannot occur inside of any skipped elements, however, since their internal contents are not evaluated.

If the nav element does not contain one of these elements, then user agents will not register the digital publication as containing a usable table of contents (e.g., a machine-rendered option will not be available).

Branches

If the table of contents is considered as a tree of links, then each list item (li element) inside of the list of links represents one branch. Each of these branches has to have a name and optional destination in order to be presented to users, and this information is obtained from the first a element found within the list item, wherever it is nested (again, excluding any a elements inside of skipped elements.)

The link destination for the branch is obtained from the a element's href attribute, when specified. This attribute can be omitted if a link is not available (e.g., in a preview) or not relevant (e.g., a grouping header). When providing a link into the content, it is also possible to specify the relation of the linked document (in a rel attribute) and the media type of the linked resource (in a type attribute).

After finding the a element that labels the branch, user agents will continue to inspect the markup for another list element (i.e., sub-branches). If a list is found, it is similarly processed to extract its links, and so on, until there are no more nested branches left to process.

Skipped Elements

A small set of elements are ignored when the parsing table of contents to avoid misinterpretation. These are the [htmlsectioning content elements and sectioning root elements. The reason they are ignored is because they can defined their own outlines (i.e., they can represent embedded content that is self-contained and not necessarily related to the structure of content links).

Any element that has its hidden attribute set is also skipped, since hidden elements are not intended to be directly accessed by users.

Although these elements can be included in the nav element, care has to be taken not to embed important content within them (e.g., do not wrap a section element around the list item that contains all the links into the content).

Ignored Elements

All elements that are not relevant to extracting the table of contents, and are not skipped, are ignored. Unlike skipped elements, ignoring means that user agents will continue to search inside them for relevant content, allowing greater flexibility in terms of the tagging that can be used.

A.2.1 Examples

This section is non-normative.

A.3 User Agent Processing

This section defines an algorithm for extracting a table of contents from a nav element. It is defined in terms of a walk over the nodes of a DOM tree, in tree order, with each node being visited when it is entered and when it is exited during the walk. Each time a node is visited, it can be seen as triggering an enter or exit event. In some steps, user agents are provided a choice in how to process the content to provide flexibility for different presentation models.

Note

For illustrative purposes, the examples in this section show the structure of the table of contents as JavaScript objects. User agents can process and internalize the resulting structure in whatever language and form is appropriate.

For the purposes of this algorithm, a list element is defined as either an [htmlol or ul element.

The following algorithm MUST be applied to a walk of a DOM subtree rooted at the first nav element in document order with the role attribute value doc-toc. All explanations are informative.

  1. Let toc be a object that represents the table of contents and initialize it as follows:

    1. Create a name property for toc that represents the title of the table of contents and set to an empty string.
    2. Create an entries property for toc that represents all the branches of the table of contents and set to an empty array.
    Explanation

    This step initializes the toc object that will store the title and the branches of the table of contents.

    Example 68: Visualization of the default toc object.
    {
        "name"    : '',
        "entries" : []
    }
  2. Initialize a stack.

    Explanation

    The stack is used to hold branches that are not yet complete. As a new sub-branch is encountered, the parent gets pushed onto the stack so it can be retrieved later.

  3. Let current toc branch be a variable set to null.

    Explanation

    current toc branch is used to hold the object that represents the branch of the table of contents that is currently being processed.

  4. Walk over the DOM in tree order, starting with the nav element the table of contents is being built from, and trigger the first relevant step below for each element as the walk enters and exits it.

    1. When entering a heading content element:

      Run these steps:

      1. If the stack is empty, and the name property of toc is an empty string, set the name property to one of the following:

        • the descendant content of the element (to preserve any HTML tags);
        • the text string obtained from the descendant content (e.g., by calculating the accessible name [accname-1.1] of the element).

        If the resulting value of name is an empty string (e.g., after removing any presentational elements and trimming all leading and trailing whitespace), set the name property either to a placeholder value or to null.

      2. Exit the element and continue processing with the next element.
      Explanation

      This step identifies the heading for the table of contents. A heading is only processed if the value of the toc name property is an empty string (i.e., no headings have yet been encountered).

      Whether a user agent sets the name to the descendant content of the heading element, or generates a text string from it, depends on whether it will re-use any descendant tagging in the presentation (e.g., to retain images, MathML, ruby and other content that does not translate to text easily).

      Example 69: Visualization of the toc object with a heading.
      {
          "name"    : "Contents",
          "entries" : []
      }

      If the name is not an empty string, or is null, then a previous heading has already been encountered or content has been encountered that indicates the nav element does not have a heading (e.g., a list has already been processed, since the heading would not follow the list of links).

      Example 70: Visualization of the toc object without a heading.
      {
          "name"    : null,
          "entries" : []
      }

      If a heading is not specified, the user agent can provide its own for later use.

    2. When entering a list element:

      Run these steps:

      1. If the name property of toc is an empty string, set name to null.

      2. If current toc branch is not null:

        1. If the entries property of current toc branch is null or a non-empty array, exit the element and continue processing with the next element.
        2. Otherwise, push the object in current toc branch onto the stack and set current toc branch to null.
      3. Otherwise, if the stack is empty:

        1. If the entries property of toc is null or a non-empty array, exit the element and continue processing with the next element.
        2. Otherwise, do nothing.
      Explanation

      This algorithm does not process multiple lists in a single branch or at the root of the nav element, so if a list has already been encountered (the entries property contains one or more branches or is set to null), this list is skipped.

      If a list is encountered and the table of contents (toc) still does not have a name (i.e., no heading element has been encountered), the table of contents is assumed to not have a heading (i.e., the heading for the table of contents cannot appear after the first list of entries). The value of the name property is changed from an empty string to null as no further headings encountered apply, either.

    3. When exiting a list element:

      If the stack is not empty, pop the top object off the stack and set current toc branch to it.

      Explanation

      This resets current toc branch back to the parent object after all of its child branches have been processed.

    4. When entering a list item element:

      Run these steps:

      1. Set current toc branch to a new object.
      2. Create name, url, type, and rel properties for the object and set them to empty strings.
      3. Create an entries property for the object and set it to an empty array.
      Explanation

      Each list item represents a possible new branch in the table of contents, so whenever one is encountered a new blank object is created in current toc branch.

      Example 71: Visualization of a new branch object.
      {
          "name"    : '',
          "url"     : '',
          "type"    : '',
          "rel"     : '',
          "entries" : []
      }

      This object gets populated with information as a descendant a element and list are encountered.

    5. When exiting a list item element:

      Run these steps:

      1. If entries property of current toc branch contains an empty array, set its value to null.

      2. If the stack contains one or more entries:

        1. If the entries property of current toc branch contains a non-empty array, and its name property is an empty string, set its name to a placeholder value or null;
        2. If the entries property of current toc branch contains an empty array, and its name property is an empty string, set current toc branch to null and exit this processing step.

        Add current toc branch to the array in the entries property of the object at the top of the stack.

      3. Otherwise, add the object in current toc branch to the entries array of toc.

      4. Set current toc branch to null.

      Explanation

      Exiting a list item indicates that processing of the current branch is complete. Before adding this branch to its parent's entries array, the branch needs to be tested to see if it has a name and/or any sub-branches. If it does not have a name but has sub-branches, the branch is kept. The user agent can either supply a placeholder value of its own creation or set the value to null. If it does not have a name or any branches, it is invalid and is discarded.

      To determine where to merge the branch, the stack is checked. If there are no objects in the stack, it is added into the entries property of the root toc object (i.e., it is a top-level branch). Otherwise, it gets added into the entries property of the object immediately preceding it in the stack.

      As a final step, current toc branch is reset back to null.

    6. When entering an anchor element and current toc branch is not null:

      Run these steps:

      1. If the name property of current toc branch is not an empty string, do nothing.

      2. Otherwise:

        1. Set the name property of current toc branch to one of the following:

          • the descendant content of the anchor element (to preserve any HTML tags);
          • the text string obtained from the descendant content (e.g., by calculating the accessible name [accname-1.1] of the element).

          If the resulting value of name is an empty string (e.g., after removing any presentational elements and trimming all leading and trailing whitespace), set the name property to null.

        2. If the element has an href attribute and the URL in the attribute resolves to a resource in the default reading order or resource list, set the url property of current toc branch to the value. Otherwise, set the property to null.
        3. If the element has a type attribute, and the value of the attribute is not an empty string after trimming leading and trailing white space, set the type property of current toc branch to its value. Otherwise, set the property to null.
        4. If the element has a rel attribute, and the value of the attribute is not an empty string after trimming leading and trailing white space, set the rel property of current toc branch to its value. Otherwise, set the property to null.

        Exit the element and continue processing with the next element.

      Explanation

      This step processes anchor tags to obtain values for the name and url properties of a branch.

      If the name of the current branch is already defined, then processing of this element is terminated (i.e., to avoid processing multiple links for a single branch).

      Whether a user agent sets the name of the entry to the descendant content of the a element, or generates a text string from it, depends on whether it will re-use any descendant tagging in the presentation (e.g., to retain images, MathML, ruby and other content that does not translate to text easily).

      In addition to having an href attribute specified, it is necessary that it resolve to a resource that belongs to the digital publication to meet the requirements of this specification. If not, the branch is retained but the entry will not be linkable.

      Additional information about the target of the link — the type of resource and its relation — is also retained.

    7. When entering a sectioning content element, a sectioning root element, or an element with a hidden attribute:

      Exit the element and continue processing with the next element.

      Explanation

      As sectioning and sectioning root elements can define their own outlines, descending into them poses problems for generating the table of contents (i.e., they may contain content that is not directly related). As a result, they are skipped over when encountered to prevent their child content from being processed.

    8. Otherwise: do nothing.

      Explanation

      For all other elements, this steps allows their descendant elements to continue to be processed.

  5. After completing the DOM walk, if the entries property of toc contains a non-empty array, toc represents the machine-processed table of contents.

    Otherwise, the digital publication does not have a table of contents that can be used for machine rendering purposes.

    Explanation

    If the entries array in the root toc object does not contain any branches (either because no list was found in the nav element or the list did not contain any conforming list items), then the algorithm did not produce a usable table of contents.

B. Manifest Examples

This section is non-normative.

B.1 Basic Manifest

The following is a manifest with a basic set of metadata for an example book profile.

A JSON encoding of the internal representation of this manifest is also available.

{
    "@context": [
        "https://schema.org",
        "https://www.w3.org/ns/pub-context",
        {"language" : "en"}
    ],
    "conformsTo": "https://example.com/publication",
    "type": "Book",
    "url": "https://publisher.example.org/mobydick",
    "author": "Herman Melville",
    "dateModified": "2018-02-10T17:00:00Z",

    "readingOrder": [
        "html/title.html",
        "html/copyright.html",
        "html/introduction.html",
        "html/epigraph.html",
        "html/c001.html",
        "html/c002.html",
        "html/c003.html",
        "html/c004.html",
        "html/c005.html",
        "html/c006.html"
    ],

    "resources": [
        "css/mobydick.css",
        {
            "type": "LinkedResource",
            "rel": "cover",
            "url": "images/cover.jpg",
            "encodingFormat": "image/jpeg"
        },{
            "type": "LinkedResource",
            "url": "html/toc.html",
            "rel": "contents"
        },{
            "type": "LinkedResource",
            "url": "fonts/STIXGeneral.otf",
            "encodingFormat": "application/vnd.ms-opentype"
        },{
            "type": "LinkedResource",
            "url": "fonts/STIXGeneralBol.otf",
            "encodingFormat": "application/vnd.ms-opentype"
        },{
            "type": "LinkedResource",
            "url": "fonts/STIXGeneralBolIta.otf",
            "encodingFormat": "application/vnd.ms-opentype"
        },{
            "type": "LinkedResource",
            "url": "fonts/STIXGeneralItalic.otf",
            "encodingFormat": "application/vnd.ms-opentype"
        }
    ]
}

B.2 Single-Document Publication

The following is a manifest for an example article profile. The article consists only of the document the manifest is embedded in. The title and reading order are omitted from the manifest, as these properties are automatically generated during processing from the title and URL of the containing document, respectively.

A JSON encoding of the internal representation of the manifest is also available, as well as a more elaborate version for the same document.

<!DOCTYPE html>
<html lang="en-US">
<head>
    <title>Model for Tabular Data and Metadata on the Web</title>
    <link href="#wpm" rel="publication" />
    ...
    <script id="wpm" type="application/ld+json">
    {
        "@context"        : [
            "https://schema.org",
            "https://www.w3.org/ns/pub-context",
            {"language" : "en-US"}
        ],
        "conformsTo"      : "https://example.com/article",
        "type"            : "TechArticle",
        "id"              : "http://www.w3.org/TR/tabular-data-model/",
        "url"             : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
        "copyrightYear"   : "2015",
        "copyrightHolder" : "World Wide Web Consortium",    
        "creator"         : ["Jeni Tennison", "Gregg Kellogg", "Ivan Herman"],
        "publisher" : {
            "type" : "Organization",
            "name" : "World Wide Web Consortium",
            "id"   : "https://www.w3.org/"
        },
        "datePublished"         : "2015-12-17",
        "resources"             : [
            "datatypes.html",
            "datatypes.svg",
            "datatypes.png",
            "diff.html",
            {
                "type"           : "LinkedResource",
                "url"            : "test-utf8.csv",
                "encodingFormat" : "text/csv"

            },
            {
                "type"           : "LinkedResource",
                "url"            : "test.xlsx",
                "encodingFormat" : "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
            }
        ],
    }
    </script>
</head>
<body>
    ....

    <section id="toc" role="doc-toc">
        <h2 resource="#h-toc" id="h-toc" class="introductory">Table of Contents</h2>
        <ul class="toc">
            <li class="tocline"><a class="tocxref" href="#intro">
                <span class="secno">1. </span>Introduction</a>
            </li>
            ...
        </ul>
    </section>
    ...

</body>
</html>

B.3 Audiobook

The following example shows a manifest that conforms to the Audiobooks profile [audiobooks].

A JSON encoding of the internal representation of this manifest is also available.

{
  "@context": [
      "https://schema.org",
      "https://www.w3.org/ns/pub-context",
      {"language": "en"}
  ],
  "conformsTo": "https://www.w3.org/audiobooks",
  "type": "Audiobook",
  "id": "https://librivox.org/flatland-a-romance-of-many-dimensions-by-edwin-abbott-abbott/",
  "url": "https://w3c.github.io/wpub/experiments/audiobook/",
  "name": "Flatland: A Romance of Many Dimensions",
  "author": "Edwin Abbott Abbott",
  "readBy": "Ruth Golding",
  "publisher": "Librivox",
  "inLanguage": "en",
  "dateModified": "2018-06-14T19:32:18Z",
  "datePublished": "2008-10-12",
  "duration": "PT15153S",
  "license": "https://creativecommons.org/publicdomain/zero/1.0/",

  "resources": [
    {
      "rel": "cover",
      "url": "http://ia800704.us.archive.org/9/items/LibrivoxCdCoverArt12/Flatland_1109.jpg",
      "encodingFormat": "image/jpeg"
    },{
      "rel": "contents", 
      "url": "toc.html", 
      "encodingFormat": "text/html"
    }
  ],

  "readingOrder": [
    {
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_1_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1371,
      "name": "Part 1, Sections 1 - 3"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_2_abbott.mp3",
      "encodingFormat": "audio/mpeg", 
      "length": 1669,
      "name": "Part 1, Sections 4 - 5"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_3_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1506,
      "name": "Part 1, Sections 6 - 7"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_4_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1669,
      "name": "Part 1, Sections 8 - 10"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_5_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1506,
      "name": "Part 1, Sections 11 - 12"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_6_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1798,
      "name": "Part 2, Sections 13 - 14"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_7_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1225,
      "name": "Part 2, Sections 15 - 17"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_8_abbott.mp3",
      "encodingFormat": "audio/mpeg",
      "length": 1371,
      "name": "Part 2, Sections 18 - 20"
    },{
      "url": "http://www.archive.org/download/flatland_rg_librivox/flatland_9_abbott.mp3", 
      "encodingFormat": "audio/mpeg",
      "length": 1659,
      "name": "Part 2, Sections 21 - 22"
    }
  ]
}

C. Properties Index

This section is non-normative.

The following table identifies where manifest properties are defined and extended.

Name Publication Manifest
abridged § 2.9.1.1 Abridged
accessMode § 2.9.1.2 Accessibility
accessModeSufficient § 2.9.1.2 Accessibility
accessibilityFeature § 2.9.1.2 Accessibility
accessibilityHazard § 2.9.1.2 Accessibility
accessibilitySummary § 2.9.1.2 Accessibility
artist § 2.9.1.5 Creators
author § 2.9.1.5 Creators
conformsTo § 2.8 Profile Conformance
@context § 2.5 Manifest Contexts
contributor § 2.9.1.5 Creators
creator § 2.9.1.5 Creators
dateModified § 2.9.1.7 Last Modification Date
datePublished § 2.9.1.8 Publication Date
direction § 2.6.1 Global Declarations
duration § 2.9.1.6 Duration
editor § 2.9.1.5 Creators
id § 2.9.1.4 Canonical Identifier
illustrator § 2.9.1.5 Creators
inker § 2.9.1.5 Creators
inLanguage § 2.9.1.9 Publication Language
language § 2.6.1 Global Declarations
letterer § 2.9.1.5 Creators
link § 2.9.2.3 Links
name § 2.9.1.11 Title
penciler § 2.9.1.5 Creators
publisher § 2.9.1.5 Creators
readBy § 2.9.1.5 Creators
readingOrder § 2.9.2.1 Default Reading Order
readingProgression § 2.9.1.10 Reading Progression Direction
resources § 2.9.2.2 Resource List
translator § 2.9.1.5 Creators
type § 2.7 Publication Types
url § 2.9.1.3 Address

D. Resource Relations Index

This section is non-normative.

The following table identifies where the use of resource relations is defined.

Name Publication Manifest
accessibility-report § 2.10.1.1 Accessibility Report
contents § 2.10.2.3 Table of Contents
cover § 2.10.2.1 Cover
pagelist § 2.10.2.2 Page List
privacy-policy § 2.10.1.3 Privacy Policy
preview § 2.10.1.2 Preview

E. JSON-LD 1.1 Dependency

This section is non-normative.

This specification depends on [json-ld11], which is, at the time of finalizing this specification, the latest version of JSON-LD. I.e., a JSON-LD 1.0 [json-ld10] processor raises errors if it processes a manifest as defined in this document.

However, the 1.1 dependency is restricted to the usage of the direction term, defined in § 2.4.4.1 Localizable Strings. If the manifest does not use this feature, it can be processed by a JSON-LD 1.0 processor provided the context reference https://www.w3.org/ns/pub-context (see § 2.5 Manifest Contexts) is replaced by https://www.w3.org/ns/pub-context-jsonld10.

Note that this is relevant if and only if the manifest is indeed processed by a JSON-LD processor. As emphasized in § 2.1.2 JSON-LD Authoring and Processing, a user agent is not required to be based on such a processor, in which case this dependency does not create any problems.

F. Acknowledgements

This section is non-normative.

The editors would like to thank the members of the Publishing Working Group for their contributions to this specification:

The Working Group would also like to thank the members of the Digital Publishing Interest Group for all the hard work they did paving the road for this specification.

G. References

G.1 Normative references

[accname-1.1]
Accessible Name and Description Computation 1.1. Joanmarie Diggs; Bryan Garaventa; Michael Cooper. W3C. 18 December 2018. W3C Recommendation. URL: https://www.w3.org/TR/accname-1.1/
[bcp47]
Tags for Identifying Languages. A. Phillips; M. Davis. IETF. September 2009. IETF Best Current Practice. URL: https://tools.ietf.org/html/bcp47
[bibtex]
BibTeX Format Description. URL: http://www.bibtex.org/Format/
[bidi]
Unicode Bidirectional Algorithm. Mark Davis; Aharon Lanin; Andrew Glass. Unicode Consortium. 4 February 2019. Unicode Standard Annex #9. URL: https://www.unicode.org/reports/tr9/tr9-41.html
[dcterms]
DCMI Metadata Terms. DCMI Usage Board. DCMI. 14 June 2012. DCMI Recommendation. URL: http://dublincore.org/documents/dcmi-terms/
[dom]
DOM Standard. Anne van Kesteren. WHATWG. Living Standard. URL: https://dom.spec.whatwg.org/
[dpub-aria-1.0]
Digital Publishing WAI-ARIA Module 1.0. Matt Garrish; Tzviya Siegman; Markus Gylling; Shane McCarron. W3C. 14 December 2017. W3C Recommendation. URL: https://www.w3.org/TR/dpub-aria-1.0/
[ecmascript]
ECMAScript Language Specification. Ecma International. URL: https://tc39.github.io/ecma262/
[html]
HTML Standard. Anne van Kesteren; Domenic Denicola; Ian Hickson; Philip Jägenstedt; Simon Pieters. WHATWG. Living Standard. URL: https://html.spec.whatwg.org/multipage/
Link Relations. URL: https://www.iana.org/assignments/link-relations/link-relations.xhtml
[iana-relations]
Link Relations. IANA. URL: https://www.iana.org/assignments/link-relations/
[iso8601]
Representation of dates and times. ISO 8601:2004.. International Organization for Standardization (ISO). 2004. ISO 8601:2004. URL: http://www.iso.org/iso/catalogue_detail?csnumber=40874
[json]
The application/json Media Type for JavaScript Object Notation (JSON). D. Crockford. IETF. July 2006. Informational. URL: https://tools.ietf.org/html/rfc4627
[json-ld11]
JSON-LD 1.1. Gregg Kellogg; Pierre-Antoine Champin. W3C. 9 September 2019. W3C Working Draft. URL: https://www.w3.org/TR/json-ld11/
[onix]
ONIX for Books. URL: http://www.editeur.org/83/Overview
[rfc1808]
Relative Uniform Resource Locators. R. Fielding. IETF. June 1995. Proposed Standard. URL: https://tools.ietf.org/html/rfc1808
[rfc2046]
Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. N. Freed; N. Borenstein. IETF. November 1996. Draft Standard. URL: https://tools.ietf.org/html/rfc2046
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[rfc5988]
Web Linking. M. Nottingham. IETF. October 2010. Proposed Standard. URL: https://tools.ietf.org/html/rfc5988
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://tools.ietf.org/html/rfc8174
[rfc8288]
Web Linking. M. Nottingham. IETF. October 2017. Proposed Standard. URL: https://httpwg.org/specs/rfc8288.html
[schema.org]
Schema.org. URL: https://schema.org
[sri]
Subresource Integrity. Devdatta Akhawe; Frederik Braun; Francois Marier; Joel Weinberger. W3C. 23 June 2016. W3C Recommendation. URL: https://www.w3.org/TR/SRI/
[url]
URL Standard. Anne van Kesteren. WHATWG. Living Standard. URL: https://url.spec.whatwg.org/
[wcag21]
Web Content Accessibility Guidelines (WCAG) 2.1. Andrew Kirkpatrick; Joshue O Connor; Alastair Campbell; Michael Cooper. W3C. 5 June 2018. W3C Recommendation. URL: https://www.w3.org/TR/WCAG21/
[WebIDL]
Web IDL. Boris Zbarsky. W3C. 15 December 2016. W3C Editor's Draft. URL: https://heycam.github.io/webidl/
[webidl-1]
WebIDL Level 1. Cameron McCormack. W3C. 15 December 2016. W3C Recommendation. URL: https://www.w3.org/TR/2016/REC-WebIDL-1-20161215/

G.2 Informative references

[audiobooks]
Audiobook Profile for Publication Manifest. Wendy Reid. W3C. 11 September 2019. W3C Working Draft. URL: https://www.w3.org/TR/audiobooks/
[bibo]
Bibliographic Ontology Specification. Bruce D'Arcus; Frédérick Giasson. Structured Dynamics LLC. 11 May 2016. URL: http://bibliographic-ontology.org/specification
[dc11]
Dublin Core Metadata Element Set, Version 1.1. DCMI. 14 June 2012. DCMI Recommendation. URL: http://dublincore.org/documents/dces/
[ecma-404]
The JSON Data Interchange Format. Ecma International. 1 October 2013. Standard. URL: https://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
[foaf]
FOAF Vocabulary Specification 0.99 (Paddington Edition). Dan Brickley; Libby Miller. FOAF project. 14 January 2014. URL: http://xmlns.com/foaf/spec
[json-ld10]
JSON-LD 1.0. Manu Sporny; Gregg Kellogg; Markus Langhaler. 2014-01-16. W3C Recommendation. URL: https://www.w3.org/TR/2014/REC-json-ld-20140116/
[json-schema]
JSON Schema: core definitions and terminology. K. Zyp. Internet Engineering Task Force (IETF). 31 January 2013. Internet-Draft. URL: https://tools.ietf.org/html/draft-zyp-json-schema
Identifier: A Link Relation to Convey a Preferred URI for Referencing. H. Van de Sompel; M. Nelson; G. Bilder; J. Kunze; S. Warner. IETF. URL: https://tools.ietf.org/html/draft-vandesompel-identifier-00
[rfc3987]
Internationalized Resource Identifiers (IRIs). M. Duerst; M. Suignard. IETF. January 2005. Proposed Standard. URL: https://tools.ietf.org/html/rfc3987
[string-meta]
Requirements for Language and Direction Metadata in Data Formats. Addison Phillips; Richard Ishida. 2017-12-01. URL: https://w3c.github.io/string-meta/
[webschemas-a11y]
WebSchemas Accessibility. URL: http://www.w3.org/wiki/WebSchemas/Accessibility