Copyright © 2020 W3C ® ( MIT , ERCIM , Keio , Beihang ). W3C liability , trademark and permissive document license rules apply.
This specification defines a general manifest format for expressing information about a digital publication. It uses [ schema.org ] metadata augmented to include various structural properties about publications, serialized in [ json-ld11 ], to enable interoperability between publishing formats while accommodating variances in the information that needs to be expressed.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Publishing Working Group as an Editor's Draft.
GitHub Issues are preferred for discussion of this specification. Alternatively, you can send comments to our mailing list. Please send them to public-publ-wg@w3.org ( archives ).
Please see the Working Group's implementation report .
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 1 March 2019 W3C Process Document .
This specification defines a general manifest format to describe publications. It is designed to be adaptable to the needs of specific areas of publishing, such as audiobook production, by specifying a modular approach for creating specializations.
This specification is also intended to facilitate different user agent architectures. While it is expected that traditional Web user agents (browsers) will be able to consume a publication manifest, this should not limit the capabilities of any other possible type of user agent (e.g., applications, whether standalone or running within a user agent, or even publications that include their own user interface).
This specification does not define how user agents are expected to render publications that use the manifest format.
This section is non-normative.
A digital publication is described by its manifest , which provides a set of properties expressed using a specific shape of JSON-LD [ json-ld11 ] (a variant of JSON [ ecma-404 ] for linked data).
The manifest is what enables user agents to understand the bounds of digital publication and the connection between its resources. It includes metadata that describes the digital publication, as a publication has an identity and nature beyond its constituent resources. The manifest also provides a list of resources that belong to the digital publication and a default reading order , which is how it connects resources into a single contiguous work.
The properties of the manifest describe the basic information a user agent requires to process and render a publication. For ease of understanding, these properties are categorized as follows:
Descriptive properties describe aspects of a digital publication, such as its title , creator , and language .
Resource categorization properties describe or identify common sets of resources, such as the resource list and default reading order . These properties refer to one or more resources, such as HTML documents, images, scripts, and metadata records.
The
manifest
also
identifies
key
resources
of
a
digital
publication
through
the
use
of
link
relations.
These
relations
are
defined
in
the
rel
property
of
objects
(i.e.,
the
JSON
objects
that
represent
each
resource
in
the
default
reading
order,
resource
list,
and
links
sections).
LinkedResource
The types of resources these relations identify are categorized as follows:
Informative resources are resources that contain additional information about the publication, such as its privacy policy , accessibility report , or preview .
Structural resources are key meta structures of the publication, such as the cover image , table of contents , and page list .
This specification defines the publication manifest as a specific "shape" of [ json-ld11 ]. This means that the manifest SHOULD be expressed using only the syntactic constructions defined in this specification, as opposed to all the possibilities offered by the JSON-LD syntax.
This shape is also defined, informally, through a JSON schema [ json-schema ] that expresses the constraints defined in this specification. This schema is maintained at https://github.com/w3c/pub-manifest/blob/master/schema/ .
The publication manifest also has a number of authoring flexibilities and compact authoring expressions. For example, it is not always required that object types be explicitly authored, as these are automatically generated during processing when missing (see § 4.2.4 Explicit and Implied Objects for more information). An internal representation of the manifest data is defined separately; see § A. Internal Representation Data Models for further details.
As a consequence, a user agent does not have to be a full JSON-LD processor. User agents only need to be able to read the manifest's specific shape and internalize the data.
This section is non-normative.
Manifest properties, in particular those categorized as descriptive properties , are primarily drawn from Schema.org and its hosted extensions [ schema.org ]. As a consequence, these properties inherit their syntax and semantics from Schema.org, making manifest authoring compatible with Schema.org authoring.
When a manifest item corresponds to a Schema.org property, its property definition identifies its mapping and includes the defining type (e.g., CreativeWork or Book ) in parentheses.
Schema.org additionally includes a large number of properties that, though relevant for publishing, are not mentioned in this specification. These properties can be used in a manifest as this document defines only the minimal set of manifest items (see § 4.7.3.2 Additional Manifest Properties ).
When using additional Schema.org properties, ensure that they are valid for the type of publication specified in the manifest. Properties are often available in many Schema.org types, as a result of the inheritance model used by the vocabulary, but not all properties are available for all types. For more detailed information about which types accept which properties, refer to [ schema.org ].
More information about using additional Schema.org properties is also available in § 4.5 Publication Types and § 4.7.3.2 Additional Manifest Properties .
This specification depends on the Infra Standard [ infra ].
A digital publication consists of a finite set of resources that represent its content. This extent is known as its bounds and is defined within its manifest as described in § 5. Publication Resources .
A digital publication is any publication authored in a format that uses a profile of the manifest .
The internal representation of a manifest is the data structure created by user agents when they process the manifest and remove all possible ambiguities and incorporate any missing values that can be inferred from another source.
It is possible for the information expressed in the manifest to be the equivalent of the internal representation created by user agents if there are no ambiguities or missing information.
A manifest represents structured information about a publication, such as informative metadata, a list of resources , and a default reading order .
Profiles are publication formats (e.g., audiobooks) that use the manifest format defined in this specification to describe their bounds and content. These formats can extend the core definition in this specification with profile-specific terms and/or new requirements.
Although profiles can differ in their structural and content requirements, such variances are restricted to maintain a high degree of predictabibility between formats. (See § 8. Modular Extensions .)
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , MUST NOT , OPTIONAL , RECOMMENDED , REQUIRED , SHOULD , and SHOULD NOT in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
All algorithm explanations are informative .
The following properties MUST be set in the manifest:
The following properties are RECOMMENDED :
The priority of all other properties and resource relations is OPTIONAL , but MAY be modified by implementations of the manifest format.
Some properties are implicitly required, as they are compiled from alternative information when not explicitly authored. See § A. Internal Representation Data Models for more information.
This section describes the categories of values that can be used with properties of the publication manifest.
When a manifest property expects a literal text string — one that is not language-dependent, such as a code value or date — as its value, the value MUST be expressed as a [ json ] string .
Literal values are not changed during processing of the manifest , unlike other values which might be, for example, converted to objects .
When a manifest property expects a number as its value, the value MUST be expressed as a [ json ] number .
When
a
manifest
property
expects
a
boolean
as
its
value,
the
value
MUST
be
expressed
as
an
[
ecmascript
]
Boolean
value
(
true
or
false
).
Various manifest properties are expected to be expressed as [ json ] objects . Although the use of explicit objects is usually advised, the following sections identify cases where it is also acceptable to use string values. These strings are automatically translated into objects during processing of the manifest by a user agent (the exact mapping of text values to objects is included in each definition).
When a manifest property expects a localizable text string as its value, the value MUST be expressed as one of:
LocalizableString
;
or
LocalizableStrings
.
A
single
string
value
represents
an
implied
object
whose
value
property
is
the
string's
text
and
whose
language
and
base
direction
is
determined
from
other
information
in
the
manifest.
As localizable strings are intended to facilitate multiple language representations of a value, properties that accept a localizable string always accept an array of these values. For this reason, although only a single string or object has to be authored, such values are converted to arrays for consistency of processing.
A
LocalizableString
is
a
[
json
]
object
consisting
of
the
following
properties:
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
value
|
The value of the localizable string. REQUIRED . | Text. | Literal | (None) |
language
|
The language of the value. OPTIONAL . | A well-formed language tag [ bcp47 ]. | Literal | (None) |
direction
|
The base direction of the value. OPTIONAL . |
ltr
or
rtl
|
Literal | (None) |
The meaning of the base direction values are:
ltr
:
indicates
that
the
textual
value
is
explicitly
directionally
set
to
left-to-right
text.
rtl
:
indicates
that
the
textual
value
is
explicitly
directionally
set
to
right-to-left
text.
A missing base direction value means that that the textual value is explicitly directionally set to the direction of the first character with a strong directionality, following the rules of the Unicode Bidirectional Algorithm [ bidi ].
If the base direction value were not set in the last example, the text would be displayed, following the Unicode Bidirectional Algorithm [ bidi ] and due to the presence of a Latin character starting the string, as:
HTML היא שפת סימון.
However,
that
would
be
incorrect.
The
extra
direction
value
is
necessary
to
control
the
display
to
yield:
HTML היא שפת סימון.
See also the [ string-meta ] document for further explanations and examples.
When a manifest property expects an entity (i.e., an individual or organization responsible for the various aspects of creation), its value MUST be expressed either as:
A
single
string
value
represents
an
instance
of
an
Entity
object
whose
name
property
is
the
string's
text
and
whose
type
is
assumed
to
be
Person
[
schema.org
].
An
Entity
is
defined
as
an
instance
of
either
the
[
schema.org
]
Person
or
Organization
type
with
the
following
minimal
property
set:
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
type
|
The type of creator. OPTIONAL |
One
or
more
Text.
Sequence
MUST
include
"
Person
"
or
"
Organization
".
|
Array of Literals | (None) |
name
|
Name of the creator. REQUIRED . | One or more Text. | Array of Localizable Strings |
name
|
id
|
A canonical identifier associated with the creator. OPTIONAL . | A URL record [ url ]. | Identifier | (None) |
url
|
An address associated with the creator. OPTIONAL . | A valid URL string [ url ]. | URL |
url
|
identifier
|
An identifier associated with the creator (e.g., ORCID). OPTIONAL . | One or more Text. | Array of Literals |
identifier
|
This
minimal
set
of
properties
is
not
restrictive.
Authors
can
include
any
additional
properties
defined
for
the
[
schema.org
]
Person
or
Organization
types,
as
appropriate.
User
agents
are
similarly
not
limited
to
interpreting
only
the
preceding
properties.
When a manifest property links to one or more resources, it MUST be expressed either as:
LinkedResource
.
A
string
value
represents
an
implied
LinkedResource
object
whose
url
property
is
set
to
the
string
value.
A
LinkedResource
object
is
defined
as
follows:
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
type
|
The type of resource. OPTIONAL |
One
or
more
Text.
Sequence
MUST
include
"
LinkedResource
".
|
Array of Literals | (None) |
url
|
Location of the resource. REQUIRED . | A valid URL string [ url ]. Refer to the property definitions that accept this type for additional restrictions. | URL |
url
|
encodingFormat
|
Media
type
of
the
resource
(e.g.,
text/html
).
OPTIONAL
.
|
MIME Media Type [ rfc2046 ]. | Literal |
encodingFormat
|
name
|
Name of the item. OPTIONAL . | One or more Text. | Array of Localizable Strings |
name
|
description
|
Description of the item. OPTIONAL . | One or more Text. | Array of Localizable Strings |
description
|
rel
|
The relation of the resource to the publication. OPTIONAL . |
One or more relations . |
Array of Literals | (None) |
integrity
|
A cryptographic hashing of the resource that allows its integrity to be verified. OPTIONAL . |
One or more whitespace-separated sets of integrity metadata [ sri ]. The value MUST conform to the metadata definition [ sri ]. Refer to [ sri ] for the list of cryptographic hashing functions that user agents are expected to support. |
Literal | (None) |
duration
|
Overall duration of a time-based media resource. OPTIONAL | Duration value as defined by [ iso8601 ]. | Literal |
duration
(
Property
)
|
alternate
|
References
to
one
or
more
reformulation(s)
of
the
resource
in
alternative
formats,
where
the
|
One or more of:
A
string
value
represents
an
implied
|
Array of Linked Resources | (None) |
Although
user
agent
support
for
the
integrity
property
is
OPTIONAL
,
user
agents
that
support
cryptographic
hashing
comparisons
using
this
property
MUST
do
so
in
accordance
with
[
sri
].
This
specification
only
defines
the
alternate
property
for
selecting
from
alternative
formats
(i.e.,
based
on
encodingFormat
or
by
inspecting
URLs
).
Profiles
MAY
extend
this
behaviour
to
allow
selection
based
on
other
criteria.
The
process
for
selecting
an
alternate
is
described
in
§
B.
Selecting
an
Alternate
Resource
.
When
defining
a
LinkedResource
object,
it
is
advised
to
always
specify
the
media
type
of
the
resource
using
the
encodingFormat
property.
Doing
so
allows
user
agents
to
more
readily
determine
the
usability
of
the
resource.
{
"type" : "LinkedResource",
"url" : "chapter1.html",
"encodingFormat" : "text/html",
"name" : "Chapter 1 - Loomings",
"integrity" : "sha256-13AE04E21177BABEDFDE721577615A638341F963731EA936BBB8C3862F57CDFC"
}
{
"type" : "LinkedResource",
"url" : "chapter1.mp3",
"encodingFormat" : "audio/mpeg",
"name" : "Chapter 1 - Loomings",
"alternate" : [
"chapter1.html",
{
"type": "LinkedResource",
"url": "chapter1.json",
"encodingFormat": "application/vnd.syncnarr+json",
"duration": "PT1669S"
}
]
}
{
…
"resources" : [
"datatypes.svg",
{
"type" : "LinkedResource",
"url" : "test-utf8.csv",
"encodingFormat" : "text/csv",
"name" : "Test Results",
"description" : "CSV file containing the full data set used."
},
{
"type" : "LinkedResource",
"url" : "terminology.html",
"encodingFormat" : "text/html",
"rel" : "glossary"
}
],
…
}
When a manifest property expects a type of object not defined in this section, or by a profile , it MUST be expressed as a [ json ] object (i.e., the property's value will not be processed to create an object).
URLs are used to identify resources associated with a digital publication . When a property expects a URL value, it MUST be a valid URL string [ url ].
In the case of relative-URL strings , these are resolved to absolute-URL strings using a base URL [ url ].
The base URL for relative-URL strings is determined as follows:
By
consequence,
relative-URL
strings
in
embedded
manifests
are
resolved
against
the
URL
of
the
document
that
references
the
manifest
unless
the
document
declares
a
base
URL
(i.e.,
in
a
<base>
element
in
its
header).
Identifiers are used to refer to a digital publication and the entities reponsible for its creation in a persistent and unambiguous manner. URLs , URNs , DOIs , ISBNs , and PURLs are all examples of persistent identifiers frequently used in publishing.
Identifiers MUST be expressed as URL records [ url ]
When a manifest property allows one or more value of their respective type (e.g., literal , object , or URL ), these values are expressed as [ json ] arrays . When a property value is a single element, however, the array syntax MAY be omitted.
A manifest MUST set its JSON-LD context [ json-ld11 ] with the following two components, in the specified order:
https://schema.org
https://www.w3.org/ns/pub-context
Although
Schema.org
is
often
referenced
using
the
http
URI
scheme,
the
vocabulary
is
being
migrated
to
use
the
secure
https
scheme
as
its
default.
As
a
result,
only
the
https
scheme
is
recognized
in
the
publication
manifest
context.
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context"
],
…
}
The publication context document adds features to the properties defined in Schema.org (e.g., the requirement for the creator property to be order preserving).
Profiles of this specification MAY require additional context URLs , but such URLs MUST be ordered after these two components.
The context can be extended by including additional paramaters — such as the global language and direction declarations — in an object following the publication context.
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{
"language" : "es"
}
],
…
}
Each natural language property value in a manifest (e.g., title , creators ) has a default natural language , which is the language that it is expressed in (e.g., English, French, Chinese). It also has a natural base direction in which it is written — the display direction, either left-to-right or right-to-left.
The digital publication manifest provides the ability to set both these concepts globally as well as on individual items to aid user agents in interpreting and presenting the metadata.
The ability to set the base direction is a JSON-LD 1.1 [ json-ld11 ] feature. In other words, the Publication Manifest has a dependency on that version of the JSON-LD specification (as opposed to the earlier 1.0 [ json-ld10 ] version).
The
global
language
and
base
direction
declarations
for
natural
language
manifest
properties
are
set
in
the
context
using
the
language
and
direction
keywords [
json-ld11
],
respectively.
These
values
are
used
to
expand
simple
string
values
into
localizable
strings
during
the
processing
of
the
manifest
,
as
well
as
to
provide
a
language
and
the
base
direction
for
localizable
strings
that
omit
one.
The
value
of
language
MUST
be
a
well-formed
language
tag
[
bcp47
].
The
value
of
direction
MUST
have
one
of
the
following
values:
"ltr"
:
indicates
that
the
textual
values
are
explicitly
directionally
set
to
left-to-right
text.
"rtl"
:
indicates
that
the
textual
values
are
explicitly
directionally
set
to
right-to-left
text.
The global language and base direction declaration, when present, MUST follow the publication context .
Default values are not specified for the global language or base direction.
It
is
possible
to
set
the
language
or
a
base
direction
locally
for
any
natural
language
value
in
the
manifest
using
a
localizable
string
:
The extra base direction setting for the Arabic title is necessary to yield the correct display, i.e.,:
HTML و CSS: تصميم و إنشاء مواقع الويب
The
possible
values
of
the
language
and
direction
keywords [
json-ld11
]
are
the
same
as
for
the
global
declaration
.
Furthermore,
both
values
can
also
be
the
(JSON)
value
of
null
,
indicating
that
no
explicit
language,
respectively
direction,
is
set.
Setting
the
value
of
language
to
null
can
be
useful
if
a
value
(e.g.,
the
name
of
an
organization)
is
commonly
used
without
any
associated
language
(e.g.,
"Google").
A local declaration of the language, respectively the base direction, takes precedence over a global declaration .
A
digital
publication's
manifest
defines
its
Publication
Type
using
the
type
keyword [
json-ld11
].
The
type
MAY
be
mapped
onto
any
[
schema.org
]
type,
but
CreativeWork
is
assumed
as
the
default
when
no
type
is
specified.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"type" : "CreativeWork",
…
}
More
specific
subtypes
of
CreativeWork
,
such
as
Article
,
Book
,
TechArticle
,
and
Course
can
be
used
instead
of,
or
in
addition
to,
CreativeWork
.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"type" : "Book",
…
}
Each Schema.org type defines a set of properties that are valid for use with it. To ensure that the manifest can be validated and processed by Schema.org-aware processors, the manifest SHOULD contain only the properties associated with the selected type.
If properties from more than one type are needed, the manifest MAY include multiple type declarations.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"type" : ["Book", "VisualArtwork"],
…
}
User agents SHOULD NOT fail to process manifests that are not valid to their declared Schema.org type(s).
Refer
to
the
Schema.org
site
for
the
complete
list
of
CreativeWork
subtypes
.
A
digital
publication
indicates
the
profile
its
manifest
and
content
conform
to
using
the
conformsTo
property.
Term | Description | Required Value | Value Category | [ dcterms ] Mapping |
---|---|---|---|---|
conformsTo
|
URL of the profile. | An absolute-URL-with-fragment string [ url ]. | Array of Literals | conformsTo |
The URL to use for a profile is defined in its respective specification.
The
conformsTo
property
can
also
be
used
to
indicate
conformance
to
other
specifications
and
standards
(e.g.,
to
[
wcag21
]).
{
…
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
…
}
The
abridged
property
provides
information
on
whether
or
not
a
digital
publication
has
been
shortened
from
its
original
form.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
abridged
|
Indicates whether the book is an abridged edition. |
Either
true
or
false
.
|
Boolean |
abridged
(
Book
)
|
{
…
"abridged" : true,
…
}
The accessibility properties provide information about the suitability of a digital publication for consumption by users with different preferred reading modalities. These properties typically supplement an evaluation against established accessibility criteria, such as those provided in [ wcag21 ].
The following properties are categorized as accessibility properties:
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
accessMode
|
The human sensory perceptual system or cognitive faculty through which a person may process or perceive information. | One or more Text. | Array of Literals |
accessMode
(
CreativeWork
)
|
accessModeSufficient
|
A list of single or combined access modes that are sufficient to understand all the intellectual content of a resource. | One or more ItemList . | Array of Object |
accessModeSufficient
(
CreativeWork
)
|
accessibilityFeature
|
Content features of the resource, such as accessible media, alternatives and supported enhancements for accessibility. | One or more Text. | Array of Literals |
accessibilityFeature
(
CreativeWork
)
|
accessibilityHazard
|
A characteristic of the described resource that is physiologically dangerous to some users. | One or more Text. | Array of Literals |
accessibilityHazard
(
CreativeWork
)
|
accessibilitySummary
|
A human-readable summary of specific accessibility features or deficiencies that is consistent with the other accessibility metadata. | Text. | Array of Localizable Strings |
accessibilitySummary
(
CreativeWork
)
|
Detailed descriptions of these properties, including the expected values to use with them, are available at [ webschemas-a11y ].
A reference to a detailed accessibility report can also be provided if more information is needed than can be expressed by these properties.
{
…
"accessMode" : ["textual", "visual"],
"accessibilityFeature" : ["alternativeText", "longDescription"]
"accessModeSufficient" : [
{
"type" : "ItemList",
"itemListElement" : ["textual", "visual"]
},
{
"type" : "ItemList",
"itemListElement" : ["textual"]
}
],
…
}
An
address
is
a
URL
that
identifies
the
source
location
of
a
digital
publication
.
It
is
expressed
using
the
url
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
url
|
URL of the publication. | A valid URL string [ url ]. | Array of URLs |
url
(
Thing
)
|
A digital publication MAY have more than one address, but all the addresses MUST resolve to the same document.
{
…
"url" : "https://publisher.example.org/frankenstein",
…
}
A
digital
publication's
canonical
identifier
property
provides
a
unique
identifier
for
a
digital
publication
.
It
is
expressed
using
the
id
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
id
|
Preferred version of the publication. | A URL record [ url ]. | Identifier | (None) |
Ensuring uniqueness of canonical identifiers is outside the scope of this specification. The actual achievable uniqueness depends on such factors as the conventions of the identifier scheme used and the degree of control over assignment of identifiers.
If a canonical identifier is not provided in the manifest, or the value is an invalid URL, the digital publication does not have a canonical identifier. User agents MUST NOT attempt to construct a canonical identifier from any other identifiers provided in the manifest.
The
specification
of
the
canonical
identifier
MAY
be
complemented
by
the
inclusion
of
additional
types
of
identifiers
using
the
identifier
property
[
schema.org
]
and/or
its
subtypes.
{
…
"id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
…
}
{
…
"id" : "urn:isbn:9780123456789",
"url" : "https://publisher.example.org/wuthering-heights",
…
}
A creator is an individual or organization responsible for the creation of a digital publication .
The following properties are categorized as creators:
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
artist
|
The primary artist for the publication, in a medium other than pencils or digital line art. |
One
or
more
Person
.
|
Array of Entities |
artist
(
VisualArtwork
)
|
author
|
The author of the publication. |
One
or
more
Person
and/or
Organization
.
|
Array of Entities |
author
(
CreativeWork
)
|
colorist
|
The individual who adds color to inked drawings. |
One
or
more
Person
.
|
Array of Entities |
colorist
(
VisualArtwork
)
|
contributor
|
Contributor whose role does not fit to one of the other roles in this table. |
One
or
more
Person
and/or
Organization
.
|
Array of Entities |
contributor
(
CreativeWork
)
|
creator
|
The creator of the publication. |
One
or
more
Person
and/or
Organization
.
|
Array of Entities |
creator
(
CreativeWork
)
|
editor
|
The editor of the publication. |
One
or
more
Person
.
|
Array of Entities |
editor
(
CreativeWork
)
|
illustrator
|
The illustrator of the publication. |
One
or
more
Person
.
|
Array of Entities |
illustrator
(
Book
)
|
inker
|
The individual who traces over the pencil drawings in ink. |
One
or
more
Person
.
|
Array of Entities |
inker
(
VisualArtwork
)
|
letterer
|
The individual who adds lettering, including speech balloons and sound effects, to artwork. |
One
or
more
Person
.
|
Array of Entities |
letterer
(
VisualArtwork
)
|
penciler
|
The individual who draws the primary narrative artwork. |
One
or
more
Person
.
|
Array of Entities |
penciler
(
VisualArtwork
)
|
publisher
|
The publisher of the publication. |
One
or
more
Person
and/or
Organization
.
|
Array of Entities |
publisher
(
CreativeWork
)
|
readBy
|
A person who reads (performs) the publication (for audiobooks). |
One
or
more
Person
.
|
Array of Entities |
readBy
(
Audiobook
)
|
translator
|
The translator of the publication. |
One
or
more
Person
and/or
Organization
.
|
Array of Entities |
translator
(
CreativeWork
)
|
Creators MUST be represented either as:
Person
[
schema.org
];
or
Person
or
Organization
[
schema.org
].
A
single
string
value
is
a
shorthand
for
a
[
schema.org
]
Person
whose
name
property
is
set
to
that
string
value.
(See
also
§
4.2.4.2
Entities
.)
The manifest MAY include more than one of each type of creator.
{
…
"url" : "https://publisher.example.org/alice-in-wonderland",
"author" : {
"type" : "Person",
"name" : "Lewis Carroll"
}
}
{
…
"author" : [
"Jeni Tennison",
{
"type" : "Person",
"name" : "Gregg Kellogg",
},
{
"type" : "Person",
"name" : "Ivan Herman",
"id" : "https://www.w3.org/People/Ivan/"
"identifier" : "0000-0003-0782-2704",
}
],
"editor" : [
"Jeni Tennison",
{
"type" : "Person",
"name" : "Gregg Kellogg",
}
],
"publisher" : {
"type" : "Organization",
"name" : "World Wide Web Consortium",
"id" : "https://www.w3.org/"
}
…
}
The
global
duration
indicates
the
overall
length
of
a
time-based
digital
publication
(e.g.,
an
audiobook
or
a
book
consisting
of
a
series
of
video
clips).
It
is
expressed
using
the
duration
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
duration
|
Overall duration of a time-based publication. | Duration value as defined by [ iso8601 ]. | Literal |
duration
(
Property
)
|
{
…
"type" : "Audiobook",
"id" : "https://example.org/flatland-a-romance-of-many-dimensions/",
"url" : "https://w3c.github.io/pub-manifest/experiments/audiobook/",
"name" : "Flatland: A Romance of Many Dimensions",
…
"duration" : "PT15153S",
…
}
The relevant Wikiepedia page gives a concise description of the ISO duration syntax.
The
last
modification
date
is
the
date
when
a
digital
publication
was
last
updated
(i.e.,
whenever
changes
were
last
made
to
any
of
the
resources
of
the
publication,
including
the
manifest
).
It
is
expressed
using
the
dateModified
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
dateModified
|
Last modification date of the publication. |
A
Date
or
DateTime
value [
schema.org
],
both
expressed
in
ISO
8601
Date,
or
Date
Time
formats,
respectively [
iso8601
].
|
Literal |
dateModified
(
CreativeWork
)
|
The last modification date does not necessarily reflect all changes to a publication (e.g., if a digital publication format allows references to third-party content). User agents SHOULD check the last modification date of individual resources to determine if they have changed and need updating.
{
…
"dateModified" : "2015-12-17",
…
}
The
publication
date
is
the
date
on
which
a
digital
publication
was
originally
published.
It
represents
a
static
event
in
the
lifecycle
of
a
publication
and
allows
subsequent
revisions
to
be
identified
and
compared.
It
is
expressed
using
the
datePublished
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
datePublished
|
Creation date of the publication. |
A
Date
or
DateTime
,
both
expressed
in
ISO
8601
Date,
or
Date
Time
formats,
respectively [
iso8601
].
|
Literal |
datePublished
(
CreativeWork
)
|
The exact moment of publication is intentionally left open to interpretation: it could be when the publication is first made available or could be a point in time before publication when the publication is considered final.
{
…
"datePublished" : "2015-12-17",
"dateModified" : "2016-01-30",
…
}
A digital publication has at least one natural language, which is the language that the content is expressed in (e.g., English, French, Chinese). The manifest includes the following property to set this concept, which can influence, for example, the behavior of a user agent (e.g., to preload a dictionary or text-to-speech engine).
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
inLanguage
|
Default language for the publication. | One or more well-formed language tags [ bcp47 ]. | Array of Literals |
inLanguage
(
Property
)
|
The natural language MUST be a well-formed language tag [ bcp47 ].
If a user agent requires the publication language and it is not available in the manifest, or the obtained value is not well-formed [ bcp47 ], the user agent MAY attempt to determine the publication language when generating its internal representation . This specification does not mandate how such a language tag is created. The user agent might:
If
a
user
agent
requires
a
primary
language
for
the
publication
and
more
than
one
language
is
specified,
the
first
entry
in
the
inLanguage
array
MUST
be
recognized
as
the
primary.
It is important to differentiate the language of the publication from the language of the individual resources that compose it. If such resources are, for example, in HTML, the language needs to be set in those resources, too. The language of the publication is not inherited.
The
reading
progression
direction
establishes
the
reading
direction
from
one
resource
to
the
next
within
a
digital
publication
.
It
is
used
to
adapt
such
publication-level
interactions
as
menu
position,
touch
gestures,
swap
direction,
and
tap
zones
for
next
and
previous
page.
The
reading
progression
is
expressed
using
the
readingDirection
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
readingProgression
|
Reading progression direction from one resource to the other. |
One
of:
ltr
or
rtl
.
|
Literal | (None) |
The value of this property MUST be either:
ltr
:
left-to-right;
or
rtl
:
right-to-left.
The
default
value
is
ltr
.
If
the
readingProgression
is
not
set,
user
agents
MUST
use
the
default
value
when
generating
their
internal
representation
.
This property has no effect on the rendering of the individual primary resources; it is only relevant for the progression direction from one resource to the other.
{
…
"readingProgression" : "ltr",
…
}
The
title
provides
the
human-readable
name
of
a
digital
publication
.
It
is
expressed
using
the
name
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
name
|
Human-readable title of the publication. | One or more Text. | Array of Localizable Strings |
name
(
Thing
)
|
If a title is not included in the manifest, the user agent MUST create one. The process for obtaining the title is defined in § 7.4.3 Add Default Values .
A user agent is not expected to produce a meaningful title [ wcag21 ] for a publication when one is not specified.
{
…
"name" : "Heart of Darkness",
…
}
Publication resources are specified via the default reading order , the resource list , and the links , as defined in this section. These lists contain references to informative resources like the privacy policy , and structural resources like the table of contents .
It is not necessary to include a reference to the manifest in any of these lists.
The default reading order is a specific progression through a set of digital publication resources. A user might follow alternative pathways through the content, but in the absence of such interaction the default reading order defines the expected progression from one resource to the next.
The
default
reading
order
is
expressed
using
the
readingOrder
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
readingOrder
|
Order of progression through the resources of a digital publication. |
One
or
more
|
Array of Linked Resources | (None) |
Each
element
of
the
readingOrder
property
MUST
be
expressed
either
as:
LinkedResource
object.
A
single
string
value
represents
an
instance
of
a
LinkedResource
object
whose
url
property
is
the
string's
text.
The order of items is significant .
The URLs expressed in the reading order MAY include fragment identifiers, although profiles of this specification MAY restrict both their use as well as what schemes and features are supported. Fragment identifiers are to be interpreted as defined by their respective specifications (e.g., the start location to move the user to, or the range of content to render before moving to the next item in the reading order).
Resources SHOULD NOT be listed more than once in the reading order, as this can lead to unexpected results in user agents (e.g., links to the resource might not resolve to the right instance in the reading order).
The default reading order MUST include at least one resource after processing of the manifest . Depending on the discovery method a profile uses, the default reading order might not need to be explicitly specified in the manifest (i.e., a default document might be automatically included). See § 7.4.3 Add Default Values for more information.
{
…
"readingOrder" : [
"html/title.html",
"html/copyright.html",
"html/introduction.html",
"html/epigraph.html",
"html/c001.html",
…
],
…
}
{
…
"readingOrder" : [
{
"type" : "LinkedResource",
"url" : "html/title.html",
"encodingFormat" : "text/html",
"name" : "Title page"
},
{
"type" : "LinkedResource",
"url" : "html/copyright.html",
"encodingFormat" : "text/html",
"name" : "Copyright page"
},
…
],
…
}
The
resource
list
enumerates
any
additional
resources
used
in
the
processing
or
rendering
of
a
digital
publication
that
are
not
already
listed
in
the
default
reading
order
.
It
is
expressed
using
the
resources
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
resources
|
List of additional publication resources used in the processing or rendering of a publication. |
One
or
more
|
Array of Linked Resources | (None) |
Each
element
of
the
resources
property
MUST
be
expressed
either
as:
LinkedResource
object.
A
single
string
value
represents
an
instance
of
a
LinkedResource
object
whose
url
property
is
the
string's
text.
The order of items is not significant .
To avoid conflicting information about a resource, a particular resource's URL SHOULD NOT be repeated within the resource list.
The URLs expressed in the resource list SHOULD NOT include fragment identifiers.
The completeness of the resource list can affect the usability of a digital publication in certain reading scenarios (e.g., the ability to read it offline). For this reason, it is strongly advised to provide a comprehensive list of all of the publication's constituent resources beyond those listed in the default reading order .
In some cases, a comprehensive list of these resources might not be easily achieved (e.g., third-party scripts that reference resources from deep within their source), but a user agent SHOULD still be able to render a publication even if some of these resources are not identified as belonging to the publication (e.g., if it is taken offline without them).
{
…
"resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
{
"type" : "LinkedResource",
"url" : "test-utf8.csv",
"encodingFormat" : "text/csv"
},
{
"type" : "LinkedResource",
"url" : "test-utf8-bom.csv",
"encodingFormat" : "text/csv"
},
…
],
…
}
The
Links
list
is
used
to
provide
a
list
of
resources
that
are
not
required
for
the
processing
and
rendering
of
a
digital
publication
(i.e.,
the
content
of
the
publication
remains
unaffected
even
if
these
resources
are
not
available).
Links
are
expressed
using
the
links
property.
Term | Description | Required Value | Value Category | [ schema.org ] Mapping |
---|---|---|---|---|
links
|
List of resources associated with a publication but not required for its processing or rendering. |
One
or
more
|
Array of Linked Resources | (None) |
Each
element
of
the
links
property
MUST
be
expressed
either
as:
LinkedResource
object.
A
single
string
value
represents
an
instance
of
a
LinkedResource
object
whose
url
property
is
the
string's
text.
The order of items is not significant .
It
is
RECOMMENDED
to
use
LinkedResource
objects
with
their
rel
values
set.
Linked resources are typically made available to user agents to augment or enhance the processing or rendering, such as:
Links can also be used to identify resources used in the online rendering of a publication, but that are not essential to include when the publication is taken offline or packaged (e.g., to minimize the size). These include:
The
links
list
SHOULD
include
resources
necessary
to
render
a
linked
resource
(e.g.,
scripts,
images,
style
sheets).
Resources
listed
in
the
links
list
MUST
NOT
be
listed
in
the
default
reading
order
or
resource
list
.
User agents MAY ignore linked resources, and are not required to take them offline with a publication. These resources SHOULD NOT be included when packaging a publication.
The manifest is designed to provide a basic set of properties for use by user agents in presenting and rendering a digital publication , but MAY be extended in the following ways:
This specification does not define how such additional properties are compiled, stored or exposed by user agents in their internal representation of the manifest. A user agent MAY ignore some or all extended properties.
The
manifest
MAY
be
extended
through
links
to
metadata
records,
such
as
an
ONIX [
onix
]
or
BibTeX [
bibtex
],
using
a
object,
where:
LinkedResource
rel
property
of
the
LinkedResource
includes
a
relevant
identifier
(e.g.,
if
the
linked
record
contains
descriptive
metadata,
the
describedby
identifier
[
iana-link-relations
]
can
be
used);
encodingFormat
identifies
the
MIME
media
type [
rfc2046
]
defined
for
that
particular
type
of
record,
if
applicable.
Linked records are included in the resource list when they are part of the publication (i.e., are needed for more than just manifest extensibility). Otherwise, they are included in the links list .
{
…
"links" : [
{
"type" : "LinkedResource",
"url" : "https://www.publisher.example.org/time-machine/onix.xml",
"encodingFormat" : "application/onix+xml",
"rel" : "describedby"
},
…
],
…
}
The
application/onix+xml
MIME
type
has
not
yet
been
registered
by
IANA
at
the
time
of
writing
this
document,
and
is
included
in
the
example
for
illustrative
purposes
only.
Additional properties MAY be included directly in the manifest using public schemes like [ schema.org ] or [ dcterms ]. Proprietary terms MAY be used, but it is RECOMMENDED that such terms be included using Compact IRIs [ json-ld11 ], with prefixes defined as part of the context.
Proper use of prefixes and compact IRIs is necessary to use a manifest with a full JSON-LD processor, but is not a requirement for the processing algorithm defined by this specification. Validation of prefixed terms has to be carried out separately if full JSON-LD processing is expected.
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{
"language" : "en",
"ex" : "https://example.org/vocab"
}
],
…
"ex:region" : "North America",
…
}
The
Schema.org
context
file
[
schema.org
]
defines
a
number
of
prefixes
for
commonly
used
vocabularies,
such
as
the
Dublin
Core
Terms
(
dcterms
)
[
dcterms
]
and
Element
Set
(
dc
)
[
dc11
],
the
FOAF
vocabulary
(
foaf
)
[
foaf
],
and
the
Bibliographic
Ontology
(
bibo
)
[
bibo
].
Properties
from
these
vocabularies
can
be
used
without
their
prefixes
having
to
be
declared.
{
…
"copyrightYear" : "2015",
"copyrightHolder" : "World Wide Web Consortium",
…
}
{
…
"dcterms:subject" : ["Web data description languages","Data integration","Data Exchange"],
…
}
The cover is a resource that user agents can use to present a digital publication (e.g., in a library or bookshelf, or when initially loading the publication).
The
cover
is
identified
by
the
cover
link
relation.
The link to the cover MUST NOT be specified in the links list .
The
cover
term
is
not
currently
registered
in
the
IANA
link
relations
but
the
Working
Group
expects
to
add
it.
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "cover.html",
"encodingFormat" : "text/html",
"rel" : "cover"
},
…
],
…
}
If
the
cover
is
an
image
(whether
embedded
in
an
HTML
resource
or
not),
it
is
strongly
advised
to
follow
Success
Criterion
1.1.1
[
wcag21
]
for
the
provision
of
alternative
text
and
extended
descriptions.
For
image
formats
that
do
not
provide
the
ability
to
embed
this
information,
the
name
and
description
properties
of
can
be
used
to
provide
alternative
text
and
extended
descriptions,
respectively.
In
these
cases,
the
LinkedResource
name
property
SHOULD
always
be
set
—
the
property
can
be
left
empty
for
decorative
images.
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "whale-image.jpg",
"encodingFormat" : "image/jpeg",
"rel" : "cover",
"name" : "Moby Dick attacking hunters",
"description" : "A white whale is seen surfacing from the water to attack a small whaling boat"
},
…
],
…
}
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "cover.jpg",
"encodingFormat" : "image/jpeg",
"rel" : "cover",
"name" : "",
},
…
],
…
}
If
a
user
agent
requires
alternative
text
for
a
cover
image
to
make
an
interface
accessible,
and
the
name
property
is
not
specified,
it
MAY
attempt
to
construct
the
alternative
text
from
the
publication
metadata.
This
specification
does
not
mandate
how
such
alternative
text
is
created.
One
method
is
to
construct
the
alternative
text
as
a
string
that
identifies
that
the
image
as
the
cover,
followed
by
the
publication
title
.
Only
one
resource
MAY
be
identified
as
the
cover,
but
additional
covers
MAY
specified
using
the
alternate
property
(e.g.,
to
provide
alternative
dimensions
or
resolution).
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "lilliput.jpg",
"encodingFormat" : "image/jpeg",
"rel" : "cover"
"alternate" : [
{
"type" : "LinkedResource",
"url" : "lilliput.svg",
"encodingFormat" : "image/svg+xml",
"rel" : "cover"
}
]
},
…
],
…
}
The page list is a navigational aid that contains a list of static page demarcation points within a digital publication .
The
page
list
is
identified
by
the
pagelist
link
relation.
The
pagelist
term
is
not
currently
registered
in
the
IANA
link
relations
but
the
Working
Group
expects
to
add
it.
Only one resource MAY be identified as containing a page list. If multiple instances are specified, user agents MUST use the first instance encountered, with precedence given to the reading order .
The link to the page list MUST NOT be specified in the links list .
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "toc_file.html",
"rel" : "pagelist"
},
…
],
…
}
The table of contents is a navigational aid that provides links to the major structural sections of a digital publication .
The
resource
that
contains
the
table
of
contents
is
identified
by
the
contents
link
relation [
iana-link-relations
].
The
table
of
contents
proper
is
the
first
element
inside
that
resource
with
the
role
value
doc-toc
,
as
defined
in
§
C.2
HTML
Structure
.
Only one resource MAY be identified as containing the table of contents. If multiple instances are specified, user agents MUST use the first instance encountered, with precedence given to resources in the reading order .
Profiles
of
this
specification
MAY
define
how
to
locate
a
resource
containing
the
table
of
contents
when
no
resource
is
identified
by
the
contents
relation.
The link to the table of contents MUST NOT be specified in the links list .
The RECOMMENDED structure and processing model for the table of contents is defined in § C. Machine-Processable Table of Contents .
{
…
"resources" : [
{
"type" : "LinkedResource",
"url" : "toc_file.html",
"rel" : "contents"
},
…
],
…
}
An accessibility report provides information about the suitability of a digital publication for consumption by users with varying preferred reading modalities. These reports typically identify the result of an evaluation against established accessibility criteria, such as those provided in [ wcag21 ], and are an important source of information in determining the usability of a publication.
An
accessibility
report
is
identified
using
the
accessibility-report
link
relation.
The
accessibility-report
term
is
not
currently
registered
in
the
IANA
link
relations
but
the
Working
Group
expects
to
add
it.
It is RECOMMENDED that the report be included as a resource of the publication so that it is available, for example, when a publication is read offline.
Providing the accessibility report in a human-readable format, such as HTML [ html ], helps ensure that it can be accessed and understood by users. Augmenting the report with machine-processable metadata, such as provided in Schema.org [ schema.org ], will additionally aid in machine processing.
{
…
"links" : [
{
"type" : "LinkedResource",
"url" : "https://www.publisher.example.org/sherlock-holmes-accessibility.html",
"rel" : "accessibility-report"
},
…
],
…
}
Not all digital publications will be available to all users (e.g., they might be restricted to registered users of a site). In such cases, the publisher might wish to provide a preview of the content in order to entice users to access the full version.
A
preview
is
identified
using
the
preview
link
relation [
iana-link-relations
].
Previews MAY be located externally or included as resources of digital publications.
{
…
"links" : [
{
"type" : "LinkedResource",
"url" : "preview.mp3",
"encodingFormat" : "audio/mpeg",
"rel" : "preview"
},
…
],
…
}
{
…
"links" : [
{
"type" : "LinkedResource",
"url" : "https://publisher.example.org/jekyll-hyde-preview.html",
"encodingFormat" : "text/html",
"rel" : "preview"
},
…
],
…
}
Users often have the legal right to know and control what information is collected about them, how such information is stored and for how long, whether it is personally identifiable, and how it can be expunged. Including a statement that addresses such privacy concerns is consequently an important part of publishing digital publications . Even if no information is collected, such a declaration increases the trust users have in the content.
A link to a privacy policy can be included in the manifest for this purposes. It is RECOMMENDED that the privacy policy be included as a resource of the publication so that it is available, for example, when a publication is read offline.
A
privacy
policy
is
identified
using
the
privacy-policy
link
relation [
iana-link-relations
].
{
…
"links" : [
{
"type" : "LinkedResource",
"url" : "https://www.w3.org/Consortium/Legal/privacy-statement-20140324",
"encodingFormat" : "text/html",
"rel" : "privacy-policy"
},
…
],
…
}
If
additional
relations
beyond
those
defined
in
this
specification
need
to
be
expressed,
the
rel
property
can
be
extended
in
one
of
the
following
ways:
The
list
of
unique
resources
belonging
to
a
digital
publication
—
its
bounds
—
is
obtained
from
the
union
of
resources
listed
in
the
readingOrder
and
resources
,
including
any
alternate
resources.
The
exact
process
for
creating
this
list
is
described
in
the
manifest
processing
algorithm.
All
other
resources
are
outside
the
bounds
of
the
digital
publication
(e.g.,
resources
listed
in
the
links
section
and
hyperlinks
in
the
content
to
external
resources
on
the
Web).
This specification does not place any restrictions on publication resources, but profiles of this specification MAY restrict both the content type and location of resources.
User agents MAY opt to process and render resources differently depending on whether or not they are within the bounds of a digital publication (e.g., exclude external resources from an offline or packaged version of a publication).
Links to the manifest MUST take one or both of the following forms:
An
HTTP
Link
header
field [
rfc5988
]
with
its
rel
parameter
set
to
the
value
"
publication
".
Link
:
<https://example.com/webpub/manifest>;
rel=publication
A
link
element [
html
]
with
its
rel
attribute
set
to
the
value
"
publication
".
<
link
href
=
"https://example.com/webpub/manifest"
rel
=
"publication"
/>
When
a
manifest
is
embedded
within
an
HTML
document
,
the
link
MUST
include
a
fragment
identifier
that
references
the
script
element
that
contains
the
manifest
(see
§
6.2
Embedding
).
<link href="#example_manifest" rel="publication">
…
<script id="example_manifest" type="application/ld+json">
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
…
}
</
script
>
When
a
digital
publication
format
allows
manifests
to
be
embedded
within
an
HTML
document,
the
manifest
MUST
be
included
in
a
script
element
[
html
]
whose
type
attribute
is
set
to
application/ld+json
[
json-ld11
].
<script type="application/ld+json">
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
…
}
</
script
>
Digital publication formats MAY define alternative methods of discovering a manifest that do no involve linking to, or embedding, a manifest (e.g., that manifest could be discovered through the use of a restricted name and/or location). This specification does not add any restrictions on such methods.
This section depends on the Infra Standard [ infra ].
This section is non-normative.
Although a digital publication 's manifest is authored as [ json-ld11 ], the steps for processing a manifest decribed in this section detail how a user agent transforms the manifest into its internal representation of the data. The algorithm describes the process using the terminology and data types defined in [ infra ], and, if successful results in an [ infra ] map of the data being returned.
An actual implementation of this algorithm will use the corresponding constructs and data types of whatever language is used.
The following error types are used in the processing algorithm :
User agents SHOULD expose both validation and fatal errors, but this specification does not prescribe the manner in which this is done.
For validation errors, user agents SHOULD differentiate the severity of the error (i.e., whether a required or recommended practice has been violated).
Some
steps
in
the
processing
algorithm
depend
on
the
expected
value
category
of
a
term,
so
the
context
in
which
a
term
is
used
can
affect
processing
(e.g.,
url
expects
an
Array
of
URLs
only
when
the
direct
property
of
the
Publication
Manifest).
To
differentiate
these
uses,
a
context
is
provided
to
certain
function
calls.
This
context
is
set
to
the
type
of
object
that
initiates
the
processing
call.
The
default
list
of
recognized
types
includes
Person
,
Organization
and
LinkedResource
.
Profiles
MAY
extend
this
list
to
include
additional
object
types.
If a context is not provided to a function, the term being processed is considered part of the global context (i.e., it is a direct child of the manifest).
When extending the list of recognized types, the normalize data function might also need to be extended to ensure that all objects have their type specified (e.g., when string values are automatically expanded to objects).
This algorithm takes the following arguments:
This algorithm does not describe how the manifest is discovered and obtained. The steps by which to do so are defined by each digital publication format.
To generate the internal representation , run the following steps:
Let processed be an empty map that will contain the internal representation of the manifest.
Let manifest be the result of parsing JSON into Infra values given text . If manifest is not a map , fatal error , return failure.
(
§
4.3
Manifest
Contexts
)
If
manifest["context"]
is
not
set
to
a
list
,
or
the
first
and
second
items
in
manifest["@context"]
are
not
the
string
values
"
https://schema.org
"
and
"
https://www.w3.org/ns/pub-context
",
in
this
order,
fatal
error
,
return
failure.
If the context URLs are not set as expected, the JSON data does not represent a publication manifest.
( § 4.6 Profile Conformance ) Let processed["profile"] be the profile the manifest conforms to. Set processed["profile"] as follows:
If manifest["conformsTo"] is not set, or does not include a profile the user agent recognizes as capable of processing and/or rendering, the user agent SHOULD inspect the media type(s) of the resources in the reading order to determine if the publication matches a profile it is capable of processing or rendering. If so, validation error , set processed["profile"] to the matching profile. Otherwise, fatal error , return failure.
Otherwise, set processed["profile"] to the first URL in manifest["conformsTo"] the user agent is capable of processing and/or rendering.
The profile the publication conforms to determines any additional extension steps that have to be performed during processing. These steps are defined by their respective specifications.
The new term profile is created because conformsTo is not restricted to profile identifiers (i.e., the new term provides a persistent identifier of the profile within the internal representation).
( § 4.4.1 Global Declarations ) Let lang be the global language and dir be the global direction obtained from this step. Set each initially to an empty string .
For each context of manifest["@context"] , moving from the last item to the first, if context is a map :
If lang is neither an empty string nor a well-formed [ bcp47 ] language tag, validation error , set lang to an empty string.
If
dir
is
neither
an
empty
string
nor
one
of
the
values
"
ltr
"
or
"
rtl
",
validation
error
,
set
dir
to
an
empty
string.
The global language and direction declarations obtained here are used to set the language and base direction, respectively, for localizable strings without a declaration.
The iterator moves backwards through @context as the last language and direction declarations override any earlier ones.
( § 4.3 Manifest Contexts ) If a profile requires additional validation of the manifest context, those steps are performed here.
This
extension
step
allows
verification
of
any
information
a
profile
requires
be
present
in
the
manifest
context
(e.g.,
additional
context
URLs
or
parameters).
These
steps
have
to
be
performed
at
this
point,
as
@context
terms
are
removed
as
part
of
the
data
normalization
in
the
next
step.
A
more
general
step
for
processing
profile
data
is
provided
at
a
later
step
.
For each term → value of manifest , set processed[term] to the result, when successful, of calling normalize data given term , value , lang , dir and base . If failure is returned, do not add term to processed .
The data normalization steps standardize the incoming manifest data to remove any authoring conveniences, such as the ability to use strings where objects or arrays are expected. The resulting processed data are added to the processed variable and are operated on in subsequent steps.
Set processed to the result of running data validation given processed .
The data validation checks ensure that the incoming data matches its expected value categories. Any restrictions on the expected values are also enforced at this step, and any invalid data is removed from the final representation.
Set processed to the result of running add default values , when successful, given processed and document , when specified. Otherwise, terminate processing, return failure.
This step checks if any information missing from the manifest can be obtained from the HTML document that links to the document, or from other sources.
If a profile specifies additional processing functions that need to be run, those steps are executed at this point.
Return processed .
For a visualization of the resulting structure, see § A. Internal Representation Data Models .
To normalize data for a property term 's value , with the global language lang , global direction dir , base URL base , and optional context context run these steps:
Let normalized be the value of value .
The data normalization steps are performed on the copy of the incoming value held in the normalized variable defined in this step. This variable is returned at the end of a successful normalization process.
( § 4.3 Manifest Contexts ) If term is @context , return failure.
@context provides information for the initial processing of the manifest, but is not retained in the internal data representation. Returning a failure signals to remove the term.
( § 4.2.7 Arrays ) If, depending on context , term expects an array and value is not a list , set normalized to the list : « value ».
A number of terms require their values to be arrays, but, for the sake of convenience, authors are allowed to use a single value instead of a one element array. For example,
{
…
"name" : "Et dukkehjem",
"author" : "Henrik Ibsen",
…
}
yields:
«[
…
"name" → « "Et dukkehjem" »,
"author" → « "Henrik Ibsen" »,
…
]»
( § 4.2.4.2 Entities ) If, depending on context , term expects an array of entities , for each entity of normalized :
if entity is a string , set entity to the map :
«[
"type" → « "Person" »,
"name" → entity
]»
otherwise, if entity is not a map , validation error , remove entity from normalized .
otherwise,
if
entity["type"]
is
not
set,
set
it
to
the
list
:
« "Person" »
.
If
entity["type"]
is
set
but
does
not
include
the
value
Person
or
Organization
,
append
the
value
Person
to
the
list
.
Creators (authors, editors, etc.), are expected to be explicitly defined as an object, but, for the sake of convenience, only their name has to be specified in the manifest. For example:
{
…
"author": "Ralph Ellison",
…
}
This
rule
converts
such
string
values
to
maps
with
a
default
type
of
Person
,
yielding
the
following
for
the
preceding
example:
«[
…
"author" → «
«[
"type" → « "Person" »
"name" → "Ralph Ellison"
]»
»,
…
]»
For simplicity, the conversion of name to a localizable string is described by a later step.
( § 4.2.4.1 Localizable Strings ) If, depending on context , term expects an array of localizable strings , for each item of normalized :
if item is a string , set item to the map :
«[
"value" → item,
"language" → lang,
"direction" → dir
]»
if lang or dir is not set, or is an empty string , remove item["language"] or item["direction"] , respectively.
otherwise, if item is not a map , validation error , remove item from normalized .
otherwise, process the map in item as follows:
If item["language"] is not set, set it to the value of lang when lang is set and is not an empty string .
Otherwise, if item["language"] is null , remove item["language"] .
If item["direction"] is not set, set it to the value of dir when dir is set and is not an empty string .
Otherwise, if item["direction"] is null , remove item["direction"] .
Natural language text values are expected to be explicitly defined as localizable string objects, but, for the sake of convenience, can be simple strings in the manifest. For example, if no language information has been provided via the global language declaration then:
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"name" : ["La Comédie humaine"],
…
}
yields:
«[
"name" → «
«[
"value" → "La Comédie humaine"
]»
»,
…
]»
If, however, an explicit language has been provided in the manifest, that language is added to the localizable string object. For example,
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language": "fr"}
],
"name" : ["La Comédie humaine"],
…
}
yields:
{
"name" → «
«[
"value" → "La Comédie humaine"
"language" → "fr"
]»
»,
…
}
A
local
setting
or
a
local
null
value
prevents
the
global
value
from
taking
effect.
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language":"fr"}
],
…
"name" : [{
"value" : "La Comédie humaine"
}],
"publisher" : [{
"type":["Organization"],
"name":[{
"value": "Hachette",
"language": null
}]
}],
…
}
yields:
{
"name" → «
«[
"value" → "La Comédie humaine"
"language" → "fr"
]»
»,
"publisher" → «
«[
"type" → « "Organization" »,
"name" → «
«[
"value" → "Hachette",
]»
]»
»,
…
}
( § 4.2.4.3 Linked Resources ) If, depending on context , term expects an array of LinkedResources , for each resource of normalized :
if resource is a string, convert resource to the map :
«[
"type" → « "LinkedResource" »,
"url" → resource
]»
otherwise, if resource is not a map , validation error , remove resource from normalized .
otherwise,
if
resource["type"]
is
not
set,
set
it
to
the
list
:
« "LinkedResource" »
.
If
resource["type"]
is
set
but
does
not
include
the
value
LinkedResource
,
append
that
value
to
the
list
.
Resource
links
are
expected
to
be
explicitly
designed
as
an
object
of
type
LinkedResource
,
but,
for
the
sake
of
convenience,
only
their
absolute
or
relative
URL
has
to
be
specified
in
the
manifest.
For
example,
{
…
"resources" : [
"css/book.css",
…
],
…
}
This step converts the string values to objects, yielding the following for the preceding example:
«[
…
"resources" → «
«[
"type" → « "LinkedResource" »,
"url" → "css/book.css"
]»,
…
»,
…
]»
For simplicity, the conversion of relative paths to absolute is described by a later step.
( § 4.2.5 URLs ) If, depending on context , term expects a URL or array of URLs :
if normalized is a string , set normalized to the result of running convert to absolute URL , when successful, given normalized . If failure is returned, return failure.
otherwise, if normalized is a list , for each item of normalized , set item to the result of running convert to absolute URL , when successful, given normalized . If failure is returned, remove item from normalized .
otherwise, validation error , return failure.
Relative URLs in the manifest are resolved against the base value to obtain absolute URLs . For example:
"url"
:
"chapter01.html"
for
a
publication
hosted
at
https://example.org/publications/wuthering-heights
would
yield:
"url"
→
"https://example.org/publications/wuthering-heights/chater01.html"
( § 8. Modular Extensions , extension point) If a profile defines processing steps for profile-specific terms, those steps are executed at this point.
Recursively check normalized as follows to ensure that all properties get normalized:
if normalized is a list , for each item of normalized that is a map :
if item["type"] is set and includes a recognized type , for each key → keyValue of item , set key to the result of running normalize data , when successful, given key , keyValue , lang , dir , base and using keyValue["type"] as the context. If failure is returned, remove key from item .
otherwise, do nothing.
otherwise, if normalized is a map :
if item["type"] is set and includes a recognized type , for each key → keyValue of normalized , set key to the result of running normalize data , when successful, given key , keyValue , lang , dir , base and using keyValue["type"] as the context. If failure is returned, remove key from normalized .
otherwise, do nothing.
otherwise, do nothing.
In order to ensure that all the properties in the manifest get processed, this step recursively checks normalized for additional map entries to process. If normalized is a list, each item is inspected to determine if it is a map that can be processed.
If a failure is returned, the item is removed from the map.
return normalized .
To convert to absolute URL url , with a base URL base , run the following steps:
If url or base is not a string , or is an empty string, validation error , return failure.
This step checks that both url and base are non-empty strings before attempting to use them.
Set url to the result of running the URL parser [ url ], when successful, with url as input and base as the base URL. If failure is returned, validation error , return failure.
This step calls the URL parser function on the url to be processed. If the url is not an absolute URL, the parser converts it to one using the base URL.
If parsing returns a failure, a failure is returned to the caller to indicate to remove the URL.
Return url .
To perform data validation on map data , run the following steps:
For each term → value of data , set term to the result of running the global data checks , when successful, given term and value . If failure is returned, remove data[term] .
This step passes each entry to a set of global validation checks that need to be run on the value and recursively on any properties within the value.
A failure is returned if the property is invalid and has to be removed.
If a profile specifies data validation checks, those steps are executed at this point.
Profile validation steps are prioritized over the default steps so that if profiles have, for example, different default values to apply, those values get applied.
(
§
4.5
Publication
Types
)
If
data["type"]
is
not
set
or
is
an
empty
list
,
validation
error
,
set
to
« "CreativeWork" »
.
(
§
4.7.1.2
Accessibility
)
If
data["accessModeSufficient"]
is
set,
for
each
item
of
data["accessModeSufficient"]
,
if
item["type"]
is
not
set
or
does
not
contain
"
ItemList
",
remove
item
from
data["accessModeSufficient"]
.
( § 4.7.1.4 Canonical Identifier ) If data["id"] is not set or is an empty string , validation error .
( § 4.7.1.6 Duration ) If data["duration"] is set and is not a valid duration value, per [ iso8601 ], validation error , remove data["duration"] .
( § 4.7.1.7 Last Modification Date ) If data["dateModified"] is set and is not a valid date or date-time per [ iso8601 ], validation error , remove data["dateModified"] .
( § 4.7.1.8 Publication Date ) If data["datePublished"] is set and is not a valid date or date-time per [ iso8601 ], validation error , remove data["datePublished"] .
( § 4.7.1.9 Publication Language ) If data["inLanguage"] is set, for each item of data["inLanguage"] , if item is not well-formed [ bcp47 ], validation error , remove item from data["inLanguage"] .
(
§
4.7.1.10
Reading
Progression
Direction
)
If
data["readingProgression"]
is
not
set,
set
to
"
ltr
".
Otherwise,
if
it
is
not
one
of
the
required
directional
values
,
validation
error
,
set
to
"
ltr
".
( § 5. Publication Resources ) Obtain and verify the unique URLs within the publication bounds as follows:
If readingOrder is set, let readingOrderURLs be the result of running get unique URLs given readingOrder . Otherwise, let readingOrderURLs be an empty ordered set .
If resources is set, let resourcesURLs be the result of running get unique URLs given resources . Otherwise, let resourcesURLs be an empty ordered set .
Set data['uniqueResources'] to the union of readingOrderURLs and resourceURLs .
This step gets the list of unique URLs within the reading order and the resource list. It then sets data['uniqueResources'] the union of these two sets, which represents the complete list of unique resources within the bounds of the publication .
( § 4.7.2.3 Links ) If data["links"] is set, for each link in data["links"] :
let url be the result of running URL serializer [ url ] on link["url"] with the exclude fragment flag set.
if data["uniqueResources"] contains url , validation error , remove link from data["links"] , then continue .
if link["rel"] is not set or is an empty list , validation error , then continue .
if
link["rel"]
contains
any
of
the
values
"
contents
",
"
pagelist
"
or
"
cover
",
validation
error
,
remove
link
from
data["links"]
.
After obtaining the list of unique publication resources in the previous step, the links property is checked to ensure that any linked resources are not also listed as publication resources.
If
the
link
does
not
specify
a
rel
value,
a
warning
is
raised.
If
its
rel
property
specifies
a
structural
resource,
the
link
is
removed,
as
structural
resources
have
to
be
within
the
publication
bounds.
( § 4.8.1 Structural Resources ) Verify the use of structural relations as follows:
Set resources to the value of data["readingOrder"] , when defined, otherwise to an empty list . Extend resources with data["resources"] , when defined.
If
more
than
one
item
in
resources
has
a
rel
entry
that
contains
the
value
"
contents
",
validation
error
.
If
more
than
one
item
in
resources
has
a
rel
entry
that
contains
the
value
"
pagelist
",
validation
error
.
If
more
than
one
item
in
resources
has
a
rel
entry
that
contains
the
value
"
cover
",
validation
error
.
If
the
cover(s)
have
an
encodingFormat
entry
that
specifies
an
image
media
type
(
image/*
),
and
do
not
have
a
name
entry
,
validation
error
.
This checks the resources specified in the reading order and resource list to verify that only one instance of a table of content, page list and cover have been specified.
For covers, it also checks that a name has been set on image-based formats for accessibility purposes.
For each term → value of data , if running remove empty arrays given the variables term and value returns failure, remove data["term"] .
As the processing of the manifest involves removing invalid values at various stages, the final data structure might end up with some lists that not no longer contain any values. This step iterates back over the data and removes any such empty lists.
Return data .
To process the global data checks on a property term 's value with an optional context context , run these steps:
( § 4.2 Value Categories ) If term has a known value category , set value to the result of calling verify value category , when successful, given the variables term , value and context . If failure is returned, return failure.
Otherwise, return value .
This step verifies that the value of the term matches the expected category required for the term. For example, the abridged term requires a boolean value, so any other value used with the term will result in a failure.
If a failure occurs calling the function, this step also returns a failure so that the property is removed from the final data set.
Terms without a known value category are not processed, so the incoming value is returned.
Recursively descend into value as follows to check any sub-properties first:
if value is a map :
if value["type"] includes a recognized type , for each key → keyValue of value , set value[key] to the result of running global data checks , when successful, given key , keyValue and using value["type"] as the context. If failure is returned, remove value[key] .
otherwise, do nothing.
otherwise, if value is a list , for each item of value , if item is a map :
if item["type"] includes a recognized type , for each key → keyValue of item , set item[key] to the result of running global data checks , when successful, given key , keyValue and using item["type"] as the context. If failure is returned, remove item[key] .
otherwise, do nothing.
otherwise, do nothing.
In order to ensure that all the properties in the manifest get processed, this step recursively checks each entry for additional map entries to process. If the value is a list, each item is inspected to determine if it is a map that can be processed.
Its placement also ensures that all subproperties are checked first, so that the higher-level checks later in the step are tested after any invalid values are removed.
(
§
4.4.1
Global
Declarations
and
§
4.4.2
Item-Specific
Declarations
)
If
term
expects
an
array
of
,
for
each
item
of
value
:
LocalizableStrings
if item["value"] is not set, remove item from value .
if item["language"] is set and its value is not well-formed [ bcp47 ], validation error , remove item["language"] .
if
item["direction"]
is
set
and
its
value
is
not
one
of
"
ltr
"
or
"
rtl
",
validation
error
,
remove
item["direction"]
.
This step checks that localizable strings have values, that their language declarations are well formed, and that their direction declarations have either the value "ltr" or "rtl".
( § 4.2.4.2 Entities ) If term expects an array of entities , for each item of value , check whether item["name"] is set:
If not, validation error , remove item from value .
This step ensures that all entities have a name. Entities without a name are removed.
(
§
4.2.4.3
Linked
Resources
)
If
term
expects
an
array
of
LinkedResources
,
for
each
resource
of
value
:
if resource["url"] is not set, or its value is an empty string, validation error , remove resource from value , then continue .
Otherwise, if resource["url"] is not a valid URL [ url ], validation error , remove resource from value , then continue .
if resource["duration"] is set and is not a valid duration value, per [ iso8601 ], validation error , remove resource["duration"] .
This
step
performs
the
following
two
checks
on
the
terms
of
a
LinkedResource
:
LinkedResource
is
removed.
Return value .
To verify value category of a property term 's value with a context context , run these steps:
If, depending on the context , term expects an array :
if value is not a list , validation error , return failure.
otherwise, for each item of value :
if item does not match the expected value category of the array, validation error , remove item from value , then continue .
if item is a map , for each key → keyValue of item , if key has an expected value category, set key to the result of running verify value category given key , keyValue , and using item["type"] as the context. If the result of processing item is an empty map , validation error , remove item from value .
If the result of processing value is an empty array , validation error , return failure.
Otherwise, if, depending on the context , term expects a map :
if value is not a map , validation error , return failure.
otherwise, for each key → keyValue of value , if key has an expected value category, set key to the result of running verify value category given key , keyValue and using item["type"] as the context. If the result of processing value is an empty map , validation error , return failure.
Otherwise, if, depending on the context , value does not match the expected value category of term , validation error , return failure.
Return value .
This function checks that the value of the term being processed matches its expected value category. The function is recursively called when the value is a list or map to ensure that all properties in the manifest get checked.
To get unique URLs from resources , run the following steps:
Let uniqueURLs be an empty ordered set .
For each resource of resources :
let url be the result of running URL serializer [ url ] on resource["url"] with exclude fragment flag set.
if uniqueURLs contains url , validation error . Otherwise, append url to uniqueURLs .
if resource["alternate"] is set, for each alternate of resource["alternate"] :
let alt_url be the result of running URL serializer [ url ] on alternate["url"] with exclude fragment flag set.
if uniqueURLs contains alt_url , validation error .
otherwise, append alt_url to uniqueURLs .
Return uniqueURLs .
This
function
takes
a
list
of
objects
—
from
either
the
reading
order
or
resource
list
—
and
returns
the
set
of
unique
URLs
.
If
duplicates
are
encountered,
warnings
are
issued.
LinkedResource
To remove empty arrays from a property term 's value , run these steps:
If value is an empty list , return failure.
Otherwise, if value is a map , for each key → keyValue of value , if running remove empty arrays given key and keyValue returns failure, remove value[key] .
This function checks that the value of the term being processed is not an empty list. A term that initially has a list can lose entries as it gets processed (i.e., when the list items are invalid).
To add default values for missing properties in map data with an optional HTML Document (DOM) Node [ html ] document , run the following steps:
( § 4.7.1.11 Title ) If data["name"] is not set:
Let title be an empty map . Set its values as follows:
if
document
is
set,
if
the
title
element [
html
]
of
document
is
set
and
is
not
empty,
set
title["value"]
to
the
text
content
of
the
title
element.
Set
title["language"]
to
the
language
[
html
],
if
available,
and
title["direction"]
to
the
base
direction
[
html
]
if
that
value
is
available
and
its
value
is
either
"
ltr
"
or
"
rtl
".
otherwise, validation error , generate a value for title["value"] (see the separate note for details ). Set title["language"] and title["direction"] as appropriate for the generated title.
«
title
»
.
This
step
adds
the
content
of
the
title
element
of
document
when
the
name
property
is
not
specified
in
the
manifest.
For
example:
<html>
<head lang="en">
<title>The Golden Bough</title>
…
<script type="application/ld+json">
{
"@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
…
}
</
script
>
yields:
«[
…
"name" → «
«[
"value" → "The Golden Bough",
"language" → "en"
]»
»,
…
]»
( § 4.7.2.1 Default Reading Order ) If data["readingOrder"] is not set:
if either document or document.URL is not set, fatal error , return failure.
set
data["readingOrder"]
to
an
empty
list
and
append
the
map
«[ "url"
→
document.URL
]»
.
append document.URL to data["uniqueResources"] .
If the Digital Publication consists only of the referencing document, the default reading order can be omitted; it will consist, automatically, of that single resource.
If a profile specifies default values the user agent has to generate, those steps are executed at this point.
Return data .
The manifest format defined in this specification is designed to be implemented and extended by publishing communities in the production of new profiles (e.g., audiobooks and scholarly publications). The flexibility the manifest format offers allows it to be tailored to each community's specific needs while also providing a common base for user agents that need to process the profiles (i.e., minimizing the differences between each profile and simplifying interoperability).
In order for a profile to be compatible with this specification, the following conditions MUST be met:
conformsTo
property
.
Adding an example of a term added by, e.g., the audiobook profile would be a good idea, when available.
As the manifest is expressed using JSON-LD, the privacy and security considerations [ json-ld11 ] detailed in that specification are applicable to all profiles of the manifest.
Some additional general considerations for profiles include:
More specific security and privacy considerations are left to each profile to detail, as these will vary depending on the nature of the digital publication format.
This section is non-normative.
The manifest includes a number of authoring conveniences, such as default values, the ability to use strings where objects would normally be required, and the automatic compilation of information from other sources (e.g., for the title and reading order ). The processing of the manifest normalizes these conveniences and results in a consistent data set for user agents (the internal representation ), but this set is not easily visualized from the processing algorithm.
This appendix provides informative abstract data models that describe the resulting data structure. The choice of representations is only for illustrative purposes. User agents are not required to use the associated technologies.
This definition expresses the expected names, datatypes, and possible restrictions for each member of the manifest after processing using [ WebIDL ].
This specification does not define an API for exposing the manifest data.
PublicationManifest
Dictionary
dictionary PublicationManifest {
sequence<DOMString> type = "CreativeWork";
required DOMString profile;
sequence<DOMString> conformsTo;
DOMString id;
boolean abridged;
sequence<DOMString> accessMode;
sequence<DOMString> accessModeSufficient;
sequence<DOMString> accessibilityFeature;
sequence<DOMString> accessibilityHazard;
sequence<LocalizableString> accessibilitySummary;
sequence<Entity> artist;
sequence<Entity> author;
sequence<Entity> colorist;
sequence<Entity> contributor;
sequence<Entity> creator;
sequence<Entity> editor;
sequence<Entity> illustrator;
sequence<Entity> inker;
sequence<Entity> letterer;
sequence<Entity> penciler;
sequence<Entity> publisher;
sequence<Entity> readBy;
sequence<Entity> translator;
sequence<DOMString> url;
DOMString duration;
sequence<DOMString> inLanguage;
DOMString dateModified;
DOMString datePublished;
TextDirection readingProgression = "ltr";
required sequence<LocalizableString> name;
required sequence<LinkedResource> readingOrder;
sequence<LinkedResource> resources;
sequence<LinkedResource> links;
sequence<DOMString> uniqueResources;
};
enum TextDirection {
"ltr",
"rtl"
};
LinkedResource
Dictionary
dictionary LinkedResource {
required DOMString url;
DOMString encodingFormat;
sequence<LocalizableString> name;
sequence<LocalizableString> description;
sequence<DOMString> rel;
DOMString integrity;
DOMString duration;
sequence<LinkedResource> alternate;
};
Entity
Dictionary
dictionary Entity {
sequence<DOMString> type;
required sequence<LocalizableString> name;
DOMString id;
DOMString url;
sequence<DOMString> identifier;
};
LocalizableString
Dictionary
dictionary LocalizableString {
required DOMString value;
DOMString language;
TextDirection direction;
};
This appendix depends on the Infra Standard [ infra ].
To
select
an
alternate
resource
for
a
resource
,
run
the
following
steps.
LinkedResource
If successful, this algorithm returns an alternate resource. Otherwise, it returns failure.
Let possibleAlternates be an empty list .
If resource["alternate"] is not set, return failure.
For each alternate of resource["alternate"] :
if alternate["encodingFormat"] is set and the user agent supports the specified media type, append to possibleAlternates .
otherwise, if a profile defines additional selection criteria, evaluate alternate against them in this extension step.
otherwise, optionally inspect alternate["url"] for clues about the media type. If the resource appears to be supported, append alternate to possibleAlternates .
If possibleAlternates is an empty list , return failure.
Otherwise, if the size of possibleAlternates is 1, return the resource from possibleAlternates .
Otherwise, return a resource from possibleAlternates as determined by the user agent.
This function iterates the alternative formats for a resource and compiles a list of possibilities. If more than one possibility is found, the user agent determines how to prioritize and select the best alternative.
User agents are not required to add alternatives to the list of possibilities if they do not specify an explicit media type.
This section is non-normative.
To
facilitate
navigation
within
pages
and
across
sites,
HTML
uses
the
nav
element
[
html
]
to
express
lists
of
links.
Although
generic
in
nature
by
default,
the
purpose
of
a
nav
element
can
be
more
specifically
identified
by
use
of
the
role
attribute
[
html
].
In
particular,
the
doc-toc
role
from
the
[
dpub-aria-1.0
]
vocabulary
identifies
the
nav
element
as
the
digital
publication's
table
of
contents.
Including an identifiable table of contents is an accessible way to produce any digital publication , but due to the flexibility of HTML markup, it also presents challenges for user agents trying to extract a meaningful hierarchy of links (e.g., to provide a custom view available from any page). To avoid duplicating the tables of contents for different uses, this section defines a syntax that is both human friendly and commonly used while still providing enough structure for user agent extraction.
Authors
have
a
choice
of
lists
(ordered
or
unordered)
to
construct
their
table
of
contents.
By
tagging
each
link
within
these
lists
in
anchor
tags
(
a
elements
),
user
agents
can
easily
differentiate
the
information
they
need
from
any
peripheral
content
(asides)
or
stylistic
tagging
that
has
also
been
added.
The
table
of
contents
can
consist
of
both
active
links
(with
an
href
attribute)
and
inactive
links
(excluding
the
href
attribute),
providing
additional
flexibility
in
how
the
table
of
contents
is
constructed
(e.g.,
to
omit
links
to
certain
headings
or
only
link
to
certain
content
in
a
preview).
Note, however, that user agents are not required to preserve the presentational aspects of the table of contents (i.e., the user agent is typically extracting the information in order to present it in a common way across all publications). User agents are only expected to retain the text content of the link elements, for example, so text styling, inline images and other non-text content might be lost. Similarly, list styling and even how many levels deep of linking to display are at the discretion of the user agent. For this reason, linking to the presentational table of contents so that users are not limited to the machine-processed one is advised.
The
table
of
contents
is
expressed
via
an
[
html
] element
(typically
a
nav
element
).
This
element
MUST
be
identified
by
the
role
attribute [
html
]
value
"
doc-toc
" [
dpub-aria-1.0
],
and
MUST
be
the
first
element
in
the
document
in
document
tree
order
[
dom
]
with
that
role
value.
The
element
MAY
be
hidden
from
users.
The manifest SHOULD identify the resource that contains the table of contents.
Although
the
content
model
of
the
nav
element
is
not
restricted,
user
agents
will
only
be
able
to
extract
a
usable
table
of
contents
when
the
following
markup
guidelines
are
followed:
Although
a
title
for
the
table
of
contents
is
optional,
to
avoid
having
a
user
agent
generate
a
placeholder
title
when
one
is
needed,
it
is
advised
to
add
one.
Titles
are
specified
using
any
of
the
[
html
]
h1
through
h6
elements
.
Note
that
only
the
first
such
element
is
recognized
as
the
title.
If
a
heading
element
is
not
found
before
the
list
of
links
,
user
agents
will
assume
that
one
has
not
been
specified.
The
first
[
html
]
ol
or
ul
list
element
encountered
in
the
nav
element
is
assumed
to
contain
the
list
that
defines
the
links
into
the
content.
This
list
will
be
found
even
if
it
is
nested
inside
of
div
elements,
for
example,
as
the
algorithm
ignores
elements
that
are
not
relevant
to
its
processing.
The
list
cannot
occur
inside
of
any
skipped
elements
,
however,
since
their
internal
contents
are
not
evaluated.
If
the
nav
element
does
not
contain
one
of
these
elements,
then
user
agents
will
not
register
the
digital
publication
as
containing
a
usable
table
of
contents
(e.g.,
a
machine-rendered
option
will
not
be
available).
If
the
table
of
contents
is
considered
as
a
tree
of
links,
then
each
list
item
(
li
element
)
inside
of
the
list
of
links
represents
one
branch.
Each
of
these
branches
has
to
have
a
name
and
optional
destination
in
order
to
be
presented
to
users,
and
this
information
is
obtained
from
the
first
a
element
found
within
the
list
item,
wherever
it
is
nested
(again,
excluding
any
a
elements
inside
of
skipped
elements
.)
The
link
destination
for
the
branch
is
obtained
from
the
a
element's
href
attribute,
when
specified.
This
attribute
can
be
omitted
if
a
link
is
not
available
(e.g.,
in
a
preview)
or
not
relevant
(e.g.,
a
grouping
header).
When
providing
a
link
into
the
content,
it
is
also
possible
to
specify
the
relation
of
the
linked
document
(in
a
rel
attribute)
and
the
media
type
of
the
linked
resource
(in
a
type
attribute).
After
finding
the
a
element
that
labels
the
branch,
user
agents
will
continue
to
inspect
the
markup
for
another
list
element
(i.e.,
sub-branches).
If
a
list
is
found,
it
is
similarly
processed
to
extract
its
links,
and
so
on,
until
there
are
no
more
nested
branches
left
to
process.
A small set of elements are ignored when the parsing table of contents to avoid misinterpretation. These are the [ html ] sectioning content elements and sectioning root elements . The reason they are ignored is because they can defined their own outlines (i.e., they can represent embedded content that is self-contained and not necessarily related to the structure of content links).
Any
element
that
has
its
hidden
attribute
set
is
also
skipped,
since
hidden
elements
are
not
intended
to
be
directly
accessed
by
users.
Although
these
elements
can
be
included
in
the
nav
element,
care
has
to
be
taken
not
to
embed
important
content
within
them
(e.g.,
do
not
wrap
a
section
element
around
the
list
item
that
contains
all
the
links
into
the
content).
All elements that are not relevant to extracting the table of contents, and are not skipped , are ignored. Unlike skipped elements, ignoring means that user agents will continue to search inside them for relevant content, allowing greater flexibility in terms of the tagging that can be used.
This section is non-normative.
This section depends on the Infra Standard [ infra ].
This
section
defines
an
algorithm
for
extracting
a
table
of
contents
from
a
nav
element.
It
is
defined
in
terms
of
a
walk
over
the
nodes
of
a
DOM
tree,
in
tree
order
[
dom
],
with
each
node
being
visited
when
it
is
entered
and
when
it
is
exited
during
the
walk.
Each
time
a
node
is
visited,
it
can
be
seen
as
triggering
an
enter
or
exit
event.
In
some
steps,
user
agents
are
provided
a
choice
in
how
to
process
the
content
to
provide
flexibility
for
different
presentation
models.
This algorithm is not defined in purely event driven terms, as inspecting all descendant nodes is not always necessary to obtain the needed information from the DOM. In some cases, an element, and all its descendants, is skipped immediately after it is processed on enter . An event approach could be applied, but would require modifying the algorithm to process/ignore the skipped nodes.
User agents can process and internalize the resulting structure using any language that can represent the final form of the data.
For
the
purposes
of
this
algorithm,
a
list
element
is
defined
as
either
an
[
html
]
ol
or
ul
element.
The
following
algorithm
MUST
be
applied
to
a
walk
of
a
DOM
subtree
rooted
at
the
first
element
in
document
order
with
the
role
attribute
value
doc-toc
,
regardless
of
whether
the
element
has
been
declaratively
hidden
[
html
]
or
styled
by
CSS
not
to
be
visible:
The rules for locating the resource containing the table of contents element are defined in § 4.8.1.3 Table of Contents .
If a table of contents element is not found, the publication does not have a table of contents that can be used for machine rendering purposes.
Let
toc
be
the
map
«[
"name"
→
"",
"entries"
→
«
»
]»
representing
the
table
of
contents.
This step initializes the map that will store the title and the branches of the table of contents. In this map:
Initialize the stack branches to hold branches of the table of contents as they are created.
The stack is used to hold branches that are not yet complete. As a new sub-branch is encountered, the parent gets pushed onto the stack so it can be retrieved later.
Let current_toc_node be a variable set to null .
current_toc_node is used to hold the map that represents the branch of the table of contents that is currently being processed.
Walk over the DOM in tree order [ dom ], starting with the element the table of contents is being built from, and trigger the first relevant step below for each element as the walk enters and exits it.
When entering a heading content element:
Run these steps:
If branches is empty, and toc["name"] is an empty string , set toc["name"] to one of the following:
If the resulting value of toc["name"] is an empty string (e.g., after removing any presentational elements and trimming all leading and trailing whitespace), set toc["name"] either to a placeholder value or to null .
This step identifies the heading for the table of contents. A heading is only processed if the value of toc["name"] is an empty string (i.e., no headings have yet been encountered).
Whether a user agent sets name to the descendant content of the heading element, or generates a text string from it, depends on whether it will re-use any descendant tagging in the presentation (e.g., to retain images, MathML, ruby and other content that does not translate to text easily).
«[
"name" → "Contents",
"entries" → « »
]»
If
name
is
not
an
empty
string,
or
is
null
,
then
a
previous
heading
has
already
been
encountered
or
content
has
been
encountered
that
indicates
the
nav
element
does
not
have
a
heading
(e.g.,
a
list
has
already
been
processed,
since
the
heading
would
not
follow
the
list
of
links).
«[
"name" → null,
"entries" → « »
]»
If a heading is not specified, the user agent can provide its own for later use.
When entering a list element :
Run these steps:
If the toc["name"] is an empty string , set toc["name"] to null .
If current_toc_node is not null :
Otherwise, if branches is empty:
This
algorithm
does
not
process
multiple
lists
in
a
single
branch
or
at
the
root
of
the
nav
element,
so
if
a
list
has
already
been
encountered
(the
entries
property
contains
one
or
more
branches
or
is
set
to
null
),
this
list
is
skipped.
If a list is encountered and the table of contents ( toc ) still does not have a name (i.e., no heading element has been encountered), the table of contents is assumed to not have a heading (i.e., the heading for the table of contents cannot appear after the first list of entries). The value of the name property is changed from an empty string to null as no further headings encountered apply, either.
When exiting a list element :
If branches is not empty, pop the top map from branches and set current_toc_node to it.
Otherwise, if toc.entries contains an empty list , set it to null .
This step resets current_toc_node back to the parent object after all of its child branches have been processed.
If there are no branches in the stack, the toc.entries is set to null if it doesn't contain any items (to avoid processing any further lists at the root level).
When entering a list item element, set current_toc_node to the following map :
«[
"name" → null,
"url" → null,
"type" → null,
"rel" → null,
"entries" → « »
]»
Each list item represents a possible new branch in the table of contents, so whenever one is encountered a new blank object is created in current_toc_node .
This
object
gets
populated
with
information
as
a
descendant
a
element
and
list
are
encountered.
When exiting a list item element:
Run these steps:
If current_toc_node["entries"] contains an empty list , set it to null .
If current_toc_node["name"] is an empty string:
If branches is not empty, append current_toc_node to the entries property of the map at the top of branches . Otherwise, append current_toc_node to toc["entries"] .
Set current_toc_node to null .
Exiting a list item indicates that processing of the current branch is complete. Before adding this branch to its parent's entries array, the branch needs to be tested to see if it has a name and/or any sub-branches. If it does not have a name but has sub-branches, the branch is kept. The user agent can either supply a placeholder value of its own creation or set the value to null. If it does not have a name or any branches, it is invalid and is discarded.
To determine where to merge the branch, the stack is checked. If there are no items in the stack, it is added into the entries property of the root toc object (i.e., it is a top-level branch). Otherwise, it gets added into the entries property of the object immediately preceding it in the stack.
As
a
final
step,
current_toc_node
is
reset
back
to
null
.
When entering an anchor element and current_toc_node is not null :
Run these steps:
If current_toc_node["name"] is not null , do nothing.
Otherwise:
Set current_toc_node["name"] to one of the following:
href
attribute
and
the
URL
in
the
attribute
resolves
to
a
resource
in
uniqueResources
,
set
current_toc_node["url"]
to
the
value.
type
attribute,
and
the
value
of
the
attribute
is
not
an
empty
string
after
trimming
leading
and
trailing
white
space,
set
current_toc_node["type"]
to
the
trimmed
value.
rel
attribute,
and
the
value
of
the
attribute
is
not
an
empty
string
after
trimming
leading
and
trailing
white
space,
split
the
trimmed
value
on
whitespace
and
set
current_toc_node["rel"]
to
the
resulting
list
of
tokens.
Skip further processing of the element and continue to the next.
This step processes anchor tags to obtain values for the name and url properties of a branch.
If the name of the current branch is already defined, then processing of this element is terminated (i.e., to avoid processing multiple links for a single branch).
Whether
a
user
agent
sets
the
name
of
the
entry
to
the
descendant
content
of
the
a
element,
or
generates
a
text
string
from
it,
depends
on
whether
it
will
re-use
any
descendant
tagging
in
the
presentation
(e.g.,
to
retain
images,
MathML,
ruby
and
other
content
that
does
not
translate
to
text
easily).
In
addition
to
having
an
href
attribute
specified,
it
is
necessary
that
it
resolve
to
a
resource
that
belongs
to
the
digital
publication
to
meet
the
requirements
of
this
specification.
If
not,
the
branch
is
retained
but
the
entry
will
not
be
linkable.
Additional information about the target of the link — the type of resource and its relation — is also retained.
«[
"name" → "In the Beginning",
"url" → "http://example.com/page1.svg",
"type" → "image/svg",
"rel" → null,
"entries" → « »
]»
When entering a sectioning content element, a sectioning root element, or an element with a hidden attribute:
Skip further processing of the element and continue to the next.
As sectioning and sectioning root elements can define their own outlines, descending into them poses problems for generating the table of contents (i.e., they may contain content that is not directly related). As a result, they are skipped over when encountered to prevent their child content from being processed.
Otherwise: do nothing.
For all other elements, this steps allows their descendant elements to continue to be processed.
After
completing
the
DOM
walk,
if
toc["entries"]
contains
a
non-empty
list
,
return
toc
.
Otherwise,
return
null
.
If
the
entries
array
in
the
root
toc
object
does
not
contain
any
branches
(either
because
no
list
was
found
in
the
nav
element
or
the
list
did
not
contain
any
conforming
list
items),
then
the
algorithm
did
not
produce
a
usable
table
of
contents.
The following substantive changes have been made since the last release:
For a complete list of issues addressed, refer to the GitHub tracker .
This section is non-normative.
This section is non-normative.
The following is a manifest with a basic set of metadata for an example book profile.
A JSON encoding of the internal representation of this manifest is also available.
{
"@context": [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language" : "en"}
],
"conformsTo": "https://example.com/publication",
"type": "Book",
"url": "https://publisher.example.org/mobydick",
"author": "Herman Melville",
"dateModified": "2018-02-10T17:00:00Z",
"readingOrder": [
"html/title.html",
"html/copyright.html",
"html/introduction.html",
"html/epigraph.html",
"html/c001.html",
"html/c002.html",
"html/c003.html",
"html/c004.html",
"html/c005.html",
"html/c006.html"
],
"resources": [
"css/mobydick.css",
{
"type": "LinkedResource",
"rel": "cover",
"url": "images/cover.jpg",
"encodingFormat": "image/jpeg"
},{
"type": "LinkedResource",
"url": "html/toc.html",
"rel": "contents"
},{
"type": "LinkedResource",
"url": "fonts/STIXGeneral.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"type": "LinkedResource",
"url": "fonts/STIXGeneralBol.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"type": "LinkedResource",
"url": "fonts/STIXGeneralBolIta.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"type": "LinkedResource",
"url": "fonts/STIXGeneralItalic.otf",
"encodingFormat": "application/vnd.ms-opentype"
}
]
}
The following is a manifest for an example article profile. The article consists only of the document the manifest is embedded in. The title and reading order are omitted from the manifest, as these properties are automatically generated during processing from the title and URL of the containing document, respectively.
A JSON encoding of the internal representation of the manifest is also available, as well as a more elaborate version for the same document.
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Model for Tabular Data and Metadata on the Web</title>
<link href="#wpm" rel="publication" />
...
<script id="wpm" type="application/ld+json">
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language" : "en-US"}
],
"conformsTo" : "https://example.com/article",
"type" : "TechArticle",
"id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"copyrightYear" : "2015",
"copyrightHolder" : "World Wide Web Consortium",
"creator" : ["Jeni Tennison", "Gregg Kellogg", "Ivan Herman"],
"publisher" : {
"type" : "Organization",
"name" : "World Wide Web Consortium",
"id" : "https://www.w3.org/"
},
"datePublished" : "2015-12-17",
"resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
{
"type" : "LinkedResource",
"url" : "test-utf8.csv",
"encodingFormat" : "text/csv"
},
{
"type" : "LinkedResource",
"url" : "test.xlsx",
"encodingFormat" : "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
}
],
}
</script>
</head>
<body>
....
<section id="toc" role="doc-toc">
<h2 resource="#h-toc" id="h-toc" class="introductory">Table of Contents</h2>
<ul class="toc">
<li class="tocline"><a class="tocxref" href="#intro">
<span class="secno">1. </span>Introduction</a>
</li>
...
</ul>
</section>
...
</body>
</html>
The following example shows a manifest that conforms to the Audiobooks profile [ audiobooks ].
A JSON encoding of the internal representation of this manifest is also available.
{
"@context": [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language": "en"}
],
"conformsTo": "https://www.w3.org/TR/audiobooks/",
"type": "Audiobook",
"id": "https://librivox.org/flatland-a-romance-of-many-dimensions-by-edwin-abbott-abbott/",
"url": "https://w3c.github.io/pub-manifest/experiments/audiobook/",
"name": "Flatland: A Romance of Many Dimensions",
"author": "Edwin Abbott Abbott",
"readBy": "Ruth Golding",
"publisher": "Librivox",
"inLanguage": "en",
"dateModified": "2019-11-14",
"datePublished": "2008-10-12",
"duration": "PT13774S",
"license": "https://creativecommons.org/publicdomain/zero/1.0/",
"abridged": false,
"accessMode": "auditory",
"accessModeSufficient": [{
"type": "ItemList",
"itemListElement": ["auditory"],
"description": "Audio"
}],
"accessibilityFeature": ["readingOrder", "unlocked"],
"accessibilityHazard": "noSoundHazard",
"accessibilitySummary": "This is just a test summary",
"readingProgression": "ltr",
"resources": [
{
"rel": "cover",
"url": "http://ia800704.us.archive.org/9/items/LibrivoxCdCoverArt12/Flatland_1109.jpg",
"encodingFormat": "image/jpeg",
"name": "Cover page with title and author"
},{
"rel": "contents",
"url": "toc.html",
"encodingFormat": "text/html"
},{
"rel": "accessibility-report",
"url": "a11y.html",
"encodingFormat": "text/html"
},{
"rel": "privacy-policy,",
"url": "privacy.html",
"encodingFormat": "text/html"
}
],
"readingOrder": [
{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_1_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1371S",
"name": "Part 1, Sections 1 - 3"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_2_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1669S",
"name": "Part 1, Sections 4 - 5"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_3_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1506S",
"name": "Part 1, Sections 6 - 7"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_4_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1669S",
"name": "Part 1, Sections 8 - 10"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_5_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1506S",
"name": "Part 1, Sections 11 - 12"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_6_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1798S",
"name": "Part 2, Sections 13 - 14"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_7_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1225S",
"name": "Part 2, Sections 15 - 17"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_8_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1371S",
"name": "Part 2, Sections 18 - 20"
},{
"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_9_abbott.mp3",
"encodingFormat": "audio/mpeg",
"duration": "PT1659S",
"name": "Part 2, Sections 21 - 22"
}
]
}
This section is non-normative.
The following table identifies where manifest properties are defined and extended.
This section is non-normative.
The following table identifies where the use of resource relations is defined.
Name | Publication Manifest |
---|---|
accessibility-report
|
§ 4.8.2.1 Accessibility Report |
contents
|
§ 4.8.1.3 Table of Contents |
cover
|
§ 4.8.1.1 Cover |
pagelist
|
§ 4.8.1.2 Page List |
privacy-policy
|
§ 4.8.2.3 Privacy Policy |
preview
|
§ 4.8.2.2 Preview |
This section is non-normative.
The editors would like to thank the members of the Publishing Working Group for their contributions to this specification:
The Working Group would also like to thank the members of the Digital Publishing Interest Group for all the hard work they did paving the road for this specification.