This document is also available in this non-normative format: EPUB
Copyright © 2019-2022 W3C ® ( MIT , ERCIM , Keio , Beihang ). W3C liability , trademark and permissive document license rules apply.
This specification describes the requirements for the creation of audiobooks, using a profile of the Publication Manifest specification.
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Audiobooks Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. Future updates to this specification may incorporate new features .
This document was produced by a group operating under the W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 2 November 2021 W3C Process Document .
This section is non-normative.
An Audiobook is a collection of audio resources grouped together by a reading order, metadata, and resources, all contained in a manifest. This Audiobook can live on the Open Web Platform, or as a packaged entity.
This specification is intended to standardize the audiobooks distribution model on the web and between businesses. It should facilitate different user agent architectures for the consumption of Audiobooks. The primary goal is to bring clarity to a part of the publishing industry currently underserved by standards, while opening Audiobooks to the Open Web Platform and new user agents. This specification does not outline what file types or formats should be used by content creators, only a manifest format for delivering them.
This specification does not define how user agents are expected to render Audiobooks. Details about the types of affordances that user agents can provide to enhance the reading experience for users are instead defined in [ pwp-ucr ].
Terms with meanings specific to the publishing industry are capitalized in this document (e.g., "Reading System"). A complete list of these terms and definitions is provided in [ pub-manifest ].
Only the first instance of a term in a section is linked to its definition.
In addition, the following terminology is defined for use in this specification:
Supplemental content is any content relating to the audiobook content but not required for the full experience of the publication. Examples of supplemental content include photographs, charts, or data relating to topics mentioned in the audiobook.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , MUST NOT , RECOMMENDED , REQUIRED , and SHOULD in this document are to be interpreted as described in BCP 14 [ RFC2119 ] [ RFC8174 ] when, and only when, they appear in all capitals, as shown here.
The primary entry page is an HTML resource that represents the preferred starting resource of an Audiobook and enables discovery of its manifest. It typically introduces the audiobook and provides access to the content.
The primary entry page MUST include either a link to the manifest or embed the manifest [ pub-manifest ]. It also SHOULD contain the table of contents .
An Audiobook MUST include a primary entry page except when packaging allows alternative discovery of the manifest. When present, the page MUST be included in the resource list .
The table of contents provides a hierarchical list of links that reflects the structural outline of the major sections of the Audiobook and any supplemental content it may contain.
The
table
of
contents
is
expressed
via
an
[
html
]
element
(typically
a
nav
element)
in
one
of
the
resources.
This
element
MUST
be
identified
by
the
role
attribute
[
html
]
value
"doc-toc"
[
dpub-aria-1.0
].
If the table of contents is located in the primary entry page , the table of contents MUST be the first element in the document — in document tree order [ dom ] — with that role value. Otherwise, the manifest SHOULD identify the resource that contains the structure.
If the table of contents is not located in the primary entry page , the manifest SHOULD identify the resource that contains the structure.
When
an
Audiobook
contains
additional
audio
resources
(i.e.
that
include
multiple
chapters
or
parts,
or
if
the
audiobook
contains
supplemental
content):
content:
a table of contents SHOULD be included;
the table of contents SHOULD include a link to any of the resources ; and
all links SHOULD refer to publication resources [ pub-manifest ].
Some Audiobooks may have audio resources that contain more than one chapter or section of the book content. It is strongly recommended that content creators provide a table of contents using media fragments [ media-frags ] to provide the user with access to the structure of the Audiobook.
When including supplemental content, be aware that users might not have access to this content unless it is linked to from the table of contents. It is strongly advised to provide links to all content that is not in the default reading order.
For more guidance on the structure and formatting for tables of contents, consult Publication Manifest - Machine-Processable Table of Contents [ pub-manifest ].
This section is non-normative.
The Audiobook manifest is defined by a set of properties that describe the basic information a user agent requires to process and render an Audiobook. These properties are categorized in the Publication Manifest [ pub-manifest ]. Where these properties are extended from the Publication Manifest is specified in this section.
The Audiobook manifest is defined as a specific "shape" of [ json-ld11 ]. This shape is also defined, informally, through a JSON schema [ json-schema ] that expresses the constraints defined in this specification. This schema is maintained at https://www.w3.org/ns/pub-schema/audiobooks/ .
The requirements for the expression of Audiobook properties and resource relations are defined as follows:
The list of properties uses the formal names for each property as described in [ schema.org ] and [ pub-manifest ]. A descriptive label is included in parentheses where the purpose of these properties might be unclear.
conformsTo
@context
readingOrder
name
(publication
title)
abridged
accessibilityFeature
accessibilityHazard
accessibilitySummary
accessMode
accessModeSufficient
author
cover
duration
dateModified
datePublished
id
(canonical
identifier)
inLanguage
(publication
language)
readBy
readingProgression
resources
type
url
(address)
Some
properties
are
implicitly
required,
as
they
are
compiled
from
alternative
information
when
not
explicitly
authored.
Refer
to
the
internal
representation
data
models
[
pub-manifest
]
for
more
information
(the
Audiobooks
representation
only
differs
in
the
default
value
for
the
type
term).
An Audiobook manifest has to start by setting the JSON-LD context [ json-ld ]. The context has the following two major components:
https://schema.org
https://www.w3.org/ns/pub-context
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
&
}
To add the global language and direction of the manifest metadata, language and direction declaration [ pub-manifest ] can also be added to the context:
{
"@context" : [
"https://schema.org",
"https://www.w3.org/ns/pub-context",
{"language":"fr"}
]
&
}
The
conformance
URL
expressed
in
the
conformsTo
term
[
pub-manifest
]
MUST
be
"
https://www.w3.org/TR/audiobooks/
".
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/"
&
}
The
Publication
Type
is
defined
using
the
type
term [
pub-manifest
].
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"type" : "Audiobook"
&
}
If
a
type
is
not
specified,
Audiobook
[
schema.org
]
is
assumed
as
the
default.
A creator is an individual or entity responsible for the creation of the Audiobook. The Audiobooks profile can use the full list of creators defined in [ pub-manifest ].
The creators list includes two recommended creators for Audiobooks:
{
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
"type" : "Audiobook",
&
"url" : "https://publisher.example.org/janeeyre",
"author" : {
"type" : "Person",
"name" : "Charlotte Bronte"
}
}
{
"conformsTo" : "https://www.w3.org/TR/audiobooks/";
"@context": ["https://schema.org", "https://www.w3.org/ns/pub-context"],
&
"url" : "https://publisher.example.org/janeeyre",
"author" : {
"type": "Person",
"name": "Charlotte Bronte"
}
"readBy" : {
"type": "Person",
"name": "Ivan Herman",
"id" : "https://www.w3.org/People/Ivan/"
}
}
A duration is the length of the audio resources in an Audiobook. The duration property is fully defined in Publication Manifest [ pub-manifest ].
Duration
SHOULD
be
expressed
for
the
entirety
of
the
audiobook
as
part
of
the
manifest,
and
SHOULD
MUST
be
present
at
the
item
level
in
the
default
reading
order
.
We have updated the normative requirement for the duration property on the item level, to address implementer feedback in regards to facilitating streaming. This normative requirement is now a MUST .
When a content creator specifies both the duration for the audiobook and item-level duration in the default reading order the resource-level duration SHOULD be equal to the sum of the durations of the items in the reading order.
{
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
&
"url" : "https://publisher.example.org/janeeyre",
"author" : {
"type" : "Person",
"name" : "Charlotte Bronte"
},
"duration" : "PT12345.235S"
}
The default reading order [ pub-manifest ] is a specific progression through the audio resources in the audiobook.
The
default
reading
order
MUST
contain
at
least
one
audio
resource,
which
MAY
be
identified
by
the
type
of
LinkedResource
[
pub-manifest
].
The
default
reading
order
MUST
NOT
contain
non-audio
resources.
An
audio
resource
can
MUST
be
referenced
in
its
entirety
via
a
URL
[
url
],
or
for
content
where
multiple
chapters
occupy
a
single
file
by
using
media
fragments
[
media-frags
]
to
locate
the
exact
starting
and
end
points.
].
It
is
important
to
note
that
a
resource
cannot
be
referenced
more
than
once
in
the
reading
order.
In
the
case
where
an
audio
file
represents
the
content
of
multiple
chapters
or
sections
of
the
book,
the
table
of
contents
can
should
be
used
to
specify
the
starting
and
ending
points
of
those
chapters
in
the
larger
audio
file,
as
demonstrated
in
this
example
.
Update
previously
non-normative
statement
about
the
use
of
media
fragments
to
allow
for
referencing
of
audio
files
that
span
multiple
chapters.
The
previous
statement
contradicted
the
note
on
referencing
an
audio
file
more
than
once
in
the
reading
order.
The
new
normative
statement
will
require
that
media
fragments
not
be
used
on
the
url
property
in
the
Reading
Order.
Annotations can also use media fragments to identify the location of the annotation in the resource, and are compatible with the Web Annotations model. This method will only apply to audiobook manifests that are not packaged.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"url" : "https://publisher.example.org/janeeyre",
"name" : "Jane Eyre",
"readingOrder" : [{
"type" : "LinkedResource",
"url" : "audio/janeeyre.mp3",
"encodingFormat" : "audio/mp3",
"name" : "Jane Eyre",
"duration" : "PT124503.123S"
}]
}
The
resource
list
enumerates
any
additional
resources
used
in
the
processing
and
rendering
of
an
audiobook
that
are
not
listed
in
the
reading
order.
It
is
expressed
using
the
resources
property.
If an audiobook includes supplemental content it MUST be referenced in the resource list.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"url" : "https://publisher.example.org/janeeyre",
"name" : "Jane Eyre",
"resources" : [
"cover.jpg",
"portrait_CB.jpg",
"supplement.pdf"
]
}
Previews are a common way to provide users an experience of the full content before purchasing or downloading the full audiobook.
A
preview
is
identified
using
the
preview
link
relation,
as
defined
in
[
pub-manifest
].
Previews MAY be located externally or included as a resource of the audiobook.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"url" : "https://publisher.example.org/janeeyre",
"name" : "Jane Eyre",
"resources" : [{
"type" : "LinkedResource",
"url" : "https://publisher.example.org/jane-eyre-preview.wav",
"encodingFormat" : "audio/wav",
"rel" : "preview"
}]
}
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"url" : "https://publisher.example.org/janeeyre",
"name" : "Jane Eyre",
"resources" : [{
"type" : "LinkedResource",
"url" : "preview.wav",
"encodingFormat" : "audio/wav",
"rel" : "preview"
}]
}
This section is non-normative.
Audiobooks will be packaged using the method described in the Lightweight Packaging Format [ lpf ] note.
This section is non-normative.
The history of the audiobook is rooted in the world of accessibility. Both purely audio publications and publications that synchronize text and audio playback have long been used to assist users with alternative reading needs and preferences.
An approach for accessible synchronized media in publications is currently being done by the Synchronized Multimedia for Publications Community Group . Refer to the work of that group for more information about creating such content and incorporating it into an Audiobook.
Alternatively, a content creator can provide the text equivalent as HTML [ html ] resources in the resources .
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"url" : "https://publisher.example.org/janeeyre",
"name" : "Jane Eyre",
"readingOrder" : {
"type" : "LinkedResource",
"url" : "audio/part001.wav#t=0",
"encodingFormat" : "audio/vnd-wav",
"name" : "Chapter 1",
"duration" : "PT457.931S",
"alternate" : {
"type" : "LinkedResource",
"url" : "text/part001-1.html",
"encodingFormat" : "text/html"},
},
"resources" : [{
"type": "LinkedResource",
"url": "text/part001-1.html",
"encodingFormat" : "text/html"
}&
]
}
This section depends on the Infra Standard [ infra ].
The specification extends the Publication Manifest processing algorithms [ pub-manifest ] as follows:
The following extension steps are added for Audiobook manifests:
( 5.6.2 Duration ) Check the duration of the publication as follows:
Let resourceDuration hold the total duration of individual resources.
For each resource of data["readingOrder"] :
if resource["duration"] is not defined, validation error .
otherwise, if resource["duration"] , add resource["duration"] to resourceDuration .
If the values cannot be compared because data["duration"] is not set, validation error .
Otherwise, if resourceDuration does not specify the same total duration as data["duration"] , validation error .
This step checks both that all resources in the reading order specify a duration and that the sum of all those durations matches the total duration for the publication.
A validation error is only emitted while checking each resource if the resource does not specify a duration. The validity of the durations [ pub-manifest ] is already checked in the publication manifest algorithm so does not need to be repeated.
The following extension steps are added for Audiobook manifests:
( 5.7 Default Reading Order ) Check the reading order as follows:
If data["readingOrder"] is not set, fatal error .
For each resource in data["readingOrder"] , if resource is not an audio resource, validation error , remove resource from data["readingOrder"] .
If data["readingOrder"] is an empty list , fatal error .
This step ensures that only audio resources are listed in the reading order and removes any that are not.
If the reading order does not contain any entries after checking each resource, a fatal error is returned as the publication is not a valid audiobook.
(
5.5
Publication
Type
)
If
data["type"]
is
not
set
or
is
an
empty
list
,
validation
error
,
set
to
« "Audiobook" »
.
This
step
sets
the
default
type
of
the
publication
to
Audiobook
when
a
type
property
has
not
been
specified.
( 5.2 Requirements ) Check that each of the following properties is set. If not, issue a validation error for each one.
This step checks that all the recommended properties have been set. For more information about these, refer to 5.2 Requirements .
(
5.2
Requirements
)
If
no
resource
in
data["readingOrder"]
or
data["resources"]
has
a
rel
entry
that
contains
the
relation
cover
,
validation
error
.
This
step
checks
the
reading
order
and
resource
list
to
verify
that
a
cover
has
been
specified
(i.e.,
an
resource
has
the
value
cover
in
its
rel
property).
This section is non-normative.
This specification extends the Publication Manifest’s User Agent Processing Algorithm for Machine-Processable Table of Contents [ pub-manifest ] to locate a table of content element as follows:
See also 4.2 Table of Contents for further details.
As Audiobooks is a profile of Publication Manifest [ pub-manifest ], all security and privacy considerations detailed in that specification are applicable to this profile.
This profile acknowledges the following considerations:
This section is non-normative.
This
section
outlines
the
expected
user
agent
behaviors
for
implementation
of
audiobooks.
For
processing
instructions,
user
agents
should
refer
to
the
Processing
a
Manifest
section
of
the
Publication
Manifest
[
pub-manifest
]
specification,
and
conform
to
any
behavior
described
there.
All user agent behaviors described in this section are intended to provide implementors with guidance, not strict requirements. Behaviors in this document are taken mainly from the Use Cases and Requirements [ pwp-ucr ] note published by the working group.
As outlined in the Use Cases and Requirements [ pwp-ucr ] note, an audiobook must be navigable in the User Agent. This means that a User Agent must provide methods for the user to move through the audiobook in a linear or non-linear fashion by either moving through the Reading Order seamlessly or by accessing the Table of Contents . The User Agent should also allow the user to move through individual audio files in short time increments.
For an audiobook, the User Agent should provide a player interface [ pwp-ucr ] that will allow the user to navigate, play, or pause the audiobook. This interface can be represented to the user in any way (i.e. physical buttons, visual interface, keyboard input, or voice commands), but should be accessible at any point in the listening experience.
The Use Cases and Requirements [ pwp-ucr ] note recommends that content be available offline and that any packaged formats should not affect the iterations of the publications. This means that even if the content is copied many times to many users via multiple User Agents, the core manifest and its identifier are never changed.
This specification recommends the Lightweight Packaging Format [ lpf ] for packaging audiobook content, but this is not a requirement. Audiobook User Agents should be able to ingest LPF files for play, and should display content according to the requirements and recommendations in this document.
If a User Agent is serving the content directly from their service (i.e. as a retailer or repository of content), it is recommended that they provide a method for offlining or downloading the content to the user. This can be in any format they choose, but the audiobook should be complete and valid and the contents listed in the manifest should be served in their entirety. Even if a User Agent does not support the display of a certain resource (i.e. an image file or data table), it should still be available to the user for download.
This specification does not provide a method for content creators to protect or watermark their content, as there are existing methods available in the market today. User Agents who work with content creators that wish to protect or limit the distribution of their content can choose a method that works best for their requirements.
This
specification
recommends
and
provides
a
method
for
content
creators
to
create
fully
accessible
audiobooks
.
User
Agents
should
use
this
information,
in
the
section
on
Accessibility,
to
implement
accessible
audiobook
interfaces.
It
is
recommended
that
User
Agents
provide
accessible
player
interfaces,
as
well
as
a
method
for
content
creators
who
have
provided
alternate
content
to
have
that
content
displayed.
Substantive changes since Recommendation publication.
url
property
in
the
Reading
Order.
Substantive changes since the First Public Working Draft :
For a complete list of issues addressed, refer to the GitHub tracker .
This section is non-normative.
A manifest for an audiobook. The canonical version of this manifest is also available.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /uploads/G4IJrD/tmp/experiments/flatland.json</pre>
</body>
</
html
>
A manifest for an audiobook with supplemental content.
{
"@context" : ["https://schema.org", "https://www.w3/org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
"id" : "https://publisher.example.com/janeeyre",
"url" : "https://publisher.example.com/janeeyre",
"name" : "Jane Eyre",
"author" : "Charlotte Bronte",
"readBy" : "Jane Doe",
"duration" : "PT123456.789S",
"abridged" : false,
"inLanguage" : "en",
"dateModified" : "2019-03-29T15:59:00Z",
"datePublished" : "2019-03-29",
"readingOrder": [
{"url": "audio/chapter001.aac", "encodingFormat": "audio/aac", "name": "Chapter 1", "duration": "PT1234.567S"},
{"url": "audio/chapter002.aac", "encodingFormat": "audio/aac", "name": "Chapter 2", "duration": "PT890.123S"},
{"url": "audio/chapter003.aac", "encodingFormat": "audio/aac", "name": "Chapter 3", "duration": "PT456.789S"},
{"url": "audio/chapter004.aac", "encodingFormat": "audio/aac", "name": "Chapter 4", "duration": "PT987.654S"},
{"url": "audio/chapter005.aac", "encodingFormat": "audio/aac", "name": "Chapter 5", "duration": "PT321.987S"}
],
"resources": [
{"rel": "cover", "url": "images/cover.jpg", "encordingFormat": "image/jpeg"},
{"rel": "contents", "url": "toc.html", "encodingFormat": "text/html"},
{"url": "haworth_house.pdf", "encodingFormat": "application/pdf"}
]
}
This section is non-normative.
A primary entry page with a simple table of contents for an audiobook.
<head>
&
<script type="application/ld+json">
{
"@context" : ["https://schema.org","https://www.w3.org/ns/pub-context"],
"conformsTo" : "https://www.w3.org/TR/audiobooks/",
&
"url" : "https://publisher.example.org/janeeyre",
&
}
</script>
&
</head>
<body>
&
<nav role="doc-toc">
<ol>
<li><a href="audio/chapter001.wav">Chapter 1. There was no possibility of taking a walk that day...</a></li>
<li><a href="audio/chapter002.wav">Chapter 2. I resisted all the way:...</a></li>
<li><a href="audio/chapter003.wav">Chapter 3. The next thing I remember is,...</a></li>
&
</ol>
</nav>
&
</
body
>
A table of contents for a simple audiobook.
<nav role="doc-toc">
<h2>JANE EYRE</h2>
<ol>
<li><a href="audio/chapter001.mp3">Chapter 1. There was no possibility of taking a walk that day...</a></li>
<li><a href="audio/chapter002.mp3">Chapter 2. I resisted all the way:...</a></li>
<li><a href="audio/chapter003.mp3">Chapter 3. The next thing I remember is,...</a></li>
&
</ol>
</
nav
>
A table of contents using media fragment references to locations in a single audio track.
<nav role="doc-toc">
<h2>JANE EYRE</h2>
<ol>
<li><a href="https://example.publisher.org/janeeyre/part001.mp3#t=0,456.788">Chapter 1</a></li>
<li><a href="https://example.publisher.org/janeeyre/part001.mp3#t=456.789,1234.566">Chapter 2</a></li>
<li><a href="https://example.publisher.org/janeeyre/part001.mp3#t=1234.567">Chapter 3</a></li>
</ol>
</
nav
>
Cannot
GET
/uploads/WyI67Q/tmp/common/
/uploads/G4IJrD/tmp/common/
html
/acknowledgements
.html