Copyright © 2018 W3C ® ( MIT , ERCIM , Keio , Beihang ). W3C liability , trademark and permissive document license rules apply.
This specification defines a collection of information that describes the structure of Web Publications so that user agents can provide user experiences tailored to reading publications, such as sequential navigation and offline reading. This information includes the default reading order, a list of resources, and publication-wide metadata.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This draft provides a preliminary outline of a Web Publication. Many details are under active consideration within the Publishing Working Group and are subject to change. The most prominent known issues have been identified in this document and links provided to comment on them.
This document was published by the Publishing Working Group as an Editor's Draft. Comments regarding this document are welcome. Please send them to public-publ-wg@w3.org ( subscribe , archives ).
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 1 February 2018 W3C Process Document .
This section is non-normative.
A Web Publication is a discoverable and identifiable collection of resources. Information about the Web Publication is expressed in a machine-readable document called a manifest , which is what enables user agents to understand the bounds of the Web Publication and the connection between its resources.
The manifest includes metadata that describe the Web Publication, as a publication has an identity and nature beyond its constituent resources. The manifest also provides a list of all the resources that belong to the Web Publication and a default reading order, which is how it connects resources into a single contiguous work.
A
Web
Publication
is
discoverable
in
one
of
two
ways:
resources
either
include
a
link
to
the
manifest
(via
an
HTTP
Link
header
or
an
HTML
link
element [
html
]),
or
the
manifest
can
be
loaded
directly
by
a
compatible
user
agent.
With the establishment of Web Publications, user agents can build new experiences tailored specifically for their unique reading needs.
This section is non-normative.
This specification only defines requirements for the production and rendering of valid Web Publications . As much as possible, it leverages existing Open Web Platform technologies to achieve its goal—that being to allow for a measure of boundedness on the Web without changing the way that the Web itself operates.
Moreover, the specification is designed to adapt automatically to updates to Open Web Platform technologies in order to ensure that Web Publications continue to interoperate seamlessly as the Web evolves (e.g., by referencing the latest published versions instead of specific dated versions).
Further, this specification does not attempt to constrain the nature of a Web Publication: any type of work that can be represented on the Web constitutes a potential Web Publication.
The specification is also intended to facilitate different user agent architectures for the consumption of Web Publications. While a primary goal is that traditional Web user agents (browsers) will be able to consume Web Publications, this should not limit the capabilities of any other possible type of user agent (e.g., applications, whether standalone or running within a user agent, or even Web Publications that include their own user interface). As a result, the specification does not attempt to architect required solutions for situations whose expected outcome will vary depending on the nature of the user agent and the expectations of the user (e.g., how to prompt to initiate a Web Publication, or at what point or how much of a Web Publication to cache for offline use).
We may want to write something here on the relationships…
This document uses terminology defined by the W3C Note "Publishing and Linking on the Web" [ publishing-linking ], including, in particular, user , user agent , browser , and address .
An identifier is metadata that can be used to refer to Web Content in a persistent and unambiguous manner. URLs , URNs , DOIs , ISBNs , and PURLs are all examples of persistent identifiers frequently used in publishing.
A manifest represents structured information about a Web Publication , such as informative metadata, a list of all resources , and a default reading order .
For the purposes of this specification, non-empty is used to refer to an element, attribute or property whose text content or value consists of one or more characters after whitespace normalization, where whitespace normalization rules are defined per the host format.
The general term URL is defined by the URL Standard [ url ]. It is used as in other W3C specifications, like HTML [ html ]. In particular, a URL allows for the usage of characters from Unicode following [ rfc3987 ]. See the note in the HTML5 specification for further details.
A Web Publication is a collection of one or more resources, organized together through a manifest into a single logical work with a default reading order . The Web Publication is uniquely identifiable and presentable using Open Web Platform technologies.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , MUST NOT , OPTIONAL , RECOMMENDED , REQUIRED , SHOULD , and SHOULD NOT are to be interpreted as described in [ RFC2119 ].
This specification defines two conformance classes: one for Web Publications and one for user agents that process them.
A Web Publication conforms to this specification if it meets the following criteria:
A user agent conforms to this specification if it meets the following criteria:
This section is non-normative.
A Web Publication is defined by a set of items known as its information set ( infoset ). The infoset is both abstract and concrete. It is abstract in the sense that it represents a set of information that a user agent has to compile about the Web Publication, but it also becomes concrete when the user agent creates an internal representation of that information.
A manifest , on the other hand, is a serialization of an infoset created by the author of a Web Publication . The manifest is expressed using the JSON -LD [ json-ld ] format — a variant of JSON [ ecma-404 ] for expressing linked data. The manifest can be created as a standalone resource or it can be embedded within an HTML document.
Although the infoset is primarily compiled from a Web Publication's manifest , some information is obtained outside the manifest. The table of contents, for example, may be referenced from the manifest but is serialized in an HTML document.
This specification describes the requirements for creating both the infoset and manifest. This section, in particular, details how to create a manifest, and the next lists the various properties common to infosets and manifests.
A Web Publication Manifest MUST start by setting the JSON -LD context [ json-ld ]. The context has the following two major components:
https://schema.org
https://www.w3.org/ns/wp-context
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/wp-context"],
…
}
The Web Publication context file MAY add features to the properties defined in Schema.org (e.g., the requirement for the creator property to be order preserving).
As part of the continuous contacts with Schema.org the additional features defined in the Web Publication context file could migrate to the core Schema.org vocabulary.
Although
Schema.org
is
often
referenced
using
the
http
URI
scheme,
the
vocabulary
is
being
migrated
to
use
the
secure
https
scheme
as
its
default.
This
specification
requires
the
use
https
when
referencing
Schema.org
in
the
manifest.
In some cases, the context MAY be extended by additional, local information. For example, see 4.4.5.2 Manifest Expression .
The
Web
Publication
Manifest
MUST
include
a
Publication
Type
using
the
@type
keyword [
json-ld
].
The
type
MAY
be
mapped
onto
the
CreativeWork
type [
schema.org
].
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork"
…
}
Schema.org
also
includes
a
number
of
more
specific
types,
all
subtypes
of
CreativeWork
,
such
as
Article
,
Book
,
and
Course
.
These
MAY
be
used
instead
of
CreativeWork
.
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/wp-context"],
"@type" : "Book"
…
}
Refer
to
the
Schema.org
site
for
the
complete
list
of
CreativeWork
subtypes
.
The naming, syntax, and requirements for manifest properties are defined in 4. Web Publication Properties .
Although authors only have to understand the serialization requirements for manifest terms, they are encouraged to read through the infoset definitions for each property, as well. The infoset definitions describe, in some cases, how items are compiled in the absence of explicit information in the manifest.
A
manifest
can
be
embedded
within
an
HTML
document
using
the
script
element
[
html
].
When
embedding
a
manifest,
the
type
attribute
of
the
containing
script
element
MUST
be
set
to
application/ld+json
.
Additionally,
the
script
element
MUST
include
a
unique
identifier
in
an
id
attribute [
html
].
This
identifier
ensures
that
the
manifest
can
be
referenced
.
<script id="example_manifest" type="application/ld+json">
{
…
}
</
script
>
Resources SHOULD provide a link to the manifest of the Web Publication to which they belong to enable discovery. Links MUST take one or both of the following forms:
An
HTTP
Link
header
field [
rfc5988
]
with
its
rel
parameter
set
to
the
value
"
publication
".
Link:
<
https:
//
example.com
/
webpub
/
manifest
>
;
rel=publication
A
link
element [
html
]
with
its
rel
attribute
set
to
the
value
"
publication
".
<
link
href
=
"https://example.com/webpub/manifest"
rel
=
"publication"
/>
When
a
manifest
is
embedded
within
an
HTML
document,
the
link
MUST
include
a
fragment
identifier
that
references
the
script
element
that
contains
the
manifest
(see
3.3
Embedding
a
Manifest
).
<link href="#example_manifest" rel="publication">
…
<script id="example_manifest" type="application/ld+json">
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/wp-context"],
…
}
</
script
>
The
exact
value
of
rel
is
still
to
be
agreed
upon
and
should
be
registered
by
IANA.
The following details might be moved to the lifecycle section in a future draft.
When a resource links to multiple manifests, a user agent MAY choose to present one or more alternatives to the end user, or choose a single alternative on its own. The user agent MAY choose to present any manifest based upon information that it possesses, even one that is not explicitly listed as a parent (e.g., based upon information it calculates or acquires out of band). In the absence of a preference by user agent implementers, selection of the first manifest listed is suggested as a default.
A Web Publication MUST include at least one HTML document [ html ] that links to the manifest . This page is referred to as the primary entry page of the Web Publication.
The manifest may be embedded into the primary entry page ; in this case the link element MUST use a relative URL to refer to the manifest.
The manifest itself MUST NOT include a reference to itself, i.e., the reference to the manifest MUST NOT appear as part of the 4.7 Resource Categorization Properties .
There are no restrictions on a Web Publication beyond this requirement. The Web Publication MAY include references to resources of any media type, both in the default reading order and as dependencies of other resources.
When adding resources to a Web Publication, consider support in user agents. The use of progressive enhancement techniques and the provision of fallback content, as appropriate, will ensure a more consistent reading experience for users regardless of their preferred user agent.
The table of contents provides a hierarchical list of links that reflects the structural outline of the major sections of the Web Publication.
The
table
of
contents
is
expressed
via
an
HTML
element
in
one
of
the
resources
(typically
a
nav
element [
html
]).
This
element
MUST
be
identified
by
the
role
attribute [
html
]
value
"
doc-toc
" [
dpub-aria-1.0
],
and
MUST
be
the
first
element
in
the
document
so
designated.
The table of contents SHOULD be located in the primary entry page of the Web Publication. If not, the manifest SHOULD identify the resource that contains the structure.
There are no requirements on the table of contents itself, except that, when specified, it MUST include a link to at least one resource .
Refer to the table of contents property definition for more information on how to identify in the infoset and manifest which resource contains the table of contents.
This question arises only if the table of contents is accepted: can a table of contents navigation element refer, via links, to any resource that is not listed in the default reading order ?
This section is non-normative.
Both the Web Publication infoset and manifest are defined by a common set of properties that describe the basic information a user agent requires to process and render a Web Publication. These properties are categorized as followed:
Descriptive properties describe aspects of a Web Publication, such as its title , creator , and language . These properties are primarily drawn from Schema.org and its hosted extensions [ schema.org ], so they map to one or several Schema.org properties and inherit their syntax and semantics. (The following property categories typically do not have Schema.org equivalents, so are defined specifically for Web Publications.)
Informative properties identify resources that contain additional information about the Web Publication, such as its privacy policy or an accessibility report .
Structural properties identify key meta structures of the Web Publication, such as the cover image or the the location of the table of contents .
Resource categorization properties describe or identify common sets of resources, such as the resource list and default reading order . These properties refer to one or more external resources (images, script files, separate metadata files, etc.).
The categorization of properties is done to simplify comprehension of their purpose; the groupings have no relevance outside this specification (i.e., the groupings do not exist in the infoset or manifest).
Schema.org includes a large number of properties that, though relevant for publishing, are are not mentioned in this specification; Web Publication authors can use any of those properties. This document defines only the minimal set of infoset items, and their mapping to Schema.org when appropriate.
There are discussion on whether a best practices document would be created, referring to more schema.org terms. If so, it should be linked from here.
(Extracting a discussion on #232 into a separate Issue.)
Schema.org has a (currently "pending") type LinkRole which may be a good alternative to the (publication specific) PublicationLink . Maybe worth considering using the schema.org type.
Ref: #232 (review) , #232 (comment) , #232 (comment)
The requirements for the expression of Web Publication properties are defined by the infoset as follows:
As the infoset properties do not all have to be serialized in the manifest, the requirements for the manifest will differ in some cases. Refer to each property's definition to determine whether it is required in the manifest or can be compiled from other information.
With the exception of the descriptive properties , the Web Publication properties typically link to one or more resources. When a property requires a link value, the link MUST be expressed in one of the following two ways:
PublicationLink
object
that
can
be
used
to
express
the
URL
,
the
media
type,
and
other
characteristics
of
the
target
resource.
{
…
"resources" : [
"datatypes.svg",
{
"@type" : "PublicationLink",
"url" : "test-utf8.csv",
"encodingFormat" : "text/csv",
"name" : "Test Results",
"description" : "CSV file containing the full data set used in this research."
},
{
"@type" : "PublicationLink",
"url" : "terminology.html",
"encodingFormat" : "text/html",
"rel" : "glossary"
}
]
}
PublicationLink
Definition
This
specification
defines
a
new
type
for
links
called
PublicationLink
.
It
consists
of
the
following
properties:
| Term | Description | Required Value | [ schema.org ] Mapping | Optionality |
|---|---|---|---|---|
url
|
Location of the resource. | A URL [ url ]. Refer to the property definitions that accept this type for additional restrictions. |
url
|
REQUIRED |
encodingFormat
|
Media
type
of
the
resource
(e.g.,
text/html
).
|
MIME Media Type [ rfc2046 ]. |
encodingFormat
|
OPTIONAL |
name
|
Name of the item. | Text. |
name
|
OPTIONAL |
description
|
Description of the item. | Text. |
description
|
OPTIONAL |
rel
|
The relationship of the resource to the Web Publication. |
One or more relations. The values are either the relevant relationship terms of the IANA link registry [ iana-link-relations ], or specially-defined URLs if no suitable link registry item exists. |
(None) | OPTIONAL |
The accessibility properties provides information about the suitability of a Web Publication for consumption by users with varying preferred reading modalities. These properties typically supplement an evaluation against established accessibility criteria, such as those provided in [ WCAG20 ]. (For linking to a detailed accessibility report, see 4.5.1 Accessibility Report .)
The following infoset items are categorized as accessibility properties:
More detailed descriptions of these properties, as well as the possible values, are described on the WebSchemas Wiki site .
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
accessMode
|
The human sensory perceptual system or cognitive faculty through which a person may process or perceive information. | Text. Expected values . |
accessMode
|
accessModeSufficient
|
A list of single or combined accessModes that are sufficient to understand all the intellectual content of a resource. | Comma-separated values. Expected values . |
accessModeSufficient
|
accessibilityAPI
|
Indicates that the resource is compatible with the referenced accessibility API. | Text. Expected values . |
accessibilityAPI
|
accessibilityControl
|
Identifies input methods that are sufficient to fully control the described resource. | Text. Expected values . |
accessibilityControl
|
accessibilityFeature
|
Content features of the resource, such as accessible media, alternatives and supported enhancements for accessibility. | Text. Expected values . |
accessibilityFeature
|
accessibilityHazard
|
A characteristic of the described resource that is physiologically dangerous to some users. | Text. Expected values . |
accessibilityHazard
|
accessibilitySummary
|
A human-readable summary of specific accessibility features or deficiencies, consistent with the other accessibility metadata but expressing subtleties such as “short descriptions are present but long descriptions will be needed for non-visual users” or “short descriptions are present and no long descriptions are needed.” | Text. |
accessibilitySummary
|
Note that the author MAY also provide a reference to a more detailed Accessibility Report , beyond the accessibility information expressed by these properties.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"accessMode" : ["textual", "visual"],
"accessModeSufficient" : ["textual"],
…
}
A Web Publication's address is a URL [ url ] that represents the primary entry page for the Web Publication.
If the address does not resolve to an HTML document [ html ], user agents SHOULD NOT provide access to it to users. A Web Publication MAY have more than one address, but all the addresses MUST resolve to the same document.
The referenced document SHOULD be a resource of the Web Publication. It can be any resource, including one that is not listed in the default reading order . This document MUST include a link to the manifest to ensure a bidirectional linking relationship (i.e., that user agents can also locate the manifest from the document at the address).
If the document is not a Web Publication resource, user agents SHOULD load the first document in the default reading order when initiating the Web Publication.
To improve the usability of Web Publications, particularly in user agents that do not support Web Publications, include navigation aids in the referenced document that facilitate consumption of the content, (e.g., provide a table of contents or a link to one).
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
url
|
URL of the primary entry page. | A URL [ url ]. |
url
|
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
…
}
A Web Publication's canonical identifier is a unique identifier that resolves to the preferred version of the Web Publication.
Ensuring uniqueness of canonical identifiers is outside the scope of this specification. The actual uniqueness achievable depends on such factors as the conventions of the identifier scheme used and the degree of control over assignment of identifiers.
The canonical identifier is intended to provide a measure of permanence above and beyond the Web Publication's address(es) . If a Web Publication is permanently relocated to a new URL , for example, the canonical identifier provides a way of discovering the new location (e.g., a DOI registry could be updated with the new URL , or a redirect could be added to the URL of the canonical identifier). It is also intended to provide a means of identifying instances of the same Web Publication hosted at different URLs .
The canonical identifier MUST be a URL [ url ].
If a URL is not provided in the manifest, or the value is an invalid URL , the Web Publication does not have a canonical identifier. User agents MUST NOT attempt to construct a canonical identifier from any other identifiers provided in the manifest.
The
canonical
identifier
can
be
used
as
the
target
of
a
"canonical"
link [
rfc6596
]
(e.g.,
a
link
element [
html
]
whose
rel
attribute
has
the
value
canonical
or
an
HTTP
Link
header
field [
rfc5988
]
similarly
identified).
Is a canonical identifier necessary to call out explicitly in the infoset , or can it be handled by other metadata.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
@id
|
Preferred version of the Web Publication. | A URL [ url ]. | (None) |
The
specification
of
the
canonical
identifier
MAY
be
complemented
by
the
inclusion
of
additional
types
of
identifiers
for
the
Web
Publication
using
the
identifier
property
[
schema.org
]
and/or
its
subtypes.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
…
}
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"isbn" : "9780123456789",
"url" : "https://publisher.example.org/mobydick",
…
}
A creator is an individual or entity responsible for the creation of the Web Publication .
The following infoset items are categorized as creators:
A Web Publication MAY have more than one of each of these types of creators.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
artist
|
The primary artist for the publication, in a medium other than pencils or digital line art. |
One
or
more
Person
.
|
artist
|
author
|
The author of the publication. |
One
or
more
Person
or
Organization
.
|
author
|
colorist
|
The individual who adds color to inked drawings. |
One
or
more
Person
.
|
colorist
|
contributor
|
Contributor whose role does not fit to one of the other roles in this table. |
One
or
more
Person
or
Organization
.
|
contributor
|
creator
|
The creator of the publication. |
One
or
more
Person
or
Organization
.
|
creator
|
editor
|
The editor of the publication. |
One
or
more
Person
.
|
editor
|
illustrator
|
The illustrator of the publication. |
One
or
more
Person
.
|
illustrator
|
letterer
|
The individual who adds lettering, including speech balloons and sound effects, to artwork. |
One
or
more
Person
.
|
letterer
|
penciler
|
The individual who draws the primary narrative artwork. |
One
or
more
Person
.
|
penciler
|
publisher
|
The publisher of the publication. |
One
or
more
Person
or
Organization
.
|
publisher
|
readBy
|
A person who reads (performs) the publication (for audiobooks). |
One
or
more
Person
.
|
readBy
|
translator
|
The translator of the publication. |
One
or
more
Person
or
Organization
.
|
translator
|
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"author" : {
"@type" : "Person",
"name" : "Herman Melville"
}
}
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"author" : [{
"@type" : "Person",
"name" : "Jeni Tennison",
},{
"@type" : "Person",
"name" : "Gregg Kellogg",
},{
"@type" : "Person",
"name" : "Ivan Herman",
"url" : "https://www.w3.org/People/Ivan/"
}],
"editor" : [{
"@type" : "Person",
"name" : "Jeni Tennison",
},{
"@type" : "Person",
"name" : "Gregg Kellogg",
}],
"publisher" : {
"@type" : "Organization",
"name" : "World Wide Web Consortium",
"url" : "https://www.w3.org/"
}
…
}
Each
textual
The
Web
Publication
has
a
natural
language
value
(e.g.,
English,
French,
Chinese),
as
well
as
a
natural
base
writing
direction
(left-to-right
or
right-to-left).
The
infoset
has
entries
to
set
these
values,
which
can
influence,
for
example,
the
behavior
of
a
user
agent
(e.g.,
it
might
place
a
pop-up
for
a
table
of
contents
on
the
right
hand
side
for
publications
whose
natural
base
direction
is
right-to-left).
Similarly,
each
natural
language
property
value
in
the
Web
Publication's
infoset
(e.g.,
title
,
creators
)
is
Localizable
localizable
[
string-meta
].
This
means
it
],
meaning
that
the
same
information
is
possible
available
for
each.
As
a
result,
the
infoset
has
entries
to
assign:
set:
of
both
the
(text)
direction
i18n-tracking
topic:internationalization
Web
Publication
and
the
natural
language
properties
values
of
the
topic:manifest
topic:metadata
infoset
topic:schema
mapping
This
is
a
completely
open
issue
at
this
moment,
both
for
JSON-LD
and
Schema.org...
The
only
(incomplete)
approach
would
be
to
rely
on,
and
base
everything,
on
the
UTF-encoding
of
the
text...
.
The
Web
Publication's
infoset
MAY
contain
global
language
and
the
base
direction
declarations.
declarations
for
the
Web
Publication.
The
natural
language
MUST
be
a
tag
that
conforms
to [
bcp47
],
while
the
base
language
direction
MUST
have
one
of
the
following
values:
ltr
:
rtl
:
auto
:
indicates
that
the
textual
values
are
explicitly
directionally
set
to
the
direction
of
the
first
character
with
a
strong
directionality.
When
specified,
these
properties
are
also
used
as
defaults
for
textual
values
in
the
infoset
,
but
they
MAY
be
overridden
by
individual
properties.
.
These
features
make
it
possible
It
is
important
to
add
differentiate
the
title
language
of
a
publication
in
different
languages,
or
repeat
the
creators’
names
using
different
scripts.
User
agents
MUST
NOT
publication
use
from
the
language
and
the
base
direction
outside
the
context
of
the
infoset
(e.g.,
individual
resources
that
compose
it.
If
such
resources
are,
for
example,
in
HTML
,
the
processing
or
rendering
language
and
direction
need
to
be
set
in
those
resources,
too.
The
language
and
base
direction
of
the
publication
are
not
inherited.
The global language information MAY be overridden by individual values.
When
using
Web
Publication
content).
These
manifests
with
bidirectional
text,
user
agents
SHOULD
identify
the
base
direction
of
any
given
natural
language
value
by
scanning
the
text
for
the
first
strong
directional
character.
Once
the
base
direction
has
been
identified,
user
agents
MUST
determine
the
appropriate
rendering
and
display
of
natural
language
values
do
not
override
similar
declarations
according
to
the
Unicode
Bidirectional
Algorithm [
bidi
].
This
could
require
wrapping
additional
control
characters
or
markup
around
the
string
prior
to
display,
in
any
resource,
nor
do
they
serve
as
global
defaults
when
such
information
order
to
apply
the
base
direction.
(See
C.
Examples
for
bidirectional
texts
.)
This section, in particular the features related to text directions, must be reviewed by I18N experts.
If
the
manifest
is
not
provided
embedded
in
the
primary
entry
page
via
a
resource.
script
element,
and
the
manifest
does
not
set
the
global
language
and/or
the
base
direction
(see
4.4.5.2.1
Global
Language
and
Direction
),
the
lang
and
the
dir
attributes
of
the
script
element
are
used
as
the
global
language
and
base
direction
,
respectively
(see
the
details
on
handling
the
lang
and
dir
attributes
in [
html
]).
It
is
to
be
discussed
whether
this
last
paragraph,
i.e.,
inheriting
values
from
script
,
should
be
kept.
If
a
user
agent
requires
the
language
and
one
is
not
available
in
the
infoset
(globally
(globally,
or
specifically
for
a
that
property),
or
the
obtained
value
is
invalid,
has
been
specified,
the
user
agent
MAY
attempt
to
determine
the
language.
This
specification
does
not
mandate
how
such
a
language
tag
is
created.
The
user
agent
might:
If
a
language
tag
cannot
be
determined,
user
agents
MUST
use
No
default
values
are
specified
for
the
value
"
und
"
(undetermined).
If
language
or
the
default
base
direction
is
not
specified,
or
is
an
invalid
value,
the
value
of
this
item
MUST
be
set
to
"
ltr
".
.
As
this
infoset
item
refers
to
several
aspects
of
setting
language
and
direction,
each
is
these
are
treated
separately.
| Term | Description | Required Value |
[
|
|---|---|---|---|
inLanguage
|
Default
language
|
language
code
as
defined
in [
bcp47
|
inLanguage
|
inDirection
|
Default
base
direction
for
the
|
ltr
,
rtl
,
or
auto
|
(None) |
If
authors
intend
to
extend
the
context
through
{
}
in
use
a
manifest,
or
a
manifest
template,
both
as
embedded
manifest
and
as
a
separate
resource,
they
are
strongly
encouraged
to
set
these
properties
explicitly
to
avoid
interference
of
the
(list
of)
containing
element
in
case
of
embedding.
@context
.
Although
not
rejected
by
schema.org,
it
is
also
ignored.
script
It
is
possible
to
set
the
language
and
base
direction
for
any
textual
value
in
the
manifest.
This
information
MUST
be
set
for
each
item
separately
using
the
@value
and
@language
keywords
[
(instead
of
a
simple
string) [
json-ld
]:
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"author" : {
"@type" : "Person",
"name" : {
"@value" : "Marcel Proust",
"@language" : "fr"
}
}
}
The
value
of
the
language
tag
MUST
be
set
to
a
language
code
as
defined
in
[
bcp47
].
If
not
set,
the
default
value
is
the
default
value
of
the
manifest
.
The
value
of
It
is
not
possible
to
set
the
base
direction
MUST
be
"
ltr
"
or
"
rtl
",
as
appropriate.
explicitly
for
a
value.
Setting
the
language
of
an
item,
say,
"name"
topic:internationalization
topic:manifest
topic:metadata
direction
for
a
natural
text
value
is
currently
not
possible
in
JSON
-LD [
topic:schema
mapping
json-ld
The
requirement
is
that
it
should
be
possible
to
express
].
In
case
the
language
of,
say,
JSON
-LD
community,
as
well
as
the
title
or
name
schema.org
community,
introduces
such
a
feature,
future
versions
of
this
specification
may
extend
the
author
individually.
This
does
not
seem
ability
of
Web
Publication
Manifests
to
work
in
Schema.org...
include
this.
The last modification date is the date when the Web Publication was last updated (i.e., whenever changes were last made to any of the resources of the Web Publication, including the manifest ).
The last modification date does not necessarily reflect all changes to the Web Publication (e.g., third-party content could change without the author being aware). User agents SHOULD check the last modification date of individual resources to determine if they have changed and need updating.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
dateModified
|
Last modification date of the publication. |
A
Date
or
DateTime
value [
schema.org
],
both
expressed
in
ISO
8601
date,
or
Date
Time
formats,
respectively
[
iso8601
].
|
dateModified
|
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"dateModified" : "2015-12-17",
…
}
The publication date is the date on which the Web Publication was originally published. It represents a static event in the lifecycle of a Web Publication and allows subsequent revisions to be identified and compared.
The exact moment of publication is intentionally left open to interpretation: it could be when the Web Publication is first made available online or could be a point in time before publication when the Web Publication is considered final.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
datePublished
|
Creation date of the publication. |
A
Date
or
DateTime
,
both
expressed
in
ISO
8601
date,
or
Date
Time
formats,
respectively
[
iso8601
].
|
datePublished
|
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"datePublished" : "2015-12-17",
"dateModified" : "2016-01-30",
…
}
The reading progression establishes the reading direction from one resource to the next within a Web Publication .
The value of this property may be:
ltr
:
left-to-right;
rtl
:
right-to-left;
auto
:
the
user
agent
chooses
the
direction.
The
default
value
is
auto
.
This infoset item has no effect on the rendering of the individual primary resources; it is only relevant for the progression direction from one resource to the other.
The reading progression of a Web Publication is used to adapt such publication level interactions as menu position, swap direction, defining tap zones to lead the user to the next and previous pages, touch gestures, etc.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
readingProgression
|
Reading direction from one resource to the other. |
ltr
,
rtl
,
or
auto
|
(None) |
If
this
value
is
not
set,
its
default
value
is
ltr
.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"readingProgression" : "ltr"
}
The title provides the human-readable name of the Web Publication .
The
title
is
specified
by
the
manifest
expression
,
when
present.
If
not
included
in
the
manifest,
user
agents
MAY
use
the
value
of
the
title
element
[
html
]
of
the
Web
Publication’s
primary
entry
page
(if
present)
.
Relying
on
the
title
element
could
be
semantically
problematic
if
the
Web
Publication
consists
of
several
HTML
resources
(e.g.,
one
per
chapter
of
a
book),
because
the
HTML
definition
defines
this
element
as
"metadata"
for
the
enclosing
HTML
document,
not
for
a
collection
of
resources.
Using
this
element
is,
on
the
other
hand,
preferred
in
the
case
of
a
publication
consisting
of
a
single
HTML
document
(e.g.,
a
scholarly
journal
article).
When specified in the infoset , the title MUST be non-empty .
If a user agent requires a title and one is not available in the infoset , it MAY create one (e.g., provide a language-specific placeholder title or use the URL of the manifest).
A user agent is not expected to produce a meaningful title [ wcag20 ] for a Web Publication when one is not specified.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
name
|
Human-readable name of the Web Publication. | Text. |
name
|
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick"
}
An accessibility report provides information about the suitability of a Web Publication for consumption by users with varying preferred reading modalities. These reports typically identify the result of an evaluation against established accessibility criteria, such as those provided in [ WCAG21 ], and are an important source of information in determining the usability of a Web Publication.
The infoset SHOULD include a link to an accessibility report when one is available for a Web Publication. It is RECOMMENDED that the report be included as a resource of the Web Publication.
It is also RECOMMENDED that the accessibility report be provided in a human-readable format, such as HTML [ html ]. Augmenting these reports with machine-processable metadata, such as provided in Schema.org [ schema.org ], is also RECOMMENDED .
Machine-readable accessibility metadata may be recommended in whatever format is used to externalize publication metadata (e.g., to ensure availability for search). Depending how this externalizing is done, adding machine-processable accessibility metadata to such a record could take precedence over, or complement, the accessibility record.
If
present
in
the
manifest,
the
accessibility
report
MUST
be
expressed
as
a
PublicationLink
.
The
rel
value
of
the
PublicationLink
MUST
include
the
https://www.w3.org/ns/wp#accessibility-report
identifier.
The
Working
Group
will
attempt
to
define
the
accessibility-report
term
by
IANA,
to
avoid
using
a
URL
.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"links" : [{
"@type" : "PublicationLink",
"url" : "https://www.publisher.example.org/mobydick-accessibility.html",
"rel" : "https://www.w3.org/ns/wp#accessibility-report"
},{
…
}],
…
}
Users often have the legal right to know and control what information is collected about them, how such information is stored and for how long, whether it is personally identifiable, and how it can be expunged. Including a statement that addresses such privacy concerns is consequently an important part of publishing Web Publications . Even if no information is collected, such a declaration increases the trust users have in the content.
A link to a privacy policy can be included in the infoset . It is RECOMMENDED that the privacy policy be included as a resource of the Web Publication.
It is RECOMMENDED that the privacy policy be provided in a human-readable format, such as HTML [ html ].
Refer to 9. Privacy for more information about privacy considerations in Web Publications.
https://w3c.github.io/wpub/#wp-privacy needs more clarity, and not be so general. Most of the privacy policy collection and enforcement is upstream from the document markup, except where the markup explicitly collects data.
If
present
in
the
manifest,
the
privacy
policy
MUST
be
expressed
as
a
PublicationLink
.
The
rel
value
of
the
PublicationLink
MUST
include
the
privacy-policy
identifier [
iana-link-relations
].
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
…
"links" : [{
"@type" : "PublicationLink",
"url" : "https://www.w3.org/Consortium/Legal/privacy-statement-20140324",
"encodingFormat" : "text/html",
"rel" : "privacy-policy"
},{
…
}],
…
}
The cover is an image that user agents can use to present the Web Publication to users (e.g., in a library or bookshelf, or when initially loading the Web Publication).
The infoset SHOULD include a reference to a cover image.
User agents SHOULD NOT use the cover image as the sole means of selecting or accessing Web Publications. A user agent SHOULD use the Web Publication's title and creators as text alternatives for such interfaces.
More than one cover image MAY be referenced from the infoset to provide alternative sizes and resolutions for different device screens.
A user agent MAY create a cover for a Web Publication if one is not present. This specification does not define requirements for the creation of such cover images (e.g., the user agent could use a placeholder image, generate an image dynamically, or incorporate properties of the infoset into a graphic, such as the title or creators).
If
present
in
the
manifest,
the
cover
MUST
be
expressed
as
a
PublicationLink
.
The
URL
expressed
in
the
url
term
MUST
NOT
include
a
fragment
identifier.
The
rel
value
of
the
PublicationLink
MUST
include
the
https://www.w3.org/ns/wp#cover-page
identifier.
The
Working
Group
will
attempt
to
define
the
cover-page
term
by
IANA,
to
avoid
using
a
URL
.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"resources" : [{
"@type" : "PublicationLink",
"url" : "whale-image.jpg",
"encodingFormat" : "image/jpeg"
"rel" : "https://www.w3.org/ns/wp#cover-page"
},{
…
}],
…
}
The table of contents property identifies the resource that contains the Web Publication's table of contents .
User
agents
MUST
compute
the
table-of-contents
as
follows:
rel
value
of
contents
[
iana-link-relations
],
use
the
url
value
of
the
specified
resource
as
the
link
leading
to
the
table
of
contents.
role
[
html
]
value
doc-toc
[
dpub-aria-1.0
],
use
that
element
as
the
table
of
contents.
If neither of the above cases results in a link to the table of contents, the Web Publication does not have a table of contents and this property MUST NOT be included in the infoset .
If
the
table
of
contents
resource
contains
more
than
one
element
identified
by
the
role
"
doc-toc
",
user
agents
MUST
use
the
first
element
in
document
order
as
the
table
of
contents.
If
present
in
the
manifest,
the
table
of
content
MUST
be
expressed
as
a
PublicationLink
.
The
URL
expressed
in
the
url
term
MUST
NOT
include
a
fragment
identifier.
The
rel
value
of
the
PublicationLink
MUST
include
the
contents
identifier [
iana-link-relations
].
The link to the table of contents MAY be specified in either the default reading order or resource-list , but MUST NOT be specified in both.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"resources" : [{
"@type" : "PublicationLink",
"url" : "toc_file.html",
"rel" : "contents"
},{
…
}],
…
}
<head>
…
<script type="application/ld+json">
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
…
}
</script>
…
</head>
<body>
…
<section role="doc-toc">
…
</section>
…
</
body
>
Web Publication resources are specified via the default reading order , the resource list , and the links , as defined in this section. These lists contain references to informative properties like the privacy policy , and structural properties like the table of contents .
Note that a particular resource's URL MUST NOT appear in more than one of these lists, and a URL MUST NOT be repeated within a list.
The default reading order is a specific progression through a set of Web Publication resources.
A user might follow alternative pathways through the content, but in the absence of such interaction the default reading order defines the expected progression from one resource to the next.
The default reading order MUST include at least one resource.
The default reading order is specified directly in the manifest. However, if the reading order consists of only a single resource, namely the primary entry page of the Web Publication, the default reading order need not be specified.
If
present
in
the
Web
Publication
Manifest,
this
item
MUST
be
mapped
on
the
readingOrder
term,
defined
specifically
for
Web
Publications.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
readingOrder
|
An array of:
The order in the array is significant . The URLs MUST NOT include fragment identifiers. |
(None) |
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"readingOrder" : [
"html/title.html",
"html/copyright.html",
"html/introduction.html",
"html/epigraph.html",
"html/c001.html",
…
]
}
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"readingOrder" : [{
"@type" : "PublicationList",
"url" : "html/title.html",
"encodingFormat" : "text/html",
"name" : "Title page"
},{
"@type" : "PublicationList",
"url" : "html/copyright.html",
"encodingFormat" : "text/html",
"name" : "Copyright page"
},{
…
}]
}
The resource list enumerates all resources that are used in the processing and rendering of a Web Publication (i.e., that are within its bounds) and that are not listed in the default reading order .
The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.
The completeness of the resource list will affect the usability of the Web Publication in certain reading scenarios (e.g., the ability to read the Web Publication offline). For this reason, it is strongly RECOMMENDED to provide a comprehensive list of all of the Web Publication's constituent resources beyond those listed in the default reading order .
In some cases, a comprehensive list of these resources might not be easily achieved (e.g., third-party scripts that reference resources from deep within their source), but a user agent SHOULD still be able to render a Web Publication even if some of these resources are not identified as belonging to the Web Publication (e.g., when it is taken offline without them).
If a user agent encounters a resource that it cannot locate in the resource list, it MUST treat the resource as external to the Web Publication (e.g., it might alert the user before loading, open the resource in a new window, or unload the current Web Publication and resume normal Web browsing).
This was not decided on the Toronto F2F, and is still open.
If
present
in
the
Web
Publication
Manifest,
this
item
MUST
be
mapped
on
the
resources
term,
defined
specifically
for
Web
Publications.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
resources
|
An array of:
The order in the array is not significant . The URLs MUST NOT include fragment identifiers. |
(None) |
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
…
"resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
{
"@type" : "PublicationLink",
"url" : "test-utf8.csv",
"encodingFormat" : "text/csv"
},{
"@type" : "PublicationLink",
"url" : "test-utf8-bom.csv",
"encodingFormat" : "text/csv"
},{
…
}
],
…
}
The links property enumerates resources that are significant to a Web Publication but are not essential to the processing and rendering of it (i.e., the content of the Web Publication remains readable even if the resources are not available). Examples of linked resources might include:
The
completeness
of
the
links
list
might
affect
the
usability
of
the
Web
Publication
in
certain
scenarios
(e.g.,
a
user
agent
might
chose
to
omit
these
resources
when
taking
a
publication
offline,
or
they
could
be
omitted
when
packaged
version
of
the
Web
Publication
is
created).
For
this
reason,
it
is
strongly
RECOMMENDED
to
provide
a
comprehensive
list
of
all
of
the
Web
Publication's
non-critical
resources.
The
links
list
SHOULD
include
resources
necessary
to
render
a
linked
resource
(e.g.,
scripts,
images,
style
sheets).
Resources
listed
in
the
links
list
MUST
NOT
be
listed
in
the
default
reading
order
or
resource
list
.
| Term | Description | Required Value | [ schema.org ] Mapping |
|---|---|---|---|
links
|
An array of:
The order in the array is not significant . |
(None) |
The infoset is designed to provide a basic set of properties for use by user agents in presenting and rendering a Web Publication , but MAY be extended in the following ways:
Although both methods are valid, the use of linked records to extend the infoset is RECOMMENDED .
This specification does not define how such additional properties are compiled, stored or exposed by user agents in their internal representation of the infoset . A user agent MAY ignore some or all extended properties.
Extending
the
manifest
through
links
to
a
record,
such
as
an
ONIX [
onix
]
or
BibTeX [
bibtex
]
file,
MUST
be
expressed
using
a
PublicationLink
object,
where:
rel
value
of
the
PublicationLink
SHOULD
include
a
relevant
identifier
defined
by
IANA
or
by
other
organizations;
if
the
link
record
contains
descriptive
metadata
it
MUST
include
the
describedby
(IANA)
identifier;
encodingFormat
in
the
link
MUST
use
the
MIME
media
type [
rfc2046
]
defined
for
that
particular
type
of
record,
if
applicable.
Linked records MUST be included in the resource list when they are part of the Web Publication (i.e., are needed for more than just infoset extensibility). Otherwise, they MUST be included in the links list .
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "Book",
…
"url" : "https://publisher.example.org/mobydick",
"name" : "Moby Dick",
"links" : [{
"@type" : "PublicationLink",
"url" : "https://www.publisher.example.org/mobydick-onix.xml",
"encodingFormat" : "application/onix+xml",
"rel" : "describedby"
},{
…
}],
…
}
The
application/onix+xml
MIME
type
has
not
yet
been
registered
by
IANA
at
the
time
of
writing
this
document,
and
is
included
in
the
example
for
illustrative
purposes
only.
Additional properties can be included directly in the manifest. It is RECOMMENDED that these properties be taken from public schemes like [ schema.org ] or [ dcterms ] and use values from controlled vocabularies whenever possible. Proprietary terms MAY be used, but it is RECOMMENDED that such terms be included using Compact IRIs [ json-ld ], with prefixes defined as part of the context.
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"copyrightYear" : "2015",
"copyrightHolder" : "World Wide Web Consortium",
…
}
{
"@context" : ["https://schema.org","https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
…
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"dc:subject" : ["Document Structures","Resource Description Framework (RDF)"]
…
}
A
prefix
definition
dc
for
[
dcterms
]
is
included
in
the
context
file
of
[
schema.org
].
This
means
that
it
is
not
necessary
to
add
the
prefix
explicitly.
The
same
is
true
for
a
number
of
other
public
vocabularies;
see
the
schema.org
context
file
for
further
details.
The publishing working group is currently evaluating the best approach for implementing Web Publications in user agents. This note is intended to provide an overview of where current thinking is at and what issues are under consideration.
The development of Web Publications is not viewed as a separate forking of the Web, but an enhancement layer that can be supported by user agents. To that end, the primary constraints on any solution for Web Publications are that:
While this specification will provide implementation flexibility for user agents, there are still a number of areas that have been identified as potentially needing to be detailed. These include:
initialization expectations for a Web Publication:
the creation of a "publication state":
tracking the extent of a Web Publication:
establishing the bounds of a Web Publication:
The working group intends to flesh out the lifecycle in later revisions once it is clearer what models are viable and what solutions can be standardized. Input on the feasibility and challenges of these approaches is welcome at any time.
The steps for obtaining a manifest are given by the following algorithm. The algorithm, if successful, returns a processed manifest and the manifest URL ; otherwise, it terminates prematurely and returns nothing. In the case of nothing being returned, the user agent MUST ignore the manifest declaration.
Document
of
the
top-level
browsing
context,
let
origin
be
the
Document
's
origin,
and
manifest
link
be
the
first
link
element
in
tree
order
whose
rel
attribute
contains
the
token
publication
.
null
,
terminate
this
algorithm.
href
attribute's
value
is
the
empty
string,
then
abort
these
steps.
href
attribute,
relative
to
the
element's
base
URL
.
If
parsing
fails,
then
abort
these
steps.
Document
.
crossOrigin
attribute's
value
is
'
use-credentials
',
then
set
request
's
credentials
to
'
include
'.
See the diagram in the appendix for a visual representation of the algorithm.
This section will require additional work if we also decide to allow JSON -LD embedded in HTML .
The steps for processing a manifest are given by the following algorithm. The algorithm takes a text string as an argument, which represents a manifest , and a manifest URL [ url ], which represents the location of the manifest, and a document URL [ url ]. The output from inputting a JSON document into this algorithm is a processed manifest .
"{}"
.
Object
:
"{}"
.
WebPublicationManifest
dictionary.
See the diagram in the appendix for a visual representation of the algorithm.
The new JSON -LD based approach will require additional processing from the client. Due to the flexible nature of JSON -LD and schema.org, a simple conversion from JSON to WebIDL won't be enough.
This section contains placeholders for possible reading enhancements/features the user agent may/should/must provide. The list is subject to addition, modification and removal as the enhancements get discussed in more detail.
Before starting a discussion on the individual affordances' issues , the WG should have a consensus on what exactly is to be defined for each of those.
When a user agent obtains a manifest it SHOULD provide the option to switch the display to publication mode .
This feature has the following requirements:
Publication mode is a display mode implemented by the user agent that follows the conventions listed in presentation and navigation .
The layout and rendering of Web Publications is governed by the same rules that apply to all Web content: HTML documents are styled and laid out according to the rules of CSS , SVG documents are rendered as defined by that format, etc. This specification requires no particular profile or subset of CSS , HTML , or SVG to be supported, other than the expectations set for these technologies by their respective specifications.
This specification intentionally avoids introducing any new layout features. Any shortcoming of the Web platform in terms of layout needs to be addressed for the whole Web platform, which means via CSS .
This working group will work with other relevant groups of the W3C to address platform-wide limitations that negatively impact Web Publications .
For the purposes of layout, each resource of a Web Publication is treated as a separate document. User agents MUST NOT mix content from multiple resources in the same rendering (e.g., CSS floats or absolutely positioned elements from one resource cannot intrude or overlap with content from an other resource).
Despite this general requirement that each resource should be treated as a separate document for the purpose of layout, there are some places where CSS specifications should be amended to be able to deal more intelligently with collections of resources like Web Publications .
One instance is the definition of cross-references , which are currently restricted to work only within a single document. This restriction should be relaxed to allow for cross-references between separate resources of a single Web Publication.
Another related would be to allow counters to accumulate across multiple resources of a single Web Publication (e.g., so that figures in multiple sections may be numbered in a single sequence).
When a user agent renders a Web Publication , it SHOULD provide user settings to customize the experience.
User settings MAY include:
This specification does not cover how user agents override author styles to offer user settings.
To provide user settings in their reader mode, browsers usually get rid of most of the author styles. There is always a tension in reading environments between author styles and the user's preference, which is very hard to balance.
2.1.11
Personalization
The
user
must
have
the
possibility
to
personalize
his
or
her
reading
experience.
Picking up on #52
This section is non-normative.
Publications have historically been presented via paged media, whereas Web pages almost always scroll. As the preferences of individual readers vary, and as different types of publications are better suited for one or the other, this specification encourages user agents to support both, and to offer a choice to their users.
It
might
be
useful
for
authors
to
be
able
to
specify
a
preference
between
scrolling
and
pagination,
even
if
a
strict
requirement
is
not
possible.
This
should
most
likely
be
addressed
through
an
extension
of
@viewport
or
of
the
viewport
meta
tag
(see [
css-device-adapt
]),
or
possibly
through
an
extension
of
@page
(see [
css-page-3
]).
This
should
be
discussed
with
the
relevant
working
groups
(
CSSWG
,
WebPlatformWG
,
WHATWG
).
2.1.10
Pagination
It
should
be
possible
to
see
the
Web
Publication
in
a
“paginated”
view.
picking
up
on
#52
See
also
https://w3c.github.io/wpub/#aff-presentation
When a user agent renders a Web Publication in a paginated layout, it MUST lay out each document in the default reading order sequentially, with the last page of a resource being followed by the first page of the subsequent one.
To avoid blank pages, if a resource ends on a left page (resp. right page), the subsequent one should start on a right page (resp. left page) even if the page progression (see [ css-page-3 ]) would otherwise lead to it starting on the opposite page. It should also be possible to use the break-before property (see [ css-break-3 ]) to force the content to resume on the opposite side if that was desired by the author.
[ css-page-3 ] needs to be amended to describe this exception to the general behavior when dealing with collections of documents instead of individual documents.
How is pagination supposed to work when subsequent resources have opposite page progression directions (see [ css-page-3 ]). For example, due to different a different writing mode? This is not necessarily a problem from a layout point of view, as each page is independent, but from an UI point of view. If swiping left means next page until the end of one chapter, and starts meaning previous page in the next chapter because the language is switched from English to Hebrew, this is going to be confusing.
[ css-page-3 ] needs to be amended so that page counters are not automatically reset to at the beginning of each new resource belonging to the same Web Publication.
2.1.10
Pagination
It
should
be
possible
to
see
the
Web
Publication
in
a
“paginated”
view.
picking
up
on
#52
See
also
https://w3c.github.io/wpub/#aff-presentation
A WP can be read in a browser offline with no change in fidelity from the online experience
Detail on inter-publication search across multiple resources will be included in a future draft.
User agents should provide an affordance that saves the reading progression in the publication and return the user to that location the next time that she opens the publication again.
The user must be able to leave the Web Publication and return to it at the last position they left from. The User Agent must retain the reading position, based on the last known position of the reader in the web publication. The position should be based on the reader's position in the file, within the reading order.
The user agent may retain reading state if the web publication is revised.
The navigation of the web publication should be defined in the Default Reading Order required by the Information Set.
User Agents should not have to set the reading state in the following type of resources:
Reading state should only apply to content documents listed as being within the bounds of the Web Publication.
Example
1:
Sarah
is
reading
a
long
article
on
her
way
to
work.
She
arrives
before
she
has
finished,
but
wants
to
continue
from
the
place
she
left
off.
The
user
agent
should
remember
her
reading
state
for
the
next
time
she
opens
the
publication.
If a tester opens a web publication in a WP-aware UA, moves ahead in the publication, closes the reader, then reopens it, they should be returned to the last known reading state.
This section is non-normative.
The document referred from this section, i.e., Web Annotation Extensions for Web Publications [ wpub-ann ], has been recently renamed. Its previous was "Locators for Web Publication". The terminology used in this section has to be realigned with the name change.
Locators
are
used
to
identify,
locate,
retrieve,
and/or
reference
locations
and
content
fragments
within
Web
Publications
(e.g.,
for
address(es),
bookmarks,
and
annotations).
Locators
traditionally
take
the
form
of
fragment
identifiers [
rfc3986
],
where
the
portion
of
a
URL
preceded
by
a
number
sign
character
(
#
)
identifies
a
specific
position
within
the
referenced
resource.
For some use cases, it is essential to identify and reference a Web Publication resource—or a location in or a segment of a resource—in the scope or context of the Web Publication to which it belongs. A traditional fragment identifier cannot satisfy this requirement, since only the URL of the constituent resource containing the location or content fragment of interest is expressed. The Web Annotation Extensions for Web Publications [ wpub-ann ] document, based on the Web Annotation Model [ annotation-model ], addresses this issue by providing the means to express both the URL of the resource and the URL of the Web Publication.
Web Publication Locators also address the problem of referencing into a resource that was not authored with such a need in mind. A fragment identifier can only reference elements with explicit identifiers and locations with explicit anchor points. Web Publication Locators include a variety of selectors that work with the general structures and content of a resource (e.g., text selectors, CSS selectors).
As Web Publication Locators currently rely on a JSON -based expression syntax, it is not yet clear how much of this syntax can be translated to a fragment identifier. This may limit the usefulness beyond expressions that are also JSON -based (e.g., outside of annotations or bookmarks).
Illustrate with example of an easy to understand Web Publication Locator, such as might be used in annotating a simple Web Publication.
The semantics of Web Publication Locators are a mapping and extension of the Web Annotation Data Model [ annotation-model ] and Vocabulary [ annotation-vocab ] for describing and referencing a segment of a Web resource. As a result, Web Publication Locators provide the expressiveness needed for a broad range of annotation and bookmarking use cases. Additionally, Web Publication Locators provide a way to identify and reference a location within a Web Publication (i.e., as distinct from identifying and referencing a content fragment consisting of a span of characters or bytes). A Web Publication Locator can be used to identify, retrieve and/or reference a fragment of a Web Publication that spans multiple resources.
In composing a Web Publication Locator, use the canonical identifier of the Web Publication in preference to any alternative addresses. Such use facilitates the collation of Web Publication Locators associated with a particular Web Publication. URLs of Web Publication resources appearing in a Web Publication Locator should match the URL of the resource provided in the infoset .
This section is non-normative.
Although a Web Publication manifest is authored as [ json-ld ], user agents process this information into an internal data structure representing the infoset in order to utilize the properties. The exact manner in which this processing occurs, and how the data is used internally, is user agent-dependent. To ensure interoperability when exposing the infoset items, however, this appendix defines a common, abstract representation of the data structures using the standard formalism of the Web Interface Definition Language [ webidl-1 ] which can express the expected names, datatypes, and possible restrictions for each member of the infoset . (A WebIDL representation can be mapped onto ECMAScript, C, or other programming languages.)
WebPublicationManifest
dictionary
dictionary WebPublicationManifest {
required DOMString url;
DOMString lang;
TextDirection direction = "auto";
TextDirection readingProgression = "auto";
sequence<LocalizableString> name;
DOMString id;
sequence<Contributor> authors;
DOMString dateModified;
DOMString datePublished;
sequence<PublicationLink> links;
sequence<PublicationLink> readingOrder;
sequence<PublicationLink> resources;
sequence<PublicationLink> toc;
};
The
WebPublicationManifest
has
the
following
members:
url
lang
direction
readingProgression
name
id
authors
dateModified
datePublished
links
readingOrder
resources
toc
authors
member
The current infoset for creators is not fully defined; this dictionary might be further improved once there is agreement on how they should be handled.
dictionary Contributor {
required LocalizableString name;
DOMString id;
};
The
author
member
is
a
sequence
of
Contributor
dictionaries
where
each
dictionary
has
the
following
members:
name
id
LocalizableString
dictionary
This definition includes a slightly tweaked version of the i18n recommendation that also includes a string value in addition to a language and a direction.
Some
metadata
in
the
infoset
have
strong
requirements
for
internationalization.
For
those
members,
this
specification
relies
on
the
best
practices
established
by
the
i18n
WG
and
on
the
LocalizableString
dictionary.
dictionary LocalizableString {
required DOMString value;
DOMString lang;
TextDirection dir = "auto";
};
When
lang
or
dir
are
specified
in
LocalizableString
,
these
values
override
the
default
language
and
base
direction
specificed
in
WebPublicationManifest
.
LocalizableString
has
the
following
members:
value
lang
dir
PublicationLink
dictionary
dictionary PublicationLink {
required DOMString url;
DOMString encodingFormat;
DOMString name;
sequence<DOMString> rel;
sequence<PublicationLink> children;
};
The PublicationLink dictionary contains the following members:
url
encodingFormat
name
rel
children
TextDirection
enum
enum TextDirection {
"ltr",
"rtl",
"auto"
};
The
TextDirection
enum
can
contain
the
following
values:
ltr
rtl
auto
This section is non-normative.
This is the simple version of a book manifest example. A somewhat more elaborate version for the same publication is also available as an example.
{
"@context": ["https://schema.org", "https://www.w3.org/ns/wp-context"],
"@type": "Book",
"url": "https://publisher.example.org/mobydick",
"author":
{
"@type": "Person",
"name": "Herman Melville"
},
"dateModified": "2018-02-10T17:00:00Z",
"readingOrder": [
"html/title.html",
"html/copyright.html",
"html/introduction.html",
"html/epigraph.html",
"html/c001.html",
"html/c002.html",
"html/c003.html",
"html/c004.html",
"html/c005.html",
"html/c006.html"
],
"resources": [
"css/mobydick.css",
{
"@type": "PublicationLink",
"rel": "https://www.w3.org/ns/wp#cover-page",
"url": "images/cover.jpg",
"encodingFormat": "image/jpeg"
},{
"@type": "PublicationLink",
"url": "html/toc.html",
"rel": "contents",
},{
"@type": "PublicationLink",
"url": "fonts/STIXGeneral.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"@type": "PublicationLink",
"url": "fonts/STIXGeneralBol.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"@type": "PublicationLink",
"url": "fonts/STIXGeneralBolIta.otf",
"encodingFormat": "application/vnd.ms-opentype"
},{
"@type": "PublicationLink",
"url": "fonts/STIXGeneralItalic.otf",
"encodingFormat": "application/vnd.ms-opentype"
}
]
}
This is the simple version of an embedded manifest example. A more elaborate version for the same document is also available as an example.
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>Model for Tabular Data and Metadata on the Web</title>
<link href="#wpm" rel="publication" />
...
<script id="wpm" type="application/ld+json">
{
"@context" : ["https://schema.org", "https://www.w3.org/ns/wp-context"],
"@type" : "CreativeWork",
"@id" : "http://www.w3.org/TR/tabular-data-model/",
"url" : "http://www.w3.org/TR/2015/REC-tabular-data-model-20151217/",
"copyrightYear" : "2015",
"copyrightHolder" : "World Wide Web Consortium",
"creator" : [
{
"@type" : "Person",
"name" : "Jeni Tennison"
},
{
"@type" : "Person",
"name" : "Gregg Kellogg"
},
{
"@type" : "Person",
"name" : "Ivan Herman"
}
],
"datePublished" : "2015-12-17",
"resources" : [
"datatypes.html",
"datatypes.svg",
"datatypes.png",
"diff.html",
"test-utf8.csv",
"test-utf8-bom.csv",
"test-utf16.csv",
"test-utf16-bom.csv",
"test.xls",
"test.xlsx"
]
}
</script>
</head>
<body>
....
<section id="toc" role="doc-toc">
<h2 resource="#h-toc" id="h-toc" class="introductory">Table of Contents</h2>
<ul class="toc">
<li class="tocline"><a class="tocxref" href="#intro"><span class="secno">1. </span>Introduction</a></li>
...
</ul>
</section>
...
</body>
</
html
>
{
"@context": ["https://schema.org", "https://www.w3.org/ns/wp-context"],
"@type": "Audiobook",
"@id": "https://librivox.org/flatland-a-romance-of-many-dimensions-by-edwin-abbott-abbott/",
"url": "https://w3c.github.io/wpub/experiments/audiobook/",
"name": "Flatland: A Romance of Many Dimensions",
"author": "Edwin Abbott Abbott",
"readBy": "Ruth Golding",
"publisher": "Librivox",
"dateModified": "2018-06-14T19:32:18Z",
"datePublished": "2008-10-12",
"license": "https://creativecommons.org/publicdomain/zero/1.0/",
"resources": [
{"rel": "cover", "url": "http://ia800704.us.archive.org/9/items/LibrivoxCdCoverArt12/Flatland_1109.jpg", "encodingFormat": "image/jpeg"},
{"rel": "contents", "url": "toc.html", "encodingFormat": "text/html"}
],
"readingOrder": [
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_1_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 1, Sections 1 - 3"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_2_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 1, Sections 4 - 5"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_3_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 1, Sections 6 - 7"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_4_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 1, Sections 8 - 10"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_5_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 1, Sections 11 - 12"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_6_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 2, Sections 13 - 14"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_7_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 2, Sections 15 - 17"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_8_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 2, Sections 18 - 20"},
{"url": "http://www.archive.org/download/flatland_rg_librivox/flatland_9_abbott.mp3", "encodingFormat": "audio/mpeg", "name": "Part 2, Sections 21 - 22"}
]
}
This section is non-normative.
(These examples were originally published in the Activity Streams Recommendation [ activitystreams-core ].)
| Character order in memory | Dir | Method | Expected display |
|---|---|---|---|
פעילות
הבינאום,
W3C
| rtl | First strong directional character | פעילות הבינאום, W3C |
The
document
is
titled,
"⁧פעילות
הבינאום,
W3C
⁩"
| ltr | First strong directional character | The document is titled, " פעילות הבינאום, W3C " |
‏
HTML
היא
שפת
סימון
| rtl | Bidi Control Character |
HTML
היא
שפת
סימון
|
‎'سلام'
is
hello
in
Persian
| ltr | Bidi Control Character |
'
سلام
'
is
hello
in
Persian
|
This section is non-normative.
These diagrams provide a visual view of the lifecycle steps, as specified in 5. Web Publication Lifecycle .
This section is non-normative.
This section is non-normative.
The following people contributed to the development of this specification:
The Working Group would also like to thank the members of the Digital Publishing Interest Group for all the hard work they did paving the road for this specification.