1.
Introduction
Note
This
section
is
not
normative.
Web
applications
often
need
to
work
with
strings
of
HTML
on
the
client
side,
perhaps
as
part
of
a
client-side
templating
solution,
perhaps
as
part
of
rendering
user
generated
content,
etc.
It
is
difficult
to
do
so
in
a
safe
way.
The
naive
approach
of
joining
strings
together
and
stuffing
them
into
an
Element
’s
innerHTML
is
fraught
with
risk,
as
it
can
cause
JavaScript
execution
in
a
number
of
unexpected
ways.
Libraries
like
[DOMPURIFY]
attempt
to
manage
this
problem
by
carefully
parsing
and
sanitizing
strings
before
insertion,
by
constructing
a
DOM
and
filtering
its
members
through
an
allow-list.
This
has
proven
to
be
a
fragile
approach,
as
the
parsing
APIs
exposed
to
the
web
don’t
always
map
in
reasonable
ways
to
the
browser’s
behavior
when
actually
rendering
a
string
as
HTML
in
the
"real"
DOM.
Moreover,
the
libraries
need
to
keep
on
top
of
browsers'
changing
behavior
over
time;
things
that
once
were
safe
may
turn
into
time-bombs
based
on
new
platform-level
features.
The
browser
has
a
fairly
good
idea
of
when
it
is
going
to
execute
code.
We
can
improve
upon
the
user-space
libraries
by
teaching
the
browser
how
to
render
HTML
from
an
arbitrary
string
in
a
safe
manner,
and
do
so
in
a
way
that
is
much
more
likely
to
be
maintained
and
updated
along
with
the
browser’s
own
changing
parser
implementation.
This
document
outlines
an
API
which
aims
to
do
just
that.
1.1.
Goals
Mitigate
the
risk
of
DOM-based
cross-site
scripting
attacks
by
providing
developers
with
mechanisms
for
handling
user-controlled
HTML
which
prevent
direct
script
execution
upon
injection.
Make
HTML
output
safe
for
use
within
the
current
user
agent,
taking
into
account
its
current
understanding
of
HTML.
Allow
developers
to
override
the
default
set
of
elements
and
attributes.
Adding
certain
elements
and
attributes
can
prevent
script
gadget
attacks.
1.2.
API
Summary
The
Sanitizer
API
offers
functionality
to
parse
a
string
containing
HTML
into
a
DOM
tree,
and
to
filter
the
resulting
tree
according
to
a
user-supplied
configuration.
The
methods
come
in
two
by
two
flavours:
Safe
and
unsafe
:
The
"safe"
methods
will
not
generate
any
markup
that
executes
script.
That
is,
they
should
be
safe
from
XSS.
The
"unsafe"
methods
will
parse
and
filter
whatever
they’re
supposed
to.
See
also:
§ 4
Security
Considerations
.
Context:
Methods
are
defined
on
Element
and
ShadowRoot
and
will
replace
these
Node
’s
children,
and
are
largely
analogous
to
innerHTML
.
There
are
also
static
methods
on
the
Document
,
which
parse
an
entire
document
are
largely
analogous
to
DOMParser
.
parseFromString()
.
2.
Framework
2.1.
Sanitizer
API
The
Element
interface
defines
two
methods,
setHTML()
and
setHTMLUnsafe()
.
Both
of
these
take
a
DOMString
with
HTML
markup,
and
an
optional
configuration.
{
[ = {});
};
{
[ = {});
};
Element
’s
setHTMLUnsafe(
html
,
options
)
method
steps
are:
Let
compliantHTML
be
the
result
of
invoking
the
get
trusted
type
compliant
string
algorithm
with
TrustedHTML
,
this
’s
relevant
global
object
,
html
,
"Element
setHTMLUnsafe",
and
"script".
Let
target
be
this
’s
template
contents
if
this
is
a
template
element;
otherwise
this
.
Set
and
filter
HTML
given
target
,
this
,
compliantHTML
,
options
,
and
false.
Element
’s
setHTML(
html
,
options
)
method
steps
are:
Let
target
be
this
’s
template
contents
if
this
is
a
template
;
otherwise
this
.
Set
and
filter
HTML
given
target
,
this
,
html
,
options
,
and
true.
{
[ = {});
};
{
[ = {});
};
These
methods
are
mirrored
on
the
ShadowRoot
:
ShadowRoot
’s
setHTMLUnsafe(
html
,
options
)
method
steps
are:
Let
compliantHTML
be
the
result
of
invoking
the
get
trusted
type
compliant
string
algorithm
with
TrustedHTML
,
this
’s
relevant
global
object
,
html
,
"ShadowRoot
setHTMLUnsafe",
and
"script".
Set
and
filter
HTML
using
this
,
this
’s
shadow
host
(as
context
element),
compliantHTML
,
options
,
and
false.
ShadowRoot
’s
setHTML(
html
,
options
)
method
steps
are:
Set
and
filter
HTML
using
this
(as
target),
this
(as
context
element),
html
,
options
,
and
true.
The
Document
interface
gains
two
new
methods
which
parse
an
entire
Document
:
{
= {});
};
{
= {});
};
The
parseHTMLUnsafe(
html
,
options
)
method
steps
are:
Let
compliantHTML
be
the
result
of
invoking
the
get
trusted
type
compliant
string
algorithm
with
TrustedHTML
,
the
current
global
object
,
html
,
"Document
parseHTMLUnsafe",
and
"script".
Let
document
be
a
new
Document
,
whose
content
type
is
"text/html".
Note:
Since
document
does
not
have
a
browsing
context,
scripting
is
disabled.
Set
document
’s
allow
declarative
shadow
roots
to
true.
Parse
HTML
from
a
string
given
document
and
compliantHTML
.
Let
sanitizer
be
the
result
of
calling
get
a
sanitizer
instance
from
options
with
options
and
false.
Call
sanitize
on
document
with
sanitizer
and
false.
Return
document
.
The
parseHTML(
html
,
options
)
method
steps
are:
Let
document
be
a
new
Document
,
whose
content
type
is
"text/html".
Note:
Since
document
does
not
have
a
browsing
context,
scripting
is
disabled.
Set
document
’s
allow
declarative
shadow
roots
to
true.
Parse
HTML
from
a
string
given
document
and
html
.
Let
sanitizer
be
the
result
of
calling
get
a
sanitizer
instance
from
options
with
options
and
true.
Call
sanitize
on
document
with
sanitizer
and
true.
Return
document
.
2.2.
SetHTML
options
and
the
configuration
object.
The
family
of
setHTML()
-like
methods
all
accept
an
options
dictionary.
Right
now,
only
one
member
of
this
dictionary
is
defined:
};
{
( = "default";
};
{
( = {};
};
The
Sanitizer
configuration
object
encapsulates
a
filter
configuration.
The
same
configuration
can
be
used
with
both
"safe"
or
"unsafe"
methods,
where
the
"safe"
methods
perform
an
implicit
removeUnsafe
operation
on
the
passed
in
configuration
and
have
a
default
configuration
when
none
is
passed.
The
default
differs
between
"safe"
and
"unsafe"
methods:
The
"safe"
methods
are
aiming
to
be
safe
by
default
and
have
a
restrictive
default,
while
the
"unsafe"
methods
are
unrestricted
by
default.
The
intent
for
configuration
use
is
that
one
(or
a
few)
configurations
will
be
built-up
early
on
in
a
page’s
lifetime,
and
can
then
be
used
whenever
needed.
This
allows
implementations
to
pre-process
configurations.
The
configuration
object
can
be
queried
to
return
a
configuration
dictionary.
It
can
also
be
modified
directly.
]
{
= "default");
// Query configuration:
();
// Modify a Sanitizer's lists and fields:
);
);
);
);
);
);
);
);
);
// Remove markup that executes script.
();
};
A
Sanitizer
has
an
associated
SanitizerConfig
configuration
.
The
constructor(
configuration
)
method
steps
are:
If
configuration
is
a
SanitizerPresets
string
,
then:
Assert
:
configuration
is
default
.
Set
configuration
to
the
built-in
safe
default
configuration
.
Let
valid
be
the
return
value
of
set
a
configuration
with
configuration
and
true
on
this
.
If
valid
is
false,
then
throw
a
TypeError
.
The
get()
method
steps
are:
Note:
Outside
of
the
get()
method,
the
order
of
the
Sanitizer’s
elements
and
attributes
is
unobservable.
By
explicitly
sorting
the
result
of
this
method,
we
give
implementations
the
opportunity
to
optimize
by,
for
example,
using
unordered
sets
internally.
Let
config
be
this
’s
configuration
.
Assert
:
config
is
valid
.
If
config
["
elements
"]
exists
:
For
any
element
of
config
["
elements
"]:
If
element
["
attributes
"]
exists
:
Set
element
["
attributes
"]
to
the
result
of
sort
in
ascending
order
element
["
attributes
"],
with
attrA
being
less
than
item
attrB
.
If
element
["
removeAttributes
"]
exists
:
Set
element
["
removeAttributes
"]
to
the
result
of
sort
in
ascending
order
element
["
removeAttributes
"],
with
attrA
being
less
than
item
attrB
.
Set
config
["
elements
"]
to
the
result
of
sort
in
ascending
order
config
["
elements
"],
with
elementA
being
less
than
item
elementB
.
Otherwise:
Set
config
["
removeElements
"]
to
the
result
of
sort
in
ascending
order
config
["
removeElements
"],
with
elementA
being
less
than
item
elementB
.
If
config
["
replaceWithChildrenElements
"]
exists
:
Set
config
["
replaceWithChildrenElements
"]
to
the
result
of
sort
in
ascending
order
config
["
replaceWithChildrenElements
"],
with
elementA
being
less
than
item
elementB
.
If
config
["
processingInstructions
"]
exists
:
Set
config
["
processingInstructions
"]
to
the
result
of
sort
in
ascending
order
config
["
processingInstructions
"],
with
piA
["
target
"]
being
code
unit
less
than
piB
["
target
"].
Otherwise:
Set
config
["
removeProcessingInstructions
"]
to
the
result
of
sort
in
ascending
order
config
["
removeProcessingInstructions
"],
with
piA
["
target
"]
being
code
unit
less
than
piB
["
target
"].
If
config
["
attributes
"]
exists
:
Set
config
["
attributes
"]
to
the
result
of
sort
in
ascending
order
config
["
attributes
"],
with
attrA
being
less
than
item
attrB
.
Otherwise:
Set
config
["
removeAttributes
"]
to
the
result
of
sort
in
ascending
order
config
["
removeAttributes
"],
with
attrA
being
less
than
item
attrB
.
Return
config
.
The
allowElement(
element
)
method
steps
are:
Note:
This
algorithm
is
relatively
involved,
because
the
element
allow
list
may
specifiy
per-element
allow-
or
remove-lists
for
attributes.
This
requires
that
we
distinguish
4
cases:
Whether
we
have
a
global
allow-
or
remove-list,
and
whether
these
lists
already
contain
element
or
not.
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
Set
element
to
the
result
of
canonicalize
a
sanitizer
element
with
attributes
with
element
.
If
configuration
["
elements
"]
exists
:
Set
modified
to
the
result
of
remove
element
from
configuration
["
replaceWithChildrenElements
"].
Comment
:
We
need
to
make
sure
the
per-element
attributes
do
not
overlap
with
global
attributes.
If
configuration
["
attributes
"]
exists
:
If
element
["
attributes
"]
exists
:
Set
element
["
attributes
"]
to
remove
duplicates
from
element
["
attributes
"].
Set
element
["
attributes
"]
to
the
difference
of
element
["
attributes
"]
and
configuration
["
attributes
"].
If
configuration
["
dataAttributes
"]
is
true:
Remove
all
items
item
from
element
["
attributes
"]
where
item
is
a
custom
data
attribute
.
If
element
["
removeAttributes
"]
exists
:
Set
element
["
removeAttributes
"]
to
remove
duplicates
from
element
["
removeAttributes
"].
Set
element
["
removeAttributes
"]
to
the
intersection
of
element
["
removeAttributes
"]
and
configuration
["
attributes
"].
Otherwise:
If
element
["
attributes
"]
exists
:
Set
element
["
attributes
"]
to
remove
duplicates
from
element
["
attributes
"].
Set
element
["
attributes
"]
to
the
difference
of
element
["
attributes
"]
and
element
["
removeAttributes
"]
with
default
«
».
Remove
element
["
removeAttributes
"].
Set
element
["
attributes
"]
to
the
difference
of
element
["
attributes
"]
and
configuration
["
removeAttributes
"].
If
element
["
removeAttributes
"]
exists
:
Set
element
["
removeAttributes
"]
to
remove
duplicates
from
element
["
removeAttributes
"].
Set
element
["
removeAttributes
"]
to
the
difference
of
element
["
removeAttributes
"]
and
configuration
["
removeAttributes
"].
If
configuration
["
elements
"]
does
not
contain
element
:
Comment
:
This
is
the
case
with
a
global
allow-list
that
does
not
yet
contain
element
.
Append
element
to
configuration
["
elements
"].
Return
true.
Comment
:
This
is
the
case
with
a
global
allow-list
that
already
contains
element
.
Let
current
element
be
the
item
in
configuration
["
elements
"]
where
item
["
name
"]
equals
element
["
name
"]
and
item
["
namespace
"]
equals
element
["
namespace
"].
If
element
equals
current
element
then
return
modified
.
Remove
element
from
configuration
["
elements
"].
Append
element
to
configuration
["
elements
"]
Return
true.
Otherwise:
If
element
["
attributes
"]
exists
or
element
["
removeAttributes
"]
with
default
«
»
is
not
empty
:
The
user
agent
may
report
a
warning
to
the
console
that
this
operation
is
not
supported.
Return
false.
Set
modified
to
the
result
of
remove
element
from
configuration
["
replaceWithChildrenElements
"].
If
configuration
["
removeElements
"]
does
not
contain
element
:
Comment
:
This
is
the
case
with
a
global
remove-list
that
does
not
contain
element
.
Return
modified
.
Comment
:
This
is
the
case
with
a
global
remove-list
that
contains
element
.
Remove
element
from
configuration
["
removeElements
"].
Return
true.
The
removeElement(
element
)
method
steps
are
to
remove
an
element
with
element
and
this
’s
configuration
.
The
replaceElementWithChildren(
element
)
method
steps
are:
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
Set
element
to
the
result
of
canonicalize
a
sanitizer
element
with
element
.
If
the
built-in
non-replaceable
elements
list
contains
element
:
Return
false.
If
configuration
["
replaceWithChildrenElements
"]
contains
element
:
Return
false.
Remove
element
from
configuration
["
removeElements
"].
Remove
element
from
configuration
["
elements
"]
list.
Add
element
to
configuration
["
replaceWithChildrenElements
"].
Return
true.
The
allowProcessingInstruction(
pi
)
method
steps
are:
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
Set
pi
to
the
result
of
canonicalize
a
sanitizer
processing
instruction
with
pi
.
If
configuration
["
processingInstructions
"]
exists
:
If
configuration
["
processingInstructions
"]
contains
pi
:
Return
false.
Append
pi
to
configuration
["
processingInstructions
"].
Return
true.
Otherwise:
If
configuration
["
removeProcessingInstructions
"]
contains
pi
:
Remove
the
item
from
configuration
["
removeProcessingInstructions
"]
whose
"
target
"
is
pi
["
target
"].
Return
true.
Return
false.
The
removeProcessingInstruction(
pi
)
method
steps
are:
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
Set
pi
to
the
result
of
canonicalize
a
sanitizer
processing
instruction
with
pi
.
If
configuration
["
processingInstructions
"]
exists
:
If
configuration
["
processingInstructions
"]
contains
pi
:
Remove
the
item
from
configuration
["
processingInstructions
"]
whose
"
target
"
is
pi
["
target
"].
Return
true.
Return
false.
Otherwise:
If
configuration
["
removeProcessingInstructions
"]
contains
pi
:
Return
false.
Append
pi
to
configuration
["
removeProcessingInstructions
"].
Return
true.
The
allowAttribute(
attribute
)
method
steps
are:
Note:
This
method
distinguishes
two
cases,
namely
whether
we
have
a
global
allow-
or
a
global
remove-list.
If
add
attribute
to
a
global
allow-list,
we
may
need
to
do
additional
work
to
fix
up
per-element
allow-
or
remove-lists
to
maintain
our
validity
criteria.
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
Set
attribute
to
the
result
of
canonicalize
a
sanitizer
attribute
with
attribute
.
If
configuration
["
attributes
"]
exists
:
Comment
:
If
we
have
a
global
allow-list,
we
need
to
add
attribute
.
If
configuration
["
dataAttributes
"]
is
true
and
attribute
is
a
custom
data
attribute
,
then
return
false.
If
configuration
["
attributes
"]
contains
attribute
return
false.
Comment
:
Fix-up
per-element
allow
and
remove
lists.
If
configuration
["
elements
"]
exists
:
For
each
element
in
configuration
["
elements
"]:
If
element
["
attributes
"]
with
default
«
»
contains
attribute
:
Remove
attribute
from
element
["
attributes
"].
Assert
:
element
["
removeAttributes
"]
with
default
«
»
does
not
contain
attribute
.
Append
attribute
to
configuration
["
attributes
"]
Return
true.
Otherwise:
Comment
:
If
we
have
a
global
remove-list,
we
need
to
remove
attribute
.
If
configuration
["
removeAttributes
"]
does
not
contain
attribute
:
Return
false.
Remove
attribute
from
configuration
["
removeAttributes
"].
Return
true.
The
removeAttribute(
attribute
)
method
steps
are
to
remove
an
attribute
with
attribute
and
this
’s
configuration
.
The
setComments(
allow
)
method
steps
are:
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
If
configuration
["
comments
"]
exists
and
configuration
["
comments
"]
equals
allow
,
then
return
false;
Set
configuration
["
comments
"]
to
allow
.
Return
true.
The
setDataAttributes(
allow
)
method
steps
are:
Let
configuration
be
this
’s
configuration
.
Assert
:
configuration
is
valid
.
If
configuration
["
attributes
"]
does
not
exist
,
then
return
false.
If
configuration
["
dataAttributes
"]
equals
allow
,
then
return
false.
If
allow
is
true:
Remove
any
items
attr
from
configuration
["
attributes
"]
where
attr
is
a
custom
data
attribute
.
If
configuration
["
elements
"]
exists
:
For
each
element
in
configuration
["
elements
"]:
If
element
["
attributes
"]
exists
:
Remove
any
items
attr
from
element
["
attributes
"]
where
attr
is
a
custom
data
attribute
.
Set
configuration
["
dataAttributes
"]
to
allow
.
Return
true.
The
removeUnsafe()
method
steps
are
to
update
this
’s
configuration
with
the
result
of
calling
remove
unsafe
on
this
’s
configuration
.
2.3.
The
Configuration
Dictionary
{
;
= "http://www.w3.org/1999/xhtml";
};
// Used by "elements"
{
;
;
};
;
;
{
;
};
;
{
;
;
};
;
{
;
;
;
;
;
;
;
;
;
};
2.4.
Configuration
Invariants
Configurations
can
and
ought
to
be
modified
by
developers
to
suit
their
purposes.
Options
are
to
write
a
new
configuration
dictionary
from
scratch,
to
modify
an
existing
Sanitizer
’s
configuration
by
using
the
modifier
methods,
or
to
get()
an
existing
Sanitizer
’s
configuration
as
a
dictionary
and
modify
the
dictionary
and
then
create
a
new
Sanitizer
with
it.
An
empty
configuration
allows
everything
(when
called
with
the
"unsafe"
methods
like
setHTMLUnsafe
).
A
configuration
"default"
contains
a
built-in
safe
default
configuration
.
Note
that
"safe"
and
"unsafe"
sanitizer
methods
have
different
defaults.
Not
all
configuration
dictionaries
are
valid.
A
valid
configuration
avoids
redundancy
(like
specifying
the
same
element
to
be
allowed
twice)
and
contradictions
(like
specifying
an
element
to
be
both
removed
and
allowed.)
Several
conditions
need
to
hold
for
a
configuration
to
be
valid:
Mixing
global
allow-
and
remove-lists:
elements
or
removeElements
can
exist,
but
not
both.
If
both
are
missing,
this
is
equivalent
to
removeElements
set
to
«
».
attributes
or
removeAttributes
can
exist,
but
not
both.
If
both
are
missing,
this
is
equivalent
to
removeAttributes
set
to
«
».
dataAttributes
is
conceptually
an
extension
of
the
attributes
allow-list.
The
dataAttributes
attribute
is
only
allowed
when
a
attributes
list
is
used.
Duplicate
entries
between
different
global
lists:
There
are
no
duplicate
entries
(i.e.,
no
same
elements)
between
elements
,
removeElements
,
or
replaceWithChildrenElements
.
There
are
no
duplicate
entries
(i.e.,
no
same
attributes)
between
attributes
or
removeAttributes
.
Mixing
local
allow-
and
remove-lists
on
the
same
element:
When
a
attributes
list
exists,
both,
either
or
none
of
the
attributes
and
removeAttributes
lists
are
allowed
on
the
same
element.
When
a
removeAttributes
list
exists,
either
or
none
of
the
attributes
and
removeAttributes
lists
are
allowed
on
the
same
element,
but
not
both.
Duplicate
entries
on
the
same
element:
There
are
no
duplicate
entries
between
attributes
and
removeAttributes
on
the
same
element.
No
element
from
the
built-in
non-replaceable
elements
list
appears
in
replaceWithChildrenElements
,
since
replacing
these
elements
with
their
children
could
lead
to
re-parsing
issues
or
invalid
node
trees.
The
elements
element
allow-list
can
also
specify
allowing
or
removing
attributes
for
a
given
element.
This
is
meant
to
mirror
[HTML]
’s
structure,
which
knows
both
global
attributes
as
well
as
local
attributes
that
apply
to
a
specific
element.
Global
and
local
attributes
can
be
mixed,
but
note
that
ambiguous
configurations
where
a
particular
attribute
would
be
allowed
by
one
list
and
forbidden
by
another,
are
generally
invalid.
global
attributes
global
removeAttributes
local
attributes
An
attribute
is
allowed
if
it
matches
either
list.
No
duplicates
are
allowed.
An
attribute
is
only
allowed
if
it’s
in
the
local
allow
list.
No
duplicate
entries
between
global
remove
and
local
allow
lists
are
allowed.
Note
that
the
global
remove
list
has
no
function
for
this
particular
element,
but
may
well
apply
to
other
elements
that
do
not
have
a
local
allow
list.
local
removeAttributes
An
attribute
is
allowed
if
it’s
in
the
global
allow-list,
but
not
in
the
local
remove-list.
Local
remove
must
be
a
subset
of
the
global
allow
lists.
An
attribute
is
allowed
if
it
is
in
neither
list.
No
duplicate
entries
between
global
remove
and
local
remove
lists
are
allowed.
Please
note
the
asymmetry
where
mostly
no
duplicates
between
global
and
per-element
lists
are
permitted,
but
in
the
case
of
a
global
allow-list
and
a
per-element
remove-list
the
latter
must
be
a
subset
of
the
former.
An
excerpt
of
the
table
above,
only
focusing
on
duplicates,
is
as
follows:
global
attributes
global
removeAttributes
local
attributes
No
duplicates
are
allowed.
No
duplicates
are
allowed.
local
removeAttributes
Local
remove
must
be
a
subset
of
the
global
allow
lists.
No
duplicates
are
allowed.
The
dataAttributes
setting
allows
custom
data
attributes
.
The
rules
above
easily
extends
to
custom
data
attributes
if
one
considers
dataAttributes
to
be
an
allow-list:
global
attributes
and
dataAttributes
set
local
attributes
All
custom
data
attributes
are
allowed.
No
custom
data
attributes
may
be
listed
in
any
allow-list,
as
that
would
mean
a
duplicate
entry.
local
removeAttributes
A
custom
data
attribute
is
allowed,
unless
it’s
listed
in
the
local
remove-list.
No
custom
data
attribute
may
be
listed
in
the
global
allow-list,
as
that
would
mean
a
duplicate
entry.
Putting
these
rules
in
words:
Duplicates
and
interactions
between
global
and
local
lists:
If
a
global
attributes
allow
list
exists,
then
all
element’s
local
lists:
If
a
local
attributes
allow
list
exists,
there
may
be
no
duplicate
entries
between
these
lists.
If
a
local
removeAttributes
remove
list
exists,
then
all
its
entries
must
also
be
listed
in
the
global
attributes
allow
list.
If
dataAttributes
is
true,
then
no
custom
data
attributes
may
be
listed
in
any
of
the
allow-lists.
If
a
global
removeAttributes
remove
list
exists,
then:
If
a
local
attributes
allow
list
exists,
there
may
be
no
duplicate
entries
between
these
lists.
If
a
local
removeAttributes
remove
list
exists,
there
may
be
no
duplicate
entries
between
these
lists.
Not
both
a
local
attributes
allow
list
and
local
removeAttributes
remove
list
exists.
dataAttributes
must
be
absent.
To
determine
whether
a
canonical
SanitizerConfig
config
is
valid
:
NOTE:
It’s
expected
that
the
configuration
being
passing
in
has
previously
been
run
through
the
canonicalize
the
configuration
steps.
We
will
simply
assert
conditions
that
that
algorithm
should
have
guaranteed
to
hold.
Assert
:
config
["
elements
"]
exists
or
config
["
removeElements
"]
exists
.
If
config
["
elements
"]
exists
and
config
["
removeElements
"]
exists
,
then
return
false.
Assert
:
Either
config
["
processingInstructions
"]
exists
or
config
["
removeProcessingInstructions
"]
exists
.
If
config
["
processingInstructions
"]
exists
and
config
["
removeProcessingInstructions
"]
exists
,
then
return
false.
Assert
:
Either
config
["
attributes
"]
exists
or
config
["
removeAttributes
"]
exists
.
If
config
["
attributes
"]
exists
and
config
["
removeAttributes
"]
exists
,
then
return
false.
Assert
:
All
SanitizerElementNamespaceWithAttributes
,
SanitizerElementNamespace
,
SanitizerProcessingInstruction
,
and
SanitizerAttributeNamespace
items
in
config
are
canonical,
meaning
they
have
been
run
through
canonicalize
a
sanitizer
element
,
canonicalize
a
sanitizer
processing
instruction
,
or
canonicalize
a
sanitizer
attribute
,
as
appropriate.
If
config
["
elements
"]
exists
:
If
config
["
elements
"]
has
duplicates
,
then
return
false.
Otherwise:
If
config
["
removeElements
"]
has
duplicates
,
then
return
false.
If
config
["
replaceWithChildrenElements
"]
exists
and
has
duplicates
,
then
return
false.
If
config
["
processingInstructions
"]
exists
:
If
config
["
processingInstructions
"]
has
duplicate
targets
,
then
return
false.
Otherwise:
If
config
["
removeProcessingInstructions
"]
has
duplicate
targets
,
then
return
false.
If
config
["
attributes
"]
exists
:
If
config
["
attributes
"]
has
duplicates
,
then
return
false.
Otherwise:
If
config
["
removeAttributes
"]
has
duplicates
,
then
return
false.
If
config
["
replaceWithChildrenElements
"]
exists
:
For
each
element
of
config
["
replaceWithChildrenElements
"]:
If
the
built-in
non-replaceable
elements
list
contains
element
,
then
return
false.
If
config
["
elements
"]
exists
:
If
the
intersection
of
config
["
elements
"]
and
config
["
replaceWithChildrenElements
"]
is
not
empty
,
then
return
false.
Otherwise:
If
the
intersection
of
config
["
removeElements
"]
and
config
["
replaceWithChildrenElements
"]
is
not
empty
,
then
return
false.
If
config
["
attributes
"]
exists
:
Assert
:
config
["
dataAttributes
"]
exists
.
If
config
["
elements
"]
exists
:
For
each
element
of
config
["
elements
"]:
If
element
["
attributes
"]
exists
and
element
["
attributes
"]
has
duplicates
,
then
return
false.
If
element
["
removeAttributes
"]
exists
and
element
["
removeAttributes
"]
has
duplicates
,
then
return
false.
If
the
intersection
of
config
["
attributes
"]
and
element
["
attributes
"]
with
default
«
»
is
not
empty
,
then
return
false.
If
element
["
removeAttributes
"]
with
default
«
»
is
not
a
subset
of
config
["
attributes
"],
then
return
false.
If
config
["
dataAttributes
"]
is
true
and
element
["
attributes
"]
contains
a
custom
data
attribute
,
then
return
false.
If
config
["
dataAttributes
"]
is
true
and
config
["
attributes
"]
contains
a
custom
data
attribute
,
then
return
false.
Otherwise:
If
config
["
elements
"]
exists
:
For
each
element
of
config
["
elements
"]:
If
element
["
attributes
"]
exists
and
element
["
removeAttributes
"]
exists
,
then
return
false.
If
element
["
attributes
"]
exist
and
element
["
attributes
"]
has
duplicates
,
then
return
false.
If
element
["
removeAttributes
"]
exist
and
element
["
removeAttributes
"]
has
duplicates
,
then
return
false.
If
the
intersection
of
config
["
removeAttributes
"]
and
element
["
attributes
"]
with
default
«
»
is
not
empty
,
then
return
false.
If
the
intersection
of
config
["
removeAttributes
"]
and
element
["
removeAttributes
"]
with
default
«
»
is
not
empty
,
then
return
false.
If
config
["
dataAttributes
"]
exists
,
then
return
false.
Return
true.
Note:
Setting
a
configuration
from
a
dictionary
will
do
a
bit
normalization.
In
particular,
if
both
allow-
and
remove-lists
are
missing,
it
will
interpret
this
as
an
empty
remove-list.
So
{}
itself
is
not
a
valid
configuration,
but
it
will
be
normalized
to
{removeElements:[],removeAttributes:[]}
,
which
is.
This
normalization
step
was
chosen
in
order
to
have
a
missing
dictionary
be
consistent
with
an
empty
one,
i.e.,
to
have
setHTMLUnsafe(txt)
be
consistent
with
setHTMLUnsafe(txt,
{sanitizer:
{}})
.
3.
Algorithms
To
set
and
filter
HTML
,
given
an
Element
or
DocumentFragment
target
,
an
Element
contextElement
,
a
string
html
,
and
a
dictionary
options
,
and
a
boolean
safe
:
If
safe
and
contextElement
’s
local
name
is
"
script
"
and
contextElement
’s
namespace
is
the
HTML
namespace
or
the
SVG
namespace
,
then
return.
Let
sanitizer
be
the
result
of
calling
get
a
sanitizer
instance
from
options
with
options
and
safe
.
Let
newChildren
be
the
result
of
the
HTML
fragment
parsing
algorithm
given
contextElement
,
html
,
and
true.
Let
fragment
be
a
new
DocumentFragment
whose
node
document
is
contextElement
’s
node
document
.
For
each
node
in
newChildren
,
append
node
to
fragment
.
Run
sanitize
on
fragment
using
sanitizer
and
safe
.
Replace
all
with
fragment
within
target
.
To
get
a
sanitizer
instance
from
options
from
a
dictionary
options
with
a
boolean
safe
:
Note:
This
algorithm
works
for
both
SetHTMLOptions
and
SetHTMLUnsafeOptions
.
They
only
differ
in
the
defaults.
Let
sanitizerSpec
be
"
default
".
If
options
["
sanitizer
"]
exists
,
then:
Set
sanitizerSpec
to
options
["
sanitizer
"]
Assert
:
sanitizerSpec
is
either
a
Sanitizer
instance,
a
string
which
is
a
SanitizerPresets
member,
or
a
dictionary
.
If
sanitizerSpec
is
a
string
:
Assert
:
sanitizerSpec
is
"
default
"
Set
sanitizerSpec
to
the
built-in
safe
default
configuration
.
Assert
:
sanitizerSpec
is
either
a
Sanitizer
instance,
or
a
dictionary
.
If
sanitizerSpec
is
a
dictionary
:
Let
sanitizer
be
a
new
Sanitizer
instance.
Let
setConfigurationResult
be
the
result
of
set
a
configuration
with
sanitizerSpec
and
not
safe
on
sanitizer
.
If
setConfigurationResult
is
false,
throw
a
TypeError
.
Set
sanitizerSpec
to
sanitizer
.
Assert
:
sanitizerSpec
is
a
Sanitizer
instance.
Return
sanitizerSpec
.
3.1.
Sanitize
For
the
main
sanitize
operation,
using
a
ParentNode
node
,
a
Sanitizer
sanitizer
,
and
a
boolean
safe
,
run
these
steps:
Let
configuration
be
the
value
of
sanitizer
’s
configuration
.
Assert
:
configuration
is
valid
.
If
safe
is
true,
then
set
configuration
to
the
result
of
calling
remove
unsafe
on
configuration
.
Call
sanitize
core
on
node
,
configuration
,
and
with
handleJavascriptNavigationUrls
set
to
safe
.
The
sanitize
core
operation,
using
a
ParentNode
node
,
a
SanitizerConfig
configuration
,
and
a
boolean
handleJavascriptNavigationUrls
,
recurses
over
the
DOM
tree
beginning
with
node
.
It
consistes
of
these
steps:
For
each
child
of
node
’s
children
:
Assert
:
child
implements
Text
,
Comment
,
Element
,
ProcessingInstruction
or
DocumentType
.
Note:
Currently,
this
algorithm
is
only
called
on
output
of
the
HTML
parser
for
which
this
assertion
should
hold.
DocumentType
should
only
occur
for
parseHTML
and
parseHTMLUnsafe
.
If
in
the
future
this
algorithm
will
be
used
in
different
contexts,
this
assumption
needs
to
be
re-examined.
If
child
implements
DocumentType
,
then
continue
.
If
child
implements
Text
,
then
continue
.
If
child
implements
Comment
:
If
configuration
["
comments
"]
is
not
true,
then
remove
child
.
If
child
implements
ProcessingInstruction
:
Let
piTarget
be
child
’s
target
.
If
configuration
["
processingInstructions
"]
exists
:
If
configuration
["
processingInstructions
"]
does
not
contain
piTarget
:
Remove
child
.
Otherwise:
If
configuration
["
removeProcessingInstructions
"]
contains
piTarget
:
Remove
child
.
Otherwise:
Let
elementName
be
a
SanitizerElementNamespace
with
child
’s
local
name
and
namespace
.
If
configuration
["
replaceWithChildrenElements
"]
exists
and
if
configuration
["
replaceWithChildrenElements
"]
contains
elementName
:
Assert
:
node
does
not
implement
Document
.
Call
sanitize
core
on
child
with
configuration
and
handleJavascriptNavigationUrls
.
Let
fragment
be
a
new
DocumentFragment
whose
node
document
is
node
’s
node
document
.
For
each
innerChild
of
child
’s
children
,
append
innerChild
to
fragment
.
Replace
child
with
fragment
within
node
.
NOTE:
Replace
shouldn’t
throw
here,
since
the
structural
preconditions
for
successful
execution
of
the
algorithm
should
be
met.
Continue
.
If
configuration
["
elements
"]
exists
:
If
configuration
["
elements
"]
does
not
contain
elementName
:
Remove
child
.
Continue
.
Otherwise:
If
configuration
["
removeElements
"]
contains
elementName
:
Remove
child
.
Continue
.
If
elementName
equals
«[
"
name
"
→
"
template
",
"
namespace
"
→
HTML
namespace
]»,
then
call
sanitize
core
on
child
’s
template
contents
with
configuration
and
handleJavascriptNavigationUrls
.
If
child
is
a
shadow
host
,
then
call
sanitize
core
on
child
’s
shadow
root
with
configuration
and
handleJavascriptNavigationUrls
.
For
each
attribute
in
child
’s
attribute
list
:
Let
attrName
be
a
SanitizerAttributeNamespace
with
attribute
’s
local
name
and
namespace
.
If
is
attribute
allowed
for
attrName
given
configuration
,
and
elementName
is
blocked
,
then
remove
attribute
.
If
handleJavascriptNavigationUrls
:
If
«[
elementName
,
attrName
]»
matches
an
entry
in
the
built-in
navigating
URL
attributes
list
,
and
if
attribute
contains
a
javascript:
URL
,
then
remove
attribute
.
If
child
’s
namespace
is
the
MathML
Namespace
and
attr
’s
local
name
is
"
href
"
and
attr
’s
namespace
is
null
or
the
XLink
namespace
and
attr
contains
a
javascript:
URL
,
then
remove
attribute
.
If
the
built-in
animating
URL
attributes
list
contains
«[
elementName
,
attrName
]»
and
attr
’s
value
is
"
href
"
or
"
xlink:href
",
then
remove
attribute
.
Call
sanitize
core
on
child
with
configuration
and
handleJavascriptNavigationUrls
.
To
determine
is
attribute
allowed
for
a
SanitizerAttributeNamespace
attrName
,
given
a
SanitizerConfig
configuration
,
and
an
SanitizerElementNamespace
elementName
:
Let
elementWithLocalAttributes
be
an
empty
ordered
map
.
If
configuration
["
elements
"]
exists
and
configuration
["
elements
"]
contains
elementName
:
Set
elementWithLocalAttributes
to
the
item
in
configuration
["
elements
"]
where
elementName
["
name
"]
is
item
["
name
"]
and
elementName
["
namespace
"]
is
item
["
namespace
"].
If
elementWithLocalAttributes
["
removeAttributes
"]
with
default
«
»
contains
attrName
:
Return
blocked
.
If
configuration
["
attributes
"]
exists
:
Let
the
boolean
globallyAllowed
be
whether
configuration
["
attributes
"]
contains
attrName
.
Let
the
boolean
locallyAllowed
be
whether
elementWithLocalAttributes
["
attributes
"]
with
default
«
»
contains
attrName
.
Let
the
boolean
isDataAttributeAllowed
be
whether
both,
"data-"
is
a
code
unit
prefix
of
attrName
[
name
]
and
attrName
[
namespace
]
is
null
,
and
configuration
["
dataAttributes
"]
is
true.
If
neither
globallyAllowed
nor
locallyAllowed
nor
isDataAttributeAllowed
,
return
blocked
.
Otherwise:
If
elementWithLocalAttributes
["
attributes
"]
exists
and
elementWithLocalAttributes
["
attributes
"]
does
not
contain
attrName
:
Return
blocked
.
If
configuration
["
removeAttributes
"]
contains
attrName
:
Return
blocked
.
Return
allowed
.
Note:
Current
browsers
support
javascript:
URLs
only
when
navigating.
Since
navigation
itself
is
not
an
XSS
threat
we
handle
navigation
to
javascript:
URLs,
but
not
navigations
in
general.
Declarative
navigation
falls
into
a
handful
of
categories:
Anchor
elements.
(
<a>
in
HTML
and
SVG
namespaces)
Form
elements
that
trigger
navigation
as
part
of
the
form
action.
[MathML]
allows
any
element
to
act
as
an
anchor
.
[SVG11]
animation.
The
first
two
are
covered
by
the
built-in
navigating
URL
attributes
list
.
The
MathML
case
is
covered
by
a
seperate
rule,
because
there
is
no
formalism
in
this
spec
to
cover
a
"per-namespace
global"
rule.
The
SVG
animation
case
is
covered
by
the
built-in
animating
URL
attributes
list
.
But
since
the
interpretation
of
SVG
animation
elements
depends
on
the
animation
target,
and
since
during
sanitization
we
cannot
know
what
the
final
target
will
be,
the
sanitize
algorithm
blocks
any
animation
of
href
attributes.
To
determine
whether
an
attribute
contains
a
javascript:
URL
:
Let
url
be
the
result
of
running
the
basic
URL
parser
on
attribute
’s
value
.
If
url
is
failure
,
then
return
false.
Return
whether
url
’s
scheme
is
"
javascript
".
3.2.
Modify
the
Configuration
The
configuration
modifier
methods
are
methods
on
Sanitizer
that
modify
its
configuration.
They
will
maintain
the
validity
criteria.
They
return
a
boolean
which
informs
the
caller
whether
the
configuration
was
modified
or
not.
s
div
s
div
To
remove
an
element
SanitizerElement
element
from
a
SanitizerConfig
configuration
:
Note:
This
method
requires
that
we
distinguish
4
cases:
Whether
we
have
a
global
allow-
or
remove-list,
whether
they
already
contain
element
or
not.
Assert
:
configuration
is
valid
.
Set
element
to
the
result
of
canonicalize
a
sanitizer
element
with
element
.
Set
modified
to
the
result
of
remove
element
from
configuration
["
replaceWithChildrenElements
"].
If
configuration
["
elements
"]
exists
:
If
configuration
["
elements
"]
contains
element
:
Comment
:
We
have
a
global
allow
list
and
it
contains
element
.
Remove
element
from
configuration
["
elements
"].
Return
true.
Comment
:
We
have
a
global
allow
list
and
it
does
not
contain
element
.
Return
modified
.
Otherwise:
If
configuration
["
removeElements
"]
contains
element
:
Comment
:
We
have
a
global
remove
list
and
it
already
contains
element
.
Return
modified
.
Comment
:
We
have
a
global
remove
list
and
it
does
not
contain
element
.
Add
element
to
configuration
["
removeElements
"].
Return
true.
To
remove
an
attribute
SanitizerAttribute
attribute
from
a
SanitizerConfig
configuration
:
Note:
This
method
distinguishes
two
cases,
namely
whether
we
have
a
global
allow-
or
a
global
remove-list.
If
we
add
attribute
to
the
global
remove-list,
we
may
need
to
do
additional
work
to
fix
up
per-element
allow-
or
remove-lists
to
maintain
our
validity
criteria.
If
we
remove
attribute
from
a
global
allow-list,
we
may
also
have
to
remove
it
from
local
remove-lists.
Assert
:
configuration
is
valid
.
Set
attribute
to
the
result
of
canonicalize
a
sanitizer
attribute
with
attribute
.
If
configuration
["
attributes
"]
exists
:
Comment
:
If
we
have
a
global
allow-list,
we
need
to
remove
attribute
.
Set
modified
to
the
result
of
remove
attribute
from
configuration
["
attributes
"].
Comment
:
Fix-up
per-element
allow
and
remove
lists.
If
configuration
["
elements
"]
exists
:
For
each
element
of
configuration
["
elements
"]:
If
element
["
attributes
"]
with
default
«
»
contains
attribute
:
Set
modified
to
true.
Remove
attribute
from
element
["
attributes
"].
If
element
["
removeAttributes
"]
with
default
«
»
contains
attribute
:
Assert
:
modified
is
true.
Remove
attribute
from
element
["
removeAttributes
"].
Return
modified
.
Otherwise:
Comment
:
If
we
have
a
global
remove-list,
we
need
to
add
attribute
.
If
configuration
["
removeAttributes
"]
contains
attribute
return
false.
Comment
:
Fix-up
per-element
allow
and
remove
lists.
If
configuration
["
elements
"]
exists
:
For
each
element
in
configuration
["
elements
"]:
If
element
["
attributes
"]
with
default
«
»
contains
attribute
:
Remove
attribute
from
element
["
attributes
"].
If
element
["
removeAttributes
"]
with
default
«
»
contains
attribute
:
Remove
attribute
from
element
["
removeAttributes
"].
Append
attribute
to
configuration
["
removeAttributes
"]
Return
true.
To
remove
unsafe
from
a
SanitizerConfig
configuration
,
do
this:
Note:
While
this
algorithm
is
called
remove
unsafe
,
we
use
the
term
"unsafe"
strictly
in
the
sense
of
this
spec
,
to
denote
content
that
will
execute
JavaScript
when
inserted
into
the
document.
In
other
words,
this
method
will
remove
oportunities
for
XSS.
Assert
:
The
key
set
of
built-in
safe
baseline
configuration
equals
«
[
"
removeElements
",
"
removeAttributes
"
]
».
Assert
:
configuration
is
valid
.
Let
result
be
false.
For
each
element
in
built-in
safe
baseline
configuration
["
removeElements
"]:
Call
remove
an
element
element
from
configuration
.
If
the
call
returned
true,
set
result
to
true.
For
each
attribute
in
built-in
safe
baseline
configuration
["
removeAttributes
"]:
Call
remove
an
attribute
attribute
from
configuration
.
If
the
call
returned
true,
set
result
to
true.
For
each
attribute
listed
in
event
handler
content
attributes
:
Call
remove
an
attribute
attribute
from
configuration
.
If
the
call
returned
true,
set
result
to
true.
Return
result
.
3.3.
Set
the
Configuration
To
set
a
configuration
,
given
a
dictionary
configuration
,
a
boolean
allowCommentsPIsAndDataAttributes
,
and
a
Sanitizer
sanitizer
:
Canonicalize
configuration
with
allowCommentsPIsAndDataAttributes
.
If
configuration
is
not
valid
,
then
return
false.
Set
sanitizer
’s
configuration
to
configuration
.
Return
true.
3.4.
Canonicalize
the
Configuration
The
Sanitizer
stores
the
configuration
in
a
canonical
form,
as
this
makes
a
number
of
processing
steps
easier.
An
elements
list
{elements:
["div"]}
gets
stored
as
{elements:
[{name:
"div",
namespace:
"http://www.w3.org/1999/xhtml"}]
).
To
canonicalize
the
configuration
SanitizerConfig
configuration
with
a
boolean
allowCommentsPIsAndDataAttributes
:
Note:
We
assume
that
configuration
is
the
result
of
[WebIDL]
converting
a
JavaScript
value
to
a
SanitizerConfig
.
If
neither
configuration
["
elements
"]
nor
configuration
["
removeElements
"]
exist
,
then
set
configuration
["
removeElements
"]
to
«
».
If
neither
configuration
["
processingInstructions
"]
nor
configuration
["
removeProcessingInstructions
"]
exist
:
If
allowCommentsPIsAndDataAttributes
is
true,
then
set
configuration
["
removeProcessingInstructions
"]
to
«
».
Otherwise,
set
configuration
["
processingInstructions
"]
to
«
».
If
neither
configuration
["
attributes
"]
nor
configuration
["
removeAttributes
"]
exist
,
then
set
configuration
["
removeAttributes
"]
to
«
».
If
configuration
["
elements
"]
exists
:
Let
elements
be
«
».
For
each
element
of
configuration
["
elements
"]
do:
Append
the
result
of
canonicalize
a
sanitizer
element
with
attributes
element
to
elements
.
Set
configuration
["
elements
"]
to
elements
.
If
configuration
["
removeElements
"]
exists
:
Let
elements
be
«
».
For
each
element
of
configuration
["
removeElements
"]
do:
Append
the
result
of
canonicalize
a
sanitizer
element
element
to
elements
.
Set
configuration
["
removeElements
"]
to
elements
.
If
configuration
["
replaceWithChildrenElements
"]
exists
:
Let
elements
be
«
».
For
each
element
of
configuration
["
replaceWithChildrenElements
"]
do:
Append
the
result
of
canonicalize
a
sanitizer
element
element
to
elements
.
Set
configuration
["
replaceWithChildrenElements
"]
to
elements
.
If
configuration
["
processingInstructions
"]
exists
:
Let
processingInstructions
be
«
».
For
each
pi
of
configuration
["
processingInstructions
"]:
Append
the
result
of
canonicalize
a
sanitizer
processing
instruction
pi
to
processingInstructions
.
Set
configuration
["
processingInstructions
"]
to
processingInstructions
.
If
configuration
["
removeProcessingInstructions
"]
exists
:
Let
processingInstructions
be
«
».
For
each
pi
of
configuration
["
removeProcessingInstructions
"]:
Append
the
result
of
canonicalize
a
sanitizer
processing
instruction
pi
to
processingInstructions
.
Set
configuration
["
removeProcessingInstructions
"]
to
processingInstructions
.
If
configuration
["
attributes
"]
exists
:
Let
attributes
be
«
».
For
each
attribute
of
configuration
["
attributes
"]
do:
Append
the
result
of
canonicalize
a
sanitizer
attribute
attribute
to
attributes
.
Set
configuration
["
attributes
"]
to
attributes
.
If
configuration
["
removeAttributes
"]
exists
:
Let
attributes
be
«
».
For
each
attribute
of
configuration
["
removeAttributes
"]
do:
Append
the
result
of
canonicalize
a
sanitizer
attribute
attribute
to
attributes
.
Set
configuration
["
removeAttributes
"]
to
attributes
.
If
configuration
["
comments
"]
does
not
exist
,
then
set
configuration
["
comments
"]
to
allowCommentsPIsAndDataAttributes
.
If
configuration
["
attributes
"]
exists
and
configuration
["
dataAttributes
"]
does
not
exist
,
then
set
configuration
["
dataAttributes
"]
to
allowCommentsPIsAndDataAttributes
.
To
canonicalize
a
sanitizer
element
with
attributes
a
SanitizerElementWithAttributes
element
:
Let
result
be
the
result
of
canonicalize
a
sanitizer
element
with
element
.
If
element
is
a
dictionary
:
If
element
["
attributes
"]
exists
:
Let
attributes
be
«
».
For
each
attribute
of
element
["
attributes
"]:
Append
the
result
of
canonicalize
a
sanitizer
attribute
with
attribute
to
attributes
.
Set
result
["
attributes
"]
to
attributes
.
If
element
["
removeAttributes
"]
exists
:
Let
attributes
be
«
».
For
each
attribute
of
element
["
removeAttributes
"]:
Append
the
result
of
canonicalize
a
sanitizer
attribute
with
attribute
to
attributes
.
Set
result
["
removeAttributes
"]
to
attributes
.
If
neither
result
["
attributes
"]
nor
result
["
removeAttributes
"]
exist
:
Set
result
["
removeAttributes
"]
to
«
».
Return
result
.
In
order
moved
to
canonicalize
a
sanitizer
element
a
SanitizerElement
element
,
return
the
result
of
canonicalize
a
sanitizer
name
with
element
and
the
WHATWG
HTML
namespace
as
the
default
namespace.
In
order
to
canonicalize
a
sanitizer
processing
instruction
pi
,
run
the
following
steps:
Assert
:
pi
is
either
a
DOMString
or
a
dictionary
.
If
pi
is
a
DOMString
,
then
return
«[
"
target
"
→
pi
]».
Assert
:
pi
is
a
dictionary
and
pi
["target"]
exists
.
Return
«[
"
target
"
→
pi
["target"]
]».
In
order
to
canonicalize
a
sanitizer
attribute
a
SanitizerAttribute
attribute
,
return
the
result
of
canonicalize
a
sanitizer
name
with
attribute
and
null
as
the
default
namespace.
In
order
to
canonicalize
a
sanitizer
name
name
,
with
a
default
namespace
defaultNamespace
,
run
the
following
steps:
Assert
:
name
is
either
a
DOMString
or
a
dictionary
.
If
name
is
a
DOMString
,
then
return
«[
"
name
"
→
name
,
"
namespace
"
→
defaultNamespace
]».
Assert
:
name
is
a
dictionary
and
both
name
["name"]
and
name
["namespace"]
exist
.
If
name
["namespace"]
is
the
empty
string,
then
set
it
to
null.
Return
«[
"
name
"
→
name
["name"],
"
namespace
"
→
name
["namespace"]
]».
3.5.
Supporting
Algorithms
For
the
canonicalized
element
and
attribute
name
lists
used
in
this
spec,
list
membership
is
based
on
matching
both
"
name
"
and
"
namespace
"
entries:
A
Sanitizer
name
list
contains
an
item
if
there
exists
an
entry
of
list
that
is
an
ordered
map
,
and
where
item
["name"]
equals
entry
["name"]
and
item
["namespace"]
equals
entry
["namespace"].
A
Sanitizer
target
list
contains
a
target
target
if
there
exists
an
entry
of
list
that
is
an
ordered
map
,
and
where
target
equals
entry
["target"].
To
remove
an
item
from
a
list
list
:
Set
removed
to
false.
For
each
entry
of
list
:
If
item
["name"]
equals
entry
["name"]
and
item
["namespace"]
equals
entry
["namespace"]:
Remove
item
entry
from
list
.
Set
removed
to
true.
Return
removed
.
To
add
a
name
to
a
list
,
where
name
is
canonicalized
and
list
is
an
ordered
map
:
If
list
contains
name
,
then
return.
Append
name
to
list
.
An
item
itemA
is
less
than
item
itemB
if:
If
itemA
["namespace"]
is
null:
If
itemB
["namespace"]
is
not
null,
then
return
true.
Otherwise:
If
itemB
["namespace"]
is
null,
then
return
false.
If
itemA
["namespace"]
is
code
unit
less
than
itemB
["namespace"],
then
return
true.
If
itemA
["namespace"]
is
not
itemB
["namespace"],
then
return
false.
Return
itemA
["name"]
is
code
unit
less
than
itemB
["name"].
Equality
for
ordered
sets
is
equality
of
its
members,
but
without
regard
to
order:
Ordered
sets
A
and
B
are
equal
if
both
A
is
a
superset
of
B
and
B
is
a
superset
of
A
.
An
ordered
map
is
a
sequence
of
key
and
value
tuples
.
Equality
of
ordered
maps
is
equality
of
this
sequence
of
tuples,
when
treated
as
an
ordered
set.
Ordered
maps
A
and
B
are
equal
if
the
ordered
set
consisting
of
A
’s
entries
and
the
ordered
set
of
B
’s
entries
are
equal
.
A
list
list
has
duplicates
,
if
for
any
item
of
list
,
there
is
more
than
one
entry
in
list
where
item
["name"]
is
entry
["name"]
and
item
["namespace"]
is
entry
["namespace"].
A
list
list
has
duplicate
targets
,
if
for
any
item
of
list
,
there
is
more
than
one
entry
in
list
where
item
["target"]
is
entry
["target"].
To
remove
duplicates
from
a
list
list
,
Let
result
be
«
».
For
each
entry
of
list
,
add
entry
to
result
.
Return
result
.
The
intersection
of
two
lists
A
and
B
containing
SanitizerElement
is
the
same
as
set
intersection
,
but
with
the
set
entries
previously
canonicalized
:
Let
set
A
be
«
[]
».
Let
set
B
be
«
[]
».
For
each
entry
of
A
,
append
the
result
of
canonicalize
a
sanitizer
name
entry
to
set
A
.
3.6.
1.1.
Builtins
Issue
Reporting
-
The Sanitizer API is intended to prevent DOM-based Cross-Site Scripting by traversing a supplied HTML content and removing elements and attributes according to a configuration. The specified API must not support the construction of a Sanitizer object that leaves script-capable markup in and doing so would be a bug in the threat model. That being said, there are securityFile specification issueswhich the correct usage of the Sanitizer API will not be able to protect against and the scenarios will be laid outin thefollowing sections. 4.1. Server-Side Reflected and Stored XSS This section is not normative. The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an existing DocumentFragment. The Sanitizer does not address server-side reflected or stored XSS. 4.2. DOM clobbering This section is not normative. DOM clobbering describes an attack in which malicious HTML confuses an application by naming elements through id or name attributes such that properties like children of anWHATWG HTMLelement in the DOM are overshadowed by the malicious content. The Sanitizer API does not protect DOM clobbering attacks in its default state, but can be configured to remove id and name attributes. 4.3. XSS with Script gadgets This section is not normative. Script gadgets are a techniqueissue tracker . -
Please
report
bugs
in
which an attacker uses existing application code from popular JavaScript libraries to cause their own codeimplementations toexecute. This is often done by injecting innocent-looking code or seemingly inert DOM nodes that is only parsed and interpreted by a framework which then performstheexecution of JavaScript based on that input. The Sanitizer API can not prevent these attacks, but requires page authors to explicitly allow unknown elements in general, and authors must additionally explicitly configure unknown attributes and elements and markup that is known to be widely used for templating and framework-specific code, like data- and slot attributes and elements like <slot> and <template> .corresponding browser vendor. Webelieve that these restrictions are not exhaustive andencouragepage authors to examine their third party libraries for this behavior. 4.4. Mutated XSS This section is not normative. Mutated XSS or mXSS describes an attack based on parser context mismatches when parsing an HTML snippet without the correct context. In particular, when a parsed HTML fragment has been serialized to a string, the string is not guaranteed to be parsed and interpreted exactly the same when inserted into a different parent element. An example for carrying out such an attack is by relying on the change of parsing behavior for foreign content or mis-nested tags. The Sanitizer API offers only functions that turn a string into a node tree. The context is supplied implicitly by all sanitizer functions: Element.setHTML() uses the current element; Document.parseHTML() creates a new document. Therefore Sanitizer API is not directly affected by mutated XSS. If a developer were to retrieve a sanitized node tree as a string, e.g. via .innerHTML , andyou tothen parse it again then mutated XSS may occur. We discourage this practice. If processing or passing of HTML as a string should be necessary after all, then any string should be considered untrusted and should be sanitized (again) when inserting itlook intothe DOM. In other words, a sanitized and then serialized HTML tree can no longer be considered as sanitized. A more complete treatment of mXSS can be found in [MXSS] . 5. Acknowledgements This work is informed and inspired by [DOMPURIFY] from cure53, Internet Explorer’s window.toStaticHTML() as well as the original [HTMLSanitizer] from Ben Bucksch. Anne van Kesteren, Krzysztof Kotowicz, Andrew C. H. Mc Millan, Tom Schuster, Luke Warlow, Guillaume Weghsteen, and Mike West fortheirvaluable feedback.security reporting policies.