1. Introduction
This section is not normative.
Web
applications
often
need
to
work
with
strings
of
HTML
on
the
client
side,
perhaps
as
part
of
a
client-side
templating
solution,
perhaps
as
part
of
rendering
user
generated
content,
etc.
It
is
difficult
to
do
so
in
a
safe
way.
The
naive
approach
of
joining
strings
together
and
stuffing
them
into
an
Element
’s
innerHTML
is
fraught
with
risk,
as
it
can
cause
JavaScript
execution
in
a
number
of
unexpected
ways.
Libraries like [DOMPURIFY] attempt to manage this problem by carefully parsing and sanitizing strings before insertion, by constructing a DOM and filtering its members through an allow-list. This has proven to be a fragile approach, as the parsing APIs exposed to the web don’t always map in reasonable ways to the browser’s behavior when actually rendering a string as HTML in the "real" DOM. Moreover, the libraries need to keep on top of browsers' changing behavior over time; things that once were safe may turn into time-bombs based on new platform-level features.
The browser has a fairly good idea of when it is going to execute code. We can improve upon the user-space libraries by teaching the browser how to render HTML from an arbitrary string in a safe manner, and do so in a way that is much more likely to be maintained and updated along with the browser’s own changing parser implementation. This document outlines an API which aims to do just that.
1.1. Goals
-
Mitigate the risk of DOM-based cross-site scripting attacks by providing developers with mechanisms for handling user-controlled HTML which prevent direct script execution upon injection.
-
Make HTML output safe for use within the current user agent, taking into account its current understanding of HTML.
-
Allow developers to override the default set of elements and attributes. Adding certain elements and attributes can prevent script gadget attacks.
1.2. API Summary
The Sanitizer API offers functionality to parse a string containing HTML into a DOM tree, and to filter the resulting tree according to a user-supplied configuration. The methods come in two by two flavours:
-
Safe and unsafe : The "safe" methods will not generate any markup that executes script. That is, they should be safe from XSS. The "unsafe" methods will parse and filter whatever they’re supposed to. See also: § 4 Security Considerations .
-
Context: Methods are defined on
ElementandShadowRootand will replace theseNode’s children, and are largely analogous toinnerHTML. There are also static methods on theDocument, which parse an entire document are largely analogous toDOMParser.parseFromString().
2. Framework
2.1. Sanitizer API
The
Element
interface
defines
two
methods,
setHTML()
and
setHTMLUnsafe()
.
Both
of
these
take
a
DOMString
with
HTML
markup,
and
an
optional
configuration.
partial interface Element { [CEReactions ]undefined setHTMLUnsafe ((TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {}); [options CEReactions ]undefined setHTML (DOMString ,html optional SetHTMLOptions = {}); };options
Element
’s
setHTMLUnsafe(
html
,
options
)
method
steps
are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML, this ’s relevant global object , html , "Element setHTMLUnsafe", and "script". -
Let target be this ’s template contents if this is a
templateelement; otherwise this . -
Set and filter HTML given target , this , compliantHTML , options , and false.
Element
’s
setHTML(
html
,
options
)
method
steps
are:
-
Let target be this ’s template contents if this is a
template; otherwise this . -
Set and filter HTML given target , this , html , options , and true.
partial interface ShadowRoot { [CEReactions ]undefined setHTMLUnsafe ((TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {}); [options CEReactions ]undefined setHTML (DOMString ,html optional SetHTMLOptions = {}); };options
These
methods
are
mirrored
on
the
ShadowRoot
:
ShadowRoot
’s
setHTMLUnsafe(
html
,
options
)
method
steps
are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML, this ’s relevant global object , html , "ShadowRoot setHTMLUnsafe", and "script". -
Set and filter HTML using this , this ’s shadow host (as context element), compliantHTML , options , and false.
ShadowRoot
’s
setHTML(
html
,
options
)
method
steps
are:
-
Set and filter HTML using this (as target), this (as context element), html , options , and true.
The
Document
interface
gains
two
new
methods
which
parse
an
entire
Document
:
partial interface Document {static Document parseHTMLUnsafe ((TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {});options static Document parseHTML (DOMString ,html optional SetHTMLOptions = {}); };options
parseHTMLUnsafe(
html
,
options
)
method
steps
are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML,this ’s relevantthe current global object , html , "Document parseHTMLUnsafe", and "script". -
Let document be a new
Document, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document ’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and compliantHTML .
-
Let sanitizer be the result of calling get a sanitizer instance from options with options and false.
-
Call sanitize on document with sanitizer and false.
-
Return document .
parseHTML(
html
,
options
)
method
steps
are:
-
Let document be a new
Document, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document ’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and html .
-
Let sanitizer be the result of calling get a sanitizer instance from options with options and true.
-
Call sanitize on document with sanitizer and true.
-
Return document .
2.2. SetHTML options and the configuration object.
The
family
of
setHTML()
-like
methods
all
accept
an
options
dictionary.
Right
now,
only
one
member
of
this
dictionary
is
defined:
enum {SanitizerPresets };"default" dictionary { (SetHTMLOptions Sanitizer or SanitizerConfig or SanitizerPresets )= "default"; };sanitizer dictionary { (SetHTMLUnsafeOptions Sanitizer or SanitizerConfig or SanitizerPresets )= {}; };sanitizer
The
Sanitizer
configuration
object
encapsulates
a
filter
configuration.
The
same
configuration
can
be
used
with
both
"safe"
or
"unsafe"
methods,
where
the
"safe"
methods
perform
an
implicit
removeUnsafe
operation
on
the
passed
in
configuration
and
have
a
default
configuration
when
none
is
passed.
The
default
differs
between
"safe"
and
"unsafe"
methods:
The
"safe"
methods
are
aiming
to
be
safe
by
default
and
have
a
restrictive
default,
while
the
"unsafe"
methods
are
unrestricted
by
default.
The
intent
for
configuration
use
is
that
one
(or
a
few)
configurations
will
be
built-up
early
on
in
a
page’s
lifetime,
and
can
then
be
used
whenever
needed.
This
allows
implementations
to
pre-process
configurations.
The configuration object can be queried to return a configuration dictionary. It can also be modified directly.
[Exposed =Window ]interface {Sanitizer constructor (optional (SanitizerConfig or SanitizerPresets )= "default"); // Query configuration:configuration SanitizerConfig get (); // Modify a Sanitizer’s lists and fields:boolean allowElement (SanitizerElementWithAttributes );element boolean removeElement (SanitizerElement );element boolean replaceElementWithChildren (SanitizerElement );element boolean allowAttribute (SanitizerAttribute );attribute boolean removeAttribute (SanitizerAttribute );attribute boolean setComments (boolean );allow boolean setDataAttributes (boolean ); // Remove markup that executes script.allow boolean removeUnsafe (); };
A
Sanitizer
has
an
associated
SanitizerConfig
configuration
.
constructor(
configuration
)
method
steps
are:
-
If configuration is a
SanitizerPresetsstring , then:-
Set configuration to the built-in safe default configuration .
-
Let valid be the return value of set a configuration with configuration and true on this .
-
If valid is false, then throw a
TypeError.
get()
method
steps
are:
-
Let config be this ’s configuration .
-
If config ["
elements"] exists :-
For any element of config ["
elements"]:-
If element ["
attributes"] exists :-
Set element ["
attributes"] to the result of sort in ascending order element ["attributes"], with attrA being less than item attrB .
-
-
If element ["
removeAttributes"] exists :-
Set element ["
removeAttributes"] to the result of sort in ascending order element ["removeAttributes"], with attrA being less than item attrB .
-
-
-
Set config ["
elements"] to the result of sort in ascending order config ["elements"], with elementA being less than item elementB .
-
-
If config ["
removeElements"] exists :-
Set config ["
removeElements"] to the result of sort in ascending order config ["removeElements"], with elementA being less than item elementB .
-
-
If config ["
replaceWithChildrenElements"] exists :-
Set config ["
replaceWithChildrenElements"] to the result of sort in ascending order config ["replaceWithChildrenElements"], with elementA being less than item elementB .
-
-
If config ["
attributes"] exists :-
Set config ["
attributes"] to the result of sort in ascending order config ["attributes"], with attrA being less than item attrB .
-
-
If config ["
removeAttributes"] exists :-
Set config ["
removeAttributes"] to the result of sort in ascending order config ["removeAttributes"], with attrA being less than item attrB .
-
-
Return config .
allowElement(
element
)
method
steps
are
to
allow
an
element
with
element
and
this
’s
configuration
.
removeElement(
element
)
method
steps
are
to
remove
an
element
with
element
and
this
’s
configuration
.
replaceElementWithChildren(
element
)
method
steps
are
to
replace
an
element
with
its
children
with
element
and
this
’s
configuration
.
allowAttribute(
attribute
)
method
steps
are
to
allow
an
attribute
with
attribute
and
this
’s
configuration
.
removeAttribute(
attribute
)
method
steps
are
to
remove
an
attribute
with
attribute
and
this
’s
configuration
.
setDataAttributes(
allow
)
method
steps
are
to
set
data
attributes
with
allow
and
this
’s
configuration
.
removeUnsafe()
method
steps
are
to
update
this
’s
configuration
with
the
result
of
calling
remove
unsafe
on
this
’s
configuration
.
2.3. The Configuration Dictionary
dictionary {SanitizerElementNamespace required DOMString ;name DOMString ?= "http://www.w3.org/1999/xhtml"; }; // Used by "elements"_namespace dictionary :SanitizerElementNamespaceWithAttributes SanitizerElementNamespace {sequence <SanitizerAttribute >;attributes sequence <SanitizerAttribute >; };removeAttributes typedef (DOMString or SanitizerElementNamespace );SanitizerElement typedef (DOMString or SanitizerElementNamespaceWithAttributes );SanitizerElementWithAttributes dictionary {SanitizerAttributeNamespace required DOMString ;name DOMString ?=_namespace null ; };typedef (DOMString or SanitizerAttributeNamespace );SanitizerAttribute dictionary {SanitizerConfig sequence <SanitizerElementWithAttributes >;elements sequence <SanitizerElement >;removeElements sequence <SanitizerElement >;replaceWithChildrenElements sequence <SanitizerAttribute >;attributes sequence <SanitizerAttribute >;removeAttributes boolean ;comments boolean ; };dataAttributes
2.4. Configuration Invariants
Configurations
can
and
ought
to
be
modified
by
developers
to
suit
their
purposes.
Options
are
to
write
a
new
configuration
dictionary
from
scratch,
to
modify
an
existing
Sanitizer
’s
configuration
by
using
the
modifier
methods,
or
to
get()
an
existing
Sanitizer
’s
configuration
as
a
dictionary
and
modify
the
dictionary
and
then
create
a
new
Sanitizer
with
it.
An
empty
configuration
allows
everything
(when
called
with
the
"unsafe"
methods
like
setHTMLUnsafe
).
A
configuration
"default"
contains
a
built-in
safe
default
configuration
.
Note
that
"safe"
and
"unsafe"
sanitizer
methods
have
different
defaults.
Not all configuration dictionaries are valid. A valid configuration avoids redundancy (like specifying the same element to be allowed twice) and contradictions (like specifying an element to be both removed and allowed.)
Several conditions need to hold for a configuration to be valid:
-
Mixing global allow- and remove-lists:
-
elementsorremoveElementscan exist, but not both. If both are missing, this is equivalent toremoveElementsset to « ». -
attributesorremoveAttributescan exist, but not both. If both are missing, this is equivalent toremoveAttributesset to « ». -
dataAttributesis conceptually an extension of theattributesallow-list. ThedataAttributesattribute is only allowed when aattributeslist is used.
-
-
Duplicate entries between different global lists:
-
There are no duplicate entries (i.e., no same elements) between
elements,removeElements, orreplaceWithChildrenElements. -
There are no duplicate entries (i.e., no same attributes) between
attributesorremoveAttributes.
-
-
Duplicate entries on the same element:
-
There are no duplicate entries between
attributesandremoveAttributeson the same element.
-
The
elements
element
allow-list
can
also
specify
allowing
or
removing
attributes
for
a
given
element.
This
is
meant
to
mirror
[HTML]
’s
structure,
which
knows
both
global
attributes
as
well
as
local
attributes
that
apply
to
a
specific
element.
Global
and
local
attributes
can
be
mixed,
but
note
that
ambiguous
configurations
where
a
particular
attribute
would
be
allowed
by
one
list
and
forbidden
by
another,
are
generally
invalid.
global
attributes
|
global
removeAttributes
| |
|---|---|---|
local
attributes
| An attribute is allowed if it matches either list. No duplicates are allowed. | An attribute is only allowed if it’s in the local allow list. No duplicate entries between global remove and local allow lists are allowed. Note that the global remove list has no function for this particular element, but may well apply to other elements that do not have a local allow list. |
local
removeAttributes
| An attribute is allowed if it’s in the global allow-list, but not in the local remove-list. Local remove must be a subset of the global allow lists. | An attribute is allowed if it is in neither list. No duplicate entries between global remove and local remove lists are allowed. |
Please note the asymmetry where mostly no duplicates between global and per-element lists are permitted, but in the case of a global allow-list and a per-element remove-list the latter must be a subset of the former. An excerpt of the table above, only focusing on duplicates, is as follows:
global
attributes
|
global
removeAttributes
| |
|---|---|---|
local
attributes
| No duplicates are allowed. | No duplicates are allowed. |
local
removeAttributes
| Local remove must be a subset of the global allow lists. | No duplicates are allowed. |
The
dataAttributes
setting
allows
custom
data
attributes
.
The
rules
above
easily
extends
to
custom
data
attributes
if
one
considers
dataAttributes
to
be
an
allow-list:
global
attributes
and
dataAttributes
set
| |
|---|---|
local
attributes
| All custom data attributes are allowed. No custom data attributes may be listed in any allow-list, as that would mean a duplicate entry. |
local
removeAttributes
| A custom data attribute is allowed, unless it’s listed in the local remove-list. No custom data attribute may be listed in the global allow-list, as that would mean a duplicate entry. |
Putting these rules in words:
-
Duplicates and interactions between global and local lists:
-
If a global
attributesallow list exists, then all element’s local lists:-
If a local
attributesallow lists exists, there may be no duplicate entries between these lists. -
If a local
removeAttributesremove lists exists, then all its entries must also be listed in the globalattributesallow list. -
If
dataAttributesis true, then no custom data attributes may be listed in any of the allow-lists.
-
-
If a global
removeAttributesremove list exists, then:-
If a local
attributesallow lists exists, there may be no duplicate entries between these lists. -
If a local
removeAttributesremove lists exists, there may be no duplicate entries between these lists. -
dataAttributesmust be absent.
-
-
SanitizerConfig
config
is
valid
if
all
of
the
following
conditions
hold:
-
The config has either an
elementsor aremoveElementskey , but not both. -
The config has either an
attributesor aremoveAttributeskey , but not both. -
Assert : All
SanitizerElementNamespaceWithAttributes,SanitizerElementNamespace, andSanitizerAttributeNamespaceitems in config are canonical, meaning they have been run through canonicalize a sanitizer element or canonicalize a sanitizer attribute , as appropriate. -
None of config [
elements], config [removeElements], config [replaceWithChildrenElements], config [attributes], or config [removeAttributes], if they exist , has duplicates . -
If both config [
elements] and config [replaceWithChildrenElements] exist , then the intersection of config [elements] and config [replaceWithChildrenElements] is empty . -
If both config [
removeElements] and config [replaceWithChildrenElements] exist , then the intersection of config [removeElements] and config [replaceWithChildrenElements] is empty . -
If config [
attributes] exists :-
If config [
elements] exists :-
For each element of config [
elements]:-
Neither element [
attributes] nor element [removeAttributes], if they exist, has duplicates . -
The intersection of config [
attributes] and element [attributes] with default « » is empty . -
element [
removeAttributes] with default « » is a subset of config [attributes]. -
If
dataAttributesexists anddataAttributesis true:-
element [
attributes] does not contain a custom data attribute .
-
-
-
-
If
dataAttributesis true:-
config [
attributes] does not contain a custom data attribute .
-
-
-
If config [
removeAttributes] exists :-
If config [
elements] exists , then for each element of config [elements]:-
Neither element [
attributes] nor element [removeAttributes], if they exist, has duplicates . -
The intersection of config [
removeAttributes] and element [attributes] with default « » is empty . -
The intersection of config [
removeAttributes] and element [removeAttributes] with default « » is empty .
-
-
config [
dataAttributes] does not exist .
-
Note:
Setting
a
configuration
from
a
dictionary
will
do
a
bit
normalization.
In
particular,
if
both
allow-
and
remove-lists
are
missing,
it
will
interpret
this
as
an
empty
remove-list.
So
{}
itself
is
not
a
valid
configuration,
but
it
will
be
normalized
to
{removeElements:[],removeAttributes:[]}
,
which
is.
This
normalization
step
was
chosen
in
order
to
have
a
missing
dictionary
be
consistent
with
an
empty
one,
i.e.,
to
have
setHTMLUnsafe(txt)
be
consistent
with
setHTMLUnsafe(txt,
{sanitizer:
{}})
.
3. Algorithms
Element
or
DocumentFragment
target
,
an
Element
contextElement
,
a
string
html
,
and
a
dictionary
options
,
and
a
boolean
safe
:
-
If safe and contextElement ’s local name is "
script" and contextElement ’s namespace is the HTML namespace or the SVG namespace , then return. -
Let sanitizer be the result of calling get a sanitizer instance from options with options and safe .
-
Let newChildren be the result of the HTML fragment parsing algorithm given contextElement , html , and true.
-
Let fragment be a new
DocumentFragmentwhose node document is contextElement ’s node document . -
Run sanitize on fragment using sanitizer and safe .
-
Replace all with fragment within target .
Note:
This
algorithm
works
for
both
SetHTMLOptions
and
SetHTMLUnsafeOptions
.
They
only
differ
in
the
defaults.
-
Let sanitizerSpec be "
default". -
If options ["
sanitizer"] exists , then:-
Set sanitizerSpec to options ["
sanitizer"]
-
-
Assert : sanitizerSpec is either a
Sanitizerinstance, a string which is aSanitizerPresetsmember, or a dictionary . -
If sanitizerSpec is a string :
-
Set sanitizerSpec to the built-in safe default configuration .
-
Assert : sanitizerSpec is either a
Sanitizerinstance, or a dictionary . -
If sanitizerSpec is a dictionary :
-
Let sanitizer be a new
Sanitizerinstance. -
Let setConfigurationResult be the result of set a configuration with sanitizerSpec and not safe on sanitizer .
-
Set sanitizerSpec to sanitizer .
-
-
Return sanitizerSpec .
3.1. Sanitize
ParentNode
node
,
a
Sanitizer
sanitizer
,
and
a
boolean
safe
,
run
these
steps:
-
Let configuration be the value of sanitizer ’s configuration .
-
If safe is true, then set configuration to the result of calling remove unsafe on configuration .
-
Call sanitize core on node , configuration , and with handleJavascriptNavigationUrls set to safe .
ParentNode
node
,
a
SanitizerConfig
configuration
,
and
a
boolean
handleJavascriptNavigationUrls
,
recurses
over
the
DOM
tree
beginning
with
node
.
It
consistes
of
these
steps:
-
For each child of node ’s children :
-
Assert : child implements
Text,Comment,Element, orDocumentType.Note: Currently, this algorithm is only called on output of the HTML parser for which this assertion should hold.
DocumentTypeshould only occur forparseHTMLandparseHTMLUnsafe. If in the future this algorithm will be used in different contexts, this assumption needs to be re-examined. -
If child implements
DocumentType, then continue . -
If child implements
Text, then continue . -
If child implements
Comment: -
Otherwise:
-
Let elementName be a
SanitizerElementNamespacewith child ’s local name and namespace . -
If configuration ["
replaceWithChildrenElements"] exists and if configuration ["replaceWithChildrenElements"] contains elementName :-
Call sanitize core on child with configuration and handleJavascriptNavigationUrls .
-
Call replace all with child ’s children within child .
-
Continue .
-
-
If configuration ["
removeElements"] exists and configuration ["removeElements"] contains elementName : -
If configuration ["
elements"] exists and configuration ["elements"] does not contain elementName : -
If elementName equals «[ "
name" → "template", "namespace" → HTML namespace ]», then call sanitize core on child ’s template contents with configuration and handleJavascriptNavigationUrls . -
If child is a shadow host , then call sanitize core on child ’s shadow root with configuration and handleJavascriptNavigationUrls .
-
Let elementWithLocalAttributes be « [] ».
-
If configuration ["
elements"] exists and configuration ["elements"] contains elementName :-
Set elementWithLocalAttributes to configuration ["
elements"][ elementName ].
-
-
For each attribute in child ’s attribute list :
-
Let attrName be a
SanitizerAttributeNamespacewith attribute ’s local name and namespace . -
If elementWithLocalAttributes ["
removeAttributes"] with default « » contains attrName :-
Remove attribute .
-
-
Otherwise, if configuration ["
attributes"] exists :-
If configuration ["
attributes"] does not contain attrName and elementWithLocalAttributes ["attributes"] with default « » does not contain attrName , and if "data-" is not a code unit prefix of attribute ’s local name and namespace is notnullor configuration ["dataAttributes"] is not true:-
Remove attribute .
-
-
-
Otherwise:
-
If elementWithLocalAttributes ["
attributes"] exists and elementWithLocalAttributes ["attributes"] does not contain attrName :-
Remove attribute .
-
-
Otherwise, if configuration ["
removeAttributes"] contains attrName :-
Remove attribute .
-
-
-
If handleJavascriptNavigationUrls :
-
If «[ elementName , attrName ]» matches an entry in the built-in navigating URL attributes list , and if attribute contains a javascript: URL , then remove attribute .
-
If child ’s namespace is the MathML Namespace and attr ’s local name is "
href" and attr ’s namespace isnullor the XLink namespace and attr contains a javascript: URL , then remove attribute . -
If the built-in animating URL attributes list contains «[ elementName , attrName ]» and attr ’s value is "
href" or "xlink:href", then remove attribute .
-
-
-
Call sanitize core on child with configuration and handleJavascriptNavigationUrls .
-
-
javascript:
URLs
only
when
navigating.
Since
navigation
itself
is
not
an
XSS
threat
we
handle
navigation
to
javascript:
URLs,
but
not
navigations
in
general.
Declarative navigation falls into a handful of categories:
-
Anchor elements. (
<a>in HTML and SVG namespaces) -
Form elements that trigger navigation as part of the form action.
-
[MathML] allows any element to act as an anchor .
-
[SVG11] animation.
The first two are covered by the built-in navigating URL attributes list .
The MathML case is covered by a seperate rule, because there is no formalism in this spec to cover a "per-namespace global" rule.
The
SVG
animation
case
is
covered
by
the
built-in
animating
URL
attributes
list
.
But
since
the
interpretation
of
SVG
animation
elements
depends
on
the
animation
target,
and
since
during
sanitization
we
cannot
know
what
the
final
target
will
be,
the
sanitize
algorithm
blocks
any
animation
of
href
attributes.
-
Let url be the result of running the basic URL parser on attribute ’s value .
-
If url is
failure, then return false.
3.2. Modify the Configuration
The
configuration
modifier
methods
are
methods
on
Sanitizer
that
modify
its
configuration.
They
will
maintain
the
validity
criteria.
They
return
a
boolean
which
informs
the
caller
whether
the
configuration
was
modified
or
not.
let s= new Sanitizer({ elements: [ "div" ]}); s. allowElement( "p" ); // Returns true. div. setHTML( "<div><p>" , { sanitizer: s}); // Allows `<div>` and `<p>`.
let s= new Sanitizer({ elements: [ "div" ]}); s. removeElement( "p" ); // Return false, as <p> was not previously allowed. div. setHTML( "<div><p>" , { sanitizer: s}); // Allows `<div>`. `<p>` is removed.
SanitizerElementWithAttributes
element
with
a
SanitizerConfig
configuration
:
-
Whether we have a global allow- or remove-list, and
-
whether these lists already contain element or not.
-
Set element to the result of canonicalize a sanitizer element with attributes with element .
-
If configuration ["
elements"] exists :-
Set modified to the result of remove element from configuration ["
replaceWithChildrenElements"]. -
Comment : We need to make sure the per-element attributes do not overlap with global attributes.
-
If element ["
attributes"] exists :-
Set element ["
attributes"] to remove duplicates from element ["attributes"]. -
If configuration ["
attributes"] exists :-
Set element ["
attributes"] to the difference of element ["attributes"] and configuration ["attributes"]. -
If configuration ["
dataAttributes"] is true:-
Remove all items item from element ["
attributes"] where item is a custom data attribute .
-
-
-
If configuration ["
removeAttributes"] exists :-
Set element ["
attributes"] to the difference of element ["attributes"] and configuration ["removeAttributes"].
-
-
-
If element ["
removeAttributes"] exists :-
Set element ["
removeAttributes"] to remove duplicates from element ["removeAttributes"]. -
If configuration ["
attributes"] exists :-
Set element ["
removeAttributes"] to the intersection of element ["removeAttributes"] and configuration ["attributes"].
-
-
If configuration ["
removeAttributes"] exists :-
Set element ["
removeAttributes"] to the difference of element ["removeAttributes"] and configuration ["removeAttributes"].
-
-
-
Comment : This is the case with a global allow-list that already contains element .
-
Let current element be the item in configuration ["
elements"] where item [name] equals element [name] and item [namespace] equals element [namespace]. -
If element equals current element then return modified .
-
Return true.
-
-
Otherwise:
-
If element ["
attributes"] exists or element ["removeAttributes"] with default « » is not empty :-
The user agent may report a warning to the console that this operation is not supported.
-
Return false.
-
-
Set modified to the result of remove element from configuration ["
replaceWithChildrenElements"]. -
If configuration ["
removeElements"] does not contain element :-
Comment : This is the case with a global remove-list that does not contain element .
-
Return modified .
-
-
Comment : This is the case with a global remove-list that contains element .
-
Remove element from configuration ["
removeElements"]. -
Return true.
-
SanitizerElement
element
from
a
SanitizerConfig
configuration
:
-
Whether we have a global allow- or remove-list,
-
whether they already contain element or not.
-
Set element to the result of canonicalize a sanitizer element with element .
-
Set modified to the result of remove element from configuration ["
replaceWithChildrenElements"]. -
Otherwise:
-
If configuration ["
removeElements"] contains element :-
Comment : We have a global remove list and it already contains element .
-
Return modified .
-
-
Comment : We have a global remove list and it does not contain element .
-
Add element to configuration ["
removeElements"]. -
Return true.
-
SanitizerElement
element
from
a
SanitizerConfig
configuration
:
-
Set element to the result of canonicalize a sanitizer element with element .
-
If configuration ["
replaceWithChildrenElements"] contains element :-
Return false.
-
-
Remove element from configuration ["
removeElements"]. -
Add element to configuration ["
replaceWithChildrenElements"]. -
Return true.
SanitizerAttribute
attribute
on
a
SanitizerConfig
configuration
:
Note: This method distinguishes two cases, namely whether we have a global allow- or a global remove-list. If add attribute to a global allow-list, we may need to do additional work to fix up per-element allow- or remove-lists to maintain our validity criteria.
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute .
-
If configuration ["
attributes"] exists :-
Comment : If we have a global allow-list, we need to add attribute .
-
If configuration ["
dataAttributes"] is true and attribute is a custom data attribute , then return false. -
If configuration ["
attributes"] contains attribute return false. -
Comment : Fix-up per-element allow and remove lists.
-
If configuration ["
elements"] exists :-
For each element in configuration ["
elements"]:-
If element ["
attributes"] with default « » contains attribute :-
Remove attribute from element ["
attributes"].
-
-
Assert : element ["
removeAttributes"] with default « » does not contain attribute .
-
-
-
Append attribute to configuration ["
attributes"] -
Return true.
-
-
Otherwise:
-
Comment : If we have a global remove-list, we need to remove attribute .
-
If configuration ["
removeAttributes"] does not contain attribute :-
Return false.
-
-
Remove attribute from configuration ["
removeAttributes"]. -
Return true.
-
SanitizerConfig
configuration
:
Note: This method distinguishes two cases, namely whether we have a global allow- or a global remove-list. If we add attribute to the global remove-list, we may need to do additional work to fix up per-element allow- or remove-lists to maintain our validity criteria. If we remove attribute from a global allow-list, we may also have to remove it from local remove-lists.
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute .
-
If configuration ["
attributes"] exists :-
Comment : If we have a global allow-list, we need to add attribute .
-
If configuration ["
attributes"] does not contain attribute :-
Return false.
-
-
Comment : Fix-up per-element allow and remove lists.
-
If configuration ["
elements"] exists :-
For each element in configuration ["
elements"]:-
If element ["
removeAttributes"] with default « » contains attribute :-
Remove attribute from element ["
removeAttributes"].
-
-
-
-
Remove attribute from configuration ["
attributes"]. -
Return true.
-
-
Otherwise:
-
Comment : If we have a global remove-list, we need to add attribute .
-
If configuration ["
removeAttributes"] contains attribute return false. -
Comment : Fix-up per-element allow and remove lists.
-
If configuration ["
elements"] exists :-
For each element in configuration ["
elements"]:-
If element ["
attributes"] with default « » contains attribute :-
Remove attribute from element ["
attributes"].
-
-
If element ["
removeAttributes"] with default « » contains attribute :-
Remove attribute from element ["
removeAttributes"].
-
-
-
-
Append attribute to configuration ["
removeAttributes"] -
Return true.
-
SanitizerConfig
configuration
:
SanitizerConfig
configuration
:
-
If configuration ["
attributes"] does not exist , then return false. -
If configuration ["
dataAttributes"] equals allow , then return false. -
If allow is true:
-
Remove any items attr from configuration ["
attributes"] where attr is a custom data attribute . -
If configuration ["
elements"] exists :-
For each element in configuration ["
elements"]:-
If element [
attributes] exists :-
Remove any items attr from element [
attributes] where attr is a custom data attribute .
-
-
-
-
-
Set configuration ["
dataAttributes"] to allow . -
Return true.
SanitizerConfig
configuration
,
do
this:
Note: While this algorithm is called remove unsafe , we use the term "unsafe" strictly in the sense of this spec , to denote content that will execute JavaScript when inserted into the document. In other words, this method will remove oportunities for XSS.
-
Assert : The key set of built-in safe baseline configuration equals «[ "
removeElements", "removeAttributes" ] ». -
Let result be false.
-
For each element in built-in safe baseline configuration [
removeElements]:-
Call remove an element element from configuration .
-
If the call returned true, set result to true.
-
-
For each attribute in built-in safe baseline configuration [
removeAttributes]:-
Call remove an attribute attribute from configuration .
-
If the call returned true, set result to true.
-
-
For each attribute listed in event handler content attributes :
-
Call remove an attribute attribute from configuration .
-
If the call returned true, set result to true.
-
-
Return result .
3.3. Set the Configuration
Sanitizer
sanitizer
:
-
Canonicalize configuration with allowCommentsAndDataAttributes .
-
If configuration is not valid , then return false.
-
Set sanitizer ’s configuration to configuration .
-
Return true.
3.4. Canonicalize the Configuration
The
Sanitizer
stores
the
configuration
in
a
canonical
form,
as
this
makes
a
number
of
processing
steps
easier.
elements
list
{elements:
["div"]}
gets
stored
as
{elements:
[{name:
"div",
namespace:
"http://www.w3.org/1999/xhtml"}]
).
SanitizerConfig
configuration
with
a
boolean
allowCommentsAndDataAttributes
:
Note:
We
assume
that
configuration
is
the
result
of
[WebIDL]
converting
a
JavaScript
value
to
a
SanitizerConfig
.
-
If neither configuration ["
elements"] nor configuration ["removeElements"] exist , then set configuration ["removeElements"] to « ». -
If neither configuration ["
attributes"] nor configuration ["removeAttributes"] exist , then set configuration ["removeAttributes"] to « ». -
If configuration ["
elements"] exists :-
Let elements be « ».
-
For each element of configuration ["
elements"] do:-
Append the result of canonicalize a sanitizer element with attributes element to elements .
-
-
Set configuration ["
elements"] to elements .
-
-
If configuration ["
removeElements"] exists :-
Let elements be « ».
-
For each element of configuration ["
removeElements"] do:-
Append the result of canonicalize a sanitizer element element to elements .
-
-
Set configuration ["
removeElements"] to elements .
-
-
If configuration ["
replaceWithChildrenElements"] exists :-
Let elements be « ».
-
For each element of configuration ["
replaceWithChildrenElements"] do:-
Append the result of canonicalize a sanitizer element element to elements .
-
-
Set configuration ["
replaceWithChildrenElements"] to elements .
-
-
If configuration ["
attributes"] exists :-
Let attributes be « ».
-
For each attribute of configuration ["
attributes"] do:-
Append the result of canonicalize a sanitizer attribute attribute to attributes .
-
-
Set configuration ["
attributes"] to attributes .
-
-
If configuration ["
removeAttributes"] exists :-
Let attributes be « ».
-
For each attribute of configuration ["
removeAttributes"] do:-
Append the result of canonicalize a sanitizer attribute attribute to attributes .
-
-
Set configuration ["
removeAttributes"] to attributes .
-
-
If configuration ["
comments"] does not exist , then set configuration ["comments"] to allowCommentsAndDataAttributes . -
If configuration ["
attributes"] exists and configuration ["dataAttributes"] does not exist , then set configuration ["dataAttributes"] to allowCommentsAndDataAttributes .
SanitizerElementWithAttributes
element
:
-
Let result be the result of canonicalize a sanitizer element with element .
-
If element is a dictionary :
-
If element ["
attributes"] exists :-
Let attributes be « ».
-
For each attribute of element ["
attributes"]:-
Append the result of canonicalize a sanitizer attribute with attribute to attributes .
-
-
Set result ["
attributes"] to attributes .
-
-
If element ["
removeAttributes"] exists :-
Let attributes be « ».
-
For each attribute of element ["
removeAttributes"]:-
Append the result of canonicalize a sanitizer attribute with attribute to attributes .
-
-
Set result ["
removeAttributes"] to attributes .
-
-
-
If neither result ["
attributes"] nor result ["removeAttributes"] exist :-
Set result ["
removeAttributes"] to « ».
-
-
Return result .
SanitizerElement
element
,
return
the
result
of
canonicalize
a
sanitizer
name
with
element
and
the
HTML
namespace
as
the
default
namespace.
SanitizerAttribute
attribute
,
return
the
result
of
canonicalize
a
sanitizer
name
with
attribute
and
null
as
the
default
namespace.
-
Assert : name is either a
DOMStringor a dictionary . -
If name is a
DOMString, then return «[ "name" → name , "namespace" → defaultNamespace ]». -
Assert : name is a dictionary and both name ["name"] and name ["namespace"] exist .
-
If name ["namespace"] is the empty string, then set it to null.
-
Return «[
"name" → name ["name"],
"namespace" → name ["namespace"]
]».
3.5. Supporting Algorithms
For
the
canonicalized
element
and
attribute
name
lists
used
in
this
spec,
list
membership
is
based
on
matching
both
"
name
"
and
"
namespace
"
entries:
-
If itemA ["namespace"] is null:
-
If itemB ["namespace"] is not null, return true.
-
-
Otherwise:
-
If itemB ["namespace"] is null, return false.
-
If itemA ["namespace"] is code unit less than itemB ["namespace"], return true.
-
-
Return itemA ["name"] is code unit less than itemB ["name"].
SanitizerElement
is
the
same
as
set
intersection
,
but
with
the
set
entries
previously
canonicalized
:
-
Let set A be « [] »
-
Let set B be « [] »
-
For each entry of A , append the result of canonicalize a sanitizer name entry to set A .
-
For each entry of B , append the result of canonicalize a sanitizer name entry to set B .
-
Retrun the intersection of set A and set B .
3.6. Builtins
There are four builtins:
-
the built-in safe baseline configuration , and
-
the built-in navigating URL attributes list , and
The built-in safe default configuration is as follows:
{ "elements" : [ { "name" : "math" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "merror" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mfrac" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mi" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mmultiscripts" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mn" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mo" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "fence" , "namespace" : null }, { "name" : "form" , "namespace" : null }, { "name" : "largeop" , "namespace" : null }, { "name" : "lspace" , "namespace" : null }, { "name" : "maxsize" , "namespace" : null }, { "name" : "minsize" , "namespace" : null }, { "name" : "movablelimits" , "namespace" : null }, { "name" : "rspace" , "namespace" : null }, { "name" : "separator" , "namespace" : null }, { "name" : "stretchy" , "namespace" : null }, { "name" : "symmetric" , "namespace" : null } ] }, { "name" : "mover" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accent" , "namespace" : null } ] }, { "name" : "mpadded" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "depth" , "namespace" : null }, { "name" : "height" , "namespace" : null }, { "name" : "lspace" , "namespace" : null }, { "name" : "voffset" , "namespace" : null }, { "name" : "width" , "namespace" : null } ] }, { "name" : "mphantom" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mprescripts" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mroot" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mrow" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "ms" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mspace" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "depth" , "namespace" : null }, { "name" : "height" , "namespace" : null }, { "name" : "width" , "namespace" : null } ] }, { "name" : "msqrt" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mstyle" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msub" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msubsup" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msup" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtable" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtd" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "columnspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null } ] }, { "name" : "mtext" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtr" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "munder" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accentunder" , "namespace" : null } ] }, { "name" : "munderover" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accent" , "namespace" : null }, { "name" : "accentunder" , "namespace" : null } ] }, { "name" : "semantics" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "a" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "href" , "namespace" : null }, { "name" : "hreflang" , "namespace" : null }, { "name" : "rel" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "abbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "address" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "article" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "aside" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "b" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "bdi" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "bdo" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "blockquote" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null } ] }, { "name" : "body" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "br" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "caption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "cite" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "code" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "col" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "colgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "data" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "dd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "del" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "dfn" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "div" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dl" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "em" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figcaption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figure" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "footer" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h1" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h2" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h3" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h4" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h5" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h6" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "head" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "header" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "html" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "i" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ins" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "kbd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "li" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "main" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "mark" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "menu" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "nav" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ol" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "reversed" , "namespace" : null }, { "name" : "start" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "p" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "pre" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "q" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ruby" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "s" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "samp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "search" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "section" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "small" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "span" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "strong" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sub" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "table" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tbody" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "td" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "colspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null } ] }, { "name" : "tfoot" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "th" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "abbr" , "namespace" : null }, { "name" : "colspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null }, { "name" : "scope" , "namespace" : null } ] }, { "name" : "thead" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "time" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "datetime" , "namespace" : null } ] }, { "name" : "title" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "u" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ul" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "var" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "wbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "circle" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "cx" , "namespace" : null }, { "name" : "cy" , "namespace" : null }, { "name" : "pathLength" , "namespace" : null }, { "name" : "r" , "namespace" : null } ] }, { "name" : "defs" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [] }, { "name" : "desc" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [] }, { "name" : "ellipse" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "cx" , "namespace" : null }, { "name" : "cy" , "namespace" : null }, { "name" : "pathLength" , "namespace" : null }, { "name" : "rx" , "namespace" : null }, { "name" : "ry" , "namespace" : null } ] }, { "name" : "foreignObject" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "height" , "namespace" : null }, { "name" : "width" , "namespace" : null }, { "name" : "x" , "namespace" : null }, { "name" : "y" , "namespace" : null } ] }, { "name" : "g" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [] }, { "name" : "line" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "pathLength" , "namespace" : null }, { "name" : "x1" , "namespace" : null }, { "name" : "x2" , "namespace" : null }, { "name" : "y1" , "namespace" : null }, { "name" : "y2" , "namespace" : null } ] }, { "name" : "marker" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "markerHeight" , "namespace" : null }, { "name" : "markerUnits" , "namespace" : null }, { "name" : "markerWidth" , "namespace" : null }, { "name" : "orient" , "namespace" : null }, { "name" : "preserveAspectRatio" , "namespace" : null }, { "name" : "refX" , "namespace" : null }, { "name" : "refY" , "namespace" : null }, { "name" : "viewBox" , "namespace" : null } ] }, { "name" : "metadata" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [] }, { "name" : "path" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "d" , "namespace" : null }, { "name" : "pathLength" , "namespace" : null } ] }, { "name" : "polygon" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "pathLength" , "namespace" : null }, { "name" : "points" , "namespace" : null } ] }, { "name" : "polyline" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "pathLength" , "namespace" : null }, { "name" : "points" , "namespace" : null } ] }, { "name" : "rect" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "height" , "namespace" : null }, { "name" : "pathLength" , "namespace" : null }, { "name" : "rx" , "namespace" : null }, { "name" : "ry" , "namespace" : null }, { "name" : "width" , "namespace" : null }, { "name" : "x" , "namespace" : null }, { "name" : "y" , "namespace" : null } ] }, { "name" : "svg" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "height" , "namespace" : null }, { "name" : "preserveAspectRatio" , "namespace" : null }, { "name" : "viewBox" , "namespace" : null }, { "name" : "width" , "namespace" : null }, { "name" : "x" , "namespace" : null }, { "name" : "y" , "namespace" : null } ] }, { "name" : "text" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "dx" , "namespace" : null }, { "name" : "dy" , "namespace" : null }, { "name" : "lengthAdjust" , "namespace" : null }, { "name" : "rotate" , "namespace" : null }, { "name" : "textLength" , "namespace" : null }, { "name" : "x" , "namespace" : null }, { "name" : "y" , "namespace" : null } ] }, { "name" : "textPath" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "lengthAdjust" , "namespace" : null }, { "name" : "method" , "namespace" : null }, { "name" : "path" , "namespace" : null }, { "name" : "side" , "namespace" : null }, { "name" : "spacing" , "namespace" : null }, { "name" : "startOffset" , "namespace" : null }, { "name" : "textLength" , "namespace" : null } ] }, { "name" : "title" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [] }, { "name" : "tspan" , "namespace" : "http://www.w3.org/2000/svg" , "attributes" : [ { "name" : "dx" , "namespace" : null }, { "name" : "dy" , "namespace" : null }, { "name" : "lengthAdjust" , "namespace" : null }, { "name" : "rotate" , "namespace" : null }, { "name" : "textLength" , "namespace" : null }, { "name" : "x" , "namespace" : null }, { "name" : "y" , "namespace" : null } ] } ], "attributes" : [ { "name" : "alignment-baseline" , "namespace" : null }, { "name" : "baseline-shift" , "namespace" : null }, { "name" : "clip-path" , "namespace" : null }, { "name" : "clip-rule" , "namespace" : null }, { "name" : "color" , "namespace" : null }, { "name" : "color-interpolation" , "namespace" : null }, { "name" : "cursor" , "namespace" : null }, { "name" : "dir" , "namespace" : null }, { "name" : "direction" , "namespace" : null }, { "name" : "display" , "namespace" : null }, { "name" : "displaystyle" , "namespace" : null }, { "name" : "dominant-baseline" , "namespace" : null }, { "name" : "fill" , "namespace" : null }, { "name" : "fill-opacity" , "namespace" : null }, { "name" : "fill-rule" , "namespace" : null }, { "name" : "font-family" , "namespace" : null }, { "name" : "font-size" , "namespace" : null }, { "name" : "font-size-adjust" , "namespace" : null }, { "name" : "font-stretch" , "namespace" : null }, { "name" : "font-style" , "namespace" : null }, { "name" : "font-variant" , "namespace" : null }, { "name" : "font-weight" , "namespace" : null }, { "name" : "lang" , "namespace" : null }, { "name" : "letter-spacing" , "namespace" : null }, { "name" : "marker-end" , "namespace" : null }, { "name" : "marker-mid" , "namespace" : null }, { "name" : "marker-start" , "namespace" : null }, { "name" : "mathbackground" , "namespace" : null }, { "name" : "mathcolor" , "namespace" : null }, { "name" : "mathsize" , "namespace" : null }, { "name" : "opacity" , "namespace" : null }, { "name" : "paint-order" , "namespace" : null }, { "name" : "pointer-events" , "namespace" : null }, { "name" : "scriptlevel" , "namespace" : null }, { "name" : "shape-rendering" , "namespace" : null }, { "name" : "stop-color" , "namespace" : null }, { "name" : "stop-opacity" , "namespace" : null }, { "name" : "stroke" , "namespace" : null }, { "name" : "stroke-dasharray" , "namespace" : null }, { "name" : "stroke-dashoffset" , "namespace" : null }, { "name" : "stroke-linecap" , "namespace" : null }, { "name" : "stroke-linejoin" , "namespace" : null }, { "name" : "stroke-miterlimit" , "namespace" : null }, { "name" : "stroke-opacity" , "namespace" : null }, { "name" : "stroke-width" , "namespace" : null }, { "name" : "text-anchor" , "namespace" : null }, { "name" : "text-decoration" , "namespace" : null }, { "name" : "text-overflow" , "namespace" : null }, { "name" : "text-rendering" , "namespace" : null }, { "name" : "title" , "namespace" : null }, { "name" : "transform" , "namespace" : null }, { "name" : "transform-origin" , "namespace" : null }, { "name" : "unicode-bidi" , "namespace" : null }, { "name" : "vector-effect" , "namespace" : null }, { "name" : "visibility" , "namespace" : null }, { "name" : "white-space" , "namespace" : null }, { "name" : "word-spacing" , "namespace" : null }, { "name" : "writing-mode" , "namespace" : null } ], "comments" : false , "dataAttributes" : false }
Note: Included [MathML] markup is based on [SafeMathML] .
The built-in safe baseline configuration is meant to block only script-content. It is as follows:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [] }
Warning: The remove unsafe algorithm specifies to additionally remove any event handler content attributes , as defined in [HTML] . If a user agent defines extensions to the [HTML] spec with additional event handler content attributes , it is its responsibility to decide how to handle them. Using the current event handler content attributes list, the safe baseline configuration looks effectively like so:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [ "onafterprint" , "onauxclick" , "onbeforeinput" , "onbeforematch" , "onbeforeprint" , "onbeforeunload" , "onbeforetoggle" , "onblur" , "oncancel" , "oncanplay" , "oncanplaythrough" , "onchange" , "onclick" , "onclose" , "oncontextlost" , "oncontextmenu" , "oncontextrestored" , "oncopy" , "oncuechange" , "oncut" , "ondblclick" , "ondrag" , "ondragend" , "ondragenter" , "ondragleave" , "ondragover" , "ondragstart" , "ondrop" , "ondurationchange" , "onemptied" , "onended" , "onerror" , "onfocus" , "onformdata" , "onhashchange" , "oninput" , "oninvalid" , "onkeydown" , "onkeypress" , "onkeyup" , "onlanguagechange" , "onload" , "onloadeddata" , "onloadedmetadata" , "onloadstart" , "onmessage" , "onmessageerror" , "onmousedown" , "onmouseenter" , "onmouseleave" , "onmousemove" , "onmouseout" , "onmouseover" , "onmouseup" , "onoffline" , "ononline" , "onpagehide" , "onpagereveal" , "onpageshow" , "onpageswap" , "onpaste" , "onpause" , "onplay" , "onplaying" , "onpopstate" , "onprogress" , "onratechange" , "onreset" , "onresize" , "onrejectionhandled" , "onscroll" , "onscrollend" , "onsecuritypolicyviolation" , "onseeked" , "onseeking" , "onselect" , "onslotchange" , "onstalled" , "onstorage" , "onsubmit" , "onsuspend" , "ontimeupdate" , "ontoggle" , "onunhandledrejection" , "onunload" , "onvolumechange" , "onwaiting" , "onwheel" ] }
javascript:
"
navigations
are
"unsafe",
are
as
follows:
«[
[
{
"
name
"
→
"
a
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
area
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
base
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
button
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
formaction
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
form
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
action
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
iframe
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
src
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
input
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
formaction
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
a
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
a
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
XLink
namespace
}
],
]»
The
built-in
animating
URL
attributes
list
,
which
can
be
used
in
[SVG11]
to
declaratively
modify
navigation
elements
to
use
"
javascript:
"
URLs,
is
as
follows:
«[
[
{
"
name
"
→
"
animate
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
]
}
],
[
{
"
name
"
→
"
animateMotion
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
animateTransform
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
set
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
]»
4. Security Considerations
The Sanitizer API is intended to prevent DOM-based Cross-Site Scripting by traversing a supplied HTML content and removing elements and attributes according to a configuration. The specified API must not support the construction of a Sanitizer object that leaves script-capable markup in and doing so would be a bug in the threat model.
That being said, there are security issues which the correct usage of the Sanitizer API will not be able to protect against and the scenarios will be laid out in the following sections.
4.1. Server-Side Reflected and Stored XSS
This section is not normative.
The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an existing DocumentFragment. The Sanitizer does not address server-side reflected or stored XSS.
4.2. DOM clobbering
This section is not normative.
DOM
clobbering
describes
an
attack
in
which
malicious
HTML
confuses
an
application
by
naming
elements
through
id
or
name
attributes
such
that
properties
like
children
of
an
HTML
element
in
the
DOM
are
overshadowed
by
the
malicious
content.
The
Sanitizer
API
does
not
protect
DOM
clobbering
attacks
in
its
default
state,
but
can
be
configured
to
remove
id
and
name
attributes.
4.3. XSS with Script gadgets
This section is not normative.
Script gadgets are a technique in which an attacker uses existing application code from popular JavaScript libraries to cause their own code to execute. This is often done by injecting innocent-looking code or seemingly inert DOM nodes that is only parsed and interpreted by a framework which then performs the execution of JavaScript based on that input.
The
Sanitizer
API
can
not
prevent
these
attacks,
but
requires
page
authors
to
explicitly
allow
unknown
elements
in
general,
and
authors
must
additionally
explicitly
configure
unknown
attributes
and
elements
and
markup
that
is
known
to
be
widely
used
for
templating
and
framework-specific
code,
like
data-
and
slot
attributes
and
elements
like
<slot>
and
<template>
.
We
believe
that
these
restrictions
are
not
exhaustive
and
encourage
page
authors
to
examine
their
third
party
libraries
for
this
behavior.
4.4. Mutated XSS
This section is not normative.
Mutated XSS or mXSS describes an attack based on parser context mismatches when parsing an HTML snippet without the correct context. In particular, when a parsed HTML fragment has been serialized to a string, the string is not guaranteed to be parsed and interpreted exactly the same when inserted into a different parent element. An example for carrying out such an attack is by relying on the change of parsing behavior for foreign content or mis-nested tags.
The
Sanitizer
API
offers
only
functions
that
turn
a
string
into
a
node
tree.
The
context
is
supplied
implicitly
by
all
sanitizer
functions:
Element.setHTML()
uses
the
current
element;
Document.parseHTML()
creates
a
new
document.
Therefore
Sanitizer
API
is
not
directly
affected
by
mutated
XSS.
If
a
developer
were
to
retrieve
a
sanitized
node
tree
as
a
string,
e.g.
via
.innerHTML
,
and
to
then
parse
it
again
then
mutated
XSS
may
occur.
We
discourage
this
practice.
If
processing
or
passing
of
HTML
as
a
string
should
be
necessary
after
all,
then
any
string
should
be
considered
untrusted
and
should
be
sanitized
(again)
when
inserting
it
into
the
DOM.
In
other
words,
a
sanitized
and
then
serialized
HTML
tree
can
no
longer
be
considered
as
sanitized.
A more complete treatment of mXSS can be found in [MXSS] .
5. Acknowledgements
This
work
is
informed
and
inspired
by
[DOMPURIFY]
from
cure53,
Internet
Explorer’s
window.toStaticHTML()
as
well
as
the
original
[HTMLSanitizer]
from
Ben
Bucksch.
Anne
van
Kesteren,
Krzysztof
Kotowicz,
Tom
Schuster,
Luke
Warlow,
Guillaume
Weghsteen,
and
Mike
West
for
their
valuable
feedback.