1. Introduction
This section is not normative.
Web
applications
often
need
to
work
with
strings
of
HTML
on
the
client
side,
perhaps
as
part
of
a
client-side
templating
solution,
perhaps
as
part
of
rendering
user
generated
content,
etc.
It
is
difficult
to
do
so
in
a
safe
way.
The
naive
approach
of
joining
strings
together
and
stuffing
them
into
an
Element
's
innerHTML
is
fraught
with
risk,
as
it
can
cause
JavaScript
execution
in
a
number
of
unexpected
ways.
Libraries like [DOMPURIFY] attempt to manage this problem by carefully parsing and sanitizing strings before insertion, by constructing a DOM and filtering its members through an allow-list. This has proven to be a fragile approach, as the parsing APIs exposed to the web don’t always map in reasonable ways to the browser’s behavior when actually rendering a string as HTML in the "real" DOM. Moreover, the libraries need to keep on top of browsers' changing behavior over time; things that once were safe may turn into time-bombs based on new platform-level features.
The browser has a fairly good idea of when it is going to execute code. We can improve upon the user-space libraries by teaching the browser how to render HTML from an arbitrary string in a safe manner, and do so in a way that is much more likely to be maintained and updated along with the browser’s own changing parser implementation. This document outlines an API which aims to do just that.
1.1. Goals
-
Mitigate the risk of DOM-based cross-site scripting attacks by providing developers with mechanisms for handling user-controlled HTML which prevent direct script execution upon injection.
-
Make HTML output safe for use within the current user agent, taking into account its current understanding of HTML.
-
Allow developers to override the default set of elements and attributes. Adding certain elements and attributes can prevent script gadget attacks.
1.2. API Summary
The Sanitizer API offers functionality to parse a string containing HTML into a DOM tree, and to filter the resulting tree according to a user-supplied configuration. The methods come in two by two flavours:
-
Safe and unsafe : The "safe" methods will not generate any markup that executes script. That is, they should be safe from XSS. The "unsafe" methods will parse and filter whatever they’re supposed to. See also: § 4 Security Considerations .
-
Context: Methods are defined on
ElementandShadowRootand will replace theseNode's children, and are largely analogous toinnerHTML. There are also static methods on theDocument, which parse an entire document are largely analogous toDOMParser.parseFromString().
2. Framework
2.1. Sanitizer API
The
Element
interface
defines
two
methods,
setHTML()
and
setHTMLUnsafe()
.
Both
of
these
take
a
DOMString
with
HTML
markup,
and
an
optional
configuration.
partial interface Element { [CEReactions ]undefined ((setHTMLUnsafe TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {}); [options CEReactions ]undefined (setHTML DOMString ,html optional SetHTMLOptions = {}); };options
Element
's
setHTMLUnsafe
(
html
,
options
)
method
steps
are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML, this 's relevant global object , html , "Element setHTMLUnsafe", and "script". -
Let target be this 's template contents if this is a
templateelement; otherwise this . -
Set and filter HTML given target , this , compliantHTML , options , and false.
Element
's
setHTML
(
html
,
options
)
method
steps
are:
-
Let target be this 's template contents if this is a
template; otherwise this . -
Set and filter HTML given target , this , html , options , and true.
partial interface ShadowRoot { [CEReactions ]undefined ((setHTMLUnsafe TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {}); [options CEReactions ]undefined (setHTML DOMString ,html optional SetHTMLOptions = {}); };options
These
methods
are
mirrored
on
the
ShadowRoot
:
ShadowRoot
's
setHTMLUnsafe
(
html
,
options
)
method
steps
are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML, this 's relevant global object , html , "ShadowRoot setHTMLUnsafe", and "script". -
Set and filter HTML using this , this 's shadow host (as context element), compliantHTML , options , and false.
ShadowRoot
's
setHTML
(
html
,
options
)
method
steps
are:
-
Set and filter HTML using this (as target), this (as context element), html , options , and true.
The
Document
interface
gains
two
new
methods
which
parse
an
entire
Document
:
partial interface Document {static Document ((parseHTMLUnsafe TrustedHTML or DOMString ),html optional SetHTMLUnsafeOptions = {});options static Document (parseHTML DOMString ,html optional SetHTMLOptions = {}); };options
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML, this 's relevant global object , html , "Document parseHTMLUnsafe", and "script". -
Let document be a new
Document, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document ’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and compliantHTML .
-
Let sanitizer be the result of calling get a sanitizer instance from options with options .
-
Call sanitize on document ’s root node with sanitizer and false.
-
Return document .
-
Let document be a new
Document, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document ’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and html .
-
Let sanitizer be the result of calling get a sanitizer instance from options with options .
-
Call sanitize on document ’s root node with sanitizer and true.
-
Return document .
2.2. SetHTML options and the configuration object.
The
family
of
setHTML()
-like
methods
all
accept
an
options
dictionary.
Right
now,
only
one
member
of
this
dictionary
is
defined:
enum {SanitizerPresets };"default" dictionary { (SetHTMLOptions Sanitizer or SanitizerConfig or SanitizerPresets )= "default"; };sanitizer dictionary { (SetHTMLUnsafeOptions Sanitizer or SanitizerConfig or SanitizerPresets )= {}; };sanitizer
The
Sanitizer
configuration
object
encapsulates
a
filter
configuration.
The
same
configuration
can
be
used
with
both
"safe"
or
"unsafe"
methods,
where
the
"safe"
methods
perform
an
implicit
removeUnsafe
operation
on
the
passed
in
configuration
and
have
a
default
configuration
when
none
is
passed.
The
intent
is
that
one
(or
a
few)
configurations
will
be
built-up
early
on
in
a
page’s
lifetime,
and
can
then
be
used
whenever
needed.
This
allows
implementations
to
pre-process
configurations.
The configuration object can be queried to return a configuration dictionary. It can also be modified directly.
[Exposed =(Window ,Worker )]interface {Sanitizer (constructor optional (SanitizerConfig or SanitizerPresets )= "default"); // Query configuration:configuration SanitizerConfig (); // Modify a Sanitizer’s lists and fields:get undefined (allowElement SanitizerElementWithAttributes );element undefined (removeElement SanitizerElement );element undefined (replaceElementWithChildren SanitizerElement );element undefined (allowAttribute SanitizerAttribute );attribute undefined (removeAttribute SanitizerAttribute );attribute undefined (setComments boolean );allow undefined (setDataAttributes boolean ); // Remove markup that executes script. May modify multiple lists:allow undefined (); };removeUnsafe
A
Sanitizer
has
an
associated
configuration
,
a
SanitizerConfig
.
-
If configuration is a
SanitizerPresetsstring , then:-
Set configuration to the built-in safe default configuration .
-
Let valid be the return value of setting configuration on this .
-
If valid is false, then throw a
TypeError.
2.3. The Configuration Dictionary
dictionary {SanitizerElementNamespace required DOMString ;name DOMString ?= "http://www.w3.org/1999/xhtml"; }; // Used by "elements"_namespace dictionary :SanitizerElementNamespaceWithAttributes SanitizerElementNamespace {sequence <SanitizerAttribute >;attributes sequence <SanitizerAttribute >; };removeAttributes typedef (DOMString or SanitizerElementNamespace );SanitizerElement typedef (DOMString or SanitizerElementNamespaceWithAttributes );SanitizerElementWithAttributes dictionary {SanitizerAttributeNamespace required DOMString ;name DOMString ?=_namespace null ; };typedef (DOMString or SanitizerAttributeNamespace );SanitizerAttribute dictionary {SanitizerConfig sequence <SanitizerElementWithAttributes >;elements sequence <SanitizerElement >;removeElements sequence <SanitizerElement >;replaceWithChildrenElements sequence <SanitizerAttribute >;attributes sequence <SanitizerAttribute >;removeAttributes boolean ;comments boolean ; };dataAttributes
3. Algorithms
Element
or
DocumentFragment
target
,
an
Element
contextElement
,
a
string
html
,
and
a
dictionary
options
,
and
a
boolean
safe
:
-
If safe and contextElement ’s local name is "
script" and contextElement ’s namespace is the HTML namespace or the SVG namespace , then return. -
Let sanitizer be the result of calling get a sanitizer instance from options with options .
-
Let newChildren be the result of the HTML fragment parsing algorithm steps given contextElement , html , and true.
-
Let fragment be a new
DocumentFragmentwhose node document is contextElement ’s node document . -
Run sanitize on fragment using sanitizer and safe .
-
Replace all with fragment within target .
Note:
This
algorithm
works
for
both
SetHTMLOptions
and
SetHTMLUnsafeOptions
.
They
only
differ
in
the
defaults.
-
Let sanitizerSpec be "
default". -
If options ["
sanitizer"] exists , then:-
Set sanitizerSpec to options ["
sanitizer"]
-
-
Assert : sanitizerSpec is either a
Sanitizerinstance, a string which is aSanitizerPresetsmember, or a dictionary . -
If sanitizerSpec is a string :
-
Set sanitizerSpec to the built-in safe default configuration .
-
Assert : sanitizerSpec is either a
Sanitizerinstance, or a dictionary . -
If sanitizerSpec is a dictionary :
-
Let sanitizer be a new
Sanitizerinstance. -
Let setConfigurationResult be the result of set a configuration with sanitizerSpec on sanitizer .
-
Set sanitizerSpec to sanitizer .
-
-
Return sanitizerSpec .
3.1. Sanitization Algorithms
ParentNode
node
,
a
Sanitizer
sanitizer
,
and
a
boolean
safe
,
run
these
steps:
-
Let configuration be the value of sanitizer ’s configuration .
-
If safe is true, then set configuration to the result of calling remove unsafe on configuration .
-
Call sanitize core on node , configuration , and with handleJavascriptNavigationUrls set to safe .
ParentNode
node
,
a
SanitizerConfig
configuration
,
and
a
boolean
handleJavascriptNavigationUrls
,
iterates
over
the
DOM
tree
beginning
with
node
,
and
may
recurse
to
handle
some
special
cases
(e.g.
template
contents).
It
consistes
of
these
steps:
-
Let current be node .
-
For each child in current ’s children :
-
Assert : child implements
Text,Comment, orElement.Note: Currently, this algorithm is only called on output of the HTML parser for which this assertion should hold. If in the future this algorithm will be used in different contexts, this assumption needs to be re-examined.
-
If child implements
Text, then continue . -
If child implements
Comment: -
Otherwise:
-
Let elementName be a
SanitizerElementNamespacewith child ’s local name and namespace . -
If configuration ["
removeElements"] contains elementName , or if configuration ["elements"] is not empty and does not contain elementName , then remove child . -
If configuration ["
replaceWithChildrenElements"] contains elementName :-
Call sanitize core on child with configuration and handleJavascriptNavigationUrls .
-
Call replace all with child ’s children within child .
-
-
If elementName equals «[ "
name" → "template", "namespace" → HTML namespace ]»-
Then call sanitize core on child ’s template contents with configuration and handleJavascriptNavigationUrls .
-
-
If child is a shadow host , then call sanitize core on child ’s shadow root with configuration and handleJavascriptNavigationUrls .
-
For each attribute in child ’s attribute list :
-
Let attrName be a
SanitizerAttributeNamespacewith attribute ’s local name and namespace . -
If configuration ["
removeAttributes"] contains attrName , then Remove attribute from child . -
If configuration ["
elements"]["removeAttributes"] contains attrName , then remove attribute from child . -
If all of the following are false, then remove attribute from child .
-
configuration ["
attributes"] exists and contains attrName -
configuration ["
elements"]["attributes"] contains attrName -
"data-" is a code unit prefix of local name and namespace is
nulland configuration ["dataAttributes"] is true
-
-
If handleJavascriptNavigationUrls :
-
If «[ elementName , attrName ]» matches an entry in the built-in navigating URL attributes list , and if attribute contains a javascript: URL , then remove attribute from child .
-
If child ’s namespace is the MathML Namespace and attr ’s local name is "
href" and attr ’s namespace isnullor the XLink namespace and attr contains a javascript: URL , then remove attr . -
If the built-in animating URL attributes list contains «[ elementName , attrName ]» and attr ’s value is "
href" or "xlink:href", then remove attr .
-
-
-
-
javascript:
URLs
only
when
navigating.
Since
navigation
itself
is
not
an
XSS
threat
we
handle
navigation
to
javascript:
URLs,
but
not
navigations
in
general.
Declarative navigation falls into a handful of categories:
-
Anchor elements. (
<a>in HTML and SVG namespaces) -
Form elements that trigger navigation as part of the form action.
-
[MathML] allows any element to act as an anchor .
-
[SVG11] animation.
The first two are covered by the built-in navigating URL attributes list .
The MathML case is covered by a seperate rule, because there is no formalism in this spec to cover a "per-namespace global" rule.
The
SVG
animation
case
is
covered
by
the
built-in
animating
URL
attributes
list
.
But
since
the
interpretation
of
SVG
animation
elements
depends
on
the
animation
target,
and
since
during
sanitization
we
cannot
know
what
the
final
target
will
be,
the
sanitize
algorithm
blocks
any
animation
of
href
attributes.
-
Let url be the result of running the basic URL parser on attribute ’s value .
-
If url is
failure, then return false.
3.2. Configuration Processing
SanitizerConfig
configuration
,
do:
-
Set element to the result of canonicalize a sanitizer element with attributes with element .
-
Remove element from configuration ["
removeElements"]. -
Remove element from configuration ["
replaceWithChildrenElements"].
NOTE: Handling of allowElement is a little more complicated than the other methods, because the element allow list can have per-element allow- and remove-attribute lists. We first remove the given element from the list before then adding it, which has the effect of re-setting (rather than merging or elsehow modifying) the per-element list to whatever is passed in. In other words, the per-element allow- and remove-lists can only be set as a whole.
NOTE:
Remove
matches
on
name
and
namespace,
so
adding
an
element
with
attributes
would
still
remove
the
matching
element
from
the
removeElements
and
replaceWithChildrenElements
lists.
SanitizerConfig
configuration
,
do:
-
Set element to the result of canonicalize a sanitizer element with element .
-
Add element to configuration ["
removeElements"]. -
Remove element from configuration ["
replaceWithChildrenElements"].
SanitizerConfig
configuration
,
do:
-
Set element to the result of canonicalize a sanitizer element with element .
-
Add element to configuration ["
replaceWithChildrenElements"]. -
Remove element from configuration ["
removeElements"].
SanitizerConfig
configuration
,
do:
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute .
-
Add attribute to configuration ["
attributes"]. -
Remove attribute from configuration ["
removeAttributes"].
SanitizerConfig
configuration
,
do:
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute .
-
Add attribute to configuration ["
removeAttributes"]. -
Remove attribute from configuration ["
attributes"].
SanitizerConfig
configuration
,
do:
-
Set configuration ["
comments"] to allow .
SanitizerConfig
configuration
,
do:
-
Set configuration ["
dataAttributes"] to allow .
Note: While this algorithm is called remove unsafe , we use the term "unsafe" strictly in the sense of this spec , to denote content that will execute JavaScript when inserted into the document. In other words, this method will remove oportunities for XSS.
To remove unsafe from a configuration , do this:
-
Assert : The built-in safe baseline configuration has
removeElementsandremoveAttributeskeys set, but notelements,replaceWithChildrenElements, orattributes. -
Let result be a copy of configuration .
-
For each element in built-in safe baseline configuration [
removeElements]:-
Call remove an element with element and result .
-
-
For each attribute in built-in safe baseline configuration [
removeAttributes]:-
Call remove an attribute with attribute and result .
-
-
For each attribute listed in event handler content attributes :
-
Call remove an attribute with attribute and result .
-
-
Return result .
Sanitizer
sanitizer
:
-
For each element of configuration ["
elements"] do:-
Call allow an element with element and sanitizer .
-
-
For each element of configuration ["
removeElements"] do:-
Call remove an element with element and sanitizer .
-
-
For each element of configuration ["
replaceWithChildrenElements"] do:-
Call replace an element with its children with element and sanitizer .
-
-
For each attribute of configuration ["
attributes"] do:-
Call allow an attribute with attribute and sanitizer .
-
-
For each attribute of configuration ["
removeAttributes"] do:-
Call remove an attribute with attribute and sanitizer .
-
-
Call set comments with configuration ["
comments"] and sanitizer . -
Call set data attributes with configuration ["
dataAttributes"] and sanitizer . -
Return whether all of the following are true:
-
size of configuration ["
elements"] equals size of this 's configuration ["elements"]. -
size of configuration ["
removeElements"] equals size of this 's configuration ["removeElements"]. -
size of configuration ["
replaceWithChildrenElements"] equals size of this 's configuration ["replaceWithChildrenElements"]. -
size of configuration ["
attributes"] equals size of this 's configuration ["attributes"]. -
size of configuration ["
removeAttributes"] equals size of this 's configuration ["removeAttributes"]. -
Either configuration ["
elements"] or configuration ["removeElements"] exist , or neither, but not both. -
Either configuration ["
attributes"] or configuration ["removeAttributes"] exist , or neither, but not both.
-
Note: Previous versions of this spec had elaborate definitions of how to canonicalize a config. This has now effectively been moved into the method definitions.
Note:
This
operation
is
defined
in
terms
of
the
manipulation
methods
on
the
Sanitizer
.
Those
methods
remove
matching
entries
from
other
lists.
The
size
equality
steps
in
the
last
step
would
then
catch
this.
For
example:
{
allow:
["div",
"div"]
}
would
create
a
Sanitizer
with
one
element
in
the
allow
list.
The
final
test
would
then
return
false,
which
would
cause
the
caller
to
throw
an
exception.
This is still missing error checks for the per-element attribute lists and syntax errors.
SanitizerElementWithAttributes
element
,
do
this:
-
Let result be the result of canonicalize a sanitizer element with element .
-
If element is a dictionary :
-
For each attribute in element ["
attributes"]:-
Add the result of canonicalize a sanitizer attribute with attribute to result ["
attributes"].
-
-
For each attribute in element ["
removeAttributes"]:-
Add the result of canonicalize a sanitizer attribute with attribute to result ["
removeAttributes"].
-
-
-
Return result .
SanitizerElement
element
,
return
the
result
of
canonicalize
a
sanitizer
name
with
element
and
the
HTML
namespace
as
the
default
namespace.
SanitizerAttribute
attribute
,
return
the
result
of
canonicalize
a
sanitizer
name
with
attribute
and
null
as
the
default
namespace.
-
Assert : name is either a
DOMStringor a dictionary . -
If name is a
DOMString, then return «[ "name" → name , "namespace" → defaultNamespace ]». -
Assert : name is a dictionary and name ["name"] exists .
-
Return «[
"name" → name ["name"],
"namespace" → ( name ["namespace"] if it exists , otherwise defaultNamespace )
]».
3.3. Supporting Algorithms
For
the
canonicalized
element
and
attribute
name
lists
used
in
this
spec,
list
membership
is
based
on
matching
both
"
name
"
and
"
namespace
"
entries:
3.4. Builtins
There are four builtins:
-
the built-in safe baseline configuration , and
-
the built-in navigating URL attributes list , and
The built-in safe default configuration is as follows:
{ "elements" : [ { "name" : "html" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "head" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "title" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "body" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "article" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "section" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "nav" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "aside" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h1" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h2" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h3" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h4" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h5" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h6" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "header" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "footer" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "address" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "p" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "pre" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "blockquote" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null } ] }, { "name" : "ol" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "reversed" , "namespace" : null }, { "name" : "start" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "ul" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "menu" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "li" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "dl" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figure" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figcaption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "main" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "search" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "div" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "a" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "href" , "namespace" : null }, { "name" : "rel" , "namespace" : null }, { "name" : "hreflang" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "em" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "strong" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "small" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "s" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "cite" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "q" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dfn" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "title" , "namespace" : null } ] }, { "name" : "abbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "title" , "namespace" : null } ] }, { "name" : "ruby" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "data" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "time" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "datetime" , "namespace" : null } ] }, { "name" : "code" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "var" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "samp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "kbd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sub" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "i" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "b" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "u" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "mark" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "bdi" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "dir" , "namespace" : null } ] }, { "name" : "bdo" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "dir" , "namespace" : null } ] }, { "name" : "span" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "br" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "wbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ins" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "del" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "table" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "caption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "colgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "col" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "tbody" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "thead" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tfoot" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "td" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "colspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null } ] }, { "name" : "th" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "colspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null }, { "name" : "scope" , "namespace" : null }, { "name" : "abbr" , "namespace" : null } ] } ], "attributes" : [ { "name" : "dir" , "namespace" : null }, { "name" : "lang" , "namespace" : null }, { "name" : "title" , "namespace" : null } ] }
The built-in safe baseline configuration is meant to block only script-content. It is as follows:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [] }
Warning: The remove unsafe algorithm specifies to additionally remove any event handler content attributes , as defined in [HTML] . If a user agent defines extensions to the [HTML] spec with additional event handler content attributes , it is its responsibility to decide how to handle them. Using the current event handler content attributes list, the safe baseline configuration looks effectively like so:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [ "onafterprint" , "onauxclick" , "onbeforeinput" , "onbeforematch" , "onbeforeprint" , "onbeforeunload" , "onbeforetoggle" , "onblur" , "oncancel" , "oncanplay" , "oncanplaythrough" , "onchange" , "onclick" , "onclose" , "oncontextlost" , "oncontextmenu" , "oncontextrestored" , "oncopy" , "oncuechange" , "oncut" , "ondblclick" , "ondrag" , "ondragend" , "ondragenter" , "ondragleave" , "ondragover" , "ondragstart" , "ondrop" , "ondurationchange" , "onemptied" , "onended" , "onerror" , "onfocus" , "onformdata" , "onhashchange" , "oninput" , "oninvalid" , "onkeydown" , "onkeypress" , "onkeyup" , "onlanguagechange" , "onload" , "onloadeddata" , "onloadedmetadata" , "onloadstart" , "onmessage" , "onmessageerror" , "onmousedown" , "onmouseenter" , "onmouseleave" , "onmousemove" , "onmouseout" , "onmouseover" , "onmouseup" , "onoffline" , "ononline" , "onpagehide" , "onpagereveal" , "onpageshow" , "onpageswap" , "onpaste" , "onpause" , "onplay" , "onplaying" , "onpopstate" , "onprogress" , "onratechange" , "onreset" , "onresize" , "onrejectionhandled" , "onscroll" , "onscrollend" , "onsecuritypolicyviolation" , "onseeked" , "onseeking" , "onselect" , "onslotchange" , "onstalled" , "onstorage" , "onsubmit" , "onsuspend" , "ontimeupdate" , "ontoggle" , "onunhandledrejection" , "onunload" , "onvolumechange" , "onwaiting" , "onwheel" ] }
javascript:
"
navigations
are
"unsafe",
are
as
follows:
«[
[
{
"
name
"
→
"
a
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
area
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
button
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
formaction
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
form
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
action
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
iframe
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
src
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
input
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
formaction
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
a
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
a
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
XLink
namespace
}
],
[
{
"
name
"
→
"
base
",
"
namespace
"
→
HTML
namespace
},
{
"
name
"
→
"
href
",
"
namespace
"
→
null
}
],
]»
The
built-in
animating
URL
attributes
list
,
which
can
be
used
in
[SVG11]
to
declaratively
modify
navigation
elements
to
use
"
javascript:
"
URLs,
is
as
follows:
«[
[
{
"
name
"
→
"
animate
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
]
}
],
[
{
"
name
"
→
"
animateMotion
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
animateTransform
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
[
{
"
name
"
→
"
set
",
"
namespace
"
→
SVG
namespace
},
{
"
name
"
→
"
attributeName
",
"
namespace
"
→
null
}
],
]»
4. Security Considerations
The Sanitizer API is intended to prevent DOM-based Cross-Site Scripting by traversing a supplied HTML content and removing elements and attributes according to a configuration. The specified API must not support the construction of a Sanitizer object that leaves script-capable markup in and doing so would be a bug in the threat model.
That being said, there are security issues which the correct usage of the Sanitizer API will not be able to protect against and the scenarios will be laid out in the following sections.
4.1. Server-Side Reflected and Stored XSS
This section is not normative.
The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an existing DocumentFragment. The Sanitizer does not address server-side reflected or stored XSS.
4.2. DOM clobbering
This section is not normative.
DOM
clobbering
describes
an
attack
in
which
malicious
HTML
confuses
an
application
by
naming
elements
through
id
or
name
attributes
such
that
properties
like
children
of
an
HTML
element
in
the
DOM
are
overshadowed
by
the
malicious
content.
The
Sanitizer
API
does
not
protect
DOM
clobbering
attacks
in
its
default
state,
but
can
be
configured
to
remove
id
and
name
attributes.
4.3. XSS with Script gadgets
This section is not normative.
Script gadgets are a technique in which an attacker uses existing application code from popular JavaScript libraries to cause their own code to execute. This is often done by injecting innocent-looking code or seemingly inert DOM nodes that is only parsed and interpreted by a framework which then performs the execution of JavaScript based on that input.
The
Sanitizer
API
can
not
prevent
these
attacks,
but
requires
page
authors
to
explicitly
allow
unknown
elements
in
general,
and
authors
must
additionally
explicitly
configure
unknown
attributes
and
elements
and
markup
that
is
known
to
be
widely
used
for
templating
and
framework-specific
code,
like
data-
and
slot
attributes
and
elements
like
<slot>
and
<template>
.
We
believe
that
these
restrictions
are
not
exhaustive
and
encourage
page
authors
to
examine
their
third
party
libraries
for
this
behavior.
4.4. Mutated XSS
This section is not normative.
Mutated XSS or mXSS describes an attack based on parser context mismatches when parsing an HTML snippet without the correct context. In particular, when a parsed HTML fragment has been serialized to a string, the string is not guaranteed to be parsed and interpreted exactly the same when inserted into a different parent element. An example for carrying out such an attack is by relying on the change of parsing behavior for foreign content or mis-nested tags.
The
Sanitizer
API
offers
only
functions
that
turn
a
string
into
a
node
tree.
The
context
is
supplied
implicitly
by
all
sanitizer
functions:
Element.setHTML()
uses
the
current
element;
Document.parseHTML()
creates
a
new
document.
Therefore
Sanitizer
API
is
not
directly
affected
by
mutated
XSS.
If
a
developer
were
to
retrieve
a
sanitized
node
tree
as
a
string,
e.g.
via
.innerHTML
,
and
to
then
parse
it
again
then
mutated
XSS
may
occur.
We
discourage
this
practice.
If
processing
or
passing
of
HTML
as
a
string
should
be
necessary
after
all,
then
any
string
should
be
considered
untrusted
and
should
be
sanitized
(again)
when
inserting
it
into
the
DOM.
In
other
words,
a
sanitized
and
then
serialized
HTML
tree
can
no
longer
be
considered
as
sanitized.
A more complete treatment of mXSS can be found in [MXSS] .
5. Acknowledgements
Cure53’s
[DOMPURIFY]
is
a
clear
inspiration
for
the
API
this
document
describes,
as
is
Internet
Explorer’s
window.toStaticHTML()
.