1. Introduction
This section is not normative.
Web applications often need to work with strings of HTML on the client side,
perhaps as part of a client-side templating solution, perhaps as part of
rendering user generated content, etc. It is difficult to do so in a safe way.
The naive approach of joining strings together and stuffing them into
an Element
’s innerHTML
is fraught with risk, as it can cause
JavaScript execution in a number of unexpected ways.
Libraries like [DOMPURIFY] attempt to manage this problem by carefully parsing and sanitizing strings before insertion, by constructing a DOM and filtering its members through an allow-list. This has proven to be a fragile approach, as the parsing APIs exposed to the web don’t always map in reasonable ways to the browser’s behavior when actually rendering a string as HTML in the "real" DOM. Moreover, the libraries need to keep on top of browsers' changing behavior over time; things that once were safe may turn into time-bombs based on new platform-level features.
The browser has a fairly good idea of when it is going to execute code. We can improve upon the user-space libraries by teaching the browser how to render HTML from an arbitrary string in a safe manner, and do so in a way that is much more likely to be maintained and updated along with the browser’s own changing parser implementation. This document outlines an API which aims to do just that.
1.1. Goals
-
Mitigate the risk of DOM-based cross-site scripting attacks by providing developers with mechanisms for handling user-controlled HTML which prevent direct script execution upon injection.
-
Make HTML output safe for use within the current user agent, taking into account its current understanding of HTML.
-
Allow developers to override the default set of elements and attributes. Adding certain elements and attributes can prevent script gadget attacks.
1.2. API Summary
The Sanitizer API offers functionality to parse a string containing HTML into a DOM tree, and to filter the resulting tree according to a user-supplied configuration. The methods come in two by two flavours:
-
Safe and unsafe: The "safe" methods will not generate any markup that executes script. That is, they should be safe from XSS. The "unsafe" methods will parse and filter whatever they’re supposed to. See also: § 4 Security Considerations.
-
Context: Methods are defined on
Element
andShadowRoot
and will replace theseNode
’s children, and are largely analogous toinnerHTML
. There are also static methods on theDocument
, which parse an entire document are largely analogous toDOMParser
.parseFromString()
.
2. Framework
2.1. Sanitizer API
The Element
interface defines two methods, setHTML()
and
setHTMLUnsafe()
. Both of these take a DOMString
with HTML
markup, and an optional configuration.
partial interface Element { [CEReactions ]undefined ((
setHTMLUnsafe TrustedHTML or DOMString ),
html optional SetHTMLUnsafeOptions = {}); [
options CEReactions ]undefined (
setHTML DOMString ,
html optional SetHTMLOptions = {}); };
options
Element
’s setHTMLUnsafe(html, options) method steps are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML
, this’s relevant global object, html, "Element setHTMLUnsafe", and "script". -
Let target be this’s template contents if this is a
template
element; otherwise this. -
Set and filter HTML given target, this, compliantHTML, options, and false.
Element
’s setHTML(html, options) method steps are:
-
Let target be this’s template contents if this is a
template
; otherwise this. -
Set and filter HTML given target, this, html, options, and true.
partial interface ShadowRoot { [CEReactions ]undefined ((
setHTMLUnsafe TrustedHTML or DOMString ),
html optional SetHTMLUnsafeOptions = {}); [
options CEReactions ]undefined (
setHTML DOMString ,
html optional SetHTMLOptions = {}); };
options
These methods are mirrored on the ShadowRoot
:
ShadowRoot
’s setHTMLUnsafe(html, options) method steps are:
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML
, this’s relevant global object, html, "ShadowRoot setHTMLUnsafe", and "script". -
Set and filter HTML using this, this’s shadow host (as context element), compliantHTML, options, and false.
ShadowRoot
’s setHTML(html, options) method steps are:
-
Set and filter HTML using this (as target), this (as context element), html, options, and true.
The Document
interface gains two new methods which parse an entire Document
:
partial interface Document {static Document ((
parseHTMLUnsafe TrustedHTML or DOMString ),
html optional SetHTMLUnsafeOptions = {});
options static Document (
parseHTML DOMString ,
html optional SetHTMLOptions = {}); };
options
-
Let compliantHTML be the result of invoking the Get Trusted Type compliant string algorithm with
TrustedHTML
, this’s relevant global object, html, "Document parseHTMLUnsafe", and "script". -
Let document be a new
Document
, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and compliantHTML.
-
Let sanitizer be the result of calling get a sanitizer instance from options with options and false.
-
Call sanitize on document with sanitizer and false.
-
Return document.
-
Let document be a new
Document
, whose content type is "text/html".Note: Since document does not have a browsing context, scripting is disabled.
-
Set document’s allow declarative shadow roots to true.
-
Parse HTML from a string given document and html.
-
Let sanitizer be the result of calling get a sanitizer instance from options with options and true.
-
Call sanitize on document with sanitizer and true.
-
Return document.
2.2. SetHTML options and the configuration object.
The family of setHTML()
-like methods all accept an options
dictionary. Right now, only one member of this dictionary is defined:
enum {
SanitizerPresets };
"default" dictionary { (
SetHTMLOptions Sanitizer or SanitizerConfig or SanitizerPresets )= "default"; };
sanitizer dictionary { (
SetHTMLUnsafeOptions Sanitizer or SanitizerConfig or SanitizerPresets )= {}; };
sanitizer
The Sanitizer
configuration object encapsulates a filter configuration.
The same configuration can be used with both "safe"
or "unsafe" methods, where the "safe" methods perform an implicit
removeUnsafe
operation on the passed in configuration and have a default
configuration when none is passed. The default differs between "safe" and
"unsafe" methods: The "safe" methods are aiming to be safe by default and
have a restrictive default, while the "unsafe" methods are unrestricted by
default. The intent for configuration use is
that one (or a few) configurations will be built-up early on in a page’s
lifetime, and can then be used whenever needed. This allows implementations
to pre-process configurations.
The configuration object can be queried to return a configuration dictionary. It can also be modified directly.
[Exposed =Window ]interface {
Sanitizer (
constructor optional (SanitizerConfig or SanitizerPresets )= "default"); // Query configuration:
configuration SanitizerConfig (); // Modify a Sanitizer’s lists and fields:
get boolean (
allowElement SanitizerElementWithAttributes );
element boolean (
removeElement SanitizerElement );
element boolean (
replaceElementWithChildren SanitizerElement );
element boolean (
allowAttribute SanitizerAttribute );
attribute boolean (
removeAttribute SanitizerAttribute );
attribute boolean (
setComments boolean );
allow boolean (
setDataAttributes boolean ); // Remove markup that executes script.
allow boolean (); };
removeUnsafe
A Sanitizer
has an associated SanitizerConfig
configuration.
-
If configuration is a
SanitizerPresets
string, then:-
Set configuration to the built-in safe default configuration.
-
Let valid be the return value of set a configuration with configuration and true on this.
-
If valid is false, then throw a
TypeError
.
2.3. The Configuration Dictionary
dictionary {
SanitizerElementNamespace required DOMString ;
name DOMString ?= "http://www.w3.org/1999/xhtml"; }; // Used by "elements"
_namespace dictionary :
SanitizerElementNamespaceWithAttributes SanitizerElementNamespace {sequence <SanitizerAttribute >;
attributes sequence <SanitizerAttribute >; };
removeAttributes typedef (DOMString or SanitizerElementNamespace );
SanitizerElement typedef (DOMString or SanitizerElementNamespaceWithAttributes );
SanitizerElementWithAttributes dictionary {
SanitizerAttributeNamespace required DOMString ;
name DOMString ?=
_namespace null ; };typedef (DOMString or SanitizerAttributeNamespace );
SanitizerAttribute dictionary {
SanitizerConfig sequence <SanitizerElementWithAttributes >;
elements sequence <SanitizerElement >;
removeElements sequence <SanitizerElement >;
replaceWithChildrenElements sequence <SanitizerAttribute >;
attributes sequence <SanitizerAttribute >;
removeAttributes boolean ;
comments boolean ; };
dataAttributes
2.4. Configuration Invariants
Configurations can and ought to be modified by developers to suit their
purposes. Options are to write a new configuration dictionary from
scratch, to modify an existing Sanitizer
’s configuration by using the
modifier methods, or to get()
an existing Sanitizer
’s
configuration as a dictionary and modify the dictionary and then
create a new Sanitizer
with it.
An empty configuration allows everything (when called with the "unsafe" methods like
setHTMLUnsafe
).
A configuration "default"
contains a built-in safe default configuration. Note
that "safe" and "unsafe" sanitizer methods have different defaults.
Not all configuration dictionaries are valid. A valid configuration avoids redundancy (like specifying the same element to be allowed twice) and contradictions (like specifying an element to be both removed and allowed.)
Several conditions need to hold for a configuration to be valid:
-
Mixing global allow- and remove-lists:
-
elements
orremoveElements
can exist, but not both. If both are missing, this is equivalent toremoveElements
set to « [] ». -
attributes
orremoveAttributes
can exist, but not both. If both are missing, this is equivalent toremoveAttributes
set to « [] ». -
dataAttributes
is conceptually an extension of theattributes
allow-list. ThedataAttributes
attribute is only allowed when aattributes
list is used.
-
-
Duplicate entries between different global lists:
-
There are no duplicate entries (i.e., no same elements) between
elements
,removeElements
, orreplaceWithChildrenElements
. -
There are no duplicate entries (i.e., no same attributes) between
attributes
orremoveAttributes
.
-
-
Duplicate entries on the same element:
-
There are no duplicate entries between
attributes
andremoveAttributes
on the same element.
-
The elements
element allow-list can also specify allowing or removing
attributes for a given element. This is meant to mirror [HTML]’s structure, which knows
both global attributes as well as local attributes that apply to a
specific element. Global and local attributes can be mixed, but note that ambiguous
configurations where a particular attribute would be allowed by one list and forbidden by another,
are generally invalid.
global attributes
| global removeAttributes
| |
---|---|---|
local attributes
| An attribute is allowed if it matches either list. No duplicates are allowed. | An attribute is only allowed if it’s in the local allow list. No duplicate entries between global remove and local allow lists are allowed. Note that the global remove list has no function for this particular element, but may well apply to other elements that do not have a local allow list. |
local removeAttributes
| An attribute is allowed if it’s in the global allow-list, but not in the local remove-list. Local remove must be a subset of the global allow lists. | An attribute is allowed if it is in neither list. No duplicate entries between global remove and local remove lists are allowed. |
Please note the asymmetry where mostly no duplicates between global and per-element lists are permitted, but in the case of a global allow-list and a per-element remove-list the latter must be a subset of the former. An excerpt of the table above, only focusing on duplicates, is as follows:
global attributes
| global removeAttributes
| |
---|---|---|
local attributes
| No duplicates are allowed. | No duplicates are allowed. |
local removeAttributes
| Local remove must be a subset of the global allow lists. | No duplicates are allowed. |
The dataAttributes
setting allows custom data attributes. The rules
above easily extends to custom data attributes if one considers
dataAttributes
to be an allow-list:
global attributes and dataAttributes set
| |
---|---|
local attributes
| All custom data attributes are allowed. No custom data attributes may be listed in any allow-list, as that would mean a duplicate entry. |
local removeAttributes
| A custom data attribute is allowed, unless it’s listed in the local remove-list. No custom data attribute may be listed in the global allow-list, as that would mean a duplicate entry. |
Putting these rules in words:
-
Duplicates and interactions between global and local lists:
-
If a global
attributes
allow list exists, then all element’s local lists:-
If a local
attributes
allow lists exists, there may be no duplicate entries between these lists. -
If a local
removeAttributes
remove lists exists, then all its entries must also be listed in the globalattributes
allow list. -
If
dataAttributes
is true, then no custom data attributes may be listed in any of the allow-lists.
-
-
If a global
removeAttributes
remove list exists, then:-
If a local
attributes
allow lists exists, there may be no duplicate entries between these lists. -
If a local
removeAttributes
remove lists exists, there may be no duplicate entries between these lists. -
dataAttributes
must be absent.
-
-
SanitizerConfig
config is valid if all of the following
conditions hold:
-
The config has either an
elements
or aremoveElements
key, but not both. -
The config has either an
attributes
or aremoveAttributes
key, but not both. -
Assert: All
SanitizerElementNamespaceWithAttributes
,SanitizerElementNamespace
, andSanitizerAttributeNamespace
items in config are canonical, meaning they have been run through canonicalize a sanitizer element or canonicalize a sanitizer attribute, as appropriate. -
None of config[
elements
], config[removeElements
], config[replaceWithChildrenElements
], config[attributes
], or config[removeAttributes
], if they exist, has dupes. -
If both config[
elements
] and config[replaceWithChildrenElements
] exist, then the intersection of config[elements
] and config[replaceWithChildrenElements
] is empty. -
If both config[
removeElements
] and config[replaceWithChildrenElements
] exist, then the intersection of config[removeElements
] and config[replaceWithChildrenElements
] is empty. -
If config[
attributes
] exists:-
-
For any element in config[
elements
]:-
The intersection of config[
attributes
] and element[attributes
] with default « [] » is empty. -
element[
removeAttributes
] with default « [] » is a subset of config[attributes
]. -
If
dataAttributes
exists anddataAttributes
is true:-
element[
attributes
] does not contain a custom data attribute.
-
-
-
-
If
dataAttributes
is true:-
config[
attributes
] does not contain a custom data attribute.
-
-
-
If config[
removeAttributes
] exists:-
If config[
elements
] exists, then for any element in config[elements
]:-
The intersection of config[
removeAttributes
] and element[attributes
] with default « [] » is empty. -
The intersection of config[
removeAttributes
] and element[removeAttributes
] with default « [] » is empty.
-
-
config[
dataAttributes
] does not exist.
-
Note: Setting a configuration from a dictionary will do a bit
normalization. In particular, if both allow- and remove-lists are missing, it will interpret this
as an empty remove-list. So {}
itself is not a valid configuration, but it
will be normalized to {removeElements:[],removeAttributes:[]}
, which is. This normalization step
was chosen in order to have a missing dictionary be consistent with an empty one, i.e., to have
setHTMLUnsafe(txt)
be consistent with setHTMLUnsafe(txt, {sanitizer: {}})
.
3. Algorithms
Element
or DocumentFragment
target, an Element
contextElement, a string html, and a
dictionary options, and a boolean safe:
-
If safe and contextElement’s local name is "
script
" and contextElement’s namespace is the HTML namespace or the SVG namespace, then return. -
Let sanitizer be the result of calling get a sanitizer instance from options with options and safe.
-
Let newChildren be the result of the HTML fragment parsing algorithm given contextElement, html, and true.
-
Let fragment be a new
DocumentFragment
whose node document is contextElement’s node document. -
Run sanitize on fragment using sanitizer and safe.
-
Replace all with fragment within target.
Note: This algorithm works for both SetHTMLOptions
and
SetHTMLUnsafeOptions
. They only differ in the defaults.
-
Let sanitizerSpec be "
default
". -
If options["
sanitizer
"] exists, then:-
Set sanitizerSpec to options["
sanitizer
"]
-
-
Assert: sanitizerSpec is either a
Sanitizer
instance, a string which is aSanitizerPresets
member, or a dictionary. -
If sanitizerSpec is a string:
-
Set sanitizerSpec to the built-in safe default configuration.
-
Assert: sanitizerSpec is either a
Sanitizer
instance, or a dictionary. -
If sanitizerSpec is a dictionary:
-
Let sanitizer be a new
Sanitizer
instance. -
Let setConfigurationResult be the result of set a configuration with sanitizerSpec and not safe on sanitizer.
-
Set sanitizerSpec to sanitizer.
-
-
Return sanitizerSpec.
3.1. Sanitize
ParentNode
node, a
Sanitizer
sanitizer, and a boolean safe, run these steps:
-
Let configuration be the value of sanitizer’s configuration.
-
If safe is true, then set configuration to the result of calling remove unsafe on configuration.
-
Call sanitize core on node, configuration, and with handleJavascriptNavigationUrls set to safe.
ParentNode
node, a SanitizerConfig
configuration, and a
boolean handleJavascriptNavigationUrls, recurses over the DOM tree
beginning with node. It consistes of these steps:
-
For each child of node’s children:
-
Assert: child implements
Text
,Comment
,Element
, orDocumentType
.Note: Currently, this algorithm is only called on output of the HTML parser for which this assertion should hold.
DocumentType
should only occur forparseHTML
andparseHTMLUnsafe
. If in the future this algorithm will be used in different contexts, this assumption needs to be re-examined. -
If child implements
DocumentType
, then continue. -
If child implements
Text
, then continue. -
If child implements
Comment
: -
Otherwise:
-
Let elementName be a
SanitizerElementNamespace
with child’s local name and namespace. -
If configuration["
replaceWithChildrenElements
"] exists and if configuration["replaceWithChildrenElements
"] contains elementName:-
Call sanitize core on child with configuration and handleJavascriptNavigationUrls.
-
Call replace all with child’s children within child.
-
-
If configuration["
removeElements
"] exists and configuration["removeElements
"] contains elementName: -
If configuration["
elements
"] exists and configuration["elements
"] does not contain elementName: -
If elementName equals «[ "
name
" → "template
", "namespace
" → HTML namespace ]», then call sanitize core on child’s template contents with configuration and handleJavascriptNavigationUrls. -
If child is a shadow host, then call sanitize core on child’s shadow root with configuration and handleJavascriptNavigationUrls.
-
Let elementWithLocalAttributes be « [] ».
-
If configuration["
elements
"] exists and configuration["elements
"] contains elementName:-
Set elementWithLocalAttributes to configuration["
elements
"][elementName].
-
-
For each attribute in child’s attribute list:
-
Let attrName be a
SanitizerAttributeNamespace
with attribute’s local name and namespace. -
If elementWithLocalAttributes["
removeAttributes
"] with default « [] » contains attrName:-
Remove attribute.
-
-
Otherwise, if configuration["
attributes
"] exists:-
If configuration["
attributes
"] does not contain attrName and elementWithLocalAttributes["attributes
"] with default « [] » does not contain attrName, and if "data-" is not a code unit prefix of attribute’s local name and namespace is notnull
or configuration["dataAttributes
"] is not true:-
Remove attribute.
-
-
-
Otherwise:
-
If elementWithLocalAttributes["
attributes
"] exists and elementWithLocalAttributes["attributes
"] does not contain attrName:-
Remove attribute.
-
-
Otherwise, if configuration["
removeAttributes
"] contains attrName:-
Remove attribute.
-
-
-
If handleJavascriptNavigationUrls:
-
If «[elementName, attrName]» matches an entry in the built-in navigating URL attributes list, and if attribute contains a javascript: URL, then remove attribute.
-
If child’s namespace is the MathML Namespace and attr’s local name is "
href
" and attr’s namespace isnull
or the XLink namespace and attr contains a javascript: URL, then remove attribute. -
If the built-in animating URL attributes list contains «[elementName, attrName]» and attr’s value is "
href
" or "xlink:href
", then remove attribute.
-
-
-
Call sanitize core on child with configuration and handleJavascriptNavigationUrls.
-
-
javascript:
URLs
only when navigating. Since navigation itself is not an XSS threat we handle
navigation to javascript:
URLs, but not navigations in general.
Declarative navigation falls into a handful of categories:
-
Anchor elements. (
<a>
in HTML and SVG namespaces) -
Form elements that trigger navigation as part of the form action.
-
[SVG11] animation.
The first two are covered by the built-in navigating URL attributes list.
The MathML case is covered by a seperate rule, because there is no formalism in this spec to cover a "per-namespace global" rule.
The SVG animation case is covered by the
built-in animating URL attributes list. But since the interpretation of SVG animation elements depends on the animation target, and since during sanitization we cannot know what the final target will be, the sanitize algorithm blocks any animation of href
attributes.
-
Let url be the result of running the basic URL parser on attribute’s value.
-
If url is
failure
, then return false.
3.2. Modify the Configuration
The configuration modifier methods are methods on Sanitizer
that modify its configuration.
They will maintain the validity criteria.
They return a boolean which informs the caller whether the configuration was modified or not.
let s= new Sanitizer({ elements: [ "div" ]}); s. allowElement( "p" ); // Returns true. div. setHTML( "<div><p>" , { sanitizer: s}); // Allows `<div>` and `<p>`.
let s= new Sanitizer({ elements: [ "div" ]}); s. removeElement( "p" ); // Return false, as <p> was not previously allowed. div. setHTML( "<div><p>" , { sanitizer: s}); // Allows `<div>`. `<p>` is removed.
SanitizerElementWithAttributes
element with a SanitizerConfig
configuration:
-
Whether we have a global allow- or remove-list, and
-
whether these lists already contain element or not.
-
Set element to the result of canonicalize a sanitizer element with attributes with element.
-
Set modified to the result of remove element from configuration["
replaceWithChildrenElements
"]. -
If configuration["
elements
"] exists:-
Comment: We need to make sure the per-element attributes do not overlap with global attributes.
-
If element["
attributes
"] exists:-
If configuration["
attributes
"] exists:-
Set element["
attributes
"] to the difference of element["attributes
"] and configuration["attributes
"]. -
If configuration["
dataAttributes
"] exists and configuration["dataAttributes
"] is true:-
Remove all items item from element["
attributes
"] where item is a custom data attribute.
-
-
-
If configuration["
removeAttributes
"] exists:-
Set element["
attributes
"] to the difference of element["attributes
"] and configuration["removeAttributes
"].
-
-
-
Otherwise if element["
removeAttributes
"] exists:-
If configuration["
attributes
"] exists:-
Set element["
removeAttributes
"] to the intersection of element["removeAttributes
"] and configuration["attributes
"].
-
-
If configuration["
removeAttributes
"] exists:-
Set element["
removeAttributes
"] to the difference of element["removeAttributes
"] and configuration["removeAttributes
"].
-
-
-
Comment: This is the case with a global allow-list that already contains element.
-
Let current element be the item in configuration["
elements
"] where item[name
] equals element[name
] and item[namespace
] equals element[namespace
]. -
If element equals current element then return modified.
-
Return true.
-
-
Otherwise:
-
Comment: If we have a global remove-list, the per-element attributes of element get ignored.
-
If configuration["
removeElements
"] does not contain element:-
Comment: This is the case with a global remove-list that does not contain element.
-
Return modified.
-
-
Comment: This is the case with a global remove-list that contains element.
-
Remove element from configuration["
removeElements
"]. -
Return true.
-
SanitizerElement
element
from a SanitizerConfig
configuration:
-
Whether we have a global allow- or remove-list,
-
whether they already contain element or not.
-
Set element to the result of canonicalize a sanitizer element with element.
-
Set modified to the result of remove element from configuration["
replaceWithChildrenElements
"]. -
Otherwise:
-
If configuration["
removeElements
"] contains element:-
Comment: We have a global remove list and it already contains element.
-
Return modified.
-
-
Comment: We have a global remove list and it does not contain element.
-
Add element to configuration["
removeElements
"]. -
Return true.
-
SanitizerElement
element from a SanitizerConfig
configuration:
-
Set element to the result of canonicalize a sanitizer element with element.
-
If configuration["
replaceWithChildrenElements
"] contains element:-
Return false.
-
-
Add element to configuration["
replaceWithChildrenElements
"]. -
Remove element from configuration["
removeElements
"]. -
Return true.
SanitizerAttribute
attribute on a SanitizerConfig
configuration:
Note: This method distinguishes two cases, namely whether we have a global allow- or a global remove-list. If add attribute to a global allow-list, we may need to do additional work to fix up per-element allow- or remove-lists to maintain our validity criteria.
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute.
-
If configuration["
attributes
"] exists:-
Comment: If we have a global allow-list, we need to add attribute.
-
If configuration["
dataAttributes
"] exists and configuration["dataAttributes
"] is true and attribute is a custom data attribute, then return false. -
If configuration["
attributes
"] contains attribute return false. -
Append attribute to configuration["
attributes
"] -
Comment: Fix-up per-element allow and remove lists.
-
If configuration["
elements
"] exists:-
For each element in configuration["
elements
"]:-
If element["
attributes
"] with default « [] » contains attribute:-
Remove attribute from element["
attributes
"].
-
-
Assert: element["
removeAttributes
"] with default « [] » does not contain attribute.
-
-
-
Return true.
-
-
Otherwise:
-
Comment: If we have a global remove-list, we need to remove attribute.
-
If configuration["
removeAttributes
"] does not contain attribute:-
Return false.
-
-
Remove attribute from configuration["
removeAttributes
"]. -
Return true.
-
SanitizerConfig
configuration:
Note: This method distinguishes two cases, namely whether we have a global allow- or a global remove-list. If we add attribute to the global remove-list, we may need to do additional work to fix up per-element allow- or remove-lists to maintain our validity criteria. If we remove attribute from a global allow-list, we may also have to remove it from local remove-lists.
-
Set attribute to the result of canonicalize a sanitizer attribute with attribute.
-
If configuration["
attributes
"] exists:-
Comment: If we have a global allow-list, we need to add attribute.
-
If configuration["
attributes
"] does not contain attribute:-
Return false.
-
-
Remove attribute from configuration["
attributes
"]. -
Comment: Fix-up per-element allow and remove lists.
-
If configuration["
elements
"] exists:-
For each element in configuration["
elements
"]:-
If element["
removeAttributes
"] with default « [] » contains attribute:-
Remove attribute from element["
removeAttributes
"].
-
-
-
-
Return true.
-
-
Otherwise:
-
Comment: If we have a global remove-list, we need to add attribute.
-
If configuration["
removeAttributes
"] contains attribute return false. -
Append attribute to configuration["
removeAttributes
"] -
Comment: Fix-up per-element allow and remove lists.
-
If configuration["
elements
"] exists:-
For each element in configuration["
elements
"]:-
If element["
attributes
"] with default « [] » contains attribute:-
Remove attribute from element["
attributes
"].
-
-
If element["
removeAttributes
"] with default « [] » contains attribute:-
Remove attribute from element["
removeAttributes
"].
-
-
-
-
Return true.
-
SanitizerConfig
configuration:
SanitizerConfig
configuration:
-
If configuration["
attributes
"] does not exist, then return false. -
If configuration["
dataAttributes
"] exists and configuration["dataAttributes
"] equals allow, then return false. -
Set configuration["
dataAttributes
"] to allow. -
If allow is true:
-
Remove any items attr from configuration["
attributes
"] where attr is a custom data attribute. -
If configuration["
elements
"] exists:-
For each element in configuration["
elements
"]:-
If element[
attributes
] exists:-
Remove any items attr from element[
attributes
] where attr is a custom data attribute.
-
-
-
-
-
Return true.
SanitizerConfig
configuration,
do this:
Note: While this algorithm is called remove unsafe, we use the term "unsafe" strictly in the sense of this spec, to denote content that will execute JavaScript when inserted into the document. In other words, this method will remove oportunities for XSS.
-
Assert: The key set of built-in safe baseline configuration equals «[ "
removeElements
", "removeAttributes
" ] ». -
Let result be false.
-
For each element in built-in safe baseline configuration[
removeElements
]:-
Call remove an element element from configuration.
-
If the call returned true, set result to true.
-
-
For each attribute in built-in safe baseline configuration[
removeAttributes
]:-
Call remove an attribute attribute from configuration.
-
If the call returned true, set result to true.
-
-
For each attribute listed in event handler content attributes:
-
Call remove an attribute attribute from configuration.
-
If the call returned true, set result to true.
-
-
Return result.
3.3. Set the Configuration
Sanitizer
sanitizer:
-
Canonicalize configuration with allowCommentsAndDataAttributes.
-
If configuration is not valid, then return false.
-
Set sanitizer’s configuration to configuration.
-
Return true.
3.4. Canonicalize the Configuration
The Sanitizer
stores the configuration in a canonical form, as this makes
a number of processing steps easier.
elements
list {elements: ["div"]}
gets stored as
{elements: [{name: "div", namespace: "http://www.w3.org/1999/xhtml"}]
).
SanitizerConfig
configuration
with a boolean allowCommentsAndDataAttributes:
Note: We assume that configuration is the result of [WebIDL] converting a JavaScript
value to a SanitizerConfig
.
-
If neither configuration["
elements
"] nor configuration["removeElements
"] exist, then set configuration["removeElements
"] to « [] ». -
If neither configuration["
attributes
"] nor configuration["removeAttributes
"] exist, then set configuration["removeAttributes
"] to « [] ». -
If configuration["
elements
"] exists:-
Let elements be « [] »
-
For each element of configuration["
elements
"] do:-
Append the result of canonicalize a sanitizer element with attributes element to elements.
-
-
Set configuration["
elements
"] to elements.
-
-
If configuration["
removeElements
"] exists:-
Let elements be « [] »
-
For each element of configuration["
removeElements
"] do:-
Append the result of canonicalize a sanitizer element element to elements.
-
-
Set configuration["
removeElements
"] to elements.
-
-
If configuration["
replaceWithChildrenElements
"] exists:-
Let elements be « [] »
-
For each element of configuration["
replaceWithChildrenElements
"] do:-
Append the result of canonicalize a sanitizer element element to elements.
-
-
Set configuration["
replaceWithChildrenElements
"] to elements.
-
-
If configuration["
attributes
"] exists:-
Let attributes be « [] »
-
For each attribute of configuration["
attributes
"] do:-
Append the result of canonicalize a sanitizer attribute attribute to attributes.
-
-
Set configuration["
attributes
"] to attributes.
-
-
If configuration["
removeAttributes
"] exists:-
Let attributes be « [] »
-
For each attribute of configuration["
removeAttributes
"] do:-
Append the result of canonicalize a sanitizer attribute attribute to attributes.
-
-
Set configuration["
removeAttributes
"] to attributes.
-
-
If configuration["
comments
"] does not exist, then set configuration["comments
"] to allowCommentsAndDataAttributes. -
If configuration["
attributes
"] exists and configuration["dataAttributes
"] does not exist, then set configuration["dataAttributes
"] to allowCommentsAndDataAttributes.
SanitizerElementWithAttributes
element:
-
Let result be the result of canonicalize a sanitizer element with element.
-
If element is a dictionary:
-
For each attribute in element["
attributes
"]:-
Add the result of canonicalize a sanitizer attribute with attribute to result["
attributes
"].
-
-
For each attribute in element["
removeAttributes
"]:-
Add the result of canonicalize a sanitizer attribute with attribute to result["
removeAttributes
"].
-
-
-
Return result.
SanitizerElement
element,
return the result of canonicalize a sanitizer name with element and the HTML namespace as the default namespace.
SanitizerAttribute
attribute,
return the result of canonicalize a sanitizer name with attribute and null as the default namespace.
-
Assert: name is either a
DOMString
or a dictionary. -
If name is a
DOMString
, then return «[ "name
" → name, "namespace
" → defaultNamespace]». -
Assert: name is a dictionary and name["name"] exists.
-
Let namespace be name["namespace"] if it exists, otherwise defaultNamespace.
-
If namespace is the empty string, then set it to null.
-
Return «[
"name
" → name["name"],
"namespace
" → namespace
]».
3.5. Supporting Algorithms
For the canonicalized
element
and attribute name
lists
used in this spec, list membership is based on matching both "name
" and "namespace
"
entries:
SanitizerElement
is the same as set intersection,
but with the set entries previously canonicalized:
-
Let set A be « [] »
-
Let set B be « [] »
-
For each entry of A, append the result of canonicalize a sanitizer name entry to set A.
-
For each entry of B, append the result of canonicalize a sanitizer name entry to set B.
-
Retrun the intersection of set A and set B.
3.6. Builtins
There are four builtins:
-
the built-in safe baseline configuration, and
The built-in safe default configuration is as follows:
{ "elements" : [ { "name" : "html" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "head" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "title" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "body" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "article" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "section" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "nav" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "aside" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h1" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h2" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h3" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h4" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h5" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "h6" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "header" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "footer" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "address" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "p" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "hr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "pre" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "blockquote" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null } ] }, { "name" : "ol" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "reversed" , "namespace" : null }, { "name" : "start" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "ul" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "menu" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "li" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "dl" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figure" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "figcaption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "main" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "search" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "div" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "a" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "href" , "namespace" : null }, { "name" : "rel" , "namespace" : null }, { "name" : "hreflang" , "namespace" : null }, { "name" : "type" , "namespace" : null } ] }, { "name" : "em" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "strong" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "small" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "s" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "cite" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "q" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "dfn" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "abbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ruby" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rt" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "rp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "data" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "value" , "namespace" : null } ] }, { "name" : "time" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "datetime" , "namespace" : null } ] }, { "name" : "code" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "var" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "samp" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "kbd" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sub" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "sup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "i" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "b" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "u" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "mark" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "bdi" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "bdo" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "span" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "br" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "wbr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "ins" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "del" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "cite" , "namespace" : null }, { "name" : "datetime" , "namespace" : null } ] }, { "name" : "table" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "caption" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "colgroup" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "col" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "span" , "namespace" : null } ] }, { "name" : "tbody" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "thead" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tfoot" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "tr" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [] }, { "name" : "td" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "colspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null } ] }, { "name" : "th" , "namespace" : "http://www.w3.org/1999/xhtml" , "attributes" : [ { "name" : "colspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null }, { "name" : "headers" , "namespace" : null }, { "name" : "scope" , "namespace" : null }, { "name" : "abbr" , "namespace" : null } ] }, { "name" : "math" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "merror" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mfrac" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mi" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mmultiscripts" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mn" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mo" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "form" , "namespace" : null }, { "name" : "fence" , "namespace" : null }, { "name" : "separator" , "namespace" : null }, { "name" : "lspace" , "namespace" : null }, { "name" : "rspace" , "namespace" : null }, { "name" : "stretchy" , "namespace" : null }, { "name" : "symmetric" , "namespace" : null }, { "name" : "maxsize" , "namespace" : null }, { "name" : "minsize" , "namespace" : null }, { "name" : "largeop" , "namespace" : null }, { "name" : "movablelimits" , "namespace" : null } ] }, { "name" : "mover" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accent" , "namespace" : null } ] }, { "name" : "mpadded" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "width" , "namespace" : null }, { "name" : "height" , "namespace" : null }, { "name" : "depth" , "namespace" : null }, { "name" : "lspace" , "namespace" : null }, { "name" : "voffset" , "namespace" : null } ] }, { "name" : "mphantom" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mprescripts" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mroot" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mrow" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "ms" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mspace" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "width" , "namespace" : null }, { "name" : "height" , "namespace" : null }, { "name" : "depth" , "namespace" : null } ] }, { "name" : "msqrt" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mstyle" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msub" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msubsup" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "msup" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtable" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtd" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "columnspan" , "namespace" : null }, { "name" : "rowspan" , "namespace" : null } ] }, { "name" : "mtext" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "mtr" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] }, { "name" : "munder" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accentunder" , "namespace" : null } ] }, { "name" : "munderover" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [ { "name" : "accent" , "namespace" : null }, { "name" : "accentunder" , "namespace" : null } ] }, { "name" : "semantics" , "namespace" : "http://www.w3.org/1998/Math/MathML" , "attributes" : [] } ], "attributes" : [ { "name" : "dir" , "namespace" : null }, { "name" : "lang" , "namespace" : null }, { "name" : "title" , "namespace" : null }, { "name" : "displaystyle" , "namespace" : null }, { "name" : "mathbackground" , "namespace" : null }, { "name" : "mathcolor" , "namespace" : null }, { "name" : "mathsize" , "namespace" : null }, { "name" : "scriptlevel" , "namespace" : null } ], "comments" : false , "dataAttributes" : false }
Note: Included [MathML] markup is based on [SafeMathML].
The built-in safe baseline configuration is meant to block only script-content. It is as follows:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [] }
Warning: The remove unsafe algorithm specifies to additionally remove any event handler content attributes, as defined in [HTML]. If a user agent defines extensions to the [HTML] spec with additional event handler content attributes, it is its responsibility to decide how to handle them. Using the current event handler content attributes list, the safe baseline configuration looks effectively like so:
{ "removeElements" : [ { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "script" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "frame" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "iframe" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "object" }, { "namespace" : "http://www.w3.org/1999/xhtml" , "name" : "embed" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "script" }, { "namespace" : "http://www.w3.org/2000/svg" , "name" : "use" } ], "removeAttributes" : [ "onafterprint" , "onauxclick" , "onbeforeinput" , "onbeforematch" , "onbeforeprint" , "onbeforeunload" , "onbeforetoggle" , "onblur" , "oncancel" , "oncanplay" , "oncanplaythrough" , "onchange" , "onclick" , "onclose" , "oncontextlost" , "oncontextmenu" , "oncontextrestored" , "oncopy" , "oncuechange" , "oncut" , "ondblclick" , "ondrag" , "ondragend" , "ondragenter" , "ondragleave" , "ondragover" , "ondragstart" , "ondrop" , "ondurationchange" , "onemptied" , "onended" , "onerror" , "onfocus" , "onformdata" , "onhashchange" , "oninput" , "oninvalid" , "onkeydown" , "onkeypress" , "onkeyup" , "onlanguagechange" , "onload" , "onloadeddata" , "onloadedmetadata" , "onloadstart" , "onmessage" , "onmessageerror" , "onmousedown" , "onmouseenter" , "onmouseleave" , "onmousemove" , "onmouseout" , "onmouseover" , "onmouseup" , "onoffline" , "ononline" , "onpagehide" , "onpagereveal" , "onpageshow" , "onpageswap" , "onpaste" , "onpause" , "onplay" , "onplaying" , "onpopstate" , "onprogress" , "onratechange" , "onreset" , "onresize" , "onrejectionhandled" , "onscroll" , "onscrollend" , "onsecuritypolicyviolation" , "onseeked" , "onseeking" , "onselect" , "onslotchange" , "onstalled" , "onstorage" , "onsubmit" , "onsuspend" , "ontimeupdate" , "ontoggle" , "onunhandledrejection" , "onunload" , "onvolumechange" , "onwaiting" , "onwheel" ] }
javascript:
"
navigations are "unsafe", are as follows:
«[
[
{ "name
" → "a
", "namespace
" → HTML namespace },
{ "name
" → "href
", "namespace
" → null
}
],
[
{ "name
" → "area
", "namespace
" → HTML namespace },
{ "name
" → "href
", "namespace
" → null
}
],
[
{ "name
" → "base
", "namespace
" → HTML namespace },
{ "name
" → "href
", "namespace
" → null
}
],
[
{ "name
" → "button
", "namespace
" → HTML namespace },
{ "name
" → "formaction
", "namespace
" → null
}
],
[
{ "name
" → "form
", "namespace
" → HTML namespace },
{ "name
" → "action
", "namespace
" → null
}
],
[
{ "name
" → "iframe
", "namespace
" → HTML namespace },
{ "name
" → "src
", "namespace
" → null
}
],
[
{ "name
" → "input
", "namespace
" → HTML namespace },
{ "name
" → "formaction
", "namespace
" → null
}
],
[
{ "name
" → "a
", "namespace
" → SVG namespace },
{ "name
" → "href
", "namespace
" → null
}
],
[
{ "name
" → "a
", "namespace
" → SVG namespace },
{ "name
" → "href
", "namespace
" → XLink namespace }
],
]»
The built-in animating URL attributes list, which can be used in
[SVG11] to declaratively modify navigation elements to use "javascript:
"
URLs, is as follows:
«[
[
{ "name
" → "animate
", "namespace
" → SVG namespace },
{ "name
" → "attributeName
", "namespace
" → null
] }
],
[
{ "name
" → "animateMotion
", "namespace
" → SVG namespace },
{ "name
" → "attributeName
", "namespace
" → null
}
],
[
{ "name
" → "animateTransform
", "namespace
" → SVG namespace },
{ "name
" → "attributeName
", "namespace
" → null
}
],
[
{ "name
" → "set
", "namespace
" → SVG namespace },
{ "name
" → "attributeName
", "namespace
" → null
}
],
]»
4. Security Considerations
The Sanitizer API is intended to prevent DOM-based Cross-Site Scripting by traversing a supplied HTML content and removing elements and attributes according to a configuration. The specified API must not support the construction of a Sanitizer object that leaves script-capable markup in and doing so would be a bug in the threat model.
That being said, there are security issues which the correct usage of the Sanitizer API will not be able to protect against and the scenarios will be laid out in the following sections.
4.1. Server-Side Reflected and Stored XSS
This section is not normative.
The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an existing DocumentFragment. The Sanitizer does not address server-side reflected or stored XSS.
4.2. DOM clobbering
This section is not normative.
DOM clobbering describes an attack in which malicious HTML confuses an
application by naming elements through id
or name
attributes such that
properties like children
of an HTML element in the DOM are overshadowed by
the malicious content.
The Sanitizer API does not protect DOM clobbering attacks in its
default state, but can be configured to remove id
and name
attributes.
4.3. XSS with Script gadgets
This section is not normative.
Script gadgets are a technique in which an attacker uses existing application code from popular JavaScript libraries to cause their own code to execute. This is often done by injecting innocent-looking code or seemingly inert DOM nodes that is only parsed and interpreted by a framework which then performs the execution of JavaScript based on that input.
The Sanitizer API can not prevent these attacks, but requires page authors to
explicitly allow unknown elements in general, and authors must additionally
explicitly configure unknown attributes and elements and markup that is known
to be widely used for templating and framework-specific code,
like data-
and slot
attributes and elements like <slot>
and <template>
.
We believe that these restrictions are not exhaustive and encourage page
authors to examine their third party libraries for this behavior.
4.4. Mutated XSS
This section is not normative.
Mutated XSS or mXSS describes an attack based on parser context mismatches when parsing an HTML snippet without the correct context. In particular, when a parsed HTML fragment has been serialized to a string, the string is not guaranteed to be parsed and interpreted exactly the same when inserted into a different parent element. An example for carrying out such an attack is by relying on the change of parsing behavior for foreign content or mis-nested tags.
The Sanitizer API offers only functions that turn a string into a node tree.
The context is supplied implicitly by all sanitizer functions:
Element.setHTML()
uses the current element; Document.parseHTML()
creates a
new document. Therefore Sanitizer API is not directly affected by mutated XSS.
If a developer were to retrieve a sanitized node tree as a string, e.g. via
.innerHTML
, and to then parse it again then mutated XSS may occur.
We discourage this practice. If processing or passing of HTML as a
string should be necessary after all, then any string should be considered
untrusted and should be sanitized (again) when inserting it into the DOM. In
other words, a sanitized and then serialized HTML tree can no
longer be considered as sanitized.
A more complete treatment of mXSS can be found in [MXSS].
5. Acknowledgements
This work is informed and inspired by [DOMPURIFY] from cure53,
Internet Explorer’s window.toStaticHTML()
as well as the original
[HTMLSanitizer] from Ben Bucksch.
Anne van Kesteren, Krzysztof Kotowicz, Tom Schuster, Luke Warlow,
Guillaume Weghsteen, and Mike West for their valuable feedback.