Target Privacy Threat Model

Editor’s Draft,

This version:
https://w3cping.github.io/privacy-threat-model
Feedback:
public-privacy@w3.org with subject line “[privacy-threat-model] … message topic …” (archives)
Issue Tracking:
GitHub
Inline In Spec
Editors:
(Google Inc.)
(Brave)
Not Ready For Implementation

This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.

Before attempting to implement this spec, please contact the editors.


Abstract

A privacy threat model we should migrate the web toward.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index.

This document was published by the Privacy Interest Group as an Editor’s Draft. This document is intended to become a W3C Recommendation. Publication as an Editor’s Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Feedback and comments on this specification are welcome, either as GitHub issues or on the public-privacy@w3.org mailing list. When sending e-mail, please put the text “privacy-threat-model” in the subject, preferably like this:“[privacy-threat-model] …summary of comment…

This document was produced by a group operating under the W3C Patent Policy. The W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

This document is at a very early stage. Many things in it are wrong and/or incomplete. Please take it as a rough shape for how we might document the target threat model, rather than as definite statements about what should be in the target threat model.

1. Introduction

As a threat model, this specification describes attacker capabilities and attacker goals, and says which goals which capabilities should and should not enable.

As a privacy threat model, the attacker goals compromise the privacy of users, rather than their security.

As a target threat model, it describes not the current state of the Web including all current maybe-unwise APIs, but rather an end state that we hope to migrate to, and that new APIs should be held to. This is meant to be a plausible threat model: it doesn’t expect to remove any APIs or browser behavior that is deemed essential to the viability of the Web.

Since people are likely to disagree about which APIs are essential to the Web, when saying that an attacker can achieve their goal, this document describes how the attacker achieves it using particular "essential" APIs, and it provides an index of those APIs so readers can point out ones that they don’t consider essential.

2. Terminology

[HTML] defines an origin as the tuple of a scheme, hostname, and port that provides the main security boundary on the web.

A site is a set of origins that are all same site with each other. Note that there are problems ([PSL-PROBLEMS]) with using registrable domains as a logical boundary.

A party is defined by [tracking-dnt] as "a natural person, a legal entity, or a set of legal entities that share common owner(s), common controller(s), and a group identity that is easily discoverable by a user."

The first party for a user action is the party that controls the origin of the top-level browsing context under which the action happened. Intuitively, this is the owner of the domain in the browser’s URL bar. This differs from Mozilla’s definition in that Mozilla defines other parties as first parties if the user can easily discover which party it is and intends to interact with that party, for example to allow sign-in widgets to be first-party.

A third party for a user action is any party that isn’t the first party or the user (the second party).

A user is a human or program that controls a user agent.

A user ID is a pair of a site and a (potentially-large) integer allocated by that site that is used to identify a user on that site. A single user will generally have many user IDs that refer to them, and a single site may or may not know that multiple user IDs refer to the same user.

A global identifier is a string that identifies a particular user independent of which site they’re visiting. Users generally have relatively few global identifiers and can usually list and recognize them. A goal of anti-tracking policy is to prevent user IDs from becoming global identifiers.

An attacker is any entity trying to get information that a user might not want them to get. Attackers are often entities that a user intends to interact with in other ways, as both first and third parties, and some users may not mind their collection of this information.

This document uses the terms publisher and tracker colloquially to refer to particular kinds of sites and the parties that operate them. They are not rigorously defined.

3. High-level threats

User agents should attempt to defend their users from a variety of high-level threats or attacker goals, described in this section. then describes the low-level steps an attacker would use to achieve these high-level goals.

This section is not complete. It lists a lot of potential privacy threats, but needs editing to pick which kinds of threats belong in this threat model and to unify the multiple lists of suggestions.

The following threats were brainstormed in the 2019 TPAC PING meeting:

The following threats are copied from Self-Review Questionnaire: Security and Privacy §threats. They are not all addressed in this document.

Surveillance

Surveillance is the observation or monitoring of an individual’s communications or activities.

Stored Data Compromise

End systems that do not take adequate measures to secure stored data from unauthorized or inappropriate access.

Intrusion

Intrusion consists of invasive acts that disturb or interrupt one’s life or activities.

Misattribution

::: Misattribution occurs when data or communications related to one individual are attributed to another.

Correlation

Correlation is the combination of various pieces of information related to an individual or that obtain that characteristic when combined.

Identification

Identification is the linking of information to a particular individual to infer an individual’s identity or to allow the inference of an individual’s identity.

Secondary Use

Secondary use is the use of collected information about an individual without the individual’s consent for a purpose different from that for which the information was collected.

Disclosure

Disclosure is the revelation of information about an individual that affects the way others judge the individual.

Exclusion

Exclusion is the failure to allow individuals to know about the data that others have about them and to participate in its handling and use.

4. Acknowledgements

Safari did the first work to prove that a more privacy-preserving web was possible, by blocking third-party cookies by default and then shipping ITP 1.0, without breaking the world. They eventually published their policy for Tracking Prevention, which heavily influenced this document.

Mozilla wrote the first concrete anti-tracking policy, which inspired Safari’s policy.

Michael Kleber on the Chrome team proposed a Privacy Model for the Web, which suggests blocking the transfer of user IDs between top-level sites and suggests a few ways that information could flow between sites without compromising user privacy.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[PSL]
Public Suffix List. Mozilla Foundation.
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119

Informative References

[PSL-PROBLEMS]
Ryan Sleevi. Public Suffix List Problems. URL: https://github.com/sleevi/psl-problems
[TRACKING-DNT]
Roy Fielding; David Singer. Tracking Preference Expression (DNT). 17 January 2019. NOTE. URL: https://www.w3.org/TR/tracking-dnt/

Issues Index

This section is not complete. It lists a lot of potential privacy threats, but needs editing to pick which kinds of threats belong in this threat model and to unify the multiple lists of suggestions.