Verifiable Credential Data Integrity 1.0

Securing the Integrity of Verifiable Credential Data

W3C Editor's Draft

More details about this document
This version:
https://w3c.github.io/vc-data-integrity/
Latest published version:
https://www.w3.org/TR/vc-data-integrity/
Latest editor's draft:
https://w3c.github.io/vc-data-integrity/
History:
Commit history
Editors:
Manu Sporny (Digital Bazaar)
Dave Longley (Digital Bazaar) (2014-2022)
Mike Prorock (Mesur.io)
Authors:
Dave Longley (Digital Bazaar)
Manu Sporny (Digital Bazaar)
Feedback:
GitHub w3c/vc-data-integrity (pull requests, new issue, open issues)
public-credentials@w3.org with subject line [vc-data-integrity] … message topic … (archives)

Abstract

This specification describes mechanisms for ensuring the authenticity and integrity of Verifiable Credentials and similar types of constrained digital documents using cryptography, especially through the use of digital signatures and related mathematical proofs.

Status of This Document

This is a preview

Do not attempt to implement this version of the specification. Do not reference this version as authoritative in any way. Instead, see https://w3c.github.io/vc-data-integrity/ for the Editor's draft.

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This is an experimental specification and is undergoing regular revisions. It is not fit for production deployment.

This document was published by the Verifiable Credentials Working Group as an Editor's Draft.

Publication as an Editor's Draft does not imply endorsement by W3C and its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

This specification describes mechanisms for ensuring the authenticity and integrity of Verifiable Credentials and similar types of constrained digital documents using cryptography, especially through the use of digital signatures and related mathematical proofs. Cryptographic proofs enable functionality that is useful to implementors of distributed systems. For example, proofs can be used to:

1.1 How it Works

The operation of Data Integrity is conceptually simple. To create a cryptographic proof, the following steps are performed: 1) Transformation, 2) Hashing, and 3) Proof Generation.


Diagram showing the three steps involved in the creation of a cryptographic
proof. The diagram is laid out left to right with a blue box labeled 'Data'
on the far left. The blue box travels, left to right, through three subsequent
yellow arrows labeled 'Transform Data', 'Hash Data', and 'Generate Proof'. The
resulting blue box at the far right is labeled 'Data with Proof'.
Figure 1 To create a cryptographic proof, data is transformed, hashed, and cryptographically protected.

Transformation is a process that takes input data and prepares it for the hashing process. One example of a possible transformation is to take a record of people's names that attended a meeting, sort the list alphabetically by the individual's family name, and rewrite the names on a piece of paper, one per line, in sorted order. Possible transformations include canonicalization and binary-to-text encoding.

Hashing is a process that calculates an identifier for the transformed data using a cryptographic hash function. This process is conceptually similar to how a phone address book functions, where one takes a person's name (the input data) and maps that name to that individual's phone number (the hash). Possible cryptographic hash functions include SHA-3 and BLAKE-3.

Proof Generation is a process that calculates a value that protects the integrity of the input data from modification or otherwise proves a certain desired threshold of trust. This process is conceptually similar to the way a wax seal can be used on an envelope containing a letter to establish trust in the sender and show that the letter has not been tampered with in transit. Possible proof generation functions include digital signatures and proofs of stake.

To verify a cryptographic proof, the following steps are performed: 1) Transformation, 2) Hashing, and 3) Proof Verification.


Diagram showing the three steps involved in the verification of a cryptographic
proof. The diagram is laid out left to right with a blue box labeled
'Data with Proof' on the far left. The blue box travels, left to right, through
three subsequent yellow arrows labeled 'Transform Data', 'Hash Data', and
'Verify Proof'. The resulting blue box at the far right is labeled 'Data with
Proof'.
Figure 2 To verify a cryptographic proof, data is transformed, hashed, and checked for correctness.

During verification, the transformation and hashing steps are conceptually the same as described above.

Proof Verification is a process that applies a cryptographic proof verification function to see if the input data can be trusted. Possible proof verification functions include digital signatures and proofs of stake.

This specification details how cryptographic software architects and implementers can package these processes together into things called cryptographic suites and provide them to application developers for the purposes of protecting the integrity of application data in transit and at rest.

1.2 Design Goals and Rationale

This specification optimizes for the following design goals:

Simplicity
The technology is designed to be easy to use for application developers, without requiring significant training in cryptography. It optimizes for the following priority of constituencies: application developers over cryptographic suite implementers, over cryptographic suite designers, over cryptographic algorithm specification authors. The solution focuses on sensible defaults to prevent the selection of ineffective protection mechanisms. See section 5.2 Protecting Application Developers and 5.1 Versioning Cryptography Suites for further details.
Composability
A number of historical digital signature mechanisms have had monolithic designs which limited use cases by combining data transformation, syntax, digital signature, and serialization into a single specification. This specification layers each component such that a broader range of use cases are enabled, including generalized selective disclosure and serialization-agnostic signatures. See section 5.4 Transformations, section 5.5 Data Opacity, and 5.1 Versioning Cryptography Suites for further rationale.
Resilience
Since digital proof mechanisms might be compromised without warning due to technological advancements, it is important that cryptographic suites provide multiple layers of protection and can be rapidly upgraded. This specification provides for both algorithmic agility and cryptographic layering, while still keeping the digital proof format easy for developers to understand and use. See section 5.3 Agility and Layering to understand the particulars.
Progressive Extensibility
Creating and deploying new cryptographic protection mechanisms is designed to be a deliberate, iterative, and careful process that acknowledges that extension happens in phases from experimentation, to implementation, to standardization. This specification strives to balance the need for an increase in the rate of innovation in cryptography with the need for stable production-grade cryptography suites. See section 3. Cryptographic Suites for instructions on establishing new types of cryptographic proofs.
Serialization Flexibility
Cryptographic proofs can be serialized in many different but equivalent ways and have often been tightly bound to the original document syntax. This specification enables one to create cryptographic proofs that are not bound to the original document syntax, which enables more advanced use cases such as being able to use a single digital signature across a variety of serialization syntaxes such as JSON and CBOR without the need to regenerate the cryptographic proof. See section 5.4 Transformations for an explanation of the benefits of such an approach.
Note: Application of technology to broader use cases

While this specification primarily focuses on Verifiable Credentials, the design of this technology is generalized, such that it can be used for non-Verifiable Credential use cases. In these instances, implementers are expected to perform their own due diligence and expert review as to the applicability of the technology to their use case.

1.3 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, OPTIONAL, RECOMMENDED, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

1.4 Terminology

Error
Cannot GET /uploads/qfTLRw/terms.html

2. Data Model

This section specifies the data model that is used for expressing data integrity proofs and verification methods.

2.1 Proofs

A data integrity proof provides information about the proof mechanism, parameters required to verify that proof, and the proof value itself. All of this information is provided using Linked Data vocabularies such as the [SECURITY-VOCABULARY].

The following attributes are defined for use in a data integrity proof:

type
The specific proof type used for the cryptographic proof MUST be specified as a string that maps to a URL [URL]. Examples of proof types include DataIntegrityProof and Ed25519Signature2020. Proof types determine what other fields are required to secure and verify the proof.
proofPurpose
The reason the proof was created MUST be specified as a string that maps to a URL [URL]. The proof purpose acts as a safeguard to prevent the proof from being misused by being applied to a purpose other than the one that was intended. For example, without this value the creator of a proof could be tricked into using cryptographic material typically used to create a Verifiable Credential (assertionMethod) during a login process (authentication) which would then result in the creation of a Verifiable Credential they never meant to create instead of the intended action, which was to merely logging into a website.
verificationMethod
The means and information needed to verify the proof MUST be specified as a string that maps to a [URL]. An example of a verification method is a link to a public key which includes cryptographic material that is used by a verifier during the verification process.
created
The date and time the proof was created MUST be specified as an [XML-SCHEMA11] combined date and time string.
domain
The creator of a proof SHOULD include a string value that indicates its intended usage, which a verifier SHOULD use to ensure the proof was intended to be used by them. The specification of the domain parameter is useful in challenge-response protocols where the verifier is operating from within a security domain known to the creator of the proof. Examples of a domain parameter include: domain.example (DNS domain), https://domain.example:8443 (full Web origin), mycorp-intranet (well-known text string), and b31d37d4-dd59-47d3-9dd8-c973da43b63a (UUID).
challenge
A string value that SHOULD be included in a proof if a domain is specified. The value is used once for a particular domain and window of time. This value is used to mitigate replay attacks. Examples of a challenge value include: 1235abcd6789, 79d34551-ae81-44ae-823b-6dadbab9ebd4, and ruby.
proofValue
A string value that contains the data necessary to verify the digital proof using the verificationMethod specified. The contents of the value MUST be a multibase-encoded binary value. Alternative fields with different encodings MAY be used to encode the data necessary to verify the digital proof instead of this value. The contents of this value are determined by a cryptosuite and set to the proof value generated by the Proof Algorithm.
Note

All of the terms above map to URLs. The vocabulary where these terms are defined is the [SECURITY-VOCABULARY].

A proof can be added to a JSON document like the following:

Example 1: A simple JSON data document
{
  "title": "Hello world!"
};

by adding the parameters outlined in this section:

Example 2: A simple signed JSON data document
{
  "title": "Hello world!",
  "proof": {
    "type": "DataIntegrityProof",
    "cryptosuite": "example-signature-2022",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://di.example/issuer#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "zQeVbY4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYV8nQA
      pmEcqaqA3Q1gVHMrXFkXJeV6doDwLWx"
  }
}

The example proof above suggests a ficticious example-signature-2022 cryptography suite that produces a verifiable digital proof by presumptively transforming the input data using the JSON Canonicalization Scheme [RFC8785] and then digitally signing it using an Ed25519 elliptic curve signature.

Similarly, a proof can be added to a JSON-LD data document like the following:

Example 3: A simple JSON-LD data document
{
  "@context": {"title": "https://schema.org/title"},
  "title": "Hello world!"
};

by adding the parameters outlined in this section:

Example 4: A simple signed JSON-LD data document
{
  "@context": [
    {"title": "https://schema.org/title"},
    "https://w3id.org/security/data-integrity/v1"
  ],
  "title": "Hello world!",
  "proof": {
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://ldi.example/issuer#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQA
      VHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }
}

The proof example above uses the Ed25519Signature2020 proof type to produce a verifiable digital proof by canonicalizing the input data using the RDF Dataset Canonicalization algorithm [RDF-DATASET-C14N] and then digitally signing it using an Ed25519 elliptic curve signature.

Issue
Create a separate section detailing an optional mechanism for authenticating public key control via bi-directional links. How to establish trust in controllers is out of scope but examples can be given.
Issue
Specify algorithm agility mechanisms (additional attributes from the security vocab can be used to indicate other signing and hash algorithms). Rewrite algorithms to be parameterized on this basis and move Ed25519Signature2020 definition to a single supported mechanism; specify its identifier as a URL. In order to make it easy to specify a variety of combinations of algorithms, introduce a core type DataIntegrityProof that allows for easy filtering/discover of proof nodes, but that type on its own doesn't specify any default proof or hash algorithms, those need to be given via other properties in the nodes.
Issue: Avoid signature format proliferation by using text-based suite value

The pattern that Data Integrity Signatures use presently leads to a proliferation in signature types and JSON-LD Contexts. This proliferation can be avoided without any loss of the security characteristics of tightly binding a cryptography suite version to one or more acceptable public keys. The following signature suites are currently being contemplated: eddsa-2022, nist-ecdsa-2022, koblitz-ecdsa-2022, rsa-2022, pgp-2022, bbs-2022, eascdsa-2022, ibsa-2022, and jws-2022.

Example 5: A DataIntegrityProof example using a NIST ECDSA 2022 Cryptosuite
{
  "@context": ["https://w3id.org/security/data-integrity/v1"],
  "type": "DataIntegrityProof",
  "cryptosuite": "ecdsa-2022",
  "created": "2022-11-29T20:35:38Z",
  "verificationMethod": "did:example:123456789abcdefghi#keys-1",
  "proofPurpose": "assertionMethod",
  "proofValue": "z2rb7doJxczUFBTdV5F5pehtbUXPDUgKVugZZ99jniVXCUpojJ9PqLYV
                 evMeB1gCyJ4HqpnTyQwaoRPWaD3afEZboXCBTdV5F5pehtbUXPDUgKVugUpoj"
}
Issue
Add an explicit check on key type to prevent an attacker from selecting an algorithm that could abuse how the key is used/interpreted.
Issue
Add a note indicating that selective disclosure proof mechanisms can be compatible with Data Integrity; for example, an algorithm could produce a merkle tree from a canonicalized set of N-Quads and then sign the root hash. Disclosure would involve including the merkle paths for each N-Quad that is to be revealed. This mechanism would merely consume the normalized output differently (this, and the proof mechanism would be modifications to this core spec). It might also be necessary to generate proof parameters such as a private key/seed that can be used along with an algorithm to deterministically generate nonces that are concatenated with each N-Quad to prevent rainbow table or similar attacks.
Issue
Add a note indicating that this specification should not be construed to indicate that public key controllers should be restricted to a single public key or that systems that use this spec and involve real people should identify each person as only ever being a single entity rather than perhaps N entities with M keys. There are no such restrictions and in many cases those kinds of restrictions are ill-advised due to privacy considerations.

The Data Integrity specification supports the concept of multiple proofs in a single document. There are two types of multi-proof approaches that are identified: Proof Sets (un-ordered) and Proof Chains (ordered).

2.1.1 Proof Sets

A proof set is useful when the same data needs to be secured by multiple entities, but where the order of proofs does not matter, such as in the case of a set of signatures on a contract. A proof set, which has no order, is represented by associating a set of proofs with the proof key in a document.

Example 6: A proof set in a data document
{
  "@context": [
    {"title": "https://schema.org/title"},
    "https://w3id.org/security/data-integrity/v1"
],
  "title": "Hello world!",
  "proof": [{
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T19:23:24Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQA
      VHMrXFkXJpmEcqdoDwLWxaqA3Q1geV6"
  }, {
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T13:08:49Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mes
      DpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z5QLBrp19KiWXerb8ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ6Qnzr5CG9876
      zNht8BpStWi8H2Mi7XCY3inbLrZrm95"
  }]
}

2.1.2 Proof Chains

A proof chain is useful when the same data needs to be signed by multiple entities and the order of when the proofs occurred matters, such as in the case of a notary counter-signing a proof that had been created on a document. A proof chain, where order needs to be preserved, is represented by associating an ordered list of proofs with the proofChain key in a document.

Example 7: A proof chain in a data document
{
  "@context": [
    {"title": "https://schema.org/title"},
    "https://w3id.org/security/data-integrity/v1"
],
  "title": "Hello world!",
  "proofChain": [{
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T19:23:42Z",
    "verificationMethod": "https://ldi.example/issuer/1#z6MkjLrk3gKS2nnkeWcmcxi
      ZPGskmesDpuwRBorgHxUXfxnG",
    "proofPurpose": "assertionMethod",
    "proofValue": "zVbY8nQAVHMrXFkXJpmEcqdoDwLWxaqA3Q1geV64oey5q2M3XKaxup3tmzN4
      DRFTLVqpLMweBrSxMY2xHX5XTYVQe"
  }, {
    "type": "DataIntegrityProof",
    "cryptosuite": "eddsa-2022",
    "created": "2020-11-05T21:28:14Z",
    "verificationMethod": "https://pfps.example/issuer/2#z6MkGskxnGjLrk3gKS2mes
      DpuwRBokeWcmrgHxUXfnncxiZP",
    "proofPurpose": "assertionMethod",
    "proofValue": "z6Qnzr5CG9876zNht8BpStWi8H2Mi7XCY3inbLrZrm955QLBrp19KiWXerb8
      ByPnAZ9wujVFN8PDsxxXeMoyvDqhZ"
  }]
}

2.2 Proof Purposes

A proof that describes its purpose helps prevent it from being misused for some other purpose.

Issue
Add a mention of JWK's key_ops parameter and WebCrypto's KeyUsage restrictions; explain that Proof Purpose serves a similar goal but allows for finer-grained restrictions.

The following is a list of commonly used proof purpose values.

authentication
Indicates that a given proof is only to be used for the purposes of an authentication protocol.
assertionMethod
Indicates that a proof can only be used for making assertions, for example signing a Verifiable Credential.
keyAgreement
Indicates that a proof is used for for key agreement protocols, such as Elliptic Curve Diffie Hellman key agreement used by popular encryption libraries.
capabilityDelegation
Indicates that the proof can only be used for delegating capabilities. See the Authorization Capabilities [ZCAP] specification for more detail.
capabilityInvocation
Indicates that the proof can only be used for invoking capabilities. See the Authorization Capabilities [ZCAP] specification for more detail.

Note: The Authorization Capabilities [ZCAP] specification defines additional proof purposes for that use case, such as capabilityInvocation and capabilityDelegation.

2.3 Controller Documents

A controller document is a set of data that specifies one or more relationships between a controller and a set of data, such as a set of public cryptographic keys. The controller document SHOULD contain verification relationships that explicitly permit the use of certain verification methods for specific purposes.

Issue
Add examples of common Controller documents, such as controller documents published on a ledger-based registry, or on a mutable medium in combination with an integrity protection mechanism such as Hashlinks.

2.3.1 Verification Methods

A controller document can express verification methods, such as cryptographic public keys, which can be used to authenticate or authorize interactions with the controller or associated parties. For example, a cryptographic public key can be used as a verification method with respect to a digital signature; in such usage, it verifies that the signer could use the associated cryptographic private key. Verification methods might take many parameters. An example of this is a set of five cryptographic keys from which any three are required to contribute to a cryptographic threshold signature.

verificationMethod

The verificationMethod property is OPTIONAL. If present, the value MUST be a set of verification methods, where each verification method is expressed using a map. The verification method map MUST include the id, type, controller, and specific verification material properties that are determined by the value of type and are defined in 2.3.1.1 Verification Material. A verification method MAY include additional properties. Verification methods SHOULD be registered in the Data Integrity Specification Registries [TBD - DIS-REGISTRIES].

id

The value of the id property for a verification method MUST be a string that conforms to the conforms to the [URL] syntax.

type
The value of the type property MUST be a string that references exactly one verification method type. In order to maximize global interoperability, the verification method type SHOULD be registered in the Data Integrity Specification Registries [TBD -- DIS-REGISTRIES].
controller
The value of the controller property MUST be a string that conforms to the [URL] syntax.
Example 8: Example verification method structure
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/data-integrity/v1"
  ]
  "id": "did:example:123456789abcdefghi",
  ...
  "verificationMethod": [{
    "id": ...,
    "type": ...,
    "controller": ...,
    "publicKeyJwk": ...
  }, {
    "id": ...,
    "type": ...,
    "controller": ...,
    "publicKeyMultibase": ...
  }]
}
Note: Verification method controller(s) and controller(s)

The semantics of the controller property are the same when the subject of the relationship is the controller document as when the subject of the relationship is a verification method, such as a cryptographic public key. Since a key can't control itself, and the key controller cannot be inferred from the controller document, it is necessary to explicitly express the identity of the controller of the key. The difference is that the value of controller for a verification method is not necessarily a controller. controllers are expressed using the controller property at the highest level of the controller document.

2.3.1.1 Verification Material

Verification material is any information that is used by a process that applies a verification method. The type of a verification method is expected to be used to determine its compatibility with such processes. Examples of verification material properties are publicKeyJwk or publicKeyMultibase. A cryptographic suite specification is responsible for specifying the verification method type and its associated verification material. For example, see JSON Web Signature 2020 and Ed25519 Signature 2020. For all registered verification method types and associated verification material available for controllers, please see the Data Integrity Specification Registries [TBD - DIS-REGISTRIES].

Issue

Ensuring that cryptographic suites are versioned and tightly scoped to a very small set of possible key types and signature schemes (ideally one key type and size and one signature output type) is a design goal for most Data Integrity cryptographic suites. Historically, this has been done by defining both the key type and the cryptographic suite that uses the key type in the same specification. The downside of doing so, however, is that there might be a proliferation of different key types in multikey that result in different cryptosuites defining the same key material differently. For example, one cryptosuite might use compressed Curve P-256 keys while another uses uncompressed values. If that occurs, it will harm interoperability. It will be important in the coming months to years to ensure that this does not happen by fully defining the multikey format in a separate specification so cryptosuite specifications, such as this one, can refer to the multikey specification, thus reducing the chances of multikey type proliferation and improving the chances of maximum interoperability for the multikey format.

To increase the likelihood of interoperable implementations, this specification limits the number of formats for expressing verification material in a controller document. The fewer formats that implementers have to implement, the more likely it will be that they will support all of them. This approach attempts to strike a delicate balance between ease of implementation and supporting formats that have historically had broad deployment. Two supported verification material properties are listed below:

publicKeyJwk

The publicKeyJwk property is OPTIONAL. If present, the value MUST be a map representing a JSON Web Key that conforms to [RFC7517]. The map MUST NOT contain "d", or any other members of the private information class as described in Registration Template. It is RECOMMENDED that verification methods that use JWKs [RFC7517] to represent their public keys use the value of kid as their fragment identifier. It is RECOMMENDED that JWK kid values are set to the public key fingerprint [RFC7638]. See the first key in Example 9 for an example of a public key with a compound key identifier.

publicKeyMultibase

The publicKeyMultibase property is OPTIONAL. This feature is non-normative. If present, the value MUST be a string representation of a [MULTIBASE] encoded public key.

Note that the [MULTIBASE] specification is not yet a standard and is subject to change. There might be some use cases for this data format where publicKeyMultibase is defined, to allow for expression of public keys, but privateKeyMultibase is not defined, to protect against accidental leakage of secret keys.

A verification method MUST NOT contain multiple verification material properties for the same material. For example, expressing key material in a verification method using both publicKeyJwk and publicKeyMultibase at the same time is prohibited.

An example of a controller document containing verification methods using both properties above is shown below.

Example 9: Verification methods using publicKeyJwk and publicKeyMultibase
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/suites/jws-2020/v1",
    "https://w3id.org/security/multikey/v1"
  ]
  "id": "did:example:123456789abcdefghi",
  ...
  "verificationMethod": [{
    "id": "did:example:123#_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A",
    "type": "JsonWebKey2020", // external (property value)
    "controller": "did:example:123",
    "publicKeyJwk": {
      "crv": "Ed25519", // external (property name)
      "x": "VCpo2LMLhn6iWku8MKvSLg2ZAoC-nlOyPVQaO3FxVeQ", // external (property name)
      "kty": "OKP", // external (property name)
      "kid": "_Qq0UL2Fq651Q0Fjd6TvnYE-faHiOpRlPVQcY_-tA4A" // external (property name)
    }
  }, {
    "id": "did:example:123456789abcdefghi#keys-1",
    "type": "Multikey", // external (property value)
    "controller": "did:example:pqrstuvwxyz0987654321",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
  }],
  ...
}
2.3.1.2 Multikey

The Multikey data model is a specific type of verification method that utilizes the [MULTICODEC] specification to encode key types into a single binary stream that is then encoded using the [MULTIBASE] specification. To encode a Multikey, the verification method type MUST be set to Multikey and the publicKeyMultibase value MUST be a [MULTIBASE] encoded [MULTICODEC] value. An example of a Multikey is provided below:

Example 10: Multikey encoding of a Ed25519 public key
{
  "@context": ["https://w3id.org/security/multikey/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
}

In the example above, the publicKeyMultibase value starts with the letter z, which is the [MULTIBASE] header that conveys that the binary data is base58-encoded using the Bitcoin base-encoding alphabet. The decoded binary data [MULTICODEC] header is 0xed, which specifies that the remaining data is a 32-byte raw Ed25519 public key.

The Multikey data model is also capable of encoding secret keys, sometimes referred to as private keys.

Example 11: Multikey encoding of a Ed25519 secret key
{
  "@context": ["https://w3id.org/security/suites/secrets/v1"],
  "id": "did:example:123456789abcdefghi#keys-1",
  "type": "Multikey",
  "controller": "did:example:123456789abcdefghi",
  "secretKeyMultibase": "z3u2fprgdREFtGakrHr6zLyTeTEZtivDnYCPZmcSt16EYCER"
}

In the example above, the secretKeyMultibase value starts with the letter z, which is the [MULTIBASE] header that conveys that the binary data is base58-encoded using the Bitcoin base-encoding alphabet. The decoded binary data [MULTICODEC] header is 0x1300, which specifies that the remaining data is a 32-byte raw Ed25519 private key.

2.3.1.3 Referring to Verification Methods

Verification methods can be embedded in or referenced from properties associated with various verification relationships as described in 2.3.2 Verification Relationships. Referencing verification methods allows them to be used by more than one verification relationship.

If the value of a verification method property is a map, the verification method has been embedded and its properties can be accessed directly. However, if the value is a URL string, the verification method has been included by reference and its properties will need to be retrieved from elsewhere in the controller document or from another controller document. This is done by dereferencing the URL and searching the resulting resource for a verification method map with an id property whose value matches the URL.

Example 12: Embedding and referencing verification methods
    {
...

      "authentication": [
        // this key is referenced and might be used by
        // more than one verification relationship
        "did:example:123456789abcdefghi#keys-1",
        // this key is embedded and may *only* be used for authentication
        {
          "id": "did:example:123456789abcdefghi#keys-2",
          "type": "Multikey", // external (property value)
          "controller": "did:example:123456789abcdefghi",
          "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
        }
      ],

...
    }

2.3.2 Verification Relationships

A verification relationship expresses the relationship between the controller and a verification method.

Different verification relationships enable the associated verification methods to be used for different purposes. It is up to a verifier to ascertain the validity of a verification attempt by checking that the verification method used is contained in the appropriate verification relationship property of the controller document.

The verification relationship between the controller and the verification method is explicit in the controller document. Verification methods that are not associated with a particular verification relationship cannot be used for that verification relationship. For example, a verification method in the value of the authentication property cannot be used to engage in key agreement protocols with the controller—the value of the keyAgreement property needs to be used for that.

The controller document does not express revoked keys using a verification relationship. If a referenced verification method is not in the latest controller document used to dereference it, then that verification method is considered invalid or revoked.

The following sections define several useful verification relationships. A controller document MAY include any of these, or other properties, to express a specific verification relationship. In order to maximize global interoperability, any such properties used SHOULD be registered in the Data Integrity Specification Registries [TBD: DIS-REGISTRIES].

2.3.2.1 Authentication

The authentication verification relationship is used to specify how the controller is expected to be authenticated, for purposes such as logging into a website or engaging in any sort of challenge-response protocol.

authentication
The authentication property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.
Example 13: Authentication property containing three verification methods
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "authentication": [
    // this method can be used to authenticate as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for authentication, it may not
    // be used for any other proof purpose, so its full description is
    // embedded here rather than using only a reference
    {
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Multikey",
      "controller": "did:example:123456789abcdefghi",
      "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

If authentication is established, it is up to the application to decide what to do with that information.

This is useful to any authentication verifier that needs to check to see if an entity that is attempting to authenticate is, in fact, presenting a valid proof of authentication. When a verifier receives some data (in some protocol-specific format) that contains a proof that was made for the purpose of "authentication", and that says that an entity is identified by the id, then that verifier checks to ensure that the proof can be verified using a verification method (e.g., public key) listed under authentication in the controller document.

Note that the verification method indicated by the authentication property of a controller document can only be used to authenticate the controller. To authenticate a different controller, the entity associated with the value of controller needs to authenticate with its own controller document and associated authentication verification relationship.

2.3.2.2 Assertion

The assertionMethod verification relationship is used to specify how the controller is expected to express claims, such as for the purposes of issuing a Verifiable Credential [VC-DATA-MODEL].

assertionMethod
The assertionMethod property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

This property is useful, for example, during the processing of a verifiable credential by a verifier. During verification, a verifier checks to see if a verifiable credential contains a proof created by the controller by checking that the verification method used to assert the proof is associated with the assertionMethod property in the corresponding controller document.

Example 14: Assertion method property containing two verification methods
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "assertionMethod": [
    // this method can be used to assert statements as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for assertion of statements, it is not
    // used for any other verification relationship, so its full description is
    // embedded here rather than using a reference
    {
      "id": "did:example:123456789abcdefghi#keys-2",
      "type": "Multikey", // external (property value)
      "controller": "did:example:123456789abcdefghi",
      "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}
2.3.2.3 Key Agreement

The keyAgreement verification relationship is used to specify how an entity can generate encryption material in order to transmit confidential information intended for the controller, such as for the purposes of establishing a secure communication channel with the recipient.

keyAgreement
The keyAgreement property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when encrypting a message intended for the controller. In this case, the counterparty uses the cryptographic public key information in the verification method to wrap a decryption key for the recipient.

Example 15: Key agreement property containing two verification methods
{
  "@context": "https://www.w3.org/ns/did/v1",
  "id": "did:example:123456789abcdefghi",
  ...
  "keyAgreement": [
    // this method can be used to perform key agreement as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for key agreement usage, it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
      "id": "did:example:123#zC9ByQ8aJs8vrNXyDhPHHNNMSHPcaSgNpjjsBYpMMjsTdS",
      "type": "X25519KeyAgreementKey2019", // external (property value)
      "controller": "did:example:123",
      "publicKeyMultibase": "z6LSn6p3HRxx1ZZk1dT9VwcfTBCYgtNWdzdDMKPZjShLNWG7"
    }
  ],
  ...
}
2.3.2.4 Capability Invocation

The capabilityInvocation verification relationship is used to specify a verification method that might be used by the controller to invoke a cryptographic capability, such as the authorization to update the controller document.

capabilityInvocation
The capabilityInvocation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller needs to access a protected HTTP API that requires authorization in order to use it. In order to authorize when using the HTTP API, the controller uses a capability that is associated with a particular URL that is exposed via the HTTP API. The invocation of the capability could be expressed in a number of ways, e.g., as a digitally signed message that is placed into the HTTP Headers.

The server providing the HTTP API is the verifier of the capability and it would need to verify that the verification method referred to by the invoked capability exists in the capabilityInvocation property of the controller document. The verifier would also check to make sure that the action being performed is valid and the capability is appropriate for the resource being accessed. If the verification is successful, the server has cryptographically determined that the invoker is authorized to access the protected resource.

Example 16: Capability invocation property containing two verification methods
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "capabilityInvocation": [
    // this method can be used to invoke capabilities as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for capability invocation usage, it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
    "id": "did:example:123456789abcdefghi#keys-2",
    "type": "Multikey", // external (property value)
    "controller": "did:example:123456789abcdefghi",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}
2.3.2.5 Capability Delegation

The capabilityDelegation verification relationship is used to specify a mechanism that might be used by the controller to delegate a cryptographic capability to another party, such as delegating the authority to access a specific HTTP API to a subordinate.

capabilityDelegation
The capabilityDelegation property is OPTIONAL. If present, the associated value MUST be a set of one or more verification methods. Each verification method MAY be embedded or referenced.

An example of when this property is useful is when a controller chooses to delegate their capability to access a protected HTTP API to a party other than themselves. In order to delegate the capability, the controller would use a verification method associated with the capabilityDelegation verification relationship to cryptographically sign the capability over to another controller. The delegate would then use the capability in a manner that is similar to the example described in 2.3.2.4 Capability Invocation.

Example 17: Capability Delegation property containing two verification methods
{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    "https://w3id.org/security/multikey/v1"
  ],
  "id": "did:example:123456789abcdefghi",
  ...
  "capabilityDelegation": [
    // this method can be used to perform capability delegation as did:...fghi
    "did:example:123456789abcdefghi#keys-1",
    // this method is *only* approved for granting capabilities; it will not
    // be used for any other verification relationship, so its full description is
    // embedded here rather than using only a reference
    {
    "id": "did:example:123456789abcdefghi#keys-2",
    "type": "Multikey", // external (property value)
    "controller": "did:example:123456789abcdefghi",
    "publicKeyMultibase": "z6MkmM42vxfqZQsv4ehtTjFFxQ4sQKS2w6WR7emozFAn5cxu"
    }
  ],
  ...
}

2.4 Relationship to Linked Data

The term Linked Data is used to describe a recommended best practice for exposing, sharing, and connecting information on the Web using standards, such as URLs, to identify things and their properties. When information is presented as Linked Data, other related information can be easily discovered and new information can be easily linked to it. Linked Data is extensible in a decentralized way, greatly reducing barriers to large scale integration.

With the increase in usage of Linked Data for a variety of applications, there is a need to be able to verify the authenticity and integrity of Linked Data documents. This specification adds authentication and integrity protection to data documents through the use of mathematical proofs without sacrificing Linked Data features such as extensibility and composability.

Note: Use of Linked Data is an optional feature

While this specification provides mechanisms to digitally sign Linked Data, the use of Linked Data is not necessary to gain some of the advantages provided by this specification.

3. Cryptographic Suites

A data integrity proof is designed to be easy to use by developers and therefore strives to minimize the amount of information one has to remember to generate a proof. Often, just the cryptographic suite name (e.g. eddsa-2022) is required from developers to initiate the creation of a proof. These cryptographic suites are often created or reviewed by people that have the requisite cryptographic training to ensure that safe combinations of cryptographic primitives are used. This section specifies the requirements for authoring cryptographic suite specifications.

The requirements for all data integrity cryptographic suite specifications are as follows:

Issue: Require interoperability report?

The following language was deemed to be contentious: The specification MUST provide a link to an interoperability test report to document which implementations are conformant with the cryptographic suite specification.

The Working Group is seeking feedback on whether or not this is desired given the important role that cryptographic suite specifications play in ensuring data integrity.

3.1 DataIntegrityProof

A number of cryptographic suites follow the same basic pattern when expressing a data integrity proof. This section specifies that general design pattern, a cryptographic suite type called a DataIntegrityProof, which reduces the burden of writing and implementing cryptographic suites through the reuse of design primitives and source code.

When specifing a cryptographic suite that utilizes this design pattern, the proof value takes the following form:

type
The type property MUST contain the string DataIntegrityProof.
cryptosuite
The cryptosuite property MUST contain a string specifying the name of the cryptosuite.
proofValue
The proofValue property MUST be used, as specified in 2.1 Proofs.

Cryptographic suite designers MUST use mandatory proof value properties defined in Section 2.1 Proofs, and MAY define other properties specific to their cryptographic suite.

4. Algorithms

The algorithms defined below are generalized in that they require a specific canonicalization algorithm, message digest algorithm, and proof algorithm to be used to achieve the algorithm's intended outcome.

4.1 Create Proof Algorithm

Issue
The proof parameters should be included as headers and values in the data to be signed.

The following algorithm specifies how to create a digital proof that can be later used to verify the authenticity and integrity of a unsigned data document. A unsigned data document, document, proof options, options, and a private key, privateKey, are required inputs. The proof options MUST contain an identifier for the public/private key pair, and an [ISO8601] combined date and time string, created, containing the current date and time, accurate to at least one second, in Universal Time Code format. A domain might also be specified in the options. A signed data document is produced as output. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

  1. Create a copy of document, hereafter referred to as output.
  2. Generate a canonicalized document by canonicalizing document according to a canonicalization algorithm (e.g. the URDNA2015 [RDF-DATASET-C14N] algorithm).
  3. Create a value tbs that represents the data to be signed, and set it to the result of running the Create Verify Hash Algorithm, passing the information in options.
  4. Digitally sign tbs using the privateKey and the the digital proof algorithm. The resulting string is the proofValue.
  5. Add a proof node to output containing a data integrity proof using the appropriate type and proofValue values as well as all of the data in the proof options (e.g. created, and if given, any additional proof options such as domain).
  6. Return output as the signed data document.

4.2 Verify Proof Algorithm

Issue

This algorithm is highly specific to digital signatures and needs to be generalized to other proof mechanisms such as Equihash.

The following algorithm specifies how to check the authenticity and integrity of a signed data document by verifying its digital proof. This algorithm takes a signed data document, signed document and outputs a true or false value based on whether or not the digital proof on signed document was verified. Whenever this algorithm encodes strings, it MUST use UTF-8 encoding.

Issue
Specify how the public key can be obtained, through some out-of-band process and passed in or it can be retrieved by derefencing its URL identifier, etc.
  1. Get the public key by dereferencing its URL identifier in the proof node of the default graph of signed document. Confirm that the unsigned data document that describes the public key specifies its controller and that its controllers's URL identifier can be dereferenced to reveal a bi-directional link back to the key. Ensure that the key's controller is a trusted entity before proceeding to the next step.
  2. Let document be a copy of signed document.
  3. Remove any proof nodes from the default graph in document and save it as proof.
  4. Generate a canonicalized document by canonicalizing document according to the canonicalization algorithm (e.g. the URDNA2015 [RDF-DATASET-C14N] algorithm).
  5. Create a value tbv that represents the data to be verified, and set it to the result of running the Create Verify Hash Algorithm, passing the information in proof.
  6. Pass the proofValue, tbv, and the public key to the proof algorithm. Return the resulting boolean value.

4.3 Create Verify Hash Algorithm

Issue

This algorithm is too specific to digital signatures and needs to be generalized for algorithms such as Equihash.

The following algorithm specifies how to create the data that is used to generate or verify a digital proof. It takes a canonicalized unsigned data document, canonicalized document, canonicalization algorithm, a message digest algorithm, and proof options, input options (by reference). The proof options MUST contain an identifier for the public/private key pair, and an [ISO8601] combined date and time string, created, containing the current date and time, accurate to at least one second, in Universal Time Code format. A domain might also be specified in the options. Its output is a data that can be used to generate or verify a digital proof (it is usually further hashed as part of the verification or signing process).

  1. Let options be a copy of input options.
  2. If the proofValue parameter, such as jws, exists in options, remove the entry.
  3. If created does not exist in options, add an entry with a value that is an [ISO8601] combined date and time string containing the current date and time accurate to at least one second, in Universal Time Code format. For example: 2017-11-13T20:21:34Z.
  4. Generate output by:
    1. Creating a canonicalized options document by canonicalizing options according to the canonicalization algorithm (e.g. the URDNA2015 [RDF-DATASET-C14N] algorithm).
    2. Hash canonicalized options document using the message digest algorithm (e.g. SHA-256) and set output to the result.
    3. Hash canonicalized document using the message digest algorithm (e.g. SHA-256) and append it to output.
  5. Issue
    This last step needs further clarification. Signing implementations usually automatically perform their own integrated hashing of an input message, i.e. signing algorithms are a combination of a raw signing mechanism and a hashing mechanism such as RS256 (RSA + SHA-256). Current implementations of RSA-based data integrity proof suites therefore do not perform this last step before passing the data to a signing algorithm as it will be performed internally. The Ed25519Proof2018 algorithm also does not perform this last step -- and, in fact, uses SHA-512 internally. In short, this last step should better communicate that the 64 bytes produced from concatenating the SHA-256 of the canonicalized options with the SHA-256 of the canonicalized document are passed into the signing algorithm with a presumption that the signing algorithm will include hashing of its own.
    Note: It is presumed that the 64-byte output will be used in a signing algorithm that includes its own hashing algorithm, such as RS256 (RSA + SHA-256) or EdDsa (Ed25519 which uses SHA-512).
  6. Return output.

5. Security Considerations

The following section describes security considerations that developers implementing this specification should be aware of in order to create secure software.

Issue
TODO: We need to add a complete list of security considerations.

5.1 Versioning Cryptography Suites

Cryptography secures information through the use of secrets. Knowledge of the necessary secret makes it computationally easy to access certain information. The same information can be accessed if a computationally-difficult, brute-force effort successfully guesses the secret. All modern cryptography requires the computationally difficult approach to remain difficult throughout time, which does not always hold due to breakthroughs in science and mathematics. That is to say that Cryptography has a shelf life.

This specification plans for the obsolescence of all cryptographic approaches by asserting that whatever cryptography is in use today is highly likely to be broken over time. Software systems have to be able to change the cryptography in use over time in order to continue to secure information. Such changes might involve increasing required secret sizes or modifications to the cryptographic primitives used. However, some combinations of cryptographic parameters might actually reduce security. Given these assumptions, systems need to be able to distinguish different combinations of safe cryptographic parameters, also known as cryptographic suites, from one another. When identifying or versioning cryptographic suites, there are several approaches that can be taken which include: parameters, numbers, and dates.

Parametric versioning specifies the particular cryptographic parameters that are employed in a cryptographic suite. For example, one could use an identifier such as RSASSA-PKCS1-v1_5-SHA1. The benefit to this scheme is that a well-trained cryptographer will be able to determine all of the parameters in play by the identifier. The drawback to this scheme is that most of the population that uses these sorts of identifiers are not well trained and thus will not understand that the previously mentioned identifier is a cryptographic suite that is no longer safe to use. Additionally, this lack of knowledge might lead software developers to generalize the parsing of cryptographic suite identifiers such that any combination of cryptographic primitives becomes acceptable, resulting in reduced security. Ideally, cryptographic suites are implemented in software as specific, acceptable profiles of cryptographic parameters instead.

Numbered versioning might specify a major and minor version number such as 1.0 or 2.1. Numbered versioning conveys a specific order and suggests that higher version numbers are more capable than lower version numbers. The benefit of this approach is that it removes complex parameters that less expert developers might not understand with a simpler model that conveys that an upgrade might be appropriate. The drawback of this approach is that its not clear if an upgrade is necessary, as software version number increases often don't require an upgrade for the software to continue functioning. This can lead to developers thinking their usage of a particular version is safe, when it is not. Ideally, additional signals would be given to developers that use cryptographic suites in their software that periodic reviews of those suites for continued security are required.

Date-based versioning specifies a particular release date for a specific cryptographic suite. The benefit of a date, such as a year, is that it is immediately clear to a developer if the date is relatively old or new. Seeing an old date might prompt the developer to go searching for a newer cryptographic suite, where as a parametric or number-based versioning scheme might not. The downside of a date-based version is that some cryptographic suites might not expire for 5-10 years, prompting the developer to go searching for a newer cryptographic suite only to not find one that is newer. While this might be an inconvenience, it is one that results in safer ecosystem behavior.

Issue 38: Determine how cryptographic suites are versioned

The following text is currently under debate:

It is highly encouraged that cryptographic suite identifiers are versioned using a year designation. For example, the cryptographic suite identifier ecdsa-2022 implies that the suite is probably an acceptable of ECDSA in the year 2025, but might not be a safe choice in the year 2042. A date-based versioning mechanism, however, is not enough by itself. All cryptographic suites that follow this specification are intended to be registered [VC-EXTENSION-REGISTRY] in a way that clearly signal which cryptosuites are deprecated, standardized, or experimental. Cryptosuite registration will follow CFRG, IETF, NIST, FIPS, and safecurves guidance. Use of deprecated suites are expected to throw errors in implementations unless a useUnsafeCryptosuites option is used specifying exactly the unsafe cryptosuite to use. Use of experimental suites are expected to throw errors in implementations unless a useExperimentalCryptosuites option is used specifying exactly the experimental cryptosuite to use.

Issue: Resolve VC Extension Registry Governance

The following text is debated due to the governance rules of the VC Extension Registry not being finalized:

Developers are urged to always refer to the [VC-EXTENSION-REGISTRY] for the latest standardized cryptography suites as well as cross-checking their usage against, at least, the COSE Algorithms Registry, the JOSE Algorithms Registry, and the latest FIPS Digital Signature Standard guidance. Depending on your jurisdiction you may also need to consider additional guidance from ETSI or other regional regulatory and standards bodies.

5.2 Protecting Application Developers

Modern cryptographic algorithms provide a number of tunable parameters and options to ensure that the algorithms can meet the varied requirements of different use cases. For example, embedded systems have limited processing and memory environments and might not have the resources to generate the strongest digital signatures for a given algorithm. Other environments, like financial trading systems, might only need to protect data for a day while the trade is occurring, while other environments might need to protect data for multiple decades. To meet these needs, cryptographic algorithm designers often provide multiple ways to configure a cryptographic algorithm.

Cryptographic library implementers often take the specifications created by cryptographic algorithm designers and specification authors and implement them such that all options are available to the application developers that use their libraries. This can be due to not knowing which combination of features a particular application developer might need for a given cryptographic deployment. All options are often exposed to application developers.

Application developers that use cryptographic libraries often do not have the requisite cryptographic expertise and knowledge necessary to appropriately select cryptographic parameters and options for a given application. This lack of expertise can lead to an inappropriate selection of cryptographic parameters and options for a particular application.

This specification sets the priority of constituencies to protect application developers over cryptographic library implementers over cryptographic specification authors over cryptographic algorithm designers. Given these priorities, the following recommendations are made:

The guidance above is meant to ensure that useful cryptographic options and parameters are provided at the lower layers of the architecture while not exposing those options and parameters to application developers who may not fully understand the balancing benefits and drawbacks of each option.

Issue: Use of experimental and deprecated cryptography

The VCWG is seeking guidance on adding language to allow the use of experimental or deprecated cryptography. By default, those features will be disabled and will require the application developer to specifically allow use on a per-cryptographic suite basis. There will be requirements for all implementing libraries to throw errors or warnings when deprecated or experimental options are selected without the appropriate override flags.

5.3 Agility and Layering

Cryptographic agility is a practice by which one designs frequently connected information security systems to support switching between multiple cryptographic primitives and/or algorithms. The primary goal of cryptographic agility is to enable systems to rapidly adapt to new cryptographic primitives and algorithms without making disruptive changes to the systems' infrastructure. Thus, when a particular cryptographic primitive, such as the SHA-1 algorithm, is determined to be no longer safe to use, systems can be reconfigured to use a newer primitive via a simple configuration file change.

Cryptographic agility is most effective when the client and the server in the information security system are in regular contact. However, when the messages protected by a particular cryptographic algorithm are long-lived, as with Verifiable Credentials, and/or when the client (holder) might not be able to easily recontact the server (issuer), then cryptographic agility does not provide the desired protections.

Cryptographic layering is a practice where one designs rarely connected information security systems to employ multiple primitives and/or algorithms at the same time. The primary goal of cryptographic layering is to enable systems to survive the failure or one or more cryptographic algorithms or primitives without losing cryptographic protection on the payload. For example, digitally signing a single piece of information using RSA, ECDSA, and Falcon algorithms in parallel would provide a mechanism that could survive the failure of two of these three digital signature algorithms. When a particular cryptographic protection is compromised, such as an RSA digital signature using 768-bit keys, systems can still utilize the non-compromised cryptographic protections to continue to protect the information. Developers are urged to take advantage of this feature for all signed content that might need to be protected for a year or longer.

This specification provides for both forms of agility. It provides for cryptographic agility, which allows one to easily switch from one algorithm to another. It also provides for cryptographic layering, which allows one to simultaneously use multiple cryptographic algorithms, typically in parallel, such that any of those used to protect information can be used without reliance on or requirement of the others, while still keeping the digital proof format easy to use for developers.

5.4 Transformations

At times, it is beneficial to transform the data being protected during the cryptographic protection process. Such "in-line" transformation can enable a particular type of cryptographic protection to be agnostic to the data format it is carried in. For example, some Data Integrity cryptographic suites utilize RDF Dataset Canonicalization [URDNA2015] which transforms the initial representation into a canonical form [N-QUADS] that is then serialized, hashed, and digitally signed. As long as any syntax expressing the protected data can be transformed into this canonical form, the digital signature can be verified. This enables the same digital signature over the information to be expressed in JSON, CBOR, YAML, and other compatible syntaxes without having to create a cryptographic proof for every syntax.

Being able to express the same digital signature across a variety of syntaxes is beneficial because systems often have native data formats with which they operate. For example, some systems are written against JSON data, while others are written against CBOR data. Without transformation, systems that process their data internally as CBOR are required to store the digitally signed data structures as JSON (or vice-versa). This leads to double-storing data and can lead to increased security attack surface if the unsigned representation stored in databases accidentally deviates from the signed representation. By using transformations, the digital proof can live in the native data format to help prevent otherwise undetectable database drift over time.

This specification is designed to avoid requiring the duplication of signed information by utilizing "in-line" data transformations. Application developers are urged to work with cryptographically protected data in the native data format for their application and not separate storage of cryptographic proofs from the data being protected. Developers are also urged to regularly confirm that the cryptographically protected data has not been tampered with as it is written to and read from application storage.

Issue: Collision-resistant canonicalization requirements

The VCWG is seeking feedback on normative language that cryptographic suite implementers need to follow to ensure that they do not utilize data transformation mechanisms that can map to the same output. That is, given different inputs for canonicalization scheme #1 and canonicalization scheme #2, they must not produce the same output value. As an analogy, this is the same requirement for cryptographic hashing mechanisms and is why those schemes are designed to be collision resistant. Cryptographic canonicalization mechanisms have the same requirement. At present, this isn't a problem because the three expected canonicalization schemes — the Universal RDF Dataset Canonicalization Algorithm 2015 [URDNA2015], JSON Canonicalization Scheme [RFC8785], and a theoretical future base-encoding canonicalization — have entirely different outputs.

Issue: Avoiding the pitfalls of XML Canonicalization

The VCWG is seeking feedback on whether to explain why modern canonicalization schemes are simpler than the far more complex XML Canonicalization schemes of the early 2000s. Some readers seem to be under the impression that all canonicalization is difficult and has to be avoided at all costs (including costs to application developers). The WG would like to understand if it would be helpful to include a section explaining why some simpler data syntaxes (such as JSON) are easier to canonicalize than more complex data syntaxes (such as XML).

5.5 Data Opacity

The inspectability of application data has effects on system efficiency and developer productivity. When cryptographically protected application data, such as base-encoded binary data, is not easily processed by application subsystems, such as databases, it increases the effort of working with the cryptographically protected information. For example, a cryptographically protected payload that can be natively stored and indexed by a database will result in a simpler system that:

Similarly, a cryptographically protected payload that can be processed by multiple upstream networked systems increases the ability to properly layer security architectures. For example, if upstream systems do not have to repeatedly decode the incoming payload, it increases the ability for a system to distribute processing load by specializing upstream subsystems to actively combat attacks. While a digital signature needs to always be checked before taking substantive action, other upstream checks can be performed on transparent payloads — such as identifier-based rate limiting, signature expiration checking, or nonce/challenge checking — to reject obviously bad requests.

Additionally, if a developer is not able to easily view data in a system, the ability to easily audit or debug system correctness is hampered. For example, requiring application developers to cut-and-paste base-encoded application data makes development more challenging and increases the chances that obvious bugs will be missed because every message needs to go through a manually operated base-decoding tool.

There are times, however, where the correct design decision is to make data opaque. Data that does not need to be processed by other application subsystems, as well as data that does not need to be modified or accessed by an application developer, can be serialized into opaque formats. Examples include digital signature values, cryptographic key parameters, and other data fields that only need to be accessed by a cryptographic library and need not be modified by the application developer. There are also examples where data opacity is appropriate when the underlying subsystem does not expose the application developer to the underlying complexity of the opaque data, such as databases that perform encryption at rest. In these cases, the application developer continues to develop against transparent application data formats while the database manages the complexity of encrypting and decrypting the application data to and from long-term storage.

This specification strives to provide an architecture where application data remains in its native format and is not made opaque, while other cryptographic data, such as digital signatures, are kept in their opaque binary encoded form. Cryptographic suite implementers are urged to consider appropriate use of data opacity when designing their suites, and to weigh the design trade-offs when making application data opaque versus providing access to cryptographic data at the application layer.

5.6 Verification Method Binding

Issue

Implementers must ensure that a verification method is bound to a particular controller by going from the verification method to the controller document, and then ensuring that the controller document also contains the verification method.

5.7 Verification Relationship Validation

Issue

Implementers need to ensure that when a verification method is used, that it matches the verification relationship associated with it and that it lines up with the proof purpose.

5.8 Canonicalization Method Correctness

Issue

Canonicalization mechanisms utilized for normalizing input to hashing functions need to have vetted mathematical proofs associted with them. Canonicalization mechanisms that create collisions in hash functions can be used to attack digital signatures.

6. Privacy Considerations

The following section describes privacy considerations that developers implementing this specification should be aware of in order to create privacy enhancing software.

Issue
TODO: We need to add a complete list of privacy considerations.

6.1 Unlinkability

When a digitally-signed payload contains data that is seen by multiple verifiers, it becomes a point of correlation. An example of such data is a shopping loyalty card number. Correlatable data can be used for tracking purposes by verifiers, which can sometimes violate privacy expectations. The fact that some data can be used for tracking might not be immediately apparent. Examples of such correlatable data include, but are not limited to, a static digital signature or a cryptographic hash of an image.

It is possible to create a digitally-signed payload that does not have any correlatable tracking data while also providing some level of assurance that the payload is trustworthy for a given interaction. This characteristic is called unlinkability which ensures that no correlatable data are used in a digitally-signed payload while still providing some level of trust, the sufficiency of which must be determined by each verifier.

It is important to understand that not all use cases require or even permit unlinkability. There are use cases where linkability and correlation are required due to regulatory or safety reasons, such as correlating organizations and individuals that are shipping and storing hazardous materials. Unlinkability is useful when there is an expectation of privacy for a particular interaction.

There are at least two mechanisms that can provide some level of unlinkability. The first method is to ensure that no data value used in the message is ever repeated in a future message. The second is to ensure that any repeated data value provides adequate herd privacy such that it becomes practically impossible to correlate the entity that expects some level of privacy in the interaction.

A variety of methods can be used to achieve unlinkability. These methods include ensuring that a message is a single use bearer token with no information that can be used for the purposes of correlation, using attributes that ensure an adequate level of herd privacy, and the use of cryptosuites that enable the entity presenting a message to regenerate new signatures while not compromising the trust in the message being presented.

6.2 Selective Disclosure

Selective disclosure is a technique that enables the recipient of a previously-signed message (that is, a message signed by its creator) to reveal only parts of the message without disturbing the verifiability of those parts. For example, one might selectively disclose a digital driver's license for the purpose of renting a car. This could involve revealing only the issuing authority, license number, birthday, and authorized motor vehicle class from the license. Note that in this case, the license number is correlatable information, but some amount of privacy is preserved because the driver's full name and address are not shared.

Not all software or cryptosuites are capable of providing selective disclosure. If the author of a message wishes it to be selectively disclosable by its recipient, then they need to enable selective disclosure on the specific message, and both need to use a capable cryptosuite. The author might also make it mandatory to disclose certain parts of the message. A recipient that wants to selectively disclose partial content of the message needs to utilize software that is able to perform the technique. An example of a cryptosuite that supports selective disclosure is bbs-2022.

It is possible to selectively disclose information in a way that does not preserve unlinkability. For example, one might want to disclose the inspection results related to a shipment, which include the shipment identifier or lot number, which might have to be correlatable due to regulatory requirements. However, disclosure of the entire inspection result might not be required as selectively disclosing just the pass/fail status could be deemed adequate. For more information on disclosing information while preserving privacy, see Section 6.1 Unlinkability.

A. References

A.1 Normative references

[DID-CORE]
Decentralized Identifiers (DIDs) v1.0. Manu Sporny; Amy Guy; Markus Sabadello; Drummond Reed. W3C. 19 July 2022. W3C Recommendation. URL: https://www.w3.org/TR/did-core/
[INFRA]
Infra Standard. Anne van Kesteren; Domenic Denicola. WHATWG. Living Standard. URL: https://infra.spec.whatwg.org/
[ISO8601]
Representation of dates and times. ISO 8601:2004.. International Organization for Standardization (ISO). 2004. ISO 8601:2004. URL: http://www.iso.org/iso/catalogue_detail?csnumber=40874
[MULTIBASE]
The Multibase Encoding Scheme. Juan Benet; Manu Sporny. IETF. February 2021. Internet-Draft. URL: https://datatracker.ietf.org/doc/html/draft-multiformats-multibase-03
[MULTICODEC]
The Multi Codec Encoding Scheme. The Multiformats Community. IETF. February 2022. Internet-Draft. URL: https://github.com/multiformats/multicodec/blob/master/table.csv
[RDF-DATASET-C14N]
RDF Dataset Normalization. David Longley; Manu Sporny. JSON-LD Community Group. CGDRAFT. URL: https://json-ld.github.io/normalization/spec/
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC7517]
JSON Web Key (JWK). M. Jones. IETF. May 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7517
[RFC7638]
JSON Web Key (JWK) Thumbprint. M. Jones; N. Sakimura. IETF. September 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7638
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[RFC8785]
JSON Canonicalization Scheme (JCS). A. Rundgren; B. Jordan; S. Erdtman. IETF. June 2020. Informational. URL: https://www.rfc-editor.org/rfc/rfc8785
[SECURITY-VOCABULARY]
The Security Vocabulary. Manu Sporny; David Longley. Credentials Community Group. CGDRAFT. URL: https://w3c-ccg.github.io/security-vocab/
[URL]
URL Standard. Anne van Kesteren. WHATWG. Living Standard. URL: https://url.spec.whatwg.org/
[VC-DATA-MODEL]
Verifiable Credentials Data Model v1.1. Manu Sporny; Grant Noble; Dave Longley; Daniel Burnett; Brent Zundel; Kyle Den Hartog. W3C. 3 March 2022. W3C Recommendation. URL: https://www.w3.org/TR/vc-data-model/
[XML-SCHEMA11]
Reference not found.
[ZCAP]
Authorization Capabilities for Linked Data. Credentials Community Group. CGDRAFT. URL: https://w3c-ccg.github.io/zcap-spec

A.2 Informative references

[N-QUADS]
RDF 1.1 N-Quads. Gavin Carothers. W3C. 25 February 2014. W3C Recommendation. URL: https://www.w3.org/TR/n-quads/
[RFC7696]
Guidelines for Cryptographic Algorithm Agility and Selecting Mandatory-to-Implement Algorithms. R. Housley. IETF. November 2015. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc7696
[URDNA2015]
Universal RDF Dataset Canonicalization Algorithm 2015. Credentials Community Group. CGDRAFT. URL: https://json-ld.github.io/rdf-dataset-canonicalization/spec/
[VC-EXTENSION-REGISTRY]
Verifiable Credentials Extension Registry. Manu Sporny. Credentials Community Group. CG-DRAFT. URL: https://w3c-ccg.github.io/vc-extension-registry/