Trace Context Level 3

W3C Editor's Draft

More details about this document
This version:
https://w3c.github.io/trace-context/
Latest published version:
https://www.w3.org/TR/trace-context-3/
Latest editor's draft:
https://w3c.github.io/trace-context/
History:
Commit history
Implementation report:
https://w3c.github.io/trace-context/implementations
Editors:
Sergey Kanzhelev (Google)
Daniel Dyla (Dynatrace)
Yuri Shkuro (Meta)
J. Kalyana Sundaram (Microsoft)
Bastian Krol (Dash0)
Former editors:
Nik Molnar
Alois Reitbauer
Morgan McLean
Bogdan Drutu
Daniel Khan
Feedback:
GitHub w3c/trace-context (pull requests, new issue, open issues)
public-trace-context@w3.org with subject line trace-context (archives)
Discussions
We are on Slack.

Abstract

Error
Cannot GET /uploads/PajaXZ/spec/01-abstract.md

Status of This Document

This is a preview

Do not attempt to implement this version of the specification. Do not reference this version as authoritative in any way. Instead, see https://w3c.github.io/trace-context/ for the Editor's draft.

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Error
Cannot GET /uploads/PajaXZ/spec/02-sotd.md

This document was published by the Distributed Tracing Working Group as an Editor's Draft.

Publication as an Editor's Draft does not imply endorsement by W3C and its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 03 November 2023 W3C Process Document.

1. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, MUST NOT, SHOULD, and SHOULD NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Error
Cannot GET /uploads/PajaXZ/spec/10-overview.md

2. Trace Context HTTP Request Headers Format

This section describes the binding of the distributed trace context to traceparent and tracestate HTTP headers.

2.1 Relationship Between the Headers

The traceparent request header represents the incoming request in a tracing system in a common format, understood by all vendors. Here’s an example of a traceparent header.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

The tracestate request header includes the parent in a potentially vendor-specific format:

tracestate: congo=t61rcWkgMzE

For example, say a client and server in a system use different tracing vendors: Congo and Rojo. A client traced in the Congo system adds the following headers to an outbound HTTP request.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
tracestate: congo=t61rcWkgMzE

Note: In this case, the tracestate value t61rcWkgMzE is the result of Base64 encoding the parent ID (b7ad6b7169203331), though such manipulations are not required.

The receiving server, traced in the Rojo tracing system, carries over the tracestate it received and adds a new entry to the left.

traceparent: 00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01
tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

You'll notice that the Rojo system reuses the value of its traceparent for its entry in tracestate. This means it is a generic tracing system (no proprietary information is being passed). Otherwise, tracestate entries are opaque and can be vendor-specific.

If the next receiving server uses Congo, it carries over the tracestate from Rojo and adds a new entry for the parent to the left of the previous entry.

Note: ucfJifl5GOE is the Base64 encoded parent ID b9c7c989f97918e1.

Notice when Congo wrote its traceparent entry, it is not encoded, which helps in consistency for those doing correlation. However, the value of its entry tracestate is encoded and different from traceparent. This is ok.

Finally, you'll see tracestate retains an entry for Rojo exactly as it was, except pushed to the right. The left-most position lets the next server know which tracing system corresponds with traceparent. In this case, since Congo wrote traceparent, its tracestate entry should be left-most.

2.2 Traceparent Header

The traceparent HTTP header field identifies the incoming request in a tracing system. It has four fields:

2.2.1 Header Name

Header name: traceparent

The header name is ASCII case-insensitive. That is, TRACEPARENT, TraceParent, and traceparent are considered the same header. The header name is a single word; it does not contain any delimiters such as a hyphen.

In order to increase interoperability across multiple protocols and encourage successful integration, tracing systems SHOULD encode the header name as ASCII lowercase.

2.2.2 traceparent Header Field Values

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule from that document. The DIGIT rule defines a single number character 0-9.

HEXDIGLC = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" ; lowercase hex character
value           = version "-" version-format

The dash (-) character is used as a delimiter between fields.

2.2.2.1 version
version         = 2HEXDIGLC   ; this document assumes version 00. Version ff is forbidden

Version (version) is an 8-bit unsigned integer value, serialized as an ASCII string with two characters. Version 255 ("ff") is invalid. This document specifies version 0 ("00") of the traceparent header.

2.2.2.2 version-format

The following version-format definition is used for version 00.

version-format   = trace-id "-" parent-id "-" trace-flags
trace-id         = 32HEXDIGLC  ; 16 bytes array identifier. All zeroes forbidden
parent-id        = 16HEXDIGLC  ; 8 bytes array identifier. All zeroes forbidden
trace-flags      = 2HEXDIGLC   ; 8 bit flags.
2.2.2.3 trace-id

This is the ID of the whole trace forest and is used to uniquely identify a distributed trace through a system. It is represented as a 16-byte array, for example, 4bf92f3577b34da6a3ce929d0e0e4736. All bytes as zero (00000000000000000000000000000000) is considered an invalid value.

The value of trace-id SHOULD be globally unique. One recommended method to ensure global uniqueness, as well as to address some privacy and security considerations, to a satisfactory degree of certainty is to randomly (or pseudo-randomly) generate the trace-id. Implementers SHOULD use a trace-id generation method which randomly (or pseudo-randomly) generates at least the right-most 7 bytes of the ID. If the right-most 7 bytes are randomly (or pseudo-randomly) generated, the corresponding 2.2.2.5.2 Random Trace ID Flag SHOULD be set. For more details, see #considerations-for-trace-id-field-generation.

If the trace-id value is invalid (for example if it contains non-allowed characters or all zeros), vendors MUST ignore the entire header.

2.2.2.4 parent-id

This is the ID of this request as known by the caller (in some tracing systems, this is known as the span-id, where a span is the execution of a client request). It is represented as an 8-byte array, for example, 00f067aa0ba902b7. All bytes as zero (0000000000000000) is considered an invalid value.

Vendors MUST ignore the traceparent when the parent-id is invalid (for example, if it contains non-lowercase hex characters).

2.2.2.5 trace-flags

This is an 8-bit field that controls tracing flags such as sampling, trace level, etc. These flags are recommendations given by the caller rather than strict rules to follow for three reasons:

  1. An untrusted caller may be able to abuse a tracing system by setting these flags maliciously.
  2. A caller may have a bug which causes the tracing system to have a problem.
  3. Different load between caller service and callee service might force callee to downsample.

You can find more in the section #security-considerations of this specification.

Like other fields, trace-flags is hex-encoded. For example, all 8 flags set would be ff and no flags set would be 00.

As this is a bit field, the flags cannot be interpreted by a simple equality comparison. For example, both 01 (00000001) and 03 (00000011) represent that the trace has been sampled because the sampled flag (00000001) is set, and 03 and 02 (00000010) both represent that at least the right-most 7 bytes of the trace-id are randomly (or pseudo-randomly) generated because the random bit (00000010) is set. A common mistake when interpreting bit-fields is using a comparison of the whole number rather than interpreting a single bit.

Here is an example of properly handling trace flags:

static final byte FLAG_SAMPLED = 1; // 00000001
static final byte FLAG_RANDOM = 2; // 00000010
...
boolean sampled = (traceFlags & FLAG_SAMPLED) == FLAG_SAMPLED;
boolean random = (traceFlags & FLAG_RANDOM) == FLAG_RANDOM;
2.2.2.5.1 Sampled Flag

When set, the least significant bit (right-most), denotes that the caller may have recorded trace data. When unset, the caller did not record trace data out-of-band.

There are a number of recording scenarios that may break distributed tracing:

  • Only recording a subset of requests results in broken traces.
  • Recording information about all incoming and outgoing requests becomes prohibitively expensive, at load.
  • Making random or component-specific data collection decisions leads to fragmented data in all traces.

Because of these issues, tracing vendors make their own recording decisions, and there is no consensus on what is the best algorithm for this job.

Various techniques include:

  • Probability sampling (sample 1 out of 100 distributed traces by flipping a coin)
  • Delayed decision (make collection decision based on duration or a result of a request)
  • Deferred sampling (let the callee decide whether information about this request needs to be collected)

How these techniques are implemented can be tracing vendor-specific or application-defined.

The tracestate field is designed to handle the variety of techniques for making recording decisions (or other specific information) specific for a given vendor. The sampled flag provides better interoperability between vendors. It allows vendors to communicate recording decisions and enable a better experience for the customer.

For example, when a SaaS service participates in a distributed trace, this service has no knowledge of the tracing vendor used by its caller. This service may produce records of incoming requests for monitoring or troubleshooting purposes. The sampled flag can be used to ensure that information about requests that were marked for recording by the caller will also be recorded by SaaS service downstream so that the caller can troubleshoot the behavior of every recorded request.

The sampled flag has no restriction on its mutations except that it can only be mutated when 2.2.2.4 parent-id is updated.

The following are a set of suggestions that vendors SHOULD use to increase vendor interoperability.

  • If a component made definitive recording decision - this decision SHOULD be reflected in the sampled flag.
  • If a component needs to make a recording decision - it SHOULD respect the sampled flag value. #security-considerations SHOULD be applied to protect from abusive or malicious use of this flag.
  • If a component deferred or delayed the decision and only a subset of telemetry will be recorded, the sampled flag should be propagated unchanged. It should be set to 0 as the default option when the trace is initiated by this component.

There are two additional options that vendors MAY follow:

  • A component that makes a deferred or delayed recording decision may communicate the priority of a recording by setting sampled flag to 1 for a subset of requests.
  • A component may also fall back to probability sampling and set the sampled flag to 1 for the subset of requests.
2.2.2.5.2 Random Trace ID Flag

The second least significant bit of the trace-flags field denotes the random-trace-id flag.

When starting or restarting a trace (that is, when the participant generates a new trace-id), the following rules apply:

  • If that flag is set, at least the right-most 7 bytes of the trace-id MUST be selected randomly (or pseudo-randomly) with uniform distribution over the interval [0..2^56-1].
  • If the flag is not set, the trace-id MAY still be randomly (or pseudo-randomly) generated.
  • When unset, the trace-id MAY be generated in any way that satisfies the requirements of the [trace ID format](#trace-id).
  • When at least the right-most 7 bytes of the trace-id are randomly (or pseudo-randomly) generated, the random-trace-id flag SHOULD be set to 1.

When continuing a trace (that is, the incoming HTTP request had the traceparent header and the participant uses the same trace-id in the traceparent header on outgoing requests), the following rules apply:

  • If the flag is set in the incoming traceparent header, it MUST also be set in all outgoing traceparent headers which use the same trace-id.
  • If the flag is unset in the incoming traceparent header, it MUST also be unset in any outgoing traceparent headers which use the same trace-id.

This allows downstream consumers to implement features such as trace sampling or database sharding based on these bytes. For additional information, see #considerations-for-trace-id-field-generation.

2.2.2.5.3 Other Flags

The behavior of other flags, such as (00000100) is not defined and is reserved for future use. Vendors MUST set those to zero.

2.2.3 Examples of HTTP traceparent Headers

Valid traceparent when caller sampled this request:

Value = 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
base16(version) = 00
base16(trace-id) = 4bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = 00f067aa0ba902b7
base16(trace-flags) = 01  // sampled

Valid traceparent when caller didn’t sample this request:

Value = 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-00
base16(version) = 00
base16(trace-id) = 4bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = 00f067aa0ba902b7
base16(trace-flags) = 00  // not sampled

2.2.4 Versioning of Traceparent

This specification is opinionated about future versions of trace context. The current version of this specification assumes that future versions of the traceparent header will be additive to the current one.

Vendors MUST follow these rules when parsing headers with an unexpected format:

  • Pass-through services should not analyze the version. They should expect that headers may have larger size limits in the future and only disallow prohibitively large headers.
  • When the version prefix cannot be parsed (it's not 2 hex characters followed by a dash (-)), the implementation should restart the trace.
  • If a higher version is detected, the implementation SHOULD try to parse it by trying the following:

    • If the size of the header is shorter than 55 characters, the vendor should not parse the header and should restart the trace.
    • Parse trace-id (from the first dash through the next 32 characters). Vendors MUST check that the 32 characters are hex, and that they are followed by a dash (-).
    • Parse parent-id (from the second dash at the 35th position through the next 16 characters). Vendors MUST check that the 16 characters are hex and followed by a dash.
    • Parse the sampled bit of flags (2 characters from the third dash). Vendors MUST check that the 2 characters are either at the end of the string or followed by a dash.

    If all three values were parsed successfully, the vendor should use them.

2.3 Tracestate Header

The main purpose of the tracestate HTTP header is to provide additional vendor-specific trace identification information across different distributed tracing systems and is a companion header for the traceparent field. It also conveys information about the request’s position in multiple distributed tracing graphs.

If the vendor failed to parse traceparent, it MUST NOT attempt to parse tracestate. Note that the opposite is not true: failure to parse tracestate MUST NOT affect the parsing of traceparent.

The tracestate HTTP header MUST NOT be used for any properties that are not defined by a tracing system. [BAGGAGE] MAY be used for defining and propagating such application level properties.

2.3.1 Header Name

Header name: tracestate

The header name is ASCII case-insensitive. That is, TRACESTATE, TraceState, and tracestate are considered the same header. The header name is a single word, it does not contain any delimiters such as a hyphen.

In order to increase interoperability across multiple protocols and encourage successful integration, tracing systems SHOULD encode the header name as ASCII lowercase.

2.3.2 tracestate Header Field Values

The tracestate field may contain any opaque value in any of the keys. Tracestate MAY be sent or received as multiple header fields. Multiple tracestate header fields MUST be handled as specified by RFC9110 Section 5.3 Field Order. The tracestate header SHOULD be sent as a single field when possible, but MAY be split into multiple header fields. When sending tracestate as multiple header fields, it MUST be split according to RFC9110. When receiving multiple tracestate header fields, they MUST be combined into a single header according to RFC9110.

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule in appendix B.1 for RFC5234. It also includes the OWS rule from RFC9110 section 5.6.3.

The DIGIT rule defines numbers 0-9.

The OWS rule defines an optional whitespace character. To improve readability, it is used where zero or more whitespace characters might appear.

The caller SHOULD generate the optional whitespace as a single space; otherwise, a caller SHOULD NOT generate optional whitespace. See details in the corresponding RFC.

The tracestate field value is a list of list-members separated by commas (,). A list-member is a key/value pair separated by an equals sign (=). Spaces and horizontal tabs surrounding list-members are ignored. There can be a maximum of 32 list-members in a list. If adding an entry would cause the tracestate list to contain more than 32 list-members the right-most list-member should be removed from the list.

Empty and whitespace-only list members are allowed. Vendors MUST accept empty tracestate headers but SHOULD avoid sending them. Empty list members are allowed in tracestate because it is difficult for a vendor to recognize the empty value when multiple tracestate headers are sent. Whitespace characters are allowed for a similar reason, as some vendors automatically inject whitespace after a comma separator, even in the case of an empty header.

2.3.2.1 list

A simple example of a list with two list-members might look like: vendorname1=opaqueValue1,vendorname2=opaqueValue2.

list  = list-member 0*31( OWS "," OWS list-member )
list-member = (key "=" value) / OWS

Identifiers for a list are short (up to 256 characters) textual identifiers.

2.3.2.2 list-members

A list-member contains a key/value pair.

2.3.2.2.1 Key

The key is an identifier that describes the vendor.

key = ( lcalpha / DIGIT ) 0*255 ( keychar )
keychar    = lcalpha / DIGIT / "_" / "-"/ "*" / "/" / "@"
lcalpha    = %x61-7A ; a-z

A key MUST begin with a lowercase letter or a digit and contain up to 256 characters including lowercase letters (a-z), digits (0-9), underscores (_), dashes (-), asterisks (*), forward slashes (/), and at signs (@).

2.3.2.2.2 Value

The value is an opaque string containing up to 256 printable ASCII [RFC0020] characters (i.e., the range 0x20 to 0x7E) except comma (,) and (=). The string must end with a character which is not a space (0x20). Note that this also excludes tabs, newlines, carriage returns, etc. All leading spaces MUST be preserved as part of the value. All trailing spaces are considered to be optional whitespace characters not part of the value. Optional trailing whitespace MAY be excluded when propagating the header.

value    = 0*255(chr) nblk-chr
nblk-chr = %x21-2B / %x2D-3C / %x3E-7E
chr      = %x20 / nblk-chr

2.3.3 Combined Header Values

The tracestate value is the concatenation of trace graph key/value pairs.

Example: vendorname1=opaqueValue1,vendorname2=opaqueValue2

Tracing tools are not supposed to add the same header multiple times. For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the tracestate value would not be:

congo=congosFirstPosition,rojo=rojosFirstPosition,congo=congosSecondPosition

Instead, the entry would be rewritten to only include the most recent position: congo=congosSecondPosition,rojo=rojosFirstPosition

See 2.5 Mutating the tracestate Field for details.

2.3.3.1 tracestate Limits:

Vendors SHOULD propagate at least 512 characters of a combined header. This length includes commas required to separate list items and optional white space (OWS) characters.

There are systems where propagating of 512 characters of tracestate may be expensive. In this case, the maximum size of the propagated tracestate header SHOULD be documented and explained. The cost of propagating tracestate SHOULD be weighted against the value of monitoring scenarios enabled for the end users.

In a situation where tracestate is truncated due to the total size of the header value, the vendor MUST truncate whole entries. Entries larger than 128 characters long SHOULD be removed first. Then entries SHOULD be removed starting from the end of tracestate. Other truncation strategies like safe list entries, blocked list entries, or size-based truncation SHOULD NOT be used.

2.3.4 Examples of tracestate HTTP Headers

Single tracing system (generic format):

tracestate: rojo=00f067aa0ba902b7

Multiple tracing systems (with different formatting):

tracestate: rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

2.3.5 Versioning of tracestate

The version of tracestate is defined by the version prefix of traceparent header. Vendors need to attempt to parse tracestate if a higher version is detected, to the best of its ability. It is the vendor’s decision whether to use partially-parsed tracestate key/value pairs or not.

2.4 Mutating the traceparent Field

A vendor receiving a request without a traceparent header SHOULD generate traceparent headers for outbound requests, effectively starting a new trace. A possible reason for not doing this could be a performance sensitive scenario when the vendor decides to not sample a request. Note that for most scenarios, vendors are expected to generate the header even when not sampling, to propagate the sampling decision downstream.

A vendor receiving a traceparent request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing it to outgoing requests.

If the value of the traceparent field wasn't changed before propagation, tracestate MUST NOT be modified as well. Unmodified header propagation is typically implemented in pass-through services like proxies. This behavior may also be implemented in a service which currently does not collect distributed tracing information.

Following is the list of allowed mutations:

Vendors MUST NOT make any other mutations to the traceparent header.

2.5 Mutating the tracestate Field

Vendors receiving a tracestate request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing to outgoing requests. When mutating tracestate, the order of unmodified key/value pairs MUST be preserved. Modified keys MUST be moved to the beginning (left) of the list.

Following are allowed mutations:

Error
Cannot GET /uploads/PajaXZ/spec/21-http_response_header_format.md
Error
Cannot GET /uploads/PajaXZ/spec/30-processing-model.md
Error
Cannot GET /uploads/PajaXZ/spec/40-other-protocols.md
Error
Cannot GET /uploads/PajaXZ/spec/50-privacy.md
Error
Cannot GET /uploads/PajaXZ/spec/51-security.md
Error
Cannot GET /uploads/PajaXZ/spec/60-trace-id-format.md
Error
Cannot GET /uploads/PajaXZ/spec/61-span-id-format.md
Error
Cannot GET /uploads/PajaXZ/spec/60-acknowledgments.md

3. Glossary

This section is non-normative.

Distributed trace
A distributed trace is a set of events, triggered as a result of a single logical operation, consolidated across various components of an application. A distributed trace contains events that cross process, network and security boundaries. A distributed trace may be initiated when someone presses a button to start an action on a website - in this example, the trace will represent calls made between the downstream services that handled the chain of requests initiated by this button being pressed.
Opaque value
An opaque value refers to a value that can only be understood or processed in any way by the distributed trace participant that generated this value. Any other participant must treat it as a blob of bytes.

A. References

A.1 Normative references

[BAGGAGE]
Propagation format for distributed context: Baggage. Sergey Kanzhelev; Yuri Shkuro; Daniel Dyla; J. Kalyana Sundaram. W3C. 28 February 2024. W3C Working Draft. URL: https://www.w3.org/TR/baggage/
[BIT-FIELD]
8-bit field. Wikipedia. URL: https://en.wikipedia.org/wiki/Bit_field
[infra]
Infra Standard. Anne van Kesteren; Domenic Denicola. WHATWG. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC0020]
ASCII format for network interchange. V.G. Cerf. IETF. October 1969. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc20
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. March 1997. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc2119
[RFC5234]
Augmented BNF for Syntax Specifications: ABNF. D. Crocker, Ed.; P. Overell. IETF. January 2008. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc5234
[RFC8174]
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words. B. Leiba. IETF. May 2017. Best Current Practice. URL: https://www.rfc-editor.org/rfc/rfc8174
[RFC9110]
HTTP Semantics. R. Fielding, Ed.; M. Nottingham, Ed.; J. Reschke, Ed.. IETF. June 2022. Internet Standard. URL: https://httpwg.org/specs/rfc9110.html