Trace Context Level 2

Abstract

This specification defines standard HTTP headers and a value format to propagate context information that enables distributed tracing scenarios. The specification standardizes how context information is sent and modified between services. Context information uniquely identifies individual requests in a distributed system and also defines a means to add and propagate provider-specific context information.

3. Trace Context HTTP Request Headers Format

This section describes the binding of the distributed trace context to traceparent and tracestate HTTP headers.

3.1 Relationship Between the Headers

The traceparent request header represents the incoming request in a tracing system in a common format, understood by all vendors. Here’s an example of a traceparent header.

traceparent: 01-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01-0602

The tracestate request header includes the parent in a potentially vendor-specific format:


tracestate:
congo=t61rcWkgMzE

For example, say a client and server in a system use different tracing vendors: Congo and Rojo. A client traced in the Congo system adds the following headers to an outbound HTTP request.


traceparent:
00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01

01-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01-0100

tracestate:
congo=t61rcWkgMzE

Note : In this case, the tracestate value t61rcWkgMzE is the result of Base64 encoding the parent ID ( b7ad6b7169203331 ), though such manipulations are not required.

The receiving server, traced in the Rojo tracing system, carries over the tracestate it received and adds a new entry to the left.


traceparent:
00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01

01-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01-0100

tracestate:
rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

You'll notice that the Rojo system reuses the value of its traceparent for its entry in tracestate. This means it is a generic tracing system (no proprietary information is being passed). Otherwise, tracestate entries are opaque and can be vendor-specific.

If the next receiving server uses Congo, it carries over the tracestate from Rojo and adds a new entry for the parent to the left of the previous entry.


traceparent:
00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01

01-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01-0100

tracestate:
congo=ucfJifl5GOE,rojo=00f067aa0ba902b7

Note: ucfJifl5GOE is the Base64 encoded parent ID b9c7c989f97918e1.

Notice when Congo wrote its traceparent entry, it is not encoded, which helps in consistency for those doing correlation. However, the value of its entry tracestate is encoded and different from traceparent. This is ok.

Finally, you'll see tracestate retains an entry for Rojo exactly as it was, except pushed to the right. The left-most position lets the next server know which tracing system corresponds with traceparent. In this case, since Congo wrote traceparent, its tracestate entry should be left-most.

3.2 Traceparent Header

The traceparent HTTP header field identifies the incoming request in a tracing system. It has four ~~fields:~~ required fields and one optional field:

version
trace-id
parent-id
trace-flags
sampling-constant (optional)

3.2.1 Header Name

Header name: traceparent

In order to increase interoperability across multiple protocols and encourage successful integration, by default vendors SHOULD keep the header name lowercase. The header name is a single word without any delimiters, for example, a hyphen (-).

Vendors MUST expect the header name in any case (upper, lower, mixed), and SHOULD send the header name in lowercase.

3.2.2 traceparent Header Field Values

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule from that document. The DIGIT rule defines a single number character 0-9.

HEXDIGLC = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" ; lowercase hex character
value           = version "-" version-format

The dash ( - ) character is used as a delimiter between fields.

3.2.2.1 version



version

=

2

HEXDIGLC

;
this
document
assumes
version
00.

01.

Version
ff
is
forbidden

The value is US-ASCII encoded (which is UTF-8 compliant).

Version ( version ) is 1 byte representing an 8-bit unsigned integer. Version ff is invalid. The current specification assumes the version is set to 00 01.

3.2.2.2 version-format

The following version-format definition is used for version 00 01.



version-format

=
trace-id

"-"

parent-id

"-"

trace-flags
[

"-"

sampling-constant
]


trace-id

=

32

HEXDIGLC

;
16
bytes
array
identifier.
All
zeroes
forbidden


parent-id

=

16

HEXDIGLC

;
8
bytes
array
identifier.
All
zeroes
forbidden


trace-flags

=

2

HEXDIGLC

;
8
bit
flags.
Currently,
only
one
bit
is
used.
See
below
for
details


sampling-constant

=
sampling-random-value
parent-sampling-constant

sampling-random-value

=

2

HEXDIGLC

;
geometrically
distributed
8-bit
random
value
used
for
consistent
sampling

parent-sampling-constant

=

2

HEXDIGLC

;
8-bit
value
used
to
represent
the
head
sampling
probability.

3.2.2.3 trace-id

This is the ID of the whole trace forest and is used to uniquely identify a distributed trace through a system. It is represented as a 16-byte array, for example, 4bf92f3577b34da6a3ce929d0e0e4736. All bytes as zero ( 00000000000000000000000000000000 ) is considered an invalid value.

If the trace-id value is invalid (for example if it contains non-allowed characters or all zeros), vendors MUST ignore the traceparent.

See considerations for trace-id field generation for recommendations on how to operate with trace-id.

3.2.2.4 parent-id

This is the ID of this request as known by the caller (in some tracing systems, this is known as the span-id, where a span is the execution of a client request). It is represented as an 8-byte array, for example, 00f067aa0ba902b7. All bytes as zero ( 0000000000000000 ) is considered an invalid value.

Vendors MUST ignore the traceparent when the parent-id is invalid (for example, if it contains non-lowercase hex characters).

3.2.2.5 trace-flags

An 8-bit field that controls tracing flags such as sampling, trace level, etc. These flags are recommendations given by the caller rather than strict rules to follow for three reasons:

An untrusted caller may be able to abuse a tracing system by setting these flags maliciously.
A caller may have a bug which causes the tracing system to have a problem.
Different load between caller service and callee service might force callee to downsample.

You can find more in the section Security considerations of this specification.

Like other fields, trace-flags is hex-encoded. For example, all 8 flags set would be ff and no flags set would be 00.

As this is a bit field, you cannot interpret flags by decoding the hex value and looking at the resulting number. For example, a flag 00000001 could be encoded as 01 in hex, or 09 in hex if present with the flag 00001000. A common mistake in bit fields is forgetting to mask when interpreting flags.

Here is an example of properly handling trace flags:

static final byte FLAG_SAMPLED = 1; // 00000001
...
boolean sampled = (traceFlags & FLAG_SAMPLED) == FLAG_SAMPLED;

3.2.2.5.1 Sampled flag

The current version of this specification ( 00 ) only supports a single flag called sampled.

When set, the least significant bit (right-most), denotes that the caller may have recorded trace data. When unset, the caller did not record trace data out-of-band.

There are a number of recording scenarios that may break distributed tracing:

Only recording a subset of requests results in broken traces.
Recording information about all incoming and outgoing requests becomes prohibitively expensive, at load.
Making random or component-specific data collection decisions leads to fragmented data in all traces.

Because of these issues, tracing vendors make their own recording decisions, and there is no consensus on what is the best algorithm for this job.

Various techniques include:

Probability sampling (sample 1 out of 100 distributed traces by flipping a coin)
Delayed decision (make collection decision based on duration or a result of a request)
Deferred sampling (let the callee decide whether information about this request needs to be collected)

How these techniques are implemented can be tracing vendor-specific or application-defined.

The tracestate field is designed to handle the variety of techniques for making recording decisions (or other specific information) specific for a given vendor. The sampled flag provides better interoperability between vendors. It allows vendors to communicate recording decisions and enable a better experience for the customer.

For example, when a SaaS service participates in a distributed trace , this service has no knowledge of the tracing vendor used by its caller. This service may produce records of incoming requests for monitoring or troubleshooting purposes. The sampled flag can be used to ensure that information about requests that were marked for recording by the caller will also be recorded by SaaS service downstream so that the caller can troubleshoot the behavior of every recorded request.

The sampled flag has no restriction on its mutations except that it can only be mutated when parent-id is updated .

The following are a set of suggestions that vendors SHOULD use to increase vendor interoperability.

If a component made definitive recording decision - this decision SHOULD be reflected in the sampled flag.
If a component needs to make a recording decision - it SHOULD respect the sampled flag value. Security considerations SHOULD be applied to protect from abusive or malicious use of this flag.
If a component deferred or delayed the decision and only a subset of telemetry will be recorded, the sampled flag should be propagated unchanged. It should be set to 0 as the default option when the trace is initiated by this component.

There are two additional options that vendors MAY follow:

A component that makes a deferred or delayed recording decision may communicate the priority of a recording by setting sampled flag to 1 for a subset of requests.
A component may also fall back to probability sampling and set the sampled flag to 1 for the subset of requests.

3.2.2.5.2 Other Flags

The behavior of other flags, such as ( 00000100 ) is not defined and is reserved for future use. Vendors MUST set those to zero.

3.2.2.6 sampling-constant

This optional field is represented by 4 lowercase hex digits. It consists of a geometrically distributed random value used to guarantee consistent sampling and a constant which can be used to calculate the sampling probability used by the parent. In order to optimize space, sampling probabilities are restricted to values 2^-x where x <= 61 and a special case which represents the probability 0.

3.2.2.6.1 sampling-random-value

This field is a random value less than or equal to 62 taken from the truncated geometric distribution with success probability a = 1/2. It is represented as an integer using 2 lowercase hex digits. It is used to ensure distributed traces are sampled consistently. For example, if two components in the same distributed trace have different sampling probability, this value may be used to ensure that the component with the lesser probability samples a subset of the traces sampled by the component with the greater probability. This ensures the maximum number of possible complete traces are sampled for a given set of sampling probabilities.

It can be efficiently calculated by counting the number of leading zeros in a random binary integer. For example, given the random integer 169797692638565684 represented by the 64-bit unsigned binary 0000001001011011001111100001101000010001011001110001010100110100, there are 6 leading zeros giving a sampling random value of 06. Samplers generating the sampling random value SHOULD use 62 bits of uniformly random binary to calculate the random value.

3.2.2.6.2 parent-sampling-constant

This value represents the sampling probability used by the parent of the current operation. It is represented as an integer using 2 lowercase hex digits. The parent sampling probability can be calculated using 2^-x where x is the head sampling constant. The special value of 3e ( 62 ) is taken to represent a head sampling probability of 0. The special value of 3f ( 63 ) is taken to represent an unknown head sampling probability. Any value greater than 3f ( 63 ) is invalid. It is represented as a lowercase hex-encoded 8-bit integer.

3.2.3 Examples of HTTP traceparent Headers

Valid traceparent when caller sampled this request:

Value = 01-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01-0100

base16(version) = 00
base16(trace-id) = bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = f067aa0ba902b7
base16(trace-flags) =   // sampled

base16(trace-id) = 4bf92f3577b34da6a3ce929d0e0e4736
base16(parent-id) = 00f067aa0ba902b7
base16(trace-flags) = 01  // sampled
base16(sampling-random-value) = 01 // trace sampled by all components using sampling probability 0.5 or greater
base16(parent-sampling-constant) = 00 // all traces sampled by head of trace

Valid traceparent when caller didn’t sample this request:


Value
=

00
-
4

01


bf92f3577b34da6a3ce929d0e0e4736-

-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-


00

f067aa0ba902b7-

-


00

0100


base16(version)
=

00

base16(trace-id)
=
4
bf92f3577b34da6a3ce929d0e0e4736

4bf92f3577b34da6a3ce929d0e0e4736

base16(parent-id)
=
00f067aa0ba902b7
base16(trace-flags)
=


00

f067aa0ba902b7
base16(trace-flags)


//
not
sampled

base16(sampling-random-value)
=

01

//
trace
sampled
by
all
components
using
sampling
probability
0.5
or
greater

base16(parent-sampling-constant)

=

00


//
not

all
traces

sampled
by
head
of
trace

3.2.4 Versioning of traceparent

This specification is opinionated about future versions of trace context. The current version of this specification assumes that future versions of the traceparent header will be additive to the current one.

Vendors MUST follow these rules when parsing headers with an unexpected format:

Pass-through services should not analyze the version. They should expect that headers may have larger size limits in the future and only disallow prohibitively large headers.
When the version prefix cannot be parsed (it's not 2 hex characters followed by a dash ( - )), the implementation should restart the trace.
If a higher version is detected, the implementation SHOULD try to parse it by trying the following:
- If the size of the header is shorter than 55 characters, the vendor should not parse the header and should restart the trace.
- Parse trace-id (from the first dash through the next 32 characters). Vendors MUST check that the 32 characters are hex, and that they are followed by a dash ( - ).
- Parse parent-id (from the second dash at the 35th position through the next 16 characters). Vendors MUST check that the 16 characters are hex and followed by a dash.
- Parse the sampled bit of flags (2 characters from the third dash). Vendors MUST check that the 2 characters are either at the end of the string or followed by a dash.
- Parse the sampling-constant (from the fourth dash at the 55th position through the next 4 characters). Vendors MUST check that the 2 characters are either at the end of the string or followed by a dash.
If all three values required values were parsed successfully, the vendor should use them. If the sampling constant was parsed successfully, the vendor should use it.

Vendors MUST NOT parse or assume anything about unknown fields for this version. Vendors MUST use these fields to construct the new traceparent field according to the highest version of the specification known to the implementation (in this specification it is 00 ).

3.3 Tracestate Header

The main purpose of the tracestate HTTP header is to provide additional vendor-specific trace identification information across different distributed tracing systems and is a companion header for the traceparent field. It also conveys information about the request’s position in multiple distributed tracing graphs.

If the vendor failed to parse traceparent, it MUST NOT attempt to parse tracestate. Note that the opposite is not true: failure to parse tracestate MUST NOT affect the parsing of traceparent.

The tracestate HTTP header MUST NOT be used for any properties that are not defined by a tracing system. [ BAGGAGE ] MAY be used for defining and propagating such application level properties.

3.3.1 Header Name

Header name: tracestate

In order to increase interoperability across multiple protocols and encourage successful integration, by default you SHOULD keep the header name lowercase. The header name is a single word without any delimiters, for example, a hyphen ( - ).

Vendors MUST expect the header name in any case (upper, lower, mixed), and SHOULD send the header name in lowercase.

3.3.2 tracestate Header Field Values

The tracestate field may contain any opaque value in any of the keys. Tracestate MAY be sent or received as multiple header fields. Multiple tracestate header fields MUST be handled as specified by RFC7230 Section 3.2.2 Field Order . The tracestate header SHOULD be sent as a single field when possible, but MAY be split into multiple header fields. When sending tracestate as multiple header fields, it MUST be split according to RFC7230 . When receiving multiple tracestate header fields, they MUST be combined into a single header according to RFC7230 .

This section uses the Augmented Backus-Naur Form (ABNF) notation of [ RFC5234 ], including the DIGIT rule in appendix B.1 for RFC5234 . It also includes the OWS rule from RFC7230 section 3.2.3 .

The DIGIT rule defines numbers 0 - 9.

The OWS rule defines an optional whitespace character. To improve readability, it is used where zero or more whitespace characters might appear.

The caller SHOULD generate the optional whitespace as a single space; otherwise, a caller SHOULD NOT generate optional whitespace. See details in the corresponding RFC .

The tracestate field value is a list of list-members separated by commas ( , ). A list-member is a key/value pair separated by an equals sign ( = ). Spaces and horizontal tabs surrounding list-member s are ignored. There can be a maximum of 32 list-member s in a list. If adding an entry would cause the tracestate list to contain more than 32 list-members the right-most list-member should be removed from the list.

Empty and whitespace-only list members are allowed. Vendors MUST accept empty tracestate headers but SHOULD avoid sending them. Empty list members are allowed in tracestate because it is difficult for a vendor to recognize the empty value when multiple tracestate headers are sent. Whitespace characters are allowed for a similar reason, as some vendors automatically inject whitespace after a comma separator, even in the case of an empty header.

3.3.2.1 list

A simple example of a list with two list-member s might look like: vendorname1=opaqueValue1,vendorname2=opaqueValue2.

list  = list-member 0*31( OWS "," OWS list-member )
list-member = (key "=" value) / OWS

Identifiers for a list are short (up to 256 characters) textual identifiers.

3.3.2.2 list-members

A list-member contains a key/value pair.

3.3.2.2.1 Key

The key is an identifier that describes the vendor.



key

=
(
lcalpha
/

DIGIT

)

0

*

255

(
keychar
)

keychar

=
lcalpha
/

DIGIT

/

"_"

/

"-"

/

"*"

/

"/"

/

"@"


lcalpha

=

%x61-7A


;
a-z

A key MUST begin with a lowercase letter or a digit and contain up to 256 characters including lowercase letters ( a - z ), digits ( 0 - 9 ), underscores ( _ ), dashes ( - ), asterisks ( * ), forward slashes ( / ), and at signs ( @ ).

3.3.2.2.2 Value

The value is an opaque string containing up to 256 printable ASCII [ RFC0020 ] characters (i.e., the range 0x20 to 0x7E) except comma (,) and (=). The string must end with a character which is not a space (0x20). Note that this also excludes tabs, newlines, carriage returns, etc. All leading spaces MUST be preserved as part of the value. All trailing spaces are considered to be optional whitespace characters not part of the value. Optional trailing whitespace MAY be excluded when propagating the header.



value

=

0

*

255

(chr)
nblk-chr

nblk-chr

=

%x21-2B

/

%x2D-3C

/

%x3E-7E


chr

=

%x20

/
nblk-chr

3.3.3 Combined Header Value

The tracestate value is the concatenation of trace graph key/value pairs

Example: vendorname1=opaqueValue1,vendorname2=opaqueValue2

Only one entry per key is allowed because the entry represents that last position in the trace. Hence vendors must overwrite their entry upon reentry to their tracing system.

For example, if a vendor name is Congo and a trace started in their system and then went through a system named Rojo and later returned to Congo, the tracestate value would not be:

congo=congosFirstPosition,rojo=rojosFirstPosition,congo=congosSecondPosition

Instead, the entry would be rewritten to only include the most recent position: congo=congosSecondPosition,rojo=rojosFirstPosition

3.3.3.1 tracestate Limits:

Vendors SHOULD propagate at least 512 characters of a combined header. This length includes commas required to separate list items and optional white space (OWS) characters.

There are systems where propagating of 512 characters of tracestate may be expensive. In this case, the maximum size of the propagated tracestate header SHOULD be documented and explained. The cost of propagating tracestate SHOULD be weighted against the value of monitoring scenarios enabled for the end users.

In a situation where tracestate is truncated due to the total size of the header value, the vendor MUST truncate whole entries. Entries larger than 128 characters long SHOULD be removed first. Then entries SHOULD be removed starting from the end of tracestate. Other truncation strategies like safe list entries, blocked list entries, or size-based truncation SHOULD NOT be used.

3.3.4 Examples of tracestate HTTP Headers

Single tracing system (generic format):

tracestate: rojo=00f067aa0ba902b7

Multiple tracing systems (with different formatting):


tracestate:
rojo=00f067aa0ba902b7,congo=t61rcWkgMzE

3.3.5 Versioning of tracestate

The version of tracestate is defined by the version prefix of traceparent header. Vendors need to attempt to parse tracestate if a higher version is detected, to the best of its ability. It is the vendor’s decision whether to use partially-parsed tracestate key/value pairs or not.

3.4 Mutating the traceparent Field

A vendor receiving a traceparent request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing it to outgoing requests.

If the value of the traceparent field wasn't changed before propagation, tracestate MUST NOT be modified as well. Unmodified header propagation is typically implemented in pass-through services like proxies. This behavior may also be implemented in a service which currently does not collect distributed tracing information.

Following is the list of allowed mutations:

Update parent-id : The value of the parent-id field can be set to the new value representing the ID of the current operation. This is the most typical mutation and should be considered a default.
Update sampled : The value of the sampled field reflects the caller's recording behavior: either trace data was dropped or may have been recorded out-of-band. This can be indicated by toggling the flag in both directions. This mutation gives the downstream vendor information about the likelihood that its parent's information was recorded. The parent-id field MUST be set to a new value with the sampled flag update.
Restart trace : All properties ( trace-id, parent-id, trace-flags ) are regenerated. This mutation is used in services that are defined as a front gate into secure networks and eliminates a potential denial-of-service attack surface. Vendors SHOULD clean up tracestate collection on traceparent restart. There are rare cases when the original tracestate entries must be preserved after a restart. This typically happens when the trace-id is reverted back at some point of the trace flow, for instance, when it leaves the secure network. However, it SHOULD be an explicit decision, and not the default behavior.
Downgrade the version : This version of the specification ( 00 ) defines the behavior for a vendor that receives a traceparent header of a higher version. In this case, the first mutation is to downgrade the version of the header. Other mutations are allowed in combination with this one.

Vendors MUST NOT make any other mutations to the traceparent header.

3.5 Mutating the tracestate Field

Vendors receiving a tracestate request header MUST send it to outgoing requests. It MAY mutate the value of this header before passing to outgoing requests. When mutating tracestate, the order of unmodified key/value pairs MUST be preserved. Modified keys MUST be moved to the beginning (left) of the list.

Following are allowed mutations:

Add a new key/value pair . The new key/value pair SHOULD be added to the beginning of the list.
Update an existing value . The value for any given key can be updated. Modified keys SHOULD be moved to the beginning (left) of the list.
Delete a key/value pair . Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that were not generated by them. The deletion of an unknown key/value pair will break correlation in other systems. This mutation enables two scenarios. The first is that proxies can block certain tracestate keys for privacy and security concerns. The second scenario is a truncation of long tracestate s.

4. Trace Context HTTP Response Headers Format

This section describes the binding of the distributed trace context to the traceresponse HTTP header.

4.1 Traceresponse Header

The traceresponse HTTP response header field identifies a completed request in a tracing system. It has four fields:

version
trace-id
child-id
trace-flags

4.1.1 Header Name

Header name: traceresponse

In order to increase interoperability across multiple protocols and encourage successful integration, the header name SHOULD be lowercase. The header name is a single word without any delimiters, for example, a hyphen (-).

Tracing systems MUST expect the header name in any case (upper, lower, mixed), and SHOULD send the header name in lowercase.

4.1.2 traceresponse Header Field Values

This section uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234], including the DIGIT rule from that document. The DIGIT rule defines a single number character 0-9.

HEXDIGLC = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" ; lowercase hex character
value           = version "-" version-format

The dash ( - ) character is used as a delimiter between fields.

4.1.2.1 version



version

=

2

HEXDIGLC

;
this
document
assumes
version
00.
Version
255
is
forbidden

The value is US-ASCII encoded (which is UTF-8 compliant).

Version ( version ) is 1 byte representing an 8-bit unsigned integer. Version 255 is invalid. The current specification assumes the version is set to 00.

4.1.2.2 version-format

The following version-format definition is used for version 00.



version-format

=
trace-id

"-"

child-id

"-"

trace-flags

trace-id

=

32

HEXDIGLC

;
16
bytes
array
identifier.
All
zeroes
forbidden


child-id

=

16

HEXDIGLC

;
8
bytes
array
identifier.
All
zeroes
forbidden


trace-flags

=

2

HEXDIGLC

;
8
bit
flags.
Currently,
only
one
bit
is
used.
See
below
for
details

4.1.2.3 trace-id

If the trace-id value is invalid (for example if it contains non-allowed characters or all zeros), tracing systems MUST ignore the traceresponse.

See considerations for trace-id field generation for recommendations on how to operate with trace-id.

4.1.2.4 child-id

This is the ID of the operation of the callee (in some systems known as the span id) and is used to uniquely identify an operation within a trace. It is represented as an 8-byte array, for example, 00f067aa0ba902b7. All bytes as zero ( 0000000000000000 ) is considered an invalid value.

4.1.2.5 trace-flags

An 8-bit field that provides additional information about how the callee handled the trace such as sampling, trace level, etc. These flags are recommendations given by the callee rather than strict rules to follow for three reasons:

An untrusted callee may be able to abuse a tracing system by setting these flags maliciously.
A callee may have a bug which causes the tracing system to have a problem.
Different load between calling and called services might force one or more participants to discard part or all of a trace.

You can find more in the section Security considerations of this specification.

Like other fields, trace-flags is hex-encoded. For example, all 8 flags set would be ff and no flags set would be 00.

As this is a bit field, you cannot interpret flags by decoding the hex value and looking at the resulting number. For example, a flag 00000001 could be encoded as 01 in hex, or 09 in hex if the flag 00001000 was also present (00001001 is 09). A common mistake in bit fields is forgetting to mask when interpreting flags.

Here is an example of properly handling trace flags:

static final byte FLAG_SAMPLED = 1; // 00000001
...
boolean sampled = (traceFlags & FLAG_SAMPLED) == FLAG_SAMPLED;

4.1.2.5.1 Sampled flag

The current version of this specification ( 00 ) only supports a single flag called sampled.

When set, the least significant bit (right-most), denotes that the callee may have recorded trace data. When unset, the callee did not record trace data out-of-band.

The sampled flag provides interoperability between tracing systems. It allows tracing systems to communicate recording decisions and enable a better experience for the customer. For example, when a SaaS load balancer service participates in a distributed trace , this service has no knowledge of the tracing system used by its callee. This service may produce records of incoming requests for monitoring or troubleshooting purposes. The sampled flag can be used to ensure that information about requests that were marked for recording by the callee will also be recorded by the SaaS load balancer service upstream so that the callee can troubleshoot the behavior of every recorded request.

The sampled flag has no restrictions.

The following are a set of suggestions that tracing systems SHOULD use to increase interoperability.

If a component made definitive recording decision - this decision SHOULD be reflected in the sampled flag.
If a component needs to make a recording decision - it SHOULD respect the sampled flag value. Security considerations SHOULD be applied to protect from abusive or malicious use of this flag.
If a component deferred or delayed the decision and only a subset of telemetry will be recorded, the sampled flag from the incoming traceparent header should be used if it is available. It should be set to 0 as the default option when the trace is initiated by this component.
If a component receives a 0 for the sampled flag on an incoming request, it may still decide to record a trace. In this case it SHOULD return a sampled flag 1 on the response so that the caller can update its sampling decision if required.

There are two additional options that tracing systems MAY follow:

A component that makes a deferred or delayed recording decision may communicate the priority of a recording by setting sampled flag to 1 for a subset of requests.
A component may also fall back to probability sampling and set the sampled flag to 1 for the subset of requests.

4.1.2.5.2 Other Flags

The behavior of other flags, such as ( 00000100 ) is not defined and is reserved for future use. tracing systems MUST set those to zero.

5. Processing Model

This section is non-normative.

This section provides a step-by-step example of a tracing vendor receiving a request with trace context headers, processing the request and then potentially forwarding it. This description can be used as a reference when implementing a trace context-compliant tracing system, middleware (like a proxy or messaging bus), or a cloud service.

5.1 Processing Model for Working with Trace Context Request Header

This processing model describes the behavior of a vendor that modifies and forwards trace context headers. How the model works depends on whether or not a traceparent header is received.

5.1.1 No traceparent Received

If no traceparent header is received:

The vendor checks an incoming request for a traceparent and a tracestate header.
Because the traceparent header is not received, the vendor creates a new trace-id and parent-id that represents the current request. (Note: If the vendor does not sample this request and wants to communicate that sampling decision downstream via the sampled flag, the vendor MAY create a trace-id and parent-id that are not associated with any actual trace data. The vendor MAY also decide to not communicate the sampling decision downstream.)
If a tracestate header is received without an accompanying traceparent header, it is invalid and MUST be discarded.
The vendor SHOULD create a new tracestate header and add a new key/value pair.
The vendor sets the traceparent and tracestate header for the outgoing request.

5.1.2 A traceparent is Received

If a traceparent header is received:

The vendor checks an incoming request for a traceparent and a tracestate header.
Because the traceparent header is present , the vendor tries to parse the version of the traceparent header.
1. If the version cannot be parsed , the vendor creates a new traceparent header and deletes tracestate.
2. If the version number is higher than supported by the tracer, the vendor uses the format defined in this specification ( 00 ) to parse trace-id and parent-id. The vendor will only parse the trace-flags values supported by this version of this specification and ignore all other values. If parsing fails, the vendor creates a new traceparent header and deletes the tracestate. Vendors will set all unparsed / unknown trace-flags to 0 on outgoing requests.
3. If the vendor supports the version number , it validates trace-id and parent-id. If either trace-id, parent-id or trace-flags are invalid, the vendor creates a new traceparent header and deletes tracestate.
The vendor MAY validate the tracestate header. If the tracestate header cannot be parsed the vendor MAY discard the entire header. Invalid tracestate entries MAY also be discarded.
For each outgoing request the vendor performs the following steps:
1. The vendor MUST modify the traceparent header:
  - Update parent-id: The value of property parent-id MUST be set to a value representing the ID of the current operation.
  - Update sampled: The value of sampled reflects the caller's recording behavior. The value of the sampled flag of trace-flags MAY be set to 1 if the trace data is likely to be recorded or to 0 otherwise. Setting the flag is no guarantee that the trace will be recorded but increases the likeliness of end-to-end recorded traces.
2. The vendor MAY modify the tracestate header:
  - Update a key value: The value of any key can be updated. Modified keys MUST be moved to the beginning (left) of the list.
  - Add a new key/value pair: The new key-value pair MUST be added to the beginning (left) of the list.
  - Delete a key/value pair: Any key/value pair MAY be deleted. Vendors SHOULD NOT delete keys that weren't generated by themselves. Deletion of any key/value pair MAY break correlation in other systems.
3. The vendor sets the traceparent and tracestate header for the outgoing request.

5.1.3 Alternative Processing

The processing model above describes the complete set of steps for processing trace context headers. There are, however, situations when a vendor might only support a subset of the steps described above. Proxies or messaging middleware MAY decide not to modify the traceparent headers but remove invalid headers or add additional information to tracestate.

5.2 Processing Model for Working with Trace Context Response Header

This processing model describes the behavior of a tracing system that returns trace context headers. Behavior depends on the configuration of the tracing system and what information it wishes to return to the caller.

5.2.1 Restarted Trace

When a service is called by an untrusted third party, it may decide to restart the trace. In this case, the called service MAY return a traceresponse field indicating its internal trace-id, span-id, and sampling decision.

Example request and response:

Request

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-d75597dee50b0cac-01

Response


traceresponse:
00-1baad25c36c11c1e7fbd6d122bd85db6-cab70b47728a8a99-01

In this example, a participant in a trace with ID 4bf92f3577b34da6a3ce929d0e0e4736 calls a third party system that collects their own internal telemetry using a new trace ID 1baad25c36c11c1e7fbd6d122bd85db6. When the third party completes its request, it returns the new trace ID, the ID of the operation, and internal sampling decision to the caller. If there is an error with the request, the caller can include the third party's internal trace ID in a support request.

5.2.2 Load Balancer Deferred Sampling

When a service that made a negative sampling decision makes a call to another service, there may be some event during the processing of that request that causes the called service to decide to sample the request. In this case, it may return its updated sampling decision to the caller, the caller may also return the updated sampling decision to its caller, and so on. In this way, as much of a trace as possible may be recovered for debugging purposes even if the original sampling decision was negative.

One example of this might be a load balancer which samples a random subset of requests. If the destination service encounters a problem, it may indicate that the request should be sampled by the load balancer anyway by returning a traceresponse with the sampled flag set.

Example request and response:

Request


traceparent:
00-4bf92f3577b34da6a3ce929d0e0e4736-d75597dee50b0cac-00

Response


traceresponse:
00-4bf92f3577b34da6a3ce929d0e0e4736-828c5d0d435ba505-01

In this example, a caller (the load balancer) in a trace with ID 4bf92f3577b34da6a3ce929d0e0e4736 wishes to defer a sampling decision to its callee. When the callee completes the request, it returns the internal sampling decision to the caller.

9. Considerations for trace-id field generation

This section is non-normative.

This section suggests some best practices to consider when platform or tracing vendor implement trace-id generation and propagation algorithms. These practices will ensure better interoperability of different systems.

9.1 Uniqueness of `trace-id`

The value of trace-id SHOULD be globally unique. This field is typically used for unique identification of a distributed trace . It is common for distributed traces to span various components, including, for example, cloud services. Cloud services tend to serve variety of clients and have a very high throughput of requests. So global uniqueness of trace-id is important, even when local uniqueness might seem like a good solution.

9.2 Randomness of `trace-id`

Randomly generated value of trace-id SHOULD be preferred over other algorithms of generating a globally unique identifiers. Randomness of trace-id addresses some security and privacy concerns of exposing unwanted information. Randomness also allows tracing vendors to base sampling decisions on trace-id field value and avoid propagating an additional sampling context.

As shown in the next section, it is important for trace-id to carry "uniqueness" and "randomness" in the right part of the trace-id, for better inter-operability with some existing systems.

9.3 Handling `trace-id` for compliant platforms with shorter internal identifiers

There are tracing systems which use a trace-id that is shorter than 16 bytes, which are still willing to adopt this specification.

If such a system is capable of propagating a fully compliant trace-id, even while still requiring a shorter, non-compliant identifier for internal purposes, the system is encouraged to utilize the tracestate header to propagate the additional internal identifier. However, if a system would instead prefer to use the internal identifier as the basis for a fully compliant trace-id, it SHOULD be incorporated at the as rightmost part of a trace-id. For example, tracing system may receive 234a5bcd543ef3fa53ce929d0e0e4736 as a trace-id, hovewer internally it will use 53ce929d0e0e4736 as an identifier.

9.4 Interoperating with existing systems which use shorter identifiers

There are tracing systems which are not capable of propagating the entire 16 bytes of a trace-id. For better interoperability between a fully compliant systems with these existing systems, the following practices are recommended:

When a system creates an outbound message and needs to generate a fully compliant 16 bytes trace-id from a shorter identifier, it SHOULD left pad the original identifier with zeroes. For example, the identifier 53ce929d0e0e4736, SHOULD be converted to trace-id value 000000000000000053ce929d0e0e4736.
When a system receives an inbound message and needs to convert the 16 bytes trace-id to a shorter identifier, the rightmost part of trace-id SHOULD be used as this identifier. For instance, if the value of trace-id was 234a5bcd543ef3fa53ce929d0e0e4736 on an incoming request, tracing system SHOULD use identifier with the value of 53ce929d0e0e4736.

Similar transformations are expected when tracing system converts other distributed trace context propagation formats to W3C Trace Context. Shorter identifiers SHOULD be left padded with zeros when converted to 16 bytes trace-id and rightmost part of trace-id SHOULD be used as a shorter identifier.

Note, many existing systems that are not capable of propagating the whole trace-id will not propagate tracestate header either. However, such system can still use tracestate header to propagate additional data that is known by this system. For example, some systems use two flags indicating whether distributed trace needs to be recorded or not. In this case one flag can be send as sampled flag of traceparent header and tracestate can be used to send and receive an additional flag. Compliant systems will propagate this flag along all other key/value pairs. Existing systems which are not capable of tracestate propagation will truncate all additional values from tracestate and only pass along that flag.

Trace Context Level 2

W3C Editor's Draft 28 September 2021

Abstract

Status of This Document

1. Conformance

2. Overview

2.1 Problem Statement

2.2 Solution

2.3 Design Overview