Internet-Draft | Probabilistic Reveal Tokens | July 2025 |
Pfeiffenberger, et al. | Expires 22 January 2026 | [Page] |
Fraud detection often relies on high-entropy signals that can also be used to track users across sites. Probabilistic Reveal Tokens (PRTs) attempt to balance the needs of fraud detection and tracking prevention by sampling at a rate that is too low for scaled cross-site tracking, but sufficient for fraud detection in aggregate scenarios. This document describes the PRT protocol, which allows browsers to reveal sensitive signals (e.g., IP address) on a per-site basis with provable probability p_reveal, while websites can use PRTs to measure traffic quality and update denylists.¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://philippp.github.io/id-template/draft-pfeiffenberger-prtokens.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-pfeiffenberger-prtokens/.¶
Source for this draft and an issue tracker can be found at https://github.com/philippp/id-template.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 22 January 2026.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
Cross-site tracking threatens user privacy, yet websites require access to certain signals for legitimate fraud detection and security purposes. IP addresses, in particular, serve as critical signals for identifying and mitigating fraudulent activity, but their universal availability enables large-scale tracking across websites.¶
Probabilistic Reveal Tokens (PRTs) provide a solution that balances these competing needs. PRTs allow browsers to share sensitive signals with websites at a controlled probability rate that is too low for effective cross-site tracking but sufficient for aggregate fraud detection and analysis.¶
The PRT protocol enables:¶
Browsers to reveal sensitive signals on a per-site basis with provable probability p_reveal¶
Websites to measure traffic quality for a sample of entities or combinations of entities¶
Users to validate that the reveal rate is as expected after tokens have been spent¶
Prevention of scaled cross-site tracking while preserving fraud detection capabilities¶
PRTs utilize ElGamal encryption [ELGAMAL] [RFC3526] with re-randomization properties to ensure that tokens remain unlinkable beyond the probabilistically-included high-entropy signals. Tokens are unforgeable while eligible to be spent and become refutable after the spending period (epoch) ends through the publication of decryption keys.¶
This document specifies the PRT protocol, including token generation, transmission, validation, and the associated cryptographic operations. The protocol is designed to provide verifiable privacy properties while enabling legitimate fraud detection use cases.¶
Key Coordinator: The entity that generates the cryptographic keypair necessary for token encryption and decryption, and is responsible for keeping the secret key material secret while tokens are eligible to be spent. This may be implemented as part of the Issuer.¶
Epoch: A time period during which tokens are eligible to be spent by a browser. After the epoch ends, the Key Coordinator reveals the key material needed to decrypt and verify the tokens. This key material also allows anyone to generate tokens of this now-revealed epoch, creating deniability for token bearers.¶
Embargo Period: An additional duration after the epoch has ended, and before keys are released. Without an embargo period, keys minted at the exact end of the epoch are at risk of having been minted with recently released keys.¶
Issuer: An internet-facing service from which the browser fetches PRTs. The issuer computes an array of N = N_signal + N_NULL tokens (where N_signal tokens contain the signal, and N_NULL tokens do not) so that p_reveal = N_signal/N. The issuer shuffles these tokens and passes them to the browser.¶
Browser: The client software that fetches tokens from the issuer, re-randomizes them to prevent linkability, and sends the tokens to websites. After the key material is published, the browser helps the user validate the privacy properties of the tokens.¶
Website: The recipient of tokens from the browser. After the key material is published, websites validate the legitimacy of the tokens and leverage sampled signals.¶
Signal: The sensitive data (such as an IP address) that is probabilistically included in tokens.¶
p_reveal: The probability that any given token contains the signal rather than a NULL value.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The PRT protocol involves four main participants: the Key Coordinator, Issuer, Browser, and Websites. The protocol operates in epochs, with each epoch having a defined start and end time.¶
During each epoch:¶
The Key Coordinator generates an asymmetric keypair for ElGamal encryption¶
The Issuer generates a plaintext value containing signals with probability p_reveal, signs them with an HMAC key, and encrypts them using the public key to create tokens.¶
Browsers fetch token batches, re-randomize individual tokens, and send them to websites in HTTP headers¶
Websites store received tokens for later decryption¶
After the epoch ends and the embargo period has expired:¶
The Key Coordinator publishes the private key and HMAC secret¶
All parties can decrypt tokens and validate their contents¶
Users can verify the actual reveal rate matches the expected p_reveal¶
Websites can extract signals for fraud analysis¶
The PRT protocol relies on the following trust relationships:¶
The protocol satisfies the following requirements:¶
Ratio Inspection: Users MUST be able to verify the token reveal rate (p_reveal) after epoch completion.¶
Content Inspection: Users MUST be able to verify after-the-fact that the tokens do not contain additional identifying information beyond the expected signal.¶
Unlinkability: The website and issuer MUST NOT be able to re-identify the user (e.g., by colluding and matching tokens).¶
Robustness: Browsers MUST NOT be able to intentionally reduce their aggregate token reveal rate below p_reveal, and websites MUST NOT be able to increase the likelihood of a browser revealing its signal.¶
PRTs reduce access to re-identifiable information in the absence of other cross-visit identifiers. PRT implementations MUST allocate only one token to each website during a visit, where a visit is defined as a period during which the user expects to be re-identified by a website (e.g., a sequence of navigations or a cookied session).¶
If this requirement is not met, a website could collect multiple tokens for the same user and the same signal, and increase their chance of recovering the signal beyond p_reveal.¶
Epochs SHOULD be approximately one day in length, but this is not required by the protocol. In practice, epochs SHOULD be long enough to avoid leakage through user re-identification within a browsing session, and short enough that the signal in the token is still likely to be valid. A minimum epoch length of four hours is RECOMMENDED.¶
Each epoch MUST end after the following epoch has started. This allows browsers to start using tokens from the new epoch before the current epoch ends, avoiding both a lapse in the availability of valid tokens and a thundering herd problem when new tokens become available.¶
The Key Coordinator SHOULD wait an additional embargo period after the epoch ends before revealing the keys. This ensures that any tokens sent at the end of an epoch cannot be decrypted shortly afterwards when the epoch changeover occurs. It is RECOMMENDED that the embargo duration be equal to that of the epoch length to ensure that no token can be decrypted sooner than one epoch duration in length regardless of when during an epoch it is sent to the website.¶
The following cryptographic parameters MUST be used:¶
The Key Coordinator generates an asymmetric keypair (pk_e, sk_e) to be rotated every epoch E, with sk_e published to all participants after the epoch has ended. Key generation MUST follow these steps:¶
Generate a random private key x uniformly from [1, n-1] where n is the order of the secp256r1 base point¶
Compute the public key point Y = x * G where G is the secp256r1 generator point¶
The ElGamal public key pk_e consists of (G, Y)¶
The ElGamal private key sk_e is x¶
The public key pk_e is shared as a JSON Web Key (JWK) [JWK]:¶
{ "kty": "EC", "crv": "P-256", "x": "DROn9TZojl70_6lhtcLxItT2qskNDGk97wjz0N5qdiE", "y": "k6EtdGm_jW3b7Le9zM2LgcO7b9Q_qwjS2jL0MFn6V4" }¶
The Issuer receives pk_e and generates an epoch-scoped secret S_e (32 bytes) that ensures only this issuer can mint tokens.¶
When a browser requests a batch of N tokens, the Issuer constructs each token message with the following fixed structure:¶
+----------+----------+----------+----------+----------+ | Field | Version | t_ord | signal | H | +----------+----------+----------+----------+----------+ | Size | 1 byte | 1 byte | 16 bytes | 8 bytes | +----------+----------+----------+----------+----------+ | Offset | 0 | 1 | 2 | 18 | +----------+----------+----------+----------+----------+¶
Field definitions:¶
When a browser requests a batch of N tokens, the Issuer:¶
Observes the signal (e.g., IP address) from the browser's connection¶
Determines N_signal = floor(N * p_reveal) and N_NULL = N - N_signal¶
For i = 1 to N: a. Set t_ord = i b. If i <= N_signal: set signal to the actual signal value c. If i > N_signal: set signal to a number of zero bytes which matches the signal size d. Construct message M_i as defined in Section 4.3.3 e. Compute HMAC-SHA256(S_e, Version || t_ord || signal) f. Set H to the first 8 bytes of the HMAC result¶
For each message M_i: a. Interpret the 26-byte message as a big-endian integer b. Left-shift the message by three bytes and increment the value of the three-byte padding until the (29-byte) padded message is a valid x-coordinate on the curve. c. Encrypt the point using ElGamal: (u, e) = (rG, M_point + rY) where r is a random scalar d. Serialize u and e using compressed point encoding¶
Shuffle the encrypted tokens with cryptographically secure randomness.¶
Provide the shuffled tokens to the browser along with pk_e, t_epoch_end, and t_next_epoch_start¶
To prevent linkability attacks, the browser MUST re-randomize each token before use. Re-randomization exploits the malleable property of ElGamal encryption to produce a new ciphertext that decrypts to the same plaintext but is unlinkable to the original ciphertext.¶
Given an ElGamal ciphertext (u, e) where:¶
The browser re-randomizes by:¶
Parse u and e from their compressed encodings to curve points U and E¶
Generate a random scalar z uniformly from [1, n-1] where n is the order of the secp256r1 curve¶
Compute the re-randomized ciphertext:¶
Serialize U' and E' using compressed point encoding¶
The browser MUST re-randomize each token before it is used and each time the token is re-used to defend against linkability attacks.¶
The browser maintains a local database with the following schema:¶
+------------+------------+------------+------------+------------+ | Field Name | u | e | t_epoch_ | epoch_id | | | | | end | | +------------+------------+------------+------------+------------+ | Field Type | bytes | bytes | timestamp | bytes | +------------+------------+------------+------------+------------+¶
Additional implementation-specific fields for token management:¶
+------------+------------+------------+------------+------------+ | Field Name | version | public_key | num_signal | context_id | | | | | _tokens | | +------------+------------+------------+------------+------------+ | Field Type | integer | text | integer | string | +------------+------------+------------+------------+------------+¶
The context_id field defaults to NULL for new tokens and is set to an identifier of the context (e.g., the top-level domain name) when the token is spent.¶
When the browser wishes to assert its willingness to probabilistically reveal a signal:¶
It checks for any token already assigned to the requesting context whose t_epoch_end has not passed¶
If such a token exists, the browser reuses it for this connection¶
If no such token exists, the browser assigns a non-spent, current-epoch token to the context by setting the context_id field¶
The assigned token is re-randomized and sent to the relevant party¶
At the end of each epoch and after the embargo period has ended, the Key Coordinator MUST publish (pk_e, sk_e, S_e) as JSON Web Keys (JWKs). The ElGamal key is stored with key type "eg", where "x" and "y" hold the public key (pk_e) and "d" holds the secret key (sk_e). The HMAC secret S_e is stored as "hmac.k". All values are converted to big-endian byte arrays and base64url-encoded [RFC4648].¶
Example key disclosure format:¶
{ "epoch_id": 12, "epoch_start_time": "20241125T10:00:00", "epoch_end_time": "20241126T12:00:00", "invalidated_at": null, "eg": { "kty": "EC", "crv": "P-256", "x": "DROn9TZojl70_6lhtcLxItT2qskNDGk97wjz0N5qdiE", "y": "k6EtdGm_jW3b7Le9zM2LgcO7b9Q_qwjS2jL0MFn6V4", "d": "S7_oLScyL_W2ob71hx6kHFv5nTmAt2CvqzmKeF7lLGA" }, "hmac": { "kty": "oct", "k": "AyM1SysPpbyDfgZld3umj1qzKObwVMkoqQ-EstJQLr_T-1qS0gZH75aKtMN3Yj0iPS4hcgUuTwjAzZr1Z9CAow", "alg": "HS256" } }¶
Users and websites can then:¶
PRTs are transmitted in the "Sec-Probabilistic-Reveal-Token" HTTP header [RFC7231]. The header value is a Structured Header Byte Sequence [RFC8941] containing a TLS Presentation Language [RFC8446] serialized PRTStruct:¶
struct { uint8 version; uint16 u_length; opaque u[u_length]; uint16 e_length; opaque e[e_length]; opaque epoch_id[8]; } PRTStruct;¶
Where:¶
version identifies the token format version¶
u and e are the ElGamal ciphertext components¶
epoch_id identifies the corresponding key material for decryption¶
For version 1, u and e are each 33 bytes in length (compressed secp256r1 points).¶
Example HTTP header:¶
Sec-Probabilistic-Reveal-Token: :AQAhA0YcSOPXwN8JkGJz2Rxe349sEOzwLcXnrU0/e5P1QUEEACECjvPnzEReeDlIkrDocZA5ZtiIptiG02YOOaNMJKyKZTdIXbE63QJtYA==:¶
Browsers SHOULD include this header on requests where:¶
The request is being sent through a proxy for privacy purposes¶
The destination origin has registered to receive PRTs¶
A valid PRT is available for the current epoch¶
If no valid PRTs are available when composing a proxied request, the browser SHOULD make the request without the header. Well-behaved clients SHOULD only fail to attach a PRT in exceptional circumstances (e.g., Issuer unavailability).¶
The epoch length affects both privacy and utility:¶
Epochs SHOULD be long enough to prevent leakage through user re- identification within browsing sessions¶
Epochs SHOULD be short enough that signals remain valid and users can verify issuer behavior in reasonable time¶
A minimum epoch length of four hours is RECOMMENDED¶
Typical implementations MAY use epoch lengths of approximately one day¶
Tokens from the same browser MUST NOT be joinable by the website. If the receiving party can link multiple tokens from the same browser (e.g., through storing them in partitioned storage), the website's chance of recovering the signal increases beyond p_reveal.¶
Browsers MUST enforce reasonable bounds on the epoch length to ensure privacy requirements are met.¶
Implementations MUST track context assignments to ensure that only one token per context per epoch is allocated. The context_id field in the browser database is REQUIRED to prevent websites from obtaining multiple tokens and exceeding the intended p_reveal rate.¶
Token ordinals prevent a malicious client from choosing one token and using it for every session. This would allow an attacker to generate an unbounded set of tokens with the same message. Although the attacker does not know which property this set of tokens has, this re-use would create a high likelihood that the signal would never be revealed.¶
For each requested batch of tokens, the issuer records the order of the token in the generation sequence as the token's "ordinal" and then shuffles the tokens. This field does not convey information about the client requesting tokens, since each batch of tokens has the full set of ordinal IDs.¶
Websites can monitor the distribution of token ordinal values and detect spikes that may indicate an attacker re-randomizing the same token across different sessions.¶
The ability to re-randomize a token's ciphertext without changing the underlying contents (a property of the ElGamal encryption scheme) underpins many of the security and privacy properties of PRTs. The issuer cannot link a re-randomized token's ciphertext back to any tokens it issued, and origins that receive re-randomized versions of the same token cannot link them together.¶
Before sending a token on a connection to a particular site for the first time, the client re-randomizes the ciphertext for that token to make it unlinkable to any other usage of the token for other pairs. The client caches the ciphertexts for each pairing and reuses them on subsequent requests while the underlying token remains valid.¶
The client does not re-randomize the token when the token is re-used on the same site. This allows the site to distinguish between a client with many visits using the same token and a client that is re-randomizing the same token across multiple visits (see section above).¶
The IP address a client uses to fetch tokens may differ from the IP address used to connect to websites later. This can occur due to network changes, dynamic IP assignment, or other factors. This drift may result in a PRT revealing an IP address that is unexpected from the user's perspective.¶
Browsers can reduce the chance of IP address mismatch by fetching tokens close to when they are spent and aligned with user expectations.¶
This document has no IANA actions.¶
We thank Scott Hendrickson for his thoughtful review and constructive criticism of earlier drafts of this proposal.¶