tor-browser

The Tor Browser
git clone https://git.dasho.dev/tor-browser.git
Log | Files | Refs | README | LICENSE

commit 84935df73c98e1e2fb1e1ee10a297562cc2d4bfb
parent a5e07594e9b0c108e48cfe11fc664761df7958cd
Author: Randell Jesup <rjesup@mozilla.com>
Date:   Tue,  7 Oct 2025 17:56:00 +0000

Bug 1917965: Add design doc r=necko-reviewers,kershaw

Differential Revision: https://phabricator.services.mozilla.com/D258947

Diffstat:
Mnetwerk/docs/cache2/doc.rst | 67++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/netwerk/docs/cache2/doc.rst b/netwerk/docs/cache2/doc.rst @@ -16,7 +16,7 @@ be clear directly from the `IDL files <https://searchfox.org/mozilla-central/sea - The cache API is **completely thread-safe** and **non-blocking**. - There is **no IPC support**. It's only accessible on the default chrome process. -- When there is no profile the new HTTP cache works, but everything is +- When there is no profile the HTTP cache works, but everything is stored only in memory not obeying any particular limits. .. _nsICacheStorageService: @@ -567,3 +567,68 @@ checking - the memory cache pool is controlled by ``browser.cache.memory.capacity``, the disk entries pool is already described above. The pool can be accessed and modified only on the cache background thread. + +Compression Dictionaries +--------------------------- + +Compression Dictionaries are specced by the IETF: +https://datatracker.ietf.org/doc/draft-ietf-httpbis-compression-dictionary/ + +See also: https://developer.chrome.com/blog/shared-dictionary-compression +and https://github.com/WICG/compression-dictionary-transport + +Gecko's design for compression dictionary support: + +We have special dict:<origin> entries with a listing of all dictionaries +for that origin, stored in metadata. + +When a fetch is made, we check if there's a dict:<origin> cache entry. If +not, we know there are no dictionaries. If there is an entry, and we +haven't previously loaded it into memory, we read and parse the metadata +and create in-memory structures for all dictionaries for <origin>. This +includes the data needed to match and decide if we want to send a +"Available-Dictionary:" header with the request. + +If a response to any request is received and it has a "Use-As-Dictionary" +header, we create a new dictionary entry in-memory and flag it for saving +to the dict:<origin> metadata. We set the stream up to decompress before +storing into the cache (see later options for alternatives in the future), +so that we can be ensured to be able to decompress later. We start +accumulating a hash value for the metadata entry. Once the resource is +fully received, we finalize the hash value and the metadata can be written. + +When a response is received with dcb or dcz compress (dictionaries), we use +the cache entry for the dictionary that we sent in Available-Dictionary to +decompress the resource. This means reading it into memory and then +allowing the decompression to occur. + +Several of these actions require a level of asynchronous action (waiting +for a cache entry to be loaded for use as a dictionary, or waiting for a +dict:<origin> entry to be loaded. This is generally handled via lambdas. + +The metadata and in-memory entries are kept in sync with the cache by +clearing entries out when cache entries are Doomed. This also interacts +with Clear Site Data and cookie clear headers (see IETF spec). + +Dictionary loading can also be triggered via <link rel="Compression +Dictionary" ...> and link headers. These will cause prefetches of the +dictionaries. + +Things to watch on landing: +- Cache hitrate +- dictionary utilization +-- Add probes +- pageload metrics +-- Would require OHTTP-based collection + +Future optimizations: +- Compressing dictionaries with zstd in the cache +-- Trades CPU use and some latency decoding dictionary-encoded files for hitrate +-- Perhaps only above some size +- Compressing dictionary-encoded files with zstd in the cache +-- Trades CPU use for hitrate +-- Perhaps only above some size +- Preemptively reading dict:<origin> entries into memory in the background + at startup +-- Up to some limit +- LRU-ing dict:<origin> entries and dropping old ones from memory