commit cf09ea9257deacb3468106cc47864ce61ea5c659
parent 69720d6199ba4319dbc4486e51c1a42f077d394b
Author: Randell Jesup <rjesup@mozilla.com>
Date: Tue, 7 Oct 2025 14:07:08 +0000
Bug 1917965: Add design doc r=necko-reviewers,kershaw
Differential Revision: https://phabricator.services.mozilla.com/D258947
Diffstat:
1 file changed, 66 insertions(+), 1 deletion(-)
diff --git a/netwerk/docs/cache2/doc.rst b/netwerk/docs/cache2/doc.rst
@@ -16,7 +16,7 @@ be clear directly from the `IDL files <https://searchfox.org/mozilla-central/sea
- The cache API is **completely thread-safe** and **non-blocking**.
- There is **no IPC support**. It's only accessible on the default
chrome process.
-- When there is no profile the new HTTP cache works, but everything is
+- When there is no profile the HTTP cache works, but everything is
stored only in memory not obeying any particular limits.
.. _nsICacheStorageService:
@@ -567,3 +567,68 @@ checking - the memory cache pool is controlled by
``browser.cache.memory.capacity``, the disk entries pool is already
described above. The pool can be accessed and modified only on the cache
background thread.
+
+Compression Dictionaries
+---------------------------
+
+Compression Dictionaries are specced by the IETF:
+https://datatracker.ietf.org/doc/draft-ietf-httpbis-compression-dictionary/
+
+See also: https://developer.chrome.com/blog/shared-dictionary-compression
+and https://github.com/WICG/compression-dictionary-transport
+
+Gecko's design for compression dictionary support:
+
+We have special dict:<origin> entries with a listing of all dictionaries
+for that origin, stored in metadata.
+
+When a fetch is made, we check if there's a dict:<origin> cache entry. If
+not, we know there are no dictionaries. If there is an entry, and we
+haven't previously loaded it into memory, we read and parse the metadata
+and create in-memory structures for all dictionaries for <origin>. This
+includes the data needed to match and decide if we want to send a
+"Available-Dictionary:" header with the request.
+
+If a response to any request is received and it has a "Use-As-Dictionary"
+header, we create a new dictionary entry in-memory and flag it for saving
+to the dict:<origin> metadata. We set the stream up to decompress before
+storing into the cache (see later options for alternatives in the future),
+so that we can be ensured to be able to decompress later. We start
+accumulating a hash value for the metadata entry. Once the resource is
+fully received, we finalize the hash value and the metadata can be written.
+
+When a response is received with dcb or dcz compress (dictionaries), we use
+the cache entry for the dictionary that we sent in Available-Dictionary to
+decompress the resource. This means reading it into memory and then
+allowing the decompression to occur.
+
+Several of these actions require a level of asynchronous action (waiting
+for a cache entry to be loaded for use as a dictionary, or waiting for a
+dict:<origin> entry to be loaded. This is generally handled via lambdas.
+
+The metadata and in-memory entries are kept in sync with the cache by
+clearing entries out when cache entries are Doomed. This also interacts
+with Clear Site Data and cookie clear headers (see IETF spec).
+
+Dictionary loading can also be triggered via <link rel="Compression
+Dictionary" ...> and link headers. These will cause prefetches of the
+dictionaries.
+
+Things to watch on landing:
+- Cache hitrate
+- dictionary utilization
+-- Add probes
+- pageload metrics
+-- Would require OHTTP-based collection
+
+Future optimizations:
+- Compressing dictionaries with zstd in the cache
+-- Trades CPU use and some latency decoding dictionary-encoded files for hitrate
+-- Perhaps only above some size
+- Compressing dictionary-encoded files with zstd in the cache
+-- Trades CPU use for hitrate
+-- Perhaps only above some size
+- Preemptively reading dict:<origin> entries into memory in the background
+ at startup
+-- Up to some limit
+- LRU-ing dict:<origin> entries and dropping old ones from memory