index.rst (8668B)
1 ============ 2 SpiderMonkey 3 ============ 4 5 *SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of 6 the *Mozilla Firefox* web browser. The implementation behaviour is defined by 7 the `ECMAScript <https://tc39.es/ecma262/>`_ and `WebAssembly 8 <https://webassembly.org/>`_ specifications. 9 10 Much of the internal technical documentation of the engine can be found 11 throughout the source files themselves by looking for comments labelled with 12 `[SMDOC]`_. Information about the team, our processes, and about embedding 13 *SpiderMonkey* in your own projects can be found at https://spidermonkey.dev. 14 15 Specific documentation on a few topics is available at: 16 17 .. toctree:: 18 :maxdepth: 1 19 20 build 21 test 22 hacking_tips 23 Debugger/index 24 SavedFrame/index 25 feature_checklist 26 bytecode_checklist 27 use_counter 28 29 30 Components of SpiderMonkey 31 ########################## 32 33 ๐งน Garbage Collector 34 ********************* 35 36 .. toctree:: 37 :maxdepth: 2 38 :hidden: 39 40 Overview <gc> 41 Rooting Hazard Analysis <HazardAnalysis/index> 42 Running the Analysis <HazardAnalysis/running> 43 44 *JavaScript* is a garbage collected language and at the core of *SpiderMonkey* 45 we manage a garbage-collected memory heap. Elements of this heap have a base 46 C++ type of `gc::Cell`_. Each round of garbage collection will free up any 47 *Cell* that is not referenced by a *root* or another live *Cell* in turn. 48 49 See :doc:`GC overview<gc>` for more details. 50 51 52 ๐ฆ JS::Value and JSObject 53 ************************** 54 55 *JavaScript* values are divided into either objects or primitives 56 (*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*). 57 Values are represented with the `JS::Value`_ type which may in turn point to 58 an object that extends from the `JSObject`_ type. Objects include both plain 59 *JavaScript* objects and exotic objects representing various things from 60 functions to *ArrayBuffers* to *HTML Elements* and more. 61 62 Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``) 63 which provides a way to store properties as key-value pairs similar to a hash 64 table. These objects hold their *values* and point to a *Shape* that 65 represents the set of *keys*. Similar objects point to the same *Shape* which 66 saves memory and allows the JITs to quickly work with objects similar to ones 67 it has seen before. See the `[SMDOC] Shapes`_ comment for more details. 68 69 C++ (and Rust) code may create and manipulate these objects using the 70 collection of interfaces we traditionally call the **JSAPI**. 71 72 73 ๐๏ธ JavaScript Parser 74 ********************* 75 76 In order to evaluate script text, we parse it using the *Parser* into an 77 `Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter* 78 (BCE) to generate `Bytecode`_ and associated metadata. We refer to this 79 resulting format as `Stencil`_ and it has the helpful characteristic that it 80 does not utilize the Garbage Collector. The *Stencil* can then be 81 instantiated into a series of GC *Cells* that can be mutated and understood 82 by the execution engines described below. 83 84 Each function as well as the top-level itself generates a distinct script. 85 This is the unit of execution granularity since functions may be set as 86 callbacks that the host runs at a later time. There are both 87 ``ScriptStencil`` and ``js::BaseScript`` forms of scripts. 88 89 By default, the parser runs in a mode called *syntax* or *lazy* parsing where 90 we avoid generating full bytecode for functions within the source that we are 91 parsing. This lazy parsing is still required to check for all *early errors* 92 that the specification describes. When such a lazily compiled inner function 93 is first executed, we recompile just that function in a process called 94 *delazification*. Lazy parsing avoids allocating the AST and bytecode which 95 saves both CPU time and memory. In practice, many functions are never 96 executed during a given load of a webpage so this delayed parsing can be 97 quite beneficial. 98 99 100 โ๏ธ JavaScript Interpreter 101 ************************** 102 103 The *bytecode* generated by the parser may be executed by an interpreter 104 written in C++ that manipulates objects in the GC heap and invokes native 105 code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for 106 descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for 107 their implementation. 108 109 110 โก JavaScript JITs 111 ******************* 112 113 .. toctree:: 114 :maxdepth: 1 115 :hidden: 116 117 MIR-optimizations/index 118 119 In order to speed up execution of *bytecode*, we use a series of Just-In-Time 120 (JIT) compilers to generate specialized machine code (eg. x86, ARM, etc) 121 tailored to the *JavaScript* that is run and the data that is processed. 122 123 As an individual script runs more times (or has a loop that runs many times) 124 we describe it as getting *hotter* and at certain thresholds we *tier-up* by 125 JIT-compiling it. Each subsequent JIT tier spends more time compiling but 126 aims for better execution performance. 127 128 Baseline Interpreter 129 -------------------- 130 131 The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the 132 *bytecode* one opcode at a time, but attaches small fragments of code called 133 *Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next 134 time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_ 135 comment for more details. 136 137 Baseline Compiler 138 ----------------- 139 140 The *Baseline Compiler* use the same *Inline Caches* mechanism from the 141 *Baseline Interpreter* but additionally translates the entire bytecode to 142 native machine code. This removes dispatch overhead and does minor local 143 optimizations. This machine code still calls back into C++ for complex 144 operations. The translation is very fast but the ``BaselineScript`` uses 145 memory and requires ``mprotect`` and flushing CPU caches. 146 147 WarpMonkey 148 ---------- 149 150 The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the 151 highest level of optimization for the most frequently run scripts. It is able 152 to inline other scripts and specialize code based on the data and arguments 153 being processed. 154 155 We translate the *bytecode* and *Inline Cache* data into a Mid-level 156 `Intermediate Representation`_ (Ion MIR) representation. This graph is 157 transformed and optimized before being *lowered* to a Low-level Intermediate 158 Representation (Ion LIR). This *LIR* performs register allocation and then 159 generates native machine code in a process called *Code Generation*. 160 161 See `MIR Optimizations`_ for an overview of MIR optimizations. 162 163 The optimizations here assume that a script continues to see data similar 164 what has been seen before. The *Baseline* JITs are essential to success here 165 because they generate *ICs* that match observed data. If after a script is 166 compiled with *Warp*, it encounters data that it is not prepared to handle it 167 performs a *bailout*. The *bailout* mechanism reconstructs the native machine 168 stack frame to match the layout used by the *Baseline Interpreter* and then 169 branches to that interpreter as though we were running it all along. Building 170 this stack frame may use special side-table saved by *Warp* to reconstruct 171 values that are not otherwise available. 172 173 174 ๐ช WebAssembly 175 *************** 176 177 In addition to *JavaScript*, the engine is also able to execute *WebAssembly* 178 (WASM) sources. 179 180 WASM-Baseline (RabaldrMonkey) 181 ----------------------------- 182 183 This engine performs fast translation to machine code in order to minimize 184 latency to first execution. 185 186 WASM-Ion (BaldrMonkey) 187 ---------------------- 188 189 This engine translates the WASM input into same *MIR* form that *WarpMonkey* 190 uses and uses the *IonBackend* to optimize. These optimizations (and in 191 particular, the register allocation) generate very fast native machine code. 192 193 194 .. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell 195 .. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout 196 .. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F 197 .. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F 198 .. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes 199 .. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F 200 .. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches 201 .. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil 202 .. _Bytecode: https://en.wikipedia.org/wiki/Bytecode 203 .. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree 204 .. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation 205 .. _MIR Optimizations: ./MIR-optimizations/index.html