sharing-rust-libraries.rst (12490B)
1 Sharing rust libraries across the Firefox (for Android) stack 2 ============================================================= 3 4 `Agi Sferro <agi@sferro.dev>` 5 6 March 20th, 2021 7 8 The problem 9 ----------- 10 11 We don’t have a good story for integrating a rust library so that it’s 12 available to use in Gecko, GeckoView, AC and Fenix and also in a way that rust 13 can call rust directly avoiding a C FFI layer. 14 15 Goals 16 ----- 17 18 - Being able to integrate a rust library that can be called from Gecko, 19 GeckoView, AC, Fenix, including having singleton-like instances that are 20 shared across the stack, per-process. 21 - The rust library should be able to call and be called by other rust libraries 22 or rust code in Gecko directly (i.e. without a C FFI layer) 23 - A build-time assurance that all components in the stack compile against the 24 same version of the rust library 25 - Painless, quick and automated updates. Should be able to produce chemspill 26 updates for the rust library in under 24h with little manual intervention 27 (besides security checks / code review / QA). 28 - Support for non-Gecko consumers of the rust library is essential. I.e. 29 providing a version of Gecko that does not include any of the libraries 30 - (optional?) Provide an easy way to create bundles of rust libraries depending 31 on consumers needs. 32 33 Proposal 34 -------- 35 36 1. Rename libmegazord.so to librustcomponents.so to clarify what the purpose of 37 this artifact is. 38 2. Every rust library that wants to be called or wants to call rust code 39 directly will be included in libxul.so (which contains most of Gecko native 40 code), and vendored in mozilla-central. This includes, among others, Glean and 41 Nimbus. 42 3. libxul.so will expose the necessary FFI symbols for the Kotlin wrappers 43 needed by the libraries vendored in mozilla-central in step (2). 44 4. At every nightly / beta / release build of Gecko, we will generate an (or 45 possibly many) additional librustcomponents.so artifacts that will be published 46 as an AAR in maven.mozilla.org. This will also publish all the vendored 47 libraries in mozilla-central to maven, which will have a dependency on the 48 librustcomponents.so produced as part of this step. Doing this will ensure that 49 both libxul.so and librustcomponents.so contain the exact same code and can be 50 swapped freely in the dependency graph. 51 5. Provide a new GeckoView build with artifactId geckoview-omni which will 52 depend on all the rust libraries. The existing geckoview will not have such 53 dependency and will be kept for third-party users of GeckoView. 54 6. GeckoView will depend on the Kotlin wrappers of all the libraries that 55 depend on librustcomponents.so built in step (4) in the .pom file. For example 56 57 .. code:: xml 58 59 <dependency> 60 <groupId>org.mozilla.telemetry</groupId> 61 <artifactId>glean</artifactId> 62 <version>33.1.2</version> 63 <scope>compile</scope> 64 </dependency> 65 66 It will also exclude the org.mozilla.telemetry.glean dependency to 67 librustcomponents.so, as the native code is now included in libxul.so as part 68 of step (2). Presumably Glean will discover where its native code lives by 69 either trying librustcomponents.so or libxul.so (or some other better methods, 70 suggestions welcome). 71 72 7. Android Components and Fenix will remove their explicit dependency on Glean, 73 Nimbus and all other libraries provided by GeckoView, and instead consume the 74 one provided by GeckoView (this step is optional, note that any version 75 conflict would cause a build error). 76 77 78 The good 79 -------- 80 81 - We get automated integration with AC for free. When an update for a library 82 is pushed to mozilla-central, a new nightly build for GeckoView will be 83 produced which is already consumed by AC automatically (and indirectly into 84 Fenix). 85 - Publishing infrastructure to maven is already figured out, and we can reuse 86 the existing process for GeckoView to publish all the dependencies. 87 - If a consumer (say AC) uses a mismatched version for a dependency, a 88 compile-time error will be thrown. 89 - All consumers of the rust libraries packaged this way are on the same version 90 (provided they stay up to date with releases) 91 - Non-Mozilla consumers get first-class visibility into what is packaged into 92 GeckoView, and can independently discover Glean, Nimbus, etc, since we define 93 our dependencies in the pom file. 94 - Gecko Desktop and Gecko Mobile consumer Glean and other libraries in the same 95 way, removing unnecessary deviation. 96 97 Worth Noting 98 ------------ 99 100 - Uplifts to beta / release versions of Fenix will involve more checks as they 101 impact Gecko too. 102 103 The Bad 104 ------- 105 106 - Libraries need to be vendored in mozilla-central. Dependencies will follow 107 the Gecko train which might not be right for them, as some dependencies don’t 108 really have a nightly vs stable version. - This could change in the future, as 109 the integration gets deeper and updates to the library become more frequent / 110 at every commit. 111 - Locally testing a change in a rust library involves rebuilding all of Gecko. 112 This is a side effect of statically linking rust libraries to Gecko. 113 - All rust libraries that are both used by Android and Gecko will need to be 114 updated together, and we cannot have separate versions on Desktop/Mobile. 115 Although this can be mitigated by providing flexible dependency on the library 116 side (e.g. nimbus doesn’t need to depend on a specific version of - Glean and 117 can accept whatever is in Gecko) 118 - Code that doesn’t natively live in mozilla-central has double the work to get 119 into a product - first a release process is needed from the native repo, then 120 a phabricator process for the vendoring. 121 122 Alternatives Considered 123 ----------------------- 124 125 Telemetry delegate 126 ^^^^^^^^^^^^^^^^^^ 127 128 GeckoView provides a Java Telemetry delegate interface that Glean can implement 129 on the AC layer to provide Glean functionality to consumers. Glean would offer 130 a rust wrapper to the Java delegate API to transparently call either the 131 delegate (when built for mobile) or the Glean instance directly (when built for 132 Desktop). 133 134 Drawbacks 135 """"""""" 136 137 - This involves a lot of work on the Glean side to build and maintain the 138 delegate 139 - A large section of the Glean API is embedded in the GeckoView API without a 140 direct dependency 141 - We don’t expect the telemetry delegate to have other implementations other 142 than Glean itself, despite the apparent generic nature of the telemetry 143 delegate 144 - Glean and GeckoView engineers need to coordinate for every API update, as an 145 update to the Glean API likely triggers an update to the GV API. 146 - Gecko Desktop and Gecko Mobile use Glean a meaningfully different way 147 - Doesn’t solve the dependency problem: even though in theory this would allow 148 Gecko to work with multiple Glean versions, in practice the GV Telemetry 149 delegate is going to track Glean so closely that it will inevitably require 150 pretty specific Glean versions to work. 151 152 Advantages 153 """""""""" 154 155 - Explicit code dependency, an uninformed observer can understand how telemetry 156 is extracted from GeckoView by just looking at the API 157 - No hard Glean version requirement, AC can be (in theory) built with a 158 different Glean version than Gecko and things would still work 159 160 Why we decided against 161 """""""""""""""""""""" 162 163 The amount of ongoing maintenance work involved on the Glean side far outweighs 164 the small advantages, namely to not tie AC to a specific Glean version. 165 Significantly complicates the stack. 166 167 Dynamic Discovery 168 ^^^^^^^^^^^^^^^^^ 169 170 Gecko discovers when it’s being loaded as part of Fenix (or some other 171 Gecko-powered browser) by calling dlsym on the Glean library. When the 172 discovery is successful, and the Glean version matches, Gecko will directly use 173 the Glean provided by Fenix. 174 175 Drawbacks 176 """"""""" 177 178 - Non standard, non-Mozilla apps will not expect this to work the way it does 179 - “Magic”: there’s no way to know that the dyscovery is happening (or what 180 version of Glean is provided with Gecko) unless you know it’s there. 181 - The standard failure mode is at runtime, as there’s no built-in way to check 182 that the version provided by Gecko is the same as the one provided by Fenix 183 at build time. 184 - Doesn’t solve the synchronization problem: Gecko and Fenix will have to be on 185 the same Glean version for this to work. 186 - Gecko Mobile deviates meaningfully from Desktop in the way it uses Glean for 187 no intrinsic reason 188 189 Advantages 190 """""""""" 191 192 - This system is transparent to Consuming apps, e.g. Nimbus can use Glean as 193 is, with no significant modifications needed. 194 195 Why we decided against 196 """""""""""""""""""""" 197 198 - This alternative does not provide substantial benefits over the proposal 199 outlined in this doc and has significant drawbacks like the runtime failure 200 case and the non-standard linking process. 201 202 Hybrid Dynamic Discovery 203 ^^^^^^^^^^^^^^^^^^^^^^^^ 204 205 This is a variation of the Dynamic Discovery where Gecko and GeckoView include 206 Glean directly and consumers get Glean from Gecko dynamically (i.e. they dlsym 207 libxul.so). 208 209 Drawbacks 210 """"""""" 211 212 - Glean still needs to build a wrapper for libraries not included in Gecko 213 (like Nimbus) that want to call Glean directly. 214 215 Advantages 216 """""""""" 217 218 - The dependency to Glean is explicit and clear from an uninformed observer 219 point of view. 220 - Smaller scope, only Glean would need to be moved to mozilla-central 221 222 Why we decided against 223 """""""""""""""""""""" 224 225 Not enough advantages over the proposal, significant ongoing maintenance work 226 required from the Glean side. 227 228 Open Questions 229 -------------- 230 231 - How does iOS consume megazord today? Do they have a maven-like dependency 232 system we can use to publish the iOS megazord? 233 - How do we deal with licenses in about:license? Application-services has a 234 build step that extracts rust dependencies and puts them in the pom file 235 - What would be the process for coordinating a-c breaking changes? 236 - Would the desire to vendor apply even if this were not Rust code? 237 238 Common Questions 239 ---------------- 240 241 - **How do we make sure GV/AC/Gecko consume the same version of the native 242 libraries?** The pom dependency in GeckoView ensures that any GeckoView 243 consumers depend on the same version of a given library, this includes AC and 244 Fenix. 245 - **What happens to non-Gecko consumers of megazord?** This plan is transparent 246 to a non-Gecko consumer of megazord, as they will still consume the native 247 libraries through the megazord dependency in Glean/Nimbus/etc. With the added 248 benefit that, if the consumer stays up to date with the megazord dependency, 249 they will use the same version that Gecko uses. 250 - **What’s the process to publish an update to the megazord?** When a team 251 wants to publish an update to the megazord it will need to commit the update 252 to mozilla-central. A new build will be generated in the next nightly cycle, 253 producing an updated version of the megazord. My understanding is that current 254 megazord releases are stable (and don’t have beta/nightly cycles) so for 255 external consumers, consuming the nightly build could be adequate, and provide 256 the fastest turnaround on updates. For Gecko consumers the turnaround will be 257 the same to Firefox Desktop (i.e. roughly 6-8 weeks from commit to release 258 build). 259 - **How do we handle security uplifts?** If you have a security release one 260 rust library you would need to request uplift to beta/release branches 261 (depending on impact) like all other Gecko changes. The process in itself can 262 be expedited and have a fast turnaround when needed (below 24h). We have been 263 using this process for all Gecko changes so I would not expect particular 264 problems with it. 265 - **What about OOP cases? E.g. GeckoView as a service?** We briefly discussed 266 this in the email chain, there are ways we could make that work (e.g. 267 providing a IPC shim). The details are fuzzy but since we don’t have any 268 immediate need for such support knowing that it’s doable with a reasonable 269 amount of work is enough for now. 270 - **Vendoring in mozilla-central seems excessive.** I agree. This is an 271 unfortunate requirement stemming from a few assumptions (which could be 272 challenged! We are choosing not to): 273 274 - Gecko wants to vendor whatever it consumes for rust 275 - We want rust to call rust directly (without a C FFI layer) 276 - We want adding new libraries to be a painless experience 277 278 Because of the above, vendoring in mozilla-central seems to be the best if not 279 the only way to achieve our goals.