locale.rst (23295B)
1 .. role:: js(code) 2 :language: javascript 3 4 ================= 5 Locale management 6 ================= 7 8 A locale is a combination of language, region, script, and regional preferences the 9 user wants to format their data into. 10 11 There are multiple models of locale data structures in the industry that have varying degrees 12 of compatibility between each other. Historically, each major platform has used their own, 13 and many standard bodies provided conflicting proposals. 14 15 Mozilla, alongside with most modern platforms, follows Unicode and W3C recommendation 16 and conforms to a standard known as `BCP 47`_ which describes a low level textual 17 representation of a locale known as `language tag`. 18 19 A few examples of language tags: *en-US*, *de*, *ar*, *zh-Hans*, *es-CL*. 20 21 Locales and Language Tags 22 ========================= 23 24 Locale data structure consists of four primary fields. 25 26 - Language (Example: English - *en*, French - *fr*, Serbian - *sr*) 27 - Script (Example: Latin - *Latn*, Cyrylic - *Cyrl*) 28 - Region (Example: United States - *US*, Canada - *CA*, Russia - *RU*) 29 - Variants (Example: Mac OS - *macos*, Windows - *windows*, Linux - *linux*) 30 31 `BCP 47`_ specifies the syntax for each of those fields (called subtags) when 32 represented as a string. The syntax defines the allowed selection of characters, 33 their capitalization, and the order in which the fields should be defined. 34 35 Most of the base subtags are valid ISO codes, such as `ISO 639`_ for 36 language subtag, or `ISO 3166-1`_ for region. 37 38 The examples above present language tags with several fields omitted, which is allowed 39 by the standard. 40 41 On top of that, a locale may contain: 42 43 - extensions and private fields 44 These fields can be used to carry additional information about a locale. 45 Mozilla currently has partial support for them in the JS implementation and plans to 46 extend support to all APIs. 47 - extkeys and "grandfathered" tags (unfortunate language, but part of the spec) 48 Mozilla does not support these yet. 49 50 51 An example locale can be visualized as: 52 53 .. code-block:: javascript 54 55 { 56 "language": "sr", 57 "script": "Cyrl", 58 "region": "RU", 59 "variants": [], 60 "extensions": {}, 61 "privateuse": [], 62 } 63 64 which can be then serialized into a string: **"sr-Cyrl-RU"**. 65 66 .. important:: 67 68 Since locales are often stored and passed around the codebase as 69 language tag strings, it is important to always use an appropriate 70 API to parse, manipulate and serialize them. 71 Avoid `Do-It-Yourself` solutions which leave your code fragile and may 72 break on unexpected language tag structures. 73 74 Locale Fallback Chains 75 ====================== 76 77 Locale sensitive operations are always considered "best-effort". That means that it 78 cannot be assumed that a perfect match will exist between what the user requested and what 79 the API can provide. 80 81 As a result, the best practice is to *always* operate on locale fallback chains - 82 ordered lists of locales according to the user preference. 83 84 An example of a locale fallback chain may be: :js:`["es-CL", "es-ES", "es", "fr", "en"]`. 85 86 The above means a request to format the data according to the Chilean Spanish if possible, 87 fall back to Spanish Spanish, then any (generic) Spanish, French and eventually to 88 English. 89 90 .. important:: 91 92 It is *always* better to use a locale fallback chain over a single locale. 93 In case there's only one locale available, a list with one element will work 94 while allowing for future extensions without a costly refactor. 95 96 Language Negotiation 97 ==================== 98 99 Due to the imperfections in data matching, all operations on locales should always 100 use a language negotiation algorithm to resolve the best available set of locales, 101 based on the list of all available locales and an ordered list of requested locales. 102 103 Such algorithms may vary in sophistication and number of strategies. Mozilla's 104 solution is based on modified logic from `RFC 5656`_. 105 106 The three lists of locales used in negotiation: 107 108 - **Available** - locales that are locally installed 109 - **Requested** - locales that the user selected in decreasing order of preference 110 - **Resolved** - result of the negotiation 111 112 The result of a negotiation is an ordered list of locales that are available to 113 the system, and the consumer is expected to attempt using the locales in the 114 resolved order. 115 116 Negotiation should be used in all scenarios like selecting language resources, 117 calendar, number formatting, etc. 118 119 Single Locale Matching 120 ---------------------- 121 122 Every negotiation strategy goes through a list of steps in an attempt to find the 123 best possible match between locales. 124 125 The exact algorithm is custom, and consists of a 6 level strategy: 126 127 :: 128 129 1) Attempt to find an exact match for each requested locale in available 130 locales. 131 Example: ['en-US'] * ['en-US'] = ['en-US'] 132 133 2) Attempt to match a requested locale to an available locale treated 134 as a locale range. 135 Example: ['en-US'] * ['en'] = ['en'] 136 ^^ 137 |-- becomes 'en-*-*-*' 138 139 3) Attempt to use the maximized version of the requested locale, to 140 find the best match in available locales. 141 Example: ['en'] * ['en-GB', 'en-US'] = ['en-US'] 142 ^^ 143 |-- ICU likelySubtags expands it to 'en-Latn-US' 144 145 4) Attempt to look for a different variant of the same locale. 146 Example: ['ja-JP-win'] * ['ja-JP-mac'] = ['ja-JP-mac'] 147 ^^^^^^^^^ 148 |----------- replace variant with range: 'ja-JP-*' 149 150 5) Attempt to look for a maximized version of the requested locale, 151 stripped of the region code. 152 Example: ['en-CA'] * ['en-ZA', 'en-US'] = ['en-US', 'en-ZA'] 153 ^^^^^ 154 |----------- look for likelySubtag of 'en': 'en-Latn-US' 155 156 6) Attempt to look for a different region of the same locale. 157 Example: ['en-GB'] * ['en-AU'] = ['en-AU'] 158 ^^^^^ 159 |----- replace region with range: 'en-*' 160 161 Filtering / Matching / Lookup 162 ----------------------------- 163 164 When negotiating between lists of locales, Mozilla's :js:`LocaleService` API 165 offers three language negotiation strategies: 166 167 Filtering 168 ^^^^^^^^^ 169 170 This is the most common scenario, where there is an advantage in creating a 171 maximal possible list of locales that the user may benefit from. 172 173 An example of a scenario: 174 175 .. code-block:: javascript 176 177 let requested = ["fr-CA", "en-US"]; 178 let available = ["en-GB", "it", "en-ZA", "fr", "de-DE", "fr-CA", "fr-CH"]; 179 180 let result = Services.locale.negotiateLanguages(requested, available); 181 182 result == ["fr-CA", "fr", "fr-CH", "en-GB", "en-ZA"]; 183 184 In the example above the algorithm was able to match *"fr-CA"* as a perfect match, 185 but then was able to find other matches as well - a generic French is a very 186 good match, and Swiss French is also very close to the top requested language. 187 188 In case of the second of the requested locales, unfortunately American English 189 is not available, but British English and South African English are. 190 191 The algorithm is greedy and attempts to match as many locales 192 as possible. This is usually what the developer wants. 193 194 Matching 195 ^^^^^^^^ 196 197 In less common scenarios the code needs to match a single, best available locale for 198 each of the requested locales. 199 200 An example of this scenario: 201 202 .. code-block:: javascript 203 204 let requested = ["fr-CA", "en-US"]; 205 let available = ["en-GB", "it", "en-ZA", "fr", "de-DE", "fr-CA", "fr-ZH"]; 206 207 let result = Services.locale.negotiateLanguages( 208 requested, 209 available, 210 undefined, 211 Services.locale.langNegStrategyMatching); 212 213 result == ["fr-CA", "en-GB"]; 214 215 The best available locales for *"fr-CA"* is a perfect match, and for *"en-US"*, the 216 algorithm selected British English. 217 218 Lookup 219 ^^^^^^ 220 221 The third strategy should be used in cases where no matter what, only one locale 222 can be ever used. Some third-party APIs don't support fallback and it doesn't make 223 sense to continue resolving after finding the first locale. 224 225 It is still advised to continue using this API as a fallback chain list, just in 226 this case with a single element. 227 228 .. code-block:: javascript 229 230 let requested = ["fr-CA", "en-US"]; 231 let available = ["en-GB", "it", "en-ZA", "fr", "de-DE", "fr-CA", "fr-ZH"]; 232 233 let result = Services.locale.negotiateLanguages( 234 requested, 235 available, 236 Services.locale.defaultLocale, 237 Services.locale.langNegStrategyLookup); 238 239 result == ["fr-CA"]; 240 241 Default Locale 242 -------------- 243 244 Besides *Available*, *Requested* and *Resolved* locale lists, there's also a concept 245 of *DefaultLocale*, which is a single locale out of the list of available ones that 246 should be used in case there is no match to be found between available and 247 requested locales. 248 249 Every Firefox is built with a single default locale - for example 250 **Firefox zh-CN** has *DefaultLocale* set to *zh-CN* since this locale is guaranteed 251 to be packaged in, have all the resources, and should be used if the negotiation fails 252 to return any matches. 253 254 .. code-block:: javascript 255 256 let requested = ["fr-CA", "en-US"]; 257 let available = ["it", "de", "zh-CN", "pl", "sr-RU"]; 258 let defaultLocale = "zh-CN"; 259 260 let result = Services.locale.negotiateLanguages(requested, available, defaultLocale); 261 262 result == ["zh-CN"]; 263 264 Chained Language Negotiation 265 ---------------------------- 266 267 In some cases the user may want to link a language selection to another component. 268 269 For example, a Firefox extension may come with its own list of available locales, which 270 may have locales that Firefox doesn't. 271 272 In that case, negotiation between user requested locales and the add-on's list may result 273 in a selection of locales superseding that of Firefox itself. 274 275 276 .. code-block:: none 277 278 Fx Available 279 +-------------+ 280 | it, fr, ar | 281 +-------------+ Fx Locales 282 | +--------+ 283 +--------------> | fr, ar | 284 | +--------+ 285 Requested | 286 +----------------+ 287 | es, fr, pl, ar | 288 +----------------+ Add-on Locales 289 | +------------+ 290 +--------------> | es, fr, ar | 291 Add-on Available | +------------+ 292 +-----------------+ 293 | de, es, fr, ar | 294 +-----------------+ 295 296 297 In that case, an add-on may end up being displayed in Spanish, while Firefox UI will 298 use French. In most cases this results in a bad UX. 299 300 In order to avoid that, one can chain the add-on negotiation and take Firefox's resolved 301 locales as a `requested`, and negotiate that against the add-ons' `available` list. 302 303 .. code-block:: none 304 305 Fx Available 306 +-------------+ 307 | it, ar, fr | 308 +-------------+ Fx Locales (as Add-on Requested) 309 | +--------+ 310 +--------------> | fr, ar | 311 | +--------+ 312 Requested | | Add-on Locales 313 +----------------+ | +--------+ 314 | es, fr, pl, ar | +-------------> | fr, ar | 315 +----------------+ | +--------+ 316 | 317 Add-on Available | 318 +-----------------+ 319 | de, es, ar, fr | 320 +-----------------+ 321 322 Available Locales 323 ================= 324 325 In Gecko, available locales come from the `Packaged Locales` and the installed 326 `language packs`. Language packs are a variant of WebExtensions providing just 327 localized resources for one or more languages. 328 329 The primary notion of which locales are available is based on which locales Gecko has 330 UI localization resources for, and other datasets such as internationalization may 331 carry different lists of available locales. 332 333 Requested Locales 334 ================= 335 336 The list of requested locales can be read and set using :js:`LocaleService::requestedLocales` API. 337 338 Using the API will perform necessary sanity checks and canonicalize the values. 339 340 After the sanitization, the value will be stored in a pref :js:`intl.locale.requested`. 341 The pref usually will store a comma separated list of valid BCP47 locale 342 codes, but it can also have two special meanings: 343 344 - If the pref is not set at all, Gecko will use the default locale as the requested one. 345 - If the pref is set to an empty string, Gecko will look into OS app locales as the requested. 346 347 The former is the current default setting for Firefox Desktop, and the latter is the 348 default setting for Firefox for Android. 349 350 If the developer wants to programmatically request the app to follow OS locales, 351 they can assign :js:`null` to :js:`requestedLocales`. 352 353 Regional Preferences 354 ==================== 355 356 Every locale comes with a set of default preferences that are specific to a culture 357 and region. This contains preferences such as calendar system, way to display 358 time (24h vs 12h clock), which day the week starts on, which days constitute a weekend, 359 what numbering system and date time formatting a given locale uses 360 (for example "MM/DD" in en-US vs "DD/MM" in en-AU). 361 362 For all such preferences Gecko has a list of default settings for every region, 363 but there's also a degree of customization every user may want to make. 364 365 All major operating systems have a Settings UI for selecting those preferences, 366 and since Firefox does not provide its own, Gecko looks into the OS for them. 367 368 A special API :js:`mozilla::intl::OSPreferences` handles communication with the 369 host operating system, retrieving regional preferences and altering 370 internationalization formatting with user preferences. 371 372 One thing to notice is that the boundary between regional preferences and language 373 selection is not strong. In many cases the internationalization formats 374 will contain language specific terms and literals. For example a date formatting 375 pattern into Japanese may look like this - *"2018年3月24日"*, or the date format 376 may contains names of months or weekdays to be translated 377 ("April", "Tuesday" etc.). 378 379 For that reason it is tricky to follow regional preferences in a scenario where Operating 380 System locale selection does not match the Firefox UI locales. 381 382 Such behavior might lead to a UI case like "Today is 24 października" in an English Firefox 383 with Polish date formats. 384 385 For that reason, by default, Gecko will *only* look into OS Preferences if the *language* 386 portion of the locale of the OS and Firefox match. 387 That means that if Windows is in "**en**-AU" and Firefox is in "**en**-US" Gecko will look 388 into Windows Regional Preferences, but if Windows is in "**de**-CH" and Firefox 389 is in "**fr**-FR" it won't. 390 In order to force Gecko to look into OS preferences irrelevant of the language match, 391 set the flag :js:`intl.regional_prefs.use_os_locales` to :js:`true`. 392 393 UI Direction 394 ------------ 395 396 Since the UI direction is so tightly coupled with the locale selection, the 397 main method of testing the directionality of the Gecko app lives in LocaleService. 398 399 :js:`LocaleService::IsAppLocaleRTL` returns a boolean indicating if the current 400 direction of the app UI is right-to-left. 401 402 Default and Last Fallback Locales 403 ================================= 404 405 Every Gecko application is built with a single locale as the default one. Such locale 406 is guaranteed to have all linguistic resources available, should be used 407 as the default locale in case language negotiation cannot find any match, and also 408 as the last locale to look for in a fallback chain. 409 410 If all else fails, Gecko also support a notion of last fallback locale, which is 411 currently hardcoded to *"en-US"*, and is the very final locale to try in case 412 nothing else (including the default locale) works. 413 Notice that Unicode and ICU use *"en-GB"* in that role because more English speaking 414 people around the World recognize British regional preferences than American (metric vs. 415 imperial, Fahrenheit vs Celsius etc.). 416 Mozilla may switch to *"en-GB"* in the future. 417 418 Packaged Locales 419 ================ 420 421 When the Gecko application is being packaged it bundles a selection of locale resources 422 to be available within it. At the moment, for example, most Firefox for Android 423 builds come with almost 100 locales packaged into it, while Desktop Firefox comes 424 with usually just one packaged locale. 425 426 There is currently work being done on enabling more flexibility in how 427 the locales are packaged to allow for bundling applications with different 428 sets of locales in different areas - dictionaries, hyphenations, product language resources, 429 installer language resources, etc. 430 431 Web Exposed Locales 432 ==================== 433 434 For anti-tracking or some other reasons, we tend to expose spoofed locale to web content instead 435 of default locales. This can be done by setting the pref :js:`intl.locale.privacy.web_exposed`. 436 The pref is a comma separated list of locale, and empty string implies default locales. 437 438 The pref has no function while :js:`privacy.spoof_english` is set to 2, where *"en-US"* will always 439 be returned. 440 441 Multi-Process 442 ============= 443 444 Locale management can operate in a client/server model. This allows a Gecko process 445 to manage locales (server mode) or just receive the locale selection from a parent 446 process (client mode). 447 448 The client mode is currently used by all child processes of Desktop Firefox, and 449 may be used by, for example, GeckoView to follow locale selection from a parent 450 process. 451 452 To check the mode the process is operating in, the :js:`LocaleService::IsServer` method is available. 453 454 Note that :js:`L10nRegistry.registerSources`, :js:`L10nRegistry.updateSources`, and 455 :js:`L10nRegistry.removeSources` each trigger an IPC synchronization between the parent 456 process and any extant content processes, which is expensive. If you need to change the 457 registration of multiple sources, the best way to do so is to coalesce multiple requests 458 into a single array and then call the method once. 459 460 Mozilla Exceptions 461 ================== 462 463 There's currently only a single exception of the BCP47 used, and that's 464 a legacy "ja-JP-mac" locale. The "mac" is a variant and BCP47 requires all variants 465 to be 5-8 character long. 466 467 Gecko supports the limitation by accepting the 3-letter variants in our APIs and also 468 provides a special :js:`appLocalesAsLangTags` method which returns this locale in that form. 469 (:js:`appLocalesAsBCP47` will canonicalize it and turn into `"ja-JP-macos"`). 470 471 Usage of language negotiation etc. shouldn't rely on this behavior. 472 473 Events 474 ====== 475 476 :js:`LocaleService` emits two events: :js:`intl:app-locales-changed` and 477 :js:`intl:requested-locales-changed` which all code can listen to. 478 479 Those events may be broadcasted in response to new language packs being installed, or 480 uninstalled, or user selection of languages changing. 481 482 In most cases, the code should observe the :js:`intl:app-locales-changed` 483 and react to only that event since this is the one indicating a change 484 in the currently used language settings that the components should follow. 485 486 Testing 487 ======= 488 489 Many components may have logic encoded to react to changes in requested, available 490 or resolved locales. 491 492 In order to test the component's behavior, it is important to replicate 493 the environment in which such change may happen. 494 495 Since in most cases it is advised for a component to tie its 496 language negotiation to the main application (see `Chained Language Negotiation`), 497 it is not enough to add a new locale to trigger the language change. 498 499 First, it is necessary to add a new locale to the available ones, then change 500 the requested, and only that will result in a new negotiation and language 501 change happening. 502 503 There are two primary ways to add a locale to available ones. 504 505 Testing Localization 506 -------------------- 507 508 If the goal is to test that the correct localization ends up in the correct place, 509 the developer needs to register a new :js:`L10nFileSource` in :js:`L10nRegistry` and 510 provide a mock cached data to be returned by the API. 511 512 It may look like this: 513 514 .. code-block:: javascript 515 516 let source = L10nFileSource.createMock( 517 "mock-source", "app", 518 ["ko-KR", "ar"], 519 "resource://mock-addon/localization/{locale}", 520 [ 521 { 522 path: "resource://mock-addon/localization/ko-KR/test.ftl", 523 source: "key = Value in Korean" 524 }, 525 { 526 path: "resource://mock-addon/localization/ar/test.ftl", 527 source: "key = Value in Arabic" 528 } 529 ] 530 ); 531 532 L10nRegistry.getInstance().registerSources([source]); 533 534 let availableLocales = Services.locale.availableLocales; 535 536 assert(availableLocales.includes("ko-KR")); 537 assert(availableLocales.includes("ar")); 538 539 Services.locale.requestedLocales = ["ko-KR"]; 540 541 let appLocales = Services.locale.appLocalesAsBCP47; 542 assert(appLocales[0], "ko-KR"); 543 544 From here, a resource :js:`test.ftl` can be added to a `Localization` and for ID :js:`key` 545 the correct value from the mocked cache will be returned. 546 547 Testing Locale Switching 548 ------------------------ 549 550 The second method is much more limited, as it only mocks the locale availability, 551 but it is also simpler: 552 553 .. code-block:: javascript 554 555 Services.locale.availableLocales = ["ko-KR", "ar"]; 556 Services.locale.requestedLocales = ["ko-KR"]; 557 558 let appLocales = Services.locale.appLocalesAsBCP47; 559 assert(appLocales[0], "ko-KR"); 560 561 In the future, Mozilla plans to add a third way for add-ons (`bug 1440969`_) 562 to allow for either manual or automated testing purposes disconnecting its locales 563 from the main application ones. 564 565 Testing the outcome 566 ------------------- 567 568 Except of testing for reaction to locale changes, it is advised to avoid writing 569 tests that expect a certain locale to be selected, or certain internationalization 570 or localization data to be used. 571 572 Doing so locks down the test infrastructure to be only usable when launched in 573 a single locale environment and requires those tests to be updated whenever the underlying 574 data changes. 575 576 In the case of testing locale selection it is best to use a fake locale like :js:`x-test`, that 577 will not be present at the beginning of the test. 578 579 In the case of testing for internationalization data it is best to use :js:`resolvedOptions()`, 580 to verify the right data is being used, rather than comparing the output string. 581 582 In the case of localization, it is best to test against the correct :js:`data-l10n-id` 583 being set or, in edge cases, verify that a given variable is present in the string using 584 :js:`String.prototype.includes`. 585 586 Deep Dive 587 ========= 588 589 Below is a list of articles with additional 590 details on selected subjects: 591 592 .. toctree:: 593 :maxdepth: 1 594 595 locale_env 596 locale_startup 597 598 Feedback 599 ======== 600 601 In case of questions, please consult Intl module peers. 602 603 604 .. _RFC 5656: https://tools.ietf.org/html/rfc5656 605 .. _BCP 47: https://tools.ietf.org/html/bcp47#section-2.1 606 .. _ISO 639: http://www.loc.gov/standards/iso639-2/php/code_list.php 607 .. _ISO 3166-1: https://www.iso.org/iso-3166-country-codes.html 608 .. _Intl.Locale: https://bugzilla.mozilla.org/show_bug.cgi?id=1433303 609 .. _fluent-locale: https://docs.rs/fluent-locale/ 610 .. _bug 1440969: https://bugzilla.mozilla.org/show_bug.cgi?id=1440969