mach-try-perf.rst (26545B)
1 ############# 2 Mach Try Perf 3 ############# 4 5 .. contents:: 6 :depth: 2 7 :local: 8 9 To make it easier for developers to find the tests they need to run we built a perf-specific try selector called ``./mach try perf``. With this tool, you no longer need to remember the obfuscated platform and test names that you need to target for your tests. Instead, the new interface shows test categories along with a simplified name of the platform that they will run on. 10 11 When you trigger a try run from the perf selector, two try runs will be created. One with your changes, and one without. The push without your changes will be done on the revision that your patches are based on (which we call the base revision). In your console, after you trigger the try runs, you'll find a PerfCompare link that will bring you directly to a comparison of the two pushes when they have completed. 12 13 The tool is built to be conservative about the number of tests to run, so if you are looking for something that is not listed, it's likely hidden behind a flag found in the ``--help``. Here's a list of what you'll find there:: 14 15 $ ./mach try perf --help 16 17 optional arguments: 18 -h, --help show this help message and exit 19 perf arguments: 20 --show-all Show all available tasks. Alternatively, --full may be used. 21 --chrome Show tests available for Chrome-based browsers (disabled by 22 default). 23 --custom-car Show tests available for Custom Chromium-as-Release (disabled by 24 default) 25 --safari Show tests available for Safari (disabled by default). 26 --safari-tp Show tests available for Safari Technology Preview(disabled by 27 default). 28 --live-sites Run tasks with live sites (if possible). You can also use the 29 `live-sites` variant. 30 --profile Run tasks with profiling (if possible). You can also use the 31 `profiling` variant. 32 --single-run Run tasks without a comparison 33 -q QUERY, --query QUERY 34 Query to run in either the perf-category selector, or the fuzzy 35 selector if --show-all/--full is provided. 36 --browsertime-upload-apk BROWSERTIME_UPLOAD_APK 37 Path to an APK to upload. Note that this will replace the APK 38 installed in all Android Performance tests. If the Activity, 39 Binary Path, or Intents required change at all relative to the 40 existing GeckoView, and Fenix tasks, then you will need to make 41 fixes in the associated taskcluster files (e.g. 42 taskcluster/kinds/browsertime/mobile.yml). Alternatively, set 43 MOZ_FIREFOX_ANDROID_APK_OUTPUT to a path to an APK, and then run 44 the command with --browsertime-upload-apk firefox-android. This 45 option will only copy the APK for browsertime, see --mozperftest- 46 upload-apk to upload APKs for startup tests. 47 --mozperftest-upload-apk MOZPERFTEST_UPLOAD_APK 48 See --browsertime-upload-apk. This option does the same thing 49 except it's for mozperftest tests such as the startup ones. Note 50 that those tests only exist through --show-all/--full as they aren't 51 contained in any existing categories. 52 --detect-changes Adds a task that detects performance changes using MWU. 53 --comparator COMPARATOR 54 Either a path to a file to setup a custom comparison, or a builtin 55 name. See the Firefox source docs for mach try perf for examples 56 of how to build your own, along with the interface. 57 --comparator-args [ARG=VALUE [ARG=VALUE ...]] 58 Arguments provided to the base, and new revision setup stages of 59 the comparator. 60 --variants [ [ ...]] Select variants to display in the selector from: fission, 61 bytecode-cached, live-sites, profiling, swr 62 --platforms [ [ ...]] 63 Select specific platforms to target. 64 Available platforms: android-a55, android, windows, 65 linux, macosx, desktop 66 --apps [ [ ...]] Select specific applications to target from: firefox, chrome, 67 geckoview, fenix, chrome-m, safari, safari-tp, custom-car, cstm- 68 car-m 69 --clear-cache Deletes the try_perf_revision_cache file 70 --alert ALERT Run all tests that produced this alert summary ID based on the 71 alert summary table in either the alerts view or the regression 72 bug. The comparison that is produced will be based on the base 73 revision in your local repository (i.e. the base revision your 74 patches, if any, are based on). If only specific tests need to 75 run, use --tests to specify them (e.g. --tests webaudio). 76 --extra-args [ [ ...]] 77 Set the extra args (e.x, --extra-args verbose post-startup- 78 delay=1) 79 --non-pgo Use opt/non-pgo builds instead of shippable/pgo builds. Setting 80 this flag will result in faster try runs. 81 --tests [TESTS [TESTS ...]], -t [TESTS [TESTS ...]] 82 Select from all tasks that run these specific tests (e.g. amazon, or 83 speedometer3). 84 85 task configuration arguments: 86 --artifact Force artifact builds where possible. 87 --no-artifact Disable artifact builds even if being used locally. 88 --browsertime Use browsertime during Raptor tasks. 89 --disable-pgo Don't run PGO builds 90 --env ENV Set an environment variable, of the form FOO=BAR. Can 91 be passed in multiple times. 92 --gecko-profile Create and upload a gecko profile during talos/raptor 93 tasks. Copy paste the parameters used in this profiling 94 run directly from about:profiling in Nightly. 95 --gecko-profile-interval GECKO_PROFILE_INTERVAL 96 How frequently to take samples (ms) 97 --gecko-profile-entries GECKO_PROFILE_ENTRIES 98 How many samples to take with the profiler 99 --gecko-profile-features GECKO_PROFILE_FEATURES 100 Set the features enabled for the profiler. 101 --gecko-profile-threads GECKO_PROFILE_THREADS 102 Comma-separated list of threads to sample. 103 paths Run tasks containing tests under the specified 104 path(s). 105 --rebuild [2-20] Rebuild all selected tasks the specified number of 106 times. 107 108 109 110 Workflow 111 -------- 112 113 Below, you'll find an overview of the features available in ``./mach try perf``. If you'd like to learn more about how to use this tool to enhance your development process, see the :ref:`Standard Workflow with Mach Try Perf` page. 114 115 Standard Usage 116 -------------- 117 118 To use mach try perf simply call ``./mach try perf``. This will open an interface for test selection like so: 119 120 121 .. image:: ./standard-try-perf.png 122 :alt: Mach try perf with default options 123 :scale: 75% 124 :align: center 125 126 127 Select the categories you'd like to run, hit enter, and wait for the tool to finish the pushes. **Note that it can take some time to do both pushes, and you might not see logging for some time.** 128 129 130 Retrigger 131 --------- 132 After the push is done, you will receive a Treeherder link that you can open to view your push. Access the Treeherder link to see all your tests. 133 134 To launch a retrigger, first select the task that you want to retrigger: 135 136 .. image:: ./th_select_task.png 137 :width: 300 138 139 140 Then, click the rotating arrow icon in the task action bar, or press 'r' on your keyboard: 141 142 .. image:: ./th_retrigger.png 143 :width: 300 144 145 146 Additionally, you can add the flag ``--rebuild=2-20`` to the try perf command to specify how many times you want to run the tests. If you want to learn more about retriggering please `visit this page <../treeherder-try/index.html#retrigger-r>`__. 147 148 149 Add new jobs (mass retriggers) 150 ------------------------------ 151 152 The add new job function can be used to retrigger many tasks multiple times. To add a new job, follow these steps: 153 * Navigate to the push you want to add jobs on Treeherder. 154 * Click on the arrow drop-down on the top right of the push. 155 * Select the ``Custom push action`` from the menu. 156 157 .. image:: ./th_custom_push_action.png 158 :width: 500 159 160 You can copy the values from the ``target-tasks.json`` file from your ``Decision`` task and paste them into the ``task`` option. This method is useful for mass retriggers if needed. 161 After you have pasted the json values, press the ``Trigger`` button. 162 163 .. image:: ./th_custom_job_action.png 164 :width: 500 165 166 Ideally, you should be able to use compare view to be more specific in the retriggers you do for tasks/tests that show a difference that they want to double-check. 167 168 169 Add extra-arguments/options to try run 170 -------------------------------------- 171 172 To add additional arguments to a try run, there are several approaches you can consider: 173 174 175 Use Treeherder 176 ^^^^^^^^^^^^^^ 177 178 This method assumes that you already have the job that has been run and you want to run it again, but this time to add extra options as well. First select the task that you want to add extra options: 179 180 .. image:: ./th_select_task.png 181 :width: 300 182 183 Then, click the three dots icon in the task action bar and select ``Custom Action``: 184 185 .. image:: ./th_custom_action.png 186 :width: 300 187 188 A window will open where you need to select ``raptor-extra-options``. There you can add all the options you need (e.g. extra_options: 'verbose browser-cycles=3'). After finishing, press the ``Trigger`` button. 189 190 .. image:: ./th_raptor_extra_option.png 191 :width: 500 192 193 Modify the yml file 194 ^^^^^^^^^^^^^^^^^^^ 195 196 This method involves identifying the YML file that contains the test you are interested in and modifying or adding the extra-options key. Under this key you can add all the parameters you desire. 197 198 .. image:: ./extra-options.png 199 :width: 500 200 201 Use extra-args option 202 ^^^^^^^^^^^^^^^^^^^^^ 203 204 An alternative method is to utilize the ``--extra-args`` argument to try perf command (e.g. --extra-args verbose post-startup-delay=1). 205 206 207 .. _Running Alert Tests: 208 209 Running Alert Tests 210 ------------------- 211 212 To run all the tests that triggered a given alert, use ``./mach try perf --alert <ALERT-NUMBER>``. Using this command will run all the tests that generated the alert summary ID provided in the regression bug. **It's recommended to use this when working with performance alerts.** The alert number can be found in comment 0 on any alert bug, `such as this one <https://bugzilla.mozilla.org/show_bug.cgi?id=1844510>`_. As seen in the image below, the alert number can be found just above the summary table. The comparison that is produced will be based on the base revision in your local repository (i.e. the base revision your patches, if any, are based on). 213 214 .. image:: ./comment-zero-alert-number.png 215 :alt: Comment 0 containing an alert number just above the table. 216 :scale: 50% 217 :align: center 218 219 220 Running Tasks of a Specific Test 221 -------------------------------- 222 223 Using the ``--tests`` option, you can run all tasks that run a specific test. This is based on the test name that is used in the command that runs in the task. For raptor, this is the test specified by ``--test``. For talos, it can either be a specific test in a suite like ``tp5n`` from ``xperf``, or the suite ``xperf`` can be specified. For AWSY though, there are no specific tests that can be selected so the only option to select AWSY tests is to specify ``AWSY`` as the test. 224 225 If it's used with ``--alert <NUM>``, only the tasks that run the specific test will be run on try. If it's used with ``--show-all`` or ``--full``, you will only see the tasks that run the specific test in the fuzzy interface. Finally, if it's used without either of those, then categories of the tests that were specified will be displayed in the fuzzy interface. For example, if ``--tests amazon`` is used, then categories like ``amazon linux firefox`` or ``amazon desktop`` will be displayed. 226 227 Chrome 228 ------ 229 230 Chrome tests are disabled by default as they are often unneeded and waste our limited resources. If you need chrome tests you can add ``--chrome`` to the command like so ``./mach try perf --chrome``: 231 232 233 .. image:: ./android-chrome-try-perf.png 234 :alt: Mach try perf with android, and chrome options 235 :scale: 75% 236 :align: center 237 238 239 Variants 240 -------- 241 242 If you are looking for any variants (e.g. no-fission, bytecode-cached, live-sites), use the ``--variants`` options like so ``./mach try perf --variants live-sites``. This will select all possible categories that could have live-sites tests. 243 244 245 .. image:: ./variants-try-perf.png 246 :alt: Mach try perf with variants 247 :scale: 75% 248 :align: center 249 250 251 Note that it is expected that the offered categories have extra variants (such as bytecode-cached) as we are showing all possible combinations that can include live-sites. 252 253 Platforms 254 --------- 255 256 To target a particular platform you can use ``--platforms`` to only show categories with the given platforms. 257 258 Categories 259 ---------- 260 261 In the future, this section will be populated dynamically. If you are wondering what the categories you selected will run, you can use ``--no-push`` to print out a list of tasks that will run like so:: 262 263 $ ./mach try perf --no-push 264 265 Artifact builds enabled, pass --no-artifact to disable 266 Gathering tasks for Benchmarks desktop category 267 Executing queries: 'browsertime 'benchmark, !android 'shippable !-32 !clang, !live, !profil, !chrom 268 estimates: Runs 66 tasks (54 selected, 12 dependencies) 269 estimates: Total task duration 8:45:58 270 estimates: In the shortest 38% of durations (thanks!) 271 estimates: Should take about 1:04:58 (Finished around 2022-11-22 15:08) 272 Commit message: 273 Perf selections=Benchmarks desktop (queries='browsertime 'benchmark&!android 'shippable !-32 !clang&!live&!profil&!chrom) 274 Pushed via `mach try perf` 275 Calculated try_task_config.json: 276 { 277 "env": { 278 "TRY_SELECTOR": "fuzzy" 279 }, 280 "tasks": [ 281 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-ares6", 282 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-assorted-dom", 283 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-jetstream2", 284 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-matrix-react-bench", 285 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-animometer", 286 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-htmlsuite", 287 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-speedometer", 288 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-stylebench", 289 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-sunspider", 290 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-twitch-animation", 291 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-unity-webgl", 292 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-firefox-webaudio", 293 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot", 294 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-baseline", 295 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-optimizing", 296 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc", 297 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-baseline", 298 "test-linux1804-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-optimizing", 299 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-ares6", 300 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-assorted-dom", 301 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-jetstream2", 302 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-matrix-react-bench", 303 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-animometer", 304 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-htmlsuite", 305 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-speedometer", 306 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-stylebench", 307 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-sunspider", 308 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-twitch-animation", 309 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-unity-webgl", 310 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-firefox-webaudio", 311 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot", 312 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-baseline", 313 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-optimizing", 314 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc", 315 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-baseline", 316 "test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-optimizing", 317 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-ares6", 318 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-assorted-dom", 319 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-jetstream2", 320 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-matrix-react-bench", 321 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-animometer", 322 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-motionmark-htmlsuite", 323 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-speedometer", 324 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-stylebench", 325 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-sunspider", 326 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-twitch-animation", 327 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-unity-webgl", 328 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-firefox-webaudio", 329 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot", 330 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-baseline", 331 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-godot-optimizing", 332 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc", 333 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-baseline", 334 "test-windows10-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox-wasm-misc-optimizing" 335 ], 336 "use-artifact-builds": true, 337 "version": 1 338 } 339 340 341 Adding a New Category 342 --------------------- 343 344 It's very easy to add a new category if needed, and you can do so by modifying the `PerfParser categories attribute here <https://searchfox.org/mozilla-central/source/tools/tryselect/selectors/perf.py#179>`_. The following is an example of a complex category that gives a good idea of what you have available:: 345 346 "Resource Usage": { 347 "query": { 348 "talos": ["'talos 'xperf | 'tp5"], 349 "raptor": ["'power 'osx"], 350 "awsy": ["'awsy"], 351 }, 352 "suites": ["talos", "raptor", "awsy"], 353 "platform-restrictions": ["desktop"], 354 "variant-restrictions": { 355 "raptor": [], 356 "talos": [], 357 }, 358 "app-restrictions": { 359 "raptor": ["firefox"], 360 "talos": ["firefox"], 361 }, 362 "tasks": [], 363 }, 364 365 The following fields are available: 366 * **query**: Set the queries to use for each suite you need. 367 * **suites**: The suites that are needed for this category. 368 * **tasks**: A hard-coded list of tasks to select. 369 * **platform-restrictions**: The platforms that it can run on. 370 * **app-restrictions**: A list of apps that the category can run. 371 * **variant-restrictions**: A list of variants available for each suite. 372 373 Note that setting the App/Variant-Restriction fields should be used to restrict the available apps and variants, not expand them as the suites, apps, and platforms combined already provide the largest coverage. The restrictions should be used when you know certain things definitely won't work, or will never be implemented for this category of tests. For instance, our ``Resource Usage`` tests only work on Firefox even though they may exist in Raptor which can run tests with Chrome. 374 375 Comparators 376 ----------- 377 378 If the standard/default push-to-try comparison is not enough, you can build your own "comparator" that can setup the base, and new revisions. The default comparator ``BasePerfComparator`` runs the standard mach-try-perf comparison, and there also exists a custom comparator called ``BenchmarkComparator`` for running custom benchmark comparisons on try (using Github PR links). 379 380 If you'd like to add a custom comparator, you can either create it in a separate file and pass it in the ``--comparator``, or add it to the ``tools/tryselect/selectors/perfselector/perfcomparators.py`` and use the name of the class as the ``--comparator`` argument (e.g. ``--comparator BenchmarkComparator``). You can pass additional arguments to it using the ``--comparator-args`` option that accepts arguments in the format ``NAME=VALUE``. 381 382 The custom comparator needs to be a subclass of ``BasePerfComparator``, and optionally overrides its methods. See the comparators file for more information about the interface available. Here's the general interface for it (subject to change), note that the ``@comparator`` decorator is required when making a builtin comparator:: 383 384 @comparator 385 class BasePerfComparator: 386 def __init__(self, vcs, compare_commit, current_revision_ref, comparator_args): 387 """Initialize the standard/default settings for Comparators. 388 389 :param vcs object: Used for updating the local repo. 390 :param compare_commit str: The base revision found for the local repo. 391 :param current_revision_ref str: The current revision of the local repo. 392 :param comparator_args list: List of comparator args in the format NAME=VALUE. 393 """ 394 395 def setup_base_revision(self, extra_args): 396 """Setup the base try run/revision. 397 398 The extra_args can be used to set additional 399 arguments for Raptor (not available for other harnesses). 400 401 :param extra_args list: A list of extra arguments to pass to the try tasks. 402 """ 403 404 def teardown_base_revision(self): 405 """Teardown the setup for the base revision.""" 406 407 def setup_new_revision(self, extra_args): 408 """Setup the new try run/revision. 409 410 Note that the extra_args are reset between the base, and new revision runs. 411 412 :param extra_args list: A list of extra arguments to pass to the try tasks. 413 """ 414 415 def teardown_new_revision(self): 416 """Teardown the new run/revision setup.""" 417 418 def teardown(self): 419 """Teardown for failures. 420 421 This method can be used for ensuring that the repo is cleaned up 422 when a failure is hit at any point in the process of doing the 423 new/base revision setups, or the pushes to try. 424 """ 425 426 Frequently Asked Questions (FAQ) 427 -------------------------------- 428 429 If you have any questions which aren't already answered below please reach out to us in the `perftest matrix channel <https://matrix.to/#/#perftest:mozilla.org>`_. 430 431 * **How can I tell what a category or a set of selections will run?** 432 433 At the moment, you need to run your command with an additional option to see what will be run: ``./mach try perf --no-push``. See the `Categories`_ section for more information about this. In the future, we plan on having an dynamically updated list for the tasks in the `Categories`_ section of this document. 434 435 * **What's the difference between ``Pageload desktop``, and ``Pageload desktop firefox``?** 436 437 If you simply ran ``./mach try perf`` with no additional options, then there is no difference. If you start adding additional browsers to the try run with commands like ``./mach try perf --chrome``, then ``Pageload desktop`` will select all tests available for ALL browsers available, and ``Pageload desktop firefox`` will only select Firefox tests. When ``--chrome`` is provided, you'll also see a ``Pageload desktop chrome`` option. 438 439 * **Help! I can't find a test in any of the categories. What should I do?** 440 441 Use the option ``--show-all`` or ``--full``. This will let you select tests from the ``./mach try fuzzy --full`` interface directly instead of the categories. You will always be able to find your tests this way. Please be careful with your task selections though as it's easy to run far too many tests in this way! 442 443 Future Work 444 ----------- 445 446 The future work for this tool can be `found in this bug <https://bugzilla.mozilla.org/show_bug.cgi?id=1799178>`_. Feel free to file improvements, and bugs against it.