tor-browser

The Tor Browser
git clone https://git.dasho.dev/tor-browser.git
Log | Files | Refs | README | LICENSE

index.rst (34309B)


      1 ======================
      2 Performance Sheriffing
      3 ======================
      4 
      5 .. contents::
      6    :depth: 3
      7 
      8 1 Overview
      9 ----------
     10 
     11 Performance sheriffs are responsible for making sure that performance changes in Firefox are detected
     12 and dealt with. They look at data and performance metrics produced by the performance testing frameworks
     13 and find regressions, determine the root cause, and file bugs to track all issues. The workflow we
     14 follow is shown below in our flowchart.
     15 
     16 1.1 Flowchart
     17 ~~~~~~~~~~~~~
     18 
     19 .. image:: ./flowchart.png
     20   :alt: Sheriffing Workflow Flowchart
     21   :align: center
     22 
     23 The workflow of a sheriff is backfilling jobs to get the data, investigating that data, filing
     24 bugs/linking improvements based on the data, and following up with developers if needed.
     25 
     26 1.2 Contacts and the Team
     27 ~~~~~~~~~~~~~~~~~~~~~~~~~
     28 In the event that you have an urgent issue and need help what can you do?
     29 
     30 If you have a question about a bug that was filed and assigned to you reach out to the sheriff who filed the bug on
     31 Matrix. If a performance sheriff is not responsive or you have a question about a bug
     32 send a message to the `Performance Sheriffs Matrix channel <https://chat.mozilla.org/#/room/#perfsheriffs:mozilla.org>`_
     33 and tag the sheriff. If you still have no-one responding you can message any of the following people directly
     34 on Slack or Matrix:
     35 
     36 - `@afinder <https://people.mozilla.org/p/afinder>`_
     37 - `@andra <https://people.mozilla.org/p/andraesanu>`_
     38 - `@beatrice <https://people.mozilla.org/p/bacasandrei>`_
     39 - `@florin.bilt <https://people.mozilla.org/p/fbilt>`_
     40 - `@sparky <https://people.mozilla.org/p/sparky>`_ (reach out to only if all others unreachable)
     41 
     42 All of the team is in EET (Eastern European Time) except for @sparky who is in EST (Eastern Standard Time).
     43 
     44 1.3 Regression and Improvement Definition
     45 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     46 Whenever we get a performance change we classify it as one of two things, either a regression (worse performance) or
     47 an improvement (better performance).
     48 
     49 2 How to Investigate Alerts
     50 ---------------------------
     51 In this section we will go over how performance sheriffs investigate alerts.
     52 
     53 2.1 Filtering and Reading Alerts
     54 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     55 On the `Perfherder page <https://treeherder.mozilla.org/perfherder/alerts>`_ you should see something like below:
     56 
     57 .. image:: ./Alerts_view.png
     58  :alt: Alerts View Toolbar
     59  :align: center
     60 
     61 After accessing the Perfherder alerts page make sure the filter (located in the top middle of the screenshot)
     62 is set to show the correct alerts for sheriffing. The new alerts can be found when
     63 the **untriaged** option from the left-most dropdown is selected. As shown in the screenshot below:
     64 
     65 .. image:: ./Alerts_view_toolbar.png
     66  :alt: Alerts View Toolbar
     67  :align: center
     68 
     69 The rest of the dropdowns from left to right are as follows:
     70 
     71 - **Testing harness**: altering this will take you to alerts generated on different harnesses
     72 - **The filter input**, where you can type some text and press enter to narrow down the alerts view
     73 - **"Hide downstream / reassigned to / invalid"**: enable this (recommended) to reduce clutter on the page
     74 - **"My alerts"**: only shows alerts assigned to you.
     75 
     76 Below is a screenshot of an alert:
     77 
     78 .. image:: ./single_alert.png
     79  :alt: Alerts View Toolbar
     80  :align: center
     81 
     82 You can tell an alert by looking at the bold text, it will say "Alert #XXXXX", in each alert you have groupings of
     83 summaries of tests, and those tests:
     84 
     85 - Can run on different platforms
     86 - Can share suite name (like tp5o)
     87 - Measure various metrics
     88 - Share the same framework
     89 
     90 Going from left to right of the columns inside the alerts starting with test, we have:
     91 
     92 - A blue hyperlink that links to the test documentation (if available)
     93 - The **platform's** operating system
     94 - **Information** about the historical data distribution of that
     95 - Tags and options related to the test
     96 
     97 2.2 Regressions vs Improvements
     98 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     99 First thing to note about how we investigate alerts is that **we prioritize handling regressions**! Unlike the
    100 **improvements,** regressions ship bugs to users, which, if not addressed, make our products worse and drive users away.
    101 After acknowledging an alert:
    102 
    103 - Regressions go through multiple status changes (TODO: link to sections with multiple status changes) until they are finally resolved
    104 - An improvement has a single status of improvement
    105 
    106 2.3 Framework Thresholds
    107 ~~~~~~~~~~~~~~~~~~~~~~~~
    108 Different frameworks test different things, and the thresholds for triggering alerts and considering
    109 performance changes differ based on the harness:
    110 
    111 - AWSY >= 0.25%
    112 - Build metrics installer size >= 100kb
    113 - Talos, Browsertime, Build Metrics >= 2%
    114 
    115 3 How to Handle Inactive Alerts
    116 -------------------------------
    117 
    118 Inactive performance alerts are those alerts which have had no activity in 1 week. This section covers how performance sheriffs should handle inactive performance alerts that are found in the daily email sent to the `perfalert-activity group <https://groups.google.com/a/mozilla.com/g/perfalert-activity/about>`_.
    119 
    120 3.1 Process
    121 ~~~~~~~~~~~
    122 
    123 The following is the general process that needs to be taken for the alerts in the email:
    124 
    125 #. Open the email titled ``[bugbot][autofix] PerfAlert regressions with 1 week(s) of inactivity for the DATE`` to find bugs that are inactive.
    126 
    127    - These occur at most daily.
    128 
    129 #. Open one of the bugs mentioned in the email.
    130 
    131 #. Check if the developer has previously responded to the bug.
    132 
    133 #. Find the developer (regression author) being needinfo’ed by the BugBot.
    134 
    135 #. (Optional) Check on `people.mozilla.org <https://people.mozilla.org>`_ to find the person’s Matrix/Slack information if needed.
    136 
    137 #. Find the developer in a public channel.
    138 
    139    - ``#developers`` on Matrix is the most likely place you can find them.
    140 
    141 #. Reach out to them with a message like the following:
    142 
    143    - **If the patch has had a response from the regressor author:**
    144 
    145      ::
    146 
    147       Hello, could you provide an update on this performance regression or close it if it makes sense to (with a follow-up bug if needed)? <PERFORMANCE-ALERT-BUG-LINK>
    148 
    149    - **If the patch has never had a response from the regressor author:**
    150 
    151      ::
    152 
    153       Hello, could you provide an update on this performance regression or close it if it makes sense to (with a follow-up bug if needed)? In accordance with our `regression policy <https://www.mozilla.org/en-US/about/governance/policies/regressions/>`_, we're considering backing out your patch due to a lack of comments/activity: <PERFORMANCE-ALERT-BUG-LINK>
    154 
    155 3.2 Handling Responses
    156 ~~~~~~~~~~~~~~~~~~~~~~
    157 
    158 For Bugs with a Response from the Regressor Author
    159 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    160 
    161 Depending on the developer's response, one of four things may happen:
    162 
    163 #. **Developer provides an update on the alert bug:**
    164 
    165    - No other action is needed. If this has happened multiple times on the bug, you can add the ``backlog-deferred`` keyword to prevent the BugBot rule from triggering again on the alert.
    166 
    167 #. **Developer asks for clarification on the process or isn’t sure what to do:**
    168 
    169    - Point them to this documentation. Explain the possible resolutions and what we expect of them.
    170 
    171 #. **Developer does not respond:**
    172 
    173    - Wait for 1 full business day for the response. If there is still no response, find and ping their manager (can be in private) from `people.mozilla.org <https://people.mozilla.org>`_.
    174 
    175      - If there is a response from the manager, you can proceed with one of the other options.
    176 
    177 #. **Developer does not want to close the bug and needs time to investigate:**
    178 
    179    - Add the ``backlog-deferred`` keyword to prevent BugBot from triggering on this bug again in the future.
    180 
    181 For Bugs with No Previous Response from the Regressor Author
    182 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    183 
    184 Depending on the developer's response, one of five things may happen:
    185 
    186 #. **Developer agrees to a backout:**
    187 
    188    - Reach out to a sheriff in ``#sheriffs`` on Matrix to request the backout.
    189 
    190      - Ensure that they understand that if they’re actively working on it, they can provide an update on the alert bug to prevent a backout.
    191      - Ensure that they understand that they can close the bug with ``WONTFIX``/``INCOMPLETE`` if they aren’t actively working on it, or they think it isn’t a big issue. They can file a follow-up bug to look into the issue further in the future. If it's been determined that there is no actual performance issue but there was a detection, they could close the bug as ``WORKSFORME``.
    192 
    193 #. **Developer provides an update on the alert bug:**
    194 
    195    - No other action is needed. If this has happened multiple times on the bug, you can add the ``backlog-deferred`` keyword to prevent the BugBot rule from triggering again on the alert.
    196 
    197 #. **Developer asks for clarification on the process or isn’t sure what to do:**
    198 
    199    - Point them to this documentation. Explain the possible resolutions and what we expect of them.
    200 
    201 #. **Developer does not respond:**
    202 
    203    - Wait for 1 full business day for the response. If there is still no response, find and ping their manager (can be in private) from `people.mozilla.org <https://people.mozilla.org>`_.
    204 
    205      - If there is a response from the manager/developer, you can proceed with one of the other options. If not, request a backout.
    206 
    207 #. **Developer does not want to close the bug and needs time to investigate:**
    208 
    209    - Ask them to provide a comment in the bug stating this. Add the ``backlog-deferred`` keyword to prevent the BugBot from triggering on this bug again in the future.
    210 
    211 4 FAQ
    212 -----
    213 
    214 What is Perfherder?
    215 ~~~~~~~~~~~~~~~~~~~
    216 
    217 `Perfherder <https://treeherder.mozilla.org/perf.html#/graphs>`_ is a tool that takes data points from log files and graphs them over time.
    218 Primarily this is used for performance data from `Talos <https://wiki.mozilla.org/TestEngineering/Performance/Talos>`_, but also from `AWSY <https://firefox-source-docs.mozilla.org/testing/perfdocs/awsy.html>`_, build_metrics, `Autophone <https://wiki.mozilla.org/EngineeringProductivity/Autophone>`_ and platform_microbenchmarks.
    219 All these are test harnesses and you can find more about them `here <https://wiki.mozilla.org/TestEngineering/Performance/Sheriffing/Alerts>`_.
    220 
    221 The code for Perfherder can be found inside Treeherder `on GitHub <https://github.com/mozilla/treeherder/>`_.
    222 
    223 How can I view details on a graph?
    224 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    225 
    226 When viewing Perfherder Graph details, in many cases it is obvious where the regression is. If you mouse over the data points (not click on them) you can see some raw data values.
    227 
    228 While looking for the specific changeset that caused the regression, you have to determine where the values changed. By moving the mouse over the values you can easily determine the high/low values historically to determine the normal 'range'. When you see values change, it should be obvious that the high/low values have a different 'range'.
    229 
    230 If this is hard to see, it helps to zoom in to reduce the 'y' axis. Also zooming into the 'x' axis for a smaller range of revisions yields less data points, but an easier way to see the regression.
    231 
    232 Once you find the regression point, you can click on the data point and it will lock the information as a popup. Then you can click on the revision to investigate the raw changes which were part of that.
    233 
    234 .. image:: ./Ph_Details.png
    235   :alt: Ph_Details
    236   :align: center
    237 
    238 Note, here you can get the date, revision, and value. These are all useful data points to be aware of while viewing graphs.
    239 
    240 Keep in mind, graph server doesn't show if there is missing data or a range of changesets.
    241 
    242 
    243 How can I zoom on a perfherder graph?
    244 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    245 
    246 Perfherder graphs has the ability adjust the date range from a drop down box. We default to 14 days, but we can change it to last day/2/7/14/30/90/365 days from the UI drop down.
    247 
    248 It is usually a good idea to zoom out to a 30 day view on integration branches. This allows us to see recent history as well as what the longer term trend is.
    249 
    250 There are two parts in the Perfherder graph, the top box with the trendline and the bottom viewing area with the raw data points. If you select an area in the trendline box, it will zoom to that. This is useful for adjusting the Y-axis.
    251 
    252 Here is an example of zooming in on an area:
    253 
    254 .. image:: ./Ph_Zooming.png
    255   :alt: Ph_Zooming
    256   :align: center
    257 
    258 How can I add more test series to a graph?
    259 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    260 
    261 One feature of Perfherder graphs is the ability to add up to 7 sets of data points at once and compare them on the same graph. In fact when clicking on a graph for an alert, we do this automatically when we add multiple branches at once.
    262 
    263 While looking at a graph, it is a good idea to look at that test/platform across multiple branches to see where the regression originally started at and to see if it is affected on different branches. There are 3 primary needs for adding data:
    264 
    265 - investigating branches
    266 - investigating platforms
    267 - comparing pgo/non pgo/e10s for the same test
    268 
    269 For investingating branches click the branch name in the UI and it will pop up the "Add more test data" dialog pre populated with the other branches which has data for this exact platform/test. All you have to do is hit add.
    270 
    271 .. image:: ./Ph_Addbranch.png
    272   :alt: Ph_Addbranch
    273   :align: center
    274 
    275 For investigating platforms, click the platform name in the UI and it will pop up the "Add more test data" dialog pre populated with the other platforms which has data for this exact platform/test. All you have to do is hit add.
    276 
    277 .. image:: ./Ph_Addplatform.png
    278   :alt: Ph_Addplatform
    279   :align: center
    280 
    281 To do this find the link on the left hand side where the data series are located at "+Add more test data":
    282 
    283 .. image:: ./Ph_Addmoredata.png
    284   :alt: Ph_Addmoredata
    285   :align: center
    286 
    287 How can a test series can be muted/hidden?
    288 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    289 
    290 A test series from a perfherder graph can be muted/hidden by toggling on the checkbox on the lower right of the data series from the left side panel.
    291 
    292 .. image:: ./Ph_Muting.png
    293   :alt: SPh_Muting
    294   :align: center
    295 
    296 
    297 What makes branches different from one another?
    298 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    299 
    300 We have a variety of branches at Mozilla, here are the main ones that we see alerts on:
    301 
    302 - Mozilla-Inbound (PGO, Non-PGO)
    303 - Autoland (PGO, Non-PGO)
    304 - Mozilla-Beta (all PGO)
    305 
    306 Linux and Windows builds have `PGO <#what-is-pgo>`_, OSX does not.
    307 
    308 When investigating alerts, always look for the Non-PGO branch first. Usually expect to find changes on Mozilla-Inbound (about 50%) and Autoland (50%).
    309 
    310 The volume on the branches is something to be aware of, we have higher volume on Mozilla-Inbound and Autoland, this means that alerts will be generated faster and it will be easier to track down the offending revision.
    311 
    312 A final note, Mozilla-Beta is a branch where little development takes place. The volume is really low and alerts come 5 days (or more) later. It is important to address Mozilla-Beta alerts ASAP because that is what we are shipping to customers.
    313 
    314 What is coalescing?
    315 ~~~~~~~~~~~~~~~~~~~
    316 
    317 Coalescing is a term we use for when we schedule jobs to run on a given machine. When the load is high these jobs are placed in a queue and the longer the queue we skip over some of the jobs. This allows us to get results on more recent changesets faster.
    318 
    319 This affects talos numbers as we see regressions which show up over >1 changeset that is pushed. We have to manually fill in the coalesced jobs (including builds sometimes) to ensure we have the right changeset to blame for the regression.
    320 
    321 Some things to be aware of:
    322 
    323 - missing test jobs - This could be as easy as waiting for jobs to finish, or scheduling the missing job assuming it was coalesced, otherwise, it could be a missing build.
    324 - missing builds - we would have to generate builds, which automatically schedules test jobs, sometimes these test jobs are coalesced and not run.
    325 - results might not be possible due to build failures, or test failures
    326 - `pgo builds <What-is-PGO?>`_ are not coalesced, they just run much less frequently. Most likely a pgo build isn't the root cause
    327 
    328 Here is a view on treeherder of missing data (usually coalescing):
    329 
    330 .. image:: ./Coalescing_markedup.png
    331   :alt: Coalescing_markedup
    332   :align: center
    333 
    334 Note the two pushes that have no data (circled in red). If the regression happened around here, we might want to backfill those two jobs so we can ensure we are looking at the push which caused the regression instead of >1 push.
    335 
    336 What is an uplift?
    337 ~~~~~~~~~~~~~~~~~~
    338 
    339 Every `6 weeks <https://whattrainisitnow.com/calendar/>`_ we release a new version of Firefox. When we do that, our code which developers check into the nightly branch gets uplifted (thing of this as a large `merge <#what-is-a-merge>`_) to the Beta branch. Now all the code, features, and Talos regressions are on Beta.
    340 
    341 This affects the Performance Sheriffs because we will get a big pile of alerts for Mozilla-Beta. These need to be addressed rapidly. Luckily almost all the regressions seen on Mozilla-Beta will already have been tracked on Mozilla-Inbound or Autoland.
    342 
    343 - Regressions go through multiple status changes (TODO: link to sections with multiple status changes) until they are finally resolved
    344 - An improvement has a single status of improvement
    345 
    346 What is a merge?
    347 ~~~~~~~~~~~~~~~~
    348 
    349 Many times each day we merge code from the integration branches into the main branch and back. This is a common process in large projects. At Mozilla, this means that the majority of the code for Firefox is checked into Mozilla-Inbound and Autoland, then it is merged into Mozilla-Central (also referred to as Firefox) and then once merged, it gets merged back into the other branches. If you want to read more about this merge procedure, here are `the details <https://wiki.mozilla.org/Sheriffing/How_To/Merges>`_.
    350 
    351 .. image:: ./Merge.png
    352   :alt: Merge
    353   :align: center
    354 
    355 Note that the topmost revision has the commit messsage of: "merge m-c to m-i". This is pretty standard and you can see that there are a series of `changesets <https://hg-edge.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=126a1ec5c7c5>`_, not just a few related patches.
    356 
    357 How this affects alerts is that when a regression lands on Mozilla-Inbound, it will be merged into Firefox, then Autoland. Most likely this means that you will see duplicate alerts on the other integration branch.
    358 
    359 - note: we do not generate alerts for the Firefox (Mozilla-Central) branch.
    360 
    361 What is a backout?
    362 ~~~~~~~~~~~~~~~~~~
    363 
    364 Many times we backout or hotfix code as it is causing a build failure or unittest failure. The `Sheriff team <https://wiki.mozilla.org/Sheriffing/Sheriff_Duty>`_ handles this process in general and backouts/hotfixes are usually done within 3 hours (i.e. we won't have `12 future changesets <#why-do-we-need-12-future-data-points>`_) of the original fix. As you can imagine we could get an alert 6 hours later and go to look at the graph and see there is no regression, instead there is a temporary spike for a few data points.
    365 
    366 While looking on TreeHerder for a backout, they all mention a backout in the commit message:
    367 
    368 .. image:: ./Backout_tree.png
    369   :alt: Backout_tree
    370   :align: center
    371 
    372 - note ^ the above image mentions the bug that was backed out, sometimes it is the revisoin.
    373 
    374 Backouts which affect `Perfherder alerts <https://wiki.mozilla.org/TestEngineering/Performance/Sheriffing/Alerts>`_ always generate a set of improvements and regressions. These are usually easy to spot on the graph server and we just need to annotate the set of alerts for the given revision to be a 'backout' with the bug to track what took place.
    375 
    376 Here is a view on graph server of what appears to be a backout (it could be a fix that landed quickly also):
    377 
    378 .. image:: ./Backout_graph.png
    379   :alt: Backout_graph
    380   :align: center
    381 
    382 What is PGO?
    383 ~~~~~~~~~~~~
    384 
    385 PGO is Profile Guided Optimization `Profile Guided Optimization <https://wiki.mozilla.org/TestEngineering/Performance/Sheriffing/Alerts>`_ where we do a build, run it to collect metrics and optimize based on the output of the metrics. We only release PGO builds, and for the integration branches we do these periodically (6 hours) or as needed. For Mozilla-Central we follow the same pattern. As the builds take considerably longer (2+ times as long) we don't do this for every commit into our integration branches.
    386 
    387 How does this affect alerts? We care most about PGO alerts- that is what we ship! Most of the time an alert will be generated for a -Non-PGO build and then a few hours or a day later we will see alerts for the PGO build.
    388 
    389 Pay close attention to the branch the alerts are on, most likely you will see it on the non-pgo branch first (i.e. Mozilla-Inbound-Non-PGO), then roughly a day later you will see a similar alert show up on the PGO branch (i.e. Mozilla-Inbound).
    390 
    391 Caveats:
    392 
    393 - OSX does not do PGO builds, so we do not have -Non-PGO branches for those platforms. (i.e. we only have Mozilla-Inbound)
    394 - PGO alerts will probably have different regression percentages, but the overall list of platforms/tests for a given revision will be almost identical
    395 
    396 What alerts are displayed in Alert Manager?
    397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    398 
    399 Perfherder `alerts <https://treeherder.mozilla.org/perf.html#/alerts>`_ defaults to `multiple types of alerts <https://wiki.mozilla.org/TestEngineering/Performance/Sheriffing/Alerts>`_ that are untriaged. It is a goal to keep these lists empty! You can view alerts that are improvements or in any other state (i.e. investigating, fixed, etc.) by using the drop down at the top of the page.
    400 
    401 Do we care about all alerts/tests?
    402 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    403 
    404 Yes we do. Some tests are more commonly invalid, mostly due to the noise in the tests. We also adjust the threshold per test, the default is 2%, but for Dromaeo it is 5%. If we consider a test too noisy, we consider removing it entirely.
    405 
    406 Here are some platforms/tests which are exceptions about what we run:
    407 
    408 - Linux 64bit - the only platform which we run dromaeo_dom
    409 - Linux 32/64bit - the only platform in which no `platform_microbench <https://wiki.mozilla.org/TestEngineering/Performance/Sheriffing/Alerts#platform_microbench>`_ test runs, due to high noise levels
    410 - Windows 7 - the only platform that supports xperf (toolchain is only installed there)
    411 - Windows 7/10 - heavy profiles don't run here, because they take too long while cloning the big profiles; these are tp6 tests that use heavy user profiles
    412 
    413 Lastly, we should prioritize alerts on the Mozilla-Beta branch since those are affecting more people.
    414 
    415 What does a regression look like on the graph?
    416 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    417 
    418 On almost all of our tests, we are measuring based on time. This means that the lower the score the better. Whenever the graph increases in value that is a regression.
    419 
    420 Here is a view of a regression:
    421 
    422 .. image:: ./Regression.png
    423   :alt: Regression
    424   :align: center
    425 
    426 We have some tests which measure internal metrics. A few of those are actually reported where a higher score is better. This is confusing, but we refer to these as reverse tests. The list of tests which are reverse are:
    427 
    428 - canvasmark
    429 - dromaeo_css
    430 - dromaeo_dom
    431 - rasterflood_gradient
    432 - speedometer
    433 - tcanvasmark
    434 - v8 version 7
    435 
    436 Here is a view of a reverse regression:
    437 
    438 .. image:: ./Reverse_regression.png
    439   :alt: Reverse_regression
    440   :align: center
    441 
    442 Why does Alert Manager print -xx% ?
    443 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    444 
    445 The alert will either be a regression or an improvement. For the alerts we show by default, it is regressions only. It is important to know the severity of an alert. For example a 3% regression is important to understand, but a 30% regression probably needs to be fixed ASAP. This is annotated as a XX% in the UI. there are no + or - to indicate improvement or regression, this is an absolute number. Use the bar graph to the side to determine which type of alert this is.
    446 
    447 NOTE: for the reverse tests we take that into account, so the bar graph will know to look in the correct direction.
    448 
    449 What is noise?
    450 ~~~~~~~~~~~~~~
    451 
    452 Generally a test reports values that are in a range instead of a consistent value. The larger the range of 'normal' results, the more noise we have.
    453 
    454 Some tests will post results in a small range, and when we get a data point significantly outside the range, it is easy to identify.
    455 
    456 The problem is that many tests have a large range of expected results (we call them unstable). It makes it hard to determine what a regression is when we might have a range += 4% from the median and we have a 3% regression. It is obvious in the graph over time, but hard to tell until you have many future data points.
    457 
    458 .. image:: ./Noisy_graph.png
    459   :alt: Noisy_graph
    460   :align: center
    461 
    462 What are low value tests?
    463 ~~~~~~~~~~~~~~~~~~~~~~~~~
    464 
    465 In the context of noise, the low value mean that the regression magnitude is too small related to the noise of the tests, thus it's pretty hard to tell which particular bug/commit caused this, but rather a range.
    466 In a sheriffing perspective, those often end up as WONTFIX/INVALID or tests which are often considered unreliable, not relevant to current Firefox revision etc.
    467 
    468 .. image:: ./Noisy_low_value_graph.png
    469   :alt: Noisy_low_value_graph
    470   :align: center
    471 
    472 Why can we not trust a single data point?
    473 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    474 
    475 This is a problem we have dealt with for years with no perfect answer. Some reasons we do know are:
    476 
    477 - the test is noisy due to timing, diskIO, etc.
    478 - the specific machine might have slight differences
    479 - sometimes we have longer waits starting the browser or a pageload hang for a couple extra seconds
    480 
    481 The short answer is we don't know and have to work within the constraints we do know.
    482 
    483 Why do we need 12 future data points?
    484 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    485 
    486 We are re-evaluating our assertions here, but the more data points we have, the more confidence we have in the analysis of the raw data to point out a specific change.
    487 
    488 This causes problem when we land code on Mozilla-Beta and it takes 10 days to get 12 data points. We sometimes rerun tests and just retriggering a job will help provide more data points to help us generate an alert if needed.
    489 
    490 Can't we do smarter analysis to reduce noise?
    491 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    492 
    493 Yes, we can. We have other projects and a masters thesis `masters thesis <https://wiki.mozilla.org/images/c/c0/Larres-thesis.pdf>`_ has been written on this subject. The reality is we will still need future data points to show a trend and depending on the source of data we will need to use different algorithms to analyze it.
    494 
    495 How can duplicate alerts can be identified?
    496 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    497 
    498 One problem with `coalescing <#what-is-coalescing>`_ is that we sometimes generate an original alert on a range of changes, then when we fill in the data (backfilling/retriggering) we generate new alerts. This causes confusion while looking at the alerts.
    499 
    500 Here are some scenarios which duplication will be seen:
    501 
    502 - backfilling data from `coalescing <#what-is-coalescing>`_, you will see a similar alert on the same branch/platform/test but a different revision
    503    - action: reassign the alerts to the original alert summary so all related alerts are in one place!
    504 - we merge changesets between branches
    505    - action: find the original alert summary on the upstream branch and mark the specific alert as downstream to that alert summary
    506 - `pgo <#what-is-pgo>`_ builds
    507    - action: reassign these to the non-pgo alert summary (if one exists), or downstream to the correct alert summary if this originally happened on another branch
    508 
    509 In Alert Manager it is good to acknowledge the alert and use the reassign or downstream actions. This helps us keep track of alerts across branches whenever we need to investigate in the future.
    510 
    511 What are weekend spikes?
    512 ~~~~~~~~~~~~~~~~~~~~~~~~
    513 
    514 On weekends (Saturday/Sunday) and many holidays, we find that the volume of pushes are much smaller. This results in much fewer tests to be run. For many tests, especially ones that are noisier than others, we find that the few data points we collect on a `weekend are much less noisy <https://elvis314.wordpress.com/2014/10/30/a-case-of-the-weekends/>`_ (either falling to the top or bottom of the noise range).
    515 
    516 Here is an example view of data that behaves differently on weekends:
    517 
    518 .. image:: ./Weekends_example.png
    519   :alt: Weekends_example
    520   :align: center
    521 
    522 This affects the Talos Sheriff because on Monday when our volume of pushes picks up, we get a larger range of values. Due to the way we calculate a regression, it means that we see a shift in our expected range on Monday. Usually these alerts are generated Monday evening/Tuesday morning. These are typically small regressions (<3%) and on the noisier tests.
    523 
    524 What is a multi-modal test?
    525 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    526 
    527 Many tests are bi-modal or multi-modal. This means that they have a consistent set of values, but 2 or 3 of them. Instead of having a bunch of scattered values between the low and high, you will have 2 values, the lower one and the higher one.
    528 
    529 Here is an example of a graph that has two sets of values (with random ones scattered in between):
    530 
    531 .. image:: ./Modal_example.png
    532   :alt: Modal_example
    533   :align: center
    534 
    535 This affects the alerts and results because sometimes we get a series of results that are less modal than the original- of course this generates an alert and a day later you will probably see that we are back to the original x-modal pattern as we see historically. Some of this is affected by the weekends.
    536 
    537 What is random noise?
    538 ~~~~~~~~~~~~~~~~~~~~~
    539 
    540 Random noise are the data-points that don't fit in the graph trend of the test. They happen because of various uncontrollable factors (and this is assumed) or because the test is unstable.
    541 
    542 How do I identify the current firefox release meta-bug?
    543 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    544 
    545 To easily track all the regressions opened, for every Firefox release is created a meta-bug that will depend on the regressions open.
    546 
    547 .. image:: ./Advanced_search.png
    548   :alt: Advanced_search
    549   :align: center
    550 
    551 To find all the Firefox release meta-bugs you just have to search in Advanced search for bugs with:
    552 
    553 .. image:: ./Firefox_70_meta.png
    554   :alt: SFirefox_70_meta
    555   :align: center
    556 
    557 **Product:** Testing
    558 **Component:** Performance
    559 **Summary:** Contains all of the strings [meta] Firefox, Perfherder Regression Tracking Bug You can leave the rest of the fields as they are.
    560 
    561 .. image:: ./Advanced_search_filter.png
    562   :alt: Advanced_search_filter
    563   :align: center
    564 
    565 **Result:**
    566 
    567 .. image:: ./Firefox_metabugs.png
    568   :alt: Firefox_metabugs
    569   :align: center
    570 
    571 How do I search for an already open regression?
    572 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    573 
    574 Sometimes treeherder include alerts related to a test in the same summary, sometimes it doesn’t. To make sure that the regression you found doesn’t have already a bug open, you have to search in the current Firefox release meta-bug for regressions open with the summary similar to the summary of your alert. Usually, if the test name matches, it might be what you’re looking for. But, be careful, if the test name matches that doesn’t mean that it is what you’re looking for. You need to check it thoroughly.
    575 
    576 Those situations appear because a regression appears first on one repo (e.g. autoland) and it takes a few days until the causing commit gets merged to other repos (inbound, beta, central).
    577 
    578 How do I follow up on already open regressions open by me?
    579 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    580 
    581 You can follow up on all the open regression bugs created by you by searching in `Advanced search <https://bugzilla.mozilla.org/query.cgi?format=advanced>`_ for bugs with:
    582 **Summary:** contains all of the strings > regression on push
    583 
    584 **Status:** NEW, ASSIGNED, REOPENED
    585 
    586 .. image:: ./Advanced_search_for_perf_regressions.png
    587   :alt: Advanced_search_for_perf_regressions
    588   :align: center
    589 
    590 **Keywords:** perf, perf-alert, regression
    591 
    592 **Type:** defect
    593 
    594 .. image:: ./Advanced_search_for_perf_regressions_type.png
    595   :alt: Advanced_search_for_perf_regressions_type
    596   :align: center
    597 
    598 **Search by People:** The reporter is > [your email]
    599 
    600 .. image:: ./Advanced_search_for_perf_regressions_by_people.png
    601   :alt: Advanced_search_for_perf_regressions_by_people
    602   :align: center
    603 
    604 And you will get the list of all open regressions reported by you:
    605 
    606 .. image:: ./Advanced_search_results.png
    607   :alt: Advanced_search_results
    608   :align: center
    609 
    610 How can I do a bisection?
    611 ~~~~~~~~~~~~~~~~~~~~~~~~~
    612 
    613 If you're investigating a regression/improvement but for some reason it happened in a revision interval where the jobs aren't able to run or the revision contains multiple commits (this happens more often on mozilla-beta), you need to do a bisection in order to find the exact culprit. We usually adopt the binary search method. Say you have the revisions:
    614 
    615 - abcde1 - first regressed/improved value
    616 - abcde2
    617 - abcde3
    618 - abcde4
    619 - abcde5 - last good value
    620 
    621 Bisection steps:
    622 
    623 1. checkout to the repository you're investigating:
    624    1.hg checkout autoland (if you don't have it locally you need to do > hg pull autoland && hg update autoland)
    625 2. hg checkout abcde5
    626    1. ./mach try fuzzy --full -q=^investigated-test-signature -m=baseline_abcde5_alert_###### (you will know that the baseline contains the reference value)
    627 3. hg checkout abcde3
    628    1. let's assume that build abcde4 broke the tests. you need to back it out in order to get the values of your investigated test on try:
    629        1. hg backout -r abcde4
    630    2. ./mach try fuzzy --full -q=^investigated-test-signature -m=abcde4_alert_###### (the baseline keyword is included just in the reference push message)
    631    3. Use the `perfcompare <https://perf.compare/>`_ to compare between the 2 pushes.
    632 4. If the try values between abcde5 and abcde3 don't include the delta, then you'll know that abcde1 or abcde2 are suspects so you need to repeat the step you did for abcde3 to find out.