[ tor-browser ].git.dasho

index.rst (14248B)
      1 Avoiding intermittent tests
      2 ===========================
      3 
      4 Intermittent oranges are test failures which happen intermittently,
      5 in a seemingly random way. Many of such failures could be avoided by
      6 good test writing principles. This page tries to explain some of
      7 those principles for use by people who contribute tests, and also
      8 those who review them for inclusion into mozilla-central.
      9 
     10 They are also called flaky tests in other projects.
     11 
     12 A list of patterns which have been known to cause intermittent failures
     13 comes next, with a description of why each one causes test failures, and
     14 how to avoid it.
     15 
     16 After writing a successful test case, make sure to run it locally,
     17 preferably in a debug build. Maybe tests depend on the state of another
     18 test or some future test or browser operation to clean up what is left
     19 over. This is a common problem in browser-chrome, here are things to
     20 try:
     21 
     22 -  debug mode, run test standalone `./mach <test> <path>/<to>/<test>/test.html|js`
     23 -  debug mode, run test standalone directory `./mach <test> <path>/<to>/<test>`
     24 -  debug mode, run test standalone larger directory `./mach <test> <path>/<to>`
     25 
     26 
     27 Accessing DOM elements too soon
     28 -------------------------------
     29 
     30 ``data:`` URLs load asynchronously. You should wait for the load event
     31 of an `<iframe> <https://developer.mozilla.org/docs/Web/HTML/Element/iframe>`__ that is
     32 loading a ``data:`` URL before trying to access the DOM of the
     33 subdocument.
     34 
     35 For example, the following code pattern is bad:
     36 
     37 .. code:: html
     38 
     39   <html>
     40     <body>
     41       <iframe id="x" src="data:text/html,<div id='y'>"></iframe>
     42       <script>
     43         var elem = document.getElementById("x").
     44                             contentDocument.
     45                             getElementById("y"); // might fail
     46         // ...
     47       </script>
     48     </body>
     49   </html>
     50 
     51 Instead, write the code like this:
     52 
     53 .. code:: html
     54 
     55   <html>
     56     <body>
     57       <script>
     58           function onLoad() {
     59             var elem = this.contentDocument.
     60                             getElementById("y");
     61             // ...
     62           };
     63       </script>
     64      <iframe src="data:text/html,<div id='y'>"
     65              onload="onLoad()"></iframe>
     66     </body>
     67   </html>
     68 
     69 
     70 Using script functions before they're defined
     71 ---------------------------------------------
     72 
     73 This may be relevant to event handlers, more than anything else.  Let's
     74 say that you have an `<iframe>` and you want to
     75 do something after it's been loaded, so you might write code like this:
     76 
     77 .. code:: html
     78 
     79   <iframe src="..." onload="onLoad()"></iframe>
     80   <script>
     81     function onLoad() { // oops, too late!
     82       // ...
     83     }
     84   </script>
     85 
     86 This is bad, because the
     87 `<iframe>`'s load may be
     88 completed before the script gets parsed, and therefore before the
     89 ``onLoad`` function comes into existence.  This will cause you to miss
     90 the `<iframe>` load, which
     91 may cause your test to time out, for example.  The best way to fix this
     92 is to move the function definition before where it's used in the DOM,
     93 like this:
     94 
     95 .. code:: html
     96 
     97   <script>
     98     function onLoad() {
     99       // ...
    100     }
    101   </script>
    102   <iframe src="..." onload="onLoad()"></iframe>
    103 
    104 
    105 Relying on the order of asynchronous operations
    106 -----------------------------------------------
    107 
    108 In general, when you have two asynchronous operations, you cannot assume
    109 any order between them.  For example, let's say you have two
    110 `<iframe>`'s like this:
    111 
    112 .. code:: html
    113 
    114   <script>
    115     var f1Doc;
    116     function f1Loaded() {
    117       f1Doc = document.getElementById("f1").contentDocument;
    118     }
    119     function f2Loaded() {
    120       var elem = f1Doc.getElementById("foo"); // oops, f1Doc might not be set yet!
    121     }
    122   </script>
    123   <iframe id="f1" src="..." onload="f1Loaded()"></iframe>
    124   <iframe id="f2" src="..." onload="f2Loaded()"></iframe>
    125 
    126 This code is implicitly assuming that ``f1`` will be loaded before
    127 ``f2``, but this assumption is incorrect.  A simple fix is to just
    128 detect when all of the asynchronous operations have been finished, and
    129 then do what you need to do, like this:
    130 
    131 .. code:: html
    132 
    133   <script>
    134     var f1Doc, loadCounter = 0;
    135     function process() {
    136       var elem = f1Doc.getElementById("foo");
    137     }
    138     function f1Loaded() {
    139       f1Doc = document.getElementById("f1").contentDocument;
    140       if (++loadCounter == 2) process();
    141     }
    142     function f2Loaded() {
    143       if (++loadCounter == 2) process();
    144     }
    145   </script>
    146   <iframe id="f1" src="..." onload="f1Loaded()"></iframe>
    147   <iframe id="f2" src="..." onload="f2Loaded()"></iframe>
    148 
    149 
    150 Using magical timeouts to cause delays
    151 --------------------------------------
    152 
    153 Sometimes when there is an asynchronous operation going on, it may be
    154 tempting to use a timeout to wait a while, hoping that the operation has
    155 been finished by then and that it's then safe to continue.  Such code
    156 uses patterns like this:
    157 
    158 .. code:: js
    159 
    160   setTimeout(handler, 500);
    161 
    162 This should raise an alarm in your head.  As soon as you see such code,
    163 you should ask yourself: "Why 500, and not 100?  Why not 1000?  Why not
    164 328, for that matter?"  You can never answer this question, so you
    165 should always avoid code like this!
    166 
    167 What's wrong with this code is that you're assuming that 500ms is enough
    168 for whatever operation you're waiting for.  This may stop being true
    169 depending on the platform, whether it's a debug or optimized build of
    170 Firefox running this code, machine load, whether the test is run on a
    171 VM, etc.  And it will start failing, sooner or later.
    172 
    173 Instead of code like this, you should wait for the operation to be
    174 completed explicitly.  Most of the time this can be done by listening
    175 for an event.  Some of the time there is no good event to listen for, in
    176 which case you can add one to the code responsible for the completion of
    177 the task at hand.
    178 
    179 Ideally magical timeouts are never necessary, but there are a couple
    180 cases, in particular when writing web-platform-tests, where you might
    181 need them. In such cases consider documenting why a timer was used so it
    182 can be removed if in the future it turns out to be no longer needed.
    183 
    184 
    185 Using objects without accounting for the possibility of their death
    186 -------------------------------------------------------------------
    187 
    188 This is a very common pattern in our test suite, which was recently
    189 discovered to be responsible for many intermittent failures:
    190 
    191 .. code:: js
    192 
    193   function runLater(func) {
    194     var timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
    195     timer.initWithCallback(func, 0, Ci.nsITimer.TYPE_ONE_SHOT);
    196   }
    197 
    198 The problem with this code is that it assumes that the ``timer`` object
    199 will live long enough for the timer to fire.  That may not be the case
    200 if a garbage collection is performed before the timer needs to fire.  If
    201 that happens, the ``timer`` object will get garbage collected and will
    202 go away before the timer has had a chance to fire.  A simple way to fix
    203 this is to make the ``timer`` object global, so that an outstanding
    204 reference to the object would still exist by the time that the garbage
    205 collection code attempts to collect it.
    206 
    207 .. code:: js
    208 
    209   var timer;
    210   function runLater(func) {
    211     timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
    212     timer.initWithCallback(func, 0, Ci.nsITimer.TYPE_ONE_SHOT);
    213   }
    214 
    215 A similar problem may happen with ``nsIWebProgressListener`` objects
    216 passed to the ``nsIWebProgress.addProgressListener()`` method, because
    217 the web progress object stores a weak reference to the
    218 ``nsIWebProgressListener`` object, which does not prevent it from being
    219 garbage collected.
    220 
    221 
    222 Tests which require focus
    223 -------------------------
    224 
    225 Some tests require the application window to be focused in order to
    226 function properly.
    227 
    228 For example if you're writing a crashtest or reftest which tests an
    229 element which is focused, you need to specify it in the manifest file,
    230 like this:
    231 
    232 ::
    233 
    234   needs-focus load my-crashtest.html
    235   needs-focus == my-reftest.html my-reftest-ref.html
    236 
    237 Also, if you're writing a mochitest which synthesizes keyboard events
    238 using ``synthesizeKey()``, the window needs to be focused, otherwise the
    239 test would fail intermittently on Linux.  You can ensure that by using
    240 ``SimpleTest.waitForFocus()`` and start what your test does from inside
    241 the callback for that function, as below:
    242 
    243 .. code:: js
    244 
    245   SimpleTest.waitForFocus(function() {
    246     synthesizeKey("x", {});
    247     // ...
    248   });
    249 
    250 Tests which require mouse interaction, open context menus, etc. may also
    251 require focus.  Note that waitForFocus implicitly waits for a load event
    252 as well, so it's safe to call it for a window which has not finished
    253 loading yet.
    254 
    255 
    256 Tests which take too long
    257 -------------------------
    258 
    259 Sometimes what happens in a single unit test is just too much.  This
    260 will cause the test to time out in random places during its execution if
    261 the running machine is under a heavy load, which is a sign that the test
    262 needs to have more time to execute.  This could potentially happen only
    263 in debug builds, as they are slower in general.  There are two ways to
    264 solve this problem.  One of them is to split the test into multiple
    265 smaller tests (which might have other advantages as well, including
    266 better readability in the test), or to ask the test runner framework to
    267 give the test more time to finish correctly.  The latter can be done
    268 using the ``requestLongerTimeout`` function.
    269 
    270 
    271 Tests that do not clean up properly
    272 -----------------------------------
    273 
    274 Sometimes, tests register event handlers for various events, but they
    275 don't clean up after themselves correctly.  Alternatively, sometimes
    276 tests do things which have persistent effects in the browser running the
    277 test suite.  Examples include opening a new window, adding a bookmark,
    278 changing the value of a preference, etc.
    279 
    280 In these situations, sometimes the problem is caught as soon as the test
    281 is checked into the tree.  But it's also possible for the thing which
    282 was not cleaned up properly to have an intermittent effect on future
    283 (and perhaps seemingly unrelated) tests.  These types of intermittent
    284 failures may be extremely hard to debug, and not obvious at first
    285 because most people only look at the test in which the failure happens
    286 instead of previous tests.  How the failure would look varies on a case
    287 by case basis, but one example is `bug
    288 612625 <https://bugzilla.mozilla.org/show_bug.cgi?id=612625>`__.
    289 
    290 
    291 Not waiting on the specific event that you need
    292 -----------------------------------------------
    293 
    294 Sometimes, instead of waiting for event A, tests wait on event B,
    295 implicitly hoping that B occurring means that A has occurred too.  `Bug
    296 626168 <https://bugzilla.mozilla.org/show_bug.cgi?id=626168>`__ was an
    297 example of this.  The test really needed to wait for a paint in the
    298 middle of its execution, but instead it would wait for an event loop
    299 hit, hoping that by the time that we hit the event loop, a paint has
    300 also occurred.  While these types of assumptions may hold true when
    301 developing the test, they're not guaranteed to be true every time that
    302 the test is run.  When writing a test, if you have to wait for an event,
    303 you need to take note of why you're waiting for the event, and what
    304 exactly you're waiting on, and then make sure that you're really waiting
    305 on the correct event.
    306 
    307 
    308 Tests that rely on external sites
    309 ---------------------------------
    310 
    311 Even if the external site is not actually down, variable performance of
    312 the external site, and external networks can add enough variation to
    313 test duration that it can easily cause a test to fail intermittently.
    314 
    315 External sites should NOT be used for testing.
    316 
    317 
    318 Tests that rely on Math.random() to create unique values
    319 --------------------------------------------------------
    320 
    321 Sometimes you need unique values in your test.  Using ``Math.random()``
    322 to get unique values works most of the time, but this function actually
    323 doesn't guarantee that its return values are unique, so your test might
    324 get repeated values from this function, which means that it may fail
    325 intermittently.  You can use the following pattern instead of calling
    326 ``Math.random()`` if you need values that have to be unique for your
    327 test:
    328 
    329 .. code:: js
    330 
    331   var gUniqueCounter = 0;
    332   function generateUniqueValues() {
    333     return Date.now() + "-" + (++gUniqueCounter);
    334   }
    335 
    336 Tests that depend on the current time
    337 -------------------------------------
    338 
    339 When writing a test which depends on the current time, extra attention
    340 should be paid to different types of behavior depending on when a test
    341 runs.  For example, how does your test handle the case where the
    342 daylight saving (DST) settings change while it's running?  If you're
    343 testing for a time concept relative to now (like today, yesterday,
    344 tomorrow, etc) does your test handle the case where these concepts
    345 change their meaning at the middle of the test (for example, what if
    346 your test starts at 23:59:36 on a given day and finishes at 00:01:13)?
    347 
    348 
    349 Tests that depend on time differences or comparison
    350 ---------------------------------------------------
    351 
    352 When doing time differences the operating system timers resolution
    353 should be taken into account. For example consecutive calls to
    354 `Date() <https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Date>`__ don't
    355 guarantee to get different values. Also when crossing XPCOM different
    356 time implementations can give surprising results. For example when
    357 comparing a timestamp got through :ref:`PR_Now` with one
    358 got though a JavaScript date, the last call could result in the past of
    359 the first call! These differences are more pronounced on Windows, where
    360 the skew can be up to 16ms. Globally, the timers' resolutions are
    361 guesses that are not guaranteed (also due to bogus resolutions on
    362 virtual machines), so it's better to use larger brackets when the
    363 comparison is really needed.
    364 
    365 
    366 Tests that destroy the original tab
    367 -----------------------------------
    368 
    369 Tests that remove the original tab from the browser chrome test window
    370 can cause intermittent oranges or can, and of themselves, be
    371 intermittent oranges. Obviously, both of these outcomes are undesirable.
    372 You should neither write tests that do this, or r+ tests that do this.
    373 As a general rule, if you call ``addTab`` or other tab-opening methods
    374 in your test cleanup code, you're probably doing something you shouldn't
    375 be.
	tor-browser The Tor Browser
	git clone https://git.dasho.dev/tor-browser.git
	Log \| Files \| Refs \| README \| LICENSE