index.rst (8274B)
1 Test Verification 2 ================= 3 4 When a changeset adds a new test, or modifies an existing test, the test 5 verification (TV) test suite performs additional testing to help find 6 intermittent failures in the modified test as quickly as possible. TV 7 uses other test harnesses to run the test multiple times, sometimes in a 8 variety of configurations. For instance, when a mochitest is 9 modified, TV runs the mochitest harness in a verify mode on the modified 10 mochitest. That test will be run 10 times, then the same test will be 11 run another 5 times, each time in a new browser instance. Once this is 12 done, the whole sequence will be repeated in the test chaos mode 13 (setting MOZ_CHAOSMODE). If any test run fails then the failure is 14 reported normally, testing ends, and the test suite reports the failure. 15 16 Initially, there are some limitations: 17 18 - TV only applies to mochitests (all flavors and subsuites), reftests 19 (including crashtests and js-reftests) and xpcshell tests; a separate 20 job, TVw, handles web-platform tests. 21 - Only some of the test chaos mode features are enabled 22 23 .. _Running_test_verification_with_mach: 24 25 Running test verification with mach 26 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 27 28 Supported test harnesses accept the --verify option: 29 30 :: 31 32 mach web-platform-test <test> --verify 33 34 mach mochitest <test> --verify 35 36 mach reftest <test> --verify 37 38 mach xpcshell-test <test> --verify 39 40 Multiple tests, even manifests or directories, can be verified at once, 41 but this is generally not recommended. Verification is easier to 42 understand one test at a time! 43 44 .. _Verification_steps: 45 46 Verification steps 47 ~~~~~~~~~~~~~~~~~~ 48 49 Each test harness implements --verify behavior in one or more "steps". 50 Each step uses a different strategy for finding intermittent failures. 51 For instance, the first step in mochitest verification is running the 52 test with --repeat=20; the second step is running the test just once in 53 a separate browser session, closing the browser, and repeating that 54 sequence several times. If a failure is found in one step, later steps 55 are skipped. 56 57 .. _Verification_summary: 58 59 Verification summary 60 ~~~~~~~~~~~~~~~~~~~~ 61 62 Test verification can produce a lot of output, much of it is repetitive. 63 To help communicate what verification has been found, each test harness 64 prints a summary for each file which has been verified. With each 65 verification step, there is either a pass or fail status and an overall 66 verification status, such as: 67 68 :: 69 70 ::: 71 ::: Test verification summary for: 72 ::: 73 ::: dom/base/test/test_data_uri.html 74 ::: 75 ::: 1. Run each test 20 times in one browser. : FAIL 76 ::: 2. Run each test 10 times in a new browser each time. : not run / incomplete 77 ::: 78 ::: Test verification FAILED! 79 ::: 80 81 .. _Long-running_tests_and_verification_duration: 82 83 Long-running tests and verification duration 84 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 85 86 Test verification is intended to be quick: Determine if this test fails 87 intermittently as soon as possible, so that a pass or fail result is 88 communicated quickly and test resources are not wasted. 89 90 Tests have a wide range of run-times, from milliseconds up to many 91 minutes. Of course, a test that takes 5 minutes to run, may take a very 92 long time to verify. There may also be cases where many tests are being 93 verified at one time. For instance, in automation a changeset might make 94 a trivial change to hundreds of tests at once, or a merge might result 95 in a similar situation. Even if each test is reasonably quick to verify, 96 the time required to verify all these files may be considerable. 97 98 Each test harness which supports the --verify option also supports the 99 --max-verify-time option: 100 101 :: 102 103 mach mochitest <test> --verify --max-verify-time=7200 104 105 The default max-verify-time is 3600 seconds (1 hour). If a verification 106 step exceeds the max-verify-time, later steps are not run. 107 108 In automation, the TV task uses --max-verify-time to try to limit 109 verification to about 1 hour, regardless of how many tests are to be 110 verified or how long each one runs. If verification is incomplete, the 111 task does not fail. It reports success and is green in the treeherder, 112 in addition the treeherder "Job Status" pane will also report 113 "Verification too long! Not all tests were verified." 114 115 .. _Test_Verification_in_Automation: 116 117 Test Verification in Automation 118 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 119 120 In automation, the TV and TVw tasks run whenever a changeset contains 121 modifications to a .js, .html, .xhtml or .xul file. The TV/TVw task 122 itself checks test manifests to determine if any of the modified files 123 are test files; if any of the files are tests, TV/TVw will verify those 124 tests. 125 126 Treeherder status is: 127 128 - **Green**: All modified tests in supported suites were verified with 129 no test failures, or test verification did not have enough time to 130 verify one or more tests. 131 - **Orange**: One or more tests modified by this changeset failed 132 verification. **Backout should be considered (but is not 133 mandatory)**, to avoid future intermittent failures in these tests. 134 135 There are some limitations: 136 137 - Pre-existing conditions: A test may be failing, then updated on a 138 push in a net-positive way, but continue failing intermittently. If 139 the author is aware of the remaining issues, it is probably best not 140 to backout. 141 - Failures due to test-verify conditions: In some cases, a test may 142 fail because test-verify runs a test with --repeat, or because 143 test-verify uses chaos mode, but those failures might not arise in 144 "normal" runs of the test. Ideally, all tests should be able to run 145 successfully in test-verify, but there may be exceptions. 146 147 .. _Test_Verification_on_try: 148 149 Test Verification on try 150 ~~~~~~~~~~~~~~~~~~~~~~~~ 151 152 To use test verification on try, use something like: 153 154 :: 155 156 mach try -b do -p linux64 -u test-verify-e10s --artifact 157 158 Tests modified in the push will be verified. 159 160 For TVw, use something like: 161 162 :: 163 164 mach try -b do -p linux64 -u test-verify-wpt-e10s --artifact 165 166 Web-platform tests modified in the push will be verified. 167 168 You can also run test verification on a test without modifying the test 169 using something like: 170 171 :: 172 173 mach try fuzzy <path-to-test> 174 175 176 .. _Skipping_Verification: 177 178 Skipping Verification 179 ~~~~~~~~~~~~~~~~~~~~~ 180 181 In the great majority of cases, test-verify failures indicate test 182 weaknesses that should be addressed. 183 184 In unusual cases, where test-verify failures does not provide value, 185 test-verify may be "skipped" on a test: In subsequent pushes where the 186 test is modified, the test-verify job will not try to verify the skipped 187 test. 188 189 For mochitests, xpcshell tests, and other tests using the .ini manifest 190 format, use something like: 191 192 :: 193 194 [sometest.html] 195 skip-if = verify 196 197 For reftests (including crashtests and jsreftests), use something like: 198 199 :: 200 201 skip-if(verify) == sometest.html ... 202 203 At this time, there is no corresponding support for skipping 204 web-platform tests in verify mode. 205 206 .. _FAQ: 207 208 FAQ 209 ~~~ 210 211 **Why is there a "spike" of test-verify failures for my test? Why did it 212 stop?** 213 214 Bug reports for test-verify failures usually show a "spike" of failures 215 on one day. That's because TV only runs against a particular test when 216 that test is modified. A particular push modifies the test, TV runs and 217 the test fails, and then TV doesn't run again for that test on 218 subsequent pushes (until/unless the test files are modified again). Of 219 course, when that push is merged to other trees, TV is triggered again, 220 so the same failure is usually noted on multiple trees in succession: 221 say, one revision on mozilla-inbound, then again when that revision is 222 merged to mozilla-central and again on autoland. 223 224 **When TV fails, is it worth retriggering?** 225 226 No - usually not. TV runs specific tests over and over again - sometimes 227 50 times in a single run. Retriggering on treeherder is generally 228 unnecessary and will very likely produce the same pass/fail result as 229 the original run. In this sense, TV failures are almost always 230 "perma-fail". 231 232 .. _Contact_information: 233 234 Contact information 235 ~~~~~~~~~~~~~~~~~~~ 236 237 Test verification is maintained by :**gbrown** and :**jmaher**. Bugs 238 should be filed in **Testing :: General**. You may want to reference 239 `bug 1357513 <https://bugzilla.mozilla.org/show_bug.cgi?id=1357513>`__.