JSON_FORMAT.md (21935B)
XPCShell JSON Data Format Documentation
This document describes the JSON file formats created by fetch-xpcshell-data.js.
Overview
The script generates two types of JSON files for each date or try commit:
Both formats use string tables and index-based lookups to minimize file size.
Test Timing Data Format
Top-Level Structure
{
"metadata": { ... },
"tables": { ... },
"taskInfo": { ... },
"testInfo": { ... },
"testRuns": [ ... ]
}
metadata
Contains information about the data collection:
{
"date": "2025-10-14", // Date of the data (for date-based queries)
"revision": "abc123...", // Try commit revision (for try-based queries)
"pushId": 12345, // Treeherder push ID (for try-based queries)
"startTime": 1760400000, // Unix timestamp (seconds) used as base for relative timestamps
"generatedAt": "2025-10-15T14:24:33.451Z", // ISO timestamp when file was created
"jobCount": 3481, // Number of jobs fetched
"processedJobCount": 3481 // Number of jobs successfully processed
}
tables
String tables for efficient storage. All strings are deduplicated and stored once, sorted by frequency (most frequently used first for better compression):
{
"jobNames": [ // Job names (e.g., "test-linux1804-64/opt-xpcshell")
"test-linux1804-64/opt-xpcshell",
"test-macosx1015-64/debug-xpcshell",
...
],
"testPaths": [ // Test file paths (e.g., "dom/indexedDB/test/unit")
"dom/indexedDB/test/unit",
"toolkit/components/extensions/test/xpcshell",
...
],
"testNames": [ // Test filenames (e.g., "test_foo.js")
"test_foo.js",
"test_bar.js",
...
],
"repositories": [ // Repository names
"mozilla-central",
"autoland",
"try",
...
],
"statuses": [ // Test run statuses
"PASS-PARALLEL",
"PASS-SEQUENTIAL",
"SKIP",
"FAIL-PARALLEL",
"TIMEOUT-SEQUENTIAL",
"CRASH",
"EXPECTED-FAIL",
...
],
"taskIds": [ // TaskCluster task IDs with retry (always includes .retryId)
"YJJe4a0CRIqbAmcCo8n63w.0", // Retry 0
"XPPf5b1DRJrcBndDp9o74x.1", // Retry 1
...
],
"messages": [ // Test messages (for SKIP and FAIL statuses)
"skip-if: os == 'linux'",
"disabled due to bug 123456",
"Expected 5, got 10", // Failure message
...
],
"crashSignatures": [ // Crash signatures (only for crashed tests)
"mozilla::dom::Something::Crash",
"EMPTY: no crashing thread identified",
...
],
"components": [ // Bugzilla components (Product :: Component format)
"Core :: Storage: IndexedDB",
"Testing :: XPCShell Harness",
"Firefox :: General",
...
],
"commitIds": [ // Commit IDs from repository (extracted from profile.meta.sourceURL)
"f37a6863f87aeeb870b16223045ea7614b1ba0a7",
"abc123def456789012345678901234567890abcd",
...
]
}
taskInfo
Maps task IDs to their associated job names, repositories, and commit IDs. These are parallel arrays indexed by taskIdId:
{
"repositoryIds": [0, 1, 0, 2, ...], // Index into tables.repositories
"jobNameIds": [0, 0, 1, 1, ...], // Index into tables.jobNames
"commitIds": [0, 1, 0, null, ...] // Index into tables.commitIds (null if not available)
}
Example lookup:
const taskIdId = 5;
const taskId = tables.taskIds[taskIdId]; // "YJJe4a0CRIqbAmcCo8n63w.0"
const repository = tables.repositories[taskInfo.repositoryIds[taskIdId]]; // "mozilla-central"
const jobName = tables.jobNames[taskInfo.jobNameIds[taskIdId]]; // "test-linux1804-64/opt-xpcshell"
const commitIdIdx = taskInfo.commitIds[taskIdId];
const commitId = commitIdIdx !== null ? tables.commitIds[commitIdIdx] : null; // "f37a6863f87a..." or null
testInfo
Maps test IDs to their test paths, names, and components. These are parallel arrays indexed by testId:
{
"testPathIds": [0, 0, 1, 2, ...], // Index into tables.testPaths
"testNameIds": [0, 1, 2, 3, ...], // Index into tables.testNames
"componentIds": [5, 5, 12, null, ...] // Index into tables.components (null if unknown)
}
Example lookup:
const testId = 10;
const testPath = tables.testPaths[testInfo.testPathIds[testId]]; // "dom/indexedDB/test/unit"
const testName = tables.testNames[testInfo.testNameIds[testId]]; // "test_foo.js"
const fullPath = testPath ? `${testPath}/${testName}` : testName;
const componentId = testInfo.componentIds[testId];
const component = componentId !== null ? tables.components[componentId] : "Unknown"; // "Core :: Storage: IndexedDB"
testRuns
A 2D sparse array structure: testRuns[testId][statusId]
- First dimension:
testId(index into testInfo arrays) - Second dimension:
statusId(index into tables.statuses)
Each testRuns[testId][statusId] contains data for all runs of that test with that specific status. If a test never had a particular status, that array position contains null:
[
// testId 0
[
// statusId 0 (e.g., "PASS-PARALLEL")
{
"taskIdIds": [5, 12, 18, ...], // Indices into tables.taskIds
"durations": [1234, 1456, 1289, ...], // Test durations in milliseconds
"timestamps": [0, 15, 23, ...] // Differential compressed timestamps (seconds relative to metadata.startTime)
},
// statusId 1 - this test never had that status
null,
// statusId 2 (e.g., "SKIP")
{
"taskIdIds": [45, 67, ...],
"durations": [0, 0, ...],
"timestamps": [100, 200, ...],
"messageIds": [5, 5, ...] // Present for SKIP and FAIL statuses - indices into tables.messages (null if no message)
},
// statusId 3 (e.g., "FAIL-PARALLEL")
{
"taskIdIds": [78, ...],
"durations": [1234, ...],
"timestamps": [250, ...],
"messageIds": [12, ...] // Present for SKIP and FAIL statuses - indices into tables.messages (null if no message)
},
// statusId 4 (e.g., "CRASH")
{
"taskIdIds": [89, ...],
"durations": [5678, ...],
"timestamps": [300, ...],
"crashSignatureIds": [2, ...], // Only present for CRASH status - indices into tables.crashSignatures (null if none)
"minidumps": ["12345678-abcd-1234-abcd-1234567890ab", ...] // Only present for CRASH status - minidump IDs or null
}
],
// testId 1
[ ... ],
...
]
Timestamp decompression:
// Timestamps are differentially compressed
let currentTime = metadata.startTime; // Base timestamp in seconds
const decompressedTimestamps = statusGroup.timestamps.map(diff => {
currentTime += diff;
return currentTime;
});
Example: Get all runs of a specific test:
const testId = 10;
const testGroup = testRuns[testId];
for (let statusId = 0; statusId < testGroup.length; statusId++) {
const statusGroup = testGroup[statusId];
if (!statusGroup) continue; // This test never had this status
const status = tables.statuses[statusId];
console.log(`Status: ${status}, Runs: ${statusGroup.taskIdIds.length}`);
// Decompress timestamps
let currentTime = metadata.startTime;
for (let i = 0; i < statusGroup.taskIdIds.length; i++) {
currentTime += statusGroup.timestamps[i];
const taskId = tables.taskIds[statusGroup.taskIdIds[i]];
const duration = statusGroup.durations[i];
console.log(` Task: ${taskId}, Duration: ${duration}ms, Time: ${currentTime}`);
}
}
Resource Usage Data Format
Top-Level Structure
{
"jobNames": [ ... ],
"repositories": [ ... ],
"machineInfos": [ ... ],
"jobs": { ... }
}
Lookup Tables
{
"jobNames": [ // Base job names without chunk numbers
"test-linux1804-64/opt-xpcshell",
"test-macosx1015-64/debug-xpcshell",
...
],
"repositories": [ // Repository names
"mozilla-central",
"autoland",
...
],
"machineInfos": [ // Machine specifications (memory in GB, rounded to 1 decimal)
{
"logicalCPUs": 8,
"physicalCPUs": 4,
"mainMemory": 15.6 // GB
},
{
"logicalCPUs": 16,
"physicalCPUs": 8,
"mainMemory": 31.4
},
...
]
}
jobs
Parallel arrays containing resource usage data for each job, sorted by start time:
{
"jobNameIds": [0, 0, 1, 1, ...], // Indices into jobNames array
"chunks": [1, 2, 1, 2, ...], // Chunk numbers (null if job name has no chunk)
"taskIds": ["YJJe4a0CRIqbAmcCo8n63w", "XPPf5b1DRJrcBndDp9o74x.1", ...], // Task IDs (format: "taskId" for retry 0, "taskId.retryId" for retry > 0)
"repositoryIds": [0, 0, 1, 1, ...], // Indices into repositories array
"startTimes": [0, 150, 23, 45, ...], // Differential compressed timestamps (seconds)
"machineInfoIds": [0, 0, 1, 1, ...], // Indices into machineInfos array
"maxMemories": [1234567890, ...], // Maximum memory used (bytes)
"idleTimes": [12345, ...], // Time with <50% of one core used (milliseconds)
"singleCoreTimes": [45678, ...], // Time using ~1 core (0.75-1.25 cores, milliseconds)
"cpuBuckets": [ // CPU usage time distribution (milliseconds per bucket)
[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000], // Job 0: [0-10%, 10-20%, ..., 90-100%]
[150, 250, 350, 450, 550, 650, 750, 850, 950, 1050], // Job 1
...
]
}
CPU Buckets Explanation:
- Array of 10 values representing time spent in each CPU usage range
- Bucket 0: 0-10% CPU usage
- Bucket 1: 10-20% CPU usage
- ...
- Bucket 9: 90-100% CPU usage
- Values are in milliseconds
Idle Time Calculation:
- Idle = CPU usage < (50% of one core)
- For 8-core machine: idle = CPU usage < 6.25%
- For 16-core machine: idle = CPU usage < 3.125%
Single Core Time Calculation:
- Single core = CPU usage between 0.75 and 1.25 cores
- For 8-core machine: 9.375% - 15.625%
- For 16-core machine: 4.6875% - 7.8125%
Start Time Decompression:
let currentTime = 0; // Start times are relative to each other
const decompressedStartTimes = jobs.startTimes.map(diff => {
currentTime += diff;
return currentTime;
});
Example: Get full information for a job:
const jobIndex = 5;
const jobName = jobNames[jobs.jobNameIds[jobIndex]];
const chunk = jobs.chunks[jobIndex]; // May be null
const fullJobName = chunk !== null ? `${jobName}-${chunk}` : jobName;
const taskId = jobs.taskIds[jobIndex];
const repository = repositories[jobs.repositoryIds[jobIndex]];
const machineInfo = machineInfos[jobs.machineInfoIds[jobIndex]];
// Decompress start time
let currentTime = 0;
for (let i = 0; i <= jobIndex; i++) {
currentTime += jobs.startTimes[i];
}
const startTime = currentTime; // seconds since epoch
const maxMemoryGB = jobs.maxMemories[jobIndex] / (1024 * 1024 * 1024);
const idleTimeSeconds = jobs.idleTimes[jobIndex] / 1000;
const singleCoreTimeSeconds = jobs.singleCoreTimes[jobIndex] / 1000;
const cpuDistribution = jobs.cpuBuckets[jobIndex];
const totalTime = cpuDistribution.reduce((sum, val) => sum + val, 0);
const idlePercent = (idleTimeSeconds * 1000 / totalTime) * 100;
Data Compression Techniques
The format uses several compression techniques to minimize file size:
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
- For 16-core machine: 4.6875% - 7.8125%
Index File Format
The index.json file lists all available dates:
{
"dates": [
"2025-10-15",
"2025-10-14",
"2025-10-13",
...
]
}
Dates are sorted in descending order (newest first).
Notes
- All timestamps in test timing data are in seconds
- All durations are in milliseconds
- Memory values in machineInfos are in GB (rounded to 1 decimal place)
- Memory values in jobs.maxMemories are in bytes
- The
testRunsarray is sparse -testRuns[testId][statusId]may benullif that test never had that status - Task ID formats differ between files:
- Test timing data: Always includes retry suffix (e.g., "YJJe4a0CRIqbAmcCo8n63w.0")
- Resource usage data: Omits .0 for retry 0 (e.g., "YJJe4a0CRIqbAmcCo8n63w"), includes suffix for retries > 0 (e.g., "YJJe4a0CRIqbAmcCo8n63w.1")
- Component mapping: Components are fetched from the TaskCluster index
gecko.v2.mozilla-central.latest.source.source-bugzilla-infoand mapped to test paths. The component ID intestInfo.componentIdsmay benullif the test path is not found in the mapping - Components are formatted as
"Product :: Component"(e.g.,"Core :: Storage: IndexedDB") - The data structure is optimized for sequential access patterns used by the dashboards
Aggregated Files Format
When running with --days N where N > 1, two aggregated files are generated:
- The data structure is optimized for sequential access patterns used by the dashboards
- The data structure is optimized for sequential access patterns used by the dashboards
Detailed File (xpcshell-issues-with-taskids.json)
Differences from Daily Files
1. Metadata Changes
{
"metadata": {
"startDate": "2025-11-12", // First date in the range (earliest)
"endDate": "2025-12-02", // Last date in the range (most recent)
"days": 21, // Number of days aggregated
"startTime": 1731456000, // Unix timestamp for startDate at 00:00:00 UTC
"generatedAt": "...",
"totalTestCount": 4506, // Total number of unique tests
"testsWithFailures": 3614, // Number of tests that had at least one non-passing run
"aggregatedFrom": [...] // Array of source filenames
}
}
Additional fields:
startDate,endDate,daysindicate the date rangestartTimeis the base timestamp for the entire aggregated period (00:00:00 UTC onstartDate)testsWithFailurescounts tests with any non-passing statusaggregatedFromlists all source files that were merged
2. Passing Test Runs Are Aggregated
Daily files store individual runs for all statuses:
{
"taskIdIds": [123, 456, 789],
"durations": [1500, 1600, 1550],
"timestamps": [3600, 3600, 7200]
}
Aggregated file stores only counts per hour for passing statuses (status starts with "PASS"):
{
"counts": [150, 200, 180, 145, ...],
"hours": [0, 5, 1, 2, 8, ...]
}
Where:
counts[i]= total number of passing runs in that hourhours[i]= differential compressed hour offset (hours since previous bucket)- No
taskIdIdsordurationsarrays - Typically sparse - only hours with passing runs are included
Decompressing hours:
let currentHour = 0;
const absoluteHours = [];
for (const delta of hours) {
currentHour += delta;
absoluteHours.push(currentHour);
}
// absoluteHours[i] is now the hour number (0 = startTime, 1 = startTime + 1 hour, etc.)
Example: Calculate pass rate for a test on day 5:
const testId = 0;
const day = 5; // 5 days after startDate
// Find pass status
const passStatusId = data.tables.statuses.findIndex(s => s.startsWith("PASS"));
const passGroup = data.testRuns[testId]?.[passStatusId];
// Count passes in day 5 (hours 120-143)
const dayStartHour = day * 24;
const dayEndHour = (day + 1) * 24;
let passCount = 0;
let currentHour = 0;
if (passGroup) {
for (let i = 0; i < passGroup.hours.length; i++) {
currentHour += passGroup.hours[i];
if (currentHour >= dayStartHour && currentHour < dayEndHour) {
passCount += passGroup.counts[i];
}
}
}
// For fail count, need to count timestamps in that day's range
const dayStartSeconds = day * 86400;
const dayEndSeconds = (day + 1) * 86400;
3. All Test Runs Aggregated by Hour
Both passing and non-passing test runs are aggregated by hour. The difference is in what data is preserved:
Passing tests (status starts with "PASS"):
{
"counts": [150, 200, 180],
"hours": [0, 5, 1]
}
Non-passing tests (FAIL, CRASH, TIMEOUT, SKIP, etc.):
{
"taskIdIds": [
[45, 67], // Task IDs that failed in hour 0 with message 23
[89, 12, 56], // Task IDs that failed in hour 5 with message 23
[34] // Task IDs that failed in hour 6 with message 24
],
"hours": [0, 5, 1],
"messageIds": [23, 23, 24],
"crashSignatureIds": [5, 5, 6],
"minidumps": [
["abc123", "def456"], // Minidumps for crashes in hour 0
["ghi789", null, "jkl"], // Minidumps for crashes in hour 5
[null] // Minidumps for crashes in hour 6
]
}
Key differences from daily files:
taskIdIdsis an array of arrays - one array per (hour, message, crashSignature) bucketminidumpsis an array of arrays - parallel totaskIdIds, preserving minidump for each taskhoursprovides differentially compressed hour offsets- Durations are removed
- Individual timestamps are removed - only the hour bucket is preserved
- Failures with different messages or crash signatures are in separate buckets
4. String Tables Are Merged
All string tables are merged and deduplicated across all input days. A string that appears in multiple daily files will only appear once in the aggregated file.
5. TaskInfo Only Contains Failed Tasks
Since passing runs don't store taskIdIds, the taskInfo object only contains mappings for tasks that appear in non-passing test runs. This significantly reduces the size of these arrays.
6. Platform-Irrelevant Tests Are Filtered
SKIP tests with messages starting with "run-if" are filtered out during aggregation. These represent tests that are not relevant on certain platforms (e.g., "run-if = os == 'win'") and are not actual issues. The dashboard would filter these out anyway, so excluding them reduces file size.
Use Cases
Show pass/fail trends over time:
- Passing runs: Use
countsandhoursarrays - Failing runs: Count taskIds in buckets within day ranges using
hours
Investigate specific failures:
- Task IDs preserved for all non-passing runs
- Can identify which tasks/jobs/repos had failures
- Can see error messages, crash signatures, and minidumps
Calculate overall pass rate:
const testId = 0;
const passStatusId = data.tables.statuses.findIndex(s => s.startsWith("PASS"));
const failStatusId = data.tables.statuses.indexOf("FAIL");
// Total passes
const totalPasses = data.testRuns[testId]?.[passStatusId]?.counts.reduce((a, b) => a + b, 0) ?? 0;
// Total fails - count all taskIds across all buckets
const failGroup = data.testRuns[testId]?.[failStatusId];
const totalFails = failGroup?.taskIdIds.reduce((sum, arr) => sum + arr.length, 0) ?? 0;
const passRate = totalPasses / (totalPasses + totalFails);
Small File (xpcshell-issues.json)
This file omits task IDs and minidumps to minimize file size for fast dashboard loading.
Differences from xpcshell-issues-with-taskids.json
1. No taskInfo or taskIds
The taskInfo object and tables.taskIds array are completely omitted since all runs are aggregated.
2. Reduced String Tables
Only includes tables needed for aggregated data:
{
"tables": {
"testPaths": [...],
"testNames": [...],
"statuses": [...],
"messages": [...], // Kept for failure details
"crashSignatures": [...], // Kept for crash details
"components": [...]
// No jobNames, repositories, or taskIds
}
}
3. No Task IDs - Only Counts
All status groups use counts instead of task ID arrays:
{
"counts": [5, 12, 8, 3],
"hours": [0, 5, 1, 2],
"messageIds": [23, 23, 24, 24], // For failures with different messages
"crashSignatureIds": [5, 6, 5, 6] // For crashes with different signatures
// Note: taskIdIds and minidumps are NOT included in this file
}
Failures with different messages or crash signatures are bucketed separately, preserving distinct failure modes.
Task IDs and minidumps are omitted to reduce size. They are available in the detailed file.
Example: A test that fails 5 times in hour 10 with message A and 3 times with message B will have two entries:
{
"counts": [5, 3],
"hours": [10, 0], // Both in same hour, so second delta is 0
"messageIds": [23, 24]
}