ranking-legacy.rst (9479B)
1 ================ 2 Ranking (Legacy) 3 ================ 4 5 .. NOTE:: This documentation is kept for historical purposes. 6 The frecency algorithm was changed in Firefox 147. For current documentation, 7 see `ranking.rst <https://firefox-source-docs.mozilla.org/browser/urlbar/ranking.html>`_ 8 9 10 Before results appear in the UrlbarView, they are fetched from providers. 11 12 Each `UrlbarProvider <https://firefox-source-docs.mozilla.org/browser/urlbar/overview.html#urlbarprovider>`_ 13 implements its own internal ranking and returns sorted results. 14 15 Externally all the results are ranked by the `UrlbarMuxer <https://searchfox.org/mozilla-central/source/browser/components/urlbar/UrlbarMuxerStandard.sys.mjs>`_ 16 according to a hardcoded list of groups and sub-groups. 17 18 .. NOTE:: Preferences can influence the groups order, for example by putting 19 Firefox Suggest before Search Suggestions. 20 21 The Places provider, responsible to return history and bookmark results, uses 22 an internal ranking algorithm called Frecency. 23 24 Frecency implementation 25 ======================= 26 27 Frecency is a term derived from `frequency` and `recency`, its scope is to provide a 28 ranking algorithm that gives importance both to how often a page is accessed and 29 when it was last visited. 30 Additionally, it accounts for the type of each visit through a bonus system. 31 32 To account for `recency`, a bucketing system is implemented. 33 If a page has been visited later than the bucket cutoff, it gets the weight 34 associated with that bucket: 35 36 - Up to 4 days old - weight 100 - ``places.frecency.firstBucketCutoff/Weight`` 37 - Up to 14 days old - weight 70 - ``places.frecency.secondBucketCutoff/Weight`` 38 - Up to 31 days old - weight 50 - ``places.frecency.thirdBucketCutoff/Weight`` 39 - Up to 90 days old - weight 30 - ``places.frecency.fourthBucketCutoff/Weight`` 40 - Anything else - weight 10 - ``places.frecency.defaultBucketWeight`` 41 42 To account for `frequency`, the total number of visits to a page is used to 43 calculate the final score. 44 45 The type of each visit is taken into account using specific bonuses: 46 47 Default bonus 48 Any unknown type gets a default bonus. This is expected to be unused. 49 Pref ``places.frecency.defaultVisitBonus`` current value: 0. 50 Embed 51 Used for embedded/framed visits not due to user actions. These visits today 52 are stored in memory and never participate to frecency calculation. 53 Thus this is currently unused. 54 Pref ``places.frecency.embedVisitBonus`` current value: 0. 55 Framed Link 56 Used for cross-frame visits due to user action. 57 Pref ``places.frecency.framedLinkVisitBonus`` current value: 0. 58 Download 59 Used for download visits. It’s important to support link coloring for these 60 visits, but they are not necessarily useful address bar results (the Downloads 61 view can do a better job with these), so their frecency can be low. 62 Pref ``places.frecency.downloadVisitBonus`` current value: 0. 63 Reload 64 Used for reload visits (refresh same page). Low because it should not be 65 possible to influence frecency by multiple reloads. 66 Pref ``places.frecency.reloadVisitBonus`` current value: 0. 67 Redirect Source 68 Used when the page redirects to another one. 69 It’s a low value because we give more importance to the final destination, 70 that is what the user actually visits, especially for permanent redirects. 71 Pref ``places.frecency.redirectSourceVisitBonus`` current value: 25. 72 Temporary Redirect 73 Used for visits resulting from a temporary redirect (HTTP 307). 74 Pref ``places.frecency.tempRedirectVisitBonus`` current value: 40. 75 Permanent Redirect 76 Used for visits resulting from a permanent redirect (HTTP 301). This is the 77 new supposed destination for a url, thus the bonus is higher than temporary. 78 In this case it may be advisable to just pick the bonus for the source visit. 79 Pref ``places.frecency.permRedirectVisitBonus`` current value: 50. 80 Bookmark 81 Used for visits generated from bookmark views. 82 Pref ``places.frecency.bookmarkVisitBonus`` current value: 75. 83 Link 84 Used for normal visits, for example when clicking on a link. 85 Pref ``places.frecency.linkVisitBonus`` current value: 100. 86 Typed 87 Intended to be used for pages typed by the user, in reality it is used when 88 the user picks a url from the UI (history views or the Address Bar). 89 Pref ``places.frecency.typedVisitBonus`` current value: 2000. 90 91 The above bonuses are applied to visits, in addition to that there are also a 92 few bonuses applied in case a page is not visited at all, both of these bonuses 93 can be applied at the same time: 94 95 Unvisited bookmarked page 96 Used for pages that are bookmarked but unvisited. 97 Pref ``places.frecency.unvisitedBookmarkBonus`` current value: 140. 98 Unvisited typed page 99 Used for pages that were typed and now are bookmarked (otherwise they would 100 be orphans). 101 Pref ``places.frecency.unvisitedTypedBonus`` current value: 200. 102 103 Two special frecency values are also defined: 104 105 - ``-1`` represents a just inserted entry in the database, whose score has not 106 been calculated yet. 107 - ``0`` represents an entry for which a new value should not be calculated, 108 because it has a poor user value (e.g. place: queries) among search results. 109 110 Finally, because calculating a score from all of the visits every time a new 111 visit is added would be expensive, only a sample of the last 10 112 (pref ``places.frecency.numVisits``) visits is used. 113 114 How frecency for a page is calculated 115 ------------------------------------- 116 117 .. mermaid:: 118 :align: center 119 :caption: Frecency calculation flow 120 121 flowchart TD 122 start[URL] 123 a0{Has visits?} 124 a1[Get last 10 visit] 125 a2[bonus = unvisited_bonus + bookmarked + typed] 126 a3{bonus > 0?} 127 end0[Frecency = 0] 128 end1["frecency = age_bucket_weight * (bonus / 100)"] 129 a4[Sum points of all sampled visits] 130 a5{points > 0?} 131 end2[frecency = -1] 132 end3["Frecency = visit_count * (points / sample_size)"] 133 subgraph sub [Per each visit] 134 sub0[bonus = visit_type_bonus] 135 sub1{bookmarked?} 136 sub2[add bookmark bonus] 137 sub3["score = age_bucket_weight * (bonus / 100)"] 138 sub0 --> sub1 139 sub1 -- yes --> sub2 140 sub1 -- no --> sub3 141 sub2 --> sub3; 142 end 143 start --> a0 144 a0 -- no --> a2 145 a2 --> a3 146 a3 -- no --> end0 147 a3 -- yes --> end1 148 a0 -- yes --> a1 149 a1 --> sub 150 sub --> a4 151 a4 --> a5 152 a5 -- no --> end2 153 a5 -- yes --> end3 154 155 1. If the page is visited, get a sample of ``NUM_VISITS`` most recent visits. 156 2. For each visit get a transition bonus, depending on the visit type. 157 3. If the page is bookmarked, add to the bonus an additional bookmark bonus. 158 4. If the bonus is positive, get a bucket weight depending on the visit date. 159 5. Calculate points for the visit as ``age_bucket_weight * (bonus / 100)``. 160 6. Sum points for all the sampled visits. 161 7. If the points sum is zero, return a ``-1`` frecency, it will still appear in the UI. 162 Otherwise, frecency is ``visitCount * points / NUM_VISITS``. 163 8. If the page is unvisited and not bookmarked, or it’s a bookmarked place-query, 164 return a ``0`` frecency, to hide it from the UI. 165 9. If it’s bookmarked, add the bookmark bonus. 166 10. If it’s also a typed page, add the typed bonus. 167 11. Frecency is ``age_bucket_weight * (bonus / 100)`` 168 169 When frecency for a page is calculated 170 -------------------------------------- 171 172 Operations that may influence the frecency score are: 173 174 * Adding visits 175 * Removing visits 176 * Adding bookmarks 177 * Removing bookmarks 178 * Changing the url of a bookmark 179 180 Frecency is recalculated: 181 182 * Immediately, when a new visit is added. The user expectation here is that the 183 page appears in search results after being visited. This is also valid for 184 any History API that allows to add visits. 185 * In background on idle times, in any other case. In most cases having a 186 temporary stale value is not a problem, the main concern would be privacy 187 when removing history of a page, but removing whole history will either 188 completely remove the page or, if it's bookmarked, it will still be relevant. 189 In this case, when a change influencing frecency happens, the ``recalc_frecency`` 190 database field for the page is set to ``1``. 191 192 Recalculation is done by the `PlacesFrecencyRecalculator <https://searchfox.org/mozilla-central/source/toolkit/components/places/PlacesFrecencyRecalculator.sys.mjs>`_ module. 193 The Recalculator is notified when ``PlacesUtils.history.shouldStartFrecencyRecalculation`` 194 value changes from false to true, that means there's values to recalculate. 195 A DeferredTask is armed, that will look for a user idle opportunity 196 in the next 5 minutes, otherwise it will run when that time elapses. 197 Once all the outdated values have been recalculated 198 ``PlacesUtils.history.shouldStartFrecencyRecalculation`` is set back to false 199 until the next operation invalidating a frecency. 200 The recalculation task is also armed on the ``idle-daily`` notification. 201 202 When the task is executed, it recalculates frecency of a chunk of pages. If 203 there are more pages left to recalculate, the task is re-armed. After frecency 204 of a page is recalculated, its ``recalc_frecency`` field is set back to ``0``. 205 206 Frecency is also decayed daily during the ``idle-daily`` notification, by 207 multiplying all the scores by a decay rate of ``0.975`` (half-life of 28 days). 208 This guarantees entries not receiving new visits or bookmarks lose relevancy.