tor-browser

The Tor Browser
git clone https://git.dasho.dev/tor-browser.git
Log | Files | Refs | README | LICENSE

advanced-usage.rst (15570B)


      1 Advanced Usage
      2 ==============
      3 
      4 Priority
      5 --------
      6 
      7 .. versionadded:: 2.0.0
      8 
      9 `RFC 7540`_ has a fairly substantial and complex section describing how to
     10 build a HTTP/2 priority tree, and the effect that should have on sending data
     11 from a server.
     12 
     13 h2 does not enforce any priority logic by default for servers. This is
     14 because scheduling data sends is outside the scope of this library, as it
     15 likely requires fairly substantial understanding of the scheduler being used.
     16 
     17 However, for servers that *do* want to follow the priority recommendations
     18 given by clients, the Hyper project provides `an implementation`_ of the
     19 `RFC 7540`_ priority tree that will be useful to plug into a server. That,
     20 combined with the :class:`PriorityUpdated <h2.events.PriorityUpdated>` event from
     21 this library, can be used to build a server that conforms to RFC 7540's
     22 recommendations for priority handling.
     23 
     24 Related Events
     25 --------------
     26 
     27 .. versionadded:: 2.4.0
     28 
     29 In the 2.4.0 release h2 added support for signaling "related events".
     30 These are a HTTP/2-only construct that exist because certain HTTP/2 events can
     31 occur simultaneously: that is, one HTTP/2 frame can cause multiple state
     32 transitions to occur at the same time. One example of this is a HEADERS frame
     33 that contains priority information and carries the END_STREAM flag: this would
     34 cause three events to fire (one of the various request/response received
     35 events, a :class:`PriorityUpdated <h2.events.PriorityUpdated>` event, and a
     36 :class:`StreamEnded <h2.events.StreamEnded>` event).
     37 
     38 Ordinarily h2's logic will emit those events to you one at a time. This
     39 means that you may attempt to process, for example, a
     40 :class:`DataReceived <h2.events.DataReceived>` event, not knowing that the next
     41 event out will be a :class:`StreamEnded <h2.events.StreamEnded>` event.
     42 h2 *does* know this, however, and so will forbid you from taking certain
     43 actions that are a violation of the HTTP/2 protocol.
     44 
     45 To avoid this asymmetry of information, events that can occur simultaneously
     46 now carry properties for their "related events". These allow users to find the
     47 events that can have occurred simultaneously with each other before the event
     48 is emitted by h2. The following objects have "related events":
     49 
     50 - :class:`RequestReceived <h2.events.RequestReceived>`:
     51 
     52    - :data:`stream_ended <h2.events.RequestReceived.stream_ended>`: any
     53      :class:`StreamEnded <h2.events.StreamEnded>` event that occurred at the
     54      same time as receiving this request.
     55 
     56    - :data:`priority_updated
     57      <h2.events.RequestReceived.priority_updated>`: any
     58      :class:`PriorityUpdated <h2.events.PriorityUpdated>` event that occurred
     59      at the same time as receiving this request.
     60 
     61 - :class:`ResponseReceived <h2.events.ResponseReceived>`:
     62 
     63    - :data:`stream_ended <h2.events.ResponseReceived.stream_ended>`: any
     64      :class:`StreamEnded <h2.events.StreamEnded>` event that occurred at the
     65      same time as receiving this response.
     66 
     67    - :data:`priority_updated
     68      <h2.events.ResponseReceived.priority_updated>`: any
     69      :class:`PriorityUpdated <h2.events.PriorityUpdated>` event that occurred
     70      at the same time as receiving this response.
     71 
     72 - :class:`TrailersReceived <h2.events.TrailersReceived>`:
     73 
     74    - :data:`stream_ended <h2.events.TrailersReceived.stream_ended>`: any
     75      :class:`StreamEnded <h2.events.StreamEnded>` event that occurred at the
     76      same time as receiving this set of trailers. This will **always** be
     77      present for trailers, as they must terminate streams.
     78 
     79    - :data:`priority_updated
     80      <h2.events.TrailersReceived.priority_updated>`: any
     81      :class:`PriorityUpdated <h2.events.PriorityUpdated>` event that occurred
     82      at the same time as receiving this response.
     83 
     84 - :class:`InformationalResponseReceived
     85  <h2.events.InformationalResponseReceived>`:
     86 
     87    - :data:`priority_updated
     88      <h2.events.InformationalResponseReceived.priority_updated>`: any
     89      :class:`PriorityUpdated <h2.events.PriorityUpdated>` event that occurred
     90      at the same time as receiving this informational response.
     91 
     92 - :class:`DataReceived <h2.events.DataReceived>`:
     93 
     94    - :data:`stream_ended <h2.events.DataReceived.stream_ended>`: any
     95      :class:`StreamEnded <h2.events.StreamEnded>` event that occurred at the
     96      same time as receiving this data.
     97 
     98 
     99 .. warning:: h2 does not know if you are looking for related events or
    100             expecting to find events in the event stream. Therefore, it will
    101             always emit "related events" in the event stream. If you are using
    102             the "related events" event pattern, you will want to be careful to
    103             avoid double-processing related events.
    104 
    105 .. _h2-connection-advanced:
    106 
    107 Connections: Advanced
    108 ---------------------
    109 
    110 Thread Safety
    111 ~~~~~~~~~~~~~
    112 
    113 ``H2Connection`` objects are *not* thread-safe. They cannot safely be accessed
    114 from multiple threads at once. This is a deliberate design decision: it is not
    115 trivially possible to design the ``H2Connection`` object in a way that would
    116 be either lock-free or have the locks at a fine granularity.
    117 
    118 Your implementations should bear this in mind, and handle it appropriately. It
    119 should be simple enough to use locking alongside the ``H2Connection``: simply
    120 lock around the connection object itself. Because the ``H2Connection`` object
    121 does no I/O it should be entirely safe to do that. Alternatively, have a single
    122 thread take ownership of the ``H2Connection`` and use a message-passing
    123 interface to serialize access to the ``H2Connection``.
    124 
    125 If you are using a non-threaded concurrency approach (e.g. Twisted), this
    126 should not affect you.
    127 
    128 Internal Buffers
    129 ~~~~~~~~~~~~~~~~
    130 
    131 In order to avoid doing I/O, the ``H2Connection`` employs an internal buffer.
    132 This buffer is *unbounded* in size: it can potentially grow infinitely. This
    133 means that, if you are not making sure to regularly empty it, you are at risk
    134 of exceeding the memory limit of a single process and finding your program
    135 crashes.
    136 
    137 It is highly recommended that you send data at regular intervals, ideally as
    138 soon as possible.
    139 
    140 .. _advanced-sending-data:
    141 
    142 Sending Data
    143 ~~~~~~~~~~~~
    144 
    145 When sending data on the network, it's important to remember that you may not
    146 be able to send an unbounded amount of data at once. Particularly when using
    147 TCP, it is often the case that there are limits on how much data may be in
    148 flight at any one time. These limits can be very low, and your operating system
    149 will only buffer so much data in memory before it starts to complain.
    150 
    151 For this reason, it is possible to consume only a subset of the data available
    152 when you call :meth:`data_to_send <h2.connection.H2Connection.data_to_send>`.
    153 However, once you have pulled the data out of the ``H2Connection`` internal
    154 buffer, it is *not* possible to put it back on again. For that reason, it is
    155 adviseable that you confirm how much space is available in the OS buffer before
    156 sending.
    157 
    158 Alternatively, use tools made available by your framework. For example, the
    159 Python standard library :mod:`socket <python:socket>` module provides a
    160 :meth:`sendall <python:socket.socket.sendall>` method that will automatically
    161 block until all the data has been sent. This will enable you to always use the
    162 unbounded form of
    163 :meth:`data_to_send <h2.connection.H2Connection.data_to_send>`, and will help
    164 you avoid subtle bugs.
    165 
    166 When To Send
    167 ~~~~~~~~~~~~
    168 
    169 In addition to knowing how much data to send (see :ref:`advanced-sending-data`)
    170 it is important to know when to send data. For h2, this amounts to
    171 knowing when to call :meth:`data_to_send
    172 <h2.connection.H2Connection.data_to_send>`.
    173 
    174 h2 may write data into its send buffer at two times. The first is
    175 whenever :meth:`receive_data <h2.connection.H2Connection.receive_data>` is
    176 called. This data is sent in response to some control frames that require no
    177 user input: for example, responding to PING frames. The second time is in
    178 response to user action: whenever a user calls a method like
    179 :meth:`send_headers <h2.connection.H2Connection.send_headers>`, data may be
    180 written into the buffer.
    181 
    182 In a standard design for a h2 consumer, then, that means there are two
    183 places where you'll potentially want to send data. The first is in your
    184 "receive data" loop. This is where you take the data you receive, pass it into
    185 :meth:`receive_data <h2.connection.H2Connection.receive_data>`, and then
    186 dispatch events. For this loop, it is usually best to save sending data until
    187 the loop is complete: that allows you to empty the buffer only once.
    188 
    189 The other place you'll want to send the data is when initiating requests or
    190 taking any other active, unprompted action on the connection. In this instance,
    191 you'll want to make all the relevant ``send_*`` calls, and *then* call
    192 :meth:`data_to_send <h2.connection.H2Connection.data_to_send>`.
    193 
    194 Headers
    195 -------
    196 
    197 HTTP/2 defines several "special header fields" which are used to encode data
    198 that was previously sent in either the request or status line of HTTP/1.1.
    199 These header fields are distinguished from ordinary header fields because their
    200 field name begins with a ``:`` character. The special header fields defined in
    201 `RFC 7540`_ are:
    202 
    203 - ``:status``
    204 - ``:path``
    205 - ``:method``
    206 - ``:scheme``
    207 - ``:authority``
    208 
    209 `RFC 7540`_ **mandates** that all of these header fields appear *first* in the
    210 header block, before the ordinary header fields. This could cause difficulty if
    211 the :meth:`send_headers <h2.connection.H2Connection.send_headers>` method
    212 accepted a plain ``dict`` for the ``headers`` argument, because ``dict``
    213 objects are unordered. For this reason, we require that you provide a list of
    214 two-tuples.
    215 
    216 .. _RFC 7540: https://tools.ietf.org/html/rfc7540
    217 .. _an implementation: http://python-hyper.org/projects/priority/en/latest/
    218 
    219 Flow Control
    220 ------------
    221 
    222 HTTP/2 defines a complex flow control system that uses a sliding window of
    223 data on both a per-stream and per-connection basis. Essentially, each
    224 implementation allows its peer to send a specific amount of data at any time
    225 (the "flow control window") before it must stop. Each stream has a separate
    226 window, and the connection as a whole has a window. Each window can be opened
    227 by an implementation by sending a ``WINDOW_UPDATE`` frame, either on a specific
    228 stream (causing the window for that stream to be opened), or on stream ``0``,
    229 which causes the window for the entire connection to be opened.
    230 
    231 In HTTP/2, only data in ``DATA`` frames is flow controlled. All other frames
    232 are exempt from flow control. Each ``DATA`` frame consumes both stream and
    233 connection flow control window bytes. This means that the maximum amount of
    234 data that can be sent on any one stream before a ``WINDOW_UPDATE`` frame is
    235 received is the *lower* of the stream and connection windows. The maximum
    236 amount of data that can be sent on *all* streams before a ``WINDOW_UPDATE``
    237 frame is received is the size of the connection flow control window.
    238 
    239 Working With Flow Control
    240 ~~~~~~~~~~~~~~~~~~~~~~~~~
    241 
    242 The amount of flow control window a ``DATA`` frame consumes is the sum of both
    243 its contained application data *and* the amount of padding used. h2 shows
    244 this to the user in a :class:`DataReceived <h2.events.DataReceived>` event by
    245 using the :data:`flow_controlled_length
    246 <h2.events.DataReceived.flow_controlled_length>` field. When working with flow
    247 control in h2, users *must* use this field: simply using
    248 ``len(datareceived.data)`` can eventually lead to deadlock.
    249 
    250 When data has been received and given to the user in a :class:`DataReceived
    251 <h2.events.DataReceived>`, it is the responsibility of the user to re-open the
    252 flow control window when the user is ready for more data. h2 does not do
    253 this automatically to avoid flooding the user with data: if we did, the remote
    254 peer could send unbounded amounts of data that the user would need to buffer
    255 before processing.
    256 
    257 To re-open the flow control window, then, the user must call
    258 :meth:`increment_flow_control_window
    259 <h2.connection.H2Connection.increment_flow_control_window>` with the
    260 :data:`flow_controlled_length <h2.events.DataReceived.flow_controlled_length>`
    261 of the received data. h2 requires that you manage both the connection
    262 and the stream flow control windows separately, so you may need to increment
    263 both the stream the data was received on and stream ``0``.
    264 
    265 When sending data, a HTTP/2 implementation must not send more than flow control
    266 window available for that stream. As noted above, the maximum amount of data
    267 that can be sent on the stream is the minimum of the stream and the connection
    268 flow control windows. You can find out how much data you can send on a given
    269 stream by using the :meth:`local_flow_control_window
    270 <h2.connection.H2Connection.local_flow_control_window>` method, which will do
    271 all of these calculations for you. If you attempt to send more than this amount
    272 of data on a stream, h2 will throw a :class:`ProtocolError
    273 <h2.exceptions.ProtocolError>` and refuse to send the data.
    274 
    275 In h2, receiving a ``WINDOW_UPDATE`` frame causes a :class:`WindowUpdated
    276 <h2.events.WindowUpdated>` event to fire. This will notify you that there is
    277 potentially more room in a flow control window. Note that, just because an
    278 increment of a given size was received *does not* mean that that much more data
    279 can be sent: remember that both the connection and stream flow control windows
    280 constrain how much data can be sent.
    281 
    282 As a result, when a :class:`WindowUpdated <h2.events.WindowUpdated>` event
    283 fires with a non-zero stream ID, and the user has more data to send on that
    284 stream, the user should call :meth:`local_flow_control_window
    285 <h2.connection.H2Connection.local_flow_control_window>` to check if there
    286 really is more room to send data on that stream.
    287 
    288 When a :class:`WindowUpdated <h2.events.WindowUpdated>` event fires with a
    289 stream ID of ``0``, that may have unblocked *all* streams that are currently
    290 blocked. The user should use :meth:`local_flow_control_window
    291 <h2.connection.H2Connection.local_flow_control_window>` to check all blocked
    292 streams to see if more data is available.
    293 
    294 Auto Flow Control
    295 ~~~~~~~~~~~~~~~~~
    296 
    297 .. versionadded:: 2.5.0
    298 
    299 In most cases, there is no advantage for users in managing their own flow
    300 control strategies. While particular high performance or specific-use-case
    301 applications may gain value from directly controlling the emission of
    302 ``WINDOW_UPDATE`` frames, the average application can use a
    303 lowest-common-denominator strategy to emit those frames. As of version 2.5.0,
    304 h2 now provides this automatic strategy for users, if they want to use
    305 it.
    306 
    307 This automatic strategy is built around a single method:
    308 :meth:`acknowledge_received_data
    309 <h2.connection.H2Connection.acknowledge_received_data>`. This method
    310 flags to the connection object that your application has dealt with a certain
    311 number of flow controlled bytes, and that the window should be incremented in
    312 some way. Whenever your application has "processed" some received bytes, this
    313 method should be called to signal that they have been processed.
    314 
    315 The key difference between this method and :meth:`increment_flow_control_window
    316 <h2.connection.H2Connection.increment_flow_control_window>` is that the method
    317 :meth:`acknowledge_received_data
    318 <h2.connection.H2Connection.acknowledge_received_data>` does not guarantee that
    319 it will emit a ``WINDOW_UPDATE`` frame, and if it does it will not necessarily
    320 emit them for *only* the stream or *only* the frame. Instead, the
    321 ``WINDOW_UPDATE`` frames will be *coalesced*: they will be emitted only when
    322 a certain number of bytes have been freed up.
    323 
    324 For most applications, this method should be preferred to the manual flow
    325 control mechanism.