tor-browser

The Tor Browser
git clone https://git.dasho.dev/tor-browser.git
Log | Files | Refs | README | LICENSE

introduction_to_nspr.rst (18453B)


      1 Introduction to NSPR
      2 ====================
      3 
      4 The Netscape Portable Runtime (NSPR) API allows compliant applications
      5 to use system facilities such as threads, thread synchronization, I/O,
      6 interval timing, atomic operations, and several other low-level services
      7 in a platform-independent manner. This chapter introduces key NSPR
      8 programming concepts and illustrates them with sample code.
      9 
     10 NSPR does not provide a platform for porting existing code. It must be
     11 used from the beginning of a software project.
     12 
     13 .. _NSPR_Naming_Conventions:
     14 
     15 NSPR Naming Conventions
     16 -----------------------
     17 
     18 Naming of NSPR types, functions, and macros follows the following
     19 conventions:
     20 
     21 -  Types exported by NSPR begin with ``PR`` and are followed by
     22   intercap-style declarations, like this: :ref:`PRInt`, :ref:`PRFileDesc`
     23 -  Function definitions begin with ``PR_`` and are followed by
     24   intercap-style declarations, like this: :ref:`PR_Read``,
     25   :ref:`PR_JoinThread``
     26 -  Preprocessor macros begin with the letters ``PR`` and are followed by
     27   all uppercase characters separated with the underscore character
     28   (``_``), like this: :ref:`PR_BYTES_PER_SHORT`, :ref:`PR_EXTERN`
     29 
     30 .. _NSPR_Threads:
     31 
     32 NSPR Threads
     33 ------------
     34 
     35 NSPR provides an execution environment that promotes the use of
     36 lightweight threads. Each thread is an execution entity that is
     37 scheduled independently from other threads in the same process. A thread
     38 has a limited number of resources that it truly owns. These resources
     39 include the thread stack and the CPU register set (including PC).
     40 
     41 To an NSPR client, a thread is represented by a pointer to an opaque
     42 structure of type :ref:`PRThread``. A thread is created by an explicit
     43 client request and remains a valid, independent execution entity until
     44 it returns from its root function or the process abnormally terminates.
     45 (:ref:`PRThread` and functions for creating and manipulating threads are
     46 described in detail in `Threads <Threads>`__.)
     47 
     48 NSPR threads are lightweight in the sense that they are cheaper than
     49 full-blown processes, but they are not free. They achieve the cost
     50 reduction by relying on their containing process to manage most of the
     51 resources that they access. This, and the fact that threads share an
     52 address space with other threads in the same process, makes it important
     53 to remember that *threads are not processes* .
     54 
     55 NSPR threads are scheduled in two separate domains:
     56 
     57 -  **Local threads** are scheduled within a process only and are handled
     58   entirely by NSPR, either by completely emulating threads on each host
     59   operating system (OS) that doesn't support threads, or by using the
     60   threading facilities of each host OS that does support threads to
     61   emulate a relatively large number of local threads by using a
     62   relatively small number of native threads.
     63 
     64 -  **Global threads** are scheduled by the host OS--not by NSPR--either
     65   within a process or across processes on the entire host. Global
     66   threads correspond to native threads on the host OS.
     67 
     68 NSPR threads can also be either user threads or system threads. NSPR
     69 provides a function, :ref:`PR_Cleanup`, that synchronizes process
     70 termination. :ref:`PR_Cleanup` waits for the last user thread to exit
     71 before returning, whereas it ignores system threads when determining
     72 when a process should exit. This arrangement implies that a system
     73 thread should not have volatile data that needs to be safely stored
     74 away.
     75 
     76 Priorities for NSPR threads are based loosely on hints provided by the
     77 client and sometimes constrained by the underlying operating system.
     78 Therefore, priorities are not rigidly defined. For more information, see
     79 `Thread Scheduling <#Thread_Scheduling>`__.
     80 
     81 In general, it's preferable to create local user threads with normal
     82 priority and let NSPR take care of the details as appropriate for each
     83 host OS. It's usually not necessary to create a global thread explicitly
     84 unless you are planning to port your code only to platforms that provide
     85 threading services with which you are familiar or unless the thread will
     86 be executing code that might directly call blocking OS functions.
     87 
     88 Threads can also have "per-thread-data" attached to them. Each thread
     89 has a built-in per-thread error number and error string that are updated
     90 when NSPR operations fail. It's also possible for NSPR clients to define
     91 their own per-thread-data. For details, see `Controlling Per-Thread
     92 Private Data <Threads#Controlling_Per-Thread_Private_Data>`__.
     93 
     94 .. _Thread_Scheduling:
     95 
     96 Thread Scheduling
     97 ~~~~~~~~~~~~~~~~~
     98 
     99 NSPR threads are scheduled by priority and can be preempted or
    100 interrupted. The sections that follow briefly introduce the NSPR
    101 approach to these three aspects of thread scheduling.
    102 
    103 -  `Setting Thread Priorities <#Setting_Thread_Priorities>`__
    104 -  `Preempting Threads <#Preempting_Threads>`__
    105 -  `Interrupting Threads <#Interrupting_Threads>`__
    106 
    107 For reference information on the NSPR API used for thread scheduling,
    108 see `Threads <Threads>`__.
    109 
    110 .. _Setting_Thread_Priorities:
    111 
    112 Setting Thread Priorities
    113 ^^^^^^^^^^^^^^^^^^^^^^^^^
    114 
    115 The host operating systems supported by NSPR differ widely in the
    116 mechanisms they use to support thread priorities. In general, an NSPR
    117 thread of higher priority has a statistically better chance of running
    118 relative to threads of lower priority. However, because of the multiple
    119 strategies to provide execution vehicles for threads on various host
    120 platforms, priorities are not a clearly defined abstraction in NSPR. At
    121 best they are intended to specify a preference with respect to the
    122 amount of CPU time that a higher-priority thread might expect relative
    123 to a lower-priority thread. This preference is still subject to resource
    124 availability, and must not be used in place of proper synchronization.
    125 For more information on thread synchronization, see `NSPR Thread
    126 Synchronization <#NSPR_Thread_Synchronization>`__.
    127 
    128 The issue is further muddied by inconsistent offerings from OS vendors
    129 regarding the priority of their kernel-supported threads. NSPR assumes
    130 that the priorities of global threads are not manageable, but that the
    131 host OS will perform some sort of fair scheduling. It's usually
    132 preferable to create local user threads with normal priority and let
    133 NSPR and the host take care of the details.
    134 
    135 In some NSPR configurations, there may be an arbitrary (and perhaps
    136 large) number of local threads being supported by a more limited number
    137 of **virtual processors** (an internal application of global threads).
    138 In such situations, each virtual processor will have some number of
    139 local threads associated with it, though exactly which local threads and
    140 how many may vary over time. NSPR guarantees that for each virtual
    141 processor the highest-priority, schedulable local thread is the one
    142 executing. This thread implementation strategy is referred to as the **M
    143 x N model.**
    144 
    145 .. _Preempting_Threads:
    146 
    147 Preempting Threads
    148 ^^^^^^^^^^^^^^^^^^
    149 
    150 Preemption is the act of taking control away from a ready thread at an
    151 arbitrary point and giving control to another appropriate thread. It
    152 might be viewed as taking the executing thread and adding it to the end
    153 of the ready queue for its appropriate priority, then simply running the
    154 scheduling algorithm to find the most appropriate thread. The chosen
    155 thread may be of higher priority, of the same priority, or even the same
    156 thread. It will not be a thread of lower priority.
    157 
    158 Some operating systems cannot be made preemptible (for example, Mac OS
    159 and Win 16). This puts them at some risk in supporting arbitrary code,
    160 even if the code is interpreted (Java). Other systems are not
    161 thread-aware, and their runtime libraries not thread-safe (most versions
    162 of Unix). These systems can support local level thread abstractions that
    163 can be made preemptible, but run the risk of library corruption
    164 (``libc``). Still other operating systems have a native notion of
    165 threads, and their libraries are thread-aware and support locking.
    166 However, if local threads are also present, and they are preemptible,
    167 they are subject to deadlock. At this time, the only safe solutions are
    168 to turn off preemption (a runtime decision) or to preempt global threads
    169 only.
    170 
    171 .. _Interrupting_Threads:
    172 
    173 Interrupting Threads
    174 ^^^^^^^^^^^^^^^^^^^^
    175 
    176 NSPR threads are interruptible, with some constraints and
    177 inconsistencies.
    178 
    179 To interrupt a thread, the caller of :ref:`PR_Interrupt` must have the NSPR
    180 reference to the target thread (:ref:`PRThread`). When the target is
    181 interrupted, it is rescheduled from the point at which it was blocked,
    182 with a status error indicating that it was interrupted. NSPR recognizes
    183 only two areas where a thread may be interrupted: waiting on a condition
    184 variable and waiting on I/O. In the latter case, interruption does
    185 cancel the I/O operation. In neither case does being interrupted imply
    186 the demise of the thread.
    187 
    188 .. _NSPR_Thread_Synchronization:
    189 
    190 NSPR Thread Synchronization
    191 ---------------------------
    192 
    193 Thread synchronization has two aspects: locking and notification.
    194 Locking prevents access to some resource, such as a piece of shared
    195 data: that is, it enforces mutual exclusion. Notification involves
    196 passing synchronization information among cooperating threads.
    197 
    198 In NSPR, a **mutual exclusion lock** (or **mutex**) of type :ref:`PRLock`
    199 controls locking, and associated **condition variables** of type
    200 :ref:`PRCondVar` communicate changes in state among threads. When a
    201 programmer associates a mutex with an arbitrary collection of data, the
    202 mutex provides a protective **monitor** around the data.
    203 
    204 .. _Locks_and_Monitors:
    205 
    206 Locks and Monitors
    207 ~~~~~~~~~~~~~~~~~~
    208 
    209 In general, a monitor is a conceptual entity composed of a mutex, one or
    210 more condition variables, and the monitored data. Monitors in this
    211 generic sense should not be confused with the monitor type used in Java
    212 programming. In addition to :ref:`PRLock`, NSPR provides another mutex
    213 type, :ref:`PRMonitor`, which is reentrant and can have only one associated
    214 condition variable. :ref:`PRMonitor` is intended for use with Java and
    215 reflects the Java approach to thread synchronization.
    216 
    217 To access the data in the monitor, the thread performing the access must
    218 hold the mutex, also described as being "in the monitor." Mutual
    219 exclusion guarantees that only one thread can be in the monitor at a
    220 time and that no thread may observe or modify the monitored data without
    221 being in the monitor.
    222 
    223 Monitoring is about protecting data, not code. A **monitored invariant**
    224 is a Boolean expression over the monitored data. The expression may be
    225 false only when a thread is in the monitor (holding the monitor's
    226 mutex). This requirement implies that when a thread first enters the
    227 monitor, an evaluation of the invariant expression must yield a
    228 ``true``. The thread must also reinstate the monitored invariant before
    229 exiting the monitor. Therefore, evaluation of the expression must also
    230 yield a true at that point in execution.
    231 
    232 A trivial example might be as follows. Suppose an object has three
    233 values, ``v1``, ``v2``, and ``sum``. The invariant is that the third
    234 value is the sum of the other two. Expressed mathematically, the
    235 invariant is ``sum = v1 + v2``. Any modification of ``v1`` or ``v2``
    236 requires modification of ``sum``. Since that is a complex operation, it
    237 must be monitored. Furthermore, any type of access to ``sum`` must also
    238 be monitored to ensure that neither ``v1`` nor ``v2`` are in flux.
    239 
    240 .. note::
    241 
    242   Evaluation of the invariant expression is a conceptual
    243   requirement and is rarely done in practice. It is valuable to
    244   formally define the expression during design, write it down, and
    245   adhere to it. It is also useful to implement the expression during
    246   development and test it where appropriate. The thread makes an
    247   absolute assertion of the expression's evaluation both on entering
    248   and on exiting the monitor.
    249 
    250 Acquiring a lock is a synchronous operation. Once the lock primitive is
    251 called, the thread returns only when it has acquired the lock. Should
    252 another thread (or the same thread) already have the lock held, the
    253 calling thread blocks, waiting for the situation to improve. That
    254 blocked state is not interruptible, nor is it timed.
    255 
    256 .. _Condition_Variables:
    257 
    258 Condition Variables
    259 ~~~~~~~~~~~~~~~~~~~
    260 
    261 Condition variables facilitate communication between threads. The
    262 communication available is a semantic-free notification whose context
    263 must be supplied by the programmer. Conditions are closely associated
    264 with a single monitor.
    265 
    266 The association between a condition and a monitor is established when a
    267 condition variable is created, and the association persists for the life
    268 of the condition variable. In addition, a static association exists
    269 between the condition and some data within the monitor. This data is
    270 what will be manipulated by the program under the protection of the
    271 monitor. A thread may wait on notification of a condition that signals
    272 changes in the state of the associated data. Other threads may notify
    273 the condition when changes occur.
    274 
    275 Condition variables are always monitored. The relevant operations on
    276 conditions are always performed from within the monitor. They are used
    277 to communicate changes in the state of the monitored data (though still
    278 preserving the monitored invariant). Condition variables allow one or
    279 more threads to wait for a predetermined condition to exist, and they
    280 allow another thread to notify them when the condition occurs. Condition
    281 variables themselves do not carry the semantics of the state change, but
    282 simply provide a mechanism for indicating that something has changed. It
    283 is the programmer's responsibility to associate a condition with the
    284 state of the data.
    285 
    286 A thread may be designed to wait for a particular situation to exist in
    287 some monitored data. Since the nature of the situation is not an
    288 attribute of the condition, the program must test that itself. Since
    289 this testing involves the monitored data, it must be done from within
    290 the monitor. The wait operation atomically exits the monitor and blocks
    291 the calling thread in a waiting condition state. When the thread is
    292 resumed after the wait, it will have reentered the monitor, making
    293 operations on the data safe.
    294 
    295 There is a subtle interaction between the thread(s) waiting on a
    296 condition and those notifying it. The notification must take place
    297 within a monitor--the same monitor that protects the data being
    298 manipulated by the notifier. In pseudocode, the sequence looks like
    299 this:
    300 
    301 .. code::
    302 
    303   enter(monitor);
    304   ... manipulate the monitored data
    305   notify(condition);
    306   exit(monitor);
    307 
    308 Notifications to a condition do not accumulate. Nor is it required that
    309 any thread be waiting on a condition when the notification occurs. The
    310 design of the code that waits on a condition must take these facts into
    311 account. Therefore, the pseudocode for the waiting thread might look
    312 like this:
    313 
    314 .. code::
    315 
    316   enter(monitor)
    317   while (!expression) wait(condition);
    318   ... manipulate monitored data
    319   exit(monitor);
    320 
    321 The need to evaluate the Boolean expression again after rescheduling
    322 from a wait may appear unnecessary, but it is vital to the correct
    323 execution of the program. The notification promotes a thread waiting on
    324 a condition to a ready state. When that thread actually gets scheduled
    325 is determined by the thread scheduler and cannot be predicted. If
    326 multiple threads are actually processing the notifications, one or more
    327 of them could be scheduled ahead of the one explicitly promoted by the
    328 notification. One such thread could enter the monitor and perform the
    329 work indicated by the notification, and exit. In this case the thread
    330 would resume from the wait only to find that there's nothing to do.
    331 
    332 For example, suppose the defined rule of a function is that it should
    333 wait until there is an object available and that it should return a
    334 reference to that object. Writing the code as follows could potentially
    335 return a null reference, violating the invariant of the function:
    336 
    337 .. code::
    338 
    339   void *dequeue()
    340   {
    341      void *db;
    342      enter(monitor);
    343      if ((db = delink()) == null)
    344      {
    345         wait(condition);
    346         db = delink();
    347      }
    348      exit(monitor);
    349      return db;
    350   }
    351 
    352 The same function would be more appropriately written as follows:
    353 
    354 .. code::
    355 
    356   void *dequeue()
    357   {
    358      void *db;
    359      enter(monitor);
    360      while ((db = delink()) == null)
    361         wait(condition);
    362      exit(monitor);
    363      return db;
    364   }
    365 
    366 .. note::
    367 
    368   **Caution**: The semantics of :ref:`PR_WaitCondVar` assume that the
    369   monitor is about to be exited. This assumption implies that the
    370   monitored invariant must be reinstated before calling
    371   :ref:`PR_WaitCondVar`. Failure to do this will cause subtle but painful
    372   bugs.
    373 
    374 To modify monitored data safely, a thread must be in the monitor. Since
    375 no other thread may modify or (in most cases) even observe the protected
    376 data from outside the monitor, the thread can safely make any
    377 modifications needed. When the changes have been completed, the thread
    378 notifies the condition associated with the data and exits the monitor
    379 using :ref:`PR_NotifyCondVar`. Logically, each such notification promotes
    380 one thread that was waiting on the condition to a ready state. An
    381 alternate form of notification (:ref:`PR_NotifyAllCondVar`) promotes all
    382 threads waiting on a condition to the ready state. If no threads were
    383 waiting, the notification is a no-op.
    384 
    385 Waiting on a condition variable is an interruptible operation. Another
    386 thread could target the waiting thread and issue a :ref:`PR_Interrupt`,
    387 causing a waiting thread to resume. In such cases the return from the
    388 wait operation indicates a failure and definitively indicates that the
    389 cause of the failure is an interrupt.
    390 
    391 A call to :ref:`PR_WaitCondVar` may also resume because the interval
    392 specified on the wait call has expired. However, this fact cannot be
    393 unambiguously delivered, so no attempt is made to do so. If the logic of
    394 a program allows for timing of waits on conditions, then the clock must
    395 be treated as part of the monitored data and the amount of time elapsed
    396 re-asserted when the call returns. Philosophically, timeouts should be
    397 treated as explicit notifications, and therefore require the testing of
    398 the monitored data upon resumption.
    399 
    400 .. _NSPR_Sample_Code:
    401 
    402 NSPR Sample Code
    403 ----------------
    404 
    405 The documents linked here present two sample programs, including
    406 detailed annotations: ``layer.html`` and ``switch.html``. In addition to
    407 these annotated HTML versions, the same samples are available in pure
    408 source form.