doc: Update removal of RCU-bh/sched update machinery
The RCU-bh update API is now defined in terms of that of RCU-bh and RCU-sched, so this commit updates the documentation accordingly. In addition, although RCU-sched persists in !PREEMPT kernels, in the PREEMPT case its update API is now defined in terms of that of RCU-preempt, so this commit also updates the documentation accordingly. While in the area, this commit removes the documentation for the now-obsolete synchronize_rcu_mult() and clarifies the Tasks RCU documentation. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>hifive-unleashed-5.1
parent
ea24c125fe
commit
77095901b8
|
@ -1374,8 +1374,7 @@ that is, if the CPU is currently idle.
|
||||||
Accessor Functions</a></h3>
|
Accessor Functions</a></h3>
|
||||||
|
|
||||||
<p>The following listing shows the
|
<p>The following listing shows the
|
||||||
<tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt>,
|
<tt>rcu_get_root()</tt>, <tt>rcu_for_each_node_breadth_first</tt> and
|
||||||
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt>, and
|
|
||||||
<tt>rcu_for_each_leaf_node()</tt> function and macros:
|
<tt>rcu_for_each_leaf_node()</tt> function and macros:
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -1388,13 +1387,9 @@ Accessor Functions</a></h3>
|
||||||
7 for ((rnp) = &(rsp)->node[0]; \
|
7 for ((rnp) = &(rsp)->node[0]; \
|
||||||
8 (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
|
8 (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
|
||||||
9
|
9
|
||||||
10 #define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \
|
10 #define rcu_for_each_leaf_node(rsp, rnp) \
|
||||||
11 for ((rnp) = &(rsp)->node[0]; \
|
11 for ((rnp) = (rsp)->level[NUM_RCU_LVLS - 1]; \
|
||||||
12 (rnp) < (rsp)->level[NUM_RCU_LVLS - 1]; (rnp)++)
|
12 (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
|
||||||
13
|
|
||||||
14 #define rcu_for_each_leaf_node(rsp, rnp) \
|
|
||||||
15 for ((rnp) = (rsp)->level[NUM_RCU_LVLS - 1]; \
|
|
||||||
16 (rnp) < &(rsp)->node[NUM_RCU_NODES]; (rnp)++)
|
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
<p>The <tt>rcu_get_root()</tt> simply returns a pointer to the
|
<p>The <tt>rcu_get_root()</tt> simply returns a pointer to the
|
||||||
|
@ -1407,10 +1402,7 @@ macro takes advantage of the layout of the <tt>rcu_node</tt>
|
||||||
structures in the <tt>rcu_state</tt> structure's
|
structures in the <tt>rcu_state</tt> structure's
|
||||||
<tt>->node[]</tt> array, performing a breadth-first traversal by
|
<tt>->node[]</tt> array, performing a breadth-first traversal by
|
||||||
simply traversing the array in order.
|
simply traversing the array in order.
|
||||||
The <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> macro operates
|
Similarly, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
|
||||||
similarly, but traverses only the first part of the array, thus excluding
|
|
||||||
the leaf <tt>rcu_node</tt> structures.
|
|
||||||
Finally, the <tt>rcu_for_each_leaf_node()</tt> macro traverses only
|
|
||||||
the last part of the array, thus traversing only the leaf
|
the last part of the array, thus traversing only the leaf
|
||||||
<tt>rcu_node</tt> structures.
|
<tt>rcu_node</tt> structures.
|
||||||
|
|
||||||
|
@ -1418,15 +1410,14 @@ the last part of the array, thus traversing only the leaf
|
||||||
<tr><th> </th></tr>
|
<tr><th> </th></tr>
|
||||||
<tr><th align="left">Quick Quiz:</th></tr>
|
<tr><th align="left">Quick Quiz:</th></tr>
|
||||||
<tr><td>
|
<tr><td>
|
||||||
What do <tt>rcu_for_each_nonleaf_node_breadth_first()</tt> and
|
What does
|
||||||
<tt>rcu_for_each_leaf_node()</tt> do if the <tt>rcu_node</tt> tree
|
<tt>rcu_for_each_leaf_node()</tt> do if the <tt>rcu_node</tt> tree
|
||||||
contains only a single node?
|
contains only a single node?
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><th align="left">Answer:</th></tr>
|
<tr><th align="left">Answer:</th></tr>
|
||||||
<tr><td bgcolor="#ffffff"><font color="ffffff">
|
<tr><td bgcolor="#ffffff"><font color="ffffff">
|
||||||
In the single-node case,
|
In the single-node case,
|
||||||
<tt>rcu_for_each_nonleaf_node_breadth_first()</tt> is a no-op
|
<tt>rcu_for_each_leaf_node()</tt> traverses the single node.
|
||||||
and <tt>rcu_for_each_leaf_node()</tt> traverses the single node.
|
|
||||||
</font></td></tr>
|
</font></td></tr>
|
||||||
<tr><td> </td></tr>
|
<tr><td> </td></tr>
|
||||||
</table>
|
</table>
|
||||||
|
|
|
@ -12,10 +12,9 @@ high efficiency and minimal disturbance, expedited grace periods accept
|
||||||
lower efficiency and significant disturbance to attain shorter latencies.
|
lower efficiency and significant disturbance to attain shorter latencies.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
There are three flavors of RCU (RCU-bh, RCU-preempt, and RCU-sched),
|
There are two flavors of RCU (RCU-preempt and RCU-sched), with an earlier
|
||||||
but only two flavors of expedited grace periods because the RCU-bh
|
third RCU-bh flavor having been implemented in terms of the other two.
|
||||||
expedited grace period maps onto the RCU-sched expedited grace period.
|
Each of the two implementations is covered in its own section.
|
||||||
Each of the remaining two implementations is covered in its own section.
|
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li> <a href="#Expedited Grace Period Design">
|
<li> <a href="#Expedited Grace Period Design">
|
||||||
|
|
|
@ -1306,8 +1306,6 @@ doing so would degrade real-time response.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
This non-requirement appeared with preemptible RCU.
|
This non-requirement appeared with preemptible RCU.
|
||||||
If you need a grace period that waits on non-preemptible code regions, use
|
|
||||||
<a href="#Sched Flavor">RCU-sched</a>.
|
|
||||||
|
|
||||||
<h2><a name="Parallelism Facts of Life">Parallelism Facts of Life</a></h2>
|
<h2><a name="Parallelism Facts of Life">Parallelism Facts of Life</a></h2>
|
||||||
|
|
||||||
|
@ -2165,14 +2163,9 @@ however, this is not a panacea because there would be severe restrictions
|
||||||
on what operations those callbacks could invoke.
|
on what operations those callbacks could invoke.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Perhaps surprisingly, <tt>synchronize_rcu()</tt>,
|
Perhaps surprisingly, <tt>synchronize_rcu()</tt> and
|
||||||
<a href="#Bottom-Half Flavor"><tt>synchronize_rcu_bh()</tt></a>
|
|
||||||
(<a href="#Bottom-Half Flavor">discussed below</a>),
|
|
||||||
<a href="#Sched Flavor"><tt>synchronize_sched()</tt></a>,
|
|
||||||
<tt>synchronize_rcu_expedited()</tt>,
|
<tt>synchronize_rcu_expedited()</tt>,
|
||||||
<tt>synchronize_rcu_bh_expedited()</tt>, and
|
will operate normally
|
||||||
<tt>synchronize_sched_expedited()</tt>
|
|
||||||
will all operate normally
|
|
||||||
during very early boot, the reason being that there is only one CPU
|
during very early boot, the reason being that there is only one CPU
|
||||||
and preemption is disabled.
|
and preemption is disabled.
|
||||||
This means that the call <tt>synchronize_rcu()</tt> (or friends)
|
This means that the call <tt>synchronize_rcu()</tt> (or friends)
|
||||||
|
@ -2861,15 +2854,22 @@ The other four flavors are listed below, with requirements for each
|
||||||
described in a separate section.
|
described in a separate section.
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li> <a href="#Bottom-Half Flavor">Bottom-Half Flavor</a>
|
<li> <a href="#Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a>
|
||||||
<li> <a href="#Sched Flavor">Sched Flavor</a>
|
<li> <a href="#Sched Flavor">Sched Flavor (Historical)</a>
|
||||||
<li> <a href="#Sleepable RCU">Sleepable RCU</a>
|
<li> <a href="#Sleepable RCU">Sleepable RCU</a>
|
||||||
<li> <a href="#Tasks RCU">Tasks RCU</a>
|
<li> <a href="#Tasks RCU">Tasks RCU</a>
|
||||||
<li> <a href="#Waiting for Multiple Grace Periods">
|
|
||||||
Waiting for Multiple Grace Periods</a>
|
|
||||||
</ol>
|
</ol>
|
||||||
|
|
||||||
<h3><a name="Bottom-Half Flavor">Bottom-Half Flavor</a></h3>
|
<h3><a name="Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a></h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The RCU-bh flavor of RCU has since been expressed in terms of
|
||||||
|
the other RCU flavors as part of a consolidation of the three
|
||||||
|
flavors into a single flavor.
|
||||||
|
The read-side API remains, and continues to disable softirq and to
|
||||||
|
be accounted for by lockdep.
|
||||||
|
Much of the material in this section is therefore strictly historical
|
||||||
|
in nature.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
The softirq-disable (AKA “bottom-half”,
|
The softirq-disable (AKA “bottom-half”,
|
||||||
|
@ -2929,8 +2929,20 @@ includes
|
||||||
<tt>call_rcu_bh()</tt>,
|
<tt>call_rcu_bh()</tt>,
|
||||||
<tt>rcu_barrier_bh()</tt>, and
|
<tt>rcu_barrier_bh()</tt>, and
|
||||||
<tt>rcu_read_lock_bh_held()</tt>.
|
<tt>rcu_read_lock_bh_held()</tt>.
|
||||||
|
However, the update-side APIs are now simple wrappers for other RCU
|
||||||
|
flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt
|
||||||
|
otherwise.
|
||||||
|
|
||||||
<h3><a name="Sched Flavor">Sched Flavor</a></h3>
|
<h3><a name="Sched Flavor">Sched Flavor (Historical)</a></h3>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
The RCU-sched flavor of RCU has since been expressed in terms of
|
||||||
|
the other RCU flavors as part of a consolidation of the three
|
||||||
|
flavors into a single flavor.
|
||||||
|
The read-side API remains, and continues to disable preemption and to
|
||||||
|
be accounted for by lockdep.
|
||||||
|
Much of the material in this section is therefore strictly historical
|
||||||
|
in nature.
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
Before preemptible RCU, waiting for an RCU grace period had the
|
Before preemptible RCU, waiting for an RCU grace period had the
|
||||||
|
@ -3150,94 +3162,14 @@ The tasks-RCU API is quite compact, consisting only of
|
||||||
<tt>call_rcu_tasks()</tt>,
|
<tt>call_rcu_tasks()</tt>,
|
||||||
<tt>synchronize_rcu_tasks()</tt>, and
|
<tt>synchronize_rcu_tasks()</tt>, and
|
||||||
<tt>rcu_barrier_tasks()</tt>.
|
<tt>rcu_barrier_tasks()</tt>.
|
||||||
|
In <tt>CONFIG_PREEMPT=n</tt> kernels, trampolines cannot be preempted,
|
||||||
<h3><a name="Waiting for Multiple Grace Periods">
|
so these APIs map to
|
||||||
Waiting for Multiple Grace Periods</a></h3>
|
<tt>call_rcu()</tt>,
|
||||||
|
<tt>synchronize_rcu()</tt>, and
|
||||||
<p>
|
<tt>rcu_barrier()</tt>, respectively.
|
||||||
Perhaps you have an RCU protected data structure that is accessed from
|
In <tt>CONFIG_PREEMPT=y</tt> kernels, trampolines can be preempted,
|
||||||
RCU read-side critical sections, from softirq handlers, and from
|
and these three APIs are therefore implemented by separate functions
|
||||||
hardware interrupt handlers.
|
that check for voluntary context switches.
|
||||||
That is three flavors of RCU, the normal flavor, the bottom-half flavor,
|
|
||||||
and the sched flavor.
|
|
||||||
How to wait for a compound grace period?
|
|
||||||
|
|
||||||
<p>
|
|
||||||
The best approach is usually to “just say no!” and
|
|
||||||
insert <tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>
|
|
||||||
around each RCU read-side critical section, regardless of what
|
|
||||||
environment it happens to be in.
|
|
||||||
But suppose that some of the RCU read-side critical sections are
|
|
||||||
on extremely hot code paths, and that use of <tt>CONFIG_PREEMPT=n</tt>
|
|
||||||
is not a viable option, so that <tt>rcu_read_lock()</tt> and
|
|
||||||
<tt>rcu_read_unlock()</tt> are not free.
|
|
||||||
What then?
|
|
||||||
|
|
||||||
<p>
|
|
||||||
You <i>could</i> wait on all three grace periods in succession, as follows:
|
|
||||||
|
|
||||||
<blockquote>
|
|
||||||
<pre>
|
|
||||||
1 synchronize_rcu();
|
|
||||||
2 synchronize_rcu_bh();
|
|
||||||
3 synchronize_sched();
|
|
||||||
</pre>
|
|
||||||
</blockquote>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
This works, but triples the update-side latency penalty.
|
|
||||||
In cases where this is not acceptable, <tt>synchronize_rcu_mult()</tt>
|
|
||||||
may be used to wait on all three flavors of grace period concurrently:
|
|
||||||
|
|
||||||
<blockquote>
|
|
||||||
<pre>
|
|
||||||
1 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched);
|
|
||||||
</pre>
|
|
||||||
</blockquote>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
But what if it is necessary to also wait on SRCU?
|
|
||||||
This can be done as follows:
|
|
||||||
|
|
||||||
<blockquote>
|
|
||||||
<pre>
|
|
||||||
1 static void call_my_srcu(struct rcu_head *head,
|
|
||||||
2 void (*func)(struct rcu_head *head))
|
|
||||||
3 {
|
|
||||||
4 call_srcu(&my_srcu, head, func);
|
|
||||||
5 }
|
|
||||||
6
|
|
||||||
7 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched, call_my_srcu);
|
|
||||||
</pre>
|
|
||||||
</blockquote>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
If you needed to wait on multiple different flavors of SRCU
|
|
||||||
(but why???), you would need to create a wrapper function resembling
|
|
||||||
<tt>call_my_srcu()</tt> for each SRCU flavor.
|
|
||||||
|
|
||||||
<table>
|
|
||||||
<tr><th> </th></tr>
|
|
||||||
<tr><th align="left">Quick Quiz:</th></tr>
|
|
||||||
<tr><td>
|
|
||||||
But what if I need to wait for multiple RCU flavors, but I also need
|
|
||||||
the grace periods to be expedited?
|
|
||||||
</td></tr>
|
|
||||||
<tr><th align="left">Answer:</th></tr>
|
|
||||||
<tr><td bgcolor="#ffffff"><font color="ffffff">
|
|
||||||
If you are using expedited grace periods, there should be less penalty
|
|
||||||
for waiting on them in succession.
|
|
||||||
But if that is nevertheless a problem, you can use workqueues
|
|
||||||
or multiple kthreads to wait on the various expedited grace
|
|
||||||
periods concurrently.
|
|
||||||
</font></td></tr>
|
|
||||||
<tr><td> </td></tr>
|
|
||||||
</table>
|
|
||||||
|
|
||||||
<p>
|
|
||||||
Again, it is usually better to adjust the RCU read-side critical sections
|
|
||||||
to use a single flavor of RCU, but when this is not feasible, you can use
|
|
||||||
<tt>synchronize_rcu_mult()</tt>.
|
|
||||||
|
|
||||||
<h2><a name="Possible Future Changes">Possible Future Changes</a></h2>
|
<h2><a name="Possible Future Changes">Possible Future Changes</a></h2>
|
||||||
|
|
||||||
|
@ -3248,12 +3180,6 @@ If this becomes a serious problem, it will be necessary to rework the
|
||||||
grace-period state machine so as to avoid the need for the additional
|
grace-period state machine so as to avoid the need for the additional
|
||||||
latency.
|
latency.
|
||||||
|
|
||||||
<p>
|
|
||||||
Expedited grace periods scan the CPUs, so their latency and overhead
|
|
||||||
increases with increasing numbers of CPUs.
|
|
||||||
If this becomes a serious problem on large systems, it will be necessary
|
|
||||||
to do some redesign to avoid this scalability problem.
|
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
RCU disables CPU hotplug in a few places, perhaps most notably in the
|
RCU disables CPU hotplug in a few places, perhaps most notably in the
|
||||||
<tt>rcu_barrier()</tt> operations.
|
<tt>rcu_barrier()</tt> operations.
|
||||||
|
@ -3298,11 +3224,6 @@ Please note that arrangements that require RCU to remap CPU numbers will
|
||||||
require extremely good demonstration of need and full exploration of
|
require extremely good demonstration of need and full exploration of
|
||||||
alternatives.
|
alternatives.
|
||||||
|
|
||||||
<p>
|
|
||||||
There is an embarrassingly large number of flavors of RCU, and this
|
|
||||||
number has been increasing over time.
|
|
||||||
Perhaps it will be possible to combine some at some future date.
|
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
RCU's various kthreads are reasonably recent additions.
|
RCU's various kthreads are reasonably recent additions.
|
||||||
It is quite likely that adjustments will be required to more gracefully
|
It is quite likely that adjustments will be required to more gracefully
|
||||||
|
|
|
@ -16,12 +16,9 @@ o A CPU looping in an RCU read-side critical section.
|
||||||
|
|
||||||
o A CPU looping with interrupts disabled.
|
o A CPU looping with interrupts disabled.
|
||||||
|
|
||||||
o A CPU looping with preemption disabled. This condition can
|
o A CPU looping with preemption disabled.
|
||||||
result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
|
|
||||||
stalls.
|
|
||||||
|
|
||||||
o A CPU looping with bottom halves disabled. This condition can
|
o A CPU looping with bottom halves disabled.
|
||||||
result in RCU-sched and RCU-bh stalls.
|
|
||||||
|
|
||||||
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
|
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
|
||||||
without invoking schedule(). If the looping in the kernel is
|
without invoking schedule(). If the looping in the kernel is
|
||||||
|
@ -87,9 +84,9 @@ o A hardware failure. This is quite unlikely, but has occurred
|
||||||
This resulted in a series of RCU CPU stall warnings, eventually
|
This resulted in a series of RCU CPU stall warnings, eventually
|
||||||
leading the realization that the CPU had failed.
|
leading the realization that the CPU had failed.
|
||||||
|
|
||||||
The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall
|
The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
|
||||||
warning. Note that SRCU does -not- have CPU stall warnings. Please note
|
Note that SRCU does -not- have CPU stall warnings. Please note that
|
||||||
that RCU only detects CPU stalls when there is a grace period in progress.
|
RCU only detects CPU stalls when there is a grace period in progress.
|
||||||
No grace period, no CPU stall warnings.
|
No grace period, no CPU stall warnings.
|
||||||
|
|
||||||
To diagnose the cause of the stall, inspect the stack traces.
|
To diagnose the cause of the stall, inspect the stack traces.
|
||||||
|
|
|
@ -934,7 +934,8 @@ c. Do you need to treat NMI handlers, hardirq handlers,
|
||||||
d. Do you need RCU grace periods to complete even in the face
|
d. Do you need RCU grace periods to complete even in the face
|
||||||
of softirq monopolization of one or more of the CPUs? For
|
of softirq monopolization of one or more of the CPUs? For
|
||||||
example, is your code subject to network-based denial-of-service
|
example, is your code subject to network-based denial-of-service
|
||||||
attacks? If so, you need RCU-bh.
|
attacks? If so, you should disable softirq across your readers,
|
||||||
|
for example, by using rcu_read_lock_bh().
|
||||||
|
|
||||||
e. Is your workload too update-intensive for normal use of
|
e. Is your workload too update-intensive for normal use of
|
||||||
RCU, but inappropriate for other synchronization mechanisms?
|
RCU, but inappropriate for other synchronization mechanisms?
|
||||||
|
|
|
@ -3534,14 +3534,14 @@
|
||||||
|
|
||||||
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
|
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
|
||||||
the specified list of CPUs to be no-callback CPUs.
|
the specified list of CPUs to be no-callback CPUs.
|
||||||
Invocation of these CPUs' RCU callbacks will
|
Invocation of these CPUs' RCU callbacks will be
|
||||||
be offloaded to "rcuox/N" kthreads created for
|
offloaded to "rcuox/N" kthreads created for that
|
||||||
that purpose, where "x" is "b" for RCU-bh, "p"
|
purpose, where "x" is "p" for RCU-preempt, and
|
||||||
for RCU-preempt, and "s" for RCU-sched, and "N"
|
"s" for RCU-sched, and "N" is the CPU number.
|
||||||
is the CPU number. This reduces OS jitter on the
|
This reduces OS jitter on the offloaded CPUs,
|
||||||
offloaded CPUs, which can be useful for HPC and
|
which can be useful for HPC and real-time
|
||||||
real-time workloads. It can also improve energy
|
workloads. It can also improve energy efficiency
|
||||||
efficiency for asymmetric multiprocessors.
|
for asymmetric multiprocessors.
|
||||||
|
|
||||||
rcu_nocb_poll [KNL]
|
rcu_nocb_poll [KNL]
|
||||||
Rather than requiring that offloaded CPUs
|
Rather than requiring that offloaded CPUs
|
||||||
|
|
|
@ -321,7 +321,7 @@ To reduce its OS jitter, do at least one of the following:
|
||||||
to do.
|
to do.
|
||||||
|
|
||||||
Name:
|
Name:
|
||||||
rcuob/%d, rcuop/%d, and rcuos/%d
|
rcuop/%d and rcuos/%d
|
||||||
|
|
||||||
Purpose:
|
Purpose:
|
||||||
Offload RCU callbacks from the corresponding CPU.
|
Offload RCU callbacks from the corresponding CPU.
|
||||||
|
|
Loading…
Reference in New Issue