1
0
Fork 0

Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "This cycles's RCU changes include:

   - a couple of straggling RCU flavor consolidation updates

   - SRCU updates

   - RCU CPU stall-warning updates

   - torture-test updates

   - an LKMM commit adding support for synchronize_srcu_expedited()

   - documentation updates

   - miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  net/ipv4/netfilter: Update comment from call_rcu_bh() to call_rcu()
  tools/memory-model: Add support for synchronize_srcu_expedited()
  doc/kprobes: Update obsolete RCU update functions
  torture: Suppress false-positive CONFIG_INITRAMFS_SOURCE complaint
  locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection
  rcuperf: Fix cleanup path for invalid perf_type strings
  rcutorture: Fix cleanup path for invalid torture_type strings
  rcutorture: Fix expected forward progress duration in OOM notifier
  rcutorture: Remove ->ext_irq_conflict field
  rcutorture: Make rcutorture_extend_mask() comment match the code
  tools/.../rcutorture: Convert to SPDX license identifier
  torture: Don't try to offline the last CPU
  rcu: Fix nohz status in stall warning
  rcu: Move forward-progress checkers into tree_stall.h
  rcu: Move irq-disabled stall-warning checking to tree_stall.h
  rcu: Organize functions in tree_stall.h
  rcu: Move FAST_NO_HZ stall-warning code to tree_stall.h
  rcu: Inline RCU stall-warning info helper functions
  rcu: Move rcu_print_task_exp_stall() to tree_exp.h
  rcu: Inline RCU task stall-warning helper functions
  ...
hifive-unleashed-5.2
Linus Torvalds 2019-05-06 12:04:02 -07:00
commit 5ba2a4b12f
61 changed files with 1373 additions and 1457 deletions

View File

@ -155,8 +155,7 @@ keeping lock contention under control at all tree levels regardless
of the level of loading on the system.
</p><p>RCU updaters wait for normal grace periods by registering
RCU callbacks, either directly via <tt>call_rcu()</tt> and
friends (namely <tt>call_rcu_bh()</tt> and <tt>call_rcu_sched()</tt>),
RCU callbacks, either directly via <tt>call_rcu()</tt>
or indirectly via <tt>synchronize_rcu()</tt> and friends.
RCU callbacks are represented by <tt>rcu_head</tt> structures,
which are queued on <tt>rcu_data</tt> structures while they are

View File

@ -56,6 +56,7 @@ sections.
RCU-preempt Expedited Grace Periods</a></h2>
<p>
<tt>CONFIG_PREEMPT=y</tt> kernels implement RCU-preempt.
The overall flow of the handling of a given CPU by an RCU-preempt
expedited grace period is shown in the following diagram:
@ -139,6 +140,7 @@ or offline, among other things.
RCU-sched Expedited Grace Periods</a></h2>
<p>
<tt>CONFIG_PREEMPT=n</tt> kernels implement RCU-sched.
The overall flow of the handling of a given CPU by an RCU-sched
expedited grace period is shown in the following diagram:
@ -146,7 +148,7 @@ expedited grace period is shown in the following diagram:
<p>
As with RCU-preempt, RCU-sched's
<tt>synchronize_sched_expedited()</tt> ignores offline and
<tt>synchronize_rcu_expedited()</tt> ignores offline and
idle CPUs, again because they are in remotely detectable
quiescent states.
However, because the

View File

@ -34,12 +34,11 @@ Similarly, any code that happens before the beginning of a given RCU grace
period is guaranteed to see the effects of all accesses following the end
of that grace period that are within RCU read-side critical sections.
<p>This guarantee is particularly pervasive for <tt>synchronize_sched()</tt>,
for which RCU-sched read-side critical sections include any region
<p>Note well that RCU-sched read-side critical sections include any region
of code for which preemption is disabled.
Given that each individual machine instruction can be thought of as
an extremely small region of preemption-disabled code, one can think of
<tt>synchronize_sched()</tt> as <tt>smp_mb()</tt> on steroids.
<tt>synchronize_rcu()</tt> as <tt>smp_mb()</tt> on steroids.
<p>RCU updaters use this guarantee by splitting their updates into
two phases, one of which is executed before the grace period and

View File

@ -81,18 +81,19 @@ currently executing on some other CPU. We therefore cannot free
up any data structures used by the old NMI handler until execution
of it completes on all other CPUs.
One way to accomplish this is via synchronize_sched(), perhaps as
One way to accomplish this is via synchronize_rcu(), perhaps as
follows:
unset_nmi_callback();
synchronize_sched();
synchronize_rcu();
kfree(my_nmi_data);
This works because synchronize_sched() blocks until all CPUs complete
any preemption-disabled segments of code that they were executing.
Since NMI handlers disable preemption, synchronize_sched() is guaranteed
This works because (as of v4.20) synchronize_rcu() blocks until all
CPUs complete any preemption-disabled segments of code that they were
executing.
Since NMI handlers disable preemption, synchronize_rcu() is guaranteed
not to return until all ongoing NMI handlers exit. It is therefore safe
to free up the handler's data as soon as synchronize_sched() returns.
to free up the handler's data as soon as synchronize_rcu() returns.
Important note: for this to work, the architecture in question must
invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.

View File

@ -86,10 +86,8 @@ even on a UP system. So do not do it! Even on a UP system, the RCU
infrastructure -must- respect grace periods, and -must- invoke callbacks
from a known environment in which no locks are held.
It -is- safe for synchronize_sched() and synchronize_rcu_bh() to return
immediately on an UP system. It is also safe for synchronize_rcu()
to return immediately on UP systems, except when running preemptable
RCU.
Note that it -is- safe for synchronize_rcu() to return immediately on
UP systems, including !PREEMPT SMP builds running on UP systems.
Quick Quiz #3: Why can't synchronize_rcu() return immediately on
UP systems running preemptable RCU?

View File

@ -182,16 +182,13 @@ over a rather long period of time, but improvements are always welcome!
when publicizing a pointer to a structure that can
be traversed by an RCU read-side critical section.
5. If call_rcu(), or a related primitive such as call_rcu_bh(),
call_rcu_sched(), or call_srcu() is used, the callback function
will be called from softirq context. In particular, it cannot
block.
5. If call_rcu() or call_srcu() is used, the callback function will
be called from softirq context. In particular, it cannot block.
6. Since synchronize_rcu() can block, it cannot be called from
any sort of irq context. The same rule applies for
synchronize_rcu_bh(), synchronize_sched(), synchronize_srcu(),
synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(),
synchronize_sched_expedite(), and synchronize_srcu_expedited().
6. Since synchronize_rcu() can block, it cannot be called
from any sort of irq context. The same rule applies
for synchronize_srcu(), synchronize_rcu_expedited(), and
synchronize_srcu_expedited().
The expedited forms of these primitives have the same semantics
as the non-expedited forms, but expediting is both expensive and
@ -212,20 +209,20 @@ over a rather long period of time, but improvements are always welcome!
of the system, especially to real-time workloads running on
the rest of the system.
7. If the updater uses call_rcu() or synchronize_rcu(), then the
corresponding readers must use rcu_read_lock() and
rcu_read_unlock(). If the updater uses call_rcu_bh() or
synchronize_rcu_bh(), then the corresponding readers must
use rcu_read_lock_bh() and rcu_read_unlock_bh(). If the
updater uses call_rcu_sched() or synchronize_sched(), then
the corresponding readers must disable preemption, possibly
by calling rcu_read_lock_sched() and rcu_read_unlock_sched().
If the updater uses synchronize_srcu() or call_srcu(), then
the corresponding readers must use srcu_read_lock() and
7. As of v4.20, a given kernel implements only one RCU flavor,
which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
If the updater uses call_rcu() or synchronize_rcu(),
then the corresponding readers my use rcu_read_lock() and
rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
or any pair of primitives that disables and re-enables preemption,
for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
If the updater uses synchronize_srcu() or call_srcu(),
then the corresponding readers must use srcu_read_lock() and
srcu_read_unlock(), and with the same srcu_struct. The rules for
the expedited primitives are the same as for their non-expedited
counterparts. Mixing things up will result in confusion and
broken kernels.
broken kernels, and has even resulted in an exploitable security
issue.
One exception to this rule: rcu_read_lock() and rcu_read_unlock()
may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
@ -288,8 +285,7 @@ over a rather long period of time, but improvements are always welcome!
d. Periodically invoke synchronize_rcu(), permitting a limited
number of updates per grace period.
The same cautions apply to call_rcu_bh(), call_rcu_sched(),
call_srcu(), and kfree_rcu().
The same cautions apply to call_srcu() and kfree_rcu().
Note that although these primitives do take action to avoid memory
exhaustion when any given CPU has too many callbacks, a determined
@ -322,7 +318,7 @@ over a rather long period of time, but improvements are always welcome!
11. Any lock acquired by an RCU callback must be acquired elsewhere
with softirq disabled, e.g., via spin_lock_irqsave(),
spin_lock_bh(), etc. Failing to disable irq on a given
spin_lock_bh(), etc. Failing to disable softirq on a given
acquisition of that lock will result in deadlock as soon as
the RCU softirq handler happens to run your RCU callback while
interrupting that acquisition's critical section.
@ -335,13 +331,16 @@ over a rather long period of time, but improvements are always welcome!
must use whatever locking or other synchronization is required
to safely access and/or modify that data structure.
RCU callbacks are -usually- executed on the same CPU that executed
the corresponding call_rcu(), call_rcu_bh(), or call_rcu_sched(),
but are by -no- means guaranteed to be. For example, if a given
CPU goes offline while having an RCU callback pending, then that
RCU callback will execute on some surviving CPU. (If this was
not the case, a self-spawning RCU callback would prevent the
victim CPU from ever going offline.)
Do not assume that RCU callbacks will be executed on the same
CPU that executed the corresponding call_rcu() or call_srcu().
For example, if a given CPU goes offline while having an RCU
callback pending, then that RCU callback will execute on some
surviving CPU. (If this was not the case, a self-spawning RCU
callback would prevent the victim CPU from ever going offline.)
Furthermore, CPUs designated by rcu_nocbs= might well -always-
have their RCU callbacks executed on some other CPUs, in fact,
for some real-time workloads, this is the whole point of using
the rcu_nocbs= kernel boot parameter.
13. Unlike other forms of RCU, it -is- permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
@ -381,11 +380,11 @@ over a rather long period of time, but improvements are always welcome!
SRCU's expedited primitive (synchronize_srcu_expedited())
never sends IPIs to other CPUs, so it is easier on
real-time workloads than is synchronize_rcu_expedited(),
synchronize_rcu_bh_expedited() or synchronize_sched_expedited().
real-time workloads than is synchronize_rcu_expedited().
Note that rcu_dereference() and rcu_assign_pointer() relate to
SRCU just as they do to other forms of RCU.
Note that rcu_assign_pointer() relates to SRCU just as it does to
other forms of RCU, but instead of rcu_dereference() you should
use srcu_dereference() in order to avoid lockdep splats.
14. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
@ -405,6 +404,9 @@ over a rather long period of time, but improvements are always welcome!
read-side critical sections. It is the responsibility of the
RCU update-side primitives to deal with this.
For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
immediately after an srcu_read_unlock() to get a full barrier.
16. Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
__rcu sparse checks to validate your RCU code. These can help
find problems as follows:
@ -428,22 +430,19 @@ over a rather long period of time, but improvements are always welcome!
These debugging aids can help you find problems that are
otherwise extremely difficult to spot.
17. If you register a callback using call_rcu(), call_rcu_bh(),
call_rcu_sched(), or call_srcu(), and pass in a function defined
within a loadable module, then it in necessary to wait for
all pending callbacks to be invoked after the last invocation
and before unloading that module. Note that it is absolutely
-not- sufficient to wait for a grace period! The current (say)
synchronize_rcu() implementation waits only for all previous
callbacks registered on the CPU that synchronize_rcu() is running
on, but it is -not- guaranteed to wait for callbacks registered
on other CPUs.
17. If you register a callback using call_rcu() or call_srcu(), and
pass in a function defined within a loadable module, then it in
necessary to wait for all pending callbacks to be invoked after
the last invocation and before unloading that module. Note that
it is absolutely -not- sufficient to wait for a grace period!
The current (say) synchronize_rcu() implementation is -not-
guaranteed to wait for callbacks registered on other CPUs.
Or even on the current CPU if that CPU recently went offline
and came back online.
You instead need to use one of the barrier functions:
o call_rcu() -> rcu_barrier()
o call_rcu_bh() -> rcu_barrier()
o call_rcu_sched() -> rcu_barrier()
o call_srcu() -> srcu_barrier()
However, these barrier functions are absolutely -not- guaranteed

View File

@ -52,10 +52,10 @@ o If I am running on a uniprocessor kernel, which can only do one
o How can I see where RCU is currently used in the Linux kernel?
Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
"rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh",
"srcu_read_lock", "srcu_read_unlock", "synchronize_rcu",
"synchronize_net", "synchronize_srcu", and the other RCU
primitives. Or grab one of the cscope databases from:
"rcu_read_lock_bh", "rcu_read_unlock_bh", "srcu_read_lock",
"srcu_read_unlock", "synchronize_rcu", "synchronize_net",
"synchronize_srcu", and the other RCU primitives. Or grab one
of the cscope databases from:
http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html

View File

@ -351,3 +351,106 @@ garbage values.
In short, rcu_dereference() is -not- optional when you are going to
dereference the resulting pointer.
WHICH MEMBER OF THE rcu_dereference() FAMILY SHOULD YOU USE?
First, please avoid using rcu_dereference_raw() and also please avoid
using rcu_dereference_check() and rcu_dereference_protected() with a
second argument with a constant value of 1 (or true, for that matter).
With that caution out of the way, here is some guidance for which
member of the rcu_dereference() to use in various situations:
1. If the access needs to be within an RCU read-side critical
section, use rcu_dereference(). With the new consolidated
RCU flavors, an RCU read-side critical section is entered
using rcu_read_lock(), anything that disables bottom halves,
anything that disables interrupts, or anything that disables
preemption.
2. If the access might be within an RCU read-side critical section
on the one hand, or protected by (say) my_lock on the other,
use rcu_dereference_check(), for example:
p1 = rcu_dereference_check(p->rcu_protected_pointer,
lockdep_is_held(&my_lock));
3. If the access might be within an RCU read-side critical section
on the one hand, or protected by either my_lock or your_lock on
the other, again use rcu_dereference_check(), for example:
p1 = rcu_dereference_check(p->rcu_protected_pointer,
lockdep_is_held(&my_lock) ||
lockdep_is_held(&your_lock));
4. If the access is on the update side, so that it is always protected
by my_lock, use rcu_dereference_protected():
p1 = rcu_dereference_protected(p->rcu_protected_pointer,
lockdep_is_held(&my_lock));
This can be extended to handle multiple locks as in #3 above,
and both can be extended to check other conditions as well.
5. If the protection is supplied by the caller, and is thus unknown
to this code, that is the rare case when rcu_dereference_raw()
is appropriate. In addition, rcu_dereference_raw() might be
appropriate when the lockdep expression would be excessively
complex, except that a better approach in that case might be to
take a long hard look at your synchronization design. Still,
there are data-locking cases where any one of a very large number
of locks or reference counters suffices to protect the pointer,
so rcu_dereference_raw() does have its place.
However, its place is probably quite a bit smaller than one
might expect given the number of uses in the current kernel.
Ditto for its synonym, rcu_dereference_check( ... , 1), and
its close relative, rcu_dereference_protected(... , 1).
SPARSE CHECKING OF RCU-PROTECTED POINTERS
The sparse static-analysis tool checks for direct access to RCU-protected
pointers, which can result in "interesting" bugs due to compiler
optimizations involving invented loads and perhaps also load tearing.
For example, suppose someone mistakenly does something like this:
p = q->rcu_protected_pointer;
do_something_with(p->a);
do_something_else_with(p->b);
If register pressure is high, the compiler might optimize "p" out
of existence, transforming the code to something like this:
do_something_with(q->rcu_protected_pointer->a);
do_something_else_with(q->rcu_protected_pointer->b);
This could fatally disappoint your code if q->rcu_protected_pointer
changed in the meantime. Nor is this a theoretical problem: Exactly
this sort of bug cost Paul E. McKenney (and several of his innocent
colleagues) a three-day weekend back in the early 1990s.
Load tearing could of course result in dereferencing a mashup of a pair
of pointers, which also might fatally disappoint your code.
These problems could have been avoided simply by making the code instead
read as follows:
p = rcu_dereference(q->rcu_protected_pointer);
do_something_with(p->a);
do_something_else_with(p->b);
Unfortunately, these sorts of bugs can be extremely hard to spot during
review. This is where the sparse tool comes into play, along with the
"__rcu" marker. If you mark a pointer declaration, whether in a structure
or as a formal parameter, with "__rcu", which tells sparse to complain if
this pointer is accessed directly. It will also cause sparse to complain
if a pointer not marked with "__rcu" is accessed using rcu_dereference()
and friends. For example, ->rcu_protected_pointer might be declared as
follows:
struct foo __rcu *rcu_protected_pointer;
Use of "__rcu" is opt-in. If you choose not to use it, then you should
ignore the sparse warnings.

View File

@ -83,16 +83,15 @@ Pseudo-code using rcu_barrier() is as follows:
2. Execute rcu_barrier().
3. Allow the module to be unloaded.
There are also rcu_barrier_bh(), rcu_barrier_sched(), and srcu_barrier()
functions for the other flavors of RCU, and you of course must match
the flavor of rcu_barrier() with that of call_rcu(). If your module
uses multiple flavors of call_rcu(), then it must also use multiple
There is also an srcu_barrier() function for SRCU, and you of course
must match the flavor of rcu_barrier() with that of call_rcu(). If your
module uses multiple flavors of call_rcu(), then it must also use multiple
flavors of rcu_barrier() when unloading that module. For example, if
it uses call_rcu_bh(), call_srcu() on srcu_struct_1, and call_srcu() on
it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on
srcu_struct_2(), then the following three lines of code will be required
when unloading:
1 rcu_barrier_bh();
1 rcu_barrier();
2 srcu_barrier(&srcu_struct_1);
3 srcu_barrier(&srcu_struct_2);
@ -185,12 +184,12 @@ module invokes call_rcu() from timers, you will need to first cancel all
the timers, and only then invoke rcu_barrier() to wait for any remaining
RCU callbacks to complete.
Of course, if you module uses call_rcu_bh(), you will need to invoke
rcu_barrier_bh() before unloading. Similarly, if your module uses
call_rcu_sched(), you will need to invoke rcu_barrier_sched() before
unloading. If your module uses call_rcu(), call_rcu_bh(), -and-
call_rcu_sched(), then you will need to invoke each of rcu_barrier(),
rcu_barrier_bh(), and rcu_barrier_sched().
Of course, if you module uses call_rcu(), you will need to invoke
rcu_barrier() before unloading. Similarly, if your module uses
call_srcu(), you will need to invoke srcu_barrier() before unloading,
and on the same srcu_struct structure. If your module uses call_rcu()
-and- call_srcu(), then you will need to invoke rcu_barrier() -and-
srcu_barrier().
Implementing rcu_barrier()
@ -223,8 +222,8 @@ shown below. Note that the final "1" in on_each_cpu()'s argument list
ensures that all the calls to rcu_barrier_func() will have completed
before on_each_cpu() returns. Line 9 then waits for the completion.
This code was rewritten in 2008 to support rcu_barrier_bh() and
rcu_barrier_sched() in addition to the original rcu_barrier().
This code was rewritten in 2008 and several times thereafter, but this
still gives the general idea.
The rcu_barrier_func() runs on each CPU, where it invokes call_rcu()
to post an RCU callback, as follows:

View File

@ -310,7 +310,7 @@ reader, updater, and reclaimer.
rcu_assign_pointer()
+--------+
+--------+
+---------------------->| reader |---------+
| +--------+ |
| | |
@ -318,12 +318,12 @@ reader, updater, and reclaimer.
| | | rcu_read_lock()
| | | rcu_read_unlock()
| rcu_dereference() | |
+---------+ | |
| updater |<---------------------+ |
+---------+ V
+---------+ | |
| updater |<----------------+ |
+---------+ V
| +-----------+
+----------------------------------->| reclaimer |
+-----------+
+-----------+
Defer:
synchronize_rcu() & call_rcu()

View File

@ -3623,7 +3623,9 @@
see CONFIG_RAS_CEC help text.
rcu_nocbs= [KNL]
The argument is a cpu list, as described above.
The argument is a cpu list, as described above,
except that the string "all" can be used to
specify every CPU on the system.
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs.

View File

@ -56,6 +56,23 @@ Barriers:
smp_mb__{before,after}_atomic()
TYPES (signed vs unsigned)
-----
While atomic_t, atomic_long_t and atomic64_t use int, long and s64
respectively (for hysterical raisins), the kernel uses -fno-strict-overflow
(which implies -fwrapv) and defines signed overflow to behave like
2s-complement.
Therefore, an explicitly unsigned variant of the atomic ops is strictly
unnecessary and we can simply cast, there is no UB.
There was a bug in UBSAN prior to GCC-8 that would generate UB warnings for
signed types.
With this we also conform to the C/C++ _Atomic behaviour and things like
P1236R1.
SEMANTICS
---------

View File

@ -243,10 +243,10 @@ Optimization
^^^^^^^^^^^^
The Kprobe-optimizer doesn't insert the jump instruction immediately;
rather, it calls synchronize_sched() for safety first, because it's
rather, it calls synchronize_rcu() for safety first, because it's
possible for a CPU to be interrupted in the middle of executing the
optimized region [3]_. As you know, synchronize_sched() can ensure
that all interruptions that were active when synchronize_sched()
optimized region [3]_. As you know, synchronize_rcu() can ensure
that all interruptions that were active when synchronize_rcu()
was called are done, but only if CONFIG_PREEMPT=n. So, this version
of kprobe optimization supports only kernels with CONFIG_PREEMPT=n [4]_.

View File

@ -493,10 +493,8 @@ CPU 에게 기대할 수 있는 최소한의 보장사항 몇가지가 있습니
이 타입의 오퍼레이션은 단방향의 투과성 배리어처럼 동작합니다. ACQUIRE
오퍼레이션 뒤의 모든 메모리 오퍼레이션들이 ACQUIRE 오퍼레이션 후에
일어난 것으로 시스템의 나머지 컴포넌트들에 보이게 될 것이 보장됩니다.
LOCK 오퍼레이션과 smp_load_acquire(), smp_cond_acquire() 오퍼레이션도
ACQUIRE 오퍼레이션에 포함됩니다. smp_cond_acquire() 오퍼레이션은 컨트롤
의존성과 smp_rmb() 를 사용해서 ACQUIRE 의 의미적 요구사항(semantic)을
충족시킵니다.
LOCK 오퍼레이션과 smp_load_acquire(), smp_cond_load_acquire() 오퍼레이션도
ACQUIRE 오퍼레이션에 포함됩니다.
ACQUIRE 오퍼레이션 앞의 메모리 오퍼레이션들은 ACQUIRE 오퍼레이션 완료 후에
수행된 것처럼 보일 수 있습니다.
@ -2146,33 +2144,40 @@ set_current_state() 는 다음의 것들로 감싸질 수도 있습니다:
event_indicated = 1;
wake_up_process(event_daemon);
wake_up() 류에 의해 쓰기 메모리 배리어가 내포됩니다. 만약 그것들이 뭔가를
깨운다면요. 이 배리어는 태스크 상태가 지워지기 전에 수행되므로, 이벤트를
알리기 위한 STORE 와 태스크 상태를 TASK_RUNNING 으로 설정하는 STORE 사이에
위치하게 됩니다.
wake_up() 이 무언가를 깨우게 되면, 이 함수는 범용 메모리 배리어를 수행합니다.
이 함수가 아무것도 깨우지 않는다면 메모리 배리어는 수행될 수도, 수행되지 않을
수도 있습니다; 이 경우에 메모리 배리어를 수행할 거라 오해해선 안됩니다. 이
배리어는 태스크 상태가 접근되기 전에 수행되는데, 자세히 말하면 이 이벤트를
알리기 위한 STORE 와 TASK_RUNNING 으로 상태를 쓰는 STORE 사이에 수행됩니다:
CPU 1 CPU 2
CPU 1 (Sleeper) CPU 2 (Waker)
=============================== ===============================
set_current_state(); STORE event_indicated
smp_store_mb(); wake_up();
STORE current->state <쓰기 배리어>
<범용 배리어> STORE current->state
LOAD event_indicated
STORE current->state ...
<범용 배리어> <범용 배리어>
LOAD event_indicated if ((LOAD task->state) & TASK_NORMAL)
STORE task->state
한번더 말합니다만, 이 쓰기 메모리 배리어는 이 코드가 정말로 뭔가를 깨울 때에만
실행됩니다. 이걸 설명하기 위해, X 와 Y 는 모두 0 으로 초기화 되어 있다는 가정
하에 아래의 이벤트 시퀀스를 생각해 봅시다:
여기서 "task" 는 깨어나지는 쓰레드이고 CPU 1 의 "current" 와 같습니다.
반복하지만, wake_up() 이 무언가를 정말 깨운다면 범용 메모리 배리어가 수행될
것이 보장되지만, 그렇지 않다면 그런 보장이 없습니다. 이걸 이해하기 위해, X 와
Y 는 모두 0 으로 초기화 되어 있다는 가정 하에 아래의 이벤트 시퀀스를 생각해
봅시다:
CPU 1 CPU 2
=============================== ===============================
X = 1; STORE event_indicated
X = 1; Y = 1;
smp_mb(); wake_up();
Y = 1; wait_event(wq, Y == 1);
wake_up(); load from Y sees 1, no memory barrier
load from X might see 0
LOAD Y LOAD X
위 예제에서의 경우와 달리 깨우기가 정말로 행해졌다면, CPU 2 의 X 로드는 1 을
본다고 보장될 수 있을 겁니다.
정말로 깨우기가 행해졌다면, 두 로드 중 (최소한) 하나는 1 을 보게 됩니다.
반면에, 실제 깨우기가 행해지지 않았다면, 두 로드 모두 0을 볼 수도 있습니다.
wake_up_process() 는 항상 범용 메모리 배리어를 수행합니다. 이 배리어 역시
태스크 상태가 접근되기 전에 수행됩니다. 특히, 앞의 예제 코드에서 wake_up() 이
wake_up_process() 로 대체된다면 두 로드 중 하나는 1을 볼 것이 보장됩니다.
사용 가능한 깨우기류 함수들로 다음과 같은 것들이 있습니다:
@ -2192,6 +2197,8 @@ wake_up() 류에 의해 쓰기 메모리 배리어가 내포됩니다. 만약
wake_up_poll();
wake_up_process();
메모리 순서규칙 관점에서, 이 함수들은 모두 wake_up() 과 같거나 보다 강한 순서
보장을 제공합니다.
[!] 잠재우는 코드와 깨우는 코드에 내포되는 메모리 배리어들은 깨우기 전에
이루어진 스토어를 잠재우는 코드가 set_current_state() 를 호출한 후에 행하는

View File

@ -8994,7 +8994,7 @@ R: Daniel Lustig <dlustig@nvidia.com>
L: linux-kernel@vger.kernel.org
L: linux-arch@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: tools/memory-model/
F: Documentation/atomic_bitops.txt
F: Documentation/atomic_t.txt
@ -13043,9 +13043,9 @@ M: Josh Triplett <josh@joshtriplett.org>
R: Steven Rostedt <rostedt@goodmis.org>
R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
R: Lai Jiangshan <jiangshanlai@gmail.com>
L: linux-kernel@vger.kernel.org
L: rcu@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: tools/testing/selftests/rcutorture
RDC R-321X SoC
@ -13091,10 +13091,10 @@ R: Steven Rostedt <rostedt@goodmis.org>
R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
R: Lai Jiangshan <jiangshanlai@gmail.com>
R: Joel Fernandes <joel@joelfernandes.org>
L: linux-kernel@vger.kernel.org
L: rcu@vger.kernel.org
W: http://www.rdrop.com/users/paulmck/RCU/
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: Documentation/RCU/
X: Documentation/RCU/torture.txt
F: include/linux/rcu*
@ -14246,10 +14246,10 @@ M: "Paul E. McKenney" <paulmck@linux.ibm.com>
M: Josh Triplett <josh@joshtriplett.org>
R: Steven Rostedt <rostedt@goodmis.org>
R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
L: linux-kernel@vger.kernel.org
L: rcu@vger.kernel.org
W: http://www.rdrop.com/users/paulmck/RCU/
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: include/linux/srcu*.h
F: kernel/rcu/srcu*.c
@ -15696,7 +15696,7 @@ M: "Paul E. McKenney" <paulmck@linux.ibm.com>
M: Josh Triplett <josh@joshtriplett.org>
L: linux-kernel@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: Documentation/RCU/torture.txt
F: kernel/torture.c
F: kernel/rcu/rcutorture.c

View File

@ -388,7 +388,7 @@ static void nvme_free_ns_head(struct kref *ref)
nvme_mpath_remove_disk(head);
ida_simple_remove(&head->subsys->ns_ida, head->instance);
list_del_init(&head->entry);
cleanup_srcu_struct_quiesced(&head->srcu);
cleanup_srcu_struct(&head->srcu);
nvme_put_subsystem(head->subsys);
kfree(head);
}

View File

@ -878,9 +878,11 @@ static inline void rcu_head_init(struct rcu_head *rhp)
static inline bool
rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
{
if (READ_ONCE(rhp->func) == f)
rcu_callback_t func = READ_ONCE(rhp->func);
if (func == f)
return true;
WARN_ON_ONCE(READ_ONCE(rhp->func) != (rcu_callback_t)~0L);
WARN_ON_ONCE(func != (rcu_callback_t)~0L);
return false;
}

View File

@ -56,45 +56,11 @@ struct srcu_struct { };
void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
void (*func)(struct rcu_head *head));
void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced);
void cleanup_srcu_struct(struct srcu_struct *ssp);
int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
void synchronize_srcu(struct srcu_struct *ssp);
/**
* cleanup_srcu_struct - deconstruct a sleep-RCU structure
* @ssp: structure to clean up.
*
* Must invoke this after you are finished using a given srcu_struct that
* was initialized via init_srcu_struct(), else you leak memory.
*/
static inline void cleanup_srcu_struct(struct srcu_struct *ssp)
{
_cleanup_srcu_struct(ssp, false);
}
/**
* cleanup_srcu_struct_quiesced - deconstruct a quiesced sleep-RCU structure
* @ssp: structure to clean up.
*
* Must invoke this after you are finished using a given srcu_struct that
* was initialized via init_srcu_struct(), else you leak memory. Also,
* all grace-period processing must have completed.
*
* "Completed" means that the last synchronize_srcu() and
* synchronize_srcu_expedited() calls must have returned before the call
* to cleanup_srcu_struct_quiesced(). It also means that the callback
* from the last call_srcu() must have been invoked before the call to
* cleanup_srcu_struct_quiesced(), but you can use srcu_barrier() to help
* with this last. Violating these rules will get you a WARN_ON() splat
* (with high probability, anyway), and will also cause the srcu_struct
* to be leaked.
*/
static inline void cleanup_srcu_struct_quiesced(struct srcu_struct *ssp)
{
_cleanup_srcu_struct(ssp, true);
}
#ifdef CONFIG_DEBUG_LOCK_ALLOC
/**

View File

@ -829,7 +829,9 @@ static void lock_torture_cleanup(void)
"End of test: SUCCESS");
kfree(cxt.lwsa);
cxt.lwsa = NULL;
kfree(cxt.lrsa);
cxt.lrsa = NULL;
end:
torture_cleanup_end();

View File

@ -233,6 +233,7 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
#ifdef CONFIG_RCU_STALL_COMMON
extern int rcu_cpu_stall_suppress;
extern int rcu_cpu_stall_timeout;
int rcu_jiffies_till_stall_check(void);
#define rcu_ftrace_dump_stall_suppress() \

View File

@ -494,6 +494,10 @@ rcu_perf_cleanup(void)
if (torture_cleanup_begin())
return;
if (!cur_ops) {
torture_cleanup_end();
return;
}
if (reader_tasks) {
for (i = 0; i < nrealreaders; i++)
@ -614,6 +618,7 @@ rcu_perf_init(void)
pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST));
firsterr = -EINVAL;
cur_ops = NULL;
goto unwind;
}
if (cur_ops->init)

View File

@ -299,7 +299,6 @@ struct rcu_torture_ops {
int irq_capable;
int can_boost;
int extendables;
int ext_irq_conflict;
const char *name;
};
@ -592,12 +591,7 @@ static void srcu_torture_init(void)
static void srcu_torture_cleanup(void)
{
static DEFINE_TORTURE_RANDOM(rand);
if (torture_random(&rand) & 0x800)
cleanup_srcu_struct(&srcu_ctld);
else
cleanup_srcu_struct_quiesced(&srcu_ctld);
cleanup_srcu_struct(&srcu_ctld);
srcu_ctlp = &srcu_ctl; /* In case of a later rcutorture run. */
}
@ -1160,7 +1154,7 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
unsigned long randmask2 = randmask1 >> 3;
WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
/* Most of the time lots of bits, half the time only one bit. */
/* Mostly only one bit (need preemption!), sometimes lots of bits. */
if (!(randmask1 & 0x7))
mask = mask & randmask2;
else
@ -1170,10 +1164,6 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
(!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
if ((mask & RCUTORTURE_RDR_IRQ) &&
!(mask & cur_ops->ext_irq_conflict) &&
(oldmask & cur_ops->ext_irq_conflict))
mask |= cur_ops->ext_irq_conflict; /* Or if readers object. */
return mask ?: RCUTORTURE_RDR_RCU;
}
@ -1848,7 +1838,7 @@ static int rcutorture_oom_notify(struct notifier_block *self,
WARN(1, "%s invoked upon OOM during forward-progress testing.\n",
__func__);
rcu_torture_fwd_cb_hist();
rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat) / 2));
rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat)) / 2);
WRITE_ONCE(rcu_fwd_emergency_stop, true);
smp_mb(); /* Emergency stop before free and wait to avoid hangs. */
pr_info("%s: Freed %lu RCU callbacks.\n",
@ -2094,6 +2084,10 @@ rcu_torture_cleanup(void)
cur_ops->cb_barrier();
return;
}
if (!cur_ops) {
torture_cleanup_end();
return;
}
rcu_torture_barrier_cleanup();
torture_stop_kthread(rcu_torture_fwd_prog, fwd_prog_task);
@ -2267,6 +2261,7 @@ rcu_torture_init(void)
pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_TORTURE_TEST));
firsterr = -EINVAL;
cur_ops = NULL;
goto unwind;
}
if (cur_ops->fqs == NULL && fqs_duration != 0) {

View File

@ -76,19 +76,16 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
* Must invoke this after you are finished using a given srcu_struct that
* was initialized via init_srcu_struct(), else you leak memory.
*/
void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced)
void cleanup_srcu_struct(struct srcu_struct *ssp)
{
WARN_ON(ssp->srcu_lock_nesting[0] || ssp->srcu_lock_nesting[1]);
if (quiesced)
WARN_ON(work_pending(&ssp->srcu_work));
else
flush_work(&ssp->srcu_work);
flush_work(&ssp->srcu_work);
WARN_ON(ssp->srcu_gp_running);
WARN_ON(ssp->srcu_gp_waiting);
WARN_ON(ssp->srcu_cb_head);
WARN_ON(&ssp->srcu_cb_head != ssp->srcu_cb_tail);
}
EXPORT_SYMBOL_GPL(_cleanup_srcu_struct);
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
/*
* Removes the count for the old reader from the appropriate element of

View File

@ -360,8 +360,14 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
return SRCU_INTERVAL;
}
/* Helper for cleanup_srcu_struct() and cleanup_srcu_struct_quiesced(). */
void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced)
/**
* cleanup_srcu_struct - deconstruct a sleep-RCU structure
* @ssp: structure to clean up.
*
* Must invoke this after you are finished using a given srcu_struct that
* was initialized via init_srcu_struct(), else you leak memory.
*/
void cleanup_srcu_struct(struct srcu_struct *ssp)
{
int cpu;
@ -369,24 +375,14 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced)
return; /* Just leak it! */
if (WARN_ON(srcu_readers_active(ssp)))
return; /* Just leak it! */
if (quiesced) {
if (WARN_ON(delayed_work_pending(&ssp->work)))
return; /* Just leak it! */
} else {
flush_delayed_work(&ssp->work);
}
flush_delayed_work(&ssp->work);
for_each_possible_cpu(cpu) {
struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
if (quiesced) {
if (WARN_ON(timer_pending(&sdp->delay_work)))
return; /* Just leak it! */
if (WARN_ON(work_pending(&sdp->work)))
return; /* Just leak it! */
} else {
del_timer_sync(&sdp->delay_work);
flush_work(&sdp->work);
}
del_timer_sync(&sdp->delay_work);
flush_work(&sdp->work);
if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
return; /* Forgot srcu_barrier(), so just leak it! */
}
if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
WARN_ON(srcu_readers_active(ssp))) {
@ -397,7 +393,7 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced)
free_percpu(ssp->sda);
ssp->sda = NULL;
}
EXPORT_SYMBOL_GPL(_cleanup_srcu_struct);
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
/*
* Counts the new reader in the appropriate per-CPU element of the

View File

@ -52,7 +52,7 @@ void rcu_qs(void)
local_irq_save(flags);
if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
rcu_ctrlblk.donetail = rcu_ctrlblk.curtail;
raise_softirq(RCU_SOFTIRQ);
raise_softirq_irqoff(RCU_SOFTIRQ);
}
local_irq_restore(flags);
}

View File

@ -102,11 +102,6 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
/* Number of rcu_nodes at specified level. */
int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
/* panic() on RCU Stall sysctl. */
int sysctl_panic_on_rcu_stall __read_mostly;
/* Commandeer a sysrq key to dump RCU's tree. */
static bool sysrq_rcu;
module_param(sysrq_rcu, bool, 0444);
/*
* The rcu_scheduler_active variable is initialized to the value
@ -149,7 +144,7 @@ static void sync_sched_exp_online_cleanup(int cpu);
/* rcuc/rcub kthread realtime priority */
static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
module_param(kthread_prio, int, 0644);
module_param(kthread_prio, int, 0444);
/* Delay in jiffies for grace-period initialization delays, debug only. */
@ -406,7 +401,7 @@ static bool rcu_kick_kthreads;
*/
static ulong jiffies_till_sched_qs = ULONG_MAX;
module_param(jiffies_till_sched_qs, ulong, 0444);
static ulong jiffies_to_sched_qs; /* Adjusted version of above if not default */
static ulong jiffies_to_sched_qs; /* See adjust_jiffies_till_sched_qs(). */
module_param(jiffies_to_sched_qs, ulong, 0444); /* Display only! */
/*
@ -424,6 +419,7 @@ static void adjust_jiffies_till_sched_qs(void)
WRITE_ONCE(jiffies_to_sched_qs, jiffies_till_sched_qs);
return;
}
/* Otherwise, set to third fqs scan, but bound below on large system. */
j = READ_ONCE(jiffies_till_first_fqs) +
2 * READ_ONCE(jiffies_till_next_fqs);
if (j < HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV)
@ -512,74 +508,6 @@ static const char *gp_state_getname(short gs)
return gp_state_names[gs];
}
/*
* Show the state of the grace-period kthreads.
*/
void show_rcu_gp_kthreads(void)
{
int cpu;
unsigned long j;
unsigned long ja;
unsigned long jr;
unsigned long jw;
struct rcu_data *rdp;
struct rcu_node *rnp;
j = jiffies;
ja = j - READ_ONCE(rcu_state.gp_activity);
jr = j - READ_ONCE(rcu_state.gp_req_activity);
jw = j - READ_ONCE(rcu_state.gp_wake_time);
pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
rcu_state.name, gp_state_getname(rcu_state.gp_state),
rcu_state.gp_state,
rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL,
ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
(long)READ_ONCE(rcu_state.gp_seq),
(long)READ_ONCE(rcu_get_root()->gp_seq_needed),
READ_ONCE(rcu_state.gp_flags));
rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed))
continue;
pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
rnp->grplo, rnp->grphi, (long)rnp->gp_seq,
(long)rnp->gp_seq_needed);
if (!rcu_is_leaf_node(rnp))
continue;
for_each_leaf_node_possible_cpu(rnp, cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rdp->gpwrap ||
ULONG_CMP_GE(rcu_state.gp_seq,
rdp->gp_seq_needed))
continue;
pr_info("\tcpu %d ->gp_seq_needed %ld\n",
cpu, (long)rdp->gp_seq_needed);
}
}
/* sched_show_task(rcu_state.gp_kthread); */
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
/* Dump grace-period-request information due to commandeered sysrq. */
static void sysrq_show_rcu(int key)
{
show_rcu_gp_kthreads();
}
static struct sysrq_key_op sysrq_rcudump_op = {
.handler = sysrq_show_rcu,
.help_msg = "show-rcu(y)",
.action_msg = "Show RCU tree",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static int __init rcu_sysrq_init(void)
{
if (sysrq_rcu)
return register_sysrq_key('y', &sysrq_rcudump_op);
return 0;
}
early_initcall(rcu_sysrq_init);
/*
* Send along grace-period-related data for rcutorture diagnostics.
*/
@ -1033,27 +961,6 @@ static int dyntick_save_progress_counter(struct rcu_data *rdp)
return 0;
}
/*
* Handler for the irq_work request posted when a grace period has
* gone on for too long, but not yet long enough for an RCU CPU
* stall warning. Set state appropriately, but just complain if
* there is unexpected state on entry.
*/
static void rcu_iw_handler(struct irq_work *iwp)
{
struct rcu_data *rdp;
struct rcu_node *rnp;
rdp = container_of(iwp, struct rcu_data, rcu_iw);
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp);
if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
rdp->rcu_iw_gp_seq = rnp->gp_seq;
rdp->rcu_iw_pending = false;
}
raw_spin_unlock_rcu_node(rnp);
}
/*
* Return true if the specified CPU has passed through a quiescent
* state by virtue of being in or having passed through an dynticks
@ -1167,295 +1074,6 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
return 0;
}
static void record_gp_stall_check_time(void)
{
unsigned long j = jiffies;
unsigned long j1;
rcu_state.gp_start = j;
j1 = rcu_jiffies_till_stall_check();
/* Record ->gp_start before ->jiffies_stall. */
smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
}
/*
* Complain about starvation of grace-period kthread.
*/
static void rcu_check_gp_kthread_starvation(void)
{
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
j = jiffies - READ_ONCE(rcu_state.gp_activity);
if (j > 2 * HZ) {
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
READ_ONCE(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
if (gpk) {
pr_err("RCU grace-period kthread stack dump:\n");
sched_show_task(gpk);
wake_up_process(gpk);
}
}
}
/*
* Dump stacks of all tasks running on stalled CPUs. First try using
* NMIs, but fall back to manual remote stack tracing on architectures
* that don't support NMI-based stack dumps. The NMI-triggered stack
* traces are more accurate because they are printed by the target CPU.
*/
static void rcu_dump_cpu_stacks(void)
{
int cpu;
unsigned long flags;
struct rcu_node *rnp;
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
if (!trigger_single_cpu_backtrace(cpu))
dump_cpu_task(cpu);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
}
/*
* If too much time has passed in the current grace period, and if
* so configured, go kick the relevant kthreads.
*/
static void rcu_stall_kick_kthreads(void)
{
unsigned long j;
if (!rcu_kick_kthreads)
return;
j = READ_ONCE(rcu_state.jiffies_kick_kthreads);
if (time_after(jiffies, j) && rcu_state.gp_kthread &&
(rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) {
WARN_ONCE(1, "Kicking %s grace-period kthread\n",
rcu_state.name);
rcu_ftrace_dump(DUMP_ALL);
wake_up_process(rcu_state.gp_kthread);
WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ);
}
}
static void panic_on_rcu_stall(void)
{
if (sysctl_panic_on_rcu_stall)
panic("RCU Stall\n");
}
static void print_other_cpu_stall(unsigned long gp_seq)
{
int cpu;
unsigned long flags;
unsigned long gpa;
unsigned long j;
int ndetected = 0;
struct rcu_node *rnp = rcu_get_root();
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on our buddy...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name);
print_cpu_stall_info_begin();
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
ndetected += rcu_print_task_stall(rnp);
if (rnp->qsmask != 0) {
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
print_cpu_stall_info(cpu);
ndetected++;
}
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
print_cpu_stall_info_end();
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
smp_processor_id(), (long)(jiffies - rcu_state.gp_start),
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
if (ndetected) {
rcu_dump_cpu_stacks();
/* Complain about tasks blocking the grace period. */
rcu_print_detail_task_stall();
} else {
if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) {
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
gpa = READ_ONCE(rcu_state.gp_activity);
pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n",
rcu_state.name, j - gpa, j, gpa,
READ_ONCE(jiffies_till_next_fqs),
rcu_get_root()->qsmask);
/* In this case, the current CPU might be at fault. */
sched_show_task(current);
}
}
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
rcu_check_gp_kthread_starvation();
panic_on_rcu_stall();
rcu_force_quiescent_state(); /* Kick them all. */
}
static void print_cpu_stall(void)
{
int cpu;
unsigned long flags;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rcu_get_root();
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on ourselves...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s self-detected stall on CPU", rcu_state.name);
print_cpu_stall_info_begin();
raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
print_cpu_stall_info(smp_processor_id());
raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
print_cpu_stall_info_end();
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n",
jiffies - rcu_state.gp_start,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
rcu_check_gp_kthread_starvation();
rcu_dump_cpu_stacks();
raw_spin_lock_irqsave_rcu_node(rnp, flags);
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
panic_on_rcu_stall();
/*
* Attempt to revive the RCU machinery by forcing a context switch.
*
* A context switch would normally allow the RCU state machine to make
* progress and it could be we're stuck in kernel space without context
* switches for an entirely unreasonable amount of time.
*/
set_tsk_need_resched(current);
set_preempt_need_resched();
}
static void check_cpu_stall(struct rcu_data *rdp)
{
unsigned long gs1;
unsigned long gs2;
unsigned long gps;
unsigned long j;
unsigned long jn;
unsigned long js;
struct rcu_node *rnp;
if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) ||
!rcu_gp_in_progress())
return;
rcu_stall_kick_kthreads();
j = jiffies;
/*
* Lots of memory barriers to reject false positives.
*
* The idea is to pick up rcu_state.gp_seq, then
* rcu_state.jiffies_stall, then rcu_state.gp_start, and finally
* another copy of rcu_state.gp_seq. These values are updated in
* the opposite order with memory barriers (or equivalent) during
* grace-period initialization and cleanup. Now, a false positive
* can occur if we get an new value of rcu_state.gp_start and a old
* value of rcu_state.jiffies_stall. But given the memory barriers,
* the only way that this can happen is if one grace period ends
* and another starts between these two fetches. This is detected
* by comparing the second fetch of rcu_state.gp_seq with the
* previous fetch from rcu_state.gp_seq.
*
* Given this check, comparisons of jiffies, rcu_state.jiffies_stall,
* and rcu_state.gp_start suffice to forestall false positives.
*/
gs1 = READ_ONCE(rcu_state.gp_seq);
smp_rmb(); /* Pick up ->gp_seq first... */
js = READ_ONCE(rcu_state.jiffies_stall);
smp_rmb(); /* ...then ->jiffies_stall before the rest... */
gps = READ_ONCE(rcu_state.gp_start);
smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */
gs2 = READ_ONCE(rcu_state.gp_seq);
if (gs1 != gs2 ||
ULONG_CMP_LT(j, js) ||
ULONG_CMP_GE(gps, js))
return; /* No stall or GP completed since entering function. */
rnp = rdp->mynode;
jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
if (rcu_gp_in_progress() &&
(READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall();
} else if (rcu_gp_in_progress() &&
ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* They had a few time units to dump stack, so complain. */
print_other_cpu_stall(gs2);
}
}
/**
* rcu_cpu_stall_reset - prevent further stall warnings in current grace period
*
* Set the stall-warning timeout way off into the future, thus preventing
* any RCU CPU stall-warning messages from appearing in the current set of
* RCU grace periods.
*
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2);
}
/* Trace-event wrapper function for trace_rcu_future_grace_period. */
static void trace_rcu_this_gp(struct rcu_node *rnp, struct rcu_data *rdp,
unsigned long gp_seq_req, const char *s)
@ -1585,7 +1203,7 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)
static void rcu_gp_kthread_wake(void)
{
if ((current == rcu_state.gp_kthread &&
!in_interrupt() && !in_serving_softirq()) ||
!in_irq() && !in_serving_softirq()) ||
!READ_ONCE(rcu_state.gp_flags) ||
!rcu_state.gp_kthread)
return;
@ -2295,11 +1913,10 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
return;
}
mask = rdp->grpmask;
rdp->core_needs_qs = false;
if ((rnp->qsmask & mask) == 0) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
} else {
rdp->core_needs_qs = false;
/*
* This GP can't end until cpu checks in, so all of our
* callbacks can be processed during the next GP.
@ -2548,11 +2165,11 @@ void rcu_sched_clock_irq(int user)
}
/*
* Scan the leaf rcu_node structures, processing dyntick state for any that
* have not yet encountered a quiescent state, using the function specified.
* Also initiate boosting for any threads blocked on the root rcu_node.
*
* The caller must have suppressed start of new grace periods.
* Scan the leaf rcu_node structures. For each structure on which all
* CPUs have reported a quiescent state and on which there are tasks
* blocking the current grace period, initiate RCU priority boosting.
* Otherwise, invoke the specified function to check dyntick state for
* each CPU that has not yet reported a quiescent state.
*/
static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
{
@ -2635,101 +2252,6 @@ void rcu_force_quiescent_state(void)
}
EXPORT_SYMBOL_GPL(rcu_force_quiescent_state);
/*
* This function checks for grace-period requests that fail to motivate
* RCU to come out of its idle mode.
*/
void
rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
const unsigned long gpssdelay)
{
unsigned long flags;
unsigned long j;
struct rcu_node *rnp_root = rcu_get_root();
static atomic_t warned = ATOMIC_INIT(0);
if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
return;
j = jiffies; /* Expensive access, and in common case don't get here. */
if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned))
return;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
/* Hold onto the leaf lock to make others see warned==1. */
if (rnp_root != rnp)
raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, rcu_state.gp_req_activity + gpssdelay) ||
time_before(j, rcu_state.gp_activity + gpssdelay) ||
atomic_xchg(&warned, 1)) {
raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
WARN_ON(1);
if (rnp_root != rnp)
raw_spin_unlock_rcu_node(rnp_root);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
show_rcu_gp_kthreads();
}
/*
* Do a forward-progress check for rcutorture. This is normally invoked
* due to an OOM event. The argument "j" gives the time period during
* which rcutorture would like progress to have been made.
*/
void rcu_fwd_progress_check(unsigned long j)
{
unsigned long cbs;
int cpu;
unsigned long max_cbs = 0;
int max_cpu = -1;
struct rcu_data *rdp;
if (rcu_gp_in_progress()) {
pr_info("%s: GP age %lu jiffies\n",
__func__, jiffies - rcu_state.gp_start);
show_rcu_gp_kthreads();
} else {
pr_info("%s: Last GP end %lu jiffies ago\n",
__func__, jiffies - rcu_state.gp_end);
preempt_disable();
rdp = this_cpu_ptr(&rcu_data);
rcu_check_gp_start_stall(rdp->mynode, rdp, j);
preempt_enable();
}
for_each_possible_cpu(cpu) {
cbs = rcu_get_n_cbs_cpu(cpu);
if (!cbs)
continue;
if (max_cpu < 0)
pr_info("%s: callbacks", __func__);
pr_cont(" %d: %lu", cpu, cbs);
if (cbs <= max_cbs)
continue;
max_cbs = cbs;
max_cpu = cpu;
}
if (max_cpu >= 0)
pr_cont("\n");
}
EXPORT_SYMBOL_GPL(rcu_fwd_progress_check);
/* Perform RCU core processing work for the current CPU. */
static __latent_entropy void rcu_core(struct softirq_action *unused)
{
@ -3559,13 +3081,11 @@ static int rcu_pm_notify(struct notifier_block *self,
switch (action) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */
rcu_expedite_gp();
rcu_expedite_gp();
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */
rcu_unexpedite_gp();
rcu_unexpedite_gp();
break;
default:
break;
@ -3742,8 +3262,7 @@ static void __init rcu_init_geometry(void)
jiffies_till_first_fqs = d;
if (jiffies_till_next_fqs == ULONG_MAX)
jiffies_till_next_fqs = d;
if (jiffies_till_sched_qs == ULONG_MAX)
adjust_jiffies_till_sched_qs();
adjust_jiffies_till_sched_qs();
/* If the compile-time values are accurate, just leave. */
if (rcu_fanout_leaf == RCU_FANOUT_LEAF &&
@ -3858,5 +3377,6 @@ void __init rcu_init(void)
srcu_init();
}
#include "tree_stall.h"
#include "tree_exp.h"
#include "tree_plugin.h"

View File

@ -393,15 +393,13 @@ static const char *tp_rcu_varname __used __tracepoint_string = rcu_name;
int rcu_dynticks_snap(struct rcu_data *rdp);
/* Forward declarations for rcutree_plugin.h */
/* Forward declarations for tree_plugin.h */
static void rcu_bootup_announce(void);
static void rcu_qs(void);
static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp);
#ifdef CONFIG_HOTPLUG_CPU
static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */
static void rcu_print_detail_task_stall(void);
static int rcu_print_task_stall(struct rcu_node *rnp);
static int rcu_print_task_exp_stall(struct rcu_node *rnp);
static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp);
static void rcu_flavor_sched_clock_irq(int user);
@ -418,9 +416,6 @@ static void rcu_prepare_for_idle(void);
static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
static void rcu_preempt_deferred_qs(struct task_struct *t);
static void print_cpu_stall_info_begin(void);
static void print_cpu_stall_info(int cpu);
static void print_cpu_stall_info_end(void);
static void zero_cpu_stall_ticks(struct rcu_data *rdp);
static bool rcu_nocb_cpu_needs_barrier(int cpu);
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
@ -445,3 +440,10 @@ static void rcu_bind_gp_kthread(void);
static bool rcu_nohz_full_cpu(void);
static void rcu_dynticks_task_enter(void);
static void rcu_dynticks_task_exit(void);
/* Forward declarations for tree_stall.h */
static void record_gp_stall_check_time(void);
static void rcu_iw_handler(struct irq_work *iwp);
static void check_cpu_stall(struct rcu_data *rdp);
static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
const unsigned long gpssdelay);

View File

@ -10,6 +10,7 @@
#include <linux/lockdep.h>
static void rcu_exp_handler(void *unused);
static int rcu_print_task_exp_stall(struct rcu_node *rnp);
/*
* Record the start of an expedited grace period.
@ -633,7 +634,7 @@ static void rcu_exp_handler(void *unused)
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (rnp->expmask & rdp->grpmask) {
rdp->deferred_qs = true;
WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
t->rcu_read_unlock_special.b.exp_hint = true;
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
@ -648,7 +649,7 @@ static void rcu_exp_handler(void *unused)
*
* If the CPU is fully enabled (or if some buggy RCU-preempt
* read-side critical section is being used from idle), just
* invoke rcu_preempt_defer_qs() to immediately report the
* invoke rcu_preempt_deferred_qs() to immediately report the
* quiescent state. We cannot use rcu_read_unlock_special()
* because we are in an interrupt handler, which will cause that
* function to take an early exit without doing anything.
@ -670,6 +671,27 @@ static void sync_sched_exp_online_cleanup(int cpu)
{
}
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each that is blocking the current
* expedited grace period.
*/
static int rcu_print_task_exp_stall(struct rcu_node *rnp)
{
struct task_struct *t;
int ndetected = 0;
if (!rnp->exp_tasks)
return 0;
t = list_entry(rnp->exp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
return ndetected;
}
#else /* #ifdef CONFIG_PREEMPT_RCU */
/* Invoked on each online non-idle CPU for expedited quiescent state. */
@ -709,6 +731,16 @@ static void sync_sched_exp_online_cleanup(int cpu)
WARN_ON_ONCE(ret);
}
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections that are
* blocking the current expedited grace period.
*/
static int rcu_print_task_exp_stall(struct rcu_node *rnp)
{
return 0;
}
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
/**

View File

@ -285,7 +285,7 @@ static void rcu_qs(void)
TPS("cpuqs"));
__this_cpu_write(rcu_data.cpu_no_qs.b.norm, false);
barrier(); /* Coordinate with rcu_flavor_sched_clock_irq(). */
current->rcu_read_unlock_special.b.need_qs = false;
WRITE_ONCE(current->rcu_read_unlock_special.b.need_qs, false);
}
}
@ -642,100 +642,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
rcu_preempt_deferred_qs_irqrestore(t, flags);
}
/*
* Dump detailed information for all tasks blocking the current RCU
* grace period on the specified rcu_node structure.
*/
static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
{
unsigned long flags;
struct task_struct *t;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (!rcu_preempt_blocked_readers_cgp(rnp)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
/*
* We could be printing a lot while holding a spinlock.
* Avoid triggering hard lockup.
*/
touch_nmi_watchdog();
sched_show_task(t);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
/*
* Dump detailed information for all tasks blocking the current RCU
* grace period.
*/
static void rcu_print_detail_task_stall(void)
{
struct rcu_node *rnp = rcu_get_root();
rcu_print_detail_task_stall_rnp(rnp);
rcu_for_each_leaf_node(rnp)
rcu_print_detail_task_stall_rnp(rnp);
}
static void rcu_print_task_stall_begin(struct rcu_node *rnp)
{
pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
rnp->level, rnp->grplo, rnp->grphi);
}
static void rcu_print_task_stall_end(void)
{
pr_cont("\n");
}
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
struct task_struct *t;
int ndetected = 0;
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
rcu_print_task_stall_begin(rnp);
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
rcu_print_task_stall_end();
return ndetected;
}
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each that is blocking the current
* expedited grace period.
*/
static int rcu_print_task_exp_stall(struct rcu_node *rnp)
{
struct task_struct *t;
int ndetected = 0;
if (!rnp->exp_tasks)
return 0;
t = list_entry(rnp->exp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
return ndetected;
}
/*
* Check that the list of blocked tasks for the newly completed grace
* period is in fact empty. It is a serious bug to complete a grace
@ -804,19 +710,25 @@ static void rcu_flavor_sched_clock_irq(int user)
/*
* Check for a task exiting while in a preemptible-RCU read-side
* critical section, clean up if so. No need to issue warnings,
* as debug_check_no_locks_held() already does this if lockdep
* is enabled.
* critical section, clean up if so. No need to issue warnings, as
* debug_check_no_locks_held() already does this if lockdep is enabled.
* Besides, if this function does anything other than just immediately
* return, there was a bug of some sort. Spewing warnings from this
* function is like as not to simply obscure important prior warnings.
*/
void exit_rcu(void)
{
struct task_struct *t = current;
if (likely(list_empty(&current->rcu_node_entry)))
if (unlikely(!list_empty(&current->rcu_node_entry))) {
t->rcu_read_lock_nesting = 1;
barrier();
WRITE_ONCE(t->rcu_read_unlock_special.b.blocked, true);
} else if (unlikely(t->rcu_read_lock_nesting)) {
t->rcu_read_lock_nesting = 1;
} else {
return;
t->rcu_read_lock_nesting = 1;
barrier();
t->rcu_read_unlock_special.b.blocked = true;
}
__rcu_read_unlock();
rcu_preempt_deferred_qs(current);
}
@ -979,33 +891,6 @@ static bool rcu_preempt_need_deferred_qs(struct task_struct *t)
}
static void rcu_preempt_deferred_qs(struct task_struct *t) { }
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static void rcu_print_detail_task_stall(void)
{
}
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
return 0;
}
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections that are
* blocking the current expedited grace period.
*/
static int rcu_print_task_exp_stall(struct rcu_node *rnp)
{
return 0;
}
/*
* Because there is no preemptible RCU, there can be no readers blocked,
* so there is no need to check for blocked tasks. So check only for
@ -1185,8 +1070,6 @@ static int rcu_boost_kthread(void *arg)
static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
__releases(rnp->lock)
{
struct task_struct *t;
raw_lockdep_assert_held_rcu_node(rnp);
if (!rcu_preempt_blocked_readers_cgp(rnp) && rnp->exp_tasks == NULL) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
@ -1200,9 +1083,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
if (rnp->exp_tasks == NULL)
rnp->boost_tasks = rnp->gp_tasks;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
t = rnp->boost_kthread_task;
if (t)
rcu_wake_cond(t, rnp->boost_kthread_status);
rcu_wake_cond(rnp->boost_kthread_task,
rnp->boost_kthread_status);
} else {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
@ -1649,98 +1531,6 @@ static void rcu_cleanup_after_idle(void)
#endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
#ifdef CONFIG_RCU_FAST_NO_HZ
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c",
rdp->last_accelerate & 0xffff, jiffies & 0xffff,
".l"[rdp->all_lazy],
".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)],
".D"[!rdp->tick_nohz_enabled_snap]);
}
#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
*cp = '\0';
}
#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */
/* Initiate the stall-info list. */
static void print_cpu_stall_info_begin(void)
{
pr_cont("\n");
}
/*
* Print out diagnostic information for the specified stalled CPU.
*
* If the specified CPU is aware of the current RCU grace period, then
* print the number of scheduling clock interrupts the CPU has taken
* during the time that it has been aware. Otherwise, print the number
* of RCU grace periods that this CPU is ignorant of, for example, "1"
* if the CPU was aware of the previous grace period.
*
* Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info.
*/
static void print_cpu_stall_info(int cpu)
{
unsigned long delta;
char fast_no_hz[72];
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
char *ticks_title;
unsigned long ticks_value;
/*
* We could be printing a lot while holding a spinlock. Avoid
* triggering hard lockup.
*/
touch_nmi_watchdog();
ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq);
if (ticks_value) {
ticks_title = "GPs behind";
} else {
ticks_title = "ticks this GP";
ticks_value = rdp->ticks_this_gp;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
"N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)],
!IS_ENABLED(CONFIG_IRQ_WORK) ? '?' :
rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
"!."[!delta],
ticks_value, ticks_title,
rcu_dynticks_snap(rdp) & 0xfff,
rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
fast_no_hz);
}
/* Terminate the stall-info list. */
static void print_cpu_stall_info_end(void)
{
pr_err("\t");
}
/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
static void zero_cpu_stall_ticks(struct rcu_data *rdp)
{
rdp->ticks_this_gp = 0;
rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id());
WRITE_ONCE(rdp->last_fqs_resched, jiffies);
}
#ifdef CONFIG_RCU_NOCB_CPU
/*
@ -1766,11 +1556,22 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp)
*/
/* Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters. */
/*
* Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters.
* The string after the "rcu_nocbs=" is either "all" for all CPUs, or a
* comma-separated list of CPUs and/or CPU ranges. If an invalid list is
* given, a warning is emitted and all CPUs are offloaded.
*/
static int __init rcu_nocb_setup(char *str)
{
alloc_bootmem_cpumask_var(&rcu_nocb_mask);
cpulist_parse(str, rcu_nocb_mask);
if (!strcasecmp(str, "all"))
cpumask_setall(rcu_nocb_mask);
else
if (cpulist_parse(str, rcu_nocb_mask)) {
pr_warn("rcu_nocbs= bad CPU range, all CPUs set\n");
cpumask_setall(rcu_nocb_mask);
}
return 1;
}
__setup("rcu_nocbs=", rcu_nocb_setup);

View File

@ -0,0 +1,709 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* RCU CPU stall warnings for normal RCU grace periods
*
* Copyright IBM Corporation, 2019
*
* Author: Paul E. McKenney <paulmck@linux.ibm.com>
*/
//////////////////////////////////////////////////////////////////////////////
//
// Controlling CPU stall warnings, including delay calculation.
/* panic() on RCU Stall sysctl. */
int sysctl_panic_on_rcu_stall __read_mostly;
#ifdef CONFIG_PROVE_RCU
#define RCU_STALL_DELAY_DELTA (5 * HZ)
#else
#define RCU_STALL_DELAY_DELTA 0
#endif
/* Limit-check stall timeouts specified at boottime and runtime. */
int rcu_jiffies_till_stall_check(void)
{
int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
/*
* Limit check must be consistent with the Kconfig limits
* for CONFIG_RCU_CPU_STALL_TIMEOUT.
*/
if (till_stall_check < 3) {
WRITE_ONCE(rcu_cpu_stall_timeout, 3);
till_stall_check = 3;
} else if (till_stall_check > 300) {
WRITE_ONCE(rcu_cpu_stall_timeout, 300);
till_stall_check = 300;
}
return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
}
EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
/* Don't do RCU CPU stall warnings during long sysrq printouts. */
void rcu_sysrq_start(void)
{
if (!rcu_cpu_stall_suppress)
rcu_cpu_stall_suppress = 2;
}
void rcu_sysrq_end(void)
{
if (rcu_cpu_stall_suppress == 2)
rcu_cpu_stall_suppress = 0;
}
/* Don't print RCU CPU stall warnings during a kernel panic. */
static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
{
rcu_cpu_stall_suppress = 1;
return NOTIFY_DONE;
}
static struct notifier_block rcu_panic_block = {
.notifier_call = rcu_panic,
};
static int __init check_cpu_stall_init(void)
{
atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
return 0;
}
early_initcall(check_cpu_stall_init);
/* If so specified via sysctl, panic, yielding cleaner stall-warning output. */
static void panic_on_rcu_stall(void)
{
if (sysctl_panic_on_rcu_stall)
panic("RCU Stall\n");
}
/**
* rcu_cpu_stall_reset - prevent further stall warnings in current grace period
*
* Set the stall-warning timeout way off into the future, thus preventing
* any RCU CPU stall-warning messages from appearing in the current set of
* RCU grace periods.
*
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2);
}
//////////////////////////////////////////////////////////////////////////////
//
// Interaction with RCU grace periods
/* Start of new grace period, so record stall time (and forcing times). */
static void record_gp_stall_check_time(void)
{
unsigned long j = jiffies;
unsigned long j1;
rcu_state.gp_start = j;
j1 = rcu_jiffies_till_stall_check();
/* Record ->gp_start before ->jiffies_stall. */
smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
}
/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
static void zero_cpu_stall_ticks(struct rcu_data *rdp)
{
rdp->ticks_this_gp = 0;
rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id());
WRITE_ONCE(rdp->last_fqs_resched, jiffies);
}
/*
* If too much time has passed in the current grace period, and if
* so configured, go kick the relevant kthreads.
*/
static void rcu_stall_kick_kthreads(void)
{
unsigned long j;
if (!rcu_kick_kthreads)
return;
j = READ_ONCE(rcu_state.jiffies_kick_kthreads);
if (time_after(jiffies, j) && rcu_state.gp_kthread &&
(rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) {
WARN_ONCE(1, "Kicking %s grace-period kthread\n",
rcu_state.name);
rcu_ftrace_dump(DUMP_ALL);
wake_up_process(rcu_state.gp_kthread);
WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ);
}
}
/*
* Handler for the irq_work request posted about halfway into the RCU CPU
* stall timeout, and used to detect excessive irq disabling. Set state
* appropriately, but just complain if there is unexpected state on entry.
*/
static void rcu_iw_handler(struct irq_work *iwp)
{
struct rcu_data *rdp;
struct rcu_node *rnp;
rdp = container_of(iwp, struct rcu_data, rcu_iw);
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp);
if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
rdp->rcu_iw_gp_seq = rnp->gp_seq;
rdp->rcu_iw_pending = false;
}
raw_spin_unlock_rcu_node(rnp);
}
//////////////////////////////////////////////////////////////////////////////
//
// Printing RCU CPU stall warnings
#ifdef CONFIG_PREEMPT
/*
* Dump detailed information for all tasks blocking the current RCU
* grace period on the specified rcu_node structure.
*/
static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
{
unsigned long flags;
struct task_struct *t;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (!rcu_preempt_blocked_readers_cgp(rnp)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
/*
* We could be printing a lot while holding a spinlock.
* Avoid triggering hard lockup.
*/
touch_nmi_watchdog();
sched_show_task(t);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
struct task_struct *t;
int ndetected = 0;
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
rnp->level, rnp->grplo, rnp->grphi);
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
pr_cont("\n");
return ndetected;
}
#else /* #ifdef CONFIG_PREEMPT */
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
{
}
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
return 0;
}
#endif /* #else #ifdef CONFIG_PREEMPT */
/*
* Dump stacks of all tasks running on stalled CPUs. First try using
* NMIs, but fall back to manual remote stack tracing on architectures
* that don't support NMI-based stack dumps. The NMI-triggered stack
* traces are more accurate because they are printed by the target CPU.
*/
static void rcu_dump_cpu_stacks(void)
{
int cpu;
unsigned long flags;
struct rcu_node *rnp;
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
if (!trigger_single_cpu_backtrace(cpu))
dump_cpu_task(cpu);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
}
#ifdef CONFIG_RCU_FAST_NO_HZ
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c",
rdp->last_accelerate & 0xffff, jiffies & 0xffff,
".l"[rdp->all_lazy],
".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)],
".D"[!!rdp->tick_nohz_enabled_snap]);
}
#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
*cp = '\0';
}
#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */
/*
* Print out diagnostic information for the specified stalled CPU.
*
* If the specified CPU is aware of the current RCU grace period, then
* print the number of scheduling clock interrupts the CPU has taken
* during the time that it has been aware. Otherwise, print the number
* of RCU grace periods that this CPU is ignorant of, for example, "1"
* if the CPU was aware of the previous grace period.
*
* Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info.
*/
static void print_cpu_stall_info(int cpu)
{
unsigned long delta;
char fast_no_hz[72];
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
char *ticks_title;
unsigned long ticks_value;
/*
* We could be printing a lot while holding a spinlock. Avoid
* triggering hard lockup.
*/
touch_nmi_watchdog();
ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq);
if (ticks_value) {
ticks_title = "GPs behind";
} else {
ticks_title = "ticks this GP";
ticks_value = rdp->ticks_this_gp;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
"N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)],
!IS_ENABLED(CONFIG_IRQ_WORK) ? '?' :
rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
"!."[!delta],
ticks_value, ticks_title,
rcu_dynticks_snap(rdp) & 0xfff,
rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
fast_no_hz);
}
/* Complain about starvation of grace-period kthread. */
static void rcu_check_gp_kthread_starvation(void)
{
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
j = jiffies - READ_ONCE(rcu_state.gp_activity);
if (j > 2 * HZ) {
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
READ_ONCE(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
if (gpk) {
pr_err("RCU grace-period kthread stack dump:\n");
sched_show_task(gpk);
wake_up_process(gpk);
}
}
}
static void print_other_cpu_stall(unsigned long gp_seq)
{
int cpu;
unsigned long flags;
unsigned long gpa;
unsigned long j;
int ndetected = 0;
struct rcu_node *rnp;
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on our buddy...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s detected stalls on CPUs/tasks:\n", rcu_state.name);
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
ndetected += rcu_print_task_stall(rnp);
if (rnp->qsmask != 0) {
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
print_cpu_stall_info(cpu);
ndetected++;
}
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
smp_processor_id(), (long)(jiffies - rcu_state.gp_start),
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
if (ndetected) {
rcu_dump_cpu_stacks();
/* Complain about tasks blocking the grace period. */
rcu_for_each_leaf_node(rnp)
rcu_print_detail_task_stall_rnp(rnp);
} else {
if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) {
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
gpa = READ_ONCE(rcu_state.gp_activity);
pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n",
rcu_state.name, j - gpa, j, gpa,
READ_ONCE(jiffies_till_next_fqs),
rcu_get_root()->qsmask);
/* In this case, the current CPU might be at fault. */
sched_show_task(current);
}
}
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
rcu_check_gp_kthread_starvation();
panic_on_rcu_stall();
rcu_force_quiescent_state(); /* Kick them all. */
}
static void print_cpu_stall(void)
{
int cpu;
unsigned long flags;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rcu_get_root();
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on ourselves...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
print_cpu_stall_info(smp_processor_id());
raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n",
jiffies - rcu_state.gp_start,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
rcu_check_gp_kthread_starvation();
rcu_dump_cpu_stacks();
raw_spin_lock_irqsave_rcu_node(rnp, flags);
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
panic_on_rcu_stall();
/*
* Attempt to revive the RCU machinery by forcing a context switch.
*
* A context switch would normally allow the RCU state machine to make
* progress and it could be we're stuck in kernel space without context
* switches for an entirely unreasonable amount of time.
*/
set_tsk_need_resched(current);
set_preempt_need_resched();
}
static void check_cpu_stall(struct rcu_data *rdp)
{
unsigned long gs1;
unsigned long gs2;
unsigned long gps;
unsigned long j;
unsigned long jn;
unsigned long js;
struct rcu_node *rnp;
if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) ||
!rcu_gp_in_progress())
return;
rcu_stall_kick_kthreads();
j = jiffies;
/*
* Lots of memory barriers to reject false positives.
*
* The idea is to pick up rcu_state.gp_seq, then
* rcu_state.jiffies_stall, then rcu_state.gp_start, and finally
* another copy of rcu_state.gp_seq. These values are updated in
* the opposite order with memory barriers (or equivalent) during
* grace-period initialization and cleanup. Now, a false positive
* can occur if we get an new value of rcu_state.gp_start and a old
* value of rcu_state.jiffies_stall. But given the memory barriers,
* the only way that this can happen is if one grace period ends
* and another starts between these two fetches. This is detected
* by comparing the second fetch of rcu_state.gp_seq with the
* previous fetch from rcu_state.gp_seq.
*
* Given this check, comparisons of jiffies, rcu_state.jiffies_stall,
* and rcu_state.gp_start suffice to forestall false positives.
*/
gs1 = READ_ONCE(rcu_state.gp_seq);
smp_rmb(); /* Pick up ->gp_seq first... */
js = READ_ONCE(rcu_state.jiffies_stall);
smp_rmb(); /* ...then ->jiffies_stall before the rest... */
gps = READ_ONCE(rcu_state.gp_start);
smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */
gs2 = READ_ONCE(rcu_state.gp_seq);
if (gs1 != gs2 ||
ULONG_CMP_LT(j, js) ||
ULONG_CMP_GE(gps, js))
return; /* No stall or GP completed since entering function. */
rnp = rdp->mynode;
jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
if (rcu_gp_in_progress() &&
(READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall();
} else if (rcu_gp_in_progress() &&
ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* They had a few time units to dump stack, so complain. */
print_other_cpu_stall(gs2);
}
}
//////////////////////////////////////////////////////////////////////////////
//
// RCU forward-progress mechanisms, including of callback invocation.
/*
* Show the state of the grace-period kthreads.
*/
void show_rcu_gp_kthreads(void)
{
int cpu;
unsigned long j;
unsigned long ja;
unsigned long jr;
unsigned long jw;
struct rcu_data *rdp;
struct rcu_node *rnp;
j = jiffies;
ja = j - READ_ONCE(rcu_state.gp_activity);
jr = j - READ_ONCE(rcu_state.gp_req_activity);
jw = j - READ_ONCE(rcu_state.gp_wake_time);
pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
rcu_state.name, gp_state_getname(rcu_state.gp_state),
rcu_state.gp_state,
rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL,
ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
(long)READ_ONCE(rcu_state.gp_seq),
(long)READ_ONCE(rcu_get_root()->gp_seq_needed),
READ_ONCE(rcu_state.gp_flags));
rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed))
continue;
pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
rnp->grplo, rnp->grphi, (long)rnp->gp_seq,
(long)rnp->gp_seq_needed);
if (!rcu_is_leaf_node(rnp))
continue;
for_each_leaf_node_possible_cpu(rnp, cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rdp->gpwrap ||
ULONG_CMP_GE(rcu_state.gp_seq,
rdp->gp_seq_needed))
continue;
pr_info("\tcpu %d ->gp_seq_needed %ld\n",
cpu, (long)rdp->gp_seq_needed);
}
}
/* sched_show_task(rcu_state.gp_kthread); */
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
/*
* This function checks for grace-period requests that fail to motivate
* RCU to come out of its idle mode.
*/
static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
const unsigned long gpssdelay)
{
unsigned long flags;
unsigned long j;
struct rcu_node *rnp_root = rcu_get_root();
static atomic_t warned = ATOMIC_INIT(0);
if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
return;
j = jiffies; /* Expensive access, and in common case don't get here. */
if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned))
return;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
/* Hold onto the leaf lock to make others see warned==1. */
if (rnp_root != rnp)
raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, rcu_state.gp_req_activity + gpssdelay) ||
time_before(j, rcu_state.gp_activity + gpssdelay) ||
atomic_xchg(&warned, 1)) {
raw_spin_unlock_rcu_node(rnp_root); /* irqs remain disabled. */
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
WARN_ON(1);
if (rnp_root != rnp)
raw_spin_unlock_rcu_node(rnp_root);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
show_rcu_gp_kthreads();
}
/*
* Do a forward-progress check for rcutorture. This is normally invoked
* due to an OOM event. The argument "j" gives the time period during
* which rcutorture would like progress to have been made.
*/
void rcu_fwd_progress_check(unsigned long j)
{
unsigned long cbs;
int cpu;
unsigned long max_cbs = 0;
int max_cpu = -1;
struct rcu_data *rdp;
if (rcu_gp_in_progress()) {
pr_info("%s: GP age %lu jiffies\n",
__func__, jiffies - rcu_state.gp_start);
show_rcu_gp_kthreads();
} else {
pr_info("%s: Last GP end %lu jiffies ago\n",
__func__, jiffies - rcu_state.gp_end);
preempt_disable();
rdp = this_cpu_ptr(&rcu_data);
rcu_check_gp_start_stall(rdp->mynode, rdp, j);
preempt_enable();
}
for_each_possible_cpu(cpu) {
cbs = rcu_get_n_cbs_cpu(cpu);
if (!cbs)
continue;
if (max_cpu < 0)
pr_info("%s: callbacks", __func__);
pr_cont(" %d: %lu", cpu, cbs);
if (cbs <= max_cbs)
continue;
max_cbs = cbs;
max_cpu = cpu;
}
if (max_cpu >= 0)
pr_cont("\n");
}
EXPORT_SYMBOL_GPL(rcu_fwd_progress_check);
/* Commandeer a sysrq key to dump RCU's tree. */
static bool sysrq_rcu;
module_param(sysrq_rcu, bool, 0444);
/* Dump grace-period-request information due to commandeered sysrq. */
static void sysrq_show_rcu(int key)
{
show_rcu_gp_kthreads();
}
static struct sysrq_key_op sysrq_rcudump_op = {
.handler = sysrq_show_rcu,
.help_msg = "show-rcu(y)",
.action_msg = "Show RCU tree",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static int __init rcu_sysrq_init(void)
{
if (sysrq_rcu)
return register_sysrq_key('y', &sysrq_rcudump_op);
return 0;
}
early_initcall(rcu_sysrq_init);

View File

@ -424,68 +424,11 @@ EXPORT_SYMBOL_GPL(do_trace_rcu_torture_read);
#endif
#ifdef CONFIG_RCU_STALL_COMMON
#ifdef CONFIG_PROVE_RCU
#define RCU_STALL_DELAY_DELTA (5 * HZ)
#else
#define RCU_STALL_DELAY_DELTA 0
#endif
int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */
EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
static int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
module_param(rcu_cpu_stall_suppress, int, 0644);
int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
module_param(rcu_cpu_stall_timeout, int, 0644);
int rcu_jiffies_till_stall_check(void)
{
int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
/*
* Limit check must be consistent with the Kconfig limits
* for CONFIG_RCU_CPU_STALL_TIMEOUT.
*/
if (till_stall_check < 3) {
WRITE_ONCE(rcu_cpu_stall_timeout, 3);
till_stall_check = 3;
} else if (till_stall_check > 300) {
WRITE_ONCE(rcu_cpu_stall_timeout, 300);
till_stall_check = 300;
}
return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
}
EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
void rcu_sysrq_start(void)
{
if (!rcu_cpu_stall_suppress)
rcu_cpu_stall_suppress = 2;
}
void rcu_sysrq_end(void)
{
if (rcu_cpu_stall_suppress == 2)
rcu_cpu_stall_suppress = 0;
}
static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
{
rcu_cpu_stall_suppress = 1;
return NOTIFY_DONE;
}
static struct notifier_block rcu_panic_block = {
.notifier_call = rcu_panic,
};
static int __init check_cpu_stall_init(void)
{
atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
return 0;
}
early_initcall(check_cpu_stall_init);
#endif /* #ifdef CONFIG_RCU_STALL_COMMON */
#ifdef CONFIG_TASKS_RCU

View File

@ -88,6 +88,8 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
if (num_online_cpus() <= 1)
return false; /* Can't offline the last CPU. */
if (verbose > 1)
pr_alert("%s" TORTURE_FLAG

View File

@ -56,7 +56,7 @@ struct clusterip_config {
#endif
enum clusterip_hashmode hash_mode; /* which hashing mode */
u_int32_t hash_initval; /* hash initialization */
struct rcu_head rcu; /* for call_rcu_bh */
struct rcu_head rcu; /* for call_rcu */
struct net *net; /* netns for pernet list */
char ifname[IFNAMSIZ]; /* device ifname */
};

View File

@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory Consistency Model
19. AND THEN THERE WAS ALPHA
20. THE HAPPENS-BEFORE RELATION: hb
21. THE PROPAGATES-BEFORE RELATION: pb
22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
22. RCU RELATIONS: rcu-link, rcu-gp, rcu-rscsi, rcu-fence, and rb
23. LOCKING
24. ODDS AND ENDS
@ -1430,8 +1430,8 @@ they execute means that it cannot have cycles. This requirement is
the content of the LKMM's "propagation" axiom.
RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
----------------------------------------------------
RCU RELATIONS: rcu-link, rcu-gp, rcu-rscsi, rcu-fence, and rb
-------------------------------------------------------------
RCU (Read-Copy-Update) is a powerful synchronization mechanism. It
rests on two concepts: grace periods and read-side critical sections.
@ -1446,17 +1446,19 @@ As far as memory models are concerned, RCU's main feature is its
Grace-Period Guarantee, which states that a critical section can never
span a full grace period. In more detail, the Guarantee says:
If a critical section starts before a grace period then it
must end before the grace period does. In addition, every
store that propagates to the critical section's CPU before the
end of the critical section must propagate to every CPU before
the end of the grace period.
For any critical section C and any grace period G, at least
one of the following statements must hold:
If a critical section ends after a grace period ends then it
must start after the grace period does. In addition, every
store that propagates to the grace period's CPU before the
start of the grace period must propagate to every CPU before
the start of the critical section.
(1) C ends before G does, and in addition, every store that
propagates to C's CPU before the end of C must propagate to
every CPU before G ends.
(2) G starts before C does, and in addition, every store that
propagates to G's CPU before the start of G must propagate
to every CPU before C starts.
In particular, it is not possible for a critical section to both start
before and end after a grace period.
Here is a simple example of RCU in action:
@ -1483,10 +1485,11 @@ The Grace Period Guarantee tells us that when this code runs, it will
never end with r1 = 1 and r2 = 0. The reasoning is as follows. r1 = 1
means that P0's store to x propagated to P1 before P1 called
synchronize_rcu(), so P0's critical section must have started before
P1's grace period. On the other hand, r2 = 0 means that P0's store to
y, which occurs before the end of the critical section, did not
propagate to P1 before the end of the grace period, violating the
Guarantee.
P1's grace period, contrary to part (2) of the Guarantee. On the
other hand, r2 = 0 means that P0's store to y, which occurs before the
end of the critical section, did not propagate to P1 before the end of
the grace period, contrary to part (1). Together the results violate
the Guarantee.
In the kernel's implementations of RCU, the requirements for stores
to propagate to every CPU are fulfilled by placing strong fences at
@ -1504,11 +1507,11 @@ before" or "ends after" a grace period? Some aspects of the meaning
are pretty obvious, as in the example above, but the details aren't
entirely clear. The LKMM formalizes this notion by means of the
rcu-link relation. rcu-link encompasses a very general notion of
"before": Among other things, X ->rcu-link Z includes cases where X
happens-before or is equal to some event Y which is equal to or comes
before Z in the coherence order. When Y = Z this says that X ->rfe Z
implies X ->rcu-link Z. In addition, when Y = X it says that X ->fr Z
and X ->co Z each imply X ->rcu-link Z.
"before": If E and F are RCU fence events (i.e., rcu_read_lock(),
rcu_read_unlock(), or synchronize_rcu()) then among other things,
E ->rcu-link F includes cases where E is po-before some memory-access
event X, F is po-after some memory-access event Y, and we have any of
X ->rfe Y, X ->co Y, or X ->fr Y.
The formal definition of the rcu-link relation is more than a little
obscure, and we won't give it here. It is closely related to the pb
@ -1516,171 +1519,173 @@ relation, and the details don't matter unless you want to comb through
a somewhat lengthy formal proof. Pretty much all you need to know
about rcu-link is the information in the preceding paragraph.
The LKMM also defines the gp and rscs relations. They bring grace
periods and read-side critical sections into the picture, in the
The LKMM also defines the rcu-gp and rcu-rscsi relations. They bring
grace periods and read-side critical sections into the picture, in the
following way:
E ->gp F means there is a synchronize_rcu() fence event S such
that E ->po S and either S ->po F or S = F. In simple terms,
there is a grace period po-between E and F.
E ->rcu-gp F means that E and F are in fact the same event,
and that event is a synchronize_rcu() fence (i.e., a grace
period).
E ->rscs F means there is a critical section delimited by an
rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
that E ->po U and either L ->po F or L = F. You can think of
this as saying that E and F are in the same critical section
(in fact, it also allows E to be po-before the start of the
critical section and F to be po-after the end).
E ->rcu-rscsi F means that E and F are the rcu_read_unlock()
and rcu_read_lock() fence events delimiting some read-side
critical section. (The 'i' at the end of the name emphasizes
that this relation is "inverted": It links the end of the
critical section to the start.)
If we think of the rcu-link relation as standing for an extended
"before", then X ->gp Y ->rcu-link Z says that X executes before a
grace period which ends before Z executes. (In fact it covers more
than this, because it also includes cases where X executes before a
grace period and some store propagates to Z's CPU before Z executes
but doesn't propagate to some other CPU until after the grace period
ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
before the start of) a critical section which starts before Z
executes.
"before", then X ->rcu-gp Y ->rcu-link Z roughly says that X is a
grace period which ends before Z begins. (In fact it covers more than
this, because it also includes cases where some store propagates to
Z's CPU before Z begins but doesn't propagate to some other CPU until
after X ends.) Similarly, X ->rcu-rscsi Y ->rcu-link Z says that X is
the end of a critical section which starts before Z begins.
The LKMM goes on to define the rcu-fence relation as a sequence of gp
and rscs links separated by rcu-link links, in which the number of gp
links is >= the number of rscs links. For example:
The LKMM goes on to define the rcu-fence relation as a sequence of
rcu-gp and rcu-rscsi links separated by rcu-link links, in which the
number of rcu-gp links is >= the number of rcu-rscsi links. For
example:
X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
X ->rcu-gp Y ->rcu-link Z ->rcu-rscsi T ->rcu-link U ->rcu-gp V
would imply that X ->rcu-fence V, because this sequence contains two
gp links and only one rscs link. (It also implies that X ->rcu-fence T
and Z ->rcu-fence V.) On the other hand:
rcu-gp links and one rcu-rscsi link. (It also implies that
X ->rcu-fence T and Z ->rcu-fence V.) On the other hand:
X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
X ->rcu-rscsi Y ->rcu-link Z ->rcu-rscsi T ->rcu-link U ->rcu-gp V
does not imply X ->rcu-fence V, because the sequence contains only
one gp link but two rscs links.
one rcu-gp link but two rcu-rscsi links.
The rcu-fence relation is important because the Grace Period Guarantee
means that rcu-fence acts kind of like a strong fence. In particular,
if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
will propagate to every CPU before Z executes.
E ->rcu-fence F implies not only that E begins before F ends, but also
that any write po-before E will propagate to every CPU before any
instruction po-after F can execute. (However, it does not imply that
E must execute before F; in fact, each synchronize_rcu() fence event
is linked to itself by rcu-fence as a degenerate case.)
To prove this in full generality requires some intellectual effort.
We'll consider just a very simple case:
W ->gp X ->rcu-link Y ->rscs Z.
G ->rcu-gp W ->rcu-link Z ->rcu-rscsi F.
This formula means that there is a grace period G and a critical
section C such that:
This formula means that G and W are the same event (a grace period),
and there are events X, Y and a read-side critical section C such that:
1. W is po-before G;
1. G = W is po-before or equal to X;
2. X is equal to or po-after G;
2. X comes "before" Y in some sense (including rfe, co and fr);
3. X comes "before" Y in some sense;
2. Y is po-before Z;
4. Y is po-before the end of C;
4. Z is the rcu_read_unlock() event marking the end of C;
5. Z is equal to or po-after the start of C.
5. F is the rcu_read_lock() event marking the start of C.
From 2 - 4 we deduce that the grace period G ends before the critical
section C. Then the second part of the Grace Period Guarantee says
not only that G starts before C does, but also that W (which executes
on G's CPU before G starts) must propagate to every CPU before C
starts. In particular, W propagates to every CPU before Z executes
(or finishes executing, in the case where Z is equal to the
rcu_read_lock() fence event which starts C.) This sort of reasoning
can be expanded to handle all the situations covered by rcu-fence.
From 1 - 4 we deduce that the grace period G ends before the critical
section C. Then part (2) of the Grace Period Guarantee says not only
that G starts before C does, but also that any write which executes on
G's CPU before G starts must propagate to every CPU before C starts.
In particular, the write propagates to every CPU before F finishes
executing and hence before any instruction po-after F can execute.
This sort of reasoning can be extended to handle all the situations
covered by rcu-fence.
Finally, the LKMM defines the RCU-before (rb) relation in terms of
rcu-fence. This is done in essentially the same way as the pb
relation was defined in terms of strong-fence. We will omit the
details; the end result is that E ->rb F implies E must execute before
F, just as E ->pb F does (and for much the same reasons).
details; the end result is that E ->rb F implies E must execute
before F, just as E ->pb F does (and for much the same reasons).
Putting this all together, the LKMM expresses the Grace Period
Guarantee by requiring that the rb relation does not contain a cycle.
Equivalently, this "rcu" axiom requires that there are no events E and
F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the
axiom requires that there are no cycles consisting of gp and rscs
alternating with rcu-link, where the number of gp links is >= the
number of rscs links.
Equivalently, this "rcu" axiom requires that there are no events E
and F with E ->rcu-link F ->rcu-fence E. Or to put it a third way,
the axiom requires that there are no cycles consisting of rcu-gp and
rcu-rscsi alternating with rcu-link, where the number of rcu-gp links
is >= the number of rcu-rscsi links.
Justifying the axiom isn't easy, but it is in fact a valid
formalization of the Grace Period Guarantee. We won't attempt to go
through the detailed argument, but the following analysis gives a
taste of what is involved. Suppose we have a violation of the first
part of the Guarantee: A critical section starts before a grace
period, and some store propagates to the critical section's CPU before
the end of the critical section but doesn't propagate to some other
CPU until after the end of the grace period.
taste of what is involved. Suppose both parts of the Guarantee are
violated: A critical section starts before a grace period, and some
store propagates to the critical section's CPU before the end of the
critical section but doesn't propagate to some other CPU until after
the end of the grace period.
Putting symbols to these ideas, let L and U be the rcu_read_lock() and
rcu_read_unlock() fence events delimiting the critical section in
question, and let S be the synchronize_rcu() fence event for the grace
period. Saying that the critical section starts before S means there
are events E and F where E is po-after L (which marks the start of the
critical section), E is "before" F in the sense of the rcu-link
relation, and F is po-before the grace period S:
are events Q and R where Q is po-after L (which marks the start of the
critical section), Q is "before" R in the sense used by the rcu-link
relation, and R is po-before the grace period S. Thus we have:
L ->po E ->rcu-link F ->po S.
L ->rcu-link S.
Let W be the store mentioned above, let Z come before the end of the
Let W be the store mentioned above, let Y come before the end of the
critical section and witness that W propagates to the critical
section's CPU by reading from W, and let Y on some arbitrary CPU be a
witness that W has not propagated to that CPU, where Y happens after
section's CPU by reading from W, and let Z on some arbitrary CPU be a
witness that W has not propagated to that CPU, where Z happens after
some event X which is po-after S. Symbolically, this amounts to:
S ->po X ->hb* Y ->fr W ->rf Z ->po U.
S ->po X ->hb* Z ->fr W ->rf Y ->po U.
The fr link from Y to W indicates that W has not propagated to Y's CPU
at the time that Y executes. From this, it can be shown (see the
discussion of the rcu-link relation earlier) that X and Z are related
by rcu-link, yielding:
The fr link from Z to W indicates that W has not propagated to Z's CPU
at the time that Z executes. From this, it can be shown (see the
discussion of the rcu-link relation earlier) that S and U are related
by rcu-link:
S ->po X ->rcu-link Z ->po U.
S ->rcu-link U.
The formulas say that S is po-between F and X, hence F ->gp X. They
also say that Z comes before the end of the critical section and E
comes after its start, hence Z ->rscs E. From all this we obtain:
Since S is a grace period we have S ->rcu-gp S, and since L and U are
the start and end of the critical section C we have U ->rcu-rscsi L.
From this we obtain:
F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
S ->rcu-gp S ->rcu-link U ->rcu-rscsi L ->rcu-link S,
a forbidden cycle. Thus the "rcu" axiom rules out this violation of
the Grace Period Guarantee.
For something a little more down-to-earth, let's see how the axiom
works out in practice. Consider the RCU code example from above, this
time with statement labels added to the memory access instructions:
time with statement labels added:
int x, y;
P0()
{
rcu_read_lock();
W: WRITE_ONCE(x, 1);
X: WRITE_ONCE(y, 1);
rcu_read_unlock();
L: rcu_read_lock();
X: WRITE_ONCE(x, 1);
Y: WRITE_ONCE(y, 1);
U: rcu_read_unlock();
}
P1()
{
int r1, r2;
Y: r1 = READ_ONCE(x);
synchronize_rcu();
Z: r2 = READ_ONCE(y);
Z: r1 = READ_ONCE(x);
S: synchronize_rcu();
W: r2 = READ_ONCE(y);
}
If r2 = 0 at the end then P0's store at X overwrites the value that
P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
In addition, there is a synchronize_rcu() between Y and Z, so therefore
we have Y ->gp Z.
If r2 = 0 at the end then P0's store at Y overwrites the value that
P1's load at W reads from, so we have W ->fre Y. Since S ->po W and
also Y ->po U, we get S ->rcu-link U. In addition, S ->rcu-gp S
because S is a grace period.
If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
so we have W ->rcu-link Y. In addition, W and X are in the same critical
section, so therefore we have X ->rscs W.
If r1 = 1 at the end then P1's load at Z reads from P0's store at X,
so we have X ->rfe Z. Together with L ->po X and Z ->po S, this
yields L ->rcu-link S. And since L and U are the start and end of a
critical section, we have U ->rcu-rscsi L.
Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
violating the "rcu" axiom. Hence the outcome is not allowed by the
LKMM, as we would expect.
Then U ->rcu-rscsi L ->rcu-link S ->rcu-gp S ->rcu-link U is a
forbidden cycle, violating the "rcu" axiom. Hence the outcome is not
allowed by the LKMM, as we would expect.
For contrast, let's see what can happen in a more complicated example:
@ -1690,51 +1695,52 @@ For contrast, let's see what can happen in a more complicated example:
{
int r0;
rcu_read_lock();
W: r0 = READ_ONCE(x);
X: WRITE_ONCE(y, 1);
rcu_read_unlock();
L0: rcu_read_lock();
r0 = READ_ONCE(x);
WRITE_ONCE(y, 1);
U0: rcu_read_unlock();
}
P1()
{
int r1;
Y: r1 = READ_ONCE(y);
synchronize_rcu();
Z: WRITE_ONCE(z, 1);
r1 = READ_ONCE(y);
S1: synchronize_rcu();
WRITE_ONCE(z, 1);
}
P2()
{
int r2;
rcu_read_lock();
U: r2 = READ_ONCE(z);
V: WRITE_ONCE(x, 1);
rcu_read_unlock();
L2: rcu_read_lock();
r2 = READ_ONCE(z);
WRITE_ONCE(x, 1);
U2: rcu_read_unlock();
}
If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
However this cycle is not forbidden, because the sequence of relations
contains fewer instances of gp (one) than of rscs (two). Consequently
the outcome is allowed by the LKMM. The following instruction timing
diagram shows how it might actually occur:
that U0 ->rcu-rscsi L0 ->rcu-link S1 ->rcu-gp S1 ->rcu-link U2 ->rcu-rscsi
L2 ->rcu-link U0. However this cycle is not forbidden, because the
sequence of relations contains fewer instances of rcu-gp (one) than of
rcu-rscsi (two). Consequently the outcome is allowed by the LKMM.
The following instruction timing diagram shows how it might actually
occur:
P0 P1 P2
-------------------- -------------------- --------------------
rcu_read_lock()
X: WRITE_ONCE(y, 1)
Y: r1 = READ_ONCE(y)
WRITE_ONCE(y, 1)
r1 = READ_ONCE(y)
synchronize_rcu() starts
. rcu_read_lock()
. V: WRITE_ONCE(x, 1)
W: r0 = READ_ONCE(x) .
. WRITE_ONCE(x, 1)
r0 = READ_ONCE(x) .
rcu_read_unlock() .
synchronize_rcu() ends
Z: WRITE_ONCE(z, 1)
U: r2 = READ_ONCE(z)
WRITE_ONCE(z, 1)
r2 = READ_ONCE(z)
rcu_read_unlock()
This requires P0 and P2 to execute their loads and stores out of
@ -1744,6 +1750,15 @@ section in P0 both starts before P1's grace period does and ends
before it does, and the critical section in P2 both starts after P1's
grace period does and ends after it does.
Addendum: The LKMM now supports SRCU (Sleepable Read-Copy-Update) in
addition to normal RCU. The ideas involved are much the same as
above, with new relations srcu-gp and srcu-rscsi added to represent
SRCU grace periods and read-side critical sections. There is a
restriction on the srcu-gp and srcu-rscsi links that can appear in an
rcu-fence sequence (the srcu-rscsi links must be paired with srcu-gp
links having the same SRCU domain with proper nesting); the details
are relatively unimportant.
LOCKING
-------

View File

@ -20,13 +20,17 @@ that litmus test to be exercised within the Linux kernel.
REQUIREMENTS
============
Version 7.49 of the "herd7" and "klitmus7" tools must be downloaded
separately:
Version 7.52 or higher of the "herd7" and "klitmus7" tools must be
downloaded separately:
https://github.com/herd/herdtools7
See "herdtools7/INSTALL.md" for installation instructions.
Note that although these tools usually provide backwards compatibility,
this is not absolutely guaranteed. Therefore, if a later version does
not work, please try using the exact version called out above.
==================
BASIC USAGE: HERD7
@ -221,8 +225,29 @@ The Linux-kernel memory model has the following limitations:
additional call_rcu() process to the site of the
emulated rcu-barrier().
e. Sleepable RCU (SRCU) is not modeled. It can be
emulated, but perhaps not simply.
e. Although sleepable RCU (SRCU) is now modeled, there
are some subtle differences between its semantics and
those in the Linux kernel. For example, the kernel
might interpret the following sequence as two partially
overlapping SRCU read-side critical sections:
1 r1 = srcu_read_lock(&my_srcu);
2 do_something_1();
3 r2 = srcu_read_lock(&my_srcu);
4 do_something_2();
5 srcu_read_unlock(&my_srcu, r1);
6 do_something_3();
7 srcu_read_unlock(&my_srcu, r2);
In contrast, LKMM will interpret this as a nested pair of
SRCU read-side critical sections, with the outer critical
section spanning lines 1-7 and the inner critical section
spanning lines 3-5.
This difference would be more of a concern had anyone
identified a reasonable use case for partially overlapping
SRCU read-side critical sections. For more information,
please see: https://paulmck.livejournal.com/40593.html
f. Reader-writer locking is not modeled. It can be
emulated in litmus tests using atomic read-modify-write

View File

@ -33,8 +33,14 @@ enum Barriers = 'wmb (*smp_wmb*) ||
'after-unlock-lock (*smp_mb__after_unlock_lock*)
instructions F[Barriers]
(* SRCU *)
enum SRCU = 'srcu-lock || 'srcu-unlock || 'sync-srcu
instructions SRCU[SRCU]
(* All srcu events *)
let Srcu = Srcu-lock | Srcu-unlock | Sync-srcu
(* Compute matching pairs of nested Rcu-lock and Rcu-unlock *)
let matched = let rec
let rcu-rscs = let rec
unmatched-locks = Rcu-lock \ domain(matched)
and unmatched-unlocks = Rcu-unlock \ range(matched)
and unmatched = unmatched-locks | unmatched-unlocks
@ -46,8 +52,27 @@ let matched = let rec
in matched
(* Validate nesting *)
flag ~empty Rcu-lock \ domain(matched) as unbalanced-rcu-locking
flag ~empty Rcu-unlock \ range(matched) as unbalanced-rcu-locking
flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
(* Outermost level of nesting only *)
let crit = matched \ (po^-1 ; matched ; po^-1)
(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
let srcu-rscs = let rec
unmatched-locks = Srcu-lock \ domain(matched)
and unmatched-unlocks = Srcu-unlock \ range(matched)
and unmatched = unmatched-locks | unmatched-unlocks
and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
and unmatched-locks-to-unlocks =
([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
and matched = matched | (unmatched-locks-to-unlocks \
(unmatched-po ; unmatched-po))
in matched
(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

View File

@ -33,7 +33,7 @@ let mb = ([M] ; fencerel(Mb) ; [M]) |
([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
([M] ; po ; [UL] ; (co | po) ; [LKW] ;
fencerel(After-unlock-lock) ; [M])
let gp = po ; [Sync-rcu] ; po?
let gp = po ; [Sync-rcu | Sync-srcu] ; po?
let strong-fence = mb | gp
@ -91,32 +91,47 @@ acyclic pb as propagation
(*******)
(*
* Effect of read-side critical section proceeds from the rcu_read_lock()
* onward on the one hand and from the rcu_read_unlock() backwards on the
* other hand.
* Effects of read-side critical sections proceed from the rcu_read_unlock()
* or srcu_read_unlock() backwards on the one hand, and from the
* rcu_read_lock() or srcu_read_lock() forwards on the other hand.
*
* In the definition of rcu-fence below, the po term at the left-hand side
* of each disjunct and the po? term at the right-hand end have been factored
* out. They have been moved into the definitions of rcu-link and rb.
* This was necessary in order to apply the "& loc" tests correctly.
*)
let rscs = po ; crit^-1 ; po?
let rcu-gp = [Sync-rcu] (* Compare with gp *)
let srcu-gp = [Sync-srcu]
let rcu-rscsi = rcu-rscs^-1
let srcu-rscsi = srcu-rscs^-1
(*
* The synchronize_rcu() strong fence is special in that it can order not
* one but two non-rf relations, but only in conjunction with an RCU
* read-side critical section.
*)
let rcu-link = hb* ; pb* ; prop
let rcu-link = po? ; hb* ; pb* ; prop ; po
(*
* Any sequence containing at least as many grace periods as RCU read-side
* critical sections (joined by rcu-link) acts as a generalized strong fence.
* Likewise for SRCU grace periods and read-side critical sections, provided
* the synchronize_srcu() and srcu_read_[un]lock() calls refer to the same
* struct srcu_struct location.
*)
let rec rcu-fence = gp |
(gp ; rcu-link ; rscs) |
(rscs ; rcu-link ; gp) |
(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
let rec rcu-fence = rcu-gp | srcu-gp |
(rcu-gp ; rcu-link ; rcu-rscsi) |
((srcu-gp ; rcu-link ; srcu-rscsi) & loc) |
(rcu-rscsi ; rcu-link ; rcu-gp) |
((srcu-rscsi ; rcu-link ; srcu-gp) & loc) |
(rcu-gp ; rcu-link ; rcu-fence ; rcu-link ; rcu-rscsi) |
((srcu-gp ; rcu-link ; rcu-fence ; rcu-link ; srcu-rscsi) & loc) |
(rcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; rcu-gp) |
((srcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; srcu-gp) & loc) |
(rcu-fence ; rcu-link ; rcu-fence)
(* rb orders instructions just as pb does *)
let rb = prop ; rcu-fence ; hb* ; pb*
let rb = prop ; po ; rcu-fence ; po? ; hb* ; pb*
irreflexive rb as rcu

View File

@ -47,6 +47,12 @@ rcu_read_unlock() { __fence{rcu-unlock}; }
synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }
// SRCU
srcu_read_lock(X) __srcu{srcu-lock}(X)
srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
// Atomic
atomic_read(X) READ_ONCE(*X)
atomic_set(X,V) { WRITE_ONCE(*X,V); }

View File

@ -6,9 +6,6 @@
(*
* Generate coherence orders and handle lock operations
*
* Warning: spin_is_locked() crashes herd7 versions strictly before 7.48.
* spin_is_locked() is functional from herd7 version 7.49.
*)
include "cross.cat"

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Extract the number of CPUs expected from the specified Kconfig-file
# fragment by checking CONFIG_SMP and CONFIG_NR_CPUS. If the specified
@ -7,23 +8,9 @@
#
# Usage: configNR_CPUS.sh config-frag
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
cf=$1
if test ! -r $cf

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# config_override.sh base override
#
@ -6,23 +7,9 @@
# that conflict with any in override, concatenating what remains and
# sending the result to standard output.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2017
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
base=$1
if test -r $base

View File

@ -1,23 +1,11 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Usage: configcheck.sh .config .config-template
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
T=${TMPDIR-/tmp}/abat-chk-config.sh.$$
trap 'rm -rf $T' 0
@ -26,6 +14,7 @@ mkdir $T
cat $1 > $T/.config
cat $2 | sed -e 's/\(.*\)=n/# \1 is not set/' -e 's/^#CHECK#//' |
grep -v '^CONFIG_INITRAMFS_SOURCE' |
awk '
{
print "if grep -q \"" $0 "\" < '"$T/.config"'";

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Usage: configinit.sh config-spec-file build-output-dir results-dir
#
@ -14,23 +15,9 @@
# for example, "O=/tmp/foo". If this argument is omitted, the .config
# file will be generated directly in the current directory.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
T=${TMPDIR-/tmp}/configinit.sh.$$
trap 'rm -rf $T' 0

View File

@ -1,26 +1,13 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Get an estimate of how CPU-hoggy to be.
#
# Usage: cpus2use.sh
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
ncpus=`grep '^processor' /proc/cpuinfo | wc -l`
idlecpus=`mpstat | tail -1 | \

View File

@ -1,24 +1,11 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Shell functions for the rest of the scripts.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
# bootparam_hotplug_cpu bootparam-string
#

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Alternate sleeping and spinning on randomly selected CPUs. The purpose
# of this script is to inflict random OS jitter on a concurrently running
@ -11,23 +12,9 @@
# sleepmax: Maximum microseconds to sleep, defaults to one second.
# spinmax: Maximum microseconds to spin, defaults to one millisecond.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2016
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
me=$(($1 * 1000))
duration=$2

View File

@ -1,26 +1,13 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Build a kvm-ready Linux kernel from the tree in the current directory.
#
# Usage: kvm-build.sh config-template build-dir resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
config_template=${1}
if test -z "$config_template" -o ! -f "$config_template" -o ! -r "$config_template"

View File

@ -1,4 +1,5 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0+
#
# Invoke a text editor on all console.log files for all runs with diagnostics,
# that is, on all such files having a console.log.diags counterpart.
@ -10,6 +11,10 @@
#
# The "directory" above should end with the date/time directory, for example,
# "tools/testing/selftests/rcutorture/res/2018.02.25-14:27:27".
#
# Copyright (C) IBM Corporation, 2018
#
# Author: Paul E. McKenney <paulmck@linux.ibm.com>
rundir="${1}"
if test -z "$rundir" -o ! -d "$rundir"

View File

@ -1,26 +1,13 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Analyze a given results directory for locktorture progress.
#
# Usage: kvm-recheck-lock.sh resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2014
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
i="$1"
if test -d "$i" -a -r "$i"

View File

@ -1,26 +1,13 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Analyze a given results directory for rcutorture progress.
#
# Usage: kvm-recheck-rcu.sh resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2014
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
i="$1"
if test -d "$i" -a -r "$i"

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Analyze a given results directory for rcuperf performance measurements,
# looking for ftrace data. Exits with 0 if data was found, analyzed, and
@ -7,23 +8,9 @@
#
# Usage: kvm-recheck-rcuperf-ftrace.sh resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2016
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
i="$1"
. functions.sh

View File

@ -1,26 +1,13 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Analyze a given results directory for rcuperf performance measurements.
#
# Usage: kvm-recheck-rcuperf.sh resdir
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2016
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
i="$1"
if test -d "$i" -a -r "$i"

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Given the results directories for previous KVM-based torture runs,
# check the build and console output for errors. Given a directory
@ -6,23 +7,9 @@
#
# Usage: kvm-recheck.sh resdir ...
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
. functions.sh

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Run a kvm-based test of the specified tree on the specified configs.
# Fully automated run and error checking, no graphics console.
@ -20,23 +21,9 @@
#
# More sophisticated argument parsing is clearly needed.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
T=${TMPDIR-/tmp}/kvm-test-1-run.sh.$$
trap 'rm -rf $T' 0

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Run a series of tests under KVM. By default, this series is specified
# by the relevant CFLIST file, but can be overridden by the --configs
@ -6,23 +7,9 @@
#
# Usage: kvm.sh [ options ]
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
scriptname=$0
args="$*"

View File

@ -1,21 +1,8 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Create an initrd directory if one does not already exist.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Author: Connor Shu <Connor.Shu@ibm.com>

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Check the build output from an rcutorture run for goodness.
# The "file" is a pathname on the local system, and "title" is
@ -8,23 +9,9 @@
#
# Usage: parse-build.sh file title
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
F=$1
title=$2

View File

@ -1,4 +1,5 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Check the console output from an rcutorture run for oopses.
# The "file" is a pathname on the local system, and "title" is
@ -6,23 +7,9 @@
#
# Usage: parse-console.sh file title
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2011
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
T=${TMPDIR-/tmp}/parse-console.sh.$$
file="$1"

View File

@ -1,24 +1,11 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Kernel-version-dependent shell functions for the rest of the scripts.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2014
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
# locktorture_param_onoff bootparam-string config-file
#

View File

@ -1,24 +1,11 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Kernel-version-dependent shell functions for the rest of the scripts.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
# rcutorture_param_n_barrier_cbs bootparam-string
#

View File

@ -1,24 +1,11 @@
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0+
#
# Torture-suite-dependent shell functions for the rest of the scripts.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2015
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
# per_version_boot_params bootparam-string config-file seconds
#