1
0
Fork 0
Fork of reMarkable kernel https://github.com/reMarkable/linux
 
 
 
 
Go to file
Petr Mladek 842c088464 livepatch: Fix stacking of patches with respect to RCU
rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
access and manipulation of the list of patches that modify the same function.
In particular, it is the variable func_stack that is accessible from the ftrace
handler via struct ftrace_ops and klp_ops.

Of course, it synchronizes also some states of the patch on the top of the
stack, e.g. func->transition in klp_ftrace_handler.

At the same time, this mechanism guards also the manipulation of
task->patch_state. It is modified according to the state of the transition and
the state of the process.

Now, all this works well as long as RCU works well. Sadly livepatching might
get into some corner cases when this is not true. For example, RCU is not
watching when rcu_read_lock() is taken in idle threads.  It is because they
might sleep and prevent reaching the grace period for too long.

There are ways how to make RCU watching even in idle threads, see
rcu_irq_enter(). But there is a small location inside RCU infrastructure when
even this does not work.

This small problematic location can be detected either before calling
rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
Sadly, there is no safe way how to handle it.  Once we detect that RCU was not
watching, we might see inconsistent state of the function stack and the related
variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
incompatible implementation of the function and break the consistency of the
system. We could warn but we could not avoid the damage.

Fortunately, ftrace has similar problems and they seem to be solved well there.
It uses a heavy weight implementation of some RCU operations. In particular, it
replaces:

  + rcu_read_lock() with preempt_disable_notrace()
  + rcu_read_unlock() with preempt_enable_notrace()
  + synchronize_rcu() with schedule_on_each_cpu(sync_work)

My understanding is that this is RCU implementation from a stone age. It meets
the core RCU requirements but it is rather ineffective. Especially, it does not
allow to batch or speed up the synchronize calls.

On the other hand, it is very trivial. It allows to safely trace and/or
livepatch even the RCU core infrastructure.  And the effectiveness is a not a
big issue because using ftrace or livepatches on productive systems is a rare
operation.  The safety is much more important than a negligible extra load.

Note that the alternative implementation follows the RCU principles. Therefore,
     we could and actually must use list_*_rcu() variants when manipulating the
     func_stack.  These functions allow to access the pointers in the right
     order and with the right barriers. But they do not use any other
     information that would be set only by rcu_read_lock().

Also note that there are actually two problems solved in ftrace:

First, it cares about the consistency of RCU read sections.  It is being solved
the way as described and used in this patch.

Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
when it is being freed. For this, it also calls synchronize_rcu_tasks() in
preemptive kernel in ftrace_shutdown().

Livepatch has similar problem but it is solved by ftrace for free.
klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
registered with FTRACE_OPS_FL_DYNAMIC. It causes that
unregister_ftrace_function() calls:

	* schedule_on_each_cpu(ftrace_sync) - always
	* synchronize_rcu_tasks() - in preemptive kernel

The effect is that nobody is neither inside the dynamic trampoline nor inside
the ftrace handler after unregister_ftrace_function() returns.

[jkosina@suse.cz: reformat changelog, fix comment]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-06-20 10:42:19 +02:00
Documentation Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-05-02 18:24:16 -07:00
arch Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-05-02 18:24:16 -07:00
block A reasonably busy cycle for documentation this time around. There is a new 2017-05-02 10:21:17 -07:00
certs certs: Add a secondary system keyring that can be added to dynamically 2016-04-11 22:48:09 +01:00
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
drivers Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2017-05-02 18:07:04 -07:00
firmware WHENCE: use https://linuxtv.org for LinuxTV URLs 2015-12-04 10:35:11 -02:00
fs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-05-02 18:24:16 -07:00
include Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-05-02 18:24:16 -07:00
init mm: move mm_percpu_wq initialization earlier 2017-03-31 17:13:30 -07:00
ipc kernel-api.rst: fix a series of errors when parsing C files 2017-04-02 14:31:49 -06:00
kernel livepatch: Fix stacking of patches with respect to RCU 2017-06-20 10:42:19 +02:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
mm Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:18:50 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
samples Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-05-02 18:24:16 -07:00
scripts A reasonably busy cycle for documentation this time around. There is a new 2017-05-02 10:21:17 -07:00
security Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
sound LED updates for 4.12 2017-05-01 13:12:49 -07:00
tools Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
usr kbuild: initramfs cleanup, set target from Kconfig 2017-01-05 09:40:16 -08:00
virt KVM/ARM Fixes for v4.11-rc6 2017-04-05 16:27:47 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-08-02 16:48:52 -04:00
.mailmap mailmap: add Martin Kepplinger's email 2017-04-13 18:24:21 -07:00
COPYING
CREDITS avr32: remove support for AVR32 architecture 2017-05-01 09:27:15 +02:00
Kbuild scripts/gdb: provide linux constants 2016-05-23 17:04:14 -07:00
Kconfig kbuild: migrate all arch to the kconfig mainmenu upgrade 2010-09-19 22:54:11 -04:00
MAINTAINERS This is the bulk of pin control changes for the v4.12 cycle: 2017-05-02 17:59:33 -07:00
Makefile A reasonably busy cycle for documentation this time around. There is a new 2017-05-02 10:21:17 -07:00
README README: add a new README file, pointing to the Documentation/ 2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.