1
0
Fork 0
Fork of alistair23 Linux kernel for reMarkable from https://github.com/alistair23/linux
 
 
 
 
 
 
Go to file
Johannes Weiner 173fa52f7f mm: drop mmap_sem before calling balance_dirty_pages() in write fault
[ Upstream commit 89b15332af ]

One of our services is observing hanging ps/top/etc under heavy write
IO, and the task states show this is an mmap_sem priority inversion:

A write fault is holding the mmap_sem in read-mode and waiting for
(heavily cgroup-limited) IO in balance_dirty_pages():

    balance_dirty_pages+0x724/0x905
    balance_dirty_pages_ratelimited+0x254/0x390
    fault_dirty_shared_page.isra.96+0x4a/0x90
    do_wp_page+0x33e/0x400
    __handle_mm_fault+0x6f0/0xfa0
    handle_mm_fault+0xe4/0x200
    __do_page_fault+0x22b/0x4a0
    page_fault+0x45/0x50

Somebody tries to change the address space, contending for the mmap_sem in
write-mode:

    call_rwsem_down_write_failed_killable+0x13/0x20
    do_mprotect_pkey+0xa8/0x330
    SyS_mprotect+0xf/0x20
    do_syscall_64+0x5b/0x100
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The waiting writer locks out all subsequent readers to avoid lock
starvation, and several threads can be seen hanging like this:

    call_rwsem_down_read_failed+0x14/0x30
    proc_pid_cmdline_read+0xa0/0x480
    __vfs_read+0x23/0x140
    vfs_read+0x87/0x130
    SyS_read+0x42/0x90
    do_syscall_64+0x5b/0x100
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

To fix this, do what we do for cache read faults already: drop the
mmap_sem before calling into anything IO bound, in this case the
balance_dirty_pages() function, and return VM_FAULT_RETRY.

Link: http://lkml.kernel.org/r/20190924194238.GA29030@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:19:55 +01:00
Documentation dt-bindings: Improve validation build error handling 2020-01-04 19:18:11 +01:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
arch s390/cpum_sf: Avoid SBD overflow condition in irq handler 2020-01-09 10:19:49 +01:00
block block: add bio_truncate to fix guard_bio_eod 2020-01-09 10:19:54 +01:00
certs PKCS#7: Refactor verify_pkcs7_signature() 2019-08-05 18:40:18 -04:00
crypto KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails 2019-12-31 16:46:07 +01:00
drivers PCI: Add a helper to check Power Resource Requirements _PR3 existence 2020-01-09 10:19:52 +01:00
fs block: add bio_truncate to fix guard_bio_eod 2020-01-09 10:19:54 +01:00
include block: add bio_truncate to fix guard_bio_eod 2020-01-09 10:19:54 +01:00
init Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
ipc ipc/sem.c: convert to use built-in RCU list checking 2019-09-25 17:51:41 -07:00
kernel taskstats: fix data-race 2020-01-09 10:19:54 +01:00
lib ubsan, x86: Annotate and allow __ubsan_handle_shift_out_of_bounds() in uaccess regions 2019-12-31 16:44:25 +01:00
mm mm: drop mmap_sem before calling balance_dirty_pages() in write fault 2020-01-09 10:19:55 +01:00
net netfilter: nft_tproxy: Fix port selector on Big Endian 2020-01-09 10:19:54 +01:00
samples samples: pktgen: fix proc_cmd command result check logic 2019-12-31 16:43:45 +01:00
scripts scripts/kallsyms: fix definitely-lost memory leak 2020-01-04 19:18:23 +01:00
security tomoyo: Don't use nifty names on sockets. 2020-01-04 19:18:42 +01:00
sound ALSA: hda - Downgrade error message for single-cmd fallback 2020-01-09 10:19:54 +01:00
tools selftests: vm: add fragment CONFIG_TEST_VMALLOC 2020-01-04 19:18:31 +01:00
usr kbuild: update compile-test header list for v5.4-rc2 2019-10-05 15:29:49 +09:00
virt KVM: arm/arm64: Properly handle faulting of device mappings 2019-12-31 16:46:24 +01:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-22 14:28:14 -08:00
Makefile Linux 5.4.8 2020-01-04 19:19:19 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.