1
0
Fork 0
Fork of alistair23 Linux kernel for reMarkable from https://github.com/alistair23/linux
 
 
 
 
 
 
Go to file
Mel Gorman 4b62bbcc86 mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
commit 8b272b3cbb upstream.

: A user reported a bug against a distribution kernel while running a
: proprietary workload described as "memory intensive that is not swapping"
: that is expected to apply to mainline kernels.  The workload is
: read/write/modifying ranges of memory and checking the contents.  They
: reported that within a few hours that a bad PMD would be reported followed
: by a memory corruption where expected data was all zeros.  A partial
: report of the bad PMD looked like
:
:   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
:   [ 5195.341184] ------------[ cut here ]------------
:   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
:   ....
:   [ 5195.410033] Call Trace:
:   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
:   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
:   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
:   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
:   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
:   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
:   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
:
: Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
: was a false detection.  The bug does not trigger if automatic NUMA
: balancing or transparent huge pages is disabled.
:
: The bug is due a race in change_pmd_range between a pmd_trans_huge and
: pmd_nond_or_clear_bad check without any locks held.  During the
: pmd_trans_huge check, a parallel protection update under lock can have
: cleared the PMD and filled it with a prot_numa entry between the transhuge
: check and the pmd_none_or_clear_bad check.
:
: While this could be fixed with heavy locking, it's only necessary to make
: a copy of the PMD on the stack during change_pmd_range and avoid races.  A
: new helper is created for this as the check if quite subtle and the
: existing similar helpful is not suitable.  This passed 154 hours of
: testing (usually triggers between 20 minutes and 24 hours) without
: detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
: workload showed no significant change in behaviour.

Although Mel withdrew the patch on the face of LKML comment
https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
still open, and we have reports of Linpack test reporting bad residuals
after the bad PMD warning is observed.  In addition to that, bad
rss-counter and non-zero pgtables assertions are triggered on mm teardown
for the task hitting the bad PMD.

 host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
 ....
 host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
 host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096

The issue is observed on a v4.18-based distribution kernel, but the race
window is expected to be applicable to mainline kernels, as well.

[akpm@linux-foundation.org: fix comment typo, per Rafael]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12 13:00:19 +01:00
Documentation netfilter: nf_flowtable: fix documentation 2020-03-05 16:43:51 +01:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
arch arch/csky: fix some Kconfig typos 2020-03-12 13:00:17 +01:00
block block, bfq: do not insert oom queue into position tree 2020-03-12 13:00:08 +01:00
certs PKCS#7: Refactor verify_pkcs7_signature() 2019-08-05 18:40:18 -04:00
crypto crypto: rename sm3-256 to sm3 in hash_algo_name 2020-02-28 17:22:26 +01:00
drivers vgacon: Fix a UAF in vgacon_invert_region 2020-03-12 13:00:19 +01:00
fs cifs: fix rename() by ensuring source handle opened with DELETE bit 2020-03-12 13:00:18 +01:00
include blktrace: Protect q->blk_trace with RCU 2020-03-05 16:43:52 +01:00
init kbuild: remove header compile test 2020-03-05 16:43:47 +01:00
ipc Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()" 2020-02-28 17:22:20 +01:00
kernel blktrace: fix dereference after null check 2020-03-12 13:00:10 +01:00
lib kbuild: move headers_check rule to usr/include/Makefile 2020-03-05 16:43:47 +01:00
mm mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa 2020-03-12 13:00:19 +01:00
net netfilter: nft_tunnel: no need to call htons() when dumping ports 2020-03-05 16:43:50 +01:00
samples samples/bpf: Set -fno-stack-protector when building BPF programs 2020-02-24 08:36:36 +01:00
scripts kbuild: move headers_check rule to usr/include/Makefile 2020-03-05 16:43:47 +01:00
security efi: Only print errors about failing to get certs if EFI vars are found 2020-03-12 13:00:14 +01:00
sound ALSA: hda/realtek - Enable the headset of ASUS B9450FA with ALC294 2020-03-12 13:00:18 +01:00
tools selftests: forwarding: vxlan_bridge_1d: use more proper tos value 2020-03-12 13:00:17 +01:00
usr kbuild: fix 'No such file or directory' warning when cleaning 2020-03-12 13:00:09 +01:00
virt KVM: Check for a bad hva before dropping into the ghc slow path 2020-03-05 16:43:48 +01:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS MAINTAINERS: Update drm/i915 bug filing URL 2020-02-28 17:22:19 +01:00
Makefile Linux 5.4.24 2020-03-05 16:43:52 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.