remarkable-linux/mm
zhong jiang 5b398e416e mm,ksm: fix endless looping in allocating memory when ksm enable
I hit the following hung task when runing a OOM LTP test case with 4.1
kernel.

Call trace:
[<ffffffc000086a88>] __switch_to+0x74/0x8c
[<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
[<ffffffc000a1c09c>] schedule+0x3c/0x94
[<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
[<ffffffc000a1e32c>] down_write+0x64/0x80
[<ffffffc00021f794>] __ksm_exit+0x90/0x19c
[<ffffffc0000be650>] mmput+0x118/0x11c
[<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
[<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
[<ffffffc0000d0f34>] get_signal+0x444/0x5e0
[<ffffffc000089fcc>] do_signal+0x1d8/0x450
[<ffffffc00008a35c>] do_notify_resume+0x70/0x78

The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator

ksm_do_scan
	scan_get_next_rmap_item
		down_read
		get_next_rmap_item
			alloc_rmap_item   #ksmd will loop permanently.

There is no way forward because the oom victim cannot release any memory
in 4.1 based kernel.  Since 4.6 we have the oom reaper which would solve
this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan
would just retry later after the lock got dropped.

Such a patch would be also easy to backport to older stable kernels which
do not have oom_reaper.

While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed
by the allocation failure.

Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Suggested-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:01 -07:00
..
kasan kasan: remove the unnecessary WARN_ONCE from quarantine.c 2016-08-11 16:58:14 -07:00
backing-dev.c block: fix bdi vs gendisk lifetime mismatch 2016-08-04 14:19:16 -06:00
balloon_compaction.c mm: balloon: use general non-lru movable page feature 2016-07-26 16:19:19 -07:00
bootmem.c
cleancache.c
cma.c
cma.h
cma_debug.c
compaction.c mm, compaction: simplify contended compaction handling 2016-07-28 16:07:41 -07:00
debug.c mm: avoid endless recursion in dump_page() 2016-09-19 15:36:16 -07:00
debug_page_ref.c
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c block/mm: make bdev_ops->rw_page() take a bool for read/write 2016-08-07 14:41:02 -06:00
frame_vector.c
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c - ARM: GICv3 ITS emulation and various fixes. Removal of the old 2016-08-02 16:11:27 -04:00
highmem.c
huge_memory.c mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing 2016-09-25 15:43:42 -07:00
hugetlb.c mm/hugetlb: fix incorrect hugepages count during mem hotplug 2016-08-11 16:58:13 -07:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c
internal.h mm, compaction: simplify contended compaction handling 2016-07-28 16:07:41 -07:00
interval_tree.c
Kconfig mm: clarify COMPACTION Kconfig text 2016-08-26 17:39:35 -07:00
Kconfig.debug
khugepaged.c mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin() 2016-09-19 15:36:16 -07:00
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: don't hang if user disables scanning early 2016-07-28 16:07:41 -07:00
ksm.c mm,ksm: fix endless looping in allocating memory when ksm enable 2016-09-28 16:19:01 -07:00
list_lru.c
maccess.c
madvise.c
Makefile Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
memblock.c mm/memblock.c: fix NULL dereference error 2016-08-04 20:02:09 -04:00
memcontrol.c mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting 2016-09-19 15:36:17 -07:00
memory-failure.c mm: hwpoison: remove incorrect comments 2016-07-28 16:07:41 -07:00
memory.c mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing 2016-09-25 15:43:42 -07:00
memory_hotplug.c mem-hotplug: don't clear the only node in new_node_page() 2016-09-19 15:36:16 -07:00
mempolicy.c mm, mempolicy: task->mempolicy must be NULL before dropping final reference 2016-09-01 17:52:01 -07:00
mempool.c Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" 2016-07-28 16:07:41 -07:00
memtest.c
migrate.c mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations 2016-07-28 16:07:41 -07:00
mincore.c
mlock.c mm, vmscan: move LRU lists to node 2016-07-28 16:07:41 -07:00
mm_init.c
mmap.c mm: refuse wrapped vm_brk requests 2016-08-02 19:35:15 -04:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c mm: thp: check pmd_trans_unstable() after split_huge_pmd() 2016-07-26 16:19:19 -07:00
mremap.c mm: thp: check pmd_trans_unstable() after split_huge_pmd() 2016-07-26 16:19:19 -07:00
msync.c
nobootmem.c
nommu.c mm: introduce fault_env 2016-07-26 16:19:19 -07:00
oom_kill.c mm, oom: fix uninitialized ret in task_will_free_mem() 2016-08-11 16:58:14 -07:00
page-writeback.c mm: remove reclaim and compaction retry approximations 2016-07-28 16:07:41 -07:00
page_alloc.c mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator 2016-09-01 17:52:01 -07:00
page_counter.c
page_ext.c
page_idle.c mm, vmscan: move lru_lock to the node 2016-07-28 16:07:41 -07:00
page_io.c mm: fix the page_swap_info() BUG_ON check 2016-09-19 15:36:17 -07:00
page_isolation.c mm/page_isolation: clean up confused code 2016-07-26 16:19:19 -07:00
page_owner.c mm/page_owner: use stackdepot to store stacktrace 2016-07-26 16:19:19 -07:00
page_poison.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c mm: silently skip readahead for DAX inodes 2016-08-26 17:39:35 -07:00
rmap.c rmap: fix compound check logic in page_remove_file_rmap 2016-08-10 16:40:56 -07:00
shmem.c huge tmpfs: fix Committed_AS leak 2016-09-24 11:20:01 -07:00
slab.c Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
slab.h mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB 2016-07-28 16:07:41 -07:00
slab_common.c mm: charge/uncharge kmemcg from generic page allocator paths 2016-07-26 16:19:19 -07:00
slob.c
slub.c mm/slub.c: run free_partial() outside of the kmem_cache_node->list_lock 2016-08-10 16:40:56 -07:00
sparse-vmemmap.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sparse.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
swap.c mm, pagevec: release/reacquire lru_lock on pgdat change 2016-07-28 16:07:41 -07:00
swap_cgroup.c
swap_state.c mm: move most file-based accounting to the node 2016-07-28 16:07:41 -07:00
swapfile.c mm: fix the page_swap_info() BUG_ON check 2016-09-19 15:36:17 -07:00
truncate.c truncate: handle file thp 2016-07-26 16:19:19 -07:00
usercopy.c mm: usercopy: Check for module addresses 2016-09-20 16:07:39 -07:00
userfaultfd.c
util.c mm: move most file-based accounting to the node 2016-07-28 16:07:41 -07:00
vmacache.c
vmalloc.c mm: charge/uncharge kmemcg from generic page allocator paths 2016-07-26 16:19:19 -07:00
vmpressure.c
vmscan.c mm: delete unnecessary and unsafe init_tlb_ubc() 2016-09-24 11:20:01 -07:00
vmstat.c mm: remove reclaim and compaction retry approximations 2016-07-28 16:07:41 -07:00
workingset.c mm, workingset: make working set detection node-aware 2016-07-28 16:07:41 -07:00
z3fold.c
zbud.c
zpool.c
zsmalloc.c zsmalloc: Delete an unnecessary check before the function call "iput" 2016-07-28 16:07:41 -07:00
zswap.c