remarkable-linux

History

Dmitry Vyukov 64abdcb243 kasan: eliminate long stalls during quarantine reduction Currently we dedicate 1/32 of RAM for quarantine and then reduce it by 1/4 of total quarantine size. This can be a significant amount of memory. For example, with 4GB of RAM total quarantine size is 128MB and it is reduced by 32MB at a time. With 128GB of RAM total quarantine size is 4GB and it is reduced by 1GB. This leads to several problems: - freeing 1GB can take tens of seconds, causes rcu stall warnings and just introduces unexpected long delays at random places - if kmalloc() is called under a mutex, other threads stall on that mutex while a thread reduces quarantine - threads wait on quarantine_lock while one thread grabs a large batch of objects to evict - we walk the uncached list of object to free twice which makes all of the above worse - when a thread frees objects, they are already not accounted against global_quarantine.bytes; as the result we can have quarantine_size bytes in quarantine + unbounded amount of memory in large batches in threads that are in process of freeing Reduce size of quarantine in smaller batches to reduce the delays. The only reason to reduce it in batches is amortization of overheads, the new batch size of 1MB should be well enough to amortize spinlock lock/unlock and few function calls. Plus organize quarantine as a FIFO array of batches. This allows to not walk the list in quarantine_reduce() under quarantine_lock, which in turn reduces contention and is just faster. This improves performance of heavy load (syzkaller fuzzing) by ~20% with 4 CPUs and 32GB of RAM. Also this eliminates frequent (every 5 sec) drops of CPU consumption from ~400% to ~100% (one thread reduces quarantine while others are waiting on a mutex). Some reference numbers: 1. Machine with 4 CPUs and 4GB of memory. Quarantine size 128MB. Currently we free 32MB at at time. With new code we free 1MB at a time (1024 batches, ~128 are used). 2. Machine with 32 CPUs and 128GB of memory. Quarantine size 4GB. Currently we free 1GB at at time. With new code we free 8MB at a time (1024 batches, ~512 are used). 3. Machine with 4096 CPUs and 1TB of memory. Quarantine size 32GB. Currently we free 8GB at at time. With new code we free 4MB at a time (16K batches, ~8K are used). Link: http://lkml.kernel.org/r/1478756952-18695-1-git-send-email-dvyukov@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2016-12-12 18:55:09 -08:00
..
kasan	kasan: eliminate long stalls during quarantine reduction	2016-12-12 18:55:09 -08:00
backing-dev.c	block: fix bdi vs gendisk lifetime mismatch	2016-08-04 14:19:16 -06:00
balloon_compaction.c	mm: balloon: use general non-lru movable page feature	2016-07-26 16:19:19 -07:00
bootmem.c	mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping	2016-10-11 15:06:33 -07:00
cleancache.c
cma.c	mm/cma.c: check the max limit for cma allocation	2016-11-11 08:12:37 -08:00
cma.h
cma_debug.c
compaction.c	mm, compaction: fix NR_ISOLATED_* stats for pfn based migration	2016-12-12 18:55:07 -08:00
debug.c	mm, debug: print raw struct page data in __dump_page()	2016-12-12 18:55:08 -08:00
debug_page_ref.c
dmapool.c
early_ioremap.c
fadvise.c	mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED	2016-06-09 14:23:11 -07:00
failslab.c
filemap.c	mm: workingset: restore refault tracking for single-page files	2016-12-12 18:55:08 -08:00
frame_vector.c	mm: replace get_vaddr_frames() write/force parameters with gup_flags	2016-10-19 08:11:24 -07:00
frontswap.c	mm, frontswap: convert frontswap_enabled to static key	2016-07-26 16:19:19 -07:00
gup.c	mm: fix up get_user_pages* comments	2016-12-12 18:55:07 -08:00
highmem.c	mm/highmem: make nr_free_highpages() handles all highmem zones by itself	2016-05-19 19:12:14 -07:00
huge_memory.c	mm: make transparent hugepage size public	2016-12-12 18:55:09 -08:00
hugetlb.c	mm: add tlb_remove_check_page_size_change to track page size change	2016-12-12 18:55:07 -08:00
hugetlb_cgroup.c	mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size	2016-05-20 17:58:30 -07:00
hwpoison-inject.c
init-mm.c
internal.h	mm, compaction: make full priority ignore pageblock suitability	2016-10-07 18:46:29 -07:00
interval_tree.c
Kconfig	mm: THP page cache support for ppc64	2016-12-12 18:55:08 -08:00
Kconfig.debug	PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO	2016-09-13 02:35:27 +02:00
khugepaged.c	mm: THP page cache support for ppc64	2016-12-12 18:55:08 -08:00
kmemcheck.c
kmemleak-test.c
kmemleak.c	kmemleak: fix reference to Documentation	2016-12-12 18:55:07 -08:00
ksm.c	mm,ksm: add __GFP_HIGH to the allocation in alloc_stable_node()	2016-10-07 18:46:29 -07:00
list_lru.c	mm/list_lru.c: avoid error-path NULL pointer deref	2016-10-27 18:43:42 -07:00
maccess.c	x86: remove more uaccess_32.h complexity	2016-05-22 17:21:27 -07:00
madvise.c	mm: add tlb_remove_check_page_size_change to track page size change	2016-12-12 18:55:07 -08:00
Makefile	Disable the __builtin_return_address() warning globally after all	2016-10-12 10:23:41 -07:00
memblock.c	mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping	2016-10-11 15:06:33 -07:00
memcontrol.c	mm: memcontrol: use special workqueue for creating per-memcg caches	2016-12-12 18:55:06 -08:00
memory-failure.c	mm: hwpoison: fix thp split handling in memory_failure()	2016-11-11 08:12:37 -08:00
memory.c	mm: THP page cache support for ppc64	2016-12-12 18:55:08 -08:00
memory_hotplug.c	mm: remove x86-only restriction of movable_node	2016-12-12 18:55:07 -08:00
mempolicy.c	mm/mempolicy.c: forbid static or relative flags for local NUMA mode	2016-12-12 18:55:07 -08:00
mempool.c	Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"	2016-07-28 16:07:41 -07:00
memtest.c
migrate.c	lib: radix-tree: check accounting of existing slot replacement users	2016-12-12 18:55:08 -08:00
mincore.c	mm, swap: use offset of swap entry as key of swap cache	2016-10-07 18:46:28 -07:00
mlock.c	thp: fix corner case of munlock() of PTE-mapped THPs	2016-11-30 16:32:52 -08:00
mm_init.c
mmap.c	mm: vma_merge: correct false positive from __vma_unlink->validate_mm_rb	2016-10-07 18:46:29 -07:00
mmu_context.c	mm/mmu_context, sched/core: Fix mmu_context.h assumption	2016-04-28 11:44:19 +02:00
mmu_notifier.c
mmzone.c	mm, page_alloc: inline the fast path of the zonelist iterator	2016-05-19 19:12:14 -07:00
mprotect.c	mm/pkeys: generate pkey system call code only if ARCH_HAS_PKEYS is selected	2016-12-12 18:55:07 -08:00
mremap.c	mremap: move_ptes: check pte dirty after its removal	2016-11-29 08:20:24 -08:00
msync.c
nobootmem.c	mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping	2016-10-11 15:06:33 -07:00
nommu.c	mm: unexport __get_user_pages()	2016-10-24 19:13:20 -07:00
oom_kill.c	oom: print nodemask in the oom report	2016-10-07 18:46:29 -07:00
page-writeback.c	mm: don't use radix tree writeback tags for pages in swap cache	2016-10-07 18:46:28 -07:00
page_alloc.c	mm, page_alloc: keep pcp count and list contents in sync if struct page is corrupted	2016-12-12 18:55:08 -08:00
page_counter.c
page_ext.c	mm/page_ext: support extra space allocation by page_ext user	2016-10-07 18:46:27 -07:00
page_idle.c	mm, vmscan: move lru_lock to the node	2016-07-28 16:07:41 -07:00
page_io.c	mm/page_io.c: replace some BUG_ON()s with VM_BUG_ON_PAGE()	2016-10-07 18:46:29 -07:00
page_isolation.c	mm/page_isolation: fix typo: "paes" -> "pages"	2016-10-07 18:46:29 -07:00
page_owner.c	mm/page_owner: don't define fields on struct page_ext by hard-coding	2016-10-07 18:46:27 -07:00
page_poison.c	mm: check the return value of lookup_page_ext for all call sites	2016-06-03 15:06:22 -07:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c	mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk()	2016-10-05 11:52:55 -04:00
pgtable-generic.c
process_vm_access.c	mm: remove write/force parameters from __get_user_pages_unlocked()	2016-10-18 14:13:37 -07:00
quicklist.c
readahead.c	mm: don't cap request size based on read-ahead setting	2016-12-12 18:55:08 -08:00
rmap.c	mm, rmap: handle anon_vma_prepare() common case inline	2016-12-12 18:55:08 -08:00
shmem.c	lib: radix-tree: update callback for changing leaf nodes	2016-12-12 18:55:08 -08:00
slab.c	mm, slab: maintain total slab count instead of active count	2016-12-12 18:55:07 -08:00
slab.h	mm, slab: maintain total slab count instead of active count	2016-12-12 18:55:07 -08:00
slab_common.c	mm/slab_common.c: check kmem_create_cache flags are common	2016-12-12 18:55:06 -08:00
slob.c	slub: move synchronize_sched out of slab_mutex on shrink	2016-12-12 18:55:06 -08:00
slub.c	slub: avoid false-postive warning	2016-12-12 18:55:06 -08:00
sparse-vmemmap.c	treewide: replace obsolete _refok by __ref	2016-08-02 17:31:41 -04:00
sparse.c	treewide: replace obsolete _refok by __ref	2016-08-02 17:31:41 -04:00
swap.c	thp: reduce usage of huge zero page's atomic counter	2016-10-07 18:46:28 -07:00
swap_cgroup.c
swap_state.c	mm, swap: use offset of swap entry as key of swap cache	2016-10-07 18:46:28 -07:00
swapfile.c	mm: add three more cond_resched() in swapoff	2016-12-12 18:55:08 -08:00
truncate.c	mm: workingset: move shadow entry tracking to radix tree exceptional tracking	2016-12-12 18:55:08 -08:00
usercopy.c	mm: usercopy: Check for module addresses	2016-09-20 16:07:39 -07:00
userfaultfd.c	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros	2016-04-04 10:41:08 -07:00
util.c	Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-10-22 09:39:10 -07:00
vmacache.c	mm: unrig VMA cache hit ratio	2016-10-07 18:46:27 -07:00
vmalloc.c	mm: add preempt points into __purge_vmap_area_lazy()	2016-12-12 18:55:08 -08:00
vmpressure.c
vmscan.c	mm/vmscan.c: set correct defer count for shrinker	2016-12-12 18:55:07 -08:00
vmstat.c	seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char	2016-10-07 18:46:30 -07:00
workingset.c	mm: workingset: update shadow limit to reflect bigger active list	2016-12-12 18:55:08 -08:00
z3fold.c	mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup	2016-06-03 16:02:55 -07:00
zbud.c
zpool.c
zsmalloc.c	zsmalloc: Delete an unnecessary check before the function call "iput"	2016-07-28 16:07:41 -07:00
zswap.c	mm/zswap: use workqueue to destroy pool	2016-05-20 17:58:30 -07:00