remarkable-linux/mm
Andrea Arcangeli 4106f83a9f make swappiness safer to use
Swappiness isn't a safe sysctl.  Setting it to 0 for example can hang a
system.  That's a corner case but even setting it to 10 or lower can waste
enormous amounts of cpu without making much progress.  We've customers who
wants to use swappiness but they can't because of the current
implementation (if you change it so the system stops swapping it really
stops swapping and nothing works sane anymore if you really had to swap
something to make progress).

This patch from Kurt Garloff makes swappiness safer to use (no more huge
cpu usage or hangs with low swappiness values).

I think the prev_priority can also be nuked since it wastes 4 bytes per
zone (that would be an incremental patch but I wait the nr_scan_[in]active
to be nuked first for similar reasons).  Clearly somebody at some point
noticed how broken that thing was and they had to add min(priority,
prev_priority) to give it some reliability, but they didn't go the last
mile to nuke prev_priority too.  Calculating distress only in function of
not-racy priority is correct and sure more than enough without having to
add randomness into the equation.

Patch is tested on older kernels but it compiles and it's quite simple
so...

Overall I'm not very satisified by the swappiness tweak, since it doesn't
rally do anything with the dirty pagecache that may be inactive.  We need
another kind of tweak that controls the inactive scan and tunes the
can_writepage feature (not yet in mainline despite having submitted it a
few times), not only the active one.  That new tweak will tell the kernel
how hard to scan the inactive list for pure clean pagecache (something the
mainline kernel isn't capable of yet).  We already have that feature
working in all our enterprise kernels with the default reasonable tune, or
they can't even run a readonly backup with tar without triggering huge
write I/O.  I think it should be available also in mainline later.

Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: Andrea Arcangeli <andrea@suse.de>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:59 -07:00
..
allocpercpu.c Slab allocators: Replace explicit zeroing with __GFP_ZERO 2007-07-17 10:23:02 -07:00
backing-dev.c remove mm/backing-dev.c:congestion_wait_interruptible() 2007-07-16 09:05:52 -07:00
bootmem.c
bounce.c Drop 'size' argument from bio_endio and bi_end_io 2007-10-10 09:25:57 +02:00
fadvise.c
filemap.c fs: remove some AOP_TRUNCATED_PAGE 2007-10-16 09:42:58 -07:00
filemap_xip.c mm: write iovec cleanup 2007-10-16 09:42:54 -07:00
fremap.c fix VM_CAN_NONLINEAR check in sys_remap_file_pages 2007-10-08 12:58:14 -07:00
highmem.c Create the ZONE_MOVABLE zone 2007-07-17 10:22:59 -07:00
hugetlb.c hugetlb: fix clear_user_highpage arguments 2007-10-01 07:52:23 -07:00
internal.h
Kconfig vmemmap: generify initialisation via helpers 2007-10-16 09:42:51 -07:00
madvise.c speed up madvise_need_mmap_write() usage 2007-07-16 09:05:36 -07:00
Makefile Generic Virtual Memmap support for SPARSEMEM 2007-10-16 09:42:51 -07:00
memory.c calculation of pgoff in do_linear_fault() uses mixed units 2007-10-16 09:42:53 -07:00
memory_hotplug.c Memoryless nodes: introduce mask of nodes with memory 2007-10-16 09:42:58 -07:00
mempolicy.c memoryless nodes: fixup uses of node_online_map in generic code 2007-10-16 09:42:59 -07:00
mempool.c Slab allocators: Replace explicit zeroing with __GFP_ZERO 2007-07-17 10:23:02 -07:00
migrate.c Memoryless nodes: Update memory policy and page migration 2007-10-16 09:42:58 -07:00
mincore.c
mlock.c do not limit locked memory when RLIMIT_MEMLOCK is RLIM_INFINITY 2007-07-16 09:05:37 -07:00
mmap.c fix NULL pointer dereference in __vm_enough_memory() 2007-08-22 19:52:45 -07:00
mmzone.c
mprotect.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
mremap.c mm: variable length argument support 2007-07-19 10:04:45 -07:00
msync.c
nommu.c fix NULL pointer dereference in __vm_enough_memory() 2007-08-22 19:52:45 -07:00
oom_kill.c Memoryless nodes: OOM: use N_HIGH_MEMORY map instead of constructing one on the fly 2007-10-16 09:42:58 -07:00
page-writeback.c memoryless nodes: fixup uses of node_online_map in generic code 2007-10-16 09:42:59 -07:00
page_alloc.c memoryless nodes: fixup uses of node_online_map in generic code 2007-10-16 09:42:59 -07:00
page_io.c Drop 'size' argument from bio_endio and bi_end_io 2007-10-10 09:25:57 +02:00
pdflush.c Freezer: make kernel threads nonfreezable by default 2007-07-17 10:23:02 -07:00
prio_tree.c
quicklist.c
readahead.c mm: buffered write cleanup 2007-10-16 09:42:54 -07:00
rmap.c mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
shmem.c memoryless nodes: fixup uses of node_online_map in generic code 2007-10-16 09:42:59 -07:00
shmem_acl.c
slab.c Categorize GFP flags 2007-10-16 09:42:59 -07:00
slob.c Slab allocators: fail if ksize is called with a NULL parameter 2007-10-16 09:42:53 -07:00
slub.c Categorize GFP flags 2007-10-16 09:42:59 -07:00
sparse-vmemmap.c vmemmap: generify initialisation via helpers 2007-10-16 09:42:51 -07:00
sparse.c Generic Virtual Memmap support for SPARSEMEM 2007-10-16 09:42:51 -07:00
swap.c mm: use pagevec to rotate reclaimable page 2007-10-16 09:42:54 -07:00
swap_state.c mm: clarify __add_to_swap_cache locking 2007-10-16 09:42:53 -07:00
swapfile.c Replace CONFIG_SOFTWARE_SUSPEND with CONFIG_HIBERNATION 2007-07-29 16:45:38 -07:00
thrash.c
tiny-shmem.c
truncate.c mm: merge populate and nopage into fault (fixes nonlinear) 2007-07-19 10:04:41 -07:00
util.c Slab allocators: fail if ksize is called with a NULL parameter 2007-10-16 09:42:53 -07:00
vmalloc.c Categorize GFP flags 2007-10-16 09:42:59 -07:00
vmscan.c make swappiness safer to use 2007-10-16 09:42:59 -07:00
vmstat.c Remove fs.h from mm.h 2007-07-29 17:09:29 -07:00