alistair23-linux/fs/proc
David Rientjes a7f638f999 mm, oom: normalize oom scores to oom_score_adj scale only for userspace
The oom_score_adj scale ranges from -1000 to 1000 and represents the
proportion of memory available to the process at allocation time.  This
means an oom_score_adj value of 300, for example, will bias a process as
though it was using an extra 30.0% of available memory and a value of
-350 will discount 35.0% of available memory from its usage.

The oom killer badness heuristic also uses this scale to report the oom
score for each eligible process in determining the "best" process to
kill.  Thus, it can only differentiate each process's memory usage by
0.1% of system RAM.

On large systems, this can end up being a large amount of memory: 256MB
on 256GB systems, for example.

This can be fixed by having the badness heuristic to use the actual
memory usage in scoring threads and then normalizing it to the
oom_score_adj scale for userspace.  This results in better comparison
between eligible threads for kill and no change from the userspace
perspective.

Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29 16:22:24 -07:00
..
array.c userns: Convert proc to use kuid/kgid where appropriate 2012-05-15 14:59:28 -07:00
base.c mm, oom: normalize oom scores to oom_score_adj scale only for userspace 2012-05-29 16:22:24 -07:00
cmdline.c proc: switch /proc/cmdline to seq_file 2008-10-23 14:29:04 +04:00
consoles.c console: rename acquire/release_console_sem() to console_lock/unlock() 2011-01-26 10:50:06 +10:00
cpuinfo.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c 2008-10-23 15:05:11 +04:00
devices.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
generic.c switch procfs to umode_t use 2012-01-03 22:54:56 -05:00
inode.c avoid iput() from flusher thread 2012-05-28 09:54:45 -07:00
internal.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl 2012-03-23 18:08:58 -07:00
interrupts.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c 2008-10-23 15:15:46 +04:00
Kconfig kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT 2011-01-20 17:02:05 -08:00
kcore.c fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static 2012-03-23 16:58:42 -07:00
kmsg.c procfs: Use generic_file_llseek in /proc/kmsg 2010-04-09 16:35:41 +02:00
loadavg.c sched, timers: cleanup avenrun users 2009-05-15 15:32:45 +02:00
Makefile ns: proc files for namespace naming policy. 2011-05-10 14:31:44 -07:00
meminfo.c fs/proc/meminfo.c: fix compilation error 2011-12-09 07:50:28 -08:00
mmu.c fs/proc/mmu.c: headers butchery 2007-10-17 08:42:48 -07:00
namespaces.c fs/proc/namespaces.c: prevent crash when ns_entries[] is empty 2012-03-28 17:14:37 -07:00
nommu.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
page.c pagemap: export KPF_THP 2012-03-21 17:54:57 -07:00
proc_devtree.c of/flattree: Drop an uninteresting message to pr_debug level 2011-03-02 13:45:18 -07:00
proc_net.c switch procfs to umode_t use 2012-01-03 22:54:56 -05:00
proc_sysctl.c userns: Convert sysctl permission checks to use kuid and kgids. 2012-05-15 14:59:28 -07:00
proc_tty.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
root.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
softirqs.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
stat.c proc: stats: Use arch_idle_time for idle and iowait times if available 2012-03-30 15:43:33 +02:00
task_mmu.c mm: fix NULL ptr deref when walking hugepages 2012-05-29 16:22:18 -07:00
task_nommu.c procfs: mark thread stack correctly in proc/<pid>/maps 2012-03-21 17:54:58 -07:00
uptime.c Merge branch 'sched/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip 2011-12-19 19:23:15 +01:00
version.c proc: switch /proc/version to seq_file 2008-10-23 14:19:58 +04:00
vmcore.c fadump: Introduce cleanup routine to invalidate /proc/vmcore. 2012-02-23 10:50:02 +11:00