1
0
Fork 0
alistair23-linux/arch
Shaohua Li ec8acf20af swap: add per-partition lock for swapfile
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD).  The main contention comes
from swap_info_get().  This patch tries to fix the gap with adding a new
per-partition lock.

Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.

nr_swap_pages is an atomic now, it can be changed without swap_lock.  In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages.  But sounds not a big problem.

Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.

Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it.  read the
flags is ok with either the locks hold.

If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.

swap_entry_free() can change swap_list.  To delete that code, we add a
new highest_priority_index.  Whenever get_swap_page() is called, we
check it.  If it's valid, we use it.

It's a pity get_swap_page() still holds swap_lock().  But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush).  And
BTW, looks get_swap_page() doesn't really need the lock.  We never free
swap_info[] and we check SWAP_WRITEOK flag.  The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.

"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s.  This patch further improves the
speed to 2.3G/s, so around 15% improvement.  It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.

[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:17 -08:00
..
alpha Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
arm Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
arm64 memory-hotplug: remove memmap of sparse-vmemmap 2013-02-23 17:50:12 -08:00
avr32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-02-08 18:02:14 -05:00
blackfin arm-soc: cleanups 2013-02-21 14:58:40 -08:00
c6x c6x: Provide dummy dma_mmap_coherent() and dma_get_sgtable() 2013-01-29 08:11:14 +01:00
cris Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
frv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-02-20 18:58:50 -08:00
h8300 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-02-21 17:40:58 -08:00
hexagon arch Kconfig: Remove references to IRQ_PER_CPU 2013-02-04 18:53:20 +01:00
ia64 memory-hotplug: remove memmap of sparse-vmemmap 2013-02-23 17:50:12 -08:00
m32r arm-soc: cleanups 2013-02-21 14:58:40 -08:00
m68k arm-soc: cleanups 2013-02-21 14:58:40 -08:00
microblaze Driver core patches for 3.9-rc1 2013-02-21 12:05:51 -08:00
mips Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-21 18:06:55 -08:00
mn10300 Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
openrisc openrisc idle: delete pm_idle 2013-02-17 23:37:08 -05:00
parisc tty/serial patches for 3.9-rc1 2013-02-21 13:41:04 -08:00
powerpc memory-hotplug: remove memmap of sparse-vmemmap 2013-02-23 17:50:12 -08:00
s390 memory-hotplug: remove memmap of sparse-vmemmap 2013-02-23 17:50:12 -08:00
score ARCH: drivers remove __dev* attributes. 2013-01-03 15:57:13 -08:00
sh memory-hotplug: introduce new arch_remove_memory() for removing page table 2013-02-23 17:50:12 -08:00
sparc swap: add per-partition lock for swapfile 2013-02-23 17:50:17 -08:00
tile swap: add per-partition lock for swapfile 2013-02-23 17:50:17 -08:00
um tty/serial patches for 3.9-rc1 2013-02-21 13:41:04 -08:00
unicore32 unicore32 idle: delete stray pm_idle comment 2013-02-17 23:37:08 -05:00
x86 acpi, memory-hotplug: support getting hotplug info from SRAT 2013-02-23 17:50:14 -08:00
xtensa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-02-21 17:40:58 -08:00
.gitignore
Kconfig kprobes/x86: Move ftrace-based kprobe code into kprobes-ftrace.c 2013-01-21 13:22:36 -05:00