1
0
Fork 0
alistair23-linux/drivers/block
Ying Huang 966a967116 smp: Avoid using two cache lines for struct call_single_data
struct call_single_data is used in IPIs to transfer information between
CPUs.  Its size is bigger than sizeof(unsigned long) and less than
cache line size.  Currently it is not allocated with any explicit alignment
requirements.  This makes it possible for allocated call_single_data to
cross two cache lines, which results in double the number of the cache lines
that need to be transferred among CPUs.

This can be fixed by requiring call_single_data to be aligned with the
size of call_single_data. Currently the size of call_single_data is the
power of 2.  If we add new fields to call_single_data, we may need to
add padding to make sure the size of new definition is the power of 2
as well.

Fortunately, this is enforced by GCC, which will report bad sizes.

To set alignment requirements of call_single_data to the size of
call_single_data, a struct definition and a typedef is used.

To test the effect of the patch, I used the vm-scalability multiple
thread swap test case (swap-w-seq-mt).  The test will create multiple
threads and each thread will eat memory until all RAM and part of swap
is used, so that huge number of IPIs are triggered when unmapping
memory.  In the test, the throughput of memory writing improves ~5%
compared with misaligned call_single_data, because of faster IPIs.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Huang, Ying <ying.huang@intel.com>
[ Add call_single_data_t and align with size of call_single_data. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 15:14:38 +02:00
..
aoe block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
drbd Merge branch 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 13:13:32 -07:00
mtip32xx Merge branch 'nvme-4.13' of git://git.infradead.org/nvme into for-linus 2017-07-10 11:44:34 -06:00
paride block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
rsxx block: don't bother with bounce limits for make_request drivers 2017-06-27 12:13:45 -06:00
xen-blkback Merge commit '8e8320c9315c' into for-4.13/block 2017-06-22 21:55:24 -06:00
zram zram: rework copy of compressor name in comp_algorithm_store() 2017-08-10 15:54:07 -07:00
DAC960.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
DAC960.h
Kconfig libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
Makefile block: remove the osdblk driver 2017-04-19 09:10:51 -06:00
amiflop.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
ataflop.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
brd.c ARM: fix rd_size declaration 2017-07-10 16:32:34 -07:00
cciss.c cciss: initialize struct scsi_req 2017-07-06 12:23:51 -06:00
cciss.h SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
cciss_cmd.h cciss: use new doorbell-bit-5 reset method 2011-05-06 08:23:55 -06:00
cciss_scsi.c cciss: Remove kmalloc cast 2017-02-22 11:54:49 -07:00
cciss_scsi.h cciss: add cciss_tape_cmds module paramter 2011-05-06 08:23:59 -06:00
cryptoloop.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
floppy.c Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-06 20:57:13 -07:00
loop.c Merge branch 'work.read_write' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-07-05 14:35:57 -07:00
loop.h loop: support 4k physical blocksize 2017-06-08 08:40:00 -06:00
nbd.c nbd: clear disconnected on reconnect 2017-07-25 13:58:34 -06:00
null_blk.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
pktcdvd.c driver core patches for 4.13-rc1 2017-07-03 20:27:48 -07:00
ps3disk.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
ps3vram.c blk: remove bio_set arg from blk_queue_split() 2017-06-18 12:40:59 -06:00
rbd.c rbd: use bio_clone_fast() instead of bio_clone() 2017-06-18 12:40:59 -06:00
rbd_types.h rbd: RBD_V{1,2}_DATA_FORMAT macros 2017-02-20 12:16:15 +01:00
skd_main.c block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
skd_s1120.h skd: fix formatting in skd_s1120.h 2013-11-08 09:10:30 -07:00
smart1,2.h fix typos 'comamnd' -> 'command' in comments 2011-02-02 11:31:21 +01:00
sunvdc.c sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain 2017-08-09 22:22:32 -07:00
swim.c block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
swim3.c block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
swim_asm.S m68k: mac - Add SWIM floppy support 2009-03-26 21:15:27 +01:00
sx8.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
umem.c blk: remove bio_set arg from blk_queue_split() 2017-06-18 12:40:59 -06:00
umem.h
virtio_blk.c virtio_blk: Use sysfs_match_string() helper 2017-07-25 16:37:34 +03:00
xen-blkfront.c xen-blkfront: use a right index when checking requests 2017-08-15 10:34:04 -04:00
xsysace.c block: don't set bounce limit in blk_init_queue 2017-06-27 12:13:45 -06:00
z2ram.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00