1
0
Fork 0
alistair23-linux/block
Tahsin Erdogan 7fc6b87a9f blkcg: allocate struct blkcg_gq outside request queue spinlock
blkg_conf_prep() currently calls blkg_lookup_create() while holding
request queue spinlock. This means allocating memory for struct
blkcg_gq has to be made non-blocking. This causes occasional -ENOMEM
failures in call paths like below:

  pcpu_alloc+0x68f/0x710
  __alloc_percpu_gfp+0xd/0x10
  __percpu_counter_init+0x55/0xc0
  cfq_pd_alloc+0x3b2/0x4e0
  blkg_alloc+0x187/0x230
  blkg_create+0x489/0x670
  blkg_lookup_create+0x9a/0x230
  blkg_conf_prep+0x1fb/0x240
  __cfqg_set_weight_device.isra.105+0x5c/0x180
  cfq_set_weight_on_dfl+0x69/0xc0
  cgroup_file_write+0x39/0x1c0
  kernfs_fop_write+0x13f/0x1d0
  __vfs_write+0x23/0x120
  vfs_write+0xc2/0x1f0
  SyS_write+0x44/0xb0
  entry_SYSCALL_64_fastpath+0x18/0xad

In the code path above, percpu allocator cannot call vmalloc() due to
queue spinlock.

A failure in this call path gives grief to tools which are trying to
configure io weights. We see occasional failures happen shortly after
reboots even when system is not under any memory pressure. Machines
with a lot of cpus are more vulnerable to this condition.

Update blkg_create() function to temporarily drop the rcu and queue
locks when it is allowed by gfp mask.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-28 15:59:04 -06:00
..
partitions partitions/efi: Fix integer overflow in GPT size calculation 2017-01-17 09:02:31 -07:00
Kconfig blk-throttle: add configure option for new .low interface 2017-03-28 08:02:20 -06:00
Kconfig.iosched block: get rid of blk-mq default scheduler choice Kconfig entries 2017-02-22 13:19:45 -07:00
Makefile virtio, vhost: optimizations, fixes 2017-03-02 13:53:13 -08:00
badblocks.c badblocks: badblocks_set/clear update unacked_exist 2016-10-21 15:45:47 -06:00
bio-integrity.c block: remove bio_is_rw 2016-10-28 08:45:17 -06:00
bio.c blk-throttle: add a simple idle detection 2017-03-28 08:02:20 -06:00
blk-cgroup.c blkcg: allocate struct blkcg_gq outside request queue spinlock 2017-03-28 15:59:04 -06:00
blk-core.c block: track request size in blk_issue_stat 2017-03-28 08:02:20 -06:00
blk-exec.c block: introduce blk_rq_is_passthrough 2017-01-31 14:00:34 -07:00
blk-flush.c block: remove outdated part of blkdev_issue_flush() comment 2017-03-24 15:41:30 -06:00
blk-integrity.c block: constify struct blk_integrity_profile 2017-03-24 20:34:39 -06:00
blk-ioc.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2017-03-03 10:53:35 -08:00
blk-lib.c block: correct documentation for blkdev_issue_discard() flags 2017-03-24 15:41:28 -06:00
blk-map.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
blk-merge.c block: optionally merge discontiguous discard bios into a single request 2017-02-08 13:43:08 -07:00
blk-mq-cpumap.c blk-mq: export blk_mq_map_queues 2016-11-08 17:30:00 -05:00
blk-mq-debugfs.c blk-stat: convert to callback-based statistics reporting 2017-03-21 10:03:11 -06:00
blk-mq-pci.c blk_mq: linux/blk-mq.h does not include all the headers it depends on 2016-09-19 08:21:51 -06:00
blk-mq-sched.c blk-mq: move update of tags->rqs to __blk_mq_alloc_request() 2017-03-02 08:56:04 -07:00
blk-mq-sched.h blk-mq-sched: separate mark hctx and queue restart operations 2017-02-23 11:55:47 -07:00
blk-mq-sysfs.c blk-mq: free hctx->cpumask in release handler of hctx's kobject 2017-03-08 09:56:12 -07:00
blk-mq-tag.c blk-mq: Fix tagset reinit in the presence of cpu hot-unplug 2017-03-13 08:14:23 -06:00
blk-mq-tag.h blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset 2017-03-02 08:56:04 -07:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c block: track request size in blk_issue_stat 2017-03-28 08:02:20 -06:00
blk-mq.h blk-stat: convert to callback-based statistics reporting 2017-03-21 10:03:11 -06:00
blk-settings.c block: optionally merge discontiguous discard bios into a single request 2017-02-08 13:43:08 -07:00
blk-softirq.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/topology.h> 2017-03-02 08:42:26 +01:00
blk-stat.c blk-throttle: add a mechanism to estimate IO latency 2017-03-28 08:02:20 -06:00
blk-stat.h blk-throttle: add a mechanism to estimate IO latency 2017-03-28 08:02:20 -06:00
blk-sysfs.c blk-throttle: choose a small throtl_slice for SSD 2017-03-28 08:02:20 -06:00
blk-tag.c blk-mq-sched: add framework for MQ capable IO schedulers 2017-01-17 10:04:20 -07:00
blk-throttle.c blk-throttle: add latency target support 2017-03-28 08:02:20 -06:00
blk-timeout.c block: remove REQ_NO_TIMEOUT flag 2015-12-22 09:38:34 -07:00
blk-wbt.c blk-stat: convert to callback-based statistics reporting 2017-03-21 10:03:11 -06:00
blk-wbt.h block: track request size in blk_issue_stat 2017-03-28 08:02:20 -06:00
blk-zoned.c block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
blk.h blk-throttle: add a mechanism to estimate IO latency 2017-03-28 08:02:20 -06:00
bounce.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2015-09-19 18:57:09 -07:00
bsg-lib.c block: split scsi_request out of struct request 2017-01-27 15:08:35 -07:00
bsg.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
cfq-iosched.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c block: Get rid of blk_get_backing_dev_info() 2017-02-02 08:21:32 -07:00
deadline-iosched.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
elevator.c block: don't call ioc_exit_icq() with the queue lock held for blk-mq 2017-03-02 13:59:08 -07:00
genhd.c block: Fix oops scsi_disk_get() 2017-03-22 20:11:37 -06:00
ioctl.c block: Get rid of blk_get_backing_dev_info() 2017-02-02 08:21:32 -07:00
ioprio.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
mq-deadline.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block/sed-opal: allocate struct opal_dev dynamically 2017-02-17 12:41:47 -07:00
partition-generic.c block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
scsi_ioctl.c block: fold cmd_type into the REQ_OP_ space 2017-01-31 14:00:44 -07:00
sed-opal.c block/sed: Fix opal user range check and unused variables 2017-03-08 09:56:12 -07:00
t10-pi.c block: constify struct blk_integrity_profile 2017-03-24 20:34:39 -06:00