1
0
Fork 0
alistair23-linux/fs
Theodore Ts'o 68f23b8906 memcg: fix a crash in wb_workfn when a device disappears
Without memcg, there is a one-to-one mapping between the bdi and
bdi_writeback structures.  In this world, things are fairly
straightforward; the first thing bdi_unregister() does is to shutdown
the bdi_writeback structure (or wb), and part of that writeback ensures
that no other work queued against the wb, and that the wb is fully
drained.

With memcg, however, there is a one-to-many relationship between the bdi
and bdi_writeback structures; that is, there are multiple wb objects
which can all point to a single bdi.  There is a refcount which prevents
the bdi object from being released (and hence, unregistered).  So in
theory, the bdi_unregister() *should* only get called once its refcount
goes to zero (bdi_put will drop the refcount, and when it is zero,
release_bdi gets called, which calls bdi_unregister).

Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about
the Brave New memcg World, and calls bdi_unregister directly.  It does
this without informing the file system, or the memcg code, or anything
else.  This causes the root wb associated with the bdi to be
unregistered, but none of the memcg-specific wb's are shutdown.  So when
one of these wb's are woken up to do delayed work, they try to
dereference their wb->bdi->dev to fetch the device name, but
unfortunately bdi->dev is now NULL, thanks to the bdi_unregister()
called by del_gendisk().  As a result, *boom*.

Fortunately, it looks like the rest of the writeback path is perfectly
happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to
create a bdi_dev_name() function which can handle bdi->dev being NULL.
This also allows us to bulletproof the writeback tracepoints to prevent
them from dereferencing a NULL pointer and crashing the kernel if one is
tracing with memcg's enabled, and an iSCSI device dies or a USB storage
stick is pulled.

The most common way of triggering this will be hotremoval of a device
while writeback with memcg enabled is going on.  It was triggering
several times a day in a heavily loaded production environment.

Google Bug Id: 145475544

Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu
Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Chris Mason <clm@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31 10:30:36 -08:00
..
9p 9p pull request for inclusion in 5.4 2019-09-27 15:10:34 -07:00
adfs fs/adfs: bigdir: Fix an error code in adfs_fplus_read() 2020-01-25 11:31:59 -05:00
affs affs: fix a memory leak in affs_remount 2019-11-18 14:26:43 +01:00
afs afs: Fix characters allowed into cell names 2020-01-26 08:54:04 -08:00
autofs Merge branch 'next.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-12-05 17:11:48 -08:00
befs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
bfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
btrfs fs-dedupe-last-block-tag 2020-01-28 15:18:23 -08:00
cachefiles treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ceph ceph: hold extra reference to r_parent over life of request 2020-01-21 19:02:37 +01:00
cifs CIFS: Fix task struct use-after-free on reconnect 2020-01-26 19:24:17 -06:00
coda y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
configfs configfs: calculate the depth of parent item 2019-11-06 18:36:01 +01:00
cramfs cramfs: fix usage on non-MTD device 2019-11-23 21:44:49 -05:00
crypto fscrypt: improve format of no-key names 2020-01-22 14:50:03 -08:00
debugfs debugfs: Return -EPERM when locked down 2020-01-14 16:14:48 +01:00
devpts devpts_pty_kill(): don't bother with d_delete() 2019-09-03 09:30:56 -04:00
dlm dlm: use SO_SNDTIMEO_NEW instead of SO_SNDTIMEO_OLD 2019-12-18 18:07:31 +01:00
ecryptfs crypto: skcipher - remove crypto_skcipher::keysize 2019-12-11 16:36:56 +08:00
efivarfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
efs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
erofs erofs: clean up z_erofs_submit_queue() 2020-01-21 16:46:23 +08:00
exportfs race in exportfs_decode_fh() 2019-11-11 09:21:59 -05:00
ext2 \n 2019-11-30 11:16:07 -08:00
ext4 linux-kselftest-5.6-rc1-kunit 2020-01-29 15:25:34 -08:00
f2fs fsverity updates for 5.6 2020-01-28 15:31:03 -08:00
fat fat: use prandom_u32() for i_generation 2019-12-18 18:07:31 +01:00
freevxfs fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
fscache Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
fuse fuse: fix fuse_send_readpages() in the syncronous read case 2020-01-16 11:09:36 +01:00
gfs2 GFS2 changes for this merge window: 2019-12-05 13:20:11 -08:00
hfs hfs/hfsplus: use 64-bit inode timestamps 2019-12-18 18:07:32 +01:00
hfsplus hfs/hfsplus: use 64-bit inode timestamps 2019-12-18 18:07:32 +01:00
hostfs hostfs: pass 64-bit timestamps to/from user space 2019-12-18 18:07:32 +01:00
hpfs fs: compat_ioctl: move FITRIM emulation into file systems 2019-10-23 17:23:46 +02:00
hugetlbfs mm/hugetlbfs: fix for_each_hstate() loop in init_hugetlbfs_fs() 2020-01-03 10:39:08 -08:00
iomap iomap: stop using ioend after it's been freed in iomap_finish_ioend() 2019-12-05 07:41:16 -08:00
isofs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
jbd2 This merge window saw the the following new featuers added to ext4: 2019-11-30 10:53:02 -08:00
jffs2 Revert "jffs2: Fix possible null-pointer dereferences in jffs2_add_frag_to_fragtree()" 2019-11-29 11:29:58 +01:00
jfs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
kernfs fs/kernfs/dir.c: Clean code by removing always true condition 2020-01-14 16:14:47 +01:00
lockd NFSv4.1: Don't rebind to the same source port when reconnecting to the server 2019-11-03 21:28:45 -05:00
minix fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
nfs y2038: core, driver and file system changes 2020-01-29 14:55:47 -08:00
nfs_common treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfsd This is a relatively quiet cycle for nfsd, mainly various bugfixes. 2019-12-07 16:56:00 -08:00
nilfs2 fs: compat_ioctl: move FITRIM emulation into file systems 2019-10-23 17:23:46 +02:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify fs: call fsnotify_sb_delete after evict_inodes 2019-12-18 00:03:01 -05:00
ntfs ntfs: remove (un)?likely() from IS_ERR() conditions 2019-09-26 10:10:44 -07:00
ocfs2 ocfs2: fix the crash due to call ocfs2_get_dlm_debug once less 2020-01-04 13:55:09 -08:00
omfs fs: omfs: Initialize filesystem timestamp ranges 2019-08-30 08:11:25 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs orangefs: posix open permission checking... 2019-12-04 08:52:55 -05:00
overlayfs overlayfs fixes for 5.5-rc2 2019-12-14 11:13:54 -08:00
proc Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-01-29 11:20:24 -08:00
pstore pstore/ram: Regularize prz label allocation lifetime 2020-01-08 17:05:45 -08:00
qnx4 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
qnx6 fs: Fill in max and min timestamps in superblock 2019-08-30 07:27:17 -07:00
quota fs: avoid softlockups in s_inodes iterators 2019-12-18 00:03:01 -05:00
ramfs vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API 2019-09-12 21:05:34 -04:00
reiserfs reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr 2020-01-16 16:49:29 +01:00
romfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
squashfs Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-09-19 10:06:57 -07:00
sysfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
sysv fs: sysv: Initialize filesystem timestamp ranges 2019-08-30 07:27:18 -07:00
tracefs tracing: Do not create tracefs files if tracefs lockdown is in effect 2019-10-12 20:49:07 -04:00
ubifs ubifs: allow both hash and disk name to be provided in no-key names 2020-01-22 14:49:56 -08:00
udf fs-udf: Delete an unnecessary check before brelse() 2019-09-04 18:19:43 +02:00
ufs y2038: add inode timestamp clamping 2019-09-19 09:42:37 -07:00
unicode unicode: make array 'token' static const, makes object smaller 2019-09-17 11:48:24 -04:00
verity fs-verity: use u64_to_user_ptr() 2020-01-14 13:28:28 -08:00
xfs Merge branch 'core/kprobes' into perf/core, to pick up a completed branch 2019-12-25 10:43:08 +01:00
Kconfig io-wq: small threadpool implementation for io_uring 2019-10-29 12:43:00 -06:00
Kconfig.binfmt binfmt_flat: make support for old format binaries optional 2019-06-24 09:16:47 +10:00
Makefile compat_ioctl: move sys_compat_ioctl() to ioctl.c 2020-01-03 09:42:52 +01:00
aio.c y2038: syscall implementation cleanups 2019-12-01 14:00:59 -08:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c timestamp_truncate: Replace users of timespec64_trunc 2019-08-30 07:27:17 -07:00
bad_inode.c
…
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf.c fs/binfmt_elf.c: extract elf_read() function 2019-12-04 19:44:13 -08:00
binfmt_elf_fdpic.c y2038: elfcore: Use __kernel_old_timeval for process times 2019-11-15 14:38:29 +01:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c fs/binfmt_flat.c: remove set but not used variable 'inode' 2019-07-16 19:23:22 -07:00
binfmt_misc.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c block: don't send uevent for empty disk when not invalidating 2019-12-02 18:49:30 -07:00
buffer.c smp: Remove allocation mask from on_each_cpu_cond.*() 2020-01-24 20:40:09 +01:00
char_dev.c chardev: Avoid potential use-after-free in 'chrdev_open()' 2020-01-06 20:10:26 +01:00
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
compat_binfmt_elf.c y2038: elfcore: Use __kernel_old_timeval for process times 2019-11-15 14:38:29 +01:00
coredump.c coredump: split pipe command whitespace before expanding template 2019-08-03 07:02:01 -07:00
d_path.c [PATCH] fix d_absolute_path() interplay with fsmount() 2019-08-30 19:31:09 -04:00
dax.c New code for 5.5: 2019-11-30 10:44:49 -08:00
dcache.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-12-08 11:08:28 -08:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c fs/direct-io.c: include fs/internal.h for missing prototype 2020-01-04 13:55:09 -08:00
drop_caches.c fs: avoid softlockups in s_inodes iterators 2019-12-18 00:03:01 -05:00
eventfd.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
eventpoll.c eventpoll: support non-blocking do_epoll_ctl() calls 2020-01-29 15:45:47 -07:00
exec.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-12-03 12:20:25 -08:00
fcntl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-12-08 11:08:28 -08:00
fhandle.c fs/handle.c - fix up kerneldoc 2019-08-07 21:51:47 -04:00
file.c threads-v5.6 2020-01-29 19:38:34 -08:00
file_table.c vfs: Export flush_delayed_fput for use by knfsd. 2019-08-19 11:00:39 -04:00
filesystems.c vfs: Implement a filesystem superblock creation/configuration context 2019-02-28 03:29:26 -05:00
fs-writeback.c memcg: fix a crash in wb_workfn when a device disappears 2020-01-31 10:30:36 -08:00
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c vfs: Make fs_parse() handle fs_param_is_fd-type params better 2019-09-12 21:06:14 -04:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c fs: common implementation of file type 2019-01-21 17:48:13 +01:00
fsopen.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
inode.c fscrypt: don't allow v1 policies with casefolding 2020-01-22 14:47:15 -08:00
internal.h for-5.6/io_uring-vfs-2020-01-29 2020-01-29 18:53:37 -08:00
io-wq.c io_uring: fix linked command file table usage 2020-01-29 13:46:44 -07:00
io-wq.h io_uring: fix linked command file table usage 2020-01-29 13:46:44 -07:00
io_uring.c for-5.6/io_uring-vfs-2020-01-29 2020-01-29 18:53:37 -08:00
ioctl.c compat_ioctl: simplify the implementation 2020-01-03 09:42:52 +01:00
libfs.c fs/libfs.c: fix kernel-doc warning 2019-10-14 15:04:01 -07:00
locks.c locks: print unsigned ino in /proc/locks 2019-12-29 09:00:58 -05:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c fs: move guard_bio_eod() after bio_set_op_attrs 2020-01-09 08:16:12 -07:00
namei.c Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-01-29 11:20:24 -08:00
namespace.c fs/namespace.c: make to_mnt_ns() static 2020-01-04 13:55:09 -08:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-01-29 11:20:24 -08:00
open.c fs: make build_open_flags() available internally 2020-01-20 17:01:53 -07:00
pipe.c pipe: fix empty pipe check in pipe_write() 2019-12-22 09:47:47 -08:00
pnode.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
pnode.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
posix_acl.c fs/posix_acl.c: fix kernel-doc warnings 2020-01-04 13:55:09 -08:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c fs: allow deduplication of eof block into the end of the destination file 2020-01-23 18:20:48 +01:00
readdir.c readdir: make user_access_begin() use the real access range 2020-01-23 10:15:28 -08:00
select.c y2038: syscalls: change remaining timeval to __kernel_old_timeval 2019-11-15 14:38:29 +01:00
seq_file.c seq_file: fix problem when seeking mid-record 2019-08-13 16:06:52 -07:00
signalfd.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
splice.c pipe: remove 'waiting_writers' merging logic 2019-12-07 13:21:01 -08:00
stack.c sched/rt, fs: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
stat.c fs: make two stat prep helpers available 2020-01-20 17:03:54 -07:00
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-03 14:21:35 -07:00
super.c fs: call fsnotify_sb_delete after evict_inodes 2019-12-18 00:03:01 -05:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c timerfd: Make timerfd_settime() time namespace aware 2020-01-14 12:20:53 +01:00
userfaultfd.c Merge branch 'akpm' (patches from Andrew) 2019-12-01 20:36:41 -08:00
utimes.c y2038: syscalls: change remaining timeval to __kernel_old_timeval 2019-11-15 14:38:29 +01:00
xattr.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00