remarkable-linux/fs
Brian Foster 63db7c815b xfs: use ->b_state to fix buffer I/O accounting release race
We've had user reports of unmount hangs in xfs_wait_buftarg() that
analysis shows is due to btp->bt_io_count == -1. bt_io_count
represents the count of in-flight asynchronous buffers and thus
should always be >= 0. xfs_wait_buftarg() waits for this value to
stabilize to zero in order to ensure that all untracked (with
respect to the lru) buffers have completed I/O processing before
unmount proceeds to tear down in-core data structures.

The value of -1 implies an I/O accounting decrement race. Indeed,
the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele()
(where the buffer lock is no longer held) means that bp->b_flags can
be updated from an unsafe context. While a user-level reproducer is
currently not available, some intrusive hacks to run racing buffer
lookups/ioacct/releases from multiple threads was used to
successfully manufacture this problem.

Existing callers do not expect to acquire the buffer lock from
xfs_buf_rele(). Therefore, we can not safely update ->b_flags from
this context. It turns out that we already have separate buffer
state bits and associated serialization for dealing with buffer LRU
state in the form of ->b_state and ->b_lock. Therefore, replace the
_XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O
accounting wrappers appropriately and make sure they are used with
the correct locking. This ensures that buffer in-flight state can be
modified at buffer release time without racing with modifications
from a buffer lock holder.

Fixes: 9c7504aa72 ("xfs: track and serialize in-flight async buffers against unmount")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-05-31 08:22:52 -07:00
..
9p 9p: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
adfs
affs fs/affs: add rename exchange 2017-05-05 15:24:52 -04:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
autofs4 Fix dead URLs to ftp.kernel.org 2017-03-28 16:16:52 +02:00
befs befs: make export work with cold dcache 2017-05-05 11:35:35 +01:00
bfs Tigran has moved 2017-05-12 15:57:15 -07:00
btrfs Merge branch 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-05-10 08:33:17 -07:00
cachefiles sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
ceph The two main items are support for disabling automatic rbd exclusive 2017-05-10 08:42:33 -07:00
cifs fs: cifs: replace CURRENT_TIME by other appropriate apis 2017-05-08 17:15:15 -07:00
coda coda: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
configfs
cramfs
crypto fscrypt: introduce helper function for filename matching 2017-05-04 11:44:37 -04:00
debugfs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
devpts
dlm net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
ecryptfs ecryptfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
efivarfs
efs
exofs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
exportfs Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
ext2 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2017-05-12 15:43:10 -07:00
ext4 Merge branch 'akpm' (patches from Andrew) 2017-05-13 09:49:35 -07:00
f2fs Merge branch 'akpm' (patches from Andrew) 2017-05-08 18:17:56 -07:00
fat fat: fix using uninitialized fields of fat_inode/fsinfo_inode 2017-03-09 17:01:10 -08:00
freevxfs
fscache KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
fuse NFS client updates for Linux 4.12 2017-05-10 13:03:38 -07:00
gfs2 gfs2: replace CURRENT_TIME with current_time 2017-05-08 17:15:15 -07:00
hfs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hfsplus fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hostfs
hpfs sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
hugetlbfs hugetlbfs: fix offset overflow in hugetlbfs mmap 2017-04-13 18:24:21 -07:00
isofs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
jbd2 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
jffs2 jffs2: fix spelling mistake: "requestied" -> "requested" 2017-04-19 11:35:55 -07:00
jfs jfs: Remove jfs_get_inode_flags() 2017-04-19 14:21:23 +02:00
kernfs kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file() 2017-03-17 10:25:59 +09:00
lockd Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
minix statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
ncpfs ncpfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
nfs Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
nfs_common
nfsd Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
nilfs2 fs: Remove SB_I_DYNBDI flag 2017-04-20 12:09:55 -06:00
nls
notify fanotify: don't expose EOPENSTALE to userspace 2017-04-25 15:48:06 +02:00
ntfs sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
ocfs2 fs/ocfs2/cluster: use offset_in_page() macro 2017-05-03 15:52:07 -07:00
omfs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
openpromfs
orangefs Some cleanups: 2017-05-05 13:36:10 -07:00
overlayfs Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2017-05-10 09:03:48 -07:00
proc proc: try to remove use of FOLL_FORCE entirely 2017-05-09 08:45:16 -07:00
pstore Annotation of module parameters that specify device settings 2017-05-10 19:13:03 -07:00
qnx4
qnx6
quota quota: Remove dquot_quotactl_ops 2017-04-19 14:21:23 +02:00
ramfs
reiserfs reiserfs: use designated initializers 2017-05-08 17:15:11 -07:00
romfs
squashfs fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version 2017-02-24 17:46:57 -08:00
sysfs sysfs: be careful of error returns from ops->show() 2017-04-08 17:33:32 +02:00
sysv statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
tracefs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
ubifs This pull request contains updates for both UBI and UBIFS: 2017-05-13 10:23:12 -07:00
udf udf: use kmap_atomic for memcpy copying 2017-04-24 16:28:02 +02:00
ufs fs: ufs: use ktime_get_real_ts64() for birthtime 2017-05-08 17:15:15 -07:00
xfs xfs: use ->b_state to fix buffer I/O accounting release race 2017-05-31 08:22:52 -07:00
aio.c Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
anon_inodes.c
attr.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
bad_inode.c statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
binfmt_aout.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_elf.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_elf_fdpic.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_em86.c
binfmt_flat.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_misc.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
binfmt_script.c
block_dev.c Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2017-05-12 15:43:10 -07:00
buffer.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
char_dev.c chardev: add helper function to register char devs with a struct device 2017-03-21 06:44:32 +01:00
compat.c fs/compat.c: trim unused includes 2017-04-17 12:52:27 -04:00
compat_binfmt_elf.c
compat_ioctl.c fs: compat: Remove warning from COMPATIBLE_IOCTL 2017-04-29 17:47:19 -04:00
coredump.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
dax.c Merge branch 'akpm' (patches from Andrew) 2017-05-13 09:49:35 -07:00
dcache.c fs: don't set *REFERENCED on single use objects 2017-05-03 11:47:05 -04:00
dcookies.c
direct-io.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
drop_caches.c
eventfd.c sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
eventpoll.c epoll: Add busy poll support to epoll with socket fds. 2017-03-24 20:49:31 -07:00
exec.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
fcntl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
fhandle.c fhandle: move compat syscalls from compat.c 2017-04-17 12:52:26 -04:00
file.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
file_table.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
filesystems.c
fs-writeback.c writeback: fix memory leak in wb_queue_work() 2017-03-13 08:27:34 -06:00
fs_pin.c
fs_struct.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
inode.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
ioctl.c sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API dependency 2017-03-02 08:42:37 +01:00
iomap.c fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
Kconfig block, dax: move "select DAX" from BLOCK to FS_DAX 2017-05-08 10:55:27 -07:00
Kconfig.binfmt
libfs.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
locks.c locks: Set FL_CLOSE when removing flock locks on close() 2017-04-21 10:45:01 -04:00
Makefile
mbcache.c
mount.h fsnotify: Free fsnotify_mark_connector when there is no mark attached 2017-04-10 17:37:35 +02:00
mpage.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
namei.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
namespace.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
no-block.c
nsfs.c ns: allow ns_entries to have custom symlink content 2017-05-08 17:15:12 -07:00
open.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
pipe.c
pnode.c
pnode.h
posix_acl.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
proc_namespace.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
read_write.c move compat_rw_copy_check_uvector() over to fs/read_write.c 2017-04-17 12:52:26 -04:00
readdir.c readdir: move compat syscalls from compat.c 2017-04-17 12:52:24 -04:00
select.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
seq_file.c mm: introduce kv[mz]alloc helpers 2017-05-08 17:15:12 -07:00
signalfd.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
splice.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
stack.c
stat.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:54:26 -07:00
statfs.c statfs: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
super.c bdi: Drop 'parent' argument from bdi_register[_va]() 2017-04-20 12:09:55 -06:00
sync.c vfs: use helper for calling f_op->fsync() 2017-02-20 16:51:23 +01:00
timerfd.c timerfd: Only check CAP_WAKE_ALARM when it is needed 2017-03-01 12:53:44 +01:00
userfaultfd.c userfaultfd: report actual registered features in fdinfo 2017-04-08 00:47:48 -07:00
utimes.c utimes: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
xattr.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00