alistair23-linux/fs
Wang Xiaoguang 1d57ee9416 btrfs: improve delayed refs iterations
This issue was found when I tried to delete a heavily reflinked file,
when deleting such files, other transaction operation will not have a
chance to make progress, for example, start_transaction() will blocked
in wait_current_trans(root) for long time, sometimes it even triggers
soft lockups, and the time taken to delete such heavily reflinked file
is also very large, often hundreds of seconds. Using perf top, it reports
that:

PerfTop:    7416 irqs/sec  kernel:99.8%  exact:  0.0% [4000Hz cpu-clock],  (all, 4 CPUs)
---------------------------------------------------------------------------------------
    84.37%  [btrfs]             [k] __btrfs_run_delayed_refs.constprop.80
    11.02%  [kernel]            [k] delay_tsc
     0.79%  [kernel]            [k] _raw_spin_unlock_irq
     0.78%  [kernel]            [k] _raw_spin_unlock_irqrestore
     0.45%  [kernel]            [k] do_raw_spin_lock
     0.18%  [kernel]            [k] __slab_alloc
It seems __btrfs_run_delayed_refs() took most cpu time, after some debug
work, I found it's select_delayed_ref() causing this issue, for a delayed
head, in our case, it'll be full of BTRFS_DROP_DELAYED_REF nodes, but
select_delayed_ref() will firstly try to iterate node list to find
BTRFS_ADD_DELAYED_REF nodes, obviously it's a disaster in this case, and
waste much time.

To fix this issue, we introduce a new ref_add_list in struct btrfs_delayed_ref_head,
then in select_delayed_ref(), if this list is not empty, we can directly use
nodes in this list. With this patch, it just took about 10~15 seconds to
delte the same file. Now using perf top, it reports that:

PerfTop:    2734 irqs/sec  kernel:99.5%  exact:  0.0% [4000Hz cpu-clock],  (all, 4 CPUs)
----------------------------------------------------------------------------------------

    20.74%  [kernel]          [k] _raw_spin_unlock_irqrestore
    16.33%  [kernel]          [k] __slab_alloc
     5.41%  [kernel]          [k] lock_acquired
     4.42%  [kernel]          [k] lock_acquire
     4.05%  [kernel]          [k] lock_release
     3.37%  [kernel]          [k] _raw_spin_unlock_irq

For normal files, this patch also gives help, at least we do not need to
iterate whole list to found BTRFS_ADD_DELAYED_REF nodes.

Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-11-30 13:45:21 +01:00
..
9p Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
adfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
affs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
afs afs: call->operation_ID sometimes used as __be32 sometimes as u32 2016-10-13 17:03:52 +01:00
autofs4 autofs: refactor ioctl fn vector in iookup_dev_ioctl() 2016-10-11 15:06:31 -07:00
befs befs fixes for 4.9-rc1 2016-10-15 12:09:13 -07:00
bfs Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
btrfs btrfs: improve delayed refs iterations 2016-11-30 13:45:21 +01:00
cachefiles Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
ceph ceph: use default file splice read callback 2016-11-10 20:13:04 +01:00
cifs CIFS: Retrieve uid and gid from special sid if enabled 2016-10-14 14:22:16 -05:00
coda Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
configfs Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
cramfs
crypto fscrypto: don't use on-stack buffer for key derivation 2016-11-19 20:56:13 -05:00
debugfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
devpts Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
dlm dlm: free workqueues after the connections 2016-10-10 09:54:00 -05:00
ecryptfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
efivarfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
efs
exofs fs: exofs: print a hex number after a 0x prefix 2016-10-27 18:43:43 -07:00
exportfs exportfs: be careful to only return expected errors. 2016-10-06 09:07:44 -04:00
ext2 ext2: avoid bogus -Wmaybe-uninitialized warning 2016-10-18 11:29:35 +02:00
ext4 ext4: sanity check the block and cluster size at mount time 2016-11-19 20:58:15 -05:00
f2fs This includes fixing a bug which references a wrong pointer, sum_page, in 2016-10-18 14:15:23 -07:00
fat Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
freevxfs
fscache
fuse fuse: fix fuse_write_end() if zero bytes were copied 2016-11-15 12:34:21 +01:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
hfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
hostfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
hpfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
hugetlbfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
isofs isofs: Do not return EACCES for unknown filesystems 2016-10-18 11:28:21 +02:00
jbd2 jbd2: fix incorrect unlock on j_list_lock 2016-10-12 23:19:18 -04:00
jffs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
jfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
kernfs kernfs: Add noop_fsync to supported kernfs_file_fops 2016-10-27 17:47:11 +02:00
lockd treewide: remove redundant #include <linux/kconfig.h> 2016-10-11 15:06:33 -07:00
logfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
minix Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
ncpfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
nfs NFS client bugfixes for Linux 4.9 part 4 2016-11-23 14:43:40 -08:00
nfs_common
nfsd nfsd: Fix general protection fault in release_lock_stateid() 2016-11-01 15:24:43 -04:00
nilfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
nls
notify fsnotify: clean up spinlock assertions 2016-10-07 18:46:26 -07:00
ntfs fs: remove the never implemented aio_fsync file operation 2016-10-30 13:09:42 -04:00
ocfs2 ocfs2: fix not enough credit panic 2016-11-11 08:12:37 -08:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
openpromfs fs: Replace CURRENT_TIME with current_time() for inode timestamps 2016-09-27 21:06:21 -04:00
orangefs orangefs: add .owner to debugfs file_operations 2016-11-16 11:52:19 -05:00
overlayfs ovl: fsync after copy-up 2016-10-31 14:42:14 +01:00
proc proc: fix NULL dereference when reading /proc/<pid>/auxv 2016-10-27 18:43:43 -07:00
pstore Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
qnx4
qnx6
quota
ramfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
reiserfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
romfs
squashfs vfs: Remove {get,set,remove}xattr inode operations 2016-10-07 21:48:36 -04:00
sysfs Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-10-14 12:18:50 -07:00
sysv Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
tracefs fs: Replace CURRENT_TIME with current_time() for inode timestamps 2016-09-27 21:06:21 -04:00
ubifs ubifs: Fix regression in ubifs_readdir() 2016-10-28 14:48:31 +02:00
udf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
ufs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
xfs xfs: defer should abort intent items if the trans roll fails 2016-10-24 14:21:18 +11:00
aio.c aio: fix freeze protection of aio writes 2016-10-30 13:09:42 -04:00
anon_inodes.c
attr.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
bad_inode.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
binfmt_aout.c
binfmt_elf.c x86/coredump: Use pr_reg size, rather that TIF_IA32 flag 2016-09-14 21:28:10 +02:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c fs: Replace current_fs_time() with current_time() 2016-09-27 21:06:22 -04:00
binfmt_script.c
block_dev.c block: implement (some of) fallocate for block devices 2016-10-11 15:06:30 -07:00
buffer.c fs: use mapping_set_error instead of opencoded set_bit 2016-10-11 15:06:33 -07:00
char_dev.c
compat.c compat: remove compat_printk() 2016-09-27 21:20:53 -04:00
compat_binfmt_elf.c
compat_ioctl.c fs: compat_ioctl: add pretimeout functions for watchdogs 2016-09-24 09:27:18 +02:00
coredump.c coredump: fix unfreezable coredumping task 2016-11-11 08:12:37 -08:00
dax.c thp: reduce usage of huge zero page's atomic counter 2016-10-07 18:46:28 -07:00
dcache.c
dcookies.c
direct-io.c consistent treatment of EFAULT on O_DIRECT read/write 2016-10-03 20:38:55 -04:00
drop_caches.c
eventfd.c
eventpoll.c
exec.c mm: replace get_user_pages_remote() write/force parameters with gup_flags 2016-10-19 08:12:02 -07:00
fcntl.c
fhandle.c
file.c fs/file: more unsigned file descriptors 2016-09-27 18:47:38 -04:00
file_table.c
filesystems.c
fs-writeback.c
fs_pin.c
fs_struct.c
inode.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 13:04:49 -07:00
ioctl.c vfs: cap dedupe request structure size at PAGE_SIZE 2016-09-15 13:29:52 -07:00
iomap.c fs: Do to trim high file position bits in iomap_page_mkwrite_actor 2016-10-24 14:20:25 +11:00
Kconfig mm/hugetlb: introduce ARCH_HAS_GIGANTIC_PAGE 2016-10-07 18:46:29 -07:00
Kconfig.binfmt
libfs.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
locks.c locking, fs/locks: Add missing file_sem locks 2016-10-18 12:21:28 +02:00
Makefile
mbcache.c
mount.h mnt: Add a per mount namespace limit on the number of mounts 2016-09-30 12:46:48 -05:00
mpage.c
namei.c Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2016-10-14 17:23:33 -07:00
namespace.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
no-block.c
nsfs.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
open.c xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
pipe.c pipe: cap initial pipe capacity according to pipe-max-size limit 2016-10-11 15:06:32 -07:00
pnode.c mnt: Add a per mount namespace limit on the number of mounts 2016-09-30 12:46:48 -05:00
pnode.h mnt: Add a per mount namespace limit on the number of mounts 2016-09-30 12:46:48 -05:00
posix_acl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
proc_namespace.c
read_write.c iov_iter: kernel-doc import_iovec() and rw_copy_check_uvector() 2016-10-14 20:00:34 -04:00
readdir.c
select.c fs/select: add vmalloc fallback for select(2) 2016-10-11 15:06:30 -07:00
seq_file.c seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char 2016-10-07 18:46:30 -07:00
signalfd.c
splice.c fix default_file_splice_read() 2016-11-26 20:05:42 -05:00
stack.c
stat.c
statfs.c
super.c fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths 2016-10-14 20:41:59 -04:00
sync.c
timerfd.c
userfaultfd.c
utimes.c Merge remote-tracking branch 'jk/vfs' into work.misc 2016-10-08 11:06:08 -04:00
xattr.c xattr: Fix setting security xattrs on sockfs 2016-11-17 00:00:23 -05:00