remarkable-linux/fs
Chris Mason b4ce94de9b Btrfs: Change btree locking to use explicit blocking points
Most of the btrfs metadata operations can be protected by a spinlock,
but some operations still need to schedule.

So far, btrfs has been using a mutex along with a trylock loop,
most of the time it is able to avoid going for the full mutex, so
the trylock loop is a big performance gain.

This commit is step one for getting rid of the blocking locks entirely.
btrfs_tree_lock takes a spinlock, and the code explicitly switches
to a blocking lock when it starts an operation that can schedule.

We'll be able get rid of the blocking locks in smaller pieces over time.
Tracing allows us to find the most common cause of blocking, so we
can start with the hot spots first.

The basic idea is:

btrfs_tree_lock() returns with the spin lock held

btrfs_set_lock_blocking() sets the EXTENT_BUFFER_BLOCKING bit in
the extent buffer flags, and then drops the spin lock.  The buffer is
still considered locked by all of the btrfs code.

If btrfs_tree_lock gets the spinlock but finds the blocking bit set, it drops
the spin lock and waits on a wait queue for the blocking bit to go away.

Much of the code that needs to set the blocking bit finishes without actually
blocking a good percentage of the time.  So, an adaptive spin is still
used against the blocking bit to avoid very high context switch rates.

btrfs_clear_lock_blocking() clears the blocking bit and returns
with the spinlock held again.

btrfs_tree_unlock() can be called on either blocking or spinning locks,
it does the right thing based on the blocking bit.

ctree.c has a helper function to set/clear all the locked buffers in a
path as blocking.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-02-04 09:25:08 -05:00
..
9p fs/Kconfig: move 9p out 2009-01-22 13:16:01 +03:00
adfs fs/Kconfig: move adfs out 2009-01-22 13:15:56 +03:00
affs fs/Kconfig: move affs out 2009-01-22 13:15:56 +03:00
afs fs/Kconfig: move afs out 2009-01-22 13:16:01 +03:00
autofs fs/Kconfig: move autofs, autofs4 out 2009-01-22 13:15:54 +03:00
autofs4 fs/Kconfig: move autofs, autofs4 out 2009-01-22 13:15:54 +03:00
befs fs/Kconfig: move befs out 2009-01-22 13:15:57 +03:00
bfs fs/Kconfig: move bfs out 2009-01-22 13:15:57 +03:00
btrfs Btrfs: Change btree locking to use explicit blocking points 2009-02-04 09:25:08 -05:00
cifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-01-05 18:32:06 -08:00
coda fs/Kconfig: move coda out 2009-01-22 13:16:01 +03:00
configfs fs/Kconfig: move configfs out 2009-01-22 13:15:56 +03:00
cramfs fs/Kconfig: move cramfs out 2009-01-22 13:15:58 +03:00
debugfs debugfs: add helpers for exporting a size_t simple value 2009-01-07 10:00:16 -08:00
devpts zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
dlm dlm: initialize file_lock struct in GETLK before copying conflicting lock 2009-01-21 15:28:45 -06:00
ecryptfs fs/Kconfig: move ecryptfs out 2009-01-22 13:15:56 +03:00
efs fs/Kconfig: move efs out 2009-01-22 13:15:57 +03:00
exportfs Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
ext2 ext2: also update the inode on disk when dir is IS_DIRSYNC 2009-01-15 16:39:42 -08:00
ext3 filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
ext4 filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
fat fs/Kconfig: move fat out 2009-01-22 13:15:55 +03:00
freevxfs fs/Kconfig: move vxfs out 2009-01-22 13:15:58 +03:00
fuse Merge branch 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc 2009-01-26 10:08:50 -08:00
gfs2 filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
hfs fs/Kconfig: move hfs, hfsplus out 2009-01-22 13:15:57 +03:00
hfsplus fs/Kconfig: move hfs, hfsplus out 2009-01-22 13:15:57 +03:00
hostfs fs: symlink write_begin allocation context fix 2009-01-04 13:33:20 -08:00
hpfs fs/Kconfig: move hpfs out 2009-01-22 13:15:59 +03:00
hppfs CRED: Use creds in file structs 2008-11-14 10:39:25 +11:00
hugetlbfs hugetlb: unsigned ret cannot be negative 2009-01-06 15:59:08 -08:00
isofs fs/Kconfig: move iso9660, udf out 2009-01-22 13:15:55 +03:00
jbd jbd: remove excess kernel-doc notation 2009-01-08 08:31:01 -08:00
jbd2 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2009-01-08 17:14:59 -08:00
jffs2 [JFFS2] remove junk prototypes 2009-01-09 21:05:21 +00:00
jfs fs/Kconfig: move jfs out 2009-01-22 13:15:54 +03:00
lockd NLM: Clean up flow of control in make_socks() function 2009-01-07 15:40:44 -05:00
minix fs/Kconfig: move minix out 2009-01-22 13:15:58 +03:00
ncpfs fs/Kconfig: move the rest of ncpfs out 2009-01-22 13:16:01 +03:00
nfs fs/Kconfig: move nfs out 2009-01-22 13:16:00 +03:00
nfs_common SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-only 2008-12-23 15:21:32 -05:00
nfsd nfsd: only set file_lock.fl_lmops in nfsd4_lockt if a stateowner is found 2009-01-27 17:26:59 -05:00
nls remove CONFIG_KMOD from fs 2008-10-17 02:38:36 +11:00
notify inotify: clean up inotify_read and fix locking problems 2009-01-26 10:08:05 -08:00
ntfs fs/Kconfig: move ntfs out 2009-01-22 13:15:55 +03:00
ocfs2 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 2009-01-26 10:41:00 -08:00
omfs fs/Kconfig: move omfs out 2009-01-22 13:15:58 +03:00
openpromfs zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
partitions block: fix bug in ptbl lookup cache 2009-01-09 21:46:13 +01:00
proc Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu 2009-01-09 14:00:58 -08:00
qnx4 fs/Kconfig: move qnx4 out 2009-01-22 13:15:59 +03:00
ramfs NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area() 2009-01-08 12:04:46 +00:00
reiserfs fs/Kconfig: move reiserfs out 2009-01-22 13:15:53 +03:00
romfs fs/Kconfig: move romfs out 2009-01-22 13:15:59 +03:00
smbfs fs/Kconfig: move smbfs out 2009-01-22 13:16:01 +03:00
squashfs fs/Kconfig: move squashfs out 2009-01-22 13:15:58 +03:00
sysfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2009-01-26 10:40:28 -08:00
sysv fs/Kconfig: move sysv out 2009-01-22 13:15:59 +03:00
ubifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-01-07 11:31:52 -08:00
udf fs/Kconfig: move iso9660, udf out 2009-01-22 13:15:55 +03:00
ufs fs/Kconfig: move ufs out 2009-01-22 13:16:00 +03:00
xfs Long btree pointers are still 64 bit on disk 2009-01-22 01:23:11 -06:00
aio.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
anon_inodes.c anon_inodes: use fops->owner for module refcount 2008-12-31 16:55:44 +02:00
attr.c CRED: Wrap task credential accesses in the filesystem subsystem 2008-11-14 10:39:05 +11:00
bad_inode.c kill ->dir_notify() 2008-12-31 18:07:43 -05:00
binfmt_aout.c sanitize ifdefs in binfmt_aout 2009-01-03 11:45:54 -08:00
binfmt_elf.c ELF: implement AT_RANDOM for glibc PRNG seeding 2009-01-08 08:31:12 -08:00
binfmt_elf_fdpic.c FDPIC: Don't attempt to expand the userspace stack to fill the space allocated 2009-01-08 12:04:47 +00:00
binfmt_em86.c Allow recursion in binfmt_script and binfmt_misc 2008-10-16 11:21:38 -07:00
binfmt_flat.c FLAT: Don't attempt to expand the userspace stack to fill the space allocated 2009-01-08 12:04:47 +00:00
binfmt_misc.c fs/binfmt_misc.c: add terminating newline to /proc/sys/fs/binfmt_misc/status 2009-01-06 15:59:19 -08:00
binfmt_script.c Allow recursion in binfmt_script and binfmt_misc 2008-10-16 11:21:38 -07:00
binfmt_som.c CRED: Make execve() take advantage of copy-on-write credentials 2008-11-14 10:39:24 +11:00
bio-integrity.c bio: allow individual slabs in the bio_set 2008-12-29 08:29:23 +01:00
bio.c [SCSI] block: make blk_rq_map_user take a NULL user-space buffer for WRITE 2009-01-02 11:10:35 -06:00
block_dev.c filesystem freeze: implement generic freeze feature 2009-01-09 16:54:42 -08:00
buffer.c [CVE-2009-0029] System call wrappers part 10 2009-01-14 14:15:22 +01:00
char_dev.c fs: fix name overwrite in __register_chrdev_region() 2009-01-06 15:59:13 -08:00
compat.c [CVE-2009-0029] Make sys_pselect7 static 2009-01-14 14:15:16 +01:00
compat_binfmt_elf.c
compat_ioctl.c
dcache.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
dcookies.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
direct-io.c fs: truncate blocks outside i_size after O_DIRECT write error 2009-01-06 15:59:06 -08:00
dquot.c quota: Improve locking 2009-01-16 18:02:10 +01:00
drop_caches.c
eventfd.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
eventpoll.c [CVE-2009-0029] System call wrappers part 23 2009-01-14 14:15:28 +01:00
exec.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
fcntl.c [CVE-2009-0029] System call wrappers part 15 2009-01-14 14:15:24 +01:00
fifo.c [PATCH] introduce fmode_t, do annotations 2008-10-21 07:47:06 -04:00
file.c [PATCH] merge locate_fd() and get_unused_fd() 2008-08-01 11:25:23 -04:00
file_table.c filp_cachep can be static in fs/file_table.c 2008-12-31 18:07:42 -05:00
filesystems.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
fs-writeback.c fs: sys_sync fix 2009-01-06 15:59:09 -08:00
generic_acl.c
inode.c partial revert of asynchronous inode delete 2009-01-09 13:15:49 -08:00
internal.h CRED: Make execve() take advantage of copy-on-write credentials 2008-11-14 10:39:24 +11:00
ioctl.c [CVE-2009-0029] System call wrappers part 15 2009-01-14 14:15:24 +01:00
ioprio.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
Kconfig fs/Kconfig: move 9p out 2009-01-22 13:16:01 +03:00
Kconfig.binfmt CORE_DUMP_DEFAULT_ELF_HEADERS depends on ELF_CORE 2009-01-09 16:54:41 -08:00
libfs.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-01-05 18:32:06 -08:00
locks.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus 2009-01-09 15:18:49 -08:00
mbcache.c
mpage.c do_mpage_readpage(): remove useless clear_buffer_mapped() call 2009-01-06 15:59:01 -08:00
namei.c [CVE-2009-0029] System call wrappers part 29 2009-01-14 14:15:30 +01:00
namespace.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
nfsctl.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
no-block.c
open.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
pipe.c [CVE-2009-0029] System call wrappers part 33 2009-01-14 14:15:32 +01:00
pnode.c
pnode.h
posix_acl.c CRED: Wrap task credential accesses in the filesystem subsystem 2008-11-14 10:39:05 +11:00
quota.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
quota_tree.c quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
quota_tree.h quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
quota_v1.c quota: Move quotaio_v[12].h from include/linux/ to fs/ 2009-01-05 08:36:58 -08:00
quota_v2.c quota: Convert union in mem_dqinfo to a pointer 2009-01-05 08:40:21 -08:00
quotaio_v1.h quota: Move quotaio_v[12].h from include/linux/ to fs/ 2009-01-05 08:36:58 -08:00
quotaio_v2.h quota: Split off quota tree handling into a separate file 2009-01-05 08:40:21 -08:00
read_write.c [CVE-2009-0029] System call wrappers part 20 2009-01-14 14:15:26 +01:00
read_write.h
readdir.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
select.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
seq_file.c Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-01-03 12:04:39 -08:00
signalfd.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
splice.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
stack.c
stat.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
super.c [CVE-2009-0029] System call wrappers part 11 2009-01-14 14:15:23 +01:00
sync.c [CVE-2009-0029] System call wrappers part 09 2009-01-14 14:15:21 +01:00
timerfd.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
utimes.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
xattr.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
xattr_acl.c