remarkable-linux/fs
Filipe Manana 348a0013d5 Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list
As of my previous change titled "Btrfs: fix scrub preventing unused block
groups from being deleted", the following warning at
extent-tree.c:btrfs_delete_unused_bgs() can be hit when we mount the a
filesysten with "-o discard":

 10263  void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info)
 10264  {
 (...)
 10405                  if (trimming) {
 10406                          WARN_ON(!list_empty(&block_group->bg_list));
 10407                          spin_lock(&trans->transaction->deleted_bgs_lock);
 10408                          list_move(&block_group->bg_list,
 10409                                    &trans->transaction->deleted_bgs);
 10410                          spin_unlock(&trans->transaction->deleted_bgs_lock);
 10411                          btrfs_get_block_group(block_group);
 10412                  }
 (...)

This happens because scrub can now add back the block group to the list of
unused block groups (fs_info->unused_bgs). This is dangerous because we
are moving the block group from the unused block groups list to the list
of deleted block groups without holding the lock that protects the source
list (fs_info->unused_bgs_lock).

The following diagram illustrates how this happens:

            CPU 1                                     CPU 2

 cleaner_kthread()
   btrfs_delete_unused_bgs()

     sees bg X in list
      fs_info->unused_bgs

     deletes bg X from list
      fs_info->unused_bgs

                                            scrub_enumerate_chunks()

                                              searches device tree using
                                              its commit root

                                              finds device extent for
                                              block group X

                                              gets block group X from the tree
                                              fs_info->block_group_cache_tree
                                              (via btrfs_lookup_block_group())

                                              sets bg X to RO (again)

                                              scrub_chunk(bg X)

                                              sets bg X back to RW mode

                                              adds bg X to the list
                                              fs_info->unused_bgs again,
                                              since it's still unused and
                                              currently not in that list

     sets bg X to RO mode

     btrfs_remove_chunk(bg X)

     --> discard is enabled and bg X
         is in the fs_info->unused_bgs
         list again so the warning is
         triggered
     --> we move it from that list into
         the transaction's delete_bgs
         list, but we can have another
         task currently manipulating
         the first list (fs_info->unused_bgs)

Fix this by using the same lock (fs_info->unused_bgs_lock) to protect both
the list of unused block groups and the list of deleted block groups. This
makes it safe and there's not much worry for more lock contention, as this
lock is seldom used and only the cleaner kthread adds elements to the list
of deleted block groups. The warning goes away too, as this was previously
an impossible case (and would have been better a BUG_ON/ASSERT) but it's
not impossible anymore.
Reproduced with fstest btrfs/073 (using MOUNT_OPTIONS="-o discard").

Signed-off-by: Filipe Manana <fdmanana@suse.com>
2015-12-10 11:22:38 +00:00
..
9p 9p: fix return code of read() when count is 0 2015-08-23 14:21:36 -05:00
adfs fs/adfs: remove unneeded cast 2015-06-30 19:44:57 -07:00
affs fs/affs: make root lookup from blkdev logical size 2015-09-10 13:29:01 -07:00
afs
autofs4 make simple_positive() public 2015-06-23 18:02:01 -04:00
befs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
bfs
btrfs Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list 2015-12-10 11:22:38 +00:00
cachefiles Merge branch 'fscache-fixes' into for-next 2015-06-23 18:01:30 -04:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-09-11 12:33:03 -07:00
cifs [CIFS] Update cifs version number 2015-10-03 16:54:17 -05:00
coda fs/coda: fix readlink buffer overflow 2015-09-10 13:29:01 -07:00
configfs configfs: fix kernel infoleak through user-controlled format string 2015-07-17 16:39:53 -07:00
cramfs
debugfs debugfs: Export bool read/write functions 2015-07-20 18:44:50 +01:00
devpts devpts: if initialization failed, don't crash when opening /dev/ptmx 2015-06-30 19:44:58 -07:00
dlm dlm for 4.3 2015-09-03 12:57:48 -07:00
ecryptfs Invalidate stale eCryptfs dcache entries caused by unlinked lower inodes 2015-09-08 11:26:17 -07:00
efivarfs
efs fs/efs: femove unneeded cast 2015-06-25 17:00:42 -07:00
exofs pagemap.h: move dir_pages() over there 2015-06-23 18:02:00 -04:00
exportfs
ext2 ext2: huge page fault support 2015-09-08 15:35:28 -07:00
ext4 ext4: start transaction before calling into DAX 2015-09-08 15:35:28 -07:00
f2fs Merge tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs 2015-09-03 13:10:22 -07:00
fat
freevxfs freevxfs: Grammar s/an negative/a negative/ 2015-08-07 13:59:24 +02:00
fscache
fuse fs/fuse: fix ioctl type confusion 2015-08-16 12:35:44 -07:00
gfs2 GFS2: merge window 2015-09-11 12:23:51 -07:00
hfs hfs: fix B-tree corruption after insertion at position 0 2015-09-10 13:29:01 -07:00
hfsplus hfs,hfsplus: cache pages correctly between bnode_create and bnode_free 2015-09-10 13:29:01 -07:00
hostfs fs: create and use seq_show_option for escaping 2015-09-04 16:54:41 -07:00
hpfs hpfs: update ctime and mtime on directory modification 2015-09-03 11:55:30 -07:00
hugetlbfs hugetlbfs: add hugetlbfs_fallocate() 2015-09-08 15:35:28 -07:00
isofs
jbd2 jbd2: limit number of reserved credits 2015-08-04 11:21:52 -04:00
jffs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
jfs Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2015-09-03 12:28:30 -07:00
kernfs kernfs: implement kernfs_path_len() 2015-08-18 15:49:15 -07:00
lockd lockd: NLM grace period shouldn't block NFSv4 opens 2015-08-13 10:22:06 -04:00
logfs block: remove bio_get_nr_vecs() 2015-08-13 12:32:04 -06:00
minix Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
ncpfs
nfs NFS: Fix a tracepoint NULL-pointer dereference 2015-10-06 18:56:25 -04:00
nfs_common lockd: NLM grace period shouldn't block NFSv4 opens 2015-08-13 10:22:06 -04:00
nfsd NFS client updates for Linux 4.3 2015-09-07 14:02:24 -07:00
nilfs2 block: remove bio_get_nr_vecs() 2015-08-13 12:32:04 -06:00
nls
notify Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-09-05 20:34:28 -07:00
ntfs ntfs: delete unnecessary checks before calling iput() 2015-09-04 16:54:41 -07:00
ocfs2 ocfs2/dlm: fix deadlock when dispatch assert master 2015-09-22 15:09:53 -07:00
omfs
openpromfs
overlayfs fs: create and use seq_show_option for escaping 2015-09-04 16:54:41 -07:00
proc proc: convert to kstrto*()/kstrto*_from_user() 2015-09-10 13:29:01 -07:00
pstore Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-07-03 15:20:57 -07:00
qnx4
qnx6 pagemap.h: move dir_pages() over there 2015-06-23 18:02:00 -04:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-09-05 20:34:28 -07:00
ramfs
reiserfs fs: create and use seq_show_option for escaping 2015-09-04 16:54:41 -07:00
romfs
squashfs fs: cleanup slight list_entry abuse 2015-06-23 18:01:59 -04:00
sysfs vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
sysv pagemap.h: move dir_pages() over there 2015-06-23 18:02:00 -04:00
tracefs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
ubifs UBIFS: Kill unneeded locking in ubifs_init_security 2015-09-29 12:45:42 +02:00
udf udf: Don't modify filesystem for read-only mounts 2015-08-20 14:58:35 +02:00
ufs fix ufs write vs readpage race when writing into a hole 2015-09-09 10:43:12 -07:00
xfs xfs: huge page fault support 2015-09-08 15:35:28 -07:00
aio.c mm: move ->mremap() from file_operations to vm_operations_struct 2015-09-04 16:54:41 -07:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c blockdev: don't set S_DAX for misaligned partitions 2015-09-15 20:08:05 -04:00
buffer.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
char_dev.c fs/char_dev.c: fix incorrect documentation for unregister_chrdev_region 2015-08-05 13:49:35 -07:00
compat.c
compat_binfmt_elf.c
compat_ioctl.c ioctl_compat: handle FITRIM 2015-07-09 11:42:21 -07:00
coredump.c fs: Don't dump core if the corefile would become world-readable. 2015-09-10 13:29:01 -07:00
dax.c dax: fix NULL pointer in __dax_pmd_fault() 2015-10-01 21:42:35 -04:00
dcache.c dcache: Reduce the scope of i_lock in d_splice_alias 2015-08-21 02:34:37 -04:00
dcookies.c
direct-io.c block: remove bio_get_nr_vecs() 2015-08-13 12:32:04 -06:00
drop_caches.c inode: convert inode_sb_list_lock to per-sb 2015-08-17 18:39:46 -04:00
eventfd.c
eventpoll.c
exec.c vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
fcntl.c
fhandle.c
file.c fs/file.c: __fget() and dup2() atomicity rules 2015-07-01 02:31:08 -04:00
file_table.c fs, file table: reinit files_stat.max_files after deferred memory initialisation 2015-08-07 04:39:40 +03:00
filesystems.c
fs-writeback.c fs-writeback: unplug before cond_resched in writeback_sb_inodes 2015-09-19 18:50:19 -07:00
fs_pin.c
fs_struct.c
inode.c inode: don't softlockup when evicting inodes 2015-08-18 10:20:09 -07:00
internal.h inode: rename i_wb_list to i_io_list 2015-08-17 23:38:10 -04:00
ioctl.c
Kconfig fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
Kconfig.binfmt
libfs.c fs: Set the size of empty dirs to 0. 2015-08-12 15:28:45 -05:00
locks.c fs: fix fs/locks.c kernel-doc warning 2015-08-31 16:27:25 -04:00
Makefile userfaultfd: buildsystem activation 2015-09-04 16:54:41 -07:00
mbcache.c
mount.h fs: use seq_open_private() for proc_mounts 2015-06-30 19:44:56 -07:00
mpage.c block: remove bio_get_nr_vecs() 2015-08-13 12:32:04 -06:00
namei.c namei: results of d_is_negative() should be checked after dentry revalidation 2015-10-10 10:17:27 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2015-09-01 16:13:25 -07:00
no-block.c
nsfs.c fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void 2015-09-11 15:21:34 -07:00
open.c vfs: Commit to never having exectuables on proc and sysfs. 2015-07-10 10:39:25 -05:00
pipe.c
pnode.c
pnode.h mnt: Clarify and correct the disconnect logic in umount_tree 2015-07-22 20:33:27 -05:00
posix_acl.c fs/posix_acl.c: make posix_acl_create() safer and cleaner 2015-06-23 18:01:07 -04:00
proc_namespace.c fs: use seq_open_private() for proc_mounts 2015-06-30 19:44:56 -07:00
read_write.c
readdir.c
select.c
seq_file.c fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void 2015-09-11 15:21:34 -07:00
signalfd.c signalfd: fix information leak in signalfd_copyinfo 2015-08-07 04:39:40 +03:00
splice.c Merge branch 'akpm' (patches from Andrew) 2015-06-24 20:47:21 -07:00
stack.c
stat.c
statfs.c
super.c Merge branch 'superblock-scaling' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into for-next 2015-08-21 02:31:20 -04:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key" 2015-09-22 15:09:53 -07:00
utimes.c
xattr.c