remarkable-linux/fs/btrfs
Wang Xiaoguang 9e7cc91a6d btrfs: fix fsfreeze hang caused by delayed iputs deal
When running fstests generic/068, sometimes we got below deadlock:
  xfs_io          D ffff8800331dbb20     0  6697   6693 0x00000080
  ffff8800331dbb20 ffff88007acfc140 ffff880034d895c0 ffff8800331dc000
  ffff880032d243e8 fffffffeffffffff ffff880032d24400 0000000000000001
  ffff8800331dbb38 ffffffff816a9045 ffff880034d895c0 ffff8800331dbba8
  Call Trace:
  [<ffffffff816a9045>] schedule+0x35/0x80
  [<ffffffff816abab2>] rwsem_down_read_failed+0xf2/0x140
  [<ffffffff8118f5e1>] ? __filemap_fdatawrite_range+0xd1/0x100
  [<ffffffff8134f978>] call_rwsem_down_read_failed+0x18/0x30
  [<ffffffffa06631fc>] ? btrfs_alloc_block_rsv+0x2c/0xb0 [btrfs]
  [<ffffffff810d32b5>] percpu_down_read+0x35/0x50
  [<ffffffff81217dfc>] __sb_start_write+0x2c/0x40
  [<ffffffffa067f5d5>] start_transaction+0x2a5/0x4d0 [btrfs]
  [<ffffffffa067f857>] btrfs_join_transaction+0x17/0x20 [btrfs]
  [<ffffffffa068ba34>] btrfs_evict_inode+0x3c4/0x5d0 [btrfs]
  [<ffffffff81230a1a>] evict+0xba/0x1a0
  [<ffffffff812316b6>] iput+0x196/0x200
  [<ffffffffa06851d0>] btrfs_run_delayed_iputs+0x70/0xc0 [btrfs]
  [<ffffffffa067f1d8>] btrfs_commit_transaction+0x928/0xa80 [btrfs]
  [<ffffffffa0646df0>] btrfs_freeze+0x30/0x40 [btrfs]
  [<ffffffff81218040>] freeze_super+0xf0/0x190
  [<ffffffff81229275>] do_vfs_ioctl+0x4a5/0x5c0
  [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
  [<ffffffff810038cf>] ? syscall_trace_enter_phase1+0x11f/0x140
  [<ffffffff81229409>] SyS_ioctl+0x79/0x90
  [<ffffffff81003c12>] do_syscall_64+0x62/0x110
  [<ffffffff816acbe1>] entry_SYSCALL64_slow_path+0x25/0x25

>From this warning, freeze_super() already holds SB_FREEZE_FS, but
btrfs_freeze() will call btrfs_commit_transaction() again, if
btrfs_commit_transaction() finds that it has delayed iputs to handle,
it'll start_transaction(), which will try to get SB_FREEZE_FS lock
again, then deadlock occurs.

The root cause is that in btrfs, sync_filesystem(sb) does not make
sure all metadata is updated. There still maybe some codes adding
delayed iputs, see below sample race window:

         CPU1                                  |         CPU2
|-> freeze_super()                             |
    |-> sync_filesystem(sb);                   |
    |                                          |-> cleaner_kthread()
    |                                          |   |-> btrfs_delete_unused_bgs()
    |                                          |       |-> btrfs_remove_chunk()
    |                                          |           |-> btrfs_remove_block_group()
    |                                          |               |-> btrfs_add_delayed_iput()
    |                                          |
    |-> sb->s_writers.frozen = SB_FREEZE_FS;   |
    |-> sb_wait_write(sb, SB_FREEZE_FS);       |
    |   acquire SB_FREEZE_FS lock.             |
    |                                          |
    |-> btrfs_freeze()                         |
        |-> btrfs_commit_transaction()         |
            |-> btrfs_run_delayed_iputs()      |
            |   will handle delayed iputs,     |
            |   that means start_transaction() |
            |   will be called, which will try |
            |   to get SB_FREEZE_FS lock.      |

To fix this issue, introduce a "int fs_frozen" to record internally whether
fs has been frozen. If fs has been frozen, we can not handle delayed iputs.

Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add comment to btrfs_freeze ]
Signed-off-by: David Sterba <dsterba@suse.com>

Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25 03:58:26 -07:00
..
tests btrfs: tests, require fs_info for root 2016-07-26 13:53:18 +02:00
acl.c btrfs: Replace -ENOENT by -ERANGE in btrfs_get_acl() 2016-07-26 13:52:25 +02:00
async-thread.c btrfs: plumb fs_info into btrfs_work 2016-07-26 13:53:15 +02:00
async-thread.h btrfs: plumb fs_info into btrfs_work 2016-07-26 13:53:15 +02:00
backref.c btrfs: backref: Fix soft lockup in __merge_refs function 2016-08-25 03:58:16 -07:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
check-integrity.c btrfs: Use correct format specifier 2016-06-17 18:32:40 +02:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c Btrfs: fix BUG_ON in btrfs_submit_compressed_write 2016-07-26 13:52:25 +02:00
compression.h btrfs: move btrfs_compression_type to compression.h 2016-03-11 17:12:46 +01:00
ctree.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
ctree.h btrfs: fix fsfreeze hang caused by delayed iputs deal 2016-08-25 03:58:26 -07:00
dedupe.h btrfs: expand cow_file_range() to support in-band dedup and subpage-blocksize 2016-07-26 13:52:25 +02:00
delayed-inode.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
delayed-inode.h Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes 2016-06-25 06:20:10 -07:00
delayed-ref.c btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() 2016-08-25 03:58:21 -07:00
delayed-ref.h Btrfs: remove unused function btrfs_add_delayed_qgroup_reserve() 2016-08-03 11:02:51 +01:00
dev-replace.c btrfs: btrfs_test_opt and friends should take a btrfs_fs_info 2016-07-26 13:53:16 +02:00
dev-replace.h btrfs: refactor btrfs_dev_replace_start for reuse 2016-04-28 10:59:13 +02:00
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c btrfs: fix fsfreeze hang caused by delayed iputs deal 2016-08-25 03:58:26 -07:00
disk-io.h btrfs: tests, require fs_info for root 2016-07-26 13:53:18 +02:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent-tree.c btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
extent_io.c Btrfs: fix eb memory leak due to readpage failure 2016-07-26 13:52:25 +02:00
extent_io.h btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
extent_map.c btrfs: Fix slab accounting flags 2016-07-26 13:52:25 +02:00
extent_map.h btrfs: cleanup, stop casting for extent_map->lookup everywhere 2016-01-15 19:22:28 +01:00
file-item.c Btrfs: fix __MAX_CSUM_ITEMS 2016-08-03 14:08:37 -07:00
file.c btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
free-space-cache.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
free-space-cache.h btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
free-space-tree.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
free-space-tree.h Btrfs: implement the free space B-tree 2015-12-17 12:16:47 -08:00
hash.c btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: rename btrfs_std_error to btrfs_handle_fs_error 2016-04-28 10:36:54 +02:00
inode-map.c btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-01-15 19:25:02 +01:00
inode.c btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
ioctl.c btrfs: waiting on qgroup rescan should not always be interruptible 2016-08-25 03:58:20 -07:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: cleanup, remove stray return statements 2016-01-07 14:30:52 +01:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
Makefile Btrfs: add free space tree sanity tests 2015-12-17 12:16:47 -08:00
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c btrfs: Fix slab accounting flags 2016-07-26 13:52:25 +02:00
ordered-data.h Btrfs: fix race setting block group readonly during device replace 2016-05-30 12:58:21 +01:00
orphan.c btrfs: kill the key type accessor helpers 2014-09-17 13:37:12 -07:00
print-tree.c btrfs: teach print_leaf about temporary item subtypes 2016-02-11 16:15:43 +01:00
print-tree.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
props.c btrfs: simpilify btrfs_subvol_inherit_props 2016-07-26 13:54:22 +02:00
props.h Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
qgroup.c btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() 2016-08-25 03:58:21 -07:00
qgroup.h btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() 2016-08-25 03:58:21 -07:00
raid56.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c Btrfs: fix race between readahead and device replace/removal 2016-05-30 12:58:18 +01:00
relocation.c btrfs: update btrfs_space_info's bytes_may_use timely 2016-08-25 03:58:26 -07:00
root-tree.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
scrub.c btrfs: plumb fs_info into btrfs_work 2016-07-26 13:53:15 +02:00
send.c Btrfs: send, don't bug on inconsistent snapshots 2016-08-01 07:31:41 +01:00
send.h Btrfs: use linux/sizes.h to represent constants 2016-01-07 14:38:02 +01:00
struct-funcs.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
super.c btrfs: fix fsfreeze hang caused by delayed iputs deal 2016-08-25 03:58:26 -07:00
sysfs.c btrfs: add missing bytes_readonly attribute file in sysfs 2016-07-26 13:52:25 +02:00
sysfs.h btrfs: sysfs: introduce helper for syncing bits with sysfs files 2016-01-21 18:50:40 +01:00
transaction.c btrfs: fix fsfreeze hang caused by delayed iputs deal 2016-08-25 03:58:26 -07:00
transaction.h btrfs: add btrfs_trans_handle->fs_info pointer 2016-07-26 13:54:26 +02:00
tree-defrag.c Btrfs: fix locking bugs when defragging leaves 2015-12-18 02:51:32 +00:00
tree-log.c btrfs: qgroup: Fix qgroup incorrectness caused by log replay 2016-08-25 03:58:23 -07:00
tree-log.h Btrfs: fix unreplayable log after snapshot delete + parent dir fsync 2016-03-01 08:23:25 -08:00
ulist.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c Btrfs: make btrfs_search_forward return with nodes unlocked 2014-09-17 13:38:02 -07:00
volumes.c btrfs: btrfs_abort_transaction, drop root parameter 2016-07-26 13:54:26 +02:00
volumes.h Merge branch 'foreign/jeffm/uapi' into for-chris-4.7-20160516 2016-05-16 15:46:29 +02:00
xattr.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00