remarkable-linux/fs
Frederic Weisbecker 9b467e6ebe reiserfs: don't lock root inode searching
Nothing requires that we lock the filesystem until the root inode is
provided.

Also iget5_locked() triggers a warning because we are holding the
filesystem lock while allocating the inode, which result in a lockdep
suspicion that we have a lock inversion against the reclaim path:

[ 1986.896979] =================================
[ 1986.896990] [ INFO: inconsistent lock state ]
[ 1986.896997] 3.1.1-main #8
[ 1986.897001] ---------------------------------
[ 1986.897007] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 1986.897016] kswapd0/16 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 1986.897023]  (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
[ 1986.897044] {RECLAIM_FS-ON-W} state was registered at:
[ 1986.897050]   [<c014a5b9>] mark_held_locks+0xae/0xd0
[ 1986.897060]   [<c014aab3>] lockdep_trace_alloc+0x7d/0x91
[ 1986.897068]   [<c0190ee0>] kmem_cache_alloc+0x1a/0x93
[ 1986.897078]   [<c01e7728>] reiserfs_alloc_inode+0x13/0x3d
[ 1986.897088]   [<c01a5b06>] alloc_inode+0x14/0x5f
[ 1986.897097]   [<c01a5cb9>] iget5_locked+0x62/0x13a
[ 1986.897106]   [<c01e99e0>] reiserfs_fill_super+0x410/0x8b9
[ 1986.897114]   [<c01953da>] mount_bdev+0x10b/0x159
[ 1986.897123]   [<c01e764d>] get_super_block+0x10/0x12
[ 1986.897131]   [<c0195b38>] mount_fs+0x59/0x12d
[ 1986.897138]   [<c01a80d1>] vfs_kern_mount+0x45/0x7a
[ 1986.897147]   [<c01a83e3>] do_kern_mount+0x2f/0xb0
[ 1986.897155]   [<c01a987a>] do_mount+0x5c2/0x612
[ 1986.897163]   [<c01a9a72>] sys_mount+0x61/0x8f
[ 1986.897170]   [<c044060c>] sysenter_do_call+0x12/0x32
[ 1986.897181] irq event stamp: 7509691
[ 1986.897186] hardirqs last  enabled at (7509691): [<c0190f34>] kmem_cache_alloc+0x6e/0x93
[ 1986.897197] hardirqs last disabled at (7509690): [<c0190eea>] kmem_cache_alloc+0x24/0x93
[ 1986.897209] softirqs last  enabled at (7508896): [<c01294bd>] __do_softirq+0xee/0xfd
[ 1986.897222] softirqs last disabled at (7508859): [<c01030ed>] do_softirq+0x50/0x9d
[ 1986.897234]
[ 1986.897235] other info that might help us debug this:
[ 1986.897242]  Possible unsafe locking scenario:
[ 1986.897244]
[ 1986.897250]        CPU0
[ 1986.897254]        ----
[ 1986.897257]   lock(&REISERFS_SB(s)->lock);
[ 1986.897265] <Interrupt>
[ 1986.897269]     lock(&REISERFS_SB(s)->lock);
[ 1986.897276]
[ 1986.897277]  *** DEADLOCK ***
[ 1986.897278]
[ 1986.897286] no locks held by kswapd0/16.
[ 1986.897291]
[ 1986.897292] stack backtrace:
[ 1986.897299] Pid: 16, comm: kswapd0 Not tainted 3.1.1-main #8
[ 1986.897306] Call Trace:
[ 1986.897314]  [<c0439e76>] ? printk+0xf/0x11
[ 1986.897324]  [<c01482d1>] print_usage_bug+0x20e/0x21a
[ 1986.897332]  [<c01479b8>] ? print_irq_inversion_bug+0x172/0x172
[ 1986.897341]  [<c014855c>] mark_lock+0x27f/0x483
[ 1986.897349]  [<c0148d88>] __lock_acquire+0x628/0x1472
[ 1986.897358]  [<c0149fae>] lock_acquire+0x47/0x5e
[ 1986.897366]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897384]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897397]  [<c043b5ef>] mutex_lock_nested+0x35/0x26f
[ 1986.897409]  [<c01f8bd4>] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897421]  [<c01f8bd4>] reiserfs_write_lock+0x20/0x2a
[ 1986.897433]  [<c01e2edd>] map_block_for_writepage+0xc9/0x590
[ 1986.897448]  [<c01b1706>] ? create_empty_buffers+0x33/0x8f
[ 1986.897461]  [<c0121124>] ? get_parent_ip+0xb/0x31
[ 1986.897472]  [<c043ef7f>] ? sub_preempt_count+0x81/0x8e
[ 1986.897485]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897496]  [<c0121124>] ? get_parent_ip+0xb/0x31
[ 1986.897508]  [<c01e355d>] reiserfs_writepage+0x1b9/0x3e7
[ 1986.897521]  [<c0173b40>] ? clear_page_dirty_for_io+0xcb/0xde
[ 1986.897533]  [<c014a6e3>] ? trace_hardirqs_on_caller+0x108/0x138
[ 1986.897546]  [<c014a71e>] ? trace_hardirqs_on+0xb/0xd
[ 1986.897559]  [<c0177b38>] shrink_page_list+0x34f/0x5e2
[ 1986.897572]  [<c01780a7>] shrink_inactive_list+0x172/0x22c
[ 1986.897585]  [<c0178464>] shrink_zone+0x303/0x3b1
[ 1986.897597]  [<c043cae0>] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897611]  [<c01788c9>] kswapd+0x3b7/0x5f2

The deadlock shouldn't happen since we are doing that allocation in the
mount path, the filesystem is not available for any reclaim.  Still the
warning is annoying.

To solve this, acquire the lock later only where we need it, right before
calling reiserfs_read_locked_inode() that wants to lock to walk the tree.

Reported-by: Knut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:54 -08:00
..
9p 9p: propagate umode_t 2012-01-03 22:55:01 -05:00
adfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
affs affs: propagate umode_t 2012-01-03 22:55:04 -05:00
afs switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
autofs4 vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
befs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
bfs switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
btrfs btrfs: pass __GFP_WRITE for buffered write page allocations 2012-01-10 16:30:44 -08:00
cachefiles fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
ceph ceph: d_alloc_root() may fail 2012-01-09 16:36:12 -05:00
cifs Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
coda switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
configfs configfs: convert to umode_t 2012-01-03 22:54:57 -05:00
cramfs Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
debugfs Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
devpts devpts: fix double-free on mount failure 2012-01-08 20:19:03 -05:00
dlm net: remove ipv6_addr_copy() 2011-11-22 16:43:32 -05:00
ecryptfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
efs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-01-09 12:51:01 -08:00
exportfs
ext2 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-09 17:37:37 -08:00
fat Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
freevxfs fs: propagate umode_t, misc bits 2012-01-03 22:55:10 -05:00
fscache
fuse vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
gfs2 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
hfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hfsplus vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hostfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hpfs switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
hppfs vfs: for usbfs, etc. internal vfsmounts ->mnt_sb->s_root == ->mnt_root 2012-01-03 22:52:41 -05:00
hugetlbfs hugetlbfs: propagate umode_t 2012-01-03 22:55:05 -05:00
isofs isofs: inode leak on mount failure 2012-01-09 10:48:11 -05:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
jbd2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
jffs2 vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
jfs Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
lockd vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
logfs logfs: propagate umode_t 2012-01-03 22:55:06 -05:00
minix Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
ncpfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
nfs Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
nfs_common
nfsd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
nilfs2 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
nls NLS: raname "maxlen" to "maxout" in UTF conversion routines 2011-11-26 19:58:47 -08:00
notify vfs: move fsnotify junk to struct mount 2012-01-03 22:57:12 -05:00
ntfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
ocfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
omfs omfs: propagate umode_t 2012-01-03 22:55:01 -05:00
openpromfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
proc tracepoint: add tracepoints for debugging oom_score_adj 2012-01-10 16:30:44 -08:00
pstore pstore: gracefully handle NULL pstore_info functions 2011-11-18 13:49:00 -08:00
qnx4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
quota vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
ramfs pohmelfs: propagate umode_t 2012-01-03 22:55:07 -05:00
reiserfs reiserfs: don't lock root inode searching 2012-01-10 16:30:54 -08:00
romfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
squashfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
sysfs sysfs: propagate umode_t 2012-01-03 22:55:03 -05:00
sysv vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
ubifs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
udf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ufs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
xfs Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs 2012-01-09 12:50:15 -08:00
aio.c aio: allocate kiocbs in batches 2011-11-02 16:07:03 -07:00
anon_inodes.c
attr.c switch is_sxid() to umode_t 2012-01-03 22:55:11 -05:00
bad_inode.c switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
binfmt_aout.c
binfmt_elf.c fs: binfmt_elf: create Kconfig variable for PIE randomization 2012-01-10 16:30:51 -08:00
binfmt_elf_fdpic.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
binfmt_script.c
binfmt_som.c
bio-integrity.c
bio.c bio: change some signed vars to unsigned 2011-11-16 09:21:50 +01:00
block_dev.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
buffer.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
char_dev.c char_dev.c: fix up some whitespace errors 2011-12-13 11:18:17 -08:00
compat.c switch open and mkdir syscalls to umode_t 2012-01-03 22:55:19 -05:00
compat_binfmt_elf.c
compat_ioctl.c vfs: fix up ENOIOCTLCMD error handling 2012-01-05 15:40:12 -08:00
dcache.c vfs: new helper - d_make_root() 2012-01-09 19:23:45 -05:00
dcookies.c
direct-io.c
drop_caches.c
eventfd.c
eventpoll.c
exec.c tracepoint: add tracepoints for debugging oom_score_adj 2012-01-10 16:30:44 -08:00
fcntl.c
fhandle.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
fifo.c
file.c
file_table.c vfs: prevent remount read-only if pending removes 2012-01-06 23:20:13 -05:00
filesystems.c vfs: convert fs_supers to hlist 2012-01-03 22:52:39 -05:00
fs-writeback.c Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
fs_struct.c
generic_acl.c
inode.c mm: account reaped page cache on inode cache pruning 2012-01-10 16:30:42 -08:00
internal.h vfs: protect remounting superblock read-only 2012-01-06 23:20:12 -05:00
ioctl.c vfs: fix up ENOIOCTLCMD error handling 2012-01-05 15:40:12 -08:00
ioprio.c
Kconfig Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-01-09 12:51:01 -08:00
Kconfig.binfmt fs: binfmt_elf: create Kconfig variable for PIE randomization 2012-01-10 16:30:51 -08:00
libfs.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
locks.c vfs: fix handling of lock allocation failure in lease-break case 2011-12-26 10:25:26 -08:00
Makefile Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
mbcache.c
mount.h vfs: keep list of mounts for each superblock 2012-01-06 23:20:12 -05:00
mpage.c
namei.c Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
no-block.c
open.c switch security_path_chmod() to struct path * 2012-01-06 23:16:53 -05:00
pipe.c vfs: pipe.c is really non-modular 2012-01-03 22:52:41 -05:00
pnode.c vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
pnode.h vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
posix_acl.c
proc_namespace.c vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
read_write.c
read_write.h
readdir.c
select.c
seq_file.c constify seq_file stuff 2012-01-03 22:52:40 -05:00
signalfd.c
splice.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
stack.c filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
stat.c
statfs.c vfs: new helper - vfs_ustat() 2012-01-03 22:53:07 -05:00
super.c vfs: prevent remount read-only if pending removes 2012-01-06 23:20:13 -05:00
sync.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
timerfd.c
utimes.c
xattr.c vfs: mnt_drop_write_file() 2012-01-03 22:52:40 -05:00
xattr_acl.c