Commit graph

171 commits

Author SHA1 Message Date
Nick Piggin 54566b2c15 fs: symlink write_begin allocation context fix
With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened.  They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim.  This bug could
cause filesystem deadlocks.

The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock.  The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.

Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
this flag in their write_begin function.  Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).

This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
random example).

[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
  untouched to the grab_cache_page_write_begin() function.  That
  just simplifies everybody, and may even allow future expansion of the
  logic.   - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-04 13:33:20 -08:00
Steven Rostedt 1239f26c05 make INIT_FS use the __RW_LOCK_UNLOCKED initialization
[AV: rediffed on top of unification of init_fs]
Initialization of init_fs still uses the deprecated RW_LOCK_UNLOCKED macro.
This patch updates it to use the __RW_LOCK_UNLOCKED(lock) macro.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:42 -05:00
Al Viro 18d8fda7c3 take init_fs to saner place
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:42 -05:00
Christoph Hellwig cb23beb551 kill vfs_permission
With all the nameidata removal there's no point anymore for this helper.
Of the three callers left two will go away with the next lookup series
anyway.

Also add proper kerneldoc to inode_permission as this is the main
permission check routine now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:41 -05:00
Christoph Hellwig 3fb64190aa pass a struct path * to may_open
No need for the nameidata in may_open - a struct path is enough.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:41 -05:00
Christoph Hellwig b4091d5f6f kill walk_init_root
walk_init_root is a tiny helper that is marked __always_inline, has just
one caller and an unused argument.  Just merge it into the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:41 -05:00
Christoph Hellwig 66f221875d remove incorrect comment in inode_permission
We now pass on all MAY_ flags to the filesystems permission routines,
so remove the comment stating the contrary.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:41 -05:00
Duane Griffin ebd09abbd9 vfs: ensure page symlinks are NUL-terminated
On-disk data corruption could cause a page link to have its i_size set
to PAGE_SIZE (or a multiple thereof) and its contents all non-NUL.
NUL-terminate the link name to ensure this doesn't cause further
problems for the kernel.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:39 -05:00
Kentaro Takeda be6d3e56a6 introduce new LSM hooks where vfsmount is available.
Add new LSM hooks for path-based checks.  Call them on directory-modifying
operations at the points where we still know the vfsmount involved.

Signed-off-by: Kentaro Takeda <takedakn@nttdata.co.jp>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Toshiharu Harada <haradats@nttdata.co.jp>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:37 -05:00
James Morris ec98ce480a Merge branch 'master' into next
Conflicts:
	fs/nfsd/nfs4recover.c

Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().

Signed-off-by: James Morris <jmorris@namei.org>
2008-12-04 17:16:36 +11:00
Hugh Dickins f9454548e1 don't unlink an active swapfile
Peter Cordes is sorry that he rm'ed his swapfiles while they were in use,
he then had no pathname to swapoff.  It's a curious little oversight, but
not one worth a lot of hackery.  Kudos to Willy Tarreau for turning this
around from a discussion of synthetic pathnames to how to prevent unlink.
Mimic immutable: prohibit unlinking an active swapfile in may_delete()
(and don't worry my little head over the tiny race window).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Willy Tarreau <w@1wt.eu>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Peter Cordes <peter@cordes.ca>
Cc: Bodo Eggert <7eggert@gmx.de>
Cc: David Newall <davidn@davidnewall.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-19 18:49:59 -08:00
David Howells da9592edeb CRED: Wrap task credential accesses in the filesystem subsystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:05 +11:00
Miklos Szeredi f696a3659f [PATCH] move executable checking into ->permission()
For execute permission on a regular files we need to check if file has
any execute bits at all, regardless of capabilites.

This check is normally performed by generic_permission() but was also
added to the case when the filesystem defines its own ->permission()
method.  In the latter case the filesystem should be responsible for
performing this check.

Move the check from inode_permission() inside filesystems which are
not calling generic_permission().

Create a helper function execute_ok() that returns true if the inode
is a directory or if any execute bits are present in i_mode.

Also fix up the following code:

 - coda control file is never executable
 - sysctl files are never executable
 - hfs_permission seems broken on MAY_EXEC, remove
 - hfsplus_permission is eqivalent to generic_permission(), remove

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-10-23 05:13:25 -04:00
OGAWA Hirofumi 4e9ed2f85a [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent
This adds LOOKUP_RENAME_TARGET intent for lookup of rename destination.

LOOKUP_RENAME_TARGET is going to be used like LOOKUP_CREATE. But since
the destination of rename() can be existing directory entry, so it has a
difference. Although that difference doesn't matter in my usage, this
tells it to user of this intent.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:20 -04:00
OGAWA Hirofumi 0612d9fb27 [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup
lookup_hash() with LOOKUP_PARENT is bogus. And this prepares to add
new intent on those path.

The user of LOOKUP_PARENT intent is nfs only, and it checks whether
nd->flags has LOOKUP_CREATE or LOOKUP_OPEN, so the result is same.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:19 -04:00
OGAWA Hirofumi e2761a1167 [PATCH vfs-2.6 2/6] vfs: add d_ancestor()
This adds d_ancestor() instead of d_isparent(), then use it.

If new_dentry == old_dentry, is_subdir() returns 1, looks strange.
"new_dentry == old_dentry" is not subdir obviously. But I'm not
checking callers for now, so this keeps current behavior.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:16 -04:00
OGAWA Hirofumi 871c0067d5 [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT()
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:16 -04:00
Al Viro 3516586a42 [PATCH] make O_EXCL in nd->intent.flags visible in nd->flags
New flag: LOOKUP_EXCL.  Set before doing the final step of pathname
resolution on the paths that have LOOKUP_CREATE and O_EXCL.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:56 -04:00
Al Viro 8737f3a1b3 [PATCH] get rid of path_lookup_create()
... and don't pass bogus flags when we are just looking for parent.
Fold __path_lookup_intent_open() into path_lookup_open() while we
are at it; that's the only remaining caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:54 -04:00
Al Viro d181146572 [PATCH] new helper - kern_path()
Analog of lookup_path(), takes struct path *.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 03:34:19 -04:00
Miklos Szeredi a95164d979 [patch 3/4] vfs: remove unused nameidata argument of may_create()
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-01 11:25:30 -04:00
Stephen Smalley f418b00607 Re: BUG at security/selinux/avc.c:883 (was: Re: linux-next: Tree
for July 17: early crash on x86-64)

SELinux needs MAY_APPEND to be passed down to the security hook.
Otherwise, we get permission denials when only append permission is
granted by policy even if the opening process specified O_APPEND.
Shows up as a regression in the ltp selinux testsuite, fixed by
this patch.

Signed-off-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-01 11:25:21 -04:00
Al Viro 964bd18362 [PATCH] get rid of __user_path_lookup_open
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:41 -04:00
Al Viro 2ad94ae654 [PATCH] new (local) helper: user_path_parent()
Preparation to untangling intents mess: reduce the number of do_path_lookup()
callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:35 -04:00
Al Viro 2d8f30380a [PATCH] sanitize __user_walk_fd() et.al.
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
	user_path_at(dfd, pathname, flags, &path)
	user_path(pathname, &path)
	user_lpath(pathname, &path)
	user_path_dir(pathname, &path)  (fail if not a directory)
  The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:34 -04:00
Al Viro f419a2e3b6 [PATCH] kill nameidata passing to permission(), rename to inode_permission()
Incidentally, the name that gives hundreds of false positives on grep
is not a good idea...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:31 -04:00
Al Viro 30524472c2 [PATCH] take noexec checks to very few callers that care
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:30 -04:00
Al Viro 672b16b2f6 [PATCH] more nameidata removal: exec_permission_lite() doesn't need it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:23 -04:00
Al Viro b77b0646ef [PATCH] pass MAY_OPEN to vfs_permission() explicitly
... and get rid of the last "let's deduce mask from nameidata->flags"
bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:22 -04:00
Al Viro a110343f0d [PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess
* MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS
* MAY_ACCESS on fuse should affect only the last step of pathname resolution
* fchdir() and chroot() should pass MAY_ACCESS, for the same reason why
  chdir() needs that.
* now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be
  removed; it has no business being in nameidata.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:21 -04:00
Al Viro 7f2da1e7d0 [PATCH] kill altroot
long overdue...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:20 -04:00
Al Viro 8bb79224b8 [PATCH] permission checks for chdir need special treatment only on the last step
... so we ought to pass MAY_CHDIR to vfs_permission() instead of having
it triggered on every step of preceding pathname resolution.  LOOKUP_CHDIR
is killed by that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:19 -04:00
Miklos Szeredi db2e747b14 [patch 5/5] vfs: remove mode parameter from vfs_symlink()
Remove the unused mode parameter from vfs_symlink and callers.

Thanks to Tetsuo Handa for noticing.

CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-07-26 20:53:18 -04:00
Tetsuo Handa 7e79eedb3b [patch 4/5] vfs: reuse local variable in vfs_link()
Why not reuse "inode" which is assigned as

  struct inode *inode = old_dentry->d_inode;

in the beginning of vfs_link() ?

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-07-26 20:53:17 -04:00
Al Viro e6305c43ed [PATCH] sanitize ->permission() prototype
* kill nameidata * argument; map the 3 bits in ->flags anybody cares
  about to new MAY_... ones and pass with the mask.
* kill redundant gfs2_iop_permission()
* sanitize ecryptfs_permission()
* fix remaining places where ->permission() instances might barf on new
  MAY_... found in mask.

The obvious next target in that direction is permission(9)

folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:14 -04:00
Miklos Szeredi d70b67c8bc [patch] vfs: fix lookup on deleted directory
Lookup can install a child dentry for a deleted directory.  This keeps
the directory dentry alive, and the inode pinned in the cache and on
disk, even after all external references have gone away.

This isn't a big problem normally, since memory pressure or umount
will clear out the directory dentry and its children, releasing the
inode.  But for UBIFS this causes problems because its orphan area can
overflow.

Fix this by returning ENOENT for all lookups on a S_DEAD directory
before creating a child dentry.

Thanks to Zoltan Sogor for noticing this while testing UBIFS, and
Artem for the excellent analysis of the problem and testing.

Reported-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:05 -04:00
Marcin Slusarz 694a1764d6 [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink
generic_readlink calls ERR_PTR for negative and positive values
(vfs_readlink returns length of "link"), but it should not
(not an errno) and does not need to.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-06-23 11:52:30 -04:00
Jan Blunck c8e7f449b2 [patch 1/4] vfs: path_{get,put}() cleanups
Here are some more places where path_{get,put}() can be used instead of
dput()/mntput() pair.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-06-23 11:52:29 -04:00
Al Viro e9baf6e598 [PATCH] return to old errno choice in mkdir() et.al.
In case when both EEXIST and EROFS would apply we used to
return the former in mkdir(2) and friends.  Lest anyone suspects
us of being consistent, in the same situation knfsd gave clients
nfs_erofs...

	ro-bind series had switched the syscall side of things to
returning -EROFS and immediately broke an application - namely,
mkdir -p.  Patch restores the original behaviour...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-16 17:23:18 -04:00
Serge E. Hallyn 08ce5f16ee cgroups: implement device whitelist
Implement a cgroup to track and enforce open and mknod restrictions on device
files.  A device cgroup associates a device access whitelist with each cgroup.
 A whitelist entry has 4 fields.  'type' is a (all), c (char), or b (block).
'all' means it applies to all types and all major and minor numbers.  Major
and minor are either an integer or * for all.  Access is a composition of r
(read), w (write), and m (mknod).

The root device cgroup starts with rwm to 'all'.  A child devcg gets a copy of
the parent.  Admins can then remove devices from the whitelist or add new
entries.  A child cgroup can never receive a device access which is denied its
parent.  However when a device access is removed from a parent it will not
also be removed from the child(ren).

An entry is added using devices.allow, and removed using
devices.deny.  For instance

	echo 'c 1:3 mr' > /cgroups/1/devices.allow

allows cgroup 1 to read and mknod the device usually known as
/dev/null.  Doing

	echo a > /cgroups/1/devices.deny

will remove the default 'a *:* mrw' entry.

CAP_SYS_ADMIN is needed to change permissions or move another task to a new
cgroup.  A cgroup may not be granted more permissions than the cgroup's parent
has.  Any task can move itself between cgroups.  This won't be sufficient, but
we can decide the best way to adequately restrict movement later.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix may-be-used-uninitialized warning]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Looks-good-to: Pavel Emelyanov <xemul@openvz.org>
Cc: Daniel Hokka Zakrisson <daniel@hozac.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:09 -07:00
Dave Hansen 4a3fd211cc [PATCH] r/o bind mounts: elevate write count for open()s
This is the first really tricky patch in the series.  It elevates the writer
count on a mount each time a non-special file is opened for write.

We used to do this in may_open(), but Miklos pointed out that __dentry_open()
is used as well to create filps.  This will cover even those cases, while a
call in may_open() would not have.

There is also an elevated count around the vfs_create() call in open_namei().
See the comments for more details, but we need this to fix a 'create, remount,
fail r/w open()' race.

Some filesystems forego the use of normal vfs calls to create
struct files.   Make sure that these users elevate the mnt
writer count because they will get __fput(), and we need
to make sure they're balanced.

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:25 -04:00
Dave Hansen 9079b1eb17 [PATCH] r/o bind mounts: get write access for vfs_rename() callers
This also uses the little helper in the NFS code to make an if() a little bit
less ugly.  We introduced the helper at the beginning of the series.

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:34 -04:00
Dave Hansen 75c3f29de7 [PATCH] r/o bind mounts: write counts for link/symlink
[AV: add missing nfsd pieces]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:34 -04:00
Dave Hansen 463c319726 [PATCH] r/o bind mounts: get callers of vfs_mknod/create/mkdir()
This takes care of all of the direct callers of vfs_mknod().
Since a few of these cases also handle normal file creation
as well, this also covers some calls to vfs_create().

So that we don't have to make three mnt_want/drop_write()
calls inside of the switch statement, we move some of its
logic outside of the switch and into a helper function
suggested by Christoph.

This also encapsulates a fix for mknod(S_IFREG) that Miklos
found.

[AV: merged mkdir handling, added missing nfsd pieces]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:34 -04:00
Dave Hansen 0622753b80 [PATCH] r/o bind mounts: elevate write count for rmdir and unlink.
Elevate the write count during the vfs_rmdir() and vfs_unlink().

[AV: merged rmdir and unlink parts, added missing pieces in nfsd]

Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:33 -04:00
Christoph Hellwig a70e65df88 [PATCH] merge open_namei() and do_filp_open()
open_namei() will, in the future, need to take mount write counts
over its creation and truncation (via may_open()) operations.  It
needs to keep these write counts until any potential filp that is
created gets __fput()'d.

This gets complicated in the error handling and becomes very murky
as to how far open_namei() actually got, and whether or not that
mount write count was taken.  That makes it a bad interface.

All that the current do_filp_open() really does is allocate the
nameidata on the stack, then call open_namei().

So, this merges those two functions and moves filp_open() over
to namei.c so it can be close to its buddy: do_filp_open().  It
also gets a kerneldoc comment in the process.

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:32 -04:00
Dave Hansen d57999e152 [PATCH] do namei_flags calculation inside open_namei()
My end goal here is to make sure all users of may_open()
return filps.  This will ensure that we properly release
mount write counts which were taken for the filp in
may_open().

This patch moves the sys_open flags to namei flags
calculation into fs/namei.c.  We'll shortly be moving
the nameidata_to_filp() calls into namei.c, and this
gets the sys_open flags to a place where we can get
at them when we need them.

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:25:31 -04:00
Linus Torvalds 7ed7fe5e82 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] get stack footprint of pathname resolution back to relative sanity
  [PATCH] double iput() on failure exit in hugetlb
  [PATCH] double dput() on failure exit in tiny-shmem
  [PATCH] fix up new filp allocators
  [PATCH] check for null vfsmount in dentry_open()
  [PATCH] reiserfs: eliminate private use of struct file in xattr
  [PATCH] sanitize hppfs
  hppfs pass vfsmount to dentry_open()
  [PATCH] restore export of do_kern_mount()
2008-03-25 08:57:47 -07:00
Randy Dunlap a6b91919e0 fs: fix kernel-doc notation warnings
Fix kernel-doc notation warnings in fs/.

Warning(mmotm-2008-0314-1449//fs/super.c:560): missing initial short description on line:
 *	mark_files_ro
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
 *	lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/namei.c:1368): missing initial short description on line:
 * lookup_one_len:  filesystem helper to lookup single pathname component
Warning(mmotm-2008-0314-1449//fs/buffer.c:3221): missing initial short description on line:
 * bh_uptodate_or_lock: Test whether the buffer is uptodate
Warning(mmotm-2008-0314-1449//fs/buffer.c:3240): missing initial short description on line:
 * bh_submit_read: Submit a locked buffer for reading
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:30): missing initial short description on line:
 * writeback_acquire: attempt to get exclusive writeback access to a device
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:47): missing initial short description on line:
 * writeback_in_progress: determine whether there is writeback in progress
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:58): missing initial short description on line:
 * writeback_release: relinquish exclusive writeback access against a device.
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:351): contents before sections
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:561): contents before sections
Warning(mmotm-2008-0314-1449//fs/jbd/transaction.c:1935): missing initial short description on line:
 * void journal_invalidatepage()

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-19 18:53:36 -07:00
Al Viro a02f76c34d [PATCH] get stack footprint of pathname resolution back to relative sanity
Somebody had put struct nameidata in stack frame of link_path_walk().
Unfortunately, there are certain realities to deal with:
	* It's in the middle of recursion.  Depth is equal to the nesting
depth of symlinks, i.e. up to 8.
	* struct namiedata is, even if one discards the intent junk,
at least 12 pointers + 5 ints.
	* moreover, adding a stack frame is not free in that situation.
	* there are fs methods called on top of that, and they also have
stack footprint.
	* kernel stack is not infinite.

The thing is, even if one chooses to deal with -ESTALE that way (and it's
one hell of an overkill), the only thing that needs to be preserved is
vfsmount + dentry, not the entire struct nameidata.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-19 06:55:46 -04:00