Commit graph

124 commits

Author SHA1 Message Date
Vivek Goyal c0ca3d70e8 ovl: modify ovl_permission() to do checks on two inodes
Right now ovl_permission() calls __inode_permission(realinode), to do
permission checks on real inode and no checks are done on overlay inode.

Modify it to do checks both on overlay inode as well as underlying inode.
Checks on overlay inode will be done with the creds of calling task while
checks on underlying inode will be done with the creds of mounter.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 12:05:23 +02:00
Vivek Goyal 39a25b2b37 ovl: define ->get_acl() for overlay inodes
Now we are planning to do DAC permission checks on overlay inode
itself. And to make it work, we will need to make sure we can get acls from
underlying inode. So define ->get_acl() for overlay inodes and this in turn
calls into underlying filesystem to get acls, if any.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 12:05:23 +02:00
Vivek Goyal 72e4848181 ovl: move some common code in a function
ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some
common code which can be moved into a separate function.  No functionality
change.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 12:05:23 +02:00
Andreas Gruenbacher 58ed4e70f2 ovl: store ovl_entry in inode->i_private for all inodes
Previously this was only done for directory inodes.  Doing so for all
inodes makes for a nice cleanup in ovl_permission at zero cost.

Inodes are not shared for hard links on the overlay, so this works fine.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 12:05:22 +02:00
Miklos Szeredi eead4f2dc4 ovl: use generic_delete_inode
No point in keeping overlay inodes around since they will never be reused.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29 12:05:22 +02:00
Miklos Szeredi c1b2cc1a76 ovl: check mounter creds on underlying lookup
The hash salting changes meant that we can no longer reuse the hash in the
overlay dentry to look up the underlying dentry.

Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to
mounter's creds (like we do for all other operations later in the series).

Now the lookup_hash() export introduced in 4.6 by 3c9fe8cdff ("vfs: add
lookup_hash() helper") is unused and can possibly be removed; its
usefulness negated by the hash salting and the idea that mounter's creds
should be used on operations on underlying filesystems.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 8387ff2577 ("vfs: make the string hashes salt the hash")
2016-07-29 12:05:22 +02:00
Miklos Szeredi 1b91dbdd29 Merge branch 'd_real' into overlayfs-next 2016-07-27 11:36:03 +02:00
Maxim Patlasov cfc9fde0b0 ovl: verify upper dentry in ovl_remove_and_whiteout()
The upper dentry may become stale before we call ovl_lock_rename_workdir.
For example, someone could (mistakenly or maliciously) manually unlink(2)
it directly from upperdir.

To ensure it is not stale, let's lookup it after ovl_lock_rename_workdir
and and check if it matches the upper dentry.

Essentially, it is the same problem and similar solution as in
commit 11f3710417 ("ovl: verify upper dentry before unlink and rename").

Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
2016-07-22 10:54:20 +02:00
Vivek Goyal 07a2daab49 ovl: Copy up underlying inode's ->i_mode to overlay inode
Right now when a new overlay inode is created, we initialize overlay
inode's ->i_mode from underlying inode ->i_mode but we retain only
file type bits (S_IFMT) and discard permission bits.

This patch changes it and retains permission bits too. This should allow
overlay to do permission checks on overlay inode itself in task context.

[SzM] It also fixes clearing suid/sgid bits on write.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>
2016-07-04 16:49:48 +02:00
Miklos Szeredi b99c2d9138 ovl: handle ATTR_KILL*
Before 4bacc9c923 ("overlayfs: Make f_path...") file->f_path pointed to
the underlying file, hence suid/sgid removal on write worked fine.

After that patch file->f_path pointed to the overlay file, and the file
mode bits weren't copied to overlay_inode->i_mode.  So the suid/sgid
removal simply stopped working.

The fix is to copy the mode bits, but then ovl_setattr() needs to clear
ATTR_MODE to avoid the BUG() in notify_change().  So do this first, then in
the next patch copy the mode.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc: <stable@vger.kernel.org>
2016-07-04 16:49:48 +02:00
Vivek Goyal e7c0b5991d ovl: warn instead of error if d_type is not supported
overlay needs underlying fs to support d_type. Recently I put in a
patch in to detect this condition and started failing mount if
underlying fs did not support d_type.

But this breaks existing configurations over kernel upgrade. Those who
are running docker (partially broken configuration) with xfs not
supporting d_type, are surprised that after kernel upgrade docker does
not run anymore.

https://github.com/docker/docker/issues/22937#issuecomment-229881315

So instead of erroring out, detect broken configuration and warn
about it. This should allow existing docker setups to continue
working after kernel upgrade.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 45aebeaf4f ("ovl: Ensure upper filesystem supports d_type")
Cc: <stable@vger.kernel.org> 4.6
2016-07-03 09:39:31 +02:00
Miklos Szeredi 2d902671ce vfs: merge .d_select_inode() into .d_real()
The two methods essentially do the same: find the real dentry/inode
belonging to an overlay dentry.  The difference is in the usage:

vfs_open() uses ->d_select_inode() and expects the function to perform
copy-up if necessary based on the open flags argument.

file_dentry() uses ->d_real() passing in the overlay dentry as well as the
underlying inode.

vfs_rename() uses ->d_select_inode() but passes zero flags.  ->d_real()
with a zero inode would have worked just as well here.

This patch merges the functionality of ->d_select_inode() into ->d_real()
by adding an 'open_flags' argument to the latter.

[Al Viro] Make the signature of d_real() match that of ->d_real() again.
And constify the inode argument, while we are at it.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-06-30 08:53:27 +02:00
Miklos Szeredi 03bea60409 ovl: get_write_access() in truncate
When truncating a file we should check write access on the underlying
inode.  And we should do so on the lower file as well (before copy-up) for
consistency.

Original patch and test case by Aihua Zhang.

 - - >o >o - - test.c - - >o >o - -
#include <stdio.h>
#include <errno.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
	int ret;

	ret = truncate(argv[0], 4096);
	if (ret != -1) {
		fprintf(stderr, "truncate(argv[0]) should have failed\n");
		return 1;
	}
	if (errno != ETXTBSY) {
		perror("truncate(argv[0])");
		return 1;
	}

	return 0;
}
 - - >o >o - - >o >o - - >o >o - -

Reported-by: Aihua Zhang <zhangaihua1@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
2016-06-29 16:03:55 +02:00
Miklos Szeredi a4859d7594 ovl: fix dentry leak for default_permissions
When using the 'default_permissions' mount option, ovl_permission() on
non-directories was missing a dput(alias), resulting in "BUG Dentry still
in use".

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 8d3095f4ad ("ovl: default permissions")
Cc: <stable@vger.kernel.org> # v4.5+
2016-06-29 08:26:59 +02:00
Miklos Szeredi d0e13f5bbe ovl: fix uid/gid when creating over whiteout
Fix a regression when creating a file over a whiteout.  The new
file/directory needs to use the current fsuid/fsgid, not the ones from the
mounter's credentials.

The refcounting is a bit tricky: prepare_creds() sets an original refcount,
override_creds() gets one more, which revert_cred() drops.  So

  1) we need to expicitly put the mounter's credentials when overriding
     with the updated one

  2) we need to put the original ref to the updated creds (and this can
     safely be done before revert_creds(), since we'll still have the ref
     from override_creds()).

Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Fixes: 3fe6e52f06 ("ovl: override creds with the ones from the superblock mounter")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-06-15 14:18:59 +02:00
Miklos Szeredi b581755b1c ovl: xattr filter fix
a) ovl_need_xattr_filter() is wrong, we can have multiple lower layers
overlaid, all of which (except the lowest one) honouring the
"trusted.overlay.opaque" xattr.  So need to filter everything except the
bottom and the pure-upper layer.

b) we no longer can assume that inode is attached to dentry in
get/setxattr.

This patch unconditionally filters private xattrs to fix both of the above.
Performance impact for get/removexattrs is likely in the noise.

For listxattrs it might be measurable in pathological cases, but I very
much hope nobody cares.  If they do, we'll fix it then.

Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: b96809173e ("security_d_instantiate(): move to the point prior to attaching dentry to inode")
2016-06-06 16:21:37 +02:00
Linus Torvalds d102a56edb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Followups to the parallel lookup work:

   - update docs

   - restore killability of the places that used to take ->i_mutex
     killably now that we have down_write_killable() merged

   - Additionally, it turns out that I missed a prerequisite for
     security_d_instantiate() stuff - ->getxattr() wasn't the only thing
     that could be called before dentry is attached to inode; with smack
     we needed the same treatment applied to ->setxattr() as well"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch ->setxattr() to passing dentry and inode separately
  switch xattr_handler->set() to passing dentry and inode separately
  restore killability of old mutex_lock_killable(&inode->i_mutex) users
  add down_write_killable_nested()
  update D/f/directory-locking
2016-05-27 17:14:05 -07:00
Al Viro 3767e255b3 switch ->setxattr() to passing dentry and inode separately
smack ->d_instantiate() uses ->setxattr(), so to be able to call it before
we'd hashed the new dentry and attached it to inode, we need ->setxattr()
instances getting the inode as an explicit argument rather than obtaining
it from dentry.

Similar change for ->getxattr() had been done in commit ce23e64.  Unlike
->getxattr() (which is used by both selinux and smack instances of
->d_instantiate()) ->setxattr() is used only by smack one and unfortunately
it got missed back then.

Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-27 20:09:16 -04:00
Linus Torvalds 0121a32201 Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi:
 "The meat of this is a change to use the mounter's credentials for
  operations that require elevated privileges (such as whiteout
  creation).  This fixes behavior under user namespaces as well as being
  a nice cleanup"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: Do d_type check only if work dir creation was successful
  ovl: update documentation
  ovl: override creds with the ones from the superblock mounter
2016-05-27 16:44:39 -07:00
Vivek Goyal 21765194ce ovl: Do d_type check only if work dir creation was successful
d_type check requires successful creation of workdir as iterates
through work dir and expects work dir to be present in it. If that's
not the case, this check will always return d_type not supported even
if underlying filesystem might be supporting it.

So don't do this check if work dir creation failed in previous step.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-27 10:18:56 +02:00
Antonio Murdaca 3fe6e52f06 ovl: override creds with the ones from the superblock mounter
In user namespace the whiteout creation fails with -EPERM because the
current process isn't capable(CAP_SYS_ADMIN) when setting xattr.

A simple reproducer:

$ mkdir upper lower work merged lower/dir
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
$ unshare -m -p -f -U -r bash

Now as root in the user namespace:

\# touch merged/dir/{1,2,3} # this will force a copy up of lower/dir
\# rm -fR merged/*

This ends up failing with -EPERM after the files in dir has been
correctly deleted:

unlinkat(4, "2", 0)                     = 0
unlinkat(4, "1", 0)                     = 0
unlinkat(4, "3", 0)                     = 0
close(4)                                = 0
unlinkat(AT_FDCWD, "merged/dir", AT_REMOVEDIR) = -1 EPERM (Operation not
permitted)

Interestingly, if you don't place files in merged/dir you can remove it,
meaning if upper/dir does not exist, creating the char device file works
properly in that same location.

This patch uses ovl_sb_creator_cred() to get the cred struct from the
superblock mounter and override the old cred with these new ones so that
the whiteout creation is possible because overlay is wrong in assuming that
the creds it will get with prepare_creds will be in the initial user
namespace.  The old cap_raise game is removed in favor of just overriding
the old cred struct.

This patch also drops from ovl_copy_up_one() the following two lines:

override_cred->fsuid = stat->uid;
override_cred->fsgid = stat->gid;

This is because the correct uid and gid are taken directly with the stat
struct and correctly set with ovl_set_attr().

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-27 08:55:26 +02:00
Al Viro 002354112f restore killability of old mutex_lock_killable(&inode->i_mutex) users
The ones that are taking it exclusive, that is...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-26 00:13:25 -04:00
Al Viro 0e0162bb8c Merge branch 'ovl-fixes' into for-linus
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
2016-05-17 02:17:59 -04:00
Miklos Szeredi 38b78a5f18 ovl: ignore permissions on underlying lookup
Generally permission checking is not necessary when overlayfs looks up a
dentry on one of the underlying layers, since search permission on base
directory was already checked in ovl_permission().

More specifically using lookup_one_len() causes a problem when the lower
directory lacks search permission for a specific user while the upper
directory does have search permission.  Since lookups are cached, this
causes inconsistency in behavior: success depends on who did the first
lookup.

So instead use lookup_hash() which doesn't do the permission check.

Reported-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-05-10 23:58:18 -04:00
Al Viro 9902af79c0 parallel lookups: actual switch to rwsem
ta-da!

The main issue is the lack of down_write_killable(), so the places
like readdir.c switched to plain inode_lock(); once killable
variants of rwsem primitives appear, that'll be dealt with.

lockdep side also might need more work

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:49:28 -04:00
Al Viro b9e1d435fd ovl_lookup_real(): use lookup_one_len_unlocked()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-02 19:47:24 -04:00
Al Viro 84695ffee7 Merge getxattr prototype change into work.lookups
The rest of work.xattr stuff isn't needed for this branch
2016-05-02 19:45:47 -04:00
Al Viro ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Miklos Szeredi d101a12595 fs: add file_dentry()
This series fixes bugs in nfs and ext4 due to 4bacc9c923 ("overlayfs:
Make f_path always point to the overlay and f_inode to the underlay").

Regular files opened on overlayfs will result in the file being opened on
the underlying filesystem, while f_path points to the overlayfs
mount/dentry.

This confuses filesystems which get the dentry from struct file and assume
it's theirs.

Add a new helper, file_dentry() [*], to get the filesystem's own dentry
from the file.  This checks file->f_path.dentry->d_flags against
DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
set (this is the common, non-overlayfs case).

In the uncommon case it will call into overlayfs's ->d_real() to get the
underlying dentry, matching file_inode(file).

The reason we need to check against the inode is that if the file is copied
up while being open, d_real() would return the upper dentry, while the open
file comes from the lower dentry.

[*] If possible, it's better simply to use file_inode() instead.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: <stable@vger.kernel.org> # v4.2
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Daniel Axtens <dja@axtens.net>
2016-03-26 16:14:37 -04:00
Miklos Szeredi 6986c012fa ovl: cleanup unused var in rename2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:46 +01:00
Miklos Szeredi 56656e960b ovl: rename is_merge to is_lowest
The 'is_merge' is an historical naming from when only a single lower layer
could exist.  With the introduction of multiple lower layers the meaning of
this flag was changed to mean only the "lowest layer" (while all lower
layers were being merged).

So now 'is_merge' is inaccurate and hence renaming to 'is_lowest'

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:46 +01:00
Sohom Bhattacharjee f134f24465 ovl: fixed coding style warning
This patch fixes a newline warning found by the checkpatch.pl tool

Signed-off-by: Sohom-Bhattacharjee <soham.bhattacharjee15@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:45 +01:00
Vivek Goyal 45aebeaf4f ovl: Ensure upper filesystem supports d_type
In some instances xfs has been created with ftype=0 and there if a file
on lower fs is removed, overlay leaves a whiteout in upper fs but that
whiteout does not get filtered out and is visible to overlayfs users.

And reason it does not get filtered out because upper filesystem does
not report file type of whiteout as DT_CHR during iterate_dir().

So it seems to be a requirement that upper filesystem support d_type for
overlayfs to work properly. Do this check during mount and fail if d_type
is not supported.

Suggested-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:45 +01:00
David Howells fb5bb2c3b7 ovl: Warn on copy up if a process has a R/O fd open to the lower file
Print a warning when overlayfs copies up a file if the process that
triggered the copy up has a R/O fd open to the lower file being copied up.

This can help catch applications that do things like the following:

	fd1 = open("foo", O_RDONLY);
	fd2 = open("foo", O_RDWR);

where they expect fd1 and fd2 to refer to the same file - which will no
longer be the case post-copy up.

With this patch, the following commands:

	bash 5</mnt/a/foo128
	6<>/mnt/a/foo128

assuming /mnt/a/foo128 to be an un-copied up file on an overlay will
produce the following warning in the kernel log:

	overlayfs: Copying up foo129, but open R/O on fd 5 which will cease
	to be coherent [pid=3818 bash]

This is enabled by setting:

	/sys/module/overlay/parameters/check_copy_up

to 1.

The warnings are ratelimited and are also limited to one warning per file -
assuming the copy up completes in each case.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:45 +01:00
Konstantin Khlebnikov 07f2af7bfd ovl: honor flag MS_SILENT at mount
This patch hides error about missing lowerdir if MS_SILENT is set.

We use mount(NULL, "/", "overlay", MS_SILENT, NULL) for testing support of
overlayfs: syscall returns -ENODEV if it's not supported. Otherwise kernel
automatically loads module and returns -EINVAL because lowerdir is missing.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:45 +01:00
Miklos Szeredi 11f3710417 ovl: verify upper dentry before unlink and rename
Unlink and rename in overlayfs checked the upper dentry for staleness by
verifying upper->d_parent against upperdir.  However the dentry can go
stale also by being unhashed, for example.

Expand the verification to actually look up the name again (under parent
lock) and check if it matches the upper dentry.  This matches what the VFS
does before passing the dentry to filesytem's unlink/rename methods, which
excludes any inconsistency caused by overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-03-21 17:31:44 +01:00
Konstantin Khlebnikov b81de061fa ovl: copy new uid/gid into overlayfs runtime inode
Overlayfs must update uid/gid after chown, otherwise functions
like inode_owner_or_capable() will check user against stale uid.
Catched by xfstests generic/087, it chowns file and calls utimes.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
2016-03-03 17:17:46 +01:00
Konstantin Khlebnikov 45d1173896 ovl: ignore lower entries when checking purity of non-directory entries
After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.

Overlayfs already tracks pureness of file location in oe->opaque.  This
patch just uses that for detecting actual path type.

Comment from Vivek Goyal's patch:

Here are the details of the problem. Do following.

$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ?            ? test

Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.

Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.

Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.

So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Viktor Stanchev <me@viktorstanchev.com>
Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
2016-03-03 17:17:45 +01:00
Rui Wang ce9113bbcb ovl: fix getcwd() failure after unsuccessful rmdir
ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.

This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491

Signed-off-by: Rui Wang <rui.y.wang@intel.com>
Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
2016-03-03 17:17:45 +01:00
Konstantin Khlebnikov b5891cfab0 ovl: fix working on distributed fs as lower layer
This adds missing .d_select_inode into alternative dentry_operations.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7c03b5d45b ("ovl: allow distributed fs as lower layer")
Fixes: 4bacc9c923 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Reviewed-by: Nikolay Borisov <kernel@kyup.com>
Tested-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # 4.2+
2016-03-03 17:17:45 +01:00
Al Viro 5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Linus Torvalds eae21770b4 Merge branch 'akpm' (patches from Andrew)
Merge third patch-bomb from Andrew Morton:
 "I'm pretty much done for -rc1 now:

   - the rest of MM, basically

   - lib/ updates

   - checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit

   - cpu_mask simplifications

   - kexec, rapidio, MAINTAINERS etc, etc.

   - more dma-mapping cleanups/simplifications from hch"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (109 commits)
  MAINTAINERS: add/fix git URLs for various subsystems
  mm: memcontrol: add "sock" to cgroup2 memory.stat
  mm: memcontrol: basic memory statistics in cgroup2 memory controller
  mm: memcontrol: do not uncharge old page in page cache replacement
  Documentation: cgroup: add memory.swap.{current,max} description
  mm: free swap cache aggressively if memcg swap is full
  mm: vmscan: do not scan anon pages if memcg swap limit is hit
  swap.h: move memcg related stuff to the end of the file
  mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
  mm: vmscan: pass memcg to get_scan_count()
  mm: memcontrol: charge swap to cgroup2
  mm: memcontrol: clean up alloc, online, offline, free functions
  mm: memcontrol: flatten struct cg_proto
  mm: memcontrol: rein in the CONFIG space madness
  net: drop tcp_memcontrol.c
  mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
  mm: memcontrol: allow to disable kmem accounting for cgroup2
  mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
  mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
  mm: memcontrol: separate kmem code from legacy tcp accounting code
  ...
2016-01-21 12:32:08 -08:00
Linus Torvalds e9f57ebcba Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
 "This contains several bug fixes and a new mount option
  'default_permissions' that allows read-only exported NFS
  filesystems to be used as lower layer"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: check dentry positiveness in ovl_cleanup_whiteouts()
  ovl: setattr: check permissions before copy-up
  ovl: root: copy attr
  ovl: move super block magic number to magic.h
  ovl: use a minimal buffer in ovl_copy_xattr
  ovl: allow zero size xattr
  ovl: default permissions
2016-01-21 12:20:46 -08:00
Andrew Morton e458bcd16f fs/overlayfs/super.c needs pagemap.h
i386 allmodconfig:

  In file included from fs/overlayfs/super.c:10:0:
  fs/overlayfs/super.c: In function 'ovl_fill_super':
  include/linux/fs.h:898:36: error: 'PAGE_CACHE_SIZE' undeclared (first use in this function)
   #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                      ^
  fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
    sb->s_maxbytes = MAX_LFS_FILESIZE;
                     ^
  include/linux/fs.h:898:36: note: each undeclared identifier is reported only once for each function it appears in
   #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
                                      ^
  fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
    sb->s_maxbytes = MAX_LFS_FILESIZE;
                     ^

Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Al Viro fceef393a5 switch ->get_link() to delayed_call, kill ->put_link()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-30 13:01:03 -05:00
Konstantin Khlebnikov 84889d4933 ovl: check dentry positiveness in ovl_cleanup_whiteouts()
This patch fixes kernel crash at removing directory which contains
whiteouts from lower layers.

Cache of directory content passed as "list" contains entries from all
layers, including whiteouts from lower layers. So, lookup in upper dir
(moved into work at this stage) will return negative entry. Plus this
cache is filled long before and we can race with external removal.

Example:
 mkdir -p lower0/dir lower1/dir upper work overlay
 touch lower0/dir/a lower0/dir/b
 mknod lower1/dir/a c 0 0
 mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work
 rm -fr overlay/dir

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org> # 3.18+
2015-12-11 16:30:49 +01:00
Miklos Szeredi cf9a6784f7 ovl: setattr: check permissions before copy-up
Without this copy-up of a file can be forced, even without actually being
allowed to do anything on the file.

[Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by
MAX_LFS_FILESIZE definition).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
2015-12-11 16:30:49 +01:00
Miklos Szeredi ed06e06977 ovl: root: copy attr
We copy i_uid and i_gid of underlying inode into overlayfs inode.  Except
for the root inode.

Fix this omission.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
2015-12-09 16:11:59 +01:00
Al Viro 6b2553918d replace ->follow_link() with new method that could stay in RCU mode
new method: ->get_link(); replacement of ->follow_link().  The differences
are:
	* inode and dentry are passed separately
	* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
	* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.

It's a flagday change - the old method is gone, all in-tree instances
converted.  Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode.  That'll change
in the next commits.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-08 22:41:54 -05:00
Al Viro 0f7ff2dabb ovl: get rid of the dead code left from broken (and disabled) optimizations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06 12:31:07 -05:00