1
0
Fork 0
Commit Graph

126 Commits (a528d35e8bfcc521d7cb70aaf03e1bd296c8493f)

Author SHA1 Message Date
David Howells a528d35e8b statx: Add a system call to make enhanced file info available
Add a system call to make extended file information available, including
file creation and some attribute flags where available through the
underlying filesystem.

The getattr inode operation is altered to take two additional arguments: a
u32 request_mask and an unsigned int flags that indicate the
synchronisation mode.  This change is propagated to the vfs_getattr*()
function.

Functions like vfs_stat() are now inline wrappers around new functions
vfs_statx() and vfs_statx_fd() to reduce stack usage.

========
OVERVIEW
========

The idea was initially proposed as a set of xattrs that could be retrieved
with getxattr(), but the general preference proved to be for a new syscall
with an extended stat structure.

A number of requests were gathered for features to be included.  The
following have been included:

 (1) Make the fields a consistent size on all arches and make them large.

 (2) Spare space, request flags and information flags are provided for
     future expansion.

 (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
     __s64).

 (4) Creation time: The SMB protocol carries the creation time, which could
     be exported by Samba, which will in turn help CIFS make use of
     FS-Cache as that can be used for coherency data (stx_btime).

     This is also specified in NFSv4 as a recommended attribute and could
     be exported by NFSD [Steve French].

 (5) Lightweight stat: Ask for just those details of interest, and allow a
     netfs (such as NFS) to approximate anything not of interest, possibly
     without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
     Dilger] (AT_STATX_DONT_SYNC).

 (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
     its cached attributes are up to date [Trond Myklebust]
     (AT_STATX_FORCE_SYNC).

And the following have been left out for future extension:

 (7) Data version number: Could be used by userspace NFS servers [Aneesh
     Kumar].

     Can also be used to modify fill_post_wcc() in NFSD which retrieves
     i_version directly, but has just called vfs_getattr().  It could get
     it from the kstat struct if it used vfs_xgetattr() instead.

     (There's disagreement on the exact semantics of a single field, since
     not all filesystems do this the same way).

 (8) BSD stat compatibility: Including more fields from the BSD stat such
     as creation time (st_btime) and inode generation number (st_gen)
     [Jeremy Allison, Bernd Schubert].

 (9) Inode generation number: Useful for FUSE and userspace NFS servers
     [Bernd Schubert].

     (This was asked for but later deemed unnecessary with the
     open-by-handle capability available and caused disagreement as to
     whether it's a security hole or not).

(10) Extra coherency data may be useful in making backups [Andreas Dilger].

     (No particular data were offered, but things like last backup
     timestamp, the data version number and the DOS archive bit would come
     into this category).

(11) Allow the filesystem to indicate what it can/cannot provide: A
     filesystem can now say it doesn't support a standard stat feature if
     that isn't available, so if, for instance, inode numbers or UIDs don't
     exist or are fabricated locally...

     (This requires a separate system call - I have an fsinfo() call idea
     for this).

(12) Store a 16-byte volume ID in the superblock that can be returned in
     struct xstat [Steve French].

     (Deferred to fsinfo).

(13) Include granularity fields in the time data to indicate the
     granularity of each of the times (NFSv4 time_delta) [Steve French].

     (Deferred to fsinfo).

(14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
     Note that the Linux IOC flags are a mess and filesystems such as Ext4
     define flags that aren't in linux/fs.h, so translation in the kernel
     may be a necessity (or, possibly, we provide the filesystem type too).

     (Some attributes are made available in stx_attributes, but the general
     feeling was that the IOC flags were to ext[234]-specific and shouldn't
     be exposed through statx this way).

(15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
     Michael Kerrisk].

     (Deferred, probably to fsinfo.  Finding out if there's an ACL or
     seclabal might require extra filesystem operations).

(16) Femtosecond-resolution timestamps [Dave Chinner].

     (A __reserved field has been left in the statx_timestamp struct for
     this - if there proves to be a need).

(17) A set multiple attributes syscall to go with this.

===============
NEW SYSTEM CALL
===============

The new system call is:

	int ret = statx(int dfd,
			const char *filename,
			unsigned int flags,
			unsigned int mask,
			struct statx *buffer);

The dfd, filename and flags parameters indicate the file to query, in a
similar way to fstatat().  There is no equivalent of lstat() as that can be
emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
also no equivalent of fstat() as that can be emulated by passing a NULL
filename to statx() with the fd of interest in dfd.

Whether or not statx() synchronises the attributes with the backing store
can be controlled by OR'ing a value into the flags argument (this typically
only affects network filesystems):

 (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
     respect.

 (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
     its attributes with the server - which might require data writeback to
     occur to get the timestamps correct.

 (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
     network filesystem.  The resulting values should be considered
     approximate.

mask is a bitmask indicating the fields in struct statx that are of
interest to the caller.  The user should set this to STATX_BASIC_STATS to
get the basic set returned by stat().  It should be noted that asking for
more information may entail extra I/O operations.

buffer points to the destination for the data.  This must be 256 bytes in
size.

======================
MAIN ATTRIBUTES RECORD
======================

The following structures are defined in which to return the main attribute
set:

	struct statx_timestamp {
		__s64	tv_sec;
		__s32	tv_nsec;
		__s32	__reserved;
	};

	struct statx {
		__u32	stx_mask;
		__u32	stx_blksize;
		__u64	stx_attributes;
		__u32	stx_nlink;
		__u32	stx_uid;
		__u32	stx_gid;
		__u16	stx_mode;
		__u16	__spare0[1];
		__u64	stx_ino;
		__u64	stx_size;
		__u64	stx_blocks;
		__u64	__spare1[1];
		struct statx_timestamp	stx_atime;
		struct statx_timestamp	stx_btime;
		struct statx_timestamp	stx_ctime;
		struct statx_timestamp	stx_mtime;
		__u32	stx_rdev_major;
		__u32	stx_rdev_minor;
		__u32	stx_dev_major;
		__u32	stx_dev_minor;
		__u64	__spare2[14];
	};

The defined bits in request_mask and stx_mask are:

	STATX_TYPE		Want/got stx_mode & S_IFMT
	STATX_MODE		Want/got stx_mode & ~S_IFMT
	STATX_NLINK		Want/got stx_nlink
	STATX_UID		Want/got stx_uid
	STATX_GID		Want/got stx_gid
	STATX_ATIME		Want/got stx_atime{,_ns}
	STATX_MTIME		Want/got stx_mtime{,_ns}
	STATX_CTIME		Want/got stx_ctime{,_ns}
	STATX_INO		Want/got stx_ino
	STATX_SIZE		Want/got stx_size
	STATX_BLOCKS		Want/got stx_blocks
	STATX_BASIC_STATS	[The stuff in the normal stat struct]
	STATX_BTIME		Want/got stx_btime{,_ns}
	STATX_ALL		[All currently available stuff]

stx_btime is the file creation time, stx_mask is a bitmask indicating the
data provided and __spares*[] are where as-yet undefined fields can be
placed.

Time fields are structures with separate seconds and nanoseconds fields
plus a reserved field in case we want to add even finer resolution.  Note
that times will be negative if before 1970; in such a case, the nanosecond
fields will also be negative if not zero.

The bits defined in the stx_attributes field convey information about a
file, how it is accessed, where it is and what it does.  The following
attributes map to FS_*_FL flags and are the same numerical value:

	STATX_ATTR_COMPRESSED		File is compressed by the fs
	STATX_ATTR_IMMUTABLE		File is marked immutable
	STATX_ATTR_APPEND		File is append-only
	STATX_ATTR_NODUMP		File is not to be dumped
	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs

Within the kernel, the supported flags are listed by:

	KSTAT_ATTR_FS_IOC_FLAGS

[Are any other IOC flags of sufficient general interest to be exposed
through this interface?]

New flags include:

	STATX_ATTR_AUTOMOUNT		Object is an automount trigger

These are for the use of GUI tools that might want to mark files specially,
depending on what they are.

Fields in struct statx come in a number of classes:

 (0) stx_dev_*, stx_blksize.

     These are local system information and are always available.

 (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
     stx_size, stx_blocks.

     These will be returned whether the caller asks for them or not.  The
     corresponding bits in stx_mask will be set to indicate whether they
     actually have valid values.

     If the caller didn't ask for them, then they may be approximated.  For
     example, NFS won't waste any time updating them from the server,
     unless as a byproduct of updating something requested.

     If the values don't actually exist for the underlying object (such as
     UID or GID on a DOS file), then the bit won't be set in the stx_mask,
     even if the caller asked for the value.  In such a case, the returned
     value will be a fabrication.

     Note that there are instances where the type might not be valid, for
     instance Windows reparse points.

 (2) stx_rdev_*.

     This will be set only if stx_mode indicates we're looking at a
     blockdev or a chardev, otherwise will be 0.

 (3) stx_btime.

     Similar to (1), except this will be set to 0 if it doesn't exist.

=======
TESTING
=======

The following test program can be used to test the statx system call:

	samples/statx/test-statx.c

Just compile and run, passing it paths to the files you want to examine.
The file is built automatically if CONFIG_SAMPLES is enabled.

Here's some example output.  Firstly, an NFS directory that crosses to
another FSID.  Note that the AUTOMOUNT attribute is set because transiting
this directory will cause d_automount to be invoked by the VFS.

	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:26           Inode: 1703937     Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000
	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)

Secondly, the result of automounting on that directory.

	[root@andromeda ~]# /tmp/test-statx /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:27           Inode: 2           Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02 20:51:15 -05:00
Eric Biggers 6f69f0ed61 fscrypt: constify struct fscrypt_operations
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Richard Weinberger <richard@nod.at>
2017-02-08 10:59:57 -05:00
Eric Biggers 46f47e4800 fscrypt: split supp and notsupp declarations into their own headers
Previously, each filesystem configured without encryption support would
define all the public fscrypt functions to their notsupp_* stubs.  This
list of #defines had to be updated in every filesystem whenever a change
was made to the public fscrypt functions.  To make things more
maintainable now that we have three filesystems using fscrypt, split the
old header fscrypto.h into several new headers.  fscrypt_supp.h contains
the real declarations and is included by filesystems when configured
with encryption support, whereas fscrypt_notsupp.h contains the inline
stubs and is included by filesystems when configured without encryption
support.  fscrypt_common.h contains common declarations needed by both.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-02-06 23:26:43 -05:00
Richard Weinberger ec9160dacd ubifs: Use fscrypt ioctl() helpers
Commit db717d8e26 ("fscrypto: move ioctl processing more fully into
common code") moved ioctl() related functions into fscrypt and offers
us now a set of helper functions.

Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
2016-12-13 19:54:52 +01:00
Richard Weinberger e021986ee4 ubifs: Implement UBIFS_FLG_ENCRYPTION
This feature flag indicates that the filesystem contains encrypted
files.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger d63d61c169 ubifs: Implement UBIFS_FLG_DOUBLE_HASH
This feature flag indicates that all directory entry nodes have a 32bit
cookie set and therefore UBIFS is allowed to perform lookups by hash.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger cc41a53652 ubifs: Use a random number for cookies
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 528e3d178f ubifs: Add full hash lookup support
UBIFS stores a 32bit hash of every file, for traditional lookups by name
this scheme is fine since UBIFS can first try to find the file by the
hash of the filename and upon collisions it can walk through all entries
with the same hash and do a string compare.
When filesnames are encrypted fscrypto will ask the filesystem for a
unique cookie, based on this cookie the filesystem has to be able to
locate the target file again. With 32bit hashes this is impossible
because the chance for collisions is very high. Do deal with that we
store a 32bit cookie directly in the UBIFS directory entry node such
that we get a 64bit cookie (32bit from filename hash and the dent
cookie). For a lookup by hash UBIFS finds the entry by the first 32bit
and then compares the dent cookie. If it does not match, it has to do a
linear search of the whole directory and compares all dent cookies until
the correct entry is found.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger f4f61d2cc6 ubifs: Implement encrypted filenames
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 7799953b34 ubifs: Implement encrypt/decrypt for all IO
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger 1ee77870c9 ubifs: Constify struct inode pointer in ubifs_crypt_is_encrypted()
...and provide a non const variant for fscrypto

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger d475a50745 ubifs: Add skeleton for fscrypto
This is the first building block to provide file level
encryption on UBIFS.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger ade46c3a60 ubifs: Export xattr get and set functions
For fscrypto we need this function outside of xattr.c.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger f6337d8426 ubifs: Export ubifs_check_dir_empty()
fscrypto will need this function too. Also get struct ubifs_info
from the provided inode. Not all callers will have a reference to
struct ubifs_info.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Rafał Miłecki 1b7fc2c006 ubifs: Use dirty_writeback_interval value for wbuf timer
Right now wbuf timer has hardcoded timeouts and there is no place for
manual adjustments. Some projects / cases many need that though. Few
file systems allow doing that by respecting dirty_writeback_interval
that can be set using sysctl (dirty_writeback_centisecs).

Lowering dirty_writeback_interval could be some way of dealing with user
space apps lacking proper fsyncs. This is definitely *not* a perfect
solution but we don't have ideal (user space) world. There were already
advanced discussions on this matter, mostly when ext4 was introduced and
it wasn't behaving as ext3. Anyway, the final decision was to add some
hacks to the ext4, as trying to fix whole user space or adding new API
was pointless.

We can't (and shouldn't?) just follow ext4. We can't e.g. sync on close
as this would cause too many commits and flash wearing. On the other
hand we still should allow some trade-off between -o sync and default
wbuf timeout. Respecting dirty_writeback_interval should allow some sane
cutomizations if used warily.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:24 +01:00
Rafał Miłecki 854826c9d5 ubifs: Drop softlimit and delta fields from struct ubifs_wbuf
Values of these fields are set during init and never modified. They are
used (read) in a single function only. There isn't really any reason to
keep them in a struct. It only makes struct just a bit bigger without
any visible gain.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:11 +01:00
Richard Weinberger 9ec64962af ubifs: Implement RENAME_EXCHANGE
Adds RENAME_EXCHANGE to UBIFS, the operation itself
is completely disjunct from a regular rename() that's
why we dispatch very early in ubifs_reaname().

RENAME_EXCHANGE used by the renameat2() system call
allows the caller to exchange two paths atomically.
Both paths have to exist and have to be on the same
filesystem.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger 9e0a1fff8d ubifs: Implement RENAME_WHITEOUT
Adds RENAME_WHITEOUT support to UBIFS, we implement
it in the same way as ext4 and xfs do.
For an overview of other ways to implement it please
refere to commit 7dcf5c3e45 ("xfs: add RENAME_WHITEOUT support").

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Daniel Golle 380bc8b710 ubifs: Update comment for ubifs_errc
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:30:26 +02:00
Linus Torvalds ba5a2655c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull remaining vfs xattr work from Al Viro:
 "The rest of work.xattr (non-cifs conversions)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  btrfs: Switch to generic xattr handlers
  ubifs: Switch to generic xattr handlers
  jfs: Switch to generic xattr handlers
  jfs: Clean up xattr name mapping
  gfs2: Switch to generic xattr handlers
  ceph: kill __ceph_removexattr()
  ceph: Switch to generic xattr handlers
  ceph: Get rid of d_find_alias in ceph_set_acl
2016-05-18 10:08:45 -07:00
Andreas Gruenbacher 2b88fc21ca ubifs: Switch to generic xattr handlers
Ubifs internally uses special inodes for storing xattrs. Those inodes
had NULL {get,set,remove}xattr inode operations before this change, so
xattr operations on them would fail. The super block's s_xattr field
would also apply to those special inodes. However, the inodes are not
visible outside of ubifs, and so no xattr operations will ever be
carried out on them anyway.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-17 19:16:23 -04:00
Al Viro 84695ffee7 Merge getxattr prototype change into work.lookups
The rest of work.xattr stuff isn't needed for this branch
2016-05-02 19:45:47 -04:00
Al Viro ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Kirill A. Shutemov 09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Joe Perches 3e7f2c5104 ubifs: Add logging functions for ubifs_msg, ubifs_err and ubifs_warn
The existing logging macros are fairly large and converting the
macros to functions make the object code smaller.

Use %pV and __builtin_return_address(0) as appropriate.

$ size fs/ubifs/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 575831	 309688	 161312	1046831	  ff92f	fs/ubifs/built-in.o.allyesconfig.new
 622457	 312872	 161120	1096449	 10bb01	fs/ubifs/built-in.o.allyesconfig.old
 223785	    640	    644	 225069	  36f2d	fs/ubifs/built-in.o.defconfig.new
 251873	    640	    644	 253157	  3dce5	fs/ubifs/built-in.o.defconfig.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20 21:36:05 +01:00
Linus Torvalds 5d2eb548b3 Merge branch 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr cleanups from Al Viro.

* 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  f2fs: xattr simplifications
  squashfs: xattr simplifications
  9p: xattr simplifications
  xattr handlers: Pass handler to operations instead of flags
  jffs2: Add missing capability check for listing trusted xattrs
  hfsplus: Remove unused xattr handler list operations
  ubifs: Remove unused security xattr handler
  vfs: Fix the posix_acl_xattr_list return value
  vfs: Check attribute names in posix acl xattr handers
2015-11-13 18:02:30 -08:00
Andreas Gruenbacher 13d3408f10 ubifs: Remove unused security xattr handler
Ubifs installs a security xattr handler in sb->s_xattr but doesn't use the
generic_{get,set,list,remove}xattr inode operations needed for processing
this list of attribute handlers; the handler is never called.  Instead,
ubifs uses its own xattr handlers which also process security xattrs.

Remove the dead code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: linux-mtd@lists.infradead.org
Cc: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-13 20:34:29 -05:00
Dongsheng Yang 8c1c5f2638 ubifs: introduce UBIFS_ATIME_SUPPORT to ubifs
To make ubifs support atime flexily, this commit introduces
a Kconfig option named as UBIFS_ATIME_SUPPORT.

With UBIFS_ATIME_SUPPORT=n:
	ubifs keeps the full compatibility to no_atime from
the start of ubifs.

=================UBIFS_ATIME_SUPPORT=n=======================
-o - no atime
-o atime - no atime
-o noatime - no atime
-o relatime - no atime
-o strictatime - no atime
-o lazyatime - no atime

With UBIFS_ATIME_SUPPORT=y:
	ubifs supports the atime same with other main stream
file systems.
=================UBIFS_ATIME_SUPPORT=y=======================
-o - default behavior (relatime currently)
-o atime - atime support
-o noatime - no atime support
-o relatime - relative atime support
-o strictatime - strict atime support
-o lazyatime - lazy atime support

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-07 11:35:08 +01:00
Dongsheng Yang 7d25b361a2 UBIFS: fix a typo in comment of ubifs_budget_req
s/now/how

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-10-03 20:03:25 +02:00
Sheng Yong 235c362bd0 UBIFS: extend debug/message capabilities
In the case where we have more than one volumes on different UBI
devices, it may be not that easy to tell which volume prints the
messages.  Add ubi number and volume id in ubifs_msg/warn/error
to help debug. These two values are passed by struct ubifs_info.

For those where ubifs_info is not initialized yet, ubifs_* is
replaced by pr_*. For those where ubifs_info is not avaliable,
ubifs_info is passed to the calling function as a const parameter.

The output looks like,

[   95.444879] UBIFS (ubi0:1): background thread "ubifs_bgt0_1" started, PID 696
[   95.484688] UBIFS (ubi0:1): UBIFS: mounted UBI device 0, volume 1, name "test1"
[   95.484694] UBIFS (ubi0:1): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[   95.484699] UBIFS (ubi0:1): FS size: 30220288 bytes (28 MiB, 238 LEBs), journal size 1523712 bytes (1 MiB, 12 LEBs)
[   95.484703] UBIFS (ubi0:1): reserved for root: 1427378 bytes (1393 KiB)
[   95.484709] UBIFS (ubi0:1): media format: w4/r0 (latest is w4/r0), UUID 40DFFC0E-70BE-4193-8905-F7D6DFE60B17, small LPT model
[   95.489875] UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 699
[   95.529713] UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "test2"
[   95.529718] UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[   95.529724] UBIFS (ubi1:0): FS size: 19808256 bytes (18 MiB, 156 LEBs), journal size 1015809 bytes (0 MiB, 8 LEBs)
[   95.529727] UBIFS (ubi1:0): reserved for root: 935592 bytes (913 KiB)
[   95.529733] UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID EEB7779D-F419-4CA9-811B-831CAC7233D4, small LPT model

[  954.264767] UBIFS error (ubi1:0 pid 756): ubifs_read_node: bad node type (255 but expected 6)
[  954.367030] UBIFS error (ubi1:0 pid 756): ubifs_read_node: bad node at LEB 0:0, LEB mapping status 1

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:41 +02:00
Yannick Guerrini d3f9db00d0 UBIFS: Fix trivial typos in comments
Change 'comress' to 'compress'
Change 'inteval' to 'interval'
Change 'disabe' to 'disable'
Change 'nenver' to 'never'

Signed-off-by: Yannick Guerrini <yguerrini@tomshardware.fr>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-03-25 11:08:41 +02:00
Subodh Nijsure d7f0b70d30 UBIFS: Add security.* XATTR support for the UBIFS
Artem: rename static functions so that they do not use the "ubifs_" prefix - we
       only use this prefix for non-static functions.
Artem: remove few junk white-space changes in file.c

Signed-off-by: Subodh Nijsure <snijsure@grid-net.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ben Shelton <ben.shelton@ni.com>
Acked-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Terry Wilcox <terry.wilcox@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2015-01-28 16:08:54 +01:00
Artem Bityutskiy 07e19dff63 UBIFS: remove mst_mutex
The 'mst_mutex' is not needed since because 'ubifs_write_master()' is only
called on the mount path and commit path. The mount path is sequential and
there is no parallelism, and the commit path is also serialized - there is only
one commit going on at a time.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:52 +03:00
hujianyang c7b5bb0beb UBIFS: remove useless @ecc in struct ubifs_scan_leb
We set @ecc in ubifs_scan_leb only if leb_read returns EBADMSG and
do not use it any more. This patch removes this variable and adds
comments about EBADMSG handling.

Artem: re-phrase commentaries

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-07-19 09:53:51 +03:00
Daniel Golle 90bea5a3f0 UBIFS: respect MS_SILENT mount flag
When attempting to mount a non-ubifs formatted volume, lots of error
messages (including a stack dump) are thrown to the kernel log even if
the MS_SILENT mount flag is set.
Fix this by introducing adding an additional state-variable in
struct ubifs_info and suppress error messages in ubifs_read_node if
MS_SILENT is set.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2014-06-02 18:00:52 +03:00
Dave Chinner 1ab6c4997e fs: convert fs shrinkers to new scan/count API
Convert the filesystem shrinkers to use the new API, and standardise some
of the behaviours of the shrinkers at the same time.  For example,
nr_to_scan means the number of objects to scan, not the number of objects
to free.

I refactored the CIFS idmap shrinker a little - it really needs to be
broken up into a shrinker per tree and keep an item count with the tree
root so that we don't need to walk the tree every time the shrinker needs
to count the number of objects in the tree (i.e.  all the time under
memory pressure).

[glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
[assorted fixes folded in]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-10 18:56:31 -04:00
Adam Thomas 8afd500cb5 UBIFS: fix double free of ubifs_orphan objects
The last orphan in the dnext list has its dnext set to NULL. Because
of that, ubifs_delete_orphan assumes that it is not on the dnext list
and frees it immediately instead ignoring it as a second delete. The
orphan is later freed again by erase_deleted.

This change adds an explicit flag to ubifs_orphan indicating whether
it is pending delete.

Signed-off-by: Adam Thomas <adamthomas1111@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org
2013-02-04 12:31:48 +02:00
Adam Thomas 2928f0d0c5 UBIFS: fix use of freed ubifs_orphan objects
The last orphan in the cnext list has its cnext set to NULL. Because
of that, ubifs_delete_orphan assumes that it is not on the cnext list
and frees it immediately instead of adding it to the dnext list. The
freed orphan is later modified by write_orph_node.

This can cause various inconsistencies including directory entries
that cannot be removed and this error:

UBIFS error (pid 20685): layout_cnodes: LPT out of space at LEB 14:129009 needing 17, done_ltab 1, done_lsave 1

This is a regression introduced by
"7074e5eb UBIFS: remove invalid reference to list iterator variable".

This change adds an explicit flag to ubifs_orphan indicating whether
it is pending commit.

Signed-off-by: Adam Thomas <adamthomas1111@gmail.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # v3.6+
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2013-02-04 12:31:00 +02:00
Artem Bityutskiy 98a1eebda3 UBIFS: introduce categorized lprops counter
This commit is a preparation for a subsequent bugfix. We introduce a
counter for categorized lprops.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org
2012-10-26 16:00:26 +03:00
Linus Torvalds 782c3fb22b No big changes for 3.7 in UBIFS:
* Error reporting and debug printing improvements
 * Power cut emulation fixes
 * Minor cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQaumRAAoJECmIfjd9wqK016UQAImRrzhPoegy2Pr7j/CpzUST
 wJru3QlgdHgjukWer/s4CGTTjmTLMgLAllUnJLO3yn+Po4o67RJxI4SjhMf5lwDG
 TSmB/ttQpRmYxaaS2EOC6Ow+jdMM2MhAsPrD8nAfLvOm/XduuSyquPY/xFa6U4rJ
 oE4w23b5t2BZI/UJmsCaYs6Rj1h8w1aZUukii+J9BN8DGygn/vJuI9EDNKdtQmxe
 vmoT2uWKPyL859kY/+lONH048NWMkEB3BhNmuAsFqGY/tdQRdmeyhgWEKNjbmc/M
 DXZe7ovy5umUyw6iDpfhEhmOBdV9xorDySsmKNP1/q60rUGD6R/tynC4y1q1IueB
 c6fbry6xpwKIccMS/w7x9cbZxMnG++iQoj71pr2FV9Bg5Xd7XduM/XjzI5rMHuqv
 EnmcwT/LIu5Lw1vMU8pfW2yIdTkotNR/2aca1bs0f4WtS3AhngdqmNbFmddcpY37
 qvTYDrSMOW/IaegiDQzucVXNhIs8ufIULZzmj9CtmgeyvNVJboTJRik+HJDWFqeC
 04TaM8k+Yjp/isbzTmWEyBG3G06cNpMwLtT5OKcpRVQ1ZYu62AbnR+Dbep6n0qt+
 gBORW8TTY32GudeSiHLKCrOsrkx3zNSli6T4Sw9YD5e29Dee62KRTgjKVplezQfB
 Djh70vUyX2tHdYlW88gf
 =gM4M
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.7-rc1' of git://git.infradead.org/linux-ubifs

Pull ubifs changes from Artem Bityutskiy:
 "No big changes for 3.7 in UBIFS:
   - Error reporting and debug printing improvements
   - Power cut emulation fixes
   - Minor cleanups"

Fix trivial conflict in fs/ubifs/debug.c due to the user namespace
changes.

* tag 'upstream-3.7-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: print less
  UBIFS: use pr_ helper instead of printk
  UBIFS: comply with coding style
  UBIFS: use __aligned() attribute
  UBIFS: remove __DATE__ and __TIME__
  UBIFS: fix power cut emulation for mtdram
  UBIFS: improve scanning debug output
  UBIFS: always print full error reports
  UBIFS: print PID in debug messages
2012-10-02 20:47:48 -07:00
Eric W. Biederman 39241beb78 userns: Convert ubifs to use kuid/kgid
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:36 -07:00
Artem Bityutskiy 6b38d03f48 UBIFS: use pr_ helper instead of printk
Use 'pr_err()' instead of 'printk(KERN_ERR', etc.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-08-31 17:32:57 +03:00
Richard Weinberger b36a261e8c UBI: Kill data type hint
We do not need this feature and to our shame it even was not working
and there was a bug found very recently.
	-- Artem Bityutskiy

Without the data type hint UBI2 (fastmap) will be easier to implement.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-05-20 20:25:59 +03:00
Artem Bityutskiy f70b7e52aa UBIFS: remove Kconfig debugging option
Have the debugging stuff always compiled-in instead. It simplifies maintanance
a lot.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-05-16 19:53:46 +03:00
Artem Bityutskiy 7ca58bad69 UBIFS: kill CUR_MAX_KEY_LEN macro
It is useless and confusing and may make people believe they may just
change it, which is not true, because this will also change the on-flash
format.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-02-29 18:43:01 +02:00
Al Viro ad44be5c78 ubifs: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:06 -05:00
Linus Torvalds bbd9d6f7fb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
  vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
  isofs: Remove global fs lock
  jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
  fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
  mm/truncate.c: fix build for CONFIG_BLOCK not enabled
  fs:update the NOTE of the file_operations structure
  Remove dead code in dget_parent()
  AFS: Fix silly characters in a comment
  switch d_add_ci() to d_splice_alias() in "found negative" case as well
  simplify gfs2_lookup()
  jfs_lookup(): don't bother with . or ..
  get rid of useless dget_parent() in btrfs rename() and link()
  get rid of useless dget_parent() in fs/btrfs/ioctl.c
  fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
  drivers: fix up various ->llseek() implementations
  fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
  Ext4: handle SEEK_HOLE/SEEK_DATA generically
  Btrfs: implement our own ->llseek
  fs: add SEEK_HOLE and SEEK_DATA flags
  reiserfs: make reiserfs default to barrier=flush
  ...

Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
2011-07-22 19:02:39 -07:00
Josef Bacik 02c24a8218 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2.  For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:59 -04:00
Artem Bityutskiy 83cef708c6 UBIFS: introduce more I/O helpers
Introduce the following I/O helper functions: 'ubifs_leb_read()',
'ubifs_leb_write()', 'ubifs_leb_change()', 'ubifs_leb_unmap()',
'ubifs_leb_map()', 'ubifs_is_mapped().

The idea is to wrap all UBI I/O functions in order to encapsulate various
assertions and error path handling (error message, stack dump, switching to R/O
mode). And there are some other benefits of this which will be used in the
following patches.

This patch does not switch whole UBIFS to use these functions yet.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:33 +03:00
Artem Bityutskiy 3766244769 UBIFS: use correct flags in lprops
The UBIFS lpt tree is in many aspects similar to the TNC tree, and we have
similar flags for these trees. And by mistake we use the COW_ZNODE flag for
LPT in some places, instead of the right flag COW_CNODE. And this works
only because these two constants have the same value.

This patch makes all the LPT code to use COW_CNODE and also changes COW_CNODE
constant value to make sure we do not misuse the flags any more.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00