1
0
Fork 0
Commit Graph

222 Commits (2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b)

Author SHA1 Message Date
Fengguang Wu 08d8e9749e writeback: fix ntfs with sb_has_dirty_inodes()
NTFS's if-condition on dirty inodes is not complete.  Fix it with
sb_has_dirty_inodes().

Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Ken Chen <kenchen@google.com>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:02 -07:00
Ken Chen 0e0f4fc22e writeback: fix periodic superblock dirty inode flushing
Current -mm tree has bucketful of bug fixes in periodic writeback path.
However, we still hit a glitch where dirty pages on a given inode aren't
completely flushed to the disk, and system will accumulate large amount of
dirty pages beyond what dirty_expire_interval is designed for.

The problem is __sync_single_inode() will move an inode to sb->s_dirty list
even when there are more pending dirty pages on that inode.  If there is
another inode with a small number of dirty pages, we hit a case where the loop
iteration in wb_kupdate() terminates prematurely because wbc.nr_to_write > 0.
Thus leaving the inode that has large amount of dirty pages behind and it has
to wait for another dirty_writeback_interval before we flush it again.  We
effectively only write out MAX_WRITEBACK_PAGES every dirty_writeback_interval.
If the rate of dirtying is sufficiently high, the system will start
accumulate a large number of dirty pages.

So fix it by having another sb->s_more_io list on which to park the inode
while we iterate through sb->s_io and to allow each dirty inode which resides
on that sb to have an equal chance of flushing some amount of dirty pages.

Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:02 -07:00
Mathieu Desnoyers 2b47c3611d Fix f_version type: should be u64 instead of unsigned long
Fix f_version type: should be u64 instead of long

There is a type inconsistency between struct inode i_version and struct file
f_version.

fs.h:

struct inode
  u64                     i_version;

and

struct file
  unsigned long           f_version;

Users do:

fs/ext3/dir.c:

if (filp->f_version != inode->i_version) {

So why isn't f_version a u64 ? It becomes a problem if versions gets
higher than 2^32 and we are on an architecture where longs are 32 bits.

This patch changes the f_version type to u64, and updates the users accordingly.

It applies to 2.6.23-rc2-mm2.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Martin Bligh <mbligh@google.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: <linux-ext4@vger.kernel.org>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:53 -07:00
Adrian Bunk 4a239427f2 make fs/libfs.c:simple_commit_write() static
simple_commit_write() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:53 -07:00
Denis Cheng 74bf17cffc fs: remove the unused mempages parameter
Since the mempages parameter is actually not used, they should be removed.

Now there is only files_init use the mempages parameter,

 	files_init(mempages);

but I don't think the adaptation to mempages in files_init is really
useful; and if files_init also changed to the prototype void (*func)(void),
the wrapper vfs_caches_init would also not need the mempages parameter.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:49 -07:00
Alexey Dobriyan 4be28540ee Remove sysctl.h from fs.h
Rrrr, addition of sysctl.h to fs.h was't very smart, because simple
editing of the former will buy you big recompile, where it shouldn't
have to.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:48 -07:00
Nick Piggin 55144768e1 fs: remove some AOP_TRUNCATED_PAGE
prepare/commit_write no longer returns AOP_TRUNCATED_PAGE since OCFS2 and
GFS2 were converted to the new aops, so we can make some simplifications
for that.

[michal.k.k.piotrowski@gmail.com: fix warning]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:58 -07:00
Nick Piggin 89e107877b fs: new cont helpers
Rework the generic block "cont" routines to handle the new aops.  Supporting
cont_prepare_write would take quite a lot of code to support, so remove it
instead (and we later convert all filesystems to use it).

write_begin gets passed AOP_FLAG_CONT_EXPAND when called from
generic_cont_expand, so filesystems can avoid the old hacks they used.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin afddba49d1 fs: introduce write_begin, write_end, and perform_write aops
These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin 2f718ffc16 mm: buffered write iterator
Add an iterator data structure to operate over an iovec.  Add usercopy
operators needed by generic_file_buffered_write, and convert that function
over.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Fengguang Wu f4e6b498d6 readahead: combine file_ra_state.prev_index/prev_offset into prev_pos
Combine the file_ra_state members
				unsigned long prev_index
				unsigned int prev_offset
into
				loff_t prev_pos

It is more consistent and better supports huge files.

Thanks to Peter for the nice proposal!

[akpm@linux-foundation.org: fix shift overflow]
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:52 -07:00
Fengguang Wu 0bb7ba6b9c readahead: mmap read-around simplification
Fold file_ra_state.mmap_hit into file_ra_state.mmap_miss and make it an int.

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:52 -07:00
Fengguang Wu 937085aa35 readahead: compacting file_ra_state
Use 'unsigned int' instead of 'unsigned long' for readahead sizes.

This helps reduce memory consumption on 64bit CPU when a lot of files are
opened.

CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:52 -07:00
Linus Torvalds 541010e4b8 Merge branch 'locks' of git://linux-nfs.org/~bfields/linux
* 'locks' of git://linux-nfs.org/~bfields/linux:
  nfsd: remove IS_ISMNDLCK macro
  Rework /proc/locks via seq_files and seq_list helpers
  fs/locks.c: use list_for_each_entry() instead of list_for_each()
  NFS: clean up explicit check for mandatory locks
  AFS: clean up explicit check for mandatory locks
  9PFS: clean up explicit check for mandatory locks
  GFS2: clean up explicit check for mandatory locks
  Cleanup macros for distinguishing mandatory locks
  Documentation: move locks.txt in filesystems/
  locks: add warning about mandatory locking races
  Documentation: move mandatory locking documentation to filesystems/
  locks: Fix potential OOPS in generic_setlease()
  Use list_first_entry in locks_wake_up_blocks
  locks: fix flock_lock_file() comment
  Memory shortage can result in inconsistent flocks state
  locks: kill redundant local variable
  locks: reverse order of posix_locks_conflict() arguments
2007-10-15 16:07:40 -07:00
Peter Zijlstra 14358e6dda lockdep: annotate dir vs file i_mutex
On Mon, 2007-09-24 at 22:13 -0400, Steven Rostedt wrote:
> The circular lock seems to be this:
> 
> #1:
> 
>   sys_mmap2:              down_write(&mm->mmap_sem);
>   nfs_revalidate_mapping: mutex_lock(&inode->i_mutex);
> 
> 
> #0:
> 
>   vfs_readdir:     mutex_lock(&inode->i_mutex);
>    - during the readdir (filldir64), we take a user fault (missing page?)
>     and call do_page_fault -
>   do_page_fault:   down_read(&mm->mmap_sem);
> 
> 
> So it does indeed look like a circular locking. Now the question is, "is
> this a bug?".  Looking like the inode of #1 must be a file or something
> else that you can mmap and the inode of #0 seems it must be a directory.
> I would say "no".
> 
> Now if you can readdir on a file or mmap a directory, then this could be
> an issue.
> 
> Otherwise, I'd love to see someone teach lockdep about this issue! ;-)

Make a distinction between file and dir usage of i_mutex.
The inode should be complete and unused at unlock_new_inode(), re-init
i_mutex depending on its type.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2007-10-14 01:38:33 +02:00
Peter Zijlstra d475fd428c lockdep: per filesystem inode lock class
Give each filesystem its own inode lock class. The various filesystems have
different locking order wrt the inode locks; esp. the pseudo filesystems differ
from the rest.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2007-10-15 14:51:31 +02:00
Pavel Emelyanov 7f8ada98d9 Rework /proc/locks via seq_files and seq_list helpers
Currently /proc/locks is shown with a proc_read function, but its behavior
is rather complex as it has to manually handle current offset and buffer
length.  On the other hand, files that show objects from lists can be
easily reimplemented using the sequential files and the seq_list_XXX()
helpers.

This saves (as usually) 16 lines of code and more than 200 from
the .text section.

[akpm@linux-foundation.org: no externs in C]
[akpm@linux-foundation.org: warning fixes]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-09 18:32:46 -04:00
Pavel Emelyanov a16877ca9c Cleanup macros for distinguishing mandatory locks
The combination of S_ISGID bit set and S_IXGRP bit unset is used to mark the
inode as "mandatory lockable" and there's a macro for this check called
MANDATORY_LOCK(inode).  However, fs/locks.c and some filesystems still perform
the explicit i_mode checking.  Besides, Andrew pointed out, that this macro is
buggy itself, as it dereferences the inode arg twice.

Convert this macro into static inline function and switch its users to it,
making the code shorter and more readable.

The __mandatory_lock() helper is to be used in places where the IS_MANDLOCK()
for superblock is already known to be true.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-09 18:32:46 -04:00
Adrian Bunk ec05b297f9 [PATCH] remove mm/filemap.c:file_send_actor()
This patch removes the no longer used file_send_actor().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-08-11 22:34:47 +02:00
Christoph Hellwig 0af1a45046 rename setlease to generic_setlease
Make it a little more clear that this is the default implementation for
the setleast operation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:43 -07:00
Fengguang Wu f9acc8c7b3 readahead: sanify file_ra_state names
Rename some file_ra_state variables and remove some accessors.

It results in much simpler code.
Kudos to Rusty!

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
Fengguang Wu c743d96b6d readahead: remove the old algorithm
Remove the old readahead algorithm.

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
Fengguang Wu 5ce1110b92 readahead: data structure and routines
Extend struct file_ra_state to support the on-demand readahead logic.  Also
define some helpers for it.

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Cc: Steven Pratt <slpratt@austin.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
Akinobu Mita e53252d97e unregister_chrdev() return void
unregister_chrdev() does not return meaningful value.  This patch makes it
return void like most unregister_* functions.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:43 -07:00
Linus Torvalds a8dcf12f9e Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
  locks: fix vfs_test_lock() comment
  locks: make posix_test_lock() interface more consistent
  nfs: disable leases over NFS
  gfs2: stop giving out non-cluster-coherent leases
  locks: export setlease to filesystems
  locks: provide a file lease method enabling cluster-coherent leases
  locks: rename lease functions to reflect locks.c conventions
  locks: share more common lease code
  locks: clean up lease_alloc()
  locks: convert an -EINVAL return to a BUG
  leases: minor break_lease() comment clarification
2007-07-18 18:27:00 -07:00
J. Bruce Fields 6d34ac199a locks: make posix_test_lock() interface more consistent
Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
or something other than F_UNLCK, the return value is no longer needed.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18 19:17:19 -04:00
J. Bruce Fields 4698afe8e3 locks: export setlease to filesystems
Export setlease so it can used by filesystems to implement their lease
methods.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18 19:17:06 -04:00
J. Bruce Fields f9ffed26d6 locks: provide a file lease method enabling cluster-coherent leases
Currently leases are only kept locally, so there's no way for a distributed
filesystem to enforce them against multiple clients.  We're particularly
interested in the case of nfsd exporting a cluster filesystem, in which
case nfsd needs cluster-coherent leases in order to implement delegations
correctly.

Also add some documentation.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-07-18 19:14:47 -04:00
J. Bruce Fields a9933cea7a locks: rename lease functions to reflect locks.c conventions
We've been using the convention that vfs_foo is the function that calls
a filesystem-specific foo method if it exists, or falls back on a
generic method if it doesn't; thus vfs_foo is what is called when some
other part of the kernel (normally lockd or nfsd) wants to get a lock,
whereas foo is what filesystems call to use the underlying local
functionality as part of their lock implementation.

So rename setlease to vfs_setlease (which will call a
filesystem-specific setlease after a later patch) and __setlease to
setlease.

Also, vfs_setlease need only be GPL-exported as long as it's only needed
by lockd and nfsd.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18 19:14:12 -04:00
Amit Arora 97ac73506c sys_fallocate() implementation on i386, x86_64 and powerpc
fallocate() is a new system call being proposed here which will allow
applications to preallocate space to any file(s) in a file system.
Each file system implementation that wants to use this feature will need
to support an inode operation called ->fallocate().
Applications can use this feature to avoid fragmentation to certain
level and thus get faster access speed. With preallocation, applications
also get a guarantee of space for particular file(s) - even if later the
the system becomes full.

Currently, glibc provides an interface called posix_fallocate() which
can be used for similar cause. Though this has the advantage of working
on all file systems, but it is quite slow (since it writes zeroes to
each block that has to be preallocated). Without a doubt, file systems
can do this more efficiently within the kernel, by implementing
the proposed fallocate() system call. It is expected that
posix_fallocate() will be modified to call this new system call first
and incase the kernel/filesystem does not implement it, it should fall
back to the current implementation of writing zeroes to the new blocks.
ToDos:
1. Implementation on other architectures (other than i386, x86_64,
   and ppc). Patches for s390(x) and ia64 are already available from
   previous posts, but it was decided that they should be added later
   once fallocate is in the mainline. Hence not including those patches
   in this take.
2. Changes to glibc,
   a) to support fallocate() system call
   b) to make posix_fallocate() and posix_fallocate64() call fallocate()

Signed-off-by: Amit Arora <aarora@in.ibm.com>
2007-07-17 21:42:44 -04:00
Satyam Sharma 3bd858ab1c Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 12:00:03 -07:00
Christoph Hellwig a569425512 knfsd: exportfs: add exportfs.h header
currently the export_operation structure and helpers related to it are in
fs.h.  fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Akinobu Mita f4480240f7 unregister_blkdev(): return void
Put WARN_ON and fixed all callers of unregister_blkdev().  Now we can make
unregister_blkdev return void.

Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:03 -07:00
Adrian Bunk 62239ac2b3 proper prototype for proc_nr_files()
Add a proper prototype for proc_nr_files() in include/linux/fs.h

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:03 -07:00
Stefan Richter 9e7bf24b1b fs: clarify "dummy" member in struct inodes_stat_t
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
David Howells e8d6c55412 AFS: implement file locking
Implement file locking for AFS.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:43 -07:00
Andrew Morton fc9a07e7bf invalidate_mapping_pages(): add cond_resched
invalidate_mapping_pages() can sometimes take a long time (millions of pages
to free).  Long enough for the softlockup detector to trigger.

We used to have a cond_resched() in there but I took it out because the
drop_caches code calls invalidate_mapping_pages() under inode_lock.

The patch adds a nasty flag and puts the cond_resched() back.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jens Axboe d96e6e7164 Remove remnants of sendfile()
There are now zero users of .sendfile() in the kernel, so kill
it from the file_operations structure and in do_sendfile().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:15 +02:00
Carsten Otte d054fe3d10 xip sendfile removal
This patch removes xip_file_sendfile, the sendfile implementation for
xip without replacement. Those customers that use xip on s390 are not
using sendfile() as far as we know, and so far s390 is the only platform
this could potentially be used on so far.
Having sendfile is not a popular feature for execute in place file
systems, however we have a working implementation of splice_read() based
on fs/splice.c if anyone asks for it.
At this point in time, it does not seem preferable to merge
splice_read() for xip because it causes extra maintenence effort due to
code duplication and it requires struct page behind the xip memory
segment. We'd like to get rid of that in favor of supporting flash based
embedded platforms (Monta Vista work) soon.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:15 +02:00
Jens Axboe 0452a4e5d0 sendfile: kill generic_file_sendfile()
It's no longer used.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:14 +02:00
Dave Hansen 71c4215790 document nlink function
These should have been documented from the beginning.  Fix it.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-24 08:59:11 -07:00
Linus Torvalds ec4883b015 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  [JFFS2] Fix obsoletion of metadata nodes in jffs2_add_tn_to_tree()
  [MTD] Fix error checking after get_mtd_device() in get_sb_mtd functions
  [JFFS2] Fix buffer length calculations in jffs2_get_inode_nodes()
  [JFFS2] Fix potential memory leak of dead xattrs on unmount.
  [JFFS2] Fix BUG() caused by failing to discard xattrs on deleted files.
  [MTD] generalise the handling of MTD-specific superblocks
  [MTD] [MAPS] don't force uclinux mtd map to be root dev
2007-06-04 17:54:09 -07:00
David Howells acaebfd8a7 [MTD] generalise the handling of MTD-specific superblocks
Generalise the handling of MTD-specific superblocks so that JFFS2 and ROMFS
can both share it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-05-11 12:14:15 +01:00
Mark Fasheh ef51c97623 Remove do_sync_file_range()
Remove do_sync_file_range() and convert callers to just use
do_sync_mapping_range().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Miklos Szeredi 79c0b2df79 add filesystem subtype support
There's a slight problem with filesystem type representation in fuse
based filesystems.

From the kernel's view, there are just two filesystem types: fuse and
fuseblk.  From the user's view there are lots of different filesystem
types.  The user is not even much concerned if the filesystem is fuse based
or not.  So there's a conflict of interest in how this should be
represented in fstab, mtab and /proc/mounts.

The current scheme is to encode the real filesystem type in the mount
source.  So an sshfs mount looks like this:

  sshfs#user@server:/   /mnt/server    fuse   rw,nosuid,nodev,...

This url-ish syntax works OK for sshfs and similar filesystems.  However
for block device based filesystems (ntfs-3g, zfs) it doesn't work, since
the kernel expects the mount source to be a real device name.

A possibly better scheme would be to encode the real type in the type
field as "type.subtype".  So fuse mounts would look like this:

  /dev/hda1       /mnt/windows   fuseblk.ntfs-3g   rw,...
  user@server:/   /mnt/server    fuse.sshfs        rw,nosuid,nodev,...

This patch adds the necessary code to the kernel so that this can be
correctly displayed in /proc/mounts.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:01 -07:00
Chris Snook 1ae7075bcd use use SEEK_MAX to validate user lseek arguments
Add SEEK_MAX and use it to validate lseek arguments from userspace.

Signed-off-by: Chris Snook <csnook@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:59 -07:00
Dmitriy Monakhov 0ceb331433 mm: move common segment checks to separate helper function
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Cc: Christoph Hellwig <hch@lst.de>
Acked-by: Anton Altaparmakov <aia21@cam.ac.uk>
Acked-by: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:57 -07:00
Linus Torvalds 2d56d3c43c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
  gfs2: nfs lock support for gfs2
  lockd: add code to handle deferred lock requests
  lockd: always preallocate block in nlmsvc_lock()
  lockd: handle test_lock deferrals
  lockd: pass cookie in nlmsvc_testlock
  lockd: handle fl_grant callbacks
  lockd: save lock state on deferral
  locks: add fl_grant callback for asynchronous lock return
  nfsd4: Convert NFSv4 to new lock interface
  locks: add lock cancel command
  locks: allow {vfs,posix}_lock_file to return conflicting lock
  locks: factor out generic/filesystem switch from setlock code
  locks: factor out generic/filesystem switch from test_lock
  locks: give posix_test_lock same interface as ->lock
  locks: make ->lock release private data before returning in GETLK case
  locks: create posix-to-flock helper functions
  locks: trivial removal of unnecessary parentheses
2007-05-07 12:34:24 -07:00
Jan Kara 6ce745ed39 readahead: code cleanup
Rename file_ra_state.prev_page to prev_index and file_ra_state.offset to
prev_offset.  Also update of prev_index in do_generic_mapping_read() is now
moved close to the update of prev_offset.

[wfg@mail.ustc.edu.cn: fix it]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:52 -07:00
Jan Kara ec0f163722 readahead: improve heuristic detecting sequential reads
Introduce ra.offset and store in it an offset where the previous read
ended.  This way we can detect whether reads are really sequential (and
thus we should not mark the page as accessed repeatedly) or whether they
are random and just happen to be in the same page (and the page should
really be marked accessed again).

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:52 -07:00