alistair23-linux/fs
David Howells 952efe7b78 FS-Cache: Add and document asynchronous operation handling
Add and document asynchronous operation handling for use by FS-Cache's data
storage and retrieval routines.

The following documentation is added to:

	Documentation/filesystems/caching/operations.txt

		       ================================
		       ASYNCHRONOUS OPERATIONS HANDLING
		       ================================

========
OVERVIEW
========

FS-Cache has an asynchronous operations handling facility that it uses for its
data storage and retrieval routines.  Its operations are represented by
fscache_operation structs, though these are usually embedded into some other
structure.

This facility is available to and expected to be be used by the cache backends,
and FS-Cache will create operations and pass them off to the appropriate cache
backend for completion.

To make use of this facility, <linux/fscache-cache.h> should be #included.

===============================
OPERATION RECORD INITIALISATION
===============================

An operation is recorded in an fscache_operation struct:

	struct fscache_operation {
		union {
			struct work_struct fast_work;
			struct slow_work slow_work;
		};
		unsigned long		flags;
		fscache_operation_processor_t processor;
		...
	};

Someone wanting to issue an operation should allocate something with this
struct embedded in it.  They should initialise it by calling:

	void fscache_operation_init(struct fscache_operation *op,
				    fscache_operation_release_t release);

with the operation to be initialised and the release function to use.

The op->flags parameter should be set to indicate the CPU time provision and
the exclusivity (see the Parameters section).

The op->fast_work, op->slow_work and op->processor flags should be set as
appropriate for the CPU time provision (see the Parameters section).

FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the
operation and waited for afterwards.

==========
PARAMETERS
==========

There are a number of parameters that can be set in the operation record's flag
parameter.  There are three options for the provision of CPU time in these
operations:

 (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD).  A thread
     may decide it wants to handle an operation itself without deferring it to
     another thread.

     This is, for example, used in read operations for calling readpages() on
     the backing filesystem in CacheFiles.  Although readpages() does an
     asynchronous data fetch, the determination of whether pages exist is done
     synchronously - and the netfs does not proceed until this has been
     determined.

     If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags
     before submitting the operation, and the operating thread must wait for it
     to be cleared before proceeding:

		wait_on_bit(&op->flags, FSCACHE_OP_WAITING,
			    fscache_wait_bit, TASK_UNINTERRUPTIBLE);

 (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it
     will be given to keventd to process.  Such an operation is not permitted
     to sleep on I/O.

     This is, for example, used by CacheFiles to copy data from a backing fs
     page to a netfs page after the backing fs has read the page in.

     If this option is used, op->fast_work and op->processor must be
     initialised before submitting the operation:

		INIT_WORK(&op->fast_work, do_some_work);

 (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it
     will be given to the slow work facility to process.  Such an operation is
     permitted to sleep on I/O.

     This is, for example, used by FS-Cache to handle background writes of
     pages that have just been fetched from a remote server.

     If this option is used, op->slow_work and op->processor must be
     initialised before submitting the operation:

		fscache_operation_init_slow(op, processor)

Furthermore, operations may be one of two types:

 (1) Exclusive (FSCACHE_OP_EXCLUSIVE).  Operations of this type may not run in
     conjunction with any other operation on the object being operated upon.

     An example of this is the attribute change operation, in which the file
     being written to may need truncation.

 (2) Shareable.  Operations of this type may be running simultaneously.  It's
     up to the operation implementation to prevent interference between other
     operations running at the same time.

=========
PROCEDURE
=========

Operations are used through the following procedure:

 (1) The submitting thread must allocate the operation and initialise it
     itself.  Normally this would be part of a more specific structure with the
     generic op embedded within.

 (2) The submitting thread must then submit the operation for processing using
     one of the following two functions:

	int fscache_submit_op(struct fscache_object *object,
			      struct fscache_operation *op);

	int fscache_submit_exclusive_op(struct fscache_object *object,
					struct fscache_operation *op);

     The first function should be used to submit non-exclusive ops and the
     second to submit exclusive ones.  The caller must still set the
     FSCACHE_OP_EXCLUSIVE flag.

     If successful, both functions will assign the operation to the specified
     object and return 0.  -ENOBUFS will be returned if the object specified is
     permanently unavailable.

     The operation manager will defer operations on an object that is still
     undergoing lookup or creation.  The operation will also be deferred if an
     operation of conflicting exclusivity is in progress on the object.

     If the operation is asynchronous, the manager will retain a reference to
     it, so the caller should put their reference to it by passing it to:

	void fscache_put_operation(struct fscache_operation *op);

 (3) If the submitting thread wants to do the work itself, and has marked the
     operation with FSCACHE_OP_MYTHREAD, then it should monitor
     FSCACHE_OP_WAITING as described above and check the state of the object if
     necessary (the object might have died whilst the thread was waiting).

     When it has finished doing its processing, it should call
     fscache_put_operation() on it.

 (4) The operation holds an effective lock upon the object, preventing other
     exclusive ops conflicting until it is released.  The operation can be
     enqueued for further immediate asynchronous processing by adjusting the
     CPU time provisioning option if necessary, eg:

	op->flags &= ~FSCACHE_OP_TYPE;
	op->flags |= ~FSCACHE_OP_FAST;

     and calling:

	void fscache_enqueue_operation(struct fscache_operation *op)

     This can be used to allow other things to have use of the worker thread
     pools.

=====================
ASYNCHRONOUS CALLBACK
=====================

When used in asynchronous mode, the worker thread pool will invoke the
processor method with a pointer to the operation.  This should then get at the
container struct by using container_of():

	static void fscache_write_op(struct fscache_operation *_op)
	{
		struct fscache_storage *op =
			container_of(_op, struct fscache_storage, op);
	...
	}

The caller holds a reference on the operation, and will invoke
fscache_put_operation() when the processor function returns.  The processor
function is at liberty to call fscache_enqueue_operation() or to take extra
references.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 16:42:39 +01:00
..
9p vfs: simple_set_mnt() should return void 2009-03-27 14:44:03 -04:00
adfs fs/adfs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
affs fs/affs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
afs proc 2/2: remove struct proc_dir_entry::owner 2009-03-31 01:14:44 +04:00
autofs constify dentry_operations: autofs, autofs4 2009-03-27 14:44:00 -04:00
autofs4 autofs4: fix lookup deadlock 2009-04-01 08:59:23 -07:00
befs fs/befs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
bfs
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
cifs New helper - current_umask() 2009-03-31 23:00:26 -04:00
coda constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
configfs constify dentry_operations: configfs 2009-03-27 14:44:03 -04:00
cramfs fs/cramfs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
debugfs
devpts Merge code for single and multiple-instance mounts 2009-03-27 14:44:04 -04:00
dlm
ecryptfs ecryptfs: use kzfree() 2009-04-01 08:59:23 -07:00
efs fs/efs: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
exportfs
ext2 New helper - current_umask() 2009-03-31 23:00:26 -04:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
fat Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
freevxfs
fscache FS-Cache: Add and document asynchronous operation handling 2009-04-03 16:42:39 +01:00
fuse mm: page_mkwrite change prototype to match fault 2009-04-01 08:59:14 -07:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
hfs fs/hfs: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
hostfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
hpfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
hppfs hppfs: hppfs_read_file() may return -ERROR 2009-04-02 19:04:53 -07:00
hugetlbfs mm: reintroduce and deprecate rlimit based access for SHM_HUGETLB 2009-04-01 08:59:12 -07:00
isofs fs/isofs: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
jbd jbd: fix oops in jbd_journal_init_inode() on corrupted fs 2009-04-02 19:04:52 -07:00
jbd2 jbd2: Update locking coments 2009-03-27 17:20:40 -04:00
jffs2 New helper - current_umask() 2009-03-31 23:00:26 -04:00
jfs New helper - current_umask() 2009-03-31 23:00:26 -04:00
lockd NSM: Fix unaligned accesses in nsm_init_private() 2009-04-01 13:24:14 -04:00
minix fs/minix: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
ncpfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
nfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
nfs_common
nfsd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
nls
notify fs: avoid I_NEW inodes 2009-03-27 14:44:05 -04:00
ntfs ntfs: remove private wrapper of endian helpers 2009-04-01 08:59:18 -07:00
ocfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
openpromfs
partitions Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 2009-03-26 16:04:22 -07:00
proc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
qnx4 fs/qnx4: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
quota vfs: skip I_CLEAR state inodes 2009-04-02 19:04:48 -07:00
ramfs ramfs: add support for "mode=" mount option 2009-04-01 08:59:22 -07:00
reiserfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
romfs
smbfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
squashfs fs/squashfs: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
sysfs mm: page_mkwrite change prototype to match fault: fix sysfs 2009-04-01 08:59:14 -07:00
sysv fs/sysv: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
ubifs mm: page_mkwrite change prototype to match fault 2009-04-01 08:59:14 -07:00
udf udf: Use lowercase names of quota functions 2009-03-26 02:18:36 +01:00
ufs fs/ufs: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
xfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
aio.c aio: lookup_ioctx can return the wrong value when looking up a bogus context 2009-03-19 15:57:18 -07:00
anon_inodes.c constify dentry_operations: rest 2009-03-27 14:44:03 -04:00
attr.c vfs: Use lowercase names of quota functions 2009-03-26 02:18:35 +01:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c Trim includes in binfmt_elf 2009-03-31 23:00:27 -04:00
binfmt_elf_fdpic.c bin_elf_fdpic: check the return value of clear_user 2009-04-02 19:05:01 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
binfmt_som.c Don't crap into descriptor table in binfmt_som 2009-03-31 23:00:28 -04:00
bio-integrity.c block: add private bio_set for bio integrity allocations 2009-03-24 12:35:17 +01:00
bio.c block: add private bio_set for bio integrity allocations 2009-03-24 12:35:17 +01:00
block_dev.c Cleanup after commit 585d3bc06f 2009-04-01 07:07:16 -04:00
buffer.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
char_dev.c
compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
compat_binfmt_elf.c
compat_ioctl.c
dcache.c Trim includes of fdtable.h 2009-03-31 23:00:28 -04:00
dcookies.c
direct-io.c
drop_caches.c vfs: skip I_CLEAR state inodes 2009-04-02 19:04:48 -07:00
eventfd.c epoll keyed wakeups: make eventfd use keyed wakeups 2009-04-01 08:59:20 -07:00
eventpoll.c epoll keyed wakeups: teach epoll about hints coming with the wakeup key 2009-04-01 08:59:20 -07:00
exec.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
fcntl.c Fix a lockdep warning in fasync_helper() 2009-03-30 08:00:24 -06:00
fifo.c
file.c
file_table.c Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6 2009-03-26 16:14:02 -07:00
filesystems.c
fs-writeback.c writeback: guard against jiffies wraparound on inode->dirtied_when checks (try #3) 2009-04-02 19:04:48 -07:00
fs_struct.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
generic_acl.c New helper - current_umask() 2009-03-31 23:00:26 -04:00
inode.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-03-27 16:23:12 -07:00
internal.h New locking/refcounting for fs_struct 2009-03-31 23:00:26 -04:00
ioctl.c
ioprio.c
Kconfig FS-Cache: Add main configuration option, module entry points and debugging 2009-04-03 16:42:36 +01:00
Kconfig.binfmt
libfs.c vfs: simple_set_mnt() should return void 2009-03-27 14:44:03 -04:00
locks.c
Makefile FS-Cache: Add main configuration option, module entry points and debugging 2009-04-03 16:42:36 +01:00
mbcache.c
mpage.c Remove two unneeded exports and make two symbols static in fs/mpage.c 2009-04-01 07:38:54 -04:00
namei.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
namespace.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
nfsctl.c
no-block.c
open.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
pipe.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-03-27 16:23:12 -07:00
pnode.c
pnode.h
posix_acl.c
read_write.c preadv/pwritev: Add preadv and pwritev system calls. 2009-04-02 19:05:08 -07:00
read_write.h
readdir.c
select.c
seq_file.c cpumask: fix seq_bitmap_*() functions. 2009-03-30 22:05:11 +10:30
signalfd.c
splice.c FS-Cache: Recruit a page flags for cache management 2009-04-03 16:42:36 +01:00
stack.c
stat.c
super.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-03-27 16:23:12 -07:00
sync.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 2009-03-27 14:48:34 -07:00
timerfd.c
utimes.c
xattr.c
xattr_acl.c