alistair23-linux/Documentation/filesystems
David Howells 2d6fff6370 FS-Cache: Add the FS-Cache netfs API and documentation
Add the API for a generic facility (FS-Cache) by which filesystems (such as AFS
or NFS) may call on local caching capabilities without having to know anything
about how the cache works, or even if there is a cache:

	+---------+
	|         |                        +--------------+
	|   NFS   |--+                     |              |
	|         |  |                 +-->|   CacheFS    |
	+---------+  |   +----------+  |   |  /dev/hda5   |
	             |   |          |  |   +--------------+
	+---------+  +-->|          |  |
	|         |      |          |--+
	|   AFS   |----->| FS-Cache |
	|         |      |          |--+
	+---------+  +-->|          |  |
	             |   |          |  |   +--------------+
	+---------+  |   +----------+  |   |              |
	|         |  |                 +-->|  CacheFiles  |
	|  ISOFS  |--+                     |  /var/cache  |
	|         |                        +--------------+
	+---------+

General documentation and documentation of the netfs specific API are provided
in addition to the header files.

As this patch stands, it is possible to build a filesystem against the facility
and attempt to use it.  All that will happen is that all requests will be
immediately denied as if no cache is present.

Further patches will implement the core of the facility.  The facility will
transfer requests from networking filesystems to appropriate caches if
possible, or else gracefully deny them.

If this facility is disabled in the kernel configuration, then all its
operations will trivially reduce to nothing during compilation.

WHY NOT I_MAPPING?
==================

I have added my own API to implement caching rather than using i_mapping to do
this for a number of reasons.  These have been discussed a lot on the LKML and
CacheFS mailing lists, but to summarise the basics:

 (1) Most filesystems don't do hole reportage.  Holes in files are treated as
     blocks of zeros and can't be distinguished otherwise, making it difficult
     to distinguish blocks that have been read from the network and cached from
     those that haven't.

 (2) The backing inode must be fully populated before being exposed to
     userspace through the main inode because the VM/VFS goes directly to the
     backing inode and does not interrogate the front inode's VM ops.

     Therefore:

     (a) The backing inode must fit entirely within the cache.

     (b) All backed files currently open must fit entirely within the cache at
     	 the same time.

     (c) A working set of files in total larger than the cache may not be
     	 cached.

     (d) A file may not grow larger than the available space in the cache.

     (e) A file that's open and cached, and remotely grows larger than the
     	 cache is potentially stuffed.

 (3) Writes go to the backing filesystem, and can only be transferred to the
     network when the file is closed.

 (4) There's no record of what changes have been made, so the whole file must
     be written back.

 (5) The pages belong to the backing filesystem, and all metadata associated
     with that page are relevant only to the backing filesystem, and not
     anything stacked atop it.

OVERVIEW
========

FS-Cache provides (or will provide) the following facilities:

 (1) Caches can be added / removed at any time, even whilst in use.

 (2) Adds a facility by which tags can be used to refer to caches, even if
     they're not available yet.

 (3) More than one cache can be used at once.  Caches can be selected
     explicitly by use of tags.

 (4) The netfs is provided with an interface that allows either party to
     withdraw caching facilities from a file (required for (1)).

 (5) A netfs may annotate cache objects that belongs to it.  This permits the
     storage of coherency maintenance data.

 (6) Cache objects will be pinnable and space reservations will be possible.

 (7) The interface to the netfs returns as few errors as possible, preferring
     rather to let the netfs remain oblivious.

 (8) Cookies are used to represent indices, files and other objects to the
     netfs.  The simplest cookie is just a NULL pointer - indicating nothing
     cached there.

 (9) The netfs is allowed to propose - dynamically - any index hierarchy it
     desires, though it must be aware that the index search function is
     recursive, stack space is limited, and indices can only be children of
     indices.

(10) Indices can be used to group files together to reduce key size and to make
     group invalidation easier.  The use of indices may make lookup quicker,
     but that's cache dependent.

(11) Data I/O is effectively done directly to and from the netfs's pages.  The
     netfs indicates that page A is at index B of the data-file represented by
     cookie C, and that it should be read or written.  The cache backend may or
     may not start I/O on that page, but if it does, a netfs callback will be
     invoked to indicate completion.  The I/O may be either synchronous or
     asynchronous.

(12) Cookies can be "retired" upon release.  At this point FS-Cache will mark
     them as obsolete and the index hierarchy rooted at that point will get
     recycled.

(13) The netfs provides a "match" function for index searches.  In addition to
     saying whether a match was made or not, this can also specify that an
     entry should be updated or deleted.

FS-Cache maintains a virtual index tree in which all indices, files, objects
and pages are kept.  Bits of this tree may actually reside in one or more
caches.

                                           FSDEF
                                             |
                        +------------------------------------+
                        |                                    |
                       NFS                                  AFS
                        |                                    |
           +--------------------------+                +-----------+
           |                          |                |           |
        homedir                     mirror          afs.org   redhat.com
           |                          |                            |
     +------------+           +---------------+              +----------+
     |            |           |               |              |          |
   00001        00002       00007           00125        vol00001   vol00002
     |            |           |               |                         |
 +---+---+     +-----+      +---+      +------+------+            +-----+----+
 |   |   |     |     |      |   |      |      |      |            |     |    |
PG0 PG1 PG2   PG0  XATTR   PG0 PG1   DIRENT DIRENT DIRENT        R/W   R/O  Bak
                     |                                            |
                    PG0                                       +-------+
                                                              |       |
                                                            00001   00003
                                                              |
                                                          +---+---+
                                                          |   |   |
                                                         PG0 PG1 PG2

In the example above, two netfs's can be seen to be backed: NFS and AFS.  These
have different index hierarchies:

 (*) The NFS primary index will probably contain per-server indices.  Each
     server index is indexed by NFS file handles to get data file objects.
     Each data file objects can have an array of pages, but may also have
     further child objects, such as extended attributes and directory entries.
     Extended attribute objects themselves have page-array contents.

 (*) The AFS primary index contains per-cell indices.  Each cell index contains
     per-logical-volume indices.  Each of volume index contains up to three
     indices for the read-write, read-only and backup mirrors of those volumes.
     Each of these contains vnode data file objects, each of which contains an
     array of pages.

The very top index is the FS-Cache master index in which individual netfs's
have entries.

Any index object may reside in more than one cache, provided it only has index
children.  Any index with non-index object children will be assumed to only
reside in one cache.

The FS-Cache overview can be found in:

	Documentation/filesystems/caching/fscache.txt

The netfs API to FS-Cache can be found in:

	Documentation/filesystems/caching/netfs-api.txt

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 16:42:36 +01:00
..
caching FS-Cache: Add the FS-Cache netfs API and documentation 2009-04-03 16:42:36 +01:00
configfs docsrc: build Documentation/ sources 2008-08-12 16:07:30 -07:00
00-INDEX
9p.txt
adfs.txt
affs.txt
afs.txt
autofs4-mount-control.txt autofs4: device node ioctl documentation 2008-10-16 11:21:39 -07:00
automount-support.txt
befs.txt
bfs.txt remove mention of CONFIG_KMOD from documentation 2008-07-22 19:24:29 +10:00
btrfs.txt Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING 2009-01-07 09:54:24 -05:00
cifs.txt
coda.txt
cramfs.txt
dentry-locking.txt
devpts.txt Document usage of multiple-instances of devpts 2009-01-02 10:19:36 -08:00
directory-locking
dlmfs.txt
dnotify.txt
ecryptfs.txt
Exporting
ext2.txt trivial: fix orphan dates in ext2 documentation 2009-03-23 14:21:26 -07:00
ext3.txt trivial: fix bad links in the ext2 and ext3 documentation 2009-03-12 16:24:25 -07:00
ext4.txt ext4: Regularize mount options 2009-03-28 10:59:57 -04:00
fiemap.txt vfs: vfs-level fiemap interface 2008-10-08 19:44:18 -04:00
files.txt fix f_count description in Documentation/filesystems/files.txt 2008-12-31 18:07:42 -05:00
fuse.txt
gfs2-glocks.txt [GFS2] Glock documentation 2008-06-27 09:39:53 +01:00
gfs2.txt
hfs.txt
hfsplus.txt
hpfs.txt
inotify.txt
isofs.txt
jfs.txt
Locking mm: page_mkwrite change prototype to match fault 2009-04-01 08:59:14 -07:00
locks.txt
mandatory-locking.txt
ncpfs.txt
nfs-rdma.txt update port number in NFS/RDMA documentation 2009-01-27 17:20:14 -05:00
nfsroot.txt doc: typo in Documentation/filesystems/nfsroot.txt 2008-10-16 11:21:31 -07:00
ntfs.txt NTFS: update homepage 2008-09-02 19:21:37 -07:00
ocfs2.txt ocfs2: add mount option and Kconfig option for acl 2009-01-05 08:36:52 -08:00
omfs.txt omfs: add filesystem documentation 2008-07-26 12:00:05 -07:00
porting
proc.txt documentation: update Documentation/filesystem/proc.txt and Documentation/sysctls 2009-04-02 19:04:53 -07:00
quota.txt quota: documentation for sending "below quota" messages via netlink and tiny doc update 2008-08-12 16:07:27 -07:00
ramfs-rootfs-initramfs.txt Trivial Documentation/filesystems/ramfs-rootfs-initramfs.txt fix 2008-11-30 11:40:56 -08:00
relay.txt relay: add buffer-only channels; useful for early logging 2008-07-26 12:00:04 -07:00
romfs.txt
rpc-cache.txt
seq_file.txt
sharedsubtree.txt
smbfs.txt
spufs.txt
squashfs.txt Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB 2009-03-05 00:40:13 +00:00
sysfs-pci.txt PCI: Introduce /sys/bus/pci/devices/.../remove 2009-03-20 14:58:48 -07:00
sysfs.txt PATCH [2/2] Documentation/filesystems/sysfs.txt: fix descriptions of device attributes 2009-02-22 09:28:15 -08:00
sysv-fs.txt
tmpfs.txt
ubifs.txt UBIFS: remove fast unmounting 2009-01-29 16:34:30 +02:00
udf.txt
ufs.txt
vfat.txt fat: Fix ATTR_RO for directory 2008-11-06 15:41:21 -08:00
vfs.txt filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
xfs.txt [XFS] remove restricted chown parameter from xfs linux 2008-10-30 18:30:09 +11:00
xip.txt DOC: update xip method info 2008-11-12 17:17:17 -08:00