alistair23-linux/mm
Linus Torvalds 7658cc2892 VM: Fix nasty and subtle race in shared mmap'ed page writeback
The VM layer (on the face of it, fairly reasonably) expected that when
it does a ->writepage() call to the filesystem, it would write out the
full page at that point in time.  Especially since it had earlier marked
the whole page dirty with "set_page_dirty()".

But that isn't actually the case: ->writepage() does not actually write
a page, it writes the parts of the page that have been explicitly marked
dirty before, *and* that had not got written out for other reasons since
the last time we told it they were dirty.

That last caveat is the important one.

Which _most_ of the time ends up being the whole page (since we had
called "set_page_dirty()" on the page earlier), but if the filesystem
had done any dirty flushing of its own (for example, to honor some
internal write ordering guarantees), it might end up doing only a
partial page IO (or none at all) when ->writepage() is actually called.

That is the correct thing in general (since we actually often _want_
only the known-dirty parts of the page to be written out), but the
shared dirty page handling had implicitly forgotten about these details,
and had a number of cases where it was doing just the "->writepage()"
part, without telling the low-level filesystem that the whole page might
have been re-dirtied as part of being mapped writably into user space.

Since most of the time the FS did actually write out the full page, we
didn't notice this for a loong time, and this needed some really odd
patterns to trigger.  But it caused occasional corruption with rtorrent
and with the Debian "apt" database, because both use shared mmaps to
update the end result.

This fixes it. Finally. After way too much hair-pulling.

Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Acked-by: Martin J. Bligh <mbligh@google.com>
Acked-by: Martin Michlmayr <tbm@cyrius.com>
Acked-by: Martin Johansson <martin@fatbob.nu>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Andrei Popa <andrei.popa@i-neo.ro>
Cc: High Dickins <hugh@veritas.com>
Cc: Andrew Morton <akpm@osdl.org>,
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Gordon Farquharson <gordonfarquharson@gmail.com>
Cc: Guillaume Chazarain <guichaz@yahoo.fr>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Kenneth Cheng <kenneth.w.chen@intel.com>
Cc: Tobias Diedrich <ranma@tdiedrich.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-29 10:00:58 -08:00
..
allocpercpu.c [PATCH] Allow NULL pointers in percpu_free 2006-12-07 08:39:22 -08:00
backing-dev.c [PATCH] separate bdi congestion functions from queue congestion functions 2006-10-20 10:26:35 -07:00
bootmem.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
bounce.c [PATCH] BLOCK: Separate the bounce buffering code from the highmem code [try #6] 2006-09-30 20:32:11 +02:00
fadvise.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
filemap.c [PATCH] dio: only call aio_complete() after returning -EIOCBQUEUED 2006-12-10 09:57:21 -08:00
filemap.h Remove all inclusions of <linux/config.h> 2006-10-04 03:38:54 -04:00
filemap_xip.c [PATCH] mm: more rmap debugging 2006-12-22 08:55:49 -08:00
fremap.c [PATCH] mm: more rmap debugging 2006-12-22 08:55:49 -08:00
highmem.c [PATCH] BLOCK: Separate the bounce buffering code from the highmem code [try #6] 2006-09-30 20:32:11 +02:00
hugetlb.c [PATCH] Pass vma argument to copy_user_highpage(). 2006-12-13 09:27:08 -08:00
internal.h [PATCH] mm: VM_BUG_ON 2006-09-26 08:48:44 -07:00
Kconfig Fix "can not" in Documentation and Kconfig 2006-10-03 22:53:09 +02:00
madvise.c [PATCH] Fix MADV_REMOVE protection checking 2006-04-17 18:22:18 -07:00
Makefile [PATCH] separate bdi congestion functions from queue congestion functions 2006-10-20 10:26:35 -07:00
memory.c [PATCH] mm: more rmap debugging 2006-12-22 08:55:49 -08:00
memory_hotplug.c [PATCH] Get rid of zone_table[] 2006-12-07 08:39:20 -08:00
mempolicy.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
mempool.c [PATCH] dm: work around mempool_alloc, bio_alloc_bioset deadlocks 2006-09-01 11:39:09 -07:00
migrate.c [PATCH] radix-tree: RCU lockless readside 2006-12-07 08:39:25 -08:00
mincore.c [PATCH] sys_mincore: s/max/min/ 2006-12-17 10:21:53 -08:00
mlock.c [PATCH] mlock cleanup 2006-12-07 08:39:22 -08:00
mmap.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
mmzone.c [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols 2006-12-07 08:39:44 -08:00
mprotect.c [PATCH] paravirt: lazy mmu mode hooks.patch 2006-10-01 00:39:33 -07:00
mremap.c [PATCH] paravirt: lazy mmu mode hooks.patch 2006-10-01 00:39:33 -07:00
msync.c [PATCH] mm: msync() cleanup 2006-09-26 08:48:45 -07:00
nommu.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
oom_kill.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
page-writeback.c VM: Fix nasty and subtle race in shared mmap'ed page writeback 2006-12-29 10:00:58 -08:00
page_alloc.c [PATCH] cpuset: rework cpuset_zone_allowed api 2006-12-13 09:05:49 -08:00
page_io.c [PATCH] swsusp: use block device offsets to identify swap locations 2006-12-07 08:39:27 -08:00
pdflush.c [PATCH] Add include/linux/freezer.h and move definitions from sched.h 2006-12-07 08:39:27 -08:00
prio_tree.c
readahead.c [PATCH] io-accounting-read-accounting nfs fix 2006-12-10 09:55:41 -08:00
rmap.c [PATCH] Fix up page_mkclean_one(): virtual caches, s390 2006-12-22 10:39:35 -08:00
shmem.c [PATCH] Fix for shmem_truncate_range() BUG_ON() 2006-12-22 08:55:47 -08:00
shmem_acl.c [PATCH] Fix typos in mm/shmem_acl.c 2006-10-11 11:14:23 -07:00
slab.c [PATCH] fix kernel-doc warnings in 2.6.20-rc1 2006-12-22 08:55:47 -08:00
slob.c [PATCH] More slab.h cleanups 2006-12-13 09:05:49 -08:00
sparse.c [PATCH] numa node ids are int, page_to_nid and zone_to_nid should return int 2006-12-07 08:39:23 -08:00
swap.c [PATCH] hotplug CPU: clean up hotcpu_notifier() use 2006-12-07 08:39:39 -08:00
swap_state.c [PATCH] lockdep: locking init debugging improvement 2006-07-03 15:27:02 -07:00
swapfile.c [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path 2006-12-08 08:28:43 -08:00
thrash.c [PATCH] make mm/thrash.c:global_faults static 2006-12-07 08:39:22 -08:00
tiny-shmem.c [PATCH] struct path: convert mm 2006-12-08 08:28:47 -08:00
truncate.c Clean up and export cancel_dirty_page() to modules 2006-12-23 09:25:04 -08:00
util.c [PATCH] slab: clean up leak tracking ifdefs a little bit 2006-10-04 07:55:13 -07:00
vmalloc.c [PATCH] Fix strange size check in __get_vm_area_node() 2006-11-16 11:43:38 -08:00
vmscan.c [PATCH] Fix swapped parameters in mm/vmscan.c 2006-12-22 08:55:47 -08:00
vmstat.c [PATCH] struct seq_operations and struct file_operations constification 2006-12-07 08:39:46 -08:00