1
0
Fork 0
alistair23-linux/mm
Johannes Weiner 2d6d7f9828 mm: protect set_page_dirty() from ongoing truncation
Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:

__set_page_dirty_nobuffers()       __delete_from_page_cache()
  if (TestSetPageDirty(page))
                                     page->mapping = NULL
				     if (PageDirty())
				       dec_zone_page_state(page, NR_FILE_DIRTY);
				       dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
    if (page->mapping)
      account_page_dirtied(page)
        __inc_zone_page_state(page, NR_FILE_DIRTY);
	__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);

which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.

Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held.  The notable exception to this rule, though,
is do_wp_page(), for which this race exists.  However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes.  Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.

Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed.  Remove it.

Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-08 15:10:51 -08:00
..
Kconfig mm/balloon_compaction: add vmstat counters and kpageflags bit 2014-10-09 22:26:01 -04:00
Kconfig.debug mm/debug-pagealloc: prepare boottime configurable on/off 2014-12-13 12:42:48 -08:00
Makefile mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
backing-dev.c Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block 2014-10-18 11:53:51 -07:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
cleancache.c mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE 2014-01-23 16:36:50 -08:00
cma.c mm: cma: split cma-reserved in dmesg log 2014-12-18 19:08:10 -08:00
compaction.c mm, compaction: more focused lru and pcplists draining 2014-12-10 17:41:06 -08:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: move page->mem_cgroup bad page handling into generic code 2014-12-10 17:41:09 -08:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c mm: fadvise: document the fadvise(FADV_DONTNEED) behaviour for partial pages 2014-12-13 12:42:49 -08:00
failslab.c switch debugfs to umode_t 2012-01-03 22:54:56 -05:00
filemap.c mm: get rid of radix tree gfp mask for pagecache_get_page 2014-12-29 12:45:45 -08:00
filemap_xip.c mm/xip: share the i_mmap_rwsem 2014-12-13 12:42:45 -08:00
fremap.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2014-12-15 15:52:01 -08:00
frontswap.c mm/frontswap.c: fix the condition in BUG_ON 2014-12-10 17:41:08 -08:00
gup.c kernel: Provide READ_ONCE and ASSIGN_ONCE 2014-12-20 16:48:59 -08:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2014-12-15 15:52:01 -08:00
hugetlb.c Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2014-12-15 15:52:01 -08:00
hugetlb_cgroup.c mm: hugetlb_cgroup: convert to lockless page counters 2014-12-10 17:41:04 -08:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
internal.h mm, compaction: always update cached scanner positions 2014-12-10 17:41:06 -08:00
interval_tree.c mm: convert a few VM_BUG_ON callers to VM_BUG_ON_VMA 2014-10-09 22:25:57 -04:00
iov_iter.c copy_from_iter_nocache() 2014-12-08 20:25:23 -05:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c mm: introduce kmemleak_update_trace() 2014-06-06 16:08:17 -07:00
ksm.c mmu_notifier: call mmu_notifier_invalidate_range() from VMM 2014-11-13 13:46:09 +11:00
list_lru.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
maccess.c mm: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
madvise.c VFS: Rename do_fallocate() to vfs_fallocate() 2014-11-07 16:17:44 -05:00
memblock.c mm/memblock.c: refactor functions to set/clear MEMBLOCK_HOTPLUG 2014-12-13 12:42:46 -08:00
memcontrol.c mm/memcontrol.c: remove unused mem_cgroup_lru_names_not_uptodate() 2014-12-13 12:42:49 -08:00
memory-failure.c mm: vmscan: invoke slab shrinkers from shrink_zone() 2014-12-13 12:42:48 -08:00
memory.c mm: protect set_page_dirty() from ongoing truncation 2015-01-08 15:10:51 -08:00
memory_hotplug.c mm, memory_hotplug/failure: drain single zone pcplists 2014-12-10 17:41:05 -08:00
mempolicy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-19 18:19:19 -08:00
mempool.c mm/mempool.c: update the kmemleak stack trace for mempool allocations 2014-06-06 16:08:17 -07:00
migrate.c vm_area_operations: kill ->migrate() 2014-12-17 08:26:51 -05:00
mincore.c mm: mincore: add hwpoison page handle 2014-12-13 12:42:46 -08:00
mlock.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 15:44:12 +02:00
mm_init.c mm: bring back /sys/kernel/mm 2014-01-27 21:02:39 -08:00
mmap.c mm: export find_extend_vma() and handle_mm_fault() for driver use 2014-12-13 12:42:47 -08:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: numa: Change page last {nid,pid} into {cpu,pid} 2013-10-09 14:47:45 +02:00
mprotect.c mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared 2014-10-14 02:18:28 +02:00
mremap.c Merge git://git.kvack.org/~bcrl/aio-next 2014-12-14 13:36:57 -08:00
msync.c msync: fix incorrect fstart calculation 2014-07-03 09:21:53 -07:00
nobootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
nommu.c mm/nommu: use alloc_pages_exact() rather than its own implementation 2014-12-13 12:42:48 -08:00
oom_kill.c oom: kill the insufficient and no longer needed PT_TRACE_EXIT check 2014-12-13 12:42:49 -08:00
page-writeback.c mm: protect set_page_dirty() from ongoing truncation 2015-01-08 15:10:51 -08:00
page_alloc.c mm: cma: split cma-reserved in dmesg log 2014-12-18 19:08:10 -08:00
page_counter.c mm: memcontrol: remove obsolete kmemcg pinning tricks 2014-12-10 17:41:05 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c fix __swap_writepage() compile failure on old gcc versions 2014-06-14 19:30:48 -05:00
page_isolation.c mm, page_isolation: drain single zone pcplists 2014-12-10 17:41:05 -08:00
page_owner.c mm/page_owner: correct owner information for early allocated pages 2014-12-13 12:42:48 -08:00
pagewalk.c mm: use VM_BUG_ON_MM where possible 2014-10-09 22:25:58 -04:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: off by one in BUG_ON() 2014-10-29 10:34:34 -04:00
pgtable-generic.c mm: actually clear pmd_numa before invalidating 2014-08-29 16:28:15 -07:00
process_vm_access.c start adding the tag to iov_iter 2014-05-06 17:32:49 -04:00
quicklist.c mm: delete various needless include <linux/module.h> 2011-10-31 09:20:11 -04:00
readahead.c mm/readahead.c: remove unused file_ra_state from count_history_pages 2014-08-06 18:01:15 -07:00
rmap.c mm: prevent endless growth of anon_vma hierarchy 2015-01-08 15:10:51 -08:00
shmem.c new helper: iter_is_iovec() 2014-12-17 06:43:56 -05:00
slab.c slab: fix cpuset check in fallback_alloc 2014-12-13 12:42:53 -08:00
slab.h memcg: use generic slab iterators for showing slabinfo 2014-12-10 17:41:07 -08:00
slab_common.c memcg: use generic slab iterators for showing slabinfo 2014-12-10 17:41:07 -08:00
slob.c mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller() 2014-10-09 22:25:50 -04:00
slub.c slub: fix cpuset check in get_any_partial 2014-12-13 12:42:53 -08:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap.c mm: memcontrol: do not kill uncharge batching in free_pages_and_swap_cache 2014-10-09 22:25:59 -04:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swapfile.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
truncate.c mm: Fix comment before truncate_setsize() 2014-11-07 08:29:25 +11:00
util.c proc/maps: make vm_is_stack() logic namespace-friendly 2014-10-09 22:25:50 -04:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm/vmalloc.c: fix memory ordering bug 2014-12-13 12:42:49 -08:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c mm: vmscan: invoke slab shrinkers from shrink_zone() 2014-12-13 12:42:48 -08:00
vmstat.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
workingset.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
zbud.c mm/zbud: init user ops only when it is needed 2014-12-13 12:42:51 -08:00
zpool.c mm/zpool: use prefixed module loading 2014-08-29 16:28:16 -07:00
zsmalloc.c mm/zsmalloc: adjust order of functions 2014-12-18 19:08:11 -08:00
zswap.c mm/zswap: delete unnecessary check before calling free_percpu() 2014-12-13 12:42:50 -08:00