From 68bdc8d64742ccc5e340c5d122ebbab3f0cf2a74 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Tue, 6 Jan 2009 14:39:37 -0800 Subject: [PATCH] mm: try_to_unuse check removing right swap There's a possible race in try_to_unuse() which Nick Piggin led me to two years ago. Where it does lock_page() after read_swap_cache_async(), what if another task removed that page from swapcache just before we locked it? It would sail though the (*swap_map > 1) tests doing nothing (because it could not have been removed from swapcache before its swap references were gone), until it reaches the delete_from_swap_cache(page) near the bottom. Now imagine that this page has been allocated to swap on a different swap area while we dropped page lock (perhaps at the top, perhaps in unuse_mm): we could wrongly remove from swap cache before the page has been written to swap, so a subsequent do_swap_page() would read in stale data from swap. I think this case could not happen before: remove_exclusive_swap_page() refused while page count was raised. But now with reuse_swap_page() and try_to_free_swap() removing from swap cache without minding page count, I think it could happen - the previous patch argued that it was safe because try_to_unuse() already ignored page count, but overlooked that it might be breaking the assumptions in try_to_unuse() itself. Signed-off-by: Hugh Dickins Cc: Lee Schermerhorn Cc: Rik van Riel Cc: Nick Piggin Cc: KAMEZAWA Hiroyuki Cc: Robin Holt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/swapfile.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index f43601827607..9ce7f81c8abc 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -889,7 +889,16 @@ static int try_to_unuse(unsigned int type) lock_page(page); wait_on_page_writeback(page); } - if (PageSwapCache(page)) + + /* + * It is conceivable that a racing task removed this page from + * swap cache just before we acquired the page lock at the top, + * or while we dropped it in unuse_mm(). The page might even + * be back in swap cache on another swap area: that we must not + * delete, since it may not have been written out to swap yet. + */ + if (PageSwapCache(page) && + likely(page_private(page) == entry.val)) delete_from_swap_cache(page); /*