mm: gup persist for write permission

do_wp_page()'s VM_FAULT_WRITE return value tells __get_user_pages() that
COW has been done if necessary, though it may be leaving the pte without
write permission - for the odd case of forced writing to a readonly vma
for ptrace.  At present GUP then retries the follow_page() without asking
for write permission, to escape an endless loop when forced.

But an application may be relying on GUP to guarantee a writable page
which won't be COWed again when written from userspace, whereas a race
here might leave a readonly pte in place?  Change the VM_FAULT_WRITE
handling to ask follow_page() for write permission again, except in that
odd case of forced writing to a readonly vma.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Hugh Dickins 2009-01-06 14:39:32 -08:00 committed by Linus Torvalds
parent 2da02997e0
commit 878b63ac88

View file

@ -1264,9 +1264,15 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
* do_wp_page has broken COW when necessary, * do_wp_page has broken COW when necessary,
* even if maybe_mkwrite decided not to set * even if maybe_mkwrite decided not to set
* pte_write. We can thus safely do subsequent * pte_write. We can thus safely do subsequent
* page lookups as if they were reads. * page lookups as if they were reads. But only
* do so when looping for pte_write is futile:
* in some cases userspace may also be wanting
* to write to the gotten user page, which a
* read fault here might prevent (a readonly
* page might get reCOWed by userspace write).
*/ */
if (ret & VM_FAULT_WRITE) if ((ret & VM_FAULT_WRITE) &&
!(vma->vm_flags & VM_WRITE))
foll_flags &= ~FOLL_WRITE; foll_flags &= ~FOLL_WRITE;
cond_resched(); cond_resched();