All of lore.kernel.org
 help / color / mirror / Atom feed
* mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough
@ 2011-07-12 16:50 Andrea Arcangeli
  2011-07-12 17:48 ` Nai Xia
  2011-07-12 17:53 ` Andrea Arcangeli
  0 siblings, 2 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2011-07-12 16:50 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, Marcelo Tosatti, Johannes Weiner

Hi Hugh,

what do you think about this?

===
Subject: mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough

From: Andrea Arcangeli <aarcange@redhat.com>

There seems to be a bug in do_wp_page that if not fixed, it would
lead to a Ksm shared page to be mapped read-write into some process pte leading
to random memory corruption in userland MADV_MEARGEABLE vmas.

If the orig_pte value was read by do_wp_page after
write_protect_page() (likely as if the pte wasn't originally read as
readonly by handle_pte_fault, do_wp_page wouldn't be called in the
first place), but if we reach lock_page() in the !PageKsm path (before
reuse_swap_page is called), but before set_page_stable_node() run (the
kpage == NULL case), the orig_pte wouldn't have changed (after
write_protect_page returned the pte doesn't change anymore and then we
release the page lock), and the pte_same() check would succeed, but
the old_page would have become a PageKsm already before releasing the
page lock in try_to_merge_one_page, so we shouldn't go ahead with
reuse_swap_page in do_wp_page in that case. But we do, and then we
reuse the wrprotected PageKsm in the stable tree allowing userland to
map it read-write. The PageKsm check I introduced below in memory.c
should close this race, it is enough to check the page is not Ksm to
know if we can takeover it or not after we obtain the page lock.

To say it in another way, the current and only PageKsm check in
do_wp_page in short is racy because it's run before trying to obtain
the page lock, so it could run before set_page_stable_node() had a
chance to run yet.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---

diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2454,7 +2454,8 @@ static int do_wp_page(struct mm_struct *
 			lock_page(old_page);
 			page_table = pte_offset_map_lock(mm, pmd, address,
 							 &ptl);
-			if (!pte_same(*page_table, orig_pte)) {
+			if (!pte_same(*page_table, orig_pte) ||
+			    PageKsm(old_page)) {
 				unlock_page(old_page);
 				goto unlock;
 			}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough
  2011-07-12 16:50 mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough Andrea Arcangeli
@ 2011-07-12 17:48 ` Nai Xia
  2011-07-12 18:01   ` Andrea Arcangeli
  2011-07-12 17:53 ` Andrea Arcangeli
  1 sibling, 1 reply; 4+ messages in thread
From: Nai Xia @ 2011-07-12 17:48 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Hugh Dickins, linux-mm, Marcelo Tosatti, Johannes Weiner

On 7/13/11, Andrea Arcangeli <aarcange@redhat.com> wrote:
> Hi Hugh,
>
> what do you think about this?
>
> ===
> Subject: mm: do_wp_page recheck PageKsm after obtaining the page_lock,
> pte_same not enough
>
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> There seems to be a bug in do_wp_page that if not fixed, it would
> lead to a Ksm shared page to be mapped read-write into some process pte
> leading
> to random memory corruption in userland MADV_MEARGEABLE vmas.
>
> If the orig_pte value was read by do_wp_page after
> write_protect_page() (likely as if the pte wasn't originally read as
> readonly by handle_pte_fault, do_wp_page wouldn't be called in the
> first place), but if we reach lock_page() in the !PageKsm path (before
> reuse_swap_page is called), but before set_page_stable_node() run (the
> kpage == NULL case), the orig_pte wouldn't have changed (after
> write_protect_page returned the pte doesn't change anymore and then we
> release the page lock), and the pte_same() check would succeed, but
> the old_page would have become a PageKsm already before releasing the
> page lock in try_to_merge_one_page, so we shouldn't go ahead with
> reuse_swap_page in do_wp_page in that case. But we do, and then we
> reuse the wrprotected PageKsm in the stable tree allowing userland to
> map it read-write. The PageKsm check I introduced below in memory.c
> should close this race, it is enough to check the page is not Ksm to
> know if we can takeover it or not after we obtain the page lock.
>
> To say it in another way, the current and only PageKsm check in
> do_wp_page in short is racy because it's run before trying to obtain
> the page lock, so it could run before set_page_stable_node() had a
> chance to run yet.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
>
> diff --git a/mm/memory.c b/mm/memory.c
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2454,7 +2454,8 @@ static int do_wp_page(struct mm_struct *
>  			lock_page(old_page);
>  			page_table = pte_offset_map_lock(mm, pmd, address,
>  							 &ptl);
> -			if (!pte_same(*page_table, orig_pte)) {
> +			if (!pte_same(*page_table, orig_pte) ||
> +			    PageKsm(old_page)) {

I think in this case we should copy the page instead of going to unlock.

And I think reuse_swap_page() has checked the PageKsm(page) inside and
in this case it will go to the copy path already?

>  				unlock_page(old_page);
>  				goto unlock;
>  			}
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough
  2011-07-12 16:50 mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough Andrea Arcangeli
  2011-07-12 17:48 ` Nai Xia
@ 2011-07-12 17:53 ` Andrea Arcangeli
  1 sibling, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2011-07-12 17:53 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, Marcelo Tosatti, Johannes Weiner

On Tue, Jul 12, 2011 at 06:50:03PM +0200, Andrea Arcangeli wrote:
> Hi Hugh,
> 
> what do you think about this?

Ah I just noticed it couldn't happen so you can ignore the previous
patch... a second check is indeed inside reuse_swap_page.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough
  2011-07-12 17:48 ` Nai Xia
@ 2011-07-12 18:01   ` Andrea Arcangeli
  0 siblings, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2011-07-12 18:01 UTC (permalink / raw)
  To: Nai Xia; +Cc: Hugh Dickins, linux-mm, Marcelo Tosatti, Johannes Weiner

On Wed, Jul 13, 2011 at 01:48:10AM +0800, Nai Xia wrote:
> I think in this case we should copy the page instead of going to unlock.
> 
> And I think reuse_swap_page() has checked the PageKsm(page) inside and
> in this case it will go to the copy path already?

yes this is why it's unnecessary, I've been a bit in a paranoid mode
on this code lately, one more check wouldn't have hurted but it's
definitely unnecessary so please ignore...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-07-12 18:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-12 16:50 mm: do_wp_page recheck PageKsm after obtaining the page_lock, pte_same not enough Andrea Arcangeli
2011-07-12 17:48 ` Nai Xia
2011-07-12 18:01   ` Andrea Arcangeli
2011-07-12 17:53 ` Andrea Arcangeli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.