All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
  2017-11-09 15:29 [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
@ 2017-11-09  9:19 ` Jan Beulich
  2017-11-09 10:24   ` Yu Zhang
  2017-11-09  9:22 ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2017-11-09  9:19 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang

>>> On 09.11.17 at 16:29, <yu.c.zhang@linux.intel.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -4844,9 +4844,10 @@ int map_pages_to_xen(
>              {
>                  unsigned long base_mfn;
>  
> -                pl1e = l2e_to_l1e(*pl2e);
>                  if ( locking )
>                      spin_lock(&map_pgdir_lock);
> +
> +                pl1e = l2e_to_l1e(*pl2e);
>                  base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1);
>                  for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ )
>                      if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) ||

I agree with the general observation, but there are three things I'd
like to see considered:

1) Please extend the change slightly such that the L2E
re-consolidation code matches the L3E one (i.e. latch into ol2e
earlier and pass that one to l2e_to_l1e(). Personally I would even
prefer if the presence/absence of blank lines matched between
the two pieces of code.

2) Is your change actually enough to take care of all forms of the
race you describe? In particular, isn't it necessary to re-check PSE
after having taken the lock, in case another CPU has just finished
doing the re-consolidation?

3) What about the empty&free checks in modify_xen_mappings()?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
  2017-11-09 15:29 [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
  2017-11-09  9:19 ` Jan Beulich
@ 2017-11-09  9:22 ` Jan Beulich
  2017-11-09 10:32   ` Yu Zhang
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2017-11-09  9:22 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang

>>> On 09.11.17 at 16:29, <yu.c.zhang@linux.intel.com> wrote:
> In map_pages_to_xen(), a L2 page table entry may be reset to point to
> a superpage, and its corresponding L1 page table need be freed in such
> scenario, when these L1 page table entries are mapping to consecutive
> page frames and having the same mapping flags.
> 
> However, variable `pl1e` is not protected by the lock before L1 page table
> is enumerated. A race condition may happen if this code path is invoked
> simultaneously on different CPUs.
> 
> For example, `pl1e` value on CPU0 may hold an obsolete value, pointing
> to a page which has just been freed on CPU1. Besides, before this page
> is reused, it will still be holding the old PTEs, referencing consecutive
> page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will
> be triggered on CPU0, resulting the unexpected free of a normal page.
> 
> Protecting the `pl1e` with the lock will fix this race condition.
> 
> Signed-off-by: Min He <min.he@intel.com>
> Signed-off-by: Yi Zhang <yi.z.zhang@intel.com>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>

Oh, one more thing: Is it really the case that all three of you
contributed to the patch? We don't use the Linux model of
everyone through whose hands a patch passes adding an
S-o-b of their own - that would rather be Reviewed-by then (if
applicable).

Also generally I would consider the first S-o-b to be that of the
original author, yet the absence of an explicit From: tag makes
authorship ambiguous here. Please clarify this in v2.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
  2017-11-09  9:19 ` Jan Beulich
@ 2017-11-09 10:24   ` Yu Zhang
  2017-11-09 12:49     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Yu Zhang @ 2017-11-09 10:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, min.he, yi.z.zhang, xen-devel



On 11/9/2017 5:19 PM, Jan Beulich wrote:
>>>> On 09.11.17 at 16:29, <yu.c.zhang@linux.intel.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -4844,9 +4844,10 @@ int map_pages_to_xen(
>>               {
>>                   unsigned long base_mfn;
>>   
>> -                pl1e = l2e_to_l1e(*pl2e);
>>                   if ( locking )
>>                       spin_lock(&map_pgdir_lock);
>> +
>> +                pl1e = l2e_to_l1e(*pl2e);
>>                   base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1);
>>                   for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ )
>>                       if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) ||
> I agree with the general observation, but there are three things I'd
> like to see considered:
>
> 1) Please extend the change slightly such that the L2E
> re-consolidation code matches the L3E one (i.e. latch into ol2e
> earlier and pass that one to l2e_to_l1e(). Personally I would even
> prefer if the presence/absence of blank lines matched between
> the two pieces of code.

Got it. Thanks.

>
> 2) Is your change actually enough to take care of all forms of the
> race you describe? In particular, isn't it necessary to re-check PSE
> after having taken the lock, in case another CPU has just finished
> doing the re-consolidation?

Good question. :-)

I'd thought of checking the PSE for pl2e, and dropped that. My understanding
was below:
After the lock is taken, pl2e will be pointing to either a L1 page table 
in normal
cases; or to a superpage if another CPU has just finished the 
re-consolidation
and released the lock. And for the latter scenario, l1e_get_pfn(*pl1e) 
shall not
be equal to (base_mfn + i), and will not jump out the the loop.

But after second thought, above understanding is based on assumption of the
contents of the target superpage. No matter how small the chance is, we can
not make such assumption.

So my suggestion is we add the check the PSE and if it is set, "goto 
check_l3".
Is this reasonable to you?

>
> 3) What about the empty&free checks in modify_xen_mappings()?

Oh. Thanks for the remind.
Just had a look. It seems pl1e or pl2e may be freed more than once for the
empty & free checks, due to lack of protection.
So we'd better add a lock too, right?

Yu


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
  2017-11-09  9:22 ` Jan Beulich
@ 2017-11-09 10:32   ` Yu Zhang
  0 siblings, 0 replies; 6+ messages in thread
From: Yu Zhang @ 2017-11-09 10:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang



On 11/9/2017 5:22 PM, Jan Beulich wrote:
>>>> On 09.11.17 at 16:29, <yu.c.zhang@linux.intel.com> wrote:
>> In map_pages_to_xen(), a L2 page table entry may be reset to point to
>> a superpage, and its corresponding L1 page table need be freed in such
>> scenario, when these L1 page table entries are mapping to consecutive
>> page frames and having the same mapping flags.
>>
>> However, variable `pl1e` is not protected by the lock before L1 page table
>> is enumerated. A race condition may happen if this code path is invoked
>> simultaneously on different CPUs.
>>
>> For example, `pl1e` value on CPU0 may hold an obsolete value, pointing
>> to a page which has just been freed on CPU1. Besides, before this page
>> is reused, it will still be holding the old PTEs, referencing consecutive
>> page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will
>> be triggered on CPU0, resulting the unexpected free of a normal page.
>>
>> Protecting the `pl1e` with the lock will fix this race condition.
>>
>> Signed-off-by: Min He <min.he@intel.com>
>> Signed-off-by: Yi Zhang <yi.z.zhang@intel.com>
>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Oh, one more thing: Is it really the case that all three of you
> contributed to the patch? We don't use the Linux model of
> everyone through whose hands a patch passes adding an
> S-o-b of their own - that would rather be Reviewed-by then (if
> applicable).
>
> Also generally I would consider the first S-o-b to be that of the
> original author, yet the absence of an explicit From: tag makes
> authorship ambiguous here. Please clarify this in v2.

Oh, we three found this issue when debugging a bug together. And Min is
the author of this patch. So I'd like to add

"From: Min He <min.he@intel.com> "

at the beginning of the commit message in v2. :-)

Yu
> Jan
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
  2017-11-09 10:24   ` Yu Zhang
@ 2017-11-09 12:49     ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2017-11-09 12:49 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Andrew Cooper, min.he, xen-devel, yi.z.zhang

>>> On 09.11.17 at 11:24, <yu.c.zhang@linux.intel.com> wrote:
> On 11/9/2017 5:19 PM, Jan Beulich wrote:
>> 2) Is your change actually enough to take care of all forms of the
>> race you describe? In particular, isn't it necessary to re-check PSE
>> after having taken the lock, in case another CPU has just finished
>> doing the re-consolidation?
> 
> Good question. :-)
> 
> I'd thought of checking the PSE for pl2e, and dropped that. My understanding
> was below:
> After the lock is taken, pl2e will be pointing to either a L1 page table 
> in normal
> cases; or to a superpage if another CPU has just finished the 
> re-consolidation
> and released the lock. And for the latter scenario, l1e_get_pfn(*pl1e) 
> shall not
> be equal to (base_mfn + i), and will not jump out the the loop.
> 
> But after second thought, above understanding is based on assumption of the
> contents of the target superpage. No matter how small the chance is, we can
> not make such assumption.
> 
> So my suggestion is we add the check the PSE and if it is set, "goto 
> check_l3".
> Is this reasonable to you?

Yes; for the L3 case it'll be a simple "continue" afaict.

>> 3) What about the empty&free checks in modify_xen_mappings()?
> 
> Oh. Thanks for the remind.
> Just had a look. It seems pl1e or pl2e may be freed more than once for the
> empty & free checks, due to lack of protection.
> So we'd better add a lock too, right?

Yes, I think so.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen().
@ 2017-11-09 15:29 Yu Zhang
  2017-11-09  9:19 ` Jan Beulich
  2017-11-09  9:22 ` Jan Beulich
  0 siblings, 2 replies; 6+ messages in thread
From: Yu Zhang @ 2017-11-09 15:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, min.he, Jan Beulich, yi.z.zhang

In map_pages_to_xen(), a L2 page table entry may be reset to point to
a superpage, and its corresponding L1 page table need be freed in such
scenario, when these L1 page table entries are mapping to consecutive
page frames and having the same mapping flags.

However, variable `pl1e` is not protected by the lock before L1 page table
is enumerated. A race condition may happen if this code path is invoked
simultaneously on different CPUs.

For example, `pl1e` value on CPU0 may hold an obsolete value, pointing
to a page which has just been freed on CPU1. Besides, before this page
is reused, it will still be holding the old PTEs, referencing consecutive
page frames. Consequently the `free_xen_pagetable(l2e_to_l1e(ol2e))` will
be triggered on CPU0, resulting the unexpected free of a normal page.

Protecting the `pl1e` with the lock will fix this race condition.

Signed-off-by: Min He <min.he@intel.com>
Signed-off-by: Yi Zhang <yi.z.zhang@intel.com>
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/mm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a20fdca..9c9afa1 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4844,9 +4844,10 @@ int map_pages_to_xen(
             {
                 unsigned long base_mfn;
 
-                pl1e = l2e_to_l1e(*pl2e);
                 if ( locking )
                     spin_lock(&map_pgdir_lock);
+
+                pl1e = l2e_to_l1e(*pl2e);
                 base_mfn = l1e_get_pfn(*pl1e) & ~(L1_PAGETABLE_ENTRIES - 1);
                 for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++, pl1e++ )
                     if ( (l1e_get_pfn(*pl1e) != (base_mfn + i)) ||
-- 
2.5.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-09 15:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09 15:29 [PATCH] x86/mm: fix a potential race condition in map_pages_to_xen() Yu Zhang
2017-11-09  9:19 ` Jan Beulich
2017-11-09 10:24   ` Yu Zhang
2017-11-09 12:49     ` Jan Beulich
2017-11-09  9:22 ` Jan Beulich
2017-11-09 10:32   ` Yu Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.