All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/mm: drop bogus assertion
@ 2017-11-30  9:10 Jan Beulich
  2017-11-30  9:17 ` Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jan Beulich @ 2017-11-30  9:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Olaf Hering, Tim Deegan

Olaf has observed this assertion to trigger after an aborted migration
of a PV guest (it remains to be determined why there is a page fault in
the first place here:

(XEN) Xen call trace:
(XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
(XEN)    [<ffff82d08036b7d8>] x86_64/entry.S#handle_exception_saved+0x66/0xa4
(XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
(XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
(XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
(XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
(XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
(XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
(XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
(XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30

i.e. the guest specified a runstate area address which has a non-present
mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
not something the hypervisor needs to be concerned about.) Release
builds work fine, which is a first indication that the assertion isn't
really needed.

What's worse though - there appears to be a timing window where the
guest runs in shadow mode, but not in log-dirty mode, and that is what
triggers the assertion (the same could, afaict, be achieved by test-
enabling shadow mode on a PV guest). This is because turing off log-
dirty mode is being performed in two steps: First the log-dirty bit gets
cleared (paging_log_dirty_disable() [having paused the domain] ->
sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
unpausing the domain and only then clearing shadow mode (via
shadow_test_disable(), which pauses the domain a second time).

Hence besides removing the ASSERT() here (or optionally replacing it by
explicit translate and refcounts mode checks, but this seems rather
pointless now that the three are tied together) I wonder whether either
shadow_one_bit_disable() should turn off shadow mode if no other bit
besides PG_SH_enable remains set (just like shadow_one_bit_enable()
enables it if not already set), or the domain pausing scope should be
extended so that both steps occur without the domain getting a chance to
run in between.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1338,12 +1338,8 @@ static int fixup_page_fault(unsigned lon
      */
     if ( paging_mode_enabled(d) && !paging_mode_external(d) )
     {
-        int ret;
+        int ret = paging_fault(addr, regs);
 
-        /* Logdirty mode is the only expected paging mode for PV guests. */
-        ASSERT(paging_mode_only_log_dirty(d));
-
-        ret = paging_fault(addr, regs);
         if ( ret == EXCRET_fault_fixed )
             trace_trap_two_addr(TRC_PV_PAGING_FIXUP, regs->rip, addr);
         return ret;
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -69,9 +69,6 @@
 #define paging_mode_translate(_d) (!!((_d)->arch.paging.mode & PG_translate))
 #define paging_mode_external(_d)  (!!((_d)->arch.paging.mode & PG_external))
 
-#define paging_mode_only_log_dirty(_d)                  \
-    (((_d)->arch.paging.mode & PG_MASK) == PG_log_dirty)
-
 /* flags used for paging debug */
 #define PAGING_DEBUG_LOGDIRTY 0
 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/mm: drop bogus assertion
  2017-11-30  9:10 [PATCH] x86/mm: drop bogus assertion Jan Beulich
@ 2017-11-30  9:17 ` Jan Beulich
  2017-11-30 11:33 ` Andrew Cooper
  2017-12-06  9:31 ` Ping: " Jan Beulich
  2 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2017-11-30  9:17 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Olaf Hering, Tim Deegan

>>> On 30.11.17 at 10:10, <JBeulich@suse.com> wrote:
> Olaf has observed this assertion to trigger after an aborted migration
> of a PV guest (it remains to be determined why there is a page fault in
> the first place here:
> 
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
> (XEN)    [<ffff82d08036b7d8>] x86_64/entry.S#handle_exception_saved+0x66/0xa4
> (XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
> (XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
> (XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
> (XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
> (XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
> (XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
> (XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30
> 
> i.e. the guest specified a runstate area address which has a non-present
> mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
> not something the hypervisor needs to be concerned about.)

It occurred to me only after sending that this would legitimately
happen when the vCPU is in user mode, i.e. likely nothing to be
concerned about. I guess I'll drop this part of the description.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/mm: drop bogus assertion
  2017-11-30  9:10 [PATCH] x86/mm: drop bogus assertion Jan Beulich
  2017-11-30  9:17 ` Jan Beulich
@ 2017-11-30 11:33 ` Andrew Cooper
  2017-11-30 11:57   ` Jan Beulich
  2017-12-06  9:31 ` Ping: " Jan Beulich
  2 siblings, 1 reply; 7+ messages in thread
From: Andrew Cooper @ 2017-11-30 11:33 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Olaf Hering, Tim Deegan

On 30/11/17 09:10, Jan Beulich wrote:
> Olaf has observed this assertion to trigger after an aborted migration
> of a PV guest (it remains to be determined why there is a page fault in
> the first place here:
>
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
> (XEN)    [<ffff82d08036b7d8>] x86_64/entry.S#handle_exception_saved+0x66/0xa4
> (XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
> (XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
> (XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
> (XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
> (XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
> (XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
> (XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30
>
> i.e. the guest specified a runstate area address which has a non-present
> mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
> not something the hypervisor needs to be concerned about.) Release
> builds work fine, which is a first indication that the assertion isn't
> really needed.
>
> What's worse though - there appears to be a timing window where the
> guest runs in shadow mode, but not in log-dirty mode, and that is what
> triggers the assertion (the same could, afaict, be achieved by test-
> enabling shadow mode on a PV guest). This is because turing off log-
> dirty mode is being performed in two steps: First the log-dirty bit gets
> cleared (paging_log_dirty_disable() [having paused the domain] ->
> sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
> unpausing the domain and only then clearing shadow mode (via
> shadow_test_disable(), which pauses the domain a second time).
>
> Hence besides removing the ASSERT() here (or optionally replacing it by
> explicit translate and refcounts mode checks, but this seems rather
> pointless now that the three are tied together) I wonder whether either
> shadow_one_bit_disable() should turn off shadow mode if no other bit
> besides PG_SH_enable remains set (just like shadow_one_bit_enable()
> enables it if not already set), or the domain pausing scope should be
> extended so that both steps occur without the domain getting a chance to
> run in between.
>
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1338,12 +1338,8 @@ static int fixup_page_fault(unsigned lon
>       */
>      if ( paging_mode_enabled(d) && !paging_mode_external(d) )
>      {
> -        int ret;
> +        int ret = paging_fault(addr, regs);
>  
> -        /* Logdirty mode is the only expected paging mode for PV guests. */
> -        ASSERT(paging_mode_only_log_dirty(d));

This assert was introduced for a specific reason, as a result of
d5c251c22b7, although the later bugfix 7b9f21cabc complicated things
somewhat.

The observation about shadow/logdirty happening in two phases does
invalidate the check in this form.  Its original purpose was to check
that we hadn't slipped into PV autotranslate mode, and I think that is
still worth keeping around.

How about reducing it to just a simple ASSERT(!paging_mode_translate(d)) ?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/mm: drop bogus assertion
  2017-11-30 11:33 ` Andrew Cooper
@ 2017-11-30 11:57   ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2017-11-30 11:57 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Olaf Hering, Tim Deegan

>>> On 30.11.17 at 12:33, <andrew.cooper3@citrix.com> wrote:
> On 30/11/17 09:10, Jan Beulich wrote:
>> Olaf has observed this assertion to trigger after an aborted migration
>> of a PV guest (it remains to be determined why there is a page fault in
>> the first place here:
>>
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
>> (XEN)    [<ffff82d08036b7d8>] x86_64/entry.S#handle_exception_saved+0x66/0xa4
>> (XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
>> (XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
>> (XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
>> (XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
>> (XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
>> (XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
>> (XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
>> (XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30
>>
>> i.e. the guest specified a runstate area address which has a non-present
>> mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
>> not something the hypervisor needs to be concerned about.) Release
>> builds work fine, which is a first indication that the assertion isn't
>> really needed.
>>
>> What's worse though - there appears to be a timing window where the
>> guest runs in shadow mode, but not in log-dirty mode, and that is what
>> triggers the assertion (the same could, afaict, be achieved by test-
>> enabling shadow mode on a PV guest). This is because turing off log-
>> dirty mode is being performed in two steps: First the log-dirty bit gets
>> cleared (paging_log_dirty_disable() [having paused the domain] ->
>> sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
>> unpausing the domain and only then clearing shadow mode (via
>> shadow_test_disable(), which pauses the domain a second time).
>>
>> Hence besides removing the ASSERT() here (or optionally replacing it by
>> explicit translate and refcounts mode checks, but this seems rather
>> pointless now that the three are tied together) I wonder whether either
>> shadow_one_bit_disable() should turn off shadow mode if no other bit
>> besides PG_SH_enable remains set (just like shadow_one_bit_enable()
>> enables it if not already set), or the domain pausing scope should be
>> extended so that both steps occur without the domain getting a chance to
>> run in between.
>>
>> Reported-by: Olaf Hering <olaf@aepfle.de>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -1338,12 +1338,8 @@ static int fixup_page_fault(unsigned lon
>>       */
>>      if ( paging_mode_enabled(d) && !paging_mode_external(d) )
>>      {
>> -        int ret;
>> +        int ret = paging_fault(addr, regs);
>>  
>> -        /* Logdirty mode is the only expected paging mode for PV guests. */
>> -        ASSERT(paging_mode_only_log_dirty(d));
> 
> This assert was introduced for a specific reason, as a result of
> d5c251c22b7, although the later bugfix 7b9f21cabc complicated things
> somewhat.

I don't see how this later change complicated things; see also
below.

> The observation about shadow/logdirty happening in two phases does
> invalidate the check in this form.  Its original purpose was to check
> that we hadn't slipped into PV autotranslate mode, and I think that is
> still worth keeping around.
> 
> How about reducing it to just a simple ASSERT(!paging_mode_translate(d)) ?

Well, I've mentioned this as an option, so I don't mind; the reason
I didn't do it this way is because, as said, we now have uniformly
external == refcounts == translate anyway, and this is in a
!external conditional already.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping: [PATCH] x86/mm: drop bogus assertion
  2017-11-30  9:10 [PATCH] x86/mm: drop bogus assertion Jan Beulich
  2017-11-30  9:17 ` Jan Beulich
  2017-11-30 11:33 ` Andrew Cooper
@ 2017-12-06  9:31 ` Jan Beulich
  2017-12-08  8:19   ` Tim Deegan
  2 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2017-12-06  9:31 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Andrew Cooper, Olaf Hering, xen-devel

>>> On 30.11.17 at 10:10, <JBeulich@suse.com> wrote:
> Olaf has observed this assertion to trigger after an aborted migration
> of a PV guest (it remains to be determined why there is a page fault in
> the first place here:
> 
> (XEN) Xen call trace:
> (XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
> (XEN)    [<ffff82d08036b7d8>] x86_64/entry.S#handle_exception_saved+0x66/0xa4
> (XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
> (XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
> (XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
> (XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
> (XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
> (XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
> (XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
> (XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30
> 
> i.e. the guest specified a runstate area address which has a non-present
> mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
> not something the hypervisor needs to be concerned about.) Release
> builds work fine, which is a first indication that the assertion isn't
> really needed.
> 
> What's worse though - there appears to be a timing window where the
> guest runs in shadow mode, but not in log-dirty mode, and that is what
> triggers the assertion (the same could, afaict, be achieved by test-
> enabling shadow mode on a PV guest). This is because turing off log-
> dirty mode is being performed in two steps: First the log-dirty bit gets
> cleared (paging_log_dirty_disable() [having paused the domain] ->
> sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
> unpausing the domain and only then clearing shadow mode (via
> shadow_test_disable(), which pauses the domain a second time).
> 
> Hence besides removing the ASSERT() here (or optionally replacing it by
> explicit translate and refcounts mode checks, but this seems rather
> pointless now that the three are tied together) I wonder whether either
> shadow_one_bit_disable() should turn off shadow mode if no other bit
> besides PG_SH_enable remains set (just like shadow_one_bit_enable()
> enables it if not already set), or the domain pausing scope should be
> extended so that both steps occur without the domain getting a chance to
> run in between.
> 
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1338,12 +1338,8 @@ static int fixup_page_fault(unsigned lon
>       */
>      if ( paging_mode_enabled(d) && !paging_mode_external(d) )
>      {
> -        int ret;
> +        int ret = paging_fault(addr, regs);
>  
> -        /* Logdirty mode is the only expected paging mode for PV guests. */
> -        ASSERT(paging_mode_only_log_dirty(d));
> -
> -        ret = paging_fault(addr, regs);

Andrew had responded that he'd prefer the assertion to be
converted to !paging_mode_translate(d), but before possibly
going ahead to do so I'd really like to have your opinion here as
well as on the wider shadow mode related aspect explained in
the description, since there other options to address the issue
(perhaps even leaving the original assertion in place, if test-
enabling of shadow mode was meant to be a development/
debug-only feature).

Thanks, Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ping: [PATCH] x86/mm: drop bogus assertion
  2017-12-06  9:31 ` Ping: " Jan Beulich
@ 2017-12-08  8:19   ` Tim Deegan
  2017-12-12 15:32     ` Andrew Cooper
  0 siblings, 1 reply; 7+ messages in thread
From: Tim Deegan @ 2017-12-08  8:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Olaf Hering, xen-devel

Hi,

At 02:31 -0700 on 06 Dec (1512527481), Jan Beulich wrote:
> >>> On 30.11.17 at 10:10, <JBeulich@suse.com> wrote:
> > i.e. the guest specified a runstate area address which has a non-present
> > mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
> > not something the hypervisor needs to be concerned about.) Release
> > builds work fine, which is a first indication that the assertion isn't
> > really needed.

Yep, this assertion should just go away, so:

Reviewed-by: Tim Deegan <tim@xen.org>

> > What's worse though - there appears to be a timing window where the
> > guest runs in shadow mode, but not in log-dirty mode, and that is what
> > triggers the assertion (the same could, afaict, be achieved by test-
> > enabling shadow mode on a PV guest). This is because turing off log-
> > dirty mode is being performed in two steps: First the log-dirty bit gets
> > cleared (paging_log_dirty_disable() [having paused the domain] ->
> > sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
> > unpausing the domain and only then clearing shadow mode (via
> > shadow_test_disable(), which pauses the domain a second time).
> > 
> > Hence besides removing the ASSERT() here (or optionally replacing it by
> > explicit translate and refcounts mode checks, but this seems rather
> > pointless now that the three are tied together) I wonder whether either
> > shadow_one_bit_disable() should turn off shadow mode if no other bit
> > besides PG_SH_enable remains set (just like shadow_one_bit_enable()
> > enables it if not already set), or the domain pausing scope should be
> > extended so that both steps occur without the domain getting a chance to
> > run in between.

I'd be fine with either of those.  It would avoid unexpected surprises
in the relatively untested pv-shadow-without-logdirty paths.  

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Ping: [PATCH] x86/mm: drop bogus assertion
  2017-12-08  8:19   ` Tim Deegan
@ 2017-12-12 15:32     ` Andrew Cooper
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Cooper @ 2017-12-12 15:32 UTC (permalink / raw)
  To: Tim Deegan, Jan Beulich; +Cc: xen-devel, Olaf Hering

On 08/12/17 08:19, Tim Deegan wrote:
> Hi,
>
> At 02:31 -0700 on 06 Dec (1512527481), Jan Beulich wrote:
>>>>> On 30.11.17 at 10:10, <JBeulich@suse.com> wrote:
>>> i.e. the guest specified a runstate area address which has a non-present
>>> mapping in the page tables [EC=0002 CR2=ffff88003d405220], but that's
>>> not something the hypervisor needs to be concerned about.) Release
>>> builds work fine, which is a first indication that the assertion isn't
>>> really needed.
> Yep, this assertion should just go away, so:
>
> Reviewed-by: Tim Deegan <tim@xen.org>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-12-12 15:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-30  9:10 [PATCH] x86/mm: drop bogus assertion Jan Beulich
2017-11-30  9:17 ` Jan Beulich
2017-11-30 11:33 ` Andrew Cooper
2017-11-30 11:57   ` Jan Beulich
2017-12-06  9:31 ` Ping: " Jan Beulich
2017-12-08  8:19   ` Tim Deegan
2017-12-12 15:32     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.