linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/xen: Return from panic notifier
@ 2019-10-01 15:16 Boris Ostrovsky
  2019-10-02  4:36 ` Jürgen Groß
  2019-10-02  7:40 ` [Xen-devel] " Jan Beulich
  0 siblings, 2 replies; 7+ messages in thread
From: Boris Ostrovsky @ 2019-10-01 15:16 UTC (permalink / raw)
  To: xen-devel, linux-kernel; +Cc: jgross, james, Boris Ostrovsky

Currently execution of panic() continues until Xen's panic notifier
(xen_panic_event()) is called at which point we make a hypercall that
never returns.

This means that any notifier that is supposed to be called later as
well as significant part of panic() code (such as pstore writes from
kmsg_dump()) is never executed.

There is no reason for xen_panic_event() to be this last point in
execution since panic()'s emergency_restart() will call into
xen_emergency_restart() from where we can perform our hypercall.

Reported-by: James Dingwall <james@dingwall.me.uk>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 arch/x86/xen/enlighten.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 750f46ad018a..fd4f58cf51dc 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -269,16 +269,27 @@ void xen_reboot(int reason)
 		BUG();
 }
 
+static int reboot_reason = SHUTDOWN_reboot;
 void xen_emergency_restart(void)
 {
-	xen_reboot(SHUTDOWN_reboot);
+	xen_reboot(reboot_reason);
 }
 
 static int
 xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
 {
-	if (!kexec_crash_loaded())
-		xen_reboot(SHUTDOWN_crash);
+	if (!kexec_crash_loaded()) {
+		reboot_reason = SHUTDOWN_crash;
+
+		/*
+		 * If panic_timeout==0 then we are supposed to wait forever.
+		 * However, to preserve original dom0 behavior we have to drop
+		 * into hypervisor. (domU behavior is controlled by its
+		 * config file)
+		 */
+		if (panic_timeout == 0)
+			panic_timeout = -1;
+	}
 	return NOTIFY_DONE;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/xen: Return from panic notifier
  2019-10-01 15:16 [PATCH] x86/xen: Return from panic notifier Boris Ostrovsky
@ 2019-10-02  4:36 ` Jürgen Groß
  2019-10-02  7:40 ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 7+ messages in thread
From: Jürgen Groß @ 2019-10-02  4:36 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel, linux-kernel; +Cc: james

On 01.10.19 17:16, Boris Ostrovsky wrote:
> Currently execution of panic() continues until Xen's panic notifier
> (xen_panic_event()) is called at which point we make a hypercall that
> never returns.
> 
> This means that any notifier that is supposed to be called later as
> well as significant part of panic() code (such as pstore writes from
> kmsg_dump()) is never executed.
> 
> There is no reason for xen_panic_event() to be this last point in
> execution since panic()'s emergency_restart() will call into
> xen_emergency_restart() from where we can perform our hypercall.
> 
> Reported-by: James Dingwall <james@dingwall.me.uk>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] x86/xen: Return from panic notifier
  2019-10-01 15:16 [PATCH] x86/xen: Return from panic notifier Boris Ostrovsky
  2019-10-02  4:36 ` Jürgen Groß
@ 2019-10-02  7:40 ` Jan Beulich
  2019-10-02 13:24   ` Boris Ostrovsky
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2019-10-02  7:40 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: xen-devel, linux-kernel, james, Juergen Gross

On 01.10.2019 17:16, Boris Ostrovsky wrote:
> Currently execution of panic() continues until Xen's panic notifier
> (xen_panic_event()) is called at which point we make a hypercall that
> never returns.
> 
> This means that any notifier that is supposed to be called later as
> well as significant part of panic() code (such as pstore writes from
> kmsg_dump()) is never executed.

Back at the time when this was introduced into the XenoLinux tree,
I think this behavior was intentional for at least DomU-s. I wonder
whether you wouldn't want your change to further distinguish Dom0
and DomU behavior.

> There is no reason for xen_panic_event() to be this last point in
> execution since panic()'s emergency_restart() will call into
> xen_emergency_restart() from where we can perform our hypercall.

Did you consider, as an alternative, to lower the notifier's
priority?

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] x86/xen: Return from panic notifier
  2019-10-02  7:40 ` [Xen-devel] " Jan Beulich
@ 2019-10-02 13:24   ` Boris Ostrovsky
  2019-10-02 13:42     ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Ostrovsky @ 2019-10-02 13:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, linux-kernel, james, Juergen Gross

On 10/2/19 3:40 AM, Jan Beulich wrote:
> On 01.10.2019 17:16, Boris Ostrovsky wrote:
>> Currently execution of panic() continues until Xen's panic notifier
>> (xen_panic_event()) is called at which point we make a hypercall that
>> never returns.
>>
>> This means that any notifier that is supposed to be called later as
>> well as significant part of panic() code (such as pstore writes from
>> kmsg_dump()) is never executed.
> Back at the time when this was introduced into the XenoLinux tree,
> I think this behavior was intentional for at least DomU-s. I wonder
> whether you wouldn't want your change to further distinguish Dom0
> and DomU behavior.

Do you remember what the reason for that was?

I think having ability to call kmsg_dump() on a panic is a useful thing
to have for domUs as well. Besides, there may be other functionality
added post-notifiers in panic() in the future. Or another notifier may
be registered later with the same lowest priority.

Is there a downside in allowing domUs to fall through panic() all the
way to emergency_restart()?

>
>> There is no reason for xen_panic_event() to be this last point in
>> execution since panic()'s emergency_restart() will call into
>> xen_emergency_restart() from where we can perform our hypercall.
> Did you consider, as an alternative, to lower the notifier's
> priority?

I didn't but that wouldn't help with the original problem that James
reported --- we'd still not get to kmsg_dump() call.


-boris

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] x86/xen: Return from panic notifier
  2019-10-02 13:24   ` Boris Ostrovsky
@ 2019-10-02 13:42     ` Jan Beulich
  2019-10-02 14:14       ` Boris Ostrovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2019-10-02 13:42 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: james, xen-devel, Juergen Gross, linux-kernel

On 02.10.2019 15:24, Boris Ostrovsky wrote:
> On 10/2/19 3:40 AM, Jan Beulich wrote:
>> On 01.10.2019 17:16, Boris Ostrovsky wrote:
>>> Currently execution of panic() continues until Xen's panic notifier
>>> (xen_panic_event()) is called at which point we make a hypercall that
>>> never returns.
>>>
>>> This means that any notifier that is supposed to be called later as
>>> well as significant part of panic() code (such as pstore writes from
>>> kmsg_dump()) is never executed.
>> Back at the time when this was introduced into the XenoLinux tree,
>> I think this behavior was intentional for at least DomU-s. I wonder
>> whether you wouldn't want your change to further distinguish Dom0
>> and DomU behavior.
> 
> Do you remember what the reason for that was?

I can only guess that the thinking probably was that e.g. external
dumping (by the tool stack) would be more reliable (including but
not limited to this meaning less change of state from when the
original crash reason was detected) than having the domain dump
itself.

> I think having ability to call kmsg_dump() on a panic is a useful thing
> to have for domUs as well. Besides, there may be other functionality
> added post-notifiers in panic() in the future. Or another notifier may
> be registered later with the same lowest priority.
> 
> Is there a downside in allowing domUs to fall through panic() all the
> way to emergency_restart()?

See above.

>>> There is no reason for xen_panic_event() to be this last point in
>>> execution since panic()'s emergency_restart() will call into
>>> xen_emergency_restart() from where we can perform our hypercall.
>> Did you consider, as an alternative, to lower the notifier's
>> priority?
> 
> I didn't but that wouldn't help with the original problem that James
> reported --- we'd still not get to kmsg_dump() call.

True. I guess more control over the behavior needs to be given to
the admin, as either approach has its up- and downsides

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] x86/xen: Return from panic notifier
  2019-10-02 13:42     ` Jan Beulich
@ 2019-10-02 14:14       ` Boris Ostrovsky
  2019-10-02 14:59         ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Boris Ostrovsky @ 2019-10-02 14:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: james, xen-devel, Juergen Gross, linux-kernel

On 10/2/19 9:42 AM, Jan Beulich wrote:
>
> I can only guess that the thinking probably was that e.g. external
> dumping (by the tool stack) would be more reliable (including but
> not limited to this meaning less change of state from when the
> original crash reason was detected) than having the domain dump
> itself.


We could register an external dumper (controlled by a boot option
perhaps, off by default) that will call directly into hypervisor with
SHUTDOWN_crash. That will guarantee that we will complete the notifier
chain without relying on priorities. (Of course this still won't address
a possible new feature in panic() that might be called post-dumping)

If you think it's worth doing this can be easily added.

-boris

> True. I guess more control over the behavior needs to be given to
> the admin, as either approach has its up- and downsides
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] x86/xen: Return from panic notifier
  2019-10-02 14:14       ` Boris Ostrovsky
@ 2019-10-02 14:59         ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2019-10-02 14:59 UTC (permalink / raw)
  To: Boris Ostrovsky; +Cc: james, xen-devel, Juergen Gross, linux-kernel

On 02.10.2019 16:14, Boris Ostrovsky wrote:
> On 10/2/19 9:42 AM, Jan Beulich wrote:
>>
>> I can only guess that the thinking probably was that e.g. external
>> dumping (by the tool stack) would be more reliable (including but
>> not limited to this meaning less change of state from when the
>> original crash reason was detected) than having the domain dump
>> itself.
> 
> 
> We could register an external dumper (controlled by a boot option
> perhaps, off by default) that will call directly into hypervisor with
> SHUTDOWN_crash. That will guarantee that we will complete the notifier
> chain without relying on priorities. (Of course this still won't address
> a possible new feature in panic() that might be called post-dumping)
> 
> If you think it's worth doing this can be easily added.

Well, I think this is the better option than potentially
pingponging between the two extremes, because one wants it the
way it currently is and another wants it the way this patch
changes things to.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-02 14:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-01 15:16 [PATCH] x86/xen: Return from panic notifier Boris Ostrovsky
2019-10-02  4:36 ` Jürgen Groß
2019-10-02  7:40 ` [Xen-devel] " Jan Beulich
2019-10-02 13:24   ` Boris Ostrovsky
2019-10-02 13:42     ` Jan Beulich
2019-10-02 14:14       ` Boris Ostrovsky
2019-10-02 14:59         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).