All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] Do not stop guest when panic event is received
@ 2020-10-02  2:41 Alejandro Jimenez
  2020-10-02  2:41 ` [PATCH 1/1] vl: Add -no-panicstop option Alejandro Jimenez
  2020-10-20 17:14 ` [PATCH 0/1] Do not stop guest when panic event is received Paolo Bonzini
  0 siblings, 2 replies; 5+ messages in thread
From: Alejandro Jimenez @ 2020-10-02  2:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

The following patch adds an option to request that QEMU does not stop the VM when a panic event is received.
This allows guests in cloud environments to report the panic condition to the control plane, but be able to
proceed to collect a crash dump and automatically reboot, without waiting to receive one or several 'cont'
monitor commands.

I am aware of a previous discussion regarding the decision to stop the guest on panic event:
https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/
that is why I propose explicitly using a parameter to change the default behavior when necessary.

The PVPANIC_CRASHLOADED event was introduced in the v5.6 kernel, and it is intended to tell QEMU that the guest
will handle the panic condition by itself, but unfortunately older kernels will only support sending the
PVPANIC_PANICKED event, for which the default behavior is to pause the VM.

Having a '-no-panicstop' option allows for older guest kernels that do not support the PVPANIC_CRASHLOADED event
to behave in the same way as newer kernels, simplifying control plane code. It also provides the same advantage
when launching Windows guests with the hv-crash enlightenment, since the hv-crash MSR writes are ultimately
handled by QEMU as if the guest had sent a PVPANIC_PANICKED event.

The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).

Please let me know if you have any comments or suggestions.

Regards,
Alejandro

Alejandro Jimenez (1):
  vl: Add -no-panicstop option

 qemu-options.hx | 11 +++++++++++
 softmmu/vl.c    | 17 ++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/1] vl: Add -no-panicstop option
  2020-10-02  2:41 [PATCH 0/1] Do not stop guest when panic event is received Alejandro Jimenez
@ 2020-10-02  2:41 ` Alejandro Jimenez
  2020-10-20 17:14 ` [PATCH 0/1] Do not stop guest when panic event is received Paolo Bonzini
  1 sibling, 0 replies; 5+ messages in thread
From: Alejandro Jimenez @ 2020-10-02  2:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini

The current default action of pausing a guest after a panic event
is received leaves the responsibility to resume guest execution to the
management layer. The reasons for this behavior are discussed here:
https://lore.kernel.org/qemu-devel/52148F88.5000509@redhat.com/

However, in instances like the case of older guests (Linux and
Windows) using a pvpanic device but missing support for the
PVPANIC_CRASHLOADED event, and Windows guests using the hv-crash
enlightenment, it is desirable to allow the guests to continue
running after sending a PVPANIC_PANICKED event. This allows such
guests to proceed to capture a crash dump and automatically reboot
without intervention of a management layer.

Add an option to avoid stopping a VM after a panic event is received.

Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
---
 qemu-options.hx | 11 +++++++++++
 softmmu/vl.c    | 17 ++++++++++++++---
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3564c23..cbaf947 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3882,6 +3882,17 @@ SRST
     specified domain id (XEN only).
 ERST
 
+DEF("no-panicstop", 0, QEMU_OPTION_no_panicstop, \
+    "-no-panicstop	do not stop the VM on panic\n", QEMU_ARCH_ALL)
+SRST
+``-no-panicstop``
+    Don't stop the VM when a panic event is received. This allows older guests
+    using a pvpanic device but without support for the PVPANIC_CRASHLOADED
+    event, and Windows guests using the hv-crash enlightenment to continue
+    running and capture a crash dump or reboot without intervention of a
+    management layer.
+ERST
+
 DEF("no-reboot", 0, QEMU_OPTION_no_reboot, \
     "-no-reboot      exit instead of rebooting\n", QEMU_ARCH_ALL)
 SRST
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 22bc570..a939b18 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -147,6 +147,7 @@ int win2k_install_hack = 0;
 int singlestep = 0;
 int no_hpet = 0;
 int fd_bootchk = 1;
+static int no_panicstop;
 static int no_reboot;
 int no_shutdown = 0;
 int graphic_rotate = 0;
@@ -1431,9 +1432,16 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
     if (current_cpu) {
         current_cpu->crash_occurred = true;
     }
-    qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE,
-                                   !!info, info);
-    vm_stop(RUN_STATE_GUEST_PANICKED);
+
+    if (no_panicstop) {
+        qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_RUN,
+                                        !!info, info);
+    } else {
+        qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE,
+                                        !!info, info);
+        vm_stop(RUN_STATE_GUEST_PANICKED);
+    }
+
     if (!no_shutdown) {
         qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_POWEROFF,
                                        !!info, info);
@@ -3558,6 +3566,9 @@ void qemu_init(int argc, char **argv, char **envp)
             case QEMU_OPTION_no_hpet:
                 no_hpet = 1;
                 break;
+            case QEMU_OPTION_no_panicstop:
+                no_panicstop = 1;
+                break;
             case QEMU_OPTION_no_reboot:
                 no_reboot = 1;
                 break;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/1] Do not stop guest when panic event is received
  2020-10-02  2:41 [PATCH 0/1] Do not stop guest when panic event is received Alejandro Jimenez
  2020-10-02  2:41 ` [PATCH 1/1] vl: Add -no-panicstop option Alejandro Jimenez
@ 2020-10-20 17:14 ` Paolo Bonzini
  2020-10-21 13:26   ` Alejandro Jimenez
  1 sibling, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2020-10-20 17:14 UTC (permalink / raw)
  To: Alejandro Jimenez, qemu-devel

On 02/10/20 04:41, Alejandro Jimenez wrote:
> The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
> option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).
> 
> Please let me know if you have any comments or suggestions.

Hi Alejandro, sorry for the delayed response.

The concept is fine, and I agree this should not be a device property.

On the other hand, we already have many similar options: -no-reboot,
-no-shutdown, -watchdog-action and now --no-panicstop.

I think it's time to group them into a single option:

* -action reboot=pause|shutdown|none
* -action shutdown=pause|poweroff|none
* -action panic=pause|shutdown|none
* -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi

where the existing options would translate to the new option, like:

* -no-reboot "-action reboot=shutdown"
* -no-shutdown "-action shutdown=pause"

The implementation should be relatively easy too; there's already an
enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
parsing function select_watchdog_action that can be changed to just
return the RunstateAction.

Would you like to take a look at this?

Thanks,

Paolo



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/1] Do not stop guest when panic event is received
  2020-10-20 17:14 ` [PATCH 0/1] Do not stop guest when panic event is received Paolo Bonzini
@ 2020-10-21 13:26   ` Alejandro Jimenez
  2020-10-21 13:33     ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Alejandro Jimenez @ 2020-10-21 13:26 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel



On 10/20/2020 1:14 PM, Paolo Bonzini wrote:
> On 02/10/20 04:41, Alejandro Jimenez wrote:
>> The fact that the behavior of hv-crash is also affected is why I chose to implement this change as an independent
>> option, as opposed to making it a property of the pvpanic device (e.g. -device pvpanic,no-panicstop).
>>
>> Please let me know if you have any comments or suggestions.
> Hi Alejandro, sorry for the delayed response.
>
> The concept is fine, and I agree this should not be a device property.
>
> On the other hand, we already have many similar options: -no-reboot,
> -no-shutdown, -watchdog-action and now --no-panicstop.
>
> I think it's time to group them into a single option:
>
> * -action reboot=pause|shutdown|none
> * -action shutdown=pause|poweroff|none
> * -action panic=pause|shutdown|none
> * -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi
>
> where the existing options would translate to the new option, like:
>
> * -no-reboot "-action reboot=shutdown"
> * -no-shutdown "-action shutdown=pause"
>
> The implementation should be relatively easy too; there's already an
> enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
> parsing function select_watchdog_action that can be changed to just
> return the RunstateAction.
>
> Would you like to take a look at this?
Hi Paolo,

Thank you for your reply and the advice/hints above. I'll take a look 
and try to implement what you propose.

Regards,
Alejandro
>
> Thanks,
>
> Paolo
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/1] Do not stop guest when panic event is received
  2020-10-21 13:26   ` Alejandro Jimenez
@ 2020-10-21 13:33     ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2020-10-21 13:33 UTC (permalink / raw)
  To: Alejandro Jimenez, qemu-devel

On 21/10/20 15:26, Alejandro Jimenez wrote:
> 
> 
> On 10/20/2020 1:14 PM, Paolo Bonzini wrote:
>> On 02/10/20 04:41, Alejandro Jimenez wrote:
>>> The fact that the behavior of hv-crash is also affected is why I
>>> chose to implement this change as an independent
>>> option, as opposed to making it a property of the pvpanic device
>>> (e.g. -device pvpanic,no-panicstop).
>>>
>>> Please let me know if you have any comments or suggestions.
>> Hi Alejandro, sorry for the delayed response.
>>
>> The concept is fine, and I agree this should not be a device property.
>>
>> On the other hand, we already have many similar options: -no-reboot,
>> -no-shutdown, -watchdog-action and now --no-panicstop.
>>
>> I think it's time to group them into a single option:
>>
>> * -action reboot=pause|shutdown|none
>> * -action shutdown=pause|poweroff|none
>> * -action panic=pause|shutdown|none
>> * -action watchdog=reset|shutdown|poweroff|pause|debug|none|inject-nmi
>>
>> where the existing options would translate to the new option, like:
>>
>> * -no-reboot "-action reboot=shutdown"
>> * -no-shutdown "-action shutdown=pause"
>>
>> The implementation should be relatively easy too; there's already an
>> enum WatchdogAction (that can be renamed to e.g. RunstateAction) and a
>> parsing function select_watchdog_action that can be changed to just
>> return the RunstateAction.
>>
>> Would you like to take a look at this?
> Hi Paolo,
> 
> Thank you for your reply and the advice/hints above. I'll take a look
> and try to implement what you propose.

Just one thing, for the parsing you can place it close to the existing

    qemu_opts_foreach(qemu_find_opts("name"),
                      parse_name, NULL, &error_fatal);

Paolo



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-21 13:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02  2:41 [PATCH 0/1] Do not stop guest when panic event is received Alejandro Jimenez
2020-10-02  2:41 ` [PATCH 1/1] vl: Add -no-panicstop option Alejandro Jimenez
2020-10-20 17:14 ` [PATCH 0/1] Do not stop guest when panic event is received Paolo Bonzini
2020-10-21 13:26   ` Alejandro Jimenez
2020-10-21 13:33     ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.