All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
@ 2016-11-30 17:55 Vitaly Kuznetsov
  2016-11-30 19:30 ` KY Srinivasan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-30 17:55 UTC (permalink / raw)
  To: x86, devel
  Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin

There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt) which
injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs
of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic is
enabled and we'd like to do kdump we need to perform some minimal cleanup
so the kdump kernel will be able to initialize VMBus devices, this cleanup
includes sending CHANNELMSG_UNLOAD to the host waiting for
CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the response
to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on VMBus module
load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we can't do
any cross-CPU work reliably on crash we have vmbus_wait_for_unload()
function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs message
pages and this sometimes works. It was discovered that in case the host
wants to send more than one message to a secondary CPU (not the CPU running
vmbus_wait_for_unload()) we're unable to get it as after reading the first
message we're supposed to do EOMing by doing wrmsrl(HV_X64_MSR_EOM, 0) but
this is per-CPU. I have a feeling that this was working some time ago when
I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a
message even without wrmsrl() but apparently this doesn't work any more.
Unfortunately there is not that much we can do when all CPUs get NMI as
all but the first one are getting blocked with interrupts disabled. What we
can do is limit processing unknown interrupts to the first CPU which gets
it in case we're about to crash.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 8f44c5a..6e4181ff 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -31,6 +31,7 @@
 #include <asm/apic.h>
 #include <asm/timer.h>
 #include <asm/reboot.h>
+#include <asm/nmi.h>
 
 struct ms_hyperv_info ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void)
 	return 0;
 }
 
+/*
+ * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
+ * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
+ * unknown NMI on the first CPU which gets it.
+ */
+static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
+{
+	static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+
+	if (!unknown_nmi_panic)
+		return NMI_DONE;
+
+	if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+		return NMI_HANDLED;
+
+	return NMI_DONE;
+}
+
 static void __init ms_hyperv_init_platform(void)
 {
 	/*
@@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void)
 	 */
 	if (efi_enabled(EFI_BOOT))
 		x86_platform.get_nmi_reason = hv_get_nmi_reason;
+
+	register_nmi_handler(NMI_LOCAL, hv_nmi_unknown, NMI_FLAG_FIRST,
+			     "hv_nmi_unknown");
 }
 
 const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
  2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
@ 2016-11-30 19:30 ` KY Srinivasan
  2016-12-01 11:36 ` kbuild test robot
  2016-12-01 13:01 ` kbuild test robot
  2 siblings, 0 replies; 4+ messages in thread
From: KY Srinivasan @ 2016-11-30 19:30 UTC (permalink / raw)
  To: Vitaly Kuznetsov, x86, devel
  Cc: linux-kernel, Haiyang Zhang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Wednesday, November 30, 2016 9:55 AM
> To: x86@kernel.org; devel@linuxdriverproject.org
> Cc: linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Thomas Gleixner
> <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; H. Peter Anvin
> <hpa@zytor.com>
> Subject: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when
> unknown_nmi_panic
> 
> There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt)
> which
> injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs
> of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic
> is
> enabled and we'd like to do kdump we need to perform some minimal
> cleanup
> so the kdump kernel will be able to initialize VMBus devices, this cleanup
> includes sending CHANNELMSG_UNLOAD to the host waiting for
> CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the
> response
> to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on
> VMBus module
> load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we
> can't do
> any cross-CPU work reliably on crash we have vmbus_wait_for_unload()
> function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs
> message
> pages and this sometimes works. It was discovered that in case the host
> wants to send more than one message to a secondary CPU (not the CPU
> running
> vmbus_wait_for_unload()) we're unable to get it as after reading the first
> message we're supposed to do EOMing by doing
> wrmsrl(HV_X64_MSR_EOM, 0) but
> this is per-CPU. I have a feeling that this was working some time ago when
> I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a
> message even without wrmsrl() but apparently this doesn't work any more.
> Unfortunately there is not that much we can do when all CPUs get NMI as
> all but the first one are getting blocked with interrupts disabled. What we
> can do is limit processing unknown interrupts to the first CPU which gets
> it in case we're about to crash.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Thanks Vitaly.

Acked-by: K. Y. Srinivasan <kys@microsoft.com>


> ---
>  arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c
> b/arch/x86/kernel/cpu/mshyperv.c
> index 8f44c5a..6e4181ff 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -31,6 +31,7 @@
>  #include <asm/apic.h>
>  #include <asm/timer.h>
>  #include <asm/reboot.h>
> +#include <asm/nmi.h>
> 
>  struct ms_hyperv_info ms_hyperv;
>  EXPORT_SYMBOL_GPL(ms_hyperv);
> @@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void)
>  	return 0;
>  }
> 
> +/*
> + * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
> + * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
> + * unknown NMI on the first CPU which gets it.
> + */
> +static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
> +{
> +	static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> +
> +	if (!unknown_nmi_panic)
> +		return NMI_DONE;
> +
> +	if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> +		return NMI_HANDLED;
> +
> +	return NMI_DONE;
> +}
> +
>  static void __init ms_hyperv_init_platform(void)
>  {
>  	/*
> @@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void)
>  	 */
>  	if (efi_enabled(EFI_BOOT))
>  		x86_platform.get_nmi_reason = hv_get_nmi_reason;
> +
> +	register_nmi_handler(NMI_LOCAL, hv_nmi_unknown,
> NMI_FLAG_FIRST,
> +			     "hv_nmi_unknown");
>  }
> 
>  const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
> --
> 2.9.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
  2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
  2016-11-30 19:30 ` KY Srinivasan
@ 2016-12-01 11:36 ` kbuild test robot
  2016-12-01 13:01 ` kbuild test robot
  2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2016-12-01 11:36 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kbuild-all, x86, devel, Haiyang Zhang, linux-kernel, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]

Hi Vitaly,

[auto build test ERROR on tip/x86/core]
[also build test ERROR on v4.9-rc7 next-20161130]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vitaly-Kuznetsov/x86-hyperv-Handle-unknown-NMIs-on-one-CPU-when-unknown_nmi_panic/20161201-171219
config: i386-randconfig-x014-201648 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   arch/x86/kernel/cpu/mshyperv.c: In function 'hv_nmi_unknown':
>> arch/x86/kernel/cpu/mshyperv.c:171:7: error: 'unknown_nmi_panic' undeclared (first use in this function)
     if (!unknown_nmi_panic)
          ^~~~~~~~~~~~~~~~~
   arch/x86/kernel/cpu/mshyperv.c:171:7: note: each undeclared identifier is reported only once for each function it appears in

vim +/unknown_nmi_panic +171 arch/x86/kernel/cpu/mshyperv.c

   165	 * unknown NMI on the first CPU which gets it.
   166	 */
   167	static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
   168	{
   169		static atomic_t nmi_cpu = ATOMIC_INIT(-1);
   170	
 > 171		if (!unknown_nmi_panic)
   172			return NMI_DONE;
   173	
   174		if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24405 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
  2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
  2016-11-30 19:30 ` KY Srinivasan
  2016-12-01 11:36 ` kbuild test robot
@ 2016-12-01 13:01 ` kbuild test robot
  2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2016-12-01 13:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kbuild-all, x86, devel, Haiyang Zhang, linux-kernel, Ingo Molnar,
	H. Peter Anvin, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 7309 bytes --]

Hi Vitaly,

[auto build test WARNING on tip/x86/core]
[also build test WARNING on v4.9-rc7 next-20161130]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vitaly-Kuznetsov/x86-hyperv-Handle-unknown-NMIs-on-one-CPU-when-unknown_nmi_panic/20161201-171219
config: i386-randconfig-x0-12011945 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from include/uapi/linux/posix_types.h:4,
                    from include/uapi/linux/types.h:13,
                    from include/linux/types.h:5,
                    from arch/x86/kernel/cpu/mshyperv.c:13:
   arch/x86/kernel/cpu/mshyperv.c: In function 'hv_nmi_unknown':
   arch/x86/kernel/cpu/mshyperv.c:171:7: error: 'unknown_nmi_panic' undeclared (first use in this function)
     if (!unknown_nmi_panic)
          ^
   include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> arch/x86/kernel/cpu/mshyperv.c:171:2: note: in expansion of macro 'if'
     if (!unknown_nmi_panic)
     ^~
   arch/x86/kernel/cpu/mshyperv.c:171:7: note: each undeclared identifier is reported only once for each function it appears in
     if (!unknown_nmi_panic)
          ^
   include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> arch/x86/kernel/cpu/mshyperv.c:171:2: note: in expansion of macro 'if'
     if (!unknown_nmi_panic)
     ^~

vim +/if +171 arch/x86/kernel/cpu/mshyperv.c

     7	 * This program is free software; you can redistribute it and/or modify
     8	 * it under the terms of the GNU General Public License as published by
     9	 * the Free Software Foundation; version 2 of the License.
    10	 *
    11	 */
    12	
  > 13	#include <linux/types.h>
    14	#include <linux/time.h>
    15	#include <linux/clocksource.h>
    16	#include <linux/init.h>
    17	#include <linux/export.h>
    18	#include <linux/hardirq.h>
    19	#include <linux/efi.h>
    20	#include <linux/interrupt.h>
    21	#include <linux/irq.h>
    22	#include <linux/kexec.h>
    23	#include <asm/processor.h>
    24	#include <asm/hypervisor.h>
    25	#include <asm/hyperv.h>
    26	#include <asm/mshyperv.h>
    27	#include <asm/desc.h>
    28	#include <asm/idle.h>
    29	#include <asm/irq_regs.h>
    30	#include <asm/i8259.h>
    31	#include <asm/apic.h>
    32	#include <asm/timer.h>
    33	#include <asm/reboot.h>
    34	#include <asm/nmi.h>
    35	
    36	struct ms_hyperv_info ms_hyperv;
    37	EXPORT_SYMBOL_GPL(ms_hyperv);
    38	
    39	#if IS_ENABLED(CONFIG_HYPERV)
    40	static void (*vmbus_handler)(void);
    41	static void (*hv_kexec_handler)(void);
    42	static void (*hv_crash_handler)(struct pt_regs *regs);
    43	
    44	void hyperv_vector_handler(struct pt_regs *regs)
    45	{
    46		struct pt_regs *old_regs = set_irq_regs(regs);
    47	
    48		entering_irq();
    49		inc_irq_stat(irq_hv_callback_count);
    50		if (vmbus_handler)
    51			vmbus_handler();
    52	
    53		exiting_irq();
    54		set_irq_regs(old_regs);
    55	}
    56	
    57	void hv_setup_vmbus_irq(void (*handler)(void))
    58	{
    59		vmbus_handler = handler;
    60		/*
    61		 * Setup the IDT for hypervisor callback. Prevent reallocation
    62		 * at module reload.
    63		 */
    64		if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors))
    65			alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
    66					hyperv_callback_vector);
    67	}
    68	
    69	void hv_remove_vmbus_irq(void)
    70	{
    71		/* We have no way to deallocate the interrupt gate */
    72		vmbus_handler = NULL;
    73	}
    74	EXPORT_SYMBOL_GPL(hv_setup_vmbus_irq);
    75	EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq);
    76	
    77	void hv_setup_kexec_handler(void (*handler)(void))
    78	{
    79		hv_kexec_handler = handler;
    80	}
    81	EXPORT_SYMBOL_GPL(hv_setup_kexec_handler);
    82	
    83	void hv_remove_kexec_handler(void)
    84	{
    85		hv_kexec_handler = NULL;
    86	}
    87	EXPORT_SYMBOL_GPL(hv_remove_kexec_handler);
    88	
    89	void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs))
    90	{
    91		hv_crash_handler = handler;
    92	}
    93	EXPORT_SYMBOL_GPL(hv_setup_crash_handler);
    94	
    95	void hv_remove_crash_handler(void)
    96	{
    97		hv_crash_handler = NULL;
    98	}
    99	EXPORT_SYMBOL_GPL(hv_remove_crash_handler);
   100	
   101	#ifdef CONFIG_KEXEC_CORE
   102	static void hv_machine_shutdown(void)
   103	{
   104		if (kexec_in_progress && hv_kexec_handler)
   105			hv_kexec_handler();
   106		native_machine_shutdown();
   107	}
   108	
   109	static void hv_machine_crash_shutdown(struct pt_regs *regs)
   110	{
   111		if (hv_crash_handler)
   112			hv_crash_handler(regs);
   113		native_machine_crash_shutdown(regs);
   114	}
   115	#endif /* CONFIG_KEXEC_CORE */
   116	#endif /* CONFIG_HYPERV */
   117	
   118	static uint32_t  __init ms_hyperv_platform(void)
   119	{
   120		u32 eax;
   121		u32 hyp_signature[3];
   122	
   123		if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
   124			return 0;
   125	
   126		cpuid(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS,
   127		      &eax, &hyp_signature[0], &hyp_signature[1], &hyp_signature[2]);
   128	
   129		if (eax >= HYPERV_CPUID_MIN &&
   130		    eax <= HYPERV_CPUID_MAX &&
   131		    !memcmp("Microsoft Hv", hyp_signature, 12))
   132			return HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
   133	
   134		return 0;
   135	}
   136	
   137	static cycle_t read_hv_clock(struct clocksource *arg)
   138	{
   139		cycle_t current_tick;
   140		/*
   141		 * Read the partition counter to get the current tick count. This count
   142		 * is set to 0 when the partition is created and is incremented in
   143		 * 100 nanosecond units.
   144		 */
   145		rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
   146		return current_tick;
   147	}
   148	
   149	static struct clocksource hyperv_cs = {
   150		.name		= "hyperv_clocksource",
   151		.rating		= 400, /* use this when running on Hyperv*/
   152		.read		= read_hv_clock,
   153		.mask		= CLOCKSOURCE_MASK(64),
   154		.flags		= CLOCK_SOURCE_IS_CONTINUOUS,
   155	};
   156	
   157	static unsigned char hv_get_nmi_reason(void)
   158	{
   159		return 0;
   160	}
   161	
   162	/*
   163	 * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
   164	 * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
   165	 * unknown NMI on the first CPU which gets it.
   166	 */
   167	static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
   168	{
   169		static atomic_t nmi_cpu = ATOMIC_INIT(-1);
   170	
 > 171		if (!unknown_nmi_panic)
   172			return NMI_DONE;
   173	
   174		if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 23753 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-01 13:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
2016-11-30 19:30 ` KY Srinivasan
2016-12-01 11:36 ` kbuild test robot
2016-12-01 13:01 ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.