* [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
@ 2016-11-30 17:55 Vitaly Kuznetsov
2016-11-30 19:30 ` KY Srinivasan
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Vitaly Kuznetsov @ 2016-11-30 17:55 UTC (permalink / raw)
To: x86, devel
Cc: linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Thomas Gleixner,
Ingo Molnar, H. Peter Anvin
There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt) which
injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs
of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic is
enabled and we'd like to do kdump we need to perform some minimal cleanup
so the kdump kernel will be able to initialize VMBus devices, this cleanup
includes sending CHANNELMSG_UNLOAD to the host waiting for
CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the response
to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on VMBus module
load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we can't do
any cross-CPU work reliably on crash we have vmbus_wait_for_unload()
function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs message
pages and this sometimes works. It was discovered that in case the host
wants to send more than one message to a secondary CPU (not the CPU running
vmbus_wait_for_unload()) we're unable to get it as after reading the first
message we're supposed to do EOMing by doing wrmsrl(HV_X64_MSR_EOM, 0) but
this is per-CPU. I have a feeling that this was working some time ago when
I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a
message even without wrmsrl() but apparently this doesn't work any more.
Unfortunately there is not that much we can do when all CPUs get NMI as
all but the first one are getting blocked with interrupts disabled. What we
can do is limit processing unknown interrupts to the first CPU which gets
it in case we're about to crash.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 8f44c5a..6e4181ff 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -31,6 +31,7 @@
#include <asm/apic.h>
#include <asm/timer.h>
#include <asm/reboot.h>
+#include <asm/nmi.h>
struct ms_hyperv_info ms_hyperv;
EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void)
return 0;
}
+/*
+ * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
+ * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
+ * unknown NMI on the first CPU which gets it.
+ */
+static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
+{
+ static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+
+ if (!unknown_nmi_panic)
+ return NMI_DONE;
+
+ if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+ return NMI_HANDLED;
+
+ return NMI_DONE;
+}
+
static void __init ms_hyperv_init_platform(void)
{
/*
@@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void)
*/
if (efi_enabled(EFI_BOOT))
x86_platform.get_nmi_reason = hv_get_nmi_reason;
+
+ register_nmi_handler(NMI_LOCAL, hv_nmi_unknown, NMI_FLAG_FIRST,
+ "hv_nmi_unknown");
}
const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
--
2.9.3
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
@ 2016-11-30 19:30 ` KY Srinivasan
2016-12-01 11:36 ` kbuild test robot
2016-12-01 13:01 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: KY Srinivasan @ 2016-11-30 19:30 UTC (permalink / raw)
To: Vitaly Kuznetsov, x86, devel
Cc: linux-kernel, Haiyang Zhang, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin
> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com]
> Sent: Wednesday, November 30, 2016 9:55 AM
> To: x86@kernel.org; devel@linuxdriverproject.org
> Cc: linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Thomas Gleixner
> <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; H. Peter Anvin
> <hpa@zytor.com>
> Subject: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when
> unknown_nmi_panic
>
> There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt)
> which
> injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs
> of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic
> is
> enabled and we'd like to do kdump we need to perform some minimal
> cleanup
> so the kdump kernel will be able to initialize VMBus devices, this cleanup
> includes sending CHANNELMSG_UNLOAD to the host waiting for
> CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the
> response
> to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on
> VMBus module
> load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we
> can't do
> any cross-CPU work reliably on crash we have vmbus_wait_for_unload()
> function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs
> message
> pages and this sometimes works. It was discovered that in case the host
> wants to send more than one message to a secondary CPU (not the CPU
> running
> vmbus_wait_for_unload()) we're unable to get it as after reading the first
> message we're supposed to do EOMing by doing
> wrmsrl(HV_X64_MSR_EOM, 0) but
> this is per-CPU. I have a feeling that this was working some time ago when
> I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a
> message even without wrmsrl() but apparently this doesn't work any more.
> Unfortunately there is not that much we can do when all CPUs get NMI as
> all but the first one are getting blocked with interrupts disabled. What we
> can do is limit processing unknown interrupts to the first CPU which gets
> it in case we're about to crash.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Thanks Vitaly.
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
> ---
> arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c
> b/arch/x86/kernel/cpu/mshyperv.c
> index 8f44c5a..6e4181ff 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -31,6 +31,7 @@
> #include <asm/apic.h>
> #include <asm/timer.h>
> #include <asm/reboot.h>
> +#include <asm/nmi.h>
>
> struct ms_hyperv_info ms_hyperv;
> EXPORT_SYMBOL_GPL(ms_hyperv);
> @@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void)
> return 0;
> }
>
> +/*
> + * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
> + * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
> + * unknown NMI on the first CPU which gets it.
> + */
> +static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
> +{
> + static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> +
> + if (!unknown_nmi_panic)
> + return NMI_DONE;
> +
> + if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> + return NMI_HANDLED;
> +
> + return NMI_DONE;
> +}
> +
> static void __init ms_hyperv_init_platform(void)
> {
> /*
> @@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void)
> */
> if (efi_enabled(EFI_BOOT))
> x86_platform.get_nmi_reason = hv_get_nmi_reason;
> +
> + register_nmi_handler(NMI_LOCAL, hv_nmi_unknown,
> NMI_FLAG_FIRST,
> + "hv_nmi_unknown");
> }
>
> const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
> --
> 2.9.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
2016-11-30 19:30 ` KY Srinivasan
@ 2016-12-01 11:36 ` kbuild test robot
2016-12-01 13:01 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2016-12-01 11:36 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: kbuild-all, x86, devel, Haiyang Zhang, linux-kernel, Ingo Molnar,
H. Peter Anvin, Thomas Gleixner
[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]
Hi Vitaly,
[auto build test ERROR on tip/x86/core]
[also build test ERROR on v4.9-rc7 next-20161130]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Vitaly-Kuznetsov/x86-hyperv-Handle-unknown-NMIs-on-one-CPU-when-unknown_nmi_panic/20161201-171219
config: i386-randconfig-x014-201648 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
arch/x86/kernel/cpu/mshyperv.c: In function 'hv_nmi_unknown':
>> arch/x86/kernel/cpu/mshyperv.c:171:7: error: 'unknown_nmi_panic' undeclared (first use in this function)
if (!unknown_nmi_panic)
^~~~~~~~~~~~~~~~~
arch/x86/kernel/cpu/mshyperv.c:171:7: note: each undeclared identifier is reported only once for each function it appears in
vim +/unknown_nmi_panic +171 arch/x86/kernel/cpu/mshyperv.c
165 * unknown NMI on the first CPU which gets it.
166 */
167 static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
168 {
169 static atomic_t nmi_cpu = ATOMIC_INIT(-1);
170
> 171 if (!unknown_nmi_panic)
172 return NMI_DONE;
173
174 if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24405 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
2016-11-30 19:30 ` KY Srinivasan
2016-12-01 11:36 ` kbuild test robot
@ 2016-12-01 13:01 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2016-12-01 13:01 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: kbuild-all, x86, devel, Haiyang Zhang, linux-kernel, Ingo Molnar,
H. Peter Anvin, Thomas Gleixner
[-- Attachment #1: Type: text/plain, Size: 7309 bytes --]
Hi Vitaly,
[auto build test WARNING on tip/x86/core]
[also build test WARNING on v4.9-rc7 next-20161130]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Vitaly-Kuznetsov/x86-hyperv-Handle-unknown-NMIs-on-one-CPU-when-unknown_nmi_panic/20161201-171219
config: i386-randconfig-x0-12011945 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All warnings (new ones prefixed by >>):
In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from arch/x86/kernel/cpu/mshyperv.c:13:
arch/x86/kernel/cpu/mshyperv.c: In function 'hv_nmi_unknown':
arch/x86/kernel/cpu/mshyperv.c:171:7: error: 'unknown_nmi_panic' undeclared (first use in this function)
if (!unknown_nmi_panic)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> arch/x86/kernel/cpu/mshyperv.c:171:2: note: in expansion of macro 'if'
if (!unknown_nmi_panic)
^~
arch/x86/kernel/cpu/mshyperv.c:171:7: note: each undeclared identifier is reported only once for each function it appears in
if (!unknown_nmi_panic)
^
include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> arch/x86/kernel/cpu/mshyperv.c:171:2: note: in expansion of macro 'if'
if (!unknown_nmi_panic)
^~
vim +/if +171 arch/x86/kernel/cpu/mshyperv.c
7 * This program is free software; you can redistribute it and/or modify
8 * it under the terms of the GNU General Public License as published by
9 * the Free Software Foundation; version 2 of the License.
10 *
11 */
12
> 13 #include <linux/types.h>
14 #include <linux/time.h>
15 #include <linux/clocksource.h>
16 #include <linux/init.h>
17 #include <linux/export.h>
18 #include <linux/hardirq.h>
19 #include <linux/efi.h>
20 #include <linux/interrupt.h>
21 #include <linux/irq.h>
22 #include <linux/kexec.h>
23 #include <asm/processor.h>
24 #include <asm/hypervisor.h>
25 #include <asm/hyperv.h>
26 #include <asm/mshyperv.h>
27 #include <asm/desc.h>
28 #include <asm/idle.h>
29 #include <asm/irq_regs.h>
30 #include <asm/i8259.h>
31 #include <asm/apic.h>
32 #include <asm/timer.h>
33 #include <asm/reboot.h>
34 #include <asm/nmi.h>
35
36 struct ms_hyperv_info ms_hyperv;
37 EXPORT_SYMBOL_GPL(ms_hyperv);
38
39 #if IS_ENABLED(CONFIG_HYPERV)
40 static void (*vmbus_handler)(void);
41 static void (*hv_kexec_handler)(void);
42 static void (*hv_crash_handler)(struct pt_regs *regs);
43
44 void hyperv_vector_handler(struct pt_regs *regs)
45 {
46 struct pt_regs *old_regs = set_irq_regs(regs);
47
48 entering_irq();
49 inc_irq_stat(irq_hv_callback_count);
50 if (vmbus_handler)
51 vmbus_handler();
52
53 exiting_irq();
54 set_irq_regs(old_regs);
55 }
56
57 void hv_setup_vmbus_irq(void (*handler)(void))
58 {
59 vmbus_handler = handler;
60 /*
61 * Setup the IDT for hypervisor callback. Prevent reallocation
62 * at module reload.
63 */
64 if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors))
65 alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
66 hyperv_callback_vector);
67 }
68
69 void hv_remove_vmbus_irq(void)
70 {
71 /* We have no way to deallocate the interrupt gate */
72 vmbus_handler = NULL;
73 }
74 EXPORT_SYMBOL_GPL(hv_setup_vmbus_irq);
75 EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq);
76
77 void hv_setup_kexec_handler(void (*handler)(void))
78 {
79 hv_kexec_handler = handler;
80 }
81 EXPORT_SYMBOL_GPL(hv_setup_kexec_handler);
82
83 void hv_remove_kexec_handler(void)
84 {
85 hv_kexec_handler = NULL;
86 }
87 EXPORT_SYMBOL_GPL(hv_remove_kexec_handler);
88
89 void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs))
90 {
91 hv_crash_handler = handler;
92 }
93 EXPORT_SYMBOL_GPL(hv_setup_crash_handler);
94
95 void hv_remove_crash_handler(void)
96 {
97 hv_crash_handler = NULL;
98 }
99 EXPORT_SYMBOL_GPL(hv_remove_crash_handler);
100
101 #ifdef CONFIG_KEXEC_CORE
102 static void hv_machine_shutdown(void)
103 {
104 if (kexec_in_progress && hv_kexec_handler)
105 hv_kexec_handler();
106 native_machine_shutdown();
107 }
108
109 static void hv_machine_crash_shutdown(struct pt_regs *regs)
110 {
111 if (hv_crash_handler)
112 hv_crash_handler(regs);
113 native_machine_crash_shutdown(regs);
114 }
115 #endif /* CONFIG_KEXEC_CORE */
116 #endif /* CONFIG_HYPERV */
117
118 static uint32_t __init ms_hyperv_platform(void)
119 {
120 u32 eax;
121 u32 hyp_signature[3];
122
123 if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
124 return 0;
125
126 cpuid(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS,
127 &eax, &hyp_signature[0], &hyp_signature[1], &hyp_signature[2]);
128
129 if (eax >= HYPERV_CPUID_MIN &&
130 eax <= HYPERV_CPUID_MAX &&
131 !memcmp("Microsoft Hv", hyp_signature, 12))
132 return HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS;
133
134 return 0;
135 }
136
137 static cycle_t read_hv_clock(struct clocksource *arg)
138 {
139 cycle_t current_tick;
140 /*
141 * Read the partition counter to get the current tick count. This count
142 * is set to 0 when the partition is created and is incremented in
143 * 100 nanosecond units.
144 */
145 rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
146 return current_tick;
147 }
148
149 static struct clocksource hyperv_cs = {
150 .name = "hyperv_clocksource",
151 .rating = 400, /* use this when running on Hyperv*/
152 .read = read_hv_clock,
153 .mask = CLOCKSOURCE_MASK(64),
154 .flags = CLOCK_SOURCE_IS_CONTINUOUS,
155 };
156
157 static unsigned char hv_get_nmi_reason(void)
158 {
159 return 0;
160 }
161
162 /*
163 * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
164 * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
165 * unknown NMI on the first CPU which gets it.
166 */
167 static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
168 {
169 static atomic_t nmi_cpu = ATOMIC_INIT(-1);
170
> 171 if (!unknown_nmi_panic)
172 return NMI_DONE;
173
174 if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 23753 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-12-01 13:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-30 17:55 [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic Vitaly Kuznetsov
2016-11-30 19:30 ` KY Srinivasan
2016-12-01 11:36 ` kbuild test robot
2016-12-01 13:01 ` kbuild test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.