KVM support may be compiled as dynamic module, which triggers the following splat on modprobe: KVM: vmx: using Hyper-V Enlightened VMCS BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19 CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 Call Trace: dump_stack+0x61/0x7e check_preemption_disabled+0xd4/0xe6 debug_smp_processor_id+0x17/0x19 set_hv_tscchange_cb+0x1b/0x89 kvm_arch_init+0x14a/0x163 [kvm] kvm_init+0x30/0x259 [kvm] vmx_init+0xed/0x3db [kvm_intel] do_one_initcall+0x89/0x1bc do_init_module+0x5f/0x207 load_module+0x1b34/0x209b __ia32_sys_init_module+0x17/0x19 do_fast_syscall_32+0x121/0x1fa entry_SYSENTER_compat+0x7f/0x91 The easiest solution seems to be disabling preemption while setting up reenlightment MSRs. While at it, fix hv_cpu_*() callbacks. Fixes: 93286261de1b4 ("x86/hyperv: Reenlightenment notifications support") Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Cathy Avery <cavery@redhat.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> Cc: Mohammed Gamal <mmorsy@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: devel@linuxdriverproject.org Cc: kvm@vger.kernel.org Cc: linux-hyperv@vger.kernel.org Cc: x86@kernel.org Reported-by: Prasanna Panchamukhi <panchamukhi@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> --- arch/x86/hyperv/hv_init.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 1608050e9df9..0bdd79ecbff8 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index); static int hv_cpu_init(unsigned int cpu) { u64 msr_vp_index; - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; void **input_arg; struct page *pg; @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu) hv_get_vp_index(msr_vp_index); - hv_vp_index[smp_processor_id()] = msr_vp_index; + hv_vp_index[cpu] = msr_vp_index; if (msr_vp_index > hv_max_vp_index) hv_max_vp_index = msr_vp_index; @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void)) struct hv_reenlightenment_control re_ctrl = { .vector = HYPERV_REENLIGHTENMENT_VECTOR, .enabled = 1, - .target_vp = hv_vp_index[smp_processor_id()] }; struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1}; @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void)) /* Make sure callback is registered before we write to MSRs */ wmb(); + preempt_disable(); + re_ctrl.target_vp = hv_vp_index[smp_processor_id()]; wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl)); + preempt_enable(); + wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl)); } EXPORT_SYMBOL_GPL(set_hv_tscchange_cb); -- 2.22.0
On Tue, Jun 11, 2019 at 10:20:03PM +0100, Dmitry Safonov wrote:
> KVM support may be compiled as dynamic module, which triggers the
> following splat on modprobe:
>
> KVM: vmx: using Hyper-V Enlightened VMCS
> BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19
> CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1
> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
> Call Trace:
> dump_stack+0x61/0x7e
> check_preemption_disabled+0xd4/0xe6
> debug_smp_processor_id+0x17/0x19
> set_hv_tscchange_cb+0x1b/0x89
> kvm_arch_init+0x14a/0x163 [kvm]
> kvm_init+0x30/0x259 [kvm]
> vmx_init+0xed/0x3db [kvm_intel]
> do_one_initcall+0x89/0x1bc
> do_init_module+0x5f/0x207
> load_module+0x1b34/0x209b
> __ia32_sys_init_module+0x17/0x19
> do_fast_syscall_32+0x121/0x1fa
> entry_SYSENTER_compat+0x7f/0x91
>
> The easiest solution seems to be disabling preemption while setting up
> reenlightment MSRs. While at it, fix hv_cpu_*() callbacks.
>
> Fixes: 93286261de1b4 ("x86/hyperv: Reenlightenment notifications
> support")
>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Cathy Avery <cavery@redhat.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
> Cc: Mohammed Gamal <mmorsy@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Roman Kagan <rkagan@virtuozzo.com>
> Cc: Sasha Levin <sashal@kernel.org>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>
> Cc: devel@linuxdriverproject.org
> Cc: kvm@vger.kernel.org
> Cc: linux-hyperv@vger.kernel.org
> Cc: x86@kernel.org
> Reported-by: Prasanna Panchamukhi <panchamukhi@arista.com>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
> arch/x86/hyperv/hv_init.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 1608050e9df9..0bdd79ecbff8 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
> static int hv_cpu_init(unsigned int cpu)
> {
> u64 msr_vp_index;
> - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
> + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
> void **input_arg;
> struct page *pg;
>
> @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
>
> hv_get_vp_index(msr_vp_index);
>
> - hv_vp_index[smp_processor_id()] = msr_vp_index;
> + hv_vp_index[cpu] = msr_vp_index;
>
> if (msr_vp_index > hv_max_vp_index)
> hv_max_vp_index = msr_vp_index;
> @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void))
> struct hv_reenlightenment_control re_ctrl = {
> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
> .enabled = 1,
> - .target_vp = hv_vp_index[smp_processor_id()]
> };
> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>
> @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void))
> /* Make sure callback is registered before we write to MSRs */
> wmb();
>
> + preempt_disable();
> + re_ctrl.target_vp = hv_vp_index[smp_processor_id()];
> wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
> + preempt_enable();
> +
> wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl));
> }
> EXPORT_SYMBOL_GPL(set_hv_tscchange_cb);
This looks bogus, MSRs are a per-cpu resource, you had better know what
CPUs you're on and be stuck to it when you do wrmsr. This just fudges
the code to make the warning go away and doesn't fix the actual problem
afaict.
Dmitry Safonov <dima@arista.com> writes: > KVM support may be compiled as dynamic module, which triggers the > following splat on modprobe: > > KVM: vmx: using Hyper-V Enlightened VMCS > BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19 > CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1 > Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 > Call Trace: > dump_stack+0x61/0x7e > check_preemption_disabled+0xd4/0xe6 > debug_smp_processor_id+0x17/0x19 > set_hv_tscchange_cb+0x1b/0x89 > kvm_arch_init+0x14a/0x163 [kvm] > kvm_init+0x30/0x259 [kvm] > vmx_init+0xed/0x3db [kvm_intel] > do_one_initcall+0x89/0x1bc > do_init_module+0x5f/0x207 > load_module+0x1b34/0x209b > __ia32_sys_init_module+0x17/0x19 > do_fast_syscall_32+0x121/0x1fa > entry_SYSENTER_compat+0x7f/0x91 Hm, I never noticed this one, you probably need something like CONFIG_PREEMPT enabled so see it. > > The easiest solution seems to be disabling preemption while setting up > reenlightment MSRs. While at it, fix hv_cpu_*() callbacks. > > Fixes: 93286261de1b4 ("x86/hyperv: Reenlightenment notifications > support") > > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Borislav Petkov <bp@alien8.de> > Cc: Cathy Avery <cavery@redhat.com> > Cc: Haiyang Zhang <haiyangz@microsoft.com> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com> > Cc: Mohammed Gamal <mmorsy@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Radim Krčmář <rkrcmar@redhat.com> > Cc: Roman Kagan <rkagan@virtuozzo.com> > Cc: Sasha Levin <sashal@kernel.org> > Cc: Stephen Hemminger <sthemmin@microsoft.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Vitaly Kuznetsov <vkuznets@redhat.com> > > Cc: devel@linuxdriverproject.org > Cc: kvm@vger.kernel.org > Cc: linux-hyperv@vger.kernel.org > Cc: x86@kernel.org > Reported-by: Prasanna Panchamukhi <panchamukhi@arista.com> > Signed-off-by: Dmitry Safonov <dima@arista.com> > --- > arch/x86/hyperv/hv_init.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index 1608050e9df9..0bdd79ecbff8 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index); > static int hv_cpu_init(unsigned int cpu) > { > u64 msr_vp_index; > - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; > void **input_arg; > struct page *pg; > > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu) > > hv_get_vp_index(msr_vp_index); > > - hv_vp_index[smp_processor_id()] = msr_vp_index; > + hv_vp_index[cpu] = msr_vp_index; > > if (msr_vp_index > hv_max_vp_index) > hv_max_vp_index = msr_vp_index; The above is unrelated cleanup (as cpu == smp_processor_id() for CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be preempted. > @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void)) > struct hv_reenlightenment_control re_ctrl = { > .vector = HYPERV_REENLIGHTENMENT_VECTOR, > .enabled = 1, > - .target_vp = hv_vp_index[smp_processor_id()] > }; > struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1}; > > @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void)) > /* Make sure callback is registered before we write to MSRs */ > wmb(); > > + preempt_disable(); > + re_ctrl.target_vp = hv_vp_index[smp_processor_id()]; > wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl)); > + preempt_enable(); > + My personal preference would be to do something like int cpu = get_cpu(); ... set things up ... put_cpu(); instead, there are no long-running things in the whole function. But what you've done should work too, so Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> > wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl)); > } > EXPORT_SYMBOL_GPL(set_hv_tscchange_cb); -- Vitaly
Peter Zijlstra <peterz@infradead.org> writes:
> On Tue, Jun 11, 2019 at 10:20:03PM +0100, Dmitry Safonov wrote:
>> KVM support may be compiled as dynamic module, which triggers the
>> following splat on modprobe:
>>
>> KVM: vmx: using Hyper-V Enlightened VMCS
>> BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19
>> CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1
>> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
>> Call Trace:
>> dump_stack+0x61/0x7e
>> check_preemption_disabled+0xd4/0xe6
>> debug_smp_processor_id+0x17/0x19
>> set_hv_tscchange_cb+0x1b/0x89
>> kvm_arch_init+0x14a/0x163 [kvm]
>> kvm_init+0x30/0x259 [kvm]
>> vmx_init+0xed/0x3db [kvm_intel]
>> do_one_initcall+0x89/0x1bc
>> do_init_module+0x5f/0x207
>> load_module+0x1b34/0x209b
>> __ia32_sys_init_module+0x17/0x19
>> do_fast_syscall_32+0x121/0x1fa
>> entry_SYSENTER_compat+0x7f/0x91
>>
>> The easiest solution seems to be disabling preemption while setting up
>> reenlightment MSRs. While at it, fix hv_cpu_*() callbacks.
>>
>> Fixes: 93286261de1b4 ("x86/hyperv: Reenlightenment notifications
>> support")
>>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Cathy Avery <cavery@redhat.com>
>> Cc: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>> Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
>> Cc: Mohammed Gamal <mmorsy@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Cc: Roman Kagan <rkagan@virtuozzo.com>
>> Cc: Sasha Levin <sashal@kernel.org>
>> Cc: Stephen Hemminger <sthemmin@microsoft.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
>>
>> Cc: devel@linuxdriverproject.org
>> Cc: kvm@vger.kernel.org
>> Cc: linux-hyperv@vger.kernel.org
>> Cc: x86@kernel.org
>> Reported-by: Prasanna Panchamukhi <panchamukhi@arista.com>
>> Signed-off-by: Dmitry Safonov <dima@arista.com>
>> ---
>> arch/x86/hyperv/hv_init.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
>> index 1608050e9df9..0bdd79ecbff8 100644
>> --- a/arch/x86/hyperv/hv_init.c
>> +++ b/arch/x86/hyperv/hv_init.c
>> @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
>> static int hv_cpu_init(unsigned int cpu)
>> {
>> u64 msr_vp_index;
>> - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
>> + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
>> void **input_arg;
>> struct page *pg;
>>
>> @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
>>
>> hv_get_vp_index(msr_vp_index);
>>
>> - hv_vp_index[smp_processor_id()] = msr_vp_index;
>> + hv_vp_index[cpu] = msr_vp_index;
>>
>> if (msr_vp_index > hv_max_vp_index)
>> hv_max_vp_index = msr_vp_index;
>> @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void))
>> struct hv_reenlightenment_control re_ctrl = {
>> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
>> .enabled = 1,
>> - .target_vp = hv_vp_index[smp_processor_id()]
>> };
>> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>
>> @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void))
>> /* Make sure callback is registered before we write to MSRs */
>> wmb();
>>
>> + preempt_disable();
>> + re_ctrl.target_vp = hv_vp_index[smp_processor_id()];
>> wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
>> + preempt_enable();
>> +
>> wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl));
>> }
>> EXPORT_SYMBOL_GPL(set_hv_tscchange_cb);
>
> This looks bogus, MSRs are a per-cpu resource, you had better know what
> CPUs you're on and be stuck to it when you do wrmsr. This just fudges
> the code to make the warning go away and doesn't fix the actual problem
> afaict.
Actually, we don't care which CPU will receive the reenlightenment
notification and TSC Emulation in Hyper-V is, of course, global. We have
code which re-assignes the notification to some other CPU in case the
one it's currently assigned to goes away (see hv_cpu_die()).
--
Vitaly
On Wed, 12 Jun 2019, Vitaly Kuznetsov wrote:
> Dmitry Safonov <dima@arista.com> writes:
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index 1608050e9df9..0bdd79ecbff8 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
> > static int hv_cpu_init(unsigned int cpu)
> > {
> > u64 msr_vp_index;
> > - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
> > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
> > void **input_arg;
> > struct page *pg;
> >
> > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
> >
> > hv_get_vp_index(msr_vp_index);
> >
> > - hv_vp_index[smp_processor_id()] = msr_vp_index;
> > + hv_vp_index[cpu] = msr_vp_index;
> >
> > if (msr_vp_index > hv_max_vp_index)
> > hv_max_vp_index = msr_vp_index;
>
> The above is unrelated cleanup (as cpu == smp_processor_id() for
> CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be
> preempted.
They can be preempted, but they are guaranteed to run on the upcoming CPU,
i.e. smp_processor_id() is allowed even in preemptible context as the task
cannot migrate.
Thanks,
tglx
Thomas Gleixner <tglx@linutronix.de> writes:
> On Wed, 12 Jun 2019, Vitaly Kuznetsov wrote:
>> Dmitry Safonov <dima@arista.com> writes:
>> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
>> > index 1608050e9df9..0bdd79ecbff8 100644
>> > --- a/arch/x86/hyperv/hv_init.c
>> > +++ b/arch/x86/hyperv/hv_init.c
>> > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
>> > static int hv_cpu_init(unsigned int cpu)
>> > {
>> > u64 msr_vp_index;
>> > - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
>> > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
>> > void **input_arg;
>> > struct page *pg;
>> >
>> > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
>> >
>> > hv_get_vp_index(msr_vp_index);
>> >
>> > - hv_vp_index[smp_processor_id()] = msr_vp_index;
>> > + hv_vp_index[cpu] = msr_vp_index;
>> >
>> > if (msr_vp_index > hv_max_vp_index)
>> > hv_max_vp_index = msr_vp_index;
>>
>> The above is unrelated cleanup (as cpu == smp_processor_id() for
>> CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be
>> preempted.
>
> They can be preempted, but they are guaranteed to run on the upcoming CPU,
> i.e. smp_processor_id() is allowed even in preemptible context as the task
> cannot migrate.
>
Ah, right, thanks! The guarantee that they don't migrate should be enough.
--
Vitaly
On Wed, Jun 12, 2019 at 12:17:24PM +0200, Vitaly Kuznetsov wrote: > Dmitry Safonov <dima@arista.com> writes: > > > KVM support may be compiled as dynamic module, which triggers the > > following splat on modprobe: > > > > KVM: vmx: using Hyper-V Enlightened VMCS > > BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19 > > CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1 > > Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 > > Call Trace: > > dump_stack+0x61/0x7e > > check_preemption_disabled+0xd4/0xe6 > > debug_smp_processor_id+0x17/0x19 > > set_hv_tscchange_cb+0x1b/0x89 > > kvm_arch_init+0x14a/0x163 [kvm] > > kvm_init+0x30/0x259 [kvm] > > vmx_init+0xed/0x3db [kvm_intel] > > do_one_initcall+0x89/0x1bc > > do_init_module+0x5f/0x207 > > load_module+0x1b34/0x209b > > __ia32_sys_init_module+0x17/0x19 > > do_fast_syscall_32+0x121/0x1fa > > entry_SYSENTER_compat+0x7f/0x91 > > Hm, I never noticed this one, you probably need something like > CONFIG_PREEMPT enabled so see it. CONFIG_DEBUG_PREEMPT > > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index); > > static int hv_cpu_init(unsigned int cpu) > > { > > u64 msr_vp_index; > > - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; > > + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; > > void **input_arg; > > struct page *pg; > > > > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu) > > > > hv_get_vp_index(msr_vp_index); > > > > - hv_vp_index[smp_processor_id()] = msr_vp_index; > > + hv_vp_index[cpu] = msr_vp_index; > > > > if (msr_vp_index > hv_max_vp_index) > > hv_max_vp_index = msr_vp_index; > > The above is unrelated cleanup (as cpu == smp_processor_id() for > CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be > preempted. Yeah, makes sense though. > > @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void)) > > struct hv_reenlightenment_control re_ctrl = { > > .vector = HYPERV_REENLIGHTENMENT_VECTOR, > > .enabled = 1, > > - .target_vp = hv_vp_index[smp_processor_id()] > > }; > > struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1}; > > > > @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void)) > > /* Make sure callback is registered before we write to MSRs */ > > wmb(); > > > > + preempt_disable(); > > + re_ctrl.target_vp = hv_vp_index[smp_processor_id()]; > > wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl)); > > + preempt_enable(); > > + > > My personal preference would be to do something like > int cpu = get_cpu(); > > ... set things up ... > > put_cpu(); If it doesn't matter, how about this then? --- diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 1608050e9df9..e58c693a9fce 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index); static int hv_cpu_init(unsigned int cpu) { u64 msr_vp_index; - struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()]; + struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu]; void **input_arg; struct page *pg; @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu) hv_get_vp_index(msr_vp_index); - hv_vp_index[smp_processor_id()] = msr_vp_index; + hv_vp_index[cpu] = msr_vp_index; if (msr_vp_index > hv_max_vp_index) hv_max_vp_index = msr_vp_index; @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void)) struct hv_reenlightenment_control re_ctrl = { .vector = HYPERV_REENLIGHTENMENT_VECTOR, .enabled = 1, - .target_vp = hv_vp_index[smp_processor_id()] + .target_vp = hv_vp_index[raw_smp_processor_id()] }; struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
Peter Zijlstra <peterz@infradead.org> writes:
> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
> struct hv_reenlightenment_control re_ctrl = {
> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
> .enabled = 1,
> - .target_vp = hv_vp_index[smp_processor_id()]
> + .target_vp = hv_vp_index[raw_smp_processor_id()]
> };
> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>
Yes, this should do, thanks! I'd also suggest to leave a comment like
/*
* This function can get preemted and migrate to a different CPU
* but this doesn't matter. We just need to assign
* reenlightenment notification to some online CPU. In case this
* CPU goes offline, hv_cpu_die() will re-assign it to some
* other online CPU.
*/
--
Vitaly
On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>> struct hv_reenlightenment_control re_ctrl = {
>> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
>> .enabled = 1,
>> - .target_vp = hv_vp_index[smp_processor_id()]
>> + .target_vp = hv_vp_index[raw_smp_processor_id()]
>> };
>> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>
>
> Yes, this should do, thanks! I'd also suggest to leave a comment like
> /*
> * This function can get preemted and migrate to a different CPU
> * but this doesn't matter. We just need to assign
> * reenlightenment notification to some online CPU. In case this
> * CPU goes offline, hv_cpu_die() will re-assign it to some
> * other online CPU.
> */
What if the cpu goes down just before wrmsrl()?
I mean, hv_cpu_die() will reassign another cpu, but this thread will be
resumed on some other cpu and will write cpu number which is at that
moment already down?
(probably I miss something)
And I presume it's guaranteed that during hv_cpu_die() no other cpu may
go down:
: new_cpu = cpumask_any_but(cpu_online_mask, cpu);
: re_ctrl.target_vp = hv_vp_index[new_cpu];
: wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
--
Dima
On Fri, Jun 14, 2019 at 12:50:51PM +0100, Dmitry Safonov wrote:
> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
> > Peter Zijlstra <peterz@infradead.org> writes:
> >
> >> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
> >> struct hv_reenlightenment_control re_ctrl = {
> >> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
> >> .enabled = 1,
> >> - .target_vp = hv_vp_index[smp_processor_id()]
> >> + .target_vp = hv_vp_index[raw_smp_processor_id()]
> >> };
> >> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
> >>
> >
> > Yes, this should do, thanks! I'd also suggest to leave a comment like
> > /*
> > * This function can get preemted and migrate to a different CPU
> > * but this doesn't matter. We just need to assign
> > * reenlightenment notification to some online CPU. In case this
> > * CPU goes offline, hv_cpu_die() will re-assign it to some
> > * other online CPU.
> > */
>
> What if the cpu goes down just before wrmsrl()?
> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
> resumed on some other cpu and will write cpu number which is at that
> moment already down?
>
> (probably I miss something)
>
> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
> go down:
> : new_cpu = cpumask_any_but(cpu_online_mask, cpu);
> : re_ctrl.target_vp = hv_vp_index[new_cpu];
> : wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
Then cpus_read_lock() is the right interface, not preempt_disable().
I know you probably can't change the HV interface, but I'm thinking its
rather daft you have to specify a CPU at all for this. The HV can just
pick one and send the notification there, who cares.
On 6/14/19 1:27 PM, Peter Zijlstra wrote:
> On Fri, Jun 14, 2019 at 12:50:51PM +0100, Dmitry Safonov wrote:
>> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
>>> Peter Zijlstra <peterz@infradead.org> writes:
>>>
>>>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>>>> struct hv_reenlightenment_control re_ctrl = {
>>>> .vector = HYPERV_REENLIGHTENMENT_VECTOR,
>>>> .enabled = 1,
>>>> - .target_vp = hv_vp_index[smp_processor_id()]
>>>> + .target_vp = hv_vp_index[raw_smp_processor_id()]
>>>> };
>>>> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>>>
>>>
>>> Yes, this should do, thanks! I'd also suggest to leave a comment like
>>> /*
>>> * This function can get preemted and migrate to a different CPU
>>> * but this doesn't matter. We just need to assign
>>> * reenlightenment notification to some online CPU. In case this
>>> * CPU goes offline, hv_cpu_die() will re-assign it to some
>>> * other online CPU.
>>> */
>>
>> What if the cpu goes down just before wrmsrl()?
>> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
>> resumed on some other cpu and will write cpu number which is at that
>> moment already down?
>>
>> (probably I miss something)
>>
>> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
>> go down:
>> : new_cpu = cpumask_any_but(cpu_online_mask, cpu);
>> : re_ctrl.target_vp = hv_vp_index[new_cpu];
>> : wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
>
> Then cpus_read_lock() is the right interface, not preempt_disable().
>
> I know you probably can't change the HV interface, but I'm thinking its
> rather daft you have to specify a CPU at all for this. The HV can just
> pick one and send the notification there, who cares.
Heh, I thought cpus_read_lock() is more "internal" api and
preempt_diable() is prefered ;-)
Will send v2 with the suggested comment and cpus_read_lock().
--
Dima
Dmitry Safonov <dima@arista.com> writes: > On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote: >> Peter Zijlstra <peterz@infradead.org> writes: >> >>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void)) >>> struct hv_reenlightenment_control re_ctrl = { >>> .vector = HYPERV_REENLIGHTENMENT_VECTOR, >>> .enabled = 1, >>> - .target_vp = hv_vp_index[smp_processor_id()] >>> + .target_vp = hv_vp_index[raw_smp_processor_id()] >>> }; >>> struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1}; >>> >> >> Yes, this should do, thanks! I'd also suggest to leave a comment like >> /* >> * This function can get preemted and migrate to a different CPU >> * but this doesn't matter. We just need to assign >> * reenlightenment notification to some online CPU. In case this >> * CPU goes offline, hv_cpu_die() will re-assign it to some >> * other online CPU. >> */ > > What if the cpu goes down just before wrmsrl()? > I mean, hv_cpu_die() will reassign another cpu, but this thread will be > resumed on some other cpu and will write cpu number which is at that > moment already down? > Right you are, we need to guarantee wrmsr() happens before the CPU gets a chance to go offline: we don't save the cpu number anywhere, we just read it with rdmsr() in hv_cpu_die(). > > And I presume it's guaranteed that during hv_cpu_die() no other cpu may > go down: > : new_cpu = cpumask_any_but(cpu_online_mask, cpu); > : re_ctrl.target_vp = hv_vp_index[new_cpu]; > : wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl)); I *think* I got convinced that CPUs don't go offline simultaneously when I was writing this. -- Vitaly
Peter Zijlstra <peterz@infradead.org> writes:
>
> I know you probably can't change the HV interface, but I'm thinking its
> rather daft you have to specify a CPU at all for this. The HV can just
> pick one and send the notification there, who cares.
Generally speaking, hypervisor can't know if the CPU is offline (or
e.g. 'isolated') from guest's perspective so I think having an option to
specify affinity for reenlightenment notification is rather a good
thing, not bad.
(Actually, I don't remember if I tried specifying 'HV_ANY' (U32_MAX-1)
here to see what happens. But then I doubt it'll notice the fact that we
offlined some CPU so we may get a totally unexpected IRQ there).
--
Vitaly