* [PATCH v6 0/2] x86/apic: x2apic write eoi msr notrace
@ 2016-10-27 4:38 Wanpeng Li
2016-10-27 4:38 ` [PATCH v6 1/2] x86/msr: Add write " Wanpeng Li
2016-10-27 4:38 ` [PATCH v6 2/2] x86/apic: x2apic write eoi " Wanpeng Li
0 siblings, 2 replies; 8+ messages in thread
From: Wanpeng Li @ 2016-10-27 4:38 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
Paolo Bonzini, Borislav Petkov, Wanpeng Li
| RCU used illegally from idle CPU!
| rcu_scheduler_active = 1, debug_locks = 0
| RCU used illegally from extended quiescent state!
| no locks held by swapper/1/0.
|
| [<ffffffff9d492b95>] do_trace_write_msr+0x135/0x140
| [<ffffffff9d06f860>] native_write_msr+0x20/0x30
| [<ffffffff9d065fad>] native_apic_msr_eoi_write+0x1d/0x30
| [<ffffffff9d05bd1d>] smp_reschedule_interrupt+0x1d/0x30
| [<ffffffff9d8daec6>] reschedule_interrupt+0x96/0xa0
Reschedule interrupt may be called in cpu idle state. This causes lockdep
check warning above.
As Peterz pointed out:
| The thing is, many many smp_reschedule_interrupt() invocations don't
| actually execute anything much at all and are only send to tickle the
| return to user path (which does the actual preemption).
|
| Having to do the whole irq_enter/irq_exit dance just for this unlikely
| debug case totally blows.
This patchset adds x2apic write eoi msr notrace to avoid the debug codes
splash and reverts irq_enter/irq_exit dance to avoid to make a very frequent
interrupt slower because of debug code.
v5 -> v6:
* split the patch
* don't duplicate the inline asm
v4 -> v5:
* add notrace mark
v3 -> v4:
* add notrace mark
v2 -> v3:
* revert irq_enter/irq_exit() since it is merged
v1 -> v2:
* add write msr notrace to avoid debug codes splash instead of slowdown
a very frequent interrupt
Wanpeng Li (2):
x86/msr: Add write msr notrace
x86/apic: x2apic write eoi msr notrace
arch/x86/include/asm/apic.h | 3 ++-
arch/x86/include/asm/msr.h | 14 +++++++++++++-
arch/x86/kernel/apic/apic.c | 1 +
arch/x86/kernel/kvm.c | 4 ++--
arch/x86/kernel/smp.c | 2 --
5 files changed, 18 insertions(+), 6 deletions(-)
--
1.9.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v6 1/2] x86/msr: Add write msr notrace
2016-10-27 4:38 [PATCH v6 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
@ 2016-10-27 4:38 ` Wanpeng Li
2016-10-28 16:47 ` Borislav Petkov
2016-10-27 4:38 ` [PATCH v6 2/2] x86/apic: x2apic write eoi " Wanpeng Li
1 sibling, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2016-10-27 4:38 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
Paolo Bonzini, Borislav Petkov, Wanpeng Li
From: Wanpeng Li <wanpeng.li@hotmail.com>
Add write msr notrace, it will be used by later patch.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
arch/x86/include/asm/msr.h | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index b5fee97..eec29a7 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -115,17 +115,29 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
}
/* Can be uninlined because referenced by paravirt */
-notrace static inline void native_write_msr(unsigned int msr,
+notrace static inline void __native_write_msr_notrace(unsigned int msr,
unsigned low, unsigned high)
{
asm volatile("1: wrmsr\n"
"2:\n"
_ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_unsafe)
: : "c" (msr), "a"(low), "d" (high) : "memory");
+}
+
+/* Can be uninlined because referenced by paravirt */
+notrace static inline void native_write_msr(unsigned int msr,
+ unsigned low, unsigned high)
+{
+ __native_write_msr_notrace(msr, low, high);
if (msr_tracepoint_active(__tracepoint_write_msr))
do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
}
+static inline void wrmsr_notrace(unsigned msr, unsigned low, unsigned high)
+{
+ __native_write_msr_notrace(msr, low, high);
+}
+
/* Can be uninlined because referenced by paravirt */
notrace static inline int native_write_msr_safe(unsigned int msr,
unsigned low, unsigned high)
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v6 2/2] x86/apic: x2apic write eoi msr notrace
2016-10-27 4:38 [PATCH v6 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
2016-10-27 4:38 ` [PATCH v6 1/2] x86/msr: Add write " Wanpeng Li
@ 2016-10-27 4:38 ` Wanpeng Li
2016-10-28 16:42 ` Borislav Petkov
1 sibling, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2016-10-27 4:38 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
Paolo Bonzini, Borislav Petkov, Wanpeng Li
From: Wanpeng Li <wanpeng.li@hotmail.com>
| RCU used illegally from idle CPU!
| rcu_scheduler_active = 1, debug_locks = 0
| RCU used illegally from extended quiescent state!
| no locks held by swapper/1/0.
|
| [<ffffffff9d492b95>] do_trace_write_msr+0x135/0x140
| [<ffffffff9d06f860>] native_write_msr+0x20/0x30
| [<ffffffff9d065fad>] native_apic_msr_eoi_write+0x1d/0x30
| [<ffffffff9d05bd1d>] smp_reschedule_interrupt+0x1d/0x30
| [<ffffffff9d8daec6>] reschedule_interrupt+0x96/0xa0
Reschedule interrupt may be called in cpu idle state. This causes lockdep
check warning above.
As Peterz pointed out:
| So now we're making a very frequent interrupt slower because of debug
| code.
|
| The thing is, many many smp_reschedule_interrupt() invocations don't
| actually execute anything much at all and are only send to tickle the
| return to user path (which does the actual preemption).
|
| Having to do the whole irq_enter/irq_exit dance just for this unlikely
| debug case totally blows.
This patch converts x2apic write eoi msr to notrace to avoid the debug
codes splash and reverts irq_enter/irq_exit dance to avoid to make a very
frequent interrupt slower because of debug code.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
arch/x86/include/asm/apic.h | 3 ++-
arch/x86/kernel/apic/apic.c | 1 +
arch/x86/kernel/kvm.c | 4 ++--
arch/x86/kernel/smp.c | 2 --
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f5aaf6c..a5a0bcf 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -196,7 +196,7 @@ static inline void native_apic_msr_write(u32 reg, u32 v)
static inline void native_apic_msr_eoi_write(u32 reg, u32 v)
{
- wrmsr(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
+ wrmsr_notrace(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
}
static inline u32 native_apic_msr_read(u32 reg)
@@ -332,6 +332,7 @@ struct apic {
* on write for EOI.
*/
void (*eoi_write)(u32 reg, u32 v);
+ void (*native_eoi_write)(u32 reg, u32 v);
u64 (*icr_read)(void);
void (*icr_write)(u32 low, u32 high);
void (*wait_icr_idle)(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 88c657b..2686894 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2263,6 +2263,7 @@ void __init apic_set_eoi_write(void (*eoi_write)(u32 reg, u32 v))
for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
/* Should happen once for each apic */
WARN_ON((*drv)->eoi_write == eoi_write);
+ (*drv)->native_eoi_write = (*drv)->eoi_write;
(*drv)->eoi_write = eoi_write;
}
}
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..d230513 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -308,7 +308,7 @@ static void kvm_register_steal_time(void)
static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
-static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
+notrace static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
{
/**
* This relies on __test_and_clear_bit to modify the memory
@@ -319,7 +319,7 @@ static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
*/
if (__test_and_clear_bit(KVM_PV_EOI_BIT, this_cpu_ptr(&kvm_apic_eoi)))
return;
- apic_write(APIC_EOI, APIC_EOI_ACK);
+ apic->native_eoi_write(APIC_EOI, APIC_EOI_ACK);
}
static void kvm_guest_cpu_init(void)
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index c00cb64..68f8cc2 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -261,10 +261,8 @@ static inline void __smp_reschedule_interrupt(void)
__visible void smp_reschedule_interrupt(struct pt_regs *regs)
{
- irq_enter();
ack_APIC_irq();
__smp_reschedule_interrupt();
- irq_exit();
/*
* KVM uses this interrupt to force a cpu out of guest mode
*/
--
1.9.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v6 2/2] x86/apic: x2apic write eoi msr notrace
2016-10-27 4:38 ` [PATCH v6 2/2] x86/apic: x2apic write eoi " Wanpeng Li
@ 2016-10-28 16:42 ` Borislav Petkov
0 siblings, 0 replies; 8+ messages in thread
From: Borislav Petkov @ 2016-10-28 16:42 UTC (permalink / raw)
To: Wanpeng Li
Cc: linux-kernel, kvm, Ingo Molnar, Mike Galbraith, Peter Zijlstra,
Thomas Gleixner, Paolo Bonzini, Wanpeng Li
On Thu, Oct 27, 2016 at 12:38:42PM +0800, Wanpeng Li wrote:
...
> This patch converts x2apic write eoi msr to notrace to avoid the debug
> codes splash and reverts irq_enter/irq_exit dance to avoid to make a very
> frequent interrupt slower because of debug code.
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Mike Galbraith <efault@gmx.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
...
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index edbbfc8..d230513 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -308,7 +308,7 @@ static void kvm_register_steal_time(void)
>
> static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
>
> -static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
> +notrace static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
WARNING: storage class should be at the beginning of the declaration
#107: FILE: arch/x86/kernel/kvm.c:311:
+notrace static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
Make sure you integrate checkpatch.pl in your patches creating workflow...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 1/2] x86/msr: Add write msr notrace
2016-10-27 4:38 ` [PATCH v6 1/2] x86/msr: Add write " Wanpeng Li
@ 2016-10-28 16:47 ` Borislav Petkov
2016-10-30 23:30 ` Wanpeng Li
0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2016-10-28 16:47 UTC (permalink / raw)
To: Wanpeng Li
Cc: linux-kernel, kvm, Ingo Molnar, Mike Galbraith, Peter Zijlstra,
Thomas Gleixner, Paolo Bonzini, Wanpeng Li
On Thu, Oct 27, 2016 at 12:38:41PM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> Add write msr notrace, it will be used by later patch.
>
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Mike Galbraith <efault@gmx.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> arch/x86/include/asm/msr.h | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> index b5fee97..eec29a7 100644
> --- a/arch/x86/include/asm/msr.h
> +++ b/arch/x86/include/asm/msr.h
> @@ -115,17 +115,29 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
> }
>
> /* Can be uninlined because referenced by paravirt */
> -notrace static inline void native_write_msr(unsigned int msr,
> +notrace static inline void __native_write_msr_notrace(unsigned int msr,
> unsigned low, unsigned high)
^^^^^^^
Align arguments on an opening brace.
Also, please fix that in a patch ontop of this one:
WARNING: storage class should be at the beginning of the declaration
#43: FILE: arch/x86/include/asm/msr.h:118:
+notrace static inline void __native_write_msr_notrace(unsigned int msr,
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
#54: FILE: arch/x86/include/asm/msr.h:129:
+ unsigned low, unsigned high)
And because we know what those are, you can convert them directly to u32.
IOW, the end result should be something like this:
static inline void notrace
__native_write_msr_notrace(unsigned int msr, u32 low, u32 high)
And yes, I suggested using the "_notrace" suffix for the name but then
it would look funny if we end up using it in code.
So maybe we should make that lower-level helper simply:
static inline void notrace
__native_write_msr(unsigned int msr, u32 low, u32 high)
to denote that it does purely the WRMSR operation and nothing else.
Yap, that looks the cleanest to me.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 1/2] x86/msr: Add write msr notrace
2016-10-28 16:47 ` Borislav Petkov
@ 2016-10-30 23:30 ` Wanpeng Li
2016-10-30 23:46 ` Borislav Petkov
0 siblings, 1 reply; 8+ messages in thread
From: Wanpeng Li @ 2016-10-30 23:30 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, kvm, Ingo Molnar, Mike Galbraith, Peter Zijlstra,
Thomas Gleixner, Paolo Bonzini, Wanpeng Li
2016-10-29 0:47 GMT+08:00 Borislav Petkov <bp@alien8.de>:
[...]
>>
>> /* Can be uninlined because referenced by paravirt */
>> -notrace static inline void native_write_msr(unsigned int msr,
>> +notrace static inline void __native_write_msr_notrace(unsigned int msr,
>> unsigned low, unsigned high)
> ^^^^^^^
> Align arguments on an opening brace.
Other functions like native_write_msr() and native_write_msr_safe()
etc are also not aligned, so your suggestion maybe result in
inconsistent.
>
> Also, please fix that in a patch ontop of this one:
>
> WARNING: storage class should be at the beginning of the declaration
> #43: FILE: arch/x86/include/asm/msr.h:118:
> +notrace static inline void __native_write_msr_notrace(unsigned int msr,
>
> WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
> #54: FILE: arch/x86/include/asm/msr.h:129:
> + unsigned low, unsigned high)
>
> And because we know what those are, you can convert them directly to u32.
Ditto.
>
> IOW, the end result should be something like this:
>
> static inline void notrace
> __native_write_msr_notrace(unsigned int msr, u32 low, u32 high)
>
> And yes, I suggested using the "_notrace" suffix for the name but then
> it would look funny if we end up using it in code.
>
> So maybe we should make that lower-level helper simply:
>
> static inline void notrace
> __native_write_msr(unsigned int msr, u32 low, u32 high)
>
> to denote that it does purely the WRMSR operation and nothing else.
>
> Yap, that looks the cleanest to me.
Agreed.
Regards,
Wanpeng Li
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 1/2] x86/msr: Add write msr notrace
2016-10-30 23:30 ` Wanpeng Li
@ 2016-10-30 23:46 ` Borislav Petkov
2016-10-31 1:41 ` Wanpeng Li
0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2016-10-30 23:46 UTC (permalink / raw)
To: Wanpeng Li
Cc: linux-kernel, kvm, Ingo Molnar, Mike Galbraith, Peter Zijlstra,
Thomas Gleixner, Paolo Bonzini, Wanpeng Li
On Mon, Oct 31, 2016 at 07:30:33AM +0800, Wanpeng Li wrote:
> Other functions like native_write_msr() and native_write_msr_safe()
> etc are also not aligned, so your suggestion maybe result in
> inconsistent.
So align them too, while you're at it.
> > And because we know what those are, you can convert them directly to u32.
>
> Ditto.
You could convert the rest to u32/u64 in another patch, ontop or if you
don't feel like it, I can take care of it.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v6 1/2] x86/msr: Add write msr notrace
2016-10-30 23:46 ` Borislav Petkov
@ 2016-10-31 1:41 ` Wanpeng Li
0 siblings, 0 replies; 8+ messages in thread
From: Wanpeng Li @ 2016-10-31 1:41 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-kernel, kvm, Ingo Molnar, Mike Galbraith, Peter Zijlstra,
Thomas Gleixner, Paolo Bonzini, Wanpeng Li
2016-10-31 7:46 GMT+08:00 Borislav Petkov <bp@alien8.de>:
> On Mon, Oct 31, 2016 at 07:30:33AM +0800, Wanpeng Li wrote:
>> Other functions like native_write_msr() and native_write_msr_safe()
>> etc are also not aligned, so your suggestion maybe result in
>> inconsistent.
>
> So align them too, while you're at it.
>
>> > And because we know what those are, you can convert them directly to u32.
>>
>> Ditto.
>
> You could convert the rest to u32/u64 in another patch, ontop or if you
> don't feel like it, I can take care of it.
I just fix my own "storage class should be at the beginning of the
declaration" coding style issue. Please feel free to handle align/u32
issues which existing before the patchset if you like. :)
Regards,
Wanpeng Li
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-10-31 1:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-27 4:38 [PATCH v6 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
2016-10-27 4:38 ` [PATCH v6 1/2] x86/msr: Add write " Wanpeng Li
2016-10-28 16:47 ` Borislav Petkov
2016-10-30 23:30 ` Wanpeng Li
2016-10-30 23:46 ` Borislav Petkov
2016-10-31 1:41 ` Wanpeng Li
2016-10-27 4:38 ` [PATCH v6 2/2] x86/apic: x2apic write eoi " Wanpeng Li
2016-10-28 16:42 ` Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).