linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/2] x86/apic: x2apic write eoi msr notrace
@ 2016-11-07  3:13 Wanpeng Li
  2016-11-07  3:13 ` [PATCH v8 1/2] x86/msr: Add write " Wanpeng Li
  2016-11-07  3:13 ` [PATCH v8 2/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
  0 siblings, 2 replies; 5+ messages in thread
From: Wanpeng Li @ 2016-11-07  3:13 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
	Paolo Bonzini, Borislav Petkov, Wanpeng Li

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 no locks held by swapper/1/0.
 
  do_trace_write_msr
  native_write_msr
  native_apic_msr_eoi_write
  smp_reschedule_interrupt
  reschedule_interrupt

Reschedule interrupt may be called in CPU idle state. This causes lockdep
check warning above.

As Peterz pointed out:

| The thing is, many many smp_reschedule_interrupt() invocations don't
| actually execute anything much at all and are only sent to tickle the
| return to user path (which does the actual preemption).
| 
| Having to do the whole irq_enter/irq_exit dance just for this unlikely
| debug case totally blows.

This patchset adds x2apic write eoi msr notrace to avoid the debug codes 
splash and reverts irq_enter/irq_exit dance to avoid to make a very frequent 
interrupt slower because of debug code.

v7 -> v8:
 * cleanup patch 2/2 descriptions

v6 -> v7:
 * cleanup storage class should be at the beginning of the declaration

v5 -> v6:
 * split the patch 
 * don't duplicate the inline asm

v4 -> v5:
 * add notrace mark 

v3 -> v4: 
 * add notrace mark 

v2 -> v3:
 * revert irq_enter/irq_exit() since it is merged

v1 -> v2:
 * add write msr notrace to avoid debug codes splash instead of slowdown 
   a very frequent interrupt


Wanpeng Li (2):
  x86/msr: Add write msr notrace
  x86/apic: x2apic write eoi msr notrace

 arch/x86/include/asm/apic.h |  3 ++-
 arch/x86/include/asm/msr.h  | 14 +++++++++++++-
 arch/x86/kernel/apic/apic.c |  1 +
 arch/x86/kernel/kvm.c       |  4 ++--
 arch/x86/kernel/smp.c       |  2 --
 5 files changed, 18 insertions(+), 6 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v8 1/2] x86/msr: Add write msr notrace
  2016-11-07  3:13 [PATCH v8 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
@ 2016-11-07  3:13 ` Wanpeng Li
  2016-11-09 21:09   ` [tip:x86/apic] x86/msr: Add wrmsr_notrace() tip-bot for Wanpeng Li
  2016-11-07  3:13 ` [PATCH v8 2/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
  1 sibling, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2016-11-07  3:13 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
	Paolo Bonzini, Borislav Petkov, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

Add write msr notrace, it will be used by later patch.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/include/asm/msr.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index b5fee97..eec29a7 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -115,17 +115,29 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
 }
 
 /* Can be uninlined because referenced by paravirt */
-notrace static inline void native_write_msr(unsigned int msr,
+static notrace inline void __native_write_msr_notrace(unsigned int msr,
 					    unsigned low, unsigned high)
 {
 	asm volatile("1: wrmsr\n"
 		     "2:\n"
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_unsafe)
 		     : : "c" (msr), "a"(low), "d" (high) : "memory");
+}
+
+/* Can be uninlined because referenced by paravirt */
+static notrace inline void native_write_msr(unsigned int msr,
+					    unsigned low, unsigned high)
+{
+	__native_write_msr_notrace(msr, low, high);
 	if (msr_tracepoint_active(__tracepoint_write_msr))
 		do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
 }
 
+static inline void wrmsr_notrace(unsigned msr, unsigned low, unsigned high)
+{
+	__native_write_msr_notrace(msr, low, high);
+}
+
 /* Can be uninlined because referenced by paravirt */
 notrace static inline int native_write_msr_safe(unsigned int msr,
 					unsigned low, unsigned high)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v8 2/2] x86/apic: x2apic write eoi msr notrace
  2016-11-07  3:13 [PATCH v8 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
  2016-11-07  3:13 ` [PATCH v8 1/2] x86/msr: Add write " Wanpeng Li
@ 2016-11-07  3:13 ` Wanpeng Li
  2016-11-09 21:10   ` [tip:x86/apic] x86/apic: Prevent tracing on apic_msr_write_eoi() tip-bot for Wanpeng Li
  1 sibling, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2016-11-07  3:13 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Ingo Molnar, Mike Galbraith, Peter Zijlstra, Thomas Gleixner,
	Paolo Bonzini, Borislav Petkov, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 no locks held by swapper/1/0.
 
  do_trace_write_msr
  native_write_msr
  native_apic_msr_eoi_write
  smp_reschedule_interrupt
  reschedule_interrupt

Reschedule interrupt may be called in CPU idle state. This causes lockdep
check warning above.

As Peterz pointed out:

| So now we're making a very frequent interrupt slower because of debug 
| code.
|
| The thing is, many many smp_reschedule_interrupt() invocations don't
| actually execute anything much at all and are only sent to tickle the
| return to user path (which does the actual preemption).
| 
| Having to do the whole irq_enter/irq_exit dance just for this unlikely
| debug case totally blows.

This patch converts x2apic write eoi msr to notrace to avoid the debug 
codes splash and reverts irq_enter/irq_exit dance to avoid to make a very 
frequent interrupt slower because of debug code.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/include/asm/apic.h | 3 ++-
 arch/x86/kernel/apic/apic.c | 1 +
 arch/x86/kernel/kvm.c       | 4 ++--
 arch/x86/kernel/smp.c       | 2 --
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f5aaf6c..a5a0bcf 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -196,7 +196,7 @@ static inline void native_apic_msr_write(u32 reg, u32 v)
 
 static inline void native_apic_msr_eoi_write(u32 reg, u32 v)
 {
-	wrmsr(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
+	wrmsr_notrace(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
 }
 
 static inline u32 native_apic_msr_read(u32 reg)
@@ -332,6 +332,7 @@ struct apic {
 	 * on write for EOI.
 	 */
 	void (*eoi_write)(u32 reg, u32 v);
+	void (*native_eoi_write)(u32 reg, u32 v);
 	u64 (*icr_read)(void);
 	void (*icr_write)(u32 low, u32 high);
 	void (*wait_icr_idle)(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 88c657b..2686894 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2263,6 +2263,7 @@ void __init apic_set_eoi_write(void (*eoi_write)(u32 reg, u32 v))
 	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
 		/* Should happen once for each apic */
 		WARN_ON((*drv)->eoi_write == eoi_write);
+		(*drv)->native_eoi_write = (*drv)->eoi_write;
 		(*drv)->eoi_write = eoi_write;
 	}
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..d230513 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -308,7 +308,7 @@ static void kvm_register_steal_time(void)
 
 static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
-static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
+static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
 	/**
 	 * This relies on __test_and_clear_bit to modify the memory
@@ -319,7 +319,7 @@ static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 	 */
 	if (__test_and_clear_bit(KVM_PV_EOI_BIT, this_cpu_ptr(&kvm_apic_eoi)))
 		return;
-	apic_write(APIC_EOI, APIC_EOI_ACK);
+	apic->native_eoi_write(APIC_EOI, APIC_EOI_ACK);
 }
 
 static void kvm_guest_cpu_init(void)
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index c00cb64..68f8cc2 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -261,10 +261,8 @@ static inline void __smp_reschedule_interrupt(void)
 
 __visible void smp_reschedule_interrupt(struct pt_regs *regs)
 {
-	irq_enter();
 	ack_APIC_irq();
 	__smp_reschedule_interrupt();
-	irq_exit();
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip:x86/apic] x86/msr: Add wrmsr_notrace()
  2016-11-07  3:13 ` [PATCH v8 1/2] x86/msr: Add write " Wanpeng Li
@ 2016-11-09 21:09   ` tip-bot for Wanpeng Li
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Wanpeng Li @ 2016-11-09 21:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pbonzini, mingo, linux-kernel, efault, tglx, bp, peterz, wanpeng.li, hpa

Commit-ID:  b2c5ea4f759190ee9f75687a00035c1a66d0d743
Gitweb:     http://git.kernel.org/tip/b2c5ea4f759190ee9f75687a00035c1a66d0d743
Author:     Wanpeng Li <wanpeng.li@hotmail.com>
AuthorDate: Mon, 7 Nov 2016 11:13:39 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 9 Nov 2016 22:03:14 +0100

x86/msr: Add wrmsr_notrace()

Required to remove the extra irq_enter()/irq_exit() in
smp_reschedule_interrupt().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: kvm@vger.kernel.org
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1478488420-5982-2-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/msr.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index b5fee97..9b0a232 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -115,17 +115,29 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
 }
 
 /* Can be uninlined because referenced by paravirt */
-notrace static inline void native_write_msr(unsigned int msr,
+static notrace inline void __native_write_msr_notrace(unsigned int msr,
 					    unsigned low, unsigned high)
 {
 	asm volatile("1: wrmsr\n"
 		     "2:\n"
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_unsafe)
 		     : : "c" (msr), "a"(low), "d" (high) : "memory");
+}
+
+/* Can be uninlined because referenced by paravirt */
+static notrace inline void native_write_msr(unsigned int msr,
+					    unsigned low, unsigned high)
+{
+	__native_write_msr_notrace(msr, low, high);
 	if (msr_tracepoint_active(__tracepoint_write_msr))
 		do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
 }
 
+static inline void wrmsr_notrace(unsigned msr, unsigned low, unsigned high)
+{
+	__native_write_msr_notrace(msr, low, high);
+}
+
 /* Can be uninlined because referenced by paravirt */
 notrace static inline int native_write_msr_safe(unsigned int msr,
 					unsigned low, unsigned high)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip:x86/apic] x86/apic: Prevent tracing on apic_msr_write_eoi()
  2016-11-07  3:13 ` [PATCH v8 2/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
@ 2016-11-09 21:10   ` tip-bot for Wanpeng Li
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot for Wanpeng Li @ 2016-11-09 21:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, pbonzini, efault, wanpeng.li, linux-kernel, bp, mingo, hpa, tglx

Commit-ID:  8ca225520e278e41396dab0524989f4848626f83
Gitweb:     http://git.kernel.org/tip/8ca225520e278e41396dab0524989f4848626f83
Author:     Wanpeng Li <wanpeng.li@hotmail.com>
AuthorDate: Mon, 7 Nov 2016 11:13:40 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 9 Nov 2016 22:03:14 +0100

x86/apic: Prevent tracing on apic_msr_write_eoi()

The following RCU lockdep warning led to adding irq_enter()/irq_exit() into
smp_reschedule_interrupt():

 RCU used illegally from idle CPU!
 rcu_scheduler_active = 1, debug_locks = 0
 RCU used illegally from extended quiescent state!
 no locks held by swapper/1/0.
 
  do_trace_write_msr
  native_write_msr
  native_apic_msr_eoi_write
  smp_reschedule_interrupt
  reschedule_interrupt

As Peterz pointed out:

| So now we're making a very frequent interrupt slower because of debug 
| code.
|
| The thing is, many many smp_reschedule_interrupt() invocations don't
| actually execute anything much at all and are only sent to tickle the
| return to user path (which does the actual preemption).
| 
| Having to do the whole irq_enter/irq_exit dance just for this unlikely
| debug case totally blows.

Use the wrmsr_notrace() variant in native_apic_msr_write_eoi, annotate the
kvm variant with notrace and add a native_apic_eoi callback to the apic
structure so KVM guests are covered as well.

This allows to revert the irq_enter/irq_exit dance in
smp_reschedule_interrupt().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Mike Galbraith <efault@gmx.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1478488420-5982-3-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/apic.h | 3 ++-
 arch/x86/kernel/apic/apic.c | 1 +
 arch/x86/kernel/kvm.c       | 4 ++--
 arch/x86/kernel/smp.c       | 2 --
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index f5aaf6c..a5a0bcf 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -196,7 +196,7 @@ static inline void native_apic_msr_write(u32 reg, u32 v)
 
 static inline void native_apic_msr_eoi_write(u32 reg, u32 v)
 {
-	wrmsr(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
+	wrmsr_notrace(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
 }
 
 static inline u32 native_apic_msr_read(u32 reg)
@@ -332,6 +332,7 @@ struct apic {
 	 * on write for EOI.
 	 */
 	void (*eoi_write)(u32 reg, u32 v);
+	void (*native_eoi_write)(u32 reg, u32 v);
 	u64 (*icr_read)(void);
 	void (*icr_write)(u32 low, u32 high);
 	void (*wait_icr_idle)(void);
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 88c657b..2686894 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2263,6 +2263,7 @@ void __init apic_set_eoi_write(void (*eoi_write)(u32 reg, u32 v))
 	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
 		/* Should happen once for each apic */
 		WARN_ON((*drv)->eoi_write == eoi_write);
+		(*drv)->native_eoi_write = (*drv)->eoi_write;
 		(*drv)->eoi_write = eoi_write;
 	}
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..aad52f1 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -308,7 +308,7 @@ static void kvm_register_steal_time(void)
 
 static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
 
-static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
+static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 {
 	/**
 	 * This relies on __test_and_clear_bit to modify the memory
@@ -319,7 +319,7 @@ static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
 	 */
 	if (__test_and_clear_bit(KVM_PV_EOI_BIT, this_cpu_ptr(&kvm_apic_eoi)))
 		return;
-	apic_write(APIC_EOI, APIC_EOI_ACK);
+	apic->native_eoi_write(APIC_EOI, APIC_EOI_ACK);
 }
 
 static void kvm_guest_cpu_init(void)
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index c00cb64..68f8cc2 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -261,10 +261,8 @@ static inline void __smp_reschedule_interrupt(void)
 
 __visible void smp_reschedule_interrupt(struct pt_regs *regs)
 {
-	irq_enter();
 	ack_APIC_irq();
 	__smp_reschedule_interrupt();
-	irq_exit();
 	/*
 	 * KVM uses this interrupt to force a cpu out of guest mode
 	 */

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-11-09 21:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07  3:13 [PATCH v8 0/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
2016-11-07  3:13 ` [PATCH v8 1/2] x86/msr: Add write " Wanpeng Li
2016-11-09 21:09   ` [tip:x86/apic] x86/msr: Add wrmsr_notrace() tip-bot for Wanpeng Li
2016-11-07  3:13 ` [PATCH v8 2/2] x86/apic: x2apic write eoi msr notrace Wanpeng Li
2016-11-09 21:10   ` [tip:x86/apic] x86/apic: Prevent tracing on apic_msr_write_eoi() tip-bot for Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).