All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] apic: eoi optimization support
@ 2012-04-23 14:03 Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 1/5] apic: fix typo EIO_ACK -> EOI_ACK and document Michael S. Tsirkin
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:03 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

I'm looking at reducing the interrupt overhead for virtualized guests:
some workloads spend a large part of their time processing interrupts.
This patchset supplies infrastructure to reduce the IRQ ack overhead on
x86: the idea is to add an eoi_write callback that we can then optimize
without touching other apic functionality.

The main user will be kvm: on kvm, an EOI write from the guest causes an
expensive exit to host; we can avoid this using shared memory as the
last patch in the series demonstrates.

But I also wrote a micro-optimized version for the regular x2apic: this
shaves off a branch and about 9 instructions from EOI when x2apic is
used, and a comment in ack_APIC_irq implies that someone counted
instructions there, at some point.

Also included in the patchset are a couple of trivial macro fixes.

The patches work fine on my boxes and I did look at the
objdump output to verify that the generated code
for the micro-optimization patch looks right
and actually is shorter.

Some benchmark results below (not sure what kind of
testing is the most appropriate) show a tiny
but measureable improvement. The tests were run on
an AMD box with 24 cpus.

- A clean kernel build after reboot shows
a tiny but measureable improvement in system time
which means lower CPU overhead (though not measureable
in total time - that is dominated by user time and fluctuates
too much):

linux# reboot -f
...
linux# make clean
linux# time make -j 64 LOCALVERSION= 2>&1 > /dev/null

Before:

real    2m52.244s
user    35m53.833s
sys     6m7.194s

After:

real    2m52.827s
user    35m48.916s
sys     6m2.305s

- perf micro-benchmarks seem to consistently show
  a tiny improvement in total time as well but it's below
  the confidence level of 3 std deviations:

# ./tools/perf/perf   stat --sync --repeat 100 --null perf bench sched messaging
...
       0.414666797 seconds time elapsed ( +-  1.29% )

Performance counter stats for 'perf bench sched messaging' (100 runs):

       0.395370891 seconds time elapsed
( +-  1.04% )


# ./tools/perf/perf   stat --sync --repeat 100 --null perf bench sched pipe -l 10000
       0.307019664 seconds time elapsed
( +-  0.10% )

       0.304738024 seconds time elapsed
( +-  0.08% )

The patches are against 3.4-rc3 - let me know if
I need to rebase.

I think patches 1-2 are definitely a good idea,
and patches 3-4 might be a good idea.
Please review, and consider patches 1-4 for linux 3.5.

Thanks,
MST

Michael S. Tsirkin (5):
  apic: fix typo EIO_ACK -> EOI_ACK and document
  apic: use symbolic APIC_EOI_ACK
  x86: add apic->eoi_write callback
  x86: eoi micro-optimization
  kvm_para: guest side for eoi avoidance

 arch/x86/include/asm/apic.h            |   22 ++++++++++++--
 arch/x86/include/asm/apicdef.h         |    2 +-
 arch/x86/include/asm/bitops.h          |    6 ++-
 arch/x86/include/asm/kvm_para.h        |    2 +
 arch/x86/kernel/apic/apic_flat_64.c    |    2 +
 arch/x86/kernel/apic/apic_noop.c       |    1 +
 arch/x86/kernel/apic/apic_numachip.c   |    1 +
 arch/x86/kernel/apic/bigsmp_32.c       |    1 +
 arch/x86/kernel/apic/es7000_32.c       |    2 +
 arch/x86/kernel/apic/numaq_32.c        |    1 +
 arch/x86/kernel/apic/probe_32.c        |    1 +
 arch/x86/kernel/apic/summit_32.c       |    1 +
 arch/x86/kernel/apic/x2apic_cluster.c  |    1 +
 arch/x86/kernel/apic/x2apic_phys.c     |    1 +
 arch/x86/kernel/apic/x2apic_uv_x.c     |    1 +
 arch/x86/kernel/kvm.c                  |   51 ++++++++++++++++++++++++++++++--
 arch/x86/platform/visws/visws_quirks.c |    2 +-
 17 files changed, 88 insertions(+), 10 deletions(-)

-- 
1.7.9.111.gf3fb0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH RFC 1/5] apic: fix typo EIO_ACK -> EOI_ACK and document
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
@ 2012-04-23 14:04 ` Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 2/5] apic: use symbolic APIC_EOI_ACK Michael S. Tsirkin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:04 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

Fix typo in the macro name and document the
reason it has this value. Update users.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/apicdef.h         |    2 +-
 arch/x86/platform/visws/visws_quirks.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 134bba0..c46bb99 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -37,7 +37,7 @@
 #define		APIC_ARBPRI_MASK	0xFFu
 #define	APIC_PROCPRI	0xA0
 #define	APIC_EOI	0xB0
-#define		APIC_EIO_ACK		0x0
+#define		APIC_EOI_ACK		0x0 /* Docs say 0 for future compat. */
 #define	APIC_RRR	0xC0
 #define	APIC_LDR	0xD0
 #define		APIC_LDR_MASK		(0xFFu << 24)
diff --git a/arch/x86/platform/visws/visws_quirks.c b/arch/x86/platform/visws/visws_quirks.c
index c7abf13..94d8a39 100644
--- a/arch/x86/platform/visws/visws_quirks.c
+++ b/arch/x86/platform/visws/visws_quirks.c
@@ -445,7 +445,7 @@ static void ack_cobalt_irq(struct irq_data *data)
 
 	spin_lock_irqsave(&cobalt_lock, flags);
 	disable_cobalt_irq(data);
-	apic_write(APIC_EOI, APIC_EIO_ACK);
+	apic_write(APIC_EOI, APIC_EOI_ACK);
 	spin_unlock_irqrestore(&cobalt_lock, flags);
 }
 
-- 
1.7.9.111.gf3fb0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 2/5] apic: use symbolic APIC_EOI_ACK
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 1/5] apic: fix typo EIO_ACK -> EOI_ACK and document Michael S. Tsirkin
@ 2012-04-23 14:04 ` Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 3/5] x86: add apic->eoi_write callback Michael S. Tsirkin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:04 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

Use the symbol instead of hard-coded numbers,
now that the reason for the value is documented
where the constant is defined we don't need to
duplicate this explanation in code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/apic.h |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index d854101..a09e9ab 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -463,9 +463,7 @@ static inline void ack_APIC_irq(void)
 	 * ack_APIC_irq() actually gets compiled as a single instruction
 	 * ... yummie.
 	 */
-
-	/* Docs say use 0 for future compatibility */
-	apic_write(APIC_EOI, 0);
+	apic_write(APIC_EOI, APIC_EOI_ACK);
 }
 
 static inline unsigned default_get_apic_id(unsigned long x)
-- 
1.7.9.111.gf3fb0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 3/5] x86: add apic->eoi_write callback
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 1/5] apic: fix typo EIO_ACK -> EOI_ACK and document Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 2/5] apic: use symbolic APIC_EOI_ACK Michael S. Tsirkin
@ 2012-04-23 14:04 ` Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC 4/5] x86: eoi micro-optimization Michael S. Tsirkin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:04 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

Add eoi_write callback so that kvm can override
eoi accesses without touching the rest of the apic.
As a side-effect, this will enable a micro-optimization
for apics using msr.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/apic.h           |   15 ++++++++++++++-
 arch/x86/kernel/apic/apic_flat_64.c   |    2 ++
 arch/x86/kernel/apic/apic_noop.c      |    1 +
 arch/x86/kernel/apic/apic_numachip.c  |    1 +
 arch/x86/kernel/apic/bigsmp_32.c      |    1 +
 arch/x86/kernel/apic/es7000_32.c      |    2 ++
 arch/x86/kernel/apic/numaq_32.c       |    1 +
 arch/x86/kernel/apic/probe_32.c       |    1 +
 arch/x86/kernel/apic/summit_32.c      |    1 +
 arch/x86/kernel/apic/x2apic_cluster.c |    1 +
 arch/x86/kernel/apic/x2apic_phys.c    |    1 +
 arch/x86/kernel/apic/x2apic_uv_x.c    |    1 +
 12 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index a09e9ab..74efb8d 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -351,6 +351,13 @@ struct apic {
 	/* apic ops */
 	u32 (*read)(u32 reg);
 	void (*write)(u32 reg, u32 v);
+        /*
+	 * eoi_write has the same signature as write.
+	 * Drivers can support both eoi_write and write by passing the same
+	 * callback value. Kernel can override eoi_write and fall back
+	 * on write for EOI.
+	 */
+	void (*eoi_write)(u32 reg, u32 v);
 	u64 (*icr_read)(void);
 	void (*icr_write)(u32 low, u32 high);
 	void (*wait_icr_idle)(void);
@@ -426,6 +433,11 @@ static inline void apic_write(u32 reg, u32 val)
 	apic->write(reg, val);
 }
 
+static inline void apic_eoi(void)
+{
+	apic->eoi_write(APIC_EOI, APIC_EOI_ACK);
+}
+
 static inline u64 apic_icr_read(void)
 {
 	return apic->icr_read();
@@ -450,6 +462,7 @@ static inline u32 safe_apic_wait_icr_idle(void)
 
 static inline u32 apic_read(u32 reg) { return 0; }
 static inline void apic_write(u32 reg, u32 val) { }
+static inline void apic_eoi(void) { }
 static inline u64 apic_icr_read(void) { return 0; }
 static inline void apic_icr_write(u32 low, u32 high) { }
 static inline void apic_wait_icr_idle(void) { }
@@ -463,7 +476,7 @@ static inline void ack_APIC_irq(void)
 	 * ack_APIC_irq() actually gets compiled as a single instruction
 	 * ... yummie.
 	 */
-	apic_write(APIC_EOI, APIC_EOI_ACK);
+	apic_eoi();
 }
 
 static inline unsigned default_get_apic_id(unsigned long x)
diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
index 359b689..0e881c4 100644
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -227,6 +227,7 @@ static struct apic apic_flat =  {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
@@ -386,6 +387,7 @@ static struct apic apic_physflat =  {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/apic_noop.c b/arch/x86/kernel/apic/apic_noop.c
index 634ae6c..a6e4c6e 100644
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -181,6 +181,7 @@ struct apic apic_noop = {
 
 	.read				= noop_apic_read,
 	.write				= noop_apic_write,
+	.eoi_write			= noop_apic_write,
 	.icr_read			= noop_apic_icr_read,
 	.icr_write			= noop_apic_icr_write,
 	.wait_icr_idle			= noop_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 899803e..4b03357 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -292,6 +292,7 @@ static struct apic apic_numachip __refconst = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/bigsmp_32.c b/arch/x86/kernel/apic/bigsmp_32.c
index 0cdec70..31fbdbf 100644
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -248,6 +248,7 @@ static struct apic apic_bigsmp = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/es7000_32.c b/arch/x86/kernel/apic/es7000_32.c
index e42d1d3b9..db4ab1b 100644
--- a/arch/x86/kernel/apic/es7000_32.c
+++ b/arch/x86/kernel/apic/es7000_32.c
@@ -678,6 +678,7 @@ static struct apic __refdata apic_es7000_cluster = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
@@ -742,6 +743,7 @@ static struct apic __refdata apic_es7000 = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/numaq_32.c b/arch/x86/kernel/apic/numaq_32.c
index 00d2422..f00a68c 100644
--- a/arch/x86/kernel/apic/numaq_32.c
+++ b/arch/x86/kernel/apic/numaq_32.c
@@ -530,6 +530,7 @@ static struct apic __refdata apic_numaq = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/probe_32.c b/arch/x86/kernel/apic/probe_32.c
index ff2c1b9..1b291da 100644
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -142,6 +142,7 @@ static struct apic apic_default = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/summit_32.c b/arch/x86/kernel/apic/summit_32.c
index fea000b..659897c 100644
--- a/arch/x86/kernel/apic/summit_32.c
+++ b/arch/x86/kernel/apic/summit_32.c
@@ -546,6 +546,7 @@ static struct apic apic_summit = {
 
 	.read				= native_apic_mem_read,
 	.write				= native_apic_mem_write,
+	.eoi_write			= native_apic_mem_write,
 	.icr_read			= native_apic_icr_read,
 	.icr_write			= native_apic_icr_write,
 	.wait_icr_idle			= native_apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c
index 48f3103..a5baa78 100644
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -260,6 +260,7 @@ static struct apic apic_x2apic_cluster = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/arch/x86/kernel/apic/x2apic_phys.c
index 8a778db..aac165b 100644
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -166,6 +166,7 @@ static struct apic apic_x2apic_phys = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index 87bfa69..5b0e3d0 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -404,6 +404,7 @@ static struct apic __refdata apic_x2apic_uv_x = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
-- 
1.7.9.111.gf3fb0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC 4/5] x86: eoi micro-optimization
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2012-04-23 14:04 ` [PATCH RFC 3/5] x86: add apic->eoi_write callback Michael S. Tsirkin
@ 2012-04-23 14:04 ` Michael S. Tsirkin
  2012-04-23 14:04 ` [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance Michael S. Tsirkin
  2012-05-07 10:35 ` [PATCH RFC 0/5] apic: eoi optimization support Ingo Molnar
  5 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:04 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

We know both register and value for eoi beforehand,
so there's no need to check it and no need to do math
to calculate the msr. Saves instructions/branches
on each EOI when using x2apic.

I checked the generated code to verify that it
actually is shorter, but haven't done any
performance testing. What's appropriate here
and what kind of test makes the most sense?

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/apic.h           |    5 +++++
 arch/x86/kernel/apic/x2apic_cluster.c |    2 +-
 arch/x86/kernel/apic/x2apic_phys.c    |    2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c    |    2 +-
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 74efb8d..5eb6d56 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -138,6 +138,11 @@ static inline void native_apic_msr_write(u32 reg, u32 v)
 	wrmsr(APIC_BASE_MSR + (reg >> 4), v, 0);
 }
 
+static inline void native_apic_msr_eoi_write(u32 reg, u32 v)
+{
+	wrmsr(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0);
+}
+
 static inline u32 native_apic_msr_read(u32 reg)
 {
 	u64 msr;
diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2apic_cluster.c
index a5baa78..ff35cff 100644
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -260,7 +260,7 @@ static struct apic apic_x2apic_cluster = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
-	.eoi_write			= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_eoi_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/x2apic_phys.c b/arch/x86/kernel/apic/x2apic_phys.c
index aac165b..06a0812 100644
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -166,7 +166,7 @@ static struct apic apic_x2apic_phys = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
-	.eoi_write			= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_eoi_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index 5b0e3d0..c6d03f7 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -404,7 +404,7 @@ static struct apic __refdata apic_x2apic_uv_x = {
 
 	.read				= native_apic_msr_read,
 	.write				= native_apic_msr_write,
-	.eoi_write			= native_apic_msr_write,
+	.eoi_write			= native_apic_msr_eoi_write,
 	.icr_read			= native_x2apic_icr_read,
 	.icr_write			= native_x2apic_icr_write,
 	.wait_icr_idle			= native_x2apic_wait_icr_idle,
-- 
1.7.9.111.gf3fb0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2012-04-23 14:04 ` [PATCH RFC 4/5] x86: eoi micro-optimization Michael S. Tsirkin
@ 2012-04-23 14:04 ` Michael S. Tsirkin
  2012-04-24  6:50   ` Gleb Natapov
  2012-05-08 15:26   ` Paolo Bonzini
  2012-05-07 10:35 ` [PATCH RFC 0/5] apic: eoi optimization support Ingo Molnar
  5 siblings, 2 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-23 14:04 UTC (permalink / raw)
  To: x86, kvm
  Cc: Ingo Molnar, H. Peter Anvin, Avi Kivity, Marcelo Tosatti, gleb,
	Linus Torvalds, linux-kernel

The idea is simple: there's a bit, per APIC, in guest memory,
that tells the guest that it does not need EOI.
Guest tests it using a single est and clear operation - this is
necessary so that host can detect interrupt nesting - and if set, it can
skip the EOI MSR.

This patch is not final yet, pls don't apply
until host side is also finalized. Including
it here for completeness to show another user
of the new eoi_write interface.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/include/asm/bitops.h   |    6 +++-
 arch/x86/include/asm/kvm_para.h |    2 +
 arch/x86/kernel/kvm.c           |   51 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b97596e..c9c70ea 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -26,11 +26,13 @@
 #if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
 /* Technically wrong, but this avoids compilation errors on some gcc
    versions. */
-#define BITOP_ADDR(x) "=m" (*(volatile long *) (x))
+#define BITOP_ADDR_CONSTRAINT "=m"
 #else
-#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
+#define BITOP_ADDR_CONSTRAINT "+m"
 #endif
 
+#define BITOP_ADDR(x) BITOP_ADDR_CONSTRAINT (*(volatile long *) (x))
+
 #define ADDR				BITOP_ADDR(addr)
 
 /*
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 734c376..195840d 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -22,6 +22,7 @@
 #define KVM_FEATURE_CLOCKSOURCE2        3
 #define KVM_FEATURE_ASYNC_PF		4
 #define KVM_FEATURE_STEAL_TIME		5
+#define KVM_FEATURE_EOI			6
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -37,6 +38,7 @@
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
 #define MSR_KVM_STEAL_TIME  0x4b564d03
+#define MSR_KVM_EOI_EN      0x4b564d04
 
 struct kvm_steal_time {
 	__u64 steal;
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index b8ba6e4..ad66cdd 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -39,6 +39,8 @@
 #include <asm/desc.h>
 #include <asm/tlbflush.h>
 #include <asm/idle.h>
+#include <asm/apic.h>
+#include <asm/apicdef.h>
 
 static int kvmapf = 1;
 
@@ -290,6 +292,27 @@ static void kvm_register_steal_time(void)
 		cpu, __pa(st));
 }
 
+static DEFINE_PER_CPU(u16, kvm_apic_eoi) __aligned(2) = 0;
+
+/* Our own copy of __test_and_clear_bit to make sure
+ * it is done with a single instruction */
+static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
+{
+	int oldbit;
+
+	asm volatile("btr %2,%1\n\t"
+		     "sbb %0,%0"
+		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
+	return oldbit;
+}
+
+static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
+{
+	if (kvm_test_and_clear_bit(0, &__get_cpu_var(kvm_apic_eoi)))
+		return;
+	apic->write(APIC_EOI, APIC_EOI_ACK);
+}
+
 void __cpuinit kvm_guest_cpu_init(void)
 {
 	if (!kvm_para_available())
@@ -307,11 +330,17 @@ void __cpuinit kvm_guest_cpu_init(void)
 		       smp_processor_id());
 	}
 
+	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
+		__get_cpu_var(kvm_apic_eoi) = 0;
+		wrmsrl(MSR_KVM_EOI_EN, __pa(&__get_cpu_var(kvm_apic_eoi)) |
+		       KVM_MSR_ENABLED);
+	}
+
 	if (has_steal_clock)
 		kvm_register_steal_time();
 }
 
-static void kvm_pv_disable_apf(void *unused)
+static void kvm_pv_disable_apf(void)
 {
 	if (!__get_cpu_var(apf_reason).enabled)
 		return;
@@ -323,11 +352,18 @@ static void kvm_pv_disable_apf(void *unused)
 	       smp_processor_id());
 }
 
+static void kvm_pv_guest_cpu_reboot(void *unused)
+{
+	if (kvm_para_has_feature(KVM_FEATURE_EOI))
+		wrmsrl(MSR_KVM_EOI_EN, 0);
+	kvm_pv_disable_apf();
+}
+
 static int kvm_pv_reboot_notify(struct notifier_block *nb,
 				unsigned long code, void *unused)
 {
 	if (code == SYS_RESTART)
-		on_each_cpu(kvm_pv_disable_apf, NULL, 1);
+		on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
 	return NOTIFY_DONE;
 }
 
@@ -378,7 +414,9 @@ static void __cpuinit kvm_guest_cpu_online(void *dummy)
 static void kvm_guest_cpu_offline(void *dummy)
 {
 	kvm_disable_steal_time();
-	kvm_pv_disable_apf(NULL);
+	if (kvm_para_has_feature(KVM_FEATURE_EOI))
+		wrmsrl(MSR_KVM_EOI_EN, 0);
+	kvm_pv_disable_apf();
 	apf_task_wake_all();
 }
 
@@ -431,6 +469,13 @@ void __init kvm_guest_init(void)
 		pv_time_ops.steal_clock = kvm_steal_clock;
 	}
 
+	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
+		/* Should happen once after apic is initialized */
+		BUG_ON(!apic);
+		BUG_ON(apic->eoi_write == kvm_guest_apic_eoi_write);
+		apic->eoi_write = kvm_guest_apic_eoi_write;
+	}
+
 #ifdef CONFIG_SMP
 	smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
 	register_cpu_notifier(&kvm_cpu_notifier);
-- 
1.7.9.111.gf3fb0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-04-23 14:04 ` [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance Michael S. Tsirkin
@ 2012-04-24  6:50   ` Gleb Natapov
  2012-04-24  6:58     ` Michael S. Tsirkin
  2012-05-08 15:26   ` Paolo Bonzini
  1 sibling, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2012-04-24  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On Mon, Apr 23, 2012 at 05:04:40PM +0300, Michael S. Tsirkin wrote:
> The idea is simple: there's a bit, per APIC, in guest memory,
> that tells the guest that it does not need EOI.
> Guest tests it using a single est and clear operation - this is
> necessary so that host can detect interrupt nesting - and if set, it can
> skip the EOI MSR.
> 
> This patch is not final yet, pls don't apply
> until host side is also finalized. Including
> it here for completeness to show another user
> of the new eoi_write interface.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  arch/x86/include/asm/bitops.h   |    6 +++-
>  arch/x86/include/asm/kvm_para.h |    2 +
>  arch/x86/kernel/kvm.c           |   51 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 54 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> index b97596e..c9c70ea 100644
> --- a/arch/x86/include/asm/bitops.h
> +++ b/arch/x86/include/asm/bitops.h
> @@ -26,11 +26,13 @@
>  #if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
>  /* Technically wrong, but this avoids compilation errors on some gcc
>     versions. */
> -#define BITOP_ADDR(x) "=m" (*(volatile long *) (x))
> +#define BITOP_ADDR_CONSTRAINT "=m"
>  #else
> -#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
> +#define BITOP_ADDR_CONSTRAINT "+m"
>  #endif
>  
> +#define BITOP_ADDR(x) BITOP_ADDR_CONSTRAINT (*(volatile long *) (x))
> +
>  #define ADDR				BITOP_ADDR(addr)
>  
>  /*
> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> index 734c376..195840d 100644
> --- a/arch/x86/include/asm/kvm_para.h
> +++ b/arch/x86/include/asm/kvm_para.h
> @@ -22,6 +22,7 @@
>  #define KVM_FEATURE_CLOCKSOURCE2        3
>  #define KVM_FEATURE_ASYNC_PF		4
>  #define KVM_FEATURE_STEAL_TIME		5
> +#define KVM_FEATURE_EOI			6
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> @@ -37,6 +38,7 @@
>  #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
>  #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
>  #define MSR_KVM_STEAL_TIME  0x4b564d03
> +#define MSR_KVM_EOI_EN      0x4b564d04
>  
>  struct kvm_steal_time {
>  	__u64 steal;
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index b8ba6e4..ad66cdd 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -39,6 +39,8 @@
>  #include <asm/desc.h>
>  #include <asm/tlbflush.h>
>  #include <asm/idle.h>
> +#include <asm/apic.h>
> +#include <asm/apicdef.h>
>  
>  static int kvmapf = 1;
>  
> @@ -290,6 +292,27 @@ static void kvm_register_steal_time(void)
>  		cpu, __pa(st));
>  }
>  
> +static DEFINE_PER_CPU(u16, kvm_apic_eoi) __aligned(2) = 0;
> +
> +/* Our own copy of __test_and_clear_bit to make sure
> + * it is done with a single instruction */
> +static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
> +{
> +	int oldbit;
> +
> +	asm volatile("btr %2,%1\n\t"
> +		     "sbb %0,%0"
> +		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
> +	return oldbit;
> +}
> +
> +static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
> +{
> +	if (kvm_test_and_clear_bit(0, &__get_cpu_var(kvm_apic_eoi)))
> +		return;
> +	apic->write(APIC_EOI, APIC_EOI_ACK);
> +}
> +
>  void __cpuinit kvm_guest_cpu_init(void)
>  {
>  	if (!kvm_para_available())
> @@ -307,11 +330,17 @@ void __cpuinit kvm_guest_cpu_init(void)
>  		       smp_processor_id());
>  	}
>  
> +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> +		__get_cpu_var(kvm_apic_eoi) = 0;
> +		wrmsrl(MSR_KVM_EOI_EN, __pa(&__get_cpu_var(kvm_apic_eoi)) |
> +		       KVM_MSR_ENABLED);
> +	}
> +
>  	if (has_steal_clock)
>  		kvm_register_steal_time();
>  }
>  
> -static void kvm_pv_disable_apf(void *unused)
> +static void kvm_pv_disable_apf(void)
>  {
>  	if (!__get_cpu_var(apf_reason).enabled)
>  		return;
> @@ -323,11 +352,18 @@ static void kvm_pv_disable_apf(void *unused)
>  	       smp_processor_id());
>  }
>  
> +static void kvm_pv_guest_cpu_reboot(void *unused)
> +{
> +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> +		wrmsrl(MSR_KVM_EOI_EN, 0);
> +	kvm_pv_disable_apf();
> +}
> +
>  static int kvm_pv_reboot_notify(struct notifier_block *nb,
>  				unsigned long code, void *unused)
>  {
>  	if (code == SYS_RESTART)
> -		on_each_cpu(kvm_pv_disable_apf, NULL, 1);
> +		on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
>  	return NOTIFY_DONE;
>  }
>  
> @@ -378,7 +414,9 @@ static void __cpuinit kvm_guest_cpu_online(void *dummy)
>  static void kvm_guest_cpu_offline(void *dummy)
>  {
>  	kvm_disable_steal_time();
> -	kvm_pv_disable_apf(NULL);
> +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> +		wrmsrl(MSR_KVM_EOI_EN, 0);
> +	kvm_pv_disable_apf();
>  	apf_task_wake_all();
>  }
>  
> @@ -431,6 +469,13 @@ void __init kvm_guest_init(void)
>  		pv_time_ops.steal_clock = kvm_steal_clock;
>  	}
>  
> +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> +		/* Should happen once after apic is initialized */
> +		BUG_ON(!apic);
What if kernel was started with noapic?

> +		BUG_ON(apic->eoi_write == kvm_guest_apic_eoi_write);
> +		apic->eoi_write = kvm_guest_apic_eoi_write;
> +	}
> +
>  #ifdef CONFIG_SMP
>  	smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
>  	register_cpu_notifier(&kvm_cpu_notifier);
> -- 
> 1.7.9.111.gf3fb0

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-04-24  6:50   ` Gleb Natapov
@ 2012-04-24  6:58     ` Michael S. Tsirkin
  2012-04-24  7:07       ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-04-24  6:58 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, Apr 24, 2012 at 09:50:02AM +0300, Gleb Natapov wrote:
> On Mon, Apr 23, 2012 at 05:04:40PM +0300, Michael S. Tsirkin wrote:
> > The idea is simple: there's a bit, per APIC, in guest memory,
> > that tells the guest that it does not need EOI.
> > Guest tests it using a single est and clear operation - this is
> > necessary so that host can detect interrupt nesting - and if set, it can
> > skip the EOI MSR.
> > 
> > This patch is not final yet, pls don't apply
> > until host side is also finalized. Including
> > it here for completeness to show another user
> > of the new eoi_write interface.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Just pointing out the above: this last patch in the
series is for illustrative purposes at this point.

> > ---
> >  arch/x86/include/asm/bitops.h   |    6 +++-
> >  arch/x86/include/asm/kvm_para.h |    2 +
> >  arch/x86/kernel/kvm.c           |   51 ++++++++++++++++++++++++++++++++++++--
> >  3 files changed, 54 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > index b97596e..c9c70ea 100644
> > --- a/arch/x86/include/asm/bitops.h
> > +++ b/arch/x86/include/asm/bitops.h
> > @@ -26,11 +26,13 @@
> >  #if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
> >  /* Technically wrong, but this avoids compilation errors on some gcc
> >     versions. */
> > -#define BITOP_ADDR(x) "=m" (*(volatile long *) (x))
> > +#define BITOP_ADDR_CONSTRAINT "=m"
> >  #else
> > -#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
> > +#define BITOP_ADDR_CONSTRAINT "+m"
> >  #endif
> >  
> > +#define BITOP_ADDR(x) BITOP_ADDR_CONSTRAINT (*(volatile long *) (x))
> > +
> >  #define ADDR				BITOP_ADDR(addr)
> >  
> >  /*
> > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> > index 734c376..195840d 100644
> > --- a/arch/x86/include/asm/kvm_para.h
> > +++ b/arch/x86/include/asm/kvm_para.h
> > @@ -22,6 +22,7 @@
> >  #define KVM_FEATURE_CLOCKSOURCE2        3
> >  #define KVM_FEATURE_ASYNC_PF		4
> >  #define KVM_FEATURE_STEAL_TIME		5
> > +#define KVM_FEATURE_EOI			6
> >  
> >  /* The last 8 bits are used to indicate how to interpret the flags field
> >   * in pvclock structure. If no bits are set, all flags are ignored.
> > @@ -37,6 +38,7 @@
> >  #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
> >  #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
> >  #define MSR_KVM_STEAL_TIME  0x4b564d03
> > +#define MSR_KVM_EOI_EN      0x4b564d04
> >  
> >  struct kvm_steal_time {
> >  	__u64 steal;
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index b8ba6e4..ad66cdd 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -39,6 +39,8 @@
> >  #include <asm/desc.h>
> >  #include <asm/tlbflush.h>
> >  #include <asm/idle.h>
> > +#include <asm/apic.h>
> > +#include <asm/apicdef.h>
> >  
> >  static int kvmapf = 1;
> >  
> > @@ -290,6 +292,27 @@ static void kvm_register_steal_time(void)
> >  		cpu, __pa(st));
> >  }
> >  
> > +static DEFINE_PER_CPU(u16, kvm_apic_eoi) __aligned(2) = 0;
> > +
> > +/* Our own copy of __test_and_clear_bit to make sure
> > + * it is done with a single instruction */
> > +static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
> > +{
> > +	int oldbit;
> > +
> > +	asm volatile("btr %2,%1\n\t"
> > +		     "sbb %0,%0"
> > +		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
> > +	return oldbit;
> > +}
> > +
> > +static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
> > +{
> > +	if (kvm_test_and_clear_bit(0, &__get_cpu_var(kvm_apic_eoi)))
> > +		return;
> > +	apic->write(APIC_EOI, APIC_EOI_ACK);
> > +}
> > +
> >  void __cpuinit kvm_guest_cpu_init(void)
> >  {
> >  	if (!kvm_para_available())
> > @@ -307,11 +330,17 @@ void __cpuinit kvm_guest_cpu_init(void)
> >  		       smp_processor_id());
> >  	}
> >  
> > +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> > +		__get_cpu_var(kvm_apic_eoi) = 0;
> > +		wrmsrl(MSR_KVM_EOI_EN, __pa(&__get_cpu_var(kvm_apic_eoi)) |
> > +		       KVM_MSR_ENABLED);
> > +	}
> > +
> >  	if (has_steal_clock)
> >  		kvm_register_steal_time();
> >  }
> >  
> > -static void kvm_pv_disable_apf(void *unused)
> > +static void kvm_pv_disable_apf(void)
> >  {
> >  	if (!__get_cpu_var(apf_reason).enabled)
> >  		return;
> > @@ -323,11 +352,18 @@ static void kvm_pv_disable_apf(void *unused)
> >  	       smp_processor_id());
> >  }
> >  
> > +static void kvm_pv_guest_cpu_reboot(void *unused)
> > +{
> > +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> > +		wrmsrl(MSR_KVM_EOI_EN, 0);
> > +	kvm_pv_disable_apf();
> > +}
> > +
> >  static int kvm_pv_reboot_notify(struct notifier_block *nb,
> >  				unsigned long code, void *unused)
> >  {
> >  	if (code == SYS_RESTART)
> > -		on_each_cpu(kvm_pv_disable_apf, NULL, 1);
> > +		on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
> >  	return NOTIFY_DONE;
> >  }
> >  
> > @@ -378,7 +414,9 @@ static void __cpuinit kvm_guest_cpu_online(void *dummy)
> >  static void kvm_guest_cpu_offline(void *dummy)
> >  {
> >  	kvm_disable_steal_time();
> > -	kvm_pv_disable_apf(NULL);
> > +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> > +		wrmsrl(MSR_KVM_EOI_EN, 0);
> > +	kvm_pv_disable_apf();
> >  	apf_task_wake_all();
> >  }
> >  
> > @@ -431,6 +469,13 @@ void __init kvm_guest_init(void)
> >  		pv_time_ops.steal_clock = kvm_steal_clock;
> >  	}
> >  
> > +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> > +		/* Should happen once after apic is initialized */
> > +		BUG_ON(!apic);
> What if kernel was started with noapic?

I didn't test this yet. I thought noapic sets the apic pointer to
apic_noop, not to NULL.

So this case will get slowed down slightly if it
actually runs on kvm. It does not look like noapic
guest is worth optimising for.

> > +		BUG_ON(apic->eoi_write == kvm_guest_apic_eoi_write);
> > +		apic->eoi_write = kvm_guest_apic_eoi_write;
> > +	}
> > +
> >  #ifdef CONFIG_SMP
> >  	smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
> >  	register_cpu_notifier(&kvm_cpu_notifier);
> > -- 
> > 1.7.9.111.gf3fb0
> 
> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-04-24  6:58     ` Michael S. Tsirkin
@ 2012-04-24  7:07       ` Gleb Natapov
  0 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2012-04-24  7:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, Apr 24, 2012 at 09:58:35AM +0300, Michael S. Tsirkin wrote:
> On Tue, Apr 24, 2012 at 09:50:02AM +0300, Gleb Natapov wrote:
> > On Mon, Apr 23, 2012 at 05:04:40PM +0300, Michael S. Tsirkin wrote:
> > > The idea is simple: there's a bit, per APIC, in guest memory,
> > > that tells the guest that it does not need EOI.
> > > Guest tests it using a single est and clear operation - this is
> > > necessary so that host can detect interrupt nesting - and if set, it can
> > > skip the EOI MSR.
> > > 
> > > This patch is not final yet, pls don't apply
> > > until host side is also finalized. Including
> > > it here for completeness to show another user
> > > of the new eoi_write interface.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Just pointing out the above: this last patch in the
> series is for illustrative purposes at this point.
> 
> > > ---
> > >  arch/x86/include/asm/bitops.h   |    6 +++-
> > >  arch/x86/include/asm/kvm_para.h |    2 +
> > >  arch/x86/kernel/kvm.c           |   51 ++++++++++++++++++++++++++++++++++++--
> > >  3 files changed, 54 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
> > > index b97596e..c9c70ea 100644
> > > --- a/arch/x86/include/asm/bitops.h
> > > +++ b/arch/x86/include/asm/bitops.h
> > > @@ -26,11 +26,13 @@
> > >  #if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
> > >  /* Technically wrong, but this avoids compilation errors on some gcc
> > >     versions. */
> > > -#define BITOP_ADDR(x) "=m" (*(volatile long *) (x))
> > > +#define BITOP_ADDR_CONSTRAINT "=m"
> > >  #else
> > > -#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
> > > +#define BITOP_ADDR_CONSTRAINT "+m"
> > >  #endif
> > >  
> > > +#define BITOP_ADDR(x) BITOP_ADDR_CONSTRAINT (*(volatile long *) (x))
> > > +
> > >  #define ADDR				BITOP_ADDR(addr)
> > >  
> > >  /*
> > > diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> > > index 734c376..195840d 100644
> > > --- a/arch/x86/include/asm/kvm_para.h
> > > +++ b/arch/x86/include/asm/kvm_para.h
> > > @@ -22,6 +22,7 @@
> > >  #define KVM_FEATURE_CLOCKSOURCE2        3
> > >  #define KVM_FEATURE_ASYNC_PF		4
> > >  #define KVM_FEATURE_STEAL_TIME		5
> > > +#define KVM_FEATURE_EOI			6
> > >  
> > >  /* The last 8 bits are used to indicate how to interpret the flags field
> > >   * in pvclock structure. If no bits are set, all flags are ignored.
> > > @@ -37,6 +38,7 @@
> > >  #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
> > >  #define MSR_KVM_ASYNC_PF_EN 0x4b564d02
> > >  #define MSR_KVM_STEAL_TIME  0x4b564d03
> > > +#define MSR_KVM_EOI_EN      0x4b564d04
> > >  
> > >  struct kvm_steal_time {
> > >  	__u64 steal;
> > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > index b8ba6e4..ad66cdd 100644
> > > --- a/arch/x86/kernel/kvm.c
> > > +++ b/arch/x86/kernel/kvm.c
> > > @@ -39,6 +39,8 @@
> > >  #include <asm/desc.h>
> > >  #include <asm/tlbflush.h>
> > >  #include <asm/idle.h>
> > > +#include <asm/apic.h>
> > > +#include <asm/apicdef.h>
> > >  
> > >  static int kvmapf = 1;
> > >  
> > > @@ -290,6 +292,27 @@ static void kvm_register_steal_time(void)
> > >  		cpu, __pa(st));
> > >  }
> > >  
> > > +static DEFINE_PER_CPU(u16, kvm_apic_eoi) __aligned(2) = 0;
> > > +
> > > +/* Our own copy of __test_and_clear_bit to make sure
> > > + * it is done with a single instruction */
> > > +static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
> > > +{
> > > +	int oldbit;
> > > +
> > > +	asm volatile("btr %2,%1\n\t"
> > > +		     "sbb %0,%0"
> > > +		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
> > > +	return oldbit;
> > > +}
> > > +
> > > +static void kvm_guest_apic_eoi_write(u32 reg, u32 val)
> > > +{
> > > +	if (kvm_test_and_clear_bit(0, &__get_cpu_var(kvm_apic_eoi)))
> > > +		return;
> > > +	apic->write(APIC_EOI, APIC_EOI_ACK);
> > > +}
> > > +
> > >  void __cpuinit kvm_guest_cpu_init(void)
> > >  {
> > >  	if (!kvm_para_available())
> > > @@ -307,11 +330,17 @@ void __cpuinit kvm_guest_cpu_init(void)
> > >  		       smp_processor_id());
> > >  	}
> > >  
> > > +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> > > +		__get_cpu_var(kvm_apic_eoi) = 0;
> > > +		wrmsrl(MSR_KVM_EOI_EN, __pa(&__get_cpu_var(kvm_apic_eoi)) |
> > > +		       KVM_MSR_ENABLED);
> > > +	}
> > > +
> > >  	if (has_steal_clock)
> > >  		kvm_register_steal_time();
> > >  }
> > >  
> > > -static void kvm_pv_disable_apf(void *unused)
> > > +static void kvm_pv_disable_apf(void)
> > >  {
> > >  	if (!__get_cpu_var(apf_reason).enabled)
> > >  		return;
> > > @@ -323,11 +352,18 @@ static void kvm_pv_disable_apf(void *unused)
> > >  	       smp_processor_id());
> > >  }
> > >  
> > > +static void kvm_pv_guest_cpu_reboot(void *unused)
> > > +{
> > > +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> > > +		wrmsrl(MSR_KVM_EOI_EN, 0);
> > > +	kvm_pv_disable_apf();
> > > +}
> > > +
> > >  static int kvm_pv_reboot_notify(struct notifier_block *nb,
> > >  				unsigned long code, void *unused)
> > >  {
> > >  	if (code == SYS_RESTART)
> > > -		on_each_cpu(kvm_pv_disable_apf, NULL, 1);
> > > +		on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
> > >  	return NOTIFY_DONE;
> > >  }
> > >  
> > > @@ -378,7 +414,9 @@ static void __cpuinit kvm_guest_cpu_online(void *dummy)
> > >  static void kvm_guest_cpu_offline(void *dummy)
> > >  {
> > >  	kvm_disable_steal_time();
> > > -	kvm_pv_disable_apf(NULL);
> > > +	if (kvm_para_has_feature(KVM_FEATURE_EOI))
> > > +		wrmsrl(MSR_KVM_EOI_EN, 0);
> > > +	kvm_pv_disable_apf();
> > >  	apf_task_wake_all();
> > >  }
> > >  
> > > @@ -431,6 +469,13 @@ void __init kvm_guest_init(void)
> > >  		pv_time_ops.steal_clock = kvm_steal_clock;
> > >  	}
> > >  
> > > +	if (kvm_para_has_feature(KVM_FEATURE_EOI)) {
> > > +		/* Should happen once after apic is initialized */
> > > +		BUG_ON(!apic);
> > What if kernel was started with noapic?
> 
> I didn't test this yet. I thought noapic sets the apic pointer to
> apic_noop, not to NULL.
Actually it is nolapic and it exists only for 32bit kernels.

> 
> So this case will get slowed down slightly if it
> actually runs on kvm. It does not look like noapic
> guest is worth optimising for.
> 
It is not. Just make sure it does not trigger the assertion.

> > > +		BUG_ON(apic->eoi_write == kvm_guest_apic_eoi_write);
> > > +		apic->eoi_write = kvm_guest_apic_eoi_write;
> > > +	}
> > > +
> > >  #ifdef CONFIG_SMP
> > >  	smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
> > >  	register_cpu_notifier(&kvm_cpu_notifier);
> > > -- 
> > > 1.7.9.111.gf3fb0
> > 
> > --
> > 			Gleb.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/5] apic: eoi optimization support
  2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2012-04-23 14:04 ` [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance Michael S. Tsirkin
@ 2012-05-07 10:35 ` Ingo Molnar
  2012-05-07 10:59   ` Michael S. Tsirkin
  5 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2012-05-07 10:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel


* Michael S. Tsirkin <mst@redhat.com> wrote:

> I'm looking at reducing the interrupt overhead for virtualized guests:
> some workloads spend a large part of their time processing interrupts.
> This patchset supplies infrastructure to reduce the IRQ ack overhead on
> x86: the idea is to add an eoi_write callback that we can then optimize
> without touching other apic functionality.
> 
> The main user will be kvm: on kvm, an EOI write from the guest causes an
> expensive exit to host; we can avoid this using shared memory as the
> last patch in the series demonstrates.
> 
> But I also wrote a micro-optimized version for the regular x2apic: this
> shaves off a branch and about 9 instructions from EOI when x2apic is
> used, and a comment in ack_APIC_irq implies that someone counted
> instructions there, at some point.
> 
> Also included in the patchset are a couple of trivial macro fixes.
> 
> The patches work fine on my boxes and I did look at the
> objdump output to verify that the generated code
> for the micro-optimization patch looks right
> and actually is shorter.
> 
> Some benchmark results below (not sure what kind of
> testing is the most appropriate) show a tiny
> but measureable improvement. The tests were run on
> an AMD box with 24 cpus.
> 
> - A clean kernel build after reboot shows
> a tiny but measureable improvement in system time
> which means lower CPU overhead (though not measureable
> in total time - that is dominated by user time and fluctuates
> too much):
> 
> linux# reboot -f
> ...
> linux# make clean
> linux# time make -j 64 LOCALVERSION= 2>&1 > /dev/null
> 
> Before:
> 
> real    2m52.244s
> user    35m53.833s
> sys     6m7.194s
> 
> After:
> 
> real    2m52.827s
> user    35m48.916s
> sys     6m2.305s
> 
> - perf micro-benchmarks seem to consistently show
>   a tiny improvement in total time as well but it's below
>   the confidence level of 3 std deviations:
> 
> # ./tools/perf/perf   stat --sync --repeat 100 --null perf bench sched messaging
> ...
>        0.414666797 seconds time elapsed ( +-  1.29% )
> 
> Performance counter stats for 'perf bench sched messaging' (100 runs):
> 
>        0.395370891 seconds time elapsed
> ( +-  1.04% )
> 
> 
> # ./tools/perf/perf   stat --sync --repeat 100 --null perf bench sched pipe -l 10000
>        0.307019664 seconds time elapsed
> ( +-  0.10% )
> 
>        0.304738024 seconds time elapsed
> ( +-  0.08% )
> 
> The patches are against 3.4-rc3 - let me know if
> I need to rebase.
> 
> I think patches 1-2 are definitely a good idea,
> and patches 3-4 might be a good idea.
> Please review, and consider patches 1-4 for linux 3.5.
> 
> Thanks,
> MST
> 
> Michael S. Tsirkin (5):
>   apic: fix typo EIO_ACK -> EOI_ACK and document
>   apic: use symbolic APIC_EOI_ACK
>   x86: add apic->eoi_write callback
>   x86: eoi micro-optimization
>   kvm_para: guest side for eoi avoidance
> 
>  arch/x86/include/asm/apic.h            |   22 ++++++++++++--
>  arch/x86/include/asm/apicdef.h         |    2 +-
>  arch/x86/include/asm/bitops.h          |    6 ++-
>  arch/x86/include/asm/kvm_para.h        |    2 +
>  arch/x86/kernel/apic/apic_flat_64.c    |    2 +
>  arch/x86/kernel/apic/apic_noop.c       |    1 +
>  arch/x86/kernel/apic/apic_numachip.c   |    1 +
>  arch/x86/kernel/apic/bigsmp_32.c       |    1 +
>  arch/x86/kernel/apic/es7000_32.c       |    2 +
>  arch/x86/kernel/apic/numaq_32.c        |    1 +
>  arch/x86/kernel/apic/probe_32.c        |    1 +
>  arch/x86/kernel/apic/summit_32.c       |    1 +
>  arch/x86/kernel/apic/x2apic_cluster.c  |    1 +
>  arch/x86/kernel/apic/x2apic_phys.c     |    1 +
>  arch/x86/kernel/apic/x2apic_uv_x.c     |    1 +
>  arch/x86/kernel/kvm.c                  |   51 ++++++++++++++++++++++++++++++--
>  arch/x86/platform/visws/visws_quirks.c |    2 +-
>  17 files changed, 88 insertions(+), 10 deletions(-)

No objections from the x86 side.

In terms of advantages, could you please create perf stat runs 
that counts the number of MMIOs or so? That should show a pretty 
obvious improvement - and that is enough as proof, no need to 
try to reproduce the performance win in a noisy benchmark.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/5] apic: eoi optimization support
  2012-05-07 10:35 ` [PATCH RFC 0/5] apic: eoi optimization support Ingo Molnar
@ 2012-05-07 10:59   ` Michael S. Tsirkin
  2012-05-07 11:40     ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-05-07 10:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel

On Mon, May 07, 2012 at 12:35:12PM +0200, Ingo Molnar wrote:
> > Michael S. Tsirkin (5):
> >   apic: fix typo EIO_ACK -> EOI_ACK and document
> >   apic: use symbolic APIC_EOI_ACK
> >   x86: add apic->eoi_write callback
> >   x86: eoi micro-optimization
> >   kvm_para: guest side for eoi avoidance
> > 
> >  arch/x86/include/asm/apic.h            |   22 ++++++++++++--
> >  arch/x86/include/asm/apicdef.h         |    2 +-
> >  arch/x86/include/asm/bitops.h          |    6 ++-
> >  arch/x86/include/asm/kvm_para.h        |    2 +
> >  arch/x86/kernel/apic/apic_flat_64.c    |    2 +
> >  arch/x86/kernel/apic/apic_noop.c       |    1 +
> >  arch/x86/kernel/apic/apic_numachip.c   |    1 +
> >  arch/x86/kernel/apic/bigsmp_32.c       |    1 +
> >  arch/x86/kernel/apic/es7000_32.c       |    2 +
> >  arch/x86/kernel/apic/numaq_32.c        |    1 +
> >  arch/x86/kernel/apic/probe_32.c        |    1 +
> >  arch/x86/kernel/apic/summit_32.c       |    1 +
> >  arch/x86/kernel/apic/x2apic_cluster.c  |    1 +
> >  arch/x86/kernel/apic/x2apic_phys.c     |    1 +
> >  arch/x86/kernel/apic/x2apic_uv_x.c     |    1 +
> >  arch/x86/kernel/kvm.c                  |   51 ++++++++++++++++++++++++++++++--
> >  arch/x86/platform/visws/visws_quirks.c |    2 +-
> >  17 files changed, 88 insertions(+), 10 deletions(-)
> 
> No objections from the x86 side.

Is kvm.git a good tree to merge this through?

> In terms of advantages, could you please create perf stat runs 
> that counts the number of MMIOs or so? That should show a pretty 
> obvious improvement - and that is enough as proof, no need to 
> try to reproduce the performance win in a noisy benchmark.

You mean with kvm PV, right? On real hardware the micro-optimization
removes branches and maybe cache-misses but I don't see why would it
reduce MMIOs.

-- 
MST

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/5] apic: eoi optimization support
  2012-05-07 10:59   ` Michael S. Tsirkin
@ 2012-05-07 11:40     ` Ingo Molnar
  2012-05-07 11:47       ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2012-05-07 11:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel


* Michael S. Tsirkin <mst@redhat.com> wrote:

> On Mon, May 07, 2012 at 12:35:12PM +0200, Ingo Molnar wrote:
> > > Michael S. Tsirkin (5):
> > >   apic: fix typo EIO_ACK -> EOI_ACK and document
> > >   apic: use symbolic APIC_EOI_ACK
> > >   x86: add apic->eoi_write callback
> > >   x86: eoi micro-optimization
> > >   kvm_para: guest side for eoi avoidance
> > > 
> > >  arch/x86/include/asm/apic.h            |   22 ++++++++++++--
> > >  arch/x86/include/asm/apicdef.h         |    2 +-
> > >  arch/x86/include/asm/bitops.h          |    6 ++-
> > >  arch/x86/include/asm/kvm_para.h        |    2 +
> > >  arch/x86/kernel/apic/apic_flat_64.c    |    2 +
> > >  arch/x86/kernel/apic/apic_noop.c       |    1 +
> > >  arch/x86/kernel/apic/apic_numachip.c   |    1 +
> > >  arch/x86/kernel/apic/bigsmp_32.c       |    1 +
> > >  arch/x86/kernel/apic/es7000_32.c       |    2 +
> > >  arch/x86/kernel/apic/numaq_32.c        |    1 +
> > >  arch/x86/kernel/apic/probe_32.c        |    1 +
> > >  arch/x86/kernel/apic/summit_32.c       |    1 +
> > >  arch/x86/kernel/apic/x2apic_cluster.c  |    1 +
> > >  arch/x86/kernel/apic/x2apic_phys.c     |    1 +
> > >  arch/x86/kernel/apic/x2apic_uv_x.c     |    1 +
> > >  arch/x86/kernel/kvm.c                  |   51 ++++++++++++++++++++++++++++++--
> > >  arch/x86/platform/visws/visws_quirks.c |    2 +-
> > >  17 files changed, 88 insertions(+), 10 deletions(-)
> > 
> > No objections from the x86 side.
> 
> Is kvm.git a good tree to merge this through?

Fine to me, but I haven't checked how widely it conflicts with 
existing bits: by the looks of it most of the linecount is on 
the core x86 side, while the kvm change is well concentrated.

> > In terms of advantages, could you please create perf stat 
> > runs that counts the number of MMIOs or so? That should show 
> > a pretty obvious improvement - and that is enough as proof, 
> > no need to try to reproduce the performance win in a noisy 
> > benchmark.
> 
> You mean with kvm PV, right? On real hardware the 
> micro-optimization removes branches and maybe cache-misses but 
> I don't see why would it reduce MMIOs.

Yeah, on KVM. On real hw I doubt it's measurable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/5] apic: eoi optimization support
  2012-05-07 11:40     ` Ingo Molnar
@ 2012-05-07 11:47       ` Avi Kivity
  2012-05-07 11:57         ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2012-05-07 11:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michael S. Tsirkin, x86, kvm, Ingo Molnar, H. Peter Anvin,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel

On 05/07/2012 02:40 PM, Ingo Molnar wrote:
> > > 
> > > No objections from the x86 side.
> > 
> > Is kvm.git a good tree to merge this through?
>
> Fine to me, but I haven't checked how widely it conflicts with 
> existing bits: by the looks of it most of the linecount is on 
> the core x86 side, while the kvm change is well concentrated.

I don't see a problem with merging though tip.git - we're close to the
next merge window, and the guest side rarely causes conflicts.  But
please don't apply the last patch yet, I want to review it more closely
(esp. with the host side).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC 0/5] apic: eoi optimization support
  2012-05-07 11:47       ` Avi Kivity
@ 2012-05-07 11:57         ` Ingo Molnar
  0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2012-05-07 11:57 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Michael S. Tsirkin, x86, kvm, Ingo Molnar, H. Peter Anvin,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel


* Avi Kivity <avi@redhat.com> wrote:

> On 05/07/2012 02:40 PM, Ingo Molnar wrote:
> > > > 
> > > > No objections from the x86 side.
> > > 
> > > Is kvm.git a good tree to merge this through?
> >
> > Fine to me, but I haven't checked how widely it conflicts with 
> > existing bits: by the looks of it most of the linecount is on 
> > the core x86 side, while the kvm change is well concentrated.
> 
> I don't see a problem with merging though tip.git - we're close to the
> next merge window, and the guest side rarely causes conflicts.  But
> please don't apply the last patch yet, I want to review it more closely
> (esp. with the host side).

That last patch was marked "don't apply yet", so I definitely 
planned on another iteration that incorporates all the feedback 
that has been given.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-04-23 14:04 ` [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance Michael S. Tsirkin
  2012-04-24  6:50   ` Gleb Natapov
@ 2012-05-08 15:26   ` Paolo Bonzini
  2012-05-08 15:28     ` Gleb Natapov
  1 sibling, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2012-05-08 15:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: x86, kvm, Ingo Molnar, H. Peter Anvin, Avi Kivity,
	Marcelo Tosatti, gleb, Linus Torvalds, linux-kernel

Il 23/04/2012 16:04, Michael S. Tsirkin ha scritto:
> +/* Our own copy of __test_and_clear_bit to make sure
> + * it is done with a single instruction */

Is this for microoptimization or correctness?  If the latter, it does
not ensure anything without a "lock" prefix.

Paolo

> +static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
> +{
> +	int oldbit;
> +
> +	asm volatile("btr %2,%1\n\t"
> +		     "sbb %0,%0"
> +		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
> +	return oldbit;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 15:26   ` Paolo Bonzini
@ 2012-05-08 15:28     ` Gleb Natapov
  2012-05-08 15:45       ` H. Peter Anvin
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2012-05-08 15:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, x86, kvm, Ingo Molnar, H. Peter Anvin,
	Avi Kivity, Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, May 08, 2012 at 05:26:55PM +0200, Paolo Bonzini wrote:
> Il 23/04/2012 16:04, Michael S. Tsirkin ha scritto:
> > +/* Our own copy of __test_and_clear_bit to make sure
> > + * it is done with a single instruction */
> 
> Is this for microoptimization or correctness?  If the latter, it does
> not ensure anything without a "lock" prefix.
> 
It can't race with other vcpus, only with vmexit on the same vcpu.

> Paolo
> 
> > +static inline int kvm_test_and_clear_bit(int nr, volatile u16* addr)
> > +{
> > +	int oldbit;
> > +
> > +	asm volatile("btr %2,%1\n\t"
> > +		     "sbb %0,%0"
> > +		     : "=r" (oldbit), BITOP_ADDR_CONSTRAINT(*addr) : "Ir" (nr));
> > +	return oldbit;

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 15:28     ` Gleb Natapov
@ 2012-05-08 15:45       ` H. Peter Anvin
  2012-05-08 16:32         ` Gleb Natapov
  2012-05-08 16:57         ` Michael S. Tsirkin
  0 siblings, 2 replies; 21+ messages in thread
From: H. Peter Anvin @ 2012-05-08 15:45 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Paolo Bonzini, Michael S. Tsirkin, x86, kvm, Ingo Molnar,
	Avi Kivity, Marcelo Tosatti, Linus Torvalds, linux-kernel

On 05/08/2012 08:28 AM, Gleb Natapov wrote:
> On Tue, May 08, 2012 at 05:26:55PM +0200, Paolo Bonzini wrote:
>> Il 23/04/2012 16:04, Michael S. Tsirkin ha scritto:
>>> +/* Our own copy of __test_and_clear_bit to make sure
>>> + * it is done with a single instruction */
>>
>> Is this for microoptimization or correctness?  If the latter, it does
>> not ensure anything without a "lock" prefix.
>>
> It can't race with other vcpus, only with vmexit on the same vcpu.
> 

That doesn't answer the question very well... I really don't understand
the point of having a private copy here.

I really, really don't want a bunch of private interfaces around.  It
would be a lot better to define a test_and_{set,clear}_bit_local() in
<asm/bitops.h> which is defined to be local CPU atomic.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 15:45       ` H. Peter Anvin
@ 2012-05-08 16:32         ` Gleb Natapov
  2012-05-08 16:57         ` Michael S. Tsirkin
  1 sibling, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2012-05-08 16:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Paolo Bonzini, Michael S. Tsirkin, x86, kvm, Ingo Molnar,
	Avi Kivity, Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, May 08, 2012 at 08:45:39AM -0700, H. Peter Anvin wrote:
> On 05/08/2012 08:28 AM, Gleb Natapov wrote:
> > On Tue, May 08, 2012 at 05:26:55PM +0200, Paolo Bonzini wrote:
> >> Il 23/04/2012 16:04, Michael S. Tsirkin ha scritto:
> >>> +/* Our own copy of __test_and_clear_bit to make sure
> >>> + * it is done with a single instruction */
> >>
> >> Is this for microoptimization or correctness?  If the latter, it does
> >> not ensure anything without a "lock" prefix.
> >>
> > It can't race with other vcpus, only with vmexit on the same vcpu.
> > 
> 
> That doesn't answer the question very well... I really don't understand
> the point of having a private copy here.
__test_and_clear_bit() is not guarantied to be local CPU atomic an that
is what we need here.

> 
> I really, really don't want a bunch of private interfaces around.  It
> would be a lot better to define a test_and_{set,clear}_bit_local() in
> <asm/bitops.h> which is defined to be local CPU atomic.
> 
Yes, this will be definitely better.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 15:45       ` H. Peter Anvin
  2012-05-08 16:32         ` Gleb Natapov
@ 2012-05-08 16:57         ` Michael S. Tsirkin
  2012-05-08 18:06           ` H. Peter Anvin
  1 sibling, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-05-08 16:57 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Gleb Natapov, Paolo Bonzini, x86, kvm, Ingo Molnar, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, May 08, 2012 at 08:45:39AM -0700, H. Peter Anvin wrote:
> On 05/08/2012 08:28 AM, Gleb Natapov wrote:
> > On Tue, May 08, 2012 at 05:26:55PM +0200, Paolo Bonzini wrote:
> >> Il 23/04/2012 16:04, Michael S. Tsirkin ha scritto:
> >>> +/* Our own copy of __test_and_clear_bit to make sure
> >>> + * it is done with a single instruction */
> >>
> >> Is this for microoptimization or correctness?  If the latter, it does
> >> not ensure anything without a "lock" prefix.
> >>
> > It can't race with other vcpus, only with vmexit on the same vcpu.
> > 
> 
> That doesn't answer the question very well... I really don't understand
> the point of having a private copy here.
> 
> I really, really don't want a bunch of private interfaces around.  It
> would be a lot better to define a test_and_{set,clear}_bit_local() in
> <asm/bitops.h> which is defined to be local CPU atomic.
> 
> 	-hpa

Will do. OK to only define them for x86 for now?
They would be unused on other arches even if I defined them ...

> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 16:57         ` Michael S. Tsirkin
@ 2012-05-08 18:06           ` H. Peter Anvin
  2012-05-08 19:36             ` Michael S. Tsirkin
  0 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2012-05-08 18:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gleb Natapov, Paolo Bonzini, x86, kvm, Ingo Molnar, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On 05/08/2012 09:57 AM, Michael S. Tsirkin wrote:
> 
> Will do. OK to only define them for x86 for now?
> They would be unused on other arches even if I defined them ...
> 

I don't know if it's possible/practical to define them so they fall back
to the fully atomic versions unless arch-overridden, but that would be
the best, I would think.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance
  2012-05-08 18:06           ` H. Peter Anvin
@ 2012-05-08 19:36             ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2012-05-08 19:36 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Gleb Natapov, Paolo Bonzini, x86, kvm, Ingo Molnar, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds, linux-kernel

On Tue, May 08, 2012 at 11:06:10AM -0700, H. Peter Anvin wrote:
> On 05/08/2012 09:57 AM, Michael S. Tsirkin wrote:
> > 
> > Will do. OK to only define them for x86 for now?
> > They would be unused on other arches even if I defined them ...
> > 
> 
> I don't know if it's possible/practical to define them so they fall back
> to the fully atomic versions unless arch-overridden, but that would be
> the best, I would think.

Ah, good idea, will do.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-05-08 19:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23 14:03 [PATCH RFC 0/5] apic: eoi optimization support Michael S. Tsirkin
2012-04-23 14:04 ` [PATCH RFC 1/5] apic: fix typo EIO_ACK -> EOI_ACK and document Michael S. Tsirkin
2012-04-23 14:04 ` [PATCH RFC 2/5] apic: use symbolic APIC_EOI_ACK Michael S. Tsirkin
2012-04-23 14:04 ` [PATCH RFC 3/5] x86: add apic->eoi_write callback Michael S. Tsirkin
2012-04-23 14:04 ` [PATCH RFC 4/5] x86: eoi micro-optimization Michael S. Tsirkin
2012-04-23 14:04 ` [PATCH RFC dontapply 5/5] kvm_para: guest side for eoi avoidance Michael S. Tsirkin
2012-04-24  6:50   ` Gleb Natapov
2012-04-24  6:58     ` Michael S. Tsirkin
2012-04-24  7:07       ` Gleb Natapov
2012-05-08 15:26   ` Paolo Bonzini
2012-05-08 15:28     ` Gleb Natapov
2012-05-08 15:45       ` H. Peter Anvin
2012-05-08 16:32         ` Gleb Natapov
2012-05-08 16:57         ` Michael S. Tsirkin
2012-05-08 18:06           ` H. Peter Anvin
2012-05-08 19:36             ` Michael S. Tsirkin
2012-05-07 10:35 ` [PATCH RFC 0/5] apic: eoi optimization support Ingo Molnar
2012-05-07 10:59   ` Michael S. Tsirkin
2012-05-07 11:40     ` Ingo Molnar
2012-05-07 11:47       ` Avi Kivity
2012-05-07 11:57         ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.