All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH 00/10] GIC fixes and improvements
@ 2020-11-25 15:51 ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

What started this series is Andre's SPI and group interrupts tests [1],
which prompted me to attempt to rewrite check_acked() so it's more flexible
and not so complicated to review. When I was doing that I noticed that the
message passing pattern for accesses to the acked, bad_irq and bad_sender
arrays didn't look quite right, and that turned into the first 7 patches of
the series. Even though the diffs are relatively small, they are not
trivial and the reviewer can skip them for the more palatable patches that
follow. I would still appreciate someone having a look at the memory
ordering fixes.

Patch #8 ("Split check_acked() into two functions") is where check_acked()
is reworked with an eye towards supporting different timeout values or
silent reporting without adding too many arguments to check_acked().

After changing the IPI tests, I turned my attention to the LPI tests, which
followed the same memory synchronization patterns, but invented their own
interrupt handler and testing functions. Instead of redoing the work that I
did for the IPI tests, I decided to convert the LPI tests to use the same
infrastructure. It turns out that was a good idea, because it uncovered a
test inconsistency that was hidden before. I am not familiar with the ITS
and I'm not sure that there is even a problem or if the behaviour is
expected, details in the last patch.

[1] https://lists.cs.columbia.edu/pipermail/kvmarm/2019-November/037853.html

Alexandru Elisei (10):
  lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
  lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
  arm/arm64: gic: Remove memory synchronization from
    ipi_clear_active_handler()
  arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  arm/arm64: gic: Use correct memory ordering for the IPI test
  arm/arm64: gic: Check spurious and bad_sender in the active test
  arm/arm64: gic: Wait for writes to acked or spurious to complete
  arm/arm64: gic: Split check_acked() into two functions
  arm/arm64: gic: Make check_acked() more generic
  arm64: gic: Use IPI test checking for the LPI tests

 lib/arm/gic-v2.c |   4 +
 lib/arm/gic-v3.c |   3 +
 arm/gic.c        | 334 +++++++++++++++++++++++++----------------------
 3 files changed, 185 insertions(+), 156 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 00/10] GIC fixes and improvements
@ 2020-11-25 15:51 ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

What started this series is Andre's SPI and group interrupts tests [1],
which prompted me to attempt to rewrite check_acked() so it's more flexible
and not so complicated to review. When I was doing that I noticed that the
message passing pattern for accesses to the acked, bad_irq and bad_sender
arrays didn't look quite right, and that turned into the first 7 patches of
the series. Even though the diffs are relatively small, they are not
trivial and the reviewer can skip them for the more palatable patches that
follow. I would still appreciate someone having a look at the memory
ordering fixes.

Patch #8 ("Split check_acked() into two functions") is where check_acked()
is reworked with an eye towards supporting different timeout values or
silent reporting without adding too many arguments to check_acked().

After changing the IPI tests, I turned my attention to the LPI tests, which
followed the same memory synchronization patterns, but invented their own
interrupt handler and testing functions. Instead of redoing the work that I
did for the IPI tests, I decided to convert the LPI tests to use the same
infrastructure. It turns out that was a good idea, because it uncovered a
test inconsistency that was hidden before. I am not familiar with the ITS
and I'm not sure that there is even a problem or if the behaviour is
expected, details in the last patch.

[1] https://lists.cs.columbia.edu/pipermail/kvmarm/2019-November/037853.html

Alexandru Elisei (10):
  lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
  lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
  arm/arm64: gic: Remove memory synchronization from
    ipi_clear_active_handler()
  arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  arm/arm64: gic: Use correct memory ordering for the IPI test
  arm/arm64: gic: Check spurious and bad_sender in the active test
  arm/arm64: gic: Wait for writes to acked or spurious to complete
  arm/arm64: gic: Split check_acked() into two functions
  arm/arm64: gic: Make check_acked() more generic
  arm64: gic: Use IPI test checking for the LPI tests

 lib/arm/gic-v2.c |   4 +
 lib/arm/gic-v3.c |   3 +
 arm/gic.c        | 334 +++++++++++++++++++++++++----------------------
 3 files changed, 185 insertions(+), 156 deletions(-)

-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

One common usage for IPIs is for one CPU to write to a shared memory
location, send the IPI to kick another CPU, and the receiver to read from
the same location. Proper synchronization is needed to make sure that the
IPI receiver reads the most recent value and not stale data (for example,
the write from the sender CPU might still be in a store buffer).

For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
To make sure the memory stores are observable by other CPUs, we need a
wmb() barrier (DSB ST), which waits for stores to complete.

From the definition of DSB from ARM DDI 0487F.b, page B2-139:

"In addition, no instruction that appears in program order after the DSB
instruction can alter any state of the system or perform any part of its
functionality until the DSB completes other than:

- Being fetched from memory and decoded.
- Reading the general-purpose, SIMD and floating-point, Special-purpose, or
System registers that are directly or indirectly read without causing
side-effects."

Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).

The DSB instruction is enough to prevent reordering of the GIC register
write which comes in program order after the memory access.

This also matches what the Linux GICv3 irqchip driver does (commit
21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
gic_raise_softirq()")).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/gic-v3.c | 3 +++
 arm/gic.c        | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
index a7e2cb819746..a6afa42d5fbe 100644
--- a/lib/arm/gic-v3.c
+++ b/lib/arm/gic-v3.c
@@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
 
 	assert(irq < 16);
 
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
+
 	/*
 	 * For each cpu in the mask collect its peers, which are also in
 	 * the mask, in order to form target lists.
diff --git a/arm/gic.c b/arm/gic.c
index acb060585fae..512c83636a2e 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
 
 static void gicv3_ipi_send_broadcast(void)
 {
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
 	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
 	isb();
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

One common usage for IPIs is for one CPU to write to a shared memory
location, send the IPI to kick another CPU, and the receiver to read from
the same location. Proper synchronization is needed to make sure that the
IPI receiver reads the most recent value and not stale data (for example,
the write from the sender CPU might still be in a store buffer).

For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
To make sure the memory stores are observable by other CPUs, we need a
wmb() barrier (DSB ST), which waits for stores to complete.

From the definition of DSB from ARM DDI 0487F.b, page B2-139:

"In addition, no instruction that appears in program order after the DSB
instruction can alter any state of the system or perform any part of its
functionality until the DSB completes other than:

- Being fetched from memory and decoded.
- Reading the general-purpose, SIMD and floating-point, Special-purpose, or
System registers that are directly or indirectly read without causing
side-effects."

Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).

The DSB instruction is enough to prevent reordering of the GIC register
write which comes in program order after the memory access.

This also matches what the Linux GICv3 irqchip driver does (commit
21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
gic_raise_softirq()")).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/gic-v3.c | 3 +++
 arm/gic.c        | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
index a7e2cb819746..a6afa42d5fbe 100644
--- a/lib/arm/gic-v3.c
+++ b/lib/arm/gic-v3.c
@@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
 
 	assert(irq < 16);
 
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
+
 	/*
 	 * For each cpu in the mask collect its peers, which are also in
 	 * the mask, in order to form target lists.
diff --git a/arm/gic.c b/arm/gic.c
index acb060585fae..512c83636a2e 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
 
 static void gicv3_ipi_send_broadcast(void)
 {
+	/* Ensure stores are visible to other CPUs before sending the IPI */
+	wmb();
 	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
 	isb();
 }
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

GICv2 generates IPIs with a MMIO write to the GICD_SGIR register. A common
pattern for IPI usage is for the IPI receiver to read data written to
memory by the sender. The armv7 and armv8 architectures implement a
weakly-ordered memory model, which means that barriers are required to make
sure that the expected values are observed.

It turns out that because the receiver CPU must observe the write to memory
that generated the IPI when reading the GICC_IAR MMIO register, we only
need to ensure ordering of memory accesses, and not completion. Use a
smp_wmb (DMB ISHST) barrier before sending the IPI.

This also matches what the Linux GICv2 irqchip driver does (more details
in commit 8adbf57fc429 ("irqchip: gic: use dmb ishst instead of dsb when
raising a softirq")).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/gic-v2.c | 4 ++++
 arm/gic.c        | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/lib/arm/gic-v2.c b/lib/arm/gic-v2.c
index dc6a97c600ec..da244c82de34 100644
--- a/lib/arm/gic-v2.c
+++ b/lib/arm/gic-v2.c
@@ -45,6 +45,8 @@ void gicv2_ipi_send_single(int irq, int cpu)
 {
 	assert(cpu < 8);
 	assert(irq < 16);
+
+	smp_wmb();
 	writel(1 << (cpu + 16) | irq, gicv2_dist_base() + GICD_SGIR);
 }
 
@@ -53,5 +55,7 @@ void gicv2_ipi_send_mask(int irq, const cpumask_t *dest)
 	u8 tlist = (u8)cpumask_bits(dest)[0];
 
 	assert(irq < 16);
+
+	smp_wmb();
 	writel(tlist << 16 | irq, gicv2_dist_base() + GICD_SGIR);
 }
diff --git a/arm/gic.c b/arm/gic.c
index 512c83636a2e..401ffafe4299 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -260,11 +260,13 @@ static void check_lpi_hits(int *expected, const char *msg)
 
 static void gicv2_ipi_send_self(void)
 {
+	smp_wmb();
 	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
 static void gicv2_ipi_send_broadcast(void)
 {
+	smp_wmb();
 	writel(1 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

GICv2 generates IPIs with a MMIO write to the GICD_SGIR register. A common
pattern for IPI usage is for the IPI receiver to read data written to
memory by the sender. The armv7 and armv8 architectures implement a
weakly-ordered memory model, which means that barriers are required to make
sure that the expected values are observed.

It turns out that because the receiver CPU must observe the write to memory
that generated the IPI when reading the GICC_IAR MMIO register, we only
need to ensure ordering of memory accesses, and not completion. Use a
smp_wmb (DMB ISHST) barrier before sending the IPI.

This also matches what the Linux GICv2 irqchip driver does (more details
in commit 8adbf57fc429 ("irqchip: gic: use dmb ishst instead of dsb when
raising a softirq")).

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/gic-v2.c | 4 ++++
 arm/gic.c        | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/lib/arm/gic-v2.c b/lib/arm/gic-v2.c
index dc6a97c600ec..da244c82de34 100644
--- a/lib/arm/gic-v2.c
+++ b/lib/arm/gic-v2.c
@@ -45,6 +45,8 @@ void gicv2_ipi_send_single(int irq, int cpu)
 {
 	assert(cpu < 8);
 	assert(irq < 16);
+
+	smp_wmb();
 	writel(1 << (cpu + 16) | irq, gicv2_dist_base() + GICD_SGIR);
 }
 
@@ -53,5 +55,7 @@ void gicv2_ipi_send_mask(int irq, const cpumask_t *dest)
 	u8 tlist = (u8)cpumask_bits(dest)[0];
 
 	assert(irq < 16);
+
+	smp_wmb();
 	writel(tlist << 16 | irq, gicv2_dist_base() + GICD_SGIR);
 }
diff --git a/arm/gic.c b/arm/gic.c
index 512c83636a2e..401ffafe4299 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -260,11 +260,13 @@ static void check_lpi_hits(int *expected, const char *msg)
 
 static void gicv2_ipi_send_self(void)
 {
+	smp_wmb();
 	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
 static void gicv2_ipi_send_broadcast(void)
 {
+	smp_wmb();
 	writel(1 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
checks that the interrupt has been received as expected. There is no need
to use inter-processor memory synchronization primitives on code that runs
on the same CPU, so remove the unneeded memory barriers.

The arrays are modified asynchronously (in the interrupt handler) and it is
possible for the compiler to infer that they won't be changed during normal
program flow and try to perform harmful optimizations (like stashing a
previous read in a register and reusing it). To prevent this, for GICv2,
the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
compiler barrier.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 401ffafe4299..4e947e8516a2 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -12,6 +12,7 @@
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
 #include <libcflat.h>
+#include <linux/compiler.h>
 #include <errata.h>
 #include <asm/setup.h>
 #include <asm/processor.h>
@@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
 
 static void gicv2_ipi_send_self(void)
 {
-	smp_wmb();
+	/* Prevent the compiler from optimizing memory accesses */
+	barrier();
 	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
@@ -359,6 +361,7 @@ static struct gic gicv3 = {
 	},
 };
 
+/* Runs on the same CPU as the sender, no need for memory synchronization */
 static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
@@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		smp_rmb(); /* pairs with wmb in stats_reset */
 		++acked[smp_processor_id()];
 		check_irqnr(irqnr);
-		smp_wmb(); /* pairs with rmb in check_acked */
 	} else {
 		++spurious[smp_processor_id()];
-		smp_wmb();
 	}
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
checks that the interrupt has been received as expected. There is no need
to use inter-processor memory synchronization primitives on code that runs
on the same CPU, so remove the unneeded memory barriers.

The arrays are modified asynchronously (in the interrupt handler) and it is
possible for the compiler to infer that they won't be changed during normal
program flow and try to perform harmful optimizations (like stashing a
previous read in a register and reusing it). To prevent this, for GICv2,
the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
compiler barrier.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 401ffafe4299..4e947e8516a2 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -12,6 +12,7 @@
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
 #include <libcflat.h>
+#include <linux/compiler.h>
 #include <errata.h>
 #include <asm/setup.h>
 #include <asm/processor.h>
@@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
 
 static void gicv2_ipi_send_self(void)
 {
-	smp_wmb();
+	/* Prevent the compiler from optimizing memory accesses */
+	barrier();
 	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
 }
 
@@ -359,6 +361,7 @@ static struct gic gicv3 = {
 	},
 };
 
+/* Runs on the same CPU as the sender, no need for memory synchronization */
 static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
@@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		smp_rmb(); /* pairs with wmb in stats_reset */
 		++acked[smp_processor_id()];
 		check_irqnr(irqnr);
-		smp_wmb(); /* pairs with rmb in check_acked */
 	} else {
 		++spurious[smp_processor_id()];
-		smp_wmb();
 	}
 }
 
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The GICv3 driver executes a DSB barrier before sending an IPI, which
ensures that memory accesses have completed. This removes the need to
enforce ordering with respect to stats_reset() in the IPI handler.

For GICv2, we still need the DMB to ensure ordering between the read of the
GICC_IAR MMIO register and the read from the acked array. It also matches
what the Linux GICv2 driver does in gic_handle_irq().

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 4e947e8516a2..7befda2a8673 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -60,7 +60,6 @@ static void stats_reset(void)
 		bad_sender[i] = -1;
 		bad_irq[i] = -1;
 	}
-	smp_wmb();
 }
 
 static void check_acked(const char *testname, cpumask_t *mask)
@@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		gic_write_eoir(irqstat);
-		smp_rmb(); /* pairs with wmb in stats_reset */
+		/*
+		 * Make sure data written before the IPI was triggered is
+		 * observed after the IAR is read. Pairs with the smp_wmb
+		 * when sending the IPI.
+		 */
+		if (gic_version() == 2)
+			smp_rmb();
 		++acked[smp_processor_id()];
 		check_ipi_sender(irqstat);
 		check_irqnr(irqnr);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The GICv3 driver executes a DSB barrier before sending an IPI, which
ensures that memory accesses have completed. This removes the need to
enforce ordering with respect to stats_reset() in the IPI handler.

For GICv2, we still need the DMB to ensure ordering between the read of the
GICC_IAR MMIO register and the read from the acked array. It also matches
what the Linux GICv2 driver does in gic_handle_irq().

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 4e947e8516a2..7befda2a8673 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -60,7 +60,6 @@ static void stats_reset(void)
 		bad_sender[i] = -1;
 		bad_irq[i] = -1;
 	}
-	smp_wmb();
 }
 
 static void check_acked(const char *testname, cpumask_t *mask)
@@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		gic_write_eoir(irqstat);
-		smp_rmb(); /* pairs with wmb in stats_reset */
+		/*
+		 * Make sure data written before the IPI was triggered is
+		 * observed after the IAR is read. Pairs with the smp_wmb
+		 * when sending the IPI.
+		 */
+		if (gic_version() == 2)
+			smp_rmb();
 		++acked[smp_processor_id()];
 		check_ipi_sender(irqstat);
 		check_irqnr(irqnr);
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The IPI test works by sending IPIs to even numbered CPUs from the
IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
interrupts as expected. The check is done in check_acked() by the
IPI_SENDER CPU with the help of three arrays:

- acked, where acked[i] == 1 means that CPU i received the interrupt.
- bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
  i had the expected interrupt number (IPI_IRQ).
- bad_sender, where bad_sender[i] == -1 means that the interrupt received
  by CPU i was from the expected sender (IPI_SENDER, GICv2 only).

The assumption made by check_acked() is that if a CPU acked an interrupt,
then bad_sender and bad_irq have also been updated. This is a common
inter-thread communication pattern called message passing.  For message
passing to work correctly on weakly consistent memory model architectures,
like arm and arm64, barriers or address dependencies are required. This is
described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
using DMB and DSB barriers" (page K11-7993), in the section with a single
observer, which is in our case the IPI_SENDER CPU.

The IPI test attempts to enforce the correct ordering using memory
barriers, but it's not enough. For example, the program execution below is
valid from an architectural point of view:

3 online CPUs, initial state (from stats_reset()):

acked[2] = 0;
bad_sender[2] = -1;
bad_irq[2] = -1;

CPU1 (in check_acked())		| CPU2 (in ipi_handler())
				|
smp_rmb() // DMB ISHLD		| acked[2]++;
read 1 from acked[2]		|
nr_pass++ // nr_pass = 3	|
read -1 from bad_sender[2]	|
read -1 from bad_irq[2]		|
				| // in check_ipi_sender()
				| bad_sender[2] = <bad ipi sender>
				| // in check_irqnr()
				| bad_irq[2] = <bad irq number>
				| smp_wmb() // DMB ISHST
nr_pass == nr_cpus, return	|

In this scenario, CPU1 will read the updated acked value, but it will read
the initial bad_sender and bad_irq values. This is permitted because the
memory barriers do not create a data dependency between the value read from
acked and the values read from bad_rq and bad_sender on CPU1, respectively
between the values written to acked, bad_sender and bad_irq on CPU2.

To avoid this situation, let's reorder the barriers and accesses to the
arrays to create the needed dependencies that ensure that message passing
behaves as expected.

In the interrupt handler, the writes to bad_sender and bad_irq are
reordered before the write to acked and a smp_wmb() barrier is added. This
ensures that if other PEs observe the write to acked, then they will also
observe the writes to the other two arrays.

In check_acked(), put the smp_rmb() barrier after the read from acked to
ensure that the subsequent reads from bad_sender, respectively bad_irq,
aren't reordered locally by the PE.

With these changes, the expected ordering of accesses is respected and we
end up with the pattern described in the Arm ARM and also in the Linux
litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
tools/memory-model/litmus-tests. More examples and explanations can be
found in the Linux source tree, in Documentation/memory-barriers.txt, in
the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
SPECULATION".

For consistency with ipi_handler(), the array accesses in
ipi_clear_active_handler() have also been reordered. This shouldn't affect
the functionality of that test.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 7befda2a8673..bcb834406d23 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
 		mdelay(100);
 		nr_pass = 0;
 		for_each_present_cpu(cpu) {
-			smp_rmb();
 			nr_pass += cpumask_test_cpu(cpu, mask) ?
 				acked[cpu] == 1 : acked[cpu] == 0;
+			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
 
 			if (bad_sender[cpu] != -1) {
 				printf("cpu%d received IPI from wrong sender %d\n",
@@ -118,7 +118,6 @@ static void check_spurious(void)
 {
 	int cpu;
 
-	smp_rmb();
 	for_each_present_cpu(cpu) {
 		if (spurious[cpu])
 			report_info("WARN: cpu%d got %d spurious interrupts",
@@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		++acked[smp_processor_id()];
 		check_ipi_sender(irqstat);
 		check_irqnr(irqnr);
-		smp_wmb(); /* pairs with rmb in check_acked */
+		smp_wmb(); /* pairs with smp_rmb in check_acked */
+		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
 		smp_wmb();
@@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		++acked[smp_processor_id()];
 		check_irqnr(irqnr);
+		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The IPI test works by sending IPIs to even numbered CPUs from the
IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
interrupts as expected. The check is done in check_acked() by the
IPI_SENDER CPU with the help of three arrays:

- acked, where acked[i] == 1 means that CPU i received the interrupt.
- bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
  i had the expected interrupt number (IPI_IRQ).
- bad_sender, where bad_sender[i] == -1 means that the interrupt received
  by CPU i was from the expected sender (IPI_SENDER, GICv2 only).

The assumption made by check_acked() is that if a CPU acked an interrupt,
then bad_sender and bad_irq have also been updated. This is a common
inter-thread communication pattern called message passing.  For message
passing to work correctly on weakly consistent memory model architectures,
like arm and arm64, barriers or address dependencies are required. This is
described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
using DMB and DSB barriers" (page K11-7993), in the section with a single
observer, which is in our case the IPI_SENDER CPU.

The IPI test attempts to enforce the correct ordering using memory
barriers, but it's not enough. For example, the program execution below is
valid from an architectural point of view:

3 online CPUs, initial state (from stats_reset()):

acked[2] = 0;
bad_sender[2] = -1;
bad_irq[2] = -1;

CPU1 (in check_acked())		| CPU2 (in ipi_handler())
				|
smp_rmb() // DMB ISHLD		| acked[2]++;
read 1 from acked[2]		|
nr_pass++ // nr_pass = 3	|
read -1 from bad_sender[2]	|
read -1 from bad_irq[2]		|
				| // in check_ipi_sender()
				| bad_sender[2] = <bad ipi sender>
				| // in check_irqnr()
				| bad_irq[2] = <bad irq number>
				| smp_wmb() // DMB ISHST
nr_pass == nr_cpus, return	|

In this scenario, CPU1 will read the updated acked value, but it will read
the initial bad_sender and bad_irq values. This is permitted because the
memory barriers do not create a data dependency between the value read from
acked and the values read from bad_rq and bad_sender on CPU1, respectively
between the values written to acked, bad_sender and bad_irq on CPU2.

To avoid this situation, let's reorder the barriers and accesses to the
arrays to create the needed dependencies that ensure that message passing
behaves as expected.

In the interrupt handler, the writes to bad_sender and bad_irq are
reordered before the write to acked and a smp_wmb() barrier is added. This
ensures that if other PEs observe the write to acked, then they will also
observe the writes to the other two arrays.

In check_acked(), put the smp_rmb() barrier after the read from acked to
ensure that the subsequent reads from bad_sender, respectively bad_irq,
aren't reordered locally by the PE.

With these changes, the expected ordering of accesses is respected and we
end up with the pattern described in the Arm ARM and also in the Linux
litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
tools/memory-model/litmus-tests. More examples and explanations can be
found in the Linux source tree, in Documentation/memory-barriers.txt, in
the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
SPECULATION".

For consistency with ipi_handler(), the array accesses in
ipi_clear_active_handler() have also been reordered. This shouldn't affect
the functionality of that test.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 7befda2a8673..bcb834406d23 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
 		mdelay(100);
 		nr_pass = 0;
 		for_each_present_cpu(cpu) {
-			smp_rmb();
 			nr_pass += cpumask_test_cpu(cpu, mask) ?
 				acked[cpu] == 1 : acked[cpu] == 0;
+			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
 
 			if (bad_sender[cpu] != -1) {
 				printf("cpu%d received IPI from wrong sender %d\n",
@@ -118,7 +118,6 @@ static void check_spurious(void)
 {
 	int cpu;
 
-	smp_rmb();
 	for_each_present_cpu(cpu) {
 		if (spurious[cpu])
 			report_info("WARN: cpu%d got %d spurious interrupts",
@@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		++acked[smp_processor_id()];
 		check_ipi_sender(irqstat);
 		check_irqnr(irqnr);
-		smp_wmb(); /* pairs with rmb in check_acked */
+		smp_wmb(); /* pairs with smp_rmb in check_acked */
+		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
 		smp_wmb();
@@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		++acked[smp_processor_id()];
 		check_irqnr(irqnr);
+		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
 	}
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
checks that the interrupt has been received as expected. The
ipi_clear_active_handler() clears the active state of the interrupt with a
write to the GICD_ICACTIVER register instead of writing the to EOI
register.

When acknowledging the interrupt it is possible to get back an spurious
interrupt ID (ID 1023), and the interrupt handler increments the number of
spurious interrupts received on the current processor. However, this is not
checked at the end of the test. Let's also check for spurious interrupts,
like the IPI test does.

For IPIs on GICv2, the value returned by a read of the GICC_IAR register
performed when acknowledging the interrupt also contains the sender CPU
ID. Add a check for that too.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index bcb834406d23..5727d72a0ef3 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -125,12 +125,12 @@ static void check_spurious(void)
 	}
 }
 
-static void check_ipi_sender(u32 irqstat)
+static void check_ipi_sender(u32 irqstat, int sender)
 {
 	if (gic_version() == 2) {
 		int src = (irqstat >> 10) & 7;
 
-		if (src != IPI_SENDER)
+		if (src != sender)
 			bad_sender[smp_processor_id()] = src;
 	}
 }
@@ -155,7 +155,7 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		check_ipi_sender(irqstat);
+		check_ipi_sender(irqstat, IPI_SENDER);
 		check_irqnr(irqnr);
 		smp_wmb(); /* pairs with smp_rmb in check_acked */
 		++acked[smp_processor_id()];
@@ -382,6 +382,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
+		check_ipi_sender(irqstat, smp_processor_id());
 		check_irqnr(irqnr);
 		++acked[smp_processor_id()];
 	} else {
@@ -394,6 +395,7 @@ static void run_active_clear_test(void)
 	report_prefix_push("active");
 	setup_irq(ipi_clear_active_handler);
 	ipi_test_self();
+	check_spurious();
 	report_prefix_pop();
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
checks that the interrupt has been received as expected. The
ipi_clear_active_handler() clears the active state of the interrupt with a
write to the GICD_ICACTIVER register instead of writing the to EOI
register.

When acknowledging the interrupt it is possible to get back an spurious
interrupt ID (ID 1023), and the interrupt handler increments the number of
spurious interrupts received on the current processor. However, this is not
checked at the end of the test. Let's also check for spurious interrupts,
like the IPI test does.

For IPIs on GICv2, the value returned by a read of the GICC_IAR register
performed when acknowledging the interrupt also contains the sender CPU
ID. Add a check for that too.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index bcb834406d23..5727d72a0ef3 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -125,12 +125,12 @@ static void check_spurious(void)
 	}
 }
 
-static void check_ipi_sender(u32 irqstat)
+static void check_ipi_sender(u32 irqstat, int sender)
 {
 	if (gic_version() == 2) {
 		int src = (irqstat >> 10) & 7;
 
-		if (src != IPI_SENDER)
+		if (src != sender)
 			bad_sender[smp_processor_id()] = src;
 	}
 }
@@ -155,7 +155,7 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		check_ipi_sender(irqstat);
+		check_ipi_sender(irqstat, IPI_SENDER);
 		check_irqnr(irqnr);
 		smp_wmb(); /* pairs with smp_rmb in check_acked */
 		++acked[smp_processor_id()];
@@ -382,6 +382,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
+		check_ipi_sender(irqstat, smp_processor_id());
 		check_irqnr(irqnr);
 		++acked[smp_processor_id()];
 	} else {
@@ -394,6 +395,7 @@ static void run_active_clear_test(void)
 	report_prefix_push("active");
 	setup_irq(ipi_clear_active_handler);
 	ipi_test_self();
+	check_spurious();
 	report_prefix_pop();
 }
 
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The IPI test has two parts: in the first part, it tests that the sender CPU
can send an IPI to itself (ipi_test_self()), and in the second part it
sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
an interrupt, if we read back a spurious interrupt ID (1023), the handler
increments the index in the static array spurious corresponding to the CPU
ID that the handler is running on; if we get the expected interrupt ID, we
increment the same index in the acked array.

Reads of the spurious and acked arrays are synchronized with writes
performed before sending the IPI. The synchronization is done either in the
IPI sender function (GICv3), either by creating a data dependency (GICv2).

At the end of the test, the sender CPU reads from the acked and spurious
arrays to check against the expected behaviour. We need to make sure the
that writes in ipi_handler() are observable by the sender CPU. Use a DSB
ISHST to make sure that the writes have completed.

One might rightfully argue that there are no guarantees regarding when the
DSB instruction completes, just like there are no guarantees regarding when
the value is observed by the other CPUs. However, let's do our best and
instruct the CPU to complete the memory access when we know that it will be
needed.

We still need to follow the message passing pattern for the acked,
respectively bad_irq and bad_sender, because DSB guarantees that all memory
accesses that come before the barrier have completed, not that they have
completed in program order.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arm/gic.c b/arm/gic.c
index 5727d72a0ef3..544c283f5f47 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -161,8 +161,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
-		smp_wmb();
 	}
+
+	/* Wait for writes to acked/spurious to complete */
+	dsb(ishst);
 }
 
 static void setup_irq(irq_handler_fn handler)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The IPI test has two parts: in the first part, it tests that the sender CPU
can send an IPI to itself (ipi_test_self()), and in the second part it
sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
an interrupt, if we read back a spurious interrupt ID (1023), the handler
increments the index in the static array spurious corresponding to the CPU
ID that the handler is running on; if we get the expected interrupt ID, we
increment the same index in the acked array.

Reads of the spurious and acked arrays are synchronized with writes
performed before sending the IPI. The synchronization is done either in the
IPI sender function (GICv3), either by creating a data dependency (GICv2).

At the end of the test, the sender CPU reads from the acked and spurious
arrays to check against the expected behaviour. We need to make sure the
that writes in ipi_handler() are observable by the sender CPU. Use a DSB
ISHST to make sure that the writes have completed.

One might rightfully argue that there are no guarantees regarding when the
DSB instruction completes, just like there are no guarantees regarding when
the value is observed by the other CPUs. However, let's do our best and
instruct the CPU to complete the memory access when we know that it will be
needed.

We still need to follow the message passing pattern for the acked,
respectively bad_irq and bad_sender, because DSB guarantees that all memory
accesses that come before the barrier have completed, not that they have
completed in program order.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arm/gic.c b/arm/gic.c
index 5727d72a0ef3..544c283f5f47 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -161,8 +161,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		++acked[smp_processor_id()];
 	} else {
 		++spurious[smp_processor_id()];
-		smp_wmb();
 	}
+
+	/* Wait for writes to acked/spurious to complete */
+	dsb(ishst);
 }
 
 static void setup_irq(irq_handler_fn handler)
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

check_acked() has several peculiarities: is the only function among the
check_* functions which calls report() directly, it does two things
(waits for interrupts and checks for misfired interrupts) and it also
mixes printf, report_info and report calls.

check_acked() also reports a pass and returns as soon all the target CPUs
have received interrupts, However, a CPU not having received an interrupt
*now* does not guarantee not receiving an eroneous interrupt if we wait
long enough.

Rework the function by splitting it into two separate functions, each with
a single responsability: wait_for_interrupts(), which waits for the
expected interrupts to fire, and check_acked() which checks that interrupts
have been received as expected.

wait_for_interrupts() also waits an extra 100 milliseconds after the
expected interrupts have been received in an effort to make sure we don't
miss misfiring interrupts.

Splitting check_acked() into two functions will also allow us to
customize the behavior of each function in the future more easily
without using an unnecessarily long list of arguments for check_acked().

CC: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 544c283f5f47..dcdab7d5f39a 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -62,41 +62,42 @@ static void stats_reset(void)
 	}
 }
 
-static void check_acked(const char *testname, cpumask_t *mask)
+static void wait_for_interrupts(cpumask_t *mask)
 {
-	int missing = 0, extra = 0, unexpected = 0;
 	int nr_pass, cpu, i;
-	bool bad = false;
 
 	/* Wait up to 5s for all interrupts to be delivered */
-	for (i = 0; i < 50; ++i) {
+	for (i = 0; i < 50; i++) {
 		mdelay(100);
 		nr_pass = 0;
 		for_each_present_cpu(cpu) {
+			/*
+			 * A CPU having receied more than one interrupts will
+			 * show up in check_acked(), and no matter how long we
+			 * wait it cannot un-receive it. Consier at least one
+			 * interrupt as a pass.
+			 */
 			nr_pass += cpumask_test_cpu(cpu, mask) ?
-				acked[cpu] == 1 : acked[cpu] == 0;
-			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
-
-			if (bad_sender[cpu] != -1) {
-				printf("cpu%d received IPI from wrong sender %d\n",
-					cpu, bad_sender[cpu]);
-				bad = true;
-			}
-
-			if (bad_irq[cpu] != -1) {
-				printf("cpu%d received wrong irq %d\n",
-					cpu, bad_irq[cpu]);
-				bad = true;
-			}
+				acked[cpu] >= 1 : acked[cpu] == 0;
 		}
+
 		if (nr_pass == nr_cpus) {
-			report(!bad, "%s", testname);
 			if (i)
-				report_info("took more than %d ms", i * 100);
+				report_info("interrupts took more than %d ms", i * 100);
+			mdelay(100);
 			return;
 		}
 	}
 
+	report_info("interrupts timed-out (5s)");
+}
+
+static bool check_acked(cpumask_t *mask)
+{
+	int missing = 0, extra = 0, unexpected = 0;
+	bool pass = true;
+	int cpu;
+
 	for_each_present_cpu(cpu) {
 		if (cpumask_test_cpu(cpu, mask)) {
 			if (!acked[cpu])
@@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
 			if (acked[cpu])
 				++unexpected;
 		}
+		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
+
+		if (bad_sender[cpu] != -1) {
+			report_info("cpu%d received IPI from wrong sender %d",
+					cpu, bad_sender[cpu]);
+			pass = false;
+		}
+
+		if (bad_irq[cpu] != -1) {
+			report_info("cpu%d received wrong irq %d",
+					cpu, bad_irq[cpu]);
+			pass = false;
+		}
+	}
+
+	if (missing || extra || unexpected) {
+		report_info("ACKS: missing=%d extra=%d unexpected=%d",
+				missing, extra, unexpected);
+		pass = false;
 	}
 
-	report(false, "%s", testname);
-	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
-		    missing, extra, unexpected);
+	return pass;
 }
 
 static void check_spurious(void)
@@ -300,7 +318,8 @@ static void ipi_test_self(void)
 	cpumask_clear(&mask);
 	cpumask_set_cpu(smp_processor_id(), &mask);
 	gic->ipi.send_self();
-	check_acked("IPI: self", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 }
 
@@ -315,7 +334,8 @@ static void ipi_test_smp(void)
 	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
 		cpumask_clear_cpu(i, &mask);
 	gic_ipi_send_mask(IPI_IRQ, &mask);
-	check_acked("IPI: directed", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 
 	report_prefix_push("broadcast");
@@ -323,7 +343,8 @@ static void ipi_test_smp(void)
 	cpumask_copy(&mask, &cpu_present_mask);
 	cpumask_clear_cpu(smp_processor_id(), &mask);
 	gic->ipi.send_broadcast();
-	check_acked("IPI: broadcast", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

check_acked() has several peculiarities: is the only function among the
check_* functions which calls report() directly, it does two things
(waits for interrupts and checks for misfired interrupts) and it also
mixes printf, report_info and report calls.

check_acked() also reports a pass and returns as soon all the target CPUs
have received interrupts, However, a CPU not having received an interrupt
*now* does not guarantee not receiving an eroneous interrupt if we wait
long enough.

Rework the function by splitting it into two separate functions, each with
a single responsability: wait_for_interrupts(), which waits for the
expected interrupts to fire, and check_acked() which checks that interrupts
have been received as expected.

wait_for_interrupts() also waits an extra 100 milliseconds after the
expected interrupts have been received in an effort to make sure we don't
miss misfiring interrupts.

Splitting check_acked() into two functions will also allow us to
customize the behavior of each function in the future more easily
without using an unnecessarily long list of arguments for check_acked().

CC: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index 544c283f5f47..dcdab7d5f39a 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -62,41 +62,42 @@ static void stats_reset(void)
 	}
 }
 
-static void check_acked(const char *testname, cpumask_t *mask)
+static void wait_for_interrupts(cpumask_t *mask)
 {
-	int missing = 0, extra = 0, unexpected = 0;
 	int nr_pass, cpu, i;
-	bool bad = false;
 
 	/* Wait up to 5s for all interrupts to be delivered */
-	for (i = 0; i < 50; ++i) {
+	for (i = 0; i < 50; i++) {
 		mdelay(100);
 		nr_pass = 0;
 		for_each_present_cpu(cpu) {
+			/*
+			 * A CPU having receied more than one interrupts will
+			 * show up in check_acked(), and no matter how long we
+			 * wait it cannot un-receive it. Consier at least one
+			 * interrupt as a pass.
+			 */
 			nr_pass += cpumask_test_cpu(cpu, mask) ?
-				acked[cpu] == 1 : acked[cpu] == 0;
-			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
-
-			if (bad_sender[cpu] != -1) {
-				printf("cpu%d received IPI from wrong sender %d\n",
-					cpu, bad_sender[cpu]);
-				bad = true;
-			}
-
-			if (bad_irq[cpu] != -1) {
-				printf("cpu%d received wrong irq %d\n",
-					cpu, bad_irq[cpu]);
-				bad = true;
-			}
+				acked[cpu] >= 1 : acked[cpu] == 0;
 		}
+
 		if (nr_pass == nr_cpus) {
-			report(!bad, "%s", testname);
 			if (i)
-				report_info("took more than %d ms", i * 100);
+				report_info("interrupts took more than %d ms", i * 100);
+			mdelay(100);
 			return;
 		}
 	}
 
+	report_info("interrupts timed-out (5s)");
+}
+
+static bool check_acked(cpumask_t *mask)
+{
+	int missing = 0, extra = 0, unexpected = 0;
+	bool pass = true;
+	int cpu;
+
 	for_each_present_cpu(cpu) {
 		if (cpumask_test_cpu(cpu, mask)) {
 			if (!acked[cpu])
@@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
 			if (acked[cpu])
 				++unexpected;
 		}
+		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
+
+		if (bad_sender[cpu] != -1) {
+			report_info("cpu%d received IPI from wrong sender %d",
+					cpu, bad_sender[cpu]);
+			pass = false;
+		}
+
+		if (bad_irq[cpu] != -1) {
+			report_info("cpu%d received wrong irq %d",
+					cpu, bad_irq[cpu]);
+			pass = false;
+		}
+	}
+
+	if (missing || extra || unexpected) {
+		report_info("ACKS: missing=%d extra=%d unexpected=%d",
+				missing, extra, unexpected);
+		pass = false;
 	}
 
-	report(false, "%s", testname);
-	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
-		    missing, extra, unexpected);
+	return pass;
 }
 
 static void check_spurious(void)
@@ -300,7 +318,8 @@ static void ipi_test_self(void)
 	cpumask_clear(&mask);
 	cpumask_set_cpu(smp_processor_id(), &mask);
 	gic->ipi.send_self();
-	check_acked("IPI: self", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 }
 
@@ -315,7 +334,8 @@ static void ipi_test_smp(void)
 	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
 		cpumask_clear_cpu(i, &mask);
 	gic_ipi_send_mask(IPI_IRQ, &mask);
-	check_acked("IPI: directed", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 
 	report_prefix_push("broadcast");
@@ -323,7 +343,8 @@ static void ipi_test_smp(void)
 	cpumask_copy(&mask, &cpu_present_mask);
 	cpumask_clear_cpu(smp_processor_id(), &mask);
 	gic->ipi.send_broadcast();
-	check_acked("IPI: broadcast", &mask);
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask), "Interrupts received");
 	report_prefix_pop();
 }
 
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

Testing that an interrupt is received as expected is done in three places:
in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
and check_ipi_sender() compares the sender with IPI_SENDER and writes to
bad_sender when they don't match.

Let's move all the checks to check_acked() by renaming
bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
so they record the interrupt sender, respectively the irq number.
check_acked() now takes two new parameters: the expected interrupt number
and sender.

This has two distinct advantages:

1. check_acked() and ipi_handler() can now be used for interrupts other
   than IPIs.
2. Correctness checks are consolidated in one function.

CC: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 68 +++++++++++++++++++++++++++----------------------------
 1 file changed, 33 insertions(+), 35 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index dcdab7d5f39a..da7b42da5449 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -35,7 +35,7 @@ struct gic {
 
 static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
-static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
+static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
 static cpumask_t ready;
 
 static void nr_cpu_check(int nr)
@@ -57,8 +57,8 @@ static void stats_reset(void)
 
 	for (i = 0; i < nr_cpus; ++i) {
 		acked[i] = 0;
-		bad_sender[i] = -1;
-		bad_irq[i] = -1;
+		irq_sender[i] = -1;
+		irq_number[i] = -1;
 	}
 }
 
@@ -92,9 +92,10 @@ static void wait_for_interrupts(cpumask_t *mask)
 	report_info("interrupts timed-out (5s)");
 }
 
-static bool check_acked(cpumask_t *mask)
+static bool check_acked(cpumask_t *mask, int sender, int irqnum)
 {
 	int missing = 0, extra = 0, unexpected = 0;
+	bool has_gicv2 = (gic_version() == 2);
 	bool pass = true;
 	int cpu;
 
@@ -108,17 +109,19 @@ static bool check_acked(cpumask_t *mask)
 			if (acked[cpu])
 				++unexpected;
 		}
+		if (!acked[cpu])
+			continue;
 		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
 
-		if (bad_sender[cpu] != -1) {
+		if (has_gicv2 && irq_sender[cpu] != sender) {
 			report_info("cpu%d received IPI from wrong sender %d",
-					cpu, bad_sender[cpu]);
+					cpu, irq_sender[cpu]);
 			pass = false;
 		}
 
-		if (bad_irq[cpu] != -1) {
+		if (irq_number[cpu] != irqnum) {
 			report_info("cpu%d received wrong irq %d",
-					cpu, bad_irq[cpu]);
+					cpu, irq_number[cpu]);
 			pass = false;
 		}
 	}
@@ -143,26 +146,18 @@ static void check_spurious(void)
 	}
 }
 
-static void check_ipi_sender(u32 irqstat, int sender)
+static int gic_get_sender(int irqstat)
 {
-	if (gic_version() == 2) {
-		int src = (irqstat >> 10) & 7;
-
-		if (src != sender)
-			bad_sender[smp_processor_id()] = src;
-	}
-}
-
-static void check_irqnr(u32 irqnr)
-{
-	if (irqnr != IPI_IRQ)
-		bad_irq[smp_processor_id()] = irqnr;
+	if (gic_version() == 2)
+		return (irqstat >> 10) & 7;
+	return -1;
 }
 
 static void ipi_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
+	int this_cpu = smp_processor_id();
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		gic_write_eoir(irqstat);
@@ -173,12 +168,12 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		check_ipi_sender(irqstat, IPI_SENDER);
-		check_irqnr(irqnr);
+		irq_sender[this_cpu] = gic_get_sender(irqstat);
+		irq_number[this_cpu] = irqnr;
 		smp_wmb(); /* pairs with smp_rmb in check_acked */
-		++acked[smp_processor_id()];
+		++acked[this_cpu];
 	} else {
-		++spurious[smp_processor_id()];
+		++spurious[this_cpu];
 	}
 
 	/* Wait for writes to acked/spurious to complete */
@@ -311,40 +306,42 @@ static void gicv3_ipi_send_broadcast(void)
 
 static void ipi_test_self(void)
 {
+	int this_cpu = smp_processor_id();
 	cpumask_t mask;
 
 	report_prefix_push("self");
 	stats_reset();
 	cpumask_clear(&mask);
-	cpumask_set_cpu(smp_processor_id(), &mask);
+	cpumask_set_cpu(this_cpu, &mask);
 	gic->ipi.send_self();
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 }
 
 static void ipi_test_smp(void)
 {
+	int this_cpu = smp_processor_id();
 	cpumask_t mask;
 	int i;
 
 	report_prefix_push("target-list");
 	stats_reset();
 	cpumask_copy(&mask, &cpu_present_mask);
-	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
+	for (i = this_cpu & 1; i < nr_cpus; i += 2)
 		cpumask_clear_cpu(i, &mask);
 	gic_ipi_send_mask(IPI_IRQ, &mask);
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 
 	report_prefix_push("broadcast");
 	stats_reset();
 	cpumask_copy(&mask, &cpu_present_mask);
-	cpumask_clear_cpu(smp_processor_id(), &mask);
+	cpumask_clear_cpu(this_cpu, &mask);
 	gic->ipi.send_broadcast();
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 }
 
@@ -393,6 +390,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
+	int this_cpu = smp_processor_id();
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		void *base;
@@ -405,11 +403,11 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		check_ipi_sender(irqstat, smp_processor_id());
-		check_irqnr(irqnr);
-		++acked[smp_processor_id()];
+		irq_sender[this_cpu] = gic_get_sender(irqstat);
+		irq_number[this_cpu] = irqnr;
+		++acked[this_cpu];
 	} else {
-		++spurious[smp_processor_id()];
+		++spurious[this_cpu];
 	}
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

Testing that an interrupt is received as expected is done in three places:
in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
and check_ipi_sender() compares the sender with IPI_SENDER and writes to
bad_sender when they don't match.

Let's move all the checks to check_acked() by renaming
bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
so they record the interrupt sender, respectively the irq number.
check_acked() now takes two new parameters: the expected interrupt number
and sender.

This has two distinct advantages:

1. check_acked() and ipi_handler() can now be used for interrupts other
   than IPIs.
2. Correctness checks are consolidated in one function.

CC: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/gic.c | 68 +++++++++++++++++++++++++++----------------------------
 1 file changed, 33 insertions(+), 35 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index dcdab7d5f39a..da7b42da5449 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -35,7 +35,7 @@ struct gic {
 
 static struct gic *gic;
 static int acked[NR_CPUS], spurious[NR_CPUS];
-static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
+static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
 static cpumask_t ready;
 
 static void nr_cpu_check(int nr)
@@ -57,8 +57,8 @@ static void stats_reset(void)
 
 	for (i = 0; i < nr_cpus; ++i) {
 		acked[i] = 0;
-		bad_sender[i] = -1;
-		bad_irq[i] = -1;
+		irq_sender[i] = -1;
+		irq_number[i] = -1;
 	}
 }
 
@@ -92,9 +92,10 @@ static void wait_for_interrupts(cpumask_t *mask)
 	report_info("interrupts timed-out (5s)");
 }
 
-static bool check_acked(cpumask_t *mask)
+static bool check_acked(cpumask_t *mask, int sender, int irqnum)
 {
 	int missing = 0, extra = 0, unexpected = 0;
+	bool has_gicv2 = (gic_version() == 2);
 	bool pass = true;
 	int cpu;
 
@@ -108,17 +109,19 @@ static bool check_acked(cpumask_t *mask)
 			if (acked[cpu])
 				++unexpected;
 		}
+		if (!acked[cpu])
+			continue;
 		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
 
-		if (bad_sender[cpu] != -1) {
+		if (has_gicv2 && irq_sender[cpu] != sender) {
 			report_info("cpu%d received IPI from wrong sender %d",
-					cpu, bad_sender[cpu]);
+					cpu, irq_sender[cpu]);
 			pass = false;
 		}
 
-		if (bad_irq[cpu] != -1) {
+		if (irq_number[cpu] != irqnum) {
 			report_info("cpu%d received wrong irq %d",
-					cpu, bad_irq[cpu]);
+					cpu, irq_number[cpu]);
 			pass = false;
 		}
 	}
@@ -143,26 +146,18 @@ static void check_spurious(void)
 	}
 }
 
-static void check_ipi_sender(u32 irqstat, int sender)
+static int gic_get_sender(int irqstat)
 {
-	if (gic_version() == 2) {
-		int src = (irqstat >> 10) & 7;
-
-		if (src != sender)
-			bad_sender[smp_processor_id()] = src;
-	}
-}
-
-static void check_irqnr(u32 irqnr)
-{
-	if (irqnr != IPI_IRQ)
-		bad_irq[smp_processor_id()] = irqnr;
+	if (gic_version() == 2)
+		return (irqstat >> 10) & 7;
+	return -1;
 }
 
 static void ipi_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
+	int this_cpu = smp_processor_id();
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		gic_write_eoir(irqstat);
@@ -173,12 +168,12 @@ static void ipi_handler(struct pt_regs *regs __unused)
 		 */
 		if (gic_version() == 2)
 			smp_rmb();
-		check_ipi_sender(irqstat, IPI_SENDER);
-		check_irqnr(irqnr);
+		irq_sender[this_cpu] = gic_get_sender(irqstat);
+		irq_number[this_cpu] = irqnr;
 		smp_wmb(); /* pairs with smp_rmb in check_acked */
-		++acked[smp_processor_id()];
+		++acked[this_cpu];
 	} else {
-		++spurious[smp_processor_id()];
+		++spurious[this_cpu];
 	}
 
 	/* Wait for writes to acked/spurious to complete */
@@ -311,40 +306,42 @@ static void gicv3_ipi_send_broadcast(void)
 
 static void ipi_test_self(void)
 {
+	int this_cpu = smp_processor_id();
 	cpumask_t mask;
 
 	report_prefix_push("self");
 	stats_reset();
 	cpumask_clear(&mask);
-	cpumask_set_cpu(smp_processor_id(), &mask);
+	cpumask_set_cpu(this_cpu, &mask);
 	gic->ipi.send_self();
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 }
 
 static void ipi_test_smp(void)
 {
+	int this_cpu = smp_processor_id();
 	cpumask_t mask;
 	int i;
 
 	report_prefix_push("target-list");
 	stats_reset();
 	cpumask_copy(&mask, &cpu_present_mask);
-	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
+	for (i = this_cpu & 1; i < nr_cpus; i += 2)
 		cpumask_clear_cpu(i, &mask);
 	gic_ipi_send_mask(IPI_IRQ, &mask);
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 
 	report_prefix_push("broadcast");
 	stats_reset();
 	cpumask_copy(&mask, &cpu_present_mask);
-	cpumask_clear_cpu(smp_processor_id(), &mask);
+	cpumask_clear_cpu(this_cpu, &mask);
 	gic->ipi.send_broadcast();
 	wait_for_interrupts(&mask);
-	report(check_acked(&mask), "Interrupts received");
+	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
 	report_prefix_pop();
 }
 
@@ -393,6 +390,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
+	int this_cpu = smp_processor_id();
 
 	if (irqnr != GICC_INT_SPURIOUS) {
 		void *base;
@@ -405,11 +403,11 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
 
 		writel(val, base + GICD_ICACTIVER);
 
-		check_ipi_sender(irqstat, smp_processor_id());
-		check_irqnr(irqnr);
-		++acked[smp_processor_id()];
+		irq_sender[this_cpu] = gic_get_sender(irqstat);
+		irq_number[this_cpu] = irqnr;
+		++acked[this_cpu];
 	} else {
-		++spurious[smp_processor_id()];
+		++spurious[this_cpu];
 	}
 }
 
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-25 15:51 ` Alexandru Elisei
@ 2020-11-25 15:51   ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: eric.auger, andre.przywara

The LPI code validates a result similarly to the IPI tests, by checking if
the target CPU received the interrupt with the expected interrupt number.
However, the LPI tests invent their own way of checking the test results by
creating a global struct (lpi_stats), using a separate interrupt handler
(lpi_handler) and test function (check_lpi_stats).

There are several areas that can be improved in the LPI code, which are
already covered by the IPI tests:

- check_lpi_stats() doesn't take into account that the target CPU can
  receive the correct interrupt multiple times.
- check_lpi_stats() doesn't take into the account the scenarios where all
  online CPUs can receive the interrupt, but the target CPU is the last CPU
  that touches lpi_stats.observed.
- Insufficient or missing memory synchronization.

Instead of duplicating code, let's convert the LPI tests to use
check_acked() and the same interrupt handler as the IPI tests, which has
been renamed to irq_handler() to avoid any confusion.

check_lpi_stats() has been replaced with check_acked() which, together with
using irq_handler(), instantly gives us more correctness checks and proper
memory synchronization between threads. lpi_stats.expected has been
replaced by the CPU mask and the expected interrupt number arguments to
check_acked(), with no change in semantics.

lpi_handler() aborted the test if the interrupt number was not an LPI. This
was changed in favor of allowing the test to continue, as it will fail in
check_acked(), but possibly print information useful for debugging. If the
test receives spurious interrupts, those are reported via report_info() at
the end of the test for consistency with the IPI tests, which don't treat
spurious interrupts as critical errors.

In the spirit of code reuse, secondary_lpi_tests() has been replaced with
ipi_recv() because the two are now identical; ipi_recv() has been renamed
to irq_recv(), similarly to irq_handler(), to avoid confusion.

CC: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
With this change, I get the following failure for its-trigger on a
rockpro64 (running on the little cores):

$ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger
/usr/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=1
ITS: MAPD devid=7 size = 0x8 itt=0x403b0000 valid=1
MAPC col_id=3 target_addr = 0x30000 valid=1
MAPC col_id=2 target_addr = 0x20000 valid=1
INVALL col_id=2
INVALL col_id=3
MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
INT dev_id=7 event_id=255
PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
INV dev_id=2 event_id=20
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger any LPI
INVALL col_id=3
INT dev_id=2 event_id=20
INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=0
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
SUMMARY: 6 tests, 1 unexpected failures

The reason for the failure is that the test "dev2/eventid=20 now triggers
an LPI" triggers 2 LPIs, not one. This behavior was present before this
patch, but it was ignored because check_lpi_stats() wasn't looking at the
acked array.

I'm not familiar with the ITS so I'm not sure if this is expected, if the
test is incorrect or if there is something wrong with KVM emulation.

Did some more testing on an Ampere eMAG (fast out-of-order cores) using
qemu and kvmtool and Linux v5.8, here's what I found:

- Using qemu and gic.flat built from *master*: error encountered 864 times
  out of 1088 runs.
- Using qemu: error encountered 852 times out of 1027 runs.
- Using kvmtool: error encountered 8164 times out of 10602 runs.

Looks to me like it's consistent between master and this series, and
between qemu and kvmtool.

Here's the diff that I used for testing master (I removed the diff line
because it causes trouble when applying the main patch):

@@ -772,8 +772,12 @@ static void test_its_trigger(void)
        /* Now call the invall and check the LPI hits */
        its_send_invall(col3);
        lpi_stats_expect(3, 8195);
+       acked[3] = 0;
+       dsb(ishst);
        its_send_int(dev2, 20);
        check_lpi_stats("dev2/eventid=20 now triggers an LPI");
+       report_info("acked[3] = %d", acked[3]);
+       report(acked[3] == 1, "dev2/eventid=20 received one interrupt");
 
        report_prefix_pop();
 

 arm/gic.c | 185 ++++++++++++++++++++++++++----------------------------
 1 file changed, 88 insertions(+), 97 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index da7b42da5449..6e93da80fe0d 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -111,7 +111,7 @@ static bool check_acked(cpumask_t *mask, int sender, int irqnum)
 		}
 		if (!acked[cpu])
 			continue;
-		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
+		smp_rmb(); /* pairs with smp_wmb in irq_handler */
 
 		if (has_gicv2 && irq_sender[cpu] != sender) {
 			report_info("cpu%d received IPI from wrong sender %d",
@@ -149,11 +149,12 @@ static void check_spurious(void)
 static int gic_get_sender(int irqstat)
 {
 	if (gic_version() == 2)
+		/* GICC_IAR.CPUID is RAZ for non-SGIs */
 		return (irqstat >> 10) & 7;
 	return -1;
 }
 
-static void ipi_handler(struct pt_regs *regs __unused)
+static void irq_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
@@ -192,75 +193,6 @@ static void setup_irq(irq_handler_fn handler)
 }
 
 #if defined(__aarch64__)
-struct its_event {
-	int cpu_id;
-	int lpi_id;
-};
-
-struct its_stats {
-	struct its_event expected;
-	struct its_event observed;
-};
-
-static struct its_stats lpi_stats;
-
-static void lpi_handler(struct pt_regs *regs __unused)
-{
-	u32 irqstat = gic_read_iar();
-	int irqnr = gic_iar_irqnr(irqstat);
-
-	gic_write_eoir(irqstat);
-	assert(irqnr >= 8192);
-	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
-	lpi_stats.observed.cpu_id = smp_processor_id();
-	lpi_stats.observed.lpi_id = irqnr;
-	acked[lpi_stats.observed.cpu_id]++;
-	smp_wmb(); /* pairs with rmb in check_lpi_stats */
-}
-
-static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
-{
-	lpi_stats.expected.cpu_id = exp_cpu_id;
-	lpi_stats.expected.lpi_id = exp_lpi_id;
-	lpi_stats.observed.cpu_id = -1;
-	lpi_stats.observed.lpi_id = -1;
-	smp_wmb(); /* pairs with rmb in handler */
-}
-
-static void check_lpi_stats(const char *msg)
-{
-	int i;
-
-	for (i = 0; i < 50; i++) {
-		mdelay(100);
-		smp_rmb(); /* pairs with wmb in lpi_handler */
-		if (lpi_stats.observed.cpu_id == lpi_stats.expected.cpu_id &&
-		    lpi_stats.observed.lpi_id == lpi_stats.expected.lpi_id) {
-			report(true, "%s", msg);
-			return;
-		}
-	}
-
-	if (lpi_stats.observed.cpu_id == -1 && lpi_stats.observed.lpi_id == -1) {
-		report_info("No LPI received whereas (cpuid=%d, intid=%d) "
-			    "was expected", lpi_stats.expected.cpu_id,
-			    lpi_stats.expected.lpi_id);
-	} else {
-		report_info("Unexpected LPI (cpuid=%d, intid=%d)",
-			    lpi_stats.observed.cpu_id,
-			    lpi_stats.observed.lpi_id);
-	}
-	report(false, "%s", msg);
-}
-
-static void secondary_lpi_test(void)
-{
-	setup_irq(lpi_handler);
-	cpumask_set_cpu(smp_processor_id(), &ready);
-	while (1)
-		wfi();
-}
-
 static void check_lpi_hits(int *expected, const char *msg)
 {
 	bool pass = true;
@@ -347,7 +279,7 @@ static void ipi_test_smp(void)
 
 static void ipi_send(void)
 {
-	setup_irq(ipi_handler);
+	setup_irq(irq_handler);
 	wait_on_ready();
 	ipi_test_self();
 	ipi_test_smp();
@@ -355,9 +287,9 @@ static void ipi_send(void)
 	exit(report_summary());
 }
 
-static void ipi_recv(void)
+static void irq_recv(void)
 {
-	setup_irq(ipi_handler);
+	setup_irq(irq_handler);
 	cpumask_set_cpu(smp_processor_id(), &ready);
 	while (1)
 		wfi();
@@ -368,7 +300,7 @@ static void ipi_test(void *data __unused)
 	if (smp_processor_id() == IPI_SENDER)
 		ipi_send();
 	else
-		ipi_recv();
+		irq_recv();
 }
 
 static struct gic gicv2 = {
@@ -698,12 +630,12 @@ static int its_prerequisites(int nb_cpus)
 
 	stats_reset();
 
-	setup_irq(lpi_handler);
+	setup_irq(irq_handler);
 
 	for_each_present_cpu(cpu) {
 		if (cpu == 0)
 			continue;
-		smp_boot_secondary(cpu, secondary_lpi_test);
+		smp_boot_secondary(cpu, irq_recv);
 	}
 	wait_on_ready();
 
@@ -757,6 +689,7 @@ static void test_its_trigger(void)
 {
 	struct its_collection *col3;
 	struct its_device *dev2, *dev7;
+	cpumask_t mask;
 
 	if (its_setup1())
 		return;
@@ -767,13 +700,27 @@ static void test_its_trigger(void)
 
 	report_prefix_push("int");
 
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	/*
+	 * its_send_int() is missing the synchronization from the GICv3 IPI
+	 * trigger functions.
+	 */
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev=2, eventid=20  -> lpi= 8195, col=3");
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev=7, eventid=255 -> lpi= 8196, col=2");
 
 	report_prefix_pop();
 
@@ -786,9 +733,13 @@ static void test_its_trigger(void)
 	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
 	its_send_inv(dev2, 20);
 
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1),
+			"dev2/eventid=20 does not trigger any LPI");
 
 	/*
 	 * re-enable the LPI but willingly do not call invall
@@ -796,15 +747,24 @@ static void test_its_trigger(void)
 	 * The LPI should not hit
 	 */
 	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1),
+			"dev2/eventid=20 still does not trigger any LPI");
 
 	/* Now call the invall and check the LPI hits */
 	its_send_invall(col3);
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev2/eventid=20 now triggers an LPI");
 
 	report_prefix_pop();
 
@@ -815,9 +775,14 @@ static void test_its_trigger(void)
 	 */
 
 	its_send_mapd(dev2, false);
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("no LPI after device unmap");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1), "no LPI after device unmap");
+
+	check_spurious();
 	report_prefix_pop();
 }
 
@@ -825,6 +790,7 @@ static void test_its_migration(void)
 {
 	struct its_device *dev2, *dev7;
 	bool test_skipped = false;
+	cpumask_t mask;
 
 	if (its_setup1()) {
 		test_skipped = true;
@@ -841,13 +807,25 @@ do_migrate:
 	if (test_skipped)
 		return;
 
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
+
+	check_spurious();
 }
 
 #define ERRATA_UNMAPPED_COLLECTIONS "ERRATA_8c58be34494b"
@@ -857,6 +835,7 @@ static void test_migrate_unmapped_collection(void)
 	struct its_collection *col = NULL;
 	struct its_device *dev2 = NULL, *dev7 = NULL;
 	bool test_skipped = false;
+	cpumask_t mask;
 	int pe0 = 0;
 	u8 config;
 
@@ -891,17 +870,29 @@ do_migrate:
 	its_send_mapc(col, true);
 	its_send_invall(col);
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev7/eventid= 255 triggered LPI 8196 on PE #2");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev7/eventid= 255 triggered LPI 8196 on PE #2");
 
 	config = gicv3_lpi_get_config(8192);
 	report(config == LPI_PROP_DEFAULT,
 	       "Config of LPI 8192 was properly migrated");
 
-	lpi_stats_expect(pe0, 8192);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(pe0, &mask);
 	its_send_int(dev2, 0);
-	check_lpi_stats("dev2/eventid = 0 triggered LPI 8192 on PE0");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8192),
+			"dev2/eventid = 0 triggered LPI 8192 on PE0");
+
+	check_spurious();
 }
 
 static void test_its_pending_migration(void)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-25 15:51   ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-25 15:51 UTC (permalink / raw)
  To: kvm, kvmarm, drjones; +Cc: andre.przywara

The LPI code validates a result similarly to the IPI tests, by checking if
the target CPU received the interrupt with the expected interrupt number.
However, the LPI tests invent their own way of checking the test results by
creating a global struct (lpi_stats), using a separate interrupt handler
(lpi_handler) and test function (check_lpi_stats).

There are several areas that can be improved in the LPI code, which are
already covered by the IPI tests:

- check_lpi_stats() doesn't take into account that the target CPU can
  receive the correct interrupt multiple times.
- check_lpi_stats() doesn't take into the account the scenarios where all
  online CPUs can receive the interrupt, but the target CPU is the last CPU
  that touches lpi_stats.observed.
- Insufficient or missing memory synchronization.

Instead of duplicating code, let's convert the LPI tests to use
check_acked() and the same interrupt handler as the IPI tests, which has
been renamed to irq_handler() to avoid any confusion.

check_lpi_stats() has been replaced with check_acked() which, together with
using irq_handler(), instantly gives us more correctness checks and proper
memory synchronization between threads. lpi_stats.expected has been
replaced by the CPU mask and the expected interrupt number arguments to
check_acked(), with no change in semantics.

lpi_handler() aborted the test if the interrupt number was not an LPI. This
was changed in favor of allowing the test to continue, as it will fail in
check_acked(), but possibly print information useful for debugging. If the
test receives spurious interrupts, those are reported via report_info() at
the end of the test for consistency with the IPI tests, which don't treat
spurious interrupts as critical errors.

In the spirit of code reuse, secondary_lpi_tests() has been replaced with
ipi_recv() because the two are now identical; ipi_recv() has been renamed
to irq_recv(), similarly to irq_handler(), to avoid confusion.

CC: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
With this change, I get the following failure for its-trigger on a
rockpro64 (running on the little cores):

$ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger
/usr/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=1
ITS: MAPD devid=7 size = 0x8 itt=0x403b0000 valid=1
MAPC col_id=3 target_addr = 0x30000 valid=1
MAPC col_id=2 target_addr = 0x20000 valid=1
INVALL col_id=2
INVALL col_id=3
MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
INT dev_id=7 event_id=255
PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
INV dev_id=2 event_id=20
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger any LPI
INVALL col_id=3
INT dev_id=2 event_id=20
INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=0
INT dev_id=2 event_id=20
PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
SUMMARY: 6 tests, 1 unexpected failures

The reason for the failure is that the test "dev2/eventid=20 now triggers
an LPI" triggers 2 LPIs, not one. This behavior was present before this
patch, but it was ignored because check_lpi_stats() wasn't looking at the
acked array.

I'm not familiar with the ITS so I'm not sure if this is expected, if the
test is incorrect or if there is something wrong with KVM emulation.

Did some more testing on an Ampere eMAG (fast out-of-order cores) using
qemu and kvmtool and Linux v5.8, here's what I found:

- Using qemu and gic.flat built from *master*: error encountered 864 times
  out of 1088 runs.
- Using qemu: error encountered 852 times out of 1027 runs.
- Using kvmtool: error encountered 8164 times out of 10602 runs.

Looks to me like it's consistent between master and this series, and
between qemu and kvmtool.

Here's the diff that I used for testing master (I removed the diff line
because it causes trouble when applying the main patch):

@@ -772,8 +772,12 @@ static void test_its_trigger(void)
        /* Now call the invall and check the LPI hits */
        its_send_invall(col3);
        lpi_stats_expect(3, 8195);
+       acked[3] = 0;
+       dsb(ishst);
        its_send_int(dev2, 20);
        check_lpi_stats("dev2/eventid=20 now triggers an LPI");
+       report_info("acked[3] = %d", acked[3]);
+       report(acked[3] == 1, "dev2/eventid=20 received one interrupt");
 
        report_prefix_pop();
 

 arm/gic.c | 185 ++++++++++++++++++++++++++----------------------------
 1 file changed, 88 insertions(+), 97 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index da7b42da5449..6e93da80fe0d 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -111,7 +111,7 @@ static bool check_acked(cpumask_t *mask, int sender, int irqnum)
 		}
 		if (!acked[cpu])
 			continue;
-		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
+		smp_rmb(); /* pairs with smp_wmb in irq_handler */
 
 		if (has_gicv2 && irq_sender[cpu] != sender) {
 			report_info("cpu%d received IPI from wrong sender %d",
@@ -149,11 +149,12 @@ static void check_spurious(void)
 static int gic_get_sender(int irqstat)
 {
 	if (gic_version() == 2)
+		/* GICC_IAR.CPUID is RAZ for non-SGIs */
 		return (irqstat >> 10) & 7;
 	return -1;
 }
 
-static void ipi_handler(struct pt_regs *regs __unused)
+static void irq_handler(struct pt_regs *regs __unused)
 {
 	u32 irqstat = gic_read_iar();
 	u32 irqnr = gic_iar_irqnr(irqstat);
@@ -192,75 +193,6 @@ static void setup_irq(irq_handler_fn handler)
 }
 
 #if defined(__aarch64__)
-struct its_event {
-	int cpu_id;
-	int lpi_id;
-};
-
-struct its_stats {
-	struct its_event expected;
-	struct its_event observed;
-};
-
-static struct its_stats lpi_stats;
-
-static void lpi_handler(struct pt_regs *regs __unused)
-{
-	u32 irqstat = gic_read_iar();
-	int irqnr = gic_iar_irqnr(irqstat);
-
-	gic_write_eoir(irqstat);
-	assert(irqnr >= 8192);
-	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
-	lpi_stats.observed.cpu_id = smp_processor_id();
-	lpi_stats.observed.lpi_id = irqnr;
-	acked[lpi_stats.observed.cpu_id]++;
-	smp_wmb(); /* pairs with rmb in check_lpi_stats */
-}
-
-static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
-{
-	lpi_stats.expected.cpu_id = exp_cpu_id;
-	lpi_stats.expected.lpi_id = exp_lpi_id;
-	lpi_stats.observed.cpu_id = -1;
-	lpi_stats.observed.lpi_id = -1;
-	smp_wmb(); /* pairs with rmb in handler */
-}
-
-static void check_lpi_stats(const char *msg)
-{
-	int i;
-
-	for (i = 0; i < 50; i++) {
-		mdelay(100);
-		smp_rmb(); /* pairs with wmb in lpi_handler */
-		if (lpi_stats.observed.cpu_id == lpi_stats.expected.cpu_id &&
-		    lpi_stats.observed.lpi_id == lpi_stats.expected.lpi_id) {
-			report(true, "%s", msg);
-			return;
-		}
-	}
-
-	if (lpi_stats.observed.cpu_id == -1 && lpi_stats.observed.lpi_id == -1) {
-		report_info("No LPI received whereas (cpuid=%d, intid=%d) "
-			    "was expected", lpi_stats.expected.cpu_id,
-			    lpi_stats.expected.lpi_id);
-	} else {
-		report_info("Unexpected LPI (cpuid=%d, intid=%d)",
-			    lpi_stats.observed.cpu_id,
-			    lpi_stats.observed.lpi_id);
-	}
-	report(false, "%s", msg);
-}
-
-static void secondary_lpi_test(void)
-{
-	setup_irq(lpi_handler);
-	cpumask_set_cpu(smp_processor_id(), &ready);
-	while (1)
-		wfi();
-}
-
 static void check_lpi_hits(int *expected, const char *msg)
 {
 	bool pass = true;
@@ -347,7 +279,7 @@ static void ipi_test_smp(void)
 
 static void ipi_send(void)
 {
-	setup_irq(ipi_handler);
+	setup_irq(irq_handler);
 	wait_on_ready();
 	ipi_test_self();
 	ipi_test_smp();
@@ -355,9 +287,9 @@ static void ipi_send(void)
 	exit(report_summary());
 }
 
-static void ipi_recv(void)
+static void irq_recv(void)
 {
-	setup_irq(ipi_handler);
+	setup_irq(irq_handler);
 	cpumask_set_cpu(smp_processor_id(), &ready);
 	while (1)
 		wfi();
@@ -368,7 +300,7 @@ static void ipi_test(void *data __unused)
 	if (smp_processor_id() == IPI_SENDER)
 		ipi_send();
 	else
-		ipi_recv();
+		irq_recv();
 }
 
 static struct gic gicv2 = {
@@ -698,12 +630,12 @@ static int its_prerequisites(int nb_cpus)
 
 	stats_reset();
 
-	setup_irq(lpi_handler);
+	setup_irq(irq_handler);
 
 	for_each_present_cpu(cpu) {
 		if (cpu == 0)
 			continue;
-		smp_boot_secondary(cpu, secondary_lpi_test);
+		smp_boot_secondary(cpu, irq_recv);
 	}
 	wait_on_ready();
 
@@ -757,6 +689,7 @@ static void test_its_trigger(void)
 {
 	struct its_collection *col3;
 	struct its_device *dev2, *dev7;
+	cpumask_t mask;
 
 	if (its_setup1())
 		return;
@@ -767,13 +700,27 @@ static void test_its_trigger(void)
 
 	report_prefix_push("int");
 
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	/*
+	 * its_send_int() is missing the synchronization from the GICv3 IPI
+	 * trigger functions.
+	 */
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev=2, eventid=20  -> lpi= 8195, col=3");
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev=7, eventid=255 -> lpi= 8196, col=2");
 
 	report_prefix_pop();
 
@@ -786,9 +733,13 @@ static void test_its_trigger(void)
 	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
 	its_send_inv(dev2, 20);
 
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1),
+			"dev2/eventid=20 does not trigger any LPI");
 
 	/*
 	 * re-enable the LPI but willingly do not call invall
@@ -796,15 +747,24 @@ static void test_its_trigger(void)
 	 * The LPI should not hit
 	 */
 	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1),
+			"dev2/eventid=20 still does not trigger any LPI");
 
 	/* Now call the invall and check the LPI hits */
 	its_send_invall(col3);
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev2/eventid=20 now triggers an LPI");
 
 	report_prefix_pop();
 
@@ -815,9 +775,14 @@ static void test_its_trigger(void)
 	 */
 
 	its_send_mapd(dev2, false);
-	lpi_stats_expect(-1, -1);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("no LPI after device unmap");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, -1, -1), "no LPI after device unmap");
+
+	check_spurious();
 	report_prefix_pop();
 }
 
@@ -825,6 +790,7 @@ static void test_its_migration(void)
 {
 	struct its_device *dev2, *dev7;
 	bool test_skipped = false;
+	cpumask_t mask;
 
 	if (its_setup1()) {
 		test_skipped = true;
@@ -841,13 +807,25 @@ do_migrate:
 	if (test_skipped)
 		return;
 
-	lpi_stats_expect(3, 8195);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(3, &mask);
 	its_send_int(dev2, 20);
-	check_lpi_stats("dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8195),
+			"dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
+
+	check_spurious();
 }
 
 #define ERRATA_UNMAPPED_COLLECTIONS "ERRATA_8c58be34494b"
@@ -857,6 +835,7 @@ static void test_migrate_unmapped_collection(void)
 	struct its_collection *col = NULL;
 	struct its_device *dev2 = NULL, *dev7 = NULL;
 	bool test_skipped = false;
+	cpumask_t mask;
 	int pe0 = 0;
 	u8 config;
 
@@ -891,17 +870,29 @@ do_migrate:
 	its_send_mapc(col, true);
 	its_send_invall(col);
 
-	lpi_stats_expect(2, 8196);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(2, &mask);
 	its_send_int(dev7, 255);
-	check_lpi_stats("dev7/eventid= 255 triggered LPI 8196 on PE #2");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8196),
+			"dev7/eventid= 255 triggered LPI 8196 on PE #2");
 
 	config = gicv3_lpi_get_config(8192);
 	report(config == LPI_PROP_DEFAULT,
 	       "Config of LPI 8192 was properly migrated");
 
-	lpi_stats_expect(pe0, 8192);
+	stats_reset();
+	wmb();
+	cpumask_clear(&mask);
+	cpumask_set_cpu(pe0, &mask);
 	its_send_int(dev2, 0);
-	check_lpi_stats("dev2/eventid = 0 triggered LPI 8192 on PE0");
+	wait_for_interrupts(&mask);
+	report(check_acked(&mask, 0, 8192),
+			"dev2/eventid = 0 triggered LPI 8192 on PE0");
+
+	check_spurious();
 }
 
 static void test_its_pending_migration(void)
-- 
2.29.2

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-11-26  9:30     ` Zenghui Yu
  -1 siblings, 0 replies; 78+ messages in thread
From: Zenghui Yu @ 2020-11-26  9:30 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara, Eric Auger

On 2020/11/25 23:51, Alexandru Elisei wrote:
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.

I think this is expected, or not.

Before INVALL, the LPI-8195 was already pending but disabled. On
receiving INVALL, VGIC will reload configuration for all LPIs targeting
collection-3 and deliver the now enabled LPI-8195. We'll therefore see
and handle it before sending the following INT (which will set the
LPI-8195 pending again).

> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from*master*: error encountered 864 times
>    out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.

If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
vcpu-3 hadn't been scheduled), the following INT will set the already
pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
And we won't see the mentioned failure.

I think we can just drop the (meaningless and confusing?) INT.


Thanks,
Zenghui

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-26  9:30     ` Zenghui Yu
  0 siblings, 0 replies; 78+ messages in thread
From: Zenghui Yu @ 2020-11-26  9:30 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

On 2020/11/25 23:51, Alexandru Elisei wrote:
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.

I think this is expected, or not.

Before INVALL, the LPI-8195 was already pending but disabled. On
receiving INVALL, VGIC will reload configuration for all LPIs targeting
collection-3 and deliver the now enabled LPI-8195. We'll therefore see
and handle it before sending the following INT (which will set the
LPI-8195 pending again).

> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from*master*: error encountered 864 times
>    out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.

If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
vcpu-3 hadn't been scheduled), the following INT will set the already
pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
And we won't see the mentioned failure.

I think we can just drop the (meaningless and confusing?) INT.


Thanks,
Zenghui
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-26  9:30     ` Zenghui Yu
@ 2020-11-27 14:50       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-27 14:50 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara, Eric Auger

Hi Zhenghui,

Thank you for having a look at this!

On 11/26/20 9:30 AM, Zenghui Yu wrote:
> On 2020/11/25 23:51, Alexandru Elisei wrote:
>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>> acked array.
>>
>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>> test is incorrect or if there is something wrong with KVM emulation.
>
> I think this is expected, or not.
>
> Before INVALL, the LPI-8195 was already pending but disabled. On
> receiving INVALL, VGIC will reload configuration for all LPIs targeting
> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
> and handle it before sending the following INT (which will set the
> LPI-8195 pending again).
>
>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>> qemu and kvmtool and Linux v5.8, here's what I found:
>>
>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>    out of 1088 runs.
>> - Using qemu: error encountered 852 times out of 1027 runs.
>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>
> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
> vcpu-3 hadn't been scheduled), the following INT will set the already
> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
> And we won't see the mentioned failure.
>
> I think we can just drop the (meaningless and confusing?) INT.

I think I understand your explanation, the VCPU takes the interrupt immediately
after the INVALL and before the INT, and the second interrupt that I am seeing is
the one caused by the INT command.

I tried modifying the test like this:

diff --git a/arm/gic.c b/arm/gic.c
index 6e93da80fe0d..0ef8c12ea234 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -761,10 +761,17 @@ static void test_its_trigger(void)
        wmb();
        cpumask_clear(&mask);
        cpumask_set_cpu(3, &mask);
-       its_send_int(dev2, 20);
        wait_for_interrupts(&mask);
        report(check_acked(&mask, 0, 8195),
-                       "dev2/eventid=20 now triggers an LPI");
+                       "dev2/eventid=20 pending LPI is received");
+
+       stats_reset();
+       wmb();
+       cpumask_clear(&mask);
+       cpumask_set_cpu(3, &mask);
+       its_send_int(dev2, 20);
+       wait_for_interrupts(&mask);
+       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
 
        report_prefix_pop();
 
I removed the INT from the initial test, and added a separate one to check that
the INT command still works. That looks to me that preserves the spirit of the
original test. After doing stress testing this is what I got:

- with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
the interrupt after INVALL.
- with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
interrupt after INVALL, just like with kvmtool.

Judging from the fact that there is an order of magnitude less failures with
kvmtool than with qemu, I'm leaning towards some random timing issue. I will try
increasing the timeout for wait_for_interrupts() and see if the results improve
over the weekend.

Thanks,
Alex
>
>
> Thanks,
> Zenghui

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-27 14:50       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-27 14:50 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Zhenghui,

Thank you for having a look at this!

On 11/26/20 9:30 AM, Zenghui Yu wrote:
> On 2020/11/25 23:51, Alexandru Elisei wrote:
>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>> acked array.
>>
>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>> test is incorrect or if there is something wrong with KVM emulation.
>
> I think this is expected, or not.
>
> Before INVALL, the LPI-8195 was already pending but disabled. On
> receiving INVALL, VGIC will reload configuration for all LPIs targeting
> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
> and handle it before sending the following INT (which will set the
> LPI-8195 pending again).
>
>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>> qemu and kvmtool and Linux v5.8, here's what I found:
>>
>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>    out of 1088 runs.
>> - Using qemu: error encountered 852 times out of 1027 runs.
>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>
> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
> vcpu-3 hadn't been scheduled), the following INT will set the already
> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
> And we won't see the mentioned failure.
>
> I think we can just drop the (meaningless and confusing?) INT.

I think I understand your explanation, the VCPU takes the interrupt immediately
after the INVALL and before the INT, and the second interrupt that I am seeing is
the one caused by the INT command.

I tried modifying the test like this:

diff --git a/arm/gic.c b/arm/gic.c
index 6e93da80fe0d..0ef8c12ea234 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -761,10 +761,17 @@ static void test_its_trigger(void)
        wmb();
        cpumask_clear(&mask);
        cpumask_set_cpu(3, &mask);
-       its_send_int(dev2, 20);
        wait_for_interrupts(&mask);
        report(check_acked(&mask, 0, 8195),
-                       "dev2/eventid=20 now triggers an LPI");
+                       "dev2/eventid=20 pending LPI is received");
+
+       stats_reset();
+       wmb();
+       cpumask_clear(&mask);
+       cpumask_set_cpu(3, &mask);
+       its_send_int(dev2, 20);
+       wait_for_interrupts(&mask);
+       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
 
        report_prefix_pop();
 
I removed the INT from the initial test, and added a separate one to check that
the INT command still works. That looks to me that preserves the spirit of the
original test. After doing stress testing this is what I got:

- with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
the interrupt after INVALL.
- with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
interrupt after INVALL, just like with kvmtool.

Judging from the fact that there is an order of magnitude less failures with
kvmtool than with qemu, I'm leaning towards some random timing issue. I will try
increasing the timeout for wait_for_interrupts() and see if the results improve
over the weekend.

Thanks,
Alex
>
>
> Thanks,
> Zenghui
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-27 14:50       ` Alexandru Elisei
@ 2020-11-30 13:59         ` Zenghui Yu
  -1 siblings, 0 replies; 78+ messages in thread
From: Zenghui Yu @ 2020-11-30 13:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara, Eric Auger

Hi Alex,

On 2020/11/27 22:50, Alexandru Elisei wrote:
> Hi Zhenghui,
> 
> Thank you for having a look at this!
> 
> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>> acked array.
>>>
>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>> test is incorrect or if there is something wrong with KVM emulation.
>>
>> I think this is expected, or not.
>>
>> Before INVALL, the LPI-8195 was already pending but disabled. On
>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>> and handle it before sending the following INT (which will set the
>> LPI-8195 pending again).
>>
>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>
>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>     out of 1088 runs.
>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>
>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>> vcpu-3 hadn't been scheduled), the following INT will set the already
>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>> And we won't see the mentioned failure.
>>
>> I think we can just drop the (meaningless and confusing?) INT.
> 
> I think I understand your explanation, the VCPU takes the interrupt immediately
> after the INVALL and before the INT, and the second interrupt that I am seeing is
> the one caused by the INT command.

Yes.

> I tried modifying the test like this:
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 6e93da80fe0d..0ef8c12ea234 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>          wmb();
>          cpumask_clear(&mask);
>          cpumask_set_cpu(3, &mask);
> -       its_send_int(dev2, 20);

Shouldn't its_send_invall(col3) be moved down here? See below.

>          wait_for_interrupts(&mask);
>          report(check_acked(&mask, 0, 8195),
> -                       "dev2/eventid=20 now triggers an LPI");
> +                       "dev2/eventid=20 pending LPI is received");
> +
> +       stats_reset();
> +       wmb();
> +       cpumask_clear(&mask);
> +       cpumask_set_cpu(3, &mask);
> +       its_send_int(dev2, 20);
> +       wait_for_interrupts(&mask);
> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>   
>          report_prefix_pop();
>   
> I removed the INT from the initial test, and added a separate one to check that
> the INT command still works. That looks to me that preserves the spirit of the
> original test. After doing stress testing this is what I got:
> 
> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
> the interrupt after INVALL.
> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
> interrupt after INVALL, just like with kvmtool.

I guess the reason of failure is that the LPI is taken *immediately*
after the INVALL?

	/* Now call the invall and check the LPI hits */
	its_send_invall(col3);
		<- LPI is taken, acked[]++
	stats_reset();
		<- acked[] is cleared unexpectedly
	wmb();
	cpumask_clear(&mask);
	cpumask_set_cpu(3, &mask);
	wait_for_interrupts(&mask);
		<- we'll hit timed-out since acked[] is 0


Thanks,
Zenghui

> Judging from the fact that there is an order of magnitude less failures with
> kvmtool than with qemu, I'm leaning towards some random timing issue. I will try
> increasing the timeout for wait_for_interrupts() and see if the results improve
> over the weekend.
> 
> Thanks,
> Alex
>>
>>
>> Thanks,
>> Zenghui
> .
> 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-30 13:59         ` Zenghui Yu
  0 siblings, 0 replies; 78+ messages in thread
From: Zenghui Yu @ 2020-11-30 13:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alex,

On 2020/11/27 22:50, Alexandru Elisei wrote:
> Hi Zhenghui,
> 
> Thank you for having a look at this!
> 
> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>> acked array.
>>>
>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>> test is incorrect or if there is something wrong with KVM emulation.
>>
>> I think this is expected, or not.
>>
>> Before INVALL, the LPI-8195 was already pending but disabled. On
>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>> and handle it before sending the following INT (which will set the
>> LPI-8195 pending again).
>>
>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>
>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>     out of 1088 runs.
>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>
>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>> vcpu-3 hadn't been scheduled), the following INT will set the already
>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>> And we won't see the mentioned failure.
>>
>> I think we can just drop the (meaningless and confusing?) INT.
> 
> I think I understand your explanation, the VCPU takes the interrupt immediately
> after the INVALL and before the INT, and the second interrupt that I am seeing is
> the one caused by the INT command.

Yes.

> I tried modifying the test like this:
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 6e93da80fe0d..0ef8c12ea234 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>          wmb();
>          cpumask_clear(&mask);
>          cpumask_set_cpu(3, &mask);
> -       its_send_int(dev2, 20);

Shouldn't its_send_invall(col3) be moved down here? See below.

>          wait_for_interrupts(&mask);
>          report(check_acked(&mask, 0, 8195),
> -                       "dev2/eventid=20 now triggers an LPI");
> +                       "dev2/eventid=20 pending LPI is received");
> +
> +       stats_reset();
> +       wmb();
> +       cpumask_clear(&mask);
> +       cpumask_set_cpu(3, &mask);
> +       its_send_int(dev2, 20);
> +       wait_for_interrupts(&mask);
> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>   
>          report_prefix_pop();
>   
> I removed the INT from the initial test, and added a separate one to check that
> the INT command still works. That looks to me that preserves the spirit of the
> original test. After doing stress testing this is what I got:
> 
> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
> the interrupt after INVALL.
> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
> interrupt after INVALL, just like with kvmtool.

I guess the reason of failure is that the LPI is taken *immediately*
after the INVALL?

	/* Now call the invall and check the LPI hits */
	its_send_invall(col3);
		<- LPI is taken, acked[]++
	stats_reset();
		<- acked[] is cleared unexpectedly
	wmb();
	cpumask_clear(&mask);
	cpumask_set_cpu(3, &mask);
	wait_for_interrupts(&mask);
		<- we'll hit timed-out since acked[] is 0


Thanks,
Zenghui

> Judging from the fact that there is an order of magnitude less failures with
> kvmtool than with qemu, I'm leaning towards some random timing issue. I will try
> increasing the timeout for wait_for_interrupts() and see if the results improve
> over the weekend.
> 
> Thanks,
> Alex
>>
>>
>> Thanks,
>> Zenghui
> .
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-30 13:59         ` Zenghui Yu
@ 2020-11-30 14:19           ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-30 14:19 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara, Eric Auger

Hi Zenghui,

On 11/30/20 1:59 PM, Zenghui Yu wrote:
> Hi Alex,
>
> On 2020/11/27 22:50, Alexandru Elisei wrote:
>> Hi Zhenghui,
>>
>> Thank you for having a look at this!
>>
>> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>>> acked array.
>>>>
>>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>>> test is incorrect or if there is something wrong with KVM emulation.
>>>
>>> I think this is expected, or not.
>>>
>>> Before INVALL, the LPI-8195 was already pending but disabled. On
>>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>>> and handle it before sending the following INT (which will set the
>>> LPI-8195 pending again).
>>>
>>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>>
>>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>>     out of 1088 runs.
>>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>>
>>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>>> vcpu-3 hadn't been scheduled), the following INT will set the already
>>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>>> And we won't see the mentioned failure.
>>>
>>> I think we can just drop the (meaningless and confusing?) INT.
>>
>> I think I understand your explanation, the VCPU takes the interrupt immediately
>> after the INVALL and before the INT, and the second interrupt that I am seeing is
>> the one caused by the INT command.
>
> Yes.
>
>> I tried modifying the test like this:
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 6e93da80fe0d..0ef8c12ea234 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>>          wmb();
>>          cpumask_clear(&mask);
>>          cpumask_set_cpu(3, &mask);
>> -       its_send_int(dev2, 20);
>
> Shouldn't its_send_invall(col3) be moved down here? See below.
>
>>          wait_for_interrupts(&mask);
>>          report(check_acked(&mask, 0, 8195),
>> -                       "dev2/eventid=20 now triggers an LPI");
>> +                       "dev2/eventid=20 pending LPI is received");
>> +
>> +       stats_reset();
>> +       wmb();
>> +       cpumask_clear(&mask);
>> +       cpumask_set_cpu(3, &mask);
>> +       its_send_int(dev2, 20);
>> +       wait_for_interrupts(&mask);
>> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>>            report_prefix_pop();
>>   I removed the INT from the initial test, and added a separate one to check that
>> the INT command still works. That looks to me that preserves the spirit of the
>> original test. After doing stress testing this is what I got:
>>
>> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
>> the interrupt after INVALL.
>> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
>> interrupt after INVALL, just like with kvmtool.
>
> I guess the reason of failure is that the LPI is taken *immediately*
> after the INVALL?
>
>     /* Now call the invall and check the LPI hits */
>     its_send_invall(col3);
>         <- LPI is taken, acked[]++
>     stats_reset();
>         <- acked[] is cleared unexpectedly
>     wmb();
>     cpumask_clear(&mask);
>     cpumask_set_cpu(3, &mask);
>     wait_for_interrupts(&mask);
>         <- we'll hit timed-out since acked[] is 0

Yes, of course, you're right, I didn't realize that I was resetting the stats
*after* the interrupt was enabled. This also explains why I was still seeing
timeouts even when the timeout duration was set to 50 seconds. I'll retest with
the fix:

diff --git a/arm/gic.c b/arm/gic.c
index 6e93da80fe0d..c4240f5aba39 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -756,15 +756,22 @@ static void test_its_trigger(void)
                        "dev2/eventid=20 still does not trigger any LPI");
 
        /* Now call the invall and check the LPI hits */
+       stats_reset();
+       wmb();
+       cpumask_clear(&mask);
+       cpumask_set_cpu(3, &mask);
        its_send_invall(col3);
+       wait_for_interrupts(&mask);
+       report(check_acked(&mask, 0, 8195),
+                       "dev2/eventid=20 pending LPI is received");
+
        stats_reset();
        wmb();
        cpumask_clear(&mask);
        cpumask_set_cpu(3, &mask);
        its_send_int(dev2, 20);
        wait_for_interrupts(&mask);
-       report(check_acked(&mask, 0, 8195),
-                       "dev2/eventid=20 now triggers an LPI");
+       report(check_acked(&mask, 0, 8195), "dev2/eventid20 triggers an LPI");
 
        report_prefix_pop();
 
I also pushed a branch at [1].

Thank you so much for spotting this! You've saved me (and probably others) a lot
of time debugging.

[1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/fixes1-v2

Thanks,
Alex

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-30 14:19           ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-11-30 14:19 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Zenghui,

On 11/30/20 1:59 PM, Zenghui Yu wrote:
> Hi Alex,
>
> On 2020/11/27 22:50, Alexandru Elisei wrote:
>> Hi Zhenghui,
>>
>> Thank you for having a look at this!
>>
>> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>>> acked array.
>>>>
>>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>>> test is incorrect or if there is something wrong with KVM emulation.
>>>
>>> I think this is expected, or not.
>>>
>>> Before INVALL, the LPI-8195 was already pending but disabled. On
>>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>>> and handle it before sending the following INT (which will set the
>>> LPI-8195 pending again).
>>>
>>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>>
>>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>>     out of 1088 runs.
>>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>>
>>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>>> vcpu-3 hadn't been scheduled), the following INT will set the already
>>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>>> And we won't see the mentioned failure.
>>>
>>> I think we can just drop the (meaningless and confusing?) INT.
>>
>> I think I understand your explanation, the VCPU takes the interrupt immediately
>> after the INVALL and before the INT, and the second interrupt that I am seeing is
>> the one caused by the INT command.
>
> Yes.
>
>> I tried modifying the test like this:
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 6e93da80fe0d..0ef8c12ea234 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>>          wmb();
>>          cpumask_clear(&mask);
>>          cpumask_set_cpu(3, &mask);
>> -       its_send_int(dev2, 20);
>
> Shouldn't its_send_invall(col3) be moved down here? See below.
>
>>          wait_for_interrupts(&mask);
>>          report(check_acked(&mask, 0, 8195),
>> -                       "dev2/eventid=20 now triggers an LPI");
>> +                       "dev2/eventid=20 pending LPI is received");
>> +
>> +       stats_reset();
>> +       wmb();
>> +       cpumask_clear(&mask);
>> +       cpumask_set_cpu(3, &mask);
>> +       its_send_int(dev2, 20);
>> +       wait_for_interrupts(&mask);
>> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>>            report_prefix_pop();
>>   I removed the INT from the initial test, and added a separate one to check that
>> the INT command still works. That looks to me that preserves the spirit of the
>> original test. After doing stress testing this is what I got:
>>
>> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
>> the interrupt after INVALL.
>> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
>> interrupt after INVALL, just like with kvmtool.
>
> I guess the reason of failure is that the LPI is taken *immediately*
> after the INVALL?
>
>     /* Now call the invall and check the LPI hits */
>     its_send_invall(col3);
>         <- LPI is taken, acked[]++
>     stats_reset();
>         <- acked[] is cleared unexpectedly
>     wmb();
>     cpumask_clear(&mask);
>     cpumask_set_cpu(3, &mask);
>     wait_for_interrupts(&mask);
>         <- we'll hit timed-out since acked[] is 0

Yes, of course, you're right, I didn't realize that I was resetting the stats
*after* the interrupt was enabled. This also explains why I was still seeing
timeouts even when the timeout duration was set to 50 seconds. I'll retest with
the fix:

diff --git a/arm/gic.c b/arm/gic.c
index 6e93da80fe0d..c4240f5aba39 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -756,15 +756,22 @@ static void test_its_trigger(void)
                        "dev2/eventid=20 still does not trigger any LPI");
 
        /* Now call the invall and check the LPI hits */
+       stats_reset();
+       wmb();
+       cpumask_clear(&mask);
+       cpumask_set_cpu(3, &mask);
        its_send_invall(col3);
+       wait_for_interrupts(&mask);
+       report(check_acked(&mask, 0, 8195),
+                       "dev2/eventid=20 pending LPI is received");
+
        stats_reset();
        wmb();
        cpumask_clear(&mask);
        cpumask_set_cpu(3, &mask);
        its_send_int(dev2, 20);
        wait_for_interrupts(&mask);
-       report(check_acked(&mask, 0, 8195),
-                       "dev2/eventid=20 now triggers an LPI");
+       report(check_acked(&mask, 0, 8195), "dev2/eventid20 triggers an LPI");
 
        report_prefix_pop();
 
I also pushed a branch at [1].

Thank you so much for spotting this! You've saved me (and probably others) a lot
of time debugging.

[1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/fixes1-v2

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-26  9:30     ` Zenghui Yu
@ 2020-11-30 17:48       ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-11-30 17:48 UTC (permalink / raw)
  To: Zenghui Yu, Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru, Zenghui
On 11/26/20 10:30 AM, Zenghui Yu wrote:
> On 2020/11/25 23:51, Alexandru Elisei wrote:
>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>> acked array.
>>
>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>> test is incorrect or if there is something wrong with KVM emulation.
> 
> I think this is expected, or not.
> 
> Before INVALL, the LPI-8195 was already pending but disabled. On
> receiving INVALL, VGIC will reload configuration for all LPIs targeting
> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
> and handle it before sending the following INT (which will set the
> LPI-8195 pending again).
> 
>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>> qemu and kvmtool and Linux v5.8, here's what I found:
>>
>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>    out of 1088 runs.
>> - Using qemu: error encountered 852 times out of 1027 runs.
>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
> vcpu-3 hadn't been scheduled), the following INT will set the already
> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
> And we won't see the mentioned failure.
> 
> I think we can just drop the (meaningless and confusing?) INT.
Yes I agree with Zenghui, we can remove the INT and just check the
pending LPI set while disabled eventually hits

Thanks

Eric
> 
> 
> Thanks,
> Zenghui
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-11-30 17:48       ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-11-30 17:48 UTC (permalink / raw)
  To: Zenghui Yu, Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru, Zenghui
On 11/26/20 10:30 AM, Zenghui Yu wrote:
> On 2020/11/25 23:51, Alexandru Elisei wrote:
>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>> acked array.
>>
>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>> test is incorrect or if there is something wrong with KVM emulation.
> 
> I think this is expected, or not.
> 
> Before INVALL, the LPI-8195 was already pending but disabled. On
> receiving INVALL, VGIC will reload configuration for all LPIs targeting
> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
> and handle it before sending the following INT (which will set the
> LPI-8195 pending again).
> 
>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>> qemu and kvmtool and Linux v5.8, here's what I found:
>>
>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>    out of 1088 runs.
>> - Using qemu: error encountered 852 times out of 1027 runs.
>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
> vcpu-3 hadn't been scheduled), the following INT will set the already
> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
> And we won't see the mentioned failure.
> 
> I think we can just drop the (meaningless and confusing?) INT.
Yes I agree with Zenghui, we can remove the INT and just check the
pending LPI set while disabled eventually hits

Thanks

Eric
> 
> 
> Thanks,
> Zenghui
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-30 14:19           ` Alexandru Elisei
@ 2020-12-01 15:09             ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-01 15:09 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/30/20 2:19 PM, Alexandru Elisei wrote:
> Hi Zenghui,
>
> On 11/30/20 1:59 PM, Zenghui Yu wrote:
>> Hi Alex,
>>
>> On 2020/11/27 22:50, Alexandru Elisei wrote:
>>> Hi Zhenghui,
>>>
>>> Thank you for having a look at this!
>>>
>>> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>>>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>>>> acked array.
>>>>>
>>>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>>>> test is incorrect or if there is something wrong with KVM emulation.
>>>> I think this is expected, or not.
>>>>
>>>> Before INVALL, the LPI-8195 was already pending but disabled. On
>>>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>>>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>>>> and handle it before sending the following INT (which will set the
>>>> LPI-8195 pending again).
>>>>
>>>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>>>
>>>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>>>     out of 1088 runs.
>>>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>>>> vcpu-3 hadn't been scheduled), the following INT will set the already
>>>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>>>> And we won't see the mentioned failure.
>>>>
>>>> I think we can just drop the (meaningless and confusing?) INT.
>>> I think I understand your explanation, the VCPU takes the interrupt immediately
>>> after the INVALL and before the INT, and the second interrupt that I am seeing is
>>> the one caused by the INT command.
>> Yes.
>>
>>> I tried modifying the test like this:
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 6e93da80fe0d..0ef8c12ea234 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>>>          wmb();
>>>          cpumask_clear(&mask);
>>>          cpumask_set_cpu(3, &mask);
>>> -       its_send_int(dev2, 20);
>> Shouldn't its_send_invall(col3) be moved down here? See below.
>>
>>>          wait_for_interrupts(&mask);
>>>          report(check_acked(&mask, 0, 8195),
>>> -                       "dev2/eventid=20 now triggers an LPI");
>>> +                       "dev2/eventid=20 pending LPI is received");
>>> +
>>> +       stats_reset();
>>> +       wmb();
>>> +       cpumask_clear(&mask);
>>> +       cpumask_set_cpu(3, &mask);
>>> +       its_send_int(dev2, 20);
>>> +       wait_for_interrupts(&mask);
>>> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>>>            report_prefix_pop();
>>>   I removed the INT from the initial test, and added a separate one to check that
>>> the INT command still works. That looks to me that preserves the spirit of the
>>> original test. After doing stress testing this is what I got:
>>>
>>> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
>>> the interrupt after INVALL.
>>> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
>>> interrupt after INVALL, just like with kvmtool.
>> I guess the reason of failure is that the LPI is taken *immediately*
>> after the INVALL?
>>
>>     /* Now call the invall and check the LPI hits */
>>     its_send_invall(col3);
>>         <- LPI is taken, acked[]++
>>     stats_reset();
>>         <- acked[] is cleared unexpectedly
>>     wmb();
>>     cpumask_clear(&mask);
>>     cpumask_set_cpu(3, &mask);
>>     wait_for_interrupts(&mask);
>>         <- we'll hit timed-out since acked[] is 0
> Yes, of course, you're right, I didn't realize that I was resetting the stats
> *after* the interrupt was enabled. This also explains why I was still seeing
> timeouts even when the timeout duration was set to 50 seconds. I'll retest with
> the fix:
>
> diff --git a/arm/gic.c b/arm/gic.c
> index 6e93da80fe0d..c4240f5aba39 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -756,15 +756,22 @@ static void test_its_trigger(void)
>                         "dev2/eventid=20 still does not trigger any LPI");
>  
>         /* Now call the invall and check the LPI hits */
> +       stats_reset();
> +       wmb();
> +       cpumask_clear(&mask);
> +       cpumask_set_cpu(3, &mask);
>         its_send_invall(col3);
> +       wait_for_interrupts(&mask);
> +       report(check_acked(&mask, 0, 8195),
> +                       "dev2/eventid=20 pending LPI is received");
> +
>         stats_reset();
>         wmb();
>         cpumask_clear(&mask);
>         cpumask_set_cpu(3, &mask);
>         its_send_int(dev2, 20);
>         wait_for_interrupts(&mask);
> -       report(check_acked(&mask, 0, 8195),
> -                       "dev2/eventid=20 now triggers an LPI");
> +       report(check_acked(&mask, 0, 8195), "dev2/eventid20 triggers an LPI");
>  
>         report_prefix_pop();
>  
> I also pushed a branch at [1].
>
> Thank you so much for spotting this! You've saved me (and probably others) a lot
> of time debugging.
>
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/fixes1-v2

I have been testing the branch, no failures after 17,996 runs with qemu and 58,669
runs with kvmtool. This looks fine to me, I'll send a v2 with the fix.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-12-01 15:09             ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-01 15:09 UTC (permalink / raw)
  To: Zenghui Yu, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/30/20 2:19 PM, Alexandru Elisei wrote:
> Hi Zenghui,
>
> On 11/30/20 1:59 PM, Zenghui Yu wrote:
>> Hi Alex,
>>
>> On 2020/11/27 22:50, Alexandru Elisei wrote:
>>> Hi Zhenghui,
>>>
>>> Thank you for having a look at this!
>>>
>>> On 11/26/20 9:30 AM, Zenghui Yu wrote:
>>>> On 2020/11/25 23:51, Alexandru Elisei wrote:
>>>>> The reason for the failure is that the test "dev2/eventid=20 now triggers
>>>>> an LPI" triggers 2 LPIs, not one. This behavior was present before this
>>>>> patch, but it was ignored because check_lpi_stats() wasn't looking at the
>>>>> acked array.
>>>>>
>>>>> I'm not familiar with the ITS so I'm not sure if this is expected, if the
>>>>> test is incorrect or if there is something wrong with KVM emulation.
>>>> I think this is expected, or not.
>>>>
>>>> Before INVALL, the LPI-8195 was already pending but disabled. On
>>>> receiving INVALL, VGIC will reload configuration for all LPIs targeting
>>>> collection-3 and deliver the now enabled LPI-8195. We'll therefore see
>>>> and handle it before sending the following INT (which will set the
>>>> LPI-8195 pending again).
>>>>
>>>>> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
>>>>> qemu and kvmtool and Linux v5.8, here's what I found:
>>>>>
>>>>> - Using qemu and gic.flat built from*master*: error encountered 864 times
>>>>>     out of 1088 runs.
>>>>> - Using qemu: error encountered 852 times out of 1027 runs.
>>>>> - Using kvmtool: error encountered 8164 times out of 10602 runs.
>>>> If vcpu-3 hadn't seen and handled LPI-8195 as quickly as possible (e.g.,
>>>> vcpu-3 hadn't been scheduled), the following INT will set the already
>>>> pending LPI-8195 pending again and we'll receive it *once* on vcpu-3.
>>>> And we won't see the mentioned failure.
>>>>
>>>> I think we can just drop the (meaningless and confusing?) INT.
>>> I think I understand your explanation, the VCPU takes the interrupt immediately
>>> after the INVALL and before the INT, and the second interrupt that I am seeing is
>>> the one caused by the INT command.
>> Yes.
>>
>>> I tried modifying the test like this:
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 6e93da80fe0d..0ef8c12ea234 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -761,10 +761,17 @@ static void test_its_trigger(void)
>>>          wmb();
>>>          cpumask_clear(&mask);
>>>          cpumask_set_cpu(3, &mask);
>>> -       its_send_int(dev2, 20);
>> Shouldn't its_send_invall(col3) be moved down here? See below.
>>
>>>          wait_for_interrupts(&mask);
>>>          report(check_acked(&mask, 0, 8195),
>>> -                       "dev2/eventid=20 now triggers an LPI");
>>> +                       "dev2/eventid=20 pending LPI is received");
>>> +
>>> +       stats_reset();
>>> +       wmb();
>>> +       cpumask_clear(&mask);
>>> +       cpumask_set_cpu(3, &mask);
>>> +       its_send_int(dev2, 20);
>>> +       wait_for_interrupts(&mask);
>>> +       report(check_acked(&mask, 0, 8195), "dev2/eventid=20 triggers an LPI");
>>>            report_prefix_pop();
>>>   I removed the INT from the initial test, and added a separate one to check that
>>> the INT command still works. That looks to me that preserves the spirit of the
>>> original test. After doing stress testing this is what I got:
>>>
>>> - with kvmtool, 47,709 iterations, 27 times the test timed out when waiting for
>>> the interrupt after INVALL.
>>> - with qemu, 15,511 iterations, 258 times the test timed out when waiting for the
>>> interrupt after INVALL, just like with kvmtool.
>> I guess the reason of failure is that the LPI is taken *immediately*
>> after the INVALL?
>>
>>     /* Now call the invall and check the LPI hits */
>>     its_send_invall(col3);
>>         <- LPI is taken, acked[]++
>>     stats_reset();
>>         <- acked[] is cleared unexpectedly
>>     wmb();
>>     cpumask_clear(&mask);
>>     cpumask_set_cpu(3, &mask);
>>     wait_for_interrupts(&mask);
>>         <- we'll hit timed-out since acked[] is 0
> Yes, of course, you're right, I didn't realize that I was resetting the stats
> *after* the interrupt was enabled. This also explains why I was still seeing
> timeouts even when the timeout duration was set to 50 seconds. I'll retest with
> the fix:
>
> diff --git a/arm/gic.c b/arm/gic.c
> index 6e93da80fe0d..c4240f5aba39 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -756,15 +756,22 @@ static void test_its_trigger(void)
>                         "dev2/eventid=20 still does not trigger any LPI");
>  
>         /* Now call the invall and check the LPI hits */
> +       stats_reset();
> +       wmb();
> +       cpumask_clear(&mask);
> +       cpumask_set_cpu(3, &mask);
>         its_send_invall(col3);
> +       wait_for_interrupts(&mask);
> +       report(check_acked(&mask, 0, 8195),
> +                       "dev2/eventid=20 pending LPI is received");
> +
>         stats_reset();
>         wmb();
>         cpumask_clear(&mask);
>         cpumask_set_cpu(3, &mask);
>         its_send_int(dev2, 20);
>         wait_for_interrupts(&mask);
> -       report(check_acked(&mask, 0, 8195),
> -                       "dev2/eventid=20 now triggers an LPI");
> +       report(check_acked(&mask, 0, 8195), "dev2/eventid20 triggers an LPI");
>  
>         report_prefix_pop();
>  
> I also pushed a branch at [1].
>
> Thank you so much for spotting this! You've saved me (and probably others) a lot
> of time debugging.
>
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/fixes1-v2

I have been testing the branch, no failures after 17,996 runs with qemu and 58,669
runs with kvmtool. This looks fine to me, I'll send a v2 with the fix.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-01 16:37     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
> checks that the interrupt has been received as expected. There is no need
> to use inter-processor memory synchronization primitives on code that runs
> on the same CPU, so remove the unneeded memory barriers.
> 
> The arrays are modified asynchronously (in the interrupt handler) and it is
> possible for the compiler to infer that they won't be changed during normal
> program flow and try to perform harmful optimizations (like stashing a
> previous read in a register and reusing it). To prevent this, for GICv2,
> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
> compiler barrier.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 401ffafe4299..4e947e8516a2 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -12,6 +12,7 @@
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
>  #include <libcflat.h>
> +#include <linux/compiler.h>
>  #include <errata.h>
>  #include <asm/setup.h>
>  #include <asm/processor.h>
> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>  
>  static void gicv2_ipi_send_self(void)
>  {> -	smp_wmb();
nit: previous patch added it and this patch removes it. maybe squash the
modifs into the previous patch saying only a barrier() is needed for self()?
> +	/* Prevent the compiler from optimizing memory accesses */
> +	barrier();
>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>  	},
>  };
>  
> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		smp_rmb(); /* pairs with wmb in stats_reset */
the comment says it is paired with wmd in stats_reset. So is it OK to
leave the associated wmb?
>  		++acked[smp_processor_id()];
>  		check_irqnr(irqnr);
> -		smp_wmb(); /* pairs with rmb in check_acked */
same here.
>  	} else {
>  		++spurious[smp_processor_id()];
> -		smp_wmb();
>  	}
>  }
>  
> 
Thanks

Eric


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
@ 2020-12-01 16:37     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
> checks that the interrupt has been received as expected. There is no need
> to use inter-processor memory synchronization primitives on code that runs
> on the same CPU, so remove the unneeded memory barriers.
> 
> The arrays are modified asynchronously (in the interrupt handler) and it is
> possible for the compiler to infer that they won't be changed during normal
> program flow and try to perform harmful optimizations (like stashing a
> previous read in a register and reusing it). To prevent this, for GICv2,
> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
> compiler barrier.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 401ffafe4299..4e947e8516a2 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -12,6 +12,7 @@
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
>  #include <libcflat.h>
> +#include <linux/compiler.h>
>  #include <errata.h>
>  #include <asm/setup.h>
>  #include <asm/processor.h>
> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>  
>  static void gicv2_ipi_send_self(void)
>  {> -	smp_wmb();
nit: previous patch added it and this patch removes it. maybe squash the
modifs into the previous patch saying only a barrier() is needed for self()?
> +	/* Prevent the compiler from optimizing memory accesses */
> +	barrier();
>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>  	},
>  };
>  
> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		smp_rmb(); /* pairs with wmb in stats_reset */
the comment says it is paired with wmd in stats_reset. So is it OK to
leave the associated wmb?
>  		++acked[smp_processor_id()];
>  		check_irqnr(irqnr);
> -		smp_wmb(); /* pairs with rmb in check_acked */
same here.
>  	} else {
>  		++spurious[smp_processor_id()];
> -		smp_wmb();
>  	}
>  }
>  
> 
Thanks

Eric

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-01 16:37     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> One common usage for IPIs is for one CPU to write to a shared memory
> location, send the IPI to kick another CPU, and the receiver to read from
> the same location. Proper synchronization is needed to make sure that the
> IPI receiver reads the most recent value and not stale data (for example,
> the write from the sender CPU might still be in a store buffer).
> 
> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
> To make sure the memory stores are observable by other CPUs, we need a
> wmb() barrier (DSB ST), which waits for stores to complete.
> 
> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
> 
> "In addition, no instruction that appears in program order after the DSB
> instruction can alter any state of the system or perform any part of its
> functionality until the DSB completes other than:
> 
> - Being fetched from memory and decoded.
> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
> System registers that are directly or indirectly read without causing
> side-effects."
> 
> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
> 
> The DSB instruction is enough to prevent reordering of the GIC register
> write which comes in program order after the memory access.
> 
> This also matches what the Linux GICv3 irqchip driver does (commit
> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
> gic_raise_softirq()")).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

> ---
>  lib/arm/gic-v3.c | 3 +++
>  arm/gic.c        | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
> index a7e2cb819746..a6afa42d5fbe 100644
> --- a/lib/arm/gic-v3.c
> +++ b/lib/arm/gic-v3.c
> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>  
>  	assert(irq < 16);
>  
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
nit: stores to normal memory ...
> +	wmb();
> +
>  	/*
>  	 * For each cpu in the mask collect its peers, which are also in
>  	 * the mask, in order to form target lists.
> diff --git a/arm/gic.c b/arm/gic.c
> index acb060585fae..512c83636a2e 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>  
>  static void gicv3_ipi_send_broadcast(void)
>  {
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
same
> +	wmb();
>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>  	isb();
>  }
> 
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
@ 2020-12-01 16:37     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> One common usage for IPIs is for one CPU to write to a shared memory
> location, send the IPI to kick another CPU, and the receiver to read from
> the same location. Proper synchronization is needed to make sure that the
> IPI receiver reads the most recent value and not stale data (for example,
> the write from the sender CPU might still be in a store buffer).
> 
> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
> To make sure the memory stores are observable by other CPUs, we need a
> wmb() barrier (DSB ST), which waits for stores to complete.
> 
> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
> 
> "In addition, no instruction that appears in program order after the DSB
> instruction can alter any state of the system or perform any part of its
> functionality until the DSB completes other than:
> 
> - Being fetched from memory and decoded.
> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
> System registers that are directly or indirectly read without causing
> side-effects."
> 
> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
> 
> The DSB instruction is enough to prevent reordering of the GIC register
> write which comes in program order after the memory access.
> 
> This also matches what the Linux GICv3 irqchip driver does (commit
> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
> gic_raise_softirq()")).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>

> ---
>  lib/arm/gic-v3.c | 3 +++
>  arm/gic.c        | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
> index a7e2cb819746..a6afa42d5fbe 100644
> --- a/lib/arm/gic-v3.c
> +++ b/lib/arm/gic-v3.c
> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>  
>  	assert(irq < 16);
>  
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
nit: stores to normal memory ...
> +	wmb();
> +
>  	/*
>  	 * For each cpu in the mask collect its peers, which are also in
>  	 * the mask, in order to form target lists.
> diff --git a/arm/gic.c b/arm/gic.c
> index acb060585fae..512c83636a2e 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>  
>  static void gicv3_ipi_send_broadcast(void)
>  {
> +	/* Ensure stores are visible to other CPUs before sending the IPI */
same
> +	wmb();
>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>  	isb();
>  }
> 
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-01 16:37     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> GICv2 generates IPIs with a MMIO write to the GICD_SGIR register. A common
> pattern for IPI usage is for the IPI receiver to read data written to
> memory by the sender. The armv7 and armv8 architectures implement a
> weakly-ordered memory model, which means that barriers are required to make
> sure that the expected values are observed.
> 
> It turns out that because the receiver CPU must observe the write to memory
> that generated the IPI when reading the GICC_IAR MMIO register, we only
> need to ensure ordering of memory accesses, and not completion. Use a
> smp_wmb (DMB ISHST) barrier before sending the IPI.
> 
> This also matches what the Linux GICv2 irqchip driver does (more details
> in commit 8adbf57fc429 ("irqchip: gic: use dmb ishst instead of dsb when
> raising a softirq")).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric

> ---
>  lib/arm/gic-v2.c | 4 ++++
>  arm/gic.c        | 2 ++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/lib/arm/gic-v2.c b/lib/arm/gic-v2.c
> index dc6a97c600ec..da244c82de34 100644
> --- a/lib/arm/gic-v2.c
> +++ b/lib/arm/gic-v2.c
> @@ -45,6 +45,8 @@ void gicv2_ipi_send_single(int irq, int cpu)
>  {
>  	assert(cpu < 8);
>  	assert(irq < 16);
> +
> +	smp_wmb();
>  	writel(1 << (cpu + 16) | irq, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> @@ -53,5 +55,7 @@ void gicv2_ipi_send_mask(int irq, const cpumask_t *dest)
>  	u8 tlist = (u8)cpumask_bits(dest)[0];
>  
>  	assert(irq < 16);
> +
> +	smp_wmb();
>  	writel(tlist << 16 | irq, gicv2_dist_base() + GICD_SGIR);
>  }
> diff --git a/arm/gic.c b/arm/gic.c
> index 512c83636a2e..401ffafe4299 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -260,11 +260,13 @@ static void check_lpi_hits(int *expected, const char *msg)
>  
>  static void gicv2_ipi_send_self(void)
>  {
> +	smp_wmb();
>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
>  static void gicv2_ipi_send_broadcast(void)
>  {
> +	smp_wmb();
>  	writel(1 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: Add missing barrier when sending IPIs
@ 2020-12-01 16:37     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> GICv2 generates IPIs with a MMIO write to the GICD_SGIR register. A common
> pattern for IPI usage is for the IPI receiver to read data written to
> memory by the sender. The armv7 and armv8 architectures implement a
> weakly-ordered memory model, which means that barriers are required to make
> sure that the expected values are observed.
> 
> It turns out that because the receiver CPU must observe the write to memory
> that generated the IPI when reading the GICC_IAR MMIO register, we only
> need to ensure ordering of memory accesses, and not completion. Use a
> smp_wmb (DMB ISHST) barrier before sending the IPI.
> 
> This also matches what the Linux GICv2 irqchip driver does (more details
> in commit 8adbf57fc429 ("irqchip: gic: use dmb ishst instead of dsb when
> raising a softirq")).
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric

> ---
>  lib/arm/gic-v2.c | 4 ++++
>  arm/gic.c        | 2 ++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/lib/arm/gic-v2.c b/lib/arm/gic-v2.c
> index dc6a97c600ec..da244c82de34 100644
> --- a/lib/arm/gic-v2.c
> +++ b/lib/arm/gic-v2.c
> @@ -45,6 +45,8 @@ void gicv2_ipi_send_single(int irq, int cpu)
>  {
>  	assert(cpu < 8);
>  	assert(irq < 16);
> +
> +	smp_wmb();
>  	writel(1 << (cpu + 16) | irq, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> @@ -53,5 +55,7 @@ void gicv2_ipi_send_mask(int irq, const cpumask_t *dest)
>  	u8 tlist = (u8)cpumask_bits(dest)[0];
>  
>  	assert(irq < 16);
> +
> +	smp_wmb();
>  	writel(tlist << 16 | irq, gicv2_dist_base() + GICD_SGIR);
>  }
> diff --git a/arm/gic.c b/arm/gic.c
> index 512c83636a2e..401ffafe4299 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -260,11 +260,13 @@ static void check_lpi_hits(int *expected, const char *msg)
>  
>  static void gicv2_ipi_send_self(void)
>  {
> +	smp_wmb();
>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
>  static void gicv2_ipi_send_broadcast(void)
>  {
> +	smp_wmb();
>  	writel(1 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>  }
>  
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-01 16:48     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:48 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The GICv3 driver executes a DSB barrier before sending an IPI, which
> ensures that memory accesses have completed. This removes the need to
> enforce ordering with respect to stats_reset() in the IPI handler.
> 
> For GICv2, we still need the DMB to ensure ordering between the read of the
> GICC_IAR MMIO register and the read from the acked array. It also matches
> what the Linux GICv2 driver does in gic_handle_irq().
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 4e947e8516a2..7befda2a8673 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -60,7 +60,6 @@ static void stats_reset(void)
>  		bad_sender[i] = -1;
>  		bad_irq[i] = -1;
>  	}
> -	smp_wmb();
Here we are (pair removed). Still the one in check_acked still exists.
>  }
>  
>  static void check_acked(const char *testname, cpumask_t *mask)
> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> -		smp_rmb(); /* pairs with wmb in stats_reset */
> +		/*
> +		 * Make sure data written before the IPI was triggered is
> +		 * observed after the IAR is read. Pairs with the smp_wmb
> +		 * when sending the IPI.
> +		 */
> +		if (gic_version() == 2)
> +			smp_rmb();
>  		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> 
Thanks

Eric


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
@ 2020-12-01 16:48     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-01 16:48 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The GICv3 driver executes a DSB barrier before sending an IPI, which
> ensures that memory accesses have completed. This removes the need to
> enforce ordering with respect to stats_reset() in the IPI handler.
> 
> For GICv2, we still need the DMB to ensure ordering between the read of the
> GICC_IAR MMIO register and the read from the acked array. It also matches
> what the Linux GICv2 driver does in gic_handle_irq().
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 4e947e8516a2..7befda2a8673 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -60,7 +60,6 @@ static void stats_reset(void)
>  		bad_sender[i] = -1;
>  		bad_irq[i] = -1;
>  	}
> -	smp_wmb();
Here we are (pair removed). Still the one in check_acked still exists.
>  }
>  
>  static void check_acked(const char *testname, cpumask_t *mask)
> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> -		smp_rmb(); /* pairs with wmb in stats_reset */
> +		/*
> +		 * Make sure data written before the IPI was triggered is
> +		 * observed after the IAR is read. Pairs with the smp_wmb
> +		 * when sending the IPI.
> +		 */
> +		if (gic_version() == 2)
> +			smp_rmb();
>  		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> 
Thanks

Eric

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
  2020-12-01 16:37     ` Auger Eric
@ 2020-12-01 17:37       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-01 17:37 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

Thank you so much for having a look at the patches!

On 12/1/20 4:37 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> One common usage for IPIs is for one CPU to write to a shared memory
>> location, send the IPI to kick another CPU, and the receiver to read from
>> the same location. Proper synchronization is needed to make sure that the
>> IPI receiver reads the most recent value and not stale data (for example,
>> the write from the sender CPU might still be in a store buffer).
>>
>> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
>> To make sure the memory stores are observable by other CPUs, we need a
>> wmb() barrier (DSB ST), which waits for stores to complete.
>>
>> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
>>
>> "In addition, no instruction that appears in program order after the DSB
>> instruction can alter any state of the system or perform any part of its
>> functionality until the DSB completes other than:
>>
>> - Being fetched from memory and decoded.
>> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
>> System registers that are directly or indirectly read without causing
>> side-effects."
>>
>> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
>>
>> The DSB instruction is enough to prevent reordering of the GIC register
>> write which comes in program order after the memory access.
>>
>> This also matches what the Linux GICv3 irqchip driver does (commit
>> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
>> gic_raise_softirq()")).
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  lib/arm/gic-v3.c | 3 +++
>>  arm/gic.c        | 2 ++
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
>> index a7e2cb819746..a6afa42d5fbe 100644
>> --- a/lib/arm/gic-v3.c
>> +++ b/lib/arm/gic-v3.c
>> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>>  
>>  	assert(irq < 16);
>>  
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> nit: stores to normal memory ...

Yes, you are completely right. Specifying that it affects only stores to normal
memory would match the comment in the Linux irqchip driver and also what the
architecture specifies for device memory (page B2-158):

"Data accesses to memory locations are coherent for all observers in the system,
and correspondingly are treated as being Outer Shareable.
A memory location with any Device memory attribute cannot be allocated into a cache".

Same thing below, will change.

Thanks,

Alex

>> +	wmb();
>> +
>>  	/*
>>  	 * For each cpu in the mask collect its peers, which are also in
>>  	 * the mask, in order to form target lists.
>> diff --git a/arm/gic.c b/arm/gic.c
>> index acb060585fae..512c83636a2e 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>>  
>>  static void gicv3_ipi_send_broadcast(void)
>>  {
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> same
>> +	wmb();
>>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>>  	isb();
>>  }
>>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
> Thanks
>
> Eric
>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs
@ 2020-12-01 17:37       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-01 17:37 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

Thank you so much for having a look at the patches!

On 12/1/20 4:37 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> One common usage for IPIs is for one CPU to write to a shared memory
>> location, send the IPI to kick another CPU, and the receiver to read from
>> the same location. Proper synchronization is needed to make sure that the
>> IPI receiver reads the most recent value and not stale data (for example,
>> the write from the sender CPU might still be in a store buffer).
>>
>> For GICv3, IPIs are generated with a write to the ICC_SGI1R_EL1 register.
>> To make sure the memory stores are observable by other CPUs, we need a
>> wmb() barrier (DSB ST), which waits for stores to complete.
>>
>> From the definition of DSB from ARM DDI 0487F.b, page B2-139:
>>
>> "In addition, no instruction that appears in program order after the DSB
>> instruction can alter any state of the system or perform any part of its
>> functionality until the DSB completes other than:
>>
>> - Being fetched from memory and decoded.
>> - Reading the general-purpose, SIMD and floating-point, Special-purpose, or
>> System registers that are directly or indirectly read without causing
>> side-effects."
>>
>> Similar definition for armv7 (ARM DDI 0406C.d, page A3-150).
>>
>> The DSB instruction is enough to prevent reordering of the GIC register
>> write which comes in program order after the memory access.
>>
>> This also matches what the Linux GICv3 irqchip driver does (commit
>> 21ec30c0ef52 ("irqchip/gic-v3: Use wmb() instead of smb_wmb() in
>> gic_raise_softirq()")).
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  lib/arm/gic-v3.c | 3 +++
>>  arm/gic.c        | 2 ++
>>  2 files changed, 5 insertions(+)
>>
>> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
>> index a7e2cb819746..a6afa42d5fbe 100644
>> --- a/lib/arm/gic-v3.c
>> +++ b/lib/arm/gic-v3.c
>> @@ -77,6 +77,9 @@ void gicv3_ipi_send_mask(int irq, const cpumask_t *dest)
>>  
>>  	assert(irq < 16);
>>  
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> nit: stores to normal memory ...

Yes, you are completely right. Specifying that it affects only stores to normal
memory would match the comment in the Linux irqchip driver and also what the
architecture specifies for device memory (page B2-158):

"Data accesses to memory locations are coherent for all observers in the system,
and correspondingly are treated as being Outer Shareable.
A memory location with any Device memory attribute cannot be allocated into a cache".

Same thing below, will change.

Thanks,

Alex

>> +	wmb();
>> +
>>  	/*
>>  	 * For each cpu in the mask collect its peers, which are also in
>>  	 * the mask, in order to form target lists.
>> diff --git a/arm/gic.c b/arm/gic.c
>> index acb060585fae..512c83636a2e 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -275,6 +275,8 @@ static void gicv3_ipi_send_self(void)
>>  
>>  static void gicv3_ipi_send_broadcast(void)
>>  {
>> +	/* Ensure stores are visible to other CPUs before sending the IPI */
> same
>> +	wmb();
>>  	gicv3_write_sgi1r(1ULL << 40 | IPI_IRQ << 24);
>>  	isb();
>>  }
>>
> Reviewed-by: Eric Auger <eric.auger@redhat.com>
>
> Thanks
>
> Eric
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
  2020-12-01 16:37     ` Auger Eric
@ 2020-12-02 14:02       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:02 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/1/20 4:37 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>> checks that the interrupt has been received as expected. There is no need
>> to use inter-processor memory synchronization primitives on code that runs
>> on the same CPU, so remove the unneeded memory barriers.
>>
>> The arrays are modified asynchronously (in the interrupt handler) and it is
>> possible for the compiler to infer that they won't be changed during normal
>> program flow and try to perform harmful optimizations (like stashing a
>> previous read in a register and reusing it). To prevent this, for GICv2,
>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>> compiler barrier.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 401ffafe4299..4e947e8516a2 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -12,6 +12,7 @@
>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>   */
>>  #include <libcflat.h>
>> +#include <linux/compiler.h>
>>  #include <errata.h>
>>  #include <asm/setup.h>
>>  #include <asm/processor.h>
>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>  
>>  static void gicv2_ipi_send_self(void)
>>  {> -	smp_wmb();
> nit: previous patch added it and this patch removes it. maybe squash the
> modifs into the previous patch saying only a barrier() is needed for self()?
You're right, this does look out of place. I'll merge this change into the
previous patch.
>> +	/* Prevent the compiler from optimizing memory accesses */
>> +	barrier();
>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>  }
>>  
>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>  	},
>>  };
>>  
>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  {
>>  	u32 irqstat = gic_read_iar();
>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		smp_rmb(); /* pairs with wmb in stats_reset */
> the comment says it is paired with wmd in stats_reset. So is it OK to
> leave the associated wmb?

This patch removes multi-processor synchronization from the functions that run on
the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
the variables it modifies accessed by the interrupt handlers running on different
CPUs, like it happens for the IPI tests. In that case we do need the proper
barriers in place.

Thanks,

Alex

>>  		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
> same here.
>>  	} else {
>>  		++spurious[smp_processor_id()];
>> -		smp_wmb();
>>  	}
>>  }
>>  
>>
> Thanks
>
> Eric
>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
@ 2020-12-02 14:02       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:02 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/1/20 4:37 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>> checks that the interrupt has been received as expected. There is no need
>> to use inter-processor memory synchronization primitives on code that runs
>> on the same CPU, so remove the unneeded memory barriers.
>>
>> The arrays are modified asynchronously (in the interrupt handler) and it is
>> possible for the compiler to infer that they won't be changed during normal
>> program flow and try to perform harmful optimizations (like stashing a
>> previous read in a register and reusing it). To prevent this, for GICv2,
>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>> compiler barrier.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 401ffafe4299..4e947e8516a2 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -12,6 +12,7 @@
>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>   */
>>  #include <libcflat.h>
>> +#include <linux/compiler.h>
>>  #include <errata.h>
>>  #include <asm/setup.h>
>>  #include <asm/processor.h>
>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>  
>>  static void gicv2_ipi_send_self(void)
>>  {> -	smp_wmb();
> nit: previous patch added it and this patch removes it. maybe squash the
> modifs into the previous patch saying only a barrier() is needed for self()?
You're right, this does look out of place. I'll merge this change into the
previous patch.
>> +	/* Prevent the compiler from optimizing memory accesses */
>> +	barrier();
>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>  }
>>  
>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>  	},
>>  };
>>  
>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  {
>>  	u32 irqstat = gic_read_iar();
>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		smp_rmb(); /* pairs with wmb in stats_reset */
> the comment says it is paired with wmd in stats_reset. So is it OK to
> leave the associated wmb?

This patch removes multi-processor synchronization from the functions that run on
the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
the variables it modifies accessed by the interrupt handlers running on different
CPUs, like it happens for the IPI tests. In that case we do need the proper
barriers in place.

Thanks,

Alex

>>  		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
> same here.
>>  	} else {
>>  		++spurious[smp_processor_id()];
>> -		smp_wmb();
>>  	}
>>  }
>>  
>>
> Thanks
>
> Eric
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  2020-12-01 16:48     ` Auger Eric
@ 2020-12-02 14:06       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:06 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/1/20 4:48 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The GICv3 driver executes a DSB barrier before sending an IPI, which
>> ensures that memory accesses have completed. This removes the need to
>> enforce ordering with respect to stats_reset() in the IPI handler.
>>
>> For GICv2, we still need the DMB to ensure ordering between the read of the
>> GICC_IAR MMIO register and the read from the acked array. It also matches
>> what the Linux GICv2 driver does in gic_handle_irq().
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 4e947e8516a2..7befda2a8673 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -60,7 +60,6 @@ static void stats_reset(void)
>>  		bad_sender[i] = -1;
>>  		bad_irq[i] = -1;
>>  	}
>> -	smp_wmb();
> Here we are (pair removed). Still the one in check_acked still exists.

The smp_rmb() from check_acked() is there to implement the message passing pattern
wrt the writes from the ipi_handler() function, not the writes from stats_reset().
See the next patch where I try to explain how the barriers should work.

Thanks,

Alex

>>  }
>>  
>>  static void check_acked(const char *testname, cpumask_t *mask)
>> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  
>>  	if (irqnr != GICC_INT_SPURIOUS) {
>>  		gic_write_eoir(irqstat);
>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>> +		/*
>> +		 * Make sure data written before the IPI was triggered is
>> +		 * observed after the IAR is read. Pairs with the smp_wmb
>> +		 * when sending the IPI.
>> +		 */
>> +		if (gic_version() == 2)
>> +			smp_rmb();
>>  		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>>
> Thanks
>
> Eric
>

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
@ 2020-12-02 14:06       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:06 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/1/20 4:48 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The GICv3 driver executes a DSB barrier before sending an IPI, which
>> ensures that memory accesses have completed. This removes the need to
>> enforce ordering with respect to stats_reset() in the IPI handler.
>>
>> For GICv2, we still need the DMB to ensure ordering between the read of the
>> GICC_IAR MMIO register and the read from the acked array. It also matches
>> what the Linux GICv2 driver does in gic_handle_irq().
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 4e947e8516a2..7befda2a8673 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -60,7 +60,6 @@ static void stats_reset(void)
>>  		bad_sender[i] = -1;
>>  		bad_irq[i] = -1;
>>  	}
>> -	smp_wmb();
> Here we are (pair removed). Still the one in check_acked still exists.

The smp_rmb() from check_acked() is there to implement the message passing pattern
wrt the writes from the ipi_handler() function, not the writes from stats_reset().
See the next patch where I try to explain how the barriers should work.

Thanks,

Alex

>>  }
>>  
>>  static void check_acked(const char *testname, cpumask_t *mask)
>> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  
>>  	if (irqnr != GICC_INT_SPURIOUS) {
>>  		gic_write_eoir(irqstat);
>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>> +		/*
>> +		 * Make sure data written before the IPI was triggered is
>> +		 * observed after the IAR is read. Pairs with the smp_wmb
>> +		 * when sending the IPI.
>> +		 */
>> +		if (gic_version() == 2)
>> +			smp_rmb();
>>  		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>>
> Thanks
>
> Eric
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
  2020-12-02 14:02       ` Alexandru Elisei
@ 2020-12-02 14:14         ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:14 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 12/2/20 2:02 PM, Alexandru Elisei wrote:

> Hi Eric,
>
> On 12/1/20 4:37 PM, Auger Eric wrote:
>> Hi Alexandru,
>>
>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>>> checks that the interrupt has been received as expected. There is no need
>>> to use inter-processor memory synchronization primitives on code that runs
>>> on the same CPU, so remove the unneeded memory barriers.
>>>
>>> The arrays are modified asynchronously (in the interrupt handler) and it is
>>> possible for the compiler to infer that they won't be changed during normal
>>> program flow and try to perform harmful optimizations (like stashing a
>>> previous read in a register and reusing it). To prevent this, for GICv2,
>>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>>> compiler barrier.
>>>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> ---
>>>  arm/gic.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 401ffafe4299..4e947e8516a2 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -12,6 +12,7 @@
>>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>>   */
>>>  #include <libcflat.h>
>>> +#include <linux/compiler.h>
>>>  #include <errata.h>
>>>  #include <asm/setup.h>
>>>  #include <asm/processor.h>
>>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>>  
>>>  static void gicv2_ipi_send_self(void)
>>>  {> -	smp_wmb();
>> nit: previous patch added it and this patch removes it. maybe squash the
>> modifs into the previous patch saying only a barrier() is needed for self()?
> You're right, this does look out of place. I'll merge this change into the
> previous patch.
>>> +	/* Prevent the compiler from optimizing memory accesses */
>>> +	barrier();
>>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>>  }
>>>  
>>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>>  	},
>>>  };
>>>  
>>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>  {
>>>  	u32 irqstat = gic_read_iar();
>>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>  
>>>  		writel(val, base + GICD_ICACTIVER);
>>>  
>>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>> the comment says it is paired with wmd in stats_reset. So is it OK to
>> leave the associated wmb?
> This patch removes multi-processor synchronization from the functions that run on
> the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
> the variables it modifies accessed by the interrupt handlers running on different
> CPUs, like it happens for the IPI tests. In that case we do need the proper
> barriers in place.

Sorry, got confused about what you were asking. The next patch removes the
smp_wmb() from stats_reset() which became redundant after the barriers added to
the GIC functions that send IPIs. This patch is about removing barriers that were
never needed in the first place because the functions were running on the same
CPU, it's not dependent on anyGIC changes.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
@ 2020-12-02 14:14         ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-02 14:14 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 12/2/20 2:02 PM, Alexandru Elisei wrote:

> Hi Eric,
>
> On 12/1/20 4:37 PM, Auger Eric wrote:
>> Hi Alexandru,
>>
>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>>> checks that the interrupt has been received as expected. There is no need
>>> to use inter-processor memory synchronization primitives on code that runs
>>> on the same CPU, so remove the unneeded memory barriers.
>>>
>>> The arrays are modified asynchronously (in the interrupt handler) and it is
>>> possible for the compiler to infer that they won't be changed during normal
>>> program flow and try to perform harmful optimizations (like stashing a
>>> previous read in a register and reusing it). To prevent this, for GICv2,
>>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>>> compiler barrier.
>>>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> ---
>>>  arm/gic.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 401ffafe4299..4e947e8516a2 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -12,6 +12,7 @@
>>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>>   */
>>>  #include <libcflat.h>
>>> +#include <linux/compiler.h>
>>>  #include <errata.h>
>>>  #include <asm/setup.h>
>>>  #include <asm/processor.h>
>>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>>  
>>>  static void gicv2_ipi_send_self(void)
>>>  {> -	smp_wmb();
>> nit: previous patch added it and this patch removes it. maybe squash the
>> modifs into the previous patch saying only a barrier() is needed for self()?
> You're right, this does look out of place. I'll merge this change into the
> previous patch.
>>> +	/* Prevent the compiler from optimizing memory accesses */
>>> +	barrier();
>>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>>  }
>>>  
>>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>>  	},
>>>  };
>>>  
>>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>  {
>>>  	u32 irqstat = gic_read_iar();
>>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>  
>>>  		writel(val, base + GICD_ICACTIVER);
>>>  
>>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>> the comment says it is paired with wmd in stats_reset. So is it OK to
>> leave the associated wmb?
> This patch removes multi-processor synchronization from the functions that run on
> the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
> the variables it modifies accessed by the interrupt handlers running on different
> CPUs, like it happens for the IPI tests. In that case we do need the proper
> barriers in place.

Sorry, got confused about what you were asking. The next patch removes the
smp_wmb() from stats_reset() which became redundant after the barriers added to
the GIC functions that send IPIs. This patch is about removing barriers that were
never needed in the first place because the functions were running on the same
CPU, it's not dependent on anyGIC changes.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
  2020-12-02 14:14         ` Alexandru Elisei
@ 2020-12-03  9:41           ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03  9:41 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/2/20 3:14 PM, Alexandru Elisei wrote:
> Hi,
> 
> On 12/2/20 2:02 PM, Alexandru Elisei wrote:
> 
>> Hi Eric,
>>
>> On 12/1/20 4:37 PM, Auger Eric wrote:
>>> Hi Alexandru,
>>>
>>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>>>> checks that the interrupt has been received as expected. There is no need
>>>> to use inter-processor memory synchronization primitives on code that runs
>>>> on the same CPU, so remove the unneeded memory barriers.
>>>>
>>>> The arrays are modified asynchronously (in the interrupt handler) and it is
>>>> possible for the compiler to infer that they won't be changed during normal
>>>> program flow and try to perform harmful optimizations (like stashing a
>>>> previous read in a register and reusing it). To prevent this, for GICv2,
>>>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>>>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>>>> compiler barrier.
>>>>
>>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>>> ---
>>>>  arm/gic.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arm/gic.c b/arm/gic.c
>>>> index 401ffafe4299..4e947e8516a2 100644
>>>> --- a/arm/gic.c
>>>> +++ b/arm/gic.c
>>>> @@ -12,6 +12,7 @@
>>>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>>>   */
>>>>  #include <libcflat.h>
>>>> +#include <linux/compiler.h>
>>>>  #include <errata.h>
>>>>  #include <asm/setup.h>
>>>>  #include <asm/processor.h>
>>>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>>>  
>>>>  static void gicv2_ipi_send_self(void)
>>>>  {> -	smp_wmb();
>>> nit: previous patch added it and this patch removes it. maybe squash the
>>> modifs into the previous patch saying only a barrier() is needed for self()?
>> You're right, this does look out of place. I'll merge this change into the
>> previous patch.
>>>> +	/* Prevent the compiler from optimizing memory accesses */
>>>> +	barrier();
>>>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>>>  }
>>>>  
>>>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>>>  	},
>>>>  };
>>>>  
>>>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>>  {
>>>>  	u32 irqstat = gic_read_iar();
>>>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>>  
>>>>  		writel(val, base + GICD_ICACTIVER);
>>>>  
>>>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>>> the comment says it is paired with wmd in stats_reset. So is it OK to
>>> leave the associated wmb?
>> This patch removes multi-processor synchronization from the functions that run on
>> the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
>> the variables it modifies accessed by the interrupt handlers running on different
>> CPUs, like it happens for the IPI tests. In that case we do need the proper
>> barriers in place.
> 
> Sorry, got confused about what you were asking. The next patch removes the
> smp_wmb() from stats_reset() which became redundant after the barriers added to
> the GIC functions that send IPIs. This patch is about removing barriers that were
> never needed in the first place because the functions were running on the same
> CPU, it's not dependent on anyGIC changes.

OK I get it. I was just confused by this pairing commment as we remove
one item and not the other but that's not an issue here as we do not
need the barrier in that case.

Feel free to add my R-b w/ or wo the squash:
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> 
> Thanks,
> Alex
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()
@ 2020-12-03  9:41           ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03  9:41 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/2/20 3:14 PM, Alexandru Elisei wrote:
> Hi,
> 
> On 12/2/20 2:02 PM, Alexandru Elisei wrote:
> 
>> Hi Eric,
>>
>> On 12/1/20 4:37 PM, Auger Eric wrote:
>>> Hi Alexandru,
>>>
>>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>>> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
>>>> checks that the interrupt has been received as expected. There is no need
>>>> to use inter-processor memory synchronization primitives on code that runs
>>>> on the same CPU, so remove the unneeded memory barriers.
>>>>
>>>> The arrays are modified asynchronously (in the interrupt handler) and it is
>>>> possible for the compiler to infer that they won't be changed during normal
>>>> program flow and try to perform harmful optimizations (like stashing a
>>>> previous read in a register and reusing it). To prevent this, for GICv2,
>>>> the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
>>>> For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
>>>> compiler barrier.
>>>>
>>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>>> ---
>>>>  arm/gic.c | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arm/gic.c b/arm/gic.c
>>>> index 401ffafe4299..4e947e8516a2 100644
>>>> --- a/arm/gic.c
>>>> +++ b/arm/gic.c
>>>> @@ -12,6 +12,7 @@
>>>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>>>   */
>>>>  #include <libcflat.h>
>>>> +#include <linux/compiler.h>
>>>>  #include <errata.h>
>>>>  #include <asm/setup.h>
>>>>  #include <asm/processor.h>
>>>> @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char *msg)
>>>>  
>>>>  static void gicv2_ipi_send_self(void)
>>>>  {> -	smp_wmb();
>>> nit: previous patch added it and this patch removes it. maybe squash the
>>> modifs into the previous patch saying only a barrier() is needed for self()?
>> You're right, this does look out of place. I'll merge this change into the
>> previous patch.
>>>> +	/* Prevent the compiler from optimizing memory accesses */
>>>> +	barrier();
>>>>  	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
>>>>  }
>>>>  
>>>> @@ -359,6 +361,7 @@ static struct gic gicv3 = {
>>>>  	},
>>>>  };
>>>>  
>>>> +/* Runs on the same CPU as the sender, no need for memory synchronization */
>>>>  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>>  {
>>>>  	u32 irqstat = gic_read_iar();
>>>> @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>>>  
>>>>  		writel(val, base + GICD_ICACTIVER);
>>>>  
>>>> -		smp_rmb(); /* pairs with wmb in stats_reset */
>>> the comment says it is paired with wmd in stats_reset. So is it OK to
>>> leave the associated wmb?
>> This patch removes multi-processor synchronization from the functions that run on
>> the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) and
>> the variables it modifies accessed by the interrupt handlers running on different
>> CPUs, like it happens for the IPI tests. In that case we do need the proper
>> barriers in place.
> 
> Sorry, got confused about what you were asking. The next patch removes the
> smp_wmb() from stats_reset() which became redundant after the barriers added to
> the GIC functions that send IPIs. This patch is about removing barriers that were
> never needed in the first place because the functions were running on the same
> CPU, it's not dependent on anyGIC changes.

OK I get it. I was just confused by this pairing commment as we remove
one item and not the other but that's not an issue here as we do not
need the barrier in that case.

Feel free to add my R-b w/ or wo the squash:
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> 
> Thanks,
> Alex
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 13:10     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test works by sending IPIs to even numbered CPUs from the
> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
> interrupts as expected. The check is done in check_acked() by the
> IPI_SENDER CPU with the help of three arrays:
> 
> - acked, where acked[i] == 1 means that CPU i received the interrupt.
> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>   i had the expected interrupt number (IPI_IRQ).
> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
> 
> The assumption made by check_acked() is that if a CPU acked an interrupt,
> then bad_sender and bad_irq have also been updated. This is a common
> inter-thread communication pattern called message passing.  For message
> passing to work correctly on weakly consistent memory model architectures,
> like arm and arm64, barriers or address dependencies are required. This is
> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
> using DMB and DSB barriers" (page K11-7993), in the section with a single
> observer, which is in our case the IPI_SENDER CPU.
> 
> The IPI test attempts to enforce the correct ordering using memory
> barriers, but it's not enough. For example, the program execution below is
> valid from an architectural point of view:
> 
> 3 online CPUs, initial state (from stats_reset()):
> 
> acked[2] = 0;
> bad_sender[2] = -1;
> bad_irq[2] = -1;
> 
> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
> 				|
> smp_rmb() // DMB ISHLD		| acked[2]++;
> read 1 from acked[2]		|
> nr_pass++ // nr_pass = 3	|
> read -1 from bad_sender[2]	|
> read -1 from bad_irq[2]		|
> 				| // in check_ipi_sender()
> 				| bad_sender[2] = <bad ipi sender>
> 				| // in check_irqnr()
> 				| bad_irq[2] = <bad irq number>
> 				| smp_wmb() // DMB ISHST
> nr_pass == nr_cpus, return	|
> 
> In this scenario, CPU1 will read the updated acked value, but it will read
> the initial bad_sender and bad_irq values. This is permitted because the
> memory barriers do not create a data dependency between the value read from
> acked and the values read from bad_rq and bad_sender on CPU1, respectively
> between the values written to acked, bad_sender and bad_irq on CPU2.
> 
> To avoid this situation, let's reorder the barriers and accesses to the
> arrays to create the needed dependencies that ensure that message passing
> behaves as expected.
> 
> In the interrupt handler, the writes to bad_sender and bad_irq are
> reordered before the write to acked and a smp_wmb() barrier is added. This
> ensures that if other PEs observe the write to acked, then they will also
> observe the writes to the other two arrays.
> 
> In check_acked(), put the smp_rmb() barrier after the read from acked to
> ensure that the subsequent reads from bad_sender, respectively bad_irq,
> aren't reordered locally by the PE.
> 
> With these changes, the expected ordering of accesses is respected and we
> end up with the pattern described in the Arm ARM and also in the Linux
> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
> tools/memory-model/litmus-tests. More examples and explanations can be
> found in the Linux source tree, in Documentation/memory-barriers.txt, in
> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
> SPECULATION".
> 
> For consistency with ipi_handler(), the array accesses in
> ipi_clear_active_handler() have also been reordered. This shouldn't affect
> the functionality of that test.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 7befda2a8673..bcb834406d23 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>  		mdelay(100);
>  		nr_pass = 0;
>  		for_each_present_cpu(cpu) {
> -			smp_rmb();
>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>  				acked[cpu] == 1 : acked[cpu] == 0;
> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
>  			if (bad_sender[cpu] != -1) {
>  				printf("cpu%d received IPI from wrong sender %d\n",
> @@ -118,7 +118,6 @@ static void check_spurious(void)
>  {
>  	int cpu;
>  
> -	smp_rmb();
this change is not documented in the commit msg.
>  	for_each_present_cpu(cpu) {
>  		if (spurious[cpu])
>  			report_info("WARN: cpu%d got %d spurious interrupts",
> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> -		smp_wmb(); /* pairs with rmb in check_acked */
> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
> +		++acked[smp_processor_id()];
>  	} else {
>  		++spurious[smp_processor_id()];
>  		smp_wmb();
I guess this one was paired with check_spurious one?
> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		++acked[smp_processor_id()];
>  		check_irqnr(irqnr);
> +		++acked[smp_processor_id()];
This change is not really needed, isn't it?
>  	} else {
>  		++spurious[smp_processor_id()];
>  	}
> 
Thanks

Eric


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
@ 2020-12-03 13:10     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test works by sending IPIs to even numbered CPUs from the
> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
> interrupts as expected. The check is done in check_acked() by the
> IPI_SENDER CPU with the help of three arrays:
> 
> - acked, where acked[i] == 1 means that CPU i received the interrupt.
> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>   i had the expected interrupt number (IPI_IRQ).
> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
> 
> The assumption made by check_acked() is that if a CPU acked an interrupt,
> then bad_sender and bad_irq have also been updated. This is a common
> inter-thread communication pattern called message passing.  For message
> passing to work correctly on weakly consistent memory model architectures,
> like arm and arm64, barriers or address dependencies are required. This is
> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
> using DMB and DSB barriers" (page K11-7993), in the section with a single
> observer, which is in our case the IPI_SENDER CPU.
> 
> The IPI test attempts to enforce the correct ordering using memory
> barriers, but it's not enough. For example, the program execution below is
> valid from an architectural point of view:
> 
> 3 online CPUs, initial state (from stats_reset()):
> 
> acked[2] = 0;
> bad_sender[2] = -1;
> bad_irq[2] = -1;
> 
> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
> 				|
> smp_rmb() // DMB ISHLD		| acked[2]++;
> read 1 from acked[2]		|
> nr_pass++ // nr_pass = 3	|
> read -1 from bad_sender[2]	|
> read -1 from bad_irq[2]		|
> 				| // in check_ipi_sender()
> 				| bad_sender[2] = <bad ipi sender>
> 				| // in check_irqnr()
> 				| bad_irq[2] = <bad irq number>
> 				| smp_wmb() // DMB ISHST
> nr_pass == nr_cpus, return	|
> 
> In this scenario, CPU1 will read the updated acked value, but it will read
> the initial bad_sender and bad_irq values. This is permitted because the
> memory barriers do not create a data dependency between the value read from
> acked and the values read from bad_rq and bad_sender on CPU1, respectively
> between the values written to acked, bad_sender and bad_irq on CPU2.
> 
> To avoid this situation, let's reorder the barriers and accesses to the
> arrays to create the needed dependencies that ensure that message passing
> behaves as expected.
> 
> In the interrupt handler, the writes to bad_sender and bad_irq are
> reordered before the write to acked and a smp_wmb() barrier is added. This
> ensures that if other PEs observe the write to acked, then they will also
> observe the writes to the other two arrays.
> 
> In check_acked(), put the smp_rmb() barrier after the read from acked to
> ensure that the subsequent reads from bad_sender, respectively bad_irq,
> aren't reordered locally by the PE.
> 
> With these changes, the expected ordering of accesses is respected and we
> end up with the pattern described in the Arm ARM and also in the Linux
> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
> tools/memory-model/litmus-tests. More examples and explanations can be
> found in the Linux source tree, in Documentation/memory-barriers.txt, in
> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
> SPECULATION".
> 
> For consistency with ipi_handler(), the array accesses in
> ipi_clear_active_handler() have also been reordered. This shouldn't affect
> the functionality of that test.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 7befda2a8673..bcb834406d23 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>  		mdelay(100);
>  		nr_pass = 0;
>  		for_each_present_cpu(cpu) {
> -			smp_rmb();
>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>  				acked[cpu] == 1 : acked[cpu] == 0;
> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
>  			if (bad_sender[cpu] != -1) {
>  				printf("cpu%d received IPI from wrong sender %d\n",
> @@ -118,7 +118,6 @@ static void check_spurious(void)
>  {
>  	int cpu;
>  
> -	smp_rmb();
this change is not documented in the commit msg.
>  	for_each_present_cpu(cpu) {
>  		if (spurious[cpu])
>  			report_info("WARN: cpu%d got %d spurious interrupts",
> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> -		smp_wmb(); /* pairs with rmb in check_acked */
> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
> +		++acked[smp_processor_id()];
>  	} else {
>  		++spurious[smp_processor_id()];
>  		smp_wmb();
I guess this one was paired with check_spurious one?
> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		++acked[smp_processor_id()];
>  		check_irqnr(irqnr);
> +		++acked[smp_processor_id()];
This change is not really needed, isn't it?
>  	} else {
>  		++spurious[smp_processor_id()];
>  	}
> 
Thanks

Eric

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 13:10     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
> checks that the interrupt has been received as expected. The
> ipi_clear_active_handler() clears the active state of the interrupt with a
> write to the GICD_ICACTIVER register instead of writing the to EOI
> register.
> 
> When acknowledging the interrupt it is possible to get back an spurious
> interrupt ID (ID 1023), and the interrupt handler increments the number of
> spurious interrupts received on the current processor. However, this is not
> checked at the end of the test. Let's also check for spurious interrupts,
> like the IPI test does.
> 
> For IPIs on GICv2, the value returned by a read of the GICC_IAR register
> performed when acknowledging the interrupt also contains the sender CPU
> ID. Add a check for that too.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  arm/gic.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index bcb834406d23..5727d72a0ef3 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -125,12 +125,12 @@ static void check_spurious(void)
>  	}
>  }
>  
> -static void check_ipi_sender(u32 irqstat)
> +static void check_ipi_sender(u32 irqstat, int sender)
>  {
>  	if (gic_version() == 2) {
>  		int src = (irqstat >> 10) & 7;
>  
> -		if (src != IPI_SENDER)
> +		if (src != sender)
>  			bad_sender[smp_processor_id()] = src;
>  	}
>  }
> @@ -155,7 +155,7 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		check_ipi_sender(irqstat);
> +		check_ipi_sender(irqstat, IPI_SENDER);
>  		check_irqnr(irqnr);
>  		smp_wmb(); /* pairs with smp_rmb in check_acked */
>  		++acked[smp_processor_id()];
> @@ -382,6 +382,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> +		check_ipi_sender(irqstat, smp_processor_id());
>  		check_irqnr(irqnr);
>  		++acked[smp_processor_id()];
>  	} else {
> @@ -394,6 +395,7 @@ static void run_active_clear_test(void)
>  	report_prefix_push("active");
>  	setup_irq(ipi_clear_active_handler);
>  	ipi_test_self();
> +	check_spurious();
>  	report_prefix_pop();
>  }
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test
@ 2020-12-03 13:10     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
> checks that the interrupt has been received as expected. The
> ipi_clear_active_handler() clears the active state of the interrupt with a
> write to the GICD_ICACTIVER register instead of writing the to EOI
> register.
> 
> When acknowledging the interrupt it is possible to get back an spurious
> interrupt ID (ID 1023), and the interrupt handler increments the number of
> spurious interrupts received on the current processor. However, this is not
> checked at the end of the test. Let's also check for spurious interrupts,
> like the IPI test does.
> 
> For IPIs on GICv2, the value returned by a read of the GICC_IAR register
> performed when acknowledging the interrupt also contains the sender CPU
> ID. Add a check for that too.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  arm/gic.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index bcb834406d23..5727d72a0ef3 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -125,12 +125,12 @@ static void check_spurious(void)
>  	}
>  }
>  
> -static void check_ipi_sender(u32 irqstat)
> +static void check_ipi_sender(u32 irqstat, int sender)
>  {
>  	if (gic_version() == 2) {
>  		int src = (irqstat >> 10) & 7;
>  
> -		if (src != IPI_SENDER)
> +		if (src != sender)
>  			bad_sender[smp_processor_id()] = src;
>  	}
>  }
> @@ -155,7 +155,7 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		check_ipi_sender(irqstat);
> +		check_ipi_sender(irqstat, IPI_SENDER);
>  		check_irqnr(irqnr);
>  		smp_wmb(); /* pairs with smp_rmb in check_acked */
>  		++acked[smp_processor_id()];
> @@ -382,6 +382,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> +		check_ipi_sender(irqstat, smp_processor_id());
>  		check_irqnr(irqnr);
>  		++acked[smp_processor_id()];
>  	} else {
> @@ -394,6 +395,7 @@ static void run_active_clear_test(void)
>  	report_prefix_push("active");
>  	setup_irq(ipi_clear_active_handler);
>  	ipi_test_self();
> +	check_spurious();
>  	report_prefix_pop();
>  }
>  
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 13:10     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The GICv3 driver executes a DSB barrier before sending an IPI, which
> ensures that memory accesses have completed. This removes the need to
> enforce ordering with respect to stats_reset() in the IPI handler.
> 
> For GICv2, we still need the DMB to ensure ordering between the read of the
> GICC_IAR MMIO register and the read from the acked array. It also matches
> what the Linux GICv2 driver does in gic_handle_irq().
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  arm/gic.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 4e947e8516a2..7befda2a8673 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -60,7 +60,6 @@ static void stats_reset(void)
>  		bad_sender[i] = -1;
>  		bad_irq[i] = -1;
>  	}
> -	smp_wmb();
>  }
>  
>  static void check_acked(const char *testname, cpumask_t *mask)
> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> -		smp_rmb(); /* pairs with wmb in stats_reset */
> +		/*
> +		 * Make sure data written before the IPI was triggered is
> +		 * observed after the IAR is read. Pairs with the smp_wmb
> +		 * when sending the IPI.
> +		 */
> +		if (gic_version() == 2)
> +			smp_rmb();
>  		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()
@ 2020-12-03 13:10     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:10 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The GICv3 driver executes a DSB barrier before sending an IPI, which
> ensures that memory accesses have completed. This removes the need to
> enforce ordering with respect to stats_reset() in the IPI handler.
> 
> For GICv2, we still need the DMB to ensure ordering between the read of the
> GICC_IAR MMIO register and the read from the acked array. It also matches
> what the Linux GICv2 driver does in gic_handle_irq().
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  arm/gic.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 4e947e8516a2..7befda2a8673 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -60,7 +60,6 @@ static void stats_reset(void)
>  		bad_sender[i] = -1;
>  		bad_irq[i] = -1;
>  	}
> -	smp_wmb();
>  }
>  
>  static void check_acked(const char *testname, cpumask_t *mask)
> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> -		smp_rmb(); /* pairs with wmb in stats_reset */
> +		/*
> +		 * Make sure data written before the IPI was triggered is
> +		 * observed after the IAR is read. Pairs with the smp_wmb
> +		 * when sending the IPI.
> +		 */
> +		if (gic_version() == 2)
> +			smp_rmb();
>  		++acked[smp_processor_id()];
>  		check_ipi_sender(irqstat);
>  		check_irqnr(irqnr);
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 13:21     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:21 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test has two parts: in the first part, it tests that the sender CPU
> can send an IPI to itself (ipi_test_self()), and in the second part it
> sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
> an interrupt, if we read back a spurious interrupt ID (1023), the handler
> increments the index in the static array spurious corresponding to the CPU
> ID that the handler is running on; if we get the expected interrupt ID, we
> increment the same index in the acked array.
> 
> Reads of the spurious and acked arrays are synchronized with writes
> performed before sending the IPI. The synchronization is done either in the
> IPI sender function (GICv3), either by creating a data dependency (GICv2).
> 
> At the end of the test, the sender CPU reads from the acked and spurious
> arrays to check against the expected behaviour. We need to make sure the
> that writes in ipi_handler() are observable by the sender CPU. Use a DSB
> ISHST to make sure that the writes have completed.
> 
> One might rightfully argue that there are no guarantees regarding when the
> DSB instruction completes, just like there are no guarantees regarding when
> the value is observed by the other CPUs. However, let's do our best and
> instruct the CPU to complete the memory access when we know that it will be
> needed.
> 
> We still need to follow the message passing pattern for the acked,
> respectively bad_irq and bad_sender, because DSB guarantees that all memory
> accesses that come before the barrier have completed, not that they have
> completed in program order.
I guess the removal of the smp_rmb in check_spurious should belong to
that patch?
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Besides, AFAIU

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  arm/gic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 5727d72a0ef3..544c283f5f47 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -161,8 +161,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		++acked[smp_processor_id()];
>  	} else {
>  		++spurious[smp_processor_id()];
> -		smp_wmb();
>  	}
> +
> +	/* Wait for writes to acked/spurious to complete */
> +	dsb(ishst);
>  }
>  
>  static void setup_irq(irq_handler_fn handler)
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete
@ 2020-12-03 13:21     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:21 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test has two parts: in the first part, it tests that the sender CPU
> can send an IPI to itself (ipi_test_self()), and in the second part it
> sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
> an interrupt, if we read back a spurious interrupt ID (1023), the handler
> increments the index in the static array spurious corresponding to the CPU
> ID that the handler is running on; if we get the expected interrupt ID, we
> increment the same index in the acked array.
> 
> Reads of the spurious and acked arrays are synchronized with writes
> performed before sending the IPI. The synchronization is done either in the
> IPI sender function (GICv3), either by creating a data dependency (GICv2).
> 
> At the end of the test, the sender CPU reads from the acked and spurious
> arrays to check against the expected behaviour. We need to make sure the
> that writes in ipi_handler() are observable by the sender CPU. Use a DSB
> ISHST to make sure that the writes have completed.
> 
> One might rightfully argue that there are no guarantees regarding when the
> DSB instruction completes, just like there are no guarantees regarding when
> the value is observed by the other CPUs. However, let's do our best and
> instruct the CPU to complete the memory access when we know that it will be
> needed.
> 
> We still need to follow the message passing pattern for the acked,
> respectively bad_irq and bad_sender, because DSB guarantees that all memory
> accesses that come before the barrier have completed, not that they have
> completed in program order.
I guess the removal of the smp_rmb in check_spurious should belong to
that patch?
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Besides, AFAIU

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  arm/gic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 5727d72a0ef3..544c283f5f47 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -161,8 +161,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		++acked[smp_processor_id()];
>  	} else {
>  		++spurious[smp_processor_id()];
> -		smp_wmb();
>  	}
> +
> +	/* Wait for writes to acked/spurious to complete */
> +	dsb(ishst);
>  }
>  
>  static void setup_irq(irq_handler_fn handler)
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
  2020-12-03 13:10     ` Auger Eric
@ 2020-12-03 13:21       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-03 13:21 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 1:10 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The IPI test works by sending IPIs to even numbered CPUs from the
>> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
>> interrupts as expected. The check is done in check_acked() by the
>> IPI_SENDER CPU with the help of three arrays:
>>
>> - acked, where acked[i] == 1 means that CPU i received the interrupt.
>> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>>   i had the expected interrupt number (IPI_IRQ).
>> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
>>
>> The assumption made by check_acked() is that if a CPU acked an interrupt,
>> then bad_sender and bad_irq have also been updated. This is a common
>> inter-thread communication pattern called message passing.  For message
>> passing to work correctly on weakly consistent memory model architectures,
>> like arm and arm64, barriers or address dependencies are required. This is
>> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
>> using DMB and DSB barriers" (page K11-7993), in the section with a single
>> observer, which is in our case the IPI_SENDER CPU.
>>
>> The IPI test attempts to enforce the correct ordering using memory
>> barriers, but it's not enough. For example, the program execution below is
>> valid from an architectural point of view:
>>
>> 3 online CPUs, initial state (from stats_reset()):
>>
>> acked[2] = 0;
>> bad_sender[2] = -1;
>> bad_irq[2] = -1;
>>
>> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
>> 				|
>> smp_rmb() // DMB ISHLD		| acked[2]++;
>> read 1 from acked[2]		|
>> nr_pass++ // nr_pass = 3	|
>> read -1 from bad_sender[2]	|
>> read -1 from bad_irq[2]		|
>> 				| // in check_ipi_sender()
>> 				| bad_sender[2] = <bad ipi sender>
>> 				| // in check_irqnr()
>> 				| bad_irq[2] = <bad irq number>
>> 				| smp_wmb() // DMB ISHST
>> nr_pass == nr_cpus, return	|
>>
>> In this scenario, CPU1 will read the updated acked value, but it will read
>> the initial bad_sender and bad_irq values. This is permitted because the
>> memory barriers do not create a data dependency between the value read from
>> acked and the values read from bad_rq and bad_sender on CPU1, respectively
>> between the values written to acked, bad_sender and bad_irq on CPU2.
>>
>> To avoid this situation, let's reorder the barriers and accesses to the
>> arrays to create the needed dependencies that ensure that message passing
>> behaves as expected.
>>
>> In the interrupt handler, the writes to bad_sender and bad_irq are
>> reordered before the write to acked and a smp_wmb() barrier is added. This
>> ensures that if other PEs observe the write to acked, then they will also
>> observe the writes to the other two arrays.
>>
>> In check_acked(), put the smp_rmb() barrier after the read from acked to
>> ensure that the subsequent reads from bad_sender, respectively bad_irq,
>> aren't reordered locally by the PE.
>>
>> With these changes, the expected ordering of accesses is respected and we
>> end up with the pattern described in the Arm ARM and also in the Linux
>> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
>> tools/memory-model/litmus-tests. More examples and explanations can be
>> found in the Linux source tree, in Documentation/memory-barriers.txt, in
>> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
>> SPECULATION".
>>
>> For consistency with ipi_handler(), the array accesses in
>> ipi_clear_active_handler() have also been reordered. This shouldn't affect
>> the functionality of that test.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 7befda2a8673..bcb834406d23 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> -			smp_rmb();
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>  				acked[cpu] == 1 : acked[cpu] == 0;
>> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>  
>>  			if (bad_sender[cpu] != -1) {
>>  				printf("cpu%d received IPI from wrong sender %d\n",
>> @@ -118,7 +118,6 @@ static void check_spurious(void)
>>  {
>>  	int cpu;
>>  
>> -	smp_rmb();
> this change is not documented in the commit msg.

You are right. I think this is a rebasing mistake and should actually be part of
#7 ("arm/arm64: gic: Wait for writes to acked or spurious to complete") where I
remove the smp_wmb() when updating spurious in ipi_handler.

>>  	for_each_present_cpu(cpu) {
>>  		if (spurious[cpu])
>>  			report_info("WARN: cpu%d got %d spurious interrupts",
>> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  		 */
>>  		if (gic_version() == 2)
>>  			smp_rmb();
>> -		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
>> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
>> +		++acked[smp_processor_id()];
>>  	} else {
>>  		++spurious[smp_processor_id()];
>>  		smp_wmb();
> I guess this one was paired with check_spurious one?
>> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> +		++acked[smp_processor_id()];
> This change is not really needed, isn't it?

It's not needed, yes. It's explained in the commit message, it's there for
consistency with ipi_handler.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
@ 2020-12-03 13:21       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-03 13:21 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 1:10 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The IPI test works by sending IPIs to even numbered CPUs from the
>> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
>> interrupts as expected. The check is done in check_acked() by the
>> IPI_SENDER CPU with the help of three arrays:
>>
>> - acked, where acked[i] == 1 means that CPU i received the interrupt.
>> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>>   i had the expected interrupt number (IPI_IRQ).
>> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
>>
>> The assumption made by check_acked() is that if a CPU acked an interrupt,
>> then bad_sender and bad_irq have also been updated. This is a common
>> inter-thread communication pattern called message passing.  For message
>> passing to work correctly on weakly consistent memory model architectures,
>> like arm and arm64, barriers or address dependencies are required. This is
>> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
>> using DMB and DSB barriers" (page K11-7993), in the section with a single
>> observer, which is in our case the IPI_SENDER CPU.
>>
>> The IPI test attempts to enforce the correct ordering using memory
>> barriers, but it's not enough. For example, the program execution below is
>> valid from an architectural point of view:
>>
>> 3 online CPUs, initial state (from stats_reset()):
>>
>> acked[2] = 0;
>> bad_sender[2] = -1;
>> bad_irq[2] = -1;
>>
>> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
>> 				|
>> smp_rmb() // DMB ISHLD		| acked[2]++;
>> read 1 from acked[2]		|
>> nr_pass++ // nr_pass = 3	|
>> read -1 from bad_sender[2]	|
>> read -1 from bad_irq[2]		|
>> 				| // in check_ipi_sender()
>> 				| bad_sender[2] = <bad ipi sender>
>> 				| // in check_irqnr()
>> 				| bad_irq[2] = <bad irq number>
>> 				| smp_wmb() // DMB ISHST
>> nr_pass == nr_cpus, return	|
>>
>> In this scenario, CPU1 will read the updated acked value, but it will read
>> the initial bad_sender and bad_irq values. This is permitted because the
>> memory barriers do not create a data dependency between the value read from
>> acked and the values read from bad_rq and bad_sender on CPU1, respectively
>> between the values written to acked, bad_sender and bad_irq on CPU2.
>>
>> To avoid this situation, let's reorder the barriers and accesses to the
>> arrays to create the needed dependencies that ensure that message passing
>> behaves as expected.
>>
>> In the interrupt handler, the writes to bad_sender and bad_irq are
>> reordered before the write to acked and a smp_wmb() barrier is added. This
>> ensures that if other PEs observe the write to acked, then they will also
>> observe the writes to the other two arrays.
>>
>> In check_acked(), put the smp_rmb() barrier after the read from acked to
>> ensure that the subsequent reads from bad_sender, respectively bad_irq,
>> aren't reordered locally by the PE.
>>
>> With these changes, the expected ordering of accesses is respected and we
>> end up with the pattern described in the Arm ARM and also in the Linux
>> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
>> tools/memory-model/litmus-tests. More examples and explanations can be
>> found in the Linux source tree, in Documentation/memory-barriers.txt, in
>> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
>> SPECULATION".
>>
>> For consistency with ipi_handler(), the array accesses in
>> ipi_clear_active_handler() have also been reordered. This shouldn't affect
>> the functionality of that test.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 7befda2a8673..bcb834406d23 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> -			smp_rmb();
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>  				acked[cpu] == 1 : acked[cpu] == 0;
>> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>  
>>  			if (bad_sender[cpu] != -1) {
>>  				printf("cpu%d received IPI from wrong sender %d\n",
>> @@ -118,7 +118,6 @@ static void check_spurious(void)
>>  {
>>  	int cpu;
>>  
>> -	smp_rmb();
> this change is not documented in the commit msg.

You are right. I think this is a rebasing mistake and should actually be part of
#7 ("arm/arm64: gic: Wait for writes to acked or spurious to complete") where I
remove the smp_wmb() when updating spurious in ipi_handler.

>>  	for_each_present_cpu(cpu) {
>>  		if (spurious[cpu])
>>  			report_info("WARN: cpu%d got %d spurious interrupts",
>> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  		 */
>>  		if (gic_version() == 2)
>>  			smp_rmb();
>> -		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
>> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
>> +		++acked[smp_processor_id()];
>>  	} else {
>>  		++spurious[smp_processor_id()];
>>  		smp_wmb();
> I guess this one was paired with check_spurious one?
>> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> +		++acked[smp_processor_id()];
> This change is not really needed, isn't it?

It's not needed, yes. It's explained in the commit message, it's there for
consistency with ipi_handler.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 13:39     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:39 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara



On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> check_acked() has several peculiarities: is the only function among the
> check_* functions which calls report() directly, it does two things
> (waits for interrupts and checks for misfired interrupts) and it also
> mixes printf, report_info and report calls.
> 
> check_acked() also reports a pass and returns as soon all the target CPUs
> have received interrupts, However, a CPU not having received an interrupt
> *now* does not guarantee not receiving an eroneous interrupt if we wait
erroneous
> long enough.
> 
> Rework the function by splitting it into two separate functions, each with
> a single responsability: wait_for_interrupts(), which waits for the
> expected interrupts to fire, and check_acked() which checks that interrupts
> have been received as expected.
> 
> wait_for_interrupts() also waits an extra 100 milliseconds after the
> expected interrupts have been received in an effort to make sure we don't
> miss misfiring interrupts.
> 
> Splitting check_acked() into two functions will also allow us to
> customize the behavior of each function in the future more easily
> without using an unnecessarily long list of arguments for check_acked().
> 
> CC: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 47 insertions(+), 26 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 544c283f5f47..dcdab7d5f39a 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -62,41 +62,42 @@ static void stats_reset(void)
>  	}
>  }
>  
> -static void check_acked(const char *testname, cpumask_t *mask)
> +static void wait_for_interrupts(cpumask_t *mask)
>  {
> -	int missing = 0, extra = 0, unexpected = 0;
>  	int nr_pass, cpu, i;
> -	bool bad = false;
>  
>  	/* Wait up to 5s for all interrupts to be delivered */
> -	for (i = 0; i < 50; ++i) {
> +	for (i = 0; i < 50; i++) {
>  		mdelay(100);
>  		nr_pass = 0;
>  		for_each_present_cpu(cpu) {
> +			/*
> +			 * A CPU having receied more than one interrupts will
received
> +			 * show up in check_acked(), and no matter how long we
> +			 * wait it cannot un-receive it. Consier at least one
consider
> +			 * interrupt as a pass.
> +			 */
>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
> -				acked[cpu] == 1 : acked[cpu] == 0;
> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> -
> -			if (bad_sender[cpu] != -1) {
> -				printf("cpu%d received IPI from wrong sender %d\n",
> -					cpu, bad_sender[cpu]);
> -				bad = true;
> -			}
> -
> -			if (bad_irq[cpu] != -1) {
> -				printf("cpu%d received wrong irq %d\n",
> -					cpu, bad_irq[cpu]);
> -				bad = true;
> -			}
> +				acked[cpu] >= 1 : acked[cpu] == 0;
>  		}
> +
>  		if (nr_pass == nr_cpus) {
> -			report(!bad, "%s", testname);
>  			if (i)
> -				report_info("took more than %d ms", i * 100);
> +				report_info("interrupts took more than %d ms", i * 100);
> +			mdelay(100);
>  			return;
>  		}
>  	}
>  
> +	report_info("interrupts timed-out (5s)");
> +}
> +
> +static bool check_acked(cpumask_t *mask)
> +{
> +	int missing = 0, extra = 0, unexpected = 0;
> +	bool pass = true;
> +	int cpu;
> +
>  	for_each_present_cpu(cpu) {
>  		if (cpumask_test_cpu(cpu, mask)) {
>  			if (!acked[cpu])
> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>  			if (acked[cpu])
>  				++unexpected;
>  		}
> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +
> +		if (bad_sender[cpu] != -1) {
> +			report_info("cpu%d received IPI from wrong sender %d",
> +					cpu, bad_sender[cpu]);
> +			pass = false;
> +		}
> +
> +		if (bad_irq[cpu] != -1) {
> +			report_info("cpu%d received wrong irq %d",
> +					cpu, bad_irq[cpu]);
> +			pass = false;
> +		}
> +	}
> +
> +	if (missing || extra || unexpected) {
> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
> +				missing, extra, unexpected);
> +		pass = false;
>  	}
>  
> -	report(false, "%s", testname);
> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
> -		    missing, extra, unexpected);
> +	return pass;
>  }
>  
>  static void check_spurious(void)
> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>  	cpumask_clear(&mask);
>  	cpumask_set_cpu(smp_processor_id(), &mask);
>  	gic->ipi.send_self();
> -	check_acked("IPI: self", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>  		cpumask_clear_cpu(i, &mask);
>  	gic_ipi_send_mask(IPI_IRQ, &mask);
> -	check_acked("IPI: directed", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
both ipi_test_smp and ipi_test_self are called from the same test so
better to use different error messages like it was done originally.

>  	report_prefix_pop();
>  
>  	report_prefix_push("broadcast");
> @@ -323,7 +343,8 @@ static void ipi_test_smp(void)
>  	cpumask_copy(&mask, &cpu_present_mask);
>  	cpumask_clear_cpu(smp_processor_id(), &mask);
>  	gic->ipi.send_broadcast();
> -	check_acked("IPI: broadcast", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> 

Otherwise looks good to me

Thanks

Eric


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-12-03 13:39     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 13:39 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara



On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> check_acked() has several peculiarities: is the only function among the
> check_* functions which calls report() directly, it does two things
> (waits for interrupts and checks for misfired interrupts) and it also
> mixes printf, report_info and report calls.
> 
> check_acked() also reports a pass and returns as soon all the target CPUs
> have received interrupts, However, a CPU not having received an interrupt
> *now* does not guarantee not receiving an eroneous interrupt if we wait
erroneous
> long enough.
> 
> Rework the function by splitting it into two separate functions, each with
> a single responsability: wait_for_interrupts(), which waits for the
> expected interrupts to fire, and check_acked() which checks that interrupts
> have been received as expected.
> 
> wait_for_interrupts() also waits an extra 100 milliseconds after the
> expected interrupts have been received in an effort to make sure we don't
> miss misfiring interrupts.
> 
> Splitting check_acked() into two functions will also allow us to
> customize the behavior of each function in the future more easily
> without using an unnecessarily long list of arguments for check_acked().
> 
> CC: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 47 insertions(+), 26 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 544c283f5f47..dcdab7d5f39a 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -62,41 +62,42 @@ static void stats_reset(void)
>  	}
>  }
>  
> -static void check_acked(const char *testname, cpumask_t *mask)
> +static void wait_for_interrupts(cpumask_t *mask)
>  {
> -	int missing = 0, extra = 0, unexpected = 0;
>  	int nr_pass, cpu, i;
> -	bool bad = false;
>  
>  	/* Wait up to 5s for all interrupts to be delivered */
> -	for (i = 0; i < 50; ++i) {
> +	for (i = 0; i < 50; i++) {
>  		mdelay(100);
>  		nr_pass = 0;
>  		for_each_present_cpu(cpu) {
> +			/*
> +			 * A CPU having receied more than one interrupts will
received
> +			 * show up in check_acked(), and no matter how long we
> +			 * wait it cannot un-receive it. Consier at least one
consider
> +			 * interrupt as a pass.
> +			 */
>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
> -				acked[cpu] == 1 : acked[cpu] == 0;
> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> -
> -			if (bad_sender[cpu] != -1) {
> -				printf("cpu%d received IPI from wrong sender %d\n",
> -					cpu, bad_sender[cpu]);
> -				bad = true;
> -			}
> -
> -			if (bad_irq[cpu] != -1) {
> -				printf("cpu%d received wrong irq %d\n",
> -					cpu, bad_irq[cpu]);
> -				bad = true;
> -			}
> +				acked[cpu] >= 1 : acked[cpu] == 0;
>  		}
> +
>  		if (nr_pass == nr_cpus) {
> -			report(!bad, "%s", testname);
>  			if (i)
> -				report_info("took more than %d ms", i * 100);
> +				report_info("interrupts took more than %d ms", i * 100);
> +			mdelay(100);
>  			return;
>  		}
>  	}
>  
> +	report_info("interrupts timed-out (5s)");
> +}
> +
> +static bool check_acked(cpumask_t *mask)
> +{
> +	int missing = 0, extra = 0, unexpected = 0;
> +	bool pass = true;
> +	int cpu;
> +
>  	for_each_present_cpu(cpu) {
>  		if (cpumask_test_cpu(cpu, mask)) {
>  			if (!acked[cpu])
> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>  			if (acked[cpu])
>  				++unexpected;
>  		}
> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +
> +		if (bad_sender[cpu] != -1) {
> +			report_info("cpu%d received IPI from wrong sender %d",
> +					cpu, bad_sender[cpu]);
> +			pass = false;
> +		}
> +
> +		if (bad_irq[cpu] != -1) {
> +			report_info("cpu%d received wrong irq %d",
> +					cpu, bad_irq[cpu]);
> +			pass = false;
> +		}
> +	}
> +
> +	if (missing || extra || unexpected) {
> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
> +				missing, extra, unexpected);
> +		pass = false;
>  	}
>  
> -	report(false, "%s", testname);
> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
> -		    missing, extra, unexpected);
> +	return pass;
>  }
>  
>  static void check_spurious(void)
> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>  	cpumask_clear(&mask);
>  	cpumask_set_cpu(smp_processor_id(), &mask);
>  	gic->ipi.send_self();
> -	check_acked("IPI: self", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>  		cpumask_clear_cpu(i, &mask);
>  	gic_ipi_send_mask(IPI_IRQ, &mask);
> -	check_acked("IPI: directed", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
both ipi_test_smp and ipi_test_self are called from the same test so
better to use different error messages like it was done originally.

>  	report_prefix_pop();
>  
>  	report_prefix_push("broadcast");
> @@ -323,7 +343,8 @@ static void ipi_test_smp(void)
>  	cpumask_copy(&mask, &cpu_present_mask);
>  	cpumask_clear_cpu(smp_processor_id(), &mask);
>  	gic->ipi.send_broadcast();
> -	check_acked("IPI: broadcast", &mask);
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> 

Otherwise looks good to me

Thanks

Eric

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 14:59     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 14:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,
On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The LPI code validates a result similarly to the IPI tests, by checking if
> the target CPU received the interrupt with the expected interrupt number.
> However, the LPI tests invent their own way of checking the test results by
> creating a global struct (lpi_stats), using a separate interrupt handler
> (lpi_handler) and test function (check_lpi_stats).
> 
> There are several areas that can be improved in the LPI code, which are
> already covered by the IPI tests:
> 
> - check_lpi_stats() doesn't take into account that the target CPU can
>   receive the correct interrupt multiple times.
> - check_lpi_stats() doesn't take into the account the scenarios where all
>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>   that touches lpi_stats.observed.
> - Insufficient or missing memory synchronization.
> 
> Instead of duplicating code, let's convert the LPI tests to use
> check_acked() and the same interrupt handler as the IPI tests, which has
> been renamed to irq_handler() to avoid any confusion.
> 
> check_lpi_stats() has been replaced with check_acked() which, together with
> using irq_handler(), instantly gives us more correctness checks and proper
> memory synchronization between threads. lpi_stats.expected has been
> replaced by the CPU mask and the expected interrupt number arguments to
> check_acked(), with no change in semantics.
> 
> lpi_handler() aborted the test if the interrupt number was not an LPI. This
> was changed in favor of allowing the test to continue, as it will fail in
> check_acked(), but possibly print information useful for debugging. If the
> test receives spurious interrupts, those are reported via report_info() at
> the end of the test for consistency with the IPI tests, which don't treat
> spurious interrupts as critical errors.
> 
> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
> ipi_recv() because the two are now identical; ipi_recv() has been renamed
> to irq_recv(), similarly to irq_handler(), to avoid confusion.
> 
> CC: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> With this change, I get the following failure for its-trigger on a
> rockpro64 (running on the little cores):
> 
> $ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger
> /usr/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=1
> ITS: MAPD devid=7 size = 0x8 itt=0x403b0000 valid=1
> MAPC col_id=3 target_addr = 0x30000 valid=1
> MAPC col_id=2 target_addr = 0x20000 valid=1
> INVALL col_id=2
> INVALL col_id=3
> MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
> MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
> INT dev_id=7 event_id=255
> PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
> INV dev_id=2 event_id=20
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger any LPI
> INVALL col_id=3
> INT dev_id=2 event_id=20
> INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=0
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
> SUMMARY: 6 tests, 1 unexpected failures
> 
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.
> 
> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from *master*: error encountered 864 times
>   out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> Looks to me like it's consistent between master and this series, and
> between qemu and kvmtool.
> 
> Here's the diff that I used for testing master (I removed the diff line
> because it causes trouble when applying the main patch):
> 
> @@ -772,8 +772,12 @@ static void test_its_trigger(void)
>         /* Now call the invall and check the LPI hits */
>         its_send_invall(col3);
>         lpi_stats_expect(3, 8195);
> +       acked[3] = 0;
> +       dsb(ishst);
>         its_send_int(dev2, 20);
>         check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +       report_info("acked[3] = %d", acked[3]);
> +       report(acked[3] == 1, "dev2/eventid=20 received one interrupt");
>  
>         report_prefix_pop();
>  
> 
>  arm/gic.c | 185 ++++++++++++++++++++++++++----------------------------
>  1 file changed, 88 insertions(+), 97 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index da7b42da5449..6e93da80fe0d 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -111,7 +111,7 @@ static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  		}
>  		if (!acked[cpu])
>  			continue;
> -		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +		smp_rmb(); /* pairs with smp_wmb in irq_handler */
>  
>  		if (has_gicv2 && irq_sender[cpu] != sender) {
>  			report_info("cpu%d received IPI from wrong sender %d",
> @@ -149,11 +149,12 @@ static void check_spurious(void)
>  static int gic_get_sender(int irqstat)
>  {
>  	if (gic_version() == 2)
> +		/* GICC_IAR.CPUID is RAZ for non-SGIs */
>  		return (irqstat >> 10) & 7;
>  	return -1;
>  }
>  
> -static void ipi_handler(struct pt_regs *regs __unused)
> +static void irq_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> @@ -192,75 +193,6 @@ static void setup_irq(irq_handler_fn handler)
>  }
>  
>  #if defined(__aarch64__)
> -struct its_event {
> -	int cpu_id;
> -	int lpi_id;
> -};
> -
> -struct its_stats {
> -	struct its_event expected;
> -	struct its_event observed;
> -};
> -
> -static struct its_stats lpi_stats;
> -
> -static void lpi_handler(struct pt_regs *regs __unused)
> -{
> -	u32 irqstat = gic_read_iar();
> -	int irqnr = gic_iar_irqnr(irqstat);
> -
> -	gic_write_eoir(irqstat);
> -	assert(irqnr >= 8192);
> -	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
> -	lpi_stats.observed.cpu_id = smp_processor_id();
> -	lpi_stats.observed.lpi_id = irqnr;
> -	acked[lpi_stats.observed.cpu_id]++;
> -	smp_wmb(); /* pairs with rmb in check_lpi_stats */
> -}
> -
> -static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
> -{
> -	lpi_stats.expected.cpu_id = exp_cpu_id;
> -	lpi_stats.expected.lpi_id = exp_lpi_id;
> -	lpi_stats.observed.cpu_id = -1;
> -	lpi_stats.observed.lpi_id = -1;
> -	smp_wmb(); /* pairs with rmb in handler */
> -}
> -
> -static void check_lpi_stats(const char *msg)
> -{
> -	int i;
> -
> -	for (i = 0; i < 50; i++) {
> -		mdelay(100);
> -		smp_rmb(); /* pairs with wmb in lpi_handler */
> -		if (lpi_stats.observed.cpu_id == lpi_stats.expected.cpu_id &&
> -		    lpi_stats.observed.lpi_id == lpi_stats.expected.lpi_id) {
> -			report(true, "%s", msg);
> -			return;
> -		}
> -	}
> -
> -	if (lpi_stats.observed.cpu_id == -1 && lpi_stats.observed.lpi_id == -1) {
> -		report_info("No LPI received whereas (cpuid=%d, intid=%d) "
> -			    "was expected", lpi_stats.expected.cpu_id,
> -			    lpi_stats.expected.lpi_id);
> -	} else {
> -		report_info("Unexpected LPI (cpuid=%d, intid=%d)",
> -			    lpi_stats.observed.cpu_id,
> -			    lpi_stats.observed.lpi_id);
> -	}
> -	report(false, "%s", msg);
> -}
> -
> -static void secondary_lpi_test(void)
> -{
> -	setup_irq(lpi_handler);
> -	cpumask_set_cpu(smp_processor_id(), &ready);
> -	while (1)
> -		wfi();
> -}
> -
>  static void check_lpi_hits(int *expected, const char *msg)
>  {
>  	bool pass = true;
> @@ -347,7 +279,7 @@ static void ipi_test_smp(void)
>  
>  static void ipi_send(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	wait_on_ready();
>  	ipi_test_self();
>  	ipi_test_smp();
> @@ -355,9 +287,9 @@ static void ipi_send(void)
>  	exit(report_summary());
>  }
>  
> -static void ipi_recv(void)
> +static void irq_recv(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	cpumask_set_cpu(smp_processor_id(), &ready);
>  	while (1)
>  		wfi();
> @@ -368,7 +300,7 @@ static void ipi_test(void *data __unused)
>  	if (smp_processor_id() == IPI_SENDER)
>  		ipi_send();
>  	else
> -		ipi_recv();
> +		irq_recv();
>  }
>  
>  static struct gic gicv2 = {
> @@ -698,12 +630,12 @@ static int its_prerequisites(int nb_cpus)
>  
>  	stats_reset();
>  
> -	setup_irq(lpi_handler);
> +	setup_irq(irq_handler);
>  
>  	for_each_present_cpu(cpu) {
>  		if (cpu == 0)
>  			continue;
> -		smp_boot_secondary(cpu, secondary_lpi_test);
> +		smp_boot_secondary(cpu, irq_recv);
>  	}
>  	wait_on_ready();
>  
> @@ -757,6 +689,7 @@ static void test_its_trigger(void)
>  {
>  	struct its_collection *col3;
>  	struct its_device *dev2, *dev7;
> +	cpumask_t mask;
>  
>  	if (its_setup1())
>  		return;
> @@ -767,13 +700,27 @@ static void test_its_trigger(void)
>  
>  	report_prefix_push("int");
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	/*
> +	 * its_send_int() is missing the synchronization from the GICv3 IPI
> +	 * trigger functions.
> +	 */
> +	wmb();
so don't you want to add it in __its_send_int instead?

Eric
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev=2, eventid=20  -> lpi= 8195, col=3");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev=7, eventid=255 -> lpi= 8196, col=2");
>  
>  	report_prefix_pop();
>  
> @@ -786,9 +733,13 @@ static void test_its_trigger(void)
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
>  	its_send_inv(dev2, 20);
>  
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 does not trigger any LPI");
>  
>  	/*
>  	 * re-enable the LPI but willingly do not call invall
> @@ -796,15 +747,24 @@ static void test_its_trigger(void)
>  	 * The LPI should not hit
>  	 */
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 still does not trigger any LPI");
>  
>  	/* Now call the invall and check the LPI hits */
>  	its_send_invall(col3);
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 now triggers an LPI");
>  
>  	report_prefix_pop();
>  
> @@ -815,9 +775,14 @@ static void test_its_trigger(void)
>  	 */
>  
>  	its_send_mapd(dev2, false);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("no LPI after device unmap");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1), "no LPI after device unmap");
> +
> +	check_spurious();
>  	report_prefix_pop();
>  }
>  
> @@ -825,6 +790,7 @@ static void test_its_migration(void)
>  {
>  	struct its_device *dev2, *dev7;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  
>  	if (its_setup1()) {
>  		test_skipped = true;
> @@ -841,13 +807,25 @@ do_migrate:
>  	if (test_skipped)
>  		return;
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +
> +	check_spurious();
>  }
>  
>  #define ERRATA_UNMAPPED_COLLECTIONS "ERRATA_8c58be34494b"
> @@ -857,6 +835,7 @@ static void test_migrate_unmapped_collection(void)
>  	struct its_collection *col = NULL;
>  	struct its_device *dev2 = NULL, *dev7 = NULL;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  	int pe0 = 0;
>  	u8 config;
>  
> @@ -891,17 +870,29 @@ do_migrate:
>  	its_send_mapc(col, true);
>  	its_send_invall(col);
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid= 255 triggered LPI 8196 on PE #2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid= 255 triggered LPI 8196 on PE #2");
>  
>  	config = gicv3_lpi_get_config(8192);
>  	report(config == LPI_PROP_DEFAULT,
>  	       "Config of LPI 8192 was properly migrated");
>  
> -	lpi_stats_expect(pe0, 8192);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(pe0, &mask);
>  	its_send_int(dev2, 0);
> -	check_lpi_stats("dev2/eventid = 0 triggered LPI 8192 on PE0");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8192),
> +			"dev2/eventid = 0 triggered LPI 8192 on PE0");
> +
> +	check_spurious();
>  }
>  
>  static void test_its_pending_migration(void)
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-12-03 14:59     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 14:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,
On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The LPI code validates a result similarly to the IPI tests, by checking if
> the target CPU received the interrupt with the expected interrupt number.
> However, the LPI tests invent their own way of checking the test results by
> creating a global struct (lpi_stats), using a separate interrupt handler
> (lpi_handler) and test function (check_lpi_stats).
> 
> There are several areas that can be improved in the LPI code, which are
> already covered by the IPI tests:
> 
> - check_lpi_stats() doesn't take into account that the target CPU can
>   receive the correct interrupt multiple times.
> - check_lpi_stats() doesn't take into the account the scenarios where all
>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>   that touches lpi_stats.observed.
> - Insufficient or missing memory synchronization.
> 
> Instead of duplicating code, let's convert the LPI tests to use
> check_acked() and the same interrupt handler as the IPI tests, which has
> been renamed to irq_handler() to avoid any confusion.
> 
> check_lpi_stats() has been replaced with check_acked() which, together with
> using irq_handler(), instantly gives us more correctness checks and proper
> memory synchronization between threads. lpi_stats.expected has been
> replaced by the CPU mask and the expected interrupt number arguments to
> check_acked(), with no change in semantics.
> 
> lpi_handler() aborted the test if the interrupt number was not an LPI. This
> was changed in favor of allowing the test to continue, as it will fail in
> check_acked(), but possibly print information useful for debugging. If the
> test receives spurious interrupts, those are reported via report_info() at
> the end of the test for consistency with the IPI tests, which don't treat
> spurious interrupts as critical errors.
> 
> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
> ipi_recv() because the two are now identical; ipi_recv() has been renamed
> to irq_recv(), similarly to irq_handler(), to avoid confusion.
> 
> CC: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> With this change, I get the following failure for its-trigger on a
> rockpro64 (running on the little cores):
> 
> $ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger
> /usr/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=1
> ITS: MAPD devid=7 size = 0x8 itt=0x403b0000 valid=1
> MAPC col_id=3 target_addr = 0x30000 valid=1
> MAPC col_id=2 target_addr = 0x20000 valid=1
> INVALL col_id=2
> INVALL col_id=3
> MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
> MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
> INT dev_id=7 event_id=255
> PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
> INV dev_id=2 event_id=20
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger any LPI
> INVALL col_id=3
> INT dev_id=2 event_id=20
> INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=0
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
> SUMMARY: 6 tests, 1 unexpected failures
> 
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.
> 
> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from *master*: error encountered 864 times
>   out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> Looks to me like it's consistent between master and this series, and
> between qemu and kvmtool.
> 
> Here's the diff that I used for testing master (I removed the diff line
> because it causes trouble when applying the main patch):
> 
> @@ -772,8 +772,12 @@ static void test_its_trigger(void)
>         /* Now call the invall and check the LPI hits */
>         its_send_invall(col3);
>         lpi_stats_expect(3, 8195);
> +       acked[3] = 0;
> +       dsb(ishst);
>         its_send_int(dev2, 20);
>         check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +       report_info("acked[3] = %d", acked[3]);
> +       report(acked[3] == 1, "dev2/eventid=20 received one interrupt");
>  
>         report_prefix_pop();
>  
> 
>  arm/gic.c | 185 ++++++++++++++++++++++++++----------------------------
>  1 file changed, 88 insertions(+), 97 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index da7b42da5449..6e93da80fe0d 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -111,7 +111,7 @@ static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  		}
>  		if (!acked[cpu])
>  			continue;
> -		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +		smp_rmb(); /* pairs with smp_wmb in irq_handler */
>  
>  		if (has_gicv2 && irq_sender[cpu] != sender) {
>  			report_info("cpu%d received IPI from wrong sender %d",
> @@ -149,11 +149,12 @@ static void check_spurious(void)
>  static int gic_get_sender(int irqstat)
>  {
>  	if (gic_version() == 2)
> +		/* GICC_IAR.CPUID is RAZ for non-SGIs */
>  		return (irqstat >> 10) & 7;
>  	return -1;
>  }
>  
> -static void ipi_handler(struct pt_regs *regs __unused)
> +static void irq_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> @@ -192,75 +193,6 @@ static void setup_irq(irq_handler_fn handler)
>  }
>  
>  #if defined(__aarch64__)
> -struct its_event {
> -	int cpu_id;
> -	int lpi_id;
> -};
> -
> -struct its_stats {
> -	struct its_event expected;
> -	struct its_event observed;
> -};
> -
> -static struct its_stats lpi_stats;
> -
> -static void lpi_handler(struct pt_regs *regs __unused)
> -{
> -	u32 irqstat = gic_read_iar();
> -	int irqnr = gic_iar_irqnr(irqstat);
> -
> -	gic_write_eoir(irqstat);
> -	assert(irqnr >= 8192);
> -	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
> -	lpi_stats.observed.cpu_id = smp_processor_id();
> -	lpi_stats.observed.lpi_id = irqnr;
> -	acked[lpi_stats.observed.cpu_id]++;
> -	smp_wmb(); /* pairs with rmb in check_lpi_stats */
> -}
> -
> -static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
> -{
> -	lpi_stats.expected.cpu_id = exp_cpu_id;
> -	lpi_stats.expected.lpi_id = exp_lpi_id;
> -	lpi_stats.observed.cpu_id = -1;
> -	lpi_stats.observed.lpi_id = -1;
> -	smp_wmb(); /* pairs with rmb in handler */
> -}
> -
> -static void check_lpi_stats(const char *msg)
> -{
> -	int i;
> -
> -	for (i = 0; i < 50; i++) {
> -		mdelay(100);
> -		smp_rmb(); /* pairs with wmb in lpi_handler */
> -		if (lpi_stats.observed.cpu_id == lpi_stats.expected.cpu_id &&
> -		    lpi_stats.observed.lpi_id == lpi_stats.expected.lpi_id) {
> -			report(true, "%s", msg);
> -			return;
> -		}
> -	}
> -
> -	if (lpi_stats.observed.cpu_id == -1 && lpi_stats.observed.lpi_id == -1) {
> -		report_info("No LPI received whereas (cpuid=%d, intid=%d) "
> -			    "was expected", lpi_stats.expected.cpu_id,
> -			    lpi_stats.expected.lpi_id);
> -	} else {
> -		report_info("Unexpected LPI (cpuid=%d, intid=%d)",
> -			    lpi_stats.observed.cpu_id,
> -			    lpi_stats.observed.lpi_id);
> -	}
> -	report(false, "%s", msg);
> -}
> -
> -static void secondary_lpi_test(void)
> -{
> -	setup_irq(lpi_handler);
> -	cpumask_set_cpu(smp_processor_id(), &ready);
> -	while (1)
> -		wfi();
> -}
> -
>  static void check_lpi_hits(int *expected, const char *msg)
>  {
>  	bool pass = true;
> @@ -347,7 +279,7 @@ static void ipi_test_smp(void)
>  
>  static void ipi_send(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	wait_on_ready();
>  	ipi_test_self();
>  	ipi_test_smp();
> @@ -355,9 +287,9 @@ static void ipi_send(void)
>  	exit(report_summary());
>  }
>  
> -static void ipi_recv(void)
> +static void irq_recv(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	cpumask_set_cpu(smp_processor_id(), &ready);
>  	while (1)
>  		wfi();
> @@ -368,7 +300,7 @@ static void ipi_test(void *data __unused)
>  	if (smp_processor_id() == IPI_SENDER)
>  		ipi_send();
>  	else
> -		ipi_recv();
> +		irq_recv();
>  }
>  
>  static struct gic gicv2 = {
> @@ -698,12 +630,12 @@ static int its_prerequisites(int nb_cpus)
>  
>  	stats_reset();
>  
> -	setup_irq(lpi_handler);
> +	setup_irq(irq_handler);
>  
>  	for_each_present_cpu(cpu) {
>  		if (cpu == 0)
>  			continue;
> -		smp_boot_secondary(cpu, secondary_lpi_test);
> +		smp_boot_secondary(cpu, irq_recv);
>  	}
>  	wait_on_ready();
>  
> @@ -757,6 +689,7 @@ static void test_its_trigger(void)
>  {
>  	struct its_collection *col3;
>  	struct its_device *dev2, *dev7;
> +	cpumask_t mask;
>  
>  	if (its_setup1())
>  		return;
> @@ -767,13 +700,27 @@ static void test_its_trigger(void)
>  
>  	report_prefix_push("int");
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	/*
> +	 * its_send_int() is missing the synchronization from the GICv3 IPI
> +	 * trigger functions.
> +	 */
> +	wmb();
so don't you want to add it in __its_send_int instead?

Eric
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev=2, eventid=20  -> lpi= 8195, col=3");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev=7, eventid=255 -> lpi= 8196, col=2");
>  
>  	report_prefix_pop();
>  
> @@ -786,9 +733,13 @@ static void test_its_trigger(void)
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
>  	its_send_inv(dev2, 20);
>  
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 does not trigger any LPI");
>  
>  	/*
>  	 * re-enable the LPI but willingly do not call invall
> @@ -796,15 +747,24 @@ static void test_its_trigger(void)
>  	 * The LPI should not hit
>  	 */
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 still does not trigger any LPI");
>  
>  	/* Now call the invall and check the LPI hits */
>  	its_send_invall(col3);
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 now triggers an LPI");
>  
>  	report_prefix_pop();
>  
> @@ -815,9 +775,14 @@ static void test_its_trigger(void)
>  	 */
>  
>  	its_send_mapd(dev2, false);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("no LPI after device unmap");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1), "no LPI after device unmap");
> +
> +	check_spurious();
>  	report_prefix_pop();
>  }
>  
> @@ -825,6 +790,7 @@ static void test_its_migration(void)
>  {
>  	struct its_device *dev2, *dev7;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  
>  	if (its_setup1()) {
>  		test_skipped = true;
> @@ -841,13 +807,25 @@ do_migrate:
>  	if (test_skipped)
>  		return;
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +
> +	check_spurious();
>  }
>  
>  #define ERRATA_UNMAPPED_COLLECTIONS "ERRATA_8c58be34494b"
> @@ -857,6 +835,7 @@ static void test_migrate_unmapped_collection(void)
>  	struct its_collection *col = NULL;
>  	struct its_device *dev2 = NULL, *dev7 = NULL;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  	int pe0 = 0;
>  	u8 config;
>  
> @@ -891,17 +870,29 @@ do_migrate:
>  	its_send_mapc(col, true);
>  	its_send_invall(col);
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid= 255 triggered LPI 8196 on PE #2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid= 255 triggered LPI 8196 on PE #2");
>  
>  	config = gicv3_lpi_get_config(8192);
>  	report(config == LPI_PROP_DEFAULT,
>  	       "Config of LPI 8192 was properly migrated");
>  
> -	lpi_stats_expect(pe0, 8192);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(pe0, &mask);
>  	its_send_int(dev2, 0);
> -	check_lpi_stats("dev2/eventid = 0 triggered LPI 8192 on PE0");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8192),
> +			"dev2/eventid = 0 triggered LPI 8192 on PE0");
> +
> +	check_spurious();
>  }
>  
>  static void test_its_pending_migration(void)
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic
  2020-11-25 15:51   ` Alexandru Elisei
@ 2020-12-03 14:59     ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 14:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> Testing that an interrupt is received as expected is done in three places:
> in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
> compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
> and check_ipi_sender() compares the sender with IPI_SENDER and writes to
> bad_sender when they don't match.
> 
> Let's move all the checks to check_acked() by renaming
> bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
> so they record the interrupt sender, respectively the irq number.
> check_acked() now takes two new parameters: the expected interrupt number
> and sender.
> 
> This has two distinct advantages:
> 
> 1. check_acked() and ipi_handler() can now be used for interrupts other
>    than IPIs.
> 2. Correctness checks are consolidated in one function.
> 
> CC: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> ---
>  arm/gic.c | 68 +++++++++++++++++++++++++++----------------------------
>  1 file changed, 33 insertions(+), 35 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index dcdab7d5f39a..da7b42da5449 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -35,7 +35,7 @@ struct gic {
>  
>  static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
> -static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
> +static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>  static cpumask_t ready;
>  
>  static void nr_cpu_check(int nr)
> @@ -57,8 +57,8 @@ static void stats_reset(void)
>  
>  	for (i = 0; i < nr_cpus; ++i) {
>  		acked[i] = 0;
> -		bad_sender[i] = -1;
> -		bad_irq[i] = -1;
> +		irq_sender[i] = -1;
> +		irq_number[i] = -1;
>  	}
>  }
>  
> @@ -92,9 +92,10 @@ static void wait_for_interrupts(cpumask_t *mask)
>  	report_info("interrupts timed-out (5s)");
>  }
>  
> -static bool check_acked(cpumask_t *mask)
> +static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  {
>  	int missing = 0, extra = 0, unexpected = 0;
> +	bool has_gicv2 = (gic_version() == 2);
>  	bool pass = true;
>  	int cpu;
>  
> @@ -108,17 +109,19 @@ static bool check_acked(cpumask_t *mask)
>  			if (acked[cpu])
>  				++unexpected;
>  		}
> +		if (!acked[cpu])
> +			continue;
>  		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
> -		if (bad_sender[cpu] != -1) {
> +		if (has_gicv2 && irq_sender[cpu] != sender) {
>  			report_info("cpu%d received IPI from wrong sender %d",
> -					cpu, bad_sender[cpu]);
> +					cpu, irq_sender[cpu]);
>  			pass = false;
>  		}
>  
> -		if (bad_irq[cpu] != -1) {
> +		if (irq_number[cpu] != irqnum) {
>  			report_info("cpu%d received wrong irq %d",
> -					cpu, bad_irq[cpu]);
> +					cpu, irq_number[cpu]);
>  			pass = false;
>  		}
>  	}
> @@ -143,26 +146,18 @@ static void check_spurious(void)
>  	}
>  }
>  
> -static void check_ipi_sender(u32 irqstat, int sender)
> +static int gic_get_sender(int irqstat)
>  {
> -	if (gic_version() == 2) {
> -		int src = (irqstat >> 10) & 7;
> -
> -		if (src != sender)
> -			bad_sender[smp_processor_id()] = src;
> -	}
> -}
> -
> -static void check_irqnr(u32 irqnr)
> -{
> -	if (irqnr != IPI_IRQ)
> -		bad_irq[smp_processor_id()] = irqnr;
> +	if (gic_version() == 2)
> +		return (irqstat >> 10) & 7;
> +	return -1;
>  }
>  
>  static void ipi_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> +	int this_cpu = smp_processor_id();
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> @@ -173,12 +168,12 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		check_ipi_sender(irqstat, IPI_SENDER);
> -		check_irqnr(irqnr);
> +		irq_sender[this_cpu] = gic_get_sender(irqstat);
> +		irq_number[this_cpu] = irqnr;
>  		smp_wmb(); /* pairs with smp_rmb in check_acked */
> -		++acked[smp_processor_id()];
> +		++acked[this_cpu];
>  	} else {
> -		++spurious[smp_processor_id()];
> +		++spurious[this_cpu];
>  	}
>  
>  	/* Wait for writes to acked/spurious to complete */
> @@ -311,40 +306,42 @@ static void gicv3_ipi_send_broadcast(void)
>  
>  static void ipi_test_self(void)
>  {
> +	int this_cpu = smp_processor_id();
>  	cpumask_t mask;
>  
>  	report_prefix_push("self");
>  	stats_reset();
>  	cpumask_clear(&mask);
> -	cpumask_set_cpu(smp_processor_id(), &mask);
> +	cpumask_set_cpu(this_cpu, &mask);
>  	gic->ipi.send_self();
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
>  static void ipi_test_smp(void)
>  {
> +	int this_cpu = smp_processor_id();
>  	cpumask_t mask;
>  	int i;
>  
>  	report_prefix_push("target-list");
>  	stats_reset();
>  	cpumask_copy(&mask, &cpu_present_mask);
> -	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
> +	for (i = this_cpu & 1; i < nr_cpus; i += 2)
>  		cpumask_clear_cpu(i, &mask);
>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  
>  	report_prefix_push("broadcast");
>  	stats_reset();
>  	cpumask_copy(&mask, &cpu_present_mask);
> -	cpumask_clear_cpu(smp_processor_id(), &mask);
> +	cpumask_clear_cpu(this_cpu, &mask);
>  	gic->ipi.send_broadcast();
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> @@ -393,6 +390,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> +	int this_cpu = smp_processor_id();
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		void *base;
> @@ -405,11 +403,11 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		check_ipi_sender(irqstat, smp_processor_id());
> -		check_irqnr(irqnr);
> -		++acked[smp_processor_id()];
> +		irq_sender[this_cpu] = gic_get_sender(irqstat);
> +		irq_number[this_cpu] = irqnr;
> +		++acked[this_cpu];
>  	} else {
> -		++spurious[smp_processor_id()];
> +		++spurious[this_cpu];
>  	}
>  }
>  
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic
@ 2020-12-03 14:59     ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-03 14:59 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> Testing that an interrupt is received as expected is done in three places:
> in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
> compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
> and check_ipi_sender() compares the sender with IPI_SENDER and writes to
> bad_sender when they don't match.
> 
> Let's move all the checks to check_acked() by renaming
> bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
> so they record the interrupt sender, respectively the irq number.
> check_acked() now takes two new parameters: the expected interrupt number
> and sender.
> 
> This has two distinct advantages:
> 
> 1. check_acked() and ipi_handler() can now be used for interrupts other
>    than IPIs.
> 2. Correctness checks are consolidated in one function.
> 
> CC: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> ---
>  arm/gic.c | 68 +++++++++++++++++++++++++++----------------------------
>  1 file changed, 33 insertions(+), 35 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index dcdab7d5f39a..da7b42da5449 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -35,7 +35,7 @@ struct gic {
>  
>  static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
> -static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
> +static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>  static cpumask_t ready;
>  
>  static void nr_cpu_check(int nr)
> @@ -57,8 +57,8 @@ static void stats_reset(void)
>  
>  	for (i = 0; i < nr_cpus; ++i) {
>  		acked[i] = 0;
> -		bad_sender[i] = -1;
> -		bad_irq[i] = -1;
> +		irq_sender[i] = -1;
> +		irq_number[i] = -1;
>  	}
>  }
>  
> @@ -92,9 +92,10 @@ static void wait_for_interrupts(cpumask_t *mask)
>  	report_info("interrupts timed-out (5s)");
>  }
>  
> -static bool check_acked(cpumask_t *mask)
> +static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  {
>  	int missing = 0, extra = 0, unexpected = 0;
> +	bool has_gicv2 = (gic_version() == 2);
>  	bool pass = true;
>  	int cpu;
>  
> @@ -108,17 +109,19 @@ static bool check_acked(cpumask_t *mask)
>  			if (acked[cpu])
>  				++unexpected;
>  		}
> +		if (!acked[cpu])
> +			continue;
>  		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
> -		if (bad_sender[cpu] != -1) {
> +		if (has_gicv2 && irq_sender[cpu] != sender) {
>  			report_info("cpu%d received IPI from wrong sender %d",
> -					cpu, bad_sender[cpu]);
> +					cpu, irq_sender[cpu]);
>  			pass = false;
>  		}
>  
> -		if (bad_irq[cpu] != -1) {
> +		if (irq_number[cpu] != irqnum) {
>  			report_info("cpu%d received wrong irq %d",
> -					cpu, bad_irq[cpu]);
> +					cpu, irq_number[cpu]);
>  			pass = false;
>  		}
>  	}
> @@ -143,26 +146,18 @@ static void check_spurious(void)
>  	}
>  }
>  
> -static void check_ipi_sender(u32 irqstat, int sender)
> +static int gic_get_sender(int irqstat)
>  {
> -	if (gic_version() == 2) {
> -		int src = (irqstat >> 10) & 7;
> -
> -		if (src != sender)
> -			bad_sender[smp_processor_id()] = src;
> -	}
> -}
> -
> -static void check_irqnr(u32 irqnr)
> -{
> -	if (irqnr != IPI_IRQ)
> -		bad_irq[smp_processor_id()] = irqnr;
> +	if (gic_version() == 2)
> +		return (irqstat >> 10) & 7;
> +	return -1;
>  }
>  
>  static void ipi_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> +	int this_cpu = smp_processor_id();
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		gic_write_eoir(irqstat);
> @@ -173,12 +168,12 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  		 */
>  		if (gic_version() == 2)
>  			smp_rmb();
> -		check_ipi_sender(irqstat, IPI_SENDER);
> -		check_irqnr(irqnr);
> +		irq_sender[this_cpu] = gic_get_sender(irqstat);
> +		irq_number[this_cpu] = irqnr;
>  		smp_wmb(); /* pairs with smp_rmb in check_acked */
> -		++acked[smp_processor_id()];
> +		++acked[this_cpu];
>  	} else {
> -		++spurious[smp_processor_id()];
> +		++spurious[this_cpu];
>  	}
>  
>  	/* Wait for writes to acked/spurious to complete */
> @@ -311,40 +306,42 @@ static void gicv3_ipi_send_broadcast(void)
>  
>  static void ipi_test_self(void)
>  {
> +	int this_cpu = smp_processor_id();
>  	cpumask_t mask;
>  
>  	report_prefix_push("self");
>  	stats_reset();
>  	cpumask_clear(&mask);
> -	cpumask_set_cpu(smp_processor_id(), &mask);
> +	cpumask_set_cpu(this_cpu, &mask);
>  	gic->ipi.send_self();
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
>  static void ipi_test_smp(void)
>  {
> +	int this_cpu = smp_processor_id();
>  	cpumask_t mask;
>  	int i;
>  
>  	report_prefix_push("target-list");
>  	stats_reset();
>  	cpumask_copy(&mask, &cpu_present_mask);
> -	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
> +	for (i = this_cpu & 1; i < nr_cpus; i += 2)
>  		cpumask_clear_cpu(i, &mask);
>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  
>  	report_prefix_push("broadcast");
>  	stats_reset();
>  	cpumask_copy(&mask, &cpu_present_mask);
> -	cpumask_clear_cpu(smp_processor_id(), &mask);
> +	cpumask_clear_cpu(this_cpu, &mask);
>  	gic->ipi.send_broadcast();
>  	wait_for_interrupts(&mask);
> -	report(check_acked(&mask), "Interrupts received");
> +	report(check_acked(&mask, this_cpu, IPI_IRQ), "Interrupts received");
>  	report_prefix_pop();
>  }
>  
> @@ -393,6 +390,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> +	int this_cpu = smp_processor_id();
>  
>  	if (irqnr != GICC_INT_SPURIOUS) {
>  		void *base;
> @@ -405,11 +403,11 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>  
>  		writel(val, base + GICD_ICACTIVER);
>  
> -		check_ipi_sender(irqstat, smp_processor_id());
> -		check_irqnr(irqnr);
> -		++acked[smp_processor_id()];
> +		irq_sender[this_cpu] = gic_get_sender(irqstat);
> +		irq_number[this_cpu] = irqnr;
> +		++acked[this_cpu];
>  	} else {
> -		++spurious[smp_processor_id()];
> +		++spurious[this_cpu];
>  	}
>  }
>  
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
  2020-12-03 14:59     ` Auger Eric
@ 2020-12-09 10:29       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-09 10:29 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 2:59 PM, Auger Eric wrote:
> Hi Alexandru,
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The LPI code validates a result similarly to the IPI tests, by checking if
>> the target CPU received the interrupt with the expected interrupt number.
>> However, the LPI tests invent their own way of checking the test results by
>> creating a global struct (lpi_stats), using a separate interrupt handler
>> (lpi_handler) and test function (check_lpi_stats).
>>
>> There are several areas that can be improved in the LPI code, which are
>> already covered by the IPI tests:
>>
>> - check_lpi_stats() doesn't take into account that the target CPU can
>>   receive the correct interrupt multiple times.
>> - check_lpi_stats() doesn't take into the account the scenarios where all
>>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>>   that touches lpi_stats.observed.
>> - Insufficient or missing memory synchronization.
>>
>> Instead of duplicating code, let's convert the LPI tests to use
>> check_acked() and the same interrupt handler as the IPI tests, which has
>> been renamed to irq_handler() to avoid any confusion.
>>
>> check_lpi_stats() has been replaced with check_acked() which, together with
>> using irq_handler(), instantly gives us more correctness checks and proper
>> memory synchronization between threads. lpi_stats.expected has been
>> replaced by the CPU mask and the expected interrupt number arguments to
>> check_acked(), with no change in semantics.
>>
>> lpi_handler() aborted the test if the interrupt number was not an LPI. This
>> was changed in favor of allowing the test to continue, as it will fail in
>> check_acked(), but possibly print information useful for debugging. If the
>> test receives spurious interrupts, those are reported via report_info() at
>> the end of the test for consistency with the IPI tests, which don't treat
>> spurious interrupts as critical errors.
>>
>> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
>> ipi_recv() because the two are now identical; ipi_recv() has been renamed
>> to irq_recv(), similarly to irq_handler(), to avoid confusion.
>>
>> CC: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>> [..]
>> @@ -767,13 +700,27 @@ static void test_its_trigger(void)
>>  
>>  	report_prefix_push("int");
>>  
>> -	lpi_stats_expect(3, 8195);
>> +	stats_reset();
>> +	/*
>> +	 * its_send_int() is missing the synchronization from the GICv3 IPI
>> +	 * trigger functions.
>> +	 */
>> +	wmb();
> so don't you want to add it in __its_send_int instead?

The memory synchronization in the IPI sender functions make perfect sense, that's
how IPIs are used - one CPU kicks the target, the target reads from a shared
memory location.

I don't think receiving an interrupt from a device is how one would usually expect
to do inter-processor communication. However, I did more digging about this
ability to trigger interrupts from made-up devices, and it seems to me that this
was introduced for testing purposes (please correct me if I'm wrong). With this in
mind, I guess it wouldn't be awkward to have the wmb() in its_send_int(), because
we are using it just like we would an IPI. And it also reduces the boilerplate code.

I'll make the change in the next iteration.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests
@ 2020-12-09 10:29       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-09 10:29 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 2:59 PM, Auger Eric wrote:
> Hi Alexandru,
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The LPI code validates a result similarly to the IPI tests, by checking if
>> the target CPU received the interrupt with the expected interrupt number.
>> However, the LPI tests invent their own way of checking the test results by
>> creating a global struct (lpi_stats), using a separate interrupt handler
>> (lpi_handler) and test function (check_lpi_stats).
>>
>> There are several areas that can be improved in the LPI code, which are
>> already covered by the IPI tests:
>>
>> - check_lpi_stats() doesn't take into account that the target CPU can
>>   receive the correct interrupt multiple times.
>> - check_lpi_stats() doesn't take into the account the scenarios where all
>>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>>   that touches lpi_stats.observed.
>> - Insufficient or missing memory synchronization.
>>
>> Instead of duplicating code, let's convert the LPI tests to use
>> check_acked() and the same interrupt handler as the IPI tests, which has
>> been renamed to irq_handler() to avoid any confusion.
>>
>> check_lpi_stats() has been replaced with check_acked() which, together with
>> using irq_handler(), instantly gives us more correctness checks and proper
>> memory synchronization between threads. lpi_stats.expected has been
>> replaced by the CPU mask and the expected interrupt number arguments to
>> check_acked(), with no change in semantics.
>>
>> lpi_handler() aborted the test if the interrupt number was not an LPI. This
>> was changed in favor of allowing the test to continue, as it will fail in
>> check_acked(), but possibly print information useful for debugging. If the
>> test receives spurious interrupts, those are reported via report_info() at
>> the end of the test for consistency with the IPI tests, which don't treat
>> spurious interrupts as critical errors.
>>
>> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
>> ipi_recv() because the two are now identical; ipi_recv() has been renamed
>> to irq_recv(), similarly to irq_handler(), to avoid confusion.
>>
>> CC: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>> [..]
>> @@ -767,13 +700,27 @@ static void test_its_trigger(void)
>>  
>>  	report_prefix_push("int");
>>  
>> -	lpi_stats_expect(3, 8195);
>> +	stats_reset();
>> +	/*
>> +	 * its_send_int() is missing the synchronization from the GICv3 IPI
>> +	 * trigger functions.
>> +	 */
>> +	wmb();
> so don't you want to add it in __its_send_int instead?

The memory synchronization in the IPI sender functions make perfect sense, that's
how IPIs are used - one CPU kicks the target, the target reads from a shared
memory location.

I don't think receiving an interrupt from a device is how one would usually expect
to do inter-processor communication. However, I did more digging about this
ability to trigger interrupts from made-up devices, and it seems to me that this
was introduced for testing purposes (please correct me if I'm wrong). With this in
mind, I guess it wouldn't be awkward to have the wmb() in its_send_int(), because
we are using it just like we would an IPI. And it also reduces the boilerplate code.

I'll make the change in the next iteration.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-12-03 13:39     ` Auger Eric
@ 2020-12-10 14:45       ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-10 14:45 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 1:39 PM, Auger Eric wrote:
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> check_acked() has several peculiarities: is the only function among the
>> check_* functions which calls report() directly, it does two things
>> (waits for interrupts and checks for misfired interrupts) and it also
>> mixes printf, report_info and report calls.
>>
>> check_acked() also reports a pass and returns as soon all the target CPUs
>> have received interrupts, However, a CPU not having received an interrupt
>> *now* does not guarantee not receiving an eroneous interrupt if we wait
> erroneous
>> long enough.
>>
>> Rework the function by splitting it into two separate functions, each with
>> a single responsability: wait_for_interrupts(), which waits for the
>> expected interrupts to fire, and check_acked() which checks that interrupts
>> have been received as expected.
>>
>> wait_for_interrupts() also waits an extra 100 milliseconds after the
>> expected interrupts have been received in an effort to make sure we don't
>> miss misfiring interrupts.
>>
>> Splitting check_acked() into two functions will also allow us to
>> customize the behavior of each function in the future more easily
>> without using an unnecessarily long list of arguments for check_acked().
>>
>> CC: Andre Przywara <andre.przywara@arm.com>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>>  1 file changed, 47 insertions(+), 26 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 544c283f5f47..dcdab7d5f39a 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -62,41 +62,42 @@ static void stats_reset(void)
>>  	}
>>  }
>>  
>> -static void check_acked(const char *testname, cpumask_t *mask)
>> +static void wait_for_interrupts(cpumask_t *mask)
>>  {
>> -	int missing = 0, extra = 0, unexpected = 0;
>>  	int nr_pass, cpu, i;
>> -	bool bad = false;
>>  
>>  	/* Wait up to 5s for all interrupts to be delivered */
>> -	for (i = 0; i < 50; ++i) {
>> +	for (i = 0; i < 50; i++) {
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> +			/*
>> +			 * A CPU having receied more than one interrupts will
> received
>> +			 * show up in check_acked(), and no matter how long we
>> +			 * wait it cannot un-receive it. Consier at least one
> consider

Will fix all three typos, thanks.

>> +			 * interrupt as a pass.
>> +			 */
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>> -				acked[cpu] == 1 : acked[cpu] == 0;
>> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>> -
>> -			if (bad_sender[cpu] != -1) {
>> -				printf("cpu%d received IPI from wrong sender %d\n",
>> -					cpu, bad_sender[cpu]);
>> -				bad = true;
>> -			}
>> -
>> -			if (bad_irq[cpu] != -1) {
>> -				printf("cpu%d received wrong irq %d\n",
>> -					cpu, bad_irq[cpu]);
>> -				bad = true;
>> -			}
>> +				acked[cpu] >= 1 : acked[cpu] == 0;
>>  		}
>> +
>>  		if (nr_pass == nr_cpus) {
>> -			report(!bad, "%s", testname);
>>  			if (i)
>> -				report_info("took more than %d ms", i * 100);
>> +				report_info("interrupts took more than %d ms", i * 100);
>> +			mdelay(100);
>>  			return;
>>  		}
>>  	}
>>  
>> +	report_info("interrupts timed-out (5s)");
>> +}
>> +
>> +static bool check_acked(cpumask_t *mask)
>> +{
>> +	int missing = 0, extra = 0, unexpected = 0;
>> +	bool pass = true;
>> +	int cpu;
>> +
>>  	for_each_present_cpu(cpu) {
>>  		if (cpumask_test_cpu(cpu, mask)) {
>>  			if (!acked[cpu])
>> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  			if (acked[cpu])
>>  				++unexpected;
>>  		}
>> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>> +
>> +		if (bad_sender[cpu] != -1) {
>> +			report_info("cpu%d received IPI from wrong sender %d",
>> +					cpu, bad_sender[cpu]);
>> +			pass = false;
>> +		}
>> +
>> +		if (bad_irq[cpu] != -1) {
>> +			report_info("cpu%d received wrong irq %d",
>> +					cpu, bad_irq[cpu]);
>> +			pass = false;
>> +		}
>> +	}
>> +
>> +	if (missing || extra || unexpected) {
>> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
>> +				missing, extra, unexpected);
>> +		pass = false;
>>  	}
>>  
>> -	report(false, "%s", testname);
>> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
>> -		    missing, extra, unexpected);
>> +	return pass;
>>  }
>>  
>>  static void check_spurious(void)
>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>  	cpumask_clear(&mask);
>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>  	gic->ipi.send_self();
>> -	check_acked("IPI: self", &mask);
>> +	wait_for_interrupts(&mask);
>> +	report(check_acked(&mask), "Interrupts received");
>>  	report_prefix_pop();
>>  }
>>  
>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>  		cpumask_clear_cpu(i, &mask);
>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>> -	check_acked("IPI: directed", &mask);
>> +	wait_for_interrupts(&mask);
>> +	report(check_acked(&mask), "Interrupts received");
> both ipi_test_smp and ipi_test_self are called from the same test so
> better to use different error messages like it was done originally.

I used the same error message because the tests have a different prefix
("target-list" versus "broadcast"). Do you think there are cases where that's not
enough?

Thanks,
Alex

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-12-10 14:45       ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-10 14:45 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/3/20 1:39 PM, Auger Eric wrote:
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> check_acked() has several peculiarities: is the only function among the
>> check_* functions which calls report() directly, it does two things
>> (waits for interrupts and checks for misfired interrupts) and it also
>> mixes printf, report_info and report calls.
>>
>> check_acked() also reports a pass and returns as soon all the target CPUs
>> have received interrupts, However, a CPU not having received an interrupt
>> *now* does not guarantee not receiving an eroneous interrupt if we wait
> erroneous
>> long enough.
>>
>> Rework the function by splitting it into two separate functions, each with
>> a single responsability: wait_for_interrupts(), which waits for the
>> expected interrupts to fire, and check_acked() which checks that interrupts
>> have been received as expected.
>>
>> wait_for_interrupts() also waits an extra 100 milliseconds after the
>> expected interrupts have been received in an effort to make sure we don't
>> miss misfiring interrupts.
>>
>> Splitting check_acked() into two functions will also allow us to
>> customize the behavior of each function in the future more easily
>> without using an unnecessarily long list of arguments for check_acked().
>>
>> CC: Andre Przywara <andre.przywara@arm.com>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>>  1 file changed, 47 insertions(+), 26 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 544c283f5f47..dcdab7d5f39a 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -62,41 +62,42 @@ static void stats_reset(void)
>>  	}
>>  }
>>  
>> -static void check_acked(const char *testname, cpumask_t *mask)
>> +static void wait_for_interrupts(cpumask_t *mask)
>>  {
>> -	int missing = 0, extra = 0, unexpected = 0;
>>  	int nr_pass, cpu, i;
>> -	bool bad = false;
>>  
>>  	/* Wait up to 5s for all interrupts to be delivered */
>> -	for (i = 0; i < 50; ++i) {
>> +	for (i = 0; i < 50; i++) {
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> +			/*
>> +			 * A CPU having receied more than one interrupts will
> received
>> +			 * show up in check_acked(), and no matter how long we
>> +			 * wait it cannot un-receive it. Consier at least one
> consider

Will fix all three typos, thanks.

>> +			 * interrupt as a pass.
>> +			 */
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>> -				acked[cpu] == 1 : acked[cpu] == 0;
>> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>> -
>> -			if (bad_sender[cpu] != -1) {
>> -				printf("cpu%d received IPI from wrong sender %d\n",
>> -					cpu, bad_sender[cpu]);
>> -				bad = true;
>> -			}
>> -
>> -			if (bad_irq[cpu] != -1) {
>> -				printf("cpu%d received wrong irq %d\n",
>> -					cpu, bad_irq[cpu]);
>> -				bad = true;
>> -			}
>> +				acked[cpu] >= 1 : acked[cpu] == 0;
>>  		}
>> +
>>  		if (nr_pass == nr_cpus) {
>> -			report(!bad, "%s", testname);
>>  			if (i)
>> -				report_info("took more than %d ms", i * 100);
>> +				report_info("interrupts took more than %d ms", i * 100);
>> +			mdelay(100);
>>  			return;
>>  		}
>>  	}
>>  
>> +	report_info("interrupts timed-out (5s)");
>> +}
>> +
>> +static bool check_acked(cpumask_t *mask)
>> +{
>> +	int missing = 0, extra = 0, unexpected = 0;
>> +	bool pass = true;
>> +	int cpu;
>> +
>>  	for_each_present_cpu(cpu) {
>>  		if (cpumask_test_cpu(cpu, mask)) {
>>  			if (!acked[cpu])
>> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  			if (acked[cpu])
>>  				++unexpected;
>>  		}
>> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>> +
>> +		if (bad_sender[cpu] != -1) {
>> +			report_info("cpu%d received IPI from wrong sender %d",
>> +					cpu, bad_sender[cpu]);
>> +			pass = false;
>> +		}
>> +
>> +		if (bad_irq[cpu] != -1) {
>> +			report_info("cpu%d received wrong irq %d",
>> +					cpu, bad_irq[cpu]);
>> +			pass = false;
>> +		}
>> +	}
>> +
>> +	if (missing || extra || unexpected) {
>> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
>> +				missing, extra, unexpected);
>> +		pass = false;
>>  	}
>>  
>> -	report(false, "%s", testname);
>> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
>> -		    missing, extra, unexpected);
>> +	return pass;
>>  }
>>  
>>  static void check_spurious(void)
>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>  	cpumask_clear(&mask);
>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>  	gic->ipi.send_self();
>> -	check_acked("IPI: self", &mask);
>> +	wait_for_interrupts(&mask);
>> +	report(check_acked(&mask), "Interrupts received");
>>  	report_prefix_pop();
>>  }
>>  
>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>  		cpumask_clear_cpu(i, &mask);
>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>> -	check_acked("IPI: directed", &mask);
>> +	wait_for_interrupts(&mask);
>> +	report(check_acked(&mask), "Interrupts received");
> both ipi_test_smp and ipi_test_self are called from the same test so
> better to use different error messages like it was done originally.

I used the same error message because the tests have a different prefix
("target-list" versus "broadcast"). Do you think there are cases where that's not
enough?

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-12-10 14:45       ` Alexandru Elisei
@ 2020-12-15 13:58         ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-15 13:58 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/10/20 3:45 PM, Alexandru Elisei wrote:
> Hi Eric,
> 
> On 12/3/20 1:39 PM, Auger Eric wrote:
>>
>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>> check_acked() has several peculiarities: is the only function among the
>>> check_* functions which calls report() directly, it does two things
>>> (waits for interrupts and checks for misfired interrupts) and it also
>>> mixes printf, report_info and report calls.
>>>
>>> check_acked() also reports a pass and returns as soon all the target CPUs
>>> have received interrupts, However, a CPU not having received an interrupt
>>> *now* does not guarantee not receiving an eroneous interrupt if we wait
>> erroneous
>>> long enough.
>>>
>>> Rework the function by splitting it into two separate functions, each with
>>> a single responsability: wait_for_interrupts(), which waits for the
>>> expected interrupts to fire, and check_acked() which checks that interrupts
>>> have been received as expected.
>>>
>>> wait_for_interrupts() also waits an extra 100 milliseconds after the
>>> expected interrupts have been received in an effort to make sure we don't
>>> miss misfiring interrupts.
>>>
>>> Splitting check_acked() into two functions will also allow us to
>>> customize the behavior of each function in the future more easily
>>> without using an unnecessarily long list of arguments for check_acked().
>>>
>>> CC: Andre Przywara <andre.przywara@arm.com>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> ---
>>>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>>>  1 file changed, 47 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 544c283f5f47..dcdab7d5f39a 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -62,41 +62,42 @@ static void stats_reset(void)
>>>  	}
>>>  }
>>>  
>>> -static void check_acked(const char *testname, cpumask_t *mask)
>>> +static void wait_for_interrupts(cpumask_t *mask)
>>>  {
>>> -	int missing = 0, extra = 0, unexpected = 0;
>>>  	int nr_pass, cpu, i;
>>> -	bool bad = false;
>>>  
>>>  	/* Wait up to 5s for all interrupts to be delivered */
>>> -	for (i = 0; i < 50; ++i) {
>>> +	for (i = 0; i < 50; i++) {
>>>  		mdelay(100);
>>>  		nr_pass = 0;
>>>  		for_each_present_cpu(cpu) {
>>> +			/*
>>> +			 * A CPU having receied more than one interrupts will
>> received
>>> +			 * show up in check_acked(), and no matter how long we
>>> +			 * wait it cannot un-receive it. Consier at least one
>> consider
> 
> Will fix all three typos, thanks.
> 
>>> +			 * interrupt as a pass.
>>> +			 */
>>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>> -				acked[cpu] == 1 : acked[cpu] == 0;
>>> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>> -
>>> -			if (bad_sender[cpu] != -1) {
>>> -				printf("cpu%d received IPI from wrong sender %d\n",
>>> -					cpu, bad_sender[cpu]);
>>> -				bad = true;
>>> -			}
>>> -
>>> -			if (bad_irq[cpu] != -1) {
>>> -				printf("cpu%d received wrong irq %d\n",
>>> -					cpu, bad_irq[cpu]);
>>> -				bad = true;
>>> -			}
>>> +				acked[cpu] >= 1 : acked[cpu] == 0;
>>>  		}
>>> +
>>>  		if (nr_pass == nr_cpus) {
>>> -			report(!bad, "%s", testname);
>>>  			if (i)
>>> -				report_info("took more than %d ms", i * 100);
>>> +				report_info("interrupts took more than %d ms", i * 100);
>>> +			mdelay(100);
>>>  			return;
>>>  		}
>>>  	}
>>>  
>>> +	report_info("interrupts timed-out (5s)");
>>> +}
>>> +
>>> +static bool check_acked(cpumask_t *mask)
>>> +{
>>> +	int missing = 0, extra = 0, unexpected = 0;
>>> +	bool pass = true;
>>> +	int cpu;
>>> +
>>>  	for_each_present_cpu(cpu) {
>>>  		if (cpumask_test_cpu(cpu, mask)) {
>>>  			if (!acked[cpu])
>>> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>>  			if (acked[cpu])
>>>  				++unexpected;
>>>  		}
>>> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>> +
>>> +		if (bad_sender[cpu] != -1) {
>>> +			report_info("cpu%d received IPI from wrong sender %d",
>>> +					cpu, bad_sender[cpu]);
>>> +			pass = false;
>>> +		}
>>> +
>>> +		if (bad_irq[cpu] != -1) {
>>> +			report_info("cpu%d received wrong irq %d",
>>> +					cpu, bad_irq[cpu]);
>>> +			pass = false;
>>> +		}
>>> +	}
>>> +
>>> +	if (missing || extra || unexpected) {
>>> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
>>> +				missing, extra, unexpected);
>>> +		pass = false;
>>>  	}
>>>  
>>> -	report(false, "%s", testname);
>>> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
>>> -		    missing, extra, unexpected);
>>> +	return pass;
>>>  }
>>>  
>>>  static void check_spurious(void)
>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>  	cpumask_clear(&mask);
>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>  	gic->ipi.send_self();
>>> -	check_acked("IPI: self", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>>  	report_prefix_pop();
>>>  }
>>>  
>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>  		cpumask_clear_cpu(i, &mask);
>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>> -	check_acked("IPI: directed", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>> both ipi_test_smp and ipi_test_self are called from the same test so
>> better to use different error messages like it was done originally.
> 
> I used the same error message because the tests have a different prefix
> ("target-list" versus "broadcast"). Do you think there are cases where that's not
> enough?
I think in "ipi" test,
ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp

Thanks

Eric
> 
> Thanks,
> Alex
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-12-15 13:58         ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-15 13:58 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/10/20 3:45 PM, Alexandru Elisei wrote:
> Hi Eric,
> 
> On 12/3/20 1:39 PM, Auger Eric wrote:
>>
>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>>> check_acked() has several peculiarities: is the only function among the
>>> check_* functions which calls report() directly, it does two things
>>> (waits for interrupts and checks for misfired interrupts) and it also
>>> mixes printf, report_info and report calls.
>>>
>>> check_acked() also reports a pass and returns as soon all the target CPUs
>>> have received interrupts, However, a CPU not having received an interrupt
>>> *now* does not guarantee not receiving an eroneous interrupt if we wait
>> erroneous
>>> long enough.
>>>
>>> Rework the function by splitting it into two separate functions, each with
>>> a single responsability: wait_for_interrupts(), which waits for the
>>> expected interrupts to fire, and check_acked() which checks that interrupts
>>> have been received as expected.
>>>
>>> wait_for_interrupts() also waits an extra 100 milliseconds after the
>>> expected interrupts have been received in an effort to make sure we don't
>>> miss misfiring interrupts.
>>>
>>> Splitting check_acked() into two functions will also allow us to
>>> customize the behavior of each function in the future more easily
>>> without using an unnecessarily long list of arguments for check_acked().
>>>
>>> CC: Andre Przywara <andre.przywara@arm.com>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> ---
>>>  arm/gic.c | 73 +++++++++++++++++++++++++++++++++++--------------------
>>>  1 file changed, 47 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/arm/gic.c b/arm/gic.c
>>> index 544c283f5f47..dcdab7d5f39a 100644
>>> --- a/arm/gic.c
>>> +++ b/arm/gic.c
>>> @@ -62,41 +62,42 @@ static void stats_reset(void)
>>>  	}
>>>  }
>>>  
>>> -static void check_acked(const char *testname, cpumask_t *mask)
>>> +static void wait_for_interrupts(cpumask_t *mask)
>>>  {
>>> -	int missing = 0, extra = 0, unexpected = 0;
>>>  	int nr_pass, cpu, i;
>>> -	bool bad = false;
>>>  
>>>  	/* Wait up to 5s for all interrupts to be delivered */
>>> -	for (i = 0; i < 50; ++i) {
>>> +	for (i = 0; i < 50; i++) {
>>>  		mdelay(100);
>>>  		nr_pass = 0;
>>>  		for_each_present_cpu(cpu) {
>>> +			/*
>>> +			 * A CPU having receied more than one interrupts will
>> received
>>> +			 * show up in check_acked(), and no matter how long we
>>> +			 * wait it cannot un-receive it. Consier at least one
>> consider
> 
> Will fix all three typos, thanks.
> 
>>> +			 * interrupt as a pass.
>>> +			 */
>>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>> -				acked[cpu] == 1 : acked[cpu] == 0;
>>> -			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>> -
>>> -			if (bad_sender[cpu] != -1) {
>>> -				printf("cpu%d received IPI from wrong sender %d\n",
>>> -					cpu, bad_sender[cpu]);
>>> -				bad = true;
>>> -			}
>>> -
>>> -			if (bad_irq[cpu] != -1) {
>>> -				printf("cpu%d received wrong irq %d\n",
>>> -					cpu, bad_irq[cpu]);
>>> -				bad = true;
>>> -			}
>>> +				acked[cpu] >= 1 : acked[cpu] == 0;
>>>  		}
>>> +
>>>  		if (nr_pass == nr_cpus) {
>>> -			report(!bad, "%s", testname);
>>>  			if (i)
>>> -				report_info("took more than %d ms", i * 100);
>>> +				report_info("interrupts took more than %d ms", i * 100);
>>> +			mdelay(100);
>>>  			return;
>>>  		}
>>>  	}
>>>  
>>> +	report_info("interrupts timed-out (5s)");
>>> +}
>>> +
>>> +static bool check_acked(cpumask_t *mask)
>>> +{
>>> +	int missing = 0, extra = 0, unexpected = 0;
>>> +	bool pass = true;
>>> +	int cpu;
>>> +
>>>  	for_each_present_cpu(cpu) {
>>>  		if (cpumask_test_cpu(cpu, mask)) {
>>>  			if (!acked[cpu])
>>> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>>  			if (acked[cpu])
>>>  				++unexpected;
>>>  		}
>>> +		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>> +
>>> +		if (bad_sender[cpu] != -1) {
>>> +			report_info("cpu%d received IPI from wrong sender %d",
>>> +					cpu, bad_sender[cpu]);
>>> +			pass = false;
>>> +		}
>>> +
>>> +		if (bad_irq[cpu] != -1) {
>>> +			report_info("cpu%d received wrong irq %d",
>>> +					cpu, bad_irq[cpu]);
>>> +			pass = false;
>>> +		}
>>> +	}
>>> +
>>> +	if (missing || extra || unexpected) {
>>> +		report_info("ACKS: missing=%d extra=%d unexpected=%d",
>>> +				missing, extra, unexpected);
>>> +		pass = false;
>>>  	}
>>>  
>>> -	report(false, "%s", testname);
>>> -	report_info("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
>>> -		    missing, extra, unexpected);
>>> +	return pass;
>>>  }
>>>  
>>>  static void check_spurious(void)
>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>  	cpumask_clear(&mask);
>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>  	gic->ipi.send_self();
>>> -	check_acked("IPI: self", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>>  	report_prefix_pop();
>>>  }
>>>  
>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>  		cpumask_clear_cpu(i, &mask);
>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>> -	check_acked("IPI: directed", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>> both ipi_test_smp and ipi_test_self are called from the same test so
>> better to use different error messages like it was done originally.
> 
> I used the same error message because the tests have a different prefix
> ("target-list" versus "broadcast"). Do you think there are cases where that's not
> enough?
I think in "ipi" test,
ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp

Thanks

Eric
> 
> Thanks,
> Alex
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-12-15 13:58         ` Auger Eric
@ 2020-12-16 11:40           ` Alexandru Elisei
  -1 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-16 11:40 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/15/20 1:58 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 12/10/20 3:45 PM, Alexandru Elisei wrote:
>> Hi Eric,
>>
>> On 12/3/20 1:39 PM, Auger Eric wrote:
>>> [..]
>>>  
>>>  static void check_spurious(void)
>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>  	cpumask_clear(&mask);
>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>  	gic->ipi.send_self();
>>> -	check_acked("IPI: self", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>>  	report_prefix_pop();
>>>  }
>>>  
>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>  		cpumask_clear_cpu(i, &mask);
>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>> -	check_acked("IPI: directed", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>> both ipi_test_smp and ipi_test_self are called from the same test so
>>> better to use different error messages like it was done originally.
>> I used the same error message because the tests have a different prefix
>> ("target-list" versus "broadcast"). Do you think there are cases where that's not
>> enough?
> I think in "ipi" test,
> ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp

I'm afraid I don't understand what you are trying to say. This is the log for the
gicv3-ipi test:

$ cat logs/gicv3-ipi.log
timeout -k 1s --foreground 90s /usr/bin/qemu-system-aarch64 -nodefaults -machine
virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device
virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none
-serial stdio -kernel arm/gic.flat -smp 6 -machine gic-version=3 -append ipi #
-initrd /tmp/tmp.trk6aAcaZx
WARNING: early print support may not work. Found uart at 0x9000000, but early base
is 0x3f8.
PASS: gicv3: ipi: self: Interrupts received
PASS: gicv3: ipi: target-list: Interrupts received
PASS: gicv3: ipi: broadcast: Interrupts received
SUMMARY: 3 tests

The warning is because I forgot to reconfigure the tests with --vmm=qemu.

Would you mind pointing out what you think is ambiguous?

Thanks,

Alex


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-12-16 11:40           ` Alexandru Elisei
  0 siblings, 0 replies; 78+ messages in thread
From: Alexandru Elisei @ 2020-12-16 11:40 UTC (permalink / raw)
  To: Auger Eric, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Eric,

On 12/15/20 1:58 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 12/10/20 3:45 PM, Alexandru Elisei wrote:
>> Hi Eric,
>>
>> On 12/3/20 1:39 PM, Auger Eric wrote:
>>> [..]
>>>  
>>>  static void check_spurious(void)
>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>  	cpumask_clear(&mask);
>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>  	gic->ipi.send_self();
>>> -	check_acked("IPI: self", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>>  	report_prefix_pop();
>>>  }
>>>  
>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>  		cpumask_clear_cpu(i, &mask);
>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>> -	check_acked("IPI: directed", &mask);
>>> +	wait_for_interrupts(&mask);
>>> +	report(check_acked(&mask), "Interrupts received");
>>> both ipi_test_smp and ipi_test_self are called from the same test so
>>> better to use different error messages like it was done originally.
>> I used the same error message because the tests have a different prefix
>> ("target-list" versus "broadcast"). Do you think there are cases where that's not
>> enough?
> I think in "ipi" test,
> ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp

I'm afraid I don't understand what you are trying to say. This is the log for the
gicv3-ipi test:

$ cat logs/gicv3-ipi.log
timeout -k 1s --foreground 90s /usr/bin/qemu-system-aarch64 -nodefaults -machine
virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device
virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none
-serial stdio -kernel arm/gic.flat -smp 6 -machine gic-version=3 -append ipi #
-initrd /tmp/tmp.trk6aAcaZx
WARNING: early print support may not work. Found uart at 0x9000000, but early base
is 0x3f8.
PASS: gicv3: ipi: self: Interrupts received
PASS: gicv3: ipi: target-list: Interrupts received
PASS: gicv3: ipi: broadcast: Interrupts received
SUMMARY: 3 tests

The warning is because I forgot to reconfigure the tests with --vmm=qemu.

Would you mind pointing out what you think is ambiguous?

Thanks,

Alex

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
  2020-12-16 11:40           ` Alexandru Elisei
@ 2020-12-16 12:37             ` Auger Eric
  -1 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-16 12:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/16/20 12:40 PM, Alexandru Elisei wrote:
> Hi Eric,
> 
> On 12/15/20 1:58 PM, Auger Eric wrote:
>> Hi Alexandru,
>>
>> On 12/10/20 3:45 PM, Alexandru Elisei wrote:
>>> Hi Eric,
>>>
>>> On 12/3/20 1:39 PM, Auger Eric wrote:
>>>> [..]
>>>>  
>>>>  static void check_spurious(void)
>>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>>  	cpumask_clear(&mask);
>>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>>  	gic->ipi.send_self();
>>>> -	check_acked("IPI: self", &mask);
>>>> +	wait_for_interrupts(&mask);
>>>> +	report(check_acked(&mask), "Interrupts received");
>>>>  	report_prefix_pop();
>>>>  }
>>>>  
>>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>>  		cpumask_clear_cpu(i, &mask);
>>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>>> -	check_acked("IPI: directed", &mask);
>>>> +	wait_for_interrupts(&mask);
>>>> +	report(check_acked(&mask), "Interrupts received");
>>>> both ipi_test_smp and ipi_test_self are called from the same test so
>>>> better to use different error messages like it was done originally.
>>> I used the same error message because the tests have a different prefix
>>> ("target-list" versus "broadcast"). Do you think there are cases where that's not
>>> enough?
>> I think in "ipi" test,
>> ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp
> 
> I'm afraid I don't understand what you are trying to say. This is the log for the
> gicv3-ipi test:
> 
> $ cat logs/gicv3-ipi.log
> timeout -k 1s --foreground 90s /usr/bin/qemu-system-aarch64 -nodefaults -machine
> virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device
> virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none
> -serial stdio -kernel arm/gic.flat -smp 6 -machine gic-version=3 -append ipi #
> -initrd /tmp/tmp.trk6aAcaZx
> WARNING: early print support may not work. Found uart at 0x9000000, but early base
> is 0x3f8.
> PASS: gicv3: ipi: self: Interrupts received
> PASS: gicv3: ipi: target-list: Interrupts received
> PASS: gicv3: ipi: broadcast: Interrupts received
> SUMMARY: 3 tests
> 
> The warning is because I forgot to reconfigure the tests with --vmm=qemu.
> 
> Would you mind pointing out what you think is ambiguous?
Hum sorry I did not pay attention to the report_prefix_push() within
ipi_test_self & ipi_test_smp. I had in mind those were only in the main().

Forgive me for the noise

Eric
> 
> Thanks,
> 
> Alex
> 


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions
@ 2020-12-16 12:37             ` Auger Eric
  0 siblings, 0 replies; 78+ messages in thread
From: Auger Eric @ 2020-12-16 12:37 UTC (permalink / raw)
  To: Alexandru Elisei, kvm, kvmarm, drjones; +Cc: andre.przywara

Hi Alexandru,

On 12/16/20 12:40 PM, Alexandru Elisei wrote:
> Hi Eric,
> 
> On 12/15/20 1:58 PM, Auger Eric wrote:
>> Hi Alexandru,
>>
>> On 12/10/20 3:45 PM, Alexandru Elisei wrote:
>>> Hi Eric,
>>>
>>> On 12/3/20 1:39 PM, Auger Eric wrote:
>>>> [..]
>>>>  
>>>>  static void check_spurious(void)
>>>> @@ -300,7 +318,8 @@ static void ipi_test_self(void)
>>>>  	cpumask_clear(&mask);
>>>>  	cpumask_set_cpu(smp_processor_id(), &mask);
>>>>  	gic->ipi.send_self();
>>>> -	check_acked("IPI: self", &mask);
>>>> +	wait_for_interrupts(&mask);
>>>> +	report(check_acked(&mask), "Interrupts received");
>>>>  	report_prefix_pop();
>>>>  }
>>>>  
>>>> @@ -315,7 +334,8 @@ static void ipi_test_smp(void)
>>>>  	for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
>>>>  		cpumask_clear_cpu(i, &mask);
>>>>  	gic_ipi_send_mask(IPI_IRQ, &mask);
>>>> -	check_acked("IPI: directed", &mask);
>>>> +	wait_for_interrupts(&mask);
>>>> +	report(check_acked(&mask), "Interrupts received");
>>>> both ipi_test_smp and ipi_test_self are called from the same test so
>>>> better to use different error messages like it was done originally.
>>> I used the same error message because the tests have a different prefix
>>> ("target-list" versus "broadcast"). Do you think there are cases where that's not
>>> enough?
>> I think in "ipi" test,
>> ipi_test -> ipi_send -> ipi_test_self, ipi_test_smp
> 
> I'm afraid I don't understand what you are trying to say. This is the log for the
> gicv3-ipi test:
> 
> $ cat logs/gicv3-ipi.log
> timeout -k 1s --foreground 90s /usr/bin/qemu-system-aarch64 -nodefaults -machine
> virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device
> virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none
> -serial stdio -kernel arm/gic.flat -smp 6 -machine gic-version=3 -append ipi #
> -initrd /tmp/tmp.trk6aAcaZx
> WARNING: early print support may not work. Found uart at 0x9000000, but early base
> is 0x3f8.
> PASS: gicv3: ipi: self: Interrupts received
> PASS: gicv3: ipi: target-list: Interrupts received
> PASS: gicv3: ipi: broadcast: Interrupts received
> SUMMARY: 3 tests
> 
> The warning is because I forgot to reconfigure the tests with --vmm=qemu.
> 
> Would you mind pointing out what you think is ambiguous?
Hum sorry I did not pay attention to the report_prefix_push() within
ipi_test_self & ipi_test_smp. I had in mind those were only in the main().

Forgive me for the noise

Eric
> 
> Thanks,
> 
> Alex
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2020-12-16 12:39 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-25 15:51 [kvm-unit-tests PATCH 00/10] GIC fixes and improvements Alexandru Elisei
2020-11-25 15:51 ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-12-01 17:37     ` Alexandru Elisei
2020-12-01 17:37       ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: " Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler() Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-12-02 14:02     ` Alexandru Elisei
2020-12-02 14:02       ` Alexandru Elisei
2020-12-02 14:14       ` Alexandru Elisei
2020-12-02 14:14         ` Alexandru Elisei
2020-12-03  9:41         ` Auger Eric
2020-12-03  9:41           ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset() Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:48   ` Auger Eric
2020-12-01 16:48     ` Auger Eric
2020-12-02 14:06     ` Alexandru Elisei
2020-12-02 14:06       ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-12-03 13:21     ` Alexandru Elisei
2020-12-03 13:21       ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:21   ` Auger Eric
2020-12-03 13:21     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:39   ` Auger Eric
2020-12-03 13:39     ` Auger Eric
2020-12-10 14:45     ` Alexandru Elisei
2020-12-10 14:45       ` Alexandru Elisei
2020-12-15 13:58       ` Auger Eric
2020-12-15 13:58         ` Auger Eric
2020-12-16 11:40         ` Alexandru Elisei
2020-12-16 11:40           ` Alexandru Elisei
2020-12-16 12:37           ` Auger Eric
2020-12-16 12:37             ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 14:59   ` Auger Eric
2020-12-03 14:59     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-11-26  9:30   ` Zenghui Yu
2020-11-26  9:30     ` Zenghui Yu
2020-11-27 14:50     ` Alexandru Elisei
2020-11-27 14:50       ` Alexandru Elisei
2020-11-30 13:59       ` Zenghui Yu
2020-11-30 13:59         ` Zenghui Yu
2020-11-30 14:19         ` Alexandru Elisei
2020-11-30 14:19           ` Alexandru Elisei
2020-12-01 15:09           ` Alexandru Elisei
2020-12-01 15:09             ` Alexandru Elisei
2020-11-30 17:48     ` Auger Eric
2020-11-30 17:48       ` Auger Eric
2020-12-03 14:59   ` Auger Eric
2020-12-03 14:59     ` Auger Eric
2020-12-09 10:29     ` Alexandru Elisei
2020-12-09 10:29       ` Alexandru Elisei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.