All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Auger Eric <eric.auger@redhat.com>,
	kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
	drjones@redhat.com
Cc: andre.przywara@arm.com
Subject: Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
Date: Thu, 3 Dec 2020 13:21:42 +0000	[thread overview]
Message-ID: <7bb38de0-e06a-085c-98d2-cbce62ba31b3@arm.com> (raw)
In-Reply-To: <89585694-e188-9f8c-de71-29f8baa89dd7@redhat.com>

Hi Eric,

On 12/3/20 1:10 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The IPI test works by sending IPIs to even numbered CPUs from the
>> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
>> interrupts as expected. The check is done in check_acked() by the
>> IPI_SENDER CPU with the help of three arrays:
>>
>> - acked, where acked[i] == 1 means that CPU i received the interrupt.
>> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>>   i had the expected interrupt number (IPI_IRQ).
>> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
>>
>> The assumption made by check_acked() is that if a CPU acked an interrupt,
>> then bad_sender and bad_irq have also been updated. This is a common
>> inter-thread communication pattern called message passing.  For message
>> passing to work correctly on weakly consistent memory model architectures,
>> like arm and arm64, barriers or address dependencies are required. This is
>> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
>> using DMB and DSB barriers" (page K11-7993), in the section with a single
>> observer, which is in our case the IPI_SENDER CPU.
>>
>> The IPI test attempts to enforce the correct ordering using memory
>> barriers, but it's not enough. For example, the program execution below is
>> valid from an architectural point of view:
>>
>> 3 online CPUs, initial state (from stats_reset()):
>>
>> acked[2] = 0;
>> bad_sender[2] = -1;
>> bad_irq[2] = -1;
>>
>> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
>> 				|
>> smp_rmb() // DMB ISHLD		| acked[2]++;
>> read 1 from acked[2]		|
>> nr_pass++ // nr_pass = 3	|
>> read -1 from bad_sender[2]	|
>> read -1 from bad_irq[2]		|
>> 				| // in check_ipi_sender()
>> 				| bad_sender[2] = <bad ipi sender>
>> 				| // in check_irqnr()
>> 				| bad_irq[2] = <bad irq number>
>> 				| smp_wmb() // DMB ISHST
>> nr_pass == nr_cpus, return	|
>>
>> In this scenario, CPU1 will read the updated acked value, but it will read
>> the initial bad_sender and bad_irq values. This is permitted because the
>> memory barriers do not create a data dependency between the value read from
>> acked and the values read from bad_rq and bad_sender on CPU1, respectively
>> between the values written to acked, bad_sender and bad_irq on CPU2.
>>
>> To avoid this situation, let's reorder the barriers and accesses to the
>> arrays to create the needed dependencies that ensure that message passing
>> behaves as expected.
>>
>> In the interrupt handler, the writes to bad_sender and bad_irq are
>> reordered before the write to acked and a smp_wmb() barrier is added. This
>> ensures that if other PEs observe the write to acked, then they will also
>> observe the writes to the other two arrays.
>>
>> In check_acked(), put the smp_rmb() barrier after the read from acked to
>> ensure that the subsequent reads from bad_sender, respectively bad_irq,
>> aren't reordered locally by the PE.
>>
>> With these changes, the expected ordering of accesses is respected and we
>> end up with the pattern described in the Arm ARM and also in the Linux
>> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
>> tools/memory-model/litmus-tests. More examples and explanations can be
>> found in the Linux source tree, in Documentation/memory-barriers.txt, in
>> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
>> SPECULATION".
>>
>> For consistency with ipi_handler(), the array accesses in
>> ipi_clear_active_handler() have also been reordered. This shouldn't affect
>> the functionality of that test.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 7befda2a8673..bcb834406d23 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> -			smp_rmb();
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>  				acked[cpu] == 1 : acked[cpu] == 0;
>> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>  
>>  			if (bad_sender[cpu] != -1) {
>>  				printf("cpu%d received IPI from wrong sender %d\n",
>> @@ -118,7 +118,6 @@ static void check_spurious(void)
>>  {
>>  	int cpu;
>>  
>> -	smp_rmb();
> this change is not documented in the commit msg.

You are right. I think this is a rebasing mistake and should actually be part of
#7 ("arm/arm64: gic: Wait for writes to acked or spurious to complete") where I
remove the smp_wmb() when updating spurious in ipi_handler.

>>  	for_each_present_cpu(cpu) {
>>  		if (spurious[cpu])
>>  			report_info("WARN: cpu%d got %d spurious interrupts",
>> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  		 */
>>  		if (gic_version() == 2)
>>  			smp_rmb();
>> -		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
>> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
>> +		++acked[smp_processor_id()];
>>  	} else {
>>  		++spurious[smp_processor_id()];
>>  		smp_wmb();
> I guess this one was paired with check_spurious one?
>> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> +		++acked[smp_processor_id()];
> This change is not really needed, isn't it?

It's not needed, yes. It's explained in the commit message, it's there for
consistency with ipi_handler.

Thanks,
Alex

WARNING: multiple messages have this Message-ID (diff)
From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Auger Eric <eric.auger@redhat.com>,
	kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
	drjones@redhat.com
Cc: andre.przywara@arm.com
Subject: Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test
Date: Thu, 3 Dec 2020 13:21:42 +0000	[thread overview]
Message-ID: <7bb38de0-e06a-085c-98d2-cbce62ba31b3@arm.com> (raw)
In-Reply-To: <89585694-e188-9f8c-de71-29f8baa89dd7@redhat.com>

Hi Eric,

On 12/3/20 1:10 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The IPI test works by sending IPIs to even numbered CPUs from the
>> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
>> interrupts as expected. The check is done in check_acked() by the
>> IPI_SENDER CPU with the help of three arrays:
>>
>> - acked, where acked[i] == 1 means that CPU i received the interrupt.
>> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>>   i had the expected interrupt number (IPI_IRQ).
>> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
>>
>> The assumption made by check_acked() is that if a CPU acked an interrupt,
>> then bad_sender and bad_irq have also been updated. This is a common
>> inter-thread communication pattern called message passing.  For message
>> passing to work correctly on weakly consistent memory model architectures,
>> like arm and arm64, barriers or address dependencies are required. This is
>> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
>> using DMB and DSB barriers" (page K11-7993), in the section with a single
>> observer, which is in our case the IPI_SENDER CPU.
>>
>> The IPI test attempts to enforce the correct ordering using memory
>> barriers, but it's not enough. For example, the program execution below is
>> valid from an architectural point of view:
>>
>> 3 online CPUs, initial state (from stats_reset()):
>>
>> acked[2] = 0;
>> bad_sender[2] = -1;
>> bad_irq[2] = -1;
>>
>> CPU1 (in check_acked())		| CPU2 (in ipi_handler())
>> 				|
>> smp_rmb() // DMB ISHLD		| acked[2]++;
>> read 1 from acked[2]		|
>> nr_pass++ // nr_pass = 3	|
>> read -1 from bad_sender[2]	|
>> read -1 from bad_irq[2]		|
>> 				| // in check_ipi_sender()
>> 				| bad_sender[2] = <bad ipi sender>
>> 				| // in check_irqnr()
>> 				| bad_irq[2] = <bad irq number>
>> 				| smp_wmb() // DMB ISHST
>> nr_pass == nr_cpus, return	|
>>
>> In this scenario, CPU1 will read the updated acked value, but it will read
>> the initial bad_sender and bad_irq values. This is permitted because the
>> memory barriers do not create a data dependency between the value read from
>> acked and the values read from bad_rq and bad_sender on CPU1, respectively
>> between the values written to acked, bad_sender and bad_irq on CPU2.
>>
>> To avoid this situation, let's reorder the barriers and accesses to the
>> arrays to create the needed dependencies that ensure that message passing
>> behaves as expected.
>>
>> In the interrupt handler, the writes to bad_sender and bad_irq are
>> reordered before the write to acked and a smp_wmb() barrier is added. This
>> ensures that if other PEs observe the write to acked, then they will also
>> observe the writes to the other two arrays.
>>
>> In check_acked(), put the smp_rmb() barrier after the read from acked to
>> ensure that the subsequent reads from bad_sender, respectively bad_irq,
>> aren't reordered locally by the PE.
>>
>> With these changes, the expected ordering of accesses is respected and we
>> end up with the pattern described in the Arm ARM and also in the Linux
>> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
>> tools/memory-model/litmus-tests. More examples and explanations can be
>> found in the Linux source tree, in Documentation/memory-barriers.txt, in
>> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
>> SPECULATION".
>>
>> For consistency with ipi_handler(), the array accesses in
>> ipi_clear_active_handler() have also been reordered. This shouldn't affect
>> the functionality of that test.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> ---
>>  arm/gic.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 7befda2a8673..bcb834406d23 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t *mask)
>>  		mdelay(100);
>>  		nr_pass = 0;
>>  		for_each_present_cpu(cpu) {
>> -			smp_rmb();
>>  			nr_pass += cpumask_test_cpu(cpu, mask) ?
>>  				acked[cpu] == 1 : acked[cpu] == 0;
>> +			smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>  
>>  			if (bad_sender[cpu] != -1) {
>>  				printf("cpu%d received IPI from wrong sender %d\n",
>> @@ -118,7 +118,6 @@ static void check_spurious(void)
>>  {
>>  	int cpu;
>>  
>> -	smp_rmb();
> this change is not documented in the commit msg.

You are right. I think this is a rebasing mistake and should actually be part of
#7 ("arm/arm64: gic: Wait for writes to acked or spurious to complete") where I
remove the smp_wmb() when updating spurious in ipi_handler.

>>  	for_each_present_cpu(cpu) {
>>  		if (spurious[cpu])
>>  			report_info("WARN: cpu%d got %d spurious interrupts",
>> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>>  		 */
>>  		if (gic_version() == 2)
>>  			smp_rmb();
>> -		++acked[smp_processor_id()];
>>  		check_ipi_sender(irqstat);
>>  		check_irqnr(irqnr);
>> -		smp_wmb(); /* pairs with rmb in check_acked */
>> +		smp_wmb(); /* pairs with smp_rmb in check_acked */
>> +		++acked[smp_processor_id()];
>>  	} else {
>>  		++spurious[smp_processor_id()];
>>  		smp_wmb();
> I guess this one was paired with check_spurious one?
>> @@ -383,8 +382,8 @@ static void ipi_clear_active_handler(struct pt_regs *regs __unused)
>>  
>>  		writel(val, base + GICD_ICACTIVER);
>>  
>> -		++acked[smp_processor_id()];
>>  		check_irqnr(irqnr);
>> +		++acked[smp_processor_id()];
> This change is not really needed, isn't it?

It's not needed, yes. It's explained in the commit message, it's there for
consistency with ipi_handler.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2020-12-03 13:21 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-25 15:51 [kvm-unit-tests PATCH 00/10] GIC fixes and improvements Alexandru Elisei
2020-11-25 15:51 ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 01/10] lib: arm/arm64: gicv3: Add missing barrier when sending IPIs Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-12-01 17:37     ` Alexandru Elisei
2020-12-01 17:37       ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 02/10] lib: arm/arm64: gicv2: " Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler() Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:37   ` Auger Eric
2020-12-01 16:37     ` Auger Eric
2020-12-02 14:02     ` Alexandru Elisei
2020-12-02 14:02       ` Alexandru Elisei
2020-12-02 14:14       ` Alexandru Elisei
2020-12-02 14:14         ` Alexandru Elisei
2020-12-03  9:41         ` Auger Eric
2020-12-03  9:41           ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset() Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-01 16:48   ` Auger Eric
2020-12-01 16:48     ` Auger Eric
2020-12-02 14:06     ` Alexandru Elisei
2020-12-02 14:06       ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-12-03 13:21     ` Alexandru Elisei [this message]
2020-12-03 13:21       ` Alexandru Elisei
2020-11-25 15:51 ` [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:10   ` Auger Eric
2020-12-03 13:10     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:21   ` Auger Eric
2020-12-03 13:21     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 13:39   ` Auger Eric
2020-12-03 13:39     ` Auger Eric
2020-12-10 14:45     ` Alexandru Elisei
2020-12-10 14:45       ` Alexandru Elisei
2020-12-15 13:58       ` Auger Eric
2020-12-15 13:58         ` Auger Eric
2020-12-16 11:40         ` Alexandru Elisei
2020-12-16 11:40           ` Alexandru Elisei
2020-12-16 12:37           ` Auger Eric
2020-12-16 12:37             ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-12-03 14:59   ` Auger Eric
2020-12-03 14:59     ` Auger Eric
2020-11-25 15:51 ` [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests Alexandru Elisei
2020-11-25 15:51   ` Alexandru Elisei
2020-11-26  9:30   ` Zenghui Yu
2020-11-26  9:30     ` Zenghui Yu
2020-11-27 14:50     ` Alexandru Elisei
2020-11-27 14:50       ` Alexandru Elisei
2020-11-30 13:59       ` Zenghui Yu
2020-11-30 13:59         ` Zenghui Yu
2020-11-30 14:19         ` Alexandru Elisei
2020-11-30 14:19           ` Alexandru Elisei
2020-12-01 15:09           ` Alexandru Elisei
2020-12-01 15:09             ` Alexandru Elisei
2020-11-30 17:48     ` Auger Eric
2020-11-30 17:48       ` Auger Eric
2020-12-03 14:59   ` Auger Eric
2020-12-03 14:59     ` Auger Eric
2020-12-09 10:29     ` Alexandru Elisei
2020-12-09 10:29       ` Alexandru Elisei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bb38de0-e06a-085c-98d2-cbce62ba31b3@arm.com \
    --to=alexandru.elisei@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=drjones@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.