From: Gavin Shan <gshan@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
yihyu@redhat.com, shan.gavin@gmail.com,
Will Deacon <will@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arm-kernel@lists.infradead.org, mochs@nvidia.com
Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring
Date: Fri, 15 Mar 2024 21:24:36 +1000 [thread overview]
Message-ID: <66e12633-b2d6-4b9a-9103-bb79770fcafa@redhat.com> (raw)
In-Reply-To: <20240315065318-mutt-send-email-mst@kernel.org>
On 3/15/24 21:05, Michael S. Tsirkin wrote:
> On Fri, Mar 15, 2024 at 08:45:10PM +1000, Gavin Shan wrote:
>>>> Yes, I guess smp_wmb() ('dmb') is buggy on NVidia's grace-hopper platform. I tried
>> to reproduce it with my own driver where one thread writes to the shared buffer
>> and another thread reads from the buffer. I don't hit the out-of-order issue so
>> far.
>
> Make sure the 2 areas you are accessing are in different cache lines.
>
Yes, I already put those 2 areas to separate cache lines.
>
>> My driver may be not correct somewhere and I will update if I can reproduce
>> the issue with my driver in the future.
>
> Then maybe your change is just making virtio slower and masks the bug
> that is actually elsewhere?
>
> You don't really need a driver. Here's a simple test: without barriers
> assertion will fail. With barriers it will not.
> (Warning: didn't bother testing too much, could be buggy.
>
> ---
>
> #include <pthread.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <assert.h>
>
> #define FIRST values[0]
> #define SECOND values[64]
>
> volatile int values[100] = {};
>
> void* writer_thread(void* arg) {
> while (1) {
> FIRST++;
> // NEED smp_wmb here
__asm__ volatile("dmb ishst" : : : "memory");
> SECOND++;
> }
> }
>
> void* reader_thread(void* arg) {
> while (1) {
> int first = FIRST;
> // NEED smp_rmb here
__asm__ volatile("dmb ishld" : : : "memory");
> int second = SECOND;
> assert(first - second == 1 || first - second == 0);
> }
> }
>
> int main() {
> pthread_t writer, reader;
>
> pthread_create(&writer, NULL, writer_thread, NULL);
> pthread_create(&reader, NULL, reader_thread, NULL);
>
> pthread_join(writer, NULL);
> pthread_join(reader, NULL);
>
> return 0;
> }
>
Had a quick test on NVidia's grace-hopper and Ampere's CPUs. I hit
the assert on both of them. After replacing 'dmb' with 'dsb', I can
hit assert on both of them too. I need to look at the code closely.
[root@virt-mtcollins-02 test]# ./a
a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
Aborted (core dumped)
[root@nvidia-grace-hopper-05 test]# ./a
a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
Aborted (core dumped)
Thanks,
Gavin
next prev parent reply other threads:[~2024-03-15 11:24 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-14 7:49 [PATCH] virtio_ring: Fix the stale index in available ring Gavin Shan
2024-03-14 8:05 ` Michael S. Tsirkin
2024-03-14 10:15 ` Gavin Shan
2024-03-14 11:50 ` Michael S. Tsirkin
2024-03-14 12:50 ` Gavin Shan
2024-03-14 12:59 ` Michael S. Tsirkin
2024-03-15 10:45 ` Gavin Shan
2024-03-15 11:05 ` Michael S. Tsirkin
2024-03-15 11:24 ` Gavin Shan [this message]
2024-03-17 16:50 ` Michael S. Tsirkin
2024-03-17 23:41 ` Gavin Shan
2024-03-18 7:50 ` Michael S. Tsirkin
2024-03-18 16:59 ` Will Deacon
2024-03-19 4:59 ` Gavin Shan
2024-03-19 6:09 ` Michael S. Tsirkin
2024-03-19 6:10 ` Michael S. Tsirkin
2024-03-19 6:54 ` Gavin Shan
2024-03-19 7:04 ` Michael S. Tsirkin
2024-03-19 7:41 ` Gavin Shan
2024-03-19 8:28 ` Michael S. Tsirkin
2024-03-19 6:38 ` Gavin Shan
2024-03-19 6:43 ` Michael S. Tsirkin
2024-03-19 6:49 ` Gavin Shan
2024-03-19 7:09 ` Michael S. Tsirkin
2024-03-19 8:08 ` Gavin Shan
2024-03-19 8:49 ` Michael S. Tsirkin
2024-03-19 18:22 ` Will Deacon
2024-03-19 23:56 ` Gavin Shan
2024-03-20 0:49 ` Michael S. Tsirkin
2024-03-20 5:24 ` Gavin Shan
2024-03-20 7:14 ` Michael S. Tsirkin
2024-03-25 7:34 ` Gavin Shan
2024-03-26 7:49 ` Michael S. Tsirkin
2024-03-26 9:38 ` Keir Fraser
2024-03-26 11:43 ` Will Deacon
2024-03-26 15:46 ` Will Deacon
2024-03-26 23:14 ` Gavin Shan
2024-03-27 0:01 ` Gavin Shan
2024-03-27 11:56 ` Michael S. Tsirkin
2024-03-20 17:15 ` Keir Fraser
2024-03-21 12:06 ` Gavin Shan
2024-03-19 7:36 ` Michael S. Tsirkin
2024-03-19 18:21 ` Will Deacon
2024-03-19 6:14 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=66e12633-b2d6-4b9a-9103-bb79770fcafa@redhat.com \
--to=gshan@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=jasowang@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mochs@nvidia.com \
--cc=mst@redhat.com \
--cc=shan.gavin@gmail.com \
--cc=virtualization@lists.linux.dev \
--cc=will@kernel.org \
--cc=xuanzhuo@linux.alibaba.com \
--cc=yihyu@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).