All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Gavin Shan <gshan@redhat.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
	jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	yihyu@redhat.com, shan.gavin@gmail.com,
	Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org, mochs@nvidia.com
Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring
Date: Sun, 17 Mar 2024 12:50:39 -0400	[thread overview]
Message-ID: <20240317124214-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <66e12633-b2d6-4b9a-9103-bb79770fcafa@redhat.com>

On Fri, Mar 15, 2024 at 09:24:36PM +1000, Gavin Shan wrote:
> 
> On 3/15/24 21:05, Michael S. Tsirkin wrote:
> > On Fri, Mar 15, 2024 at 08:45:10PM +1000, Gavin Shan wrote:
> > > > > Yes, I guess smp_wmb() ('dmb') is buggy on NVidia's grace-hopper platform. I tried
> > > to reproduce it with my own driver where one thread writes to the shared buffer
> > > and another thread reads from the buffer. I don't hit the out-of-order issue so
> > > far.
> > 
> > Make sure the 2 areas you are accessing are in different cache lines.
> > 
> 
> Yes, I already put those 2 areas to separate cache lines.
> 
> > 
> > > My driver may be not correct somewhere and I will update if I can reproduce
> > > the issue with my driver in the future.
> > 
> > Then maybe your change is just making virtio slower and masks the bug
> > that is actually elsewhere?
> > 
> > You don't really need a driver. Here's a simple test: without barriers
> > assertion will fail. With barriers it will not.
> > (Warning: didn't bother testing too much, could be buggy.
> > 
> > ---
> > 
> > #include <pthread.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <assert.h>
> > 
> > #define FIRST values[0]
> > #define SECOND values[64]
> > 
> > volatile int values[100] = {};
> > 
> > void* writer_thread(void* arg) {
> > 	while (1) {
> > 	FIRST++;
> > 	// NEED smp_wmb here
>         __asm__ volatile("dmb ishst" : : : "memory");
> > 	SECOND++;
> > 	}
> > }
> > 
> > void* reader_thread(void* arg) {
> >      while (1) {
> > 	int first = FIRST;
> > 	// NEED smp_rmb here
>         __asm__ volatile("dmb ishld" : : : "memory");
> > 	int second = SECOND;
> > 	assert(first - second == 1 || first - second == 0);
> >      }
> > }
> > 
> > int main() {
> >      pthread_t writer, reader;
> > 
> >      pthread_create(&writer, NULL, writer_thread, NULL);
> >      pthread_create(&reader, NULL, reader_thread, NULL);
> > 
> >      pthread_join(writer, NULL);
> >      pthread_join(reader, NULL);
> > 
> >      return 0;
> > }
> > 
> 
> Had a quick test on NVidia's grace-hopper and Ampere's CPUs. I hit
> the assert on both of them. After replacing 'dmb' with 'dsb', I can
> hit assert on both of them too. I need to look at the code closely.
> 
> [root@virt-mtcollins-02 test]# ./a
> a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> Aborted (core dumped)
> 
> [root@nvidia-grace-hopper-05 test]# ./a
> a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> Aborted (core dumped)
> 
> Thanks,
> Gavin


Actually this test is broken. No need for ordering it's a simple race.
The following works on x86 though (x86 does not need barriers
though).


#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#if 0
#define x86_rmb()  asm volatile("lfence":::"memory")
#define x86_mb()  asm volatile("mfence":::"memory")
#define x86_smb()  asm volatile("sfence":::"memory")
#else
#define x86_rmb()  asm volatile("":::"memory")
#define x86_mb()  asm volatile("":::"memory")
#define x86_smb()  asm volatile("":::"memory")
#endif

#define FIRST values[0]
#define SECOND values[640]
#define FLAG values[1280]

volatile unsigned values[2000] = {};

void* writer_thread(void* arg) {
	while (1) {
	/* Now synchronize with reader */
	while(FLAG);
	FIRST++;
	x86_smb();
	SECOND++;
	x86_smb();
	FLAG = 1;
	}
}

void* reader_thread(void* arg) {
    while (1) {
	/* Now synchronize with writer */
	while(!FLAG);
	x86_rmb();
	unsigned first = FIRST;
	x86_rmb();
	unsigned second = SECOND;
	assert(first - second == 1 || first - second == 0);
	FLAG = 0;

	if (!(first %1000000))
		printf("%d\n", first);
   }
}

int main() {
    pthread_t writer, reader;

    pthread_create(&writer, NULL, writer_thread, NULL);
    pthread_create(&reader, NULL, reader_thread, NULL);

    pthread_join(writer, NULL);
    pthread_join(reader, NULL);

    return 0;
}


WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Gavin Shan <gshan@redhat.com>
Cc: virtualization@lists.linux.dev, linux-kernel@vger.kernel.org,
	jasowang@redhat.com, xuanzhuo@linux.alibaba.com,
	yihyu@redhat.com, shan.gavin@gmail.com,
	Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org, mochs@nvidia.com
Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring
Date: Sun, 17 Mar 2024 12:50:39 -0400	[thread overview]
Message-ID: <20240317124214-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <66e12633-b2d6-4b9a-9103-bb79770fcafa@redhat.com>

On Fri, Mar 15, 2024 at 09:24:36PM +1000, Gavin Shan wrote:
> 
> On 3/15/24 21:05, Michael S. Tsirkin wrote:
> > On Fri, Mar 15, 2024 at 08:45:10PM +1000, Gavin Shan wrote:
> > > > > Yes, I guess smp_wmb() ('dmb') is buggy on NVidia's grace-hopper platform. I tried
> > > to reproduce it with my own driver where one thread writes to the shared buffer
> > > and another thread reads from the buffer. I don't hit the out-of-order issue so
> > > far.
> > 
> > Make sure the 2 areas you are accessing are in different cache lines.
> > 
> 
> Yes, I already put those 2 areas to separate cache lines.
> 
> > 
> > > My driver may be not correct somewhere and I will update if I can reproduce
> > > the issue with my driver in the future.
> > 
> > Then maybe your change is just making virtio slower and masks the bug
> > that is actually elsewhere?
> > 
> > You don't really need a driver. Here's a simple test: without barriers
> > assertion will fail. With barriers it will not.
> > (Warning: didn't bother testing too much, could be buggy.
> > 
> > ---
> > 
> > #include <pthread.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <assert.h>
> > 
> > #define FIRST values[0]
> > #define SECOND values[64]
> > 
> > volatile int values[100] = {};
> > 
> > void* writer_thread(void* arg) {
> > 	while (1) {
> > 	FIRST++;
> > 	// NEED smp_wmb here
>         __asm__ volatile("dmb ishst" : : : "memory");
> > 	SECOND++;
> > 	}
> > }
> > 
> > void* reader_thread(void* arg) {
> >      while (1) {
> > 	int first = FIRST;
> > 	// NEED smp_rmb here
>         __asm__ volatile("dmb ishld" : : : "memory");
> > 	int second = SECOND;
> > 	assert(first - second == 1 || first - second == 0);
> >      }
> > }
> > 
> > int main() {
> >      pthread_t writer, reader;
> > 
> >      pthread_create(&writer, NULL, writer_thread, NULL);
> >      pthread_create(&reader, NULL, reader_thread, NULL);
> > 
> >      pthread_join(writer, NULL);
> >      pthread_join(reader, NULL);
> > 
> >      return 0;
> > }
> > 
> 
> Had a quick test on NVidia's grace-hopper and Ampere's CPUs. I hit
> the assert on both of them. After replacing 'dmb' with 'dsb', I can
> hit assert on both of them too. I need to look at the code closely.
> 
> [root@virt-mtcollins-02 test]# ./a
> a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> Aborted (core dumped)
> 
> [root@nvidia-grace-hopper-05 test]# ./a
> a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
> Aborted (core dumped)
> 
> Thanks,
> Gavin


Actually this test is broken. No need for ordering it's a simple race.
The following works on x86 though (x86 does not need barriers
though).


#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#if 0
#define x86_rmb()  asm volatile("lfence":::"memory")
#define x86_mb()  asm volatile("mfence":::"memory")
#define x86_smb()  asm volatile("sfence":::"memory")
#else
#define x86_rmb()  asm volatile("":::"memory")
#define x86_mb()  asm volatile("":::"memory")
#define x86_smb()  asm volatile("":::"memory")
#endif

#define FIRST values[0]
#define SECOND values[640]
#define FLAG values[1280]

volatile unsigned values[2000] = {};

void* writer_thread(void* arg) {
	while (1) {
	/* Now synchronize with reader */
	while(FLAG);
	FIRST++;
	x86_smb();
	SECOND++;
	x86_smb();
	FLAG = 1;
	}
}

void* reader_thread(void* arg) {
    while (1) {
	/* Now synchronize with writer */
	while(!FLAG);
	x86_rmb();
	unsigned first = FIRST;
	x86_rmb();
	unsigned second = SECOND;
	assert(first - second == 1 || first - second == 0);
	FLAG = 0;

	if (!(first %1000000))
		printf("%d\n", first);
   }
}

int main() {
    pthread_t writer, reader;

    pthread_create(&writer, NULL, writer_thread, NULL);
    pthread_create(&reader, NULL, reader_thread, NULL);

    pthread_join(writer, NULL);
    pthread_join(reader, NULL);

    return 0;
}


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-03-17 16:50 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-14  7:49 [PATCH] virtio_ring: Fix the stale index in available ring Gavin Shan
2024-03-14  8:05 ` Michael S. Tsirkin
2024-03-14 10:15   ` Gavin Shan
2024-03-14 11:50     ` Michael S. Tsirkin
2024-03-14 12:50       ` Gavin Shan
2024-03-14 12:59         ` Michael S. Tsirkin
2024-03-15 10:45           ` Gavin Shan
2024-03-15 10:45             ` Gavin Shan
2024-03-15 11:05             ` Michael S. Tsirkin
2024-03-15 11:05               ` Michael S. Tsirkin
2024-03-15 11:24               ` Gavin Shan
2024-03-15 11:24                 ` Gavin Shan
2024-03-17 16:50                 ` Michael S. Tsirkin [this message]
2024-03-17 16:50                   ` Michael S. Tsirkin
2024-03-17 23:41                   ` Gavin Shan
2024-03-17 23:41                     ` Gavin Shan
2024-03-18  7:50                     ` Michael S. Tsirkin
2024-03-18  7:50                       ` Michael S. Tsirkin
2024-03-18 16:59 ` Will Deacon
2024-03-19  4:59   ` Gavin Shan
2024-03-19  4:59     ` Gavin Shan
2024-03-19  6:09     ` Michael S. Tsirkin
2024-03-19  6:09       ` Michael S. Tsirkin
2024-03-19  6:10       ` Michael S. Tsirkin
2024-03-19  6:10         ` Michael S. Tsirkin
2024-03-19  6:54         ` Gavin Shan
2024-03-19  6:54           ` Gavin Shan
2024-03-19  7:04           ` Michael S. Tsirkin
2024-03-19  7:04             ` Michael S. Tsirkin
2024-03-19  7:41             ` Gavin Shan
2024-03-19  7:41               ` Gavin Shan
2024-03-19  8:28           ` Michael S. Tsirkin
2024-03-19  8:28             ` Michael S. Tsirkin
2024-03-19  6:38       ` Gavin Shan
2024-03-19  6:38         ` Gavin Shan
2024-03-19  6:43         ` Michael S. Tsirkin
2024-03-19  6:43           ` Michael S. Tsirkin
2024-03-19  6:49           ` Gavin Shan
2024-03-19  6:49             ` Gavin Shan
2024-03-19  7:09             ` Michael S. Tsirkin
2024-03-19  7:09               ` Michael S. Tsirkin
2024-03-19  8:08               ` Gavin Shan
2024-03-19  8:08                 ` Gavin Shan
2024-03-19  8:49                 ` Michael S. Tsirkin
2024-03-19  8:49                   ` Michael S. Tsirkin
2024-03-19 18:22     ` Will Deacon
2024-03-19 18:22       ` Will Deacon
2024-03-19 23:56       ` Gavin Shan
2024-03-19 23:56         ` Gavin Shan
2024-03-20  0:49         ` Michael S. Tsirkin
2024-03-20  0:49           ` Michael S. Tsirkin
2024-03-20  5:24           ` Gavin Shan
2024-03-20  5:24             ` Gavin Shan
2024-03-20  7:14             ` Michael S. Tsirkin
2024-03-20  7:14               ` Michael S. Tsirkin
2024-03-25  7:34               ` Gavin Shan
2024-03-25  7:34                 ` Gavin Shan
2024-03-26  7:49                 ` Michael S. Tsirkin
2024-03-26  7:49                   ` Michael S. Tsirkin
2024-03-26  9:38                   ` Keir Fraser
2024-03-26  9:38                     ` Keir Fraser
2024-03-26 11:43                     ` Will Deacon
2024-03-26 11:43                       ` Will Deacon
2024-03-26 15:46                       ` Will Deacon
2024-03-26 15:46                         ` Will Deacon
2024-03-26 23:14                         ` Gavin Shan
2024-03-26 23:14                           ` Gavin Shan
2024-03-27  0:01                           ` Gavin Shan
2024-03-27  0:01                             ` Gavin Shan
2024-03-27 11:56                         ` Michael S. Tsirkin
2024-03-27 11:56                           ` Michael S. Tsirkin
2024-03-20 17:15             ` Keir Fraser
2024-03-20 17:15               ` Keir Fraser
2024-03-21 12:06               ` Gavin Shan
2024-03-21 12:06                 ` Gavin Shan
2024-03-19  7:36   ` Michael S. Tsirkin
2024-03-19 18:21     ` Will Deacon
2024-03-19  6:14 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240317124214-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=gshan@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=shan.gavin@gmail.com \
    --cc=virtualization@lists.linux.dev \
    --cc=will@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yihyu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.