From: Keir Fraser <keirf@google.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gavin Shan <gshan@redhat.com>, Will Deacon <will@kernel.org>, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, jasowang@redhat.com, xuanzhuo@linux.alibaba.com, yihyu@redhat.com, shan.gavin@gmail.com, linux-arm-kernel@lists.infradead.org, Catalin Marinas <catalin.marinas@arm.com>, mochs@nvidia.com Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring Date: Tue, 26 Mar 2024 09:38:55 +0000 [thread overview] Message-ID: <ZgKXr8IQ51xBECuP@google.com> (raw) In-Reply-To: <20240326033809-mutt-send-email-mst@kernel.org> On Tue, Mar 26, 2024 at 03:49:02AM -0400, Michael S. Tsirkin wrote: > On Mon, Mar 25, 2024 at 05:34:29PM +1000, Gavin Shan wrote: > > > > On 3/20/24 17:14, Michael S. Tsirkin wrote: > > > On Wed, Mar 20, 2024 at 03:24:16PM +1000, Gavin Shan wrote: > > > > On 3/20/24 10:49, Michael S. Tsirkin wrote:> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index 6f7e5010a673..79456706d0bd 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -685,7 +685,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > > > > /* Put entry in available array (but don't update avail->idx until they > > > > > * do sync). */ > > > > > avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1); > > > > > - vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head); > > > > > + u16 headwithflag = head | (q->split.avail_idx_shadow & ~(vq->split.vring.num - 1)); > > > > > + vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, headwithflag); > > > > > /* Descriptors and available array need to be set before we expose the > > > > > * new available array entries. */ > > > > > > > > > Ok, Michael. I continued with my debugging code. It still looks like a > > hardware bug on NVidia's grace-hopper. I really think NVidia needs to be > > involved for the discussion, as suggested by you. > > Do you have a support contact at Nvidia to report this? > > > Firstly, I bind the vhost process and vCPU thread to CPU#71 and CPU#70. > > Note that I have only one vCPU in my configuration. > > Interesting but is guest built with CONFIG_SMP set? arm64 is always built CONFIG_SMP. > > Secondly, the debugging code is enhanced so that the available head for > > (last_avail_idx - 1) is read for twice and recorded. It means the available > > head for one specific available index is read for twice. I do see the > > available heads are different from the consecutive reads. More details > > are shared as below. > > > > From the guest side > > =================== > > > > virtio_net virtio0: output.0:id 86 is not a head! > > head to be released: 047 062 112 > > > > avail_idx: > > 000 49665 > > 001 49666 <-- > > : > > 015 49664 > > what are these #s 49665 and so on? > and how large is the ring? > I am guessing 49664 is the index ring size is 16 and > 49664 % 16 == 0 More than that, 49664 % 256 == 0 So again there seems to be an error in the vicinity of roll-over of the idx low byte, as I observed in the earlier log. Surely this is more than coincidence? -- Keir > > avail_head: > > > is this the avail ring contents? > > > 000 062 > > 001 047 <-- > > : > > 015 112 > > > What are these arrows pointing at, btw? > > > > From the host side > > ================== > > > > avail_idx > > 000 49663 > > 001 49666 <--- > > : > > > > avail_head > > 000 062 (062) > > 001 047 (047) <--- > > : > > 015 086 (112) // head 086 is returned from the first read, > > // but head 112 is returned from the second read > > > > vhost_get_vq_desc: Inconsistent head in two read (86 -> 112) for avail_idx 49664 > > > > Thanks, > > Gavin > > OK thanks so this proves it is actually the avail ring value. > > -- > MST >
WARNING: multiple messages have this Message-ID (diff)
From: Keir Fraser <keirf@google.com> To: "Michael S. Tsirkin" <mst@redhat.com> Cc: Gavin Shan <gshan@redhat.com>, Will Deacon <will@kernel.org>, virtualization@lists.linux.dev, linux-kernel@vger.kernel.org, jasowang@redhat.com, xuanzhuo@linux.alibaba.com, yihyu@redhat.com, shan.gavin@gmail.com, linux-arm-kernel@lists.infradead.org, Catalin Marinas <catalin.marinas@arm.com>, mochs@nvidia.com Subject: Re: [PATCH] virtio_ring: Fix the stale index in available ring Date: Tue, 26 Mar 2024 09:38:55 +0000 [thread overview] Message-ID: <ZgKXr8IQ51xBECuP@google.com> (raw) In-Reply-To: <20240326033809-mutt-send-email-mst@kernel.org> On Tue, Mar 26, 2024 at 03:49:02AM -0400, Michael S. Tsirkin wrote: > On Mon, Mar 25, 2024 at 05:34:29PM +1000, Gavin Shan wrote: > > > > On 3/20/24 17:14, Michael S. Tsirkin wrote: > > > On Wed, Mar 20, 2024 at 03:24:16PM +1000, Gavin Shan wrote: > > > > On 3/20/24 10:49, Michael S. Tsirkin wrote:> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index 6f7e5010a673..79456706d0bd 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -685,7 +685,8 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, > > > > > /* Put entry in available array (but don't update avail->idx until they > > > > > * do sync). */ > > > > > avail = vq->split.avail_idx_shadow & (vq->split.vring.num - 1); > > > > > - vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head); > > > > > + u16 headwithflag = head | (q->split.avail_idx_shadow & ~(vq->split.vring.num - 1)); > > > > > + vq->split.vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, headwithflag); > > > > > /* Descriptors and available array need to be set before we expose the > > > > > * new available array entries. */ > > > > > > > > > Ok, Michael. I continued with my debugging code. It still looks like a > > hardware bug on NVidia's grace-hopper. I really think NVidia needs to be > > involved for the discussion, as suggested by you. > > Do you have a support contact at Nvidia to report this? > > > Firstly, I bind the vhost process and vCPU thread to CPU#71 and CPU#70. > > Note that I have only one vCPU in my configuration. > > Interesting but is guest built with CONFIG_SMP set? arm64 is always built CONFIG_SMP. > > Secondly, the debugging code is enhanced so that the available head for > > (last_avail_idx - 1) is read for twice and recorded. It means the available > > head for one specific available index is read for twice. I do see the > > available heads are different from the consecutive reads. More details > > are shared as below. > > > > From the guest side > > =================== > > > > virtio_net virtio0: output.0:id 86 is not a head! > > head to be released: 047 062 112 > > > > avail_idx: > > 000 49665 > > 001 49666 <-- > > : > > 015 49664 > > what are these #s 49665 and so on? > and how large is the ring? > I am guessing 49664 is the index ring size is 16 and > 49664 % 16 == 0 More than that, 49664 % 256 == 0 So again there seems to be an error in the vicinity of roll-over of the idx low byte, as I observed in the earlier log. Surely this is more than coincidence? -- Keir > > avail_head: > > > is this the avail ring contents? > > > 000 062 > > 001 047 <-- > > : > > 015 112 > > > What are these arrows pointing at, btw? > > > > From the host side > > ================== > > > > avail_idx > > 000 49663 > > 001 49666 <--- > > : > > > > avail_head > > 000 062 (062) > > 001 047 (047) <--- > > : > > 015 086 (112) // head 086 is returned from the first read, > > // but head 112 is returned from the second read > > > > vhost_get_vq_desc: Inconsistent head in two read (86 -> 112) for avail_idx 49664 > > > > Thanks, > > Gavin > > OK thanks so this proves it is actually the avail ring value. > > -- > MST > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-03-26 9:39 UTC|newest] Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-03-14 7:49 [PATCH] virtio_ring: Fix the stale index in available ring Gavin Shan 2024-03-14 8:05 ` Michael S. Tsirkin 2024-03-14 10:15 ` Gavin Shan 2024-03-14 11:50 ` Michael S. Tsirkin 2024-03-14 12:50 ` Gavin Shan 2024-03-14 12:59 ` Michael S. Tsirkin 2024-03-15 10:45 ` Gavin Shan 2024-03-15 10:45 ` Gavin Shan 2024-03-15 11:05 ` Michael S. Tsirkin 2024-03-15 11:05 ` Michael S. Tsirkin 2024-03-15 11:24 ` Gavin Shan 2024-03-15 11:24 ` Gavin Shan 2024-03-17 16:50 ` Michael S. Tsirkin 2024-03-17 16:50 ` Michael S. Tsirkin 2024-03-17 23:41 ` Gavin Shan 2024-03-17 23:41 ` Gavin Shan 2024-03-18 7:50 ` Michael S. Tsirkin 2024-03-18 7:50 ` Michael S. Tsirkin 2024-03-18 16:59 ` Will Deacon 2024-03-19 4:59 ` Gavin Shan 2024-03-19 4:59 ` Gavin Shan 2024-03-19 6:09 ` Michael S. Tsirkin 2024-03-19 6:09 ` Michael S. Tsirkin 2024-03-19 6:10 ` Michael S. Tsirkin 2024-03-19 6:10 ` Michael S. Tsirkin 2024-03-19 6:54 ` Gavin Shan 2024-03-19 6:54 ` Gavin Shan 2024-03-19 7:04 ` Michael S. Tsirkin 2024-03-19 7:04 ` Michael S. Tsirkin 2024-03-19 7:41 ` Gavin Shan 2024-03-19 7:41 ` Gavin Shan 2024-03-19 8:28 ` Michael S. Tsirkin 2024-03-19 8:28 ` Michael S. Tsirkin 2024-03-19 6:38 ` Gavin Shan 2024-03-19 6:38 ` Gavin Shan 2024-03-19 6:43 ` Michael S. Tsirkin 2024-03-19 6:43 ` Michael S. Tsirkin 2024-03-19 6:49 ` Gavin Shan 2024-03-19 6:49 ` Gavin Shan 2024-03-19 7:09 ` Michael S. Tsirkin 2024-03-19 7:09 ` Michael S. Tsirkin 2024-03-19 8:08 ` Gavin Shan 2024-03-19 8:08 ` Gavin Shan 2024-03-19 8:49 ` Michael S. Tsirkin 2024-03-19 8:49 ` Michael S. Tsirkin 2024-03-19 18:22 ` Will Deacon 2024-03-19 18:22 ` Will Deacon 2024-03-19 23:56 ` Gavin Shan 2024-03-19 23:56 ` Gavin Shan 2024-03-20 0:49 ` Michael S. Tsirkin 2024-03-20 0:49 ` Michael S. Tsirkin 2024-03-20 5:24 ` Gavin Shan 2024-03-20 5:24 ` Gavin Shan 2024-03-20 7:14 ` Michael S. Tsirkin 2024-03-20 7:14 ` Michael S. Tsirkin 2024-03-25 7:34 ` Gavin Shan 2024-03-25 7:34 ` Gavin Shan 2024-03-26 7:49 ` Michael S. Tsirkin 2024-03-26 7:49 ` Michael S. Tsirkin 2024-03-26 9:38 ` Keir Fraser [this message] 2024-03-26 9:38 ` Keir Fraser 2024-03-26 11:43 ` Will Deacon 2024-03-26 11:43 ` Will Deacon 2024-03-26 15:46 ` Will Deacon 2024-03-26 15:46 ` Will Deacon 2024-03-26 23:14 ` Gavin Shan 2024-03-26 23:14 ` Gavin Shan 2024-03-27 0:01 ` Gavin Shan 2024-03-27 0:01 ` Gavin Shan 2024-03-27 11:56 ` Michael S. Tsirkin 2024-03-27 11:56 ` Michael S. Tsirkin 2024-03-20 17:15 ` Keir Fraser 2024-03-20 17:15 ` Keir Fraser 2024-03-21 12:06 ` Gavin Shan 2024-03-21 12:06 ` Gavin Shan 2024-03-19 7:36 ` Michael S. Tsirkin 2024-03-19 18:21 ` Will Deacon 2024-03-19 6:14 ` Michael S. Tsirkin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ZgKXr8IQ51xBECuP@google.com \ --to=keirf@google.com \ --cc=catalin.marinas@arm.com \ --cc=gshan@redhat.com \ --cc=jasowang@redhat.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mochs@nvidia.com \ --cc=mst@redhat.com \ --cc=shan.gavin@gmail.com \ --cc=virtualization@lists.linux.dev \ --cc=will@kernel.org \ --cc=xuanzhuo@linux.alibaba.com \ --cc=yihyu@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.