* Re: [Patch net] vsock: improve tap delivery accuracy
2023-05-02 20:14 ` Stefan Hajnoczi
@ 2023-04-16 4:49 ` Bobby Eshleman
2023-05-03 7:38 ` Stefano Garzarella
2023-05-03 13:39 ` Stefan Hajnoczi
0 siblings, 2 replies; 8+ messages in thread
From: Bobby Eshleman @ 2023-04-16 4:49 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Cong Wang, Cong Wang, Bobby Eshleman, kvm, netdev, virtualization
On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> >
> > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > we should not deliver the copy to tap device in this case. So we
> > need to move virtio_transport_deliver_tap_pkt() down after all
> > possible failures.
> >
> > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > ---
> > net/vmw_vsock/virtio_transport.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > index e95df847176b..055678628c07 100644
> > --- a/net/vmw_vsock/virtio_transport.c
> > +++ b/net/vmw_vsock/virtio_transport.c
> > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > if (!skb)
> > break;
> >
> > - virtio_transport_deliver_tap_pkt(skb);
> > - reply = virtio_vsock_skb_reply(skb);
> > -
> > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > sgs[out_sg++] = &hdr;
> > if (skb->len > 0) {
> > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > break;
> > }
> >
> > + virtio_transport_deliver_tap_pkt(skb);
> > + reply = virtio_vsock_skb_reply(skb);
>
> I don't remember the reason for the ordering, but I'm pretty sure it was
> deliberate. Probably because the payload buffers could be freed as soon
> as virtqueue_add_sgs() is called.
>
> If that's no longer true with Bobby's skbuff code, then maybe it's safe
> to monitor packets after they have been sent.
>
> Stefan
Hey Stefan,
Unfortunately, skbuff doesn't change that behavior.
If I understand correctly, the problem flow you are describing
would be something like this:
Thread 0 Thread 1
guest:virtqueue_add_sgs()[@send_pkt_work]
host:vhost_vq_get_desc()[@handle_tx_kick]
host:vhost_add_used()
host:vhost_signal()
guest:virtqueue_get_buf()[@tx_work]
guest:consume_skb()
guest:deliver_tap_pkt()[@send_pkt_work]
^ use-after-free
Which I guess is possible because the receiver can consume the new
scatterlist during the processing kicked off for a previous batch?
(doesn't have to wait for the subsequent kick)
Best,
Bobby
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-05-03 13:39 ` Stefan Hajnoczi
@ 2023-04-16 6:40 ` Bobby Eshleman
0 siblings, 0 replies; 8+ messages in thread
From: Bobby Eshleman @ 2023-04-16 6:40 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Cong Wang, Cong Wang, Bobby Eshleman, kvm, netdev, virtualization
On Wed, May 03, 2023 at 09:39:13AM -0400, Stefan Hajnoczi wrote:
> On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> > On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > > From: Cong Wang <cong.wang@bytedance.com>
> > > >
> > > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > > we should not deliver the copy to tap device in this case. So we
> > > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > > possible failures.
> > > >
> > > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > > ---
> > > > net/vmw_vsock/virtio_transport.c | 5 ++---
> > > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index e95df847176b..055678628c07 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > > if (!skb)
> > > > break;
> > > >
> > > > - virtio_transport_deliver_tap_pkt(skb);
> > > > - reply = virtio_vsock_skb_reply(skb);
> > > > -
> > > > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > > sgs[out_sg++] = &hdr;
> > > > if (skb->len > 0) {
> > > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > > break;
> > > > }
> > > >
> > > > + virtio_transport_deliver_tap_pkt(skb);
> > > > + reply = virtio_vsock_skb_reply(skb);
> > >
> > > I don't remember the reason for the ordering, but I'm pretty sure it was
> > > deliberate. Probably because the payload buffers could be freed as soon
> > > as virtqueue_add_sgs() is called.
> > >
> > > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > > to monitor packets after they have been sent.
> > >
> > > Stefan
> >
> > Hey Stefan,
> >
> > Unfortunately, skbuff doesn't change that behavior.
> >
> > If I understand correctly, the problem flow you are describing
> > would be something like this:
> >
> > Thread 0 Thread 1
> > guest:virtqueue_add_sgs()[@send_pkt_work]
> >
> > host:vhost_vq_get_desc()[@handle_tx_kick]
> > host:vhost_add_used()
> > host:vhost_signal()
> > guest:virtqueue_get_buf()[@tx_work]
> > guest:consume_skb()
> >
> > guest:deliver_tap_pkt()[@send_pkt_work]
> > ^ use-after-free
> >
> > Which I guess is possible because the receiver can consume the new
> > scatterlist during the processing kicked off for a previous batch?
> > (doesn't have to wait for the subsequent kick)
>
> Yes, drivers must assume that the device completes request before
> virtqueue_add_sgs() returns. For example, the device is allowed to poll
> the virtqueue memory and may see the new descriptors immediately.
>
> I haven't audited the current vsock code path to determine whether it's
> possible to reach consume_skb() before deliver_tap_pkt() returns, so I
> can't say whether it's safe or not.
>
I see, thanks for the clarification.
Best,
Bobby
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-05-03 7:38 ` Stefano Garzarella
@ 2023-04-16 6:57 ` Bobby Eshleman
0 siblings, 0 replies; 8+ messages in thread
From: Bobby Eshleman @ 2023-04-16 6:57 UTC (permalink / raw)
To: Stefano Garzarella
Cc: Cong Wang, Bobby Eshleman, kvm, netdev, virtualization,
Stefan Hajnoczi, Cong Wang
On Wed, May 03, 2023 at 09:38:50AM +0200, Stefano Garzarella wrote:
> On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> > On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > > From: Cong Wang <cong.wang@bytedance.com>
> > > >
> > > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > > we should not deliver the copy to tap device in this case. So we
> > > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > > possible failures.
> > > >
> > > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > > ---
> > > > net/vmw_vsock/virtio_transport.c | 5 ++---
> > > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > > index e95df847176b..055678628c07 100644
> > > > --- a/net/vmw_vsock/virtio_transport.c
> > > > +++ b/net/vmw_vsock/virtio_transport.c
> > > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > > if (!skb)
> > > > break;
> > > >
> > > > - virtio_transport_deliver_tap_pkt(skb);
> > > > - reply = virtio_vsock_skb_reply(skb);
> > > > -
> > > > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > > sgs[out_sg++] = &hdr;
> > > > if (skb->len > 0) {
> > > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > > break;
> > > > }
> > > >
> > > > + virtio_transport_deliver_tap_pkt(skb);
>
> I would move only the virtio_transport_deliver_tap_pkt(),
> virtio_vsock_skb_reply() is not related.
>
> > > > + reply = virtio_vsock_skb_reply(skb);
> > >
> > > I don't remember the reason for the ordering, but I'm pretty sure it was
> > > deliberate. Probably because the payload buffers could be freed as soon
> > > as virtqueue_add_sgs() is called.
> > >
> > > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > > to monitor packets after they have been sent.
> > >
> > > Stefan
> >
> > Hey Stefan,
> >
> > Unfortunately, skbuff doesn't change that behavior.
> >
> > If I understand correctly, the problem flow you are describing
> > would be something like this:
> >
> > Thread 0 Thread 1
> > guest:virtqueue_add_sgs()[@send_pkt_work]
> >
> > host:vhost_vq_get_desc()[@handle_tx_kick]
> > host:vhost_add_used()
> > host:vhost_signal()
> > guest:virtqueue_get_buf()[@tx_work]
> > guest:consume_skb()
> >
> > guest:deliver_tap_pkt()[@send_pkt_work]
> > ^ use-after-free
> >
> > Which I guess is possible because the receiver can consume the new
> > scatterlist during the processing kicked off for a previous batch?
> > (doesn't have to wait for the subsequent kick)
>
> This is true, but both `send_pkt_work` and `tx_work` hold `tx_lock`, so can
> they really go in parallel?
>
Oh good point, the tx_lock synchronizes it:
Thread 0 Thread 1
guest:virtqueue_add_sgs()[@send_pkt_work]
host:vhost_vq_get_desc()[@handle_tx_kick]
host:vhost_add_used()
host:vhost_signal()
guest:mutex_lock()[@tx_work]
guest:deliver_tap_pkt()[@send_pkt_work]
guest:mutex_unlock()
guest:virtqueue_get_buf()[@tx_work]
guest:consume_skb()
I'm pretty sure this should be safe.
Best,
Bobby
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Patch net] vsock: improve tap delivery accuracy
@ 2023-05-02 17:44 Cong Wang
2023-05-02 20:02 ` Simon Horman
2023-05-02 20:14 ` Stefan Hajnoczi
0 siblings, 2 replies; 8+ messages in thread
From: Cong Wang @ 2023-05-02 17:44 UTC (permalink / raw)
To: netdev
Cc: virtualization, kvm, Cong Wang, Stefan Hajnoczi,
Stefano Garzarella, Bobby Eshleman
From: Cong Wang <cong.wang@bytedance.com>
When virtqueue_add_sgs() fails, the skb is put back to send queue,
we should not deliver the copy to tap device in this case. So we
need to move virtio_transport_deliver_tap_pkt() down after all
possible failures.
Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
net/vmw_vsock/virtio_transport.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e95df847176b..055678628c07 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
if (!skb)
break;
- virtio_transport_deliver_tap_pkt(skb);
- reply = virtio_vsock_skb_reply(skb);
-
sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
sgs[out_sg++] = &hdr;
if (skb->len > 0) {
@@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
break;
}
+ virtio_transport_deliver_tap_pkt(skb);
+ reply = virtio_vsock_skb_reply(skb);
if (reply) {
struct virtqueue *rx_vq = vsock->vqs[VSOCK_VQ_RX];
int val;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-05-02 17:44 [Patch net] vsock: improve tap delivery accuracy Cong Wang
@ 2023-05-02 20:02 ` Simon Horman
2023-05-02 20:14 ` Stefan Hajnoczi
1 sibling, 0 replies; 8+ messages in thread
From: Simon Horman @ 2023-05-02 20:02 UTC (permalink / raw)
To: Cong Wang
Cc: netdev, virtualization, kvm, Cong Wang, Stefan Hajnoczi,
Stefano Garzarella, Bobby Eshleman
On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
>
> When virtqueue_add_sgs() fails, the skb is put back to send queue,
> we should not deliver the copy to tap device in this case. So we
> need to move virtio_transport_deliver_tap_pkt() down after all
> possible failures.
>
> Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-05-02 17:44 [Patch net] vsock: improve tap delivery accuracy Cong Wang
2023-05-02 20:02 ` Simon Horman
@ 2023-05-02 20:14 ` Stefan Hajnoczi
2023-04-16 4:49 ` Bobby Eshleman
1 sibling, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2023-05-02 20:14 UTC (permalink / raw)
To: Cong Wang
Cc: netdev, virtualization, kvm, Cong Wang, Stefano Garzarella,
Bobby Eshleman
[-- Attachment #1: Type: text/plain, Size: 1779 bytes --]
On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
>
> When virtqueue_add_sgs() fails, the skb is put back to send queue,
> we should not deliver the copy to tap device in this case. So we
> need to move virtio_transport_deliver_tap_pkt() down after all
> possible failures.
>
> Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---
> net/vmw_vsock/virtio_transport.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index e95df847176b..055678628c07 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> if (!skb)
> break;
>
> - virtio_transport_deliver_tap_pkt(skb);
> - reply = virtio_vsock_skb_reply(skb);
> -
> sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> sgs[out_sg++] = &hdr;
> if (skb->len > 0) {
> @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> break;
> }
>
> + virtio_transport_deliver_tap_pkt(skb);
> + reply = virtio_vsock_skb_reply(skb);
I don't remember the reason for the ordering, but I'm pretty sure it was
deliberate. Probably because the payload buffers could be freed as soon
as virtqueue_add_sgs() is called.
If that's no longer true with Bobby's skbuff code, then maybe it's safe
to monitor packets after they have been sent.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-04-16 4:49 ` Bobby Eshleman
@ 2023-05-03 7:38 ` Stefano Garzarella
2023-04-16 6:57 ` Bobby Eshleman
2023-05-03 13:39 ` Stefan Hajnoczi
1 sibling, 1 reply; 8+ messages in thread
From: Stefano Garzarella @ 2023-05-03 7:38 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Stefan Hajnoczi, Cong Wang, Cong Wang, Bobby Eshleman, kvm,
netdev, virtualization
On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
>On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
>> On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
>> > From: Cong Wang <cong.wang@bytedance.com>
>> >
>> > When virtqueue_add_sgs() fails, the skb is put back to send queue,
>> > we should not deliver the copy to tap device in this case. So we
>> > need to move virtio_transport_deliver_tap_pkt() down after all
>> > possible failures.
>> >
>> > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
>> > Cc: Stefan Hajnoczi <stefanha@redhat.com>
>> > Cc: Stefano Garzarella <sgarzare@redhat.com>
>> > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
>> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
>> > ---
>> > net/vmw_vsock/virtio_transport.c | 5 ++---
>> > 1 file changed, 2 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> > index e95df847176b..055678628c07 100644
>> > --- a/net/vmw_vsock/virtio_transport.c
>> > +++ b/net/vmw_vsock/virtio_transport.c
>> > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>> > if (!skb)
>> > break;
>> >
>> > - virtio_transport_deliver_tap_pkt(skb);
>> > - reply = virtio_vsock_skb_reply(skb);
>> > -
>> > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
>> > sgs[out_sg++] = &hdr;
>> > if (skb->len > 0) {
>> > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>> > break;
>> > }
>> >
>> > + virtio_transport_deliver_tap_pkt(skb);
I would move only the virtio_transport_deliver_tap_pkt(),
virtio_vsock_skb_reply() is not related.
>> > + reply = virtio_vsock_skb_reply(skb);
>>
>> I don't remember the reason for the ordering, but I'm pretty sure it was
>> deliberate. Probably because the payload buffers could be freed as soon
>> as virtqueue_add_sgs() is called.
>>
>> If that's no longer true with Bobby's skbuff code, then maybe it's safe
>> to monitor packets after they have been sent.
>>
>> Stefan
>
>Hey Stefan,
>
>Unfortunately, skbuff doesn't change that behavior.
>
>If I understand correctly, the problem flow you are describing
>would be something like this:
>
>Thread 0 Thread 1
>guest:virtqueue_add_sgs()[@send_pkt_work]
>
> host:vhost_vq_get_desc()[@handle_tx_kick]
> host:vhost_add_used()
> host:vhost_signal()
> guest:virtqueue_get_buf()[@tx_work]
> guest:consume_skb()
>
>guest:deliver_tap_pkt()[@send_pkt_work]
>^ use-after-free
>
>Which I guess is possible because the receiver can consume the new
>scatterlist during the processing kicked off for a previous batch?
>(doesn't have to wait for the subsequent kick)
This is true, but both `send_pkt_work` and `tx_work` hold `tx_lock`, so
can they really go in parallel?
Thanks,
Stefano
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Patch net] vsock: improve tap delivery accuracy
2023-04-16 4:49 ` Bobby Eshleman
2023-05-03 7:38 ` Stefano Garzarella
@ 2023-05-03 13:39 ` Stefan Hajnoczi
2023-04-16 6:40 ` Bobby Eshleman
1 sibling, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2023-05-03 13:39 UTC (permalink / raw)
To: Bobby Eshleman
Cc: Cong Wang, Cong Wang, Bobby Eshleman, kvm, netdev, virtualization
[-- Attachment #1: Type: text/plain, Size: 3193 bytes --]
On Sun, Apr 16, 2023 at 04:49:00AM +0000, Bobby Eshleman wrote:
> On Tue, May 02, 2023 at 04:14:18PM -0400, Stefan Hajnoczi wrote:
> > On Tue, May 02, 2023 at 10:44:04AM -0700, Cong Wang wrote:
> > > From: Cong Wang <cong.wang@bytedance.com>
> > >
> > > When virtqueue_add_sgs() fails, the skb is put back to send queue,
> > > we should not deliver the copy to tap device in this case. So we
> > > need to move virtio_transport_deliver_tap_pkt() down after all
> > > possible failures.
> > >
> > > Fixes: 82dfb540aeb2 ("VSOCK: Add virtio vsock vsockmon hooks")
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > Cc: Bobby Eshleman <bobby.eshleman@bytedance.com>
> > > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> > > ---
> > > net/vmw_vsock/virtio_transport.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> > > index e95df847176b..055678628c07 100644
> > > --- a/net/vmw_vsock/virtio_transport.c
> > > +++ b/net/vmw_vsock/virtio_transport.c
> > > @@ -109,9 +109,6 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > if (!skb)
> > > break;
> > >
> > > - virtio_transport_deliver_tap_pkt(skb);
> > > - reply = virtio_vsock_skb_reply(skb);
> > > -
> > > sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> > > sgs[out_sg++] = &hdr;
> > > if (skb->len > 0) {
> > > @@ -128,6 +125,8 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> > > break;
> > > }
> > >
> > > + virtio_transport_deliver_tap_pkt(skb);
> > > + reply = virtio_vsock_skb_reply(skb);
> >
> > I don't remember the reason for the ordering, but I'm pretty sure it was
> > deliberate. Probably because the payload buffers could be freed as soon
> > as virtqueue_add_sgs() is called.
> >
> > If that's no longer true with Bobby's skbuff code, then maybe it's safe
> > to monitor packets after they have been sent.
> >
> > Stefan
>
> Hey Stefan,
>
> Unfortunately, skbuff doesn't change that behavior.
>
> If I understand correctly, the problem flow you are describing
> would be something like this:
>
> Thread 0 Thread 1
> guest:virtqueue_add_sgs()[@send_pkt_work]
>
> host:vhost_vq_get_desc()[@handle_tx_kick]
> host:vhost_add_used()
> host:vhost_signal()
> guest:virtqueue_get_buf()[@tx_work]
> guest:consume_skb()
>
> guest:deliver_tap_pkt()[@send_pkt_work]
> ^ use-after-free
>
> Which I guess is possible because the receiver can consume the new
> scatterlist during the processing kicked off for a previous batch?
> (doesn't have to wait for the subsequent kick)
Yes, drivers must assume that the device completes request before
virtqueue_add_sgs() returns. For example, the device is allowed to poll
the virtqueue memory and may see the new descriptors immediately.
I haven't audited the current vsock code path to determine whether it's
possible to reach consume_skb() before deliver_tap_pkt() returns, so I
can't say whether it's safe or not.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-05-03 16:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-02 17:44 [Patch net] vsock: improve tap delivery accuracy Cong Wang
2023-05-02 20:02 ` Simon Horman
2023-05-02 20:14 ` Stefan Hajnoczi
2023-04-16 4:49 ` Bobby Eshleman
2023-05-03 7:38 ` Stefano Garzarella
2023-04-16 6:57 ` Bobby Eshleman
2023-05-03 13:39 ` Stefan Hajnoczi
2023-04-16 6:40 ` Bobby Eshleman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).