All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: RE: [PATCH 0/6] virtio: support advance DMA
       [not found] <MW3PR11MB46029D413C9A6C32C0E5B347F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
@ 2022-01-11  8:06 ` Xuan Zhuo
  2022-01-11  8:25   ` Jason Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Xuan Zhuo @ 2022-01-11  8:06 UTC (permalink / raw)
  To: magnus.karlsson; +Cc: Michael S.Tsirkin, virtualization

On Tue, 11 Jan 2022 08:04:05 +0000, Karlsson, Magnus <magnus.karlsson@intel.com> wrote:
>
>
> > -----Original Message-----
> > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Sent: Tuesday, January 11, 2022 7:17 AM
> > To: Jason Wang <jasowang@redhat.com>
> > Cc: virtualization <virtualization@lists.linux-foundation.org>; Michael S.Tsirkin
> > <mst@redhat.com>; Karlsson, Magnus <magnus.karlsson@intel.com>
> > Subject: Re: [PATCH 0/6] virtio: support advance DMA
> >
> > On Tue, 11 Jan 2022 10:54:45 +0800, Jason Wang <jasowang@redhat.com>
> > wrote:
> > > On Mon, Jan 10, 2022 at 5:59 PM Michael S. Tsirkin <mst@redhat.com>
> > wrote:
> > > >
> > > > On Fri, Jan 07, 2022 at 02:33:00PM +0800, Xuan Zhuo wrote:
> > > > > virtqueue_add() only supports virtual addresses, dma is completed
> > > > > in virtqueue_add().
> > > > >
> > > > > In some scenarios (such as the AF_XDP scenario), DMA is completed
> > > > > in advance, so it is necessary for us to support passing the DMA address
> > to virtqueue_add().
> > > > >
> > > > > This patch set stipulates that if sg->dma_address is not NULL, use
> > > > > this address as the DMA address. And record this information in
> > > > > extra->flags, which can be skipped when executing dma unmap.
> > > > >
> > > > >     extra->flags |= VRING_DESC_F_PREDMA;
> > > > >
> > > > > But the indirect desc does not have a corresponding extra, so the
> > > > > second and third patches of this patch set are to allocate the
> > > > > corresponding extra while allocating the indirect desc. Each desc
> > > > > must have a corresponding extra because it is possible in an sgs
> > > > > some are advance DMA, while others are virtual addresses. So we must
> > allocate an extra for each indirect desc.
> > > >
> > > >
> > > > I didn't realize AF_XDP didn't have space to stuff the header into.
> > > > Jason, is that expected?
> > >
> > > I might be wrong but it looks to me AF_XDP allows to reserve
> > > sufficient headroom via xdp_umem_reg_v1.
> > >
> >
> > I understand that there is a headroom for receiving packages, which can be
> > used to put virtio headers. But there is no headroom defined in the direction
> > of sending packages. I hope Magnus Karlsson can help confirm whether
> > there is any misunderstanding.
>
> You can specify the amount of headroom you want on Tx by adjusting the "addr" field in the descriptor of the Tx ring. If your chunk starts at address X in the umem and you want 128 bytes of headroom, just write your packet into X+128 and put that address into the Tx descriptor. Will this solve your problem? If not, what would you need from AF_XDP to make it work?
>
> On Rx, there is always 256 bytes worth of headroom as specified by XDP. If you need extra, you can set the headroom variable when you register the umem.

The driver of virtio net, when passing the packet to the hardware, should add a
virtnet hdr (12 bytes) in front of the packet. Both rx and tx should add such a
header. AF_XDP has a space of 256 bytes in the rx process. We can reuse this
space. The direction of AF_XDP tx has no such regulation.

The method you mentioned requires user cooperation, which is not a good method
for driver implementation.

Thanks.

>
> > It would be best if we could not use indirect.
> >
> > Thanks.
> >
> > > > It would be best to fix that, performance is best if header is
> > > > linear with the data ...
> > >
> > > This looks like a must otherwise we may meet trouble in zerocopy receive.
> > >
> > > Thanks
> > >
> > > > Or maybe we can reduce the use of indirect somewhat, at least while
> > > > the ring is mostly empty?
> > > >
> > > > > Xuan Zhuo (6):
> > > > >   virtio: rename vring_unmap_state_packed() to
> > > > >     vring_unmap_extra_packed()
> > > > >   virtio: split: alloc indirect desc with extra
> > > > >   virtio: packed: alloc indirect desc with extra
> > > > >   virtio: split: virtqueue_add_split() support dma address
> > > > >   virtio: packed: virtqueue_add_packed() support dma address
> > > > >   virtio: add api virtio_dma_map() for advance dma
> > > > >
> > > > >  drivers/virtio/virtio_ring.c | 387 ++++++++++++++++++++---------------
> > > > >  include/linux/virtio.h       |   9 +
> > > > >  2 files changed, 232 insertions(+), 164 deletions(-)
> > > > >
> > > > > --
> > > > > 2.31.0
> > > >
> > >
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RE: [PATCH 0/6] virtio: support advance DMA
  2022-01-11  8:06 ` RE: [PATCH 0/6] virtio: support advance DMA Xuan Zhuo
@ 2022-01-11  8:25   ` Jason Wang
  2022-01-11  8:29     ` Xuan Zhuo
       [not found]     ` <MW3PR11MB46021E726D394F747E19D4C7F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Jason Wang @ 2022-01-11  8:25 UTC (permalink / raw)
  To: Xuan Zhuo; +Cc: Michael S.Tsirkin, Magnus Karlsson, virtualization

On Tue, Jan 11, 2022 at 4:17 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Tue, 11 Jan 2022 08:04:05 +0000, Karlsson, Magnus <magnus.karlsson@intel.com> wrote:
> >
> >
> > > -----Original Message-----
> > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > Sent: Tuesday, January 11, 2022 7:17 AM
> > > To: Jason Wang <jasowang@redhat.com>
> > > Cc: virtualization <virtualization@lists.linux-foundation.org>; Michael S.Tsirkin
> > > <mst@redhat.com>; Karlsson, Magnus <magnus.karlsson@intel.com>
> > > Subject: Re: [PATCH 0/6] virtio: support advance DMA
> > >
> > > On Tue, 11 Jan 2022 10:54:45 +0800, Jason Wang <jasowang@redhat.com>
> > > wrote:
> > > > On Mon, Jan 10, 2022 at 5:59 PM Michael S. Tsirkin <mst@redhat.com>
> > > wrote:
> > > > >
> > > > > On Fri, Jan 07, 2022 at 02:33:00PM +0800, Xuan Zhuo wrote:
> > > > > > virtqueue_add() only supports virtual addresses, dma is completed
> > > > > > in virtqueue_add().
> > > > > >
> > > > > > In some scenarios (such as the AF_XDP scenario), DMA is completed
> > > > > > in advance, so it is necessary for us to support passing the DMA address
> > > to virtqueue_add().
> > > > > >
> > > > > > This patch set stipulates that if sg->dma_address is not NULL, use
> > > > > > this address as the DMA address. And record this information in
> > > > > > extra->flags, which can be skipped when executing dma unmap.
> > > > > >
> > > > > >     extra->flags |= VRING_DESC_F_PREDMA;
> > > > > >
> > > > > > But the indirect desc does not have a corresponding extra, so the
> > > > > > second and third patches of this patch set are to allocate the
> > > > > > corresponding extra while allocating the indirect desc. Each desc
> > > > > > must have a corresponding extra because it is possible in an sgs
> > > > > > some are advance DMA, while others are virtual addresses. So we must
> > > allocate an extra for each indirect desc.
> > > > >
> > > > >
> > > > > I didn't realize AF_XDP didn't have space to stuff the header into.
> > > > > Jason, is that expected?
> > > >
> > > > I might be wrong but it looks to me AF_XDP allows to reserve
> > > > sufficient headroom via xdp_umem_reg_v1.
> > > >
> > >
> > > I understand that there is a headroom for receiving packages, which can be
> > > used to put virtio headers. But there is no headroom defined in the direction
> > > of sending packages. I hope Magnus Karlsson can help confirm whether
> > > there is any misunderstanding.
> >
> > You can specify the amount of headroom you want on Tx by adjusting the "addr" field in the descriptor of the Tx ring. If your chunk starts at address X in the umem and you want 128 bytes of headroom, just write your packet into X+128 and put that address into the Tx descriptor. Will this solve your problem? If not, what would you need from AF_XDP to make it work?
> >
> > On Rx, there is always 256 bytes worth of headroom as specified by XDP. If you need extra, you can set the headroom variable when you register the umem.
>
> The driver of virtio net, when passing the packet to the hardware, should add a
> virtnet hdr (12 bytes) in front of the packet. Both rx and tx should add such a
> header. AF_XDP has a space of 256 bytes in the rx process. We can reuse this
> space. The direction of AF_XDP tx has no such regulation.
>
> The method you mentioned requires user cooperation, which is not a good method
> for driver implementation.

This will result in a non-portable userspace program. I wonder why TX
has become a problem here actually, anyhow we can use a dedicated sg
for vnet hdr? And if we packed all vnet headers in an array it will
give less performance impact.

Thanks

>
> Thanks.
>
> >
> > > It would be best if we could not use indirect.
> > >
> > > Thanks.
> > >
> > > > > It would be best to fix that, performance is best if header is
> > > > > linear with the data ...
> > > >
> > > > This looks like a must otherwise we may meet trouble in zerocopy receive.
> > > >
> > > > Thanks
> > > >
> > > > > Or maybe we can reduce the use of indirect somewhat, at least while
> > > > > the ring is mostly empty?
> > > > >
> > > > > > Xuan Zhuo (6):
> > > > > >   virtio: rename vring_unmap_state_packed() to
> > > > > >     vring_unmap_extra_packed()
> > > > > >   virtio: split: alloc indirect desc with extra
> > > > > >   virtio: packed: alloc indirect desc with extra
> > > > > >   virtio: split: virtqueue_add_split() support dma address
> > > > > >   virtio: packed: virtqueue_add_packed() support dma address
> > > > > >   virtio: add api virtio_dma_map() for advance dma
> > > > > >
> > > > > >  drivers/virtio/virtio_ring.c | 387 ++++++++++++++++++++---------------
> > > > > >  include/linux/virtio.h       |   9 +
> > > > > >  2 files changed, 232 insertions(+), 164 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.31.0
> > > > >
> > > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RE: [PATCH 0/6] virtio: support advance DMA
  2022-01-11  8:25   ` Jason Wang
@ 2022-01-11  8:29     ` Xuan Zhuo
       [not found]     ` <MW3PR11MB46021E726D394F747E19D4C7F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
  1 sibling, 0 replies; 5+ messages in thread
From: Xuan Zhuo @ 2022-01-11  8:29 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S.Tsirkin, Magnus Karlsson, virtualization

On Tue, 11 Jan 2022 16:25:44 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Tue, Jan 11, 2022 at 4:17 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Tue, 11 Jan 2022 08:04:05 +0000, Karlsson, Magnus <magnus.karlsson@intel.com> wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > Sent: Tuesday, January 11, 2022 7:17 AM
> > > > To: Jason Wang <jasowang@redhat.com>
> > > > Cc: virtualization <virtualization@lists.linux-foundation.org>; Michael S.Tsirkin
> > > > <mst@redhat.com>; Karlsson, Magnus <magnus.karlsson@intel.com>
> > > > Subject: Re: [PATCH 0/6] virtio: support advance DMA
> > > >
> > > > On Tue, 11 Jan 2022 10:54:45 +0800, Jason Wang <jasowang@redhat.com>
> > > > wrote:
> > > > > On Mon, Jan 10, 2022 at 5:59 PM Michael S. Tsirkin <mst@redhat.com>
> > > > wrote:
> > > > > >
> > > > > > On Fri, Jan 07, 2022 at 02:33:00PM +0800, Xuan Zhuo wrote:
> > > > > > > virtqueue_add() only supports virtual addresses, dma is completed
> > > > > > > in virtqueue_add().
> > > > > > >
> > > > > > > In some scenarios (such as the AF_XDP scenario), DMA is completed
> > > > > > > in advance, so it is necessary for us to support passing the DMA address
> > > > to virtqueue_add().
> > > > > > >
> > > > > > > This patch set stipulates that if sg->dma_address is not NULL, use
> > > > > > > this address as the DMA address. And record this information in
> > > > > > > extra->flags, which can be skipped when executing dma unmap.
> > > > > > >
> > > > > > >     extra->flags |= VRING_DESC_F_PREDMA;
> > > > > > >
> > > > > > > But the indirect desc does not have a corresponding extra, so the
> > > > > > > second and third patches of this patch set are to allocate the
> > > > > > > corresponding extra while allocating the indirect desc. Each desc
> > > > > > > must have a corresponding extra because it is possible in an sgs
> > > > > > > some are advance DMA, while others are virtual addresses. So we must
> > > > allocate an extra for each indirect desc.
> > > > > >
> > > > > >
> > > > > > I didn't realize AF_XDP didn't have space to stuff the header into.
> > > > > > Jason, is that expected?
> > > > >
> > > > > I might be wrong but it looks to me AF_XDP allows to reserve
> > > > > sufficient headroom via xdp_umem_reg_v1.
> > > > >
> > > >
> > > > I understand that there is a headroom for receiving packages, which can be
> > > > used to put virtio headers. But there is no headroom defined in the direction
> > > > of sending packages. I hope Magnus Karlsson can help confirm whether
> > > > there is any misunderstanding.
> > >
> > > You can specify the amount of headroom you want on Tx by adjusting the "addr" field in the descriptor of the Tx ring. If your chunk starts at address X in the umem and you want 128 bytes of headroom, just write your packet into X+128 and put that address into the Tx descriptor. Will this solve your problem? If not, what would you need from AF_XDP to make it work?
> > >
> > > On Rx, there is always 256 bytes worth of headroom as specified by XDP. If you need extra, you can set the headroom variable when you register the umem.
> >
> > The driver of virtio net, when passing the packet to the hardware, should add a
> > virtnet hdr (12 bytes) in front of the packet. Both rx and tx should add such a
> > header. AF_XDP has a space of 256 bytes in the rx process. We can reuse this
> > space. The direction of AF_XDP tx has no such regulation.
> >
> > The method you mentioned requires user cooperation, which is not a good method
> > for driver implementation.
>
> This will result in a non-portable userspace program. I wonder why TX
> has become a problem here actually, anyhow we can use a dedicated sg
> for vnet hdr? And if we packed all vnet headers in an array it will
> give less performance impact.

There is no problem in implementation, there are two performance points:

1. vnet hdr and packet are not connected memory
2. use indirect or occupy two desc.

By the way, @Karlsson, Magnus, due to the peculiarity of virtio, the DMA
implementation of virtio needs to call the specialized API of virtio (see the
last patch of this patch set). I am thinking about how xsk can support this:

1. Pass callback to xsk (must pass multiple callbacks including map, unmap, map sync...)
2. xsk judges that it is virtio, and specifically calls virtio's DMA api

Which one do you think is more suitable?

Thanks.

>
> Thanks
>
> >
> > Thanks.
> >
> > >
> > > > It would be best if we could not use indirect.
> > > >
> > > > Thanks.
> > > >
> > > > > > It would be best to fix that, performance is best if header is
> > > > > > linear with the data ...
> > > > >
> > > > > This looks like a must otherwise we may meet trouble in zerocopy receive.
> > > > >
> > > > > Thanks
> > > > >
> > > > > > Or maybe we can reduce the use of indirect somewhat, at least while
> > > > > > the ring is mostly empty?
> > > > > >
> > > > > > > Xuan Zhuo (6):
> > > > > > >   virtio: rename vring_unmap_state_packed() to
> > > > > > >     vring_unmap_extra_packed()
> > > > > > >   virtio: split: alloc indirect desc with extra
> > > > > > >   virtio: packed: alloc indirect desc with extra
> > > > > > >   virtio: split: virtqueue_add_split() support dma address
> > > > > > >   virtio: packed: virtqueue_add_packed() support dma address
> > > > > > >   virtio: add api virtio_dma_map() for advance dma
> > > > > > >
> > > > > > >  drivers/virtio/virtio_ring.c | 387 ++++++++++++++++++++---------------
> > > > > > >  include/linux/virtio.h       |   9 +
> > > > > > >  2 files changed, 232 insertions(+), 164 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.31.0
> > > > > >
> > > > >
> >
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RE: [PATCH 0/6] virtio: support advance DMA
       [not found]     ` <MW3PR11MB46021E726D394F747E19D4C7F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
@ 2022-01-11 15:46       ` Michael S. Tsirkin
  2022-01-12  2:11         ` Xuan Zhuo
  0 siblings, 1 reply; 5+ messages in thread
From: Michael S. Tsirkin @ 2022-01-11 15:46 UTC (permalink / raw)
  To: Karlsson, Magnus; +Cc: virtualization

On Tue, Jan 11, 2022 at 08:40:56AM +0000, Karlsson, Magnus wrote:
> 
> 
> > -----Original Message-----
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Tuesday, January 11, 2022 9:26 AM
> > To: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Cc: Karlsson, Magnus <magnus.karlsson@intel.com>; virtualization
> > <virtualization@lists.linux-foundation.org>; Michael S.Tsirkin
> > <mst@redhat.com>
> > Subject: Re: RE: [PATCH 0/6] virtio: support advance DMA
> > 
> > On Tue, Jan 11, 2022 at 4:17 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > wrote:
> > >
> > > On Tue, 11 Jan 2022 08:04:05 +0000, Karlsson, Magnus
> > <magnus.karlsson@intel.com> wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > Sent: Tuesday, January 11, 2022 7:17 AM
> > > > > To: Jason Wang <jasowang@redhat.com>
> > > > > Cc: virtualization <virtualization@lists.linux-foundation.org>;
> > > > > Michael S.Tsirkin <mst@redhat.com>; Karlsson, Magnus
> > > > > <magnus.karlsson@intel.com>
> > > > > Subject: Re: [PATCH 0/6] virtio: support advance DMA
> > > > >
> > > > > On Tue, 11 Jan 2022 10:54:45 +0800, Jason Wang
> > > > > <jasowang@redhat.com>
> > > > > wrote:
> > > > > > On Mon, Jan 10, 2022 at 5:59 PM Michael S. Tsirkin
> > > > > > <mst@redhat.com>
> > > > > wrote:
> > > > > > >
> > > > > > > On Fri, Jan 07, 2022 at 02:33:00PM +0800, Xuan Zhuo wrote:
> > > > > > > > virtqueue_add() only supports virtual addresses, dma is
> > > > > > > > completed in virtqueue_add().
> > > > > > > >
> > > > > > > > In some scenarios (such as the AF_XDP scenario), DMA is
> > > > > > > > completed in advance, so it is necessary for us to support
> > > > > > > > passing the DMA address
> > > > > to virtqueue_add().
> > > > > > > >
> > > > > > > > This patch set stipulates that if sg->dma_address is not
> > > > > > > > NULL, use this address as the DMA address. And record this
> > > > > > > > information in
> > > > > > > > extra->flags, which can be skipped when executing dma unmap.
> > > > > > > >
> > > > > > > >     extra->flags |= VRING_DESC_F_PREDMA;
> > > > > > > >
> > > > > > > > But the indirect desc does not have a corresponding extra,
> > > > > > > > so the second and third patches of this patch set are to
> > > > > > > > allocate the corresponding extra while allocating the
> > > > > > > > indirect desc. Each desc must have a corresponding extra
> > > > > > > > because it is possible in an sgs some are advance DMA, while
> > > > > > > > others are virtual addresses. So we must
> > > > > allocate an extra for each indirect desc.
> > > > > > >
> > > > > > >
> > > > > > > I didn't realize AF_XDP didn't have space to stuff the header into.
> > > > > > > Jason, is that expected?
> > > > > >
> > > > > > I might be wrong but it looks to me AF_XDP allows to reserve
> > > > > > sufficient headroom via xdp_umem_reg_v1.
> > > > > >
> > > > >
> > > > > I understand that there is a headroom for receiving packages,
> > > > > which can be used to put virtio headers. But there is no headroom
> > > > > defined in the direction of sending packages. I hope Magnus
> > > > > Karlsson can help confirm whether there is any misunderstanding.
> > > >
> > > > You can specify the amount of headroom you want on Tx by adjusting the
> > "addr" field in the descriptor of the Tx ring. If your chunk starts at address X
> > in the umem and you want 128 bytes of headroom, just write your packet
> > into X+128 and put that address into the Tx descriptor. Will this solve your
> > problem? If not, what would you need from AF_XDP to make it work?
> > > >
> > > > On Rx, there is always 256 bytes worth of headroom as specified by XDP.
> > If you need extra, you can set the headroom variable when you register the
> > umem.
> > >
> > > The driver of virtio net, when passing the packet to the hardware,
> > > should add a virtnet hdr (12 bytes) in front of the packet. Both rx
> > > and tx should add such a header. AF_XDP has a space of 256 bytes in
> > > the rx process. We can reuse this space. The direction of AF_XDP tx has no
> > such regulation.
> > >
> > > The method you mentioned requires user cooperation, which is not a
> > > good method for driver implementation.
> > 
> > This will result in a non-portable userspace program. I wonder why TX has
> > become a problem here actually, anyhow we can use a dedicated sg for vnet
> > hdr? And if we packed all vnet headers in an array it will give less
> > performance impact.
> 
> Yes, it would be great if you could put this somewhere else 😊. Especially since I do not see a way out of the problem of the driver requiring headroom on Tx, since this is completely left up to user-space and user-space packet offset == kernel-space packet offset within the chunk, always. The only way to change that is through copying, but then it is not zero-copy anymore. One could wish there was a mechanism of communicating this requirement to user-space, but introducing that now would make any existing app not work for virtio-net. So probably easier if you could come up with a solution not requiring headroom for the driver. If not, please let me know.

We can put the headroom in a separate buffer with a gather list. Gather
list will always be slower than making it linear though. virtio is
likely not unique in that it works better with a headroom.

So it could be an optional thing where we handle a gather list too,
let's say it's not a requirement but a performance hint.
XDP generally is not too stingy with memory, so maybe just
ask apps to preferably add a couple of hundred bytes headspace
if they can, and then it's portable across drivers.
And virtio will legacy where that space is not there.
How does this sound?



> > Thanks
> > 
> > >
> > > Thanks.
> > >
> > > >
> > > > > It would be best if we could not use indirect.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > > > It would be best to fix that, performance is best if header is
> > > > > > > linear with the data ...
> > > > > >
> > > > > > This looks like a must otherwise we may meet trouble in zerocopy
> > receive.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > > Or maybe we can reduce the use of indirect somewhat, at least
> > > > > > > while the ring is mostly empty?
> > > > > > >
> > > > > > > > Xuan Zhuo (6):
> > > > > > > >   virtio: rename vring_unmap_state_packed() to
> > > > > > > >     vring_unmap_extra_packed()
> > > > > > > >   virtio: split: alloc indirect desc with extra
> > > > > > > >   virtio: packed: alloc indirect desc with extra
> > > > > > > >   virtio: split: virtqueue_add_split() support dma address
> > > > > > > >   virtio: packed: virtqueue_add_packed() support dma address
> > > > > > > >   virtio: add api virtio_dma_map() for advance dma
> > > > > > > >
> > > > > > > >  drivers/virtio/virtio_ring.c | 387 ++++++++++++++++++++---------
> > ------
> > > > > > > >  include/linux/virtio.h       |   9 +
> > > > > > > >  2 files changed, 232 insertions(+), 164 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.31.0
> > > > > > >
> > > > > >
> > >
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RE: [PATCH 0/6] virtio: support advance DMA
  2022-01-11 15:46       ` Michael S. Tsirkin
@ 2022-01-12  2:11         ` Xuan Zhuo
  0 siblings, 0 replies; 5+ messages in thread
From: Xuan Zhuo @ 2022-01-12  2:11 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: magnus.karlsson, virtualization

On Tue, 11 Jan 2022 10:46:41 -0500, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Tue, Jan 11, 2022 at 08:40:56AM +0000, Karlsson, Magnus wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Tuesday, January 11, 2022 9:26 AM
> > > To: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > Cc: Karlsson, Magnus <magnus.karlsson@intel.com>; virtualization
> > > <virtualization@lists.linux-foundation.org>; Michael S.Tsirkin
> > > <mst@redhat.com>
> > > Subject: Re: RE: [PATCH 0/6] virtio: support advance DMA
> > >
> > > On Tue, Jan 11, 2022 at 4:17 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > wrote:
> > > >
> > > > On Tue, 11 Jan 2022 08:04:05 +0000, Karlsson, Magnus
> > > <magnus.karlsson@intel.com> wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > Sent: Tuesday, January 11, 2022 7:17 AM
> > > > > > To: Jason Wang <jasowang@redhat.com>
> > > > > > Cc: virtualization <virtualization@lists.linux-foundation.org>;
> > > > > > Michael S.Tsirkin <mst@redhat.com>; Karlsson, Magnus
> > > > > > <magnus.karlsson@intel.com>
> > > > > > Subject: Re: [PATCH 0/6] virtio: support advance DMA
> > > > > >
> > > > > > On Tue, 11 Jan 2022 10:54:45 +0800, Jason Wang
> > > > > > <jasowang@redhat.com>
> > > > > > wrote:
> > > > > > > On Mon, Jan 10, 2022 at 5:59 PM Michael S. Tsirkin
> > > > > > > <mst@redhat.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > On Fri, Jan 07, 2022 at 02:33:00PM +0800, Xuan Zhuo wrote:
> > > > > > > > > virtqueue_add() only supports virtual addresses, dma is
> > > > > > > > > completed in virtqueue_add().
> > > > > > > > >
> > > > > > > > > In some scenarios (such as the AF_XDP scenario), DMA is
> > > > > > > > > completed in advance, so it is necessary for us to support
> > > > > > > > > passing the DMA address
> > > > > > to virtqueue_add().
> > > > > > > > >
> > > > > > > > > This patch set stipulates that if sg->dma_address is not
> > > > > > > > > NULL, use this address as the DMA address. And record this
> > > > > > > > > information in
> > > > > > > > > extra->flags, which can be skipped when executing dma unmap.
> > > > > > > > >
> > > > > > > > >     extra->flags |= VRING_DESC_F_PREDMA;
> > > > > > > > >
> > > > > > > > > But the indirect desc does not have a corresponding extra,
> > > > > > > > > so the second and third patches of this patch set are to
> > > > > > > > > allocate the corresponding extra while allocating the
> > > > > > > > > indirect desc. Each desc must have a corresponding extra
> > > > > > > > > because it is possible in an sgs some are advance DMA, while
> > > > > > > > > others are virtual addresses. So we must
> > > > > > allocate an extra for each indirect desc.
> > > > > > > >
> > > > > > > >
> > > > > > > > I didn't realize AF_XDP didn't have space to stuff the header into.
> > > > > > > > Jason, is that expected?
> > > > > > >
> > > > > > > I might be wrong but it looks to me AF_XDP allows to reserve
> > > > > > > sufficient headroom via xdp_umem_reg_v1.
> > > > > > >
> > > > > >
> > > > > > I understand that there is a headroom for receiving packages,
> > > > > > which can be used to put virtio headers. But there is no headroom
> > > > > > defined in the direction of sending packages. I hope Magnus
> > > > > > Karlsson can help confirm whether there is any misunderstanding.
> > > > >
> > > > > You can specify the amount of headroom you want on Tx by adjusting the
> > > "addr" field in the descriptor of the Tx ring. If your chunk starts at address X
> > > in the umem and you want 128 bytes of headroom, just write your packet
> > > into X+128 and put that address into the Tx descriptor. Will this solve your
> > > problem? If not, what would you need from AF_XDP to make it work?
> > > > >
> > > > > On Rx, there is always 256 bytes worth of headroom as specified by XDP.
> > > If you need extra, you can set the headroom variable when you register the
> > > umem.
> > > >
> > > > The driver of virtio net, when passing the packet to the hardware,
> > > > should add a virtnet hdr (12 bytes) in front of the packet. Both rx
> > > > and tx should add such a header. AF_XDP has a space of 256 bytes in
> > > > the rx process. We can reuse this space. The direction of AF_XDP tx has no
> > > such regulation.
> > > >
> > > > The method you mentioned requires user cooperation, which is not a
> > > > good method for driver implementation.
> > >
> > > This will result in a non-portable userspace program. I wonder why TX has
> > > become a problem here actually, anyhow we can use a dedicated sg for vnet
> > > hdr? And if we packed all vnet headers in an array it will give less
> > > performance impact.
> >
> > Yes, it would be great if you could put this somewhere else 😊. Especially since I do not see a way out of the problem of the driver requiring headroom on Tx, since this is completely left up to user-space and user-space packet offset == kernel-space packet offset within the chunk, always. The only way to change that is through copying, but then it is not zero-copy anymore. One could wish there was a mechanism of communicating this requirement to user-space, but introducing that now would make any existing app not work for virtio-net. So probably easier if you could come up with a solution not requiring headroom for the driver. If not, please let me know.
>
> We can put the headroom in a separate buffer with a gather list. Gather
> list will always be slower than making it linear though. virtio is
> likely not unique in that it works better with a headroom.
>
> So it could be an optional thing where we handle a gather list too,
> let's say it's not a requirement but a performance hint.
> XDP generally is not too stingy with memory, so maybe just
> ask apps to preferably add a couple of hundred bytes headspace
> if they can, and then it's portable across drivers.
> And virtio will legacy where that space is not there.
> How does this sound?


I think the same way, if the application leaves us enough space, then we can use
this space to place the vnet hdr. If not, use a separate buffer.

The problem now is that AF_XDP needs a general mechanism to inform the driver
that there is a space in front of the packet.

This actually led me to some other thoughts, AF_XDP should pass more information
to the driver, such as using csum and other hardware offload.

When xdp_desc.options has a specified flag, addr points to not packet, but a
struct

struct {
	int offset;  // packet offset with addr
	int csum_start;
	int csum_offset;
	.......
}

@Karlsson, Magnus

Thanks.

>
>
>
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > > It would be best if we could not use indirect.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > > > It would be best to fix that, performance is best if header is
> > > > > > > > linear with the data ...
> > > > > > >
> > > > > > > This looks like a must otherwise we may meet trouble in zerocopy
> > > receive.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > > Or maybe we can reduce the use of indirect somewhat, at least
> > > > > > > > while the ring is mostly empty?
> > > > > > > >
> > > > > > > > > Xuan Zhuo (6):
> > > > > > > > >   virtio: rename vring_unmap_state_packed() to
> > > > > > > > >     vring_unmap_extra_packed()
> > > > > > > > >   virtio: split: alloc indirect desc with extra
> > > > > > > > >   virtio: packed: alloc indirect desc with extra
> > > > > > > > >   virtio: split: virtqueue_add_split() support dma address
> > > > > > > > >   virtio: packed: virtqueue_add_packed() support dma address
> > > > > > > > >   virtio: add api virtio_dma_map() for advance dma
> > > > > > > > >
> > > > > > > > >  drivers/virtio/virtio_ring.c | 387 ++++++++++++++++++++---------
> > > ------
> > > > > > > > >  include/linux/virtio.h       |   9 +
> > > > > > > > >  2 files changed, 232 insertions(+), 164 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.31.0
> > > > > > > >
> > > > > > >
> > > >
> >
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-01-12  2:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <MW3PR11MB46029D413C9A6C32C0E5B347F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
2022-01-11  8:06 ` RE: [PATCH 0/6] virtio: support advance DMA Xuan Zhuo
2022-01-11  8:25   ` Jason Wang
2022-01-11  8:29     ` Xuan Zhuo
     [not found]     ` <MW3PR11MB46021E726D394F747E19D4C7F7519@MW3PR11MB4602.namprd11.prod.outlook.com>
2022-01-11 15:46       ` Michael S. Tsirkin
2022-01-12  2:11         ` Xuan Zhuo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.