bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH vhost v10 10/10] virtio_net: support dma premapped
Date: Tue, 27 Jun 2023 17:23:27 +0800	[thread overview]
Message-ID: <1687857807.2478845-8-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <CACGkMEsyP7bxOchyaKPb=y+td=1F34NwxxP3atyNBwFAtNOsxw@mail.gmail.com>

On Tue, 27 Jun 2023 16:03:35 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Fri, Jun 2, 2023 at 5:22 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > Introduce the module param "experiment_premapped" to enable the function
> > that the virtio-net do dma mapping.
> >
> > If that is true, the vq of virtio-net is under the premapped mode.
> > It just handle the sg with dma_address. And the driver must get the dma
> > address of the buffer to unmap after get the buffer from virtio core.
> >
> > That will be useful when AF_XDP is enable, AF_XDP tx and the kernel packet
> > xmit will share the tx queue, so the skb xmit must support the premapped
> > mode.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 163 +++++++++++++++++++++++++++++++++------
> >  1 file changed, 141 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 2396c28c0122..5898212fcb3c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -26,10 +26,11 @@
> >  static int napi_weight = NAPI_POLL_WEIGHT;
> >  module_param(napi_weight, int, 0444);
> >
> > -static bool csum = true, gso = true, napi_tx = true;
> > +static bool csum = true, gso = true, napi_tx = true, experiment_premapped;
> >  module_param(csum, bool, 0444);
> >  module_param(gso, bool, 0444);
> >  module_param(napi_tx, bool, 0644);
> > +module_param(experiment_premapped, bool, 0644);
>
> Having a module parameter is sub-optimal. I think we can demonstrate
> real benefit:
>
> In the case of a merge rx buffer, if the mapping is done by the
> virtio-core, it needs to be done per buffer (< PAGE_SIZE).
>
> But if it is done by the virtio-net, we have a chance to map the
> buffer per page. Which can save a lot of mappings and unmapping. A lot
> of other optimizations could be done on top as well.


Good point.

Thanks


>
> If we manage to prove this, we don't need any experimental module
> parameters at all.
>
> Thanks
>
>
> >
> >  /* FIXME: MTU in config. */
> >  #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
> > @@ -142,6 +143,9 @@ struct send_queue {
> >
> >         /* Record whether sq is in reset state. */
> >         bool reset;
> > +
> > +       /* The vq is premapped mode. */
> > +       bool premapped;
> >  };
> >
> >  /* Internal representation of a receive virtqueue */
> > @@ -174,6 +178,9 @@ struct receive_queue {
> >         char name[16];
> >
> >         struct xdp_rxq_info xdp_rxq;
> > +
> > +       /* The vq is premapped mode. */
> > +       bool premapped;
> >  };
> >
> >  /* This structure can contain rss message with maximum settings for indirection table and keysize
> > @@ -546,6 +553,105 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
> >         return skb;
> >  }
> >
> > +static int virtnet_generic_unmap(struct virtqueue *vq, struct virtqueue_detach_cursor *cursor)
> > +{
> > +       enum dma_data_direction dir;
> > +       dma_addr_t addr;
> > +       u32 len;
> > +       int err;
> > +
> > +       do {
> > +               err = virtqueue_detach(vq, cursor, &addr, &len, &dir);
> > +               if (!err || err == -EAGAIN)
> > +                       dma_unmap_page_attrs(virtqueue_dma_dev(vq), addr, len, dir, 0);
> > +
> > +       } while (err == -EAGAIN);
> > +
> > +       return err;
> > +}
> > +
> > +static void *virtnet_detach_unused_buf(struct virtqueue *vq, bool premapped)
> > +{
> > +       struct virtqueue_detach_cursor cursor;
> > +       void *buf;
> > +
> > +       if (!premapped)
> > +               return virtqueue_detach_unused_buf(vq);
> > +
> > +       buf = virtqueue_detach_unused_buf_premapped(vq, &cursor);
> > +       if (buf)
> > +               virtnet_generic_unmap(vq, &cursor);
> > +
> > +       return buf;
> > +}
> > +
> > +static void *virtnet_get_buf_ctx(struct virtqueue *vq, bool premapped, u32 *len, void **ctx)
> > +{
> > +       struct virtqueue_detach_cursor cursor;
> > +       void *buf;
> > +
> > +       if (!premapped)
> > +               return virtqueue_get_buf_ctx(vq, len, ctx);
> > +
> > +       buf = virtqueue_get_buf_premapped(vq, len, ctx, &cursor);
> > +       if (buf)
> > +               virtnet_generic_unmap(vq, &cursor);
> > +
> > +       return buf;
> > +}
> > +
> > +#define virtnet_rq_get_buf(rq, plen, pctx) \
> > +({ \
> > +       typeof(rq) _rq = (rq); \
> > +       virtnet_get_buf_ctx(_rq->vq, _rq->premapped, plen, pctx); \
> > +})
> > +
> > +#define virtnet_sq_get_buf(sq, plen, pctx) \
> > +({ \
> > +       typeof(sq) _sq = (sq); \
> > +       virtnet_get_buf_ctx(_sq->vq, _sq->premapped, plen, pctx); \
> > +})
> > +
> > +static int virtnet_add_sg(struct virtqueue *vq, bool premapped,
> > +                         struct scatterlist *sg, unsigned int num, bool out,
> > +                         void *data, void *ctx, gfp_t gfp)
> > +{
> > +       enum dma_data_direction dir;
> > +       struct device *dev;
> > +       int err, ret;
> > +
> > +       if (!premapped)
> > +               return virtqueue_add_sg(vq, sg, num, out, data, ctx, gfp);
> > +
> > +       dir = out ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
> > +       dev = virtqueue_dma_dev(vq);
> > +
> > +       ret = dma_map_sg_attrs(dev, sg, num, dir, 0);
> > +       if (ret != num)
> > +               goto err;
> > +
> > +       err = virtqueue_add_sg(vq, sg, num, out, data, ctx, gfp);
> > +       if (err < 0)
> > +               goto err;
> > +
> > +       return 0;
> > +
> > +err:
> > +       dma_unmap_sg_attrs(dev, sg, num, dir, 0);
> > +       return -ENOMEM;
> > +}
> > +
> > +static int virtnet_add_outbuf(struct send_queue *sq, unsigned int num, void *data)
> > +{
> > +       return virtnet_add_sg(sq->vq, sq->premapped, sq->sg, num, true, data, NULL, GFP_ATOMIC);
> > +}
> > +
> > +static int virtnet_add_inbuf(struct receive_queue *rq, unsigned int num, void *data,
> > +                            void *ctx, gfp_t gfp)
> > +{
> > +       return virtnet_add_sg(rq->vq, rq->premapped, rq->sg, num, false, data, ctx, gfp);
> > +}
> > +
> >  static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi)
> >  {
> >         unsigned int len;
> > @@ -553,7 +659,7 @@ static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi)
> >         unsigned int bytes = 0;
> >         void *ptr;
> >
> > -       while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) {
> > +       while ((ptr = virtnet_sq_get_buf(sq, &len, NULL)) != NULL) {
> >                 if (likely(!is_xdp_frame(ptr))) {
> >                         struct sk_buff *skb = ptr;
> >
> > @@ -667,8 +773,7 @@ static int __virtnet_xdp_xmit_one(struct virtnet_info *vi,
> >                             skb_frag_size(frag), skb_frag_off(frag));
> >         }
> >
> > -       err = virtqueue_add_outbuf(sq->vq, sq->sg, nr_frags + 1,
> > -                                  xdp_to_ptr(xdpf), GFP_ATOMIC);
> > +       err = virtnet_add_outbuf(sq, nr_frags + 1, xdp_to_ptr(xdpf));
> >         if (unlikely(err))
> >                 return -ENOSPC; /* Caller handle free/refcnt */
> >
> > @@ -744,7 +849,7 @@ static int virtnet_xdp_xmit(struct net_device *dev,
> >         }
> >
> >         /* Free up any pending old buffers before queueing new ones. */
> > -       while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) {
> > +       while ((ptr = virtnet_sq_get_buf(sq, &len, NULL)) != NULL) {
> >                 if (likely(is_xdp_frame(ptr))) {
> >                         struct xdp_frame *frame = ptr_to_xdp(ptr);
> >
> > @@ -828,7 +933,7 @@ static struct page *xdp_linearize_page(struct receive_queue *rq,
> >                 void *buf;
> >                 int off;
> >
> > -               buf = virtqueue_get_buf(rq->vq, &buflen);
> > +               buf = virtnet_rq_get_buf(rq, &buflen, NULL);
> >                 if (unlikely(!buf))
> >                         goto err_buf;
> >
> > @@ -1119,7 +1224,7 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
> >                 return -EINVAL;
> >
> >         while (--*num_buf > 0) {
> > -               buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx);
> > +               buf = virtnet_rq_get_buf(rq, &len, &ctx);
> >                 if (unlikely(!buf)) {
> >                         pr_debug("%s: rx error: %d buffers out of %d missing\n",
> >                                  dev->name, *num_buf,
> > @@ -1344,7 +1449,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >         while (--num_buf) {
> >                 int num_skb_frags;
> >
> > -               buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx);
> > +               buf = virtnet_rq_get_buf(rq, &len, &ctx);
> >                 if (unlikely(!buf)) {
> >                         pr_debug("%s: rx error: %d buffers out of %d missing\n",
> >                                  dev->name, num_buf,
> > @@ -1407,7 +1512,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  err_skb:
> >         put_page(page);
> >         while (num_buf-- > 1) {
> > -               buf = virtqueue_get_buf(rq->vq, &len);
> > +               buf = virtnet_rq_get_buf(rq, &len, NULL);
> >                 if (unlikely(!buf)) {
> >                         pr_debug("%s: rx error: %d buffers missing\n",
> >                                  dev->name, num_buf);
> > @@ -1534,7 +1639,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,
> >         alloc_frag->offset += len;
> >         sg_init_one(rq->sg, buf + VIRTNET_RX_PAD + xdp_headroom,
> >                     vi->hdr_len + GOOD_PACKET_LEN);
> > -       err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp);
> > +       err = virtnet_add_inbuf(rq, 1, buf, ctx, gfp);
> >         if (err < 0)
> >                 put_page(virt_to_head_page(buf));
> >         return err;
> > @@ -1581,8 +1686,8 @@ static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq,
> >
> >         /* chain first in list head */
> >         first->private = (unsigned long)list;
> > -       err = virtqueue_add_inbuf(rq->vq, rq->sg, vi->big_packets_num_skbfrags + 2,
> > -                                 first, gfp);
> > +       err = virtnet_add_inbuf(rq, vi->big_packets_num_skbfrags + 2,
> > +                               first, NULL, gfp);
> >         if (err < 0)
> >                 give_pages(rq, first);
> >
> > @@ -1645,7 +1750,7 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi,
> >
> >         sg_init_one(rq->sg, buf, len);
> >         ctx = mergeable_len_to_ctx(len + room, headroom);
> > -       err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp);
> > +       err = virtnet_add_inbuf(rq, 1, buf, ctx, gfp);
> >         if (err < 0)
> >                 put_page(virt_to_head_page(buf));
> >
> > @@ -1768,13 +1873,13 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
> >                 void *ctx;
> >
> >                 while (stats.packets < budget &&
> > -                      (buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx))) {
> > +                      (buf = virtnet_rq_get_buf(rq, &len, &ctx))) {
> >                         receive_buf(vi, rq, buf, len, ctx, xdp_xmit, &stats);
> >                         stats.packets++;
> >                 }
> >         } else {
> >                 while (stats.packets < budget &&
> > -                      (buf = virtqueue_get_buf(rq->vq, &len)) != NULL) {
> > +                      (buf = virtnet_rq_get_buf(rq, &len, NULL)) != NULL) {
> >                         receive_buf(vi, rq, buf, len, NULL, xdp_xmit, &stats);
> >                         stats.packets++;
> >                 }
> > @@ -1984,7 +2089,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
> >                         return num_sg;
> >                 num_sg++;
> >         }
> > -       return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
> > +       return virtnet_add_outbuf(sq, num_sg, skb);
> >  }
> >
> >  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > @@ -3552,15 +3657,17 @@ static void free_unused_bufs(struct virtnet_info *vi)
> >         int i;
> >
> >         for (i = 0; i < vi->max_queue_pairs; i++) {
> > -               struct virtqueue *vq = vi->sq[i].vq;
> > -               while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> > -                       virtnet_sq_free_unused_buf(vq, buf);
> > +               struct send_queue *sq = &vi->sq[i];
> > +
> > +               while ((buf = virtnet_detach_unused_buf(sq->vq, sq->premapped)) != NULL)
> > +                       virtnet_sq_free_unused_buf(sq->vq, buf);
> >         }
> >
> >         for (i = 0; i < vi->max_queue_pairs; i++) {
> > -               struct virtqueue *vq = vi->rq[i].vq;
> > -               while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> > -                       virtnet_rq_free_unused_buf(vq, buf);
> > +               struct receive_queue *rq = &vi->rq[i];
> > +
> > +               while ((buf = virtnet_detach_unused_buf(rq->vq, rq->premapped)) != NULL)
> > +                       virtnet_rq_free_unused_buf(rq->vq, buf);
> >         }
> >  }
> >
> > @@ -3658,6 +3765,18 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
> >                 vi->rq[i].vq = vqs[rxq2vq(i)];
> >                 vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
> >                 vi->sq[i].vq = vqs[txq2vq(i)];
> > +
> > +               if (experiment_premapped) {
> > +                       if (!virtqueue_set_premapped(vi->rq[i].vq))
> > +                               vi->rq[i].premapped = true;
> > +                       else
> > +                               netdev_warn(vi->dev, "RXQ (%d) enable premapped failure.\n", i);
> > +
> > +                       if (!virtqueue_set_premapped(vi->sq[i].vq))
> > +                               vi->sq[i].premapped = true;
> > +                       else
> > +                               netdev_warn(vi->dev, "TXQ (%d) enable premapped failure.\n", i);
> > +               }
> >         }
> >
> >         /* run here: ret == 0. */
> > --
> > 2.32.0.3.g01195cf9f
> >
>

  reply	other threads:[~2023-06-27  9:23 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02  9:21 [PATCH vhost v10 00/10] virtio core prepares for AF_XDP Xuan Zhuo
2023-06-02  9:21 ` [PATCH vhost v10 01/10] virtio_ring: put mapping error check in vring_map_one_sg Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-02  9:21 ` [PATCH vhost v10 02/10] virtio_ring: introduce virtqueue_set_premapped() Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-27  8:50     ` Xuan Zhuo
2023-06-27 14:56       ` Michael S. Tsirkin
2023-06-28  1:34         ` Xuan Zhuo
2023-06-02  9:21 ` [PATCH vhost v10 03/10] virtio_ring: split: support add premapped buf Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-27  9:01     ` Xuan Zhuo
2023-06-02  9:22 ` [PATCH vhost v10 04/10] virtio_ring: packed: " Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-27  9:05     ` Xuan Zhuo
2023-06-02  9:22 ` [PATCH vhost v10 05/10] virtio_ring: split-detach: support return dma info to driver Xuan Zhuo
2023-06-22 19:36   ` Michael S. Tsirkin
2023-06-25  2:10     ` Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-27  9:21     ` Xuan Zhuo
2023-06-02  9:22 ` [PATCH vhost v10 06/10] virtio_ring: packed-detach: " Xuan Zhuo
2023-06-02 11:40   ` Michael S. Tsirkin
2023-06-02  9:22 ` [PATCH vhost v10 07/10] virtio_ring: introduce helpers for premapped Xuan Zhuo
2023-06-04 13:45   ` Michael S. Tsirkin
2023-06-05  2:06     ` Xuan Zhuo
2023-06-05  5:38       ` Michael S. Tsirkin
2023-06-06  2:01         ` Xuan Zhuo
2023-06-22 19:29   ` Michael S. Tsirkin
2023-06-02  9:22 ` [PATCH vhost v10 08/10] virtio_ring: introduce virtqueue_dma_dev() Xuan Zhuo
2023-06-02  9:22 ` [PATCH vhost v10 09/10] virtio_ring: introduce virtqueue_add_sg() Xuan Zhuo
2023-06-02  9:22 ` [PATCH vhost v10 10/10] virtio_net: support dma premapped Xuan Zhuo
2023-06-03  6:31   ` Jakub Kicinski
2023-06-05  2:10     ` Xuan Zhuo
2023-06-05  5:44       ` Michael S. Tsirkin
2023-06-06  2:11         ` Xuan Zhuo
2023-06-22 12:15   ` Michael S. Tsirkin
2023-06-25  2:43     ` Xuan Zhuo
2023-06-27  8:03   ` Jason Wang
2023-06-27  9:23     ` Xuan Zhuo [this message]
2023-06-03  6:29 ` [PATCH vhost v10 00/10] virtio core prepares for AF_XDP Jakub Kicinski
2023-06-05  1:58   ` Xuan Zhuo
2023-06-07 14:05     ` Christoph Hellwig
2023-06-07 20:15       ` Michael S. Tsirkin
2023-06-21  6:42 ` Xuan Zhuo
2023-06-25  7:19   ` Jason Wang
2023-06-22 19:38 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1687857807.2478845-8-xuanzhuo@linux.alibaba.com \
    --to=xuanzhuo@linux.alibaba.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).