virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net-next v2 2/3] virtio-net: support IFF_TX_SKB_NO_LINEAR
       [not found] ` <d54438cec1fc86a1cb0166098493b1aa6a15885a.1611128806.git.xuanzhuo@linux.alibaba.com>
@ 2021-01-20  7:57   ` Michael S. Tsirkin
  0 siblings, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2021-01-20  7:57 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Song Liu, Martin KaFai Lau, linux-kernel, Jesper Dangaard Brouer,
	Daniel Borkmann, Alexander Lobakin, Yonghong Song,
	John Fastabend, Alexei Starovoitov, Andrii Nakryiko, netdev,
	Jonathan Lemon, KP Singh, Jakub Kicinski, bpf, bjorn.topel,
	virtualization, David S. Miller, Magnus Karlsson

On Wed, Jan 20, 2021 at 03:49:10PM +0800, Xuan Zhuo wrote:
> Virtio net supports the case where the skb linear space is empty, so add
> priv_flags.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ba8e637..f2ff6c3 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2972,7 +2972,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>  		return -ENOMEM;
>  
>  	/* Set up network device as normal. */
> -	dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE;
> +	dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE |
> +			   IFF_TX_SKB_NO_LINEAR;
>  	dev->netdev_ops = &virtnet_netdev;
>  	dev->features = NETIF_F_HIGHDMA;
>  
> -- 
> 1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2 3/3] xsk: build skb by page
       [not found] ` <6787e9a100eba47efbff81939e21e97fef492d07.1611128806.git.xuanzhuo@linux.alibaba.com>
@ 2021-01-20  8:10   ` Michael S. Tsirkin
  2021-01-20  8:11     ` Michael S. Tsirkin
  0 siblings, 1 reply; 3+ messages in thread
From: Michael S. Tsirkin @ 2021-01-20  8:10 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Song Liu, Martin KaFai Lau, linux-kernel, Jesper Dangaard Brouer,
	Daniel Borkmann, Alexander Lobakin, Yonghong Song,
	John Fastabend, Alexei Starovoitov, Andrii Nakryiko, netdev,
	Jonathan Lemon, KP Singh, Jakub Kicinski, bpf, bjorn.topel,
	virtualization, David S. Miller, Magnus Karlsson

On Wed, Jan 20, 2021 at 03:50:01PM +0800, Xuan Zhuo wrote:
> This patch is used to construct skb based on page to save memory copy
> overhead.
> 
> This function is implemented based on IFF_TX_SKB_NO_LINEAR. Only the
> network card priv_flags supports IFF_TX_SKB_NO_LINEAR will use page to
> directly construct skb. If this feature is not supported, it is still
> necessary to copy data to construct skb.
> 
> ---------------- Performance Testing ------------
> 
> The test environment is Aliyun ECS server.
> Test cmd:
> ```
> xdpsock -i eth0 -t  -S -s <msg size>
> ```
> 
> Test result data:
> 
> size    64      512     1024    1500
> copy    1916747 1775988 1600203 1440054
> page    1974058 1953655 1945463 1904478
> percent 3.0%    10.0%   21.58%  32.3%
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Reviewed-by: Dust Li <dust.li@linux.alibaba.com>

I can't see the cover letter or 1/3 in this series - was probably
threaded incorrectly?


> ---
>  net/xdp/xsk.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 86 insertions(+), 18 deletions(-)
> 
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 8037b04..817a3a5 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -430,6 +430,87 @@ static void xsk_destruct_skb(struct sk_buff *skb)
>  	sock_wfree(skb);
>  }
>  
> +static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> +					      struct xdp_desc *desc)
> +{
> +	u32 len, offset, copy, copied;
> +	struct sk_buff *skb;
> +	struct page *page;
> +	char *buffer;

Actually, make this void *, this way you will not need
casts down the road. I know this is from xsk_generic_xmit -
I don't know why it's char * there, either.

> +	int err, i;
> +	u64 addr;
> +
> +	skb = sock_alloc_send_skb(&xs->sk, 0, 1, &err);
> +	if (unlikely(!skb))
> +		return ERR_PTR(err);
> +
> +	addr = desc->addr;
> +	len = desc->len;
> +
> +	buffer = xsk_buff_raw_get_data(xs->pool, addr);
> +	offset = offset_in_page(buffer);
> +	addr = buffer - (char *)xs->pool->addrs;
> +
> +	for (copied = 0, i = 0; copied < len; i++) {
> +		page = xs->pool->umem->pgs[addr >> PAGE_SHIFT];
> +
> +		get_page(page);
> +
> +		copy = min_t(u32, PAGE_SIZE - offset, len - copied);
> +
> +		skb_fill_page_desc(skb, i, page, offset, copy);
> +
> +		copied += copy;
> +		addr += copy;
> +		offset = 0;
> +	}
> +
> +	skb->len += len;
> +	skb->data_len += len;
> +	skb->truesize += len;
> +
> +	refcount_add(len, &xs->sk.sk_wmem_alloc);
> +
> +	return skb;
> +}
> +
> +static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> +				     struct xdp_desc *desc)
> +{
> +	struct sk_buff *skb = NULL;
> +
> +	if (xs->dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
> +		skb = xsk_build_skb_zerocopy(xs, desc);
> +		if (IS_ERR(skb))
> +			return skb;
> +	} else {
> +		char *buffer;
> +		u32 len;
> +		int err;
> +
> +		len = desc->len;
> +		skb = sock_alloc_send_skb(&xs->sk, len, 1, &err);
> +		if (unlikely(!skb))
> +			return ERR_PTR(err);
> +
> +		skb_put(skb, len);
> +		buffer = xsk_buff_raw_get_data(xs->pool, desc->addr);
> +		err = skb_store_bits(skb, 0, buffer, len);
> +		if (unlikely(err)) {
> +			kfree_skb(skb);
> +			return ERR_PTR(err);
> +		}
> +	}
> +
> +	skb->dev = xs->dev;
> +	skb->priority = xs->sk.sk_priority;
> +	skb->mark = xs->sk.sk_mark;
> +	skb_shinfo(skb)->destructor_arg = (void *)(long)desc->addr;
> +	skb->destructor = xsk_destruct_skb;
> +
> +	return skb;
> +}
> +
>  static int xsk_generic_xmit(struct sock *sk)
>  {
>  	struct xdp_sock *xs = xdp_sk(sk);
> @@ -446,43 +527,30 @@ static int xsk_generic_xmit(struct sock *sk)
>  		goto out;
>  
>  	while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) {
> -		char *buffer;
> -		u64 addr;
> -		u32 len;
> -
>  		if (max_batch-- == 0) {
>  			err = -EAGAIN;
>  			goto out;
>  		}
>  
> -		len = desc.len;
> -		skb = sock_alloc_send_skb(sk, len, 1, &err);
> -		if (unlikely(!skb))
> +		skb = xsk_build_skb(xs, &desc);
> +		if (IS_ERR(skb)) {
> +			err = PTR_ERR(skb);
>  			goto out;
> +		}
>  
> -		skb_put(skb, len);
> -		addr = desc.addr;
> -		buffer = xsk_buff_raw_get_data(xs->pool, addr);
> -		err = skb_store_bits(skb, 0, buffer, len);
>  		/* This is the backpressure mechanism for the Tx path.
>  		 * Reserve space in the completion queue and only proceed
>  		 * if there is space in it. This avoids having to implement
>  		 * any buffering in the Tx path.
>  		 */
>  		spin_lock_irqsave(&xs->pool->cq_lock, flags);
> -		if (unlikely(err) || xskq_prod_reserve(xs->pool->cq)) {
> +		if (xskq_prod_reserve(xs->pool->cq)) {
>  			spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
>  			kfree_skb(skb);
>  			goto out;
>  		}
>  		spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
>  
> -		skb->dev = xs->dev;
> -		skb->priority = sk->sk_priority;
> -		skb->mark = sk->sk_mark;
> -		skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
> -		skb->destructor = xsk_destruct_skb;
> -
>  		err = __dev_direct_xmit(skb, xs->queue_id);
>  		if  (err == NETDEV_TX_BUSY) {
>  			/* Tell user-space to retry the send */
> -- 
> 1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2 3/3] xsk: build skb by page
  2021-01-20  8:10   ` [PATCH net-next v2 3/3] xsk: build skb by page Michael S. Tsirkin
@ 2021-01-20  8:11     ` Michael S. Tsirkin
  0 siblings, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2021-01-20  8:11 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Song Liu, Martin KaFai Lau, linux-kernel, Jesper Dangaard Brouer,
	Daniel Borkmann, Alexander Lobakin, Yonghong Song,
	John Fastabend, Alexei Starovoitov, Andrii Nakryiko, netdev,
	Jonathan Lemon, KP Singh, Jakub Kicinski, bpf, bjorn.topel,
	virtualization, David S. Miller, Magnus Karlsson

On Wed, Jan 20, 2021 at 03:11:04AM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 20, 2021 at 03:50:01PM +0800, Xuan Zhuo wrote:
> > This patch is used to construct skb based on page to save memory copy
> > overhead.
> > 
> > This function is implemented based on IFF_TX_SKB_NO_LINEAR. Only the
> > network card priv_flags supports IFF_TX_SKB_NO_LINEAR will use page to
> > directly construct skb. If this feature is not supported, it is still
> > necessary to copy data to construct skb.
> > 
> > ---------------- Performance Testing ------------
> > 
> > The test environment is Aliyun ECS server.
> > Test cmd:
> > ```
> > xdpsock -i eth0 -t  -S -s <msg size>
> > ```
> > 
> > Test result data:
> > 
> > size    64      512     1024    1500
> > copy    1916747 1775988 1600203 1440054
> > page    1974058 1953655 1945463 1904478
> > percent 3.0%    10.0%   21.58%  32.3%
> > 
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
> 
> I can't see the cover letter or 1/3 in this series - was probably
> threaded incorrectly?


Hmm looked again and now I do see them. My mistake pls ignore.

> 
> > ---
> >  net/xdp/xsk.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++----------
> >  1 file changed, 86 insertions(+), 18 deletions(-)
> > 
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index 8037b04..817a3a5 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
> > @@ -430,6 +430,87 @@ static void xsk_destruct_skb(struct sk_buff *skb)
> >  	sock_wfree(skb);
> >  }
> >  
> > +static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> > +					      struct xdp_desc *desc)
> > +{
> > +	u32 len, offset, copy, copied;
> > +	struct sk_buff *skb;
> > +	struct page *page;
> > +	char *buffer;
> 
> Actually, make this void *, this way you will not need
> casts down the road. I know this is from xsk_generic_xmit -
> I don't know why it's char * there, either.
> 
> > +	int err, i;
> > +	u64 addr;
> > +
> > +	skb = sock_alloc_send_skb(&xs->sk, 0, 1, &err);
> > +	if (unlikely(!skb))
> > +		return ERR_PTR(err);
> > +
> > +	addr = desc->addr;
> > +	len = desc->len;
> > +
> > +	buffer = xsk_buff_raw_get_data(xs->pool, addr);
> > +	offset = offset_in_page(buffer);
> > +	addr = buffer - (char *)xs->pool->addrs;
> > +
> > +	for (copied = 0, i = 0; copied < len; i++) {
> > +		page = xs->pool->umem->pgs[addr >> PAGE_SHIFT];
> > +
> > +		get_page(page);
> > +
> > +		copy = min_t(u32, PAGE_SIZE - offset, len - copied);
> > +
> > +		skb_fill_page_desc(skb, i, page, offset, copy);
> > +
> > +		copied += copy;
> > +		addr += copy;
> > +		offset = 0;
> > +	}
> > +
> > +	skb->len += len;
> > +	skb->data_len += len;
> > +	skb->truesize += len;
> > +
> > +	refcount_add(len, &xs->sk.sk_wmem_alloc);
> > +
> > +	return skb;
> > +}
> > +
> > +static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> > +				     struct xdp_desc *desc)
> > +{
> > +	struct sk_buff *skb = NULL;
> > +
> > +	if (xs->dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
> > +		skb = xsk_build_skb_zerocopy(xs, desc);
> > +		if (IS_ERR(skb))
> > +			return skb;
> > +	} else {
> > +		char *buffer;
> > +		u32 len;
> > +		int err;
> > +
> > +		len = desc->len;
> > +		skb = sock_alloc_send_skb(&xs->sk, len, 1, &err);
> > +		if (unlikely(!skb))
> > +			return ERR_PTR(err);
> > +
> > +		skb_put(skb, len);
> > +		buffer = xsk_buff_raw_get_data(xs->pool, desc->addr);
> > +		err = skb_store_bits(skb, 0, buffer, len);
> > +		if (unlikely(err)) {
> > +			kfree_skb(skb);
> > +			return ERR_PTR(err);
> > +		}
> > +	}
> > +
> > +	skb->dev = xs->dev;
> > +	skb->priority = xs->sk.sk_priority;
> > +	skb->mark = xs->sk.sk_mark;
> > +	skb_shinfo(skb)->destructor_arg = (void *)(long)desc->addr;
> > +	skb->destructor = xsk_destruct_skb;
> > +
> > +	return skb;
> > +}
> > +
> >  static int xsk_generic_xmit(struct sock *sk)
> >  {
> >  	struct xdp_sock *xs = xdp_sk(sk);
> > @@ -446,43 +527,30 @@ static int xsk_generic_xmit(struct sock *sk)
> >  		goto out;
> >  
> >  	while (xskq_cons_peek_desc(xs->tx, &desc, xs->pool)) {
> > -		char *buffer;
> > -		u64 addr;
> > -		u32 len;
> > -
> >  		if (max_batch-- == 0) {
> >  			err = -EAGAIN;
> >  			goto out;
> >  		}
> >  
> > -		len = desc.len;
> > -		skb = sock_alloc_send_skb(sk, len, 1, &err);
> > -		if (unlikely(!skb))
> > +		skb = xsk_build_skb(xs, &desc);
> > +		if (IS_ERR(skb)) {
> > +			err = PTR_ERR(skb);
> >  			goto out;
> > +		}
> >  
> > -		skb_put(skb, len);
> > -		addr = desc.addr;
> > -		buffer = xsk_buff_raw_get_data(xs->pool, addr);
> > -		err = skb_store_bits(skb, 0, buffer, len);
> >  		/* This is the backpressure mechanism for the Tx path.
> >  		 * Reserve space in the completion queue and only proceed
> >  		 * if there is space in it. This avoids having to implement
> >  		 * any buffering in the Tx path.
> >  		 */
> >  		spin_lock_irqsave(&xs->pool->cq_lock, flags);
> > -		if (unlikely(err) || xskq_prod_reserve(xs->pool->cq)) {
> > +		if (xskq_prod_reserve(xs->pool->cq)) {
> >  			spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
> >  			kfree_skb(skb);
> >  			goto out;
> >  		}
> >  		spin_unlock_irqrestore(&xs->pool->cq_lock, flags);
> >  
> > -		skb->dev = xs->dev;
> > -		skb->priority = sk->sk_priority;
> > -		skb->mark = sk->sk_mark;
> > -		skb_shinfo(skb)->destructor_arg = (void *)(long)desc.addr;
> > -		skb->destructor = xsk_destruct_skb;
> > -
> >  		err = __dev_direct_xmit(skb, xs->queue_id);
> >  		if  (err == NETDEV_TX_BUSY) {
> >  			/* Tell user-space to retry the send */
> > -- 
> > 1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-20  8:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1611128806.git.xuanzhuo@linux.alibaba.com>
     [not found] ` <d54438cec1fc86a1cb0166098493b1aa6a15885a.1611128806.git.xuanzhuo@linux.alibaba.com>
2021-01-20  7:57   ` [PATCH net-next v2 2/3] virtio-net: support IFF_TX_SKB_NO_LINEAR Michael S. Tsirkin
     [not found] ` <6787e9a100eba47efbff81939e21e97fef492d07.1611128806.git.xuanzhuo@linux.alibaba.com>
2021-01-20  8:10   ` [PATCH net-next v2 3/3] xsk: build skb by page Michael S. Tsirkin
2021-01-20  8:11     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).