netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xsk: add cq event
@ 2020-11-16  8:10 Xuan Zhuo
  2020-11-16  9:13 ` Denis Kirjanov
  2020-11-16 14:31 ` Björn Töpel
  0 siblings, 2 replies; 6+ messages in thread
From: Xuan Zhuo @ 2020-11-16  8:10 UTC (permalink / raw)
  To: netdev
  Cc: Xuan Zhuo, Björn Töpel, Magnus Karlsson,
	Jonathan Lemon, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, bpf, linux-kernel

When we write all cq items to tx, we have to wait for a new event based
on poll to indicate that it is writable. But the current writability is
triggered based on whether tx is full or not, and In fact, when tx is
dissatisfied, the user of cq's item may not necessarily get it, because it
may still be occupied by the network card. In this case, we need to know
when cq is available, so this patch adds a socket option, When the user
configures this option using setsockopt, when cq is available, a
readable event is generated for all xsk bound to this umem.

I can't find a better description of this event,
I think it can also be 'readable', although it is indeed different from
the 'readable' of the new data. But the overhead of xsk checking whether
cq or rx is readable is small.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 include/net/xdp_sock.h      |  1 +
 include/uapi/linux/if_xdp.h |  1 +
 net/xdp/xsk.c               | 28 ++++++++++++++++++++++++++++
 3 files changed, 30 insertions(+)

diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 1a9559c..faf5b1a 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -49,6 +49,7 @@ struct xdp_sock {
 	struct xsk_buff_pool *pool;
 	u16 queue_id;
 	bool zc;
+	bool cq_event;
 	enum {
 		XSK_READY = 0,
 		XSK_BOUND,
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index a78a809..2dba3cb 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -63,6 +63,7 @@ struct xdp_mmap_offsets {
 #define XDP_UMEM_COMPLETION_RING	6
 #define XDP_STATISTICS			7
 #define XDP_OPTIONS			8
+#define XDP_CQ_EVENT			9
 
 struct xdp_umem_reg {
 	__u64 addr; /* Start of packet data area */
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index cfbec39..0c53403 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -285,7 +285,16 @@ void __xsk_map_flush(void)
 
 void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries)
 {
+	struct xdp_sock *xs;
+
 	xskq_prod_submit_n(pool->cq, nb_entries);
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
+		if (xs->cq_event)
+			sock_def_readable(&xs->sk);
+	}
+	rcu_read_unlock();
 }
 EXPORT_SYMBOL(xsk_tx_completed);
 
@@ -495,6 +504,9 @@ static __poll_t xsk_poll(struct file *file, struct socket *sock,
 			__xsk_sendmsg(sk);
 	}
 
+	if (xs->cq_event && pool->cq && !xskq_prod_is_empty(pool->cq))
+		mask |= EPOLLIN | EPOLLRDNORM;
+
 	if (xs->rx && !xskq_prod_is_empty(xs->rx))
 		mask |= EPOLLIN | EPOLLRDNORM;
 	if (xs->tx && !xskq_cons_is_full(xs->tx))
@@ -882,6 +894,22 @@ static int xsk_setsockopt(struct socket *sock, int level, int optname,
 		mutex_unlock(&xs->mutex);
 		return err;
 	}
+	case XDP_CQ_EVENT:
+	{
+		int cq_event;
+
+		if (optlen < sizeof(cq_event))
+			return -EINVAL;
+		if (copy_from_sockptr(&cq_event, optval, sizeof(cq_event)))
+			return -EFAULT;
+
+		if (cq_event)
+			xs->cq_event = true;
+		else
+			xs->cq_event = false;
+
+		return 0;
+	}
 	default:
 		break;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] xsk: add cq event
  2020-11-16  8:10 [PATCH] xsk: add cq event Xuan Zhuo
@ 2020-11-16  9:13 ` Denis Kirjanov
       [not found]   ` <5fb245b1.1c69fb81.e2685.976dSMTPIN_ADDED_MISSING@mx.google.com>
  2020-11-16 14:31 ` Björn Töpel
  1 sibling, 1 reply; 6+ messages in thread
From: Denis Kirjanov @ 2020-11-16  9:13 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: netdev, Björn Töpel, Magnus Karlsson, Jonathan Lemon,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, bpf,
	linux-kernel

On 11/16/20, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> When we write all cq items to tx, we have to wait for a new event based
> on poll to indicate that it is writable. But the current writability is
> triggered based on whether tx is full or not, and In fact, when tx is
> dissatisfied, the user of cq's item may not necessarily get it, because it
> may still be occupied by the network card. In this case, we need to know
> when cq is available, so this patch adds a socket option, When the user
> configures this option using setsockopt, when cq is available, a
> readable event is generated for all xsk bound to this umem.
>
> I can't find a better description of this event,
> I think it can also be 'readable', although it is indeed different from
> the 'readable' of the new data. But the overhead of xsk checking whether
> cq or rx is readable is small.
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  include/net/xdp_sock.h      |  1 +
>  include/uapi/linux/if_xdp.h |  1 +
>  net/xdp/xsk.c               | 28 ++++++++++++++++++++++++++++
>  3 files changed, 30 insertions(+)
>
> diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> index 1a9559c..faf5b1a 100644
> --- a/include/net/xdp_sock.h
> +++ b/include/net/xdp_sock.h
> @@ -49,6 +49,7 @@ struct xdp_sock {
>  	struct xsk_buff_pool *pool;
>  	u16 queue_id;
>  	bool zc;
> +	bool cq_event;
>  	enum {
>  		XSK_READY = 0,
>  		XSK_BOUND,
> diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
> index a78a809..2dba3cb 100644
> --- a/include/uapi/linux/if_xdp.h
> +++ b/include/uapi/linux/if_xdp.h
> @@ -63,6 +63,7 @@ struct xdp_mmap_offsets {
>  #define XDP_UMEM_COMPLETION_RING	6
>  #define XDP_STATISTICS			7
>  #define XDP_OPTIONS			8
> +#define XDP_CQ_EVENT			9
>
>  struct xdp_umem_reg {
>  	__u64 addr; /* Start of packet data area */
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index cfbec39..0c53403 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -285,7 +285,16 @@ void __xsk_map_flush(void)
>
>  void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries)
>  {
> +	struct xdp_sock *xs;
> +
>  	xskq_prod_submit_n(pool->cq, nb_entries);
> +
> +	rcu_read_lock();
> +	list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
> +		if (xs->cq_event)
> +			sock_def_readable(&xs->sk);
> +	}
> +	rcu_read_unlock();
>  }
>  EXPORT_SYMBOL(xsk_tx_completed);
>
> @@ -495,6 +504,9 @@ static __poll_t xsk_poll(struct file *file, struct
> socket *sock,
>  			__xsk_sendmsg(sk);
>  	}
>
> +	if (xs->cq_event && pool->cq && !xskq_prod_is_empty(pool->cq))
> +		mask |= EPOLLIN | EPOLLRDNORM;
> +
>  	if (xs->rx && !xskq_prod_is_empty(xs->rx))
>  		mask |= EPOLLIN | EPOLLRDNORM;
>  	if (xs->tx && !xskq_cons_is_full(xs->tx))
> @@ -882,6 +894,22 @@ static int xsk_setsockopt(struct socket *sock, int
> level, int optname,
>  		mutex_unlock(&xs->mutex);
>  		return err;
>  	}
> +	case XDP_CQ_EVENT:
> +	{
> +		int cq_event;
> +
> +		if (optlen < sizeof(cq_event))
> +			return -EINVAL;
> +		if (copy_from_sockptr(&cq_event, optval, sizeof(cq_event)))
> +			return -EFAULT;
> +
> +		if (cq_event)
> +			xs->cq_event = true;
> +		else
> +			xs->cq_event = false;

It's false by default, isn't it?

> +
> +		return 0;
> +	}
>  	default:
>  		break;
>  	}
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xsk: add cq event
       [not found]   ` <5fb245b1.1c69fb81.e2685.976dSMTPIN_ADDED_MISSING@mx.google.com>
@ 2020-11-16 10:13     ` Denis Kirjanov
  2020-11-16 10:21       ` Denis Kirjanov
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Kirjanov @ 2020-11-16 10:13 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: netdev, Björn Töpel, Magnus Karlsson, Jonathan Lemon,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, bpf,
	linux-kernel

On 11/16/20, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> On Mon, 16 Nov 2020 12:13:21 +0300, Denis Kirjanov <kda@linux-powerpc.org>
> wrote:
>> On 11/16/20, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>> > When we write all cq items to tx, we have to wait for a new event based
>> > on poll to indicate that it is writable. But the current writability is
>> > triggered based on whether tx is full or not, and In fact, when tx is
>> > dissatisfied, the user of cq's item may not necessarily get it, because
>> > it
>> > may still be occupied by the network card. In this case, we need to
>> > know
>> > when cq is available, so this patch adds a socket option, When the user
>> > configures this option using setsockopt, when cq is available, a
>> > readable event is generated for all xsk bound to this umem.
>> >
>> > I can't find a better description of this event,
>> > I think it can also be 'readable', although it is indeed different from
>> > the 'readable' of the new data. But the overhead of xsk checking
>> > whether
>> > cq or rx is readable is small.
>> >
>> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> > ---
>> >  include/net/xdp_sock.h      |  1 +
>> >  include/uapi/linux/if_xdp.h |  1 +
>> >  net/xdp/xsk.c               | 28 ++++++++++++++++++++++++++++
>> >  3 files changed, 30 insertions(+)
>> >
>> > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
>> > index 1a9559c..faf5b1a 100644
>> > --- a/include/net/xdp_sock.h
>> > +++ b/include/net/xdp_sock.h
>> > @@ -49,6 +49,7 @@ struct xdp_sock {
>> >  	struct xsk_buff_pool *pool;
>> >  	u16 queue_id;
>> >  	bool zc;
>> > +	bool cq_event;
>> >  	enum {
>> >  		XSK_READY = 0,
>> >  		XSK_BOUND,
>> > diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
>> > index a78a809..2dba3cb 100644
>> > --- a/include/uapi/linux/if_xdp.h
>> > +++ b/include/uapi/linux/if_xdp.h
>> > @@ -63,6 +63,7 @@ struct xdp_mmap_offsets {
>> >  #define XDP_UMEM_COMPLETION_RING	6
>> >  #define XDP_STATISTICS			7
>> >  #define XDP_OPTIONS			8
>> > +#define XDP_CQ_EVENT			9
>> >
>> >  struct xdp_umem_reg {
>> >  	__u64 addr; /* Start of packet data area */
>> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>> > index cfbec39..0c53403 100644
>> > --- a/net/xdp/xsk.c
>> > +++ b/net/xdp/xsk.c
>> > @@ -285,7 +285,16 @@ void __xsk_map_flush(void)
>> >
>> >  void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries)
>> >  {
>> > +	struct xdp_sock *xs;
>> > +
>> >  	xskq_prod_submit_n(pool->cq, nb_entries);
>> > +
>> > +	rcu_read_lock();
>> > +	list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
>> > +		if (xs->cq_event)
>> > +			sock_def_readable(&xs->sk);
>> > +	}
>> > +	rcu_read_unlock();
>> >  }
>> >  EXPORT_SYMBOL(xsk_tx_completed);
>> >
>> > @@ -495,6 +504,9 @@ static __poll_t xsk_poll(struct file *file, struct
>> > socket *sock,
>> >  			__xsk_sendmsg(sk);
>> >  	}
>> >
>> > +	if (xs->cq_event && pool->cq && !xskq_prod_is_empty(pool->cq))
>> > +		mask |= EPOLLIN | EPOLLRDNORM;
>> > +
>> >  	if (xs->rx && !xskq_prod_is_empty(xs->rx))
>> >  		mask |= EPOLLIN | EPOLLRDNORM;
>> >  	if (xs->tx && !xskq_cons_is_full(xs->tx))
>> > @@ -882,6 +894,22 @@ static int xsk_setsockopt(struct socket *sock, int
>> > level, int optname,
>> >  		mutex_unlock(&xs->mutex);
>> >  		return err;
>> >  	}
>> > +	case XDP_CQ_EVENT:
>> > +	{
>> > +		int cq_event;
>> > +
>> > +		if (optlen < sizeof(cq_event))
>> > +			return -EINVAL;
>> > +		if (copy_from_sockptr(&cq_event, optval, sizeof(cq_event)))
>> > +			return -EFAULT;
>> > +
>> > +		if (cq_event)
>> > +			xs->cq_event = true;
>> > +		else
>> > +			xs->cq_event = false;
>>
>> It's false by default, isn't it?
>
> I add cq_event inside "xdp_sock", that is got by sk_alloc, this call
> sk_prot_alloc by __GFP_ZERO. So I think it is false.

Right, I meant that what's the point to set it explicitly to 'false'?

>
> Thanks.
>
>>
>> > +
>> > +		return 0;
>> > +	}
>> >  	default:
>> >  		break;
>> >  	}
>> > --
>> > 1.8.3.1
>> >
>> >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xsk: add cq event
  2020-11-16 10:13     ` Denis Kirjanov
@ 2020-11-16 10:21       ` Denis Kirjanov
  0 siblings, 0 replies; 6+ messages in thread
From: Denis Kirjanov @ 2020-11-16 10:21 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: netdev, Björn Töpel, Magnus Karlsson, Jonathan Lemon,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, bpf,
	linux-kernel

On 11/16/20, Denis Kirjanov <kda@linux-powerpc.org> wrote:
> On 11/16/20, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>> On Mon, 16 Nov 2020 12:13:21 +0300, Denis Kirjanov
>> <kda@linux-powerpc.org>
>> wrote:
>>> On 11/16/20, Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>>> > When we write all cq items to tx, we have to wait for a new event
>>> > based
>>> > on poll to indicate that it is writable. But the current writability
>>> > is
>>> > triggered based on whether tx is full or not, and In fact, when tx is
>>> > dissatisfied, the user of cq's item may not necessarily get it,
>>> > because
>>> > it
>>> > may still be occupied by the network card. In this case, we need to
>>> > know
>>> > when cq is available, so this patch adds a socket option, When the
>>> > user
>>> > configures this option using setsockopt, when cq is available, a
>>> > readable event is generated for all xsk bound to this umem.
>>> >
>>> > I can't find a better description of this event,
>>> > I think it can also be 'readable', although it is indeed different
>>> > from
>>> > the 'readable' of the new data. But the overhead of xsk checking
>>> > whether
>>> > cq or rx is readable is small.
>>> >
>>> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>> > ---
>>> >  include/net/xdp_sock.h      |  1 +
>>> >  include/uapi/linux/if_xdp.h |  1 +
>>> >  net/xdp/xsk.c               | 28 ++++++++++++++++++++++++++++
>>> >  3 files changed, 30 insertions(+)
>>> >
>>> > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
>>> > index 1a9559c..faf5b1a 100644
>>> > --- a/include/net/xdp_sock.h
>>> > +++ b/include/net/xdp_sock.h
>>> > @@ -49,6 +49,7 @@ struct xdp_sock {
>>> >  	struct xsk_buff_pool *pool;
>>> >  	u16 queue_id;
>>> >  	bool zc;
>>> > +	bool cq_event;
>>> >  	enum {
>>> >  		XSK_READY = 0,
>>> >  		XSK_BOUND,
>>> > diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
>>> > index a78a809..2dba3cb 100644
>>> > --- a/include/uapi/linux/if_xdp.h
>>> > +++ b/include/uapi/linux/if_xdp.h
>>> > @@ -63,6 +63,7 @@ struct xdp_mmap_offsets {
>>> >  #define XDP_UMEM_COMPLETION_RING	6
>>> >  #define XDP_STATISTICS			7
>>> >  #define XDP_OPTIONS			8
>>> > +#define XDP_CQ_EVENT			9
>>> >
>>> >  struct xdp_umem_reg {
>>> >  	__u64 addr; /* Start of packet data area */
>>> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
>>> > index cfbec39..0c53403 100644
>>> > --- a/net/xdp/xsk.c
>>> > +++ b/net/xdp/xsk.c
>>> > @@ -285,7 +285,16 @@ void __xsk_map_flush(void)
>>> >
>>> >  void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries)
>>> >  {
>>> > +	struct xdp_sock *xs;
>>> > +
>>> >  	xskq_prod_submit_n(pool->cq, nb_entries);
>>> > +
>>> > +	rcu_read_lock();
>>> > +	list_for_each_entry_rcu(xs, &pool->xsk_tx_list, tx_list) {
>>> > +		if (xs->cq_event)
>>> > +			sock_def_readable(&xs->sk);
>>> > +	}
>>> > +	rcu_read_unlock();
>>> >  }
>>> >  EXPORT_SYMBOL(xsk_tx_completed);
>>> >
>>> > @@ -495,6 +504,9 @@ static __poll_t xsk_poll(struct file *file, struct
>>> > socket *sock,
>>> >  			__xsk_sendmsg(sk);
>>> >  	}
>>> >
>>> > +	if (xs->cq_event && pool->cq && !xskq_prod_is_empty(pool->cq))
>>> > +		mask |= EPOLLIN | EPOLLRDNORM;
>>> > +
>>> >  	if (xs->rx && !xskq_prod_is_empty(xs->rx))
>>> >  		mask |= EPOLLIN | EPOLLRDNORM;
>>> >  	if (xs->tx && !xskq_cons_is_full(xs->tx))
>>> > @@ -882,6 +894,22 @@ static int xsk_setsockopt(struct socket *sock,
>>> > int
>>> > level, int optname,
>>> >  		mutex_unlock(&xs->mutex);
>>> >  		return err;
>>> >  	}
>>> > +	case XDP_CQ_EVENT:
>>> > +	{
>>> > +		int cq_event;
>>> > +
>>> > +		if (optlen < sizeof(cq_event))
>>> > +			return -EINVAL;
>>> > +		if (copy_from_sockptr(&cq_event, optval, sizeof(cq_event)))
>>> > +			return -EFAULT;
>>> > +
>>> > +		if (cq_event)
>>> > +			xs->cq_event = true;
>>> > +		else
>>> > +			xs->cq_event = false;
>>>
>>> It's false by default, isn't it?
>>
>> I add cq_event inside "xdp_sock", that is got by sk_alloc, this call
>> sk_prot_alloc by __GFP_ZERO. So I think it is false.
>
> Right, I meant that what's the point to set it explicitly to 'false'?

Nevermind, It's okay.

>
>>
>> Thanks.
>>
>>>
>>> > +
>>> > +		return 0;
>>> > +	}
>>> >  	default:
>>> >  		break;
>>> >  	}
>>> > --
>>> > 1.8.3.1
>>> >
>>> >
>>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xsk: add cq event
  2020-11-16  8:10 [PATCH] xsk: add cq event Xuan Zhuo
  2020-11-16  9:13 ` Denis Kirjanov
@ 2020-11-16 14:31 ` Björn Töpel
  1 sibling, 0 replies; 6+ messages in thread
From: Björn Töpel @ 2020-11-16 14:31 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: Magnus Karlsson, Jonathan Lemon, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, bpf, linux-kernel

On 2020-11-16 09:10, Xuan Zhuo wrote:
> When we write all cq items to tx, we have to wait for a new event based
> on poll to indicate that it is writable. But the current writability is
> triggered based on whether tx is full or not, and In fact, when tx is
> dissatisfied, the user of cq's item may not necessarily get it, because it
> may still be occupied by the network card. In this case, we need to know
> when cq is available, so this patch adds a socket option, When the user
> configures this option using setsockopt, when cq is available, a
> readable event is generated for all xsk bound to this umem.
> 
> I can't find a better description of this event,
> I think it can also be 'readable', although it is indeed different from
> the 'readable' of the new data. But the overhead of xsk checking whether
> cq or rx is readable is small.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

Thanks for the patch!

I'm not a fan of having two different "readable" event (both Rx and cq).
Could you explain a bit what the use case is, so I get a better
understanding.

The Tx queues has a back-pressure mechanism, determined of the number of
elements in cq. Is it related to that?

Please explain a bit more what you're trying to solve, and maybe we can
figure out a better way forward!


Thanks!
Björn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] xsk: add cq event
       [not found] <964677c6-442c-485e-9268-3a801dbd4bd3@orsmsx607.amr.corp.intel.com>
@ 2020-11-17 10:00 ` Björn Töpel
  0 siblings, 0 replies; 6+ messages in thread
From: Björn Töpel @ 2020-11-17 10:00 UTC (permalink / raw)
  To: Xuan Zhuo, Karlsson, Magnus
  Cc: Magnus Karlsson, Jonathan Lemon, David S. Miller, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, bpf, linux-kernel, netdev

On 2020-11-16 17:12, Xuan Zhuo wrote:
> On Mon, 16 Nov 2020 15:31:20 +0100, =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= <bjorn.topel@intel.com> wrote:
> 
>> On 2020-11-16 09:10, Xuan Zhuo wrote:
> 
>>> When we write all cq items to tx, we have to wait for a new event based
> 
>>> on poll to indicate that it is writable. But the current writability is
> 
>>> triggered based on whether tx is full or not, and In fact, when tx is
> 
>>> dissatisfied, the user of cq's item may not necessarily get it, because it
> 
>>> may still be occupied by the network card. In this case, we need to know
> 
>>> when cq is available, so this patch adds a socket option, When the user
> 
>>> configures this option using setsockopt, when cq is available, a
> 
>>> readable event is generated for all xsk bound to this umem.
> 
>>>
> 
>>> I can't find a better description of this event,
> 
>>> I think it can also be 'readable', although it is indeed different from
> 
>>> the 'readable' of the new data. But the overhead of xsk checking whether
> 
>>> cq or rx is readable is small.
> 
>>>
> 
>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
>>
> 
>> Thanks for the patch!
> 
>>
> 
>> I'm not a fan of having two different "readable" event (both Rx and cq).
> 
>> Could you explain a bit what the use case is, so I get a better
> 
>> understanding.
> 
>>
> 
>> The Tx queues has a back-pressure mechanism, determined of the number of
> 
>> elements in cq. Is it related to that?
> 
>>
> 
>> Please explain a bit more what you're trying to solve, and maybe we can
> 
>> figure out a better way forward!
> 
>>
> 
>>
> 
>> Thanks!
> 
>> Björn
> 
> I want to implement a tool for mass sending. For example, the size of cq is
> 
> 1024, and I set the size of tx also to 1024, so that I will put all cq in tx at
> 
> once, and then I have to wait for an event, come Indicates that there is new
> 
> write space or new cq is available.
> 
> 
> 
> At present, we can only monitor the event of write able. This indicates whether
> 
> tx is full, but in fact, tx is basically not full, because the full state is
> 
> very short, and those tx items are being used by the network card. And
> 
> epoll_wait will be awakened directly every time, without waiting, but I cannot
> 
> get the cq item, so I still cannot successfully send the package again.
> 
> 
> 
> Of course, I don't like the "readable" event very much. This is a suitable
> 
> one I found in the existing epoll event. ^_^
>

More questions! By "Mass sending" do you mean maximum throughput, or
does that mean "in very large batches"?

For the latter to do 1k batches, you could increase the Tx/cq buffer
size to say 4k.

For maximum thoughput it's better to use smaller batches (e.g. what the
txpush scenario in samples/xdpsock does).

You're right that even if there's space in the Tx ring, it wont be sent
unless there's sufficient space in the cq ring. Maybe it would make
sense to be more restrictive when triggering the "writable" socket
event? E.g. only trigger it when there's space in Tx *and* sufficient cq
space?


Björn

> 
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-11-17 10:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  8:10 [PATCH] xsk: add cq event Xuan Zhuo
2020-11-16  9:13 ` Denis Kirjanov
     [not found]   ` <5fb245b1.1c69fb81.e2685.976dSMTPIN_ADDED_MISSING@mx.google.com>
2020-11-16 10:13     ` Denis Kirjanov
2020-11-16 10:21       ` Denis Kirjanov
2020-11-16 14:31 ` Björn Töpel
     [not found] <964677c6-442c-485e-9268-3a801dbd4bd3@orsmsx607.amr.corp.intel.com>
2020-11-17 10:00 ` Björn Töpel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).