All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] af_packet: fix efficiency issues in packet_read_pending
@ 2022-04-06 11:48 lianglixue
  2022-04-06 15:20 ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: lianglixue @ 2022-04-06 11:48 UTC (permalink / raw)
  To: davem, kuba, pabeni
  Cc: edumazet, pablo, rsanger, yajun.deng, jiapeng.chong, netdev, lianglixue

In packet_read_pengding, even if the pending_refcnt of the first CPU
is not 0, the pending_refcnt of all CPUs will be traversed,
and the long delay of cross-cpu access in NUMA significantly reduces
the performance of packet sending; especially in tpacket_destruct_skb.

When pending_refcnt is not 0, it returns without counting the number of
all pending packets, which significantly reduces the traversal time.

Signed-off-by: lianglixue <lianglixue@greatwall.com.cn>
---
 net/packet/af_packet.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index c39c09899fd0..c04f49e44a33 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1210,17 +1210,18 @@ static void packet_dec_pending(struct packet_ring_buffer *rb)
 
 static unsigned int packet_read_pending(const struct packet_ring_buffer *rb)
 {
-	unsigned int refcnt = 0;
 	int cpu;
 
 	/* We don't use pending refcount in rx_ring. */
 	if (rb->pending_refcnt == NULL)
 		return 0;
 
-	for_each_possible_cpu(cpu)
-		refcnt += *per_cpu_ptr(rb->pending_refcnt, cpu);
+	for_each_possible_cpu(cpu) {
+		if (*per_cpu_ptr(rb->pending_refcnt, cpu) != 0)
+			return 1;
+	}
 
-	return refcnt;
+	return 0;
 }
 
 static int packet_alloc_pending(struct packet_sock *po)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] af_packet: fix efficiency issues in packet_read_pending
  2022-04-06 11:48 [PATCH] af_packet: fix efficiency issues in packet_read_pending lianglixue
@ 2022-04-06 15:20 ` Eric Dumazet
       [not found]   ` <6BE2E6E4-F6D0-4CB4-96ED-F1D1931D62CE@gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2022-04-06 15:20 UTC (permalink / raw)
  To: lianglixue
  Cc: David Miller, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	rsanger, Yajun Deng, jiapeng.chong, netdev, lianglixue

On Wed, Apr 6, 2022 at 4:49 AM lianglixue <lixue.liang5086@gmail.com> wrote:
>
> In packet_read_pengding, even if the pending_refcnt of the first CPU
> is not 0, the pending_refcnt of all CPUs will be traversed,
> and the long delay of cross-cpu access in NUMA significantly reduces
> the performance of packet sending; especially in tpacket_destruct_skb.
>
> When pending_refcnt is not 0, it returns without counting the number of
> all pending packets, which significantly reduces the traversal time.
>

Can you describe the use case ?

You are changing the slow path.

Perhaps you do not use the interface in the most efficient way.

Your patch is wrong, pending_refcnt increments and decrements can
happen on different cpus.



> Signed-off-by: lianglixue <lianglixue@greatwall.com.cn>
> ---
>  net/packet/af_packet.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index c39c09899fd0..c04f49e44a33 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1210,17 +1210,18 @@ static void packet_dec_pending(struct packet_ring_buffer *rb)
>
>  static unsigned int packet_read_pending(const struct packet_ring_buffer *rb)
>  {
> -       unsigned int refcnt = 0;
>         int cpu;
>
>         /* We don't use pending refcount in rx_ring. */
>         if (rb->pending_refcnt == NULL)
>                 return 0;
>
> -       for_each_possible_cpu(cpu)
> -               refcnt += *per_cpu_ptr(rb->pending_refcnt, cpu);
> +       for_each_possible_cpu(cpu) {
> +               if (*per_cpu_ptr(rb->pending_refcnt, cpu) != 0)
> +                       return 1;
> +       }
>
> -       return refcnt;
> +       return 0;
>  }
>
>  static int packet_alloc_pending(struct packet_sock *po)
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] af_packet: fix efficiency issues in packet_read_pending
       [not found]   ` <6BE2E6E4-F6D0-4CB4-96ED-F1D1931D62CE@gmail.com>
@ 2022-04-07  3:14     ` Eric Dumazet
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2022-04-07  3:14 UTC (permalink / raw)
  To: lixue liang, Daniel Borkmann
  Cc: David Miller, Jakub Kicinski, Paolo Abeni, Pablo Neira Ayuso,
	rsanger, Yajun Deng, jiapeng.chong, netdev

On Wed, Apr 6, 2022 at 8:02 PM lixue liang <lixue.liang5086@gmail.com> wrote:
>
> Hi, Eric Dumazet.The pending_refcnt will indeed increase and decrease on different cpus,
> but the caller of packet_read_pending in af_packet.c only needs to determine whether there is
> a pending packet, not the sum of pending packets on each cpu .
>
> In the numa of 128 cpus, if the maximum distance of node is 100 (numactl -H, the result is as shown below),
> even if the first cpu has a pending packet, the packet_read_pending of tpacket_destruct_skb still needs to
> additionally traverse the remaining 127 cpus the pending packet, which leads to additional consumption
> of more cpu time, and the return refcnt of packet_read_pending is the sum of the pending packets that the
> caller does not need.
>
> please correct me, thanks.

Please do not send HTML messages, they won't be delivered to the list.

No, the intent of the code is to sum all cpu variables.

Your change basically will break many cases, as I explained.

If for some reason CPU 0 did a single packet_inc_pending() but the
corresponding packet_dec_pending() happened on CPU 1,
we do not want  packet_read_pending() to return 1, but 0.


If having per-cpu variable is too costly, we need to revisit the whole thing.

(ie possibly revert b013840810c221f2b0cf641d01531526052dc1fb packet:
use percpu mmap tx frame pending refcount)

>
>
>
> 2022年4月6日 23:20,Eric Dumazet <edumazet@google.com> 写道:
>
> On Wed, Apr 6, 2022 at 4:49 AM lianglixue <lixue.liang5086@gmail.com> wrote:
>
>
> In packet_read_pengding, even if the pending_refcnt of the first CPU
> is not 0, the pending_refcnt of all CPUs will be traversed,
> and the long delay of cross-cpu access in NUMA significantly reduces
> the performance of packet sending; especially in tpacket_destruct_skb.
>
> When pending_refcnt is not 0, it returns without counting the number of
> all pending packets, which significantly reduces the traversal time.
>
>
> Can you describe the use case ?
>
> You are changing the slow path.
>
> Perhaps you do not use the interface in the most efficient way.
>
> Your patch is wrong, pending_refcnt increments and decrements can
> happen on different cpus.
>
>
>
> Signed-off-by: lianglixue <lianglixue@greatwall.com.cn>
> ---
> net/packet/af_packet.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index c39c09899fd0..c04f49e44a33 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1210,17 +1210,18 @@ static void packet_dec_pending(struct packet_ring_buffer *rb)
>
> static unsigned int packet_read_pending(const struct packet_ring_buffer *rb)
> {
> -       unsigned int refcnt = 0;
>        int cpu;
>
>        /* We don't use pending refcount in rx_ring. */
>        if (rb->pending_refcnt == NULL)
>                return 0;
>
> -       for_each_possible_cpu(cpu)
> -               refcnt += *per_cpu_ptr(rb->pending_refcnt, cpu);
> +       for_each_possible_cpu(cpu) {
> +               if (*per_cpu_ptr(rb->pending_refcnt, cpu) != 0)
> +                       return 1;
> +       }
>
> -       return refcnt;
> +       return 0;
> }
>
> static int packet_alloc_pending(struct packet_sock *po)
> --
> 2.27.0
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-04-07  3:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-06 11:48 [PATCH] af_packet: fix efficiency issues in packet_read_pending lianglixue
2022-04-06 15:20 ` Eric Dumazet
     [not found]   ` <6BE2E6E4-F6D0-4CB4-96ED-F1D1931D62CE@gmail.com>
2022-04-07  3:14     ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.