linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
@ 2023-03-11 16:36 Jason Xing
  2023-03-11 17:14 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-11 16:36 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend
  Cc: kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, kerneljasonxing, Jason Xing

From: Jason Xing <kernelxing@tencent.com>

When we encounter some performance issue and then get lost on how
to tune the budget limit and time limit in net_rx_action() function,
we can separately counting both of them to avoid the confusion.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
note: this commit is based on the link as below:
https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
---
 include/linux/netdevice.h |  1 +
 net/core/dev.c            | 12 ++++++++----
 net/core/net-procfs.c     |  9 ++++++---
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6a14b7b11766..5736311a2133 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3157,6 +3157,7 @@ struct softnet_data {
 	/* stats */
 	unsigned int		processed;
 	unsigned int		time_squeeze;
+	unsigned int		budget_squeeze;
 #ifdef CONFIG_RPS
 	struct softnet_data	*rps_ipi_list;
 #endif
diff --git a/net/core/dev.c b/net/core/dev.c
index 253584777101..bed7a68fdb5d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 	unsigned long time_limit = jiffies +
 		usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
 	int budget = READ_ONCE(netdev_budget);
+	bool is_continue = true;
 	LIST_HEAD(list);
 	LIST_HEAD(repoll);
 
@@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 	list_splice_init(&sd->poll_list, &list);
 	local_irq_enable();
 
-	for (;;) {
+	for (; is_continue;) {
 		struct napi_struct *n;
 
 		skb_defer_free_flush(sd);
@@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
 		 * Allow this to run for 2 jiffies since which will allow
 		 * an average latency of 1.5/HZ.
 		 */
-		if (unlikely(budget <= 0 ||
-			     time_after_eq(jiffies, time_limit))) {
+		if (unlikely(budget <= 0)) {
+			sd->budget_squeeze++;
+			is_continue = false;
+		}
+		if (unlikely(time_after_eq(jiffies, time_limit))) {
 			sd->time_squeeze++;
-			break;
+			is_continue = false;
 		}
 	}
 
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index 97a304e1957a..4d1a499d7c43 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
 	 */
 	seq_printf(seq,
 		   "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
-		   "%08x %08x\n",
-		   sd->processed, sd->dropped, sd->time_squeeze, 0,
+		   "%08x %08x %08x %08x\n",
+		   sd->processed, sd->dropped,
+		   0, /* was old way to count time squeeze */
+		   0,
 		   0, 0, 0, 0, /* was fastroute */
 		   0,	/* was cpu_collision */
 		   sd->received_rps, flow_limit_count,
 		   0,	/* was len of two backlog queues */
 		   (int)seq->index,
-		   softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
+		   softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
+		   sd->time_squeeze, sd->budget_squeeze);
 	return 0;
 }
 
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-11 16:36 [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior Jason Xing
@ 2023-03-11 17:14 ` Stephen Hemminger
  2023-03-12  0:04   ` Jason Xing
  2023-03-13  2:05 ` Jason Xing
  2023-03-13 21:58 ` Kui-Feng Lee
  2 siblings, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2023-03-11 17:14 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend,
	kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing

On Sun, 12 Mar 2023 00:36:14 +0800
Jason Xing <kerneljasonxing@gmail.com> wrote:

> -	for (;;) {
> +	for (; is_continue;) {


Easier to read this as a
 	while (is_continue) {

but what is wrong with using break; instead?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-11 17:14 ` Stephen Hemminger
@ 2023-03-12  0:04   ` Jason Xing
  0 siblings, 0 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-12  0:04 UTC (permalink / raw)
  To: stephen
  Cc: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend,
	kuniyu, liuhangbin, xiangxia.m.yue, jiri, bpf, netdev,
	linux-kernel, Jason Xing

On Sun, Mar 12, 2023 at 1:14 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Sun, 12 Mar 2023 00:36:14 +0800
> Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> > -     for (;;) {
> > +     for (; is_continue;) {
>
>
> Easier to read this as a
>         while (is_continue) {
>
> but what is wrong with using break; instead?

If we hit the budget limit and 'break;' immediately, we may miss the
collection when we also hit the time limit. That's why I would like to
know if we hit both of them.

Thank,
Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-11 16:36 [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior Jason Xing
  2023-03-11 17:14 ` Stephen Hemminger
@ 2023-03-13  2:05 ` Jason Xing
  2023-03-13 20:07   ` Simon Horman
  2023-03-13 21:58 ` Kui-Feng Lee
  2 siblings, 1 reply; 10+ messages in thread
From: Jason Xing @ 2023-03-13  2:05 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend
  Cc: kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing

On Sun, Mar 12, 2023 at 12:36 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> From: Jason Xing <kernelxing@tencent.com>
>
> When we encounter some performance issue and then get lost on how
> to tune the budget limit and time limit in net_rx_action() function,
> we can separately counting both of them to avoid the confusion.
>
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> note: this commit is based on the link as below:
> https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> ---
>  include/linux/netdevice.h |  1 +
>  net/core/dev.c            | 12 ++++++++----
>  net/core/net-procfs.c     |  9 ++++++---
>  3 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 6a14b7b11766..5736311a2133 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3157,6 +3157,7 @@ struct softnet_data {
>         /* stats */
>         unsigned int            processed;
>         unsigned int            time_squeeze;
> +       unsigned int            budget_squeeze;
>  #ifdef CONFIG_RPS
>         struct softnet_data     *rps_ipi_list;
>  #endif
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 253584777101..bed7a68fdb5d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>         unsigned long time_limit = jiffies +
>                 usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
>         int budget = READ_ONCE(netdev_budget);
> +       bool is_continue = true;

I kept thinking during these days, I think it looks not that concise
and elegant and also the name is not that good though the function can
work.

In the next submission, I'm going to choose to use 'while()' instead
of 'for()' suggested by Stephen.

Does anyone else have some advice about this?

Thanks,
Jason

>         LIST_HEAD(list);
>         LIST_HEAD(repoll);
>
> @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>         list_splice_init(&sd->poll_list, &list);
>         local_irq_enable();
>
> -       for (;;) {
> +       for (; is_continue;) {
>                 struct napi_struct *n;
>
>                 skb_defer_free_flush(sd);
> @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>                  * Allow this to run for 2 jiffies since which will allow
>                  * an average latency of 1.5/HZ.
>                  */
> -               if (unlikely(budget <= 0 ||
> -                            time_after_eq(jiffies, time_limit))) {
> +               if (unlikely(budget <= 0)) {
> +                       sd->budget_squeeze++;
> +                       is_continue = false;
> +               }
> +               if (unlikely(time_after_eq(jiffies, time_limit))) {
>                         sd->time_squeeze++;
> -                       break;
> +                       is_continue = false;
>                 }
>         }
>
> diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> index 97a304e1957a..4d1a499d7c43 100644
> --- a/net/core/net-procfs.c
> +++ b/net/core/net-procfs.c
> @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
>          */
>         seq_printf(seq,
>                    "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> -                  "%08x %08x\n",
> -                  sd->processed, sd->dropped, sd->time_squeeze, 0,
> +                  "%08x %08x %08x %08x\n",
> +                  sd->processed, sd->dropped,
> +                  0, /* was old way to count time squeeze */
> +                  0,
>                    0, 0, 0, 0, /* was fastroute */
>                    0,   /* was cpu_collision */
>                    sd->received_rps, flow_limit_count,
>                    0,   /* was len of two backlog queues */
>                    (int)seq->index,
> -                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> +                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> +                  sd->time_squeeze, sd->budget_squeeze);
>         return 0;
>  }
>
> --
> 2.37.3
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-13  2:05 ` Jason Xing
@ 2023-03-13 20:07   ` Simon Horman
  2023-03-14  1:56     ` Jason Xing
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Horman @ 2023-03-13 20:07 UTC (permalink / raw)
  To: Jason Xing
  Cc: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend,
	kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing

On Mon, Mar 13, 2023 at 10:05:18AM +0800, Jason Xing wrote:
> On Sun, Mar 12, 2023 at 12:36 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > When we encounter some performance issue and then get lost on how
> > to tune the budget limit and time limit in net_rx_action() function,
> > we can separately counting both of them to avoid the confusion.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > note: this commit is based on the link as below:
> > https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> > ---
> >  include/linux/netdevice.h |  1 +
> >  net/core/dev.c            | 12 ++++++++----
> >  net/core/net-procfs.c     |  9 ++++++---
> >  3 files changed, 15 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 6a14b7b11766..5736311a2133 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -3157,6 +3157,7 @@ struct softnet_data {
> >         /* stats */
> >         unsigned int            processed;
> >         unsigned int            time_squeeze;
> > +       unsigned int            budget_squeeze;
> >  #ifdef CONFIG_RPS
> >         struct softnet_data     *rps_ipi_list;
> >  #endif
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 253584777101..bed7a68fdb5d 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >         unsigned long time_limit = jiffies +
> >                 usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
> >         int budget = READ_ONCE(netdev_budget);
> > +       bool is_continue = true;
> 
> I kept thinking during these days, I think it looks not that concise
> and elegant and also the name is not that good though the function can
> work.
> 
> In the next submission, I'm going to choose to use 'while()' instead
> of 'for()' suggested by Stephen.
> 
> Does anyone else have some advice about this?

What about:

	int done = false

	while (!done) {
		...
	}

Or:

	for (;;) {
		int done = false;

		...
		if (done)
			break;
	}

> 
> Thanks,
> Jason
> 
> >         LIST_HEAD(list);
> >         LIST_HEAD(repoll);
> >
> > @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >         list_splice_init(&sd->poll_list, &list);
> >         local_irq_enable();
> >
> > -       for (;;) {
> > +       for (; is_continue;) {
> >                 struct napi_struct *n;
> >
> >                 skb_defer_free_flush(sd);
> > @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >                  * Allow this to run for 2 jiffies since which will allow
> >                  * an average latency of 1.5/HZ.
> >                  */
> > -               if (unlikely(budget <= 0 ||
> > -                            time_after_eq(jiffies, time_limit))) {
> > +               if (unlikely(budget <= 0)) {
> > +                       sd->budget_squeeze++;
> > +                       is_continue = false;
> > +               }
> > +               if (unlikely(time_after_eq(jiffies, time_limit))) {
> >                         sd->time_squeeze++;
> > -                       break;
> > +                       is_continue = false;
> >                 }
> >         }
> >
> > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> > index 97a304e1957a..4d1a499d7c43 100644
> > --- a/net/core/net-procfs.c
> > +++ b/net/core/net-procfs.c
> > @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
> >          */
> >         seq_printf(seq,
> >                    "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> > -                  "%08x %08x\n",
> > -                  sd->processed, sd->dropped, sd->time_squeeze, 0,
> > +                  "%08x %08x %08x %08x\n",
> > +                  sd->processed, sd->dropped,
> > +                  0, /* was old way to count time squeeze */
> > +                  0,
> >                    0, 0, 0, 0, /* was fastroute */
> >                    0,   /* was cpu_collision */
> >                    sd->received_rps, flow_limit_count,
> >                    0,   /* was len of two backlog queues */
> >                    (int)seq->index,
> > -                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> > +                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> > +                  sd->time_squeeze, sd->budget_squeeze);
> >         return 0;
> >  }
> >
> > --
> > 2.37.3
> >
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-11 16:36 [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior Jason Xing
  2023-03-11 17:14 ` Stephen Hemminger
  2023-03-13  2:05 ` Jason Xing
@ 2023-03-13 21:58 ` Kui-Feng Lee
  2023-03-14  1:57   ` Jason Xing
  2 siblings, 1 reply; 10+ messages in thread
From: Kui-Feng Lee @ 2023-03-13 21:58 UTC (permalink / raw)
  To: Jason Xing, davem, edumazet, kuba, pabeni, ast, daniel, hawk,
	john.fastabend
  Cc: kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing



On 3/11/23 08:36, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> When we encounter some performance issue and then get lost on how
> to tune the budget limit and time limit in net_rx_action() function,
> we can separately counting both of them to avoid the confusion.
> 
> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> ---
> note: this commit is based on the link as below:
> https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> ---
>   include/linux/netdevice.h |  1 +
>   net/core/dev.c            | 12 ++++++++----
>   net/core/net-procfs.c     |  9 ++++++---
>   3 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 6a14b7b11766..5736311a2133 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3157,6 +3157,7 @@ struct softnet_data {
>   	/* stats */
>   	unsigned int		processed;
>   	unsigned int		time_squeeze;
> +	unsigned int		budget_squeeze;
>   #ifdef CONFIG_RPS
>   	struct softnet_data	*rps_ipi_list;
>   #endif
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 253584777101..bed7a68fdb5d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>   	unsigned long time_limit = jiffies +
>   		usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
>   	int budget = READ_ONCE(netdev_budget);
> +	bool is_continue = true;
>   	LIST_HEAD(list);
>   	LIST_HEAD(repoll);
>   
> @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>   	list_splice_init(&sd->poll_list, &list);
>   	local_irq_enable();
>   
> -	for (;;) {
> +	for (; is_continue;) {
>   		struct napi_struct *n;
>   
>   		skb_defer_free_flush(sd);
> @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
>   		 * Allow this to run for 2 jiffies since which will allow
>   		 * an average latency of 1.5/HZ.
>   		 */
> -		if (unlikely(budget <= 0 ||
> -			     time_after_eq(jiffies, time_limit))) {
> +		if (unlikely(budget <= 0)) {
> +			sd->budget_squeeze++;
> +			is_continue = false;
> +		}
> +		if (unlikely(time_after_eq(jiffies, time_limit))) {
>   			sd->time_squeeze++;
> -			break;
> +			is_continue = false;
>   		}
>   	}
>   
> diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> index 97a304e1957a..4d1a499d7c43 100644
> --- a/net/core/net-procfs.c
> +++ b/net/core/net-procfs.c
> @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
>   	 */
>   	seq_printf(seq,
>   		   "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> -		   "%08x %08x\n",
> -		   sd->processed, sd->dropped, sd->time_squeeze, 0,
> +		   "%08x %08x %08x %08x\n",
> +		   sd->processed, sd->dropped,
> +		   0, /* was old way to count time squeeze */

Should we show a proximate number?  For example,
sd->time_squeeze + sd->bud_squeeze.


> +		   0,
>   		   0, 0, 0, 0, /* was fastroute */
>   		   0,	/* was cpu_collision */
>   		   sd->received_rps, flow_limit_count,
>   		   0,	/* was len of two backlog queues */
>   		   (int)seq->index,
> -		   softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> +		   softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> +		   sd->time_squeeze, sd->budget_squeeze);
>   	return 0;
>   }
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-13 20:07   ` Simon Horman
@ 2023-03-14  1:56     ` Jason Xing
  0 siblings, 0 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-14  1:56 UTC (permalink / raw)
  To: Simon Horman
  Cc: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend,
	kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing

On Tue, Mar 14, 2023 at 4:07 AM Simon Horman <simon.horman@corigine.com> wrote:
>
> On Mon, Mar 13, 2023 at 10:05:18AM +0800, Jason Xing wrote:
> > On Sun, Mar 12, 2023 at 12:36 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > From: Jason Xing <kernelxing@tencent.com>
> > >
> > > When we encounter some performance issue and then get lost on how
> > > to tune the budget limit and time limit in net_rx_action() function,
> > > we can separately counting both of them to avoid the confusion.
> > >
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > ---
> > > note: this commit is based on the link as below:
> > > https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> > > ---
> > >  include/linux/netdevice.h |  1 +
> > >  net/core/dev.c            | 12 ++++++++----
> > >  net/core/net-procfs.c     |  9 ++++++---
> > >  3 files changed, 15 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > > index 6a14b7b11766..5736311a2133 100644
> > > --- a/include/linux/netdevice.h
> > > +++ b/include/linux/netdevice.h
> > > @@ -3157,6 +3157,7 @@ struct softnet_data {
> > >         /* stats */
> > >         unsigned int            processed;
> > >         unsigned int            time_squeeze;
> > > +       unsigned int            budget_squeeze;
> > >  #ifdef CONFIG_RPS
> > >         struct softnet_data     *rps_ipi_list;
> > >  #endif
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 253584777101..bed7a68fdb5d 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > >         unsigned long time_limit = jiffies +
> > >                 usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
> > >         int budget = READ_ONCE(netdev_budget);
> > > +       bool is_continue = true;
> >
> > I kept thinking during these days, I think it looks not that concise
> > and elegant and also the name is not that good though the function can
> > work.
> >
> > In the next submission, I'm going to choose to use 'while()' instead
> > of 'for()' suggested by Stephen.
> >
> > Does anyone else have some advice about this?
>
> What about:
>
>         int done = false
>
>         while (!done) {
>                 ...
>         }
>
> Or:
>
>         for (;;) {
>                 int done = false;
>
>                 ...
>                 if (done)
>                         break;
>         }
>

Great, that looks much better:)

Thanks,
Jason

> >
> > Thanks,
> > Jason
> >
> > >         LIST_HEAD(list);
> > >         LIST_HEAD(repoll);
> > >
> > > @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > >         list_splice_init(&sd->poll_list, &list);
> > >         local_irq_enable();
> > >
> > > -       for (;;) {
> > > +       for (; is_continue;) {
> > >                 struct napi_struct *n;
> > >
> > >                 skb_defer_free_flush(sd);
> > > @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> > >                  * Allow this to run for 2 jiffies since which will allow
> > >                  * an average latency of 1.5/HZ.
> > >                  */
> > > -               if (unlikely(budget <= 0 ||
> > > -                            time_after_eq(jiffies, time_limit))) {
> > > +               if (unlikely(budget <= 0)) {
> > > +                       sd->budget_squeeze++;
> > > +                       is_continue = false;
> > > +               }
> > > +               if (unlikely(time_after_eq(jiffies, time_limit))) {
> > >                         sd->time_squeeze++;
> > > -                       break;
> > > +                       is_continue = false;
> > >                 }
> > >         }
> > >
> > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> > > index 97a304e1957a..4d1a499d7c43 100644
> > > --- a/net/core/net-procfs.c
> > > +++ b/net/core/net-procfs.c
> > > @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
> > >          */
> > >         seq_printf(seq,
> > >                    "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> > > -                  "%08x %08x\n",
> > > -                  sd->processed, sd->dropped, sd->time_squeeze, 0,
> > > +                  "%08x %08x %08x %08x\n",
> > > +                  sd->processed, sd->dropped,
> > > +                  0, /* was old way to count time squeeze */
> > > +                  0,
> > >                    0, 0, 0, 0, /* was fastroute */
> > >                    0,   /* was cpu_collision */
> > >                    sd->received_rps, flow_limit_count,
> > >                    0,   /* was len of two backlog queues */
> > >                    (int)seq->index,
> > > -                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> > > +                  softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> > > +                  sd->time_squeeze, sd->budget_squeeze);
> > >         return 0;
> > >  }
> > >
> > > --
> > > 2.37.3
> > >
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-13 21:58 ` Kui-Feng Lee
@ 2023-03-14  1:57   ` Jason Xing
  2023-03-14  8:39     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 10+ messages in thread
From: Jason Xing @ 2023-03-14  1:57 UTC (permalink / raw)
  To: Kui-Feng Lee
  Cc: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend,
	kuniyu, liuhangbin, xiangxia.m.yue, jiri, andy.ren, bpf, netdev,
	linux-kernel, Jason Xing

On Tue, Mar 14, 2023 at 5:58 AM Kui-Feng Lee <sinquersw@gmail.com> wrote:
>
>
>
> On 3/11/23 08:36, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > When we encounter some performance issue and then get lost on how
> > to tune the budget limit and time limit in net_rx_action() function,
> > we can separately counting both of them to avoid the confusion.
> >
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > ---
> > note: this commit is based on the link as below:
> > https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> > ---
> >   include/linux/netdevice.h |  1 +
> >   net/core/dev.c            | 12 ++++++++----
> >   net/core/net-procfs.c     |  9 ++++++---
> >   3 files changed, 15 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 6a14b7b11766..5736311a2133 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -3157,6 +3157,7 @@ struct softnet_data {
> >       /* stats */
> >       unsigned int            processed;
> >       unsigned int            time_squeeze;
> > +     unsigned int            budget_squeeze;
> >   #ifdef CONFIG_RPS
> >       struct softnet_data     *rps_ipi_list;
> >   #endif
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 253584777101..bed7a68fdb5d 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -6637,6 +6637,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >       unsigned long time_limit = jiffies +
> >               usecs_to_jiffies(READ_ONCE(netdev_budget_usecs));
> >       int budget = READ_ONCE(netdev_budget);
> > +     bool is_continue = true;
> >       LIST_HEAD(list);
> >       LIST_HEAD(repoll);
> >
> > @@ -6644,7 +6645,7 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >       list_splice_init(&sd->poll_list, &list);
> >       local_irq_enable();
> >
> > -     for (;;) {
> > +     for (; is_continue;) {
> >               struct napi_struct *n;
> >
> >               skb_defer_free_flush(sd);
> > @@ -6662,10 +6663,13 @@ static __latent_entropy void net_rx_action(struct softirq_action *h)
> >                * Allow this to run for 2 jiffies since which will allow
> >                * an average latency of 1.5/HZ.
> >                */
> > -             if (unlikely(budget <= 0 ||
> > -                          time_after_eq(jiffies, time_limit))) {
> > +             if (unlikely(budget <= 0)) {
> > +                     sd->budget_squeeze++;
> > +                     is_continue = false;
> > +             }
> > +             if (unlikely(time_after_eq(jiffies, time_limit))) {
> >                       sd->time_squeeze++;
> > -                     break;
> > +                     is_continue = false;
> >               }
> >       }
> >
> > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> > index 97a304e1957a..4d1a499d7c43 100644
> > --- a/net/core/net-procfs.c
> > +++ b/net/core/net-procfs.c
> > @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
> >        */
> >       seq_printf(seq,
> >                  "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> > -                "%08x %08x\n",
> > -                sd->processed, sd->dropped, sd->time_squeeze, 0,
> > +                "%08x %08x %08x %08x\n",
> > +                sd->processed, sd->dropped,
> > +                0, /* was old way to count time squeeze */
>
> Should we show a proximate number?  For example,
> sd->time_squeeze + sd->bud_squeeze.

Yeah, It does make sense. Let the old way to display untouched.

>
>
> > +                0,
> >                  0, 0, 0, 0, /* was fastroute */
> >                  0,   /* was cpu_collision */
> >                  sd->received_rps, flow_limit_count,
> >                  0,   /* was len of two backlog queues */
> >                  (int)seq->index,
> > -                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> > +                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> > +                sd->time_squeeze, sd->budget_squeeze);
> >       return 0;
> >   }
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-14  1:57   ` Jason Xing
@ 2023-03-14  8:39     ` Jesper Dangaard Brouer
  2023-03-14  9:21       ` Jason Xing
  0 siblings, 1 reply; 10+ messages in thread
From: Jesper Dangaard Brouer @ 2023-03-14  8:39 UTC (permalink / raw)
  To: Jason Xing, Kui-Feng Lee
  Cc: brouer, davem, edumazet, kuba, pabeni, ast, daniel, hawk,
	john.fastabend, kuniyu, liuhangbin, xiangxia.m.yue, jiri,
	andy.ren, bpf, netdev, linux-kernel, Jason Xing,
	Willem de Bruijn, Simon Sundberg


On 14/03/2023 02.57, Jason Xing wrote:
> On Tue, Mar 14, 2023 at 5:58 AM Kui-Feng Lee <sinquersw@gmail.com> wrote:
>>
>> On 3/11/23 08:36, Jason Xing wrote:
>>> From: Jason Xing <kernelxing@tencent.com>
>>>
>>> When we encounter some performance issue and then get lost on how
>>> to tune the budget limit and time limit in net_rx_action() function,
>>> we can separately counting both of them to avoid the confusion.
>>>
>>> Signed-off-by: Jason Xing <kernelxing@tencent.com>
>>> ---
>>> note: this commit is based on the link as below:
>>> https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
>>> ---
[...]
>>> diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
>>> index 97a304e1957a..4d1a499d7c43 100644
>>> --- a/net/core/net-procfs.c
>>> +++ b/net/core/net-procfs.c
>>> @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
>>>         */
>>>        seq_printf(seq,
>>>                   "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
>>> -                "%08x %08x\n",
>>> -                sd->processed, sd->dropped, sd->time_squeeze, 0,
>>> +                "%08x %08x %08x %08x\n",
>>> +                sd->processed, sd->dropped,
>>> +                0, /* was old way to count time squeeze */
>>
>> Should we show a proximate number?  For example,
>> sd->time_squeeze + sd->bud_squeeze.
> 
> Yeah, It does make sense. Let the old way to display untouched.
>

Yes, I don't think we can/should remove this squeeze stat because
several tools e.g. my own[1] captures these stats (and I know Willem
also have his own tool).
I like the sd->time_squeeze + sd->budget_squeeze suggestion.

  [1] 
https://github.com/netoptimizer/network-testing/blob/master/bin/softnet_stat.pl


>>
>>
>>> +                0,
>>>                   0, 0, 0, 0, /* was fastroute */
>>>                   0,   /* was cpu_collision */
>>>                   sd->received_rps, flow_limit_count,
>>>                   0,   /* was len of two backlog queues */
>>>                   (int)seq->index,
>>> -                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
>>> +                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
>>> +                sd->time_squeeze, sd->budget_squeeze);
>>>        return 0;
>>>    }
>>>

We recently had a very long troubleshooting session around a latency
issue (Cc Simon) where we used the tool[1].  The issue was NIC hardware
RX queue was backlogged, but we didn't see any squeeze events, which
confused us. (This happens because budget was 300 and two NICs using 64
budget each doesn't exceed 300).

We were/are missing another counter to tell us net_rx_action() "repoll"
is happening (as code !list_empty(&repoll)).  That were the case and it
would have "told" us that hardware RX ring was full (larger than 64).

We worked around this limitation by using the tracepoint for napi_poll,
and manually deduced that 64 bulking must mean that "repoll" were happening.

Oneliner bpftrace script:

  bpftrace -e 'tracepoint:napi:napi_poll { 
@napi_rx_bulk[str(args->dev_name)] = lhist(args->work, 0, 64, 4); }'

We used this script (that also measures softirq latency):

 
https://github.com/xdp-project/xdp-project/blob/master/areas/latency/napi_monitor.bt 


I do wonder is it would be valuable to *also* add a tracepoint to
net_rx_action, that expose sd->time_squeeze, sd->budget_squeeze and
repoll-not-empty.

--Jesper


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior
  2023-03-14  8:39     ` Jesper Dangaard Brouer
@ 2023-03-14  9:21       ` Jason Xing
  0 siblings, 0 replies; 10+ messages in thread
From: Jason Xing @ 2023-03-14  9:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Kui-Feng Lee, brouer, davem, edumazet, kuba, pabeni, ast, daniel,
	hawk, john.fastabend, kuniyu, liuhangbin, xiangxia.m.yue, jiri,
	andy.ren, bpf, netdev, linux-kernel, Jason Xing,
	Willem de Bruijn, Simon Sundberg

On Tue, Mar 14, 2023 at 4:41 PM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
> On 14/03/2023 02.57, Jason Xing wrote:
> > On Tue, Mar 14, 2023 at 5:58 AM Kui-Feng Lee <sinquersw@gmail.com> wrote:
> >>
> >> On 3/11/23 08:36, Jason Xing wrote:
> >>> From: Jason Xing <kernelxing@tencent.com>
> >>>
> >>> When we encounter some performance issue and then get lost on how
> >>> to tune the budget limit and time limit in net_rx_action() function,
> >>> we can separately counting both of them to avoid the confusion.
> >>>
> >>> Signed-off-by: Jason Xing <kernelxing@tencent.com>
> >>> ---
> >>> note: this commit is based on the link as below:
> >>> https://lore.kernel.org/lkml/20230311151756.83302-1-kerneljasonxing@gmail.com/
> >>> ---
> [...]
> >>> diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
> >>> index 97a304e1957a..4d1a499d7c43 100644
> >>> --- a/net/core/net-procfs.c
> >>> +++ b/net/core/net-procfs.c
> >>> @@ -174,14 +174,17 @@ static int softnet_seq_show(struct seq_file *seq, void *v)
> >>>         */
> >>>        seq_printf(seq,
> >>>                   "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x "
> >>> -                "%08x %08x\n",
> >>> -                sd->processed, sd->dropped, sd->time_squeeze, 0,
> >>> +                "%08x %08x %08x %08x\n",
> >>> +                sd->processed, sd->dropped,
> >>> +                0, /* was old way to count time squeeze */
> >>
> >> Should we show a proximate number?  For example,
> >> sd->time_squeeze + sd->bud_squeeze.
> >
> > Yeah, It does make sense. Let the old way to display untouched.
> >
>
[...]
> Yes, I don't think we can/should remove this squeeze stat because
> several tools e.g. my own[1] captures these stats (and I know Willem
> also have his own tool).
> I like the sd->time_squeeze + sd->budget_squeeze suggestion.

So do I. Therefore I followed this suggestion in the next submission.

[1]
https://lore.kernel.org/lkml/20230314030532.9238-3-kerneljasonxing@gmail.com/

>
>   [1]
> https://github.com/netoptimizer/network-testing/blob/master/bin/softnet_stat.pl
>
>
> >>
> >>
> >>> +                0,
> >>>                   0, 0, 0, 0, /* was fastroute */
> >>>                   0,   /* was cpu_collision */
> >>>                   sd->received_rps, flow_limit_count,
> >>>                   0,   /* was len of two backlog queues */
> >>>                   (int)seq->index,
> >>> -                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd));
> >>> +                softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd),
> >>> +                sd->time_squeeze, sd->budget_squeeze);
> >>>        return 0;
> >>>    }
> >>>
>
[...]
> We recently had a very long troubleshooting session around a latency
> issue (Cc Simon) where we used the tool[1].  The issue was NIC hardware
> RX queue was backlogged, but we didn't see any squeeze events, which
> confused us. (This happens because budget was 300 and two NICs using 64
> budget each doesn't exceed 300).

I recently found some users running on our production environment hit
the time_squeeze very often which aroused my interests.
Env:
1) budget is 300;
2) eth0 is virtio_net which only registers 32 input interrupts (32
queue pairs) with a larger number of cpus online.

>
> We were/are missing another counter to tell us net_rx_action() "repoll"
> is happening (as code !list_empty(&repoll)).  That were the case and it
> would have "told" us that hardware RX ring was full (larger than 64).
>
> We worked around this limitation by using the tracepoint for napi_poll,
> and manually deduced that 64 bulking must mean that "repoll" were happening.
>
> Oneliner bpftrace script:
>
>   bpftrace -e 'tracepoint:napi:napi_poll {
> @napi_rx_bulk[str(args->dev_name)] = lhist(args->work, 0, 64, 4); }'
>
> We used this script (that also measures softirq latency):
>
>
> https://github.com/xdp-project/xdp-project/blob/master/areas/latency/napi_monitor.bt
>
>
[...]
> I do wonder is it would be valuable to *also* add a tracepoint to
> net_rx_action, that expose sd->time_squeeze, sd->budget_squeeze and
> repoll-not-empty.

I believe it's useful that we can show more details in softnet_data,
but I'm confused about how to display them.
This morning I submitted one patch[1] and chose to do such things when
reading the softnet_stat file.

Could we add more data in the softnet_stat file while also tracing
those three important points? I'm not sure.

Thanks,
Jason

>
> --Jesper
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-14  9:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-11 16:36 [PATCH net-next] net: introduce budget_squeeze to help us tune rx behavior Jason Xing
2023-03-11 17:14 ` Stephen Hemminger
2023-03-12  0:04   ` Jason Xing
2023-03-13  2:05 ` Jason Xing
2023-03-13 20:07   ` Simon Horman
2023-03-14  1:56     ` Jason Xing
2023-03-13 21:58 ` Kui-Feng Lee
2023-03-14  1:57   ` Jason Xing
2023-03-14  8:39     ` Jesper Dangaard Brouer
2023-03-14  9:21       ` Jason Xing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).