linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bisected softirq accounting issue in v4.11-rc1~170^2~28
@ 2017-03-28  8:14 Jesper Dangaard Brouer
  2017-03-28 10:34 ` Wanpeng Li
  2017-03-28 14:34 ` Frederic Weisbecker
  0 siblings, 2 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2017-03-28  8:14 UTC (permalink / raw)
  To: Frederic Weisbecker, linux-kernel
  Cc: brouer, netdev, linux-mm, Mel Gorman, Tariq Toukan, Tariq Toukan,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Ingo Molnar


(While evaluating some changes to the page allocator) I ran into an
issue with ksoftirqd getting too much CPU sched time.

I bisected the problem to
 a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")

 a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
 commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
 Author: Frederic Weisbecker <fweisbec@gmail.com>
 Date:   Tue Jan 31 04:09:32 2017 +0100

    sched/cputime: Increment kcpustat directly on irqtime account
    
    The irqtime is accounted is nsecs and stored in
    cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
    accumulated amount reaches a new jiffy, this one gets accounted to the
    kcpustat.
    
    This was necessary when kcpustat was stored in cputime_t, which could at
    worst have jiffies granularity. But now kcpustat is stored in nsecs
    so this whole discretization game with temporary irqtime storage has
    become unnecessary.
    
    We can now directly account the irqtime to the kcpustat.
    
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Fenghua Yu <fenghua.yu@intel.com>
    Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Stanislaw Gruszka <sgruszka@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: Wanpeng Li <wanpeng.li@hotmail.com>
    Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

The reproducer is running a userspace udp_sink[1] program, and taskset
pinning the process to the same CPU as softirq RX is running on, and
starting a UDP flood with pktgen (tool part of kernel tree:
samples/pktgen/pktgen_sample03_burst_single_flow.sh).

[1] udp_sink
 https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c

The expected results (after commit 4cd13c21b207 ("softirq: Let
ksoftirqd do its job")) is that the scheduler split the CPU time 50/50
between udp_sink and ksoftirqd.

After this commit, the udp_sink program does not get any sched CPU
time, and no packets are delivered to userspace.  (All packets are
dropped by softirq due to a full socket queue, nstat UdpRcvbufErrors).

A related symptom is that ksoftirqd no longer get accounted in top.

$ grep CONFIG_IRQ_TIME_ACCOUNTING .config
CONFIG_IRQ_TIME_ACCOUNTING=y

Full .config uploaded here[2]:
 [2] http://people.netfilter.org/hawk/kconfig/config02-bisect-softirq-a499a5a14dbd

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28  8:14 Bisected softirq accounting issue in v4.11-rc1~170^2~28 Jesper Dangaard Brouer
@ 2017-03-28 10:34 ` Wanpeng Li
  2017-03-28 11:49   ` Jesper Dangaard Brouer
  2017-03-28 12:26   ` Peter Zijlstra
  2017-03-28 14:34 ` Frederic Weisbecker
  1 sibling, 2 replies; 11+ messages in thread
From: Wanpeng Li @ 2017-03-28 10:34 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Frederic Weisbecker, linux-kernel, netdev, linux-mm, Mel Gorman,
	Tariq Toukan, Tariq Toukan, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, Ingo Molnar

2017-03-28 16:14 GMT+08:00 Jesper Dangaard Brouer <brouer@redhat.com>:
>
> (While evaluating some changes to the page allocator) I ran into an
> issue with ksoftirqd getting too much CPU sched time.
>
> I bisected the problem to
>  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
>
>  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
>  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
>  Author: Frederic Weisbecker <fweisbec@gmail.com>
>  Date:   Tue Jan 31 04:09:32 2017 +0100
>
>     sched/cputime: Increment kcpustat directly on irqtime account
>
>     The irqtime is accounted is nsecs and stored in
>     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
>     accumulated amount reaches a new jiffy, this one gets accounted to the
>     kcpustat.
>
>     This was necessary when kcpustat was stored in cputime_t, which could at
>     worst have jiffies granularity. But now kcpustat is stored in nsecs
>     so this whole discretization game with temporary irqtime storage has
>     become unnecessary.
>
>     We can now directly account the irqtime to the kcpustat.
>
>     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>     Cc: Fenghua Yu <fenghua.yu@intel.com>
>     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
>     Cc: Michael Ellerman <mpe@ellerman.id.au>
>     Cc: Paul Mackerras <paulus@samba.org>
>     Cc: Peter Zijlstra <peterz@infradead.org>
>     Cc: Rik van Riel <riel@redhat.com>
>     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
>     Cc: Thomas Gleixner <tglx@linutronix.de>
>     Cc: Tony Luck <tony.luck@intel.com>
>     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
>     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
>
> The reproducer is running a userspace udp_sink[1] program, and taskset
> pinning the process to the same CPU as softirq RX is running on, and
> starting a UDP flood with pktgen (tool part of kernel tree:
> samples/pktgen/pktgen_sample03_burst_single_flow.sh).
>
> [1] udp_sink
>  https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c
>
> The expected results (after commit 4cd13c21b207 ("softirq: Let
> ksoftirqd do its job")) is that the scheduler split the CPU time 50/50
> between udp_sink and ksoftirqd.
>
> After this commit, the udp_sink program does not get any sched CPU
> time, and no packets are delivered to userspace.  (All packets are
> dropped by softirq due to a full socket queue, nstat UdpRcvbufErrors).
>
> A related symptom is that ksoftirqd no longer get accounted in top.
>
> $ grep CONFIG_IRQ_TIME_ACCOUNTING .config
> CONFIG_IRQ_TIME_ACCOUNTING=y
>
> Full .config uploaded here[2]:
>  [2] http://people.netfilter.org/hawk/kconfig/config02-bisect-softirq-a499a5a14dbd

void irqtime_account_irq(struct task_struct *curr)
{
    struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
    u64 *cpustat = kcpustat_this_cpu->cpustat;
    s64 delta;
    int cpu;

    if (!sched_clock_irqtime)
        return;

    cpu = smp_processor_id();
    delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;

sched_clock_cpu(cpu) should be converted from cputime to ns. However,
cputime_to_nsecs() API is removed by
https://lkml.org/lkml/2017/1/22/230 for generic usage, so could you
try the below patch just for testing?

diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
index 4c5b973..166efba 100644
--- a/include/linux/sched/cputime.h
+++ b/include/linux/sched/cputime.h
@@ -7,14 +7,14 @@
  * cputime accounting APIs:
  */

-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-#include <asm/cputime.h>
+#define cputime_div(__ct, divisor) div_u64((__force u64)__ct, divisor)
+#define cputime_to_usecs(__ct) \
+    cputime_div(__ct, NSEC_PER_USEC)

 #ifndef cputime_to_nsecs
 # define cputime_to_nsecs(__ct)    \
     (cputime_to_usecs(__ct) * NSEC_PER_USEC)
 #endif
-#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */

 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 extern void task_cputime(struct task_struct *t,
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index f3778e2b..68064d1 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -49,7 +49,7 @@ void irqtime_account_irq(struct task_struct *curr)
         return;

     cpu = smp_processor_id();
-    delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
+    delta = cputime_to_nsecs(sched_clock_cpu(cpu)) - irqtime->irq_start_time;
     irqtime->irq_start_time += delta;

     u64_stats_update_begin(&irqtime->sync);

Regards,
Wanpeng Li

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 10:34 ` Wanpeng Li
@ 2017-03-28 11:49   ` Jesper Dangaard Brouer
  2017-03-28 12:26   ` Peter Zijlstra
  1 sibling, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2017-03-28 11:49 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Frederic Weisbecker, linux-kernel, netdev, linux-mm, Mel Gorman,
	Tariq Toukan, Tariq Toukan, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, Ingo Molnar, brouer

On Tue, 28 Mar 2017 18:34:52 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> 2017-03-28 16:14 GMT+08:00 Jesper Dangaard Brouer <brouer@redhat.com>:
> >
> > (While evaluating some changes to the page allocator) I ran into an
> > issue with ksoftirqd getting too much CPU sched time.
> >
> > I bisected the problem to
> >  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> >
> >  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
> >  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
> >  Author: Frederic Weisbecker <fweisbec@gmail.com>
> >  Date:   Tue Jan 31 04:09:32 2017 +0100
> >
> >     sched/cputime: Increment kcpustat directly on irqtime account
> >
> >     The irqtime is accounted is nsecs and stored in
> >     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
> >     accumulated amount reaches a new jiffy, this one gets accounted to the
> >     kcpustat.
> >
> >     This was necessary when kcpustat was stored in cputime_t, which could at
> >     worst have jiffies granularity. But now kcpustat is stored in nsecs
> >     so this whole discretization game with temporary irqtime storage has
> >     become unnecessary.
> >
> >     We can now directly account the irqtime to the kcpustat.
> >
> >     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> >     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >     Cc: Fenghua Yu <fenghua.yu@intel.com>
> >     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> >     Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> >     Cc: Paul Mackerras <paulus@samba.org>
> >     Cc: Peter Zijlstra <peterz@infradead.org>
> >     Cc: Rik van Riel <riel@redhat.com>
> >     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> >     Cc: Thomas Gleixner <tglx@linutronix.de>
> >     Cc: Tony Luck <tony.luck@intel.com>
> >     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> >     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
> >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> >
> > The reproducer is running a userspace udp_sink[1] program, and taskset
> > pinning the process to the same CPU as softirq RX is running on, and
> > starting a UDP flood with pktgen (tool part of kernel tree:
> > samples/pktgen/pktgen_sample03_burst_single_flow.sh).
> >
> > [1] udp_sink
> >  https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c
> >
> > The expected results (after commit 4cd13c21b207 ("softirq: Let
> > ksoftirqd do its job")) is that the scheduler split the CPU time 50/50
> > between udp_sink and ksoftirqd.
> >
> > After this commit, the udp_sink program does not get any sched CPU
> > time, and no packets are delivered to userspace.  (All packets are
> > dropped by softirq due to a full socket queue, nstat UdpRcvbufErrors).
> >
> > A related symptom is that ksoftirqd no longer get accounted in top.
> >
> > $ grep CONFIG_IRQ_TIME_ACCOUNTING .config
> > CONFIG_IRQ_TIME_ACCOUNTING=y
> >
> > Full .config uploaded here[2]:
> >  [2] http://people.netfilter.org/hawk/kconfig/config02-bisect-softirq-a499a5a14dbd  
> 
> void irqtime_account_irq(struct task_struct *curr)
> {
>     struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime);
>     u64 *cpustat = kcpustat_this_cpu->cpustat;
>     s64 delta;
>     int cpu;
> 
>     if (!sched_clock_irqtime)
>         return;
> 
>     cpu = smp_processor_id();
>     delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> 
> sched_clock_cpu(cpu) should be converted from cputime to ns. However,
> cputime_to_nsecs() API is removed by
> https://lkml.org/lkml/2017/1/22/230 for generic usage, so could you
> try the below patch just for testing?

Sure... it didn't solve the problem :-(


> diff --git a/include/linux/sched/cputime.h b/include/linux/sched/cputime.h
> index 4c5b973..166efba 100644
> --- a/include/linux/sched/cputime.h
> +++ b/include/linux/sched/cputime.h
> @@ -7,14 +7,14 @@
>   * cputime accounting APIs:
>   */
> 
> -#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
> -#include <asm/cputime.h>
> +#define cputime_div(__ct, divisor) div_u64((__force u64)__ct, divisor)
> +#define cputime_to_usecs(__ct) \
> +    cputime_div(__ct, NSEC_PER_USEC)
> 
>  #ifndef cputime_to_nsecs
>  # define cputime_to_nsecs(__ct)    \
>      (cputime_to_usecs(__ct) * NSEC_PER_USEC)
>  #endif
> -#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
> 
>  #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
>  extern void task_cputime(struct task_struct *t,
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index f3778e2b..68064d1 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -49,7 +49,7 @@ void irqtime_account_irq(struct task_struct *curr)
>          return;
> 
>      cpu = smp_processor_id();
> -    delta = sched_clock_cpu(cpu) - irqtime->irq_start_time;
> +    delta = cputime_to_nsecs(sched_clock_cpu(cpu)) - irqtime->irq_start_time;
>      irqtime->irq_start_time += delta;
> 
>      u64_stats_update_begin(&irqtime->sync);


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 10:34 ` Wanpeng Li
  2017-03-28 11:49   ` Jesper Dangaard Brouer
@ 2017-03-28 12:26   ` Peter Zijlstra
  2017-03-28 13:06     ` Frederic Weisbecker
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2017-03-28 12:26 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Jesper Dangaard Brouer, Frederic Weisbecker, linux-kernel,
	netdev, linux-mm, Mel Gorman, Tariq Toukan, Tariq Toukan,
	Rik van Riel, Thomas Gleixner, Ingo Molnar

On Tue, Mar 28, 2017 at 06:34:52PM +0800, Wanpeng Li wrote:
> 
> sched_clock_cpu(cpu) should be converted from cputime to ns.

Uhm, no. sched_clock_cpu() returns u64 in ns.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 12:26   ` Peter Zijlstra
@ 2017-03-28 13:06     ` Frederic Weisbecker
  2017-03-28 13:18       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2017-03-28 13:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Wanpeng Li, Jesper Dangaard Brouer, linux-kernel, netdev,
	linux-mm, Mel Gorman, Tariq Toukan, Tariq Toukan, Rik van Riel,
	Thomas Gleixner, Ingo Molnar

On Tue, Mar 28, 2017 at 02:26:42PM +0200, Peter Zijlstra wrote:
> On Tue, Mar 28, 2017 at 06:34:52PM +0800, Wanpeng Li wrote:
> > 
> > sched_clock_cpu(cpu) should be converted from cputime to ns.
> 
> Uhm, no. sched_clock_cpu() returns u64 in ns.

Yes, and most of the cputime_t have been converted to u64 so there
should be no such conversion issue between u64 and cputime_t anymore.

Perhaps my commit has another side effect on softirq time accounting,
I'll see if I can reproduce.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 13:06     ` Frederic Weisbecker
@ 2017-03-28 13:18       ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2017-03-28 13:18 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Wanpeng Li, linux-kernel, netdev, linux-mm,
	Mel Gorman, Tariq Toukan, Tariq Toukan, Rik van Riel,
	Thomas Gleixner, Ingo Molnar, brouer

On Tue, 28 Mar 2017 15:06:04 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Mar 28, 2017 at 02:26:42PM +0200, Peter Zijlstra wrote:
> > On Tue, Mar 28, 2017 at 06:34:52PM +0800, Wanpeng Li wrote:  
> > > 
> > > sched_clock_cpu(cpu) should be converted from cputime to ns.  
> > 
> > Uhm, no. sched_clock_cpu() returns u64 in ns.  
> 
> Yes, and most of the cputime_t have been converted to u64 so there
> should be no such conversion issue between u64 and cputime_t anymore.
> 
> Perhaps my commit has another side effect on softirq time accounting,
> I'll see if I can reproduce.

(Disclaimer without knowing anything about the scheduler code)
my theory is that irqtime_account_irq() does not get invoked often
enough, as in my pktgen "overload" use-case keeps softirq always
running. And your change moved updating cpustat[CPUTIME_SOFTIRQ] here.
Before it got updated by account_other_time() which gets invoked from
irqtime_account_process_tick().

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28  8:14 Bisected softirq accounting issue in v4.11-rc1~170^2~28 Jesper Dangaard Brouer
  2017-03-28 10:34 ` Wanpeng Li
@ 2017-03-28 14:34 ` Frederic Weisbecker
  2017-03-28 15:23   ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2017-03-28 14:34 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-kernel, netdev, linux-mm, Mel Gorman, Tariq Toukan,
	Tariq Toukan, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar

On Tue, Mar 28, 2017 at 10:14:03AM +0200, Jesper Dangaard Brouer wrote:
> 
> (While evaluating some changes to the page allocator) I ran into an
> issue with ksoftirqd getting too much CPU sched time.
> 
> I bisected the problem to
>  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> 
>  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
>  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
>  Author: Frederic Weisbecker <fweisbec@gmail.com>
>  Date:   Tue Jan 31 04:09:32 2017 +0100
> 
>     sched/cputime: Increment kcpustat directly on irqtime account
>     
>     The irqtime is accounted is nsecs and stored in
>     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
>     accumulated amount reaches a new jiffy, this one gets accounted to the
>     kcpustat.
>     
>     This was necessary when kcpustat was stored in cputime_t, which could at
>     worst have jiffies granularity. But now kcpustat is stored in nsecs
>     so this whole discretization game with temporary irqtime storage has
>     become unnecessary.
>     
>     We can now directly account the irqtime to the kcpustat.
>     
>     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>     Cc: Fenghua Yu <fenghua.yu@intel.com>
>     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
>     Cc: Michael Ellerman <mpe@ellerman.id.au>
>     Cc: Paul Mackerras <paulus@samba.org>
>     Cc: Peter Zijlstra <peterz@infradead.org>
>     Cc: Rik van Riel <riel@redhat.com>
>     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
>     Cc: Thomas Gleixner <tglx@linutronix.de>
>     Cc: Tony Luck <tony.luck@intel.com>
>     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
>     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> The reproducer is running a userspace udp_sink[1] program, and taskset
> pinning the process to the same CPU as softirq RX is running on, and
> starting a UDP flood with pktgen (tool part of kernel tree:
> samples/pktgen/pktgen_sample03_burst_single_flow.sh).

So that means I need to run udp_sink on the same CPU than pktgen?

> 
> [1] udp_sink
>  https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c
> 
> The expected results (after commit 4cd13c21b207 ("softirq: Let
> ksoftirqd do its job")) is that the scheduler split the CPU time 50/50
> between udp_sink and ksoftirqd.

I guess you mean that this is what happened before this commit?

> 
> After this commit, the udp_sink program does not get any sched CPU
> time, and no packets are delivered to userspace.  (All packets are
> dropped by softirq due to a full socket queue, nstat UdpRcvbufErrors).
> 
> A related symptom is that ksoftirqd no longer get accounted in top.

That's indeed what I observe. udp_sink has almost no CPU time, neither has
ksoftirqd but kpktgend_0 has everything.

Finally a bug I can reproduce!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 14:34 ` Frederic Weisbecker
@ 2017-03-28 15:23   ` Jesper Dangaard Brouer
  2017-03-28 21:11     ` Frederic Weisbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2017-03-28 15:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, netdev, linux-mm, Mel Gorman, Tariq Toukan,
	Tariq Toukan, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, brouer

On Tue, 28 Mar 2017 16:34:36 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Mar 28, 2017 at 10:14:03AM +0200, Jesper Dangaard Brouer wrote:
> > 
> > (While evaluating some changes to the page allocator) I ran into an
> > issue with ksoftirqd getting too much CPU sched time.
> > 
> > I bisected the problem to
> >  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> > 
> >  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
> >  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
> >  Author: Frederic Weisbecker <fweisbec@gmail.com>
> >  Date:   Tue Jan 31 04:09:32 2017 +0100
> > 
> >     sched/cputime: Increment kcpustat directly on irqtime account
> >     
> >     The irqtime is accounted is nsecs and stored in
> >     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
> >     accumulated amount reaches a new jiffy, this one gets accounted to the
> >     kcpustat.
> >     
> >     This was necessary when kcpustat was stored in cputime_t, which could at
> >     worst have jiffies granularity. But now kcpustat is stored in nsecs
> >     so this whole discretization game with temporary irqtime storage has
> >     become unnecessary.
> >     
> >     We can now directly account the irqtime to the kcpustat.
> >     
> >     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> >     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> >     Cc: Fenghua Yu <fenghua.yu@intel.com>
> >     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> >     Cc: Linus Torvalds <torvalds@linux-foundation.org>
> >     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> >     Cc: Paul Mackerras <paulus@samba.org>
> >     Cc: Peter Zijlstra <peterz@infradead.org>
> >     Cc: Rik van Riel <riel@redhat.com>
> >     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> >     Cc: Thomas Gleixner <tglx@linutronix.de>
> >     Cc: Tony Luck <tony.luck@intel.com>
> >     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> >     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
> >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > 
> > The reproducer is running a userspace udp_sink[1] program, and taskset
> > pinning the process to the same CPU as softirq RX is running on, and
> > starting a UDP flood with pktgen (tool part of kernel tree:
> > samples/pktgen/pktgen_sample03_burst_single_flow.sh).  
> 
> So that means I need to run udp_sink on the same CPU than pktgen?

No, you misunderstood.  I run pktgen on another physical machine, which
is sending UDP packets towards my Device-Under-Test (DUT) target.  The
DUT-target is receiving packets and I observe which CPU the NIC is
delivering these packets to.

E.g determine RX-CPU via mpstat command:
 mpstat -P ALL -u -I SCPU -I SUM 2

I then start udp_sink, pinned to the RX-CPU, like:
 sudo taskset -c 2 ./udp_sink --port 9 --count $((10**6)) --recvmsg --repeat 1000


> > [1] udp_sink
> >  https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c
> > 
> > The expected results (after commit 4cd13c21b207 ("softirq: Let
> > ksoftirqd do its job")) is that the scheduler split the CPU time 50/50
> > between udp_sink and ksoftirqd.  
> 
> I guess you mean that this is what happened before this commit?

Yes. (I just pointed out the kernel had another softirq bug, that I was
involved in fixing)
 
> > 
> > After this commit, the udp_sink program does not get any sched CPU
> > time, and no packets are delivered to userspace.  (All packets are
> > dropped by softirq due to a full socket queue, nstat
> > UdpRcvbufErrors).
> > 
> > A related symptom is that ksoftirqd no longer get accounted in
> > top.  
> 
> That's indeed what I observe. udp_sink has almost no CPU time,
> neither has ksoftirqd but kpktgend_0 has everything.
> 
> Finally a bug I can reproduce!

Good to hear you can reproduce it! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 15:23   ` Jesper Dangaard Brouer
@ 2017-03-28 21:11     ` Frederic Weisbecker
  2017-03-29  9:30       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2017-03-28 21:11 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-kernel, netdev, linux-mm, Mel Gorman, Tariq Toukan,
	Tariq Toukan, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar

On Tue, Mar 28, 2017 at 05:23:03PM +0200, Jesper Dangaard Brouer wrote:
> On Tue, 28 Mar 2017 16:34:36 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Tue, Mar 28, 2017 at 10:14:03AM +0200, Jesper Dangaard Brouer wrote:
> > > 
> > > (While evaluating some changes to the page allocator) I ran into an
> > > issue with ksoftirqd getting too much CPU sched time.
> > > 
> > > I bisected the problem to
> > >  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> > > 
> > >  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
> > >  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
> > >  Author: Frederic Weisbecker <fweisbec@gmail.com>
> > >  Date:   Tue Jan 31 04:09:32 2017 +0100
> > > 
> > >     sched/cputime: Increment kcpustat directly on irqtime account
> > >     
> > >     The irqtime is accounted is nsecs and stored in
> > >     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
> > >     accumulated amount reaches a new jiffy, this one gets accounted to the
> > >     kcpustat.
> > >     
> > >     This was necessary when kcpustat was stored in cputime_t, which could at
> > >     worst have jiffies granularity. But now kcpustat is stored in nsecs
> > >     so this whole discretization game with temporary irqtime storage has
> > >     become unnecessary.
> > >     
> > >     We can now directly account the irqtime to the kcpustat.
> > >     
> > >     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > >     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > >     Cc: Fenghua Yu <fenghua.yu@intel.com>
> > >     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> > >     Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > >     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> > >     Cc: Paul Mackerras <paulus@samba.org>
> > >     Cc: Peter Zijlstra <peterz@infradead.org>
> > >     Cc: Rik van Riel <riel@redhat.com>
> > >     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> > >     Cc: Thomas Gleixner <tglx@linutronix.de>
> > >     Cc: Tony Luck <tony.luck@intel.com>
> > >     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> > >     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
> > >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > > 
> > > The reproducer is running a userspace udp_sink[1] program, and taskset
> > > pinning the process to the same CPU as softirq RX is running on, and
> > > starting a UDP flood with pktgen (tool part of kernel tree:
> > > samples/pktgen/pktgen_sample03_burst_single_flow.sh).  
> > 
> > So that means I need to run udp_sink on the same CPU than pktgen?
> 
> No, you misunderstood.  I run pktgen on another physical machine, which
> is sending UDP packets towards my Device-Under-Test (DUT) target.  The
> DUT-target is receiving packets and I observe which CPU the NIC is
> delivering these packets to.

Ah ok, so I tried to run pktgen on another machine and I get that strange write error:

    # ./pktgen_sample03_burst_single_flow.sh -d 192.168.1.3  -i wlan0
    ./functions.sh: ligne 76 : echo: erreur d'�criture : Erreur inconnue 524
    ERROR: Write error(1) occurred cmd: "clone_skb 100000 > /proc/net/pktgen/wlan0@0"

Any idea?

> 
> E.g determine RX-CPU via mpstat command:
>  mpstat -P ALL -u -I SCPU -I SUM 2
> 
> I then start udp_sink, pinned to the RX-CPU, like:
>  sudo taskset -c 2 ./udp_sink --port 9 --count $((10**6)) --recvmsg --repeat 1000

Ah thanks for these hints!

> > > After this commit, the udp_sink program does not get any sched CPU
> > > time, and no packets are delivered to userspace.  (All packets are
> > > dropped by softirq due to a full socket queue, nstat
> > > UdpRcvbufErrors).
> > > 
> > > A related symptom is that ksoftirqd no longer get accounted in
> > > top.  
> > 
> > That's indeed what I observe. udp_sink has almost no CPU time,
> > neither has ksoftirqd but kpktgend_0 has everything.
> > 
> > Finally a bug I can reproduce!
> 
> Good to hear you can reproduce it! :-)

Well, since I was generating the packets locally, maybe it didn't trigger
the expected interrupts...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-28 21:11     ` Frederic Weisbecker
@ 2017-03-29  9:30       ` Jesper Dangaard Brouer
  2017-03-29 13:13         ` Frederic Weisbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2017-03-29  9:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, netdev, linux-mm, Mel Gorman, Tariq Toukan,
	Tariq Toukan, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, brouer

On Tue, 28 Mar 2017 23:11:22 +0200
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Tue, Mar 28, 2017 at 05:23:03PM +0200, Jesper Dangaard Brouer wrote:
> > On Tue, 28 Mar 2017 16:34:36 +0200
> > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >   
> > > On Tue, Mar 28, 2017 at 10:14:03AM +0200, Jesper Dangaard Brouer wrote:  
> > > > 
> > > > (While evaluating some changes to the page allocator) I ran into an
> > > > issue with ksoftirqd getting too much CPU sched time.
> > > > 
> > > > I bisected the problem to
> > > >  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> > > > 
> > > >  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
> > > >  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
> > > >  Author: Frederic Weisbecker <fweisbec@gmail.com>
> > > >  Date:   Tue Jan 31 04:09:32 2017 +0100
> > > > 
> > > >     sched/cputime: Increment kcpustat directly on irqtime account
> > > >     
> > > >     The irqtime is accounted is nsecs and stored in
> > > >     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
> > > >     accumulated amount reaches a new jiffy, this one gets accounted to the
> > > >     kcpustat.
> > > >     
> > > >     This was necessary when kcpustat was stored in cputime_t, which could at
> > > >     worst have jiffies granularity. But now kcpustat is stored in nsecs
> > > >     so this whole discretization game with temporary irqtime storage has
> > > >     become unnecessary.
> > > >     
> > > >     We can now directly account the irqtime to the kcpustat.
> > > >     
> > > >     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > > >     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > >     Cc: Fenghua Yu <fenghua.yu@intel.com>
> > > >     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> > > >     Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > >     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > >     Cc: Paul Mackerras <paulus@samba.org>
> > > >     Cc: Peter Zijlstra <peterz@infradead.org>
> > > >     Cc: Rik van Riel <riel@redhat.com>
> > > >     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> > > >     Cc: Thomas Gleixner <tglx@linutronix.de>
> > > >     Cc: Tony Luck <tony.luck@intel.com>
> > > >     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> > > >     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
> > > >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > > > 
> > > > The reproducer is running a userspace udp_sink[1] program, and taskset
> > > > pinning the process to the same CPU as softirq RX is running on, and
> > > > starting a UDP flood with pktgen (tool part of kernel tree:
> > > > samples/pktgen/pktgen_sample03_burst_single_flow.sh).    
> > > 
> > > So that means I need to run udp_sink on the same CPU than pktgen?  
> > 
> > No, you misunderstood.  I run pktgen on another physical machine, which
> > is sending UDP packets towards my Device-Under-Test (DUT) target.  The
> > DUT-target is receiving packets and I observe which CPU the NIC is
> > delivering these packets to.  
> 
> Ah ok, so I tried to run pktgen on another machine and I get that strange write error:
> 
>     # ./pktgen_sample03_burst_single_flow.sh -d 192.168.1.3  -i wlan0
>     ./functions.sh: ligne 76 : echo: erreur d'�criture : Erreur inconnue 524
>     ERROR: Write error(1) occurred cmd: "clone_skb 100000 > /proc/net/pktgen/wlan0@0"
> 
> Any idea?

Yes, this interface does not support pktgen "clone_skb".  You can
supply cmdline argument "-c 0" to fix this.  But I suspect that this
interface also does not support "burst", thus you also need "-b 0".

See all cmdline args via: ./pktgen_sample03_burst_single_flow.sh -h

Why are you using a wifi interface for this kind of overload testing?
(the basic test here is making sure softirq is busy 100%, and at slow
wifi speeds this might not be possible to force ksoftirqd into this
scheduler state)


> > 
> > E.g determine RX-CPU via mpstat command:
> >  mpstat -P ALL -u -I SCPU -I SUM 2
> > 
> > I then start udp_sink, pinned to the RX-CPU, like:
> >  sudo taskset -c 2 ./udp_sink --port 9 --count $((10**6)) --recvmsg --repeat 1000  
> 
> Ah thanks for these hints!
> 
> > > > After this commit, the udp_sink program does not get any sched CPU
> > > > time, and no packets are delivered to userspace.  (All packets are
> > > > dropped by softirq due to a full socket queue, nstat
> > > > UdpRcvbufErrors).
> > > > 
> > > > A related symptom is that ksoftirqd no longer get accounted in
> > > > top.    
> > > 
> > > That's indeed what I observe. udp_sink has almost no CPU time,
> > > neither has ksoftirqd but kpktgend_0 has everything.
> > > 
> > > Finally a bug I can reproduce!  
> > 
> > Good to hear you can reproduce it! :-)  
> 
> Well, since I was generating the packets locally, maybe it didn't trigger
> the expected interrupts...

Well, you definitely didn't create the test case I was using.  I cannot
remember if the pktgen kthreads runs in softirq context, but I suspect
it does. If so, you can recreate the main problem, which is a softirq
thread using 100% CPU time, which cause no other processes getting
sched time on that CPU.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bisected softirq accounting issue in v4.11-rc1~170^2~28
  2017-03-29  9:30       ` Jesper Dangaard Brouer
@ 2017-03-29 13:13         ` Frederic Weisbecker
  0 siblings, 0 replies; 11+ messages in thread
From: Frederic Weisbecker @ 2017-03-29 13:13 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-kernel, netdev, linux-mm, Mel Gorman, Tariq Toukan,
	Tariq Toukan, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar

On Wed, Mar 29, 2017 at 11:30:30AM +0200, Jesper Dangaard Brouer wrote:
> On Tue, 28 Mar 2017 23:11:22 +0200
> Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Tue, Mar 28, 2017 at 05:23:03PM +0200, Jesper Dangaard Brouer wrote:
> > > On Tue, 28 Mar 2017 16:34:36 +0200
> > > Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > >   
> > > > On Tue, Mar 28, 2017 at 10:14:03AM +0200, Jesper Dangaard Brouer wrote:  
> > > > > 
> > > > > (While evaluating some changes to the page allocator) I ran into an
> > > > > issue with ksoftirqd getting too much CPU sched time.
> > > > > 
> > > > > I bisected the problem to
> > > > >  a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account")
> > > > > 
> > > > >  a499a5a14dbd1d0315a96fc62a8798059325e9e6 is the first bad commit
> > > > >  commit a499a5a14dbd1d0315a96fc62a8798059325e9e6
> > > > >  Author: Frederic Weisbecker <fweisbec@gmail.com>
> > > > >  Date:   Tue Jan 31 04:09:32 2017 +0100
> > > > > 
> > > > >     sched/cputime: Increment kcpustat directly on irqtime account
> > > > >     
> > > > >     The irqtime is accounted is nsecs and stored in
> > > > >     cpu_irq_time.hardirq_time and cpu_irq_time.softirq_time. Once the
> > > > >     accumulated amount reaches a new jiffy, this one gets accounted to the
> > > > >     kcpustat.
> > > > >     
> > > > >     This was necessary when kcpustat was stored in cputime_t, which could at
> > > > >     worst have jiffies granularity. But now kcpustat is stored in nsecs
> > > > >     so this whole discretization game with temporary irqtime storage has
> > > > >     become unnecessary.
> > > > >     
> > > > >     We can now directly account the irqtime to the kcpustat.
> > > > >     
> > > > >     Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > > > >     Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > > >     Cc: Fenghua Yu <fenghua.yu@intel.com>
> > > > >     Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> > > > >     Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > > > >     Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > > > >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> > > > >     Cc: Paul Mackerras <paulus@samba.org>
> > > > >     Cc: Peter Zijlstra <peterz@infradead.org>
> > > > >     Cc: Rik van Riel <riel@redhat.com>
> > > > >     Cc: Stanislaw Gruszka <sgruszka@redhat.com>
> > > > >     Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > >     Cc: Tony Luck <tony.luck@intel.com>
> > > > >     Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> > > > >     Link: http://lkml.kernel.org/r/1485832191-26889-17-git-send-email-fweisbec@gmail.com
> > > > >     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > > > > 
> > > > > The reproducer is running a userspace udp_sink[1] program, and taskset
> > > > > pinning the process to the same CPU as softirq RX is running on, and
> > > > > starting a UDP flood with pktgen (tool part of kernel tree:
> > > > > samples/pktgen/pktgen_sample03_burst_single_flow.sh).    
> > > > 
> > > > So that means I need to run udp_sink on the same CPU than pktgen?  
> > > 
> > > No, you misunderstood.  I run pktgen on another physical machine, which
> > > is sending UDP packets towards my Device-Under-Test (DUT) target.  The
> > > DUT-target is receiving packets and I observe which CPU the NIC is
> > > delivering these packets to.  
> > 
> > Ah ok, so I tried to run pktgen on another machine and I get that strange write error:
> > 
> >     # ./pktgen_sample03_burst_single_flow.sh -d 192.168.1.3  -i wlan0
> >     ./functions.sh: ligne 76 : echo: erreur d'�criture : Erreur inconnue 524
> >     ERROR: Write error(1) occurred cmd: "clone_skb 100000 > /proc/net/pktgen/wlan0@0"
> > 
> > Any idea?
> 
> Yes, this interface does not support pktgen "clone_skb".  You can
> supply cmdline argument "-c 0" to fix this.  But I suspect that this
> interface also does not support "burst", thus you also need "-b 0".
> 
> See all cmdline args via: ./pktgen_sample03_burst_single_flow.sh -h
> 
> Why are you using a wifi interface for this kind of overload testing?
> (the basic test here is making sure softirq is busy 100%, and at slow
> wifi speeds this might not be possible to force ksoftirqd into this
> scheduler state)

What? I need to raise from the couch and plug an ethernet cable?? ;-) ;-)

More seriously you're right, wifi probably won't be enough to trigger
the desired storm on the destination interface. I'm going to try with eth0,
that should also fix the clone_skb issues.

> > > > > After this commit, the udp_sink program does not get any sched CPU
> > > > > time, and no packets are delivered to userspace.  (All packets are
> > > > > dropped by softirq due to a full socket queue, nstat
> > > > > UdpRcvbufErrors).
> > > > > 
> > > > > A related symptom is that ksoftirqd no longer get accounted in
> > > > > top.    
> > > > 
> > > > That's indeed what I observe. udp_sink has almost no CPU time,
> > > > neither has ksoftirqd but kpktgend_0 has everything.
> > > > 
> > > > Finally a bug I can reproduce!  
> > > 
> > > Good to hear you can reproduce it! :-)  
> > 
> > Well, since I was generating the packets locally, maybe it didn't trigger
> > the expected interrupts...
> 
> Well, you definitely didn't create the test case I was using.  I cannot
> remember if the pktgen kthreads runs in softirq context, but I suspect
> it does. If so, you can recreate the main problem, which is a softirq
> thread using 100% CPU time, which cause no other processes getting
> sched time on that CPU.

Well, I prefer to reproduce the same thing than you to make sure I'm chasing
the right problem.

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-29 13:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-28  8:14 Bisected softirq accounting issue in v4.11-rc1~170^2~28 Jesper Dangaard Brouer
2017-03-28 10:34 ` Wanpeng Li
2017-03-28 11:49   ` Jesper Dangaard Brouer
2017-03-28 12:26   ` Peter Zijlstra
2017-03-28 13:06     ` Frederic Weisbecker
2017-03-28 13:18       ` Jesper Dangaard Brouer
2017-03-28 14:34 ` Frederic Weisbecker
2017-03-28 15:23   ` Jesper Dangaard Brouer
2017-03-28 21:11     ` Frederic Weisbecker
2017-03-29  9:30       ` Jesper Dangaard Brouer
2017-03-29 13:13         ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).