All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
@ 2012-02-27 20:38 Dan Williams
  2012-02-28  8:35 ` Yong Zhang
  2012-02-28  9:48 ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-27 20:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jens Axboe, Peter Zijlstra, linux-scsi, Lukasz Dorau,
	James Bottomley, Andrzej Jakowski

An experimental hack to tease out whether we are continuing to
run the softirq handler past the point of needing scheduling.

It allows only one trip through __do_softirq() as long as need_resched()
is set which hopefully creates the back pressure needed to get ksoftirqd
scheduled.

Targeted to address reports like the following that are produced
with i/o tests to a sas domain with a large number of disks (48+), and
lots of debugging enabled (slub_deubg, lockdep) that makes the
block+scsi softirq path more cpu-expensive than normal.

With this patch applied the softlockup detector seems appeased, but it
seems odd to need changes to kernel/softirq.c so maybe I have overlooked
something that needs changing at the block/scsi level?

BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
irq event stamp: 26260303
hardirqs last  enabled at (26260302): [<ffffffff814becf4>] restore_args+0x0/0x30
hardirqs last disabled at (26260303): [<ffffffff814c60ee>] apic_timer_interrupt+0x6e/0x80
softirqs last  enabled at (26220386): [<ffffffff81033edd>] __do_softirq+0x1ae/0x1bd
softirqs last disabled at (26220665): [<ffffffff814c696c>] call_softirq+0x1c/0x26
CPU 3
Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]

Pid: 78, comm: kworker/3:1 Not tainted 3.3.0-rc3-7ada1dd-isci-3.0.183+ #1 Intel Corporation ROSECITY/ROSECITY
RIP: 0010:[<ffffffff814be8b6>]  [<ffffffff814be8b6>] _raw_spin_unlock_irq+0x34/0x4b
RSP: 0000:ffff8800bb8c3c50  EFLAGS: 00000202
RAX: ffff8800375f3ec0 RBX: ffffffff814becf4 RCX: ffff8800bb8c3c00
RDX: 0000000000000001 RSI: ffff880035bbc348 RDI: ffff8800375f4588
RBP: ffff8800bb8c3c60 R08: 0000000000000000 R09: ffff880035aed150
R10: 0000000000018f3b R11: ffff8800bb8c39e0 R12: ffff8800bb8c3bc8
R13: ffffffff814c60f3 R14: ffff8800bb8c3c60 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000f2e028 CR3: 00000000b11b3000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/3:1 (pid: 78, threadinfo ffff8800377d2000, task ffff8800375f3ec0)
Stack:
 ffff88003555f800 ffff88003555f800 ffff8800bb8c3cc0 ffffffffa00512c4
 ffffffff814be8b2 ffff880035dfc000 ffff880035dfe000 000000000553a265
 ffff8800bb8c3cb0 ffff88003555f800 ffff8800b20af200 ffff880035dfe000
Call Trace:
 <IRQ>
 [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
 [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
 [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
 [<ffffffff813317a8>] ? spin_lock+0x9/0xb
 [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
 [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
 [<ffffffff8124b869>] blk_run_queue+0x26/0x3a
 [<ffffffff813319a5>] scsi_run_queue+0x1fb/0x20a
 [<ffffffff81332136>] scsi_next_command+0x3b/0x4c
 [<ffffffff81332b66>] scsi_io_completion+0x205/0x44f
 [<ffffffff813316b8>] ? spin_unlock_irqrestore+0x9/0xb
 [<ffffffff8132b3ab>] scsi_finish_command+0xeb/0xf4
 [<ffffffff81333a04>] scsi_softirq_done+0x112/0x11b
 [<ffffffff812540ac>] blk_done_softirq+0x7e/0x96
 [<ffffffff81033e0c>] __do_softirq+0xdd/0x1bd
 [<ffffffff814c696c>] call_softirq+0x1c/0x26
 [<ffffffff81003ce6>] do_softirq+0x4b/0xa5
 [<ffffffff81034916>] irq_exit+0x55/0xc2
 [<ffffffff814c6a9c>] smp_apic_timer_interrupt+0x7c/0x8a
 [<ffffffff814c60f3>] apic_timer_interrupt+0x73/0x80
 <EOI>
 [<ffffffff814be8b6>] ? _raw_spin_unlock_irq+0x34/0x4b
 [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
 [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
 [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
 [<ffffffff813317a8>] ? spin_lock+0x9/0xb
 [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
 [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
 [<ffffffff8125f3bb>] cfq_kick_queue+0x2f/0x41
 [<ffffffff8104462e>] process_one_work+0x1c8/0x336
 [<ffffffff81044599>] ? process_one_work+0x133/0x336
 [<ffffffff81044306>] ? spin_lock_irq+0x9/0xb
 [<ffffffff8125f38c>] ? cfq_init_queue+0x2a3/0x2a3
 [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
 [<ffffffff81046085>] worker_thread+0xac/0x151
 [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
 [<ffffffff8104a618>] kthread+0x8a/0x92
 [<ffffffff8107654e>] ? trace_hardirqs_on_caller+0x16/0x16d
 [<ffffffff814c6874>] kernel_thread_helper+0x4/0x10
 [<ffffffff814becf4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8104a58e>] ? kthread_create_on_node+0x14d/0x14d
 [<ffffffff814c6870>] ? gs_change+0x13/0x13

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: James Bottomley <JBottomley@parallels.com>
Reported-by: Lukasz Dorau <lukasz.dorau@intel.com>
Reported-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
Not-yet-signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 kernel/softirq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 4eb3a0f..82a3f43 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -255,7 +255,7 @@ restart:
 	local_irq_disable();
 
 	pending = local_softirq_pending();
-	if (pending && --max_restart)
+	if (pending && --max_restart && !need_resched())
 		goto restart;
 
 	if (pending)


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-27 20:38 [RFC PATCH] kick ksoftirqd more often to please soft lockup detector Dan Williams
@ 2012-02-28  8:35 ` Yong Zhang
  2012-02-28  9:48 ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Yong Zhang @ 2012-02-28  8:35 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, Jens Axboe, Peter Zijlstra, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Mon, Feb 27, 2012 at 12:38:47PM -0800, Dan Williams wrote:
> An experimental hack to tease out whether we are continuing to
> run the softirq handler past the point of needing scheduling.
> 
> It allows only one trip through __do_softirq() as long as need_resched()
> is set which hopefully creates the back pressure needed to get ksoftirqd
> scheduled.
> 
> Targeted to address reports like the following that are produced
> with i/o tests to a sas domain with a large number of disks (48+), and
> lots of debugging enabled (slub_deubg, lockdep) that makes the
> block+scsi softirq path more cpu-expensive than normal.
> 
> With this patch applied the softlockup detector seems appeased, but it
> seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> something that needs changing at the block/scsi level?

But stucking in softirq for 22s still seems odd.

I guess the reason why your patch works is that softirq returns before
handling BLOCK_SOFTIRQ, but who knows, just guess.

Does kernel command line 'threadirqs' solve your issue?

Thanks,
Yong

> 
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
> Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
> irq event stamp: 26260303
> hardirqs last  enabled at (26260302): [<ffffffff814becf4>] restore_args+0x0/0x30
> hardirqs last disabled at (26260303): [<ffffffff814c60ee>] apic_timer_interrupt+0x6e/0x80
> softirqs last  enabled at (26220386): [<ffffffff81033edd>] __do_softirq+0x1ae/0x1bd
> softirqs last disabled at (26220665): [<ffffffff814c696c>] call_softirq+0x1c/0x26
> CPU 3
> Modules linked in: nls_utf8 ipv6 uinput sg iTCO_wdt iTCO_vendor_support ioatdma dca i2c_i801 i2c_core wmi sd_mod ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
> 
> Pid: 78, comm: kworker/3:1 Not tainted 3.3.0-rc3-7ada1dd-isci-3.0.183+ #1 Intel Corporation ROSECITY/ROSECITY
> RIP: 0010:[<ffffffff814be8b6>]  [<ffffffff814be8b6>] _raw_spin_unlock_irq+0x34/0x4b
> RSP: 0000:ffff8800bb8c3c50  EFLAGS: 00000202
> RAX: ffff8800375f3ec0 RBX: ffffffff814becf4 RCX: ffff8800bb8c3c00
> RDX: 0000000000000001 RSI: ffff880035bbc348 RDI: ffff8800375f4588
> RBP: ffff8800bb8c3c60 R08: 0000000000000000 R09: ffff880035aed150
> R10: 0000000000018f3b R11: ffff8800bb8c39e0 R12: ffff8800bb8c3bc8
> R13: ffffffff814c60f3 R14: ffff8800bb8c3c60 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000f2e028 CR3: 00000000b11b3000 CR4: 00000000000406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kworker/3:1 (pid: 78, threadinfo ffff8800377d2000, task ffff8800375f3ec0)
> Stack:
>  ffff88003555f800 ffff88003555f800 ffff8800bb8c3cc0 ffffffffa00512c4
>  ffffffff814be8b2 ffff880035dfc000 ffff880035dfe000 000000000553a265
>  ffff8800bb8c3cb0 ffff88003555f800 ffff8800b20af200 ffff880035dfe000
> Call Trace:
>  <IRQ>
>  [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
>  [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
>  [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
>  [<ffffffff813317a8>] ? spin_lock+0x9/0xb
>  [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
>  [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
>  [<ffffffff8124b869>] blk_run_queue+0x26/0x3a
>  [<ffffffff813319a5>] scsi_run_queue+0x1fb/0x20a
>  [<ffffffff81332136>] scsi_next_command+0x3b/0x4c
>  [<ffffffff81332b66>] scsi_io_completion+0x205/0x44f
>  [<ffffffff813316b8>] ? spin_unlock_irqrestore+0x9/0xb
>  [<ffffffff8132b3ab>] scsi_finish_command+0xeb/0xf4
>  [<ffffffff81333a04>] scsi_softirq_done+0x112/0x11b
>  [<ffffffff812540ac>] blk_done_softirq+0x7e/0x96
>  [<ffffffff81033e0c>] __do_softirq+0xdd/0x1bd
>  [<ffffffff814c696c>] call_softirq+0x1c/0x26
>  [<ffffffff81003ce6>] do_softirq+0x4b/0xa5
>  [<ffffffff81034916>] irq_exit+0x55/0xc2
>  [<ffffffff814c6a9c>] smp_apic_timer_interrupt+0x7c/0x8a
>  [<ffffffff814c60f3>] apic_timer_interrupt+0x73/0x80
>  <EOI>
>  [<ffffffff814be8b6>] ? _raw_spin_unlock_irq+0x34/0x4b
>  [<ffffffffa00512c4>] sas_queuecommand+0xa7/0x204 [libsas]
>  [<ffffffff814be8b2>] ? _raw_spin_unlock_irq+0x30/0x4b
>  [<ffffffff8132b6a9>] scsi_dispatch_cmd+0x1a2/0x24c
>  [<ffffffff813317a8>] ? spin_lock+0x9/0xb
>  [<ffffffff813333b0>] scsi_request_fn+0x3b1/0x3d9
>  [<ffffffff8124a19f>] __blk_run_queue+0x1d/0x1f
>  [<ffffffff8125f3bb>] cfq_kick_queue+0x2f/0x41
>  [<ffffffff8104462e>] process_one_work+0x1c8/0x336
>  [<ffffffff81044599>] ? process_one_work+0x133/0x336
>  [<ffffffff81044306>] ? spin_lock_irq+0x9/0xb
>  [<ffffffff8125f38c>] ? cfq_init_queue+0x2a3/0x2a3
>  [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
>  [<ffffffff81046085>] worker_thread+0xac/0x151
>  [<ffffffff81045fd9>] ? workqueue_congested+0x1e/0x1e
>  [<ffffffff8104a618>] kthread+0x8a/0x92
>  [<ffffffff8107654e>] ? trace_hardirqs_on_caller+0x16/0x16d
>  [<ffffffff814c6874>] kernel_thread_helper+0x4/0x10
>  [<ffffffff814becf4>] ? retint_restore_args+0x13/0x13
>  [<ffffffff8104a58e>] ? kthread_create_on_node+0x14d/0x14d
>  [<ffffffff814c6870>] ? gs_change+0x13/0x13
> 
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: James Bottomley <JBottomley@parallels.com>
> Reported-by: Lukasz Dorau <lukasz.dorau@intel.com>
> Reported-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
> Not-yet-signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  kernel/softirq.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 4eb3a0f..82a3f43 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -255,7 +255,7 @@ restart:
>  	local_irq_disable();
>  
>  	pending = local_softirq_pending();
> -	if (pending && --max_restart)
> +	if (pending && --max_restart && !need_resched())
>  		goto restart;
>  
>  	if (pending)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Only stand for myself

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-27 20:38 [RFC PATCH] kick ksoftirqd more often to please soft lockup detector Dan Williams
  2012-02-28  8:35 ` Yong Zhang
@ 2012-02-28  9:48 ` Peter Zijlstra
  2012-02-28 16:48   ` Dan Williams
  2012-02-28 21:41   ` Thomas Gleixner
  1 sibling, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2012-02-28  9:48 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-kernel, Jens Axboe, linux-scsi, Lukasz Dorau,
	James Bottomley, Andrzej Jakowski, Thomas Gleixner

On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
> An experimental hack to tease out whether we are continuing to
> run the softirq handler past the point of needing scheduling.
> 
> It allows only one trip through __do_softirq() as long as need_resched()
> is set which hopefully creates the back pressure needed to get ksoftirqd
> scheduled.
> 
> Targeted to address reports like the following that are produced
> with i/o tests to a sas domain with a large number of disks (48+), and
> lots of debugging enabled (slub_deubg, lockdep) that makes the
> block+scsi softirq path more cpu-expensive than normal.
> 
> With this patch applied the softlockup detector seems appeased, but it
> seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> something that needs changing at the block/scsi level?
> 
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78] 

So you're stuck in softirq for 22s+, max_restart is 10, this gives that
on average you spend 2.2s+ per softirq invocation, this is completely
absolutely bonkers. Softirq handlers should never consume significant
amount of cpu-time.

Thomas, think its about time we put something like the below in?


---
 kernel/softirq.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index ff066a4..6137ee1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -210,6 +210,7 @@ asmlinkage void __do_softirq(void)
 	__u32 pending;
 	int max_restart = MAX_SOFTIRQ_RESTART;
 	int cpu;
+	u64 start, callback, now;
 
 	pending = local_softirq_pending();
 	account_system_vtime(current);
@@ -223,6 +224,8 @@ asmlinkage void __do_softirq(void)
 	/* Reset the pending bitmask before enabling irqs */
 	set_softirq_pending(0);
 
+	start = callback = cpu_clock(cpu);
+
 	local_irq_enable();
 
 	h = softirq_vec;
@@ -246,6 +249,15 @@ asmlinkage void __do_softirq(void)
 				preempt_count() = prev_count;
 			}
 
+			now = cpu_clock(cpu);
+			if (now - callback > TICK_NSEC / 4) {
+				printk(KERN_ERR "softirq took longer than 1/4 tick: "
+						"%u %s %p\n", vec_nr, 
+						softirq_to_name[vec_nr],
+						h->action);
+			}
+			callback = now;
+
 			rcu_bh_qs(cpu);
 		}
 		h++;
@@ -254,6 +266,10 @@ asmlinkage void __do_softirq(void)
 
 	local_irq_disable();
 
+	now = cpu_clock(cpu);
+	if (now - start > TICK_NSEC / 2)
+		printk(KERN_ERR "softirq loop took longer than 1/2 tick\n");
+
 	pending = local_softirq_pending();
 	if (pending && --max_restart)
 		goto restart;


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-28  9:48 ` Peter Zijlstra
@ 2012-02-28 16:48   ` Dan Williams
  2012-02-28 21:41   ` Thomas Gleixner
  1 sibling, 0 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-28 16:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Jens Axboe, linux-scsi, Lukasz Dorau,
	James Bottomley, Andrzej Jakowski, Thomas Gleixner

On Tue, Feb 28, 2012 at 1:48 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>
> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
> on average you spend 2.2s+ per softirq invocation, this is completely
> absolutely bonkers. Softirq handlers should never consume significant
> amount of cpu-time.

I'm not sure whether this was just one __do_softirq() invocation, or
multiple.  I'll see if I can instrument this, but the suspicion is
that max_restart can't help when softirq is constantly re-triggered.
I'll also add the below patch to see if any single invocation is
violating these expectations.

--
Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-28  9:48 ` Peter Zijlstra
  2012-02-28 16:48   ` Dan Williams
@ 2012-02-28 21:41   ` Thomas Gleixner
  2012-02-28 22:16     ` Dan Williams
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2012-02-28 21:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dan Williams, linux-kernel, Jens Axboe, linux-scsi, Lukasz Dorau,
	James Bottomley, Andrzej Jakowski

On Tue, 28 Feb 2012, Peter Zijlstra wrote:

> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
> > An experimental hack to tease out whether we are continuing to
> > run the softirq handler past the point of needing scheduling.
> > 
> > It allows only one trip through __do_softirq() as long as need_resched()
> > is set which hopefully creates the back pressure needed to get ksoftirqd
> > scheduled.
> > 
> > Targeted to address reports like the following that are produced
> > with i/o tests to a sas domain with a large number of disks (48+), and
> > lots of debugging enabled (slub_deubg, lockdep) that makes the
> > block+scsi softirq path more cpu-expensive than normal.
> > 
> > With this patch applied the softlockup detector seems appeased, but it
> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
> > something that needs changing at the block/scsi level?
> > 
> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78] 
> 
> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
> on average you spend 2.2s+ per softirq invocation, this is completely
> absolutely bonkers. Softirq handlers should never consume significant
> amount of cpu-time.
> 
> Thomas, think its about time we put something like the below in?

Absolutely. Anything which consumes more than a few microseconds in
the softirq handler needs to be sorted out, no matter what.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-28 21:41   ` Thomas Gleixner
@ 2012-02-28 22:16     ` Dan Williams
  2012-02-28 22:25         ` Dan Williams
  2012-02-29  9:17       ` Peter Zijlstra
  0 siblings, 2 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-28 22:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Tue, Feb 28, 2012 at 1:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 28 Feb 2012, Peter Zijlstra wrote:
>
>> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
>> > An experimental hack to tease out whether we are continuing to
>> > run the softirq handler past the point of needing scheduling.
>> >
>> > It allows only one trip through __do_softirq() as long as need_resched()
>> > is set which hopefully creates the back pressure needed to get ksoftirqd
>> > scheduled.
>> >
>> > Targeted to address reports like the following that are produced
>> > with i/o tests to a sas domain with a large number of disks (48+), and
>> > lots of debugging enabled (slub_deubg, lockdep) that makes the
>> > block+scsi softirq path more cpu-expensive than normal.
>> >
>> > With this patch applied the softlockup detector seems appeased, but it
>> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
>> > something that needs changing at the block/scsi level?
>> >
>> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>>
>> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
>> on average you spend 2.2s+ per softirq invocation, this is completely
>> absolutely bonkers. Softirq handlers should never consume significant
>> amount of cpu-time.
>>
>> Thomas, think its about time we put something like the below in?
>
> Absolutely. Anything which consumes more than a few microseconds in
> the softirq handler needs to be sorted out, no matter what.

Looks like everyone is guilty:

[  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
...
[  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
[  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
[  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
[  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
[  424.008691] softirq loop took longer than 1/2 tick need_resched:
yes max_restart: 10

As expected whenever that 1/2 tick message gets emitted the softirq
handler is almost running in a need_resched() context.

$ grep need_resched.*no log | wc -l
295
$ grep need_resched.*yes log | wc -l
3201

One of these warning messages gets printed out at a rate of 1 every
40ms. (468 second log w/ 11,725 of these messages).

So is it a good idea to get more aggressive about scheduling ksoftrrqd?

--
Dan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-28 22:16     ` Dan Williams
@ 2012-02-28 22:25         ` Dan Williams
  2012-02-29  9:17       ` Peter Zijlstra
  1 sibling, 0 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-28 22:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Tue, Feb 28, 2012 at 2:16 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Feb 28, 2012 at 1:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Tue, 28 Feb 2012, Peter Zijlstra wrote:
>>
>>> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
>>> > An experimental hack to tease out whether we are continuing to
>>> > run the softirq handler past the point of needing scheduling.
>>> >
>>> > It allows only one trip through __do_softirq() as long as need_resched()
>>> > is set which hopefully creates the back pressure needed to get ksoftirqd
>>> > scheduled.
>>> >
>>> > Targeted to address reports like the following that are produced
>>> > with i/o tests to a sas domain with a large number of disks (48+), and
>>> > lots of debugging enabled (slub_deubg, lockdep) that makes the
>>> > block+scsi softirq path more cpu-expensive than normal.
>>> >
>>> > With this patch applied the softlockup detector seems appeased, but it
>>> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
>>> > something that needs changing at the block/scsi level?
>>> >
>>> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>>>
>>> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
>>> on average you spend 2.2s+ per softirq invocation, this is completely
>>> absolutely bonkers. Softirq handlers should never consume significant
>>> amount of cpu-time.
>>>
>>> Thomas, think its about time we put something like the below in?
>>
>> Absolutely. Anything which consumes more than a few microseconds in
>> the softirq handler needs to be sorted out, no matter what.
>
> Looks like everyone is guilty:
>
> [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> ...
> [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> [  424.008691] softirq loop took longer than 1/2 tick need_resched:
> yes max_restart: 10
>
> As expected whenever that 1/2 tick message gets emitted the softirq
> handler is almost running in a need_resched() context.
>
> $ grep need_resched.*no log | wc -l
> 295
> $ grep need_resched.*yes log | wc -l
> 3201
>
> One of these warning messages gets printed out at a rate of 1 every
> 40ms. (468 second log w/ 11,725 of these messages).
>
> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>

Keep in mind this is with slub_debug and lockdep turned on, but we are
basically looking to do extended i/o runs on large topologies and want
the soft lockup detector to not generate false positives.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
@ 2012-02-28 22:25         ` Dan Williams
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-28 22:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Tue, Feb 28, 2012 at 2:16 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Feb 28, 2012 at 1:41 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Tue, 28 Feb 2012, Peter Zijlstra wrote:
>>
>>> On Mon, 2012-02-27 at 12:38 -0800, Dan Williams wrote:
>>> > An experimental hack to tease out whether we are continuing to
>>> > run the softirq handler past the point of needing scheduling.
>>> >
>>> > It allows only one trip through __do_softirq() as long as need_resched()
>>> > is set which hopefully creates the back pressure needed to get ksoftirqd
>>> > scheduled.
>>> >
>>> > Targeted to address reports like the following that are produced
>>> > with i/o tests to a sas domain with a large number of disks (48+), and
>>> > lots of debugging enabled (slub_deubg, lockdep) that makes the
>>> > block+scsi softirq path more cpu-expensive than normal.
>>> >
>>> > With this patch applied the softlockup detector seems appeased, but it
>>> > seems odd to need changes to kernel/softirq.c so maybe I have overlooked
>>> > something that needs changing at the block/scsi level?
>>> >
>>> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:78]
>>>
>>> So you're stuck in softirq for 22s+, max_restart is 10, this gives that
>>> on average you spend 2.2s+ per softirq invocation, this is completely
>>> absolutely bonkers. Softirq handlers should never consume significant
>>> amount of cpu-time.
>>>
>>> Thomas, think its about time we put something like the below in?
>>
>> Absolutely. Anything which consumes more than a few microseconds in
>> the softirq handler needs to be sorted out, no matter what.
>
> Looks like everyone is guilty:
>
> [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> ...
> [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> [  424.008691] softirq loop took longer than 1/2 tick need_resched:
> yes max_restart: 10
>
> As expected whenever that 1/2 tick message gets emitted the softirq
> handler is almost running in a need_resched() context.
>
> $ grep need_resched.*no log | wc -l
> 295
> $ grep need_resched.*yes log | wc -l
> 3201
>
> One of these warning messages gets printed out at a rate of 1 every
> 40ms. (468 second log w/ 11,725 of these messages).
>
> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>

Keep in mind this is with slub_debug and lockdep turned on, but we are
basically looking to do extended i/o runs on large topologies and want
the soft lockup detector to not generate false positives.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-28 22:16     ` Dan Williams
  2012-02-28 22:25         ` Dan Williams
@ 2012-02-29  9:17       ` Peter Zijlstra
  2012-02-29 19:49           ` Dan Williams
  2012-03-03  8:39         ` Paul E. McKenney
  1 sibling, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2012-02-29  9:17 UTC (permalink / raw)
  To: Dan Williams
  Cc: Thomas Gleixner, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
> Looks like everyone is guilty:
> 
> [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> ...
> [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> [  424.008691] softirq loop took longer than 1/2 tick need_resched:

/me kicks himself for not printing the actual duration.. :-)

> As expected whenever that 1/2 tick message gets emitted the softirq
> handler is almost running in a need_resched() context.

Yeah.. that's quite expected.

> So is it a good idea to get more aggressive about scheduling ksoftrrqd? 

Nah, moving away from softirq more like. I'll put moving the
load-balancer into a kthread on the todo list. And it looks like
everybody else should move to kthreads too.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-29  9:17       ` Peter Zijlstra
@ 2012-02-29 19:49           ` Dan Williams
  2012-03-03  8:39         ` Paul E. McKenney
  1 sibling, 0 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-29 19:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Wed, Feb 29, 2012 at 1:17 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
>> Looks like everyone is guilty:
>>
>> [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
>> ...
>> [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
>> [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
>> [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
>> [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
>> [  424.008691] softirq loop took longer than 1/2 tick need_resched:
>
> /me kicks himself for not printing the actual duration.. :-)
>
>> As expected whenever that 1/2 tick message gets emitted the softirq
>> handler is almost running in a need_resched() context.
>
> Yeah.. that's quite expected.
>
>> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>
> Nah, moving away from softirq more like. I'll put moving the
> load-balancer into a kthread on the todo list. And it looks like
> everybody else should move to kthreads too.

I seem to recall this running into resistance (but maybe things have
changed in the last few years)?

https://lkml.org/lkml/2007/6/29/155

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
@ 2012-02-29 19:49           ` Dan Williams
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Williams @ 2012-02-29 19:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, linux-kernel, Jens Axboe, linux-scsi,
	Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Wed, Feb 29, 2012 at 1:17 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
>> Looks like everyone is guilty:
>>
>> [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
>> ...
>> [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
>> [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
>> [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
>> [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
>> [  424.008691] softirq loop took longer than 1/2 tick need_resched:
>
> /me kicks himself for not printing the actual duration.. :-)
>
>> As expected whenever that 1/2 tick message gets emitted the softirq
>> handler is almost running in a need_resched() context.
>
> Yeah.. that's quite expected.
>
>> So is it a good idea to get more aggressive about scheduling ksoftrrqd?
>
> Nah, moving away from softirq more like. I'll put moving the
> load-balancer into a kthread on the todo list. And it looks like
> everybody else should move to kthreads too.

I seem to recall this running into resistance (but maybe things have
changed in the last few years)?

https://lkml.org/lkml/2007/6/29/155
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH] kick ksoftirqd more often to please soft lockup detector
  2012-02-29  9:17       ` Peter Zijlstra
  2012-02-29 19:49           ` Dan Williams
@ 2012-03-03  8:39         ` Paul E. McKenney
  1 sibling, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2012-03-03  8:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dan Williams, Thomas Gleixner, linux-kernel, Jens Axboe,
	linux-scsi, Lukasz Dorau, James Bottomley, Andrzej Jakowski

On Wed, Feb 29, 2012 at 10:17:03AM +0100, Peter Zijlstra wrote:
> On Tue, 2012-02-28 at 14:16 -0800, Dan Williams wrote:
> > Looks like everyone is guilty:
> > 
> > [  422.765336] softirq took longer than 1/4 tick: 3 NET_RX ffffffff813f0aa0
> > ...
> > [  423.971878] softirq took longer than 1/4 tick: 4 BLOCK ffffffff812519c8
> > [  423.985093] softirq took longer than 1/4 tick: 6 TASKLET ffffffff8103422e
> > [  423.993157] softirq took longer than 1/4 tick: 7 SCHED ffffffff8105e2e1
> > [  424.001018] softirq took longer than 1/4 tick: 9 RCU ffffffff810a0fed
> > [  424.008691] softirq loop took longer than 1/2 tick need_resched:
> 
> /me kicks himself for not printing the actual duration.. :-)
> 
> > As expected whenever that 1/2 tick message gets emitted the softirq
> > handler is almost running in a need_resched() context.
> 
> Yeah.. that's quite expected.
> 
> > So is it a good idea to get more aggressive about scheduling ksoftrrqd? 
> 
> Nah, moving away from softirq more like. I'll put moving the
> load-balancer into a kthread on the todo list. And it looks like
> everybody else should move to kthreads too.

Last year when I tried that, things got a bit ugly.  I guess I don't need
the kthreads to be realtime unless CONFIG_RCU_BOOST=y, maybe that will
help.  Also IIRC Steven Rostedt made some real-time changes that might
help as well for my case, which would have a per-CPU RT kthread.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-03-03  8:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-27 20:38 [RFC PATCH] kick ksoftirqd more often to please soft lockup detector Dan Williams
2012-02-28  8:35 ` Yong Zhang
2012-02-28  9:48 ` Peter Zijlstra
2012-02-28 16:48   ` Dan Williams
2012-02-28 21:41   ` Thomas Gleixner
2012-02-28 22:16     ` Dan Williams
2012-02-28 22:25       ` Dan Williams
2012-02-28 22:25         ` Dan Williams
2012-02-29  9:17       ` Peter Zijlstra
2012-02-29 19:49         ` Dan Williams
2012-02-29 19:49           ` Dan Williams
2012-03-03  8:39         ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.