All of lore.kernel.org
 help / color / mirror / Atom feed
* Calling block ops when !TASK_RUNNING warning in raid1
@ 2017-11-28  8:51 Dennis Yang
  2017-11-28 17:52 ` Shaohua Li
  0 siblings, 1 reply; 8+ messages in thread
From: Dennis Yang @ 2017-11-28  8:51 UTC (permalink / raw)
  To: Shaohua Li, linux-raid

Hi,

We recently see the following kernel dump on raid1 with some kernel
debug option on.

<4>[   40.501369] ------------[ cut here ]------------
<4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
<4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
<4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
<4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
    O    4.2.8 #1
<4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
Name1, BIOS QV96IR23 10/21/2015
<4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
ffffffff810dfb59
<4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
ffff880092f7b8e8
<4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
0000000000000001
<4>[   40.501403] Call Trace:
<4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
<4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
<4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
<4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
<4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
<4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
<4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
<4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
<4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
<4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
<4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
<4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
<4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
<4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
<4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
<4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
<4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
<4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
<4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
<4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
<4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
<4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
<4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
<4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
<4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
<4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
<4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
<4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
<4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
<4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
<4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
<4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
<4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
<4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
<4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
<4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
<4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
<4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---

It looks like raid1_quiesce() creates a nested sleeping primitives by
calling wait_event_lock_irq_cmd()
first to change the state to TASK_UNINTERRUPTIBLE and
flush_pending_writes() could eventually try
to allocate bio for bitmap update with GFP_NOIO which might sleep and
triggers this warning.
I am not sure if this warning is just a false-positive or should we
change the bio allocation
gfp flag to GFP_NOWAIT to prevent it from blocking?

Thank you for your patience.
Dennis

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-11-28  8:51 Calling block ops when !TASK_RUNNING warning in raid1 Dennis Yang
@ 2017-11-28 17:52 ` Shaohua Li
  2017-11-30  0:20   ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohua Li @ 2017-11-28 17:52 UTC (permalink / raw)
  To: Dennis Yang; +Cc: linux-raid

On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
> Hi,
> 
> We recently see the following kernel dump on raid1 with some kernel
> debug option on.
> 
> <4>[   40.501369] ------------[ cut here ]------------
> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
>     O    4.2.8 #1
> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
> Name1, BIOS QV96IR23 10/21/2015
> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
> ffffffff810dfb59
> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
> ffff880092f7b8e8
> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
> 0000000000000001
> <4>[   40.501403] Call Trace:
> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
> 
> It looks like raid1_quiesce() creates a nested sleeping primitives by
> calling wait_event_lock_irq_cmd()
> first to change the state to TASK_UNINTERRUPTIBLE and
> flush_pending_writes() could eventually try
> to allocate bio for bitmap update with GFP_NOIO which might sleep and
> triggers this warning.
> I am not sure if this warning is just a false-positive or should we
> change the bio allocation
> gfp flag to GFP_NOWAIT to prevent it from blocking?

This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
issue, because generic_make_request could sleep too. I think we should move the
work to a workqueue. Could you please try below patch (untested yet)?

Thanks,
Shaohua

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index cc9d337..21e4fe2 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1015,6 +1015,20 @@ static int get_unqueued_pending(struct r1conf *conf)
 	return ret;
 }
 
+/*
+ * freeze_array could be called in raid1d, so we can't move this work to raid1d
+ */
+static void raid1_do_flush_pending_work(struct work_struct *work)
+{
+	struct r1conf *conf = container_of(work, struct r1conf,
+		flush_pending_work);
+	struct blk_plug plug;
+
+	blk_start_plug(&plug);
+	flush_pending_writes(conf);
+	blk_finish_plug(&plug);
+}
+
 static void freeze_array(struct r1conf *conf, int extra)
 {
 	/* Stop sync I/O and normal I/O and wait for everything to
@@ -1047,7 +1061,7 @@ static void freeze_array(struct r1conf *conf, int extra)
 		conf->wait_barrier,
 		get_unqueued_pending(conf) == extra,
 		conf->resync_lock,
-		flush_pending_writes(conf));
+		schedule_work(&conf->flush_pending_work));
 	spin_unlock_irq(&conf->resync_lock);
 }
 static void unfreeze_array(struct r1conf *conf)
@@ -2953,6 +2967,7 @@ static struct r1conf *setup_conf(struct mddev *mddev)
 	bio_list_init(&conf->pending_bio_list);
 	conf->pending_count = 0;
 	conf->recovery_disabled = mddev->recovery_disabled - 1;
+	INIT_WORK(&conf->flush_pending_work, raid1_do_flush_pending_work);
 
 	err = -EIO;
 	for (i = 0; i < conf->raid_disks * 2; i++) {
@@ -3103,6 +3118,7 @@ static void raid1_free(struct mddev *mddev, void *priv)
 {
 	struct r1conf *conf = priv;
 
+	flush_work(&conf->flush_pending_work);
 	mempool_destroy(conf->r1bio_pool);
 	kfree(conf->mirrors);
 	safe_put_page(conf->tmppage);
diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
index c7294e7..0e9dae9 100644
--- a/drivers/md/raid1.h
+++ b/drivers/md/raid1.h
@@ -126,7 +126,7 @@ struct r1conf {
 	 */
 	sector_t		cluster_sync_low;
 	sector_t		cluster_sync_high;
-
+	struct work_struct	flush_pending_work;
 };
 
 /*

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-11-28 17:52 ` Shaohua Li
@ 2017-11-30  0:20   ` NeilBrown
  2017-11-30  2:56     ` Guoqing Jiang
  0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2017-11-30  0:20 UTC (permalink / raw)
  To: Shaohua Li, Dennis Yang; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 8064 bytes --]

On Tue, Nov 28 2017, Shaohua Li wrote:

> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
>> Hi,
>> 
>> We recently see the following kernel dump on raid1 with some kernel
>> debug option on.
>> 
>> <4>[   40.501369] ------------[ cut here ]------------
>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
>>     O    4.2.8 #1
>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
>> Name1, BIOS QV96IR23 10/21/2015
>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
>> ffffffff810dfb59
>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
>> ffff880092f7b8e8
>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
>> 0000000000000001
>> <4>[   40.501403] Call Trace:
>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
>> 
>> It looks like raid1_quiesce() creates a nested sleeping primitives by
>> calling wait_event_lock_irq_cmd()
>> first to change the state to TASK_UNINTERRUPTIBLE and
>> flush_pending_writes() could eventually try
>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
>> triggers this warning.
>> I am not sure if this warning is just a false-positive or should we
>> change the bio allocation
>> gfp flag to GFP_NOWAIT to prevent it from blocking?
>
> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
> issue, because generic_make_request could sleep too. I think we should move the
> work to a workqueue. Could you please try below patch (untested yet)?

I think it would be simpler to call __set_current_state(TASK_RUNNING)
in the 'then' branch of flush_pending_writes().

What is happening is that the wait_event_lock_irq_cmd() is told to call
flush_pending_writes(conf) after dropping the lock and before calling
schedule().
If this finds nothing to do, it should go ahead and call schedule().
If this does find something to do, it probably took a while so it
makes sense to not call schedule() and to instead go around the wait
loop again.  Setting current_state to RUNNING will cause schedule() to
no-op and will suppress the warning.

The warning is saying "you set task state and then did something
non-trivial before calling schedule()".  I think our response if "oh,
when we do something non-trivial, we don't really need to call schedule
after all, thanks for the warning".

Thanks,
NeilBrown

>
> Thanks,
> Shaohua
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index cc9d337..21e4fe2 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1015,6 +1015,20 @@ static int get_unqueued_pending(struct r1conf *conf)
>  	return ret;
>  }
>  
> +/*
> + * freeze_array could be called in raid1d, so we can't move this work to raid1d
> + */
> +static void raid1_do_flush_pending_work(struct work_struct *work)
> +{
> +	struct r1conf *conf = container_of(work, struct r1conf,
> +		flush_pending_work);
> +	struct blk_plug plug;
> +
> +	blk_start_plug(&plug);
> +	flush_pending_writes(conf);
> +	blk_finish_plug(&plug);
> +}
> +
>  static void freeze_array(struct r1conf *conf, int extra)
>  {
>  	/* Stop sync I/O and normal I/O and wait for everything to
> @@ -1047,7 +1061,7 @@ static void freeze_array(struct r1conf *conf, int extra)
>  		conf->wait_barrier,
>  		get_unqueued_pending(conf) == extra,
>  		conf->resync_lock,
> -		flush_pending_writes(conf));
> +		schedule_work(&conf->flush_pending_work));
>  	spin_unlock_irq(&conf->resync_lock);
>  }
>  static void unfreeze_array(struct r1conf *conf)
> @@ -2953,6 +2967,7 @@ static struct r1conf *setup_conf(struct mddev *mddev)
>  	bio_list_init(&conf->pending_bio_list);
>  	conf->pending_count = 0;
>  	conf->recovery_disabled = mddev->recovery_disabled - 1;
> +	INIT_WORK(&conf->flush_pending_work, raid1_do_flush_pending_work);
>  
>  	err = -EIO;
>  	for (i = 0; i < conf->raid_disks * 2; i++) {
> @@ -3103,6 +3118,7 @@ static void raid1_free(struct mddev *mddev, void *priv)
>  {
>  	struct r1conf *conf = priv;
>  
> +	flush_work(&conf->flush_pending_work);
>  	mempool_destroy(conf->r1bio_pool);
>  	kfree(conf->mirrors);
>  	safe_put_page(conf->tmppage);
> diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
> index c7294e7..0e9dae9 100644
> --- a/drivers/md/raid1.h
> +++ b/drivers/md/raid1.h
> @@ -126,7 +126,7 @@ struct r1conf {
>  	 */
>  	sector_t		cluster_sync_low;
>  	sector_t		cluster_sync_high;
> -
> +	struct work_struct	flush_pending_work;
>  };
>  
>  /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-11-30  0:20   ` NeilBrown
@ 2017-11-30  2:56     ` Guoqing Jiang
  2017-11-30  4:26       ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2017-11-30  2:56 UTC (permalink / raw)
  To: NeilBrown, Shaohua Li, Dennis Yang; +Cc: linux-raid



On 11/30/2017 08:20 AM, NeilBrown wrote:
> On Tue, Nov 28 2017, Shaohua Li wrote:
>
>> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
>>> Hi,
>>>
>>> We recently see the following kernel dump on raid1 with some kernel
>>> debug option on.
>>>
>>> <4>[   40.501369] ------------[ cut here ]------------
>>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
>>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
>>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
>>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
>>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
>>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
>>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
>>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
>>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
>>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
>>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
>>>      O    4.2.8 #1
>>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
>>> Name1, BIOS QV96IR23 10/21/2015
>>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
>>> ffffffff810dfb59
>>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
>>> ffff880092f7b8e8
>>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
>>> 0000000000000001
>>> <4>[   40.501403] Call Trace:
>>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
>>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
>>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
>>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
>>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
>>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
>>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
>>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
>>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
>>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
>>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
>>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
>>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
>>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
>>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
>>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
>>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
>>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
>>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
>>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
>>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
>>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
>>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
>>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
>>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
>>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
>>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
>>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
>>>
>>> It looks like raid1_quiesce() creates a nested sleeping primitives by
>>> calling wait_event_lock_irq_cmd()
>>> first to change the state to TASK_UNINTERRUPTIBLE and
>>> flush_pending_writes() could eventually try
>>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
>>> triggers this warning.
>>> I am not sure if this warning is just a false-positive or should we
>>> change the bio allocation
>>> gfp flag to GFP_NOWAIT to prevent it from blocking?
>> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
>> issue, because generic_make_request could sleep too. I think we should move the
>> work to a workqueue. Could you please try below patch (untested yet)?
> I think it would be simpler to call __set_current_state(TASK_RUNNING)
> in the 'then' branch of flush_pending_writes().

There is no 'then' branch in this function, maybe you mean set 
TASK_RUNNING in the 'if' branch,
since the calltrace is triggered by flush_pending_writes -> 
flush_bio_list -> bitmap_unplug.

Thanks,
Guoqing


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-11-30  2:56     ` Guoqing Jiang
@ 2017-11-30  4:26       ` NeilBrown
  2017-12-01 20:10         ` Shaohua Li
  0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2017-11-30  4:26 UTC (permalink / raw)
  To: Guoqing Jiang, Shaohua Li, Dennis Yang; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 5804 bytes --]

On Thu, Nov 30 2017, Guoqing Jiang wrote:

> On 11/30/2017 08:20 AM, NeilBrown wrote:
>> On Tue, Nov 28 2017, Shaohua Li wrote:
>>
>>> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
>>>> Hi,
>>>>
>>>> We recently see the following kernel dump on raid1 with some kernel
>>>> debug option on.
>>>>
>>>> <4>[   40.501369] ------------[ cut here ]------------
>>>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
>>>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
>>>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
>>>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
>>>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
>>>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
>>>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
>>>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
>>>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
>>>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
>>>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
>>>>      O    4.2.8 #1
>>>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
>>>> Name1, BIOS QV96IR23 10/21/2015
>>>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
>>>> ffffffff810dfb59
>>>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
>>>> ffff880092f7b8e8
>>>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
>>>> 0000000000000001
>>>> <4>[   40.501403] Call Trace:
>>>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
>>>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
>>>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
>>>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
>>>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>>>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>>>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
>>>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
>>>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
>>>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
>>>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
>>>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
>>>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
>>>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
>>>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
>>>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
>>>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
>>>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
>>>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
>>>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
>>>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
>>>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
>>>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
>>>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
>>>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
>>>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
>>>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
>>>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>>>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
>>>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>>>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>>>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
>>>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>>>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
>>>>
>>>> It looks like raid1_quiesce() creates a nested sleeping primitives by
>>>> calling wait_event_lock_irq_cmd()
>>>> first to change the state to TASK_UNINTERRUPTIBLE and
>>>> flush_pending_writes() could eventually try
>>>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
>>>> triggers this warning.
>>>> I am not sure if this warning is just a false-positive or should we
>>>> change the bio allocation
>>>> gfp flag to GFP_NOWAIT to prevent it from blocking?
>>> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
>>> issue, because generic_make_request could sleep too. I think we should move the
>>> work to a workqueue. Could you please try below patch (untested yet)?
>> I think it would be simpler to call __set_current_state(TASK_RUNNING)
>> in the 'then' branch of flush_pending_writes().
>
> There is no 'then' branch in this function, maybe you mean set 
> TASK_RUNNING in the 'if' branch,
> since the calltrace is triggered by flush_pending_writes -> 
> flush_bio_list -> bitmap_unplug.

I grew up with BASIC and Pascal.
Every "if" statement has a "then" branch and an "else" branch.
The fact that C doesn't have a "then" keyword doesn't mean there isn't a
'then' branch.

But yes, state should be set to TASK_RUNNING when the condition in the
'if' statement evaluates as 'true' (or maybe I should say "doesn't
evaluate to 0").

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-11-30  4:26       ` NeilBrown
@ 2017-12-01 20:10         ` Shaohua Li
  2017-12-03 21:21           ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohua Li @ 2017-12-01 20:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: Guoqing Jiang, Dennis Yang, linux-raid

On Thu, Nov 30, 2017 at 03:26:06PM +1100, Neil Brown wrote:
> On Thu, Nov 30 2017, Guoqing Jiang wrote:
> 
> > On 11/30/2017 08:20 AM, NeilBrown wrote:
> >> On Tue, Nov 28 2017, Shaohua Li wrote:
> >>
> >>> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
> >>>> Hi,
> >>>>
> >>>> We recently see the following kernel dump on raid1 with some kernel
> >>>> debug option on.
> >>>>
> >>>> <4>[   40.501369] ------------[ cut here ]------------
> >>>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
> >>>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
> >>>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
> >>>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
> >>>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
> >>>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
> >>>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
> >>>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
> >>>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
> >>>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
> >>>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
> >>>>      O    4.2.8 #1
> >>>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
> >>>> Name1, BIOS QV96IR23 10/21/2015
> >>>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
> >>>> ffffffff810dfb59
> >>>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
> >>>> ffff880092f7b8e8
> >>>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
> >>>> 0000000000000001
> >>>> <4>[   40.501403] Call Trace:
> >>>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
> >>>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
> >>>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
> >>>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
> >>>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> >>>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> >>>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
> >>>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
> >>>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
> >>>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
> >>>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
> >>>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
> >>>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
> >>>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
> >>>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
> >>>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
> >>>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
> >>>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >>>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
> >>>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
> >>>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
> >>>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
> >>>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
> >>>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
> >>>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
> >>>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
> >>>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
> >>>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >>>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >>>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
> >>>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >>>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >>>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
> >>>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >>>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> >>>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
> >>>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> >>>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
> >>>>
> >>>> It looks like raid1_quiesce() creates a nested sleeping primitives by
> >>>> calling wait_event_lock_irq_cmd()
> >>>> first to change the state to TASK_UNINTERRUPTIBLE and
> >>>> flush_pending_writes() could eventually try
> >>>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
> >>>> triggers this warning.
> >>>> I am not sure if this warning is just a false-positive or should we
> >>>> change the bio allocation
> >>>> gfp flag to GFP_NOWAIT to prevent it from blocking?
> >>> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
> >>> issue, because generic_make_request could sleep too. I think we should move the
> >>> work to a workqueue. Could you please try below patch (untested yet)?
> >> I think it would be simpler to call __set_current_state(TASK_RUNNING)
> >> in the 'then' branch of flush_pending_writes().
> >
> > There is no 'then' branch in this function, maybe you mean set 
> > TASK_RUNNING in the 'if' branch,
> > since the calltrace is triggered by flush_pending_writes -> 
> > flush_bio_list -> bitmap_unplug.
> 
> I grew up with BASIC and Pascal.
> Every "if" statement has a "then" branch and an "else" branch.
> The fact that C doesn't have a "then" keyword doesn't mean there isn't a
> 'then' branch.
> 
> But yes, state should be set to TASK_RUNNING when the condition in the
> 'if' statement evaluates as 'true' (or maybe I should say "doesn't
> evaluate to 0").

Completely agree this fixes the issue, but I'm a little hesitant to apply it.
It looks a little weird, I mean, at least I must add comments to explain why we
do that way. Do you have strong preference to do this way?

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-12-01 20:10         ` Shaohua Li
@ 2017-12-03 21:21           ` NeilBrown
  2017-12-04 22:31             ` Shaohua Li
  0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2017-12-03 21:21 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Guoqing Jiang, Dennis Yang, linux-raid

[-- Attachment #1: Type: text/plain, Size: 9638 bytes --]

On Fri, Dec 01 2017, Shaohua Li wrote:

> On Thu, Nov 30, 2017 at 03:26:06PM +1100, Neil Brown wrote:
>> On Thu, Nov 30 2017, Guoqing Jiang wrote:
>> 
>> > On 11/30/2017 08:20 AM, NeilBrown wrote:
>> >> On Tue, Nov 28 2017, Shaohua Li wrote:
>> >>
>> >>> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
>> >>>> Hi,
>> >>>>
>> >>>> We recently see the following kernel dump on raid1 with some kernel
>> >>>> debug option on.
>> >>>>
>> >>>> <4>[   40.501369] ------------[ cut here ]------------
>> >>>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
>> >>>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
>> >>>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
>> >>>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
>> >>>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
>> >>>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
>> >>>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
>> >>>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
>> >>>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
>> >>>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
>> >>>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
>> >>>>      O    4.2.8 #1
>> >>>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
>> >>>> Name1, BIOS QV96IR23 10/21/2015
>> >>>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
>> >>>> ffffffff810dfb59
>> >>>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
>> >>>> ffff880092f7b8e8
>> >>>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
>> >>>> 0000000000000001
>> >>>> <4>[   40.501403] Call Trace:
>> >>>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
>> >>>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
>> >>>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
>> >>>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
>> >>>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>> >>>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
>> >>>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
>> >>>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
>> >>>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
>> >>>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
>> >>>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
>> >>>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
>> >>>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
>> >>>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
>> >>>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
>> >>>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
>> >>>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
>> >>>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> >>>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
>> >>>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
>> >>>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
>> >>>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
>> >>>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
>> >>>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
>> >>>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
>> >>>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
>> >>>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
>> >>>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> >>>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> >>>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
>> >>>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> >>>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
>> >>>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
>> >>>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
>> >>>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>> >>>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
>> >>>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
>> >>>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
>> >>>>
>> >>>> It looks like raid1_quiesce() creates a nested sleeping primitives by
>> >>>> calling wait_event_lock_irq_cmd()
>> >>>> first to change the state to TASK_UNINTERRUPTIBLE and
>> >>>> flush_pending_writes() could eventually try
>> >>>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
>> >>>> triggers this warning.
>> >>>> I am not sure if this warning is just a false-positive or should we
>> >>>> change the bio allocation
>> >>>> gfp flag to GFP_NOWAIT to prevent it from blocking?
>> >>> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
>> >>> issue, because generic_make_request could sleep too. I think we should move the
>> >>> work to a workqueue. Could you please try below patch (untested yet)?
>> >> I think it would be simpler to call __set_current_state(TASK_RUNNING)
>> >> in the 'then' branch of flush_pending_writes().
>> >
>> > There is no 'then' branch in this function, maybe you mean set 
>> > TASK_RUNNING in the 'if' branch,
>> > since the calltrace is triggered by flush_pending_writes -> 
>> > flush_bio_list -> bitmap_unplug.
>> 
>> I grew up with BASIC and Pascal.
>> Every "if" statement has a "then" branch and an "else" branch.
>> The fact that C doesn't have a "then" keyword doesn't mean there isn't a
>> 'then' branch.
>> 
>> But yes, state should be set to TASK_RUNNING when the condition in the
>> 'if' statement evaluates as 'true' (or maybe I should say "doesn't
>> evaluate to 0").
>
> Completely agree this fixes the issue, but I'm a little hesitant to apply it.
> It looks a little weird, I mean, at least I must add comments to explain why we
> do that way. Do you have strong preference to do this way?

My preference is quite strong.  I believe the current code is simple and
correct and doesn't need to change.
The warning is a false positive.  It is a good warning to have, but in
this case it doesn't indicate a problem.

I agree that comments are a good thing here.  So I propose this patch,
replete with comments.

Thanks,
NeilBrown


Subject: [PATCH] md/raid1,raid10: silence warning about wait-within-wait

If you prepare_to_wait() after a previous prepare_to_wait(),
but before calling schedule(), you get warning:

  do not call blocking ops when !TASK_RUNNING; state=2

This is appropriate as it is often a bug.  The event that the
first prepare_to_wait() expects might wake up the schedule following
the second prepare_to_wait(), which could be confusing.

However if both prepare_to_wait()s are part of simple wait_event()
loops, and if the inner one is rarely called, then there is
no problem.  The inner loop is too simple to get confused by
a stray wakeup, and the outer loop won't spin unduly because the
inner doesnt affect it often.

This pattern occurs in both raid1.c and raid10.c in the use of
flush_pending_writes().

The warning can be silenced by setting current->state to TASK_RUNNING.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 drivers/md/raid1.c  | 11 +++++++++++
 drivers/md/raid10.c | 11 +++++++++++
 2 files changed, 22 insertions(+)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index cc9d337a1ed3..0faec3a5f017 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -813,6 +813,17 @@ static void flush_pending_writes(struct r1conf *conf)
 		bio = bio_list_get(&conf->pending_bio_list);
 		conf->pending_count = 0;
 		spin_unlock_irq(&conf->device_lock);
+
+		/* As this is called in a wait_event() loop, current->state
+		 * might be TASK_UNINTERRUPTIBLE which will cause a warning
+		 * when we prepare to wait again.
+		 * As it is rare that this path is taken, it is perfectly
+		 * safe to force us to go around the wait_event() loop
+		 * again, so the warning is a false-positive.
+		 * Silence the warning by resetting thread state
+		 */
+		__set_current_state(TASK_RUNNING);
+
 		flush_bio_list(conf, bio);
 	} else
 		spin_unlock_irq(&conf->device_lock);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index b9edbc747a95..df7b78a79735 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -898,6 +898,17 @@ static void flush_pending_writes(struct r10conf *conf)
 		bio = bio_list_get(&conf->pending_bio_list);
 		conf->pending_count = 0;
 		spin_unlock_irq(&conf->device_lock);
+
+		/* As this is called in a wait_event() loop, current->state
+		 * might be TASK_UNINTERRUPTIBLE which will cause a warning
+		 * when we prepare to wait again.
+		 * As it is rare that this path is taken, it is perfectly
+		 * safe to force us to go around the wait_event() loop
+		 * again, so the warning is a false-positive.
+		 * Silence the warning by resetting thread state
+		 */
+		__set_current_state(TASK_RUNNING);
+
 		/* flush any pending bitmap writes to disk
 		 * before proceeding w/ I/O */
 		bitmap_unplug(conf->mddev->bitmap);
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Calling block ops when !TASK_RUNNING warning in raid1
  2017-12-03 21:21           ` NeilBrown
@ 2017-12-04 22:31             ` Shaohua Li
  0 siblings, 0 replies; 8+ messages in thread
From: Shaohua Li @ 2017-12-04 22:31 UTC (permalink / raw)
  To: NeilBrown; +Cc: Guoqing Jiang, Dennis Yang, linux-raid

On Mon, Dec 04, 2017 at 08:21:04AM +1100, Neil Brown wrote:
> On Fri, Dec 01 2017, Shaohua Li wrote:
> 
> > On Thu, Nov 30, 2017 at 03:26:06PM +1100, Neil Brown wrote:
> >> On Thu, Nov 30 2017, Guoqing Jiang wrote:
> >> 
> >> > On 11/30/2017 08:20 AM, NeilBrown wrote:
> >> >> On Tue, Nov 28 2017, Shaohua Li wrote:
> >> >>
> >> >>> On Tue, Nov 28, 2017 at 04:51:25PM +0800, Dennis Yang wrote:
> >> >>>> Hi,
> >> >>>>
> >> >>>> We recently see the following kernel dump on raid1 with some kernel
> >> >>>> debug option on.
> >> >>>>
> >> >>>> <4>[   40.501369] ------------[ cut here ]------------
> >> >>>> <4>[   40.501375] WARNING: CPU: 7 PID: 7477 at
> >> >>>> kernel/sched/core.c:7404 __might_sleep+0xa2/0xb0()
> >> >>>> <4>[   40.501378] do not call blocking ops when !TASK_RUNNING; state=2
> >> >>>> set at [<ffffffff810c28d8>] prepare_to_wait_event+0x58/0x100
> >> >>>> <4>[   40.501379] Modules linked in: dm_c2f(O) pl2303 usbserial
> >> >>>> qm2_i2c(O) intel_ips drbd(O) flashcache(O) dm_tier_hro_algo
> >> >>>> dm_thin_pool dm_bio_prison dm_persistent_data hal_netlink(O) k10temp
> >> >>>> coretemp mlx4_en(O) mlx4_core(O) mlx_compat(O) igb e1000e(O)
> >> >>>> mpt3sas(O) mpt2sas scsi_transport_sas raid_class usb_storage xhci_pci
> >> >>>> xhci_hcd usblp uhci_hcd ehci_pci ehci_hcd
> >> >>>> <4>[   40.501395] CPU: 7 PID: 7477 Comm: md321_resync Tainted: G
> >> >>>>      O    4.2.8 #1
> >> >>>> <4>[   40.501396] Hardware name: INSYDE QV96/Type2 - Board Product
> >> >>>> Name1, BIOS QV96IR23 10/21/2015
> >> >>>> <4>[   40.501397]  ffffffff8219aff7 ffff880092f7b868 ffffffff81c86c4b
> >> >>>> ffffffff810dfb59
> >> >>>> <4>[   40.501399]  ffff880092f7b8b8 ffff880092f7b8a8 ffffffff81079fa5
> >> >>>> ffff880092f7b8e8
> >> >>>> <4>[   40.501401]  ffffffff821a4f6d 0000000000000140 0000000000000000
> >> >>>> 0000000000000001
> >> >>>> <4>[   40.501403] Call Trace:
> >> >>>> <4>[   40.501407]  [<ffffffff81c86c4b>] dump_stack+0x4c/0x65
> >> >>>> <4>[   40.501409]  [<ffffffff810dfb59>] ? console_unlock+0x279/0x4f0
> >> >>>> <4>[   40.501411]  [<ffffffff81079fa5>] warn_slowpath_common+0x85/0xc0
> >> >>>> <4>[   40.501412]  [<ffffffff8107a021>] warn_slowpath_fmt+0x41/0x50
> >> >>>> <4>[   40.501414]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> >> >>>> <4>[   40.501415]  [<ffffffff810c28d8>] ? prepare_to_wait_event+0x58/0x100
> >> >>>> <4>[   40.501416]  [<ffffffff810a4f72>] __might_sleep+0xa2/0xb0
> >> >>>> <4>[   40.501419]  [<ffffffff8117bb7c>] mempool_alloc+0x7c/0x150
> >> >>>> <4>[   40.501422]  [<ffffffff8101442a>] ? save_stack_trace+0x2a/0x50
> >> >>>> <4>[   40.501425]  [<ffffffff8145b589>] bio_alloc_bioset+0xb9/0x260
> >> >>>> <4>[   40.501428]  [<ffffffff816cb6da>] bio_alloc_mddev+0x1a/0x30
> >> >>>> <4>[   40.501429]  [<ffffffff816d22a2>] md_super_write+0x32/0x90
> >> >>>> <4>[   40.501431]  [<ffffffff816db9b2>] write_page+0x2d2/0x480
> >> >>>> <4>[   40.501432]  [<ffffffff816db808>] ? write_page+0x128/0x480
> >> >>>> <4>[   40.501434]  [<ffffffff816a45cc>] ? flush_pending_writes+0x1c/0xe0
> >> >>>> <4>[   40.501435]  [<ffffffff816dc2a6>] bitmap_unplug+0x156/0x170
> >> >>>> <4>[   40.501437]  [<ffffffff810caa2d>] ? trace_hardirqs_on+0xd/0x10
> >> >>>> <4>[   40.501439]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >> >>>> <4>[   40.501440]  [<ffffffff816a4613>] flush_pending_writes+0x63/0xe0
> >> >>>> <4>[   40.501442]  [<ffffffff816a4aff>] freeze_array+0x6f/0xc0
> >> >>>> <4>[   40.501443]  [<ffffffff810c27e0>] ? wait_woken+0xb0/0xb0
> >> >>>> <4>[   40.501444]  [<ffffffff816a4b8f>] raid1_quiesce+0x3f/0x50
> >> >>>> <4>[   40.501446]  [<ffffffff816d2254>] md_do_sync+0x1394/0x13b0
> >> >>>> <4>[   40.501447]  [<ffffffff816d1531>] ? md_do_sync+0x671/0x13b0
> >> >>>> <4>[   40.501449]  [<ffffffff810cb680>] ? __lock_acquire+0x990/0x23a0
> >> >>>> <4>[   40.501451]  [<ffffffff810bade7>] ? pick_next_task_fair+0x707/0xc30
> >> >>>> <4>[   40.501453]  [<ffffffff8108783c>] ? kernel_sigaction+0x2c/0xc0
> >> >>>> <4>[   40.501455]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >> >>>> <4>[   40.501456]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >> >>>> <4>[   40.501457]  [<ffffffff816cadbe>] md_thread+0x13e/0x150
> >> >>>> <4>[   40.501458]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >> >>>> <4>[   40.501460]  [<ffffffff816cac80>] ? find_pers+0x80/0x80
> >> >>>> <4>[   40.501462]  [<ffffffff8109ded5>] kthread+0x105/0x120
> >> >>>> <4>[   40.501463]  [<ffffffff81c94b4b>] ? _raw_spin_unlock_irq+0x2b/0x40
> >> >>>> <4>[   40.501465]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> >> >>>> <4>[   40.501467]  [<ffffffff81c956cf>] ret_from_fork+0x3f/0x70
> >> >>>> <4>[   40.501468]  [<ffffffff8109ddd0>] ? kthread_create_on_node+0x220/0x220
> >> >>>> <4>[   40.501469] ---[ end trace bd085fb137be2a87 ]---
> >> >>>>
> >> >>>> It looks like raid1_quiesce() creates a nested sleeping primitives by
> >> >>>> calling wait_event_lock_irq_cmd()
> >> >>>> first to change the state to TASK_UNINTERRUPTIBLE and
> >> >>>> flush_pending_writes() could eventually try
> >> >>>> to allocate bio for bitmap update with GFP_NOIO which might sleep and
> >> >>>> triggers this warning.
> >> >>>> I am not sure if this warning is just a false-positive or should we
> >> >>>> change the bio allocation
> >> >>>> gfp flag to GFP_NOWAIT to prevent it from blocking?
> >> >>> This is a legit warnning. Changing gfp to GFP_NOWAIT doesn't completely fix the
> >> >>> issue, because generic_make_request could sleep too. I think we should move the
> >> >>> work to a workqueue. Could you please try below patch (untested yet)?
> >> >> I think it would be simpler to call __set_current_state(TASK_RUNNING)
> >> >> in the 'then' branch of flush_pending_writes().
> >> >
> >> > There is no 'then' branch in this function, maybe you mean set 
> >> > TASK_RUNNING in the 'if' branch,
> >> > since the calltrace is triggered by flush_pending_writes -> 
> >> > flush_bio_list -> bitmap_unplug.
> >> 
> >> I grew up with BASIC and Pascal.
> >> Every "if" statement has a "then" branch and an "else" branch.
> >> The fact that C doesn't have a "then" keyword doesn't mean there isn't a
> >> 'then' branch.
> >> 
> >> But yes, state should be set to TASK_RUNNING when the condition in the
> >> 'if' statement evaluates as 'true' (or maybe I should say "doesn't
> >> evaluate to 0").
> >
> > Completely agree this fixes the issue, but I'm a little hesitant to apply it.
> > It looks a little weird, I mean, at least I must add comments to explain why we
> > do that way. Do you have strong preference to do this way?
> 
> My preference is quite strong.  I believe the current code is simple and
> correct and doesn't need to change.
> The warning is a false positive.  It is a good warning to have, but in
> this case it doesn't indicate a problem.
> 
> I agree that comments are a good thing here.  So I propose this patch,
> replete with comments.


Ok, with the comment, I feel better. Applied, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-12-04 22:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28  8:51 Calling block ops when !TASK_RUNNING warning in raid1 Dennis Yang
2017-11-28 17:52 ` Shaohua Li
2017-11-30  0:20   ` NeilBrown
2017-11-30  2:56     ` Guoqing Jiang
2017-11-30  4:26       ` NeilBrown
2017-12-01 20:10         ` Shaohua Li
2017-12-03 21:21           ` NeilBrown
2017-12-04 22:31             ` Shaohua Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.