[PATCH V2] block: Prevent hung_check firing during long sync IO

* [PATCH V2] block: Prevent hung_check firing during long sync IO
@ 2020-03-18  3:43 Ming Lei
  2020-03-18 10:55 ` Guoqing Jiang
  2020-03-18 14:48 ` Jens Axboe
  0 siblings, 2 replies; 4+ messages in thread
From: Ming Lei @ 2020-03-18  3:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Salman Qazi, Jesse Barnes, Bart Van Assche

submit_bio_wait() can be called from ioctl(BLKSECDISCARD), which
may take long time to complete, as Salman mentioned, 4K BLKSECDISCARD
takes up to 100 second on some devices. Also any block I/O operation
that occurs after the BLKSECDISCARD is submitted will also potentially
be affected by the hung task timeouts.

Another report is that task hang can be observed when running mkfs
over raid10 which takes a small max discard sectors limit because
of chunk size.

So prevent hung_check from firing by taking same approach used
in blk_execute_rq(), and the wake-up interval is set as half the
hung_check timer period, which keeps overhead low enough.

Cc: Salman Qazi <sqazi@google.com>
Cc: Jesse Barnes <jsbarnes@google.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Link: https://lkml.org/lkml/2020/2/12/1193
Reported-by: Salman Qazi <sqazi@google.com>
Reviewed-by: Jesse Barnes <jsbarnes@google.com>
Reviewed-by: Salman Qazi <sqazi@google.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
V2:
	- fix checkpatch warning
	- add reviewed-by
	- add comment log for covering one recent report on task hung on
	  raid10

 block/bio.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/block/bio.c b/block/bio.c
index 94d697217887..0985f3422556 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -17,6 +17,7 @@
 #include <linux/cgroup.h>
 #include <linux/blk-cgroup.h>
 #include <linux/highmem.h>
+#include <linux/sched/sysctl.h>
 
 #include <trace/events/block.h>
 #include "blk.h"
@@ -1019,12 +1020,21 @@ static void submit_bio_wait_endio(struct bio *bio)
 int submit_bio_wait(struct bio *bio)
 {
 	DECLARE_COMPLETION_ONSTACK_MAP(done, bio->bi_disk->lockdep_map);
+	unsigned long hang_check;
 
 	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
 	bio->bi_opf |= REQ_SYNC;
 	submit_bio(bio);
-	wait_for_completion_io(&done);
+
+	/* Prevent hang_check timer from firing at us during very long I/O */
+	hang_check = sysctl_hung_task_timeout_secs;
+	if (hang_check)
+		while (!wait_for_completion_io_timeout(&done,
+					hang_check * (HZ/2)))
+			;
+	else
+		wait_for_completion_io(&done);
 
 	return blk_status_to_errno(bio->bi_status);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread