linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/2] cancel all throttled bios in del_gendisk()
@ 2022-01-10 13:47 Yu Kuai
  2022-01-10 13:47 ` [PATCH v6 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller Yu Kuai
  2022-01-10 13:47 ` [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk() Yu Kuai
  0 siblings, 2 replies; 16+ messages in thread
From: Yu Kuai @ 2022-01-10 13:47 UTC (permalink / raw)
  To: mkoutny, paulmck, tj, axboe
  Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

If del_gendisk() is done when some io are still throttled, such io
will not be handled until the throttle is done, which is not
necessary.

Changes in v2:
 - move WARN_ON_ONCE() from throtl_rb_first() to it's caller
 - merge some patches into one.

Changes in v3:
 - some code optimization in patch 1
 - hold queue lock to cancel bios in patch 2

Changes in v4:
 - delete rcu_read_lock() and rcu_read_unlock() in patch 2

Changes in v5:
 - add comment about rcu lock

Changes in v6:
 - revert to include rcu lock with some comments.

Jagan Teki (1):
  ARM: dts: stm32: Enable LVDS panel on i.Core STM32MP1 EDIMM2.2

Yu Kuai (1):
  blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller

Yu Kuai (2):
  blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
  block: cancel all throttled bios in del_gendisk()

 block/blk-throttle.c | 82 ++++++++++++++++++++++++++++++++++++++++++--
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 83 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v6 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
  2022-01-10 13:47 [PATCH v6 0/2] cancel all throttled bios in del_gendisk() Yu Kuai
@ 2022-01-10 13:47 ` Yu Kuai
  2022-01-10 13:47 ` [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk() Yu Kuai
  1 sibling, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2022-01-10 13:47 UTC (permalink / raw)
  To: mkoutny, paulmck, tj, axboe
  Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

Prepare to reintroduce tg_drain_bios(), which will iterate until
throtl_rb_first() return NULL.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 39bb6e68a9a2..fdd57878e862 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -502,7 +502,6 @@ throtl_rb_first(struct throtl_service_queue *parent_sq)
 	struct rb_node *n;
 
 	n = rb_first_cached(&parent_sq->pending_tree);
-	WARN_ON_ONCE(!n);
 	if (!n)
 		return NULL;
 	return rb_entry_tg(n);
@@ -521,7 +520,7 @@ static void update_min_dispatch_time(struct throtl_service_queue *parent_sq)
 	struct throtl_grp *tg;
 
 	tg = throtl_rb_first(parent_sq);
-	if (!tg)
+	if (WARN_ON_ONCE(!tg))
 		return;
 
 	parent_sq->first_pending_disptime = tg->disptime;
@@ -1090,7 +1089,7 @@ static int throtl_select_dispatch(struct throtl_service_queue *parent_sq)
 			break;
 
 		tg = throtl_rb_first(parent_sq);
-		if (!tg)
+		if (WARN_ON_ONCE(!tg))
 			break;
 
 		if (time_before(jiffies, tg->disptime))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-10 13:47 [PATCH v6 0/2] cancel all throttled bios in del_gendisk() Yu Kuai
  2022-01-10 13:47 ` [PATCH v6 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller Yu Kuai
@ 2022-01-10 13:47 ` Yu Kuai
  2022-01-12  3:05   ` Ming Lei
  1 sibling, 1 reply; 16+ messages in thread
From: Yu Kuai @ 2022-01-10 13:47 UTC (permalink / raw)
  To: mkoutny, paulmck, tj, axboe
  Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

Throttled bios can't be issued after del_gendisk() is done, thus
it's better to cancel them immediately rather than waiting for
throttle is done.

For example, if user thread is throttled with low bps while it's
issuing large io, and the device is deleted. The user thread will
wait for a long time for io to return.

Noted this patch is mainly from revertion of commit 32e3374304c7
("blk-throttle: remove tg_drain_bios") and commit b77412372b68
("blk-throttle: remove blk_throtl_drain").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 81 insertions(+)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index fdd57878e862..49e783d6b0d4 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2256,6 +2256,83 @@ void blk_throtl_bio_endio(struct bio *bio)
 }
 #endif
 
+/*
+ * Dispatch all bios from all children tg's queued on @parent_sq.  On
+ * return, @parent_sq is guaranteed to not have any active children tg's
+ * and all bios from previously active tg's are on @parent_sq->bio_lists[].
+ */
+static void tg_drain_bios(struct throtl_service_queue *parent_sq)
+{
+	struct throtl_grp *tg;
+
+	while ((tg = throtl_rb_first(parent_sq))) {
+		struct throtl_service_queue *sq = &tg->service_queue;
+		struct bio *bio;
+
+		throtl_dequeue_tg(tg);
+
+		while ((bio = throtl_peek_queued(&sq->queued[READ])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+		while ((bio = throtl_peek_queued(&sq->queued[WRITE])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+	}
+}
+
+/**
+ * blk_throtl_cancel_bios - cancel throttled bios
+ * @q: request_queue to cancel throttled bios for
+ *
+ * This function is called to error all currently throttled bios on @q.
+ */
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct throtl_data *td = q->td;
+	struct bio_list bio_list_on_stack;
+	struct blkcg_gq *blkg;
+	struct cgroup_subsys_state *pos_css;
+	struct bio *bio;
+	int rw;
+
+	bio_list_init(&bio_list_on_stack);
+
+	/*
+	 * hold queue_lock to prevent concurrent with dispatching
+	 * throttled bios by timer.
+	 */
+	spin_lock_irq(&q->queue_lock);
+
+	/*
+	 * queue_lock is held, rcu lock is not needed here technically.
+	 * However, rcu lock is still held to emphasize that following
+	 * path need RCU protection and to prevent warning from lockdep.
+	 */
+	rcu_read_lock();
+
+	/*
+	 * Drain each tg while doing post-order walk on the blkg tree, so
+	 * that all bios are propagated to td->service_queue.  It'd be
+	 * better to walk service_queue tree directly but blkg walk is
+	 * easier.
+	 */
+	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
+		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
+
+	/* finally, transfer bios from top-level tg's into the td */
+	tg_drain_bios(&td->service_queue);
+
+	/* all bios now should be in td->service_queue, cancel them */
+	for (rw = READ; rw <= WRITE; rw++)
+		while ((bio = throtl_pop_queued(&td->service_queue.queued[rw],
+						NULL)))
+			bio_list_add(&bio_list_on_stack, bio);
+
+	rcu_read_unlock();
+	spin_unlock_irq(&q->queue_lock);
+	if (!bio_list_empty(&bio_list_on_stack))
+		while ((bio = bio_list_pop(&bio_list_on_stack)))
+			bio_io_error(bio);
+}
+
 int blk_throtl_init(struct request_queue *q)
 {
 	struct throtl_data *td;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..9d67d5139954 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+#define blk_throtl_cancel_bios(q)  do { } while (0)
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index c5392cc24d37..1d138f0ae26a 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -28,6 +28,7 @@
 
 #include "blk.h"
 #include "blk-rq-qos.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -622,6 +623,7 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_mq_freeze_queue_wait(q);
 
+	blk_throtl_cancel_bios(q);
 	rq_qos_exit(q);
 	blk_sync_queue(q);
 	blk_flush_integrity();
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-10 13:47 ` [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk() Yu Kuai
@ 2022-01-12  3:05   ` Ming Lei
  2022-01-13  8:46     ` yukuai (C)
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Ming Lei @ 2022-01-12  3:05 UTC (permalink / raw)
  To: Yu Kuai
  Cc: mkoutny, paulmck, tj, axboe, cgroups, linux-block, linux-kernel,
	yi.zhang

Hello Yu Kuai,

On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
> Throttled bios can't be issued after del_gendisk() is done, thus
> it's better to cancel them immediately rather than waiting for
> throttle is done.
> 
> For example, if user thread is throttled with low bps while it's
> issuing large io, and the device is deleted. The user thread will
> wait for a long time for io to return.
> 
> Noted this patch is mainly from revertion of commit 32e3374304c7
> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
> ("blk-throttle: remove blk_throtl_drain").
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>  block/blk-throttle.h |  2 ++
>  block/genhd.c        |  2 ++
>  3 files changed, 81 insertions(+)

Just wondering why not take the built-in way in throtl_upgrade_state() for
canceling throttled bios? Something like the following, then we can avoid
to re-invent the wheel.

 block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  3 +++
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index cf7e20804f1b..17e56b2e44c4 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
 		throtl_upgrade_state(tg->td);
 }
 
-static void throtl_upgrade_state(struct throtl_data *td)
+static void __throtl_cancel_bios(struct throtl_data *td)
 {
 	struct cgroup_subsys_state *pos_css;
 	struct blkcg_gq *blkg;
 
-	throtl_log(&td->service_queue, "upgrade to max");
-	td->limit_index = LIMIT_MAX;
-	td->low_upgrade_time = jiffies;
-	td->scale = 0;
-	rcu_read_lock();
 	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
 		struct throtl_grp *tg = blkg_to_tg(blkg);
 		struct throtl_service_queue *sq = &tg->service_queue;
@@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
 		throtl_select_dispatch(sq);
 		throtl_schedule_next_dispatch(sq, true);
 	}
-	rcu_read_unlock();
 	throtl_select_dispatch(&td->service_queue);
 	throtl_schedule_next_dispatch(&td->service_queue, true);
 	queue_work(kthrotld_workqueue, &td->dispatch_work);
 }
 
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct cgroup_subsys_state *pos_css;
+	struct blkcg_gq *blkg;
+
+	rcu_read_lock();
+	spin_lock_irq(&q->queue_lock);
+	__throtl_cancel_bios(q->td);
+	spin_unlock_irq(&q->queue_lock);
+	rcu_read_unlock();
+
+	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
+		del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
+	del_timer_sync(&q->td->service_queue.pending_timer);
+
+	throtl_shutdown_wq(q);
+}
+
+static void throtl_upgrade_state(struct throtl_data *td)
+{
+	throtl_log(&td->service_queue, "upgrade to max");
+	td->limit_index = LIMIT_MAX;
+	td->low_upgrade_time = jiffies;
+	td->scale = 0;
+
+	rcu_read_lock();
+	__throtl_cancel_bios(td);
+	rcu_read_unlock();
+}
+
 static void throtl_downgrade_state(struct throtl_data *td)
 {
 	td->scale /= 2;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index b23a9f3abb82..525ac607c518 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -162,11 +162,13 @@ static inline int blk_throtl_init(struct request_queue *q) { return 0; }
 static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index 626c8406f21a..1395cbd8eacf 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -30,6 +30,7 @@
 #include "blk.h"
 #include "blk-mq-sched.h"
 #include "blk-rq-qos.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -622,6 +623,8 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_mq_freeze_queue_wait(q);
 
+	blk_throtl_cancel_bios(q);
+
 	rq_qos_exit(q);
 	blk_sync_queue(q);
 	blk_flush_integrity();

Thanks,
Ming


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-12  3:05   ` Ming Lei
@ 2022-01-13  8:46     ` yukuai (C)
  2022-01-14  3:05       ` Ming Lei
  2022-01-24  3:48     ` yukuai (C)
  2022-01-24  3:50     ` yukuai (C)
  2 siblings, 1 reply; 16+ messages in thread
From: yukuai (C) @ 2022-01-13  8:46 UTC (permalink / raw)
  To: Ming Lei
  Cc: mkoutny, paulmck, tj, axboe, cgroups, linux-block, linux-kernel,
	yi.zhang

在 2022/01/12 11:05, Ming Lei 写道:
> Hello Yu Kuai,
> 
> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>> Throttled bios can't be issued after del_gendisk() is done, thus
>> it's better to cancel them immediately rather than waiting for
>> throttle is done.
>>
>> For example, if user thread is throttled with low bps while it's
>> issuing large io, and the device is deleted. The user thread will
>> wait for a long time for io to return.
>>
>> Noted this patch is mainly from revertion of commit 32e3374304c7
>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>> ("blk-throttle: remove blk_throtl_drain").
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>   block/blk-throttle.h |  2 ++
>>   block/genhd.c        |  2 ++
>>   3 files changed, 81 insertions(+)
> 
> Just wondering why not take the built-in way in throtl_upgrade_state() for
> canceling throttled bios? Something like the following, then we can avoid
> to re-invent the wheel.
> 
>   block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
>   block/blk-throttle.h |  2 ++
>   block/genhd.c        |  3 +++
>   3 files changed, 36 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index cf7e20804f1b..17e56b2e44c4 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
>   		throtl_upgrade_state(tg->td);
>   }
>   
> -static void throtl_upgrade_state(struct throtl_data *td)
> +static void __throtl_cancel_bios(struct throtl_data *td)
>   {
>   	struct cgroup_subsys_state *pos_css;
>   	struct blkcg_gq *blkg;
>   
> -	throtl_log(&td->service_queue, "upgrade to max");
> -	td->limit_index = LIMIT_MAX;
> -	td->low_upgrade_time = jiffies;
> -	td->scale = 0;
> -	rcu_read_lock();
>   	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
>   		struct throtl_grp *tg = blkg_to_tg(blkg);
>   		struct throtl_service_queue *sq = &tg->service_queue;
> @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
>   		throtl_select_dispatch(sq);
>   		throtl_schedule_next_dispatch(sq, true);
Hi, Ming Lei

I'm confused that how can bios be canceled here?
tg->iops and tg->bps stay untouched, how can throttled bios
dispatch?
>   	}
> -	rcu_read_unlock();
>   	throtl_select_dispatch(&td->service_queue);
>   	throtl_schedule_next_dispatch(&td->service_queue, true);
>   	queue_work(kthrotld_workqueue, &td->dispatch_work);
>   }
>   
> +void blk_throtl_cancel_bios(struct request_queue *q)
> +{
> +	struct cgroup_subsys_state *pos_css;
> +	struct blkcg_gq *blkg;
> +
> +	rcu_read_lock();
> +	spin_lock_irq(&q->queue_lock);
> +	__throtl_cancel_bios(q->td);
> +	spin_unlock_irq(&q->queue_lock);
> +	rcu_read_unlock();
> +
> +	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
> +		del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
> +	del_timer_sync(&q->td->service_queue.pending_timer);

By the way, I think delete timer will end up io hung here if there are
some bios still be throttled.

Thanks,
Kuai
> +
> +	throtl_shutdown_wq(q);
> +}
> +
> +static void throtl_upgrade_state(struct throtl_data *td)
> +{
> +	throtl_log(&td->service_queue, "upgrade to max");
> +	td->limit_index = LIMIT_MAX;
> +	td->low_upgrade_time = jiffies;
> +	td->scale = 0;
> +
> +	rcu_read_lock();
> +	__throtl_cancel_bios(td);
> +	rcu_read_unlock();
> +}
> +
>   static void throtl_downgrade_state(struct throtl_data *td)
>   {
>   	td->scale /= 2;
> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
> index b23a9f3abb82..525ac607c518 100644
> --- a/block/blk-throttle.h
> +++ b/block/blk-throttle.h
> @@ -162,11 +162,13 @@ static inline int blk_throtl_init(struct request_queue *q) { return 0; }
>   static inline void blk_throtl_exit(struct request_queue *q) { }
>   static inline void blk_throtl_register_queue(struct request_queue *q) { }
>   static inline bool blk_throtl_bio(struct bio *bio) { return false; }
> +static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
>   #else /* CONFIG_BLK_DEV_THROTTLING */
>   int blk_throtl_init(struct request_queue *q);
>   void blk_throtl_exit(struct request_queue *q);
>   void blk_throtl_register_queue(struct request_queue *q);
>   bool __blk_throtl_bio(struct bio *bio);
> +void blk_throtl_cancel_bios(struct request_queue *q);
>   static inline bool blk_throtl_bio(struct bio *bio)
>   {
>   	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
> diff --git a/block/genhd.c b/block/genhd.c
> index 626c8406f21a..1395cbd8eacf 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -30,6 +30,7 @@
>   #include "blk.h"
>   #include "blk-mq-sched.h"
>   #include "blk-rq-qos.h"
> +#include "blk-throttle.h"
>   
>   static struct kobject *block_depr;
>   
> @@ -622,6 +623,8 @@ void del_gendisk(struct gendisk *disk)
>   
>   	blk_mq_freeze_queue_wait(q);
>   
> +	blk_throtl_cancel_bios(q);
> +
>   	rq_qos_exit(q);
>   	blk_sync_queue(q);
>   	blk_flush_integrity();
> 
> Thanks,
> Ming
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-13  8:46     ` yukuai (C)
@ 2022-01-14  3:05       ` Ming Lei
  2022-01-14  8:21         ` yukuai (C)
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-01-14  3:05 UTC (permalink / raw)
  To: yukuai (C)
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
> 在 2022/01/12 11:05, Ming Lei 写道:
> > Hello Yu Kuai,
> >
> > On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
> > > Throttled bios can't be issued after del_gendisk() is done, thus
> > > it's better to cancel them immediately rather than waiting for
> > > throttle is done.
> > >
> > > For example, if user thread is throttled with low bps while it's
> > > issuing large io, and the device is deleted. The user thread will
> > > wait for a long time for io to return.
> > >
> > > Noted this patch is mainly from revertion of commit 32e3374304c7
> > > ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
> > > ("blk-throttle: remove blk_throtl_drain").
> > >
> > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > ---
> > >   block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
> > >   block/blk-throttle.h |  2 ++
> > >   block/genhd.c        |  2 ++
> > >   3 files changed, 81 insertions(+)
> >
> > Just wondering why not take the built-in way in throtl_upgrade_state() for
> > canceling throttled bios? Something like the following, then we can avoid
> > to re-invent the wheel.
> >
> >   block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
> >   block/blk-throttle.h |  2 ++
> >   block/genhd.c        |  3 +++
> >   3 files changed, 36 insertions(+), 7 deletions(-)
> >
> > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > index cf7e20804f1b..17e56b2e44c4 100644
> > --- a/block/blk-throttle.c
> > +++ b/block/blk-throttle.c
> > @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
> >             throtl_upgrade_state(tg->td);
> >   }
> > -static void throtl_upgrade_state(struct throtl_data *td)
> > +static void __throtl_cancel_bios(struct throtl_data *td)
> >   {
> >     struct cgroup_subsys_state *pos_css;
> >     struct blkcg_gq *blkg;
> > -   throtl_log(&td->service_queue, "upgrade to max");
> > -   td->limit_index = LIMIT_MAX;
> > -   td->low_upgrade_time = jiffies;
> > -   td->scale = 0;
> > -   rcu_read_lock();
> >     blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
> >             struct throtl_grp *tg = blkg_to_tg(blkg);
> >             struct throtl_service_queue *sq = &tg->service_queue;
> > @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
> >             throtl_select_dispatch(sq);
> >             throtl_schedule_next_dispatch(sq, true);
> Hi, Ming Lei
>
> I'm confused that how can bios be canceled here?
> tg->iops and tg->bps stay untouched, how can throttled bios
> dispatch?

I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
and the following dispatch schedule.

But looks it isn't enough, since tg_update_disptime() updates
->disptime. However,
this problem can be solved easily by not updating ->disptime in case that we are
canceling.

> >     }
> > -   rcu_read_unlock();
> >     throtl_select_dispatch(&td->service_queue);
> >     throtl_schedule_next_dispatch(&td->service_queue, true);
> >     queue_work(kthrotld_workqueue, &td->dispatch_work);
> >   }
> > +void blk_throtl_cancel_bios(struct request_queue *q)
> > +{
> > +   struct cgroup_subsys_state *pos_css;
> > +   struct blkcg_gq *blkg;
> > +
> > +   rcu_read_lock();
> > +   spin_lock_irq(&q->queue_lock);
> > +   __throtl_cancel_bios(q->td);
> > +   spin_unlock_irq(&q->queue_lock);
> > +   rcu_read_unlock();
> > +
> > +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
> > +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
> > +   del_timer_sync(&q->td->service_queue.pending_timer);
>
> By the way, I think delete timer will end up io hung here if there are
> some bios still be throttled.

Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
will be throttled.

Also if we don't update ->disptime, any new bios throttled after releasing
->queue_lock will be dispatched soon.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-14  3:05       ` Ming Lei
@ 2022-01-14  8:21         ` yukuai (C)
  2022-01-17  9:21           ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: yukuai (C) @ 2022-01-14  8:21 UTC (permalink / raw)
  To: Ming Lei
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

在 2022/01/14 11:05, Ming Lei 写道:
> On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
>> 在 2022/01/12 11:05, Ming Lei 写道:
>>> Hello Yu Kuai,
>>>
>>> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>>>> Throttled bios can't be issued after del_gendisk() is done, thus
>>>> it's better to cancel them immediately rather than waiting for
>>>> throttle is done.
>>>>
>>>> For example, if user thread is throttled with low bps while it's
>>>> issuing large io, and the device is deleted. The user thread will
>>>> wait for a long time for io to return.
>>>>
>>>> Noted this patch is mainly from revertion of commit 32e3374304c7
>>>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>>>> ("blk-throttle: remove blk_throtl_drain").
>>>>
>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>> ---
>>>>    block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>>>    block/blk-throttle.h |  2 ++
>>>>    block/genhd.c        |  2 ++
>>>>    3 files changed, 81 insertions(+)
>>>
>>> Just wondering why not take the built-in way in throtl_upgrade_state() for
>>> canceling throttled bios? Something like the following, then we can avoid
>>> to re-invent the wheel.
>>>
>>>    block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
>>>    block/blk-throttle.h |  2 ++
>>>    block/genhd.c        |  3 +++
>>>    3 files changed, 36 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>>> index cf7e20804f1b..17e56b2e44c4 100644
>>> --- a/block/blk-throttle.c
>>> +++ b/block/blk-throttle.c
>>> @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
>>>              throtl_upgrade_state(tg->td);
>>>    }
>>> -static void throtl_upgrade_state(struct throtl_data *td)
>>> +static void __throtl_cancel_bios(struct throtl_data *td)
>>>    {
>>>      struct cgroup_subsys_state *pos_css;
>>>      struct blkcg_gq *blkg;
>>> -   throtl_log(&td->service_queue, "upgrade to max");
>>> -   td->limit_index = LIMIT_MAX;
>>> -   td->low_upgrade_time = jiffies;
>>> -   td->scale = 0;
>>> -   rcu_read_lock();
>>>      blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
>>>              struct throtl_grp *tg = blkg_to_tg(blkg);
>>>              struct throtl_service_queue *sq = &tg->service_queue;
>>> @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
>>>              throtl_select_dispatch(sq);
>>>              throtl_schedule_next_dispatch(sq, true);
>> Hi, Ming Lei
>>
>> I'm confused that how can bios be canceled here?
>> tg->iops and tg->bps stay untouched, how can throttled bios
>> dispatch?
> 
> I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
> and the following dispatch schedule.
> 
> But looks it isn't enough, since tg_update_disptime() updates
> ->disptime. However,
> this problem can be solved easily by not updating ->disptime in case that we are
> canceling.
> 
>>>      }
>>> -   rcu_read_unlock();
>>>      throtl_select_dispatch(&td->service_queue);
>>>      throtl_schedule_next_dispatch(&td->service_queue, true);
>>>      queue_work(kthrotld_workqueue, &td->dispatch_work);
>>>    }
>>> +void blk_throtl_cancel_bios(struct request_queue *q)
>>> +{
>>> +   struct cgroup_subsys_state *pos_css;
>>> +   struct blkcg_gq *blkg;
>>> +
>>> +   rcu_read_lock();
>>> +   spin_lock_irq(&q->queue_lock);
>>> +   __throtl_cancel_bios(q->td);
>>> +   spin_unlock_irq(&q->queue_lock);
>>> +   rcu_read_unlock();
>>> +
>>> +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
>>> +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
>>> +   del_timer_sync(&q->td->service_queue.pending_timer);
>>
>> By the way, I think delete timer will end up io hung here if there are
>> some bios still be throttled.
> 
> Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
> will be throttled.
> 
> Also if we don't update ->disptime, any new bios throttled after releasing
> ->queue_lock will be dispatched soon.

Hi, Ming Lei

Just to be curiosity, I'm still trying to understand the logic here:

For example, if bps is set to 1k, and a io with size 16k is just
dispatched, then io throtle should wait for 16s untill new io can be 
dispatched. (details in tg_with_in_bps_limit).

How does such mechanism bypassed here?

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-14  8:21         ` yukuai (C)
@ 2022-01-17  9:21           ` Ming Lei
  2022-01-18  8:48             ` yukuai (C)
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-01-17  9:21 UTC (permalink / raw)
  To: yukuai (C)
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

On Fri, Jan 14, 2022 at 04:21:04PM +0800, yukuai (C) wrote:
> 在 2022/01/14 11:05, Ming Lei 写道:
> > On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
> > > 在 2022/01/12 11:05, Ming Lei 写道:
> > > > Hello Yu Kuai,
> > > > 
> > > > On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
> > > > > Throttled bios can't be issued after del_gendisk() is done, thus
> > > > > it's better to cancel them immediately rather than waiting for
> > > > > throttle is done.
> > > > > 
> > > > > For example, if user thread is throttled with low bps while it's
> > > > > issuing large io, and the device is deleted. The user thread will
> > > > > wait for a long time for io to return.
> > > > > 
> > > > > Noted this patch is mainly from revertion of commit 32e3374304c7
> > > > > ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
> > > > > ("blk-throttle: remove blk_throtl_drain").
> > > > > 
> > > > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > > > ---
> > > > >    block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
> > > > >    block/blk-throttle.h |  2 ++
> > > > >    block/genhd.c        |  2 ++
> > > > >    3 files changed, 81 insertions(+)
> > > > 
> > > > Just wondering why not take the built-in way in throtl_upgrade_state() for
> > > > canceling throttled bios? Something like the following, then we can avoid
> > > > to re-invent the wheel.
> > > > 
> > > >    block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
> > > >    block/blk-throttle.h |  2 ++
> > > >    block/genhd.c        |  3 +++
> > > >    3 files changed, 36 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > > > index cf7e20804f1b..17e56b2e44c4 100644
> > > > --- a/block/blk-throttle.c
> > > > +++ b/block/blk-throttle.c
> > > > @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
> > > >              throtl_upgrade_state(tg->td);
> > > >    }
> > > > -static void throtl_upgrade_state(struct throtl_data *td)
> > > > +static void __throtl_cancel_bios(struct throtl_data *td)
> > > >    {
> > > >      struct cgroup_subsys_state *pos_css;
> > > >      struct blkcg_gq *blkg;
> > > > -   throtl_log(&td->service_queue, "upgrade to max");
> > > > -   td->limit_index = LIMIT_MAX;
> > > > -   td->low_upgrade_time = jiffies;
> > > > -   td->scale = 0;
> > > > -   rcu_read_lock();
> > > >      blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
> > > >              struct throtl_grp *tg = blkg_to_tg(blkg);
> > > >              struct throtl_service_queue *sq = &tg->service_queue;
> > > > @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
> > > >              throtl_select_dispatch(sq);
> > > >              throtl_schedule_next_dispatch(sq, true);
> > > Hi, Ming Lei
> > > 
> > > I'm confused that how can bios be canceled here?
> > > tg->iops and tg->bps stay untouched, how can throttled bios
> > > dispatch?
> > 
> > I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
> > and the following dispatch schedule.
> > 
> > But looks it isn't enough, since tg_update_disptime() updates
> > ->disptime. However,
> > this problem can be solved easily by not updating ->disptime in case that we are
> > canceling.
> > 
> > > >      }
> > > > -   rcu_read_unlock();
> > > >      throtl_select_dispatch(&td->service_queue);
> > > >      throtl_schedule_next_dispatch(&td->service_queue, true);
> > > >      queue_work(kthrotld_workqueue, &td->dispatch_work);
> > > >    }
> > > > +void blk_throtl_cancel_bios(struct request_queue *q)
> > > > +{
> > > > +   struct cgroup_subsys_state *pos_css;
> > > > +   struct blkcg_gq *blkg;
> > > > +
> > > > +   rcu_read_lock();
> > > > +   spin_lock_irq(&q->queue_lock);
> > > > +   __throtl_cancel_bios(q->td);
> > > > +   spin_unlock_irq(&q->queue_lock);
> > > > +   rcu_read_unlock();
> > > > +
> > > > +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
> > > > +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
> > > > +   del_timer_sync(&q->td->service_queue.pending_timer);
> > > 
> > > By the way, I think delete timer will end up io hung here if there are
> > > some bios still be throttled.
> > 
> > Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
> > will be throttled.
> > 
> > Also if we don't update ->disptime, any new bios throttled after releasing
> > ->queue_lock will be dispatched soon.
> 
> Hi, Ming Lei
> 
> Just to be curiosity, I'm still trying to understand the logic here:
> 
> For example, if bps is set to 1k, and a io with size 16k is just
> dispatched, then io throtle should wait for 16s untill new io can be

There isn't such wait code in blk-throttle, and the magic is just in
how to compute tg->disptime.

> dispatched. (details in tg_with_in_bps_limit).
> 
> How does such mechanism bypassed here?

The point is that tg->disptime is always set as one past time, so all
throttled IOs will be dispatched immediately if ->disptime is older than
jiffies, and I have verified that the following patch can work as expected.


diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..d9845afccd97 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -45,6 +45,7 @@ static struct workqueue_struct *kthrotld_workqueue;
 enum tg_state_flags {
 	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
 	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
+	THROTL_TG_CANCELING	= 1 << 2,	/* starts to cancel all bios */
 };
 
 #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
@@ -974,6 +975,9 @@ static void tg_update_disptime(struct throtl_grp *tg)
 	unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime;
 	struct bio *bio;
 
+	if (tg->flags & THROTL_TG_CANCELING)
+		goto update;
+
 	bio = throtl_peek_queued(&sq->queued[READ]);
 	if (bio)
 		tg_may_dispatch(tg, bio, &read_wait);
@@ -983,6 +987,7 @@ static void tg_update_disptime(struct throtl_grp *tg)
 		tg_may_dispatch(tg, bio, &write_wait);
 
 	min_wait = min(read_wait, write_wait);
+update:
 	disptime = jiffies + min_wait;
 
 	/* Update dispatch time */
@@ -1836,6 +1841,25 @@ static void throtl_upgrade_state(struct throtl_data *td)
 	queue_work(kthrotld_workqueue, &td->dispatch_work);
 }
 
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct cgroup_subsys_state *pos_css;
+	struct blkcg_gq *blkg;
+
+	rcu_read_lock();
+	spin_lock_irq(&q->queue_lock);
+	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
+		struct throtl_grp *tg = blkg_to_tg(blkg);
+		struct throtl_service_queue *sq = &tg->service_queue;
+
+		tg->disptime = jiffies - 1;
+		blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING;
+		throtl_schedule_pending_timer(sq, jiffies + 1);
+	}
+	spin_unlock_irq(&q->queue_lock);
+	rcu_read_unlock();
+}
+
 static void throtl_downgrade_state(struct throtl_data *td)
 {
 	td->scale /= 2;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..b412a4d7cc1e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index f7577dde18fc..a32d48b87223 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -29,6 +29,7 @@
 
 #include "blk.h"
 #include "blk-mq-sched.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -576,6 +577,8 @@ void del_gendisk(struct gendisk *disk)
 	blk_integrity_del(disk);
 	disk_del_events(disk);
 
+	blk_throtl_cancel_bios(disk->queue);
+
 	mutex_lock(&disk->open_mutex);
 	remove_inode_hash(disk->part0->bd_inode);
 	blk_drop_partitions(disk);

Thanks,
Ming


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-17  9:21           ` Ming Lei
@ 2022-01-18  8:48             ` yukuai (C)
  2022-01-18 13:28               ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: yukuai (C) @ 2022-01-18  8:48 UTC (permalink / raw)
  To: Ming Lei
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

在 2022/01/17 17:21, Ming Lei 写道:
> On Fri, Jan 14, 2022 at 04:21:04PM +0800, yukuai (C) wrote:
>> 在 2022/01/14 11:05, Ming Lei 写道:
>>> On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
>>>> 在 2022/01/12 11:05, Ming Lei 写道:
>>>>> Hello Yu Kuai,
>>>>>
>>>>> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>>>>>> Throttled bios can't be issued after del_gendisk() is done, thus
>>>>>> it's better to cancel them immediately rather than waiting for
>>>>>> throttle is done.
>>>>>>
>>>>>> For example, if user thread is throttled with low bps while it's
>>>>>> issuing large io, and the device is deleted. The user thread will
>>>>>> wait for a long time for io to return.
>>>>>>
>>>>>> Noted this patch is mainly from revertion of commit 32e3374304c7
>>>>>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>>>>>> ("blk-throttle: remove blk_throtl_drain").
>>>>>>
>>>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>>>> ---
>>>>>>     block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>>>>>     block/blk-throttle.h |  2 ++
>>>>>>     block/genhd.c        |  2 ++
>>>>>>     3 files changed, 81 insertions(+)
>>>>>
>>>>> Just wondering why not take the built-in way in throtl_upgrade_state() for
>>>>> canceling throttled bios? Something like the following, then we can avoid
>>>>> to re-invent the wheel.
>>>>>
>>>>>     block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
>>>>>     block/blk-throttle.h |  2 ++
>>>>>     block/genhd.c        |  3 +++
>>>>>     3 files changed, 36 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>>>>> index cf7e20804f1b..17e56b2e44c4 100644
>>>>> --- a/block/blk-throttle.c
>>>>> +++ b/block/blk-throttle.c
>>>>> @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
>>>>>               throtl_upgrade_state(tg->td);
>>>>>     }
>>>>> -static void throtl_upgrade_state(struct throtl_data *td)
>>>>> +static void __throtl_cancel_bios(struct throtl_data *td)
>>>>>     {
>>>>>       struct cgroup_subsys_state *pos_css;
>>>>>       struct blkcg_gq *blkg;
>>>>> -   throtl_log(&td->service_queue, "upgrade to max");
>>>>> -   td->limit_index = LIMIT_MAX;
>>>>> -   td->low_upgrade_time = jiffies;
>>>>> -   td->scale = 0;
>>>>> -   rcu_read_lock();
>>>>>       blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
>>>>>               struct throtl_grp *tg = blkg_to_tg(blkg);
>>>>>               struct throtl_service_queue *sq = &tg->service_queue;
>>>>> @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
>>>>>               throtl_select_dispatch(sq);
>>>>>               throtl_schedule_next_dispatch(sq, true);
>>>> Hi, Ming Lei
>>>>
>>>> I'm confused that how can bios be canceled here?
>>>> tg->iops and tg->bps stay untouched, how can throttled bios
>>>> dispatch?
>>>
>>> I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
>>> and the following dispatch schedule.
>>>
>>> But looks it isn't enough, since tg_update_disptime() updates
>>> ->disptime. However,
>>> this problem can be solved easily by not updating ->disptime in case that we are
>>> canceling.
>>>
>>>>>       }
>>>>> -   rcu_read_unlock();
>>>>>       throtl_select_dispatch(&td->service_queue);
>>>>>       throtl_schedule_next_dispatch(&td->service_queue, true);
>>>>>       queue_work(kthrotld_workqueue, &td->dispatch_work);
>>>>>     }
>>>>> +void blk_throtl_cancel_bios(struct request_queue *q)
>>>>> +{
>>>>> +   struct cgroup_subsys_state *pos_css;
>>>>> +   struct blkcg_gq *blkg;
>>>>> +
>>>>> +   rcu_read_lock();
>>>>> +   spin_lock_irq(&q->queue_lock);
>>>>> +   __throtl_cancel_bios(q->td);
>>>>> +   spin_unlock_irq(&q->queue_lock);
>>>>> +   rcu_read_unlock();
>>>>> +
>>>>> +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
>>>>> +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
>>>>> +   del_timer_sync(&q->td->service_queue.pending_timer);
>>>>
>>>> By the way, I think delete timer will end up io hung here if there are
>>>> some bios still be throttled.
>>>
>>> Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
>>> will be throttled.
>>>
>>> Also if we don't update ->disptime, any new bios throttled after releasing
>>> ->queue_lock will be dispatched soon.
>>
>> Hi, Ming Lei
>>
>> Just to be curiosity, I'm still trying to understand the logic here:
>>
>> For example, if bps is set to 1k, and a io with size 16k is just
>> dispatched, then io throtle should wait for 16s untill new io can be
> 
> There isn't such wait code in blk-throttle, and the magic is just in
> how to compute tg->disptime.
> 
>> dispatched. (details in tg_with_in_bps_limit).
>>
>> How does such mechanism bypassed here?
> 
> The point is that tg->disptime is always set as one past time, so all
> throttled IOs will be dispatched immediately if ->disptime is older than
> jiffies, and I have verified that the following patch can work as expected.
> 
Hi, Ming Lei

I'm not sure about the logic here yet, however, I tried the following
patch, and the patch doesn't work as expected in my case:

1. limit bps to 1k
2. issue io with bs=16k

In this workload, each io will wait for 16s to complete.

3. when an io is just completed, delete the device

After this is done, what I expected is that the user thread will exit
immediately(with io error), however, with this patch applied, the user
thread will wait for about 16s to exit, which is the same without this
patch.

Thanks,
Kuai
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 7c462c006b26..d9845afccd97 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -45,6 +45,7 @@ static struct workqueue_struct *kthrotld_workqueue;
>   enum tg_state_flags {
>   	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
>   	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
> +	THROTL_TG_CANCELING	= 1 << 2,	/* starts to cancel all bios */
>   };
>   
>   #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
> @@ -974,6 +975,9 @@ static void tg_update_disptime(struct throtl_grp *tg)
>   	unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime;
>   	struct bio *bio;
>   
> +	if (tg->flags & THROTL_TG_CANCELING)
> +		goto update;
> +
>   	bio = throtl_peek_queued(&sq->queued[READ]);
>   	if (bio)
>   		tg_may_dispatch(tg, bio, &read_wait);
> @@ -983,6 +987,7 @@ static void tg_update_disptime(struct throtl_grp *tg)
>   		tg_may_dispatch(tg, bio, &write_wait);
>   
>   	min_wait = min(read_wait, write_wait);
> +update:
>   	disptime = jiffies + min_wait;
>   
>   	/* Update dispatch time */
> @@ -1836,6 +1841,25 @@ static void throtl_upgrade_state(struct throtl_data *td)
>   	queue_work(kthrotld_workqueue, &td->dispatch_work);
>   }
>   
> +void blk_throtl_cancel_bios(struct request_queue *q)
> +{
> +	struct cgroup_subsys_state *pos_css;
> +	struct blkcg_gq *blkg;
> +
> +	rcu_read_lock();
> +	spin_lock_irq(&q->queue_lock);
> +	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
> +		struct throtl_grp *tg = blkg_to_tg(blkg);
> +		struct throtl_service_queue *sq = &tg->service_queue;
> +
> +		tg->disptime = jiffies - 1;
> +		blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING;
> +		throtl_schedule_pending_timer(sq, jiffies + 1);
> +	}
> +	spin_unlock_irq(&q->queue_lock);
> +	rcu_read_unlock();
> +}
> +
>   static void throtl_downgrade_state(struct throtl_data *td)
>   {
>   	td->scale /= 2;
> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
> index 175f03abd9e4..b412a4d7cc1e 100644
> --- a/block/blk-throttle.h
> +++ b/block/blk-throttle.h
> @@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
>   static inline void blk_throtl_register_queue(struct request_queue *q) { }
>   static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
>   static inline bool blk_throtl_bio(struct bio *bio) { return false; }
> +static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
>   #else /* CONFIG_BLK_DEV_THROTTLING */
>   int blk_throtl_init(struct request_queue *q);
>   void blk_throtl_exit(struct request_queue *q);
>   void blk_throtl_register_queue(struct request_queue *q);
>   void blk_throtl_charge_bio_split(struct bio *bio);
>   bool __blk_throtl_bio(struct bio *bio);
> +void blk_throtl_cancel_bios(struct request_queue *q);
>   static inline bool blk_throtl_bio(struct bio *bio)
>   {
>   	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
> diff --git a/block/genhd.c b/block/genhd.c
> index f7577dde18fc..a32d48b87223 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -29,6 +29,7 @@
>   
>   #include "blk.h"
>   #include "blk-mq-sched.h"
> +#include "blk-throttle.h"
>   
>   static struct kobject *block_depr;
>   
> @@ -576,6 +577,8 @@ void del_gendisk(struct gendisk *disk)
>   	blk_integrity_del(disk);
>   	disk_del_events(disk);
>   
> +	blk_throtl_cancel_bios(disk->queue);
> +
>   	mutex_lock(&disk->open_mutex);
>   	remove_inode_hash(disk->part0->bd_inode);
>   	blk_drop_partitions(disk);
> 
> Thanks,
> Ming
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-18  8:48             ` yukuai (C)
@ 2022-01-18 13:28               ` Ming Lei
  2022-01-19  2:24                 ` yukuai (C)
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-01-18 13:28 UTC (permalink / raw)
  To: yukuai (C)
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

On Tue, Jan 18, 2022 at 04:48:26PM +0800, yukuai (C) wrote:
> 在 2022/01/17 17:21, Ming Lei 写道:
> > On Fri, Jan 14, 2022 at 04:21:04PM +0800, yukuai (C) wrote:
> > > 在 2022/01/14 11:05, Ming Lei 写道:
> > > > On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
> > > > > 在 2022/01/12 11:05, Ming Lei 写道:
> > > > > > Hello Yu Kuai,
> > > > > > 
> > > > > > On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
> > > > > > > Throttled bios can't be issued after del_gendisk() is done, thus
> > > > > > > it's better to cancel them immediately rather than waiting for
> > > > > > > throttle is done.
> > > > > > > 
> > > > > > > For example, if user thread is throttled with low bps while it's
> > > > > > > issuing large io, and the device is deleted. The user thread will
> > > > > > > wait for a long time for io to return.
> > > > > > > 
> > > > > > > Noted this patch is mainly from revertion of commit 32e3374304c7
> > > > > > > ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
> > > > > > > ("blk-throttle: remove blk_throtl_drain").
> > > > > > > 
> > > > > > > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > > > > > > ---
> > > > > > >     block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
> > > > > > >     block/blk-throttle.h |  2 ++
> > > > > > >     block/genhd.c        |  2 ++
> > > > > > >     3 files changed, 81 insertions(+)
> > > > > > 
> > > > > > Just wondering why not take the built-in way in throtl_upgrade_state() for
> > > > > > canceling throttled bios? Something like the following, then we can avoid
> > > > > > to re-invent the wheel.
> > > > > > 
> > > > > >     block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
> > > > > >     block/blk-throttle.h |  2 ++
> > > > > >     block/genhd.c        |  3 +++
> > > > > >     3 files changed, 36 insertions(+), 7 deletions(-)
> > > > > > 
> > > > > > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > > > > > index cf7e20804f1b..17e56b2e44c4 100644
> > > > > > --- a/block/blk-throttle.c
> > > > > > +++ b/block/blk-throttle.c
> > > > > > @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
> > > > > >               throtl_upgrade_state(tg->td);
> > > > > >     }
> > > > > > -static void throtl_upgrade_state(struct throtl_data *td)
> > > > > > +static void __throtl_cancel_bios(struct throtl_data *td)
> > > > > >     {
> > > > > >       struct cgroup_subsys_state *pos_css;
> > > > > >       struct blkcg_gq *blkg;
> > > > > > -   throtl_log(&td->service_queue, "upgrade to max");
> > > > > > -   td->limit_index = LIMIT_MAX;
> > > > > > -   td->low_upgrade_time = jiffies;
> > > > > > -   td->scale = 0;
> > > > > > -   rcu_read_lock();
> > > > > >       blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
> > > > > >               struct throtl_grp *tg = blkg_to_tg(blkg);
> > > > > >               struct throtl_service_queue *sq = &tg->service_queue;
> > > > > > @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
> > > > > >               throtl_select_dispatch(sq);
> > > > > >               throtl_schedule_next_dispatch(sq, true);
> > > > > Hi, Ming Lei
> > > > > 
> > > > > I'm confused that how can bios be canceled here?
> > > > > tg->iops and tg->bps stay untouched, how can throttled bios
> > > > > dispatch?
> > > > 
> > > > I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
> > > > and the following dispatch schedule.
> > > > 
> > > > But looks it isn't enough, since tg_update_disptime() updates
> > > > ->disptime. However,
> > > > this problem can be solved easily by not updating ->disptime in case that we are
> > > > canceling.
> > > > 
> > > > > >       }
> > > > > > -   rcu_read_unlock();
> > > > > >       throtl_select_dispatch(&td->service_queue);
> > > > > >       throtl_schedule_next_dispatch(&td->service_queue, true);
> > > > > >       queue_work(kthrotld_workqueue, &td->dispatch_work);
> > > > > >     }
> > > > > > +void blk_throtl_cancel_bios(struct request_queue *q)
> > > > > > +{
> > > > > > +   struct cgroup_subsys_state *pos_css;
> > > > > > +   struct blkcg_gq *blkg;
> > > > > > +
> > > > > > +   rcu_read_lock();
> > > > > > +   spin_lock_irq(&q->queue_lock);
> > > > > > +   __throtl_cancel_bios(q->td);
> > > > > > +   spin_unlock_irq(&q->queue_lock);
> > > > > > +   rcu_read_unlock();
> > > > > > +
> > > > > > +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
> > > > > > +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
> > > > > > +   del_timer_sync(&q->td->service_queue.pending_timer);
> > > > > 
> > > > > By the way, I think delete timer will end up io hung here if there are
> > > > > some bios still be throttled.
> > > > 
> > > > Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
> > > > will be throttled.
> > > > 
> > > > Also if we don't update ->disptime, any new bios throttled after releasing
> > > > ->queue_lock will be dispatched soon.
> > > 
> > > Hi, Ming Lei
> > > 
> > > Just to be curiosity, I'm still trying to understand the logic here:
> > > 
> > > For example, if bps is set to 1k, and a io with size 16k is just
> > > dispatched, then io throtle should wait for 16s untill new io can be
> > 
> > There isn't such wait code in blk-throttle, and the magic is just in
> > how to compute tg->disptime.
> > 
> > > dispatched. (details in tg_with_in_bps_limit).
> > > 
> > > How does such mechanism bypassed here?
> > 
> > The point is that tg->disptime is always set as one past time, so all
> > throttled IOs will be dispatched immediately if ->disptime is older than
> > jiffies, and I have verified that the following patch can work as expected.
> > 
> Hi, Ming Lei
> 
> I'm not sure about the logic here yet, however, I tried the following

I believe I have explained the theory already, here we just miss to
patch tg_may_dispatch(), then CPU is spinning in
throtl_pending_timer_fn() until tg_may_dispatch() becomes true.

> patch, and the patch doesn't work as expected in my case:
> 
> 1. limit bps to 1k
> 2. issue io with bs=16k
> 
> In this workload, each io will wait for 16s to complete.
> 
> 3. when an io is just completed, delete the device
> 
> After this is done, what I expected is that the user thread will exit
> immediately(with io error), however, with this patch applied, the user
> thread will wait for about 16s to exit, which is the same without this
> patch.

Please try the revised patch, which can return immediately after
deleting the disk.


diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7c462c006b26..912ba25839f2 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -45,6 +45,7 @@ static struct workqueue_struct *kthrotld_workqueue;
 enum tg_state_flags {
 	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
 	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
+	THROTL_TG_CANCELING	= 1 << 2,	/* starts to cancel all bios */
 };
 
 #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
@@ -870,8 +871,9 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
 	BUG_ON(tg->service_queue.nr_queued[rw] &&
 	       bio != throtl_peek_queued(&tg->service_queue.queued[rw]));
 
-	/* If tg->bps = -1, then BW is unlimited */
-	if (bps_limit == U64_MAX && iops_limit == UINT_MAX) {
+	/* If tg->bps = -1 or we are canceling bios, then BW is unlimited */
+	if ((bps_limit == U64_MAX && iops_limit == UINT_MAX) ||
+			(tg->flags & THROTL_TG_CANCELING)) {
 		if (wait)
 			*wait = 0;
 		return true;
@@ -974,6 +976,9 @@ static void tg_update_disptime(struct throtl_grp *tg)
 	unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime;
 	struct bio *bio;
 
+	if (tg->flags & THROTL_TG_CANCELING)
+		goto update;
+
 	bio = throtl_peek_queued(&sq->queued[READ]);
 	if (bio)
 		tg_may_dispatch(tg, bio, &read_wait);
@@ -983,6 +988,7 @@ static void tg_update_disptime(struct throtl_grp *tg)
 		tg_may_dispatch(tg, bio, &write_wait);
 
 	min_wait = min(read_wait, write_wait);
+update:
 	disptime = jiffies + min_wait;
 
 	/* Update dispatch time */
@@ -1836,6 +1842,25 @@ static void throtl_upgrade_state(struct throtl_data *td)
 	queue_work(kthrotld_workqueue, &td->dispatch_work);
 }
 
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct cgroup_subsys_state *pos_css;
+	struct blkcg_gq *blkg;
+
+	rcu_read_lock();
+	spin_lock_irq(&q->queue_lock);
+	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
+		struct throtl_grp *tg = blkg_to_tg(blkg);
+		struct throtl_service_queue *sq = &tg->service_queue;
+
+		tg->disptime = jiffies - 1;
+		blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING;
+		throtl_schedule_pending_timer(sq, jiffies + 1);
+	}
+	spin_unlock_irq(&q->queue_lock);
+	rcu_read_unlock();
+}
+
 static void throtl_downgrade_state(struct throtl_data *td)
 {
 	td->scale /= 2;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..b412a4d7cc1e 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index f7577dde18fc..a32d48b87223 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -29,6 +29,7 @@
 
 #include "blk.h"
 #include "blk-mq-sched.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -576,6 +577,8 @@ void del_gendisk(struct gendisk *disk)
 	blk_integrity_del(disk);
 	disk_del_events(disk);
 
+	blk_throtl_cancel_bios(disk->queue);
+
 	mutex_lock(&disk->open_mutex);
 	remove_inode_hash(disk->part0->bd_inode);
 	blk_drop_partitions(disk);

Thanks, 
Ming


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-18 13:28               ` Ming Lei
@ 2022-01-19  2:24                 ` yukuai (C)
  0 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2022-01-19  2:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: mkoutny, paulmck, tj, Jens Axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

在 2022/01/18 21:28, Ming Lei 写道:
> On Tue, Jan 18, 2022 at 04:48:26PM +0800, yukuai (C) wrote:
>> 在 2022/01/17 17:21, Ming Lei 写道:
>>> On Fri, Jan 14, 2022 at 04:21:04PM +0800, yukuai (C) wrote:
>>>> 在 2022/01/14 11:05, Ming Lei 写道:
>>>>> On Thu, Jan 13, 2022 at 04:46:18PM +0800, yukuai (C) wrote:
>>>>>> 在 2022/01/12 11:05, Ming Lei 写道:
>>>>>>> Hello Yu Kuai,
>>>>>>>
>>>>>>> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>>>>>>>> Throttled bios can't be issued after del_gendisk() is done, thus
>>>>>>>> it's better to cancel them immediately rather than waiting for
>>>>>>>> throttle is done.
>>>>>>>>
>>>>>>>> For example, if user thread is throttled with low bps while it's
>>>>>>>> issuing large io, and the device is deleted. The user thread will
>>>>>>>> wait for a long time for io to return.
>>>>>>>>
>>>>>>>> Noted this patch is mainly from revertion of commit 32e3374304c7
>>>>>>>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>>>>>>>> ("blk-throttle: remove blk_throtl_drain").
>>>>>>>>
>>>>>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>>>>>> ---
>>>>>>>>      block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>      block/blk-throttle.h |  2 ++
>>>>>>>>      block/genhd.c        |  2 ++
>>>>>>>>      3 files changed, 81 insertions(+)
>>>>>>>
>>>>>>> Just wondering why not take the built-in way in throtl_upgrade_state() for
>>>>>>> canceling throttled bios? Something like the following, then we can avoid
>>>>>>> to re-invent the wheel.
>>>>>>>
>>>>>>>      block/blk-throttle.c | 38 +++++++++++++++++++++++++++++++-------
>>>>>>>      block/blk-throttle.h |  2 ++
>>>>>>>      block/genhd.c        |  3 +++
>>>>>>>      3 files changed, 36 insertions(+), 7 deletions(-)
>>>>>>>
>>>>>>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>>>>>>> index cf7e20804f1b..17e56b2e44c4 100644
>>>>>>> --- a/block/blk-throttle.c
>>>>>>> +++ b/block/blk-throttle.c
>>>>>>> @@ -1816,16 +1816,11 @@ static void throtl_upgrade_check(struct throtl_grp *tg)
>>>>>>>                throtl_upgrade_state(tg->td);
>>>>>>>      }
>>>>>>> -static void throtl_upgrade_state(struct throtl_data *td)
>>>>>>> +static void __throtl_cancel_bios(struct throtl_data *td)
>>>>>>>      {
>>>>>>>        struct cgroup_subsys_state *pos_css;
>>>>>>>        struct blkcg_gq *blkg;
>>>>>>> -   throtl_log(&td->service_queue, "upgrade to max");
>>>>>>> -   td->limit_index = LIMIT_MAX;
>>>>>>> -   td->low_upgrade_time = jiffies;
>>>>>>> -   td->scale = 0;
>>>>>>> -   rcu_read_lock();
>>>>>>>        blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg) {
>>>>>>>                struct throtl_grp *tg = blkg_to_tg(blkg);
>>>>>>>                struct throtl_service_queue *sq = &tg->service_queue;
>>>>>>> @@ -1834,12 +1829,41 @@ static void throtl_upgrade_state(struct throtl_data *td)
>>>>>>>                throtl_select_dispatch(sq);
>>>>>>>                throtl_schedule_next_dispatch(sq, true);
>>>>>> Hi, Ming Lei
>>>>>>
>>>>>> I'm confused that how can bios be canceled here?
>>>>>> tg->iops and tg->bps stay untouched, how can throttled bios
>>>>>> dispatch?
>>>>>
>>>>> I thought that throttled bios will be canceled by 'tg->disptime = jiffies - 1;'
>>>>> and the following dispatch schedule.
>>>>>
>>>>> But looks it isn't enough, since tg_update_disptime() updates
>>>>> ->disptime. However,
>>>>> this problem can be solved easily by not updating ->disptime in case that we are
>>>>> canceling.
>>>>>
>>>>>>>        }
>>>>>>> -   rcu_read_unlock();
>>>>>>>        throtl_select_dispatch(&td->service_queue);
>>>>>>>        throtl_schedule_next_dispatch(&td->service_queue, true);
>>>>>>>        queue_work(kthrotld_workqueue, &td->dispatch_work);
>>>>>>>      }
>>>>>>> +void blk_throtl_cancel_bios(struct request_queue *q)
>>>>>>> +{
>>>>>>> +   struct cgroup_subsys_state *pos_css;
>>>>>>> +   struct blkcg_gq *blkg;
>>>>>>> +
>>>>>>> +   rcu_read_lock();
>>>>>>> +   spin_lock_irq(&q->queue_lock);
>>>>>>> +   __throtl_cancel_bios(q->td);
>>>>>>> +   spin_unlock_irq(&q->queue_lock);
>>>>>>> +   rcu_read_unlock();
>>>>>>> +
>>>>>>> +   blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg)
>>>>>>> +           del_timer_sync(&blkg_to_tg(blkg)->service_queue.pending_timer);
>>>>>>> +   del_timer_sync(&q->td->service_queue.pending_timer);
>>>>>>
>>>>>> By the way, I think delete timer will end up io hung here if there are
>>>>>> some bios still be throttled.
>>>>>
>>>>> Firstly ->queue_lock is held by blk_throtl_cancel_bios(), so no new bios
>>>>> will be throttled.
>>>>>
>>>>> Also if we don't update ->disptime, any new bios throttled after releasing
>>>>> ->queue_lock will be dispatched soon.
>>>>
>>>> Hi, Ming Lei
>>>>
>>>> Just to be curiosity, I'm still trying to understand the logic here:
>>>>
>>>> For example, if bps is set to 1k, and a io with size 16k is just
>>>> dispatched, then io throtle should wait for 16s untill new io can be
>>>
>>> There isn't such wait code in blk-throttle, and the magic is just in
>>> how to compute tg->disptime.
>>>
>>>> dispatched. (details in tg_with_in_bps_limit).
>>>>
>>>> How does such mechanism bypassed here?
>>>
>>> The point is that tg->disptime is always set as one past time, so all
>>> throttled IOs will be dispatched immediately if ->disptime is older than
>>> jiffies, and I have verified that the following patch can work as expected.
>>>
>> Hi, Ming Lei
>>
>> I'm not sure about the logic here yet, however, I tried the following
> 
> I believe I have explained the theory already, here we just miss to
> patch tg_may_dispatch(), then CPU is spinning in
> throtl_pending_timer_fn() until tg_may_dispatch() becomes true.
> 
>> patch, and the patch doesn't work as expected in my case:
>>
>> 1. limit bps to 1k
>> 2. issue io with bs=16k
>>
>> In this workload, each io will wait for 16s to complete.
>>
>> 3. when an io is just completed, delete the device
>>
>> After this is done, what I expected is that the user thread will exit
>> immediately(with io error), however, with this patch applied, the user
>> thread will wait for about 16s to exit, which is the same without this
>> patch.
> 
> Please try the revised patch, which can return immediately after
> deleting the disk.
> 
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 7c462c006b26..912ba25839f2 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -45,6 +45,7 @@ static struct workqueue_struct *kthrotld_workqueue;
>   enum tg_state_flags {
>   	THROTL_TG_PENDING	= 1 << 0,	/* on parent's pending tree */
>   	THROTL_TG_WAS_EMPTY	= 1 << 1,	/* bio_lists[] became non-empty */
> +	THROTL_TG_CANCELING	= 1 << 2,	/* starts to cancel all bios */
>   };
>   
>   #define rb_entry_tg(node)	rb_entry((node), struct throtl_grp, rb_node)
> @@ -870,8 +871,9 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
>   	BUG_ON(tg->service_queue.nr_queued[rw] &&
>   	       bio != throtl_peek_queued(&tg->service_queue.queued[rw]));
>   
> -	/* If tg->bps = -1, then BW is unlimited */
> -	if (bps_limit == U64_MAX && iops_limit == UINT_MAX) {
> +	/* If tg->bps = -1 or we are canceling bios, then BW is unlimited */
> +	if ((bps_limit == U64_MAX && iops_limit == UINT_MAX) ||
> +			(tg->flags & THROTL_TG_CANCELING)) {
Hi,

I understand now that with this change, tg_with_in_bps_limit() will be
skipped and tg_may_dispatch() will always return true.

Thanks,
Kuai
>   		if (wait)
>   			*wait = 0;
>   		return true;
> @@ -974,6 +976,9 @@ static void tg_update_disptime(struct throtl_grp *tg)
>   	unsigned long read_wait = -1, write_wait = -1, min_wait = -1, disptime;
>   	struct bio *bio;
>   
> +	if (tg->flags & THROTL_TG_CANCELING)
> +		goto update;
> +
>   	bio = throtl_peek_queued(&sq->queued[READ]);
>   	if (bio)
>   		tg_may_dispatch(tg, bio, &read_wait);
> @@ -983,6 +988,7 @@ static void tg_update_disptime(struct throtl_grp *tg)
>   		tg_may_dispatch(tg, bio, &write_wait);
>   
>   	min_wait = min(read_wait, write_wait);
> +update:
>   	disptime = jiffies + min_wait;
>   
>   	/* Update dispatch time */
> @@ -1836,6 +1842,25 @@ static void throtl_upgrade_state(struct throtl_data *td)
>   	queue_work(kthrotld_workqueue, &td->dispatch_work);
>   }
>   
> +void blk_throtl_cancel_bios(struct request_queue *q)
> +{
> +	struct cgroup_subsys_state *pos_css;
> +	struct blkcg_gq *blkg;
> +
> +	rcu_read_lock();
> +	spin_lock_irq(&q->queue_lock);
> +	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
> +		struct throtl_grp *tg = blkg_to_tg(blkg);
> +		struct throtl_service_queue *sq = &tg->service_queue;
> +
> +		tg->disptime = jiffies - 1;
> +		blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING;
> +		throtl_schedule_pending_timer(sq, jiffies + 1);
> +	}
> +	spin_unlock_irq(&q->queue_lock);
> +	rcu_read_unlock();
> +}
> +
>   static void throtl_downgrade_state(struct throtl_data *td)
>   {
>   	td->scale /= 2;
> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
> index 175f03abd9e4..b412a4d7cc1e 100644
> --- a/block/blk-throttle.h
> +++ b/block/blk-throttle.h
> @@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
>   static inline void blk_throtl_register_queue(struct request_queue *q) { }
>   static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
>   static inline bool blk_throtl_bio(struct bio *bio) { return false; }
> +static inline void blk_throtl_cancel_bios(struct request_queue *q) {}
>   #else /* CONFIG_BLK_DEV_THROTTLING */
>   int blk_throtl_init(struct request_queue *q);
>   void blk_throtl_exit(struct request_queue *q);
>   void blk_throtl_register_queue(struct request_queue *q);
>   void blk_throtl_charge_bio_split(struct bio *bio);
>   bool __blk_throtl_bio(struct bio *bio);
> +void blk_throtl_cancel_bios(struct request_queue *q);
>   static inline bool blk_throtl_bio(struct bio *bio)
>   {
>   	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
> diff --git a/block/genhd.c b/block/genhd.c
> index f7577dde18fc..a32d48b87223 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -29,6 +29,7 @@
>   
>   #include "blk.h"
>   #include "blk-mq-sched.h"
> +#include "blk-throttle.h"
>   
>   static struct kobject *block_depr;
>   
> @@ -576,6 +577,8 @@ void del_gendisk(struct gendisk *disk)
>   	blk_integrity_del(disk);
>   	disk_del_events(disk);
>   
> +	blk_throtl_cancel_bios(disk->queue);
> +
>   	mutex_lock(&disk->open_mutex);
>   	remove_inode_hash(disk->part0->bd_inode);
>   	blk_drop_partitions(disk);
> 
> Thanks,
> Ming
> 
> .
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-12  3:05   ` Ming Lei
  2022-01-13  8:46     ` yukuai (C)
@ 2022-01-24  3:48     ` yukuai (C)
  2022-01-24  3:50     ` yukuai (C)
  2 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2022-01-24  3:48 UTC (permalink / raw)
  To: Ming Lei, tj
  Cc: mkoutny, paulmck, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2022/01/12 11:05, Ming Lei 写道:
> Hello Yu Kuai,
> 
> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>> Throttled bios can't be issued after del_gendisk() is done, thus
>> it's better to cancel them immediately rather than waiting for
>> throttle is done.
>>
>> For example, if user thread is throttled with low bps while it's
>> issuing large io, and the device is deleted. The user thread will
>> wait for a long time for io to return.
>>
>> Noted this patch is mainly from revertion of commit 32e3374304c7
>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>> ("blk-throttle: remove blk_throtl_drain").
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>   block/blk-throttle.h |  2 ++
>>   block/genhd.c        |  2 ++
>>   3 files changed, 81 insertions(+)
> 
> Just wondering why not take the built-in way in throtl_upgrade_state() for
> canceling throttled bios? Something like the following, then we can avoid
> to re-invent the wheel.

Hi, Tejun



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-12  3:05   ` Ming Lei
  2022-01-13  8:46     ` yukuai (C)
  2022-01-24  3:48     ` yukuai (C)
@ 2022-01-24  3:50     ` yukuai (C)
  2022-01-26 17:03       ` Tejun Heo
  2 siblings, 1 reply; 16+ messages in thread
From: yukuai (C) @ 2022-01-24  3:50 UTC (permalink / raw)
  To: Ming Lei
  Cc: mkoutny, paulmck, tj, axboe, cgroups, linux-block, linux-kernel,
	yi.zhang

在 2022/01/12 11:05, Ming Lei 写道:
> Hello Yu Kuai,
> 
> On Mon, Jan 10, 2022 at 09:47:58PM +0800, Yu Kuai wrote:
>> Throttled bios can't be issued after del_gendisk() is done, thus
>> it's better to cancel them immediately rather than waiting for
>> throttle is done.
>>
>> For example, if user thread is throttled with low bps while it's
>> issuing large io, and the device is deleted. The user thread will
>> wait for a long time for io to return.
>>
>> Noted this patch is mainly from revertion of commit 32e3374304c7
>> ("blk-throttle: remove tg_drain_bios") and commit b77412372b68
>> ("blk-throttle: remove blk_throtl_drain").
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   block/blk-throttle.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>   block/blk-throttle.h |  2 ++
>>   block/genhd.c        |  2 ++
>>   3 files changed, 81 insertions(+)
> 
> Just wondering why not take the built-in way in throtl_upgrade_state() for
> canceling throttled bios? Something like the following, then we can avoid
> to re-invent the wheel.

Hi, Tejun

Both ways can fix the problem, which way do you prefer?

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-24  3:50     ` yukuai (C)
@ 2022-01-26 17:03       ` Tejun Heo
  2022-01-27  2:45         ` yukuai (C)
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2022-01-26 17:03 UTC (permalink / raw)
  To: yukuai (C)
  Cc: Ming Lei, mkoutny, paulmck, axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

On Mon, Jan 24, 2022 at 11:50:11AM +0800, yukuai (C) wrote:
> Both ways can fix the problem, which way do you prefer?

Ming's suggested change seems simpler, no?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-26 17:03       ` Tejun Heo
@ 2022-01-27  2:45         ` yukuai (C)
  2022-01-27  5:03           ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: yukuai (C) @ 2022-01-27  2:45 UTC (permalink / raw)
  To: Tejun Heo, Ming Lei
  Cc: mkoutny, paulmck, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2022/01/27 1:03, Tejun Heo 写道:
> On Mon, Jan 24, 2022 at 11:50:11AM +0800, yukuai (C) wrote:
>> Both ways can fix the problem, which way do you prefer?
> 
> Ming's suggested change seems simpler, no?

Hi,

Yes, if Ming don't mind, I can send a new version after Ming's
pathset "block: don't drain file system I/O on del_gendisk".

Thanks,
Kuai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk()
  2022-01-27  2:45         ` yukuai (C)
@ 2022-01-27  5:03           ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2022-01-27  5:03 UTC (permalink / raw)
  To: yukuai (C)
  Cc: Tejun Heo, mkoutny, paulmck, axboe, cgroups, linux-block,
	linux-kernel, yi.zhang

On Thu, Jan 27, 2022 at 10:45:33AM +0800, yukuai (C) wrote:
> 在 2022/01/27 1:03, Tejun Heo 写道:
> > On Mon, Jan 24, 2022 at 11:50:11AM +0800, yukuai (C) wrote:
> > > Both ways can fix the problem, which way do you prefer?
> > 
> > Ming's suggested change seems simpler, no?
> 
> Hi,
> 
> Yes, if Ming don't mind, I can send a new version after Ming's
> pathset "block: don't drain file system I/O on del_gendisk".

The patch of canceling throttled bios shouldn't be conflicted
with the above big patchset, and you can send it out now against
for-5.18/block.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-01-27  5:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-10 13:47 [PATCH v6 0/2] cancel all throttled bios in del_gendisk() Yu Kuai
2022-01-10 13:47 ` [PATCH v6 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller Yu Kuai
2022-01-10 13:47 ` [PATCH v6 2/2] block: cancel all throttled bios in del_gendisk() Yu Kuai
2022-01-12  3:05   ` Ming Lei
2022-01-13  8:46     ` yukuai (C)
2022-01-14  3:05       ` Ming Lei
2022-01-14  8:21         ` yukuai (C)
2022-01-17  9:21           ` Ming Lei
2022-01-18  8:48             ` yukuai (C)
2022-01-18 13:28               ` Ming Lei
2022-01-19  2:24                 ` yukuai (C)
2022-01-24  3:48     ` yukuai (C)
2022-01-24  3:50     ` yukuai (C)
2022-01-26 17:03       ` Tejun Heo
2022-01-27  2:45         ` yukuai (C)
2022-01-27  5:03           ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).