All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] cancel all throttled bios in del_gendisk()
@ 2021-12-02 13:04 ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch, tj, axboe; +Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

If del_gendisk() is done when some io are still throttled, such io
will not be handled until the throttle is done, which is not
necessary.

Changes in v2:
 - move WARN_ON_ONCE() from throtl_rb_first() to it's caller
 - merge some patches into one.

Changes in v3:
 - some code optimization in patch 1
 - hold queue lock to cancel bios in patch 2

Changes in v4:
 - delete rcu_read_lock() and rcu_read_unlock() in patch 2

Yu Kuai (2):
  blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
  block: cancel all throttled bios in del_gendisk()

 block/blk-throttle.c | 74 ++++++++++++++++++++++++++++++++++++++++++--
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 75 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 0/2] cancel all throttled bios in del_gendisk()
@ 2021-12-02 13:04 ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch-wEGCiKHe2LqWVfeAwA7xHQ, tj-DgEjT+Ai2ygdnm+yROfE0A,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	yukuai3-hv44wF8Li93QT0dZR+AlfA, yi.zhang-hv44wF8Li93QT0dZR+AlfA

If del_gendisk() is done when some io are still throttled, such io
will not be handled until the throttle is done, which is not
necessary.

Changes in v2:
 - move WARN_ON_ONCE() from throtl_rb_first() to it's caller
 - merge some patches into one.

Changes in v3:
 - some code optimization in patch 1
 - hold queue lock to cancel bios in patch 2

Changes in v4:
 - delete rcu_read_lock() and rcu_read_unlock() in patch 2

Yu Kuai (2):
  blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
  block: cancel all throttled bios in del_gendisk()

 block/blk-throttle.c | 74 ++++++++++++++++++++++++++++++++++++++++++--
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 75 insertions(+), 3 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
@ 2021-12-02 13:04   ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch, tj, axboe; +Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

Prepare to reintroduce tg_drain_bios(), which will iterate until
throtl_rb_first() return NULL.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 39bb6e68a9a2..fdd57878e862 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -502,7 +502,6 @@ throtl_rb_first(struct throtl_service_queue *parent_sq)
 	struct rb_node *n;
 
 	n = rb_first_cached(&parent_sq->pending_tree);
-	WARN_ON_ONCE(!n);
 	if (!n)
 		return NULL;
 	return rb_entry_tg(n);
@@ -521,7 +520,7 @@ static void update_min_dispatch_time(struct throtl_service_queue *parent_sq)
 	struct throtl_grp *tg;
 
 	tg = throtl_rb_first(parent_sq);
-	if (!tg)
+	if (WARN_ON_ONCE(!tg))
 		return;
 
 	parent_sq->first_pending_disptime = tg->disptime;
@@ -1090,7 +1089,7 @@ static int throtl_select_dispatch(struct throtl_service_queue *parent_sq)
 			break;
 
 		tg = throtl_rb_first(parent_sq);
-		if (!tg)
+		if (WARN_ON_ONCE(!tg))
 			break;
 
 		if (time_before(jiffies, tg->disptime))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller
@ 2021-12-02 13:04   ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch-wEGCiKHe2LqWVfeAwA7xHQ, tj-DgEjT+Ai2ygdnm+yROfE0A,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	yukuai3-hv44wF8Li93QT0dZR+AlfA, yi.zhang-hv44wF8Li93QT0dZR+AlfA

Prepare to reintroduce tg_drain_bios(), which will iterate until
throtl_rb_first() return NULL.

Signed-off-by: Yu Kuai <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 block/blk-throttle.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 39bb6e68a9a2..fdd57878e862 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -502,7 +502,6 @@ throtl_rb_first(struct throtl_service_queue *parent_sq)
 	struct rb_node *n;
 
 	n = rb_first_cached(&parent_sq->pending_tree);
-	WARN_ON_ONCE(!n);
 	if (!n)
 		return NULL;
 	return rb_entry_tg(n);
@@ -521,7 +520,7 @@ static void update_min_dispatch_time(struct throtl_service_queue *parent_sq)
 	struct throtl_grp *tg;
 
 	tg = throtl_rb_first(parent_sq);
-	if (!tg)
+	if (WARN_ON_ONCE(!tg))
 		return;
 
 	parent_sq->first_pending_disptime = tg->disptime;
@@ -1090,7 +1089,7 @@ static int throtl_select_dispatch(struct throtl_service_queue *parent_sq)
 			break;
 
 		tg = throtl_rb_first(parent_sq);
-		if (!tg)
+		if (WARN_ON_ONCE(!tg))
 			break;
 
 		if (time_before(jiffies, tg->disptime))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
  2021-12-02 13:04 ` Yu Kuai
@ 2021-12-02 13:04   ` Yu Kuai
  -1 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch, tj, axboe; +Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

Throttled bios can't be issued after del_gendisk() is done, thus
it's better to cancel them immediately rather than waiting for
throttle is done.

For example, if user thread is throttled with low bps while it's
issuing large io, and the device is deleted. The user thread will
wait for a long time for io to return.

Noted this patch is mainly from revertion of commit 32e3374304c7
("blk-throttle: remove tg_drain_bios") and commit b77412372b68
("blk-throttle: remove blk_throtl_drain").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 73 insertions(+)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index fdd57878e862..e60f690e01ad 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2256,6 +2256,75 @@ void blk_throtl_bio_endio(struct bio *bio)
 }
 #endif
 
+/*
+ * Dispatch all bios from all children tg's queued on @parent_sq.  On
+ * return, @parent_sq is guaranteed to not have any active children tg's
+ * and all bios from previously active tg's are on @parent_sq->bio_lists[].
+ */
+static void tg_drain_bios(struct throtl_service_queue *parent_sq)
+{
+	struct throtl_grp *tg;
+
+	while ((tg = throtl_rb_first(parent_sq))) {
+		struct throtl_service_queue *sq = &tg->service_queue;
+		struct bio *bio;
+
+		throtl_dequeue_tg(tg);
+
+		while ((bio = throtl_peek_queued(&sq->queued[READ])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+		while ((bio = throtl_peek_queued(&sq->queued[WRITE])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+	}
+}
+
+/**
+ * blk_throtl_cancel_bios - cancel throttled bios
+ * @q: request_queue to cancel throttled bios for
+ *
+ * This function is called to error all currently throttled bios on @q.
+ */
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct throtl_data *td = q->td;
+	struct bio_list bio_list_on_stack;
+	struct blkcg_gq *blkg;
+	struct cgroup_subsys_state *pos_css;
+	struct bio *bio;
+	int rw;
+
+	bio_list_init(&bio_list_on_stack);
+
+	/*
+	 * hold queue_lock to prevent concurrent with dispatching
+	 * throttled bios by timer.
+	 */
+	spin_lock_irq(&q->queue_lock);
+
+	/*
+	 * Drain each tg while doing post-order walk on the blkg tree, so
+	 * that all bios are propagated to td->service_queue.  It'd be
+	 * better to walk service_queue tree directly but blkg walk is
+	 * easier.
+	 */
+	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
+		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
+
+	/* finally, transfer bios from top-level tg's into the td */
+	tg_drain_bios(&td->service_queue);
+
+	/* all bios now should be in td->service_queue, cancel them */
+	for (rw = READ; rw <= WRITE; rw++)
+		while ((bio = throtl_pop_queued(&td->service_queue.queued[rw],
+						NULL)))
+			bio_list_add(&bio_list_on_stack, bio);
+
+	spin_unlock_irq(&q->queue_lock);
+	if (!bio_list_empty(&bio_list_on_stack))
+		while ((bio = bio_list_pop(&bio_list_on_stack)))
+			bio_io_error(bio);
+}
+
 int blk_throtl_init(struct request_queue *q)
 {
 	struct throtl_data *td;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..9d67d5139954 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+#define blk_throtl_cancel_bios(q)  do { } while (0)
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index 8e9cbf23c510..24fa3356d164 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -28,6 +28,7 @@
 
 #include "blk.h"
 #include "blk-rq-qos.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -619,6 +620,7 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_mq_freeze_queue_wait(q);
 
+	blk_throtl_cancel_bios(q);
 	rq_qos_exit(q);
 	blk_sync_queue(q);
 	blk_flush_integrity();
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-02 13:04   ` Yu Kuai
  0 siblings, 0 replies; 16+ messages in thread
From: Yu Kuai @ 2021-12-02 13:04 UTC (permalink / raw)
  To: hch, tj, axboe; +Cc: cgroups, linux-block, linux-kernel, yukuai3, yi.zhang

Throttled bios can't be issued after del_gendisk() is done, thus
it's better to cancel them immediately rather than waiting for
throttle is done.

For example, if user thread is throttled with low bps while it's
issuing large io, and the device is deleted. The user thread will
wait for a long time for io to return.

Noted this patch is mainly from revertion of commit 32e3374304c7
("blk-throttle: remove tg_drain_bios") and commit b77412372b68
("blk-throttle: remove blk_throtl_drain").

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-throttle.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
 block/blk-throttle.h |  2 ++
 block/genhd.c        |  2 ++
 3 files changed, 73 insertions(+)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index fdd57878e862..e60f690e01ad 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2256,6 +2256,75 @@ void blk_throtl_bio_endio(struct bio *bio)
 }
 #endif
 
+/*
+ * Dispatch all bios from all children tg's queued on @parent_sq.  On
+ * return, @parent_sq is guaranteed to not have any active children tg's
+ * and all bios from previously active tg's are on @parent_sq->bio_lists[].
+ */
+static void tg_drain_bios(struct throtl_service_queue *parent_sq)
+{
+	struct throtl_grp *tg;
+
+	while ((tg = throtl_rb_first(parent_sq))) {
+		struct throtl_service_queue *sq = &tg->service_queue;
+		struct bio *bio;
+
+		throtl_dequeue_tg(tg);
+
+		while ((bio = throtl_peek_queued(&sq->queued[READ])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+		while ((bio = throtl_peek_queued(&sq->queued[WRITE])))
+			tg_dispatch_one_bio(tg, bio_data_dir(bio));
+	}
+}
+
+/**
+ * blk_throtl_cancel_bios - cancel throttled bios
+ * @q: request_queue to cancel throttled bios for
+ *
+ * This function is called to error all currently throttled bios on @q.
+ */
+void blk_throtl_cancel_bios(struct request_queue *q)
+{
+	struct throtl_data *td = q->td;
+	struct bio_list bio_list_on_stack;
+	struct blkcg_gq *blkg;
+	struct cgroup_subsys_state *pos_css;
+	struct bio *bio;
+	int rw;
+
+	bio_list_init(&bio_list_on_stack);
+
+	/*
+	 * hold queue_lock to prevent concurrent with dispatching
+	 * throttled bios by timer.
+	 */
+	spin_lock_irq(&q->queue_lock);
+
+	/*
+	 * Drain each tg while doing post-order walk on the blkg tree, so
+	 * that all bios are propagated to td->service_queue.  It'd be
+	 * better to walk service_queue tree directly but blkg walk is
+	 * easier.
+	 */
+	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
+		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
+
+	/* finally, transfer bios from top-level tg's into the td */
+	tg_drain_bios(&td->service_queue);
+
+	/* all bios now should be in td->service_queue, cancel them */
+	for (rw = READ; rw <= WRITE; rw++)
+		while ((bio = throtl_pop_queued(&td->service_queue.queued[rw],
+						NULL)))
+			bio_list_add(&bio_list_on_stack, bio);
+
+	spin_unlock_irq(&q->queue_lock);
+	if (!bio_list_empty(&bio_list_on_stack))
+		while ((bio = bio_list_pop(&bio_list_on_stack)))
+			bio_io_error(bio);
+}
+
 int blk_throtl_init(struct request_queue *q)
 {
 	struct throtl_data *td;
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 175f03abd9e4..9d67d5139954 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -160,12 +160,14 @@ static inline void blk_throtl_exit(struct request_queue *q) { }
 static inline void blk_throtl_register_queue(struct request_queue *q) { }
 static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
 static inline bool blk_throtl_bio(struct bio *bio) { return false; }
+#define blk_throtl_cancel_bios(q)  do { } while (0)
 #else /* CONFIG_BLK_DEV_THROTTLING */
 int blk_throtl_init(struct request_queue *q);
 void blk_throtl_exit(struct request_queue *q);
 void blk_throtl_register_queue(struct request_queue *q);
 void blk_throtl_charge_bio_split(struct bio *bio);
 bool __blk_throtl_bio(struct bio *bio);
+void blk_throtl_cancel_bios(struct request_queue *q);
 static inline bool blk_throtl_bio(struct bio *bio)
 {
 	struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
diff --git a/block/genhd.c b/block/genhd.c
index 8e9cbf23c510..24fa3356d164 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -28,6 +28,7 @@
 
 #include "blk.h"
 #include "blk-rq-qos.h"
+#include "blk-throttle.h"
 
 static struct kobject *block_depr;
 
@@ -619,6 +620,7 @@ void del_gendisk(struct gendisk *disk)
 
 	blk_mq_freeze_queue_wait(q);
 
+	blk_throtl_cancel_bios(q);
 	rq_qos_exit(q);
 	blk_sync_queue(q);
 	blk_flush_integrity();
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-02 14:48     ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-02 14:48 UTC (permalink / raw)
  To: Yu Kuai; +Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]

Hello Kuai.

On Thu, Dec 02, 2021 at 09:04:40PM +0800, Yu Kuai <yukuai3@huawei.com> wrote:
> For example, if user thread is throttled with low bps while it's
> issuing large io, and the device is deleted. The user thread will
> wait for a long time for io to return.

Do I understand correctly the "long time" here is
outstanding_IO_size/throttled_bandwidth? Or are you getting at some
other cause/longer time?

> +void blk_throtl_cancel_bios(struct request_queue *q)
> +{
> +	struct throtl_data *td = q->td;
> +	struct bio_list bio_list_on_stack;
> +	struct blkcg_gq *blkg;
> +	struct cgroup_subsys_state *pos_css;
> +	struct bio *bio;
> +	int rw;
> +
> +	bio_list_init(&bio_list_on_stack);
> +
> +	/*
> +	 * hold queue_lock to prevent concurrent with dispatching
> +	 * throttled bios by timer.
> +	 */
> +	spin_lock_irq(&q->queue_lock);

You've replaced the rcu_read_lock() with the queue lock but...

> +
> +	/*
> +	 * Drain each tg while doing post-order walk on the blkg tree, so
> +	 * that all bios are propagated to td->service_queue.  It'd be
> +	 * better to walk service_queue tree directly but blkg walk is
> +	 * easier.
> +	 */
> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
> +		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);

...you also need the rcu_read_lock() here since you may encounter a
(descendant) blkcg that's removed concurrently.

(I may miss some consequences of doing this under the queue_lock so if
the concurrent removal is ruled out, please make a comment about it.)


Regards,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-02 14:48     ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-02 14:48 UTC (permalink / raw)
  To: Yu Kuai
  Cc: hch-wEGCiKHe2LqWVfeAwA7xHQ, tj-DgEjT+Ai2ygdnm+yROfE0A,
	axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	yi.zhang-hv44wF8Li93QT0dZR+AlfA

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

Hello Kuai.

On Thu, Dec 02, 2021 at 09:04:40PM +0800, Yu Kuai <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
> For example, if user thread is throttled with low bps while it's
> issuing large io, and the device is deleted. The user thread will
> wait for a long time for io to return.

Do I understand correctly the "long time" here is
outstanding_IO_size/throttled_bandwidth? Or are you getting at some
other cause/longer time?

> +void blk_throtl_cancel_bios(struct request_queue *q)
> +{
> +	struct throtl_data *td = q->td;
> +	struct bio_list bio_list_on_stack;
> +	struct blkcg_gq *blkg;
> +	struct cgroup_subsys_state *pos_css;
> +	struct bio *bio;
> +	int rw;
> +
> +	bio_list_init(&bio_list_on_stack);
> +
> +	/*
> +	 * hold queue_lock to prevent concurrent with dispatching
> +	 * throttled bios by timer.
> +	 */
> +	spin_lock_irq(&q->queue_lock);

You've replaced the rcu_read_lock() with the queue lock but...

> +
> +	/*
> +	 * Drain each tg while doing post-order walk on the blkg tree, so
> +	 * that all bios are propagated to td->service_queue.  It'd be
> +	 * better to walk service_queue tree directly but blkg walk is
> +	 * easier.
> +	 */
> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
> +		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);

...you also need the rcu_read_lock() here since you may encounter a
(descendant) blkcg that's removed concurrently.

(I may miss some consequences of doing this under the queue_lock so if
the concurrent removal is ruled out, please make a comment about it.)


Regards,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
  2021-12-02 14:48     ` Michal Koutný
@ 2021-12-03  7:50       ` yukuai (C)
  -1 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2021-12-03  7:50 UTC (permalink / raw)
  To: Michal Koutný
  Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2021/12/02 22:48, Michal Koutný 写道:
> Hello Kuai.
> 
> On Thu, Dec 02, 2021 at 09:04:40PM +0800, Yu Kuai <yukuai3@huawei.com> wrote:
>> For example, if user thread is throttled with low bps while it's
>> issuing large io, and the device is deleted. The user thread will
>> wait for a long time for io to return.
> 
> Do I understand correctly the "long time" here is
> outstanding_IO_size/throttled_bandwidth? Or are you getting at some

Hi, Michal

Yes, this is exactly what I mean.
> other cause/longer time?
> 
>> +void blk_throtl_cancel_bios(struct request_queue *q)
>> +{
>> +	struct throtl_data *td = q->td;
>> +	struct bio_list bio_list_on_stack;
>> +	struct blkcg_gq *blkg;
>> +	struct cgroup_subsys_state *pos_css;
>> +	struct bio *bio;
>> +	int rw;
>> +
>> +	bio_list_init(&bio_list_on_stack);
>> +
>> +	/*
>> +	 * hold queue_lock to prevent concurrent with dispatching
>> +	 * throttled bios by timer.
>> +	 */
>> +	spin_lock_irq(&q->queue_lock);
> 
> You've replaced the rcu_read_lock() with the queue lock but...
> 
>> +
>> +	/*
>> +	 * Drain each tg while doing post-order walk on the blkg tree, so
>> +	 * that all bios are propagated to td->service_queue.  It'd be
>> +	 * better to walk service_queue tree directly but blkg walk is
>> +	 * easier.
>> +	 */
>> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
>> +		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
> 
> ...you also need the rcu_read_lock() here since you may encounter a
> (descendant) blkcg that's removed concurrently.

blkg_destroy() is protected by the queue_lock,so I think queue_lock can
protect such concurrent scenario.

Thanks,
Kuai
> 
> (I may miss some consequences of doing this under the queue_lock so if
> the concurrent removal is ruled out, please make a comment about it.)
> 
> 
> Regards,
> Michal
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-03  7:50       ` yukuai (C)
  0 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2021-12-03  7:50 UTC (permalink / raw)
  To: Michal Koutný
  Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2021/12/02 22:48, Michal Koutný 写道:
> Hello Kuai.
> 
> On Thu, Dec 02, 2021 at 09:04:40PM +0800, Yu Kuai <yukuai3@huawei.com> wrote:
>> For example, if user thread is throttled with low bps while it's
>> issuing large io, and the device is deleted. The user thread will
>> wait for a long time for io to return.
> 
> Do I understand correctly the "long time" here is
> outstanding_IO_size/throttled_bandwidth? Or are you getting at some

Hi, Michal

Yes, this is exactly what I mean.
> other cause/longer time?
> 
>> +void blk_throtl_cancel_bios(struct request_queue *q)
>> +{
>> +	struct throtl_data *td = q->td;
>> +	struct bio_list bio_list_on_stack;
>> +	struct blkcg_gq *blkg;
>> +	struct cgroup_subsys_state *pos_css;
>> +	struct bio *bio;
>> +	int rw;
>> +
>> +	bio_list_init(&bio_list_on_stack);
>> +
>> +	/*
>> +	 * hold queue_lock to prevent concurrent with dispatching
>> +	 * throttled bios by timer.
>> +	 */
>> +	spin_lock_irq(&q->queue_lock);
> 
> You've replaced the rcu_read_lock() with the queue lock but...
> 
>> +
>> +	/*
>> +	 * Drain each tg while doing post-order walk on the blkg tree, so
>> +	 * that all bios are propagated to td->service_queue.  It'd be
>> +	 * better to walk service_queue tree directly but blkg walk is
>> +	 * easier.
>> +	 */
>> +	blkg_for_each_descendant_post(blkg, pos_css, td->queue->root_blkg)
>> +		tg_drain_bios(&blkg_to_tg(blkg)->service_queue);
> 
> ...you also need the rcu_read_lock() here since you may encounter a
> (descendant) blkcg that's removed concurrently.

blkg_destroy() is protected by the queue_lock,so I think queue_lock can
protect such concurrent scenario.

Thanks,
Kuai
> 
> (I may miss some consequences of doing this under the queue_lock so if
> the concurrent removal is ruled out, please make a comment about it.)
> 
> 
> Regards,
> Michal
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-03 10:27         ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-03 10:27 UTC (permalink / raw)
  To: yukuai (C); +Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

[-- Attachment #1: Type: text/plain, Size: 522 bytes --]

On Fri, Dec 03, 2021 at 03:50:01PM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
> blkg_destroy() is protected by the queue_lock,so I think queue_lock can
> protect such concurrent scenario.

blkg_destroy() is not as destroying :-) as actual free, you should
synchronize against (the queue_lock ensures this for
pd_free_fn=throtl_pd_free but you may still trip on blkcg after
blkcg_css_free()).

[Actually, I think you should see a warning in your situation if you
enable CONFIG_PROVE_RCU.]

HTH,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-03 10:27         ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-03 10:27 UTC (permalink / raw)
  To: yukuai (C)
  Cc: hch-wEGCiKHe2LqWVfeAwA7xHQ, tj-DgEjT+Ai2ygdnm+yROfE0A,
	axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	yi.zhang-hv44wF8Li93QT0dZR+AlfA

[-- Attachment #1: Type: text/plain, Size: 551 bytes --]

On Fri, Dec 03, 2021 at 03:50:01PM +0800, "yukuai (C)" <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
> blkg_destroy() is protected by the queue_lock,so I think queue_lock can
> protect such concurrent scenario.

blkg_destroy() is not as destroying :-) as actual free, you should
synchronize against (the queue_lock ensures this for
pd_free_fn=throtl_pd_free but you may still trip on blkcg after
blkcg_css_free()).

[Actually, I think you should see a warning in your situation if you
enable CONFIG_PROVE_RCU.]

HTH,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
  2021-12-03 10:27         ` Michal Koutný
@ 2021-12-04  8:03           ` yukuai (C)
  -1 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2021-12-04  8:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2021/12/03 18:27, Michal Koutný 写道:
> On Fri, Dec 03, 2021 at 03:50:01PM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
>> blkg_destroy() is protected by the queue_lock,so I think queue_lock can
>> protect such concurrent scenario.
> 
> blkg_destroy() is not as destroying :-) as actual free, you should
> synchronize against (the queue_lock ensures this for
> pd_free_fn=throtl_pd_free but you may still trip on blkcg after
> blkcg_css_free()).

Hi, Michal

I was thinking that if there are active blkgs, holding queue_lock will 
ensure blkcg won't be freed. However, if there are no active blkgs in
the first place, it seems right rcu_read_lock() can prevent this
iteration concurrent with css_release->css_release_work_fn->
css_free_rwork_fn.

By the way, does spin_lock can guarantee this since it disables preempt
like what rcu_read_lock() does?

Thanks,
Kuai

> 
> [Actually, I think you should see a warning in your situation if you
> enable CONFIG_PROVE_RCU.]
> 
> HTH,
> Michal
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-04  8:03           ` yukuai (C)
  0 siblings, 0 replies; 16+ messages in thread
From: yukuai (C) @ 2021-12-04  8:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

在 2021/12/03 18:27, Michal Koutný 写道:
> On Fri, Dec 03, 2021 at 03:50:01PM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
>> blkg_destroy() is protected by the queue_lock,so I think queue_lock can
>> protect such concurrent scenario.
> 
> blkg_destroy() is not as destroying :-) as actual free, you should
> synchronize against (the queue_lock ensures this for
> pd_free_fn=throtl_pd_free but you may still trip on blkcg after
> blkcg_css_free()).

Hi, Michal

I was thinking that if there are active blkgs, holding queue_lock will 
ensure blkcg won't be freed. However, if there are no active blkgs in
the first place, it seems right rcu_read_lock() can prevent this
iteration concurrent with css_release->css_release_work_fn->
css_free_rwork_fn.

By the way, does spin_lock can guarantee this since it disables preempt
like what rcu_read_lock() does?

Thanks,
Kuai

> 
> [Actually, I think you should see a warning in your situation if you
> enable CONFIG_PROVE_RCU.]
> 
> HTH,
> Michal
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-06 15:43             ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-06 15:43 UTC (permalink / raw)
  To: yukuai (C); +Cc: hch, tj, axboe, cgroups, linux-block, linux-kernel, yi.zhang

[-- Attachment #1: Type: text/plain, Size: 757 bytes --]

On Sat, Dec 04, 2021 at 04:03:53PM +0800, "yukuai (C)" <yukuai3@huawei.com> wrote:
> I was thinking that if there are active blkgs, holding queue_lock will
> ensure blkcg won't be freed.

My take is that the function traverses the whole blkcg tree (from global
root) and nothing prevents concurrent blkcg_css_free() in a possibly
unrelated branch (or queue).

> By the way, does spin_lock can guarantee this since it disables preempt
> like what rcu_read_lock() does?

Yes (but don't quoRTe me on that :-).

(It even isn't issue with a non-preemptible kernel neither but the code
IMO should be generic to allow for different configs -- or as I
mentioned initially, make a comment why the tree traversal is not
affected by concurrent frees.)

Thanks,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk()
@ 2021-12-06 15:43             ` Michal Koutný
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Koutný @ 2021-12-06 15:43 UTC (permalink / raw)
  To: yukuai (C)
  Cc: hch-wEGCiKHe2LqWVfeAwA7xHQ, tj-DgEjT+Ai2ygdnm+yROfE0A,
	axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	yi.zhang-hv44wF8Li93QT0dZR+AlfA

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

On Sat, Dec 04, 2021 at 04:03:53PM +0800, "yukuai (C)" <yukuai3-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:
> I was thinking that if there are active blkgs, holding queue_lock will
> ensure blkcg won't be freed.

My take is that the function traverses the whole blkcg tree (from global
root) and nothing prevents concurrent blkcg_css_free() in a possibly
unrelated branch (or queue).

> By the way, does spin_lock can guarantee this since it disables preempt
> like what rcu_read_lock() does?

Yes (but don't quoRTe me on that :-).

(It even isn't issue with a non-preemptible kernel neither but the code
IMO should be generic to allow for different configs -- or as I
mentioned initially, make a comment why the tree traversal is not
affected by concurrent frees.)

Thanks,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-12-06 15:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 13:04 [PATCH v4 0/2] cancel all throttled bios in del_gendisk() Yu Kuai
2021-12-02 13:04 ` Yu Kuai
2021-12-02 13:04 ` [PATCH v4 1/2] blk-throtl: move WARN_ON_ONCE() from throtl_rb_first() to it's caller Yu Kuai
2021-12-02 13:04   ` Yu Kuai
2021-12-02 13:04 ` [PATCH v4 2/2] block: cancel all throttled bios in del_gendisk() Yu Kuai
2021-12-02 13:04   ` Yu Kuai
2021-12-02 14:48   ` Michal Koutný
2021-12-02 14:48     ` Michal Koutný
2021-12-03  7:50     ` yukuai (C)
2021-12-03  7:50       ` yukuai (C)
2021-12-03 10:27       ` Michal Koutný
2021-12-03 10:27         ` Michal Koutný
2021-12-04  8:03         ` yukuai (C)
2021-12-04  8:03           ` yukuai (C)
2021-12-06 15:43           ` Michal Koutný
2021-12-06 15:43             ` Michal Koutný

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.