* [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
@ 2023-02-08 6:35 Christoph Hellwig
2023-02-08 7:38 ` Ming Lei
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Christoph Hellwig @ 2023-02-08 6:35 UTC (permalink / raw)
To: axboe; +Cc: linux-block, Ming Lei
While del_gendisk ensures there is no outstanding I/O on the queue,
it can't prevent block layer users from building new I/O.
This leads to a NULL ->root_blkg reference in bio_associate_blkg when
allocating a new bio on a shut down file system. Delay freeing the
blk-cgroup subsystems from del_gendisk until disk_release to make
sure the blkg and throttle information is still avaіlable for bio
submitters, even if those bios will immediately fail.
This now can cause a case where disk_release is called on a disk
that hasn't been added. That's mostly harmless, except for a case
in blk_throttl_exit that now needs to check for a NULL ->td pointer.
Fixes: 178fa7d49815 ("blk-cgroup: delay blk-cgroup initialization until add_disk")
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/blk-throttle.c | 3 ++-
block/genhd.c | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 902203bdddb4b4..e7bd7050d68402 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2411,7 +2411,8 @@ void blk_throtl_exit(struct gendisk *disk)
{
struct request_queue *q = disk->queue;
- BUG_ON(!q->td);
+ if (!q->td)
+ return;
del_timer_sync(&q->td->service_queue.pending_timer);
throtl_shutdown_wq(q);
blkcg_deactivate_policy(disk, &blkcg_policy_throtl);
diff --git a/block/genhd.c b/block/genhd.c
index 7e031559bf514c..65373738c70b02 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -668,8 +668,6 @@ void del_gendisk(struct gendisk *disk)
rq_qos_exit(q);
blk_mq_unquiesce_queue(q);
- blkcg_exit_disk(disk);
-
/*
* If the disk does not own the queue, allow using passthrough requests
* again. Else leave the queue frozen to fail all I/O.
@@ -1166,6 +1164,8 @@ static void disk_release(struct device *dev)
might_sleep();
WARN_ON_ONCE(disk_live(disk));
+ blkcg_exit_disk(disk);
+
/*
* To undo the all initialization from blk_mq_init_allocated_queue in
* case of a probe failure where add_disk is never called we have to
--
2.39.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 6:35 [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release Christoph Hellwig
@ 2023-02-08 7:38 ` Ming Lei
2023-02-08 15:12 ` Christoph Hellwig
2023-02-09 15:12 ` Jens Axboe
2023-02-10 5:05 ` Ming Lei
2 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2023-02-08 7:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei
On Wed, Feb 08, 2023 at 07:35:14AM +0100, Christoph Hellwig wrote:
> While del_gendisk ensures there is no outstanding I/O on the queue,
> it can't prevent block layer users from building new I/O.
>
> This leads to a NULL ->root_blkg reference in bio_associate_blkg when
> allocating a new bio on a shut down file system. Delay freeing the
> blk-cgroup subsystems from del_gendisk until disk_release to make
> sure the blkg and throttle information is still avaіlable for bio
> submitters, even if those bios will immediately fail.
>
> This now can cause a case where disk_release is called on a disk
> that hasn't been added. That's mostly harmless, except for a case
> in blk_throttl_exit that now needs to check for a NULL ->td pointer.
With this way, blkcg_init_disk() could be called before q->root_blkg
is released in disk unbind & rebind use case, then memory leak?
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 7:38 ` Ming Lei
@ 2023-02-08 15:12 ` Christoph Hellwig
2023-02-08 23:47 ` Ming Lei
0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2023-02-08 15:12 UTC (permalink / raw)
To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block
On Wed, Feb 08, 2023 at 03:38:40PM +0800, Ming Lei wrote:
> > This now can cause a case where disk_release is called on a disk
> > that hasn't been added. That's mostly harmless, except for a case
> > in blk_throttl_exit that now needs to check for a NULL ->td pointer.
>
> With this way, blkcg_init_disk() could be called before q->root_blkg
> is released in disk unbind & rebind use case, then memory leak?
q->root_blkg is now disk->root_blkg. So in an unind and rebind case
a different disk will be involved.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 15:12 ` Christoph Hellwig
@ 2023-02-08 23:47 ` Ming Lei
2023-02-09 5:04 ` Christoph Hellwig
0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2023-02-08 23:47 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block
On Wed, Feb 08, 2023 at 04:12:31PM +0100, Christoph Hellwig wrote:
> On Wed, Feb 08, 2023 at 03:38:40PM +0800, Ming Lei wrote:
> > > This now can cause a case where disk_release is called on a disk
> > > that hasn't been added. That's mostly harmless, except for a case
> > > in blk_throttl_exit that now needs to check for a NULL ->td pointer.
> >
> > With this way, blkcg_init_disk() could be called before q->root_blkg
> > is released in disk unbind & rebind use case, then memory leak?
>
> q->root_blkg is now disk->root_blkg. So in an unind and rebind case
> a different disk will be involved.
OK.
Another thing is that blkcg_init_disk() and blkcg_exit_disk() becomes
asymmetrical with this patch. So alloc_disk() & add_disk(), del_disk() & put_disk()
have to be done together. If it may be documented, this patch looks
fine.
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 23:47 ` Ming Lei
@ 2023-02-09 5:04 ` Christoph Hellwig
2023-02-09 8:56 ` Ming Lei
0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2023-02-09 5:04 UTC (permalink / raw)
To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block
On Thu, Feb 09, 2023 at 07:47:38AM +0800, Ming Lei wrote:
> Another thing is that blkcg_init_disk() and blkcg_exit_disk() becomes
> asymmetrical with this patch. So alloc_disk() & add_disk(), del_disk() & put_disk()
> have to be done together. If it may be documented, this patch looks
> fine.
As in no add_disk is allowed after del_gendisk on the same disk?
It's been like that since basically forever. I agree that documenting
it probably doesn't hurt though - but that's separate from this fix.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-09 5:04 ` Christoph Hellwig
@ 2023-02-09 8:56 ` Ming Lei
0 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2023-02-09 8:56 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block
On Thu, Feb 09, 2023 at 06:04:01AM +0100, Christoph Hellwig wrote:
> On Thu, Feb 09, 2023 at 07:47:38AM +0800, Ming Lei wrote:
> > Another thing is that blkcg_init_disk() and blkcg_exit_disk() becomes
> > asymmetrical with this patch. So alloc_disk() & add_disk(), del_disk() & put_disk()
> > have to be done together. If it may be documented, this patch looks
> > fine.
>
> As in no add_disk is allowed after del_gendisk on the same disk?
> It's been like that since basically forever. I agree that documenting
> it probably doesn't hurt though - but that's separate from this fix.
OK, fair enough,
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 6:35 [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release Christoph Hellwig
2023-02-08 7:38 ` Ming Lei
@ 2023-02-09 15:12 ` Jens Axboe
2023-02-10 5:05 ` Ming Lei
2 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2023-02-09 15:12 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-block, Ming Lei
On Wed, 08 Feb 2023 07:35:14 +0100, Christoph Hellwig wrote:
> While del_gendisk ensures there is no outstanding I/O on the queue,
> it can't prevent block layer users from building new I/O.
>
> This leads to a NULL ->root_blkg reference in bio_associate_blkg when
> allocating a new bio on a shut down file system. Delay freeing the
> blk-cgroup subsystems from del_gendisk until disk_release to make
> sure the blkg and throttle information is still avaіlable for bio
> submitters, even if those bios will immediately fail.
>
> [...]
Applied, thanks!
[1/1] blk-cgroup: delay calling blkcg_exit_disk until disk_release
commit: c43332fe028c252a2a28e46be70a530f64fc3c9d
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-08 6:35 [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release Christoph Hellwig
2023-02-08 7:38 ` Ming Lei
2023-02-09 15:12 ` Jens Axboe
@ 2023-02-10 5:05 ` Ming Lei
2023-02-13 8:37 ` Ming Lei
2 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2023-02-10 5:05 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei
On Wed, Feb 08, 2023 at 07:35:14AM +0100, Christoph Hellwig wrote:
> While del_gendisk ensures there is no outstanding I/O on the queue,
> it can't prevent block layer users from building new I/O.
>
> This leads to a NULL ->root_blkg reference in bio_associate_blkg when
> allocating a new bio on a shut down file system. Delay freeing the
> blk-cgroup subsystems from del_gendisk until disk_release to make
> sure the blkg and throttle information is still avaіlable for bio
> submitters, even if those bios will immediately fail.
>
> This now can cause a case where disk_release is called on a disk
> that hasn't been added. That's mostly harmless, except for a case
> in blk_throttl_exit that now needs to check for a NULL ->td pointer.
>
> Fixes: 178fa7d49815 ("blk-cgroup: delay blk-cgroup initialization until add_disk")
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
hammmmmm, this patch actually causes bigger trouble.
After commit 84d7d462b16d ("blk-cgroup: pin the gendisk in struct blkcg_gq"),
blkcg_gq instance grabs disk's reference, so moving blkcg_exit_disk
into disk_release() just causes reference cross-dependency, both are
leaked.
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-10 5:05 ` Ming Lei
@ 2023-02-13 8:37 ` Ming Lei
2023-02-13 8:42 ` Christoph Hellwig
0 siblings, 1 reply; 10+ messages in thread
From: Ming Lei @ 2023-02-13 8:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: axboe, linux-block, ming.lei
On Fri, Feb 10, 2023 at 01:05:11PM +0800, Ming Lei wrote:
> On Wed, Feb 08, 2023 at 07:35:14AM +0100, Christoph Hellwig wrote:
> > While del_gendisk ensures there is no outstanding I/O on the queue,
> > it can't prevent block layer users from building new I/O.
> >
> > This leads to a NULL ->root_blkg reference in bio_associate_blkg when
> > allocating a new bio on a shut down file system. Delay freeing the
> > blk-cgroup subsystems from del_gendisk until disk_release to make
> > sure the blkg and throttle information is still avaіlable for bio
> > submitters, even if those bios will immediately fail.
> >
> > This now can cause a case where disk_release is called on a disk
> > that hasn't been added. That's mostly harmless, except for a case
> > in blk_throttl_exit that now needs to check for a NULL ->td pointer.
> >
> > Fixes: 178fa7d49815 ("blk-cgroup: delay blk-cgroup initialization until add_disk")
> > Reported-by: Ming Lei <ming.lei@redhat.com>
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
>
> hammmmmm, this patch actually causes bigger trouble.
>
> After commit 84d7d462b16d ("blk-cgroup: pin the gendisk in struct blkcg_gq"),
> blkcg_gq instance grabs disk's reference, so moving blkcg_exit_disk
> into disk_release() just causes reference cross-dependency, both are
> leaked.
Hi Christoph & Jens,
This issue is a bit serious, both blkg & disk & request_queue are leaked by
commit c43332fe028c ("blk-cgroup: delay calling blkcg_exit_disk until disk_release").
Can we solve it before merging for-6.3/block into v6.3-rc1?
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release
2023-02-13 8:37 ` Ming Lei
@ 2023-02-13 8:42 ` Christoph Hellwig
0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2023-02-13 8:42 UTC (permalink / raw)
To: Ming Lei; +Cc: Christoph Hellwig, axboe, linux-block
On Mon, Feb 13, 2023 at 04:37:09PM +0800, Ming Lei wrote:
> This issue is a bit serious, both blkg & disk & request_queue are leaked by
> commit c43332fe028c ("blk-cgroup: delay calling blkcg_exit_disk until disk_release").
>
> Can we solve it before merging for-6.3/block into v6.3-rc1?
Yes. I'm testing a fix at the moment.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-02-13 8:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-08 6:35 [PATCH] blk-cgroup: delay calling blkcg_exit_disk until disk_release Christoph Hellwig
2023-02-08 7:38 ` Ming Lei
2023-02-08 15:12 ` Christoph Hellwig
2023-02-08 23:47 ` Ming Lei
2023-02-09 5:04 ` Christoph Hellwig
2023-02-09 8:56 ` Ming Lei
2023-02-09 15:12 ` Jens Axboe
2023-02-10 5:05 ` Ming Lei
2023-02-13 8:37 ` Ming Lei
2023-02-13 8:42 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.