linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] pktcdvd: Fix pkt_setup_dev() error path
       [not found] ` <20180102193948.22656-2-bart.vanassche@wdc.com>
@ 2020-04-25  1:39   ` Luis Chamberlain
  2020-04-25  9:17     ` Ming Lei
  0 siblings, 1 reply; 3+ messages in thread
From: Luis Chamberlain @ 2020-04-25  1:39 UTC (permalink / raw)
  To: Bart Van Assche, Ming Lei
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Tejun Heo,
	Maciej S . Szmigiero, Linux FS Devel, Greg Kroah-Hartman

So I hopped on a time machine to revise some old collateral due to
523e1d399ce ("block: make gendisk hold a reference to its queue")
merged on v3.2 which added the conditional check for the disk->queue
before calling blk_put_queue() on release_disk(). I started wondering
*why* the conditional was added, but I checked the original patch and
I could not find discussion around it.

Tejun, do you call why you added that conditional on

if (disk->queue)
  blk_put_queue(disk->queue);

This patch however struck me as one I should highlight, since I'm
reviewing all this now and dealing with adding error paths on
add_disk(). Below some notes.

On Tue, Jan 2, 2018 at 1:40 PM Bart Van Assche <bart.vanassche@wdc.com> wrote:
>
> Commit 523e1d399ce0 ("block: make gendisk hold a reference to its queue")
> modified add_disk() and disk_release() but did not update any of the
> error paths that trigger a put_disk() call after disk->queue has been
> assigned. That introduced the following behavior in the pktcdvd driver
> if pkt_new_dev() fails:
>
> Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]
>
> Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL,
> fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev()
> error path.
>
> Fixes: commit 523e1d399ce0 ("block: make gendisk hold a reference to its queue")
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
> Cc: <stable@vger.kernel.org> # v3.2
> ---
>  drivers/block/pktcdvd.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
> index 67974796c350..2659b2534073 100644
> --- a/drivers/block/pktcdvd.c
> +++ b/drivers/block/pktcdvd.c
> @@ -2745,7 +2745,7 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
>         pd->pkt_dev = MKDEV(pktdev_major, idx);
>         ret = pkt_new_dev(pd, dev);
>         if (ret)
> -               goto out_new_dev;
> +               goto out_mem2;
>
>         /* inherit events of the host device */
>         disk->events = pd->bdev->bd_disk->events;
> @@ -2763,8 +2763,6 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
>         mutex_unlock(&ctl_mutex);
>         return 0;
>
> -out_new_dev:
> -       blk_cleanup_queue(disk->queue);
>  out_mem2:
>         put_disk(disk);
>  out_mem:
> --

As we have it now drivers *do* call blk_cleanup_queue() on error paths
prior to add_disk(). An example today is on drivers/block/loop.c where
in loop_add(), if alloc_disk() fails we call  blk_cleanup_queue()
*but* this error path *never* called put_disk() as
drivers/block/pktcdvd.c did on error, and that is because it doesn't
need to as the last error-path-induced call was alloc_disk(). So it
doesn't need to free the disk as its not created on the error path of
loop_add().

This will of course change once we make add_disk() return int, and
capture errors, and it brings the question if we want to follow
similar strategy for other drivers, however note that blk_put_queue()
doesn't do everything blk_cleanup_queue() does, and in fact
blk_cleanup_queue() states it sets up "the appropriate flags" *and*
then calls blk_put_queue().

We'll have a a bit more collateral evolutions if we embrace the
strategy in this commit, for those drivers that wish to start taking
advantage of the error checks, but other then considering this, I
thought it would be good to think about the fact that *today* we call
blk_cleanup_queue() on error paths *without* the disk being yet
associated either. This, in spite of the fact that the way we designed
the queue, it sits on top of the disk from a kobject perspective once
registered. Since we call blk_cleanup_queue() on error paths today --
without a disk parent being possible -- it means nothing on
blk_cleanup_queue() should not rely on it having a functional disk. Do
we want to keep it that way? If we keep the practice of drivers using
blk_cleanup_queue() safely on error paths it just means we'll have to
ensure blk_cleanup_queue() never requires the disk moving forward, and
document this. The commit above reflects a case where this was not
preferred and in fact needed, however I think just setting disk-queue
= NULL, would have done it, as then the last disk_release() would not
have called blk_put_queue()

Let me know if folks have a preference, this all new to me, so I'm in
hopes folks have tribal knowledge which would be helpful here.

  Luis

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] pktcdvd: Fix pkt_setup_dev() error path
  2020-04-25  1:39   ` [PATCH 1/2] pktcdvd: Fix pkt_setup_dev() error path Luis Chamberlain
@ 2020-04-25  9:17     ` Ming Lei
  2020-04-25 22:34       ` Luis Chamberlain
  0 siblings, 1 reply; 3+ messages in thread
From: Ming Lei @ 2020-04-25  9:17 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Bart Van Assche, Jens Axboe, linux-block, Christoph Hellwig,
	Tejun Heo, Maciej S . Szmigiero, Linux FS Devel,
	Greg Kroah-Hartman

On Fri, Apr 24, 2020 at 07:39:47PM -0600, Luis Chamberlain wrote:
> So I hopped on a time machine to revise some old collateral due to
> 523e1d399ce ("block: make gendisk hold a reference to its queue")
> merged on v3.2 which added the conditional check for the disk->queue
> before calling blk_put_queue() on release_disk(). I started wondering
> *why* the conditional was added, but I checked the original patch and
> I could not find discussion around it.
> 
> Tejun, do you call why you added that conditional on
> 
> if (disk->queue)
>   blk_put_queue(disk->queue);
> 
> This patch however struck me as one I should highlight, since I'm
> reviewing all this now and dealing with adding error paths on
> add_disk(). Below some notes.

disk->queue is assigned by drivers, I guess that is why the check
is needed, given the disk may be released in error path before driver
assigns queue to it.

Also some driver may only allocate disk and not add disk, then not
necessary to assign disk->queue, such as drivers/scsi/sg.c

> 
> On Tue, Jan 2, 2018 at 1:40 PM Bart Van Assche <bart.vanassche@wdc.com> wrote:
> >
> > Commit 523e1d399ce0 ("block: make gendisk hold a reference to its queue")
> > modified add_disk() and disk_release() but did not update any of the
> > error paths that trigger a put_disk() call after disk->queue has been
> > assigned. That introduced the following behavior in the pktcdvd driver
> > if pkt_new_dev() fails:
> >
> > Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]
> >
> > Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL,
> > fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev()
> > error path.
> >
> > Fixes: commit 523e1d399ce0 ("block: make gendisk hold a reference to its queue")
> > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
> > Cc: <stable@vger.kernel.org> # v3.2
> > ---
> >  drivers/block/pktcdvd.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
> > index 67974796c350..2659b2534073 100644
> > --- a/drivers/block/pktcdvd.c
> > +++ b/drivers/block/pktcdvd.c
> > @@ -2745,7 +2745,7 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
> >         pd->pkt_dev = MKDEV(pktdev_major, idx);
> >         ret = pkt_new_dev(pd, dev);
> >         if (ret)
> > -               goto out_new_dev;
> > +               goto out_mem2;
> >
> >         /* inherit events of the host device */
> >         disk->events = pd->bdev->bd_disk->events;
> > @@ -2763,8 +2763,6 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
> >         mutex_unlock(&ctl_mutex);
> >         return 0;
> >
> > -out_new_dev:
> > -       blk_cleanup_queue(disk->queue);
> >  out_mem2:
> >         put_disk(disk);
> >  out_mem:
> > --
> 
> As we have it now drivers *do* call blk_cleanup_queue() on error paths
> prior to add_disk(). An example today is on drivers/block/loop.c where
> in loop_add(), if alloc_disk() fails we call  blk_cleanup_queue()
> *but* this error path *never* called put_disk() as
> drivers/block/pktcdvd.c did on error, and that is because it doesn't
> need to as the last error-path-induced call was alloc_disk(). So it
> doesn't need to free the disk as its not created on the error path of
> loop_add().
> 
> This will of course change once we make add_disk() return int, and
> capture errors, and it brings the question if we want to follow
> similar strategy for other drivers, however note that blk_put_queue()
> doesn't do everything blk_cleanup_queue() does, and in fact
> blk_cleanup_queue() states it sets up "the appropriate flags" *and*
> then calls blk_put_queue().
> 
> We'll have a a bit more collateral evolutions if we embrace the
> strategy in this commit, for those drivers that wish to start taking
> advantage of the error checks, but other then considering this, I
> thought it would be good to think about the fact that *today* we call
> blk_cleanup_queue() on error paths *without* the disk being yet
> associated either. This, in spite of the fact that the way we designed

Some drivers may have only request queue, and not have disk, such as
NVMe's admin queue, so I think blk_cleanup_queue() has to cover this
case.

> the queue, it sits on top of the disk from a kobject perspective once
> registered. Since we call blk_cleanup_queue() on error paths today --
> without a disk parent being possible -- it means nothing on
> blk_cleanup_queue() should not rely on it having a functional disk. Do
> we want to keep it that way? If we keep the practice of drivers using

Yes, see the reason above.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] pktcdvd: Fix pkt_setup_dev() error path
  2020-04-25  9:17     ` Ming Lei
@ 2020-04-25 22:34       ` Luis Chamberlain
  0 siblings, 0 replies; 3+ messages in thread
From: Luis Chamberlain @ 2020-04-25 22:34 UTC (permalink / raw)
  To: Ming Lei
  Cc: Bart Van Assche, Jens Axboe, linux-block, Christoph Hellwig,
	Tejun Heo, Maciej S . Szmigiero, Linux FS Devel,
	Greg Kroah-Hartman

On Sat, Apr 25, 2020 at 05:17:00PM +0800, Ming Lei wrote:
> On Fri, Apr 24, 2020 at 07:39:47PM -0600, Luis Chamberlain wrote:
> > So I hopped on a time machine to revise some old collateral due to
> > 523e1d399ce ("block: make gendisk hold a reference to its queue")
> > merged on v3.2 which added the conditional check for the disk->queue
> > before calling blk_put_queue() on release_disk(). I started wondering
> > *why* the conditional was added, but I checked the original patch and
> > I could not find discussion around it.
> > 
> > Tejun, do you call why you added that conditional on
> > 
> > if (disk->queue)
> >   blk_put_queue(disk->queue);
> > 
> > This patch however struck me as one I should highlight, since I'm
> > reviewing all this now and dealing with adding error paths on
> > add_disk(). Below some notes.
> 
> disk->queue is assigned by drivers, I guess that is why the check
> is needed, given the disk may be released in error path before driver
> assigns queue to it.
> 
> Also some driver may only allocate disk and not add disk, then not
> necessary to assign disk->queue, such as drivers/scsi/sg.c

Jeesh. Ugh. Yes I see, thanks this helps.

> > As we have it now drivers *do* call blk_cleanup_queue() on error paths
> > prior to add_disk(). An example today is on drivers/block/loop.c where
> > in loop_add(), if alloc_disk() fails we call  blk_cleanup_queue()
> > *but* this error path *never* called put_disk() as
> > drivers/block/pktcdvd.c did on error, and that is because it doesn't
> > need to as the last error-path-induced call was alloc_disk(). So it
> > doesn't need to free the disk as its not created on the error path of
> > loop_add().
> > 
> > This will of course change once we make add_disk() return int, and
> > capture errors, and it brings the question if we want to follow
> > similar strategy for other drivers, however note that blk_put_queue()
> > doesn't do everything blk_cleanup_queue() does, and in fact
> > blk_cleanup_queue() states it sets up "the appropriate flags" *and*
> > then calls blk_put_queue().
> > 
> > We'll have a a bit more collateral evolutions if we embrace the
> > strategy in this commit, for those drivers that wish to start taking
> > advantage of the error checks, but other then considering this, I
> > thought it would be good to think about the fact that *today* we call
> > blk_cleanup_queue() on error paths *without* the disk being yet
> > associated either. This, in spite of the fact that the way we designed
> 
> Some drivers may have only request queue, and not have disk, such as
> NVMe's admin queue, so I think blk_cleanup_queue() has to cover this
> case.

Alright, also useful, thanks.

> > the queue, it sits on top of the disk from a kobject perspective once
> > registered. Since we call blk_cleanup_queue() on error paths today --
> > without a disk parent being possible -- it means nothing on
> > blk_cleanup_queue() should not rely on it having a functional disk. Do
> > we want to keep it that way? If we keep the practice of drivers using
> 
> Yes, see the reason above.

Alright, the patch I replied to was a case where blk_queue_cleanup() was
removed due to a crash even though this driver both add_disk() and
assigned the queue before. Although this patch didn't come with a full
kernel splat and only:

Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]

I can only guess that this was likely a double put of the queue, once
at blk_cleanup_queue() and another with the last put on disk_release().

I'll consider these things when extending the error paths, thanks for
the feedback.

  Luis

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-25 22:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180102193948.22656-1-bart.vanassche@wdc.com>
     [not found] ` <20180102193948.22656-2-bart.vanassche@wdc.com>
2020-04-25  1:39   ` [PATCH 1/2] pktcdvd: Fix pkt_setup_dev() error path Luis Chamberlain
2020-04-25  9:17     ` Ming Lei
2020-04-25 22:34       ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).