* [PATCH 0/1] Handle NULL make_request_fn in generic_make_request() @ 2020-01-23 9:17 Stefan Bader 2020-01-23 9:17 ` [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn Stefan Bader 0 siblings, 1 reply; 11+ messages in thread From: Stefan Bader @ 2020-01-23 9:17 UTC (permalink / raw) To: linux-kernel, dm-devel, linux-block Cc: Alasdair Kergon, Mike Snitzer, Jens Axboe, Tyler Hicks In ff36ab34583a "dm: remove request-based logic from make_request_fn wrapper", device creation became a 2 stage process. In the first stage, the block device is created which has a queue set up but no mapping function set. This is done in the second stage, when the mapping table is supplied. At that stage the device can become either multi-queue/request based or doing the mapping on the bio level. So right now, it is possible to crash the kernel by doing a - dmsetup create --notable <name> - mount /dev/dm-<minor> <somewhere> While this may also need to be some fixing up in the device- mapper codebase, it also should be handled from the block core as allocating a queue can potentially be done separate from assigning a mapping function. There is already one check for not having set up a queue for a device, so this just adds an additional check for make_request_fn being unset before trying to further submit the requests. -Stefan Stefan Bader (1): blk/core: Gracefully handle unset make_request_fn block/blk-core.c | 7 +++++++ 1 file changed, 7 insertions(+) -- 2.17.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 9:17 [PATCH 0/1] Handle NULL make_request_fn in generic_make_request() Stefan Bader @ 2020-01-23 9:17 ` Stefan Bader 2020-01-23 10:23 ` Tyler Hicks 2020-01-23 10:35 ` Mike Snitzer 0 siblings, 2 replies; 11+ messages in thread From: Stefan Bader @ 2020-01-23 9:17 UTC (permalink / raw) To: linux-kernel, dm-devel, linux-block Cc: Alasdair Kergon, Mike Snitzer, Jens Axboe, Tyler Hicks When device-mapper adapted for multi-queue functionality, they also re-organized the way the make-request function was set. Before, this happened when the device-mapper logical device was created. Now it is done once the mapping table gets loaded the first time (this also decides whether the block device is request or bio based). However in generic_make_request(), the request function gets used without further checks and this happens if one tries to mount such a partially set up device. This can easily be reproduced with the following steps: - dmsetup create -n test - mount /dev/dm-<#> /mnt This maybe is something which also should be fixed up in device- mapper. But given there is already a check for an unset queue pointer and potentially there could be other drivers which do or might do the same, it sounds like a good move to add another check to generic_make_request_checks() and to bail out if the request function has not been set, yet. BugLink: https://bugs.launchpad.net/bugs/1860231 Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper") Signed-off-by: Stefan Bader <stefan.bader@canonical.com> --- block/blk-core.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 1075aaff606d..adcd042edd2d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -884,6 +884,13 @@ generic_make_request_checks(struct bio *bio) bio_devname(bio, b), (long long)bio->bi_iter.bi_sector); goto end_io; } + if (unlikely(!q->make_request_fn)) { + printk(KERN_ERR + "generic_make_request: Trying to access " + "block-device without request function: %s\n", + bio_devname(bio, b)); + goto end_io; + } /* * Non-mq queues do not honor REQ_NOWAIT, so complete a bio -- 2.17.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 9:17 ` [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn Stefan Bader @ 2020-01-23 10:23 ` Tyler Hicks 2020-01-23 10:35 ` Mike Snitzer 1 sibling, 0 replies; 11+ messages in thread From: Tyler Hicks @ 2020-01-23 10:23 UTC (permalink / raw) To: Stefan Bader Cc: linux-kernel, dm-devel, linux-block, Alasdair Kergon, Mike Snitzer, Jens Axboe On 2020-01-23 11:17:13, Stefan Bader wrote: > When device-mapper adapted for multi-queue functionality, they > also re-organized the way the make-request function was set. > Before, this happened when the device-mapper logical device was > created. Now it is done once the mapping table gets loaded the > first time (this also decides whether the block device is request > or bio based). > > However in generic_make_request(), the request function gets used > without further checks and this happens if one tries to mount such > a partially set up device. > > This can easily be reproduced with the following steps: > - dmsetup create -n test > - mount /dev/dm-<#> /mnt > > This maybe is something which also should be fixed up in device- > mapper. But given there is already a check for an unset queue > pointer and potentially there could be other drivers which do or > might do the same, it sounds like a good move to add another check > to generic_make_request_checks() and to bail out if the request > function has not been set, yet. > > BugLink: https://bugs.launchpad.net/bugs/1860231 > Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper") > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> I helped debug the crash with Stefan and I think this is the most straightforward fix (and is trivial to backport for stable kernels). I looked at delaying the queue allocation in the dm code until the table load ioctl but I decided that was risky and doesn't help the general case of preventing other subsystems from making this same mistake. Tested-by: Tyler Hicks <tyhicks@canonical.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Tyler > --- > block/blk-core.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 1075aaff606d..adcd042edd2d 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -884,6 +884,13 @@ generic_make_request_checks(struct bio *bio) > bio_devname(bio, b), (long long)bio->bi_iter.bi_sector); > goto end_io; > } > + if (unlikely(!q->make_request_fn)) { > + printk(KERN_ERR > + "generic_make_request: Trying to access " > + "block-device without request function: %s\n", > + bio_devname(bio, b)); > + goto end_io; > + } > > /* > * Non-mq queues do not honor REQ_NOWAIT, so complete a bio > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 9:17 ` [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn Stefan Bader 2020-01-23 10:23 ` Tyler Hicks @ 2020-01-23 10:35 ` Mike Snitzer 2020-01-23 17:28 ` Mike Snitzer 1 sibling, 1 reply; 11+ messages in thread From: Mike Snitzer @ 2020-01-23 10:35 UTC (permalink / raw) To: Stefan Bader Cc: linux-kernel, dm-devel, linux-block, Alasdair Kergon, Jens Axboe, Tyler Hicks On Thu, Jan 23 2020 at 4:17am -0500, Stefan Bader <stefan.bader@canonical.com> wrote: > When device-mapper adapted for multi-queue functionality, they > also re-organized the way the make-request function was set. > Before, this happened when the device-mapper logical device was > created. Now it is done once the mapping table gets loaded the > first time (this also decides whether the block device is request > or bio based). > > However in generic_make_request(), the request function gets used > without further checks and this happens if one tries to mount such > a partially set up device. > > This can easily be reproduced with the following steps: > - dmsetup create -n test > - mount /dev/dm-<#> /mnt > > This maybe is something which also should be fixed up in device- > mapper. I'll look closer at other options. > But given there is already a check for an unset queue > pointer and potentially there could be other drivers which do or > might do the same, it sounds like a good move to add another check > to generic_make_request_checks() and to bail out if the request > function has not been set, yet. > > BugLink: https://bugs.launchpad.net/bugs/1860231 From that bug; "The currently proposed fix introduces no chance of stability regressions. There is a chance of a very small performance regression since an additional pointer comparison is performed on each block layer request but this is unlikely to be noticeable." This captures my immediate concern: slowing down everyone for this DM edge-case isn't desirable. Mike ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 10:35 ` Mike Snitzer @ 2020-01-23 17:28 ` Mike Snitzer 2020-01-23 18:52 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Mike Snitzer @ 2020-01-23 17:28 UTC (permalink / raw) To: Stefan Bader Cc: Jens Axboe, linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon On Thu, Jan 23 2020 at 5:35am -0500, Mike Snitzer <snitzer@redhat.com> wrote: > On Thu, Jan 23 2020 at 4:17am -0500, > Stefan Bader <stefan.bader@canonical.com> wrote: > > > When device-mapper adapted for multi-queue functionality, they > > also re-organized the way the make-request function was set. > > Before, this happened when the device-mapper logical device was > > created. Now it is done once the mapping table gets loaded the > > first time (this also decides whether the block device is request > > or bio based). > > > > However in generic_make_request(), the request function gets used > > without further checks and this happens if one tries to mount such > > a partially set up device. > > > > This can easily be reproduced with the following steps: > > - dmsetup create -n test > > - mount /dev/dm-<#> /mnt > > > > This maybe is something which also should be fixed up in device- > > mapper. > > I'll look closer at other options. > > > But given there is already a check for an unset queue > > pointer and potentially there could be other drivers which do or > > might do the same, it sounds like a good move to add another check > > to generic_make_request_checks() and to bail out if the request > > function has not been set, yet. > > > > BugLink: https://bugs.launchpad.net/bugs/1860231 > > >From that bug; > "The currently proposed fix introduces no chance of stability > regressions. There is a chance of a very small performance regression > since an additional pointer comparison is performed on each block layer > request but this is unlikely to be noticeable." > > This captures my immediate concern: slowing down everyone for this DM > edge-case isn't desirable. SO I had a look and there isn't anything easier than adding the proposed NULL check in generic_make_request_checks(). Given the many conditionals in that function.. what's one more? ;) I looked at marking the queue frozen to prevent IO via blk_queue_enter()'s existing cheeck -- but that quickly felt like an abuse, especially in that there isn't a queue unfreeze for bio-based. Jens, I'll defer to you to judge this patch further. If you're OK with it: cool. If not, I'm open to suggestions for how to proceed. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 17:28 ` Mike Snitzer @ 2020-01-23 18:52 ` Jens Axboe 2020-01-24 6:04 ` Stefan Bader 2020-01-27 19:32 ` Mike Snitzer 0 siblings, 2 replies; 11+ messages in thread From: Jens Axboe @ 2020-01-23 18:52 UTC (permalink / raw) To: Mike Snitzer, Stefan Bader Cc: linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon On 1/23/20 10:28 AM, Mike Snitzer wrote: > On Thu, Jan 23 2020 at 5:35am -0500, > Mike Snitzer <snitzer@redhat.com> wrote: > >> On Thu, Jan 23 2020 at 4:17am -0500, >> Stefan Bader <stefan.bader@canonical.com> wrote: >> >>> When device-mapper adapted for multi-queue functionality, they >>> also re-organized the way the make-request function was set. >>> Before, this happened when the device-mapper logical device was >>> created. Now it is done once the mapping table gets loaded the >>> first time (this also decides whether the block device is request >>> or bio based). >>> >>> However in generic_make_request(), the request function gets used >>> without further checks and this happens if one tries to mount such >>> a partially set up device. >>> >>> This can easily be reproduced with the following steps: >>> - dmsetup create -n test >>> - mount /dev/dm-<#> /mnt >>> >>> This maybe is something which also should be fixed up in device- >>> mapper. >> >> I'll look closer at other options. >> >>> But given there is already a check for an unset queue >>> pointer and potentially there could be other drivers which do or >>> might do the same, it sounds like a good move to add another check >>> to generic_make_request_checks() and to bail out if the request >>> function has not been set, yet. >>> >>> BugLink: https://bugs.launchpad.net/bugs/1860231 >> >> >From that bug; >> "The currently proposed fix introduces no chance of stability >> regressions. There is a chance of a very small performance regression >> since an additional pointer comparison is performed on each block layer >> request but this is unlikely to be noticeable." >> >> This captures my immediate concern: slowing down everyone for this DM >> edge-case isn't desirable. > > SO I had a look and there isn't anything easier than adding the proposed > NULL check in generic_make_request_checks(). Given the many > conditionals in that function.. what's one more? ;) > > I looked at marking the queue frozen to prevent IO via > blk_queue_enter()'s existing cheeck -- but that quickly felt like an > abuse, especially in that there isn't a queue unfreeze for bio-based. > > Jens, I'll defer to you to judge this patch further. If you're OK with > it: cool. If not, I'm open to suggestions for how to proceed. > It does kinda suck... The generic_make_request_checks() is a mess, and this doesn't make it any better. Any reason why we can't solve this two step setup in a clean fashion instead of patching around it like this? Feels like a pretty bad hack, tbh. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 18:52 ` Jens Axboe @ 2020-01-24 6:04 ` Stefan Bader 2020-01-27 19:32 ` Mike Snitzer 1 sibling, 0 replies; 11+ messages in thread From: Stefan Bader @ 2020-01-24 6:04 UTC (permalink / raw) To: Jens Axboe, Mike Snitzer Cc: linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon [-- Attachment #1.1: Type: text/plain, Size: 3126 bytes --] On 23.01.20 20:52, Jens Axboe wrote: > On 1/23/20 10:28 AM, Mike Snitzer wrote: >> On Thu, Jan 23 2020 at 5:35am -0500, >> Mike Snitzer <snitzer@redhat.com> wrote: >> >>> On Thu, Jan 23 2020 at 4:17am -0500, >>> Stefan Bader <stefan.bader@canonical.com> wrote: >>> >>>> When device-mapper adapted for multi-queue functionality, they >>>> also re-organized the way the make-request function was set. >>>> Before, this happened when the device-mapper logical device was >>>> created. Now it is done once the mapping table gets loaded the >>>> first time (this also decides whether the block device is request >>>> or bio based). >>>> >>>> However in generic_make_request(), the request function gets used >>>> without further checks and this happens if one tries to mount such >>>> a partially set up device. >>>> >>>> This can easily be reproduced with the following steps: >>>> - dmsetup create -n test >>>> - mount /dev/dm-<#> /mnt >>>> >>>> This maybe is something which also should be fixed up in device- >>>> mapper. >>> >>> I'll look closer at other options. >>> >>>> But given there is already a check for an unset queue >>>> pointer and potentially there could be other drivers which do or >>>> might do the same, it sounds like a good move to add another check >>>> to generic_make_request_checks() and to bail out if the request >>>> function has not been set, yet. >>>> >>>> BugLink: https://bugs.launchpad.net/bugs/1860231 >>> >>> >From that bug; >>> "The currently proposed fix introduces no chance of stability >>> regressions. There is a chance of a very small performance regression >>> since an additional pointer comparison is performed on each block layer >>> request but this is unlikely to be noticeable." >>> >>> This captures my immediate concern: slowing down everyone for this DM >>> edge-case isn't desirable. >> >> SO I had a look and there isn't anything easier than adding the proposed >> NULL check in generic_make_request_checks(). Given the many >> conditionals in that function.. what's one more? ;) >> >> I looked at marking the queue frozen to prevent IO via >> blk_queue_enter()'s existing cheeck -- but that quickly felt like an >> abuse, especially in that there isn't a queue unfreeze for bio-based. >> >> Jens, I'll defer to you to judge this patch further. If you're OK with >> it: cool. If not, I'm open to suggestions for how to proceed. >> > > It does kinda suck... The generic_make_request_checks() is a mess, and > this doesn't make it any better. Any reason why we can't solve this > two step setup in a clean fashion instead of patching around it like > this? Feels like a pretty bad hack, tbh. > Tyler spent some time thinking about delaying the allocation of the queue structure until later but that seemed rather dangerous. IIRC there are places during registration of the (generic) block device which expect this to be done. Not sure whether it would be feasible to start with one kind of dummy make_request_fn and then switch that over to the proper one once that decision can be made... [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-23 18:52 ` Jens Axboe 2020-01-24 6:04 ` Stefan Bader @ 2020-01-27 19:32 ` Mike Snitzer 2020-01-27 19:39 ` Jens Axboe 2020-01-28 14:32 ` Stefan Bader 1 sibling, 2 replies; 11+ messages in thread From: Mike Snitzer @ 2020-01-27 19:32 UTC (permalink / raw) To: Jens Axboe Cc: Stefan Bader, linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon On Thu, Jan 23 2020 at 1:52pm -0500, Jens Axboe <axboe@kernel.dk> wrote: > On 1/23/20 10:28 AM, Mike Snitzer wrote: > > On Thu, Jan 23 2020 at 5:35am -0500, > > Mike Snitzer <snitzer@redhat.com> wrote: > > > >> On Thu, Jan 23 2020 at 4:17am -0500, > >> Stefan Bader <stefan.bader@canonical.com> wrote: > >> > >>> When device-mapper adapted for multi-queue functionality, they > >>> also re-organized the way the make-request function was set. > >>> Before, this happened when the device-mapper logical device was > >>> created. Now it is done once the mapping table gets loaded the > >>> first time (this also decides whether the block device is request > >>> or bio based). > >>> > >>> However in generic_make_request(), the request function gets used > >>> without further checks and this happens if one tries to mount such > >>> a partially set up device. > >>> > >>> This can easily be reproduced with the following steps: > >>> - dmsetup create -n test > >>> - mount /dev/dm-<#> /mnt > >>> > >>> This maybe is something which also should be fixed up in device- > >>> mapper. > >> > >> I'll look closer at other options. > >> > >>> But given there is already a check for an unset queue > >>> pointer and potentially there could be other drivers which do or > >>> might do the same, it sounds like a good move to add another check > >>> to generic_make_request_checks() and to bail out if the request > >>> function has not been set, yet. > >>> > >>> BugLink: https://bugs.launchpad.net/bugs/1860231 > >> > >> >From that bug; > >> "The currently proposed fix introduces no chance of stability > >> regressions. There is a chance of a very small performance regression > >> since an additional pointer comparison is performed on each block layer > >> request but this is unlikely to be noticeable." > >> > >> This captures my immediate concern: slowing down everyone for this DM > >> edge-case isn't desirable. > > > > SO I had a look and there isn't anything easier than adding the proposed > > NULL check in generic_make_request_checks(). Given the many > > conditionals in that function.. what's one more? ;) > > > > I looked at marking the queue frozen to prevent IO via > > blk_queue_enter()'s existing cheeck -- but that quickly felt like an > > abuse, especially in that there isn't a queue unfreeze for bio-based. > > > > Jens, I'll defer to you to judge this patch further. If you're OK with > > it: cool. If not, I'm open to suggestions for how to proceed. > > > > It does kinda suck... The generic_make_request_checks() is a mess, and > this doesn't make it any better. Any reason why we can't solve this > two step setup in a clean fashion instead of patching around it like > this? Feels like a pretty bad hack, tbh. I just staged the following DM fix: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333 From: Mike Snitzer <snitzer@redhat.com> Date: Mon, 27 Jan 2020 14:07:23 -0500 Subject: [PATCH] dm: fix potential for q->make_request_fn NULL pointer Move blk_queue_make_request() to dm.c:alloc_dev() so that q->make_request_fn is never NULL during the lifetime of a DM device (even one that is created without a DM table). Otherwise generic_make_request() will crash simply by doing: dmsetup create -n test mount /dev/dm-N /mnt While at it, move ->congested_data initialization out of dm.c:alloc_dev() and into the bio-based specific init method. Reported-by: Stefan Bader <stefan.bader@canonical.com> BugLink: https://bugs.launchpad.net/bugs/1860231 Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper") Depends-on: c12c9a3c3860c ("dm: various cleanups to md->queue initialization code") Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index e8f9661a10a1..b89f07ee2eff 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1859,6 +1859,7 @@ static void dm_init_normal_md_queue(struct mapped_device *md) /* * Initialize aspects of queue that aren't relevant for blk-mq */ + md->queue->backing_dev_info->congested_data = md; md->queue->backing_dev_info->congested_fn = dm_any_congested; } @@ -1949,7 +1950,12 @@ static struct mapped_device *alloc_dev(int minor) if (!md->queue) goto bad; md->queue->queuedata = md; - md->queue->backing_dev_info->congested_data = md; + /* + * default to bio-based required ->make_request_fn until DM + * table is loaded and md->type established. If request-based + * table is loaded: blk-mq will override accordingly. + */ + blk_queue_make_request(md->queue, dm_make_request); md->disk = alloc_disk_node(1, md->numa_node_id); if (!md->disk) @@ -2264,7 +2270,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t) case DM_TYPE_DAX_BIO_BASED: case DM_TYPE_NVME_BIO_BASED: dm_init_normal_md_queue(md); - blk_queue_make_request(md->queue, dm_make_request); break; case DM_TYPE_NONE: WARN_ON_ONCE(true); -- 2.21.GIT ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-27 19:32 ` Mike Snitzer @ 2020-01-27 19:39 ` Jens Axboe 2020-01-28 14:32 ` Stefan Bader 1 sibling, 0 replies; 11+ messages in thread From: Jens Axboe @ 2020-01-27 19:39 UTC (permalink / raw) To: Mike Snitzer Cc: Stefan Bader, linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon On 1/27/20 12:32 PM, Mike Snitzer wrote: > On Thu, Jan 23 2020 at 1:52pm -0500, > Jens Axboe <axboe@kernel.dk> wrote: > >> On 1/23/20 10:28 AM, Mike Snitzer wrote: >>> On Thu, Jan 23 2020 at 5:35am -0500, >>> Mike Snitzer <snitzer@redhat.com> wrote: >>> >>>> On Thu, Jan 23 2020 at 4:17am -0500, >>>> Stefan Bader <stefan.bader@canonical.com> wrote: >>>> >>>>> When device-mapper adapted for multi-queue functionality, they >>>>> also re-organized the way the make-request function was set. >>>>> Before, this happened when the device-mapper logical device was >>>>> created. Now it is done once the mapping table gets loaded the >>>>> first time (this also decides whether the block device is request >>>>> or bio based). >>>>> >>>>> However in generic_make_request(), the request function gets used >>>>> without further checks and this happens if one tries to mount such >>>>> a partially set up device. >>>>> >>>>> This can easily be reproduced with the following steps: >>>>> - dmsetup create -n test >>>>> - mount /dev/dm-<#> /mnt >>>>> >>>>> This maybe is something which also should be fixed up in device- >>>>> mapper. >>>> >>>> I'll look closer at other options. >>>> >>>>> But given there is already a check for an unset queue >>>>> pointer and potentially there could be other drivers which do or >>>>> might do the same, it sounds like a good move to add another check >>>>> to generic_make_request_checks() and to bail out if the request >>>>> function has not been set, yet. >>>>> >>>>> BugLink: https://bugs.launchpad.net/bugs/1860231 >>>> >>>> >From that bug; >>>> "The currently proposed fix introduces no chance of stability >>>> regressions. There is a chance of a very small performance regression >>>> since an additional pointer comparison is performed on each block layer >>>> request but this is unlikely to be noticeable." >>>> >>>> This captures my immediate concern: slowing down everyone for this DM >>>> edge-case isn't desirable. >>> >>> SO I had a look and there isn't anything easier than adding the proposed >>> NULL check in generic_make_request_checks(). Given the many >>> conditionals in that function.. what's one more? ;) >>> >>> I looked at marking the queue frozen to prevent IO via >>> blk_queue_enter()'s existing cheeck -- but that quickly felt like an >>> abuse, especially in that there isn't a queue unfreeze for bio-based. >>> >>> Jens, I'll defer to you to judge this patch further. If you're OK with >>> it: cool. If not, I'm open to suggestions for how to proceed. >>> >> >> It does kinda suck... The generic_make_request_checks() is a mess, and >> this doesn't make it any better. Any reason why we can't solve this >> two step setup in a clean fashion instead of patching around it like >> this? Feels like a pretty bad hack, tbh. > > I just staged the following DM fix: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333 I like that a lot more than the NULL check in the core. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-27 19:32 ` Mike Snitzer 2020-01-27 19:39 ` Jens Axboe @ 2020-01-28 14:32 ` Stefan Bader 2020-01-28 16:26 ` Mike Snitzer 1 sibling, 1 reply; 11+ messages in thread From: Stefan Bader @ 2020-01-28 14:32 UTC (permalink / raw) To: Mike Snitzer, Jens Axboe Cc: linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon [-- Attachment #1.1: Type: text/plain, Size: 5940 bytes --] On 27.01.20 20:32, Mike Snitzer wrote: > On Thu, Jan 23 2020 at 1:52pm -0500, > Jens Axboe <axboe@kernel.dk> wrote: > >> On 1/23/20 10:28 AM, Mike Snitzer wrote: >>> On Thu, Jan 23 2020 at 5:35am -0500, >>> Mike Snitzer <snitzer@redhat.com> wrote: >>> >>>> On Thu, Jan 23 2020 at 4:17am -0500, >>>> Stefan Bader <stefan.bader@canonical.com> wrote: >>>> >>>>> When device-mapper adapted for multi-queue functionality, they >>>>> also re-organized the way the make-request function was set. >>>>> Before, this happened when the device-mapper logical device was >>>>> created. Now it is done once the mapping table gets loaded the >>>>> first time (this also decides whether the block device is request >>>>> or bio based). >>>>> >>>>> However in generic_make_request(), the request function gets used >>>>> without further checks and this happens if one tries to mount such >>>>> a partially set up device. >>>>> >>>>> This can easily be reproduced with the following steps: >>>>> - dmsetup create -n test >>>>> - mount /dev/dm-<#> /mnt >>>>> >>>>> This maybe is something which also should be fixed up in device- >>>>> mapper. >>>> >>>> I'll look closer at other options. >>>> >>>>> But given there is already a check for an unset queue >>>>> pointer and potentially there could be other drivers which do or >>>>> might do the same, it sounds like a good move to add another check >>>>> to generic_make_request_checks() and to bail out if the request >>>>> function has not been set, yet. >>>>> >>>>> BugLink: https://bugs.launchpad.net/bugs/1860231 >>>> >>>> >From that bug; >>>> "The currently proposed fix introduces no chance of stability >>>> regressions. There is a chance of a very small performance regression >>>> since an additional pointer comparison is performed on each block layer >>>> request but this is unlikely to be noticeable." >>>> >>>> This captures my immediate concern: slowing down everyone for this DM >>>> edge-case isn't desirable. >>> >>> SO I had a look and there isn't anything easier than adding the proposed >>> NULL check in generic_make_request_checks(). Given the many >>> conditionals in that function.. what's one more? ;) >>> >>> I looked at marking the queue frozen to prevent IO via >>> blk_queue_enter()'s existing cheeck -- but that quickly felt like an >>> abuse, especially in that there isn't a queue unfreeze for bio-based. >>> >>> Jens, I'll defer to you to judge this patch further. If you're OK with >>> it: cool. If not, I'm open to suggestions for how to proceed. >>> >> >> It does kinda suck... The generic_make_request_checks() is a mess, and >> this doesn't make it any better. Any reason why we can't solve this >> two step setup in a clean fashion instead of patching around it like >> this? Feels like a pretty bad hack, tbh. > > I just staged the following DM fix: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333 Thanks Mike, yeah this looks like it resolves the problem without adding any impact on the generic I/O path. We certainly had thought about that but felt uncertain whether it would not open other risks. Like something adding requests just before the table load. Could this cause some I/O be handled by one function and the rest by another? And would that really matter? The other thing that was a bit strange but maybe someone else's problem is that mount generated I/O requests to start with. The device size should be 0 still. > > From: Mike Snitzer <snitzer@redhat.com> > Date: Mon, 27 Jan 2020 14:07:23 -0500 > Subject: [PATCH] dm: fix potential for q->make_request_fn NULL pointer > > Move blk_queue_make_request() to dm.c:alloc_dev() so that > q->make_request_fn is never NULL during the lifetime of a DM device > (even one that is created without a DM table). > > Otherwise generic_make_request() will crash simply by doing: > dmsetup create -n test > mount /dev/dm-N /mnt > > While at it, move ->congested_data initialization out of > dm.c:alloc_dev() and into the bio-based specific init method. > > Reported-by: Stefan Bader <stefan.bader@canonical.com> > BugLink: https://bugs.launchpad.net/bugs/1860231 > Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper") > Depends-on: c12c9a3c3860c ("dm: various cleanups to md->queue initialization code") > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > --- > drivers/md/dm.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index e8f9661a10a1..b89f07ee2eff 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1859,6 +1859,7 @@ static void dm_init_normal_md_queue(struct mapped_device *md) > /* > * Initialize aspects of queue that aren't relevant for blk-mq > */ > + md->queue->backing_dev_info->congested_data = md; > md->queue->backing_dev_info->congested_fn = dm_any_congested; > } > > @@ -1949,7 +1950,12 @@ static struct mapped_device *alloc_dev(int minor) > if (!md->queue) > goto bad; > md->queue->queuedata = md; > - md->queue->backing_dev_info->congested_data = md; > + /* > + * default to bio-based required ->make_request_fn until DM > + * table is loaded and md->type established. If request-based > + * table is loaded: blk-mq will override accordingly. > + */ > + blk_queue_make_request(md->queue, dm_make_request); > > md->disk = alloc_disk_node(1, md->numa_node_id); > if (!md->disk) > @@ -2264,7 +2270,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t) > case DM_TYPE_DAX_BIO_BASED: > case DM_TYPE_NVME_BIO_BASED: > dm_init_normal_md_queue(md); > - blk_queue_make_request(md->queue, dm_make_request); > break; > case DM_TYPE_NONE: > WARN_ON_ONCE(true); > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn 2020-01-28 14:32 ` Stefan Bader @ 2020-01-28 16:26 ` Mike Snitzer 0 siblings, 0 replies; 11+ messages in thread From: Mike Snitzer @ 2020-01-28 16:26 UTC (permalink / raw) To: Stefan Bader Cc: Jens Axboe, linux-kernel, linux-block, dm-devel, Tyler Hicks, Alasdair Kergon On Tue, Jan 28 2020 at 9:32am -0500, Stefan Bader <stefan.bader@canonical.com> wrote: > On 27.01.20 20:32, Mike Snitzer wrote: > > > > I just staged the following DM fix: > > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333 > > Thanks Mike, > > yeah this looks like it resolves the problem without adding any impact on the > generic I/O path. We certainly had thought about that but felt uncertain whether > it would not open other risks. Like something adding requests just before the > table load. Could this cause some I/O be handled by one function and the rest by > another? And would that really matter? I considered this too. Any IO issued to the device before it is "ready" won't matter anyway (no where to send the IO due to not having a DM table -- such IO should result in an error (from dm.c:dm_process_bio's !map check). But given the device has no size, a simple write will hit -ENOSPC before. And the only way to get the DM device to have a proper destination for its IO is to load a table, which requires a sequence like: # dmsetup create -n test # dmsetup table test: # echo "0 20971520 linear 259:0 2048" | dmsetup load test # dmsetup table --inactive test: 0 20971520 linear 259:0 2048 # dmsetup suspend test # dmsetup resume test # dmsetup table test: 0 20971520 linear 259:0 2048 And once a table is loaded there will be accompanying change uevents that trigger udev, blkid, etc. (NOTE: the suspend phase implies a flush of all outstanding IO, but even if 'dmsetup suspend --noflush test' were used the IO would just get pushed onto a list in DM core and it would be issued after the new table is in place). > The other thing that was a bit strange but maybe someone else's problem is that > mount generated I/O requests to start with. The device size should be 0 still. That's just mount not having a negative check for device size being 0. Mike ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-01-28 16:26 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-01-23 9:17 [PATCH 0/1] Handle NULL make_request_fn in generic_make_request() Stefan Bader 2020-01-23 9:17 ` [PATCH 1/1] blk/core: Gracefully handle unset make_request_fn Stefan Bader 2020-01-23 10:23 ` Tyler Hicks 2020-01-23 10:35 ` Mike Snitzer 2020-01-23 17:28 ` Mike Snitzer 2020-01-23 18:52 ` Jens Axboe 2020-01-24 6:04 ` Stefan Bader 2020-01-27 19:32 ` Mike Snitzer 2020-01-27 19:39 ` Jens Axboe 2020-01-28 14:32 ` Stefan Bader 2020-01-28 16:26 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).