* [PATCH 0/2] enable plugging only for reads in zoned block devices [not found] <CGME20220925185349eucas1p1dc689bac64668ca038ba8646c44fd580@eucas1p1.samsung.com> @ 2022-09-25 18:53 ` Pankaj Raghav [not found] ` <CGME20220925185350eucas1p1fc354429027a88de7e548a3a4529b4ef@eucas1p1.samsung.com> [not found] ` <CGME20220925185351eucas1p1e0c37396c09611509c0b18bdcdeddfe1@eucas1p1.samsung.com> 0 siblings, 2 replies; 22+ messages in thread From: Pankaj Raghav @ 2022-09-25 18:53 UTC (permalink / raw) To: linux-block, axboe; +Cc: damien.lemoal, gost.dev, Pankaj Raghav Hi Jens, This patch series addresses the issue that was discussed in your plugging for passthrough patch series[1]. 1st patch modifies blk_mq_plug() function to accept plugging only for reads in zoned block devices. 2nd patch uses the blk_mq_plug function in blk_execute_rq_nowait(). The patches are based on next-20220923. [1] https://lore.kernel.org/linux-block/2e484ccb-b65b-2991-e259-d3f7be6ad1a6@kernel.dk/ Pankaj Raghav (2): block: modify blk_mq_plug() to allow only reads for zoned block devices block: use blk_mq_plug() in blk_execute_rq_nowait() block/blk-mq.c | 2 +- block/blk-mq.h | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <CGME20220925185350eucas1p1fc354429027a88de7e548a3a4529b4ef@eucas1p1.samsung.com>]
* [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices [not found] ` <CGME20220925185350eucas1p1fc354429027a88de7e548a3a4529b4ef@eucas1p1.samsung.com> @ 2022-09-25 18:53 ` Pankaj Raghav 2022-09-25 22:55 ` Damien Le Moal 2022-09-26 14:37 ` Christoph Hellwig 0 siblings, 2 replies; 22+ messages in thread From: Pankaj Raghav @ 2022-09-25 18:53 UTC (permalink / raw) To: linux-block, axboe; +Cc: damien.lemoal, gost.dev, Pankaj Raghav Modify blk_mq_plug() to allow plugging only for read operations in zoned block devices as there are alternative IO paths in the linux block layer which can end up doing a write via driver private requests in sequential write zones. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- block/blk-mq.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.h b/block/blk-mq.h index 8ca453ac243d..005343df64ca 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -305,14 +305,15 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) * change can cause write BIO failures with zoned block devices as these * require sequential write patterns to zones. Prevent this from happening by * ignoring the plug state of a BIO issuing context if it is for a zoned block - * device and the BIO to plug is a write operation. + * device and the BIO to plug is not a read operation. + * * * Return current->plug if the bio can be plugged and NULL otherwise */ static inline struct blk_plug *blk_mq_plug( struct bio *bio) { - /* Zoned block device write operation case: do not plug the BIO */ - if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))) + /* Allow plugging only for read operations in zoned block devices */ + if (bdev_is_zoned(bio->bi_bdev) && bio_op(bio) != REQ_OP_READ) return NULL; /* -- 2.25.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-25 18:53 ` [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for " Pankaj Raghav @ 2022-09-25 22:55 ` Damien Le Moal 2022-09-26 14:37 ` Christoph Hellwig 1 sibling, 0 replies; 22+ messages in thread From: Damien Le Moal @ 2022-09-25 22:55 UTC (permalink / raw) To: Pankaj Raghav, linux-block, axboe; +Cc: gost.dev On 9/26/22 03:53, Pankaj Raghav wrote: > Modify blk_mq_plug() to allow plugging only for read operations in zoned > block devices as there are alternative IO paths in the linux block > layer which can end up doing a write via driver private requests in > sequential write zones. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > --- > block/blk-mq.h | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/block/blk-mq.h b/block/blk-mq.h > index 8ca453ac243d..005343df64ca 100644 > --- a/block/blk-mq.h > +++ b/block/blk-mq.h > @@ -305,14 +305,15 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) > * change can cause write BIO failures with zoned block devices as these > * require sequential write patterns to zones. Prevent this from happening by > * ignoring the plug state of a BIO issuing context if it is for a zoned block > - * device and the BIO to plug is a write operation. > + * device and the BIO to plug is not a read operation. > + * > * > * Return current->plug if the bio can be plugged and NULL otherwise > */ > static inline struct blk_plug *blk_mq_plug( struct bio *bio) > { > - /* Zoned block device write operation case: do not plug the BIO */ > - if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))) > + /* Allow plugging only for read operations in zoned block devices */ nit: s/in/with > + if (bdev_is_zoned(bio->bi_bdev) && bio_op(bio) != REQ_OP_READ) > return NULL; > > /* Otherwise, looks good to me. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-25 18:53 ` [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for " Pankaj Raghav 2022-09-25 22:55 ` Damien Le Moal @ 2022-09-26 14:37 ` Christoph Hellwig 2022-09-26 14:40 ` Jens Axboe 1 sibling, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2022-09-26 14:37 UTC (permalink / raw) To: Pankaj Raghav; +Cc: linux-block, axboe, damien.lemoal, gost.dev On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: > Modify blk_mq_plug() to allow plugging only for read operations in zoned > block devices as there are alternative IO paths in the linux block > layer which can end up doing a write via driver private requests in > sequential write zones. We should be able to plug for all operations that are not REQ_OP_ZONE_APPEND just fine. I also really can't parse your commit log at all, what alternative paths are you talking about here? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 14:37 ` Christoph Hellwig @ 2022-09-26 14:40 ` Jens Axboe 2022-09-26 14:43 ` Christoph Hellwig 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2022-09-26 14:40 UTC (permalink / raw) To: Christoph Hellwig, Pankaj Raghav; +Cc: linux-block, damien.lemoal, gost.dev On 9/26/22 8:37 AM, Christoph Hellwig wrote: > On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: >> Modify blk_mq_plug() to allow plugging only for read operations in zoned >> block devices as there are alternative IO paths in the linux block >> layer which can end up doing a write via driver private requests in >> sequential write zones. > > We should be able to plug for all operations that are not > REQ_OP_ZONE_APPEND just fine. Agree, I think we just want to make this about someone doing a series of appends. If you mix-and-match with passthrough you will have a bad time anyway. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 14:40 ` Jens Axboe @ 2022-09-26 14:43 ` Christoph Hellwig 2022-09-26 16:32 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2022-09-26 14:43 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, Pankaj Raghav, linux-block, damien.lemoal, gost.dev On Mon, Sep 26, 2022 at 08:40:54AM -0600, Jens Axboe wrote: > On 9/26/22 8:37 AM, Christoph Hellwig wrote: > > On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: > >> Modify blk_mq_plug() to allow plugging only for read operations in zoned > >> block devices as there are alternative IO paths in the linux block > >> layer which can end up doing a write via driver private requests in > >> sequential write zones. > > > > We should be able to plug for all operations that are not > > REQ_OP_ZONE_APPEND just fine. > > Agree, I think we just want to make this about someone doing a series > of appends. If you mix-and-match with passthrough you will have a bad > time anyway. Err, sorry - what I wrote about is compelte garbage. I initially wanted to say you can plug for REQ_OP_ZONE_APPEND just fine, and then realized that we also want various other ones that have the write bit set batched. So I suspect we really want to explicitly check for REQ_OP_WRITE here. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 14:43 ` Christoph Hellwig @ 2022-09-26 16:32 ` Jens Axboe 2022-09-26 19:20 ` Pankaj Raghav 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2022-09-26 16:32 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, damien.lemoal, gost.dev On 9/26/22 8:43 AM, Christoph Hellwig wrote: > On Mon, Sep 26, 2022 at 08:40:54AM -0600, Jens Axboe wrote: >> On 9/26/22 8:37 AM, Christoph Hellwig wrote: >>> On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: >>>> Modify blk_mq_plug() to allow plugging only for read operations in zoned >>>> block devices as there are alternative IO paths in the linux block >>>> layer which can end up doing a write via driver private requests in >>>> sequential write zones. >>> >>> We should be able to plug for all operations that are not >>> REQ_OP_ZONE_APPEND just fine. >> >> Agree, I think we just want to make this about someone doing a series >> of appends. If you mix-and-match with passthrough you will have a bad >> time anyway. > > Err, sorry - what I wrote about is compelte garbage. I initially > wanted to say you can plug for REQ_OP_ZONE_APPEND just fine, and then > realized that we also want various other ones that have the write bit > set batched. So I suspect we really want to explicitly check for > REQ_OP_WRITE here. My memory was a bit hazy, since we have separate ops for the driver in/out, I think just checking for REQ_OP_WRITE is indeed the right choice. That's the single case we need to care about. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 16:32 ` Jens Axboe @ 2022-09-26 19:20 ` Pankaj Raghav 2022-09-26 19:25 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Pankaj Raghav @ 2022-09-26 19:20 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, damien.lemoal, gost.dev On 2022-09-26 18:32, Jens Axboe wrote: > On 9/26/22 8:43 AM, Christoph Hellwig wrote: >> On Mon, Sep 26, 2022 at 08:40:54AM -0600, Jens Axboe wrote: >>> On 9/26/22 8:37 AM, Christoph Hellwig wrote: >>>> On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: >>>>> Modify blk_mq_plug() to allow plugging only for read operations in zoned >>>>> block devices as there are alternative IO paths in the linux block >>>>> layer which can end up doing a write via driver private requests in >>>>> sequential write zones. >>>> >>>> We should be able to plug for all operations that are not >>>> REQ_OP_ZONE_APPEND just fine. >>> >>> Agree, I think we just want to make this about someone doing a series >>> of appends. If you mix-and-match with passthrough you will have a bad >>> time anyway. >> >> Err, sorry - what I wrote about is compelte garbage. I initially >> wanted to say you can plug for REQ_OP_ZONE_APPEND just fine, and then >> realized that we also want various other ones that have the write bit >> set batched. So I suspect we really want to explicitly check for >> REQ_OP_WRITE here. > > My memory was a bit hazy, since we have separate ops for the driver > in/out, I think just checking for REQ_OP_WRITE is indeed the right > choice. That's the single case we need to care about. > Ah. You are right. I missed it as well. There is even a comment from Christoph: * - if the least significant bit is set transfers are TO the device * - if the least significant bit is not set transfers are FROM the device I guess the second patch should be enough to apply plugging when applicable for uring_cmd based nvme passthrough requests. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 19:20 ` Pankaj Raghav @ 2022-09-26 19:25 ` Jens Axboe 2022-09-27 15:20 ` Pankaj Raghav 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2022-09-26 19:25 UTC (permalink / raw) To: Pankaj Raghav, Christoph Hellwig; +Cc: linux-block, damien.lemoal, gost.dev On 9/26/22 1:20 PM, Pankaj Raghav wrote: > On 2022-09-26 18:32, Jens Axboe wrote: >> On 9/26/22 8:43 AM, Christoph Hellwig wrote: >>> On Mon, Sep 26, 2022 at 08:40:54AM -0600, Jens Axboe wrote: >>>> On 9/26/22 8:37 AM, Christoph Hellwig wrote: >>>>> On Sun, Sep 25, 2022 at 08:53:46PM +0200, Pankaj Raghav wrote: >>>>>> Modify blk_mq_plug() to allow plugging only for read operations in zoned >>>>>> block devices as there are alternative IO paths in the linux block >>>>>> layer which can end up doing a write via driver private requests in >>>>>> sequential write zones. >>>>> >>>>> We should be able to plug for all operations that are not >>>>> REQ_OP_ZONE_APPEND just fine. >>>> >>>> Agree, I think we just want to make this about someone doing a series >>>> of appends. If you mix-and-match with passthrough you will have a bad >>>> time anyway. >>> >>> Err, sorry - what I wrote about is compelte garbage. I initially >>> wanted to say you can plug for REQ_OP_ZONE_APPEND just fine, and then >>> realized that we also want various other ones that have the write bit >>> set batched. So I suspect we really want to explicitly check for >>> REQ_OP_WRITE here. >> >> My memory was a bit hazy, since we have separate ops for the driver >> in/out, I think just checking for REQ_OP_WRITE is indeed the right >> choice. That's the single case we need to care about. >> > Ah. You are right. I missed it as well. There is even a comment from > Christoph: > > * - if the least significant bit is set transfers are TO the device > * - if the least significant bit is not set transfers are FROM the device > > I guess the second patch should be enough to apply plugging when > applicable for uring_cmd based nvme passthrough requests. Do we even need the 2nd patch? If we're just doing passthrough for the blk_execute_nowait() API, then the condition should never trigger? If so, then it would be a cleanup just to ensure we're using a consistent API for getting the plug, which may be worthwhile to do separately for sure. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-26 19:25 ` Jens Axboe @ 2022-09-27 15:20 ` Pankaj Raghav 2022-09-27 16:04 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Pankaj Raghav @ 2022-09-27 15:20 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, damien.lemoal, gost.dev >> I guess the second patch should be enough to apply plugging when >> applicable for uring_cmd based nvme passthrough requests. > > Do we even need the 2nd patch? If we're just doing passthrough for the > blk_execute_nowait() API, then the condition should never trigger? I think this was the question I raised in your first version of the series. If we do a NVMe write using the passthrough interface, then we will be using REQ_OP_DRV_OUT op, which is: REQ_OP_DRV_OUT = (__force blk_opf_t)35, // write bit is set The condition in blk_mq_plug() will trigger as we only check if it is a _write_ (op & (__force blk_opf_t)1) to the device. Am I missing something? > If so, then it would be a cleanup just to ensure we're using a consistent > API for getting the plug, which may be worthwhile to do separately for > sure. > -- Pankaj ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 15:20 ` Pankaj Raghav @ 2022-09-27 16:04 ` Jens Axboe 2022-09-27 16:51 ` Christoph Hellwig 0 siblings, 1 reply; 22+ messages in thread From: Jens Axboe @ 2022-09-27 16:04 UTC (permalink / raw) To: Pankaj Raghav, Christoph Hellwig; +Cc: linux-block, damien.lemoal, gost.dev On 9/27/22 9:20 AM, Pankaj Raghav wrote: >>> I guess the second patch should be enough to apply plugging when >>> applicable for uring_cmd based nvme passthrough requests. >> >> Do we even need the 2nd patch? If we're just doing passthrough for the >> blk_execute_nowait() API, then the condition should never trigger? > > I think this was the question I raised in your first version of the series. > > If we do a NVMe write using the passthrough interface, then we will be > using REQ_OP_DRV_OUT op, which is: > > REQ_OP_DRV_OUT = (__force blk_opf_t)35, // write bit is set > > The condition in blk_mq_plug() will trigger as we only check if it is a > _write_ (op & (__force blk_opf_t)1) to the device. Am I missing something? Ah yes, good point. We used to have this notion of 'fs' request, don't think we do anymore. Because it really should just be: if (zoned && (op & REQ_OP_WRITE) && fs_request) return NULL; for that condition imho. I guess we could make it: if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) return NULL; -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 16:04 ` Jens Axboe @ 2022-09-27 16:51 ` Christoph Hellwig 2022-09-27 16:52 ` Jens Axboe 0 siblings, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2022-09-27 16:51 UTC (permalink / raw) To: Jens Axboe Cc: Pankaj Raghav, Christoph Hellwig, linux-block, damien.lemoal, gost.dev On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: > Ah yes, good point. We used to have this notion of 'fs' request, don't > think we do anymore. Because it really should just be: A fs request is a !passthrough request. > if (zoned && (op & REQ_OP_WRITE) && fs_request) > return NULL; > > for that condition imho. I guess we could make it: > > if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) > return NULL; Well, the only opcodes we do zone locking for is REQ_OP_WRITE and REQ_OP_WRITE_ZEROES. So this should be: if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) return NULL; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 16:51 ` Christoph Hellwig @ 2022-09-27 16:52 ` Jens Axboe 2022-09-27 23:07 ` Damien Le Moal 2022-09-28 11:57 ` Pankaj Raghav 0 siblings, 2 replies; 22+ messages in thread From: Jens Axboe @ 2022-09-27 16:52 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, damien.lemoal, gost.dev On 9/27/22 10:51 AM, Christoph Hellwig wrote: > On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >> Ah yes, good point. We used to have this notion of 'fs' request, don't >> think we do anymore. Because it really should just be: > > A fs request is a !passthrough request. Right, that's the condition I made below too. >> if (zoned && (op & REQ_OP_WRITE) && fs_request) >> return NULL; >> >> for that condition imho. I guess we could make it: >> >> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >> return NULL; > > Well, the only opcodes we do zone locking for is REQ_OP_WRITE and > REQ_OP_WRITE_ZEROES. So this should be: > > if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) > return NULL; I'd rather just make it explicit and use that. Pankaj, do you want to spin a v2 with that? -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 16:52 ` Jens Axboe @ 2022-09-27 23:07 ` Damien Le Moal 2022-09-27 23:10 ` Damien Le Moal 2022-09-27 23:12 ` Jens Axboe 2022-09-28 11:57 ` Pankaj Raghav 1 sibling, 2 replies; 22+ messages in thread From: Damien Le Moal @ 2022-09-27 23:07 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, gost.dev On 9/28/22 01:52, Jens Axboe wrote: > On 9/27/22 10:51 AM, Christoph Hellwig wrote: >> On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >>> Ah yes, good point. We used to have this notion of 'fs' request, don't >>> think we do anymore. Because it really should just be: >> >> A fs request is a !passthrough request. > > Right, that's the condition I made below too. > >>> if (zoned && (op & REQ_OP_WRITE) && fs_request) >>> return NULL; >>> >>> for that condition imho. I guess we could make it: >>> >>> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >>> return NULL; >> >> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >> REQ_OP_WRITE_ZEROES. So this should be: >> >> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >> return NULL; > > I'd rather just make it explicit and use that. Pankaj, do you want > to spin a v2 with that? It would be nice to reuse the bio equivalent of blk_req_needs_zone_write_lock(). The test would be: if (bio_needs_zone_write_locking()) return NULL; With something like: static inline bool bio_needs_zone_write_locking() { if (!bdev_is_zoned(bio->bi_bdev)) return false; switch (bio_op(bio)) { case REQ_OP_WRITE_ZEROES: case REQ_OP_WRITE: return true; default: return false; } } Which also has the advantage that going forward, we could refine this to plug writes to conventional zones (as these can be plugged). -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 23:07 ` Damien Le Moal @ 2022-09-27 23:10 ` Damien Le Moal 2022-09-27 23:13 ` Jens Axboe 2022-09-27 23:12 ` Jens Axboe 1 sibling, 1 reply; 22+ messages in thread From: Damien Le Moal @ 2022-09-27 23:10 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, gost.dev On 9/28/22 08:07, Damien Le Moal wrote: > On 9/28/22 01:52, Jens Axboe wrote: >> On 9/27/22 10:51 AM, Christoph Hellwig wrote: >>> On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >>>> Ah yes, good point. We used to have this notion of 'fs' request, don't >>>> think we do anymore. Because it really should just be: >>> >>> A fs request is a !passthrough request. >> >> Right, that's the condition I made below too. >> >>>> if (zoned && (op & REQ_OP_WRITE) && fs_request) >>>> return NULL; >>>> >>>> for that condition imho. I guess we could make it: >>>> >>>> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >>>> return NULL; >>> >>> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >>> REQ_OP_WRITE_ZEROES. So this should be: >>> >>> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >>> return NULL; >> >> I'd rather just make it explicit and use that. Pankaj, do you want >> to spin a v2 with that? > > It would be nice to reuse the bio equivalent of > blk_req_needs_zone_write_lock(). > > The test would be: > > if (bio_needs_zone_write_locking()) > return NULL; Note that we could also add a "IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&" to the condition or stub the helper to have this hunk disappear for the !CONFIG_BLK_DEV_ZONED case. > > With something like: > > static inline bool bio_needs_zone_write_locking() > { > if (!bdev_is_zoned(bio->bi_bdev)) > return false; > > switch (bio_op(bio)) { > case REQ_OP_WRITE_ZEROES: > > case REQ_OP_WRITE: > > return true; > default: > > return false; > > } > } > > Which also has the advantage that going forward, we could refine this to > plug writes to conventional zones (as these can be plugged). > -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 23:10 ` Damien Le Moal @ 2022-09-27 23:13 ` Jens Axboe 0 siblings, 0 replies; 22+ messages in thread From: Jens Axboe @ 2022-09-27 23:13 UTC (permalink / raw) To: Damien Le Moal, Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, gost.dev On 9/27/22 5:10 PM, Damien Le Moal wrote: > On 9/28/22 08:07, Damien Le Moal wrote: >> On 9/28/22 01:52, Jens Axboe wrote: >>> On 9/27/22 10:51 AM, Christoph Hellwig wrote: >>>> On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >>>>> Ah yes, good point. We used to have this notion of 'fs' request, don't >>>>> think we do anymore. Because it really should just be: >>>> >>>> A fs request is a !passthrough request. >>> >>> Right, that's the condition I made below too. >>> >>>>> if (zoned && (op & REQ_OP_WRITE) && fs_request) >>>>> return NULL; >>>>> >>>>> for that condition imho. I guess we could make it: >>>>> >>>>> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >>>>> return NULL; >>>> >>>> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >>>> REQ_OP_WRITE_ZEROES. So this should be: >>>> >>>> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >>>> return NULL; >>> >>> I'd rather just make it explicit and use that. Pankaj, do you want >>> to spin a v2 with that? >> >> It would be nice to reuse the bio equivalent of >> blk_req_needs_zone_write_lock(). >> >> The test would be: >> >> if (bio_needs_zone_write_locking()) >> return NULL; > > Note that we could also add a "IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&" to > the condition or stub the helper to have this hunk disappear for the > !CONFIG_BLK_DEV_ZONED case. Indeed, that would be nice. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 23:07 ` Damien Le Moal 2022-09-27 23:10 ` Damien Le Moal @ 2022-09-27 23:12 ` Jens Axboe 2022-09-27 23:35 ` Damien Le Moal 1 sibling, 1 reply; 22+ messages in thread From: Jens Axboe @ 2022-09-27 23:12 UTC (permalink / raw) To: Damien Le Moal, Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, gost.dev On 9/27/22 5:07 PM, Damien Le Moal wrote: > On 9/28/22 01:52, Jens Axboe wrote: >> On 9/27/22 10:51 AM, Christoph Hellwig wrote: >>> On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >>>> Ah yes, good point. We used to have this notion of 'fs' request, don't >>>> think we do anymore. Because it really should just be: >>> >>> A fs request is a !passthrough request. >> >> Right, that's the condition I made below too. >> >>>> if (zoned && (op & REQ_OP_WRITE) && fs_request) >>>> return NULL; >>>> >>>> for that condition imho. I guess we could make it: >>>> >>>> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >>>> return NULL; >>> >>> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >>> REQ_OP_WRITE_ZEROES. So this should be: >>> >>> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >>> return NULL; >> >> I'd rather just make it explicit and use that. Pankaj, do you want >> to spin a v2 with that? > > It would be nice to reuse the bio equivalent of > blk_req_needs_zone_write_lock(). > > The test would be: > > if (bio_needs_zone_write_locking()) > return NULL; > > With something like: > > static inline bool bio_needs_zone_write_locking() > { > if (!bdev_is_zoned(bio->bi_bdev)) > return false; > > switch (bio_op(bio)) { > case REQ_OP_WRITE_ZEROES: > > case REQ_OP_WRITE: > > return true; > default: > > return false; > > } > } I'd be fine with that (using a shared helper), but let's please just make it: static inline bool op_is_zoned_write(bdev, op) { if (!bdev_is_zoned(bio->bi_bdev)) return false; return op == REQ_OP_WRITE_ZEROES || op == REQ_OP_WRITE; } and avoid a switch for this basic case and name it a bit more logically too. Not married to the above name, but the helper should not imply anything about zone locking. That's for the caller. -- Jens Axboe ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 23:12 ` Jens Axboe @ 2022-09-27 23:35 ` Damien Le Moal 0 siblings, 0 replies; 22+ messages in thread From: Damien Le Moal @ 2022-09-27 23:35 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig; +Cc: Pankaj Raghav, linux-block, gost.dev On 9/28/22 08:12, Jens Axboe wrote: > On 9/27/22 5:07 PM, Damien Le Moal wrote: >> On 9/28/22 01:52, Jens Axboe wrote: >>> On 9/27/22 10:51 AM, Christoph Hellwig wrote: >>>> On Tue, Sep 27, 2022 at 10:04:19AM -0600, Jens Axboe wrote: >>>>> Ah yes, good point. We used to have this notion of 'fs' request, don't >>>>> think we do anymore. Because it really should just be: >>>> >>>> A fs request is a !passthrough request. >>> >>> Right, that's the condition I made below too. >>> >>>>> if (zoned && (op & REQ_OP_WRITE) && fs_request) >>>>> return NULL; >>>>> >>>>> for that condition imho. I guess we could make it: >>>>> >>>>> if (zoned && (op & REQ_OP_WRITE) && !(op & REQ_OP_DRV_OUT)) >>>>> return NULL; >>>> >>>> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >>>> REQ_OP_WRITE_ZEROES. So this should be: >>>> >>>> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >>>> return NULL; >>> >>> I'd rather just make it explicit and use that. Pankaj, do you want >>> to spin a v2 with that? >> >> It would be nice to reuse the bio equivalent of >> blk_req_needs_zone_write_lock(). >> >> The test would be: >> >> if (bio_needs_zone_write_locking()) >> return NULL; >> >> With something like: >> >> static inline bool bio_needs_zone_write_locking() >> { >> if (!bdev_is_zoned(bio->bi_bdev)) >> return false; >> >> switch (bio_op(bio)) { >> case REQ_OP_WRITE_ZEROES: >> >> case REQ_OP_WRITE: >> >> return true; >> default: >> >> return false; >> >> } >> } > > I'd be fine with that (using a shared helper), but let's please just > make it: > > static inline bool op_is_zoned_write(bdev, op) > { > if (!bdev_is_zoned(bio->bi_bdev)) > return false; > > return op == REQ_OP_WRITE_ZEROES || op == REQ_OP_WRITE; Works for me. Nit: should have REQ_OP_WRITE first as that is the most common case. > } > > and avoid a switch for this basic case and name it a bit more logically > too. Not married to the above name, but the helper should not imply > anything about zone locking. That's for the caller. blk_req_needs_zone_write_lock() would become: bool blk_req_needs_zone_write_lock(struct request *rq) { if (blk_rq_is_passthrough(rq)) return false; if (!rq->q->disk->seq_zones_wlock) return false; return op_is_zoned_write(rq->q->disk->part0, req_op(rq)); } -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-27 16:52 ` Jens Axboe 2022-09-27 23:07 ` Damien Le Moal @ 2022-09-28 11:57 ` Pankaj Raghav 2022-09-28 22:19 ` Damien Le Moal 1 sibling, 1 reply; 22+ messages in thread From: Pankaj Raghav @ 2022-09-28 11:57 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig, damien.lemoal; +Cc: linux-block, gost.dev On 2022-09-27 18:52, Jens Axboe wrote: >> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >> REQ_OP_WRITE_ZEROES. So this should be: >> >> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >> return NULL; > > I'd rather just make it explicit and use that. Pankaj, do you want > to spin a v2 with that? > Based on all the suggestions: block: adapt blk_mq_plug() to not plug for writes that require a zone lock The current implementation of blk_mq_plug() disables plugging for all operations that involves a transfer to the device as we just check if the last bit in op_is_write() function. Modify blk_mq_plug() to disable plugging only for REQ_OP_WRITE and REQ_OP_WRITE_ZEROS as they might require a zone lock. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> diff --git a/block/blk-mq.h b/block/blk-mq.h index 8ca453ac243d..297289cdd521 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -312,7 +312,8 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) static inline struct blk_plug *blk_mq_plug( struct bio *bio) { /* Zoned block device write operation case: do not plug the BIO */ - if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))) + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && + blk_op_is_zoned_write(bio->bi_bdev, bio_op(bio))) return NULL; /* diff --git a/block/blk-zoned.c b/block/blk-zoned.c index a264621d4905..fa926424edb6 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -63,13 +63,10 @@ bool blk_req_needs_zone_write_lock(struct request *rq) if (!rq->q->disk->seq_zones_wlock) return false; - switch (req_op(rq)) { - case REQ_OP_WRITE_ZEROES: - case REQ_OP_WRITE: + if (blk_op_is_zoned_write(rq->q->disk->part0, req_op(rq))) return blk_rq_zone_is_seq(rq); - default: - return false; - } + + return false; } EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 8038c5fbde40..719025028fa4 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1300,6 +1300,15 @@ static inline bool bdev_is_zoned(struct block_device *bdev) return false; } +static inline bool blk_op_is_zoned_write(struct block_device *bdev, + blk_opf_t op) +{ + if (!bdev_is_zoned(bdev)) + return false; + + return op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES; +} + static inline sector_t bdev_zone_sectors(struct block_device *bdev) { struct request_queue *q = bdev_get_queue(bdev); Does this look fine? ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for zoned block devices 2022-09-28 11:57 ` Pankaj Raghav @ 2022-09-28 22:19 ` Damien Le Moal 0 siblings, 0 replies; 22+ messages in thread From: Damien Le Moal @ 2022-09-28 22:19 UTC (permalink / raw) To: Pankaj Raghav, Jens Axboe, Christoph Hellwig; +Cc: linux-block, gost.dev On 2022/09/28 20:57, Pankaj Raghav wrote: > On 2022-09-27 18:52, Jens Axboe wrote: >>> Well, the only opcodes we do zone locking for is REQ_OP_WRITE and >>> REQ_OP_WRITE_ZEROES. So this should be: >>> >>> if (zoned && (op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES)) >>> return NULL; >> >> I'd rather just make it explicit and use that. Pankaj, do you want >> to spin a v2 with that? >> > > Based on all the suggestions: > > block: adapt blk_mq_plug() to not plug for writes that require a zone lock > > The current implementation of blk_mq_plug() disables plugging for all > operations that involves a transfer to the device as we just check if > the last bit in op_is_write() function. > > Modify blk_mq_plug() to disable plugging only for REQ_OP_WRITE and > REQ_OP_WRITE_ZEROS as they might require a zone lock. > > Suggested-by: Christoph Hellwig <hch@infradead.org> > Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > > diff --git a/block/blk-mq.h b/block/blk-mq.h > index 8ca453ac243d..297289cdd521 100644 > --- a/block/blk-mq.h > +++ b/block/blk-mq.h > @@ -312,7 +312,8 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) > static inline struct blk_plug *blk_mq_plug( struct bio *bio) > { > /* Zoned block device write operation case: do not plug the BIO */ > - if (bdev_is_zoned(bio->bi_bdev) && op_is_write(bio_op(bio))) > + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && > + blk_op_is_zoned_write(bio->bi_bdev, bio_op(bio))) > return NULL; > > /* > diff --git a/block/blk-zoned.c b/block/blk-zoned.c > index a264621d4905..fa926424edb6 100644 > --- a/block/blk-zoned.c > +++ b/block/blk-zoned.c > @@ -63,13 +63,10 @@ bool blk_req_needs_zone_write_lock(struct request *rq) > if (!rq->q->disk->seq_zones_wlock) > return false; > > - switch (req_op(rq)) { > - case REQ_OP_WRITE_ZEROES: > - case REQ_OP_WRITE: > + if (blk_op_is_zoned_write(rq->q->disk->part0, req_op(rq))) > return blk_rq_zone_is_seq(rq); > - default: > - return false; > - } > + > + return false; > } > EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock); > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 8038c5fbde40..719025028fa4 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -1300,6 +1300,15 @@ static inline bool bdev_is_zoned(struct block_device *bdev) > return false; > } > > +static inline bool blk_op_is_zoned_write(struct block_device *bdev, > + blk_opf_t op) To be consistent, I personally would prefer bdev_op_is_zoned_write() as the name here (the first argument is a bdev). > +{ > + if (!bdev_is_zoned(bdev)) > + return false; > + > + return op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES; > +} > + > static inline sector_t bdev_zone_sectors(struct block_device *bdev) > { > struct request_queue *q = bdev_get_queue(bdev); > > > Does this look fine? -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <CGME20220925185351eucas1p1e0c37396c09611509c0b18bdcdeddfe1@eucas1p1.samsung.com>]
* [PATCH 2/2] block: use blk_mq_plug() in blk_execute_rq_nowait() [not found] ` <CGME20220925185351eucas1p1e0c37396c09611509c0b18bdcdeddfe1@eucas1p1.samsung.com> @ 2022-09-25 18:53 ` Pankaj Raghav 2022-09-25 22:56 ` Damien Le Moal 0 siblings, 1 reply; 22+ messages in thread From: Pankaj Raghav @ 2022-09-25 18:53 UTC (permalink / raw) To: linux-block, axboe; +Cc: damien.lemoal, gost.dev, Pankaj Raghav blk_execute_rq_nowait() function mainly was used by low-level drivers such as NVMe to submit one-off passthrough requests. However, recently introduced uring-cmd based io-passthrough also uses the same function to submit io requests. As the plugging support is coming to io-passthrough[1], use the blk_mq_plug() helper to ensure plugging is not used in all scenarios. [1] https://lore.kernel.org/linux-block/20220922182805.96173-1-axboe@kernel.dk/ Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- block/blk-mq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index c11949d66163..840541c1ab40 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1213,7 +1213,7 @@ void blk_execute_rq_nowait(struct request *rq, bool at_head) WARN_ON(!blk_rq_is_passthrough(rq)); blk_account_io_start(rq); - if (current->plug) + if (blk_mq_plug(rq->bio)) blk_add_rq_to_plug(current->plug, rq); else blk_mq_sched_insert_request(rq, at_head, true, false); -- 2.25.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] block: use blk_mq_plug() in blk_execute_rq_nowait() 2022-09-25 18:53 ` [PATCH 2/2] block: use blk_mq_plug() in blk_execute_rq_nowait() Pankaj Raghav @ 2022-09-25 22:56 ` Damien Le Moal 0 siblings, 0 replies; 22+ messages in thread From: Damien Le Moal @ 2022-09-25 22:56 UTC (permalink / raw) To: Pankaj Raghav, linux-block, axboe; +Cc: gost.dev On 9/26/22 03:53, Pankaj Raghav wrote: > blk_execute_rq_nowait() function mainly was used by low-level drivers > such as NVMe to submit one-off passthrough requests. However, recently > introduced uring-cmd based io-passthrough also uses the same function to > submit io requests. > > As the plugging support is coming to io-passthrough[1], use the > blk_mq_plug() helper to ensure plugging is not used in all scenarios. > > [1] > https://lore.kernel.org/linux-block/20220922182805.96173-1-axboe@kernel.dk/ > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > --- > block/blk-mq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index c11949d66163..840541c1ab40 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1213,7 +1213,7 @@ void blk_execute_rq_nowait(struct request *rq, bool at_head) > WARN_ON(!blk_rq_is_passthrough(rq)); > > blk_account_io_start(rq); > - if (current->plug) > + if (blk_mq_plug(rq->bio)) > blk_add_rq_to_plug(current->plug, rq); > else > blk_mq_sched_insert_request(rq, at_head, true, false); Looks good to me. Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2022-09-28 22:19 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20220925185349eucas1p1dc689bac64668ca038ba8646c44fd580@eucas1p1.samsung.com> 2022-09-25 18:53 ` [PATCH 0/2] enable plugging only for reads in zoned block devices Pankaj Raghav [not found] ` <CGME20220925185350eucas1p1fc354429027a88de7e548a3a4529b4ef@eucas1p1.samsung.com> 2022-09-25 18:53 ` [PATCH 1/2] block: modify blk_mq_plug() to allow only reads for " Pankaj Raghav 2022-09-25 22:55 ` Damien Le Moal 2022-09-26 14:37 ` Christoph Hellwig 2022-09-26 14:40 ` Jens Axboe 2022-09-26 14:43 ` Christoph Hellwig 2022-09-26 16:32 ` Jens Axboe 2022-09-26 19:20 ` Pankaj Raghav 2022-09-26 19:25 ` Jens Axboe 2022-09-27 15:20 ` Pankaj Raghav 2022-09-27 16:04 ` Jens Axboe 2022-09-27 16:51 ` Christoph Hellwig 2022-09-27 16:52 ` Jens Axboe 2022-09-27 23:07 ` Damien Le Moal 2022-09-27 23:10 ` Damien Le Moal 2022-09-27 23:13 ` Jens Axboe 2022-09-27 23:12 ` Jens Axboe 2022-09-27 23:35 ` Damien Le Moal 2022-09-28 11:57 ` Pankaj Raghav 2022-09-28 22:19 ` Damien Le Moal [not found] ` <CGME20220925185351eucas1p1e0c37396c09611509c0b18bdcdeddfe1@eucas1p1.samsung.com> 2022-09-25 18:53 ` [PATCH 2/2] block: use blk_mq_plug() in blk_execute_rq_nowait() Pankaj Raghav 2022-09-25 22:56 ` Damien Le Moal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).