* [PATCH v12 0/3] bio: control bio max size [not found] <CGME20210604052157epcas1p2e5eebb52d08b06174696290e11fdd5a4@epcas1p2.samsung.com> @ 2021-06-04 5:03 ` Changheun Lee [not found] ` <CGME20210604052159epcas1p4370bee98aad882ab335dda1565db94fb@epcas1p4.samsung.com> ` (4 more replies) 0 siblings, 5 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-04 5:03 UTC (permalink / raw) To: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim, Changheun Lee bio size can grow up to 4GB after muli-page bvec has been enabled. But sometimes large size of bio would lead to inefficient behaviors. Control of bio max size will be helpful to improve inefficiency. blk_queue_max_bio_bytes() is added to enable be set the max_bio_bytes in each driver layer. And max_bio_bytes sysfs is added to show current max_bio_bytes for each request queue. bio size can be controlled via max_bio_bytes. Changheun Lee (3): bio: control bio max size blk-sysfs: add max_bio_bytes ufs: set max_bio_bytes with queue max sectors Documentation/ABI/testing/sysfs-block | 10 ++++++++++ Documentation/block/queue-sysfs.rst | 7 +++++++ block/bio.c | 17 ++++++++++++++--- block/blk-settings.c | 19 +++++++++++++++++++ block/blk-sysfs.c | 7 +++++++ drivers/scsi/ufs/ufshcd.c | 5 +++++ include/linux/bio.h | 4 +++- include/linux/blkdev.h | 3 +++ 8 files changed, 68 insertions(+), 4 deletions(-) -- 2.29.0 ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210604052159epcas1p4370bee98aad882ab335dda1565db94fb@epcas1p4.samsung.com>]
* [PATCH v12 1/3] bio: control bio max size [not found] ` <CGME20210604052159epcas1p4370bee98aad882ab335dda1565db94fb@epcas1p4.samsung.com> @ 2021-06-04 5:03 ` Changheun Lee 2021-06-04 7:24 ` Damien Le Moal 2021-06-07 6:32 ` Christoph Hellwig 0 siblings, 2 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-04 5:03 UTC (permalink / raw) To: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim, Changheun Lee bio size can grow up to 4GB after muli-page bvec has been enabled. But sometimes large size of bio would lead to inefficient behaviors. Control of bio max size will be helpful to improve inefficiency. Below is a example for inefficient behaviours. In case of large chunk direct I/O, - 32MB chunk read in user space - all pages for 32MB would be merged to a bio structure if the pages physical addresses are contiguous. It makes some delay to submit until merge complete. bio max size should be limited to a proper size. When 32MB chunk read with direct I/O option is coming from userspace, kernel behavior is below now in do_direct_IO() loop. It's timeline. | bio merge for 32MB. total 8,192 pages are merged. | total elapsed time is over 2ms. |------------------ ... ----------------------->| | 8,192 pages merged a bio. | at this time, first bio submit is done. | 1 bio is split to 32 read request and issue. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 19ms elapsed to complete 32MB read done from device. | If bio max size is limited with 1MB, behavior is changed below. | bio merge for 1MB. 256 pages are merged for each bio. | total 32 bio will be made. | total elapsed time is over 2ms. it's same. | but, first bio submit timing is fast. about 100us. |--->|--->|--->|---> ... -->|--->|--->|--->|--->| | 256 pages merged a bio. | at this time, first bio submit is done. | and 1 read request is issued for 1 bio. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 17ms elapsed to complete 32MB read done from device. | As a result, read request issue timing is faster if bio max size is limited. Current kernel behavior with multipage bvec, super large bio can be created. And it lead to delay first I/O request issue. Signed-off-by: Changheun Lee <nanich.lee@samsung.com> --- block/bio.c | 17 ++++++++++++++--- block/blk-settings.c | 19 +++++++++++++++++++ include/linux/bio.h | 4 +++- include/linux/blkdev.h | 3 +++ 4 files changed, 39 insertions(+), 4 deletions(-) diff --git a/block/bio.c b/block/bio.c index 44205dfb6b60..73b673f1684e 100644 --- a/block/bio.c +++ b/block/bio.c @@ -255,6 +255,13 @@ void bio_init(struct bio *bio, struct bio_vec *table, } EXPORT_SYMBOL(bio_init); +unsigned int bio_max_bytes(struct bio *bio) +{ + struct block_device *bdev = bio->bi_bdev; + + return bdev ? bdev->bd_disk->queue->limits.max_bio_bytes : UINT_MAX; +} + /** * bio_reset - reinitialize a bio * @bio: bio to reset @@ -866,7 +873,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; if (page_is_mergeable(bv, page, len, off, same_page)) { - if (bio->bi_iter.bi_size > UINT_MAX - len) { + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) { *same_page = false; return false; } @@ -995,6 +1002,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) { unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; struct page **pages = (struct page **)bv; bool same_page = false; @@ -1010,7 +1018,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, + &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; @@ -1038,6 +1047,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) { unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; struct request_queue *q = bio->bi_bdev->bd_disk->queue; unsigned int max_append_sectors = queue_max_zone_append_sectors(q); struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; @@ -1058,7 +1068,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, + &offset); if (unlikely(size <= 0)) return size ? size : -EFAULT; diff --git a/block/blk-settings.c b/block/blk-settings.c index 902c40d67120..e270e31519a1 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -32,6 +32,7 @@ EXPORT_SYMBOL_GPL(blk_queue_rq_timeout); */ void blk_set_default_limits(struct queue_limits *lim) { + lim->max_bio_bytes = UINT_MAX; lim->max_segments = BLK_MAX_SEGMENTS; lim->max_discard_segments = 1; lim->max_integrity_segments = 0; @@ -100,6 +101,24 @@ void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce bounce) } EXPORT_SYMBOL(blk_queue_bounce_limit); +/** + * blk_queue_max_bio_bytes - set bio max size for queue + * @q: the request queue for the device + * @bytes : bio max bytes to be set + * + * Description: + * Set proper bio max size to optimize queue operating. + **/ +void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) +{ + struct queue_limits *limits = &q->limits; + unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); + + limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, + BIO_MAX_VECS * PAGE_SIZE); +} +EXPORT_SYMBOL(blk_queue_max_bio_bytes); + /** * blk_queue_max_hw_sectors - set max sectors for a request for this queue * @q: the request queue for the device diff --git a/include/linux/bio.h b/include/linux/bio.h index a0b4cfdf62a4..3959cc1a0652 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -106,6 +106,8 @@ static inline void *bio_data(struct bio *bio) return NULL; } +extern unsigned int bio_max_bytes(struct bio *bio); + /** * bio_full - check if the bio is full * @bio: bio to check @@ -119,7 +121,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) if (bio->bi_vcnt >= bio->bi_max_vecs) return true; - if (bio->bi_iter.bi_size > UINT_MAX - len) + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) return true; return false; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 1255823b2bc0..861888501fc0 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -326,6 +326,8 @@ enum blk_bounce { }; struct queue_limits { + unsigned int max_bio_bytes; + enum blk_bounce bounce; unsigned long seg_boundary_mask; unsigned long virt_boundary_mask; @@ -1132,6 +1134,7 @@ extern void blk_abort_request(struct request *); * Access functions for manipulating queue properties */ extern void blk_cleanup_queue(struct request_queue *); +extern void blk_queue_max_bio_bytes(struct request_queue *, unsigned int); void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce limit); extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int); extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); -- 2.29.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-04 5:03 ` [PATCH v12 1/3] " Changheun Lee @ 2021-06-04 7:24 ` Damien Le Moal [not found] ` <CGME20210604075331epcas1p13bb57f9ddfc7b112dec1ba8cf40fdc74@epcas1p1.samsung.com> 2021-06-07 6:32 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: Damien Le Moal @ 2021-06-04 7:24 UTC (permalink / raw) To: Changheun Lee, Johannes Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, Avri Altman, alim.akhtar, bvanassche, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On 2021/06/04 14:22, Changheun Lee wrote: > bio size can grow up to 4GB after muli-page bvec has been enabled. > But sometimes large size of bio would lead to inefficient behaviors. > Control of bio max size will be helpful to improve inefficiency. > > Below is a example for inefficient behaviours. > In case of large chunk direct I/O, - 32MB chunk read in user space - > all pages for 32MB would be merged to a bio structure if the pages > physical addresses are contiguous. It makes some delay to submit > until merge complete. bio max size should be limited to a proper size. > > When 32MB chunk read with direct I/O option is coming from userspace, > kernel behavior is below now in do_direct_IO() loop. It's timeline. > > | bio merge for 32MB. total 8,192 pages are merged. > | total elapsed time is over 2ms. > |------------------ ... ----------------------->| > | 8,192 pages merged a bio. > | at this time, first bio submit is done. > | 1 bio is split to 32 read request and issue. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 19ms elapsed to complete 32MB read done from device. | > > If bio max size is limited with 1MB, behavior is changed below. > > | bio merge for 1MB. 256 pages are merged for each bio. > | total 32 bio will be made. > | total elapsed time is over 2ms. it's same. > | but, first bio submit timing is fast. about 100us. > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > | 256 pages merged a bio. > | at this time, first bio submit is done. > | and 1 read request is issued for 1 bio. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 17ms elapsed to complete 32MB read done from device. | > > As a result, read request issue timing is faster if bio max size is limited. > Current kernel behavior with multipage bvec, super large bio can be created. > And it lead to delay first I/O request issue. > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > --- > block/bio.c | 17 ++++++++++++++--- > block/blk-settings.c | 19 +++++++++++++++++++ > include/linux/bio.h | 4 +++- > include/linux/blkdev.h | 3 +++ > 4 files changed, 39 insertions(+), 4 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 44205dfb6b60..73b673f1684e 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -255,6 +255,13 @@ void bio_init(struct bio *bio, struct bio_vec *table, > } > EXPORT_SYMBOL(bio_init); > > +unsigned int bio_max_bytes(struct bio *bio) > +{ > + struct block_device *bdev = bio->bi_bdev; > + > + return bdev ? bdev->bd_disk->queue->limits.max_bio_bytes : UINT_MAX; > +} unsigned int bio_max_bytes(struct bio *bio) { struct block_device *bdev = bio->bi_bdev; if (!bdev) return UINT_MAX; return bdev->bd_disk->queue->limits.max_bio_bytes; } is a lot more readable... Also, I remember there was some problems with bd_disk possibly being null. Was that fixed ? > + > /** > * bio_reset - reinitialize a bio > * @bio: bio to reset > @@ -866,7 +873,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > if (page_is_mergeable(bv, page, len, off, same_page)) { > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) { > *same_page = false; > return false; > } > @@ -995,6 +1002,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > { > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; > + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > struct page **pages = (struct page **)bv; > bool same_page = false; > @@ -1010,7 +1018,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > > - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, > + &offset); > if (unlikely(size <= 0)) > return size ? size : -EFAULT; > > @@ -1038,6 +1047,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > { > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; > + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; > struct request_queue *q = bio->bi_bdev->bd_disk->queue; > unsigned int max_append_sectors = queue_max_zone_append_sectors(q); > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > @@ -1058,7 +1068,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > > - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, > + &offset); > if (unlikely(size <= 0)) > return size ? size : -EFAULT; > > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 902c40d67120..e270e31519a1 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -32,6 +32,7 @@ EXPORT_SYMBOL_GPL(blk_queue_rq_timeout); > */ > void blk_set_default_limits(struct queue_limits *lim) > { > + lim->max_bio_bytes = UINT_MAX; > lim->max_segments = BLK_MAX_SEGMENTS; > lim->max_discard_segments = 1; > lim->max_integrity_segments = 0; > @@ -100,6 +101,24 @@ void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce bounce) > } > EXPORT_SYMBOL(blk_queue_bounce_limit); > > +/** > + * blk_queue_max_bio_bytes - set bio max size for queue blk_queue_max_bio_bytes - set max_bio_bytes queue limit And then you can drop the not very useful description. > + * @q: the request queue for the device > + * @bytes : bio max bytes to be set > + * > + * Description: > + * Set proper bio max size to optimize queue operating. > + **/ > +void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) > +{ > + struct queue_limits *limits = &q->limits; > + unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); > + > + limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, > + BIO_MAX_VECS * PAGE_SIZE); > +} > +EXPORT_SYMBOL(blk_queue_max_bio_bytes); Setting of the stacked limits is still missing. > + > /** > * blk_queue_max_hw_sectors - set max sectors for a request for this queue > * @q: the request queue for the device > diff --git a/include/linux/bio.h b/include/linux/bio.h > index a0b4cfdf62a4..3959cc1a0652 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -106,6 +106,8 @@ static inline void *bio_data(struct bio *bio) > return NULL; > } > > +extern unsigned int bio_max_bytes(struct bio *bio); > + > /** > * bio_full - check if the bio is full > * @bio: bio to check > @@ -119,7 +121,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) > if (bio->bi_vcnt >= bio->bi_max_vecs) > return true; > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) > return true; > > return false; > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 1255823b2bc0..861888501fc0 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -326,6 +326,8 @@ enum blk_bounce { > }; > > struct queue_limits { > + unsigned int max_bio_bytes; > + > enum blk_bounce bounce; > unsigned long seg_boundary_mask; > unsigned long virt_boundary_mask; > @@ -1132,6 +1134,7 @@ extern void blk_abort_request(struct request *); > * Access functions for manipulating queue properties > */ > extern void blk_cleanup_queue(struct request_queue *); > +extern void blk_queue_max_bio_bytes(struct request_queue *, unsigned int); > void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce limit); > extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int); > extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); > -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210604075331epcas1p13bb57f9ddfc7b112dec1ba8cf40fdc74@epcas1p1.samsung.com>]
* Re: [PATCH v12 1/3] bio: control bio max size [not found] ` <CGME20210604075331epcas1p13bb57f9ddfc7b112dec1ba8cf40fdc74@epcas1p1.samsung.com> @ 2021-06-04 7:34 ` Changheun Lee 2021-06-04 8:01 ` Damien Le Moal 2021-06-04 14:52 ` Bart Van Assche 0 siblings, 2 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-04 7:34 UTC (permalink / raw) To: damien.lemoal Cc: Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, bvanassche, cang, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim > On 2021/06/04 14:22, Changheun Lee wrote: > > bio size can grow up to 4GB after muli-page bvec has been enabled. > > But sometimes large size of bio would lead to inefficient behaviors. > > Control of bio max size will be helpful to improve inefficiency. > > > > Below is a example for inefficient behaviours. > > In case of large chunk direct I/O, - 32MB chunk read in user space - > > all pages for 32MB would be merged to a bio structure if the pages > > physical addresses are contiguous. It makes some delay to submit > > until merge complete. bio max size should be limited to a proper size. > > > > When 32MB chunk read with direct I/O option is coming from userspace, > > kernel behavior is below now in do_direct_IO() loop. It's timeline. > > > > | bio merge for 32MB. total 8,192 pages are merged. > > | total elapsed time is over 2ms. > > |------------------ ... ----------------------->| > > | 8,192 pages merged a bio. > > | at this time, first bio submit is done. > > | 1 bio is split to 32 read request and issue. > > |---------------> > > |---------------> > > |---------------> > > ...... > > |---------------> > > |--------------->| > > total 19ms elapsed to complete 32MB read done from device. | > > > > If bio max size is limited with 1MB, behavior is changed below. > > > > | bio merge for 1MB. 256 pages are merged for each bio. > > | total 32 bio will be made. > > | total elapsed time is over 2ms. it's same. > > | but, first bio submit timing is fast. about 100us. > > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > > | 256 pages merged a bio. > > | at this time, first bio submit is done. > > | and 1 read request is issued for 1 bio. > > |---------------> > > |---------------> > > |---------------> > > ...... > > |---------------> > > |--------------->| > > total 17ms elapsed to complete 32MB read done from device. | > > > > As a result, read request issue timing is faster if bio max size is limited. > > Current kernel behavior with multipage bvec, super large bio can be created. > > And it lead to delay first I/O request issue. > > > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > --- > > block/bio.c | 17 ++++++++++++++--- > > block/blk-settings.c | 19 +++++++++++++++++++ > > include/linux/bio.h | 4 +++- > > include/linux/blkdev.h | 3 +++ > > 4 files changed, 39 insertions(+), 4 deletions(-) > > > > diff --git a/block/bio.c b/block/bio.c > > index 44205dfb6b60..73b673f1684e 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -255,6 +255,13 @@ void bio_init(struct bio *bio, struct bio_vec *table, > > } > > EXPORT_SYMBOL(bio_init); > > > > +unsigned int bio_max_bytes(struct bio *bio) > > +{ > > + struct block_device *bdev = bio->bi_bdev; > > + > > + return bdev ? bdev->bd_disk->queue->limits.max_bio_bytes : UINT_MAX; > > +} > > unsigned int bio_max_bytes(struct bio *bio) > { > struct block_device *bdev = bio->bi_bdev; > > if (!bdev) > return UINT_MAX; > return bdev->bd_disk->queue->limits.max_bio_bytes; > } > > is a lot more readable... > Also, I remember there was some problems with bd_disk possibly being null. Was > that fixed ? Thank you for review. But I'd like current style, and it's readable enough now I think. Null of bd_disk was just suspicion. bd_disk is not null if bdev is not null. > > > + > > /** > > * bio_reset - reinitialize a bio > > * @bio: bio to reset > > @@ -866,7 +873,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, > > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > > + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) { > > *same_page = false; > > return false; > > } > > @@ -995,6 +1002,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > > { > > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > > unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; > > + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; > > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > > struct page **pages = (struct page **)bv; > > bool same_page = false; > > @@ -1010,7 +1018,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > > BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); > > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > > > > - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > > + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, > > + &offset); > > if (unlikely(size <= 0)) > > return size ? size : -EFAULT; > > > > @@ -1038,6 +1047,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > > { > > unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; > > unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; > > + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; > > struct request_queue *q = bio->bi_bdev->bd_disk->queue; > > unsigned int max_append_sectors = queue_max_zone_append_sectors(q); > > struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; > > @@ -1058,7 +1068,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) > > BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); > > pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); > > > > - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > > + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, > > + &offset); > > if (unlikely(size <= 0)) > > return size ? size : -EFAULT; > > > > diff --git a/block/blk-settings.c b/block/blk-settings.c > > index 902c40d67120..e270e31519a1 100644 > > --- a/block/blk-settings.c > > +++ b/block/blk-settings.c > > @@ -32,6 +32,7 @@ EXPORT_SYMBOL_GPL(blk_queue_rq_timeout); > > */ > > void blk_set_default_limits(struct queue_limits *lim) > > { > > + lim->max_bio_bytes = UINT_MAX; > > lim->max_segments = BLK_MAX_SEGMENTS; > > lim->max_discard_segments = 1; > > lim->max_integrity_segments = 0; > > @@ -100,6 +101,24 @@ void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce bounce) > > } > > EXPORT_SYMBOL(blk_queue_bounce_limit); > > > > +/** > > + * blk_queue_max_bio_bytes - set bio max size for queue > > blk_queue_max_bio_bytes - set max_bio_bytes queue limit > > And then you can drop the not very useful description. OK. I'll. > > > + * @q: the request queue for the device > > + * @bytes : bio max bytes to be set > > + * > > + * Description: > > + * Set proper bio max size to optimize queue operating. > > + **/ > > +void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) > > +{ > > + struct queue_limits *limits = &q->limits; > > + unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); > > + > > + limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, > > + BIO_MAX_VECS * PAGE_SIZE); > > +} > > +EXPORT_SYMBOL(blk_queue_max_bio_bytes); > > Setting of the stacked limits is still missing. max_bio_bytes for stacked device is just default(UINT_MAX) in this patch. Because blk_set_stacking_limits() call blk_set_default_limits(). I'll work continue for stacked device after this patchowork. > > > + > > /** > > * blk_queue_max_hw_sectors - set max sectors for a request for this queue > > * @q: the request queue for the device > > diff --git a/include/linux/bio.h b/include/linux/bio.h > > index a0b4cfdf62a4..3959cc1a0652 100644 > > --- a/include/linux/bio.h > > +++ b/include/linux/bio.h > > @@ -106,6 +106,8 @@ static inline void *bio_data(struct bio *bio) > > return NULL; > > } > > > > +extern unsigned int bio_max_bytes(struct bio *bio); > > + > > /** > > * bio_full - check if the bio is full > > * @bio: bio to check > > @@ -119,7 +121,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) > > if (bio->bi_vcnt >= bio->bi_max_vecs) > > return true; > > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > > + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) > > return true; > > > > return false; > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index 1255823b2bc0..861888501fc0 100644 > > --- a/include/linux/blkdev.h > > +++ b/include/linux/blkdev.h > > @@ -326,6 +326,8 @@ enum blk_bounce { > > }; > > > > struct queue_limits { > > + unsigned int max_bio_bytes; > > + > > enum blk_bounce bounce; > > unsigned long seg_boundary_mask; > > unsigned long virt_boundary_mask; > > @@ -1132,6 +1134,7 @@ extern void blk_abort_request(struct request *); > > * Access functions for manipulating queue properties > > */ > > extern void blk_cleanup_queue(struct request_queue *); > > +extern void blk_queue_max_bio_bytes(struct request_queue *, unsigned int); > > void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce limit); > > extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int); > > extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); > > > > > -- > Damien Le Moal > Western Digital Research Thank you, Changheun Lee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-04 7:34 ` Changheun Lee @ 2021-06-04 8:01 ` Damien Le Moal [not found] ` <CGME20210604094209epcas1p26368dd18011bb2761529432cf2656a9f@epcas1p2.samsung.com> 2021-06-04 14:52 ` Bart Van Assche 1 sibling, 1 reply; 19+ messages in thread From: Damien Le Moal @ 2021-06-04 8:01 UTC (permalink / raw) To: Changheun Lee Cc: Avri Altman, Johannes Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, bvanassche, cang, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim On 2021/06/04 16:53, Changheun Lee wrote: >> On 2021/06/04 14:22, Changheun Lee wrote: >>> bio size can grow up to 4GB after muli-page bvec has been enabled. >>> But sometimes large size of bio would lead to inefficient behaviors. >>> Control of bio max size will be helpful to improve inefficiency. >>> >>> Below is a example for inefficient behaviours. >>> In case of large chunk direct I/O, - 32MB chunk read in user space - >>> all pages for 32MB would be merged to a bio structure if the pages >>> physical addresses are contiguous. It makes some delay to submit >>> until merge complete. bio max size should be limited to a proper size. >>> >>> When 32MB chunk read with direct I/O option is coming from userspace, >>> kernel behavior is below now in do_direct_IO() loop. It's timeline. >>> >>> | bio merge for 32MB. total 8,192 pages are merged. >>> | total elapsed time is over 2ms. >>> |------------------ ... ----------------------->| >>> | 8,192 pages merged a bio. >>> | at this time, first bio submit is done. >>> | 1 bio is split to 32 read request and issue. >>> |---------------> >>> |---------------> >>> |---------------> >>> ...... >>> |---------------> >>> |--------------->| >>> total 19ms elapsed to complete 32MB read done from device. | >>> >>> If bio max size is limited with 1MB, behavior is changed below. >>> >>> | bio merge for 1MB. 256 pages are merged for each bio. >>> | total 32 bio will be made. >>> | total elapsed time is over 2ms. it's same. >>> | but, first bio submit timing is fast. about 100us. >>> |--->|--->|--->|---> ... -->|--->|--->|--->|--->| >>> | 256 pages merged a bio. >>> | at this time, first bio submit is done. >>> | and 1 read request is issued for 1 bio. >>> |---------------> >>> |---------------> >>> |---------------> >>> ...... >>> |---------------> >>> |--------------->| >>> total 17ms elapsed to complete 32MB read done from device. | >>> >>> As a result, read request issue timing is faster if bio max size is limited. >>> Current kernel behavior with multipage bvec, super large bio can be created. >>> And it lead to delay first I/O request issue. >>> >>> Signed-off-by: Changheun Lee <nanich.lee@samsung.com> >>> --- >>> block/bio.c | 17 ++++++++++++++--- >>> block/blk-settings.c | 19 +++++++++++++++++++ >>> include/linux/bio.h | 4 +++- >>> include/linux/blkdev.h | 3 +++ >>> 4 files changed, 39 insertions(+), 4 deletions(-) >>> >>> diff --git a/block/bio.c b/block/bio.c >>> index 44205dfb6b60..73b673f1684e 100644 >>> --- a/block/bio.c >>> +++ b/block/bio.c >>> @@ -255,6 +255,13 @@ void bio_init(struct bio *bio, struct bio_vec *table, >>> } >>> EXPORT_SYMBOL(bio_init); >>> >>> +unsigned int bio_max_bytes(struct bio *bio) >>> +{ >>> + struct block_device *bdev = bio->bi_bdev; >>> + >>> + return bdev ? bdev->bd_disk->queue->limits.max_bio_bytes : UINT_MAX; >>> +} >> >> unsigned int bio_max_bytes(struct bio *bio) >> { >> struct block_device *bdev = bio->bi_bdev; >> >> if (!bdev) >> return UINT_MAX; >> return bdev->bd_disk->queue->limits.max_bio_bytes; >> } >> >> is a lot more readable... >> Also, I remember there was some problems with bd_disk possibly being null. Was >> that fixed ? > > Thank you for review. But I'd like current style, and it's readable enough > now I think. Null of bd_disk was just suspicion. bd_disk is not null if bdev > is not null. > >> >>> + >>> /** >>> * bio_reset - reinitialize a bio >>> * @bio: bio to reset >>> @@ -866,7 +873,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, >>> struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; >>> >>> if (page_is_mergeable(bv, page, len, off, same_page)) { >>> - if (bio->bi_iter.bi_size > UINT_MAX - len) { >>> + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) { >>> *same_page = false; >>> return false; >>> } >>> @@ -995,6 +1002,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >>> { >>> unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; >>> unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; >>> + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; >>> struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; >>> struct page **pages = (struct page **)bv; >>> bool same_page = false; >>> @@ -1010,7 +1018,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >>> BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); >>> pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); >>> >>> - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); >>> + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, >>> + &offset); >>> if (unlikely(size <= 0)) >>> return size ? size : -EFAULT; >>> >>> @@ -1038,6 +1047,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) >>> { >>> unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; >>> unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; >>> + unsigned int bytes_left = bio_max_bytes(bio) - bio->bi_iter.bi_size; >>> struct request_queue *q = bio->bi_bdev->bd_disk->queue; >>> unsigned int max_append_sectors = queue_max_zone_append_sectors(q); >>> struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; >>> @@ -1058,7 +1068,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) >>> BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); >>> pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); >>> >>> - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); >>> + size = iov_iter_get_pages(iter, pages, bytes_left, nr_pages, >>> + &offset); >>> if (unlikely(size <= 0)) >>> return size ? size : -EFAULT; >>> >>> diff --git a/block/blk-settings.c b/block/blk-settings.c >>> index 902c40d67120..e270e31519a1 100644 >>> --- a/block/blk-settings.c >>> +++ b/block/blk-settings.c >>> @@ -32,6 +32,7 @@ EXPORT_SYMBOL_GPL(blk_queue_rq_timeout); >>> */ >>> void blk_set_default_limits(struct queue_limits *lim) >>> { >>> + lim->max_bio_bytes = UINT_MAX; >>> lim->max_segments = BLK_MAX_SEGMENTS; >>> lim->max_discard_segments = 1; >>> lim->max_integrity_segments = 0; >>> @@ -100,6 +101,24 @@ void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce bounce) >>> } >>> EXPORT_SYMBOL(blk_queue_bounce_limit); >>> >>> +/** >>> + * blk_queue_max_bio_bytes - set bio max size for queue >> >> blk_queue_max_bio_bytes - set max_bio_bytes queue limit >> >> And then you can drop the not very useful description. > > OK. I'll. > >> >>> + * @q: the request queue for the device >>> + * @bytes : bio max bytes to be set >>> + * >>> + * Description: >>> + * Set proper bio max size to optimize queue operating. >>> + **/ >>> +void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) >>> +{ >>> + struct queue_limits *limits = &q->limits; >>> + unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); >>> + >>> + limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, >>> + BIO_MAX_VECS * PAGE_SIZE); >>> +} >>> +EXPORT_SYMBOL(blk_queue_max_bio_bytes); >> >> Setting of the stacked limits is still missing. > > max_bio_bytes for stacked device is just default(UINT_MAX) in this patch. > Because blk_set_stacking_limits() call blk_set_default_limits().> I'll work continue for stacked device after this patchowork. Why ? Without that added now, anybody using this performance fix will see no benefits if a device mapper is used. The stacking limit should be super simple. In blk_stack_limits(), just add: t->max_bio_bytes = min(t->max_bio_bytes, b->max_bio_bytes); > >> >>> + >>> /** >>> * blk_queue_max_hw_sectors - set max sectors for a request for this queue >>> * @q: the request queue for the device >>> diff --git a/include/linux/bio.h b/include/linux/bio.h >>> index a0b4cfdf62a4..3959cc1a0652 100644 >>> --- a/include/linux/bio.h >>> +++ b/include/linux/bio.h >>> @@ -106,6 +106,8 @@ static inline void *bio_data(struct bio *bio) >>> return NULL; >>> } >>> >>> +extern unsigned int bio_max_bytes(struct bio *bio); >>> + >>> /** >>> * bio_full - check if the bio is full >>> * @bio: bio to check >>> @@ -119,7 +121,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) >>> if (bio->bi_vcnt >= bio->bi_max_vecs) >>> return true; >>> >>> - if (bio->bi_iter.bi_size > UINT_MAX - len) >>> + if (bio->bi_iter.bi_size > bio_max_bytes(bio) - len) >>> return true; >>> >>> return false; >>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h >>> index 1255823b2bc0..861888501fc0 100644 >>> --- a/include/linux/blkdev.h >>> +++ b/include/linux/blkdev.h >>> @@ -326,6 +326,8 @@ enum blk_bounce { >>> }; >>> >>> struct queue_limits { >>> + unsigned int max_bio_bytes; >>> + >>> enum blk_bounce bounce; >>> unsigned long seg_boundary_mask; >>> unsigned long virt_boundary_mask; >>> @@ -1132,6 +1134,7 @@ extern void blk_abort_request(struct request *); >>> * Access functions for manipulating queue properties >>> */ >>> extern void blk_cleanup_queue(struct request_queue *); >>> +extern void blk_queue_max_bio_bytes(struct request_queue *, unsigned int); >>> void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce limit); >>> extern void blk_queue_max_hw_sectors(struct request_queue *, unsigned int); >>> extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); >>> >> >> >> -- >> Damien Le Moal >> Western Digital Research > > Thank you, > Changheun Lee > -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210604094209epcas1p26368dd18011bb2761529432cf2656a9f@epcas1p2.samsung.com>]
* Re: [PATCH v12 1/3] bio: control bio max size [not found] ` <CGME20210604094209epcas1p26368dd18011bb2761529432cf2656a9f@epcas1p2.samsung.com> @ 2021-06-04 9:23 ` Changheun Lee 0 siblings, 0 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-04 9:23 UTC (permalink / raw) To: damien.lemoal Cc: Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, bvanassche, cang, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim > On 2021/06/04 16:53, Changheun Lee wrote: > >> On 2021/06/04 14:22, Changheun Lee wrote: > >>> + * @q: the request queue for the device > >>> + * @bytes : bio max bytes to be set > >>> + * > >>> + * Description: > >>> + * Set proper bio max size to optimize queue operating. > >>> + **/ > >>> +void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) > >>> +{ > >>> + struct queue_limits *limits = &q->limits; > >>> + unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); > >>> + > >>> + limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, > >>> + BIO_MAX_VECS * PAGE_SIZE); > >>> +} > >>> +EXPORT_SYMBOL(blk_queue_max_bio_bytes); > >> > >> Setting of the stacked limits is still missing. > > > > max_bio_bytes for stacked device is just default(UINT_MAX) in this patch. > > Because blk_set_stacking_limits() call blk_set_default_limits(). > > I'll work continue for stacked device after this patchowork. > > Why ? Without that added now, anybody using this performance fix will see no > benefits if a device mapper is used. The stacking limit should be super simple. > In blk_stack_limits(), just add: > > t->max_bio_bytes = min(t->max_bio_bytes, b->max_bio_bytes); > > > -- > Damien Le Moal > Western Digital Research I had tried like as your comment at first. But I got many feedbacks that applying for all device is not good idea. So I'll try to control bio size in each stacked driver like as setting max_bio_bytes in LLD as you recommended. I'm trying to find some target stacked devices to contorl bio size like as dm-crypt, dm-liner, ... etc. And I'll try to find some method to control bio max size include using of blk_queue_max_bio_bytes(). Thank you, Changheun Lee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-04 7:34 ` Changheun Lee 2021-06-04 8:01 ` Damien Le Moal @ 2021-06-04 14:52 ` Bart Van Assche 2021-06-07 6:35 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: Bart Van Assche @ 2021-06-04 14:52 UTC (permalink / raw) To: Changheun Lee, damien.lemoal Cc: Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, cang, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim On 6/4/21 12:34 AM, Changheun Lee wrote: >> On 2021/06/04 14:22, Changheun Lee wrote: >>> +unsigned int bio_max_bytes(struct bio *bio) >>> +{ >>> + struct block_device *bdev = bio->bi_bdev; >>> + >>> + return bdev ? bdev->bd_disk->queue->limits.max_bio_bytes : UINT_MAX; >>> +} >> >> unsigned int bio_max_bytes(struct bio *bio) >> { >> struct block_device *bdev = bio->bi_bdev; >> >> if (!bdev) >> return UINT_MAX; >> return bdev->bd_disk->queue->limits.max_bio_bytes; >> } >> >> is a lot more readable... >> Also, I remember there was some problems with bd_disk possibly being null. Was >> that fixed ? > > Thank you for review. But I'd like current style, and it's readable enough > now I think. Null of bd_disk was just suspicion. bd_disk is not null if bdev > is not null. Damien is right. bd_disk can be NULL. From https://lore.kernel.org/linux-block/20210425043020.30065-1-bvanassche@acm.org/: "bio_max_size() may get called before device_add_disk() and hence needs to check whether or not the block device pointer is NULL. [ ... ] This patch prevents that bio_max_size() triggers the following kernel crash during a SCSI LUN scan:\ BUG: KASAN: null-ptr-deref in bio_add_hw_page+0xa6/0x310 Read of size 8 at addr 00000000000005a8 by task kworker/u16:0/7 Workqueue: events_unbound async_run_entry_fn Call Trace: show_stack+0x52/0x58 dump_stack+0x9d/0xcf kasan_report.cold+0x4b/0x50 __asan_load8+0x69/0x90 bio_add_hw_page+0xa6/0x310 bio_add_pc_page+0xaa/0xe0 bio_map_kern+0xdc/0x1a0 blk_rq_map_kern+0xcd/0x2d0 __scsi_execute+0x9a/0x290 [scsi_mod] scsi_probe_lun.constprop.0+0x17c/0x660 [scsi_mod] scsi_probe_and_add_lun+0x178/0x750 [scsi_mod] __scsi_add_device+0x18c/0x1a0 [scsi_mod] ata_scsi_scan_host+0xe5/0x260 [libata] async_port_probe+0x94/0xb0 [libata] async_run_entry_fn+0x7d/0x2d0 process_one_work+0x582/0xac0 worker_thread+0x8f/0x5a0 kthread+0x222/0x250 ret_from_fork+0x1f/0x30" Bart. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-04 14:52 ` Bart Van Assche @ 2021-06-07 6:35 ` Christoph Hellwig 2021-06-07 16:46 ` Bart Van Assche 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2021-06-07 6:35 UTC (permalink / raw) To: Bart Van Assche Cc: Changheun Lee, damien.lemoal, Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, cang, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim On Fri, Jun 04, 2021 at 07:52:35AM -0700, Bart Van Assche wrote: > Damien is right. bd_disk can be NULL. From bd_disk is initialized in bdev_alloc, so it should never be NULL. bi_bdev OTOH is only set afer bio_add_page in various places or not at all in case of passthrough bios. Which is a bit of a mess and I have plans to fix it. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-07 6:35 ` Christoph Hellwig @ 2021-06-07 16:46 ` Bart Van Assche 2021-06-08 5:22 ` Christoph Hellwig 0 siblings, 1 reply; 19+ messages in thread From: Bart Van Assche @ 2021-06-07 16:46 UTC (permalink / raw) To: Christoph Hellwig Cc: Changheun Lee, damien.lemoal, Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, cang, gregkh, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim On 6/6/21 11:35 PM, Christoph Hellwig wrote: > On Fri, Jun 04, 2021 at 07:52:35AM -0700, Bart Van Assche wrote: >> Damien is right. bd_disk can be NULL. From > > bd_disk is initialized in bdev_alloc, so it should never be NULL. > bi_bdev OTOH is only set afer bio_add_page in various places or not at > all in case of passthrough bios. Which is a bit of a mess and I have > plans to fix it. Hi Christoph, Thank you for having shared your plans for how to improve how bi_bdev is set. In case you would not yet have had the time to do this, please take a look at the call trace available on https://lore.kernel.org/linux-block/20210425043020.30065-1-bvanassche@acm.org/. That call trace shows how bio_add_pc_page() is called by the SCSI core before alloc_disk() is called. I think that sending a SCSI command before alloc_disk() is called is fundamental in the SCSI core because the SCSI INQUIRY command has to be sent before it is known whether or not a SCSI LUN represents a disk. Thanks, Bart. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-07 16:46 ` Bart Van Assche @ 2021-06-08 5:22 ` Christoph Hellwig 0 siblings, 0 replies; 19+ messages in thread From: Christoph Hellwig @ 2021-06-08 5:22 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Changheun Lee, damien.lemoal, Avri.Altman, Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, axboe, bgoncalv, cang, gregkh, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim On Mon, Jun 07, 2021 at 09:46:56AM -0700, Bart Van Assche wrote: > In case you would not yet have had the time to do this, please take a > look at the call trace available on > https://lore.kernel.org/linux-block/20210425043020.30065-1-bvanassche@acm.org/. > That call trace shows how bio_add_pc_page() is called by the SCSI core > before alloc_disk() is called. I think that sending a SCSI command > before alloc_disk() is called is fundamental in the SCSI core because > the SCSI INQUIRY command has to be sent before it is known whether or > not a SCSI LUN represents a disk. I have a plan for that as well, stay tuned :) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 1/3] bio: control bio max size 2021-06-04 5:03 ` [PATCH v12 1/3] " Changheun Lee 2021-06-04 7:24 ` Damien Le Moal @ 2021-06-07 6:32 ` Christoph Hellwig [not found] ` <CGME20210607110122epcas1p4557bcbc7abac791f2557cc0d317214fd@epcas1p4.samsung.com> 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2021-06-07 6:32 UTC (permalink / raw) To: Changheun Lee Cc: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang, jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim NAK. As discussed countless time we already have an actual limit. And we can look it as advisory before building ever bigger bios, but the last thing we need is yet another confusing limit. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210607110122epcas1p4557bcbc7abac791f2557cc0d317214fd@epcas1p4.samsung.com>]
* Re: [PATCH v12 1/3] bio: control bio max size [not found] ` <CGME20210607110122epcas1p4557bcbc7abac791f2557cc0d317214fd@epcas1p4.samsung.com> @ 2021-06-07 10:42 ` Changheun Lee 0 siblings, 0 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-07 10:42 UTC (permalink / raw) To: hch Cc: Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, avri.altman, axboe, bgoncalv, bvanassche, cang, damien.lemoal, gregkh, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim > NAK. As discussed countless time we already have an actual limit. > And we can look it as advisory before building ever bigger bios, > but the last thing we need is yet another confusing limit. Thank you for your review and feedback. I has thought proper bio size control with proper interface might be good, and can be helpful to the other module beside issued performance problem. Providing of common interface to control bio size in block layer might be good I think even though current patch is not accepted. Thanks for all, Changheun Lee ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210604052200epcas1p10754a3187476a0fcbfcc89c103a6d436@epcas1p1.samsung.com>]
* [PATCH v12 2/3] blk-sysfs: add max_bio_bytes [not found] ` <CGME20210604052200epcas1p10754a3187476a0fcbfcc89c103a6d436@epcas1p1.samsung.com> @ 2021-06-04 5:03 ` Changheun Lee 0 siblings, 0 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-04 5:03 UTC (permalink / raw) To: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim, Changheun Lee Add max_bio_bytes block sysfs node to show current maximum bio size. Signed-off-by: Changheun Lee <nanich.lee@samsung.com> --- Documentation/ABI/testing/sysfs-block | 10 ++++++++++ Documentation/block/queue-sysfs.rst | 7 +++++++ block/blk-sysfs.c | 7 +++++++ 3 files changed, 24 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index e34cdeeeb9d4..8c8a793c04b4 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -316,3 +316,13 @@ Description: does not complete in this time then the block driver timeout handler is invoked. That timeout handler can decide to retry the request, to fail it or to start a device recovery strategy. + +What: /sys/block/<disk>/queue/max_bio_bytes +Date: June 2021 +Contact: Changheun Lee <nanich.lee@samsung.com> +Description: + max_bio_bytes is the maximum bio size to be submitted in bytes. + It shows current maximum bio size, and bio size to be submitted + will be limited with this. Default value is UINT_MAX, and + the minimum value is 1MB. 1MB(=BIO_MAX_VECS * PAGE_SIZE) is + legacy maximum bio size. diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index 4dc7f0d499a8..90af56899aa9 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -286,4 +286,11 @@ sequential zones of zoned block devices (devices with a zoned attributed that reports "host-managed" or "host-aware"). This value is always 0 for regular block devices. +max_bio_bytes (RO) +--------------------------- +This is the maximum number of bytes that bio size to be submitted will be +limited. A value of 4,294,967,295(UINT_MAX) means no limit of bio size, +and it's a default value. The minimum value is 1MB. It's legacy maximum +bio size. (=BIO_MAX_VECS * PAGE_SIZE) + Jens Axboe <jens.axboe@oracle.com>, February 2009 diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index e03bedf180ab..c4cae6bbcb3b 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -108,6 +108,11 @@ queue_ra_store(struct request_queue *q, const char *page, size_t count) return ret; } +static ssize_t queue_max_bio_bytes_show(struct request_queue *q, char *page) +{ + return queue_var_show(q->limits.max_bio_bytes, (page)); +} + static ssize_t queue_max_sectors_show(struct request_queue *q, char *page) { int max_sectors_kb = queue_max_sectors(q) >> 1; @@ -577,6 +582,7 @@ static struct queue_sysfs_entry _prefix##_entry = { \ QUEUE_RW_ENTRY(queue_requests, "nr_requests"); QUEUE_RW_ENTRY(queue_ra, "read_ahead_kb"); +QUEUE_RO_ENTRY(queue_max_bio_bytes, "max_bio_bytes"); QUEUE_RW_ENTRY(queue_max_sectors, "max_sectors_kb"); QUEUE_RO_ENTRY(queue_max_hw_sectors, "max_hw_sectors_kb"); QUEUE_RO_ENTRY(queue_max_segments, "max_segments"); @@ -635,6 +641,7 @@ QUEUE_RW_ENTRY(queue_stable_writes, "stable_writes"); static struct attribute *queue_attrs[] = { &queue_requests_entry.attr, &queue_ra_entry.attr, + &queue_max_bio_bytes_entry.attr, &queue_max_hw_sectors_entry.attr, &queue_max_sectors_entry.attr, &queue_max_segments_entry.attr, -- 2.29.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
[parent not found: <CGME20210604052201epcas1p41a27660b20d70b7fc4295c8f131d33ce@epcas1p4.samsung.com>]
* [PATCH v12 3/3] ufs: set max_bio_bytes with queue max sectors [not found] ` <CGME20210604052201epcas1p41a27660b20d70b7fc4295c8f131d33ce@epcas1p4.samsung.com> @ 2021-06-04 5:03 ` Changheun Lee 2021-06-04 16:11 ` Bart Van Assche 0 siblings, 1 reply; 19+ messages in thread From: Changheun Lee @ 2021-06-04 5:03 UTC (permalink / raw) To: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim, Changheun Lee Set max_bio_bytes same with queue max sectors. It will lead to fast bio submit when bio size is over than queue max sectors. And it might be helpful to align submit bio size in some case. Signed-off-by: Changheun Lee <nanich.lee@samsung.com> --- drivers/scsi/ufs/ufshcd.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 3eb54937f1d8..37365a726517 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4858,6 +4858,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) { struct ufs_hba *hba = shost_priv(sdev->host); struct request_queue *q = sdev->request_queue; + unsigned int max_bio_bytes; blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); if (hba->quirks & UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE) @@ -4868,6 +4869,10 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) ufshcd_crypto_setup_rq_keyslot_manager(hba, q); + if (!check_shl_overflow(queue_max_sectors(q), + SECTOR_SHIFT, &max_bio_bytes)) + blk_queue_max_bio_bytes(q, max_bio_bytes); + return 0; } -- 2.29.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v12 3/3] ufs: set max_bio_bytes with queue max sectors 2021-06-04 5:03 ` [PATCH v12 3/3] ufs: set max_bio_bytes with queue max sectors Changheun Lee @ 2021-06-04 16:11 ` Bart Van Assche [not found] ` <CGME20210607094031epcas1p1f4a9ee01eaa4652ba0e8eb6a4964c952@epcas1p1.samsung.com> 0 siblings, 1 reply; 19+ messages in thread From: Bart Van Assche @ 2021-06-04 16:11 UTC (permalink / raw) To: Changheun Lee, Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On 6/3/21 10:03 PM, Changheun Lee wrote: > Set max_bio_bytes same with queue max sectors. It will lead to fast bio > submit when bio size is over than queue max sectors. And it might be helpful > to align submit bio size in some case. > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > --- > drivers/scsi/ufs/ufshcd.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 3eb54937f1d8..37365a726517 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -4858,6 +4858,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > { > struct ufs_hba *hba = shost_priv(sdev->host); > struct request_queue *q = sdev->request_queue; > + unsigned int max_bio_bytes; > > blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); > if (hba->quirks & UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE) > @@ -4868,6 +4869,10 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > > ufshcd_crypto_setup_rq_keyslot_manager(hba, q); > > + if (!check_shl_overflow(queue_max_sectors(q), > + SECTOR_SHIFT, &max_bio_bytes)) > + blk_queue_max_bio_bytes(q, max_bio_bytes); > + > return 0; > } Just like previous versions of this patch series, this approach will trigger data corruption if dm-crypt is stacked on top of the UFS driver since ufs_max_sectors << SECTOR_SHIFT = 1024 * 512 is less than the size of the dm-crypt buffer (BIO_MAX_VECS << PAGE_SHIFT = 256 * 4096). I am not recommending to increase max_sectors for the UFS driver. We need a solution that is independent of the dm-crypt internals. Bart. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210607094031epcas1p1f4a9ee01eaa4652ba0e8eb6a4964c952@epcas1p1.samsung.com>]
* Re: [PATCH v12 3/3] ufs: set max_bio_bytes with queue max sectors [not found] ` <CGME20210607094031epcas1p1f4a9ee01eaa4652ba0e8eb6a4964c952@epcas1p1.samsung.com> @ 2021-06-07 9:21 ` Changheun Lee 0 siblings, 0 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-07 9:21 UTC (permalink / raw) To: bvanassche Cc: Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, avri.altman, axboe, bgoncalv, cang, damien.lemoal, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim > On 6/3/21 10:03 PM, Changheun Lee wrote: > > Set max_bio_bytes same with queue max sectors. It will lead to fast bio > > submit when bio size is over than queue max sectors. And it might be helpful > > to align submit bio size in some case. > > > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > --- > > drivers/scsi/ufs/ufshcd.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > > index 3eb54937f1d8..37365a726517 100644 > > --- a/drivers/scsi/ufs/ufshcd.c > > +++ b/drivers/scsi/ufs/ufshcd.c > > @@ -4858,6 +4858,7 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > > { > > struct ufs_hba *hba = shost_priv(sdev->host); > > struct request_queue *q = sdev->request_queue; > > + unsigned int max_bio_bytes; > > > > blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); > > if (hba->quirks & UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE) > > @@ -4868,6 +4869,10 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > > > > ufshcd_crypto_setup_rq_keyslot_manager(hba, q); > > > > + if (!check_shl_overflow(queue_max_sectors(q), > > + SECTOR_SHIFT, &max_bio_bytes)) > > + blk_queue_max_bio_bytes(q, max_bio_bytes); > > + > > return 0; > > } > > Just like previous versions of this patch series, this approach will > trigger data corruption if dm-crypt is stacked on top of the UFS driver > since ufs_max_sectors << SECTOR_SHIFT = 1024 * 512 is less than the size > of the dm-crypt buffer (BIO_MAX_VECS << PAGE_SHIFT = 256 * 4096). max_bio_bytes can't be smaller than "BIO_MAX_VECS * PAGE_SIZE". Large value will be selected between input parameter and "BIO_MAX_VECS * PAGE_SIZE" in blk_queue_max_bio_bytes(). I think bio max size should be larger than "BIO_MAX_VECS * PAGE_SIZE" for kernel stability. void blk_queue_max_bio_bytes(struct request_queue *q, unsigned int bytes) { struct queue_limits *limits = &q->limits; unsigned int max_bio_bytes = round_up(bytes, PAGE_SIZE); limits->max_bio_bytes = max_t(unsigned int, max_bio_bytes, BIO_MAX_VECS * PAGE_SIZE); } EXPORT_SYMBOL(blk_queue_max_bio_bytes); > > I am not recommending to increase max_sectors for the UFS driver. We > need a solution that is independent of the dm-crypt internals. > > Bart. No need to increase max_sectors in driver to set max_bio_bytes. There are no dependency with dm-crypt I think. Thank you, Changheun Lee ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 0/3] bio: control bio max size 2021-06-04 5:03 ` [PATCH v12 0/3] bio: control bio max size Changheun Lee ` (2 preceding siblings ...) [not found] ` <CGME20210604052201epcas1p41a27660b20d70b7fc4295c8f131d33ce@epcas1p4.samsung.com> @ 2021-06-04 6:41 ` Can Guo 2021-06-04 16:13 ` Bart Van Assche 4 siblings, 0 replies; 19+ messages in thread From: Can Guo @ 2021-06-04 6:41 UTC (permalink / raw) To: Changheun Lee Cc: Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, avri.altman, alim.akhtar, bvanassche, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang, jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On 2021-06-04 13:03, Changheun Lee wrote: > bio size can grow up to 4GB after muli-page bvec has been enabled. > But sometimes large size of bio would lead to inefficient behaviors. > Control of bio max size will be helpful to improve inefficiency. > > blk_queue_max_bio_bytes() is added to enable be set the max_bio_bytes > in > each driver layer. And max_bio_bytes sysfs is added to show current > max_bio_bytes for each request queue. > bio size can be controlled via max_bio_bytes. > This is interesting, and we also noticed it right after multi-page bvec is enabled since last year. Internally, we had a hack to disable it. But it is good to have a tunable to control it. Thanks for the change. Reviewed-by: Can Guo <cang@codeaurora.org> > Changheun Lee (3): > bio: control bio max size > blk-sysfs: add max_bio_bytes > ufs: set max_bio_bytes with queue max sectors > > Documentation/ABI/testing/sysfs-block | 10 ++++++++++ > Documentation/block/queue-sysfs.rst | 7 +++++++ > block/bio.c | 17 ++++++++++++++--- > block/blk-settings.c | 19 +++++++++++++++++++ > block/blk-sysfs.c | 7 +++++++ > drivers/scsi/ufs/ufshcd.c | 5 +++++ > include/linux/bio.h | 4 +++- > include/linux/blkdev.h | 3 +++ > 8 files changed, 68 insertions(+), 4 deletions(-) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v12 0/3] bio: control bio max size 2021-06-04 5:03 ` [PATCH v12 0/3] bio: control bio max size Changheun Lee ` (3 preceding siblings ...) 2021-06-04 6:41 ` [PATCH v12 0/3] bio: control bio max size Can Guo @ 2021-06-04 16:13 ` Bart Van Assche [not found] ` <CGME20210607101609epcas1p392324f6d215e329d632a615c4b1adf4c@epcas1p3.samsung.com> 4 siblings, 1 reply; 19+ messages in thread From: Bart Van Assche @ 2021-06-04 16:13 UTC (permalink / raw) To: Changheun Lee, Johannes.Thumshirn, alex_y_xu, asml.silence, axboe, bgoncalv, jejb, martin.petersen, cang, avri.altman, alim.akhtar, damien.lemoal, gregkh, hch, jaegeuk, linux-block, linux-scsi, linux-kernel, ming.lei, osandov, patchwork-bot, tj, tom.leiming, yi.zhang Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On 6/3/21 10:03 PM, Changheun Lee wrote: > bio size can grow up to 4GB after muli-page bvec has been enabled. > But sometimes large size of bio would lead to inefficient behaviors. > Control of bio max size will be helpful to improve inefficiency. > > blk_queue_max_bio_bytes() is added to enable be set the max_bio_bytes in > each driver layer. And max_bio_bytes sysfs is added to show current > max_bio_bytes for each request queue. > bio size can be controlled via max_bio_bytes. Where is the version history for this patch series? Has this patch series been tested in combination with dm-crypt? Bart. ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CGME20210607101609epcas1p392324f6d215e329d632a615c4b1adf4c@epcas1p3.samsung.com>]
* Re: [PATCH v12 0/3] bio: control bio max size [not found] ` <CGME20210607101609epcas1p392324f6d215e329d632a615c4b1adf4c@epcas1p3.samsung.com> @ 2021-06-07 9:57 ` Changheun Lee 0 siblings, 0 replies; 19+ messages in thread From: Changheun Lee @ 2021-06-07 9:57 UTC (permalink / raw) To: bvanassche Cc: Johannes.Thumshirn, alex_y_xu, alim.akhtar, asml.silence, avri.altman, axboe, bgoncalv, cang, damien.lemoal, gregkh, hch, jaegeuk, jejb, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, linux-scsi, martin.petersen, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yi.zhang, yt0928.kim > On 6/3/21 10:03 PM, Changheun Lee wrote: > > bio size can grow up to 4GB after muli-page bvec has been enabled. > > But sometimes large size of bio would lead to inefficient behaviors. > > Control of bio max size will be helpful to improve inefficiency. > > > > blk_queue_max_bio_bytes() is added to enable be set the max_bio_bytes in > > each driver layer. And max_bio_bytes sysfs is added to show current > > max_bio_bytes for each request queue. > > bio size can be controlled via max_bio_bytes. > > Where is the version history for this patch series? Sorry. I didn't know about it. I'll do next. Thank you for advise. > > Has this patch series been tested in combination with dm-crypt? This patch has been tested in android device. And dm-crypt on the top of scsi has tested with below scsi modification in VM too. @@ -1837,6 +1837,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) blk_queue_virt_boundary(q, shost->virt_boundary_mask); dma_set_max_seg_size(dev, queue_max_segment_size(q)); + blk_queue_max_bio_bytes(q, queue_max_sectors(q) << SECTOR_SHIFT); + /* * Set a reasonable default alignment: The larger of 32-byte (dword), * which is a common minimum for HBAs, and the minimum DMA alignment, > > Bart. > Thank you, Changheun Lee ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-06-08 5:23 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20210604052157epcas1p2e5eebb52d08b06174696290e11fdd5a4@epcas1p2.samsung.com> 2021-06-04 5:03 ` [PATCH v12 0/3] bio: control bio max size Changheun Lee [not found] ` <CGME20210604052159epcas1p4370bee98aad882ab335dda1565db94fb@epcas1p4.samsung.com> 2021-06-04 5:03 ` [PATCH v12 1/3] " Changheun Lee 2021-06-04 7:24 ` Damien Le Moal [not found] ` <CGME20210604075331epcas1p13bb57f9ddfc7b112dec1ba8cf40fdc74@epcas1p1.samsung.com> 2021-06-04 7:34 ` Changheun Lee 2021-06-04 8:01 ` Damien Le Moal [not found] ` <CGME20210604094209epcas1p26368dd18011bb2761529432cf2656a9f@epcas1p2.samsung.com> 2021-06-04 9:23 ` Changheun Lee 2021-06-04 14:52 ` Bart Van Assche 2021-06-07 6:35 ` Christoph Hellwig 2021-06-07 16:46 ` Bart Van Assche 2021-06-08 5:22 ` Christoph Hellwig 2021-06-07 6:32 ` Christoph Hellwig [not found] ` <CGME20210607110122epcas1p4557bcbc7abac791f2557cc0d317214fd@epcas1p4.samsung.com> 2021-06-07 10:42 ` Changheun Lee [not found] ` <CGME20210604052200epcas1p10754a3187476a0fcbfcc89c103a6d436@epcas1p1.samsung.com> 2021-06-04 5:03 ` [PATCH v12 2/3] blk-sysfs: add max_bio_bytes Changheun Lee [not found] ` <CGME20210604052201epcas1p41a27660b20d70b7fc4295c8f131d33ce@epcas1p4.samsung.com> 2021-06-04 5:03 ` [PATCH v12 3/3] ufs: set max_bio_bytes with queue max sectors Changheun Lee 2021-06-04 16:11 ` Bart Van Assche [not found] ` <CGME20210607094031epcas1p1f4a9ee01eaa4652ba0e8eb6a4964c952@epcas1p1.samsung.com> 2021-06-07 9:21 ` Changheun Lee 2021-06-04 6:41 ` [PATCH v12 0/3] bio: control bio max size Can Guo 2021-06-04 16:13 ` Bart Van Assche [not found] ` <CGME20210607101609epcas1p392324f6d215e329d632a615c4b1adf4c@epcas1p3.samsung.com> 2021-06-07 9:57 ` Changheun Lee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).