* [PATCH v2] bio: limit bio max size. [not found] <CGME20210121011324epcas1p3a213069e873fd324a7ce3188558f0782@epcas1p3.samsung.com> @ 2021-01-21 0:58 ` Changheun Lee 2021-01-21 1:53 ` Damien Le Moal ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Changheun Lee @ 2021-01-21 0:58 UTC (permalink / raw) To: Johannes.Thumshirn, axboe, damien.lemoal, asml.silence, patchwork-bot, osandov, linux-block, linux-kernel, ming.lei, tj, tom.leiming Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim, Changheun Lee bio size can grow up to 4GB when muli-page bvec is enabled. but sometimes it would lead to inefficient behaviors. in case of large chunk direct I/O, - 32MB chunk read in user space - all pages for 32MB would be merged to a bio structure if memory address is continued phsycally. it makes some delay to submit until merge complete. bio max size should be limited as a proper size. When 32MB chunk read with direct I/O option is coming from userspace, kernel behavior is below now. it's timeline. | bio merge for 32MB. total 8,192 pages are merged. | total elapsed time is over 2ms. |------------------ ... ----------------------->| | 8,192 pages merged a bio. | at this time, first bio submit is done. | 1 bio is split to 32 read request and issue. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 19ms elapsed to complete 32MB read done from device. | If bio max size is limited with 1MB, behavior is changed below. | bio merge for 1MB. 256 pages are merged for each bio. | total 32 bio will be made. | total elapsed time is over 2ms. it's same. | but, first bio submit timing is fast. about 100us. |--->|--->|--->|---> ... -->|--->|--->|--->|--->| | 256 pages merged a bio. | at this time, first bio submit is done. | and 1 read request is issued for 1 bio. |---------------> |---------------> |---------------> ...... |---------------> |--------------->| total 17ms elapsed to complete 32MB read done from device. | As a result, read request issue timing is faster if bio max size is limited. Current kernel behavior with multipage bvec, super large bio can be created. And it lead to delay first I/O request issue. Signed-off-by: Changheun Lee <nanich.lee@samsung.com> --- block/bio.c | 17 ++++++++++++++++- include/linux/bio.h | 13 +++---------- include/linux/blk_types.h | 1 + 3 files changed, 20 insertions(+), 11 deletions(-) diff --git a/block/bio.c b/block/bio.c index 1f2cc1fbe283..027503c2e2e7 100644 --- a/block/bio.c +++ b/block/bio.c @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec *table, bio->bi_io_vec = table; bio->bi_max_vecs = max_vecs; + bio->bi_max_size = UINT_MAX; } EXPORT_SYMBOL(bio_init); +void bio_set_dev(struct bio *bio, struct block_device *bdev) +{ + if (bio->bi_disk != bdev->bd_disk) + bio_clear_flag(bio, BIO_THROTTLED); + + bio->bi_disk = bdev->bd_disk; + bio->bi_partno = bdev->bd_partno; + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk->queue, + bio_op(bio)) << SECTOR_SHIFT; + + bio_associate_blkg(bio); +} +EXPORT_SYMBOL(bio_set_dev); + /** * bio_reset - reinitialize a bio * @bio: bio to reset @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; if (page_is_mergeable(bv, page, len, off, same_page)) { - if (bio->bi_iter.bi_size > UINT_MAX - len) { + if (bio->bi_iter.bi_size > bio->bi_max_size - len) *same_page = false; return false; } diff --git a/include/linux/bio.h b/include/linux/bio.h index 1edda614f7ce..b9803e80c259 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) if (bio->bi_vcnt >= bio->bi_max_vecs) return true; - if (bio->bi_iter.bi_size > UINT_MAX - len) + if (bio->bi_iter.bi_size > bio->bi_max_size - len) return true; return false; @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, unsigned long *, mempool_t *); extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); extern unsigned int bvec_nr_vecs(unsigned short idx); extern const char *bio_devname(struct bio *bio, char *buffer); - -#define bio_set_dev(bio, bdev) \ -do { \ - if ((bio)->bi_disk != (bdev)->bd_disk) \ - bio_clear_flag(bio, BIO_THROTTLED);\ - (bio)->bi_disk = (bdev)->bd_disk; \ - (bio)->bi_partno = (bdev)->bd_partno; \ - bio_associate_blkg(bio); \ -} while (0) +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); #define bio_copy_dev(dst, src) \ do { \ (dst)->bi_disk = (src)->bi_disk; \ (dst)->bi_partno = (src)->bi_partno; \ + (dst)->bi_max_size = (src)->bi_max_size;\ bio_clone_blkg_association(dst, src); \ } while (0) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 866f74261b3b..e5dd5b7d8fc1 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -270,6 +270,7 @@ struct bio { */ unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ + unsigned int bi_max_size; /* max data size we can hold */ atomic_t __bi_cnt; /* pin count */ -- 2.28.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] bio: limit bio max size. 2021-01-21 0:58 ` [PATCH v2] bio: limit bio max size Changheun Lee @ 2021-01-21 1:53 ` Damien Le Moal [not found] ` <CGME20210121095138epcas1p125bb1879c65fafe4636d71b8ab5f90cd@epcas1p1.samsung.com> 2021-01-21 5:23 ` kernel test robot 2021-01-22 3:45 ` Ming Lei 2 siblings, 1 reply; 9+ messages in thread From: Damien Le Moal @ 2021-01-21 1:53 UTC (permalink / raw) To: Changheun Lee, Johannes Thumshirn, axboe, asml.silence, patchwork-bot, osandov, linux-block, linux-kernel, ming.lei, tj, tom.leiming Cc: jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On 2021/01/21 10:13, Changheun Lee wrote: Please drop the "." at the end of the patch title. > bio size can grow up to 4GB when muli-page bvec is enabled. > but sometimes it would lead to inefficient behaviors. > in case of large chunk direct I/O, - 32MB chunk read in user space - > all pages for 32MB would be merged to a bio structure if memory address is > continued phsycally. it makes some delay to submit until merge complete. s/if memory address is continued phsycally/if the pages physical addresses are contiguous/ > bio max size should be limited as a proper size. s/as/to/ > > When 32MB chunk read with direct I/O option is coming from userspace, > kernel behavior is below now. it's timeline. > > | bio merge for 32MB. total 8,192 pages are merged. > | total elapsed time is over 2ms. > |------------------ ... ----------------------->| > | 8,192 pages merged a bio. > | at this time, first bio submit is done. > | 1 bio is split to 32 read request and issue. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 19ms elapsed to complete 32MB read done from device. | > > If bio max size is limited with 1MB, behavior is changed below. > > | bio merge for 1MB. 256 pages are merged for each bio. > | total 32 bio will be made. > | total elapsed time is over 2ms. it's same. > | but, first bio submit timing is fast. about 100us. > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > | 256 pages merged a bio. > | at this time, first bio submit is done. > | and 1 read request is issued for 1 bio. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 17ms elapsed to complete 32MB read done from device. | > > As a result, read request issue timing is faster if bio max size is limited. > Current kernel behavior with multipage bvec, super large bio can be created. > And it lead to delay first I/O request issue. > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > --- > block/bio.c | 17 ++++++++++++++++- > include/linux/bio.h | 13 +++---------- > include/linux/blk_types.h | 1 + > 3 files changed, 20 insertions(+), 11 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 1f2cc1fbe283..027503c2e2e7 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec *table, > > bio->bi_io_vec = table; > bio->bi_max_vecs = max_vecs; > + bio->bi_max_size = UINT_MAX; > } > EXPORT_SYMBOL(bio_init); > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > +{ > + if (bio->bi_disk != bdev->bd_disk) > + bio_clear_flag(bio, BIO_THROTTLED); > + > + bio->bi_disk = bdev->bd_disk; > + bio->bi_partno = bdev->bd_partno; > + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk->queue, > + bio_op(bio)) << SECTOR_SHIFT; > + > + bio_associate_blkg(bio); > +} > +EXPORT_SYMBOL(bio_set_dev); > + > /** > * bio_reset - reinitialize a bio > * @bio: bio to reset > @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > if (page_is_mergeable(bv, page, len, off, same_page)) { > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > *same_page = false; > return false; > } > diff --git a/include/linux/bio.h b/include/linux/bio.h > index 1edda614f7ce..b9803e80c259 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) > if (bio->bi_vcnt >= bio->bi_max_vecs) > return true; > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > return true; > > return false; > @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, unsigned long *, mempool_t *); > extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); > extern unsigned int bvec_nr_vecs(unsigned short idx); > extern const char *bio_devname(struct bio *bio, char *buffer); > - > -#define bio_set_dev(bio, bdev) \ > -do { \ > - if ((bio)->bi_disk != (bdev)->bd_disk) \ > - bio_clear_flag(bio, BIO_THROTTLED);\ > - (bio)->bi_disk = (bdev)->bd_disk; \ > - (bio)->bi_partno = (bdev)->bd_partno; \ > - bio_associate_blkg(bio); \ > -} while (0) > +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); > > #define bio_copy_dev(dst, src) \ > do { \ > (dst)->bi_disk = (src)->bi_disk; \ > (dst)->bi_partno = (src)->bi_partno; \ > + (dst)->bi_max_size = (src)->bi_max_size;\ > bio_clone_blkg_association(dst, src); \ > } while (0) > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index 866f74261b3b..e5dd5b7d8fc1 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -270,6 +270,7 @@ struct bio { > */ > > unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ > + unsigned int bi_max_size; /* max data size we can hold */ > > atomic_t __bi_cnt; /* pin count */ This modification comes at the cost of increasing the bio structure size to simply tell the block layer "do not delay BIO splitting"... I think there is a much simpler approach. What about: 1) Use a request queue flag to indicate "limit BIO size" 2) modify __bio_try_merge_page() to look at that flag to disallow page merging if the bio size exceeds blk_queue_get_max_sectors(), or more ideally a version of it that takes into account the bio start sector. 3) Set the "limit bio size" queue flag in the driver of the device that benefit from this change. Eventually, that could also be controlled through sysfs. With such change, you will get the same result without having to increase the BIO structure size. -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CGME20210121095138epcas1p125bb1879c65fafe4636d71b8ab5f90cd@epcas1p1.samsung.com>]
* Re: [PATCH v2] bio: limit bio max size [not found] ` <CGME20210121095138epcas1p125bb1879c65fafe4636d71b8ab5f90cd@epcas1p1.samsung.com> @ 2021-01-21 9:36 ` Changheun Lee 2021-01-21 10:17 ` Damien Le Moal 0 siblings, 1 reply; 9+ messages in thread From: Changheun Lee @ 2021-01-21 9:36 UTC (permalink / raw) To: damien.lemoal Cc: Johannes.Thumshirn, asml.silence, axboe, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yt0928.kim >Please drop the "." at the end of the patch title. > >> bio size can grow up to 4GB when muli-page bvec is enabled. >> but sometimes it would lead to inefficient behaviors. >> in case of large chunk direct I/O, - 32MB chunk read in user space - >> all pages for 32MB would be merged to a bio structure if memory address is >> continued phsycally. it makes some delay to submit until merge complete. > >s/if memory address is continued phsycally/if the pages physical addresses are >contiguous/ > >> bio max size should be limited as a proper size. > >s/as/to/ Thank you for advice. :) > >> >> When 32MB chunk read with direct I/O option is coming from userspace, >> kernel behavior is below now. it's timeline. >> >> | bio merge for 32MB. total 8,192 pages are merged. >> | total elapsed time is over 2ms. >> |------------------ ... ----------------------->| >> | 8,192 pages merged a bio. >> | at this time, first bio submit is done. >> | 1 bio is split to 32 read request and issue. >> |---------------> >> |---------------> >> |---------------> >> ...... >> |---------------> >> |--------------->| >> total 19ms elapsed to complete 32MB read done from device. | >> >> If bio max size is limited with 1MB, behavior is changed below. >> >> | bio merge for 1MB. 256 pages are merged for each bio. >> | total 32 bio will be made. >> | total elapsed time is over 2ms. it's same. >> | but, first bio submit timing is fast. about 100us. >> |--->|--->|--->|---> ... -->|--->|--->|--->|--->| >> | 256 pages merged a bio. >> | at this time, first bio submit is done. >> | and 1 read request is issued for 1 bio. >> |---------------> >> |---------------> >> |---------------> >> ...... >> |---------------> >> |--------------->| >> total 17ms elapsed to complete 32MB read done from device. | >> >> As a result, read request issue timing is faster if bio max size is limited. >> Current kernel behavior with multipage bvec, super large bio can be created. >> And it lead to delay first I/O request issue. >> >> Signed-off-by: Changheun Lee <nanich.lee@samsung.com> >> >> --- >> block/bio.c | 17 ++++++++++++++++- >> include/linux/bio.h | 13 +++---------- >> include/linux/blk_types.h | 1 + >> 3 files changed, 20 insertions(+), 11 deletions(-) >> >> diff --git a/block/bio.c b/block/bio.c >> index 1f2cc1fbe283..027503c2e2e7 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec *table, >> >> bio->bi_io_vec = table; >> bio->bi_max_vecs = max_vecs; >> + bio->bi_max_size = UINT_MAX; >> } >> EXPORT_SYMBOL(bio_init); >> >> +void bio_set_dev(struct bio *bio, struct block_device *bdev) >> +{ >> + if (bio->bi_disk != bdev->bd_disk) >> + bio_clear_flag(bio, BIO_THROTTLED); >> + >> + bio->bi_disk = bdev->bd_disk; >> + bio->bi_partno = bdev->bd_partno; >> + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk->queue, >> + bio_op(bio)) << SECTOR_SHIFT; >> + >> + bio_associate_blkg(bio); >> +} >> +EXPORT_SYMBOL(bio_set_dev); >> + >> /** >> * bio_reset - reinitialize a bio >> * @bio: bio to reset >> @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, >> struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; >> >> if (page_is_mergeable(bv, page, len, off, same_page)) { >> - if (bio->bi_iter.bi_size > UINT_MAX - len) { >> + if (bio->bi_iter.bi_size > bio->bi_max_size - len) >> *same_page = false; >> return false; >> } >> diff --git a/include/linux/bio.h b/include/linux/bio.h >> index 1edda614f7ce..b9803e80c259 100644 >> --- a/include/linux/bio.h >> +++ b/include/linux/bio.h >> @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) >> if (bio->bi_vcnt >= bio->bi_max_vecs) >> return true; >> >> - if (bio->bi_iter.bi_size > UINT_MAX - len) >> + if (bio->bi_iter.bi_size > bio->bi_max_size - len) >> return true; >> >> return false; >> @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, unsigned long *, mempool_t *); >> extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); >> extern unsigned int bvec_nr_vecs(unsigned short idx); >> extern const char *bio_devname(struct bio *bio, char *buffer); >> - >> -#define bio_set_dev(bio, bdev) \ >> -do { \ >> - if ((bio)->bi_disk != (bdev)->bd_disk) \ >> - bio_clear_flag(bio, BIO_THROTTLED);\ >> - (bio)->bi_disk = (bdev)->bd_disk; \ >> - (bio)->bi_partno = (bdev)->bd_partno; \ >> - bio_associate_blkg(bio); \ >> -} while (0) >> +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); >> >> #define bio_copy_dev(dst, src) \ >> do { \ >> (dst)->bi_disk = (src)->bi_disk; \ >> (dst)->bi_partno = (src)->bi_partno; \ >> + (dst)->bi_max_size = (src)->bi_max_size;\ >> bio_clone_blkg_association(dst, src); \ >> } while (0) >> >> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h >> index 866f74261b3b..e5dd5b7d8fc1 100644 >> --- a/include/linux/blk_types.h >> +++ b/include/linux/blk_types.h >> @@ -270,6 +270,7 @@ struct bio { >> */ >> >> unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ >> + unsigned int bi_max_size; /* max data size we can hold */ >> >> atomic_t __bi_cnt; /* pin count */ > >This modification comes at the cost of increasing the bio structure size to >simply tell the block layer "do not delay BIO splitting"... > >I think there is a much simpler approach. What about: > >1) Use a request queue flag to indicate "limit BIO size" >2) modify __bio_try_merge_page() to look at that flag to disallow page merging >if the bio size exceeds blk_queue_get_max_sectors(), or more ideally a version >of it that takes into account the bio start sector. >3) Set the "limit bio size" queue flag in the driver of the device that benefit >from this change. Eventually, that could also be controlled through sysfs. > >With such change, you will get the same result without having to increase the >BIO structure size. I have a qustion. Is adding new variable in bio not possible? Additional check for every page merge like as below is inefficient I think. bool __bio_try_merge_page(struct bio *bio, struct page *page, unsigned int len, unsigned int off, bool *same_page) { ... if (page_is_mergeable(bv, page, len, off, same_page)) { if (bio->bi_iter.bi_size > UINT_MAX - len) { *same_page = false; return false; } + if (blk_queue_limit_bio_max_size(bio) && + (bio->bi_iter.bi_size > blk_queue_get_bio_max_size(bio) - len)) { + *same_page = false; + return false; + } bv->bv_len += len; bio->bi_iter.bi_size += len; return true; } ... } static inline bool bio_full(struct bio *bio, unsigned len) { ... if (bio->bi_iter.bi_size > UINT_MAX - len) return true; + if (blk_queue_limit_bio_max_size(bio) && + (bio->bi_iter.bi_size > blk_queue_get_bio_max_size(bio) - len)) + return true; ... } Page merge is CPU-bound job as you said. How about below with adding of bi_max_size in bio? bool __bio_try_merge_page(struct bio *bio, struct page *page, unsigned int len, unsigned int off, bool *same_page) { ... if (page_is_mergeable(bv, page, len, off, same_page)) { - if (bio->bi_iter.bi_size > UINT_MAX - len) { + if (bio->bi_iter.bi_size > bio->bi_max_size - len) { *same_page = false; return false; } bv->bv_len += len; bio->bi_iter.bi_size += len; return true; } ... } static inline bool bio_full(struct bio *bio, unsigned len) { ... - if (bio->bi_iter.bi_size > UINT_MAX - len) + if (bio->bi_iter.bi_size > bio->bi_max_size - len) return true; ... } +void bio_set_dev(struct bio *bio, struct block_device *bdev) +{ + if (bio->bi_disk != bdev->bd_disk) + bio_clear_flag(bio, BIO_THROTTLED); + + bio->bi_disk = bdev->bd_disk; + bio->bi_partno = bdev->bd_partno; + if (blk_queue_limit_bio_max_size(bio)) + bio->bi_max_size = blk_queue_get_bio_max_size(bio); + + bio_associate_blkg(bio); +} +EXPORT_SYMBOL(bio_set_dev); >-- >Damien Le Moal >Western Digital Research --- Changheun Lee Samsung Electronics ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] bio: limit bio max size 2021-01-21 9:36 ` Changheun Lee @ 2021-01-21 10:17 ` Damien Le Moal [not found] ` <CGME20210121113928epcas1p13c9dd6d1322979a977f24ded689b38de@epcas1p1.samsung.com> 0 siblings, 1 reply; 9+ messages in thread From: Damien Le Moal @ 2021-01-21 10:17 UTC (permalink / raw) To: nanich.lee Cc: seunghwan.hyun, tom.leiming, osandov, mj0123.lee, woosung2.lee, linux-kernel, patchwork-bot, Johannes Thumshirn, tj, jisoo2146.oh, ming.lei, axboe, linux-block, junho89.kim, asml.silence, yt0928.kim, sookwan7.kim On Thu, 2021-01-21 at 18:36 +0900, Changheun Lee wrote: > > Please drop the "." at the end of the patch title. > > > > > bio size can grow up to 4GB when muli-page bvec is enabled. > > > but sometimes it would lead to inefficient behaviors. > > > in case of large chunk direct I/O, - 32MB chunk read in user space - > > > all pages for 32MB would be merged to a bio structure if memory address > > > is > > > continued phsycally. it makes some delay to submit until merge complete. > > > > s/if memory address is continued phsycally/if the pages physical addresses > > are > > contiguous/ > > > > > bio max size should be limited as a proper size. > > > > s/as/to/ > > Thank you for advice. :) > > > > > > > > > When 32MB chunk read with direct I/O option is coming from userspace, > > > kernel behavior is below now. it's timeline. > > > > > > | bio merge for 32MB. total 8,192 pages are merged. > > > | total elapsed time is over 2ms. > > > |------------------ ... ----------------------->| > > > | 8,192 pages merged a > > > bio. > > > | at this time, first > > > bio submit is done. > > > | 1 bio is split to 32 > > > read request and issue. > > > |---------------> > > > |---------------> > > > |---------------> > > > ...... > > > |----- > > > ----------> > > > |---- > > > ----------->| > > > total 19ms elapsed to complete 32MB read done > > > from device. | > > > > > > If bio max size is limited with 1MB, behavior is changed below. > > > > > > | bio merge for 1MB. 256 pages are merged for each bio. > > > | total 32 bio will be made. > > > | total elapsed time is over 2ms. it's same. > > > | but, first bio submit timing is fast. about 100us. > > > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > > > | 256 pages merged a bio. > > > | at this time, first bio submit is done. > > > | and 1 read request is issued for 1 bio. > > > |---------------> > > > |---------------> > > > |---------------> > > > ...... > > > |---------------> > > > |--------------->| > > > total 17ms elapsed to complete 32MB read done from device. | > > > > > > As a result, read request issue timing is faster if bio max size is > > > limited. > > > Current kernel behavior with multipage bvec, super large bio can be > > > created. > > > And it lead to delay first I/O request issue. > > > > > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > > > > > --- > > > block/bio.c | 17 ++++++++++++++++- > > > include/linux/bio.h | 13 +++---------- > > > include/linux/blk_types.h | 1 + > > > 3 files changed, 20 insertions(+), 11 deletions(-) > > > > > > diff --git a/block/bio.c b/block/bio.c > > > index 1f2cc1fbe283..027503c2e2e7 100644 > > > --- a/block/bio.c > > > +++ b/block/bio.c > > > @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec > > > *table, > > > > > > bio->bi_io_vec = table; > > > bio->bi_max_vecs = max_vecs; > > > + bio->bi_max_size = UINT_MAX; > > > } > > > EXPORT_SYMBOL(bio_init); > > > > > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > > > +{ > > > + if (bio->bi_disk != bdev->bd_disk) > > > + bio_clear_flag(bio, BIO_THROTTLED); > > > + > > > + bio->bi_disk = bdev->bd_disk; > > > + bio->bi_partno = bdev->bd_partno; > > > + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk- > > > >queue, > > > + bio_op(bio)) << SECTOR_SHIFT; > > > + > > > + bio_associate_blkg(bio); > > > +} > > > +EXPORT_SYMBOL(bio_set_dev); > > > + > > > /** > > > * bio_reset - reinitialize a bio > > > * @bio: bio to reset > > > @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct > > > page *page, > > > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > > > > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > > > + if (bio->bi_iter.bi_size > bio->bi_max_size - > > > len) > > > *same_page = false; > > > return false; > > > } > > > diff --git a/include/linux/bio.h b/include/linux/bio.h > > > index 1edda614f7ce..b9803e80c259 100644 > > > --- a/include/linux/bio.h > > > +++ b/include/linux/bio.h > > > @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned > > > len) > > > if (bio->bi_vcnt >= bio->bi_max_vecs) > > > return true; > > > > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > > > return true; > > > > > > return false; > > > @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, > > > unsigned long *, mempool_t *); > > > extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); > > > extern unsigned int bvec_nr_vecs(unsigned short idx); > > > extern const char *bio_devname(struct bio *bio, char *buffer); > > > - > > > -#define bio_set_dev(bio, bdev) \ > > > -do { \ > > > - if ((bio)->bi_disk != (bdev)->bd_disk) \ > > > - bio_clear_flag(bio, BIO_THROTTLED);\ > > > - (bio)->bi_disk = (bdev)->bd_disk; \ > > > - (bio)->bi_partno = (bdev)->bd_partno; \ > > > - bio_associate_blkg(bio); \ > > > -} while (0) > > > +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); > > > > > > #define bio_copy_dev(dst, src) \ > > > do { \ > > > (dst)->bi_disk = (src)->bi_disk; \ > > > (dst)->bi_partno = (src)->bi_partno; \ > > > + (dst)->bi_max_size = (src)->bi_max_size;\ > > > bio_clone_blkg_association(dst, src); \ > > > } while (0) > > > > > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > > > index 866f74261b3b..e5dd5b7d8fc1 100644 > > > --- a/include/linux/blk_types.h > > > +++ b/include/linux/blk_types.h > > > @@ -270,6 +270,7 @@ struct bio { > > > */ > > > > > > unsigned short bi_max_vecs; /* max bvl_vecs we can > > > hold */ > > > + unsigned int bi_max_size; /* max data size we can > > > hold */ > > > > > > atomic_t __bi_cnt; /* pin count */ > > > > This modification comes at the cost of increasing the bio structure size to > > simply tell the block layer "do not delay BIO splitting"... > > > > I think there is a much simpler approach. What about: > > > > 1) Use a request queue flag to indicate "limit BIO size" > > 2) modify __bio_try_merge_page() to look at that flag to disallow page > > merging > > if the bio size exceeds blk_queue_get_max_sectors(), or more ideally a > > version > > of it that takes into account the bio start sector. > > 3) Set the "limit bio size" queue flag in the driver of the device that > > benefit > > from this change. Eventually, that could also be controlled through sysfs. > > > > With such change, you will get the same result without having to increase > > the > > BIO structure size. > > I have a qustion. > Is adding new variable in bio not possible? It is possible, but since it is a critical kernel resource used a lot, keeping it as small as possible for performance reasons is strongly desired. So if there is a coding scheme that can avoid increasing struct bio size, it should be explored first and discarded only with very good reasons. > Additional check for every page merge like as below is inefficient I think. For the general case of devices that do not care about limiting the bio size (like now), this will add one boolean evaluation (queue flag test). That's it. For your case, sure you now have 2 boolean evals instead of one. But that must be put in perspective with the cost of increasing the bio size. > > bool __bio_try_merge_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int off, bool *same_page) > { > ... > if (page_is_mergeable(bv, page, len, off, same_page)) { > if (bio->bi_iter.bi_size > UINT_MAX - len) { > *same_page = false; > return false; > } > > + if (blk_queue_limit_bio_max_size(bio) && > + (bio->bi_iter.bi_size > > blk_queue_get_bio_max_size(bio) - len)) { > + *same_page = false; > + return false; > + } > > bv->bv_len += len; > bio->bi_iter.bi_size += len; > return true; > } > ... > } > > > static inline bool bio_full(struct bio *bio, unsigned len) > { > ... > if (bio->bi_iter.bi_size > UINT_MAX - len) > return true; > > + if (blk_queue_limit_bio_max_size(bio) && > + (bio->bi_iter.bi_size > blk_queue_get_bio_max_size(bio) - len)) > + return true; > ... > } > > > Page merge is CPU-bound job as you said. > How about below with adding of bi_max_size in bio? I am not a fan of adding a bio field for using it only in one place. This is only my opinion. I will let others comment about this, but personnally I would rather do something like this: #define blk_queue_limit_bio_merge_size(q) \ test_bit(QUEUE_FLAG_LIMIT_MERGE, &(q)->queue_flags) static inline unsigned int bio_max_merge_size(struct bio *bio) { struct request_queue *q = bio->bi_disk->queue; if (blk_queue_limit_bio_merge_size(q)) return blk_queue_get_max_sectors(q, bio_op(bio)) << SECTOR_SHIFT; return UINT_MAX; } and use that helper in __bio_try_merge_page(), e.g.: if (bio->bi_iter.bi_size > bio_max_merge_size(bio) - len) { *same_page = false; return false; } No need to change the bio struct. If you measure performance with and without this change on nullblk, you can verify if it has any impact for regular devices. And for your use case, that should give you the same performance. > > bool __bio_try_merge_page(struct bio *bio, struct page *page, > unsigned int len, unsigned int off, bool *same_page) > { > ... > if (page_is_mergeable(bv, page, len, off, same_page)) { > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) { > *same_page = false; > return false; > } > > bv->bv_len += len; > bio->bi_iter.bi_size += len; > return true; > } > ... > } > > > static inline bool bio_full(struct bio *bio, unsigned len) > { > ... > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > return true; > ... > } > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > +{ > + if (bio->bi_disk != bdev->bd_disk) > + bio_clear_flag(bio, BIO_THROTTLED); > + > + bio->bi_disk = bdev->bd_disk; > + bio->bi_partno = bdev->bd_partno; > + if (blk_queue_limit_bio_max_size(bio)) > + bio->bi_max_size = blk_queue_get_bio_max_size(bio); > + > + bio_associate_blkg(bio); > +} > +EXPORT_SYMBOL(bio_set_dev); > > > -- > > Damien Le Moal > > Western Digital Research > > --- > Changheun Lee > Samsung Electronics -- Damien Le Moal Western Digital ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CGME20210121113928epcas1p13c9dd6d1322979a977f24ded689b38de@epcas1p1.samsung.com>]
* Re: [PATCH v2] bio: limit bio max size [not found] ` <CGME20210121113928epcas1p13c9dd6d1322979a977f24ded689b38de@epcas1p1.samsung.com> @ 2021-01-21 11:24 ` Changheun Lee 0 siblings, 0 replies; 9+ messages in thread From: Changheun Lee @ 2021-01-21 11:24 UTC (permalink / raw) To: damien.lemoal Cc: Johannes.Thumshirn, asml.silence, axboe, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, ming.lei, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yt0928.kim > On Thu, 2021-01-21 at 18:36 +0900, Changheun Lee wrote: > > > Please drop the "." at the end of the patch title. > > > > > > > bio size can grow up to 4GB when muli-page bvec is enabled. > > > > but sometimes it would lead to inefficient behaviors. > > > > in case of large chunk direct I/O, - 32MB chunk read in user space - > > > > all pages for 32MB would be merged to a bio structure if memory address > > > > is > > > > continued phsycally. it makes some delay to submit until merge complete. > > > > > > s/if memory address is continued phsycally/if the pages physical addresses > > > are > > > contiguous/ > > > > > > > bio max size should be limited as a proper size. > > > > > > s/as/to/ > > > > Thank you for advice. :) > > > > > > > > > > > > > When 32MB chunk read with direct I/O option is coming from userspace, > > > > kernel behavior is below now. it's timeline. > > > > > > > > | bio merge for 32MB. total 8,192 pages are merged. > > > > | total elapsed time is over 2ms. > > > > |------------------ ... ----------------------->| > > > > | 8,192 pages merged a > > > > bio. > > > > | at this time, first > > > > bio submit is done. > > > > | 1 bio is split to 32 > > > > read request and issue. > > > > |---------------> > > > > |---------------> > > > > |---------------> > > > > ...... > > > > |----- > > > > ----------> > > > > |---- > > > > ----------->| > > > > total 19ms elapsed to complete 32MB read done > > > > from device. | > > > > > > > > If bio max size is limited with 1MB, behavior is changed below. > > > > > > > > | bio merge for 1MB. 256 pages are merged for each bio. > > > > | total 32 bio will be made. > > > > | total elapsed time is over 2ms. it's same. > > > > | but, first bio submit timing is fast. about 100us. > > > > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > > > > | 256 pages merged a bio. > > > > | at this time, first bio submit is done. > > > > | and 1 read request is issued for 1 bio. > > > > |---------------> > > > > |---------------> > > > > |---------------> > > > > ...... > > > > |---------------> > > > > |--------------->| > > > > total 17ms elapsed to complete 32MB read done from device. | > > > > > > > > As a result, read request issue timing is faster if bio max size is > > > > limited. > > > > Current kernel behavior with multipage bvec, super large bio can be > > > > created. > > > > And it lead to delay first I/O request issue. > > > > > > > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > > > > > > > --- > > > > block/bio.c | 17 ++++++++++++++++- > > > > include/linux/bio.h | 13 +++---------- > > > > include/linux/blk_types.h | 1 + > > > > 3 files changed, 20 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/block/bio.c b/block/bio.c > > > > index 1f2cc1fbe283..027503c2e2e7 100644 > > > > --- a/block/bio.c > > > > +++ b/block/bio.c > > > > @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec > > > > *table, > > > > > > > > bio->bi_io_vec = table; > > > > bio->bi_max_vecs = max_vecs; > > > > + bio->bi_max_size = UINT_MAX; > > > > } > > > > EXPORT_SYMBOL(bio_init); > > > > > > > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > > > > +{ > > > > + if (bio->bi_disk != bdev->bd_disk) > > > > + bio_clear_flag(bio, BIO_THROTTLED); > > > > + > > > > + bio->bi_disk = bdev->bd_disk; > > > > + bio->bi_partno = bdev->bd_partno; > > > > + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk- > > > > >queue, > > > > + bio_op(bio)) << SECTOR_SHIFT; > > > > + > > > > + bio_associate_blkg(bio); > > > > +} > > > > +EXPORT_SYMBOL(bio_set_dev); > > > > + > > > > /** > > > > * bio_reset - reinitialize a bio > > > > * @bio: bio to reset > > > > @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct > > > > page *page, > > > > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > > > > > > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > > > > + if (bio->bi_iter.bi_size > bio->bi_max_size - > > > > len) > > > > *same_page = false; > > > > return false; > > > > } > > > > diff --git a/include/linux/bio.h b/include/linux/bio.h > > > > index 1edda614f7ce..b9803e80c259 100644 > > > > --- a/include/linux/bio.h > > > > +++ b/include/linux/bio.h > > > > @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned > > > > len) > > > > if (bio->bi_vcnt >= bio->bi_max_vecs) > > > > return true; > > > > > > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > > > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > > > > return true; > > > > > > > > return false; > > > > @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, > > > > unsigned long *, mempool_t *); > > > > extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); > > > > extern unsigned int bvec_nr_vecs(unsigned short idx); > > > > extern const char *bio_devname(struct bio *bio, char *buffer); > > > > - > > > > -#define bio_set_dev(bio, bdev) \ > > > > -do { \ > > > > - if ((bio)->bi_disk != (bdev)->bd_disk) \ > > > > - bio_clear_flag(bio, BIO_THROTTLED);\ > > > > - (bio)->bi_disk = (bdev)->bd_disk; \ > > > > - (bio)->bi_partno = (bdev)->bd_partno; \ > > > > - bio_associate_blkg(bio); \ > > > > -} while (0) > > > > +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); > > > > > > > > #define bio_copy_dev(dst, src) \ > > > > do { \ > > > > (dst)->bi_disk = (src)->bi_disk; \ > > > > (dst)->bi_partno = (src)->bi_partno; \ > > > > + (dst)->bi_max_size = (src)->bi_max_size;\ > > > > bio_clone_blkg_association(dst, src); \ > > > > } while (0) > > > > > > > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > > > > index 866f74261b3b..e5dd5b7d8fc1 100644 > > > > --- a/include/linux/blk_types.h > > > > +++ b/include/linux/blk_types.h > > > > @@ -270,6 +270,7 @@ struct bio { > > > > */ > > > > > > > > unsigned short bi_max_vecs; /* max bvl_vecs we can > > > > hold */ > > > > + unsigned int bi_max_size; /* max data size we can > > > > hold */ > > > > > > > > atomic_t __bi_cnt; /* pin count */ > > > > > > This modification comes at the cost of increasing the bio structure size to > > > simply tell the block layer "do not delay BIO splitting"... > > > > > > I think there is a much simpler approach. What about: > > > > > > 1) Use a request queue flag to indicate "limit BIO size" > > > 2) modify __bio_try_merge_page() to look at that flag to disallow page > > > merging > > > if the bio size exceeds blk_queue_get_max_sectors(), or more ideally a > > > version > > > of it that takes into account the bio start sector. > > > 3) Set the "limit bio size" queue flag in the driver of the device that > > > benefit > > > from this change. Eventually, that could also be controlled through sysfs. > > > > > > With such change, you will get the same result without having to increase > > > the > > > BIO structure size. > > > > I have a qustion. > > Is adding new variable in bio not possible? > > It is possible, but since it is a critical kernel resource used a lot, keeping > it as small as possible for performance reasons is strongly desired. So if > there is a coding scheme that can avoid increasing struct bio size, it should > be explored first and discarded only with very good reasons. I see your point. I agree with you. it's important thing. :) > > > Additional check for every page merge like as below is inefficient I think. > > For the general case of devices that do not care about limiting the bio size > (like now), this will add one boolean evaluation (queue flag test). That's it. > For your case, sure you now have 2 boolean evals instead of one. But that must > be put in perspective with the cost of increasing the bio size. > > > > > bool __bio_try_merge_page(struct bio *bio, struct page *page, > > unsigned int len, unsigned int off, bool *same_page) > > { > > ... > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > if (bio->bi_iter.bi_size > UINT_MAX - len) { > > *same_page = false; > > return false; > > } > > > > + if (blk_queue_limit_bio_max_size(bio) && > > + (bio->bi_iter.bi_size > > > blk_queue_get_bio_max_size(bio) - len)) { > > + *same_page = false; > > + return false; > > + } > > > > bv->bv_len += len; > > bio->bi_iter.bi_size += len; > > return true; > > } > > ... > > } > > > > > > static inline bool bio_full(struct bio *bio, unsigned len) > > { > > ... > > if (bio->bi_iter.bi_size > UINT_MAX - len) > > return true; > > > > + if (blk_queue_limit_bio_max_size(bio) && > > + (bio->bi_iter.bi_size > blk_queue_get_bio_max_size(bio) - len)) > > + return true; > > ... > > } > > > > > > Page merge is CPU-bound job as you said. > > How about below with adding of bi_max_size in bio? > > I am not a fan of adding a bio field for using it only in one place. > This is only my opinion. I will let others comment about this, but personnally > I would rather do something like this: > > #define blk_queue_limit_bio_merge_size(q) \ > test_bit(QUEUE_FLAG_LIMIT_MERGE, &(q)->queue_flags) > > static inline unsigned int bio_max_merge_size(struct bio *bio) > { > struct request_queue *q = bio->bi_disk->queue; > > if (blk_queue_limit_bio_merge_size(q)) > return blk_queue_get_max_sectors(q, bio_op(bio)) > << SECTOR_SHIFT; > return UINT_MAX; > } > > and use that helper in __bio_try_merge_page(), e.g.: > > if (bio->bi_iter.bi_size > bio_max_merge_size(bio) - len) { > *same_page = false; > return false; > } > > No need to change the bio struct. > > If you measure performance with and without this change on nullblk, you can > verify if it has any impact for regular devices. And for your use case, that > should give you the same performance. > OK. I'll wait others comment too for a few days. I'll prepare v3 patch as you like if there are no feedback. :) v2 patch has a compile error already by my misstyping. :( > > > > bool __bio_try_merge_page(struct bio *bio, struct page *page, > > unsigned int len, unsigned int off, bool *same_page) > > { > > ... > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) { > > *same_page = false; > > return false; > > } > > > > bv->bv_len += len; > > bio->bi_iter.bi_size += len; > > return true; > > } > > ... > > } > > > > > > static inline bool bio_full(struct bio *bio, unsigned len) > > { > > ... > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > > return true; > > ... > > } > > > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > > +{ > > + if (bio->bi_disk != bdev->bd_disk) > > + bio_clear_flag(bio, BIO_THROTTLED); > > + > > + bio->bi_disk = bdev->bd_disk; > > + bio->bi_partno = bdev->bd_partno; > > + if (blk_queue_limit_bio_max_size(bio)) > > + bio->bi_max_size = blk_queue_get_bio_max_size(bio); > > + > > + bio_associate_blkg(bio); > > +} > > +EXPORT_SYMBOL(bio_set_dev); > > > > > -- > > > Damien Le Moal > > > Western Digital Research > > > > --- > > Changheun Lee > > Samsung Electronics > > -- > Damien Le Moal > Western Digital > --- Changheun Lee Samsung Electronics ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] bio: limit bio max size. 2021-01-21 0:58 ` [PATCH v2] bio: limit bio max size Changheun Lee @ 2021-01-21 5:23 ` kernel test robot 2021-01-21 5:23 ` kernel test robot 2021-01-22 3:45 ` Ming Lei 2 siblings, 0 replies; 9+ messages in thread From: kernel test robot @ 2021-01-21 5:23 UTC (permalink / raw) To: Changheun Lee, Johannes.Thumshirn, axboe, damien.lemoal, asml.silence, patchwork-bot, osandov, linux-block, linux-kernel, ming.lei, tj Cc: kbuild-all [-- Attachment #1: Type: text/plain, Size: 6338 bytes --] Hi Changheun, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on linux/master linus/master v5.11-rc4 next-20210120] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Changheun-Lee/bio-limit-bio-max-size/20210121-092618 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: ia64-randconfig-s032-20210121 (attached as .config) compiler: ia64-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.3-208-g46a52ca4-dirty # https://github.com/0day-ci/linux/commit/3b3676565e975655879c64b22c674166d5c7ff1c git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Changheun-Lee/bio-limit-bio-max-size/20210121-092618 git checkout 3b3676565e975655879c64b22c674166d5c7ff1c # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=ia64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): block/bio.c: In function '__bio_try_merge_page': >> block/bio.c:895:4: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 895 | if (bio->bi_iter.bi_size > bio->bi_max_size - len) | ^~ block/bio.c:897:5: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 897 | return false; | ^~~~~~ block/bio.c: At top level: >> block/bio.c:904:2: error: expected identifier or '(' before 'return' 904 | return false; | ^~~~~~ >> block/bio.c:905:1: error: expected identifier or '(' before '}' token 905 | } | ^ block/bio.c: In function '__bio_try_merge_page': block/bio.c:903:2: error: control reaches end of non-void function [-Werror=return-type] 903 | } | ^ cc1: some warnings being treated as errors vim +904 block/bio.c ^1da177e4c3f4152 fs/bio.c Linus Torvalds 2005-04-16 868 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 869 /** 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 870 * __bio_try_merge_page - try appending data to an existing bvec. 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 871 * @bio: destination bio 551879a48f01826f block/bio.c Ming Lei 2019-04-23 872 * @page: start page to add 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 873 * @len: length of the data to add 551879a48f01826f block/bio.c Ming Lei 2019-04-23 874 * @off: offset of the data relative to @page ff896738be381efa block/bio.c Christoph Hellwig 2019-06-17 875 * @same_page: return if the segment has been merged inside the same page 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 876 * 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 877 * Try to add the data at @page + @off to the last bvec of @bio. This is a 3cf148891799d465 block/bio.c Randy Dunlap 2020-07-30 878 * useful optimisation for file systems with a block size smaller than the 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 879 * page size. 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 880 * 551879a48f01826f block/bio.c Ming Lei 2019-04-23 881 * Warn if (@len, @off) crosses pages in case that @same_page is true. 551879a48f01826f block/bio.c Ming Lei 2019-04-23 882 * 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 883 * Return %true on success or %false on failure. 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 884 */ 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 885 bool __bio_try_merge_page(struct bio *bio, struct page *page, ff896738be381efa block/bio.c Christoph Hellwig 2019-06-17 886 unsigned int len, unsigned int off, bool *same_page) 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 887 { c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 888 if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 889 return false; 762380ad9322951c block/bio.c Jens Axboe 2014-06-05 890 cc90bc68422318eb block/bio.c Andreas Gruenbacher 2019-12-09 891 if (bio->bi_vcnt > 0) { 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 892 struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 893 5919482e222908d4 block/bio.c Ming Lei 2019-03-17 894 if (page_is_mergeable(bv, page, len, off, same_page)) { 3b3676565e975655 block/bio.c Changheun Lee 2021-01-21 @895 if (bio->bi_iter.bi_size > bio->bi_max_size - len) 2cd896a5e86fc326 block/bio.c Ritesh Harjani 2020-09-09 896 *same_page = false; cc90bc68422318eb block/bio.c Andreas Gruenbacher 2019-12-09 897 return false; 2cd896a5e86fc326 block/bio.c Ritesh Harjani 2020-09-09 898 } c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 899 bv->bv_len += len; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 900 bio->bi_iter.bi_size += len; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 901 return true; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 902 } 5919482e222908d4 block/bio.c Ming Lei 2019-03-17 903 } 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 @904 return false; c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 @905 } 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 906 EXPORT_SYMBOL_GPL(__bio_try_merge_page); c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 907 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 42790 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] bio: limit bio max size. @ 2021-01-21 5:23 ` kernel test robot 0 siblings, 0 replies; 9+ messages in thread From: kernel test robot @ 2021-01-21 5:23 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 6438 bytes --] Hi Changheun, Thank you for the patch! Yet something to improve: [auto build test ERROR on block/for-next] [also build test ERROR on linux/master linus/master v5.11-rc4 next-20210120] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Changheun-Lee/bio-limit-bio-max-size/20210121-092618 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next config: ia64-randconfig-s032-20210121 (attached as .config) compiler: ia64-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.3-208-g46a52ca4-dirty # https://github.com/0day-ci/linux/commit/3b3676565e975655879c64b22c674166d5c7ff1c git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Changheun-Lee/bio-limit-bio-max-size/20210121-092618 git checkout 3b3676565e975655879c64b22c674166d5c7ff1c # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=ia64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): block/bio.c: In function '__bio_try_merge_page': >> block/bio.c:895:4: warning: this 'if' clause does not guard... [-Wmisleading-indentation] 895 | if (bio->bi_iter.bi_size > bio->bi_max_size - len) | ^~ block/bio.c:897:5: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if' 897 | return false; | ^~~~~~ block/bio.c: At top level: >> block/bio.c:904:2: error: expected identifier or '(' before 'return' 904 | return false; | ^~~~~~ >> block/bio.c:905:1: error: expected identifier or '(' before '}' token 905 | } | ^ block/bio.c: In function '__bio_try_merge_page': block/bio.c:903:2: error: control reaches end of non-void function [-Werror=return-type] 903 | } | ^ cc1: some warnings being treated as errors vim +904 block/bio.c ^1da177e4c3f4152 fs/bio.c Linus Torvalds 2005-04-16 868 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 869 /** 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 870 * __bio_try_merge_page - try appending data to an existing bvec. 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 871 * @bio: destination bio 551879a48f01826f block/bio.c Ming Lei 2019-04-23 872 * @page: start page to add 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 873 * @len: length of the data to add 551879a48f01826f block/bio.c Ming Lei 2019-04-23 874 * @off: offset of the data relative to @page ff896738be381efa block/bio.c Christoph Hellwig 2019-06-17 875 * @same_page: return if the segment has been merged inside the same page 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 876 * 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 877 * Try to add the data at @page + @off to the last bvec of @bio. This is a 3cf148891799d465 block/bio.c Randy Dunlap 2020-07-30 878 * useful optimisation for file systems with a block size smaller than the 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 879 * page size. 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 880 * 551879a48f01826f block/bio.c Ming Lei 2019-04-23 881 * Warn if (@len, @off) crosses pages in case that @same_page is true. 551879a48f01826f block/bio.c Ming Lei 2019-04-23 882 * 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 883 * Return %true on success or %false on failure. 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 884 */ 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 885 bool __bio_try_merge_page(struct bio *bio, struct page *page, ff896738be381efa block/bio.c Christoph Hellwig 2019-06-17 886 unsigned int len, unsigned int off, bool *same_page) 6e68af666f533625 fs/bio.c Mike Christie 2005-11-11 887 { c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 888 if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 889 return false; 762380ad9322951c block/bio.c Jens Axboe 2014-06-05 890 cc90bc68422318eb block/bio.c Andreas Gruenbacher 2019-12-09 891 if (bio->bi_vcnt > 0) { 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 892 struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 893 5919482e222908d4 block/bio.c Ming Lei 2019-03-17 894 if (page_is_mergeable(bv, page, len, off, same_page)) { 3b3676565e975655 block/bio.c Changheun Lee 2021-01-21 @895 if (bio->bi_iter.bi_size > bio->bi_max_size - len) 2cd896a5e86fc326 block/bio.c Ritesh Harjani 2020-09-09 896 *same_page = false; cc90bc68422318eb block/bio.c Andreas Gruenbacher 2019-12-09 897 return false; 2cd896a5e86fc326 block/bio.c Ritesh Harjani 2020-09-09 898 } c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 899 bv->bv_len += len; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 900 bio->bi_iter.bi_size += len; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 901 return true; 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 902 } 5919482e222908d4 block/bio.c Ming Lei 2019-03-17 903 } 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 @904 return false; c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 @905 } 0aa69fd32a5f766e block/bio.c Christoph Hellwig 2018-06-01 906 EXPORT_SYMBOL_GPL(__bio_try_merge_page); c66a14d07c136cc3 block/bio.c Kent Overstreet 2013-11-23 907 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 42790 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] bio: limit bio max size. 2021-01-21 0:58 ` [PATCH v2] bio: limit bio max size Changheun Lee 2021-01-21 1:53 ` Damien Le Moal 2021-01-21 5:23 ` kernel test robot @ 2021-01-22 3:45 ` Ming Lei [not found] ` <CGME20210125015636epcas1p344b530317fe3379914449197ad591eff@epcas1p3.samsung.com> 2 siblings, 1 reply; 9+ messages in thread From: Ming Lei @ 2021-01-22 3:45 UTC (permalink / raw) To: Changheun Lee Cc: Johannes.Thumshirn, axboe, damien.lemoal, asml.silence, patchwork-bot, osandov, linux-block, linux-kernel, tj, tom.leiming, jisoo2146.oh, junho89.kim, mj0123.lee, seunghwan.hyun, sookwan7.kim, woosung2.lee, yt0928.kim On Thu, Jan 21, 2021 at 09:58:03AM +0900, Changheun Lee wrote: > bio size can grow up to 4GB when muli-page bvec is enabled. > but sometimes it would lead to inefficient behaviors. > in case of large chunk direct I/O, - 32MB chunk read in user space - > all pages for 32MB would be merged to a bio structure if memory address is > continued phsycally. it makes some delay to submit until merge complete. > bio max size should be limited as a proper size. > > When 32MB chunk read with direct I/O option is coming from userspace, > kernel behavior is below now. it's timeline. IMO, the issue should only exist on sync direct IO and writeback. Not sure if writeback cares this small delay because user data has been written to page cache already. Wrt. your big direct IO case, as I suggested, you may reduce the submission delay a lot by applying THP. Or you can just hardcode the limit in case of sync dio. > > | bio merge for 32MB. total 8,192 pages are merged. > | total elapsed time is over 2ms. > |------------------ ... ----------------------->| > | 8,192 pages merged a bio. > | at this time, first bio submit is done. > | 1 bio is split to 32 read request and issue. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 19ms elapsed to complete 32MB read done from device. | > > If bio max size is limited with 1MB, behavior is changed below. > > | bio merge for 1MB. 256 pages are merged for each bio. > | total 32 bio will be made. > | total elapsed time is over 2ms. it's same. > | but, first bio submit timing is fast. about 100us. > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > | 256 pages merged a bio. > | at this time, first bio submit is done. > | and 1 read request is issued for 1 bio. > |---------------> > |---------------> > |---------------> > ...... > |---------------> > |--------------->| > total 17ms elapsed to complete 32MB read done from device. | > > As a result, read request issue timing is faster if bio max size is limited. > Current kernel behavior with multipage bvec, super large bio can be created. > And it lead to delay first I/O request issue. > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > --- > block/bio.c | 17 ++++++++++++++++- > include/linux/bio.h | 13 +++---------- > include/linux/blk_types.h | 1 + > 3 files changed, 20 insertions(+), 11 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 1f2cc1fbe283..027503c2e2e7 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec *table, > > bio->bi_io_vec = table; > bio->bi_max_vecs = max_vecs; > + bio->bi_max_size = UINT_MAX; > } > EXPORT_SYMBOL(bio_init); > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > +{ > + if (bio->bi_disk != bdev->bd_disk) > + bio_clear_flag(bio, BIO_THROTTLED); > + > + bio->bi_disk = bdev->bd_disk; > + bio->bi_partno = bdev->bd_partno; > + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk->queue, > + bio_op(bio)) << SECTOR_SHIFT; > + > + bio_associate_blkg(bio); > +} > +EXPORT_SYMBOL(bio_set_dev); > + > /** > * bio_reset - reinitialize a bio > * @bio: bio to reset > @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > if (page_is_mergeable(bv, page, len, off, same_page)) { > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > *same_page = false; > return false; > } > diff --git a/include/linux/bio.h b/include/linux/bio.h > index 1edda614f7ce..b9803e80c259 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) > if (bio->bi_vcnt >= bio->bi_max_vecs) > return true; > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > return true; > > return false; > @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, unsigned long *, mempool_t *); > extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); > extern unsigned int bvec_nr_vecs(unsigned short idx); > extern const char *bio_devname(struct bio *bio, char *buffer); > - > -#define bio_set_dev(bio, bdev) \ > -do { \ > - if ((bio)->bi_disk != (bdev)->bd_disk) \ > - bio_clear_flag(bio, BIO_THROTTLED);\ > - (bio)->bi_disk = (bdev)->bd_disk; \ > - (bio)->bi_partno = (bdev)->bd_partno; \ > - bio_associate_blkg(bio); \ > -} while (0) > +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); > > #define bio_copy_dev(dst, src) \ > do { \ > (dst)->bi_disk = (src)->bi_disk; \ > (dst)->bi_partno = (src)->bi_partno; \ > + (dst)->bi_max_size = (src)->bi_max_size;\ > bio_clone_blkg_association(dst, src); \ > } while (0) > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > index 866f74261b3b..e5dd5b7d8fc1 100644 > --- a/include/linux/blk_types.h > +++ b/include/linux/blk_types.h > @@ -270,6 +270,7 @@ struct bio { > */ > > unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ > + unsigned int bi_max_size; /* max data size we can hold */ People don't like to extend bio which can be fit in two cachelines exactly, and adding one 'int' will make it cross 3 cache lines. Thanks, Ming ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CGME20210125015636epcas1p344b530317fe3379914449197ad591eff@epcas1p3.samsung.com>]
* Re: [PATCH v2] bio: limit bio max size [not found] ` <CGME20210125015636epcas1p344b530317fe3379914449197ad591eff@epcas1p3.samsung.com> @ 2021-01-25 1:41 ` Changheun Lee 0 siblings, 0 replies; 9+ messages in thread From: Changheun Lee @ 2021-01-25 1:41 UTC (permalink / raw) To: ming.lei Cc: Johannes.Thumshirn, asml.silence, axboe, damien.lemoal, jisoo2146.oh, junho89.kim, linux-block, linux-kernel, mj0123.lee, nanich.lee, osandov, patchwork-bot, seunghwan.hyun, sookwan7.kim, tj, tom.leiming, woosung2.lee, yt0928.kim > On Thu, Jan 21, 2021 at 09:58:03AM +0900, Changheun Lee wrote: > > bio size can grow up to 4GB when muli-page bvec is enabled. > > but sometimes it would lead to inefficient behaviors. > > in case of large chunk direct I/O, - 32MB chunk read in user space - > > all pages for 32MB would be merged to a bio structure if memory address is > > continued phsycally. it makes some delay to submit until merge complete. > > bio max size should be limited as a proper size. > > > > When 32MB chunk read with direct I/O option is coming from userspace, > > kernel behavior is below now. it's timeline. > > IMO, the issue should only exist on sync direct IO and writeback. > Not sure if writeback cares this small delay because user data > has been written to page cache already. > > Wrt. your big direct IO case, as I suggested, you may reduce the > submission delay a lot by applying THP. > > Or you can just hardcode the limit in case of sync dio. As you said, this small delay is not affect in case of writeback. It's not common, but there are needs of direct I/O from userspace. It will be operated optional, not default in v3 patch version. And it'll be activated with queue flag, or sysfs node. Please, review it again. > > > > > | bio merge for 32MB. total 8,192 pages are merged. > > | total elapsed time is over 2ms. > > |------------------ ... ----------------------->| > > | 8,192 pages merged a bio. > > | at this time, first bio submit is done. > > | 1 bio is split to 32 read request and issue. > > |---------------> > > |---------------> > > |---------------> > > ...... > > |---------------> > > |--------------->| > > total 19ms elapsed to complete 32MB read done from device. | > > > > If bio max size is limited with 1MB, behavior is changed below. > > > > | bio merge for 1MB. 256 pages are merged for each bio. > > | total 32 bio will be made. > > | total elapsed time is over 2ms. it's same. > > | but, first bio submit timing is fast. about 100us. > > |--->|--->|--->|---> ... -->|--->|--->|--->|--->| > > | 256 pages merged a bio. > > | at this time, first bio submit is done. > > | and 1 read request is issued for 1 bio. > > |---------------> > > |---------------> > > |---------------> > > ...... > > |---------------> > > |--------------->| > > total 17ms elapsed to complete 32MB read done from device. | > > > > As a result, read request issue timing is faster if bio max size is limited. > > Current kernel behavior with multipage bvec, super large bio can be created. > > And it lead to delay first I/O request issue. > > > > Signed-off-by: Changheun Lee <nanich.lee@samsung.com> > > > > --- > > block/bio.c | 17 ++++++++++++++++- > > include/linux/bio.h | 13 +++---------- > > include/linux/blk_types.h | 1 + > > 3 files changed, 20 insertions(+), 11 deletions(-) > > > > diff --git a/block/bio.c b/block/bio.c > > index 1f2cc1fbe283..027503c2e2e7 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -284,9 +284,24 @@ void bio_init(struct bio *bio, struct bio_vec *table, > > > > bio->bi_io_vec = table; > > bio->bi_max_vecs = max_vecs; > > + bio->bi_max_size = UINT_MAX; > > } > > EXPORT_SYMBOL(bio_init); > > > > +void bio_set_dev(struct bio *bio, struct block_device *bdev) > > +{ > > + if (bio->bi_disk != bdev->bd_disk) > > + bio_clear_flag(bio, BIO_THROTTLED); > > + > > + bio->bi_disk = bdev->bd_disk; > > + bio->bi_partno = bdev->bd_partno; > > + bio->bi_max_size = blk_queue_get_max_sectors(bio->bi_disk->queue, > > + bio_op(bio)) << SECTOR_SHIFT; > > + > > + bio_associate_blkg(bio); > > +} > > +EXPORT_SYMBOL(bio_set_dev); > > + > > /** > > * bio_reset - reinitialize a bio > > * @bio: bio to reset > > @@ -877,7 +892,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page, > > struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; > > > > if (page_is_mergeable(bv, page, len, off, same_page)) { > > - if (bio->bi_iter.bi_size > UINT_MAX - len) { > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > > *same_page = false; > > return false; > > } > > diff --git a/include/linux/bio.h b/include/linux/bio.h > > index 1edda614f7ce..b9803e80c259 100644 > > --- a/include/linux/bio.h > > +++ b/include/linux/bio.h > > @@ -113,7 +113,7 @@ static inline bool bio_full(struct bio *bio, unsigned len) > > if (bio->bi_vcnt >= bio->bi_max_vecs) > > return true; > > > > - if (bio->bi_iter.bi_size > UINT_MAX - len) > > + if (bio->bi_iter.bi_size > bio->bi_max_size - len) > > return true; > > > > return false; > > @@ -482,20 +482,13 @@ extern struct bio_vec *bvec_alloc(gfp_t, int, unsigned long *, mempool_t *); > > extern void bvec_free(mempool_t *, struct bio_vec *, unsigned int); > > extern unsigned int bvec_nr_vecs(unsigned short idx); > > extern const char *bio_devname(struct bio *bio, char *buffer); > > - > > -#define bio_set_dev(bio, bdev) \ > > -do { \ > > - if ((bio)->bi_disk != (bdev)->bd_disk) \ > > - bio_clear_flag(bio, BIO_THROTTLED);\ > > - (bio)->bi_disk = (bdev)->bd_disk; \ > > - (bio)->bi_partno = (bdev)->bd_partno; \ > > - bio_associate_blkg(bio); \ > > -} while (0) > > +extern void bio_set_dev(struct bio *bio, struct block_device *bdev); > > > > #define bio_copy_dev(dst, src) \ > > do { \ > > (dst)->bi_disk = (src)->bi_disk; \ > > (dst)->bi_partno = (src)->bi_partno; \ > > + (dst)->bi_max_size = (src)->bi_max_size;\ > > bio_clone_blkg_association(dst, src); \ > > } while (0) > > > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > > index 866f74261b3b..e5dd5b7d8fc1 100644 > > --- a/include/linux/blk_types.h > > +++ b/include/linux/blk_types.h > > @@ -270,6 +270,7 @@ struct bio { > > */ > > > > unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ > > + unsigned int bi_max_size; /* max data size we can hold */ > > People don't like to extend bio which can be fit in two cachelines > exactly, and adding one 'int' will make it cross 3 cache lines. > Yes, you right. good comment. I'll not add a new variable in bio. :) > > Thanks, > Ming > > --- Changheun Lee Samsung Electronics ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-01-25 2:22 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20210121011324epcas1p3a213069e873fd324a7ce3188558f0782@epcas1p3.samsung.com> 2021-01-21 0:58 ` [PATCH v2] bio: limit bio max size Changheun Lee 2021-01-21 1:53 ` Damien Le Moal [not found] ` <CGME20210121095138epcas1p125bb1879c65fafe4636d71b8ab5f90cd@epcas1p1.samsung.com> 2021-01-21 9:36 ` Changheun Lee 2021-01-21 10:17 ` Damien Le Moal [not found] ` <CGME20210121113928epcas1p13c9dd6d1322979a977f24ded689b38de@epcas1p1.samsung.com> 2021-01-21 11:24 ` Changheun Lee 2021-01-21 5:23 ` kernel test robot 2021-01-21 5:23 ` kernel test robot 2021-01-22 3:45 ` Ming Lei [not found] ` <CGME20210125015636epcas1p344b530317fe3379914449197ad591eff@epcas1p3.samsung.com> 2021-01-25 1:41 ` Changheun Lee
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.