All of lore.kernel.org
 help / color / mirror / Atom feed
From: Changheun Lee <nanich.lee@samsung.com>
To: damien.lemoal@wdc.com, bvanassche@acm.org,
	Johannes.Thumshirn@wdc.com, asml.silence@gmail.com,
	axboe@kernel.dk, gregkh@linuxfoundation.org, hch@infradead.org,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	ming.lei@redhat.com, osandov@fb.com, patchwork-bot@kernel.org,
	tj@kernel.org, tom.leiming@gmail.com
Cc: jisoo2146.oh@samsung.com, junho89.kim@samsung.com,
	mj0123.lee@samsung.com, seunghwan.hyun@samsung.com,
	sookwan7.kim@samsung.com, woosung2.lee@samsung.com,
	yt0928.kim@samsung.com, Changheun Lee <nanich.lee@samsung.com>
Subject: [PATCH v7 1/3] bio: limit bio max size
Date: Tue, 13 Apr 2021 11:55:00 +0900	[thread overview]
Message-ID: <20210413025502.31579-2-nanich.lee@samsung.com> (raw)
In-Reply-To: <20210413025502.31579-1-nanich.lee@samsung.com>

bio size can grow up to 4GB when muli-page bvec is enabled.
but sometimes it would lead to inefficient behaviors.
in case of large chunk direct I/O, - 32MB chunk read in user space -
all pages for 32MB would be merged to a bio structure if the pages
physical addresses are contiguous. it makes some delay to submit
until merge complete. bio max size should be limited to a proper size.

When 32MB chunk read with direct I/O option is coming from userspace,
kernel behavior is below now in do_direct_IO() loop. it's timeline.

 | bio merge for 32MB. total 8,192 pages are merged.
 | total elapsed time is over 2ms.
 |------------------ ... ----------------------->|
                                                 | 8,192 pages merged a bio.
                                                 | at this time, first bio submit is done.
                                                 | 1 bio is split to 32 read request and issue.
                                                 |--------------->
                                                  |--------------->
                                                   |--------------->
                                                              ......
                                                                   |--------------->
                                                                    |--------------->|
                          total 19ms elapsed to complete 32MB read done from device. |

If bio max size is limited with 1MB, behavior is changed below.

 | bio merge for 1MB. 256 pages are merged for each bio.
 | total 32 bio will be made.
 | total elapsed time is over 2ms. it's same.
 | but, first bio submit timing is fast. about 100us.
 |--->|--->|--->|---> ... -->|--->|--->|--->|--->|
      | 256 pages merged a bio.
      | at this time, first bio submit is done.
      | and 1 read request is issued for 1 bio.
      |--------------->
           |--------------->
                |--------------->
                                      ......
                                                 |--------------->
                                                  |--------------->|
        total 17ms elapsed to complete 32MB read done from device. |

As a result, read request issue timing is faster if bio max size is limited.
Current kernel behavior with multipage bvec, super large bio can be created.
And it lead to delay first I/O request issue.

Signed-off-by: Changheun Lee <nanich.lee@samsung.com>
---
 block/bio.c            | 13 ++++++++++++-
 block/blk-settings.c   | 17 +++++++++++++++++
 include/linux/bio.h    |  4 +++-
 include/linux/blkdev.h |  4 ++++
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 50e579088aca..4bad97f1d37b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -255,6 +255,17 @@ void bio_init(struct bio *bio, struct bio_vec *table,
 }
 EXPORT_SYMBOL(bio_init);
 
+unsigned int bio_max_size(struct bio *bio)
+{
+	struct request_queue *q = bio->bi_bdev->bd_disk->queue;
+
+	if (blk_queue_limit_bio_size(q))
+		return blk_queue_get_max_sectors(q, bio_op(bio))
+			<< SECTOR_SHIFT;
+
+	return UINT_MAX;
+}
+
 /**
  * bio_reset - reinitialize a bio
  * @bio:	bio to reset
@@ -866,7 +877,7 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
 		struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1];
 
 		if (page_is_mergeable(bv, page, len, off, same_page)) {
-			if (bio->bi_iter.bi_size > UINT_MAX - len) {
+			if (bio->bi_iter.bi_size > bio_max_size(bio) - len) {
 				*same_page = false;
 				return false;
 			}
diff --git a/block/blk-settings.c b/block/blk-settings.c
index b4aa2f37fab6..1d94b97cea4f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -928,6 +928,23 @@ void blk_queue_set_zoned(struct gendisk *disk, enum blk_zoned_model model)
 }
 EXPORT_SYMBOL_GPL(blk_queue_set_zoned);
 
+/**
+ * blk_queue_set_limit_bio_size - set limit bio size flag
+ * @q:		the request queue for the device
+ * @limit:	limit bio size on(true), or off
+ *
+ * bio max size will be limited to queue max sectors size,
+ * if limit is true.
+ */
+void blk_queue_set_limit_bio_size(struct request_queue *q, bool limit)
+{
+	if (limit)
+		blk_queue_flag_set(QUEUE_FLAG_LIMIT_BIO_SIZE, q);
+	else
+		blk_queue_flag_clear(QUEUE_FLAG_LIMIT_BIO_SIZE, q);
+}
+EXPORT_SYMBOL_GPL(blk_queue_set_limit_bio_size);
+
 static int __init blk_settings_init(void)
 {
 	blk_max_low_pfn = max_low_pfn - 1;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d0246c92a6e8..e5add63da3af 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -106,6 +106,8 @@ static inline void *bio_data(struct bio *bio)
 	return NULL;
 }
 
+extern unsigned int bio_max_size(struct bio *bio);
+
 /**
  * bio_full - check if the bio is full
  * @bio:	bio to check
@@ -119,7 +121,7 @@ static inline bool bio_full(struct bio *bio, unsigned len)
 	if (bio->bi_vcnt >= bio->bi_max_vecs)
 		return true;
 
-	if (bio->bi_iter.bi_size > UINT_MAX - len)
+	if (bio->bi_iter.bi_size > bio_max_size(bio) - len)
 		return true;
 
 	return false;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 158aefae1030..c69a6ed7a189 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -617,6 +617,7 @@ struct request_queue {
 #define QUEUE_FLAG_RQ_ALLOC_TIME 27	/* record rq->alloc_time_ns */
 #define QUEUE_FLAG_HCTX_ACTIVE	28	/* at least one blk-mq hctx is active */
 #define QUEUE_FLAG_NOWAIT       29	/* device supports NOWAIT */
+#define QUEUE_FLAG_LIMIT_BIO_SIZE 30	/* limit bio size */
 
 #define QUEUE_FLAG_MQ_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_SAME_COMP) |		\
@@ -663,6 +664,8 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_fua(q)	test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags)
 #define blk_queue_registered(q)	test_bit(QUEUE_FLAG_REGISTERED, &(q)->queue_flags)
 #define blk_queue_nowait(q)	test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags)
+#define blk_queue_limit_bio_size(q)	\
+	test_bit(QUEUE_FLAG_LIMIT_BIO_SIZE, &(q)->queue_flags)
 
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);
@@ -1183,6 +1186,7 @@ extern void blk_queue_required_elevator_features(struct request_queue *q,
 						 unsigned int features);
 extern bool blk_queue_can_use_dma_map_merging(struct request_queue *q,
 					      struct device *dev);
+extern void blk_queue_set_limit_bio_size(struct request_queue *q, bool limit);
 
 /*
  * Number of physical segments as sent to the device.
-- 
2.29.0


  parent reply	other threads:[~2021-04-13  3:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20210413031256epcas1p2f3c3f140192178928b92fc070010354d@epcas1p2.samsung.com>
2021-04-13  2:54 ` [PATCH v7 0/3] limit bio max size Changheun Lee
     [not found]   ` <CGME20210413031257epcas1p329f38effa71445de2464cee32002e618@epcas1p3.samsung.com>
2021-04-13  2:55     ` Changheun Lee [this message]
2021-04-13 16:18       ` [PATCH v7 1/3] bio: " Bart Van Assche
     [not found]         ` <CGME20210415105608epcas1p269bae87b8a7dab133753f7916420251e@epcas1p2.samsung.com>
2021-04-15 10:38           ` Changheun Lee
2021-04-15 19:18             ` Bart Van Assche
     [not found]               ` <CGME20210416060827epcas1p39350d45cef64c91be681b76180b63140@epcas1p3.samsung.com>
2021-04-16  5:50                 ` Changheun Lee
2021-04-16 15:28                   ` Bart Van Assche
     [not found]                     ` <CGME20210419060745epcas1p220138a5de8e08201a6bcd9193c37fc51@epcas1p2.samsung.com>
2021-04-19  5:49                       ` Changheun Lee
2021-04-19 22:16                         ` Bart Van Assche
     [not found]                           ` <CGME20210420011139epcas1p429e791d6f8ffea596661e9366babbec8@epcas1p4.samsung.com>
2021-04-20  0:53                             ` Changheun Lee
     [not found]   ` <CGME20210413031258epcas1p469e9bd0145a49d440541cee899fd4d8e@epcas1p4.samsung.com>
2021-04-13  2:55     ` [PATCH v7 2/3] ufs: set QUEUE_FLAG_LIMIT_BIO_SIZE Changheun Lee
2021-04-13 16:14       ` Bart Van Assche
     [not found]         ` <CGME20210414015804epcas1p21a7581e22dc553530c516459f32d78a9@epcas1p2.samsung.com>
2021-04-14  1:40           ` Changheun Lee
     [not found]   ` <CGME20210413031259epcas1p4406eaed9ba20e684fc038bf1937b94ff@epcas1p4.samsung.com>
2021-04-13  2:55     ` [PATCH v7 3/3] bio: add limit_bio_size sysfs Changheun Lee
2021-04-13  7:35       ` Greg KH
     [not found]         ` <CGME20210413113510epcas1p29dd90b47ba8c8701a2309fc34698ad29@epcas1p2.samsung.com>
2021-04-13 11:17           ` Changheun Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210413025502.31579-2-nanich.lee@samsung.com \
    --to=nanich.lee@samsung.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=damien.lemoal@wdc.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=jisoo2146.oh@samsung.com \
    --cc=junho89.kim@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mj0123.lee@samsung.com \
    --cc=osandov@fb.com \
    --cc=patchwork-bot@kernel.org \
    --cc=seunghwan.hyun@samsung.com \
    --cc=sookwan7.kim@samsung.com \
    --cc=tj@kernel.org \
    --cc=tom.leiming@gmail.com \
    --cc=woosung2.lee@samsung.com \
    --cc=yt0928.kim@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.