All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] Add support for segments smaller than one page
@ 2023-01-18 22:54 Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche

Hi Jens,

Several embedded storage controllers need support for DMA segments that are
smaller than the size of one virtual memory page. Hence this patch series.
Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v2:
- For SCSI drivers, only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- In the scsi_debug patch, sorted kernel module parameters alphabetically.
  Only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- Added a patch for the UFS Exynos driver that enables
  CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size exceeds 4 KiB.

Changes compared to v1:
- Added a CONFIG variable that controls whether or not small segment support
  is enabled.
- Improved patch descriptions.

Bart Van Assche (9):
  block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and
    CONFIG_BLK_SUB_PAGE_SEGMENTS
  block: Support configuring limits below the page size
  block: Support submitting passthrough requests with small segments
  block: Add support for filesystem requests and small segments
  block: Add support for small segments in blk_rq_map_user_iov()
  scsi_debug: Support configuring the maximum segment size
  null_blk: Support configuring the maximum segment size
  scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size
    values
  scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page
    sizes

 block/Kconfig                     |  9 +++++++
 block/blk-map.c                   | 43 ++++++++++++++++++++++++++-----
 block/blk-merge.c                 |  6 +++--
 block/blk-mq.c                    |  2 ++
 block/blk-settings.c              | 20 ++++++++------
 block/blk.h                       | 22 +++++++++++-----
 drivers/block/null_blk/main.c     | 21 ++++++++++++---
 drivers/block/null_blk/null_blk.h |  1 +
 drivers/scsi/scsi_debug.c         | 15 +++++++++++
 drivers/scsi/scsi_lib.c           |  3 +++
 drivers/ufs/host/Kconfig          |  1 +
 include/linux/blkdev.h            |  7 +++++
 12 files changed, 125 insertions(+), 25 deletions(-)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 23:02   ` Jens Axboe
  2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch

Prepare for introducing support for segments smaller than the page size
by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
block drivers that support segments >= PAGE_SIZE would be affected.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/Kconfig          | 9 +++++++++
 include/linux/blkdev.h | 7 +++++++
 2 files changed, 16 insertions(+)

diff --git a/block/Kconfig b/block/Kconfig
index 5d9d9c84d516..e85061d2175b 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
 	  created on demand, but scripts that manually create device nodes and
 	  then call losetup might rely on this behavior.
 
+config BLK_SUB_PAGE_SEGMENTS
+       bool "Support segments smaller than the page size"
+       default n
+       help
+	  Most storage controllers support DMA segments larger than the typical
+	  size of a virtual memory page. Some embedded controllers only support
+	  DMA segments smaller than the page size. Enable this option to support
+	  such controllers.
+
 config BLK_RQ_ALLOC_TIME
 	bool
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 89f51d68c68a..6cbb22fb93ee 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -547,6 +547,7 @@ struct request_queue {
 /* Keep blk_queue_flag_name[] in sync with the definitions below */
 #define QUEUE_FLAG_STOPPED	0	/* queue is stopped */
 #define QUEUE_FLAG_DYING	1	/* queue being torn down */
+#define QUEUE_FLAG_SUB_PAGE_SEGMENTS 2	/* segments smaller than one page */
 #define QUEUE_FLAG_NOMERGES     3	/* disable merge attempts */
 #define QUEUE_FLAG_SAME_COMP	4	/* complete on same CPU-group */
 #define QUEUE_FLAG_FAIL_IO	5	/* fake timeout */
@@ -613,6 +614,12 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_sq_sched(q)	test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags)
 #define blk_queue_skip_tagset_quiesce(q) \
 	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+#define blk_queue_sub_page_segments(q)				\
+	test_bit(QUEUE_FLAG_SUB_PAGE_SEGMENTS, &(q)->queue_flags)
+#else
+#define blk_queue_sub_page_segments(q) false
+#endif
 
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 2/9] block: Support configuring limits below the page size
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch

Allow block drivers to configure the following if
CONFIG_BLK_SUB_PAGE_SEGMENTS=y:
* Maximum number of hardware sectors values smaller than
  PAGE_SIZE >> SECTOR_SHIFT. With PAGE_SIZE = 4096 this means that values
  below 8 are supported.
* A maximum segment size smaller than the page size. This is most useful
  for page sizes above 4096 bytes.

The behavior of the block layer is not modified if
CONFIG_BLK_SUB_PAGE_SEGMENTS=n.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-settings.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 9c9713c9269c..9820ceb18c46 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -122,12 +122,14 @@ EXPORT_SYMBOL(blk_queue_bounce_limit);
 void blk_queue_max_hw_sectors(struct request_queue *q, unsigned int max_hw_sectors)
 {
 	struct queue_limits *limits = &q->limits;
+	unsigned int min_max_hw_sectors = blk_queue_sub_page_segments(q) ? 1 :
+		PAGE_SIZE >> SECTOR_SHIFT;
 	unsigned int max_sectors;
 
-	if ((max_hw_sectors << 9) < PAGE_SIZE) {
-		max_hw_sectors = 1 << (PAGE_SHIFT - 9);
-		printk(KERN_INFO "%s: set to minimum %d\n",
-		       __func__, max_hw_sectors);
+	if (max_hw_sectors < min_max_hw_sectors) {
+		max_hw_sectors = min_max_hw_sectors;
+		printk(KERN_INFO "%s: set to minimum %u\n", __func__,
+		       max_hw_sectors);
 	}
 
 	max_hw_sectors = round_down(max_hw_sectors,
@@ -282,10 +284,12 @@ EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
  **/
 void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size)
 {
-	if (max_size < PAGE_SIZE) {
-		max_size = PAGE_SIZE;
-		printk(KERN_INFO "%s: set to minimum %d\n",
-		       __func__, max_size);
+	unsigned int min_max_segment_size = blk_queue_sub_page_segments(q) ?
+		SECTOR_SIZE : PAGE_SIZE;
+
+	if (max_size < min_max_segment_size) {
+		max_size = min_max_segment_size;
+		printk(KERN_INFO "%s: set to minimum %u\n", __func__, max_size);
 	}
 
 	/* see blk_queue_virt_boundary() for the explanation */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 3/9] block: Support submitting passthrough requests with small segments
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch

If the segment size is smaller than the page size there may be multiple
segments per bvec even if a bvec only contains a single page. Hence this
patch.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-map.c | 16 +++++++++++++++-
 block/blk.h     | 11 +++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 9ee4be4ba2f1..4270db88e2c2 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -523,6 +523,20 @@ static struct bio *bio_copy_kern(struct request_queue *q, void *data,
 	return ERR_PTR(-ENOMEM);
 }
 
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+/* Number of DMA segments required to transfer @bytes data. */
+unsigned int blk_segments(const struct queue_limits *limits, unsigned int bytes)
+{
+	const unsigned int mss = limits->max_segment_size;
+
+	if (bytes <= mss)
+		return 1;
+	if (is_power_of_2(mss))
+		return round_up(bytes, mss) >> ilog2(mss);
+	return (bytes + mss - 1) / mss;
+}
+#endif
+
 /*
  * Append a bio to a passthrough request.  Only works if the bio can be merged
  * into the request based on the driver constraints.
@@ -534,7 +548,7 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
 	unsigned int nr_segs = 0;
 
 	bio_for_each_bvec(bv, bio, iter)
-		nr_segs++;
+		nr_segs += blk_segments(&rq->q->limits, bv.bv_len);
 
 	if (!rq->bio) {
 		blk_rq_bio_prep(rq, bio, nr_segs);
diff --git a/block/blk.h b/block/blk.h
index 4c3b3325219a..8f5e749ad73b 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -76,6 +76,17 @@ struct bio_vec *bvec_alloc(mempool_t *pool, unsigned short *nr_vecs,
 		gfp_t gfp_mask);
 void bvec_free(mempool_t *pool, struct bio_vec *bv, unsigned short nr_vecs);
 
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+unsigned int blk_segments(const struct queue_limits *limits,
+			  unsigned int bytes);
+#else
+static inline unsigned int blk_segments(const struct queue_limits *limits,
+					unsigned int bytes)
+{
+	return 1;
+}
+#endif
+
 static inline bool biovec_phys_mergeable(struct request_queue *q,
 		struct bio_vec *vec1, struct bio_vec *vec2)
 {

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 4/9] block: Add support for filesystem requests and small segments
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (2 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch

Add support in the bio splitting code and also in the bio submission code
for bios with segments smaller than the page size.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-merge.c |  6 ++++--
 block/blk-mq.c    |  2 ++
 block/blk.h       | 11 +++++------
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index b7c193d67185..bf727f67473d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -294,7 +294,8 @@ static struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim,
 		if (nsegs < lim->max_segments &&
 		    bytes + bv.bv_len <= max_bytes &&
 		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
-			nsegs++;
+			/* single-page bvec optimization */
+			nsegs += blk_segments(lim, bv.bv_len);
 			bytes += bv.bv_len;
 		} else {
 			if (bvec_split_segs(lim, &bv, &nsegs, &bytes,
@@ -543,7 +544,8 @@ static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
 			    __blk_segment_map_sg_merge(q, &bvec, &bvprv, sg))
 				goto next_bvec;
 
-			if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE)
+			if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE &&
+			    bvec.bv_len <= q->limits.max_segment_size)
 				nsegs += __blk_bvec_map_sg(bvec, sglist, sg);
 			else
 				nsegs += blk_bvec_map_sg(q, &bvec, sglist, sg);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9d463f7563bc..947cae2def76 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2959,6 +2959,8 @@ void blk_mq_submit_bio(struct bio *bio)
 		bio = __bio_split_to_limits(bio, &q->limits, &nr_segs);
 		if (!bio)
 			return;
+	} else if (bio->bi_vcnt == 1) {
+		nr_segs = blk_segments(&q->limits, bio->bi_io_vec[0].bv_len);
 	}
 
 	if (!bio_integrity_prep(bio))
diff --git a/block/blk.h b/block/blk.h
index 8f5e749ad73b..c3dd332ba618 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -316,14 +316,13 @@ static inline bool bio_may_exceed_limits(struct bio *bio,
 	}
 
 	/*
-	 * All drivers must accept single-segments bios that are <= PAGE_SIZE.
-	 * This is a quick and dirty check that relies on the fact that
-	 * bi_io_vec[0] is always valid if a bio has data.  The check might
-	 * lead to occasional false negatives when bios are cloned, but compared
-	 * to the performance impact of cloned bios themselves the loop below
-	 * doesn't matter anyway.
+	 * Check whether bio splitting should be performed. This check may
+	 * trigger the bio splitting code even if splitting is not necessary.
 	 */
 	return lim->chunk_sectors || bio->bi_vcnt != 1 ||
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+		bio->bi_io_vec->bv_len > lim->max_segment_size ||
+#endif
 		bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE;
 }
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov()
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (3 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch

Before changing the return value of bio_add_hw_page() into a value in
the range [0, len], make blk_rq_map_user_iov() fall back to copying data
if mapping the data is not possible due to the segment limit.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-map.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index 4270db88e2c2..83163f9b2335 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -307,17 +307,26 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		else {
 			for (j = 0; j < npages; j++) {
 				struct page *page = pages[j];
-				unsigned int n = PAGE_SIZE - offs;
+				unsigned int n = PAGE_SIZE - offs, added;
 				bool same_page = false;
 
 				if (n > bytes)
 					n = bytes;
 
-				if (!bio_add_hw_page(rq->q, bio, page, n, offs,
-						     max_sectors, &same_page)) {
+				added = bio_add_hw_page(rq->q, bio, page, n,
+						offs, max_sectors, &same_page);
+				if (added == 0) {
 					if (same_page)
 						put_page(page);
 					break;
+				} else if (added != n) {
+					/*
+					 * The segment size is smaller than the
+					 * page size and an iov exceeds the
+					 * segment size. Give up.
+					 */
+					ret = -EREMOTEIO;
+					goto out_unmap;
 				}
 
 				bytes -= n;
@@ -671,10 +680,18 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 
 	i = *iter;
 	do {
-		if (copy)
+		if (copy) {
 			ret = bio_copy_user_iov(rq, map_data, &i, gfp_mask);
-		else
+		} else {
 			ret = bio_map_user_iov(rq, &i, gfp_mask);
+			/*
+			 * Fall back to copying the data if bio_map_user_iov()
+			 * returns -EREMOTEIO.
+			 */
+			if (ret == -EREMOTEIO)
+				ret = bio_copy_user_iov(rq, map_data, &i,
+							gfp_mask);
+		}
 		if (ret)
 			goto unmap_rq;
 		if (!bio)

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (4 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Doug Gilbert,
	Martin K . Petersen

Add a kernel module parameter for configuring the maximum segment size.
This patch enables testing SCSI support for segments smaller than the
page size.

Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_debug.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 8553277effb3..d09f05b440ba 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -752,6 +752,7 @@ static int sdebug_host_max_queue;	/* per host */
 static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED;
 static int sdebug_max_luns = DEF_MAX_LUNS;
 static int sdebug_max_queue = SDEBUG_CANQUEUE;	/* per submit queue */
+static unsigned int sdebug_max_segment_size = BLK_MAX_SEGMENT_SIZE;
 static unsigned int sdebug_medium_error_start = OPT_MEDIUM_ERR_ADDR;
 static int sdebug_medium_error_count = OPT_MEDIUM_ERR_NUM;
 static atomic_t retired_max_queue;	/* if > 0 then was prior max_queue */
@@ -5205,6 +5206,11 @@ static int scsi_debug_slave_alloc(struct scsi_device *sdp)
 	if (sdebug_verbose)
 		pr_info("slave_alloc <%u %u %u %llu>\n",
 		       sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
+
+	if (sdebug_max_segment_size < PAGE_SIZE)
+		blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS,
+				   sdp->request_queue);
+
 	return 0;
 }
 
@@ -5841,6 +5847,7 @@ module_param_named(lowest_aligned, sdebug_lowest_aligned, int, S_IRUGO);
 module_param_named(lun_format, sdebug_lun_am_i, int, S_IRUGO | S_IWUSR);
 module_param_named(max_luns, sdebug_max_luns, int, S_IRUGO | S_IWUSR);
 module_param_named(max_queue, sdebug_max_queue, int, S_IRUGO | S_IWUSR);
+module_param_named(max_segment_size, sdebug_max_segment_size, uint, S_IRUGO);
 module_param_named(medium_error_count, sdebug_medium_error_count, int,
 		   S_IRUGO | S_IWUSR);
 module_param_named(medium_error_start, sdebug_medium_error_start, int,
@@ -5917,6 +5924,7 @@ MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)");
 MODULE_PARM_DESC(lun_format, "LUN format: 0->peripheral (def); 1 --> flat address method");
 MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)");
 MODULE_PARM_DESC(max_queue, "max number of queued commands (1 to max(def))");
+MODULE_PARM_DESC(max_segment_size, "max bytes in a single segment");
 MODULE_PARM_DESC(medium_error_count, "count of sectors to return follow on MEDIUM error");
 MODULE_PARM_DESC(medium_error_start, "starting sector number to return MEDIUM error");
 MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
@@ -6920,6 +6928,12 @@ static int __init scsi_debug_init(void)
 		return -EINVAL;
 	}
 
+	if (sdebug_max_segment_size < SECTOR_SIZE) {
+		pr_err("invalid max_segment_size %d\n",
+		       sdebug_max_segment_size);
+		return -EINVAL;
+	}
+
 	switch (sdebug_dif) {
 	case T10_PI_TYPE0_PROTECTION:
 		break;
@@ -7816,6 +7830,7 @@ static int sdebug_driver_probe(struct device *dev)
 
 	sdebug_driver_template.can_queue = sdebug_max_queue;
 	sdebug_driver_template.cmd_per_lun = sdebug_max_queue;
+	sdebug_driver_template.max_segment_size = sdebug_max_segment_size;
 	if (!sdebug_clustering)
 		sdebug_driver_template.dma_boundary = PAGE_SIZE - 1;
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 7/9] null_blk: Support configuring the maximum segment size
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (5 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Damien Le Moal,
	Chaitanya Kulkarni

Add support for configuring the maximum segment size.

Add support for segments smaller than the page size.

This patch enables testing segments smaller than the page size with a
driver that does not call blk_rq_map_sg().

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/block/null_blk/main.c     | 21 ++++++++++++++++++---
 drivers/block/null_blk/null_blk.h |  1 +
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 4c601ca9552a..822890d89427 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -157,6 +157,10 @@ static int g_max_sectors;
 module_param_named(max_sectors, g_max_sectors, int, 0444);
 MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B sectors)");
 
+static unsigned int g_max_segment_size = BLK_MAX_SEGMENT_SIZE;
+module_param_named(max_segment_size, g_max_segment_size, int, 0444);
+MODULE_PARM_DESC(max_segment_size, "Maximum size of a segment in bytes");
+
 static unsigned int nr_devices = 1;
 module_param(nr_devices, uint, 0444);
 MODULE_PARM_DESC(nr_devices, "Number of devices to register");
@@ -409,6 +413,7 @@ NULLB_DEVICE_ATTR(home_node, uint, NULL);
 NULLB_DEVICE_ATTR(queue_mode, uint, NULL);
 NULLB_DEVICE_ATTR(blocksize, uint, NULL);
 NULLB_DEVICE_ATTR(max_sectors, uint, NULL);
+NULLB_DEVICE_ATTR(max_segment_size, uint, NULL);
 NULLB_DEVICE_ATTR(irqmode, uint, NULL);
 NULLB_DEVICE_ATTR(hw_queue_depth, uint, NULL);
 NULLB_DEVICE_ATTR(index, uint, NULL);
@@ -550,6 +555,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
 	&nullb_device_attr_queue_mode,
 	&nullb_device_attr_blocksize,
 	&nullb_device_attr_max_sectors,
+	&nullb_device_attr_max_segment_size,
 	&nullb_device_attr_irqmode,
 	&nullb_device_attr_hw_queue_depth,
 	&nullb_device_attr_index,
@@ -630,7 +636,8 @@ static ssize_t memb_group_features_show(struct config_item *item, char *page)
 	return snprintf(page, PAGE_SIZE,
 			"badblocks,blocking,blocksize,cache_size,"
 			"completion_nsec,discard,home_node,hw_queue_depth,"
-			"irqmode,max_sectors,mbps,memory_backed,no_sched,"
+			"irqmode,max_sectors,max_segment_size,mbps,"
+			"memory_backed,no_sched,"
 			"poll_queues,power,queue_mode,shared_tag_bitmap,size,"
 			"submit_queues,use_per_node_hctx,virt_boundary,zoned,"
 			"zone_capacity,zone_max_active,zone_max_open,"
@@ -693,6 +700,7 @@ static struct nullb_device *null_alloc_dev(void)
 	dev->queue_mode = g_queue_mode;
 	dev->blocksize = g_bs;
 	dev->max_sectors = g_max_sectors;
+	dev->max_segment_size = g_max_segment_size;
 	dev->irqmode = g_irqmode;
 	dev->hw_queue_depth = g_hw_queue_depth;
 	dev->blocking = g_blocking;
@@ -1234,6 +1242,8 @@ static int null_transfer(struct nullb *nullb, struct page *page,
 	unsigned int valid_len = len;
 	int err = 0;
 
+	WARN_ONCE(len > dev->max_segment_size, "%u > %u\n", len,
+		  dev->max_segment_size);
 	if (!is_write) {
 		if (dev->zoned)
 			valid_len = null_zone_valid_read_len(nullb,
@@ -1269,7 +1279,8 @@ static int null_handle_rq(struct nullb_cmd *cmd)
 
 	spin_lock_irq(&nullb->lock);
 	rq_for_each_segment(bvec, rq, iter) {
-		len = bvec.bv_len;
+		len = min(bvec.bv_len, nullb->dev->max_segment_size);
+		bvec.bv_len = len;
 		err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
 				     op_is_write(req_op(rq)), sector,
 				     rq->cmd_flags & REQ_FUA);
@@ -1296,7 +1307,8 @@ static int null_handle_bio(struct nullb_cmd *cmd)
 
 	spin_lock_irq(&nullb->lock);
 	bio_for_each_segment(bvec, bio, iter) {
-		len = bvec.bv_len;
+		len = min(bvec.bv_len, nullb->dev->max_segment_size);
+		bvec.bv_len = len;
 		err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
 				     op_is_write(bio_op(bio)), sector,
 				     bio->bi_opf & REQ_FUA);
@@ -2108,6 +2120,8 @@ static int null_add_dev(struct nullb_device *dev)
 	nullb->q->queuedata = nullb;
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, nullb->q);
 	blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, nullb->q);
+	if (dev->max_segment_size < PAGE_SIZE)
+		blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, nullb->q);
 
 	mutex_lock(&lock);
 	rv = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
@@ -2125,6 +2139,7 @@ static int null_add_dev(struct nullb_device *dev)
 		dev->max_sectors = queue_max_hw_sectors(nullb->q);
 	dev->max_sectors = min(dev->max_sectors, BLK_DEF_MAX_SECTORS);
 	blk_queue_max_hw_sectors(nullb->q, dev->max_sectors);
+	blk_queue_max_segment_size(nullb->q, dev->max_segment_size);
 
 	if (dev->virt_boundary)
 		blk_queue_virt_boundary(nullb->q, PAGE_SIZE - 1);
diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h
index eb5972c50be8..8cb73fe05fa3 100644
--- a/drivers/block/null_blk/null_blk.h
+++ b/drivers/block/null_blk/null_blk.h
@@ -102,6 +102,7 @@ struct nullb_device {
 	unsigned int queue_mode; /* block interface */
 	unsigned int blocksize; /* block size */
 	unsigned int max_sectors; /* Max sectors per command */
+	unsigned int max_segment_size; /* Max size of a single DMA segment. */
 	unsigned int irqmode; /* IRQ completion handler */
 	unsigned int hw_queue_depth; /* queue depth */
 	unsigned int index; /* index of the disk, only valid with a disk */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (6 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-19  5:38   ` Christoph Hellwig
  2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
  2023-01-19  1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
  9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Alim Akhtar,
	Kiwoong Kim

The block layer only accepts max_segment_size values smaller than the
page size if the QUEUE_FLAG_SUB_PAGE_SEGMENTS flag is set. Hence this
patch.

Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9ed1ebcb7443..91f2e7f787d8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1876,6 +1876,9 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 {
 	struct device *dev = shost->dma_dev;
 
+	if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
+		blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);
+
 	/*
 	 * this limit is imposed by hardware restrictions
 	 */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (7 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
  2023-01-28  3:55   ` Ming Lei
  2023-01-19  1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
  9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Bart Van Assche, Alim Akhtar,
	Kiwoong Kim

Since the maximum segment size supported by the Exynos controller is 4
KiB, this controller needs CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size
exceeds 4 KiB.

Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/host/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig
index 4cc2dbd79ed0..376a4039912d 100644
--- a/drivers/ufs/host/Kconfig
+++ b/drivers/ufs/host/Kconfig
@@ -117,6 +117,7 @@ config SCSI_UFS_TI_J721E
 config SCSI_UFS_EXYNOS
 	tristate "Exynos specific hooks to UFS controller platform driver"
 	depends on SCSI_UFSHCD_PLATFORM && (ARCH_EXYNOS || COMPILE_TEST)
+	select BLK_SUB_PAGE_SEGMENTS if PAGE_SIZE > 4096
 	help
 	  This selects the Samsung Exynos SoC specific additions to UFSHCD
 	  platform driver.  UFS host on Samsung Exynos SoC includes HCI and

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
@ 2023-01-18 23:02   ` Jens Axboe
  2023-01-18 23:40     ` Bart Van Assche
  2023-01-21  0:11     ` Bart Van Assche
  0 siblings, 2 replies; 21+ messages in thread
From: Jens Axboe @ 2023-01-18 23:02 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Keith Busch

On 1/18/23 3:54 PM, Bart Van Assche wrote:
> Prepare for introducing support for segments smaller than the page size
> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
> block drivers that support segments >= PAGE_SIZE would be affected.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/Kconfig          | 9 +++++++++
>  include/linux/blkdev.h | 7 +++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/block/Kconfig b/block/Kconfig
> index 5d9d9c84d516..e85061d2175b 100644
> --- a/block/Kconfig
> +++ b/block/Kconfig
> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>  	  created on demand, but scripts that manually create device nodes and
>  	  then call losetup might rely on this behavior.
>  
> +config BLK_SUB_PAGE_SEGMENTS
> +       bool "Support segments smaller than the page size"
> +       default n
> +       help
> +	  Most storage controllers support DMA segments larger than the typical
> +	  size of a virtual memory page. Some embedded controllers only support
> +	  DMA segments smaller than the page size. Enable this option to support
> +	  such controllers.

This should not be a visible option at all, affected drivers should just
select it.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-18 23:02   ` Jens Axboe
@ 2023-01-18 23:40     ` Bart Van Assche
  2023-01-21  0:11     ` Bart Van Assche
  1 sibling, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 23:40 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Keith Busch

On 1/18/23 15:02, Jens Axboe wrote:
> On 1/18/23 3:54 PM, Bart Van Assche wrote:
>> Prepare for introducing support for segments smaller than the page size
>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>> block drivers that support segments >= PAGE_SIZE would be affected.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: Keith Busch <kbusch@kernel.org>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   block/Kconfig          | 9 +++++++++
>>   include/linux/blkdev.h | 7 +++++++
>>   2 files changed, 16 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 5d9d9c84d516..e85061d2175b 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>>   	  created on demand, but scripts that manually create device nodes and
>>   	  then call losetup might rely on this behavior.
>>   
>> +config BLK_SUB_PAGE_SEGMENTS
>> +       bool "Support segments smaller than the page size"
>> +       default n
>> +       help
>> +	  Most storage controllers support DMA segments larger than the typical
>> +	  size of a virtual memory page. Some embedded controllers only support
>> +	  DMA segments smaller than the page size. Enable this option to support
>> +	  such controllers.
> 
> This should not be a visible option at all, affected drivers should just
> select it.

Hi Jens,

Thanks for the feedback. Unless anyone requests me to realize this in another
way, I plan to merge the patch below into the patch above:


diff --git a/block/Kconfig b/block/Kconfig
index e85061d2175b..b44b449c47bf 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -36,13 +36,7 @@ config BLOCK_LEGACY_AUTOLOAD
  	  then call losetup might rely on this behavior.

  config BLK_SUB_PAGE_SEGMENTS
-       bool "Support segments smaller than the page size"
-       default n
-       help
-	  Most storage controllers support DMA segments larger than the typical
-	  size of a virtual memory page. Some embedded controllers only support
-	  DMA segments smaller than the page size. Enable this option to support
-	  such controllers.
+       bool

  config BLK_RQ_ALLOC_TIME
  	bool



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/9] Add support for segments smaller than one page
  2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
                   ` (8 preceding siblings ...)
  2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
@ 2023-01-19  1:18 ` Jens Axboe
  2023-01-19  4:43   ` Bart Van Assche
  9 siblings, 1 reply; 21+ messages in thread
From: Jens Axboe @ 2023-01-19  1:18 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei

On 1/18/23 3:54?PM, Bart Van Assche wrote:
> Hi Jens,
> 
> Several embedded storage controllers need support for DMA segments that are
> smaller than the size of one virtual memory page. Hence this patch series.
> Please consider this patch series for the next merge window.

Before any real reviews are done, I have to ask "why?". This is pretty
hairy code in the middle of the fast path for some obscure controller.
Why would anyone ship that with > 4k page sizes rather than ship it with
a controller that is sane?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 0/9] Add support for segments smaller than one page
  2023-01-19  1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
@ 2023-01-19  4:43   ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-19  4:43 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei

On 1/18/23 17:18, Jens Axboe wrote:
> On 1/18/23 3:54?PM, Bart Van Assche wrote:
>> Hi Jens,
>>
>> Several embedded storage controllers need support for DMA segments that are
>> smaller than the size of one virtual memory page. Hence this patch series.
>> Please consider this patch series for the next merge window.
> 
> Before any real reviews are done, I have to ask "why?". This is pretty
> hairy code in the middle of the fast path for some obscure controller.
> Why would anyone ship that with > 4k page sizes rather than ship it with
> a controller that is sane?

Hi Jens,

The new config variable CONFIG_BLK_SUB_PAGE_SEGMENTS has been introduced 
to make sure that the hot path is *not* affected if that config variable 
is disabled.

Regarding the question "why": the Google Android team would like to 
improve performance by switching from 4 KiB pages to 16 KiB pages. One 
of the widely used UFS controllers (Exynos) has a maximum segment size 
of 4 KiB. Hence this patch series.

This patch series is not only useful for phones but also for Tesla cars 
since Tesla cars use an Exynos UFS controller.

A contributor to the MMC driver told me that this patch series would 
allow to simplify the MMC driver significantly (I have not yet double 
checked this).

Bart.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
  2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
@ 2023-01-19  5:38   ` Christoph Hellwig
  2023-01-19 17:27     ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2023-01-19  5:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
	Adrian Hunter, Christoph Hellwig, Ming Lei, Alim Akhtar,
	Kiwoong Kim

> +	if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
> +		blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);

Independ of me really not wanting this code at all if we can avoid it:
this has no business in the SCSI midlayer or drivers.  Once the config
option is enabled, setting the flag should happen inside
blk_queue_max_segment_size.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
  2023-01-19  5:38   ` Christoph Hellwig
@ 2023-01-19 17:27     ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-19 17:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
	Adrian Hunter, Ming Lei, Alim Akhtar, Kiwoong Kim

On 1/18/23 21:38, Christoph Hellwig wrote:
>> +	if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
>> +		blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);
> 
> Independ of me really not wanting this code at all if we can avoid it:
> this has no business in the SCSI midlayer or drivers.  Once the config
> option is enabled, setting the flag should happen inside
> blk_queue_max_segment_size.

Hi Christoph,

Thanks for having taken a look. I will move this code into the block layer.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-18 23:02   ` Jens Axboe
  2023-01-18 23:40     ` Bart Van Assche
@ 2023-01-21  0:11     ` Bart Van Assche
  2023-01-21  2:43       ` Jens Axboe
  1 sibling, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-21  0:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Keith Busch

On 1/18/23 15:02, Jens Axboe wrote:
> On 1/18/23 3:54 PM, Bart Van Assche wrote:
>> Prepare for introducing support for segments smaller than the page size
>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>> block drivers that support segments >= PAGE_SIZE would be affected.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: Keith Busch <kbusch@kernel.org>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   block/Kconfig          | 9 +++++++++
>>   include/linux/blkdev.h | 7 +++++++
>>   2 files changed, 16 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 5d9d9c84d516..e85061d2175b 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>>   	  created on demand, but scripts that manually create device nodes and
>>   	  then call losetup might rely on this behavior.
>>   
>> +config BLK_SUB_PAGE_SEGMENTS
>> +       bool "Support segments smaller than the page size"
>> +       default n
>> +       help
>> +	  Most storage controllers support DMA segments larger than the typical
>> +	  size of a virtual memory page. Some embedded controllers only support
>> +	  DMA segments smaller than the page size. Enable this option to support
>> +	  such controllers.
> 
> This should not be a visible option at all, affected drivers should just
> select it.

Hi Jens,

If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this 
option be enabled for the scsi_debug and null_blk drivers? Adding 
"select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers 
would have the unfortunate side effect that enabling either driver would 
make all block drivers slower. How about making sub-page segment support 
configurable for the scsi_debug and null_blk drivers only? That would 
allow kernel developers who want to test the sub-page segment support to 
enable this functionality without making e.g. distro kernels slower.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-21  0:11     ` Bart Van Assche
@ 2023-01-21  2:43       ` Jens Axboe
  2023-01-23 19:58         ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Jens Axboe @ 2023-01-21  2:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Keith Busch

On 1/20/23 5:11?PM, Bart Van Assche wrote:
> On 1/18/23 15:02, Jens Axboe wrote:
>> On 1/18/23 3:54?PM, Bart Van Assche wrote:
>>> Prepare for introducing support for segments smaller than the page size
>>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>>> block drivers that support segments >= PAGE_SIZE would be affected.
>>>
>>> Cc: Christoph Hellwig <hch@lst.de>
>>> Cc: Ming Lei <ming.lei@redhat.com>
>>> Cc: Keith Busch <kbusch@kernel.org>
>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>> ---
>>>   block/Kconfig          | 9 +++++++++
>>>   include/linux/blkdev.h | 7 +++++++
>>>   2 files changed, 16 insertions(+)
>>>
>>> diff --git a/block/Kconfig b/block/Kconfig
>>> index 5d9d9c84d516..e85061d2175b 100644
>>> --- a/block/Kconfig
>>> +++ b/block/Kconfig
>>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>>>         created on demand, but scripts that manually create device nodes and
>>>         then call losetup might rely on this behavior.
>>>   +config BLK_SUB_PAGE_SEGMENTS
>>> +       bool "Support segments smaller than the page size"
>>> +       default n
>>> +       help
>>> +      Most storage controllers support DMA segments larger than the typical
>>> +      size of a virtual memory page. Some embedded controllers only support
>>> +      DMA segments smaller than the page size. Enable this option to support
>>> +      such controllers.
>>
>> This should not be a visible option at all, affected drivers should just
>> select it.
> 
> Hi Jens,
> 
> If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this
> option be enabled for the scsi_debug and null_blk drivers? Adding
> "select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers
> would have the unfortunate side effect that enabling either driver
> would make all block drivers slower. How about making sub-page segment
> support configurable for the scsi_debug and null_blk drivers only?
> That would allow kernel developers who want to test the sub-page
> segment support to enable this functionality without making e.g.
> distro kernels slower.

You'd need to have a separate sub-option for each of them, Kconfig
style. But this also highlights the usual issue with pretending that
Kconfig options means that this doesn't matter, because inevitably they
end up getting enabled by default in distros anyway. And then you may as
well not even have them...

Why can't we just handle this in the driver? The segment path is hard
enough to grok in the first place, and this just makes it worse.
Generally not a huge fan of punting this to the driver, but just maybe
it'd make sense in this case since it's just the one. At least that
seems a lot more palatable than the alternative.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
  2023-01-21  2:43       ` Jens Axboe
@ 2023-01-23 19:58         ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-23 19:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
	Christoph Hellwig, Ming Lei, Keith Busch

On 1/20/23 18:43, Jens Axboe wrote:
> On 1/20/23 5:11?PM, Bart Van Assche wrote:
>> If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this
>> option be enabled for the scsi_debug and null_blk drivers? Adding
>> "select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers
>> would have the unfortunate side effect that enabling either driver
>> would make all block drivers slower. How about making sub-page segment
>> support configurable for the scsi_debug and null_blk drivers only?
>> That would allow kernel developers who want to test the sub-page
>> segment support to enable this functionality without making e.g.
>> distro kernels slower.
> 
> You'd need to have a separate sub-option for each of them, Kconfig
> style. But this also highlights the usual issue with pretending that
> Kconfig options means that this doesn't matter, because inevitably they
> end up getting enabled by default in distros anyway. And then you may as
> well not even have them...

Hi Jens,

How about the following approach?
* Remove CONFIG_BLK_SUB_PAGE_SEGMENTS.
* Use the static key mechanism instead of #ifdef
   CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that this patch series makes
   the hot path in the block layer slower.
* Count the number of request queues that need support for segments
   smaller than a page or max_hw_sector values smaller than
   PAGE_SIZE >> SECTOR_SHIFT. If that number changes from zero to one,
   enable the code introduced by this patch series. If that number drops
   to zero, toggle the static key such that the overhead of the code that
   supports small segments is eliminated.

> Why can't we just handle this in the driver? The segment path is hard
> enough to grok in the first place, and this just makes it worse.
> Generally not a huge fan of punting this to the driver, but just maybe
> it'd make sense in this case since it's just the one. At least that
> seems a lot more palatable than the alternative.

One of the functions modified by this patch series is 
blk_rq_append_bio(). That function is called by code that builds a bio 
(e.g. filesystems). Code that builds a bio does not call any block 
driver code. This is why I think it would be hard to move the 
functionality introduced by this patch series into block drivers.

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
  2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
@ 2023-01-28  3:55   ` Ming Lei
  2023-01-29  1:09     ` Bart Van Assche
  0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2023-01-28  3:55 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
	Adrian Hunter, Christoph Hellwig, Alim Akhtar, Kiwoong Kim,
	ming.lei

Hi Bart,

On Wed, Jan 18, 2023 at 02:54:47PM -0800, Bart Van Assche wrote:
> Since the maximum segment size supported by the Exynos controller is 4
> KiB, this controller needs CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size
> exceeds 4 KiB.
> 
> Cc: Alim Akhtar <alim.akhtar@samsung.com>
> Cc: Kiwoong Kim <kwmad.kim@samsung.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/ufs/host/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig
> index 4cc2dbd79ed0..376a4039912d 100644
> --- a/drivers/ufs/host/Kconfig
> +++ b/drivers/ufs/host/Kconfig
> @@ -117,6 +117,7 @@ config SCSI_UFS_TI_J721E
>  config SCSI_UFS_EXYNOS
>  	tristate "Exynos specific hooks to UFS controller platform driver"
>  	depends on SCSI_UFSHCD_PLATFORM && (ARCH_EXYNOS || COMPILE_TEST)
> +	select BLK_SUB_PAGE_SEGMENTS if PAGE_SIZE > 4096

I remember that PAGE_SIZE is still 4K on Android kernel, so
UFS_EXYNOS should work just fine, or Android kernel is going
to change PAGE_SIZE?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
  2023-01-28  3:55   ` Ming Lei
@ 2023-01-29  1:09     ` Bart Van Assche
  0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-29  1:09 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
	Adrian Hunter, Christoph Hellwig, Alim Akhtar, Kiwoong Kim

On 1/27/23 19:55, Ming Lei wrote:
> I remember that PAGE_SIZE is still 4K on Android kernel, so
> UFS_EXYNOS should work just fine, or Android kernel is going
> to change PAGE_SIZE?

Hi Ming,

We want to improve Android performance by increasing the page size from 
4 KiB to 16 KiB. Hence this patch series.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-01-29  1:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
2023-01-18 23:02   ` Jens Axboe
2023-01-18 23:40     ` Bart Van Assche
2023-01-21  0:11     ` Bart Van Assche
2023-01-21  2:43       ` Jens Axboe
2023-01-23 19:58         ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
2023-01-19  5:38   ` Christoph Hellwig
2023-01-19 17:27     ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
2023-01-28  3:55   ` Ming Lei
2023-01-29  1:09     ` Bart Van Assche
2023-01-19  1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
2023-01-19  4:43   ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.