* [PATCH v3 0/9] Add support for segments smaller than one page
@ 2023-01-18 22:54 Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
` (9 more replies)
0 siblings, 10 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche
Hi Jens,
Several embedded storage controllers need support for DMA segments that are
smaller than the size of one virtual memory page. Hence this patch series.
Please consider this patch series for the next merge window.
Thanks,
Bart.
Changes compared to v2:
- For SCSI drivers, only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- In the scsi_debug patch, sorted kernel module parameters alphabetically.
Only set flag QUEUE_FLAG_SUB_PAGE_SEGMENTS if necessary.
- Added a patch for the UFS Exynos driver that enables
CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size exceeds 4 KiB.
Changes compared to v1:
- Added a CONFIG variable that controls whether or not small segment support
is enabled.
- Improved patch descriptions.
Bart Van Assche (9):
block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and
CONFIG_BLK_SUB_PAGE_SEGMENTS
block: Support configuring limits below the page size
block: Support submitting passthrough requests with small segments
block: Add support for filesystem requests and small segments
block: Add support for small segments in blk_rq_map_user_iov()
scsi_debug: Support configuring the maximum segment size
null_blk: Support configuring the maximum segment size
scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size
values
scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page
sizes
block/Kconfig | 9 +++++++
block/blk-map.c | 43 ++++++++++++++++++++++++++-----
block/blk-merge.c | 6 +++--
block/blk-mq.c | 2 ++
block/blk-settings.c | 20 ++++++++------
block/blk.h | 22 +++++++++++-----
drivers/block/null_blk/main.c | 21 ++++++++++++---
drivers/block/null_blk/null_blk.h | 1 +
drivers/scsi/scsi_debug.c | 15 +++++++++++
drivers/scsi/scsi_lib.c | 3 +++
drivers/ufs/host/Kconfig | 1 +
include/linux/blkdev.h | 7 +++++
12 files changed, 125 insertions(+), 25 deletions(-)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 23:02 ` Jens Axboe
2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
` (8 subsequent siblings)
9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch
Prepare for introducing support for segments smaller than the page size
by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
block drivers that support segments >= PAGE_SIZE would be affected.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/Kconfig | 9 +++++++++
include/linux/blkdev.h | 7 +++++++
2 files changed, 16 insertions(+)
diff --git a/block/Kconfig b/block/Kconfig
index 5d9d9c84d516..e85061d2175b 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
created on demand, but scripts that manually create device nodes and
then call losetup might rely on this behavior.
+config BLK_SUB_PAGE_SEGMENTS
+ bool "Support segments smaller than the page size"
+ default n
+ help
+ Most storage controllers support DMA segments larger than the typical
+ size of a virtual memory page. Some embedded controllers only support
+ DMA segments smaller than the page size. Enable this option to support
+ such controllers.
+
config BLK_RQ_ALLOC_TIME
bool
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 89f51d68c68a..6cbb22fb93ee 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -547,6 +547,7 @@ struct request_queue {
/* Keep blk_queue_flag_name[] in sync with the definitions below */
#define QUEUE_FLAG_STOPPED 0 /* queue is stopped */
#define QUEUE_FLAG_DYING 1 /* queue being torn down */
+#define QUEUE_FLAG_SUB_PAGE_SEGMENTS 2 /* segments smaller than one page */
#define QUEUE_FLAG_NOMERGES 3 /* disable merge attempts */
#define QUEUE_FLAG_SAME_COMP 4 /* complete on same CPU-group */
#define QUEUE_FLAG_FAIL_IO 5 /* fake timeout */
@@ -613,6 +614,12 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
#define blk_queue_sq_sched(q) test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags)
#define blk_queue_skip_tagset_quiesce(q) \
test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+#define blk_queue_sub_page_segments(q) \
+ test_bit(QUEUE_FLAG_SUB_PAGE_SEGMENTS, &(q)->queue_flags)
+#else
+#define blk_queue_sub_page_segments(q) false
+#endif
extern void blk_set_pm_only(struct request_queue *q);
extern void blk_clear_pm_only(struct request_queue *q);
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 2/9] block: Support configuring limits below the page size
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
` (7 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch
Allow block drivers to configure the following if
CONFIG_BLK_SUB_PAGE_SEGMENTS=y:
* Maximum number of hardware sectors values smaller than
PAGE_SIZE >> SECTOR_SHIFT. With PAGE_SIZE = 4096 this means that values
below 8 are supported.
* A maximum segment size smaller than the page size. This is most useful
for page sizes above 4096 bytes.
The behavior of the block layer is not modified if
CONFIG_BLK_SUB_PAGE_SEGMENTS=n.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-settings.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 9c9713c9269c..9820ceb18c46 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -122,12 +122,14 @@ EXPORT_SYMBOL(blk_queue_bounce_limit);
void blk_queue_max_hw_sectors(struct request_queue *q, unsigned int max_hw_sectors)
{
struct queue_limits *limits = &q->limits;
+ unsigned int min_max_hw_sectors = blk_queue_sub_page_segments(q) ? 1 :
+ PAGE_SIZE >> SECTOR_SHIFT;
unsigned int max_sectors;
- if ((max_hw_sectors << 9) < PAGE_SIZE) {
- max_hw_sectors = 1 << (PAGE_SHIFT - 9);
- printk(KERN_INFO "%s: set to minimum %d\n",
- __func__, max_hw_sectors);
+ if (max_hw_sectors < min_max_hw_sectors) {
+ max_hw_sectors = min_max_hw_sectors;
+ printk(KERN_INFO "%s: set to minimum %u\n", __func__,
+ max_hw_sectors);
}
max_hw_sectors = round_down(max_hw_sectors,
@@ -282,10 +284,12 @@ EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
**/
void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size)
{
- if (max_size < PAGE_SIZE) {
- max_size = PAGE_SIZE;
- printk(KERN_INFO "%s: set to minimum %d\n",
- __func__, max_size);
+ unsigned int min_max_segment_size = blk_queue_sub_page_segments(q) ?
+ SECTOR_SIZE : PAGE_SIZE;
+
+ if (max_size < min_max_segment_size) {
+ max_size = min_max_segment_size;
+ printk(KERN_INFO "%s: set to minimum %u\n", __func__, max_size);
}
/* see blk_queue_virt_boundary() for the explanation */
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 3/9] block: Support submitting passthrough requests with small segments
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
` (6 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch
If the segment size is smaller than the page size there may be multiple
segments per bvec even if a bvec only contains a single page. Hence this
patch.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-map.c | 16 +++++++++++++++-
block/blk.h | 11 +++++++++++
2 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index 9ee4be4ba2f1..4270db88e2c2 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -523,6 +523,20 @@ static struct bio *bio_copy_kern(struct request_queue *q, void *data,
return ERR_PTR(-ENOMEM);
}
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+/* Number of DMA segments required to transfer @bytes data. */
+unsigned int blk_segments(const struct queue_limits *limits, unsigned int bytes)
+{
+ const unsigned int mss = limits->max_segment_size;
+
+ if (bytes <= mss)
+ return 1;
+ if (is_power_of_2(mss))
+ return round_up(bytes, mss) >> ilog2(mss);
+ return (bytes + mss - 1) / mss;
+}
+#endif
+
/*
* Append a bio to a passthrough request. Only works if the bio can be merged
* into the request based on the driver constraints.
@@ -534,7 +548,7 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
unsigned int nr_segs = 0;
bio_for_each_bvec(bv, bio, iter)
- nr_segs++;
+ nr_segs += blk_segments(&rq->q->limits, bv.bv_len);
if (!rq->bio) {
blk_rq_bio_prep(rq, bio, nr_segs);
diff --git a/block/blk.h b/block/blk.h
index 4c3b3325219a..8f5e749ad73b 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -76,6 +76,17 @@ struct bio_vec *bvec_alloc(mempool_t *pool, unsigned short *nr_vecs,
gfp_t gfp_mask);
void bvec_free(mempool_t *pool, struct bio_vec *bv, unsigned short nr_vecs);
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+unsigned int blk_segments(const struct queue_limits *limits,
+ unsigned int bytes);
+#else
+static inline unsigned int blk_segments(const struct queue_limits *limits,
+ unsigned int bytes)
+{
+ return 1;
+}
+#endif
+
static inline bool biovec_phys_mergeable(struct request_queue *q,
struct bio_vec *vec1, struct bio_vec *vec2)
{
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 4/9] block: Add support for filesystem requests and small segments
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (2 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
` (5 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch
Add support in the bio splitting code and also in the bio submission code
for bios with segments smaller than the page size.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-merge.c | 6 ++++--
block/blk-mq.c | 2 ++
block/blk.h | 11 +++++------
3 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b7c193d67185..bf727f67473d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -294,7 +294,8 @@ static struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim,
if (nsegs < lim->max_segments &&
bytes + bv.bv_len <= max_bytes &&
bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
- nsegs++;
+ /* single-page bvec optimization */
+ nsegs += blk_segments(lim, bv.bv_len);
bytes += bv.bv_len;
} else {
if (bvec_split_segs(lim, &bv, &nsegs, &bytes,
@@ -543,7 +544,8 @@ static int __blk_bios_map_sg(struct request_queue *q, struct bio *bio,
__blk_segment_map_sg_merge(q, &bvec, &bvprv, sg))
goto next_bvec;
- if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE)
+ if (bvec.bv_offset + bvec.bv_len <= PAGE_SIZE &&
+ bvec.bv_len <= q->limits.max_segment_size)
nsegs += __blk_bvec_map_sg(bvec, sglist, sg);
else
nsegs += blk_bvec_map_sg(q, &bvec, sglist, sg);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9d463f7563bc..947cae2def76 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2959,6 +2959,8 @@ void blk_mq_submit_bio(struct bio *bio)
bio = __bio_split_to_limits(bio, &q->limits, &nr_segs);
if (!bio)
return;
+ } else if (bio->bi_vcnt == 1) {
+ nr_segs = blk_segments(&q->limits, bio->bi_io_vec[0].bv_len);
}
if (!bio_integrity_prep(bio))
diff --git a/block/blk.h b/block/blk.h
index 8f5e749ad73b..c3dd332ba618 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -316,14 +316,13 @@ static inline bool bio_may_exceed_limits(struct bio *bio,
}
/*
- * All drivers must accept single-segments bios that are <= PAGE_SIZE.
- * This is a quick and dirty check that relies on the fact that
- * bi_io_vec[0] is always valid if a bio has data. The check might
- * lead to occasional false negatives when bios are cloned, but compared
- * to the performance impact of cloned bios themselves the loop below
- * doesn't matter anyway.
+ * Check whether bio splitting should be performed. This check may
+ * trigger the bio splitting code even if splitting is not necessary.
*/
return lim->chunk_sectors || bio->bi_vcnt != 1 ||
+#ifdef CONFIG_BLK_SUB_PAGE_SEGMENTS
+ bio->bi_io_vec->bv_len > lim->max_segment_size ||
+#endif
bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE;
}
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov()
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (3 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
` (4 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Keith Busch
Before changing the return value of bio_add_hw_page() into a value in
the range [0, len], make blk_rq_map_user_iov() fall back to copying data
if mapping the data is not possible due to the segment limit.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
block/blk-map.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index 4270db88e2c2..83163f9b2335 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -307,17 +307,26 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
else {
for (j = 0; j < npages; j++) {
struct page *page = pages[j];
- unsigned int n = PAGE_SIZE - offs;
+ unsigned int n = PAGE_SIZE - offs, added;
bool same_page = false;
if (n > bytes)
n = bytes;
- if (!bio_add_hw_page(rq->q, bio, page, n, offs,
- max_sectors, &same_page)) {
+ added = bio_add_hw_page(rq->q, bio, page, n,
+ offs, max_sectors, &same_page);
+ if (added == 0) {
if (same_page)
put_page(page);
break;
+ } else if (added != n) {
+ /*
+ * The segment size is smaller than the
+ * page size and an iov exceeds the
+ * segment size. Give up.
+ */
+ ret = -EREMOTEIO;
+ goto out_unmap;
}
bytes -= n;
@@ -671,10 +680,18 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
i = *iter;
do {
- if (copy)
+ if (copy) {
ret = bio_copy_user_iov(rq, map_data, &i, gfp_mask);
- else
+ } else {
ret = bio_map_user_iov(rq, &i, gfp_mask);
+ /*
+ * Fall back to copying the data if bio_map_user_iov()
+ * returns -EREMOTEIO.
+ */
+ if (ret == -EREMOTEIO)
+ ret = bio_copy_user_iov(rq, map_data, &i,
+ gfp_mask);
+ }
if (ret)
goto unmap_rq;
if (!bio)
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (4 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
` (3 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Doug Gilbert,
Martin K . Petersen
Add a kernel module parameter for configuring the maximum segment size.
This patch enables testing SCSI support for segments smaller than the
page size.
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/scsi/scsi_debug.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 8553277effb3..d09f05b440ba 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -752,6 +752,7 @@ static int sdebug_host_max_queue; /* per host */
static int sdebug_lowest_aligned = DEF_LOWEST_ALIGNED;
static int sdebug_max_luns = DEF_MAX_LUNS;
static int sdebug_max_queue = SDEBUG_CANQUEUE; /* per submit queue */
+static unsigned int sdebug_max_segment_size = BLK_MAX_SEGMENT_SIZE;
static unsigned int sdebug_medium_error_start = OPT_MEDIUM_ERR_ADDR;
static int sdebug_medium_error_count = OPT_MEDIUM_ERR_NUM;
static atomic_t retired_max_queue; /* if > 0 then was prior max_queue */
@@ -5205,6 +5206,11 @@ static int scsi_debug_slave_alloc(struct scsi_device *sdp)
if (sdebug_verbose)
pr_info("slave_alloc <%u %u %u %llu>\n",
sdp->host->host_no, sdp->channel, sdp->id, sdp->lun);
+
+ if (sdebug_max_segment_size < PAGE_SIZE)
+ blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS,
+ sdp->request_queue);
+
return 0;
}
@@ -5841,6 +5847,7 @@ module_param_named(lowest_aligned, sdebug_lowest_aligned, int, S_IRUGO);
module_param_named(lun_format, sdebug_lun_am_i, int, S_IRUGO | S_IWUSR);
module_param_named(max_luns, sdebug_max_luns, int, S_IRUGO | S_IWUSR);
module_param_named(max_queue, sdebug_max_queue, int, S_IRUGO | S_IWUSR);
+module_param_named(max_segment_size, sdebug_max_segment_size, uint, S_IRUGO);
module_param_named(medium_error_count, sdebug_medium_error_count, int,
S_IRUGO | S_IWUSR);
module_param_named(medium_error_start, sdebug_medium_error_start, int,
@@ -5917,6 +5924,7 @@ MODULE_PARM_DESC(lowest_aligned, "lowest aligned lba (def=0)");
MODULE_PARM_DESC(lun_format, "LUN format: 0->peripheral (def); 1 --> flat address method");
MODULE_PARM_DESC(max_luns, "number of LUNs per target to simulate(def=1)");
MODULE_PARM_DESC(max_queue, "max number of queued commands (1 to max(def))");
+MODULE_PARM_DESC(max_segment_size, "max bytes in a single segment");
MODULE_PARM_DESC(medium_error_count, "count of sectors to return follow on MEDIUM error");
MODULE_PARM_DESC(medium_error_start, "starting sector number to return MEDIUM error");
MODULE_PARM_DESC(ndelay, "response delay in nanoseconds (def=0 -> ignore)");
@@ -6920,6 +6928,12 @@ static int __init scsi_debug_init(void)
return -EINVAL;
}
+ if (sdebug_max_segment_size < SECTOR_SIZE) {
+ pr_err("invalid max_segment_size %d\n",
+ sdebug_max_segment_size);
+ return -EINVAL;
+ }
+
switch (sdebug_dif) {
case T10_PI_TYPE0_PROTECTION:
break;
@@ -7816,6 +7830,7 @@ static int sdebug_driver_probe(struct device *dev)
sdebug_driver_template.can_queue = sdebug_max_queue;
sdebug_driver_template.cmd_per_lun = sdebug_max_queue;
+ sdebug_driver_template.max_segment_size = sdebug_max_segment_size;
if (!sdebug_clustering)
sdebug_driver_template.dma_boundary = PAGE_SIZE - 1;
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 7/9] null_blk: Support configuring the maximum segment size
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (5 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
` (2 subsequent siblings)
9 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Damien Le Moal,
Chaitanya Kulkarni
Add support for configuring the maximum segment size.
Add support for segments smaller than the page size.
This patch enables testing segments smaller than the page size with a
driver that does not call blk_rq_map_sg().
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/block/null_blk/main.c | 21 ++++++++++++++++++---
drivers/block/null_blk/null_blk.h | 1 +
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 4c601ca9552a..822890d89427 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -157,6 +157,10 @@ static int g_max_sectors;
module_param_named(max_sectors, g_max_sectors, int, 0444);
MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B sectors)");
+static unsigned int g_max_segment_size = BLK_MAX_SEGMENT_SIZE;
+module_param_named(max_segment_size, g_max_segment_size, int, 0444);
+MODULE_PARM_DESC(max_segment_size, "Maximum size of a segment in bytes");
+
static unsigned int nr_devices = 1;
module_param(nr_devices, uint, 0444);
MODULE_PARM_DESC(nr_devices, "Number of devices to register");
@@ -409,6 +413,7 @@ NULLB_DEVICE_ATTR(home_node, uint, NULL);
NULLB_DEVICE_ATTR(queue_mode, uint, NULL);
NULLB_DEVICE_ATTR(blocksize, uint, NULL);
NULLB_DEVICE_ATTR(max_sectors, uint, NULL);
+NULLB_DEVICE_ATTR(max_segment_size, uint, NULL);
NULLB_DEVICE_ATTR(irqmode, uint, NULL);
NULLB_DEVICE_ATTR(hw_queue_depth, uint, NULL);
NULLB_DEVICE_ATTR(index, uint, NULL);
@@ -550,6 +555,7 @@ static struct configfs_attribute *nullb_device_attrs[] = {
&nullb_device_attr_queue_mode,
&nullb_device_attr_blocksize,
&nullb_device_attr_max_sectors,
+ &nullb_device_attr_max_segment_size,
&nullb_device_attr_irqmode,
&nullb_device_attr_hw_queue_depth,
&nullb_device_attr_index,
@@ -630,7 +636,8 @@ static ssize_t memb_group_features_show(struct config_item *item, char *page)
return snprintf(page, PAGE_SIZE,
"badblocks,blocking,blocksize,cache_size,"
"completion_nsec,discard,home_node,hw_queue_depth,"
- "irqmode,max_sectors,mbps,memory_backed,no_sched,"
+ "irqmode,max_sectors,max_segment_size,mbps,"
+ "memory_backed,no_sched,"
"poll_queues,power,queue_mode,shared_tag_bitmap,size,"
"submit_queues,use_per_node_hctx,virt_boundary,zoned,"
"zone_capacity,zone_max_active,zone_max_open,"
@@ -693,6 +700,7 @@ static struct nullb_device *null_alloc_dev(void)
dev->queue_mode = g_queue_mode;
dev->blocksize = g_bs;
dev->max_sectors = g_max_sectors;
+ dev->max_segment_size = g_max_segment_size;
dev->irqmode = g_irqmode;
dev->hw_queue_depth = g_hw_queue_depth;
dev->blocking = g_blocking;
@@ -1234,6 +1242,8 @@ static int null_transfer(struct nullb *nullb, struct page *page,
unsigned int valid_len = len;
int err = 0;
+ WARN_ONCE(len > dev->max_segment_size, "%u > %u\n", len,
+ dev->max_segment_size);
if (!is_write) {
if (dev->zoned)
valid_len = null_zone_valid_read_len(nullb,
@@ -1269,7 +1279,8 @@ static int null_handle_rq(struct nullb_cmd *cmd)
spin_lock_irq(&nullb->lock);
rq_for_each_segment(bvec, rq, iter) {
- len = bvec.bv_len;
+ len = min(bvec.bv_len, nullb->dev->max_segment_size);
+ bvec.bv_len = len;
err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
op_is_write(req_op(rq)), sector,
rq->cmd_flags & REQ_FUA);
@@ -1296,7 +1307,8 @@ static int null_handle_bio(struct nullb_cmd *cmd)
spin_lock_irq(&nullb->lock);
bio_for_each_segment(bvec, bio, iter) {
- len = bvec.bv_len;
+ len = min(bvec.bv_len, nullb->dev->max_segment_size);
+ bvec.bv_len = len;
err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
op_is_write(bio_op(bio)), sector,
bio->bi_opf & REQ_FUA);
@@ -2108,6 +2120,8 @@ static int null_add_dev(struct nullb_device *dev)
nullb->q->queuedata = nullb;
blk_queue_flag_set(QUEUE_FLAG_NONROT, nullb->q);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, nullb->q);
+ if (dev->max_segment_size < PAGE_SIZE)
+ blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, nullb->q);
mutex_lock(&lock);
rv = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
@@ -2125,6 +2139,7 @@ static int null_add_dev(struct nullb_device *dev)
dev->max_sectors = queue_max_hw_sectors(nullb->q);
dev->max_sectors = min(dev->max_sectors, BLK_DEF_MAX_SECTORS);
blk_queue_max_hw_sectors(nullb->q, dev->max_sectors);
+ blk_queue_max_segment_size(nullb->q, dev->max_segment_size);
if (dev->virt_boundary)
blk_queue_virt_boundary(nullb->q, PAGE_SIZE - 1);
diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h
index eb5972c50be8..8cb73fe05fa3 100644
--- a/drivers/block/null_blk/null_blk.h
+++ b/drivers/block/null_blk/null_blk.h
@@ -102,6 +102,7 @@ struct nullb_device {
unsigned int queue_mode; /* block interface */
unsigned int blocksize; /* block size */
unsigned int max_sectors; /* Max sectors per command */
+ unsigned int max_segment_size; /* Max size of a single DMA segment. */
unsigned int irqmode; /* IRQ completion handler */
unsigned int hw_queue_depth; /* queue depth */
unsigned int index; /* index of the disk, only valid with a disk */
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (6 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-19 5:38 ` Christoph Hellwig
2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
2023-01-19 1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Alim Akhtar,
Kiwoong Kim
The block layer only accepts max_segment_size values smaller than the
page size if the QUEUE_FLAG_SUB_PAGE_SEGMENTS flag is set. Hence this
patch.
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/scsi/scsi_lib.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9ed1ebcb7443..91f2e7f787d8 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1876,6 +1876,9 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
{
struct device *dev = shost->dma_dev;
+ if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
+ blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);
+
/*
* this limit is imposed by hardware restrictions
*/
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (7 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
@ 2023-01-18 22:54 ` Bart Van Assche
2023-01-28 3:55 ` Ming Lei
2023-01-19 1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
9 siblings, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 22:54 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Bart Van Assche, Alim Akhtar,
Kiwoong Kim
Since the maximum segment size supported by the Exynos controller is 4
KiB, this controller needs CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size
exceeds 4 KiB.
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/ufs/host/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig
index 4cc2dbd79ed0..376a4039912d 100644
--- a/drivers/ufs/host/Kconfig
+++ b/drivers/ufs/host/Kconfig
@@ -117,6 +117,7 @@ config SCSI_UFS_TI_J721E
config SCSI_UFS_EXYNOS
tristate "Exynos specific hooks to UFS controller platform driver"
depends on SCSI_UFSHCD_PLATFORM && (ARCH_EXYNOS || COMPILE_TEST)
+ select BLK_SUB_PAGE_SEGMENTS if PAGE_SIZE > 4096
help
This selects the Samsung Exynos SoC specific additions to UFSHCD
platform driver. UFS host on Samsung Exynos SoC includes HCI and
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
@ 2023-01-18 23:02 ` Jens Axboe
2023-01-18 23:40 ` Bart Van Assche
2023-01-21 0:11 ` Bart Van Assche
0 siblings, 2 replies; 21+ messages in thread
From: Jens Axboe @ 2023-01-18 23:02 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Keith Busch
On 1/18/23 3:54 PM, Bart Van Assche wrote:
> Prepare for introducing support for segments smaller than the page size
> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
> block drivers that support segments >= PAGE_SIZE would be affected.
>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Keith Busch <kbusch@kernel.org>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> block/Kconfig | 9 +++++++++
> include/linux/blkdev.h | 7 +++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/block/Kconfig b/block/Kconfig
> index 5d9d9c84d516..e85061d2175b 100644
> --- a/block/Kconfig
> +++ b/block/Kconfig
> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
> created on demand, but scripts that manually create device nodes and
> then call losetup might rely on this behavior.
>
> +config BLK_SUB_PAGE_SEGMENTS
> + bool "Support segments smaller than the page size"
> + default n
> + help
> + Most storage controllers support DMA segments larger than the typical
> + size of a virtual memory page. Some embedded controllers only support
> + DMA segments smaller than the page size. Enable this option to support
> + such controllers.
This should not be a visible option at all, affected drivers should just
select it.
--
Jens Axboe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-18 23:02 ` Jens Axboe
@ 2023-01-18 23:40 ` Bart Van Assche
2023-01-21 0:11 ` Bart Van Assche
1 sibling, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-18 23:40 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Keith Busch
On 1/18/23 15:02, Jens Axboe wrote:
> On 1/18/23 3:54 PM, Bart Van Assche wrote:
>> Prepare for introducing support for segments smaller than the page size
>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>> block drivers that support segments >= PAGE_SIZE would be affected.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: Keith Busch <kbusch@kernel.org>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>> block/Kconfig | 9 +++++++++
>> include/linux/blkdev.h | 7 +++++++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 5d9d9c84d516..e85061d2175b 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>> created on demand, but scripts that manually create device nodes and
>> then call losetup might rely on this behavior.
>>
>> +config BLK_SUB_PAGE_SEGMENTS
>> + bool "Support segments smaller than the page size"
>> + default n
>> + help
>> + Most storage controllers support DMA segments larger than the typical
>> + size of a virtual memory page. Some embedded controllers only support
>> + DMA segments smaller than the page size. Enable this option to support
>> + such controllers.
>
> This should not be a visible option at all, affected drivers should just
> select it.
Hi Jens,
Thanks for the feedback. Unless anyone requests me to realize this in another
way, I plan to merge the patch below into the patch above:
diff --git a/block/Kconfig b/block/Kconfig
index e85061d2175b..b44b449c47bf 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -36,13 +36,7 @@ config BLOCK_LEGACY_AUTOLOAD
then call losetup might rely on this behavior.
config BLK_SUB_PAGE_SEGMENTS
- bool "Support segments smaller than the page size"
- default n
- help
- Most storage controllers support DMA segments larger than the typical
- size of a virtual memory page. Some embedded controllers only support
- DMA segments smaller than the page size. Enable this option to support
- such controllers.
+ bool
config BLK_RQ_ALLOC_TIME
bool
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/9] Add support for segments smaller than one page
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
` (8 preceding siblings ...)
2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
@ 2023-01-19 1:18 ` Jens Axboe
2023-01-19 4:43 ` Bart Van Assche
9 siblings, 1 reply; 21+ messages in thread
From: Jens Axboe @ 2023-01-19 1:18 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei
On 1/18/23 3:54?PM, Bart Van Assche wrote:
> Hi Jens,
>
> Several embedded storage controllers need support for DMA segments that are
> smaller than the size of one virtual memory page. Hence this patch series.
> Please consider this patch series for the next merge window.
Before any real reviews are done, I have to ask "why?". This is pretty
hairy code in the middle of the fast path for some obscure controller.
Why would anyone ship that with > 4k page sizes rather than ship it with
a controller that is sane?
--
Jens Axboe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/9] Add support for segments smaller than one page
2023-01-19 1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
@ 2023-01-19 4:43 ` Bart Van Assche
0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-19 4:43 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei
On 1/18/23 17:18, Jens Axboe wrote:
> On 1/18/23 3:54?PM, Bart Van Assche wrote:
>> Hi Jens,
>>
>> Several embedded storage controllers need support for DMA segments that are
>> smaller than the size of one virtual memory page. Hence this patch series.
>> Please consider this patch series for the next merge window.
>
> Before any real reviews are done, I have to ask "why?". This is pretty
> hairy code in the middle of the fast path for some obscure controller.
> Why would anyone ship that with > 4k page sizes rather than ship it with
> a controller that is sane?
Hi Jens,
The new config variable CONFIG_BLK_SUB_PAGE_SEGMENTS has been introduced
to make sure that the hot path is *not* affected if that config variable
is disabled.
Regarding the question "why": the Google Android team would like to
improve performance by switching from 4 KiB pages to 16 KiB pages. One
of the widely used UFS controllers (Exynos) has a maximum segment size
of 4 KiB. Hence this patch series.
This patch series is not only useful for phones but also for Tesla cars
since Tesla cars use an Exynos UFS controller.
A contributor to the MMC driver told me that this patch series would
allow to simplify the MMC driver significantly (I have not yet double
checked this).
Bart.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
@ 2023-01-19 5:38 ` Christoph Hellwig
2023-01-19 17:27 ` Bart Van Assche
0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2023-01-19 5:38 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
Adrian Hunter, Christoph Hellwig, Ming Lei, Alim Akhtar,
Kiwoong Kim
> + if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
> + blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);
Independ of me really not wanting this code at all if we can avoid it:
this has no business in the SCSI midlayer or drivers. Once the config
option is enabled, setting the flag should happen inside
blk_queue_max_segment_size.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values
2023-01-19 5:38 ` Christoph Hellwig
@ 2023-01-19 17:27 ` Bart Van Assche
0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-19 17:27 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
Adrian Hunter, Ming Lei, Alim Akhtar, Kiwoong Kim
On 1/18/23 21:38, Christoph Hellwig wrote:
>> + if (shost->max_segment_size && shost->max_segment_size < PAGE_SIZE)
>> + blk_queue_flag_set(QUEUE_FLAG_SUB_PAGE_SEGMENTS, q);
>
> Independ of me really not wanting this code at all if we can avoid it:
> this has no business in the SCSI midlayer or drivers. Once the config
> option is enabled, setting the flag should happen inside
> blk_queue_max_segment_size.
Hi Christoph,
Thanks for having taken a look. I will move this code into the block layer.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-18 23:02 ` Jens Axboe
2023-01-18 23:40 ` Bart Van Assche
@ 2023-01-21 0:11 ` Bart Van Assche
2023-01-21 2:43 ` Jens Axboe
1 sibling, 1 reply; 21+ messages in thread
From: Bart Van Assche @ 2023-01-21 0:11 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Keith Busch
On 1/18/23 15:02, Jens Axboe wrote:
> On 1/18/23 3:54 PM, Bart Van Assche wrote:
>> Prepare for introducing support for segments smaller than the page size
>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>> block drivers that support segments >= PAGE_SIZE would be affected.
>>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Cc: Keith Busch <kbusch@kernel.org>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>> block/Kconfig | 9 +++++++++
>> include/linux/blkdev.h | 7 +++++++
>> 2 files changed, 16 insertions(+)
>>
>> diff --git a/block/Kconfig b/block/Kconfig
>> index 5d9d9c84d516..e85061d2175b 100644
>> --- a/block/Kconfig
>> +++ b/block/Kconfig
>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>> created on demand, but scripts that manually create device nodes and
>> then call losetup might rely on this behavior.
>>
>> +config BLK_SUB_PAGE_SEGMENTS
>> + bool "Support segments smaller than the page size"
>> + default n
>> + help
>> + Most storage controllers support DMA segments larger than the typical
>> + size of a virtual memory page. Some embedded controllers only support
>> + DMA segments smaller than the page size. Enable this option to support
>> + such controllers.
>
> This should not be a visible option at all, affected drivers should just
> select it.
Hi Jens,
If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this
option be enabled for the scsi_debug and null_blk drivers? Adding
"select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers
would have the unfortunate side effect that enabling either driver would
make all block drivers slower. How about making sub-page segment support
configurable for the scsi_debug and null_blk drivers only? That would
allow kernel developers who want to test the sub-page segment support to
enable this functionality without making e.g. distro kernels slower.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-21 0:11 ` Bart Van Assche
@ 2023-01-21 2:43 ` Jens Axboe
2023-01-23 19:58 ` Bart Van Assche
0 siblings, 1 reply; 21+ messages in thread
From: Jens Axboe @ 2023-01-21 2:43 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Keith Busch
On 1/20/23 5:11?PM, Bart Van Assche wrote:
> On 1/18/23 15:02, Jens Axboe wrote:
>> On 1/18/23 3:54?PM, Bart Van Assche wrote:
>>> Prepare for introducing support for segments smaller than the page size
>>> by introducing the request queue flag QUEUE_FLAG_SUB_PAGE_SEGMENTS.
>>> Introduce CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that performance of
>>> block drivers that support segments >= PAGE_SIZE would be affected.
>>>
>>> Cc: Christoph Hellwig <hch@lst.de>
>>> Cc: Ming Lei <ming.lei@redhat.com>
>>> Cc: Keith Busch <kbusch@kernel.org>
>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>> ---
>>> block/Kconfig | 9 +++++++++
>>> include/linux/blkdev.h | 7 +++++++
>>> 2 files changed, 16 insertions(+)
>>>
>>> diff --git a/block/Kconfig b/block/Kconfig
>>> index 5d9d9c84d516..e85061d2175b 100644
>>> --- a/block/Kconfig
>>> +++ b/block/Kconfig
>>> @@ -35,6 +35,15 @@ config BLOCK_LEGACY_AUTOLOAD
>>> created on demand, but scripts that manually create device nodes and
>>> then call losetup might rely on this behavior.
>>> +config BLK_SUB_PAGE_SEGMENTS
>>> + bool "Support segments smaller than the page size"
>>> + default n
>>> + help
>>> + Most storage controllers support DMA segments larger than the typical
>>> + size of a virtual memory page. Some embedded controllers only support
>>> + DMA segments smaller than the page size. Enable this option to support
>>> + such controllers.
>>
>> This should not be a visible option at all, affected drivers should just
>> select it.
>
> Hi Jens,
>
> If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this
> option be enabled for the scsi_debug and null_blk drivers? Adding
> "select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers
> would have the unfortunate side effect that enabling either driver
> would make all block drivers slower. How about making sub-page segment
> support configurable for the scsi_debug and null_blk drivers only?
> That would allow kernel developers who want to test the sub-page
> segment support to enable this functionality without making e.g.
> distro kernels slower.
You'd need to have a separate sub-option for each of them, Kconfig
style. But this also highlights the usual issue with pretending that
Kconfig options means that this doesn't matter, because inevitably they
end up getting enabled by default in distros anyway. And then you may as
well not even have them...
Why can't we just handle this in the driver? The segment path is hard
enough to grok in the first place, and this just makes it worse.
Generally not a huge fan of punting this to the driver, but just maybe
it'd make sense in this case since it's just the one. At least that
seems a lot more palatable than the alternative.
--
Jens Axboe
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS
2023-01-21 2:43 ` Jens Axboe
@ 2023-01-23 19:58 ` Bart Van Assche
0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-23 19:58 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, linux-scsi, Jaegeuk Kim, Avri Altman, Adrian Hunter,
Christoph Hellwig, Ming Lei, Keith Busch
On 1/20/23 18:43, Jens Axboe wrote:
> On 1/20/23 5:11?PM, Bart Van Assche wrote:
>> If CONFIG_BLK_SUB_PAGE_SEGMENTS is made invisible, how should this
>> option be enabled for the scsi_debug and null_blk drivers? Adding
>> "select BLK_SUB_PAGE_SEGMENTS" to the Kconfig section of these drivers
>> would have the unfortunate side effect that enabling either driver
>> would make all block drivers slower. How about making sub-page segment
>> support configurable for the scsi_debug and null_blk drivers only?
>> That would allow kernel developers who want to test the sub-page
>> segment support to enable this functionality without making e.g.
>> distro kernels slower.
>
> You'd need to have a separate sub-option for each of them, Kconfig
> style. But this also highlights the usual issue with pretending that
> Kconfig options means that this doesn't matter, because inevitably they
> end up getting enabled by default in distros anyway. And then you may as
> well not even have them...
Hi Jens,
How about the following approach?
* Remove CONFIG_BLK_SUB_PAGE_SEGMENTS.
* Use the static key mechanism instead of #ifdef
CONFIG_BLK_SUB_PAGE_SEGMENTS to prevent that this patch series makes
the hot path in the block layer slower.
* Count the number of request queues that need support for segments
smaller than a page or max_hw_sector values smaller than
PAGE_SIZE >> SECTOR_SHIFT. If that number changes from zero to one,
enable the code introduced by this patch series. If that number drops
to zero, toggle the static key such that the overhead of the code that
supports small segments is eliminated.
> Why can't we just handle this in the driver? The segment path is hard
> enough to grok in the first place, and this just makes it worse.
> Generally not a huge fan of punting this to the driver, but just maybe
> it'd make sense in this case since it's just the one. At least that
> seems a lot more palatable than the alternative.
One of the functions modified by this patch series is
blk_rq_append_bio(). That function is called by code that builds a bio
(e.g. filesystems). Code that builds a bio does not call any block
driver code. This is why I think it would be hard to move the
functionality introduced by this patch series into block drivers.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
@ 2023-01-28 3:55 ` Ming Lei
2023-01-29 1:09 ` Bart Van Assche
0 siblings, 1 reply; 21+ messages in thread
From: Ming Lei @ 2023-01-28 3:55 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
Adrian Hunter, Christoph Hellwig, Alim Akhtar, Kiwoong Kim,
ming.lei
Hi Bart,
On Wed, Jan 18, 2023 at 02:54:47PM -0800, Bart Van Assche wrote:
> Since the maximum segment size supported by the Exynos controller is 4
> KiB, this controller needs CONFIG_BLK_SUB_PAGE_SEGMENTS if the page size
> exceeds 4 KiB.
>
> Cc: Alim Akhtar <alim.akhtar@samsung.com>
> Cc: Kiwoong Kim <kwmad.kim@samsung.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/ufs/host/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig
> index 4cc2dbd79ed0..376a4039912d 100644
> --- a/drivers/ufs/host/Kconfig
> +++ b/drivers/ufs/host/Kconfig
> @@ -117,6 +117,7 @@ config SCSI_UFS_TI_J721E
> config SCSI_UFS_EXYNOS
> tristate "Exynos specific hooks to UFS controller platform driver"
> depends on SCSI_UFSHCD_PLATFORM && (ARCH_EXYNOS || COMPILE_TEST)
> + select BLK_SUB_PAGE_SEGMENTS if PAGE_SIZE > 4096
I remember that PAGE_SIZE is still 4K on Android kernel, so
UFS_EXYNOS should work just fine, or Android kernel is going
to change PAGE_SIZE?
Thanks,
Ming
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes
2023-01-28 3:55 ` Ming Lei
@ 2023-01-29 1:09 ` Bart Van Assche
0 siblings, 0 replies; 21+ messages in thread
From: Bart Van Assche @ 2023-01-29 1:09 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, linux-scsi, Jaegeuk Kim, Avri Altman,
Adrian Hunter, Christoph Hellwig, Alim Akhtar, Kiwoong Kim
On 1/27/23 19:55, Ming Lei wrote:
> I remember that PAGE_SIZE is still 4K on Android kernel, so
> UFS_EXYNOS should work just fine, or Android kernel is going
> to change PAGE_SIZE?
Hi Ming,
We want to improve Android performance by increasing the page size from
4 KiB to 16 KiB. Hence this patch series.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2023-01-29 1:09 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 22:54 [PATCH v3 0/9] Add support for segments smaller than one page Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 1/9] block: Introduce QUEUE_FLAG_SUB_PAGE_SEGMENTS and CONFIG_BLK_SUB_PAGE_SEGMENTS Bart Van Assche
2023-01-18 23:02 ` Jens Axboe
2023-01-18 23:40 ` Bart Van Assche
2023-01-21 0:11 ` Bart Van Assche
2023-01-21 2:43 ` Jens Axboe
2023-01-23 19:58 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 2/9] block: Support configuring limits below the page size Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 3/9] block: Support submitting passthrough requests with small segments Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 4/9] block: Add support for filesystem requests and " Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 5/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 6/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 7/9] null_blk: " Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 8/9] scsi: core: Set BLK_SUB_PAGE_SEGMENTS for small max_segment_size values Bart Van Assche
2023-01-19 5:38 ` Christoph Hellwig
2023-01-19 17:27 ` Bart Van Assche
2023-01-18 22:54 ` [PATCH v3 9/9] scsi: ufs: exynos: Select CONFIG_BLK_SUB_PAGE_SEGMENTS for lage page sizes Bart Van Assche
2023-01-28 3:55 ` Ming Lei
2023-01-29 1:09 ` Bart Van Assche
2023-01-19 1:18 ` [PATCH v3 0/9] Add support for segments smaller than one page Jens Axboe
2023-01-19 4:43 ` Bart Van Assche
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.