All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	jyescas@google.com, mcgrof@kernel.org,
	Bart Van Assche <bvanassche@acm.org>,
	Ming Lei <ming.lei@redhat.com>, Keith Busch <kbusch@kernel.org>
Subject: [PATCH v5 3/9] block: Support configuring limits below the page size
Date: Mon, 22 May 2023 15:25:35 -0700	[thread overview]
Message-ID: <20230522222554.525229-4-bvanassche@acm.org> (raw)
In-Reply-To: <20230522222554.525229-1-bvanassche@acm.org>

Allow block drivers to configure the following:
* Maximum number of hardware sectors values smaller than
  PAGE_SIZE >> SECTOR_SHIFT. For PAGE_SIZE = 4096 this means that values
  below 8 become supported.
* A maximum segment size below the page size. This is most useful
  for page sizes above 4096 bytes.

The blk_sub_page_segments static branch will be used in later patches to
prevent that performance of block drivers that support segments >=
PAGE_SIZE and max_hw_sectors >= PAGE_SIZE >> SECTOR_SHIFT would be affected.

This patch may change the behavior of existing block drivers from not
working into working. If a block driver calls
blk_queue_max_hw_sectors() or blk_queue_max_segment_size(), this is
usually done to configure the maximum supported limits. An attempt to
configure a limit below what is supported by the block layer causes the
block layer to select a larger value. If that value is not supported by
the block driver, this may cause other data to be transferred than
requested, a kernel crash or other undesirable behavior.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-core.c       |  2 ++
 block/blk-settings.c   | 57 ++++++++++++++++++++++++++++++++++++++++++
 block/blk.h            |  9 +++++++
 include/linux/blkdev.h |  2 ++
 4 files changed, 70 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 00c74330fa92..814bfb9c9489 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -264,6 +264,8 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
 static void blk_free_queue(struct request_queue *q)
 {
 	blk_free_queue_stats(q->stats);
+	blk_disable_sub_page_limits(&q->limits);
+
 	if (queue_is_mq(q))
 		blk_mq_release(q);
 
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 95d6e836c4a7..a4ef1dfeef76 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -19,6 +19,11 @@
 #include "blk-rq-qos.h"
 #include "blk-wbt.h"
 
+/* Protects blk_nr_sub_page_limit_queues and blk_sub_page_limits changes. */
+static DEFINE_MUTEX(blk_sub_page_limit_lock);
+static uint32_t blk_nr_sub_page_limit_queues;
+DEFINE_STATIC_KEY_FALSE(blk_sub_page_limits);
+
 void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
 {
 	q->rq_timeout = timeout;
@@ -59,6 +64,7 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->zoned = BLK_ZONED_NONE;
 	lim->zone_write_granularity = 0;
 	lim->dma_alignment = 511;
+	lim->sub_page_limits = false;
 }
 
 /**
@@ -101,6 +107,47 @@ void blk_queue_bounce_limit(struct request_queue *q, enum blk_bounce bounce)
 }
 EXPORT_SYMBOL(blk_queue_bounce_limit);
 
+/**
+ * blk_enable_sub_page_limits - enable support for max_segment_size values smaller than PAGE_SIZE and for max_hw_sectors values below PAGE_SIZE >> SECTOR_SHIFT
+ * @lim: request queue limits for which to enable support of these features.
+ *
+ * Support for these features is not enabled all the time because of the
+ * runtime overhead of these features.
+ */
+static void blk_enable_sub_page_limits(struct queue_limits *lim)
+{
+	if (lim->sub_page_limits)
+		return;
+
+	lim->sub_page_limits = true;
+
+	mutex_lock(&blk_sub_page_limit_lock);
+	if (++blk_nr_sub_page_limit_queues == 1)
+		static_branch_enable(&blk_sub_page_limits);
+	mutex_unlock(&blk_sub_page_limit_lock);
+}
+
+/**
+ * blk_disable_sub_page_limits - disable support for max_segment_size values smaller than PAGE_SIZE and for max_hw_sectors values below PAGE_SIZE >> SECTOR_SHIFT
+ * @lim: request queue limits for which to enable support of these features.
+ *
+ * Support for these features is not enabled all the time because of the
+ * runtime overhead of these features.
+ */
+void blk_disable_sub_page_limits(struct queue_limits *lim)
+{
+	if (!lim->sub_page_limits)
+		return;
+
+	lim->sub_page_limits = false;
+
+	mutex_lock(&blk_sub_page_limit_lock);
+	WARN_ON_ONCE(blk_nr_sub_page_limit_queues <= 0);
+	if (--blk_nr_sub_page_limit_queues == 0)
+		static_branch_disable(&blk_sub_page_limits);
+	mutex_unlock(&blk_sub_page_limit_lock);
+}
+
 /**
  * blk_queue_max_hw_sectors - set max sectors for a request for this queue
  * @q:  the request queue for the device
@@ -126,6 +173,11 @@ void blk_queue_max_hw_sectors(struct request_queue *q, unsigned int max_hw_secto
 	unsigned int min_max_hw_sectors = PAGE_SIZE >> SECTOR_SHIFT;
 	unsigned int max_sectors;
 
+	if (max_hw_sectors < min_max_hw_sectors) {
+		blk_enable_sub_page_limits(limits);
+		min_max_hw_sectors = 1;
+	}
+
 	if (max_hw_sectors < min_max_hw_sectors) {
 		max_hw_sectors = min_max_hw_sectors;
 		pr_info("%s: set to minimum %u\n", __func__, max_hw_sectors);
@@ -284,6 +336,11 @@ void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size)
 {
 	unsigned int min_max_segment_size = PAGE_SIZE;
 
+	if (max_size < min_max_segment_size) {
+		blk_enable_sub_page_limits(&q->limits);
+		min_max_segment_size = SECTOR_SIZE;
+	}
+
 	if (max_size < min_max_segment_size) {
 		max_size = min_max_segment_size;
 		pr_info("%s: set to minimum %u\n", __func__, max_size);
diff --git a/block/blk.h b/block/blk.h
index 9f171b8f1e34..49526127ea08 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -13,6 +13,7 @@ struct elevator_type;
 #define BLK_MAX_TIMEOUT		(5 * HZ)
 
 extern struct dentry *blk_debugfs_root;
+DECLARE_STATIC_KEY_FALSE(blk_sub_page_limits);
 
 struct blk_flush_queue {
 	unsigned int		flush_pending_idx:1;
@@ -32,6 +33,14 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
 					      gfp_t flags);
 void blk_free_flush_queue(struct blk_flush_queue *q);
 
+static inline bool blk_queue_sub_page_limits(const struct queue_limits *lim)
+{
+	return static_branch_unlikely(&blk_sub_page_limits) &&
+		lim->sub_page_limits;
+}
+
+void blk_disable_sub_page_limits(struct queue_limits *q);
+
 void blk_freeze_queue(struct request_queue *q);
 void __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
 void blk_queue_start_drain(struct request_queue *q);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index fe99948688df..e54fbb124efb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -310,6 +310,8 @@ struct queue_limits {
 	 * due to possible offsets.
 	 */
 	unsigned int		dma_alignment;
+
+	bool			sub_page_limits;
 };
 
 typedef int (*report_zones_cb)(struct blk_zone *zone, unsigned int idx,

  parent reply	other threads:[~2023-05-22 22:26 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22 22:25 [PATCH v5 0/9] Support limits below the page size Bart Van Assche
2023-05-22 22:25 ` [PATCH v5 1/9] block: Use pr_info() instead of printk(KERN_INFO ...) Bart Van Assche
2023-05-22 23:10   ` Luis Chamberlain
2023-05-27 16:09     ` Bart Van Assche
2023-05-22 22:25 ` [PATCH v5 2/9] block: Prepare for supporting sub-page limits Bart Van Assche
2023-05-22 23:26   ` Luis Chamberlain
2023-05-22 22:25 ` Bart Van Assche [this message]
2023-05-27  3:16   ` [PATCH v5 3/9] block: Support configuring limits below the page size Luis Chamberlain
2023-05-27 16:20     ` Bart Van Assche
2023-05-28 20:33       ` Luis Chamberlain
2023-05-28 22:32         ` Bart Van Assche
2023-05-31  5:40           ` Luis Chamberlain
2023-05-22 22:25 ` [PATCH v5 4/9] block: Make sub_page_limit_queues available in debugfs Bart Van Assche
2023-05-27  3:17   ` Luis Chamberlain
2023-05-22 22:25 ` [PATCH v5 5/9] block: Support submitting passthrough requests with small segments Bart Van Assche
2023-05-22 22:25 ` [PATCH v5 6/9] block: Add support for filesystem requests and " Bart Van Assche
2023-05-22 22:25 ` [PATCH v5 7/9] block: Add support for small segments in blk_rq_map_user_iov() Bart Van Assche
2023-05-22 22:25 ` [PATCH v5 8/9] scsi_debug: Support configuring the maximum segment size Bart Van Assche
2023-05-24 20:50   ` Douglas Gilbert
2023-05-22 22:25 ` [PATCH v5 9/9] null_blk: " Bart Van Assche
2023-06-09 17:14 ` [PATCH v5 0/9] Support limits below the page size Sandeep Dhavale
2023-06-12 18:15   ` Bart Van Assche
2023-06-12 18:34     ` Sandeep Dhavale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230522222554.525229-4-bvanassche@acm.org \
    --to=bvanassche@acm.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jyescas@google.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.