linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES
@ 2020-01-21 10:42 Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 1/7] block: Add @flags argument to bdev_write_zeroes_sectors() Kirill Tkhai
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

(was "[PATCH block v2 0/3] block: Introduce REQ_NOZERO flag
      for REQ_OP_WRITE_ZEROES operation";
 was "[PATCH RFC 0/3] block,ext4: Introduce REQ_OP_ASSIGN_RANGE
      to reflect extents allocation in block device internals")

v4: Correct argument for mddev_check_write_zeroes().

v3: Rename REQ_NOZERO to REQ_ALLOCATE.
    Split helpers to separate patches.
    Add a patch, disabling max_allocate_sectors inheritance for dm.

v2: Introduce new flag for REQ_OP_WRITE_ZEROES instead of
    introduction a new operation as suggested by Martin K. Petersen.
    Removed ext4-related patch to focus on block changes
    for now.

Information about continuous extent placement may be useful
for some block devices. Say, distributed network filesystems,
which provide block device interface, may use this information
for better blocks placement over the nodes in their cluster,
and for better performance. Block devices, which map a file
on another filesystem (loop), may request the same length extent
on underlining filesystem for less fragmentation and for batching
allocation requests. Also, hypervisors like QEMU may use this
information for optimization of cluster allocations.

This patchset introduces REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES,
which makes a block device to allocate blocks instead of actual
blocks zeroing. This may be used for forwarding user's fallocate(0)
requests into block device internals. E.g., in loop driver this
will result in allocation extents in backing-file, so subsequent
write won't fail by the reason of no available space. Distributed
network filesystems will be able to assign specific servers for
specific extents, so subsequent write will be more efficient.

Patches [1-3/7] are preparation on helper functions, patch [4/7]
introduces REQ_ALLOCATE flag and implements all the logic,
patch [5/7] adds one more helper, patch [6/7] disables REQ_ALLOCATE
for dm, which inherits limits from underlining block devices,
patch [7/7] adds loop as the first user of the flag.

Note, that here is only block-related patches, example of usage
for ext4 with a performance numbers may be seen in [1].

[1] https://lore.kernel.org/linux-ext4/157599697369.12112.10138136904533871162.stgit@localhost.localdomain/T/#me5bdd5cc313e14de615d81bea214f355ae975db0

---

Kirill Tkhai (7):
      block: Add @flags argument to bdev_write_zeroes_sectors()
      block: Pass op_flags into blk_queue_get_max_sectors()
      block: Introduce blk_queue_get_max_write_zeroes_sectors()
      block: Add support for REQ_ALLOCATE flag
      block: Add blk_queue_max_allocate_sectors()
      dm: Directly disable max_allocate_sectors for now
      loop: Add support for REQ_ALLOCATE


 block/blk-core.c                    |    6 +++---
 block/blk-lib.c                     |   17 ++++++++++-------
 block/blk-merge.c                   |    9 ++++++---
 block/blk-settings.c                |   17 +++++++++++++++++
 drivers/block/loop.c                |   15 ++++++++++++---
 drivers/md/dm-kcopyd.c              |    2 +-
 drivers/md/dm-table.c               |    2 ++
 drivers/md/md.h                     |    1 +
 drivers/target/target_core_iblock.c |    4 ++--
 fs/block_dev.c                      |    4 ++++
 include/linux/blk_types.h           |    5 ++++-
 include/linux/blkdev.h              |   34 ++++++++++++++++++++++++++--------
 12 files changed, 88 insertions(+), 28 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/7] block: Add @flags argument to bdev_write_zeroes_sectors()
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 2/7] block: Pass op_flags into blk_queue_get_max_sectors() Kirill Tkhai
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

This is a preparation for next patch, which introduces
a new flag BLKDEV_ZERO_ALLOCATE for calls, which need
only allocation of blocks and don't need actual blocks
zeroing.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
 block/blk-lib.c                     |    6 +++---
 drivers/md/dm-kcopyd.c              |    2 +-
 drivers/target/target_core_iblock.c |    4 ++--
 include/linux/blkdev.h              |    3 ++-
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 5f2c429d4378..3e38c93cfc53 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -224,7 +224,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		return -EPERM;
 
 	/* Ensure that max_write_zeroes_sectors doesn't overflow bi_size */
-	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bdev);
+	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bdev, 0);
 
 	if (max_write_zeroes_sectors == 0)
 		return -EOPNOTSUPP;
@@ -362,7 +362,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	sector_t bs_mask;
 	struct bio *bio;
 	struct blk_plug plug;
-	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
+	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev, 0);
 
 	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
 	if ((sector | nr_sects) & bs_mask)
@@ -391,7 +391,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 			try_write_zeroes = false;
 			goto retry;
 		}
-		if (!bdev_write_zeroes_sectors(bdev)) {
+		if (!bdev_write_zeroes_sectors(bdev, 0)) {
 			/*
 			 * Zeroing offload support was indicated, but the
 			 * device reported ILLEGAL REQUEST (for some devices
diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index 1bbe4a34ef4c..f1b8e7926dd4 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -831,7 +831,7 @@ void dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
 		 */
 		job->rw = REQ_OP_WRITE_ZEROES;
 		for (i = 0; i < job->num_dests; i++)
-			if (!bdev_write_zeroes_sectors(job->dests[i].bdev)) {
+			if (!bdev_write_zeroes_sectors(job->dests[i].bdev, 0)) {
 				job->rw = WRITE;
 				break;
 			}
diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 51ffd5c002de..73a63e197bf5 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -117,7 +117,7 @@ static int iblock_configure_device(struct se_device *dev)
 	 * Enable write same emulation for IBLOCK and use 0xFFFF as
 	 * the smaller WRITE_SAME(10) only has a two-byte block count.
 	 */
-	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bd);
+	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bd, 0);
 	if (max_write_zeroes_sectors)
 		dev->dev_attrib.max_write_same_len = max_write_zeroes_sectors;
 	else
@@ -468,7 +468,7 @@ iblock_execute_write_same(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
-	if (bdev_write_zeroes_sectors(bdev)) {
+	if (bdev_write_zeroes_sectors(bdev, 0)) {
 		if (!iblock_execute_zero_out(bdev, cmd))
 			return 0;
 	}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 04cfa798a365..0f1127d0b043 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1418,7 +1418,8 @@ static inline unsigned int bdev_write_same(struct block_device *bdev)
 	return 0;
 }
 
-static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev)
+static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev,
+						     unsigned int flags)
 {
 	struct request_queue *q = bdev_get_queue(bdev);
 



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/7] block: Pass op_flags into blk_queue_get_max_sectors()
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 1/7] block: Add @flags argument to bdev_write_zeroes_sectors() Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 3/7] block: Introduce blk_queue_get_max_write_zeroes_sectors() Kirill Tkhai
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

This preparation patch changes argument type, and now
the function takes full op_flags instead of just op code.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 block/blk-core.c       |    4 ++--
 include/linux/blkdev.h |    8 +++++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 50a5de025d5e..ac2634bcda1f 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1250,10 +1250,10 @@ EXPORT_SYMBOL(submit_bio);
 static int blk_cloned_rq_check_limits(struct request_queue *q,
 				      struct request *rq)
 {
-	if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, req_op(rq))) {
+	if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) {
 		printk(KERN_ERR "%s: over max size limit. (%u > %u)\n",
 			__func__, blk_rq_sectors(rq),
-			blk_queue_get_max_sectors(q, req_op(rq)));
+			blk_queue_get_max_sectors(q, rq->cmd_flags));
 		return -EIO;
 	}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 0f1127d0b043..23a5850f35f6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -989,8 +989,10 @@ static inline struct bio_vec req_bvec(struct request *rq)
 }
 
 static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
-						     int op)
+						     unsigned int op_flags)
 {
+	int op = op_flags & REQ_OP_MASK;
+
 	if (unlikely(op == REQ_OP_DISCARD || op == REQ_OP_SECURE_ERASE))
 		return min(q->limits.max_discard_sectors,
 			   UINT_MAX >> SECTOR_SHIFT);
@@ -1029,10 +1031,10 @@ static inline unsigned int blk_rq_get_max_sectors(struct request *rq,
 	if (!q->limits.chunk_sectors ||
 	    req_op(rq) == REQ_OP_DISCARD ||
 	    req_op(rq) == REQ_OP_SECURE_ERASE)
-		return blk_queue_get_max_sectors(q, req_op(rq));
+		return blk_queue_get_max_sectors(q, rq->cmd_flags);
 
 	return min(blk_max_size_offset(q, offset),
-			blk_queue_get_max_sectors(q, req_op(rq)));
+			blk_queue_get_max_sectors(q, rq->cmd_flags));
 }
 
 static inline unsigned int blk_rq_count_bios(struct request *rq)



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/7] block: Introduce blk_queue_get_max_write_zeroes_sectors()
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 1/7] block: Add @flags argument to bdev_write_zeroes_sectors() Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 2/7] block: Pass op_flags into blk_queue_get_max_sectors() Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 4/7] block: Add support for REQ_ALLOCATE flag Kirill Tkhai
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

This introduces a new primitive, which returns max sectors
for REQ_OP_WRITE_ZEROES operation.
@op_flags is unused now, and it will be enabled in next patch.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 block/blk-core.c       |    2 +-
 block/blk-merge.c      |    9 ++++++---
 include/linux/blkdev.h |    8 +++++++-
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index ac2634bcda1f..2edcd55624f1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -978,7 +978,7 @@ generic_make_request_checks(struct bio *bio)
 			goto not_supported;
 		break;
 	case REQ_OP_WRITE_ZEROES:
-		if (!q->limits.max_write_zeroes_sectors)
+		if (!blk_queue_get_max_write_zeroes_sectors(q, bio->bi_opf))
 			goto not_supported;
 		break;
 	default:
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 1534ed736363..467b292bc6e8 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -105,15 +105,18 @@ static struct bio *blk_bio_discard_split(struct request_queue *q,
 static struct bio *blk_bio_write_zeroes_split(struct request_queue *q,
 		struct bio *bio, struct bio_set *bs, unsigned *nsegs)
 {
+	unsigned int max_sectors;
+
+	max_sectors = blk_queue_get_max_write_zeroes_sectors(q, bio->bi_opf);
 	*nsegs = 0;
 
-	if (!q->limits.max_write_zeroes_sectors)
+	if (!max_sectors)
 		return NULL;
 
-	if (bio_sectors(bio) <= q->limits.max_write_zeroes_sectors)
+	if (bio_sectors(bio) <= max_sectors)
 		return NULL;
 
-	return bio_split(bio, q->limits.max_write_zeroes_sectors, GFP_NOIO, bs);
+	return bio_split(bio, max_sectors, GFP_NOIO, bs);
 }
 
 static struct bio *blk_bio_write_same_split(struct request_queue *q,
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 23a5850f35f6..264202fa3bf8 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -988,6 +988,12 @@ static inline struct bio_vec req_bvec(struct request *rq)
 	return mp_bvec_iter_bvec(rq->bio->bi_io_vec, rq->bio->bi_iter);
 }
 
+static inline unsigned int blk_queue_get_max_write_zeroes_sectors(
+		struct request_queue *q, unsigned int op_flags)
+{
+	return q->limits.max_write_zeroes_sectors;
+}
+
 static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
 						     unsigned int op_flags)
 {
@@ -1001,7 +1007,7 @@ static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
 		return q->limits.max_write_same_sectors;
 
 	if (unlikely(op == REQ_OP_WRITE_ZEROES))
-		return q->limits.max_write_zeroes_sectors;
+		return blk_queue_get_max_write_zeroes_sectors(q, op_flags);
 
 	return q->limits.max_sectors;
 }



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/7] block: Add support for REQ_ALLOCATE flag
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
                   ` (2 preceding siblings ...)
  2020-01-21 10:42 ` [PATCH v4 3/7] block: Introduce blk_queue_get_max_write_zeroes_sectors() Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 5/7] block: Add blk_queue_max_allocate_sectors() Kirill Tkhai
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

This adds support for REQ_ALLOCATE extension of REQ_OP_WRITE_ZEROES
operation, which encourages a block device driver to just allocate
blocks (or mark them allocated) instead of actual blocks zeroing.
REQ_ALLOCATE is aimed to be used for network filesystems providing
a block device interface. Also, block devices, which map a file
on other filesystem (like loop), may use this for less fragmentation
and batching fallocate() requests. Hypervisors like QEMU may
introduce optimizations of clusters allocations based on this.

BLKDEV_ZERO_ALLOCATE is a new corresponding flag for
blkdev_issue_zeroout().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 block/blk-lib.c           |   17 ++++++++++-------
 block/blk-settings.c      |    4 ++++
 fs/block_dev.c            |    4 ++++
 include/linux/blk_types.h |    5 ++++-
 include/linux/blkdev.h    |   13 ++++++++++---
 5 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 3e38c93cfc53..9cd6f86523ba 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -214,7 +214,7 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		struct bio **biop, unsigned flags)
 {
 	struct bio *bio = *biop;
-	unsigned int max_write_zeroes_sectors;
+	unsigned int max_write_zeroes_sectors, req_flags = 0;
 	struct request_queue *q = bdev_get_queue(bdev);
 
 	if (!q)
@@ -224,18 +224,21 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		return -EPERM;
 
 	/* Ensure that max_write_zeroes_sectors doesn't overflow bi_size */
-	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bdev, 0);
+	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bdev, flags);
 
 	if (max_write_zeroes_sectors == 0)
 		return -EOPNOTSUPP;
 
+	if (flags & BLKDEV_ZERO_NOUNMAP)
+		req_flags |= REQ_NOUNMAP;
+	if (flags & BLKDEV_ZERO_ALLOCATE)
+		req_flags |= REQ_ALLOCATE|REQ_NOUNMAP;
+
 	while (nr_sects) {
 		bio = blk_next_bio(bio, 0, gfp_mask);
 		bio->bi_iter.bi_sector = sector;
 		bio_set_dev(bio, bdev);
-		bio->bi_opf = REQ_OP_WRITE_ZEROES;
-		if (flags & BLKDEV_ZERO_NOUNMAP)
-			bio->bi_opf |= REQ_NOUNMAP;
+		bio->bi_opf = REQ_OP_WRITE_ZEROES | req_flags;
 
 		if (nr_sects > max_write_zeroes_sectors) {
 			bio->bi_iter.bi_size = max_write_zeroes_sectors << 9;
@@ -362,7 +365,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 	sector_t bs_mask;
 	struct bio *bio;
 	struct blk_plug plug;
-	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev, 0);
+	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev, flags);
 
 	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
 	if ((sector | nr_sects) & bs_mask)
@@ -391,7 +394,7 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 			try_write_zeroes = false;
 			goto retry;
 		}
-		if (!bdev_write_zeroes_sectors(bdev, 0)) {
+		if (!bdev_write_zeroes_sectors(bdev, flags)) {
 			/*
 			 * Zeroing offload support was indicated, but the
 			 * device reported ILLEGAL REQUEST (for some devices
diff --git a/block/blk-settings.c b/block/blk-settings.c
index c8eda2e7b91e..81468cd454a6 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -48,6 +48,7 @@ void blk_set_default_limits(struct queue_limits *lim)
 	lim->chunk_sectors = 0;
 	lim->max_write_same_sectors = 0;
 	lim->max_write_zeroes_sectors = 0;
+	lim->max_allocate_sectors = 0;
 	lim->max_discard_sectors = 0;
 	lim->max_hw_discard_sectors = 0;
 	lim->discard_granularity = 0;
@@ -83,6 +84,7 @@ void blk_set_stacking_limits(struct queue_limits *lim)
 	lim->max_dev_sectors = UINT_MAX;
 	lim->max_write_same_sectors = UINT_MAX;
 	lim->max_write_zeroes_sectors = UINT_MAX;
+	lim->max_allocate_sectors = UINT_MAX;
 }
 EXPORT_SYMBOL(blk_set_stacking_limits);
 
@@ -506,6 +508,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
 					b->max_write_same_sectors);
 	t->max_write_zeroes_sectors = min(t->max_write_zeroes_sectors,
 					b->max_write_zeroes_sectors);
+	t->max_allocate_sectors = min(t->max_allocate_sectors,
+					b->max_allocate_sectors);
 	t->bounce_pfn = min_not_zero(t->bounce_pfn, b->bounce_pfn);
 
 	t->seg_boundary_mask = min_not_zero(t->seg_boundary_mask,
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 69bf2fb6f7cd..1ffef894b3bd 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -2122,6 +2122,10 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
 		error = blkdev_issue_zeroout(bdev, start >> 9, len >> 9,
 					     GFP_KERNEL, BLKDEV_ZERO_NOFALLBACK);
 		break;
+	case FALLOC_FL_KEEP_SIZE:
+		error = blkdev_issue_zeroout(bdev, start >> 9, len >> 9,
+			GFP_KERNEL, BLKDEV_ZERO_ALLOCATE | BLKDEV_ZERO_NOFALLBACK);
+		break;
 	case FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE | FALLOC_FL_NO_HIDE_STALE:
 		error = blkdev_issue_discard(bdev, start >> 9, len >> 9,
 					     GFP_KERNEL, 0);
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 70254ae11769..86accd2caa4e 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -335,7 +335,9 @@ enum req_flag_bits {
 
 	/* command specific flags for REQ_OP_WRITE_ZEROES: */
 	__REQ_NOUNMAP,		/* do not free blocks when zeroing */
-
+	__REQ_ALLOCATE,		/* only notify about allocated blocks,
+				 * and do not actually zero them
+				 */
 	__REQ_HIPRI,
 
 	/* for driver use */
@@ -362,6 +364,7 @@ enum req_flag_bits {
 #define REQ_CGROUP_PUNT		(1ULL << __REQ_CGROUP_PUNT)
 
 #define REQ_NOUNMAP		(1ULL << __REQ_NOUNMAP)
+#define REQ_ALLOCATE		(1ULL << __REQ_ALLOCATE)
 #define REQ_HIPRI		(1ULL << __REQ_HIPRI)
 
 #define REQ_DRV			(1ULL << __REQ_DRV)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 264202fa3bf8..20c94a7f9411 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -337,6 +337,7 @@ struct queue_limits {
 	unsigned int		max_hw_discard_sectors;
 	unsigned int		max_write_same_sectors;
 	unsigned int		max_write_zeroes_sectors;
+	unsigned int		max_allocate_sectors;
 	unsigned int		discard_granularity;
 	unsigned int		discard_alignment;
 
@@ -991,6 +992,8 @@ static inline struct bio_vec req_bvec(struct request *rq)
 static inline unsigned int blk_queue_get_max_write_zeroes_sectors(
 		struct request_queue *q, unsigned int op_flags)
 {
+	if (op_flags & REQ_ALLOCATE)
+		return q->limits.max_allocate_sectors;
 	return q->limits.max_write_zeroes_sectors;
 }
 
@@ -1227,6 +1230,7 @@ extern int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 
 #define BLKDEV_ZERO_NOUNMAP	(1 << 0)  /* do not free blocks */
 #define BLKDEV_ZERO_NOFALLBACK	(1 << 1)  /* don't write explicit zeroes */
+#define BLKDEV_ZERO_ALLOCATE	(1 << 2)  /* allocate range of blocks */
 
 extern int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop,
@@ -1431,10 +1435,13 @@ static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev,
 {
 	struct request_queue *q = bdev_get_queue(bdev);
 
-	if (q)
-		return q->limits.max_write_zeroes_sectors;
+	if (!q)
+		return 0;
 
-	return 0;
+	if (flags & BLKDEV_ZERO_ALLOCATE)
+		return q->limits.max_allocate_sectors;
+	else
+		return q->limits.max_write_zeroes_sectors;
 }
 
 static inline enum blk_zoned_model bdev_zoned_model(struct block_device *bdev)



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 5/7] block: Add blk_queue_max_allocate_sectors()
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
                   ` (3 preceding siblings ...)
  2020-01-21 10:42 ` [PATCH v4 4/7] block: Add support for REQ_ALLOCATE flag Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now Kirill Tkhai
  2020-01-21 10:42 ` [PATCH v4 7/7] loop: Add support for REQ_ALLOCATE Kirill Tkhai
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

This is a new helper to assign max_allocate_sectors
limit of block device queue.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 block/blk-settings.c   |   13 +++++++++++++
 include/linux/blkdev.h |    2 ++
 2 files changed, 15 insertions(+)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 81468cd454a6..8d2fcbd693f9 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -259,6 +259,19 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
 }
 EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors);
 
+/**
+ * blk_queue_max_allocate_sectors - set max sectors for a single
+ *                                  allocate request
+ * @q:  the request queue for the device
+ * @max_allocate_sectors: maximum number of sectors to write per command
+ **/
+void blk_queue_max_allocate_sectors(struct request_queue *q,
+		unsigned int max_allocate_sectors)
+{
+	q->limits.max_allocate_sectors = max_allocate_sectors;
+}
+EXPORT_SYMBOL(blk_queue_max_allocate_sectors);
+
 /**
  * blk_queue_max_segments - set max hw segments for a request for this queue
  * @q:  the request queue for the device
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 20c94a7f9411..249dce6dd436 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1089,6 +1089,8 @@ extern void blk_queue_max_write_same_sectors(struct request_queue *q,
 		unsigned int max_write_same_sectors);
 extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q,
 		unsigned int max_write_same_sectors);
+extern void blk_queue_max_allocate_sectors(struct request_queue *q,
+		unsigned int max_allocate_sectors);
 extern void blk_queue_logical_block_size(struct request_queue *, unsigned int);
 extern void blk_queue_physical_block_size(struct request_queue *, unsigned int);
 extern void blk_queue_alignment_offset(struct request_queue *q,



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
                   ` (4 preceding siblings ...)
  2020-01-21 10:42 ` [PATCH v4 5/7] block: Add blk_queue_max_allocate_sectors() Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  2020-01-21 12:24   ` Mike Snitzer
  2020-01-21 10:42 ` [PATCH v4 7/7] loop: Add support for REQ_ALLOCATE Kirill Tkhai
  6 siblings, 1 reply; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

Since dm inherits limits from underlining block devices,
this patch directly disables max_allocate_sectors for dm
till full allocation support is implemented.

This prevents high-level primitives (generic_make_request_checks(),
__blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
requests.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 drivers/md/dm-table.c |    2 ++
 drivers/md/md.h       |    1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 0a2cc197f62b..e245c0d882aa 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -489,6 +489,7 @@ static int dm_set_device_limits(struct dm_target *ti, struct dm_dev *dev,
 		       (unsigned long long) start << SECTOR_SHIFT);
 
 	limits->zoned = blk_queue_zoned_model(q);
+	blk_queue_max_allocate_sectors(q, 0);
 
 	return 0;
 }
@@ -1548,6 +1549,7 @@ int dm_calculate_queue_limits(struct dm_table *table,
 			       dm_device_name(table->md),
 			       (unsigned long long) ti->begin,
 			       (unsigned long long) ti->len);
+		limits->max_allocate_sectors = 0;
 
 		/*
 		 * FIXME: this should likely be moved to blk_stack_limits(), would
diff --git a/drivers/md/md.h b/drivers/md/md.h
index acd681939112..d9088122674d 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -798,5 +798,6 @@ static inline void mddev_check_write_zeroes(struct mddev *mddev, struct bio *bio
 	if (bio_op(bio) == REQ_OP_WRITE_ZEROES &&
 	    !bio->bi_disk->queue->limits.max_write_zeroes_sectors)
 		mddev->queue->limits.max_write_zeroes_sectors = 0;
+	blk_queue_max_allocate_sectors(mddev->queue, 0);
 }
 #endif /* _MD_MD_H */



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 7/7] loop: Add support for REQ_ALLOCATE
  2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
                   ` (5 preceding siblings ...)
  2020-01-21 10:42 ` [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now Kirill Tkhai
@ 2020-01-21 10:42 ` Kirill Tkhai
  6 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 10:42 UTC (permalink / raw)
  To: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	snitzer, dm-devel, song, tytso, adilger.kernel,
	Chaitanya.Kulkarni, darrick.wong, ming.lei, osandov, jthumshirn,
	minwoo.im.dev, damien.lemoal, andrea.parri, hare, tj, ajay.joshi,
	sagi, dsterba, chaitanya.kulkarni, bvanassche, dhowells,
	asml.silence, ktkhai

Support for new modifier of REQ_OP_WRITE_ZEROES command.
This results in allocation extents in backing file instead
of actual blocks zeroing.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 drivers/block/loop.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 739b372a5112..bfe76d9adf09 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -581,6 +581,15 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
 	return 0;
 }
 
+static unsigned int write_zeroes_to_fallocate_mode(unsigned int flags)
+{
+	if (flags & REQ_ALLOCATE)
+		return 0;
+	if (flags & REQ_NOUNMAP)
+		return FALLOC_FL_ZERO_RANGE;
+	return FALLOC_FL_PUNCH_HOLE;
+}
+
 static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 {
 	struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
@@ -604,9 +613,7 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 		 * write zeroes the range.  Otherwise, punch them out.
 		 */
 		return lo_fallocate(lo, rq, pos,
-			(rq->cmd_flags & REQ_NOUNMAP) ?
-				FALLOC_FL_ZERO_RANGE :
-				FALLOC_FL_PUNCH_HOLE);
+			write_zeroes_to_fallocate_mode(rq->cmd_flags));
 	case REQ_OP_DISCARD:
 		return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);
 	case REQ_OP_WRITE:
@@ -877,6 +884,7 @@ static void loop_config_discard(struct loop_device *lo)
 		q->limits.discard_alignment = 0;
 		blk_queue_max_discard_sectors(q, 0);
 		blk_queue_max_write_zeroes_sectors(q, 0);
+		blk_queue_max_allocate_sectors(q, 0);
 		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
 		return;
 	}
@@ -886,6 +894,7 @@ static void loop_config_discard(struct loop_device *lo)
 
 	blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
 	blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
+	blk_queue_max_allocate_sectors(q, UINT_MAX >> 9);
 	blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
 }
 



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 10:42 ` [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now Kirill Tkhai
@ 2020-01-21 12:24   ` Mike Snitzer
  2020-01-21 12:36     ` Kirill Tkhai
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2020-01-21 12:24 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On Tue, Jan 21 2020 at  5:42am -0500,
Kirill Tkhai <ktkhai@virtuozzo.com> wrote:

> Since dm inherits limits from underlining block devices,
> this patch directly disables max_allocate_sectors for dm
> till full allocation support is implemented.
> 
> This prevents high-level primitives (generic_make_request_checks(),
> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
> requests.
> 
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  drivers/md/dm-table.c |    2 ++
>  drivers/md/md.h       |    1 +
>  2 files changed, 3 insertions(+)

You're mixing DM and MD changes in the same patch.

But I'm wondering if it might be best to set this default for stacking
devices in blk_set_stacking_limits()?

And then it is up to each stacking driver to override as needed.

Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 12:24   ` Mike Snitzer
@ 2020-01-21 12:36     ` Kirill Tkhai
  2020-01-21 13:33       ` Kirill Tkhai
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 12:36 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On 21.01.2020 15:24, Mike Snitzer wrote:
> On Tue, Jan 21 2020 at  5:42am -0500,
> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> 
>> Since dm inherits limits from underlining block devices,
>> this patch directly disables max_allocate_sectors for dm
>> till full allocation support is implemented.
>>
>> This prevents high-level primitives (generic_make_request_checks(),
>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
>> requests.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  drivers/md/dm-table.c |    2 ++
>>  drivers/md/md.h       |    1 +
>>  2 files changed, 3 insertions(+)
> 
> You're mixing DM and MD changes in the same patch.
> 
> But I'm wondering if it might be best to set this default for stacking
> devices in blk_set_stacking_limits()?
> 
> And then it is up to each stacking driver to override as needed.

Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:

	t->max_allocate_sectors = min(t->max_allocate_sectors,
				      b->max_allocate_sectors);

Could you please tell is this fix is also enough for md?

Kirill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 12:36     ` Kirill Tkhai
@ 2020-01-21 13:33       ` Kirill Tkhai
  2020-01-21 13:48         ` Mike Snitzer
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 13:33 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On 21.01.2020 15:36, Kirill Tkhai wrote:
> On 21.01.2020 15:24, Mike Snitzer wrote:
>> On Tue, Jan 21 2020 at  5:42am -0500,
>> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>
>>> Since dm inherits limits from underlining block devices,
>>> this patch directly disables max_allocate_sectors for dm
>>> till full allocation support is implemented.
>>>
>>> This prevents high-level primitives (generic_make_request_checks(),
>>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
>>> requests.
>>>
>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>>> ---
>>>  drivers/md/dm-table.c |    2 ++
>>>  drivers/md/md.h       |    1 +
>>>  2 files changed, 3 insertions(+)
>>
>> You're mixing DM and MD changes in the same patch.
>>
>> But I'm wondering if it might be best to set this default for stacking
>> devices in blk_set_stacking_limits()?
>>
>> And then it is up to each stacking driver to override as needed.
> 
> Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
> should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:
> 
> 	t->max_allocate_sectors = min(t->max_allocate_sectors,
> 				      b->max_allocate_sectors);
> 
> Could you please tell is this fix is also enough for md?

It looks like it's enough since queue defaults are set in md_alloc()->blk_set_stacking_limits().
In case of we set "max_allocate_sectors = 0", in further it can be changed only manually,
but nobody does this.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 13:33       ` Kirill Tkhai
@ 2020-01-21 13:48         ` Mike Snitzer
  2020-01-21 14:20           ` Kirill Tkhai
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2020-01-21 13:48 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On Tue, Jan 21 2020 at  8:33am -0500,
Kirill Tkhai <ktkhai@virtuozzo.com> wrote:

> On 21.01.2020 15:36, Kirill Tkhai wrote:
> > On 21.01.2020 15:24, Mike Snitzer wrote:
> >> On Tue, Jan 21 2020 at  5:42am -0500,
> >> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> >>
> >>> Since dm inherits limits from underlining block devices,
> >>> this patch directly disables max_allocate_sectors for dm
> >>> till full allocation support is implemented.
> >>>
> >>> This prevents high-level primitives (generic_make_request_checks(),
> >>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
> >>> requests.
> >>>
> >>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> >>> ---
> >>>  drivers/md/dm-table.c |    2 ++
> >>>  drivers/md/md.h       |    1 +
> >>>  2 files changed, 3 insertions(+)
> >>
> >> You're mixing DM and MD changes in the same patch.
> >>
> >> But I'm wondering if it might be best to set this default for stacking
> >> devices in blk_set_stacking_limits()?
> >>
> >> And then it is up to each stacking driver to override as needed.
> > 
> > Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
> > should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:
> > 
> > 	t->max_allocate_sectors = min(t->max_allocate_sectors,
> > 				      b->max_allocate_sectors);
> > 
> > Could you please tell is this fix is also enough for md?
> 
> It looks like it's enough since queue defaults are set in md_alloc()->blk_set_stacking_limits().
> In case of we set "max_allocate_sectors = 0", in further it can be changed only manually,
> but nobody does this.

Yes, it will work to disable this capability for MD and DM.

But if/when a stacked device _dooes_ want to support this then it'll be
awkward to override this stacking default to allow blk_stack_limits()
to properly stack up this limit.  blk_limits are extremely fiddley so
this isn't necessarily new.  But by explicitly defaulting to 0 and then
having blk_stack_limits use min() for this limit: it results in stacking
drivers needing to clumsily unwind the default.  E.g. DM will need to
tweak its blk_stack_limits() related code to allow override that
actually _does_  stack up the underlying devices' capability (and not
just impose its own limit that ignores the underlying devices).

So I'm not convinced this is the right way to go (be it the v4 approach
you took or the cleaner use of blk_set_stacking_limits I suggested).

And to be clear, I'm interested in having DM thinp support this
capability to preallocate blocks.

Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 13:48         ` Mike Snitzer
@ 2020-01-21 14:20           ` Kirill Tkhai
  2020-01-21 14:43             ` Mike Snitzer
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 14:20 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On 21.01.2020 16:48, Mike Snitzer wrote:
> On Tue, Jan 21 2020 at  8:33am -0500,
> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> 
>> On 21.01.2020 15:36, Kirill Tkhai wrote:
>>> On 21.01.2020 15:24, Mike Snitzer wrote:
>>>> On Tue, Jan 21 2020 at  5:42am -0500,
>>>> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>>
>>>>> Since dm inherits limits from underlining block devices,
>>>>> this patch directly disables max_allocate_sectors for dm
>>>>> till full allocation support is implemented.
>>>>>
>>>>> This prevents high-level primitives (generic_make_request_checks(),
>>>>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
>>>>> requests.
>>>>>
>>>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>>>>> ---
>>>>>  drivers/md/dm-table.c |    2 ++
>>>>>  drivers/md/md.h       |    1 +
>>>>>  2 files changed, 3 insertions(+)
>>>>
>>>> You're mixing DM and MD changes in the same patch.
>>>>
>>>> But I'm wondering if it might be best to set this default for stacking
>>>> devices in blk_set_stacking_limits()?
>>>>
>>>> And then it is up to each stacking driver to override as needed.
>>>
>>> Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
>>> should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:
>>>
>>> 	t->max_allocate_sectors = min(t->max_allocate_sectors,
>>> 				      b->max_allocate_sectors);
>>>
>>> Could you please tell is this fix is also enough for md?
>>
>> It looks like it's enough since queue defaults are set in md_alloc()->blk_set_stacking_limits().
>> In case of we set "max_allocate_sectors = 0", in further it can be changed only manually,
>> but nobody does this.
> 
> Yes, it will work to disable this capability for MD and DM.
> 
> But if/when a stacked device _dooes_ want to support this then it'll be
> awkward to override this stacking default to allow blk_stack_limits()
> to properly stack up this limit.  blk_limits are extremely fiddley so
> this isn't necessarily new.  But by explicitly defaulting to 0 and then
> having blk_stack_limits use min() for this limit: it results in stacking
> drivers needing to clumsily unwind the default.  E.g. DM will need to
> tweak its blk_stack_limits() related code to allow override that
> actually _does_  stack up the underlying devices' capability (and not
> just impose its own limit that ignores the underlying devices).
> 
> So I'm not convinced this is the right way to go (be it the v4 approach
> you took or the cleaner use of blk_set_stacking_limits I suggested).

Is there a strong vision about the way we should go? Or you leave this choose
up to me?

> And to be clear, I'm interested in having DM thinp support this
> capability to preallocate blocks.

My opinion is it would be better to not mix several subsystem related
support in a single patch set. Both of the approaches (v4 or that you
suggested) do not prevents us to implement allocation support in next
patch series. After we have the base functionality enabled, we may add
support in other subsystems and drivers one by one with more focus
on the subsystem specificities and with the best possible attention.

Regards,
Kirill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 14:20           ` Kirill Tkhai
@ 2020-01-21 14:43             ` Mike Snitzer
  2020-01-21 15:13               ` Kirill Tkhai
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Snitzer @ 2020-01-21 14:43 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On Tue, Jan 21 2020 at  9:20am -0500,
Kirill Tkhai <ktkhai@virtuozzo.com> wrote:

> On 21.01.2020 16:48, Mike Snitzer wrote:
> > On Tue, Jan 21 2020 at  8:33am -0500,
> > Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> > 
> >> On 21.01.2020 15:36, Kirill Tkhai wrote:
> >>> On 21.01.2020 15:24, Mike Snitzer wrote:
> >>>> On Tue, Jan 21 2020 at  5:42am -0500,
> >>>> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> >>>>
> >>>>> Since dm inherits limits from underlining block devices,
> >>>>> this patch directly disables max_allocate_sectors for dm
> >>>>> till full allocation support is implemented.
> >>>>>
> >>>>> This prevents high-level primitives (generic_make_request_checks(),
> >>>>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
> >>>>> requests.
> >>>>>
> >>>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> >>>>> ---
> >>>>>  drivers/md/dm-table.c |    2 ++
> >>>>>  drivers/md/md.h       |    1 +
> >>>>>  2 files changed, 3 insertions(+)
> >>>>
> >>>> You're mixing DM and MD changes in the same patch.
> >>>>
> >>>> But I'm wondering if it might be best to set this default for stacking
> >>>> devices in blk_set_stacking_limits()?
> >>>>
> >>>> And then it is up to each stacking driver to override as needed.
> >>>
> >>> Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
> >>> should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:
> >>>
> >>> 	t->max_allocate_sectors = min(t->max_allocate_sectors,
> >>> 				      b->max_allocate_sectors);
> >>>
> >>> Could you please tell is this fix is also enough for md?
> >>
> >> It looks like it's enough since queue defaults are set in md_alloc()->blk_set_stacking_limits().
> >> In case of we set "max_allocate_sectors = 0", in further it can be changed only manually,
> >> but nobody does this.
> > 
> > Yes, it will work to disable this capability for MD and DM.
> > 
> > But if/when a stacked device _dooes_ want to support this then it'll be
> > awkward to override this stacking default to allow blk_stack_limits()
> > to properly stack up this limit.  blk_limits are extremely fiddley so
> > this isn't necessarily new.  But by explicitly defaulting to 0 and then
> > having blk_stack_limits use min() for this limit: it results in stacking
> > drivers needing to clumsily unwind the default.  E.g. DM will need to
> > tweak its blk_stack_limits() related code to allow override that
> > actually _does_  stack up the underlying devices' capability (and not
> > just impose its own limit that ignores the underlying devices).
> > 
> > So I'm not convinced this is the right way to go (be it the v4 approach
> > you took or the cleaner use of blk_set_stacking_limits I suggested).
> 
> Is there a strong vision about the way we should go? Or you leave this choose
> up to me?

I don't have time to work through it at the moment (e.g. implementing
dm-thinp support to know what the block core code should be) so I'll
just defer to you on a disabling it for now.

> > And to be clear, I'm interested in having DM thinp support this
> > capability to preallocate blocks.
> 
> My opinion is it would be better to not mix several subsystem related
> support in a single patch set. Both of the approaches (v4 or that you
> suggested) do not prevents us to implement allocation support in next
> patch series. After we have the base functionality enabled, we may add
> support in other subsystems and drivers one by one with more focus
> on the subsystem specificities and with the best possible attention.

Yeah, I'm aware nothing is ever set in stone.

Setting to 0 in blk_set_stacking_limits() is OK for now.

Thanks,
Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now
  2020-01-21 14:43             ` Mike Snitzer
@ 2020-01-21 15:13               ` Kirill Tkhai
  0 siblings, 0 replies; 15+ messages in thread
From: Kirill Tkhai @ 2020-01-21 15:13 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: linux-block, linux-kernel, martin.petersen, bob.liu, axboe, agk,
	dm-devel, song, tytso, adilger.kernel, Chaitanya.Kulkarni,
	darrick.wong, ming.lei, osandov, jthumshirn, minwoo.im.dev,
	damien.lemoal, andrea.parri, hare, tj, ajay.joshi, sagi, dsterba,
	bvanassche, dhowells, asml.silence

On 21.01.2020 17:43, Mike Snitzer wrote:
> On Tue, Jan 21 2020 at  9:20am -0500,
> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> 
>> On 21.01.2020 16:48, Mike Snitzer wrote:
>>> On Tue, Jan 21 2020 at  8:33am -0500,
>>> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>
>>>> On 21.01.2020 15:36, Kirill Tkhai wrote:
>>>>> On 21.01.2020 15:24, Mike Snitzer wrote:
>>>>>> On Tue, Jan 21 2020 at  5:42am -0500,
>>>>>> Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>>>>
>>>>>>> Since dm inherits limits from underlining block devices,
>>>>>>> this patch directly disables max_allocate_sectors for dm
>>>>>>> till full allocation support is implemented.
>>>>>>>
>>>>>>> This prevents high-level primitives (generic_make_request_checks(),
>>>>>>> __blkdev_issue_write_zeroes(), ...) from sending REQ_ALLOCATE
>>>>>>> requests.
>>>>>>>
>>>>>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>>>>>>> ---
>>>>>>>  drivers/md/dm-table.c |    2 ++
>>>>>>>  drivers/md/md.h       |    1 +
>>>>>>>  2 files changed, 3 insertions(+)
>>>>>>
>>>>>> You're mixing DM and MD changes in the same patch.
>>>>>>
>>>>>> But I'm wondering if it might be best to set this default for stacking
>>>>>> devices in blk_set_stacking_limits()?
>>>>>>
>>>>>> And then it is up to each stacking driver to override as needed.
>>>>>
>>>>> Hm. Sound like a good idea. This "lim->max_allocate_sectors = 0" in blk_set_stacking_limits()
>>>>> should work for dm's dm_calculate_queue_limits(), since it calls blk_stack_limits(), which is:
>>>>>
>>>>> 	t->max_allocate_sectors = min(t->max_allocate_sectors,
>>>>> 				      b->max_allocate_sectors);
>>>>>
>>>>> Could you please tell is this fix is also enough for md?
>>>>
>>>> It looks like it's enough since queue defaults are set in md_alloc()->blk_set_stacking_limits().
>>>> In case of we set "max_allocate_sectors = 0", in further it can be changed only manually,
>>>> but nobody does this.
>>>
>>> Yes, it will work to disable this capability for MD and DM.
>>>
>>> But if/when a stacked device _dooes_ want to support this then it'll be
>>> awkward to override this stacking default to allow blk_stack_limits()
>>> to properly stack up this limit.  blk_limits are extremely fiddley so
>>> this isn't necessarily new.  But by explicitly defaulting to 0 and then
>>> having blk_stack_limits use min() for this limit: it results in stacking
>>> drivers needing to clumsily unwind the default.  E.g. DM will need to
>>> tweak its blk_stack_limits() related code to allow override that
>>> actually _does_  stack up the underlying devices' capability (and not
>>> just impose its own limit that ignores the underlying devices).
>>>
>>> So I'm not convinced this is the right way to go (be it the v4 approach
>>> you took or the cleaner use of blk_set_stacking_limits I suggested).
>>
>> Is there a strong vision about the way we should go? Or you leave this choose
>> up to me?
> 
> I don't have time to work through it at the moment (e.g. implementing
> dm-thinp support to know what the block core code should be) so I'll
> just defer to you on a disabling it for now.
> 
>>> And to be clear, I'm interested in having DM thinp support this
>>> capability to preallocate blocks.
>>
>> My opinion is it would be better to not mix several subsystem related
>> support in a single patch set. Both of the approaches (v4 or that you
>> suggested) do not prevents us to implement allocation support in next
>> patch series. After we have the base functionality enabled, we may add
>> support in other subsystems and drivers one by one with more focus
>> on the subsystem specificities and with the best possible attention.
> 
> Yeah, I'm aware nothing is ever set in stone.
> 
> Setting to 0 in blk_set_stacking_limits() is OK for now.

I get your point. Thanks for the suggestion and comments, Mike.

Kirill

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-01-21 15:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21 10:42 [PATCH v4 0/7] block: Introduce REQ_ALLOCATE flag for REQ_OP_WRITE_ZEROES Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 1/7] block: Add @flags argument to bdev_write_zeroes_sectors() Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 2/7] block: Pass op_flags into blk_queue_get_max_sectors() Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 3/7] block: Introduce blk_queue_get_max_write_zeroes_sectors() Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 4/7] block: Add support for REQ_ALLOCATE flag Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 5/7] block: Add blk_queue_max_allocate_sectors() Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 6/7] dm: Directly disable max_allocate_sectors for now Kirill Tkhai
2020-01-21 12:24   ` Mike Snitzer
2020-01-21 12:36     ` Kirill Tkhai
2020-01-21 13:33       ` Kirill Tkhai
2020-01-21 13:48         ` Mike Snitzer
2020-01-21 14:20           ` Kirill Tkhai
2020-01-21 14:43             ` Mike Snitzer
2020-01-21 15:13               ` Kirill Tkhai
2020-01-21 10:42 ` [PATCH v4 7/7] loop: Add support for REQ_ALLOCATE Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).