linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] brd: implement discard
@ 2022-09-20 17:52 Mikulas Patocka
  2022-09-20 17:53 ` [PATCH v2 1/4] brd: make brd_insert_page return bool Mikulas Patocka
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-20 17:52 UTC (permalink / raw)
  To: Jens Axboe, Zdenek Kabelac, Christoph Hellwig; +Cc: linux-block, dm-devel

Hi

Here I'm sending second version of the brd discard patches.

Mikulas


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/4] brd: make brd_insert_page return bool
  2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
@ 2022-09-20 17:53 ` Mikulas Patocka
  2022-09-21  5:00   ` Chaitanya Kulkarni
  2022-09-20 17:56 ` [PATCH v2 2/4] brd: extend the rcu regions to cover read and write Mikulas Patocka
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-20 17:53 UTC (permalink / raw)
  To: Jens Axboe, Zdenek Kabelac, Christoph Hellwig; +Cc: linux-block, dm-devel

brd_insert_page returns a pointer to struct page, however the pointer is
never used (it is only compared against NULL), so to clean-up the code, we
make it return bool.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

---
 drivers/block/brd.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/block/brd.c
===================================================================
--- linux-2.6.orig/drivers/block/brd.c
+++ linux-2.6/drivers/block/brd.c
@@ -78,11 +78,11 @@ static struct page *brd_lookup_page(stru
 }
 
 /*
- * Look up and return a brd's page for a given sector.
- * If one does not exist, allocate an empty page, and insert that. Then
- * return it.
+ * If the page exists, return true.
+ * If it does not exist, allocate an empty page, and insert that. Return true
+ * if the allocation was successful.
  */
-static struct page *brd_insert_page(struct brd_device *brd, sector_t sector)
+static bool brd_insert_page(struct brd_device *brd, sector_t sector)
 {
 	pgoff_t idx;
 	struct page *page;
@@ -90,7 +90,7 @@ static struct page *brd_insert_page(stru
 
 	page = brd_lookup_page(brd, sector);
 	if (page)
-		return page;
+		return true;
 
 	/*
 	 * Must use NOIO because we don't want to recurse back into the
@@ -99,11 +99,11 @@ static struct page *brd_insert_page(stru
 	gfp_flags = GFP_NOIO | __GFP_ZERO | __GFP_HIGHMEM;
 	page = alloc_page(gfp_flags);
 	if (!page)
-		return NULL;
+		return false;
 
 	if (radix_tree_preload(GFP_NOIO)) {
 		__free_page(page);
-		return NULL;
+		return false;
 	}
 
 	spin_lock(&brd->brd_lock);
@@ -121,7 +121,7 @@ static struct page *brd_insert_page(stru
 
 	radix_tree_preload_end();
 
-	return page;
+	return true;
 }
 
 /*


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 2/4] brd: extend the rcu regions to cover read and write
  2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
  2022-09-20 17:53 ` [PATCH v2 1/4] brd: make brd_insert_page return bool Mikulas Patocka
@ 2022-09-20 17:56 ` Mikulas Patocka
  2022-09-23 15:52   ` Christoph Hellwig
  2022-09-20 17:58 ` [PATCH v2 3/4] brd: enable discard Mikulas Patocka
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-20 17:56 UTC (permalink / raw)
  To: Jens Axboe, Zdenek Kabelac, Christoph Hellwig; +Cc: linux-block, dm-devel

This patch extends the rcu regions, so that lookup followed by a read or
write of a page is done inside rcu read lock. This si be needed for the
following patch that enables discard.

Note that we also replace "BUG_ON(!page);" with "if (page) ..." in
copy_to_brd - the page may be NULL if write races with discard. In this
situation, the result is undefined, so we can actually skip the write
operation at all.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/block/brd.c |   59 +++++++++++++++++++++++-----------------------------
 1 file changed, 27 insertions(+), 32 deletions(-)

Index: linux-2.6/drivers/block/brd.c
===================================================================
--- linux-2.6.orig/drivers/block/brd.c
+++ linux-2.6/drivers/block/brd.c
@@ -50,31 +50,12 @@ struct brd_device {
 
 /*
  * Look up and return a brd's page for a given sector.
+ * This must be called with the rcu lock held.
  */
 static struct page *brd_lookup_page(struct brd_device *brd, sector_t sector)
 {
-	pgoff_t idx;
-	struct page *page;
-
-	/*
-	 * The page lifetime is protected by the fact that we have opened the
-	 * device node -- brd pages will never be deleted under us, so we
-	 * don't need any further locking or refcounting.
-	 *
-	 * This is strictly true for the radix-tree nodes as well (ie. we
-	 * don't actually need the rcu_read_lock()), however that is not a
-	 * documented feature of the radix-tree API so it is better to be
-	 * safe here (we don't have total exclusion from radix tree updates
-	 * here, only deletes).
-	 */
-	rcu_read_lock();
-	idx = sector >> PAGE_SECTORS_SHIFT; /* sector to page index */
-	page = radix_tree_lookup(&brd->brd_pages, idx);
-	rcu_read_unlock();
-
-	BUG_ON(page && page->index != idx);
-
-	return page;
+	pgoff_t idx = sector >> PAGE_SECTORS_SHIFT; /* sector to page index */
+	return radix_tree_lookup(&brd->brd_pages, idx);
 }
 
 /*
@@ -88,7 +69,9 @@ static bool brd_insert_page(struct brd_d
 	struct page *page;
 	gfp_t gfp_flags;
 
+	rcu_read_lock();
 	page = brd_lookup_page(brd, sector);
+	rcu_read_unlock();
 	if (page)
 		return true;
 
@@ -198,23 +181,29 @@ static void copy_to_brd(struct brd_devic
 	size_t copy;
 
 	copy = min_t(size_t, n, PAGE_SIZE - offset);
-	page = brd_lookup_page(brd, sector);
-	BUG_ON(!page);
 
-	dst = kmap_atomic(page);
-	memcpy(dst + offset, src, copy);
-	kunmap_atomic(dst);
+	rcu_read_lock();
+	page = brd_lookup_page(brd, sector);
+	if (page) {
+		dst = kmap_atomic(page);
+		memcpy(dst + offset, src, copy);
+		kunmap_atomic(dst);
+	}
+	rcu_read_unlock();
 
 	if (copy < n) {
 		src += copy;
 		sector += copy >> SECTOR_SHIFT;
 		copy = n - copy;
-		page = brd_lookup_page(brd, sector);
-		BUG_ON(!page);
 
-		dst = kmap_atomic(page);
-		memcpy(dst, src, copy);
-		kunmap_atomic(dst);
+		rcu_read_lock();
+		page = brd_lookup_page(brd, sector);
+		if (page) {
+			dst = kmap_atomic(page);
+			memcpy(dst, src, copy);
+			kunmap_atomic(dst);
+		}
+		rcu_read_unlock();
 	}
 }
 
@@ -230,6 +219,8 @@ static void copy_from_brd(void *dst, str
 	size_t copy;
 
 	copy = min_t(size_t, n, PAGE_SIZE - offset);
+
+	rcu_read_lock();
 	page = brd_lookup_page(brd, sector);
 	if (page) {
 		src = kmap_atomic(page);
@@ -237,11 +228,14 @@ static void copy_from_brd(void *dst, str
 		kunmap_atomic(src);
 	} else
 		memset(dst, 0, copy);
+	rcu_read_unlock();
 
 	if (copy < n) {
 		dst += copy;
 		sector += copy >> SECTOR_SHIFT;
 		copy = n - copy;
+
+		rcu_read_lock();
 		page = brd_lookup_page(brd, sector);
 		if (page) {
 			src = kmap_atomic(page);
@@ -249,6 +243,7 @@ static void copy_from_brd(void *dst, str
 			kunmap_atomic(src);
 		} else
 			memset(dst, 0, copy);
+		rcu_read_unlock();
 	}
 }
 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 3/4] brd: enable discard
  2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
  2022-09-20 17:53 ` [PATCH v2 1/4] brd: make brd_insert_page return bool Mikulas Patocka
  2022-09-20 17:56 ` [PATCH v2 2/4] brd: extend the rcu regions to cover read and write Mikulas Patocka
@ 2022-09-20 17:58 ` Mikulas Patocka
  2023-07-10 12:32   ` Li Nan
  2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
  2022-09-21  5:20 ` [dm-devel] [PATCH v2 0/4] brd: implement discard Gao Xiang
  4 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-20 17:58 UTC (permalink / raw)
  To: Jens Axboe, Zdenek Kabelac, Christoph Hellwig; +Cc: linux-block, dm-devel

This patch implements discard in the brd driver. We use RCU to free the
page, so that if there are any concurrent readers or writes, they won't
touch the page after it is freed.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/block/brd.c |   64 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/block/brd.c
===================================================================
--- linux-2.6.orig/drivers/block/brd.c
+++ linux-2.6/drivers/block/brd.c
@@ -48,6 +48,8 @@ struct brd_device {
 	u64			brd_nr_pages;
 };
 
+static bool discard;
+
 /*
  * Look up and return a brd's page for a given sector.
  * This must be called with the rcu lock held.
@@ -107,6 +109,25 @@ static bool brd_insert_page(struct brd_d
 	return true;
 }
 
+static void brd_free_page_rcu(struct rcu_head *head)
+{
+	struct page *page = container_of(head, struct page, rcu_head);
+	__free_page(page);
+}
+
+static void brd_free_page(struct brd_device *brd, sector_t sector)
+{
+	struct page *page;
+	pgoff_t idx;
+
+	spin_lock(&brd->brd_lock);
+	idx = sector >> PAGE_SECTORS_SHIFT;
+	page = radix_tree_delete(&brd->brd_pages, idx);
+	spin_unlock(&brd->brd_lock);
+	if (page)
+		call_rcu(&page->rcu_head, brd_free_page_rcu);
+}
+
 /*
  * Free all backing store pages and radix tree. This must only be called when
  * there are no other users of the device.
@@ -277,13 +298,44 @@ out:
 	return err;
 }
 
+void brd_do_discard(struct brd_device *brd, struct bio *bio)
+{
+	sector_t sector, len, front_pad;
+
+	if (unlikely(!discard)) {
+		bio->bi_status = BLK_STS_NOTSUPP;
+		return;
+	}
+
+	sector = bio->bi_iter.bi_sector;
+	len = bio_sectors(bio);
+	front_pad = -sector & (PAGE_SECTORS - 1);
+	sector += front_pad;
+	if (unlikely(len <= front_pad))
+		return;
+	len -= front_pad;
+	len = round_down(len, PAGE_SECTORS);
+	while (len) {
+		brd_free_page(brd, sector);
+		sector += PAGE_SECTORS;
+		len -= PAGE_SECTORS;
+		cond_resched();
+	}
+}
+
 static void brd_submit_bio(struct bio *bio)
 {
 	struct brd_device *brd = bio->bi_bdev->bd_disk->private_data;
-	sector_t sector = bio->bi_iter.bi_sector;
+	sector_t sector;
 	struct bio_vec bvec;
 	struct bvec_iter iter;
 
+	if (bio_op(bio) == REQ_OP_DISCARD) {
+		brd_do_discard(brd, bio);
+		goto endio;
+	}
+
+	sector = bio->bi_iter.bi_sector;
 	bio_for_each_segment(bvec, bio, iter) {
 		unsigned int len = bvec.bv_len;
 		int err;
@@ -301,6 +353,7 @@ static void brd_submit_bio(struct bio *b
 		sector += len >> SECTOR_SHIFT;
 	}
 
+endio:
 	bio_endio(bio);
 }
 
@@ -338,6 +391,10 @@ static int max_part = 1;
 module_param(max_part, int, 0444);
 MODULE_PARM_DESC(max_part, "Num Minors to reserve between devices");
 
+static bool discard = false;
+module_param(discard, bool, 0444);
+MODULE_PARM_DESC(discard, "Support discard");
+
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_BLOCKDEV_MAJOR(RAMDISK_MAJOR);
 MODULE_ALIAS("rd");
@@ -404,6 +461,11 @@ static int brd_alloc(int i)
 	 */
 	blk_queue_physical_block_size(disk->queue, PAGE_SIZE);
 
+	if (discard) {
+		disk->queue->limits.discard_granularity = PAGE_SIZE;
+		blk_queue_max_discard_sectors(disk->queue, UINT_MAX);
+	}
+
 	/* Tell the block layer that this is not a rotational device */
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue);
 	blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue);


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 4/4] brd: implement secure erase and write zeroes
  2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
                   ` (2 preceding siblings ...)
  2022-09-20 17:58 ` [PATCH v2 3/4] brd: enable discard Mikulas Patocka
@ 2022-09-20 17:59 ` Mikulas Patocka
  2022-09-21  5:03   ` Chaitanya Kulkarni
                     ` (2 more replies)
  2022-09-21  5:20 ` [dm-devel] [PATCH v2 0/4] brd: implement discard Gao Xiang
  4 siblings, 3 replies; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-20 17:59 UTC (permalink / raw)
  To: Jens Axboe, Zdenek Kabelac, Christoph Hellwig; +Cc: linux-block, dm-devel

This patch implements REQ_OP_SECURE_ERASE and REQ_OP_WRITE_ZEROES on brd.
Write zeroes will free the pages just like discard, but the difference is
that it writes zeroes to the preceding and following page if the range is
not aligned on page boundary. Secure erase is just like write zeroes,
except that it clears the page content before freeing the page.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/block/brd.c |   30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

Index: linux-2.6/drivers/block/brd.c
===================================================================
--- linux-2.6.orig/drivers/block/brd.c
+++ linux-2.6/drivers/block/brd.c
@@ -115,7 +115,7 @@ static void brd_free_page_rcu(struct rcu
 	__free_page(page);
 }
 
-static void brd_free_page(struct brd_device *brd, sector_t sector)
+static void brd_free_page(struct brd_device *brd, sector_t sector, bool secure)
 {
 	struct page *page;
 	pgoff_t idx;
@@ -124,8 +124,11 @@ static void brd_free_page(struct brd_dev
 	idx = sector >> PAGE_SECTORS_SHIFT;
 	page = radix_tree_delete(&brd->brd_pages, idx);
 	spin_unlock(&brd->brd_lock);
-	if (page)
+	if (page) {
+		if (secure)
+			clear_highpage(page);
 		call_rcu(&page->rcu_head, brd_free_page_rcu);
+	}
 }
 
 /*
@@ -300,23 +303,34 @@ out:
 
 void brd_do_discard(struct brd_device *brd, struct bio *bio)
 {
-	sector_t sector, len, front_pad;
+	bool zero_padding;
+	sector_t sector, len, front_pad, end_pad;
 
 	if (unlikely(!discard)) {
 		bio->bi_status = BLK_STS_NOTSUPP;
 		return;
 	}
 
+	zero_padding = bio_op(bio) == REQ_OP_SECURE_ERASE || bio_op(bio) == REQ_OP_WRITE_ZEROES;
 	sector = bio->bi_iter.bi_sector;
 	len = bio_sectors(bio);
 	front_pad = -sector & (PAGE_SECTORS - 1);
+
+	if (zero_padding && unlikely(front_pad != 0))
+		copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector, min(len, front_pad) << SECTOR_SHIFT);
+
 	sector += front_pad;
 	if (unlikely(len <= front_pad))
 		return;
 	len -= front_pad;
-	len = round_down(len, PAGE_SECTORS);
+
+	end_pad = len & (PAGE_SECTORS - 1);
+	if (zero_padding && unlikely(end_pad != 0))
+		copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector + len - end_pad, end_pad << SECTOR_SHIFT);
+	len -= end_pad;
+
 	while (len) {
-		brd_free_page(brd, sector);
+		brd_free_page(brd, sector, bio_op(bio) == REQ_OP_SECURE_ERASE);
 		sector += PAGE_SECTORS;
 		len -= PAGE_SECTORS;
 		cond_resched();
@@ -330,7 +344,9 @@ static void brd_submit_bio(struct bio *b
 	struct bio_vec bvec;
 	struct bvec_iter iter;
 
-	if (bio_op(bio) == REQ_OP_DISCARD) {
+	if (bio_op(bio) == REQ_OP_DISCARD ||
+	    bio_op(bio) == REQ_OP_SECURE_ERASE ||
+	    bio_op(bio) == REQ_OP_WRITE_ZEROES) {
 		brd_do_discard(brd, bio);
 		goto endio;
 	}
@@ -464,6 +480,8 @@ static int brd_alloc(int i)
 	if (discard) {
 		disk->queue->limits.discard_granularity = PAGE_SIZE;
 		blk_queue_max_discard_sectors(disk->queue, UINT_MAX);
+		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
+		blk_queue_max_secure_erase_sectors(disk->queue, UINT_MAX);
 	}
 
 	/* Tell the block layer that this is not a rotational device */


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/4] brd: make brd_insert_page return bool
  2022-09-20 17:53 ` [PATCH v2 1/4] brd: make brd_insert_page return bool Mikulas Patocka
@ 2022-09-21  5:00   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 20+ messages in thread
From: Chaitanya Kulkarni @ 2022-09-21  5:00 UTC (permalink / raw)
  To: Mikulas Patocka, Jens Axboe, Zdenek Kabelac, Christoph Hellwig
  Cc: linux-block, dm-devel

On 9/20/22 10:53, Mikulas Patocka wrote:
> brd_insert_page returns a pointer to struct page, however the pointer is
> never used (it is only compared against NULL), so to clean-up the code, we
> make it return bool.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> ---

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] brd: implement secure erase and write zeroes
  2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
@ 2022-09-21  5:03   ` Chaitanya Kulkarni
  2022-09-21  9:09   ` Pankaj Raghav
  2022-09-23 15:54   ` Christoph Hellwig
  2 siblings, 0 replies; 20+ messages in thread
From: Chaitanya Kulkarni @ 2022-09-21  5:03 UTC (permalink / raw)
  To: Mikulas Patocka, Jens Axboe, Zdenek Kabelac, Christoph Hellwig
  Cc: linux-block, dm-devel


>   /*
> @@ -300,23 +303,34 @@ out:
>   
>   void brd_do_discard(struct brd_device *brd, struct bio *bio)
>   {
> -	sector_t sector, len, front_pad;
> +	bool zero_padding;
> +	sector_t sector, len, front_pad, end_pad;
>   
>   	if (unlikely(!discard)) {
>   		bio->bi_status = BLK_STS_NOTSUPP;
>   		return;
>   	}
>   
> +	zero_padding = bio_op(bio) == REQ_OP_SECURE_ERASE || bio_op(bio) == REQ_OP_WRITE_ZEROES;
>   	sector = bio->bi_iter.bi_sector;
>   	len = bio_sectors(bio);
>   	front_pad = -sector & (PAGE_SECTORS - 1);
> +
> +	if (zero_padding && unlikely(front_pad != 0))
> +		copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector, min(len, front_pad) << SECTOR_SHIFT);
> +
>   	sector += front_pad;
>   	if (unlikely(len <= front_pad))
>   		return;
>   	len -= front_pad;
> -	len = round_down(len, PAGE_SECTORS);
> +
> +	end_pad = len & (PAGE_SECTORS - 1);
> +	if (zero_padding && unlikely(end_pad != 0))
> +		copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector + len - end_pad, end_pad << SECTOR_SHIFT);
> +	len -= end_pad;
> +
>
Is it possible to avoid these long lines ?

-ck



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dm-devel] [PATCH v2 0/4] brd: implement discard
  2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
                   ` (3 preceding siblings ...)
  2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
@ 2022-09-21  5:20 ` Gao Xiang
  2022-09-27 14:09   ` Mikulas Patocka
  4 siblings, 1 reply; 20+ messages in thread
From: Gao Xiang @ 2022-09-21  5:20 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On Tue, Sep 20, 2022 at 01:52:38PM -0400, Mikulas Patocka wrote:
> Hi
> 
> Here I'm sending second version of the brd discard patches.

That is quite interesting.

btw, something out of topic, I once had some preliminary attempt
to add DAX support to brd since brd works as ramdisk and brd-dax
could have the following benefits:

 - DAX can be tested without PMEM devices;
 - ramdisk fses can be accessed without double page cache;
 - initrd use cases then can work well without extra page cache
   and maybe it can perform better than initramfs (without unpack
   process).

I wrote some hack stuff but don't have more time working on it...
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/log/?h=erofs/initrd-fsdax

I'm not sure if others are interested in topic though.

It would be helpful to get rid of all brd page->index use cases
first.

Thanks,
Gao Xiang

> 
> Mikulas
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] brd: implement secure erase and write zeroes
  2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
  2022-09-21  5:03   ` Chaitanya Kulkarni
@ 2022-09-21  9:09   ` Pankaj Raghav
  2022-09-23 15:54   ` Christoph Hellwig
  2 siblings, 0 replies; 20+ messages in thread
From: Pankaj Raghav @ 2022-09-21  9:09 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block,
	dm-devel, Pankaj Raghav

> @@ -330,7 +344,9 @@ static void brd_submit_bio(struct bio *b
>  	struct bio_vec bvec;
>  	struct bvec_iter iter;
>  
> -	if (bio_op(bio) == REQ_OP_DISCARD) {
> +	if (bio_op(bio) == REQ_OP_DISCARD ||
> +	    bio_op(bio) == REQ_OP_SECURE_ERASE ||
> +	    bio_op(bio) == REQ_OP_WRITE_ZEROES) {
>  		brd_do_discard(brd, bio);
>  		goto endio;
>  	}
> @@ -464,6 +480,8 @@ static int brd_alloc(int i)
>  	if (discard) {
>  		disk->queue->limits.discard_granularity = PAGE_SIZE;
>  		blk_queue_max_discard_sectors(disk->queue, UINT_MAX);
> +		blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX);
> +		blk_queue_max_secure_erase_sectors(disk->queue, UINT_MAX);
>  	}
>  
The previous patch has the following description for the discard module
param:
MODULE_PARM_DESC(discard, "Support discard");

But you are reusing it here to enable write zeroes and sec erase.
MODULE_PARM_DESC's "desc" parameter also needs to be updated in this patch.

I understand that all these operations kind of do the same thing at the
end, so it is upto you to decide if you want to add individual module param
for each operation or club them together as you have done here. If you
do the latter, then changing the module param variable `discard` to
something more generic would give more clarity as well.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/4] brd: extend the rcu regions to cover read and write
  2022-09-20 17:56 ` [PATCH v2 2/4] brd: extend the rcu regions to cover read and write Mikulas Patocka
@ 2022-09-23 15:52   ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2022-09-23 15:52 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On Tue, Sep 20, 2022 at 01:56:25PM -0400, Mikulas Patocka wrote:
> This patch extends the rcu regions, so that lookup followed by a read or
> write of a page is done inside rcu read lock. This si be needed for the
> following patch that enables discard.
> 
> Note that we also replace "BUG_ON(!page);" with "if (page) ..." in
> copy_to_brd - the page may be NULL if write races with discard. In this
> situation, the result is undefined, so we can actually skip the write
> operation at all.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> 
> ---
>  drivers/block/brd.c |   59 +++++++++++++++++++++++-----------------------------
>  1 file changed, 27 insertions(+), 32 deletions(-)
> 
> Index: linux-2.6/drivers/block/brd.c
> ===================================================================
> --- linux-2.6.orig/drivers/block/brd.c
> +++ linux-2.6/drivers/block/brd.c
> @@ -50,31 +50,12 @@ struct brd_device {
>  
>  /*
>   * Look up and return a brd's page for a given sector.
> + * This must be called with the rcu lock held.
>   */
>  static struct page *brd_lookup_page(struct brd_device *brd, sector_t sector)
>  {
> +	pgoff_t idx = sector >> PAGE_SECTORS_SHIFT; /* sector to page index */
> +	return radix_tree_lookup(&brd->brd_pages, idx);
>  }

This is still missing the rcu_read_lock_held() assertation if you
want to keep it as separate function.

> +	rcu_read_lock();
> +	page = brd_lookup_page(brd, sector);
> +	if (page) {
> +		dst = kmap_atomic(page);
> +		memcpy(dst + offset, src, copy);
> +		kunmap_atomic(dst);
> +	}
> +	rcu_read_unlock();

How is the null check going to work here?  Simply not copying
data is no exactly the expected result.

This is why I think we need the higher level rework I suggested
last time where we have a helper that always gives you page
(or maybe an error) by moving the insert so that it also does
the actual final lookup.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 4/4] brd: implement secure erase and write zeroes
  2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
  2022-09-21  5:03   ` Chaitanya Kulkarni
  2022-09-21  9:09   ` Pankaj Raghav
@ 2022-09-23 15:54   ` Christoph Hellwig
  2 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2022-09-23 15:54 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On Tue, Sep 20, 2022 at 01:59:46PM -0400, Mikulas Patocka wrote:
> This patch implements REQ_OP_SECURE_ERASE and REQ_OP_WRITE_ZEROES on brd.
> Write zeroes will free the pages just like discard, but the difference is
> that it writes zeroes to the preceding and following page if the range is
> not aligned on page boundary. Secure erase is just like write zeroes,
> except that it clears the page content before freeing the page.

So while I can see the use case for the simple discard you mentioned,
and maybe even the WRITE_ZEROES, I'd rather have a really good
justification for REQ_OP_SECURE_ERASE.  It is a bad interface,
and there are alsmost no good reasons for ever using it.  So sprinkling
it in random drivers just we can is not helpful.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dm-devel] [PATCH v2 0/4] brd: implement discard
  2022-09-21  5:20 ` [dm-devel] [PATCH v2 0/4] brd: implement discard Gao Xiang
@ 2022-09-27 14:09   ` Mikulas Patocka
  2022-09-28  1:03     ` Gao Xiang
  0 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-27 14:09 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel



On Wed, 21 Sep 2022, Gao Xiang wrote:

> On Tue, Sep 20, 2022 at 01:52:38PM -0400, Mikulas Patocka wrote:
> > Hi
> > 
> > Here I'm sending second version of the brd discard patches.
> 
> That is quite interesting.
> 
> btw, something out of topic, I once had some preliminary attempt
> to add DAX support to brd since brd works as ramdisk and brd-dax
> could have the following benefits:
> 
>  - DAX can be tested without PMEM devices;
>  - ramdisk fses can be accessed without double page cache;
>  - initrd use cases then can work well without extra page cache
>    and maybe it can perform better than initramfs (without unpack
>    process).
> 
> I wrote some hack stuff but don't have more time working on it...
> https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/log/?h=erofs/initrd-fsdax
> 
> I'm not sure if others are interested in topic though.
> 
> It would be helpful to get rid of all brd page->index use cases
> first.
> 
> Thanks,
> Gao Xiang

Hi

Ramdisk DAX was there in the past, but it was removed in the kernel 4.15.

Mikulas


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dm-devel] [PATCH v2 0/4] brd: implement discard
  2022-09-27 14:09   ` Mikulas Patocka
@ 2022-09-28  1:03     ` Gao Xiang
  2022-09-29 20:05       ` Mikulas Patocka
  0 siblings, 1 reply; 20+ messages in thread
From: Gao Xiang @ 2022-09-28  1:03 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On Tue, Sep 27, 2022 at 10:09:04AM -0400, Mikulas Patocka wrote:
> 
> 
> On Wed, 21 Sep 2022, Gao Xiang wrote:
> 
> > On Tue, Sep 20, 2022 at 01:52:38PM -0400, Mikulas Patocka wrote:
> > > Hi
> > > 
> > > Here I'm sending second version of the brd discard patches.
> > 
> > That is quite interesting.
> > 
> > btw, something out of topic, I once had some preliminary attempt
> > to add DAX support to brd since brd works as ramdisk and brd-dax
> > could have the following benefits:
> > 
> >  - DAX can be tested without PMEM devices;
> >  - ramdisk fses can be accessed without double page cache;
> >  - initrd use cases then can work well without extra page cache
> >    and maybe it can perform better than initramfs (without unpack
> >    process).
> > 
> > I wrote some hack stuff but don't have more time working on it...
> > https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/log/?h=erofs/initrd-fsdax
> > 
> > I'm not sure if others are interested in topic though.
> > 
> > It would be helpful to get rid of all brd page->index use cases
> > first.
> > 
> > Thanks,
> > Gao Xiang
> 
> Hi
> 
> Ramdisk DAX was there in the past, but it was removed in the kernel 4.15.

Hi Mikulas!

Thanks for pointing out! I didn't realize that, although I think if we really
use brd driver in production, enabling DAX support for brd is much better to
remove double caching so that ramdisk can become a real ramdisk for most
regular files.

I have no idea how other people think about ramdisk DAX, or brd is just a
stuff for testing only now.  If it behaves like this, sorry about the
noise.

Thanks,
Gao Xiang

> 
> Mikulas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dm-devel] [PATCH v2 0/4] brd: implement discard
  2022-09-28  1:03     ` Gao Xiang
@ 2022-09-29 20:05       ` Mikulas Patocka
  2022-09-29 20:48         ` Gao Xiang
  0 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2022-09-29 20:05 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel



On Wed, 28 Sep 2022, Gao Xiang wrote:

> > Hi
> > 
> > Ramdisk DAX was there in the past, but it was removed in the kernel 4.15.
> 
> Hi Mikulas!
> 
> Thanks for pointing out! I didn't realize that, although I think if we really
> use brd driver in production, enabling DAX support for brd is much better to
> remove double caching so that ramdisk can become a real ramdisk for most
> regular files.
> 
> I have no idea how other people think about ramdisk DAX, or brd is just a
> stuff for testing only now.  If it behaves like this, sorry about the
> noise.
> 
> Thanks,
> Gao Xiang

Hi

See the message for the commit 7a862fbbdec665190c5ef298c0c6ec9f3915cf45 
for the reason why it was removed.

Mikulas


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [dm-devel] [PATCH v2 0/4] brd: implement discard
  2022-09-29 20:05       ` Mikulas Patocka
@ 2022-09-29 20:48         ` Gao Xiang
  0 siblings, 0 replies; 20+ messages in thread
From: Gao Xiang @ 2022-09-29 20:48 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On Thu, Sep 29, 2022 at 04:05:43PM -0400, Mikulas Patocka wrote:
> 
> 
> On Wed, 28 Sep 2022, Gao Xiang wrote:
> 
> > > Hi
> > > 
> > > Ramdisk DAX was there in the past, but it was removed in the kernel 4.15.
> > 
> > Hi Mikulas!
> > 
> > Thanks for pointing out! I didn't realize that, although I think if we really
> > use brd driver in production, enabling DAX support for brd is much better to
> > remove double caching so that ramdisk can become a real ramdisk for most
> > regular files.
> > 
> > I have no idea how other people think about ramdisk DAX, or brd is just a
> > stuff for testing only now.  If it behaves like this, sorry about the
> > noise.
> > 
> > Thanks,
> > Gao Xiang
> 
> Hi
> 
> See the message for the commit 7a862fbbdec665190c5ef298c0c6ec9f3915cf45 
> for the reason why it was removed.

I've already seen that commit after you told me, yet I think the reasons
listed inside are not fundamental reasons why ramdisk cannot support DAX
in principle (although I know there are issues as listed to handle.)

IMHO, reserving ZONE_DEVICE memory to emulate pmem for ramdisk DAX-like
use is inflexible on my side since currently such reserved memory cannot
be used for other uses later even the ramdisk actually use little space
in practice regardless of its capacity.

Thanks,
Gao Xiang

> 
> Mikulas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] brd: enable discard
  2022-09-20 17:58 ` [PATCH v2 3/4] brd: enable discard Mikulas Patocka
@ 2023-07-10 12:32   ` Li Nan
  2023-07-10 15:24     ` Mikulas Patocka
  0 siblings, 1 reply; 20+ messages in thread
From: Li Nan @ 2023-07-10 12:32 UTC (permalink / raw)
  To: Mikulas Patocka, Jens Axboe, Zdenek Kabelac, Christoph Hellwig
  Cc: linux-block, dm-devel

Hi, Mikulas

The lack of discard in ramdisk can cause some issues related to dm. see:
https://lore.kernel.org/all/20220228141354.1091687-1-luomeng12@huawei.com/

I noticed that your patch series has already supported discard for brd. 
But this patch series has not been applied to mainline at present, may I 
ask if you still plan to continue working on it?

-- 
Thanks,
Nan


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] brd: enable discard
  2023-07-10 12:32   ` Li Nan
@ 2023-07-10 15:24     ` Mikulas Patocka
  2023-07-10 19:05       ` Jens Axboe
  0 siblings, 1 reply; 20+ messages in thread
From: Mikulas Patocka @ 2023-07-10 15:24 UTC (permalink / raw)
  To: Li Nan, Jens Axboe
  Cc: Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel



On Mon, 10 Jul 2023, Li Nan wrote:

> Hi, Mikulas
> 
> The lack of discard in ramdisk can cause some issues related to dm. see:
> https://lore.kernel.org/all/20220228141354.1091687-1-luomeng12@huawei.com/
> 
> I noticed that your patch series has already supported discard for brd. But
> this patch series has not been applied to mainline at present, may I ask if
> you still plan to continue working on it?
> 
> -- 
> Thanks,
> Nan

Hi

I got no response from ramdisk maintainer Jens Axboe. We should ask him, 
whether he doesn't want discard on ramdisk at all or whether he wants it.

Mikulas


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] brd: enable discard
  2023-07-10 15:24     ` Mikulas Patocka
@ 2023-07-10 19:05       ` Jens Axboe
  2023-07-13 11:45         ` Christoph Hellwig
  2023-07-19 20:14         ` Mikulas Patocka
  0 siblings, 2 replies; 20+ messages in thread
From: Jens Axboe @ 2023-07-10 19:05 UTC (permalink / raw)
  To: Mikulas Patocka, Li Nan
  Cc: Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel

On 7/10/23 9:24?AM, Mikulas Patocka wrote:
> 
> 
> On Mon, 10 Jul 2023, Li Nan wrote:
> 
>> Hi, Mikulas
>>
>> The lack of discard in ramdisk can cause some issues related to dm. see:
>> https://lore.kernel.org/all/20220228141354.1091687-1-luomeng12@huawei.com/
>>
>> I noticed that your patch series has already supported discard for brd. But
>> this patch series has not been applied to mainline at present, may I ask if
>> you still plan to continue working on it?
>>
>> -- 
>> Thanks,
>> Nan
> 
> Hi
> 
> I got no response from ramdisk maintainer Jens Axboe. We should ask him, 
> whether he doesn't want discard on ramdisk at all or whether he wants it.

When a series is posted and reviewers comment on required changes, I
always wait for a respin of that series with those addressed. That
didn't happen, so this didn't get applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] brd: enable discard
  2023-07-10 19:05       ` Jens Axboe
@ 2023-07-13 11:45         ` Christoph Hellwig
  2023-07-19 20:14         ` Mikulas Patocka
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-07-13 11:45 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mikulas Patocka, Li Nan, Zdenek Kabelac, Christoph Hellwig,
	linux-block, dm-devel

On Mon, Jul 10, 2023 at 01:05:00PM -0600, Jens Axboe wrote:
> When a series is posted and reviewers comment on required changes, I
> always wait for a respin of that series with those addressed. That
> didn't happen, so this didn't get applied.

... independent of that any software that assumes all block devices
can discard data is simply broken. 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/4] brd: enable discard
  2023-07-10 19:05       ` Jens Axboe
  2023-07-13 11:45         ` Christoph Hellwig
@ 2023-07-19 20:14         ` Mikulas Patocka
  1 sibling, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2023-07-19 20:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Li Nan, Zdenek Kabelac, Christoph Hellwig, linux-block, dm-devel



On Mon, 10 Jul 2023, Jens Axboe wrote:

> When a series is posted and reviewers comment on required changes, I
> always wait for a respin of that series with those addressed. That
> didn't happen, so this didn't get applied.
> 
> -- 
> Jens Axboe

Hi

I updated the brd discard patches so that they apply to the kernel 
6.5-rc1. I'm submitting them for review.

Mikulas


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-07-19 20:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-20 17:52 [PATCH v2 0/4] brd: implement discard Mikulas Patocka
2022-09-20 17:53 ` [PATCH v2 1/4] brd: make brd_insert_page return bool Mikulas Patocka
2022-09-21  5:00   ` Chaitanya Kulkarni
2022-09-20 17:56 ` [PATCH v2 2/4] brd: extend the rcu regions to cover read and write Mikulas Patocka
2022-09-23 15:52   ` Christoph Hellwig
2022-09-20 17:58 ` [PATCH v2 3/4] brd: enable discard Mikulas Patocka
2023-07-10 12:32   ` Li Nan
2023-07-10 15:24     ` Mikulas Patocka
2023-07-10 19:05       ` Jens Axboe
2023-07-13 11:45         ` Christoph Hellwig
2023-07-19 20:14         ` Mikulas Patocka
2022-09-20 17:59 ` [PATCH v2 4/4] brd: implement secure erase and write zeroes Mikulas Patocka
2022-09-21  5:03   ` Chaitanya Kulkarni
2022-09-21  9:09   ` Pankaj Raghav
2022-09-23 15:54   ` Christoph Hellwig
2022-09-21  5:20 ` [dm-devel] [PATCH v2 0/4] brd: implement discard Gao Xiang
2022-09-27 14:09   ` Mikulas Patocka
2022-09-28  1:03     ` Gao Xiang
2022-09-29 20:05       ` Mikulas Patocka
2022-09-29 20:48         ` Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).