All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: linux-block@vger.kernel.org, damien.lemoal@wdc.com,
	hare@suse.com, hch@lst.de, axboe@kernel.dk
Cc: linux-bcache@vger.kernel.org, kbusch@kernel.org,
	Coly Li <colyli@suse.de>, Ajay Joshi <ajay.joshi@wdc.com>,
	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
	Hannes Reinecke <hare@suse.de>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>
Subject: [RFC PATCH v2 4/4] block: set bi_size to REQ_OP_ZONE_RESET bio
Date: Sat, 16 May 2020 11:54:34 +0800	[thread overview]
Message-ID: <20200516035434.82809-5-colyli@suse.de> (raw)
In-Reply-To: <20200516035434.82809-1-colyli@suse.de>

Now for zoned device, zone management ioctl commands are converted into
zone management bios and handled by blkdev_zone_mgmt(). There are 4 zone
management bios are handled, their op code is,
- REQ_OP_ZONE_RESET
  Reset the zone's writer pointer and empty all previously stored data.
- REQ_OP_ZONE_OPEN
  Open the zones in the specified sector range, no influence on data.
- REQ_OP_ZONE_CLOSE
  Close the zones in the specified sector range, no influence on data.
- REQ_OP_ZONE_FINISH
  Mark the zone as full, no influence on data.
All the zone management bios has 0 byte size, a.k.a their bi_size is 0.

Exept for REQ_OP_ZONE_RESET request, zero length bio works fine for
other zone management bio, before the zoned device e.g. host managed SMR
hard drive can be created as a bcache device.

When a bcache device (virtual block device to forward bios like md raid
drivers) can be created on top of the zoned device, and a fast SSD is
attached as a cache device, bcache driver may cache the frequent random
READ requests on fast SSD to accelerate hot data READ performance.

When bcache driver receives a zone management bio for REQ_OP_ZONE_RESET
op, while forwarding the request to underlying zoned device e.g. host
managed SMR hard drive, it should also invalidate all cached data from
SSD for the resetting zone. Otherwise bcache will continue provide the
outdated cached data to READ request and cause potential data storage
inconsistency and corruption.

In order to invalidate outdated data from SSD for the reset zone, bcache
needs to know not only the start LBA but also the range length of the
resetting zone. Otherwise, bcache won't be able to accurately invalidate
the outdated cached data.

Is it possible to simply set the bi_size inside bcache driver? The
answer is NO. Although every REQ_OP_ZONE_RESET bio has exact length as
zone size or q->limits.chunk_sectors, it is possible that some other
layer stacking block driver (in the future) exists between bcache driver
and blkdev_zone_mgmt() where the zone management bio is made.

The best location to set bi_size is where the zone management bio is
composed in blkdev_zone_mgmt(), then no matter how this bio is split
before bcache driver receives it, bcache driver can always correctly
invalidate the resetting range.

This patch sets the bi_size of REQ_OP_ZONE_RESET bio for each resetting
zone. Here REQ_OP_ZONE_RESET_ALL is special whose bi_size should be set
as capacity of whole drive size, then bcache can invalidate all cached
data from SSD for the zoned backing device.

With this change, now bcache code can handle REQ_OP_ZONE_RESET bio in
the way very similar to REQ_OP_DISCARD bio with very little change.

Signed-off-by: Coly Li <colyli@suse.de>
Cc: Ajay Joshi <ajay.joshi@wdc.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>
---
Changelog:
v2: fix typo for REQ_OP_ZONE_RESET_ALL.
v1: initial version.

 block/blk-zoned.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 1e0708c68267..01d91314399b 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -227,11 +227,15 @@ int blkdev_zone_mgmt(struct block_device *bdev, enum req_opf op,
 		if (op == REQ_OP_ZONE_RESET &&
 		    blkdev_allow_reset_all_zones(bdev, sector, nr_sectors)) {
 			bio->bi_opf = REQ_OP_ZONE_RESET_ALL;
+			bio->bi_iter.bi_sector = sector;
+			bio->bi_iter.bi_size = nr_sectors;
 			break;
 		}
 
 		bio->bi_opf = op | REQ_SYNC;
 		bio->bi_iter.bi_sector = sector;
+		if (op == REQ_OP_ZONE_RESET)
+			bio->bi_iter.bi_size = zone_sectors;
 		sector += zone_sectors;
 
 		/* This may take a while, so be nice to others */
-- 
2.25.0


  parent reply	other threads:[~2020-05-16  3:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-16  3:54 [RFC PATCH v2 0/4] block layer change necessary for bcache zoned device support Coly Li
2020-05-16  3:54 ` [RFC PATCH v2 1/4] block: change REQ_OP_ZONE_RESET from 6 to 13 Coly Li
2020-05-16  4:06   ` Chaitanya Kulkarni
2020-05-16  9:33     ` Coly Li
2020-05-16 12:38   ` Christoph Hellwig
2020-05-16 12:44     ` Coly Li
2020-05-16 12:50       ` Christoph Hellwig
2020-05-16 13:05         ` Coly Li
2020-05-16 15:36           ` Christoph Hellwig
2020-05-17  5:30             ` Coly Li
2020-05-18  6:53               ` Hannes Reinecke
2020-05-18  6:56                 ` Damien Le Moal
2020-05-18  0:33   ` Damien Le Moal
2020-05-18  5:09     ` Chaitanya Kulkarni
2020-05-16  3:54 ` [RFC PATCH v2 2/4] block: block: change REQ_OP_ZONE_RESET_ALL from 8 to 15 Coly Li
2020-05-18  0:36   ` Damien Le Moal
2020-05-16  3:54 ` [RFC PATCH v2 3/4] block: remove queue_is_mq restriction from blk_revalidate_disk_zones() Coly Li
2020-05-16 12:40   ` Christoph Hellwig
2020-05-16 13:13     ` Coly Li
2020-05-16 15:35       ` Christoph Hellwig
2020-05-18  1:07       ` Damien Le Moal
2020-05-18  0:39   ` Damien Le Moal
2020-05-16  3:54 ` Coly Li [this message]
2020-05-16 12:53   ` [RFC PATCH v2 4/4] block: set bi_size to REQ_OP_ZONE_RESET bio Christoph Hellwig
2020-05-18  0:59   ` Damien Le Moal
2020-05-18  2:32     ` Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200516035434.82809-5-colyli@suse.de \
    --to=colyli@suse.de \
    --cc=ajay.joshi@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=damien.lemoal@wdc.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=kbusch@kernel.org \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.