Linux-Block Archive on lore.kernel.org
 help / color / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: Coly Li <colyli@suse.de>,
	"linux-bcache@vger.kernel.org" <linux-bcache@vger.kernel.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Subject: Re: [RFC PATCH v4 0/3] bcache: support zoned device as bcache backing device
Date: Mon, 25 May 2020 05:25:31 +0000
Message-ID: <CY4PR04MB3751BEA457F5B95A4D6EFB01E7B30@CY4PR04MB3751.namprd04.prod.outlook.com> (raw)
In-Reply-To: <20200522121837.109651-1-colyli@suse.de>

On 2020/05/22 21:19, Coly Li wrote:
> Hi folks,
> 
> This is series, now bcache can support zoned device (e.g. host managed
> SMR hard drive) as the backing deice. Currently writeback mode is not
> support yet, which is on the to-do list (requires on-disk super block
> format change).
> 
> The first patch makes bcache to export the zoned information to upper
> layer code, for example formatting zonefs on top of the bcache device.
> By default, zone 0 of the zoned device is fully reserved for bcache
> super block, therefore the reported zones number is 1 less than the
> exact zones number of the physical SMR hard drive.
> 
> The second patch handles zone management command for bcache. Indeed
> these zone management commands are wrappered as zone management bios.
> For REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL zone management bios,
> before forwarding the bio to backing device, all cached data covered
> by the resetting zone(s) must be invalidated to keep data consistency.
> For rested zone management bios just minus the bi_sector by data_offset
> and simply forward to the zoned backing device.
> 
> The third patch is to make sure after bcache device starts, the cache
> mode cannot be changed to writeback via sysfs interface. Bcache-tools
> is modified to notice users and convert to writeback mode to the default
> writethrough mode when making a bcache device.
> 
> There is one thing not addressed by this series, that is re-write the
> bcache super block after REQ_OP_ZONE_RESET_ALL command. There will be
> quite soon that all seq zones device may appear, but it is OK to make
> bcache support such all seq-zones device a bit later.
> 
> Now a bcache device created with a zoned SMR drive can pass these test
> cases,
> - read /sys/block/bcache0/queue/zoned, content is 'host-managed'
> - read /sys/block/bcache0/queue/nr_zones, content is number of zones
>   excluding zone 0 of the backing device (reserved for bcache super
>   block).
> - read /sys/block/bcache0/queue/chunk_sectors, content is zone size
>   in sectors.
> - run 'blkzone report /dev/bcache0', all zones information displayed.
> - run 'blkzone reset -o <zone LBA> -c <zones number> /dev/bcache0',
>   conventional zones will reject the command, seqential zones covered
>   by the command range will reset its write pointer to start LBA of
>   their zones. If <zone LBA> is 0 and <zones number> covers all zones,
>   REQ_OP_ZONE_RESET_ALL command will be received and handled by bcache
>   device properly.
> - zonefs can be created on top of the bcache device, with/without cache
>   device attached. All sequential direct write and random read work well
>   and zone reset by 'truncate -s 0 <zone file>' works too.
> - Writeback cache mode does not support yet.
> 
> Now all prevous code review comments are addressed by this RFC version.
> Please don't hesitate to offer your opinion on this version.
> 
> Thanks in advance for your help.

Coly,

One more thing: your patch series lacks support for REQ_OP_ZONE_APPEND. It would
be great to add that. As is, since you do not set the max_zone_append_sectors
queue limit for the bcache device, that command will not be issued by the block
layer. But zonefs (and btrfs) will use zone append in (support for zonefs is
queued already in 5.8, btrfs will come later).

If bcache writethrough policy results in a data write to be issued to both the
backend device and the cache device, then some special code will be needed:
these 2 BIOs will need to be serialized since the actual write location of a
zone append command is known only on completion of the command. That is, the
zone append BIO needs to be issued to the backend device first, then to the
cache SSD device as a regular write once the zone append completes and its write
location is known.


> 
> Coly Li
> ---
> Changelog:
> v4: another improved version without any other generic block change.
> v3: an improved version depends on other generic block layer changes.
> v2: the first RFC version for comments and review.
> v1: the initial version posted just for information.
> 
> 
> Coly Li (3):
>   bcache: export bcache zone information for zoned backing device
>   bcache: handle zone management bios for bcache device
>   bcache: reject writeback cache mode for zoned backing device
> 
>  drivers/md/bcache/bcache.h  |  10 +++
>  drivers/md/bcache/request.c | 168 +++++++++++++++++++++++++++++++++++-
>  drivers/md/bcache/super.c   |  98 ++++++++++++++++++++-
>  drivers/md/bcache/sysfs.c   |   5 ++
>  4 files changed, 279 insertions(+), 2 deletions(-)
> 


-- 
Damien Le Moal
Western Digital Research

  parent reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22 12:18 Coly Li
2020-05-22 12:18 ` [RFC PATCH v4 1/3] bcache: export bcache zone information for zoned " Coly Li
2020-05-25  1:10   ` Damien Le Moal
2020-06-01 12:34     ` Coly Li
2020-06-02  8:48       ` Damien Le Moal
2020-06-02 12:50         ` Coly Li
2020-06-03  0:58           ` Damien Le Moal
2020-05-22 12:18 ` [RFC PATCH v4 2/3] bcache: handle zone management bios for bcache device Coly Li
2020-05-25  1:24   ` Damien Le Moal
2020-06-01 16:06     ` Coly Li
2020-06-02  8:54       ` Damien Le Moal
2020-06-02 10:18         ` Coly Li
2020-06-03  0:51           ` Damien Le Moal
2020-05-22 12:18 ` [RFC PATCH v4 3/3] bcache: reject writeback cache mode for zoned backing device Coly Li
2020-05-25  1:26   ` Damien Le Moal
2020-06-01 16:09     ` Coly Li
2020-05-25  5:25 ` Damien Le Moal [this message]
2020-05-25  8:14   ` [RFC PATCH v4 0/3] bcache: support zoned device as bcache " Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CY4PR04MB3751BEA457F5B95A4D6EFB01E7B30@CY4PR04MB3751.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org
	public-inbox-index linux-block

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git