linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Naohiro Aota <naota@elisp.net>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, Chris Mason <clm@fb.com>,
	Josef Bacik <jbacik@fb.com>,
	linux-kernel@vger.kernel.org, Hannes Reinecke <hare@suse.com>,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Bart Van Assche <bart.vanassche@wdc.com>,
	Matias Bjorling <mb@lightnvm.io>
Subject: Re: [RFC PATCH 00/17] btrfs zoned block device support
Date: Thu, 16 Aug 2018 18:05:31 +0900	[thread overview]
Message-ID: <20180816090531.knjb423b3fm5fdk4@zazie> (raw)
In-Reply-To: <d5473558-47a1-708e-551b-fabb4ea0842e@gmx.com>

On Fri, Aug 10, 2018 at 03:28:21PM +0800, Qu Wenruo wrote:
> 
> 
> On 8/10/18 2:04 AM, Naohiro Aota wrote:
> > This series adds zoned block device support to btrfs.
> > 
> > A zoned block device consists of a number of zones. Zones are either
> > conventional and accepting random writes or sequential and requiring that
> > writes be issued in LBA order from each zone write pointer position.
> 
> Not familiar with zoned block device, especially for the sequential case.
> 
> Is that sequential case tape like?

It's somewhat similar but not the same as tape drives. In the tape
drives, you still *can* write in random access patters, though it's
much slow. In sequential required zones, it is always enforced to
write sequentially in a zone. Violating sequential write rule results
I/O error.

One user of sequential write required zone is Host-Managed "Shingled
Magnetic Recording" (SMR) HDDs [1]. They increase the volume capacity
by overlapping the tracks. As a result, writing to tracks overwrites
adjacent tracks. Such physical nature forces the sequential write
pattern.

[1] https://en.wikipedia.org/wiki/Shingled_magnetic_recording

> > This
> > patch series ensures that the sequential write constraint of sequential
> > zones is respected while fundamentally not changing BtrFS block and I/O
> > management for block stored in conventional zones.
> > 
> > To achieve this, the default dev extent size of btrfs is changed on zoned
> > block devices so that dev extents are always aligned to a zone. Allocation
> > of blocks within a block group is changed so that the allocation is always
> > sequential from the beginning of the block groups. To do so, an allocation
> > pointer is added to block groups and used as the allocation hint.  The
> > allocation changes also ensures that block freed below the allocation
> > pointer are ignored, resulting in sequential block allocation regardless of
> > the block group usage.
> 
> This looks like it would cause a lot of holes for metadata block groups.
> It would be better to avoid metadata block allocation in such sequential
> zone.
> (And that would need the infrastructure to make extent allocator
> priority-aware)

Yes, it would introduce holes in metadata block groups. I agree it is
desirable to allocate metadata blocks from conventional
(non-sequential) zones.

However, it's sometime impossible to allocate metadata blocks from
conventional zones, since the number of conventional zones is
generally smaller than sequential zones in some zoned block devices
like SMR HDDs (to achieve higher volume capacity).

While this patch series ensures metadata/data can be allocated in any
type of zone and everything works in any zones, we will be able to
improve metadata allocation by making the extent allocator
priority/zone-type aware in the future.

> > [...]
> > Naohiro Aota (17):
> >   btrfs: introduce HMZONED feature flag
> >   btrfs: Get zone information of zoned block devices
> >   btrfs: Check and enable HMZONED mode
> >   btrfs: limit super block locations in HMZONED mode
> >   btrfs: disable fallocate in HMZONED mode
> >   btrfs: disable direct IO in HMZONED mode
> >   btrfs: disable device replace in HMZONED mode
> >   btrfs: align extent allocation to zone boundary
> 
> According to the patch name, I though it's about extent allocation, but
> in fact it's about dev extent allocation.
> Renaming the patch would make more sense.
>
> >   btrfs: do sequential allocation on HMZONED drives
> 
> And this is the patch modifying extent allocator.

Thanks. I will fix the names of the patches in the next version.

> Despite that, the support zoned storage looks pretty interesting and
> have something in common with planned priority-aware extent allocator.
> 
> Thanks,
> Qu

Regards,
Naohiro

  parent reply	other threads:[~2018-08-16 12:02 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-09 18:04 [RFC PATCH 00/17] btrfs zoned block device support Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 01/17] btrfs: introduce HMZONED feature flag Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 02/17] btrfs: Get zone information of zoned block devices Naohiro Aota
2018-08-10  7:41   ` Nikolay Borisov
2018-08-09 18:04 ` [RFC PATCH 03/17] btrfs: Check and enable HMZONED mode Naohiro Aota
2018-08-10 12:25   ` Hannes Reinecke
2018-08-10 13:15     ` Naohiro Aota
2018-08-10 13:41       ` Hannes Reinecke
2018-08-09 18:04 ` [RFC PATCH 04/17] btrfs: limit super block locations in " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 05/17] btrfs: disable fallocate " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 06/17] btrfs: disable direct IO " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 07/17] btrfs: disable device replace " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 08/17] btrfs: align extent allocation to zone boundary Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 09/17] btrfs: do sequential allocation on HMZONED drives Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 10/17] btrfs: split btrfs_map_bio() Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 11/17] btrfs: introduce submit buffer Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 12/17] btrfs: expire submit buffer on timeout Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 13/17] btrfs: avoid sync IO prioritization on checksum in HMZONED mode Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 14/17] btrfs: redirty released extent buffers in sequential BGs Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 15/17] btrfs: reset zones of unused block groups Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 16/17] btrfs: wait existing extents before truncating Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 17/17] btrfs: enable to mount HMZONED incompat flag Naohiro Aota
2018-08-09 18:10 ` [RFC PATCH 01/12] btrfs-progs: build: Check zoned block device support Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 02/12] btrfs-progs: utils: Introduce queue_param Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 03/12] btrfs-progs: add new HMZONED feature flag Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 04/12] btrfs-progs: Introduce zone block device helper functions Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 05/12] btrfs-progs: load and check zone information Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 06/12] btrfs-progs: avoid writing super block to sequential zones Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 07/12] btrfs-progs: support discarding zoned device Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 08/12] btrfs-progs: volume: align chunk allocation to zones Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 09/12] btrfs-progs: mkfs: Zoned block device support Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 10/12] btrfs-progs: device-add: support HMZONED device Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 11/12] btrfs-progs: replace: disable in " Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 12/12] btrfs-progs: do sequential allocation Naohiro Aota
2018-08-10  7:04 ` [RFC PATCH 00/17] btrfs zoned block device support Hannes Reinecke
2018-08-10 14:24   ` Naohiro Aota
2018-08-10  7:26 ` Hannes Reinecke
2018-08-10  7:28 ` Qu Wenruo
2018-08-10 13:32   ` Hans van Kranenburg
2018-08-10 14:04     ` Qu Wenruo
2018-08-16  9:05   ` Naohiro Aota [this message]
2018-08-10  7:53 ` Nikolay Borisov
2018-08-10  7:55   ` Nikolay Borisov
2018-08-13 18:42 ` David Sterba
2018-08-13 19:20   ` Hannes Reinecke
2018-08-13 19:29     ` Austin S. Hemmelgarn
2018-08-14  7:41       ` Hannes Reinecke
2018-08-15 11:25         ` Austin S. Hemmelgarn
2018-08-28 10:33   ` Naohiro Aota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180816090531.knjb423b3fm5fdk4@zazie \
    --to=naota@elisp.net \
    --cc=bart.vanassche@wdc.com \
    --cc=clm@fb.com \
    --cc=damien.lemoal@wdc.com \
    --cc=dsterba@suse.com \
    --cc=hare@suse.com \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mb@lightnvm.io \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).