linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: Josef Bacik <josef@toxicpanda.com>,
	Naohiro Aota <Naohiro.Aota@wdc.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	David Sterba <dsterba@suse.com>
Cc: Chris Mason <clm@fb.com>, Nikolay Borisov <nborisov@suse.com>,
	Johannes Thumshirn <jthumshirn@suse.de>,
	Hannes Reinecke <hare@suse.com>,
	Anand Jain <anand.jain@oracle.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v6 08/28] btrfs: implement log-structured superblock for HMZONED mode
Date: Fri, 13 Dec 2019 21:58:53 +0000	[thread overview]
Message-ID: <BYAPR04MB5816552C67964D6415A3FF70E7540@BYAPR04MB5816.namprd04.prod.outlook.com> (raw)
In-Reply-To: e5bdec6e-a38e-7789-922f-5998b4401d02@toxicpanda.com

Josef,

On 2019/12/14 1:39, Josef Bacik wrote:
> On 12/12/19 11:08 PM, Naohiro Aota wrote:
>> Superblock (and its copies) is the only data structure in btrfs which has a
>> fixed location on a device. Since we cannot overwrite in a sequential write
>> required zone, we cannot place superblock in the zone. One easy solution is
>> limiting superblock and copies to be placed only in conventional zones.
>> However, this method has two downsides: one is reduced number of superblock
>> copies. The location of the second copy of superblock is 256GB, which is in
>> a sequential write required zone on typical devices in the market today.
>> So, the number of superblock and copies is limited to be two.  Second
>> downside is that we cannot support devices which have no conventional zones
>> at all.
>>
>> To solve these two problems, we employ superblock log writing. It uses two
>> zones as a circular buffer to write updated superblocks. Once the first
>> zone is filled up, start writing into the second buffer and reset the first
>> one. We can determine the postion of the latest superblock by reading write
>> pointer information from a device.
>>
>> The following zones are reserved as the circular buffer on HMZONED btrfs.
>>
>> - The primary superblock: zones 0 and 1
>> - The first copy: zones 16 and 17
>> - The second copy: zones 1024 or zone at 256GB which is minimum, and next
>>    to it
>>
> 
> So the series of events for writing is
> 
> -> get wp
> -> write super block
> -> advance wp
>    -> if wp == end of the zone, reset the wp

In your example, the reset is for the other zone, leaving the zone that
was just filled as is. The sequence would in fact be more like this for
zones 0 & 1:

-> Get wp zone 0, if zone is full, reset it
-> write super block in zone 0
-> advance wp zone 0. If zone is full, switch to zone 1 for next update

This would come after the sequence:
-> Get wp zone 1
-> write super block in zone 1
-> advance wp zone 1. If zone is full, switch to zone 0 for next update

> 
> now assume we crash here.  We'll go to mount the fs and the zone will look like 
> it's empty because we reset the wp, and we'll be unable to mount the fs.  Am I 
> missing something here?  Thanks,

The last successful update of the super block is always present on disk
as the block right before the wp position of zone 0 or zone 1.

> 
> Josef
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2019-12-13 21:58 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-13  4:08 [PATCH v6 00/28] btrfs: zoned block device support Naohiro Aota
2019-12-13  4:08 ` [PATCH v6 01/28] btrfs: introduce HMZONED feature flag Naohiro Aota
2019-12-13  4:08 ` [PATCH v6 02/28] btrfs: Get zone information of zoned block devices Naohiro Aota
2019-12-13 16:18   ` Josef Bacik
2019-12-18  2:29     ` Naohiro Aota
2019-12-13  4:08 ` [PATCH v6 03/28] btrfs: Check and enable HMZONED mode Naohiro Aota
2019-12-13 16:21   ` Josef Bacik
2019-12-18  4:17     ` Naohiro Aota
2019-12-13  4:08 ` [PATCH v6 04/28] btrfs: disallow RAID5/6 in " Naohiro Aota
2019-12-13 16:21   ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 05/28] btrfs: disallow space_cache " Naohiro Aota
2019-12-13 16:24   ` Josef Bacik
2019-12-18  4:28     ` Naohiro Aota
2019-12-13  4:08 ` [PATCH v6 06/28] btrfs: disallow NODATACOW " Naohiro Aota
2019-12-13 16:25   ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 07/28] btrfs: disable fallocate " Naohiro Aota
2019-12-13 16:26   ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 08/28] btrfs: implement log-structured superblock for " Naohiro Aota
2019-12-13 16:38   ` Josef Bacik
2019-12-13 21:58     ` Damien Le Moal [this message]
2019-12-17 19:17       ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 09/28] btrfs: align device extent allocation to zone boundary Naohiro Aota
2019-12-13 16:52   ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 10/28] btrfs: do sequential extent allocation in HMZONED mode Naohiro Aota
2019-12-17 19:19   ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 11/28] btrfs: make unmirroed BGs readonly only if we have at least one writable BG Naohiro Aota
2019-12-17 19:25   ` Josef Bacik
2019-12-18  7:35     ` Naohiro Aota
2019-12-18 14:54       ` Josef Bacik
2019-12-13  4:08 ` [PATCH v6 12/28] btrfs: ensure metadata space available on/after degraded mount in HMZONED Naohiro Aota
2019-12-17 19:32   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 13/28] btrfs: reset zones of unused block groups Naohiro Aota
2019-12-17 19:33   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 14/28] btrfs: redirty released extent buffers in HMZONED mode Naohiro Aota
2019-12-17 19:41   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 15/28] btrfs: serialize data allocation and submit IOs Naohiro Aota
2019-12-17 19:49   ` Josef Bacik
2019-12-19  6:54     ` Naohiro Aota
2019-12-19 14:01       ` Josef Bacik
2020-01-21  6:54         ` Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 16/28] btrfs: implement atomic compressed IO submission Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 17/28] btrfs: support direct write IO in HMZONED Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 18/28] btrfs: serialize meta IOs on HMZONED mode Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 19/28] btrfs: wait existing extents before truncating Naohiro Aota
2019-12-17 19:53   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 20/28] btrfs: avoid async checksum on HMZONED mode Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 21/28] btrfs: disallow mixed-bg in " Naohiro Aota
2019-12-17 19:56   ` Josef Bacik
2019-12-18  8:03     ` Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 22/28] btrfs: disallow inode_cache " Naohiro Aota
2019-12-17 19:56   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 23/28] btrfs: support dev-replace " Naohiro Aota
2019-12-17 21:05   ` Josef Bacik
2019-12-18  6:00     ` Naohiro Aota
2019-12-18 14:58       ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 24/28] btrfs: enable relocation " Naohiro Aota
2019-12-17 21:32   ` Josef Bacik
2019-12-18 10:49     ` Naohiro Aota
2019-12-18 15:01       ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 25/28] btrfs: relocate block group to repair IO failure in HMZONED Naohiro Aota
2019-12-17 22:04   ` Josef Bacik
2019-12-13  4:09 ` [PATCH v6 26/28] btrfs: split alloc_log_tree() Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 27/28] btrfs: enable tree-log on HMZONED mode Naohiro Aota
2019-12-17 22:08   ` Josef Bacik
2019-12-18  9:35     ` Naohiro Aota
2019-12-13  4:09 ` [PATCH v6 28/28] btrfs: enable to mount HMZONED incompat flag Naohiro Aota
2019-12-17 22:09   ` Josef Bacik
2019-12-13  4:15 ` [PATCH RFC v2] libblkid: implement zone-aware probing for HMZONED btrfs Naohiro Aota
2019-12-19 20:19 ` [PATCH v6 00/28] btrfs: zoned block device support David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR04MB5816552C67964D6415A3FF70E7540@BYAPR04MB5816.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=anand.jain@oracle.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hare@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=jthumshirn@suse.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).