linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Hannes Reinecke <hare@suse.com>,
	dsterba@suse.cz, Naohiro Aota <naota@elisp.net>,
	David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, Chris Mason <clm@fb.com>,
	Josef Bacik <jbacik@fb.com>,
	linux-kernel@vger.kernel.org,
	Damien Le Moal <damien.lemoal@wdc.com>,
	Bart Van Assche <bart.vanassche@wdc.com>,
	Matias Bjorling <mb@lightnvm.io>
Subject: Re: [RFC PATCH 00/17] btrfs zoned block device support
Date: Wed, 15 Aug 2018 07:25:27 -0400	[thread overview]
Message-ID: <3896e121-0f68-6773-fd3e-921d89756349@gmail.com> (raw)
In-Reply-To: <9531d57f-2271-7eb8-b734-dac6d33f0ec1@suse.com>

On 2018-08-14 03:41, Hannes Reinecke wrote:
> On 08/13/2018 09:29 PM, Austin S. Hemmelgarn wrote:
>> On 2018-08-13 15:20, Hannes Reinecke wrote:
>>> On 08/13/2018 08:42 PM, David Sterba wrote:
>>>> On Fri, Aug 10, 2018 at 03:04:33AM +0900, Naohiro Aota wrote:
>>>>> This series adds zoned block device support to btrfs.
>>>>
>>>> Yay, thanks!
>>>>
> [ .. ]
>>>> Device replace is disabled, but the changlog suggests there's a way to
>>>> make it work, so it's a matter of implementation. And this should be
>>>> implemented at the time of merge.
>>>>
>>> How would a device replace work in general?
>>> While I do understand that device replace is possible with RAID
>>> thingies, I somewhat fail to see how could do a device replacement
>>> without RAID functionality.
>>> Is it even possible?
>>> If so, how would it be different from a simple umount?
>> Device replace is implemented in largely the same manner as most other
>> live data migration tools (for example, LVM2's pvmove command).
>>
>> In short, when you issue a replace command for a given device, all
>> writes that would go to that device are instead sent to the new device.
>> While this is happening, old data is copied over from the old device to
>> the new one.  Once all the data is copied, the old device is released
>> (and it's BTRFS signature wiped), and the new device has it's device ID
>> updated to that of the old device.
>>
>> This is possible largely because of the COW infrastructure, but it's
>> implemented in a way that doesn't entirely depend on it (otherwise it
>> wouldn't work for NOCOW files).
>>
>> Handling this on zoned devices is not likely to be easy though, you
>> would functionally have to freeze I/O that would hit the device being
>> replaced so that you don't accidentally write to a sequential zone out
>> of order.
> 
> Ah. Oh. Hmm.
> 
> It would be possible in principle if we freeze accesses to any partially
> filled zones on the original device. Then all new writes will be going
> into new/empty zones on the new disks, and we can copy over the old data
> with no issue at all.
> We end up with some partially filled zones on the new disk, but they
> really should be cleaned up eventually either by the allocator filling
> up the partially filled zones or once garbage collection clears out
> stale zones.
> 
> However, I fear the required changes to the btrfs allocator are beyond
> my btrfs knowledge :-(
The easy short term solution is to just disallow the replace command 
(with the intent of getting it working in the future), but ensure that 
the older style add/remove method works.  That uses the balance code 
internally, so it should honor any restrictions on block placement for 
the new device, and therefore should be pretty easy to get working.


  reply	other threads:[~2018-08-15 11:25 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-09 18:04 [RFC PATCH 00/17] btrfs zoned block device support Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 01/17] btrfs: introduce HMZONED feature flag Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 02/17] btrfs: Get zone information of zoned block devices Naohiro Aota
2018-08-10  7:41   ` Nikolay Borisov
2018-08-09 18:04 ` [RFC PATCH 03/17] btrfs: Check and enable HMZONED mode Naohiro Aota
2018-08-10 12:25   ` Hannes Reinecke
2018-08-10 13:15     ` Naohiro Aota
2018-08-10 13:41       ` Hannes Reinecke
2018-08-09 18:04 ` [RFC PATCH 04/17] btrfs: limit super block locations in " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 05/17] btrfs: disable fallocate " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 06/17] btrfs: disable direct IO " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 07/17] btrfs: disable device replace " Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 08/17] btrfs: align extent allocation to zone boundary Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 09/17] btrfs: do sequential allocation on HMZONED drives Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 10/17] btrfs: split btrfs_map_bio() Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 11/17] btrfs: introduce submit buffer Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 12/17] btrfs: expire submit buffer on timeout Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 13/17] btrfs: avoid sync IO prioritization on checksum in HMZONED mode Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 14/17] btrfs: redirty released extent buffers in sequential BGs Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 15/17] btrfs: reset zones of unused block groups Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 16/17] btrfs: wait existing extents before truncating Naohiro Aota
2018-08-09 18:04 ` [RFC PATCH 17/17] btrfs: enable to mount HMZONED incompat flag Naohiro Aota
2018-08-09 18:10 ` [RFC PATCH 01/12] btrfs-progs: build: Check zoned block device support Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 02/12] btrfs-progs: utils: Introduce queue_param Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 03/12] btrfs-progs: add new HMZONED feature flag Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 04/12] btrfs-progs: Introduce zone block device helper functions Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 05/12] btrfs-progs: load and check zone information Naohiro Aota
2018-08-09 18:10   ` [RFC PATCH 06/12] btrfs-progs: avoid writing super block to sequential zones Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 07/12] btrfs-progs: support discarding zoned device Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 08/12] btrfs-progs: volume: align chunk allocation to zones Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 09/12] btrfs-progs: mkfs: Zoned block device support Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 10/12] btrfs-progs: device-add: support HMZONED device Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 11/12] btrfs-progs: replace: disable in " Naohiro Aota
2018-08-09 18:11   ` [RFC PATCH 12/12] btrfs-progs: do sequential allocation Naohiro Aota
2018-08-10  7:04 ` [RFC PATCH 00/17] btrfs zoned block device support Hannes Reinecke
2018-08-10 14:24   ` Naohiro Aota
2018-08-10  7:26 ` Hannes Reinecke
2018-08-10  7:28 ` Qu Wenruo
2018-08-16  9:05   ` Naohiro Aota
2018-08-10  7:53 ` Nikolay Borisov
2018-08-10  7:55   ` Nikolay Borisov
2018-08-13 18:42 ` David Sterba
2018-08-13 19:20   ` Hannes Reinecke
2018-08-13 19:29     ` Austin S. Hemmelgarn
2018-08-14  7:41       ` Hannes Reinecke
2018-08-15 11:25         ` Austin S. Hemmelgarn [this message]
2018-08-28 10:33   ` Naohiro Aota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3896e121-0f68-6773-fd3e-921d89756349@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=bart.vanassche@wdc.com \
    --cc=clm@fb.com \
    --cc=damien.lemoal@wdc.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=hare@suse.com \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mb@lightnvm.io \
    --cc=naota@elisp.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).