From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-f196.google.com ([209.85.223.196]:33460 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751832AbdHARkD (ORCPT ); Tue, 1 Aug 2017 13:40:03 -0400 Received: by mail-io0-f196.google.com with SMTP id q64so2050449ioi.0 for ; Tue, 01 Aug 2017 10:40:03 -0700 (PDT) Subject: Re: [PATCH 00/14 RFC] Btrfs: Add journal for raid5/6 writes To: Roman Mamedov , Liu Bo Cc: linux-btrfs@vger.kernel.org References: <20170801161439.13426-1-bo.li.liu@oracle.com> <20170801222547.35d1bd03@natsu> From: "Austin S. Hemmelgarn" Message-ID: <50312ea2-a0bf-09f7-8bc0-804c3a087ae4@gmail.com> Date: Tue, 1 Aug 2017 13:39:59 -0400 MIME-Version: 1.0 In-Reply-To: <20170801222547.35d1bd03@natsu> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017-08-01 13:25, Roman Mamedov wrote: > On Tue, 1 Aug 2017 10:14:23 -0600 > Liu Bo wrote: > >> This aims to fix write hole issue on btrfs raid5/6 setup by adding a >> separate disk as a journal (aka raid5/6 log), so that after unclean >> shutdown we can make sure data and parity are consistent on the raid >> array by replaying the journal. > > Could it be possible to designate areas on the in-array devices to be used as > journal? > > While md doesn't have much spare room in its metadata for extraneous things > like this, Btrfs could use almost as much as it wants to, adding to size of the > FS metadata areas. Reliability-wise, the log could be stored as RAID1 chunks. > > It doesn't seem convenient to need having an additional storage device around > just for the log, and also needing to maintain its fault tolerance yourself (so > the log device would better be on a mirror, such as mdadm RAID1? more expense > and maintenance complexity). > I agree, MD pretty much needs a separate device simply because they can't allocate arbitrary space on the other array members. BTRFS can do that though, and I would actually think that that would be _easier_ to implement than having a separate device. That said, I do think that it would need to be a separate chunk type, because things could get really complicated if the metadata is itself using a parity raid profile.