From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-f171.google.com ([209.85.223.171]:45337 "EHLO mail-io0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753775AbdKCLda (ORCPT ); Fri, 3 Nov 2017 07:33:30 -0400 Received: by mail-io0-f171.google.com with SMTP id i38so5563920iod.2 for ; Fri, 03 Nov 2017 04:33:29 -0700 (PDT) Received: from [191.9.206.254] (rrcs-70-62-41-24.central.biz.rr.com. [70.62.41.24]) by smtp.gmail.com with ESMTPSA id v203sm1050023itf.33.2017.11.03.04.33.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Nov 2017 04:33:28 -0700 (PDT) Subject: Re: Problem with file system To: linux-btrfs@vger.kernel.org References: <9871a669-141b-ac64-9da6-9050bcad7640@cn.fujitsu.com> <10fb0b92-bc93-a217-0608-5284ac1a05cd@rqc.ru> <20171103084222.05a4e226@jupiter.sol.kaishome.de> From: "Austin S. Hemmelgarn" Message-ID: <4d26d20d-5c07-ccb3-b26c-0c4876b8fe3a@gmail.com> Date: Fri, 3 Nov 2017 07:33:25 -0400 MIME-Version: 1.0 In-Reply-To: <20171103084222.05a4e226@jupiter.sol.kaishome.de> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017-11-03 03:42, Kai Krakow wrote: > Am Tue, 31 Oct 2017 07:28:58 -0400 > schrieb "Austin S. Hemmelgarn" : > >> On 2017-10-31 01:57, Marat Khalili wrote: >>> On 31/10/17 00:37, Chris Murphy wrote: >>>> But off hand it sounds like hardware was sabotaging the expected >>>> write ordering. How to test a given hardware setup for that, I >>>> think, is really overdue. It affects literally every file system, >>>> and Linux storage technology. >>>> >>>> It kinda sounds like to me something other than supers is being >>>> overwritten too soon, and that's why it's possible for none of the >>>> backup roots to find a valid root tree, because all four possible >>>> root trees either haven't actually been written yet (still) or >>>> they've been overwritten, even though the super is updated. But >>>> again, it's speculation, we don't actually know why your system >>>> was no longer mountable. >>> Just a detached view: I know hardware should respect >>> ordering/barriers and such, but how hard is it really to avoid >>> overwriting at least one complete metadata tree for half an hour >>> (even better, yet another one for a day)? Just metadata, not data >>> extents. >> If you're running on an SSD (or thinly provisioned storage, or >> something else which supports discards) and have the 'discard' mount >> option enabled, then there is no backup metadata tree (this issue was >> mentioned on the list a while ago, but nobody ever replied), because >> it's already been discarded. This is ideally something which should >> be addressed (we need some sort of discard queue for handling in-line >> discards), but it's not easy to address. >> >> Otherwise, it becomes a question of space usage on the filesystem, >> and this is just another reason to keep some extra slack space on the >> FS (though that doesn't help _much_, it does help). This, in theory, >> could be addressed, but it probably can't be applied across mounts of >> a filesystem without an on-disk format change. > > Well, maybe inline discard is working at the wrong level. It should > kick in when the reference through any of the backup roots is dropped, > not when the current instance is dropped. Indeed. > > Without knowledge of the internals, I guess discards could be added to > a queue within a new tree in btrfs, and only added to that queue when > dropped from the last backup root referencing it. But this will > probably add some bad performance spikes. Inline discards can already cause bad performance spikes. > > I wonder how a regular fstrim run through cron applies to this problem? You functionally lose any old (freed) trees, they just get kept around until you call fstrim.