From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from tartarus.angband.pl ([89.206.35.136]:52778 "EHLO tartarus.angband.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751266AbdKDEql (ORCPT ); Sat, 4 Nov 2017 00:46:41 -0400 Date: Sat, 4 Nov 2017 05:46:34 +0100 From: Adam Borowski To: Chris Murphy Cc: "Austin S. Hemmelgarn" , Marat Khalili , Dave , Linux fs Btrfs , Fred Van Andel Subject: Re: Problem with file system Message-ID: <20171104044634.thg7mnchm4hvzdic@angband.pl> References: <9871a669-141b-ac64-9da6-9050bcad7640@cn.fujitsu.com> <10fb0b92-bc93-a217-0608-5284ac1a05cd@rqc.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 In-Reply-To: Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Fri, Nov 03, 2017 at 04:03:44PM -0600, Chris Murphy wrote: > On Tue, Oct 31, 2017 at 5:28 AM, Austin S. Hemmelgarn > wrote: > > > If you're running on an SSD (or thinly provisioned storage, or something > > else which supports discards) and have the 'discard' mount option enabled, > > then there is no backup metadata tree (this issue was mentioned on the list > > a while ago, but nobody ever replied), > > > This is a really good point. I've been running discard mount option > for some time now without problems, in a laptop with Samsung > Electronics Co Ltd NVMe SSD Controller SM951/PM951. > > However, just trying btrfs-debug-tree -b on a specific block address > for any of the backup root trees listed in the super, only the current > one returns a valid result. All others fail with checksum errors. And > even the good one fails with checksum errors within seconds as a new > tree is created, the super updated, and Btrfs considers the old root > tree disposable and subject to discard. > > So absolutely if I were to have a problem, probably no rollback for > me. This seems to totally obviate a fundamental part of Btrfs design. How is this an issue? Discard is issued only once we're positive there's no reference to the freed blocks anywhere. At that point, they're also open for reuse, thus they can be arbitrarily scribbled upon. Unless your hardware is seriously broken (such as lying about barriers, which is nearly-guaranteed data loss on btrfs anyway), there's no way the filesystem will ever reference such blocks. The corpses of old trees that are left lying around with no discard can at most be used for manual forensics, but whether a given block will have been overwritten or not is a matter of pure luck. For rollbacks, there are snapshots. Once a transaction has been fully committed, the old version is considered gone. > because it's already been discarded. > > This is ideally something which should be addressed (we need some sort of > > discard queue for handling in-line discards), but it's not easy to address. > > Discard data extents, don't discard metadata extents? Or put them on a > substantial delay. Why would you special-case metadata? Metadata that points to overwritten or discarded blocks is of no use either. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ Laws we want back: Poland, Dz.U. 1921 nr.30 poz.177 (also Dz.U. ⣾⠁⢰⠒⠀⣿⡁ 1920 nr.11 poz.61): Art.2: An official, guilty of accepting a gift ⢿⡄⠘⠷⠚⠋⠀ or another material benefit, or a promise thereof, [in matters ⠈⠳⣄⠀⠀⠀⠀ relevant to duties], shall be punished by death by shooting.