From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from magic.merlins.org ([209.81.13.136]:41468 "EHLO mail1.merlins.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752527AbeGCEWp (ORCPT ); Tue, 3 Jul 2018 00:22:45 -0400 Date: Mon, 2 Jul 2018 21:22:41 -0700 From: Marc MERLIN To: Chris Murphy Cc: Qu Wenruo , Su Yue , Btrfs BTRFS Subject: Re: So, does btrfs check lowmem take days? weeks? Message-ID: <20180703042241.GI5567@merlins.org> References: <20180629061001.kkmgvdgqfhz23kll@merlins.org> <20180629064354.kbaepro5ccmm6lkn@merlins.org> <20180701232202.vehg7amgyvz3hpxc@merlins.org> <5a603d3d-620b-6cb3-106c-9d38e3ca6d02@cn.fujitsu.com> <20180702032259.GD5567@merlins.org> <9fbd4b39-fa75-4c30-eea8-e789fd3e4dd5@cn.fujitsu.com> <20180702140527.wfbq5jenm67fvvjg@merlins.org> <3728d88c-29c1-332b-b698-31a0b3d36e2b@gmx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Mon, Jul 02, 2018 at 06:31:43PM -0600, Chris Murphy wrote: > So the idea behind journaled file systems is that journal replay > enabled mount time "repair" that's faster than an fsck. Already Btrfs > use cases with big, but not huge, file systems makes btrfs check a > problem. Either running out of memory or it takes too long. So already > it isn't scaling as well as ext4 or XFS in this regard. > > So what's the future hold? It seems like the goal is that the problems > must be avoided in the first place rather than to repair them after > the fact. > > Are the problem's Marc is running into understood well enough that > there can eventually be a fix, maybe even an on-disk format change, > that prevents such problems from happening in the first place? > > Or does it make sense for him to be running with btrfs debug or some > subset of btrfs integrity checking mask to try to catch the problems > in the act of them happening? Those are all good questions. To be fair, I cannot claim that btrfs was at fault for whatever filesystem damage I ended up with. It's very possible that it happened due to a flaky Sata card that kicked drives off the bus when it shouldn't have. Sure in theory a journaling filesystem can recover from unexpected power loss and drives dropping off at bad times, but I'm going to guess that btrfs' complexity also means that it has data structures (extent tree?) that need to be updated completely "or else". I'm obviously ok with a filesystem check being necessary to recover in cases like this, afterall I still occasionally have to run e2fsck on ext4 too, but I'm a lot less thrilled with the btrfs situation where basically the repair tools can either completely crash your kernel, or take days and then either get stuck in an infinite loop or hit an algorithm that can't scale if you have too many hardlinks/snapshots. It sounds like there may not be a fix to this problem with the filesystem's design, outside of "do not get there, or else". It would even be useful for btrfs tools to start computing heuristics and output warnings like "you have more than 100 snapshots on this filesystem, this is not recommended, please read http://url/" Qu, Su, does that sound both reasonable and doable? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/