From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A7D3C4167B for ; Wed, 6 Apr 2022 04:46:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235820AbiDFErs (ORCPT ); Wed, 6 Apr 2022 00:47:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1450126AbiDFCuN (ORCPT ); Tue, 5 Apr 2022 22:50:13 -0400 Received: from drax.kayaks.hungrycats.org (drax.kayaks.hungrycats.org [174.142.148.226]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4A9511A383 for ; Tue, 5 Apr 2022 16:51:30 -0700 (PDT) Received: by drax.kayaks.hungrycats.org (Postfix, from userid 1002) id 1C9002AEB2F; Tue, 5 Apr 2022 19:51:30 -0400 (EDT) Date: Tue, 5 Apr 2022 19:51:29 -0400 From: Zygo Blaxell To: Josef Bacik Cc: Marc MERLIN , "linux-btrfs@vger.kernel.org" Subject: Re: Rebuilding 24TB Raid5 array (was btrfs corruption: parent transid verify failed + open_ctree failed) Message-ID: References: <20220405181108.GA28707@merlins.org> <20220405195157.GA3307770@merlins.org> <20220405195901.GC28707@merlins.org> <20220405200805.GD28707@merlins.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Tue, Apr 05, 2022 at 04:24:16PM -0400, Josef Bacik wrote: > On Tue, Apr 5, 2022 at 4:08 PM Marc MERLIN wrote: > > > > On Tue, Apr 05, 2022 at 04:05:05PM -0400, Josef Bacik wrote: > > > Well it's still the same, and this thing is 20mib into your fs so IDK > > > how it would be screwing up now. Can you do > > > > > > ./btrfs inspect-internal dump-tree -b 21069824 > > > > > > and see what that spits out? IDK why it would suddenly start > > > complaining about your chunk root. Thanks, > > > > Thanks for your patience and sticking with me > > gargamel:/var/local/src/btrfs-progs-josefbacik# ./btrfs inspect-internal dump-tree -b 21069824 /dev/mapper/dshelf1a >/dev/null > > Ok well that worked, which means it found the chunk tree fine, I'm There's two copies of the chunk tree in dup metadata. Maybe one of them is bad? It seems surprising to me if that's the case. I didn't think the userspace read code did anything like the current->pid % 2 dance, and even if it did, I'd expect it to show up before now. Other possibilites include irreproducible reads coming from the bcache layer if that's still active (e.g. if it was on mdadm raid1, and the raid1 mirrors were out of sync and didn't know it). Hopefully not, because it would make this...challenging. > going to chalk this up to it just fucking with me and ignore it for > now. I pushed some changes for the find root thing, can you re-run > > ./btrfs-find-root -o 1 /dev/whatever > > it should be less noisy and spit out one line at the end, "Found tree > root at blah blah". Thanks, > > Josef