From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C347EC433E0 for ; Mon, 10 Aug 2020 22:24:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A05042075D for ; Mon, 10 Aug 2020 22:24:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726789AbgHJWYs convert rfc822-to-8bit (ORCPT ); Mon, 10 Aug 2020 18:24:48 -0400 Received: from james.kirk.hungrycats.org ([174.142.39.145]:33748 "EHLO james.kirk.hungrycats.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726500AbgHJWYs (ORCPT ); Mon, 10 Aug 2020 18:24:48 -0400 Received: by james.kirk.hungrycats.org (Postfix, from userid 1002) id AA3AD7AA9D3; Mon, 10 Aug 2020 18:24:44 -0400 (EDT) Date: Mon, 10 Aug 2020 18:24:44 -0400 From: Zygo Blaxell To: Nikolay Borisov Cc: =?utf-8?Q?Agust=C3=ADn_Dall=CA=BCAlba?= , linux-btrfs@vger.kernel.org Subject: Re: raid10 corruption while removing failing disk Message-ID: <20200810222444.GP5890@hungrycats.org> References: <3dc4d28e81b3336311c979bda35ceb87b9645606.camel@dallalba.com.ar> <4c967884-2252-a21d-a994-80df64a7e6ef@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <4c967884-2252-a21d-a994-80df64a7e6ef@suse.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Mon, Aug 10, 2020 at 11:21:29AM +0300, Nikolay Borisov wrote: > > > On 10.08.20 г. 10:03 ч., Agustín DallʼAlba wrote: > > Hello! > > > > The last quarterly scrub on our btrfs filesystem found a few bad > > sectors in one of its devices (/dev/sdd), and because there's nobody on > > site to replace the failing disk I decided to remove it from the array > > with `btrfs device remove` before the problem could get worse. > > > > The removal was going relatively well (although slowly and I had to > > reboot a few times due to the bad sectors) until it had about 200 GB > > left to move. Now the filesystem turns read only when I try to finish > > the removal and `btrfs check` complains about wrong metadata checksums. > > However as far as I can tell none of the copies of the corrupt data are > > in the failing disk. > > > > How could this happen? Is it possible to fix this filesystem? > > > > I have refrained from trying anything so far, like upgrading to a newer > > kernel or disconnecting the failing drive, before confirming with you > > that it's safe. > > > > Kind regards. > > > > > > # uname -a > > Linux susanita 4.15.0-111-generic #112-Ubuntu SMP Thu Jul 9 20:32:34 > > UTC 2020 x86_64 x86_64 x86_64 GNU/Linux > > > > > > # btrfs --version > > btrfs-progs v4.15.1 > > > > > > # btrfs fi show > > Label: 'Susanita' uuid: 4d3acf20-d408-49ab-b0a6-182396a9f27c > > Total devices 5 FS bytes used 4.90TiB > > devid 1 size 3.64TiB used 3.42TiB path /dev/sda > > devid 2 size 3.64TiB used 3.42TiB path /dev/sde > > devid 3 size 1.82TiB used 1.59TiB path /dev/sdb > > devid 5 size 0.00B used 185.50GiB path /dev/sdd > > devid 6 size 1.82TiB used 1.22TiB path /dev/sdc > > > > > > # btrfs fi df / > > Data, RAID1: total=4.90TiB, used=4.90TiB > > System, RAID10: total=64.00MiB, used=880.00KiB > > Metadata, RAID10: total=9.00GiB, used=7.57GiB > > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > > > # btrfs check --force --readonly /dev/sda > > WARNING: filesystem mounted, continuing because of --force > > Checking filesystem on /dev/sda > > UUID: 4d3acf20-d408-49ab-b0a6-182396a9f27c > > checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266 > > checksum verify failed on 10919566688256 found BAB1746E wanted A8A48266 > > bytenr mismatch, want=10919566688256, have=17196831625821864417 > > ERROR: failed to repair root items: Input/output error > > > > # btrfs-map-logical -l 10919566688256 /dev/sda > > mirror 1 logical 10919566688256 physical 394473357312 device /dev/sdc > > mirror 2 logical 10919566688256 physical 477218586624 device /dev/sda > > > > > > Relevant dmesg output: > > [ 4.963420] Btrfs loaded, crc32c=crc32c-generic > > [ 5.072878] BTRFS: device label Susanita devid 6 transid 4241535 /dev/sdc > > [ 5.073165] BTRFS: device label Susanita devid 3 transid 4241535 /dev/sdb > > [ 5.073713] BTRFS: device label Susanita devid 2 transid 4241535 /dev/sde > > [ 5.073916] BTRFS: device label Susanita devid 5 transid 4241535 /dev/sdd > > [ 5.074398] BTRFS: device label Susanita devid 1 transid 4241535 /dev/sda > > [ 5.152479] BTRFS info (device sda): disk space caching is enabled > > [ 5.152551] BTRFS info (device sda): has skinny extents > > [ 5.332538] BTRFS info (device sda): bdev /dev/sdd errs: wr 0, rd 24, flush 0, corrupt 0, gen 0 > > [ 38.869423] BTRFS info (device sda): enabling auto defrag > > [ 38.869490] BTRFS info (device sda): use lzo compression, level 0 > > [ 38.869547] BTRFS info (device sda): disk space caching is enabled > > > > > > After running btrfs device remove /dev/sdd /: > > [ 193.684703] BTRFS info (device sda): relocating block group 10593404846080 flags metadata|raid10 > > [ 312.921934] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256 > > [ 313.034339] BTRFS error (device sda): bad tree block start 17196831625821864417 10919566688256 > > [ 313.034595] BTRFS error (device sda): bad tree block start 10597444141056 10919566688256 > > [ 313.034621] BTRFS: error (device sda) in btrfs_run_delayed_refs:3083: errno=-5 IO failure > > [ 313.034627] BTRFS info (device sda): forced readonly > > [ 313.036328] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > > [ 313.036596] IP: merge_reloc_roots+0x19f/0x2c0 [btrfs] > > This suggests you are hitting a known problem with reloc roots which > have been fixed in the latest upstream and lts (5.4) kernels: > > 044ca910276b btrfs: reloc: fix reloc root leak and NULL pointer > dereference (3 months ago) > 707de9c0806d btrfs: relocation: fix reloc_root lifespan and access (7 > months ago) > 1fac4a54374f btrfs: relocation: fix use-after-free on dead relocation > roots (11 months ago) Those commits fix a bug that did not exist in btrfs before 5.1. What is the rationale for these commits being relevant to a 4.15 kernel? > So yes, try to update to latest stable kernel and re-run the device > remove. Also update your btrfs progs to latest 5.6 version and rerun > check again (by default it's a read only operations so it shouldn't > cause any more damage). > >