From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC5ECC433EF for ; Tue, 21 Jun 2022 15:46:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353080AbiFUPq6 (ORCPT ); Tue, 21 Jun 2022 11:46:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351597AbiFUPq6 (ORCPT ); Tue, 21 Jun 2022 11:46:58 -0400 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 817192CDE8 for ; Tue, 21 Jun 2022 08:46:57 -0700 (PDT) Received: by verein.lst.de (Postfix, from userid 2407) id 286DA68AA6; Tue, 21 Jun 2022 17:46:54 +0200 (CEST) Date: Tue, 21 Jun 2022 17:46:53 +0200 From: Christoph Hellwig To: Nikolay Borisov Cc: Christoph Hellwig , clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, linux-btrfs@vger.kernel.org Subject: Re: [PATCH] btrfs: repair all bad mirrors Message-ID: <20220621154653.GA10068@lst.de> References: <20220619082821.2151052-1-hch@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Tue, Jun 21, 2022 at 06:19:19PM +0300, Nikolay Borisov wrote: >> + >> + mirror = failrec->this_mirror; >> + do { >> + mirror = prev_mirror(failrec, mirror); >> + repair_io_failure(fs_info, ino, start, failrec->len, >> + failrec->logical, page, pg_offset, mirror); >> + } while (mirror != failrec->orig_mirror); > > But does this work as intended? Say we have a raid1c4 and we read from > mirror 3 which is bad, in this case failrec->orig_mirror = 3 and > ->this_mirror = 4. The read from mirror 4 returns good data and > clean_io_failure is called with mirror= 3 in which case only mirror 3 is > repaired (assume 1/2 were also bad we don't know it yet, because the > original bio request didn't submit to them based on the PID policy). Yes. Although that is what I intended as we don't want to read data we don't otherwise have to. Maybe it should state "all known bad mirrors" instead of "all mirrors". I think if we want to check all mirror we need to defer to the scrub code.