From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59D28C433DF for ; Thu, 28 May 2020 15:09:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2A43B20814 for ; Thu, 28 May 2020 15:09:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QB+4TE0R" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404034AbgE1PJh (ORCPT ); Thu, 28 May 2020 11:09:37 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:47194 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2403906AbgE1PJg (ORCPT ); Thu, 28 May 2020 11:09:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590678574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vqgfZmgQ+eKFCVb4EWp4i4ocS1CDHddlRBsP2XRMH70=; b=QB+4TE0Rk9BSLNcvhoqUQtBnaWB1yykTEeXklAp13RQ6urlYrtkyYhuDkEC0gkn/Lbzn9b 0SO9yhP+1Lswrz75sS5eaRaobpL3Elj2EQ3W6E6Dy5jZqQxJQUDjBt7PoquiGh7RM+HiEe dh6NipACq93ANjmE7q/yTzj2ZFhrPT8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-152-i4cwVw3YNw61GJWbabKM-Q-1; Thu, 28 May 2020 11:09:24 -0400 X-MC-Unique: i4cwVw3YNw61GJWbabKM-Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BD374872FF2; Thu, 28 May 2020 15:09:23 +0000 (UTC) Received: from bfoster (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1B8151A836; Thu, 28 May 2020 15:09:23 +0000 (UTC) Date: Thu, 28 May 2020 11:09:21 -0400 From: Brian Foster To: "Darrick J. Wong" Cc: sandeen@sandeen.net, linux-xfs@vger.kernel.org Subject: Re: [PATCH 3/9] xfs_repair: create a new class of btree rebuild cursors Message-ID: <20200528150921.GB17794@bfoster> References: <158993944270.983175.4120094597556662259.stgit@magnolia> <158993946213.983175.9823091723787830102.stgit@magnolia> <20200527121804.GC12014@bfoster> <20200527220733.GN8230@magnolia> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200527220733.GN8230@magnolia> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Wed, May 27, 2020 at 03:07:33PM -0700, Darrick J. Wong wrote: > On Wed, May 27, 2020 at 08:18:04AM -0400, Brian Foster wrote: > > On Tue, May 19, 2020 at 06:51:02PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong > > > > > > Create some new support structures and functions to assist phase5 in > > > using the btree bulk loader to reconstruct metadata btrees. This is the > > > first step in removing the open-coded rebuilding code. > > > > > > Signed-off-by: Darrick J. Wong > > > --- > > > repair/phase5.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++++++----- > > > 1 file changed, 218 insertions(+), 21 deletions(-) > > > > > > > > > diff --git a/repair/phase5.c b/repair/phase5.c > > > index 84c05a13..8f5e5f59 100644 > > > --- a/repair/phase5.c > > > +++ b/repair/phase5.c > > > @@ -18,6 +18,7 @@ > > > #include "progress.h" > > > #include "slab.h" > > > #include "rmap.h" > > > +#include "bload.h" > > > > > > /* > > > * we maintain the current slice (path from root to leaf) > > ... > > > @@ -306,6 +324,156 @@ _("error - not enough free space in filesystem\n")); > > > #endif > > > } > > > > > ... > > > +static void > > > +consume_freespace( > > > + xfs_agnumber_t agno, > > > + struct extent_tree_node *ext_ptr, > > > + uint32_t len) > > > +{ > > > + struct extent_tree_node *bno_ext_ptr; > > > + xfs_agblock_t new_start = ext_ptr->ex_startblock + len; > > > + xfs_extlen_t new_len = ext_ptr->ex_blockcount - len; > > > + > > > + /* Delete the used-up extent from both extent trees. */ > > > +#ifdef XR_BLD_FREE_TRACE > > > + fprintf(stderr, "releasing extent: %u [%u %u]\n", agno, > > > + ext_ptr->ex_startblock, ext_ptr->ex_blockcount); > > > +#endif > > > + bno_ext_ptr = find_bno_extent(agno, ext_ptr->ex_startblock); > > > + ASSERT(bno_ext_ptr != NULL); > > > + get_bno_extent(agno, bno_ext_ptr); > > > + release_extent_tree_node(bno_ext_ptr); > > > + > > > + ext_ptr = get_bcnt_extent(agno, ext_ptr->ex_startblock, > > > + ext_ptr->ex_blockcount); > > > + release_extent_tree_node(ext_ptr); > > > + > > > > Not having looked too deeply at the in-core extent tracking structures, > > is there any particular reason we unconditionally remove and reinsert > > new records each time around? Is it because we're basically changing the > > extent index in the tree? If so, comment please (an update to the > > comment below is probably fine). :) > > Yes. We're changing the free space tree records, and the incore bno and > cnt trees maintain the records in sorted order. Therefore, if we want > to change a record we have to delete the record from the tree and > reinsert it. > > > > + /* > > > + * If we only used part of this last extent, then we must reinsert the > > > + * extent in the extent trees. > > /* > * If we only used part of this last extent, then we must reinsert the > * extent to maintain proper sorting order. > */ > > How about that? > Works for me, thanks. > > > + */ > > > + if (new_len > 0) { > > > + add_bno_extent(agno, new_start, new_len); > > > + add_bcnt_extent(agno, new_start, new_len); > > > + } > > > +} > > > + > > > +/* Reserve blocks for the new btree. */ > > > +static void > > > +setup_rebuild( > > > + struct xfs_mount *mp, > > > + xfs_agnumber_t agno, > > > + struct bt_rebuild *btr, > > > + uint32_t nr_blocks) > > > +{ > > > + struct extent_tree_node *ext_ptr; > > > + uint32_t blocks_allocated = 0; > > > + uint32_t len; > > > + int error; > > > + > > > + while (blocks_allocated < nr_blocks) { > > > + /* > > > + * Grab the smallest extent and use it up, then get the > > > + * next smallest. This mimics the init_*_cursor code. > > > + */ > > > + ext_ptr = findfirst_bcnt_extent(agno); > > > > Extra whitespace ^ > > > > > + if (!ext_ptr) > > > + do_error( > > > +_("error - not enough free space in filesystem\n")); > > > + > > > + /* Use up the extent we've got. */ > > > + len = min(ext_ptr->ex_blockcount, nr_blocks - blocks_allocated); > > > + error = xrep_newbt_add_blocks(&btr->newbt, > > > + XFS_AGB_TO_FSB(mp, agno, > > > + ext_ptr->ex_startblock), > > > + len); > > > > Alignment. > > Will fix both of these. > > > > + if (error) > > > + do_error(_("could not set up btree reservation: %s\n"), > > > + strerror(-error)); > > > + > > > + error = rmap_add_ag_rec(mp, agno, ext_ptr->ex_startblock, len, > > > + btr->newbt.oinfo.oi_owner); > > > + if (error) > > > + do_error(_("could not set up btree rmaps: %s\n"), > > > + strerror(-error)); > > > + > > > + consume_freespace(agno, ext_ptr, len); > > > + blocks_allocated += len; > > > + } > > > +#ifdef XR_BLD_FREE_TRACE > > > + fprintf(stderr, "blocks_allocated = %d\n", > > > + blocks_allocated); > > > +#endif > > > +} > > > + > > > +/* Feed one of the new btree blocks to the bulk loader. */ > > > +static int > > > +rebuild_claim_block( > > > + struct xfs_btree_cur *cur, > > > + union xfs_btree_ptr *ptr, > > > + void *priv) > > > +{ > > > + struct bt_rebuild *btr = priv; > > > + > > > + return xrep_newbt_claim_block(cur, &btr->newbt, ptr); > > > +} > > > + > > > > Seems like an unnecessary helper, unless this grows more code in later > > patches..? > > It doesn't grow any more code, but keep in mind that get_record, > claim_block, and iroot_size are all callbacks of xfs_btree_bload(). The > priv parameter passed to that function are passed unchanged to the three > callbacks. The bulk load code doesn't know anything about where the > blocks or the records come from, so this is how both repairs will pass > that information to the callbacks. > Ok. > > > static void > > > write_cursor(bt_status_t *curs) > > > { > > ... > > > @@ -2287,28 +2483,29 @@ keep_fsinos(xfs_mount_t *mp) > > > > > > static void > > > phase5_func( > > > - xfs_mount_t *mp, > > > - xfs_agnumber_t agno, > > > - struct xfs_slab *lost_fsb) > > > + struct xfs_mount *mp, > > > + xfs_agnumber_t agno, > > > + struct xfs_slab *lost_fsb) > > > { > > > - uint64_t num_inos; > > > - uint64_t num_free_inos; > > > - uint64_t finobt_num_inos; > > > - uint64_t finobt_num_free_inos; > > > - bt_status_t bno_btree_curs; > > > - bt_status_t bcnt_btree_curs; > > > - bt_status_t ino_btree_curs; > > > - bt_status_t fino_btree_curs; > > > - bt_status_t rmap_btree_curs; > > > - bt_status_t refcnt_btree_curs; > > > - int extra_blocks = 0; > > > - uint num_freeblocks; > > > - xfs_extlen_t freeblks1; > > > + struct repair_ctx sc = { .mp = mp, }; > > > > I don't see any reason to add sc here when it's still unused. It's not > > as if a single variable is saving complexity somewhere else. I guess > > I'll defer to Eric on the approach wrt to the other unused warnings. > > I'll ask. It seems dumb to have a prep patch that adds a bunch > of symbols that won't get used until the next patch, but OTOH combining > the two will make for a ~40K patch. > I've no strong preference either way in general (the single variable thing aside) as long as each patch compiles and functions correctly and warnings are addressed by the end of the series. I do think that if we keep separate patches, it should probably be documented in the commit log that unused infrastructure is introduced (i.e. warnings expected) and users are introduced in a following patch. It's usually easier to squash patches than separate, so the maintainer can always squash them post review if he wanted to eliminate the warnings from the commit history. Brian > > Also, what's the purpose of the rmap change below? I'm wondering if that > > (along with all of the indentation cleanup) should be its own patch with > > appropriate explanation. > > Errk, that one definitely should be separate. > > > Brian > > > > > + struct agi_stat agi_stat = {0,}; > > > + uint64_t num_inos; > > > + uint64_t num_free_inos; > > > + uint64_t finobt_num_inos; > > > + uint64_t finobt_num_free_inos; > > > + bt_status_t bno_btree_curs; > > > + bt_status_t bcnt_btree_curs; > > > + bt_status_t ino_btree_curs; > > > + bt_status_t fino_btree_curs; > > > + bt_status_t rmap_btree_curs; > > > + bt_status_t refcnt_btree_curs; > > > + int extra_blocks = 0; > > > + uint num_freeblocks; > > > + xfs_extlen_t freeblks1; > > > #ifdef DEBUG > > > - xfs_extlen_t freeblks2; > > > + xfs_extlen_t freeblks2; > > > #endif > > > - xfs_agblock_t num_extents; > > > - struct agi_stat agi_stat = {0,}; > > > + xfs_agblock_t num_extents; > > > > > > if (verbose) > > > do_log(_(" - agno = %d\n"), agno); > > > @@ -2516,8 +2713,8 @@ inject_lost_blocks( > > > if (error) > > > goto out_cancel; > > > > > > - error = -libxfs_free_extent(tp, *fsb, 1, &XFS_RMAP_OINFO_AG, > > > - XFS_AG_RESV_NONE); > > > + error = -libxfs_free_extent(tp, *fsb, 1, > > > + &XFS_RMAP_OINFO_ANY_OWNER, XFS_AG_RESV_NONE); > > > if (error) > > > goto out_cancel; > > > > > > > > >