From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60AABC3526B for ; Wed, 16 Dec 2020 16:23:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 266892313B for ; Wed, 16 Dec 2020 16:23:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726687AbgLPQXs (ORCPT ); Wed, 16 Dec 2020 11:23:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726678AbgLPQXr (ORCPT ); Wed, 16 Dec 2020 11:23:47 -0500 Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58675C0611CC for ; Wed, 16 Dec 2020 08:22:43 -0800 (PST) Received: by mail-qt1-x834.google.com with SMTP id 2so7735352qtt.10 for ; Wed, 16 Dec 2020 08:22:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KvCnIGOY5N+iKQx47fBU+44kordMUWG+xxsld8ZScx8=; b=HsK4cI7Hn4tXaCQnG0snEdGgxL04r3tLwZF4gyQxTAGCAqIvKkoBb3uRGtaWe2tbXP y1mo/IVk6GIyCj6mwx6IGrjnzeBzHLQPJacTy1MsAld1eq+76h6M6Wvy0YYUp9g/bTwu Eqsp5WMYvI02tI4AfnVnTTvZlAkQbQciuCmjk/77o7KykLalhTdIJxwl5rtCs/XcgK9E ZTa6YxrsHq6jRz3aQn4HS/BsEihTlUYaYmQj8x/bmL7vdIjScH3vK86/HuavW4hMiOwu G+3dgvrgSiVDh0bpoduaprYCRrdz0RLJ5ST9uxV97INiMTgXnnQrCUoLCQNGPvilsN0P cFHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KvCnIGOY5N+iKQx47fBU+44kordMUWG+xxsld8ZScx8=; b=RDMtz8ufjOnoBuBWaRMYhbWPV+eckb2RpUrJ2OXukq7mYHyaBlyOle268jbAdACR9+ qelzeMMsBKlWbs03wlFmAdN3qm7K0+Oc2JzZKYoepzN2C8w12NljlIaLMCCE84ot86od LElFEVEWKpZ1AhGgoQcy6j4XYQYlIupPxDiciLRupVTqV8+hjz1QpFwRz38HEM/F2GnC tSZNQf6u2MU7lRsRZGPCzUWkKk1yxrPfAyMiucjcQrMyJI1GvEykX9pyQpBKoDHjDYhA RLfeQQ4Qo/jWWoJwPUjwFa99CMk6JJt4AgSW21KEFnAE7wtdsUegrU1D2Tj9lI33v2/X I/kA== X-Gm-Message-State: AOAM532W32RfEgn8NzVy5udiozvZCwwFnsSWH9uBbiAmR7kzTIP0+/7r mm9o+x5cEdnJ81XGEzajDBeWY1/bCeG7sPD4 X-Google-Smtp-Source: ABdhPJxnEDPlF4+TFQiEld7H9H+QjGQlRYld/XUGD4easLkfd3+yG1dS/cr/pxCnwJ/QZCUFE0e0ag== X-Received: by 2002:ac8:4986:: with SMTP id f6mr43123589qtq.43.1608135762055; Wed, 16 Dec 2020 08:22:42 -0800 (PST) Received: from localhost (cpe-174-109-172-136.nc.res.rr.com. [174.109.172.136]) by smtp.gmail.com with ESMTPSA id p10sm1408097qke.32.2020.12.16.08.22.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Dec 2020 08:22:41 -0800 (PST) From: Josef Bacik To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Cc: Qu Wenruo Subject: [PATCH 12/13] btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node Date: Wed, 16 Dec 2020 11:22:16 -0500 Message-Id: <59ebfb4821922076f1ab4a1fb007f154a21945e3.1608135557.git.josef@toxicpanda.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Zygo reported the following panic when testing my error handling patches for relocation ------------[ cut here ]------------ kernel BUG at fs/btrfs/backref.c:2545! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G W 14 Hardware name: QEMU Standard PC (i440FX + PIIX, Call Trace: btrfs_backref_error_cleanup+0x4df/0x530 build_backref_tree+0x1a5/0x700 ? _raw_spin_unlock+0x22/0x30 ? release_extent_buffer+0x225/0x280 ? free_extent_buffer.part.52+0xd7/0x140 relocate_tree_blocks+0x2a6/0xb60 ? kasan_unpoison_shadow+0x35/0x50 ? do_relocation+0xc10/0xc10 ? kasan_kmalloc+0x9/0x10 ? kmem_cache_alloc_trace+0x6a3/0xcb0 ? free_extent_buffer.part.52+0xd7/0x140 ? rb_insert_color+0x342/0x360 ? add_tree_block.isra.36+0x236/0x2b0 relocate_block_group+0x2eb/0x780 ? merge_reloc_roots+0x470/0x470 btrfs_relocate_block_group+0x26e/0x4c0 btrfs_relocate_chunk+0x52/0x120 btrfs_balance+0xe2e/0x18f0 ? pvclock_clocksource_read+0xeb/0x190 ? btrfs_relocate_chunk+0x120/0x120 ? lock_contended+0x620/0x6e0 ? do_raw_spin_lock+0x1e0/0x1e0 ? do_raw_spin_unlock+0xa8/0x140 btrfs_ioctl_balance+0x1f9/0x460 btrfs_ioctl+0x24c8/0x4380 ? __kasan_check_read+0x11/0x20 ? check_chain_key+0x1f4/0x2f0 ? __asan_loadN+0xf/0x20 ? btrfs_ioctl_get_supported_features+0x30/0x30 ? kvm_sched_clock_read+0x18/0x30 ? check_chain_key+0x1f4/0x2f0 ? lock_downgrade+0x3f0/0x3f0 ? handle_mm_fault+0xad6/0x2150 ? do_vfs_ioctl+0xfc/0x9d0 ? ioctl_file_clone+0xe0/0xe0 ? check_flags.part.50+0x6c/0x1e0 ? check_flags.part.50+0x6c/0x1e0 ? check_flags+0x26/0x30 ? lock_is_held_type+0xc3/0xf0 ? syscall_enter_from_user_mode+0x1b/0x60 ? do_syscall_64+0x13/0x80 ? rcu_read_lock_sched_held+0xa1/0xd0 ? __kasan_check_read+0x11/0x20 ? __fget_light+0xae/0x110 __x64_sys_ioctl+0xc3/0x100 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This occurs because of this check if (RB_EMPTY_NODE(&upper->rb_node)) BUG_ON(!list_empty(&node->upper)); As we are dropping the backref node, if we discover that our upper node in the edge we just cleaned up isn't linked into the cache that we are now done with this node, thus the BUG_ON(). However this is an erroneous assumption, as we will look up all the references for a node first, and then process the pending edges. All of the 'upper' nodes in our pending edges won't be in the cache's rb_tree yet, because they haven't been processed. We could very well have many edges still left to cleanup on this node. The fact is we simply do not need this check, we can just process all of the edges only for this node, because below this check we do the following if (list_empty(&upper->lower)) { list_add_tail(&upper->lower, &cache->leaves); upper->lowest = 1; } If the upper node truly isn't used yet, then we add it to the cache->leaves list to be cleaned up later. If it is still used then the last child node that has it linked into its node will add it to the leaves list and then it will be cleaned up. Fix this problem by dropping this logic altogether. With this fix I no longer see the panic when testing with error injection in the backref code. Reviewed-by: Qu Wenruo Signed-off-by: Josef Bacik --- fs/btrfs/backref.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index 3af38b09be43..7ac59a568595 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -2541,13 +2541,6 @@ void btrfs_backref_cleanup_node(struct btrfs_backref_cache *cache, list_del(&edge->list[UPPER]); btrfs_backref_free_edge(cache, edge); - if (RB_EMPTY_NODE(&upper->rb_node)) { - BUG_ON(!list_empty(&node->upper)); - btrfs_backref_drop_node(cache, node); - node = upper; - node->lowest = 1; - continue; - } /* * Add the node to leaf node list if no other child block * cached. -- 2.26.2