From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D86ADC10F03 for ; Thu, 14 Mar 2019 01:06:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9F849213A2 for ; Thu, 14 Mar 2019 01:06:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qWoK0/2c" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726680AbfCNBG6 (ORCPT ); Wed, 13 Mar 2019 21:06:58 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:37358 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726218AbfCNBG6 (ORCPT ); Wed, 13 Mar 2019 21:06:58 -0400 Received: by mail-ot1-f67.google.com with SMTP id b3so3615135otp.4 for ; Wed, 13 Mar 2019 18:06:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=fxH63bT0tIyXg6/IXYao2pvR6Oy8Vp+e2u49nQ59V40=; b=qWoK0/2csFTXR0O+LXDHaWgU9Qmidhh6/qp8GCTBQzv8FNeY9OCzstXo5S9XYDqNgx q4OEQuRjlpXayCMAQVZtjU+MspX4chLrkKM64lHhBJ49auD2xZFFGVDs8R0G9AKmHYdt jchh6OC/+ZkInweexqcajA1smWHyaUqcpYHcgnVgxq14eC7fl6EUQIaXSeyouYa88ghn ZK5if30Wns6HZ5Qb3h8rE/QfJf2YDarIua/qorBhkdWopo823H2RgcEb9/IOjCI9JLBf Zg/fUKJJXzKiTyctct7t8IEr0evzeWCgjSzeRTf6RlNjmwB4ZHL3Vq50nuQbIjw/ATVP hKwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=fxH63bT0tIyXg6/IXYao2pvR6Oy8Vp+e2u49nQ59V40=; b=P4N/o8kNII8M0ZJeZ9bWp9w2Lh8o0mDNcQlxLMZfL6FOyfOoz2nOh05ATytIkf91+Z nCeIUOHMY3YAp+dEvcYRC96sa3WcjcTINOify9StSBt5uZyARMgsKbRt/MVDDMuSl4YE +ztiDK+dEGYg3NRJQDjuQdnTsdT3rDdRUuT7vOkFF6SQ59o8IF+YSpxszIkl0KFP8u+w XLPG7DilI5VINW0TtjYkwGTsP9G1ddUbyMQw1Y52k0D8B6a3hURWWjM0MUrZSKssEKA7 90gHZ3/lUpHbtujkIT3BkiDtt/vTk4Wbjh2Yp/AV9CoN/8hvqa0utKYwssF0pwPL5tjQ sJHg== X-Gm-Message-State: APjAAAVwzvsV7J+VY42dxpRFJ2Tbmcsdn+KUtDVYQWaRC801AhsH97af sMkW0S/neDbtLX/+O7gOlg5Ey7+f X-Google-Smtp-Source: APXvYqw41sViWgPJRDywhyeUnQ0n8w+4AnP4jZaWTwwf3gbcgIljeRk7SbO/tacgQAX7j8ecGQB/wQ== X-Received: by 2002:a05:6830:108f:: with SMTP id y15mr1461766oto.49.1552525616955; Wed, 13 Mar 2019 18:06:56 -0700 (PDT) Received: from JosephdeMacBook-Pro.local ([205.204.117.30]) by smtp.gmail.com with ESMTPSA id j21sm5200542otr.28.2019.03.13.18.06.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Mar 2019 18:06:56 -0700 (PDT) Subject: Re: [Ocfs2-devel] [PATCH] ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock To: "Darrick J. Wong" , akpm@linux-foundation.org, ocfs2-devel@oss.oracle.com Cc: linux-fsdevel@vger.kernel.org References: <20190312214910.GK20533@magnolia> From: Joseph Qi Message-ID: <8b40e9d3-bd05-5b54-4770-dcd46f669ff6@gmail.com> Date: Thu, 14 Mar 2019 09:06:51 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 In-Reply-To: <20190312214910.GK20533@magnolia> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On 19/3/13 05:49, Darrick J. Wong wrote: > From: Darrick J. Wong > > ocfs2_reflink_inodes_lock can swap the inode1/inode2 variables so that > we always grab cluster locks in order of increasing inode number. > Unfortunately, we forget to swap the inode record buffer head pointers > when we've done this, which leads to incorrect bookkeepping when we're > trying to make the two inodes have the same refcount tree. > > This has the effect of causing filesystem shutdowns if you're trying to > reflink data from inode 100 into inode 97, where inode 100 already has a > refcount tree attached and inode 97 doesn't. The reflink code decides > to copy the refcount tree pointer from 100 to 97, but uses inode 97's > inode record to open the tree root (which it doesn't have) and blows up. > This issue causes filesystem shutdowns and metadata corruption! > > Fixes: 29ac8e856cb369 ("ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features") > Signed-off-by: Darrick J. Wong Looks good to me. Reviewed-by: Joseph Qi > --- > fs/ocfs2/refcounttree.c | 42 ++++++++++++++++++++++++------------------ > 1 file changed, 24 insertions(+), 18 deletions(-) > > diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c > index a35259eebc56..1dc9a08e8bdc 100644 > --- a/fs/ocfs2/refcounttree.c > +++ b/fs/ocfs2/refcounttree.c > @@ -4719,22 +4719,23 @@ loff_t ocfs2_reflink_remap_blocks(struct inode *s_inode, > > /* Lock an inode and grab a bh pointing to the inode. */ > int ocfs2_reflink_inodes_lock(struct inode *s_inode, > - struct buffer_head **bh1, > + struct buffer_head **bh_s, > struct inode *t_inode, > - struct buffer_head **bh2) > + struct buffer_head **bh_t) > { > - struct inode *inode1; > - struct inode *inode2; > + struct inode *inode1 = s_inode; > + struct inode *inode2 = t_inode; > struct ocfs2_inode_info *oi1; > struct ocfs2_inode_info *oi2; > + struct buffer_head *bh1 = NULL; > + struct buffer_head *bh2 = NULL; > bool same_inode = (s_inode == t_inode); > + bool need_swap = (inode1->i_ino > inode2->i_ino); > int status; > > /* First grab the VFS and rw locks. */ > lock_two_nondirectories(s_inode, t_inode); > - inode1 = s_inode; > - inode2 = t_inode; > - if (inode1->i_ino > inode2->i_ino) > + if (need_swap) > swap(inode1, inode2); > > status = ocfs2_rw_lock(inode1, 1); > @@ -4757,17 +4758,13 @@ int ocfs2_reflink_inodes_lock(struct inode *s_inode, > trace_ocfs2_double_lock((unsigned long long)oi1->ip_blkno, > (unsigned long long)oi2->ip_blkno); > > - if (*bh1) > - *bh1 = NULL; > - if (*bh2) > - *bh2 = NULL; > - > /* We always want to lock the one with the lower lockid first. */ > if (oi1->ip_blkno > oi2->ip_blkno) > mlog_errno(-ENOLCK); > > /* lock id1 */ > - status = ocfs2_inode_lock_nested(inode1, bh1, 1, OI_LS_REFLINK_TARGET); > + status = ocfs2_inode_lock_nested(inode1, &bh1, 1, > + OI_LS_REFLINK_TARGET); > if (status < 0) { > if (status != -ENOENT) > mlog_errno(status); > @@ -4776,15 +4773,25 @@ int ocfs2_reflink_inodes_lock(struct inode *s_inode, > > /* lock id2 */ > if (!same_inode) { > - status = ocfs2_inode_lock_nested(inode2, bh2, 1, > + status = ocfs2_inode_lock_nested(inode2, &bh2, 1, > OI_LS_REFLINK_TARGET); > if (status < 0) { > if (status != -ENOENT) > mlog_errno(status); > goto out_cl1; > } > - } else > - *bh2 = *bh1; > + } else { > + bh2 = bh1; > + } > + > + /* > + * If we swapped inode order above, we have to swap the buffer heads > + * before passing them back to the caller. > + */ > + if (need_swap) > + swap(bh1, bh2); > + *bh_s = bh1; > + *bh_t = bh2; > > trace_ocfs2_double_lock_end( > (unsigned long long)oi1->ip_blkno, > @@ -4794,8 +4801,7 @@ int ocfs2_reflink_inodes_lock(struct inode *s_inode, > > out_cl1: > ocfs2_inode_unlock(inode1, 1); > - brelse(*bh1); > - *bh1 = NULL; > + brelse(bh1); > out_rw2: > ocfs2_rw_unlock(inode2, 1); > out_i2: > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >