From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6FBEC77B7C for ; Wed, 31 May 2023 09:52:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235653AbjEaJwq (ORCPT ); Wed, 31 May 2023 05:52:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233199AbjEaJwn (ORCPT ); Wed, 31 May 2023 05:52:43 -0400 Received: from frasgout12.his.huawei.com (unknown [14.137.139.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63C12E2; Wed, 31 May 2023 02:52:41 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout12.his.huawei.com (SkyGuard) with ESMTP id 4QWPQf0hfxz9xGWp; Wed, 31 May 2023 17:40:58 +0800 (CST) Received: from [10.206.134.65] (unknown [10.206.134.65]) by APP2 (Coremail) with SMTP id GxC2BwAX513IGHdkaX3yAg--.2936S2; Wed, 31 May 2023 10:52:16 +0100 (CET) Message-ID: <38048f0a-7706-12c4-dc85-4d2fa13df015@huaweicloud.com> Date: Wed, 31 May 2023 11:52:06 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [syzbot] [reiserfs?] possible deadlock in open_xa_dir Content-Language: en-US From: Roberto Sassu To: Paul Moore , syzbot Cc: hdanton@sina.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, reiserfs-devel@vger.kernel.org, roberto.sassu@huawei.com, syzkaller-bugs@googlegroups.com, peterz@infradead.org, mingo@redhat.com, will@kernel.org, Jeff Mahoney , Jan Kara References: <0000000000007bedb605f119ed9f@google.com> <00000000000000964605faf87416@google.com> <1020d006-c698-aacc-bcc3-92e5b237ef91@huaweicloud.com> In-Reply-To: <1020d006-c698-aacc-bcc3-92e5b237ef91@huaweicloud.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID: GxC2BwAX513IGHdkaX3yAg--.2936S2 X-Coremail-Antispam: 1UD129KBjvJXoWxWF4fXFyxZrWxZw4DXr18Zrb_yoWrKw1rpr W8ta1DKryqvr1kJr40q3WDW340qrsxG34UJr1DGFyUuanrZrnFqF4Ivw1Fgr45JrWkCFsx Jr1jyw1UZrn5JwUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvFb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxV AFwI0_Gr0_Gr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40E x7xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x 0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lFIxGxcIEc7CjxVA2Y2ka0xkIwI1lc7I2V7IY0VAS 07AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c 02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_ GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6Fyj6rWUJwCI42IY6I8E87Iv67AK xVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvj xUOyCJDUUUU X-CM-SenderInfo: purev21wro2thvvxqx5xdzvxpfor3voofrz/1tbiAQAOBF1jj43qAwABsf X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/31/2023 11:49 AM, Roberto Sassu wrote: > On 5/5/2023 11:36 PM, Paul Moore wrote: >> On Fri, May 5, 2023 at 4:51 PM syzbot >> wrote: >>> >>> syzbot has bisected this issue to: >>> >>> commit d82dcd9e21b77d338dc4875f3d4111f0db314a7c >>> Author: Roberto Sassu >>> Date:   Fri Mar 31 12:32:18 2023 +0000 >>> >>>      reiserfs: Add security prefix to xattr name in >>> reiserfs_security_write() >>> >>> bisection log: >>> https://syzkaller.appspot.com/x/bisect.txt?x=14403182280000 >>> start commit:   3c4aa4434377 Merge tag 'ceph-for-6.4-rc1' of >>> https://githu.. >>> git tree:       upstream >>> final oops: >>> https://syzkaller.appspot.com/x/report.txt?x=16403182280000 >>> console output: https://syzkaller.appspot.com/x/log.txt?x=12403182280000 >>> kernel config: >>> https://syzkaller.appspot.com/x/.config?x=73a06f6ef2d5b492 >>> dashboard link: >>> https://syzkaller.appspot.com/bug?extid=8fb64a61fdd96b50f3b8 >>> syz repro: >>> https://syzkaller.appspot.com/x/repro.syz?x=12442414280000 >>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=176a7318280000 >>> >>> Reported-by: syzbot+8fb64a61fdd96b50f3b8@syzkaller.appspotmail.com >>> Fixes: d82dcd9e21b7 ("reiserfs: Add security prefix to xattr name in >>> reiserfs_security_write()") >>> >>> For information about bisection process see: >>> https://goo.gl/tpsmEJ#bisection >> >> I don't think Roberto's patch identified above is the actual root >> cause of this problem as reiserfs_xattr_set_handle() is called in >> reiserfs_security_write() both before and after the patch.  However, >> due to some bad logic in reiserfs_security_write() which Roberto >> corrected, I'm thinking that it is possible this code is being >> exercised for the first time and syzbot is starting to trigger a >> locking issue in the reiserfs code ... ? > > + Jan, Jeff (which basically restructured the lock) Actually adding Jan and Jeff. Roberto > + Petr, Ingo, Will > > I involve the lockdep experts, to get a bit of help on this. > > First of all, the lockdep warning is trivial to reproduce: > > # dd if=/dev/zero of=reiserfs.img bs=1M count=100 > # losetup -f --show reiserfs.img > /dev/loop0 > # mkfs.reiserfs /dev/loop0 > # mount /dev/loop0 /mnt/ > # touch file0 > > In the testing system, Smack is the major LSM. > > Ok, so the warning here is clear: > > https://syzkaller.appspot.com/x/log.txt?x=12403182280000 > > However, I was looking if that can really happen. From this: > > [   77.746561][ T5418] -> #1 (&sbi->lock){+.+.}-{3:3}: > [   77.753772][ T5418]        lock_acquire+0x23e/0x630 > [   77.758792][ T5418]        __mutex_lock_common+0x1d8/0x2530 > [   77.764504][ T5418]        mutex_lock_nested+0x1b/0x20 > [   77.769868][ T5418]        reiserfs_write_lock+0x70/0xc0 > [   77.775321][ T5418]        reiserfs_mkdir+0x321/0x870 > > I see that the lock is taken in reiserfs_write_lock(), while lockdep says: > > [   77.710227][ T5418] but task is already holding lock: > [   77.717587][ T5418] ffff88807568d090 (&sbi->lock){+.+.}-{3:3}, at: > reiserfs_write_lock_nested+0x4a/0xb0 > > which is in a different place, I believe here: > > int reiserfs_paste_into_item(struct reiserfs_transaction_handle *th, >                              /* Path to the pasted item. */ > [...] > >         depth = reiserfs_write_unlock_nested(sb); >         dquot_free_space_nodirty(inode, pasted_size); >         reiserfs_write_lock_nested(sb, depth); >         return retval; > } > > This is called by reiserfs_add_entry(), which is called by > reiserfs_create() (it is in the lockdep trace). After returning to > reiserfs_create(), d_instantiate_new() is called. > > I don't know exactly, I take the part that the lock is held. But if it > is held, how d_instantiate_new() can be executed in another task? > > static int reiserfs_create(struct mnt_idmap *idmap, struct inode *dir, >                         struct dentry *dentry, umode_t mode, bool excl) > { > > [...] > >         reiserfs_write_lock(dir->i_sb); > >         retval = journal_begin(&th, dir->i_sb, jbegin_count); > > [...] > >         d_instantiate_new(dentry, inode); >         retval = journal_end(&th); > > out_failed: >         reiserfs_write_unlock(dir->i_sb); > > If the lock is held, the scenario lockdep describes cannot happen. Any > thoughts? > > Thanks > > Roberto