From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.1 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 402EBC433E2 for ; Fri, 28 Aug 2020 09:04:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 200AA2071B for ; Fri, 28 Aug 2020 09:04:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728670AbgH1JE3 (ORCPT ); Fri, 28 Aug 2020 05:04:29 -0400 Received: from mail.cn.fujitsu.com ([183.91.158.132]:52639 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726834AbgH1JE1 (ORCPT ); Fri, 28 Aug 2020 05:04:27 -0400 X-IronPort-AV: E=Sophos;i="5.76,363,1592841600"; d="scan'208";a="98664409" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 28 Aug 2020 17:04:22 +0800 Received: from G08CNEXMBPEKD04.g08.fujitsu.local (unknown [10.167.33.201]) by cn.fujitsu.com (Postfix) with ESMTP id 187A348990C4; Fri, 28 Aug 2020 17:04:16 +0800 (CST) Received: from [10.167.225.206] (10.167.225.206) by G08CNEXMBPEKD04.g08.fujitsu.local (10.167.33.201) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 28 Aug 2020 17:04:16 +0800 Subject: Re: [PATCH] fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set To: Dave Chinner CC: , , , References: <20200821015953.22956-1-lihao2018.fnst@cn.fujitsu.com> <20200827063748.GA12096@dread.disaster.area> <6b3b3439-2199-8f00-ceca-d65769e94fe0@cn.fujitsu.com> <20200828003541.GD12096@dread.disaster.area> From: "Li, Hao" Message-ID: Date: Fri, 28 Aug 2020 17:04:14 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.1.1 MIME-Version: 1.0 In-Reply-To: <20200828003541.GD12096@dread.disaster.area> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Language: en-US X-Originating-IP: [10.167.225.206] X-ClientProxiedBy: G08CNEXCHPEKD06.g08.fujitsu.local (10.167.33.205) To G08CNEXMBPEKD04.g08.fujitsu.local (10.167.33.201) X-yoursite-MailScanner-ID: 187A348990C4.ADFE2 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: lihao2018.fnst@cn.fujitsu.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/8/28 8:35, Dave Chinner wrote: > On Thu, Aug 27, 2020 at 05:58:07PM +0800, Li, Hao wrote: >> On 2020/8/27 14:37, Dave Chinner wrote: >>> On Fri, Aug 21, 2020 at 09:59:53AM +0800, Hao Li wrote: >>>> Currently, DCACHE_REFERENCED prevents the dentry with DCACHE_DONTCACHE >>>> set from being killed, so the corresponding inode can't be evicted. If >>>> the DAX policy of an inode is changed, we can't make policy changing >>>> take effects unless dropping caches manually. >>>> >>>> This patch fixes this problem and flushes the inode to disk to prepare >>>> for evicting it. >>>> >>>> Signed-off-by: Hao Li >>>> --- >>>> fs/dcache.c | 3 ++- >>>> fs/inode.c | 2 +- >>>> 2 files changed, 3 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/dcache.c b/fs/dcache.c >>>> index ea0485861d93..486c7409dc82 100644 >>>> --- a/fs/dcache.c >>>> +++ b/fs/dcache.c >>>> @@ -796,7 +796,8 @@ static inline bool fast_dput(struct dentry *dentry) >>>> */ >>>> smp_rmb(); >>>> d_flags = READ_ONCE(dentry->d_flags); >>>> - d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED; >>>> + d_flags &= DCACHE_REFERENCED | DCACHE_LRU_LIST | DCACHE_DISCONNECTED >>>> + | DCACHE_DONTCACHE; >>> Seems reasonable, but you need to update the comment above as to >>> how this flag fits into this code.... >> Yes. I will change it. Thanks. >> >>>> /* Nothing to do? Dropping the reference was all we needed? */ >>>> if (d_flags == (DCACHE_REFERENCED | DCACHE_LRU_LIST) && !d_unhashed(dentry)) >>>> diff --git a/fs/inode.c b/fs/inode.c >>>> index 72c4c347afb7..5218a8aebd7f 100644 >>>> --- a/fs/inode.c >>>> +++ b/fs/inode.c >>>> @@ -1632,7 +1632,7 @@ static void iput_final(struct inode *inode) >>>> } >>>> >>>> state = inode->i_state; >>>> - if (!drop) { >>>> + if (!drop || (drop && (inode->i_state & I_DONTCACHE))) { >>>> WRITE_ONCE(inode->i_state, state | I_WILL_FREE); >>>> spin_unlock(&inode->i_lock); >>> What's this supposed to do? We'll only get here with drop set if the >>> filesystem is mounting or unmounting. >> The variable drop will also be set to True if I_DONTCACHE is set on >> inode->i_state. >> Although mounting/unmounting will set the drop variable, it won't set >> I_DONTCACHE if I understand correctly. As a result, >> drop && (inode->i_state & I_DONTCACHE) will filter out mounting/unmounting. > So what does this have to do with changing the way the dcache > treats DCACHE_DONTCACHE? After changing the way the dcache treats DCACHE_DONTCACHE, the dentry with DCACHE_DONTCACHE set will be killed unconditionally even if DCACHE_REFERENCED is set, and its inode will be processed by iput_final(). This inode has I_DONTCACHE flag, so op->drop_inode() will return true, and the inode will be evicted _directly_ even though it has dirty pages. I think the kernel will run into error state because it doesn't writeback dirty pages of this inode before evicting. This is why I write back this inode here. According to my test, if I revert the second hunk of this patch, kernel will hang after running the following command: echo 123 > test.txt && xfs_io -c "chattr +x" test.txt The backtrace is: xfs_fs_destroy_inode+0x204/0x24d destroy_inode+0x3b/0x65 evict+0x150/0x1aa iput+0x117/0x19a dentry_unlink_inode+0x12b/0x12f __dentry_kill+0xee/0x211 dentry_kill+0x112/0x22f dput+0x79/0x86 __fput+0x200/0x23f ____fput+0x25/0x28 task_work_run+0x144/0x177 do_exit+0x4fb/0xb94 This backtrace is printed with an ASSERT(0) statement in xfs_fs_destroy_inode(). > > Also, if I_DONTCACHE is set, but the inode has also been unlinked or > is unhashed, should we be writing it back? i.e. it might have been > dropped for a different reason to I_DONTCACHE.... This is a problem I didn't realize. You are right. If the inode has been unlinked/unhashed and the I_DONTCACHE is also set, the appended condition will lead to unnecessary writeback. I think I need to handle the inode writeback more carefully. Maybe I can revert the second hunk and remove I_DONTCACHE from generic_drop_inode() and then change if (!drop && (sb->s_flags & SB_ACTIVE)) to if (!drop && !(inode->i_state & I_DONTCACHE) && (sb->s_flags & SB_ACTIVE)) This approach would be more suitable. > > IOWs, if there is a problem with how I_DONTCACHE is being handled, > then the problem must already exists regardless of the > DCACHE_DONTCACHE behaviour, right? So shouldn't this be a separate > bug fix with it's own explanation of the problem and the fix? I do think the way we treat I_DONTCACHE in current kernel is not suitable. generic_drop_inode() is used to determine if the inode should be evicted without writeback, so I_DONTCACHE shouldn't be handled in this function. The suitable approach is illustrated above. I can submit a patch to fix this problem if this new approach is acceptable. Thanks, Hao Li > Cheers, > > Dave.