From: "Li, Hao" <lihao2018.fnst@cn.fujitsu.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Chinner <david@fromorbit.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
<yangx.jy@cn.fujitsu.com>, <ruansy.fnst@cn.fujitsu.com>,
<gujx@cn.fujitsu.com>, Yasunori Goto <y-goto@fujitsu.com>
Subject: Re: Can we change the S_DAX flag immediately on XFS without dropping caches?
Date: Wed, 5 Aug 2020 16:10:05 +0800 [thread overview]
Message-ID: <ed4b2df4-086f-a384-3695-4ea721a70326@cn.fujitsu.com> (raw)
In-Reply-To: <5717e1e5-79fb-af3c-0859-eea3cd8d9626@cn.fujitsu.com>
Hello,
Ping.
Thanks,
Hao Li
On 2020/7/31 17:12, Li, Hao wrote:
> On 2020/7/30 0:10, Ira Weiny wrote:
>
>> On Wed, Jul 29, 2020 at 11:23:21AM +0900, Yasunori Goto wrote:
>>> Hi,
>>>
>>> On 2020/07/28 11:20, Dave Chinner wrote:
>>>> On Tue, Jul 28, 2020 at 02:00:08AM +0000, Li, Hao wrote:
>>>>> Hi,
>>>>>
>>>>> I have noticed that we have to drop caches to make the changing of S_DAX
>>>>> flag take effect after using chattr +x to turn on DAX for a existing
>>>>> regular file. The related function is xfs_diflags_to_iflags, whose
>>>>> second parameter determines whether we should set S_DAX immediately.
>>>> Yup, as documented in Documentation/filesystems/dax.txt. Specifically:
>>>>
>>>> 6. When changing the S_DAX policy via toggling the persistent FS_XFLAG_DAX flag,
>>>> the change in behaviour for existing regular files may not occur
>>>> immediately. If the change must take effect immediately, the administrator
>>>> needs to:
>>>>
>>>> a) stop the application so there are no active references to the data set
>>>> the policy change will affect
>>>>
>>>> b) evict the data set from kernel caches so it will be re-instantiated when
>>>> the application is restarted. This can be achieved by:
>>>>
>>>> i. drop-caches
>>>> ii. a filesystem unmount and mount cycle
>>>> iii. a system reboot
>>>>
>>>>> I can't figure out why we do this. Is this because the page caches in
>>>>> address_space->i_pages are hard to deal with?
>>>> Because of unfixable races in the page fault path that prevent
>>>> changing the caching behaviour of the inode while concurrent access
>>>> is possible. The only way to guarantee races can't happen is to
>>>> cycle the inode out of cache.
>>> I understand why the drop_cache operation is necessary. Thanks.
>>>
>>> BTW, even normal user becomes to able to change DAX flag for an inode,
>>> drop_cache operation still requires root permission, right?
>>>
>>> So, if kernel have a feature for normal user can operate drop cache for "a
>>> inode" with
>>> its permission, I think it improve the above limitation, and
>>> we would like to try to implement it recently.
>>>
>>> Do you have any opinion making such feature?
>>> (Agree/opposition, or any other comment?)
>> I would not be opposed but there were many hurdles to that implementation.
>>
>> What is the use case you are thinking of here?
>>
>> The compromise of dropping caches was reached because we envisioned that many
>> users would simply want to chose the file mode when a file was created and
>> maintain that mode through the lifetime of the file. To that end one can
>> simply create directories which have the desired dax mode and any files created
>> in that directory will inherit the dax mode immediately.
> Inheriting mechanism for DAX mode is reasonable but chattr&drop_caches
> makes things complicated.
>> So there is no need
>> to switch the file mode directly as a normal user.
> The question is, the normal users can indeed use chattr to change the DAX
> mode for a regular file as long as they want. However, when they do this,
> they have no way to make the change take effect. I think this behavior is
> weird. We can say chattr executes successfully because XFS_DIFLAG2_DAX has
> been set onto xfs_inode->i_d.di_flags2, but we can also say chattr doesn't
> finish things completely because S_DAX is not set onto inode->i_flags.
> The user may be confused about why chattr +/-x doesn't work at all. Maybe
> we should find a way for the normal user to make chattr take effects
> without calling the administrator, or we can make the chattr +/x command
> request root permission now that if the user has root permission, he can
> make DAX changing take effect through echo 2 > /proc/sys/vm/drop_caches.
>
>
> Regards,
>
> Hao Li
>
>> Would that work for your use case?
>>
>> Ira
next prev parent reply other threads:[~2020-08-05 8:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-28 2:00 Can we change the S_DAX flag immediately on XFS without dropping caches? Li, Hao
2020-07-28 2:20 ` Dave Chinner
2020-07-29 2:23 ` Yasunori Goto
2020-07-29 16:10 ` Ira Weiny
2020-07-31 9:12 ` Li, Hao
2020-08-05 8:10 ` Li, Hao [this message]
2020-08-05 15:44 ` Darrick J. Wong
2020-08-07 16:57 ` Ira Weiny
2020-07-31 10:04 ` Yasunori Goto
2020-07-29 23:21 ` Dave Chinner
2020-07-31 9:15 ` Li, Hao
2020-07-31 9:59 ` Yasunori Goto
2020-08-07 17:09 ` Ira Weiny
2020-08-18 9:16 ` Li, Hao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ed4b2df4-086f-a384-3695-4ea721a70326@cn.fujitsu.com \
--to=lihao2018.fnst@cn.fujitsu.com \
--cc=david@fromorbit.com \
--cc=gujx@cn.fujitsu.com \
--cc=ira.weiny@intel.com \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ruansy.fnst@cn.fujitsu.com \
--cc=y-goto@fujitsu.com \
--cc=yangx.jy@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).