Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH V2 00/12] Enable per-file/directory DAX operations V2
@ 2020-01-10 19:29 ira.weiny
  2020-01-10 19:29 ` [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute ira.weiny
                   ` (11 more replies)
  0 siblings, 12 replies; 55+ messages in thread
From: ira.weiny @ 2020-01-10 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ira Weiny, Alexander Viro, Darrick J. Wong, Dan Williams,
	Dave Chinner, Christoph Hellwig, Theodore Y. Ts'o, Jan Kara,
	linux-ext4, linux-xfs, linux-fsdevel

From: Ira Weiny <ira.weiny@intel.com>

At LSF/MM'19 [1] [2] we discussed applications that overestimate memory
consumption due to their inability to detect whether the kernel will
instantiate page cache for a file, and cases where a global dax enable via a
mount option is too coarse.

The following patch series enables selecting the use of DAX on individual files
and/or directories on xfs, and lays some groundwork to do so in ext4.  In this
scheme the dax mount option can be omitted to allow the per-file property to
take effect.

The insight at LSF/MM was to separate the per-mount or per-file "physical"
capability switch from an "effective" attribute for the file.

At LSF/MM we discussed the difficulties of switching the mode of a file with
active mappings / page cache.  It was thought the races could be avoided by
limiting mode flips to 0-length files.

However, this turns out to not be true.[3] This is because address space
operations (a_ops) may be in use at any time the inode is referenced and users
have expressed a desire to be able to change the mode on a file with data in
it.  For those reasons this patch set allows changing the mode flag on a file
as long as it is not current mapped.

Furthermore, DAX is a property of the inode and as such, many operations other
than address space operations need to be protected during a mode change.

Therefore callbacks are placed within the inode operations and used to lock the
inode as appropriate.

As in V1, Users are able to query the effective and physical flags separately
at any time.  Specifically the addition of the statx attribute bit allows them
to ensure the file is operating in the mode they intend.  This 'effective flag'
and physical flags could differ when the filesystem is mounted with the dax
flag for example.

It should be noted that the physical DAX flag inheritance is not shown in this
patch set as it was maintained from previous work on XFS.  The physical DAX flag
and it's inheritance will need to be added to other file systems for user
control. 

Finally, extensive testing was performed which resulted in a couple of bug fix
and clean up patches.  Specifically:

	fs: remove unneeded IS_DAX() check
	fs/xfs: Fix truncate up

'Fix truncate up' deserves specific attention because I'm not 100% sure it is
the correct fix.  Without that patch fsx testing failed within a few minutes
with this error.

	Mapped Write: non-zero data past EOF (0x3b0da) page offset 0xdb is 0x3711

With 'Fix truncate up' running fsx while changing modes can run for hours but I
have seen 2 other errors in the same genre after many hours of continuous
testing.

They are:

	READ BAD DATA: offset = 0x22dc, size = 0xcc7e, fname = /mnt/pmem/dax-file

	Mapped Read: non-zero data past EOF (0x3309e) page offset 0x9f is 0x6ab4

After seeing the patches to fix stale data exposure problems[4] I'm more
confident now that all 3 of these errors are a latent bug rather than a bug in
this series itself.

However, because of these failures I'm only submitting this set RFC.


[1] https://lwn.net/Articles/787973/
[2] https://lwn.net/Articles/787233/
[3] https://lkml.org/lkml/2019/10/20/96
[4] https://patchwork.kernel.org/patch/11310511/


To: linux-kernel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Cc: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, back to index

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10 19:29 [RFC PATCH V2 00/12] Enable per-file/directory DAX operations V2 ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute ira.weiny
2020-01-15 11:37   ` Jan Kara
2020-01-15 17:38     ` Darrick J. Wong
2020-01-15 19:45       ` Ira Weiny
2020-01-15 20:10         ` Dan Williams
2020-01-15 22:38           ` Ira Weiny
2020-01-16  5:39             ` Darrick J. Wong
2020-01-16  6:05               ` Dan Williams
2020-01-16  6:18                 ` Darrick J. Wong
2020-01-16  6:25                   ` Dan Williams
2020-01-18  9:11                 ` Dave Chinner
2020-01-16 17:55               ` Ira Weiny
2020-01-16 18:04                 ` Darrick J. Wong
2020-01-16 18:52                   ` Ira Weiny
2020-01-16 22:19                     ` Darrick J. Wong
2020-01-17 11:58                     ` Jan Kara
2020-01-10 19:29 ` [RFC PATCH V2 02/12] fs/xfs: Isolate the physical DAX flag from effective ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 03/12] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 04/12] fs/xfs: Clean up DAX support check ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 05/12] fs: remove unneeded IS_DAX() check ira.weiny
2020-01-16  9:38   ` Jan Kara
2020-01-16 18:47     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 06/12] fs/xfs: Check if the inode supports DAX under lock ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 07/12] fs: Add locking for a dynamic inode 'mode' ira.weiny
2020-01-13 22:12   ` Darrick J. Wong
2020-01-14  0:20     ` Ira Weiny
2020-01-14  1:03       ` Darrick J. Wong
2020-01-15 19:08         ` Ira Weiny
2020-01-16  5:40           ` Darrick J. Wong
2020-01-16 18:54             ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 08/12] fs/xfs: Add lock/unlock mode to xfs ira.weiny
2020-01-13 22:19   ` Darrick J. Wong
2020-01-14  0:35     ` Ira Weiny
2020-01-15  0:57       ` Ira Weiny
2020-01-15 23:52     ` Ira Weiny
2020-01-16  9:24   ` Jan Kara
2020-01-16 19:12     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 09/12] fs: Prevent mode change if file is mmap'ed ira.weiny
2020-01-13 22:22   ` Darrick J. Wong
2020-01-14  0:46     ` Ira Weiny
2020-01-14  1:30       ` Darrick J. Wong
2020-01-14 17:53         ` Ira Weiny
2020-01-15 11:34           ` Jan Kara
2020-01-15 18:24             ` Ira Weiny
2020-01-15 10:21   ` David Laight
2020-01-15 17:53     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 10/12] fs/xfs: Fix truncate up ira.weiny
2020-01-13 22:27   ` Darrick J. Wong
2020-01-14  0:40     ` Ira Weiny
2020-01-14  1:14       ` Darrick J. Wong
2020-01-14 19:00         ` Ira Weiny
2020-01-14 19:39           ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 11/12] fs/xfs: Clean up locking in dax invalidate ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 12/12] fs/xfs: Allow toggle of effective DAX flag ira.weiny

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git