From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny <ira.weiny@intel.com>, Alexander Viro <viro@zeniv.linux.org.uk>, "Darrick J. Wong" <darrick.wong@oracle.com>, Dan Williams <dan.j.williams@intel.com>, Dave Chinner <david@fromorbit.com>, Christoph Hellwig <hch@lst.de>, "Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 00/12] Enable per-file/directory DAX operations V2 Date: Fri, 10 Jan 2020 11:29:30 -0800 Message-ID: <20200110192942.25021-1-ira.weiny@intel.com> (raw) From: Ira Weiny <ira.weiny@intel.com> At LSF/MM'19 [1] [2] we discussed applications that overestimate memory consumption due to their inability to detect whether the kernel will instantiate page cache for a file, and cases where a global dax enable via a mount option is too coarse. The following patch series enables selecting the use of DAX on individual files and/or directories on xfs, and lays some groundwork to do so in ext4. In this scheme the dax mount option can be omitted to allow the per-file property to take effect. The insight at LSF/MM was to separate the per-mount or per-file "physical" capability switch from an "effective" attribute for the file. At LSF/MM we discussed the difficulties of switching the mode of a file with active mappings / page cache. It was thought the races could be avoided by limiting mode flips to 0-length files. However, this turns out to not be true.[3] This is because address space operations (a_ops) may be in use at any time the inode is referenced and users have expressed a desire to be able to change the mode on a file with data in it. For those reasons this patch set allows changing the mode flag on a file as long as it is not current mapped. Furthermore, DAX is a property of the inode and as such, many operations other than address space operations need to be protected during a mode change. Therefore callbacks are placed within the inode operations and used to lock the inode as appropriate. As in V1, Users are able to query the effective and physical flags separately at any time. Specifically the addition of the statx attribute bit allows them to ensure the file is operating in the mode they intend. This 'effective flag' and physical flags could differ when the filesystem is mounted with the dax flag for example. It should be noted that the physical DAX flag inheritance is not shown in this patch set as it was maintained from previous work on XFS. The physical DAX flag and it's inheritance will need to be added to other file systems for user control. Finally, extensive testing was performed which resulted in a couple of bug fix and clean up patches. Specifically: fs: remove unneeded IS_DAX() check fs/xfs: Fix truncate up 'Fix truncate up' deserves specific attention because I'm not 100% sure it is the correct fix. Without that patch fsx testing failed within a few minutes with this error. Mapped Write: non-zero data past EOF (0x3b0da) page offset 0xdb is 0x3711 With 'Fix truncate up' running fsx while changing modes can run for hours but I have seen 2 other errors in the same genre after many hours of continuous testing. They are: READ BAD DATA: offset = 0x22dc, size = 0xcc7e, fname = /mnt/pmem/dax-file Mapped Read: non-zero data past EOF (0x3309e) page offset 0x9f is 0x6ab4 After seeing the patches to fix stale data exposure problems[4] I'm more confident now that all 3 of these errors are a latent bug rather than a bug in this series itself. However, because of these failures I'm only submitting this set RFC. [1] https://lwn.net/Articles/787973/ [2] https://lwn.net/Articles/787233/ [3] https://lkml.org/lkml/2019/10/20/96 [4] https://patchwork.kernel.org/patch/11310511/ To: linux-kernel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Theodore Y. Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: linux-ext4@vger.kernel.org Cc: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
next reply index Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-10 19:29 ira.weiny [this message] 2020-01-10 19:29 ` [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute ira.weiny 2020-01-15 11:37 ` Jan Kara 2020-01-15 17:38 ` Darrick J. Wong 2020-01-15 19:45 ` Ira Weiny 2020-01-15 20:10 ` Dan Williams 2020-01-15 22:38 ` Ira Weiny 2020-01-16 5:39 ` Darrick J. Wong 2020-01-16 6:05 ` Dan Williams 2020-01-16 6:18 ` Darrick J. Wong 2020-01-16 6:25 ` Dan Williams 2020-01-18 9:11 ` Dave Chinner 2020-01-16 17:55 ` Ira Weiny 2020-01-16 18:04 ` Darrick J. Wong 2020-01-16 18:52 ` Ira Weiny 2020-01-16 22:19 ` Darrick J. Wong 2020-01-17 11:58 ` Jan Kara 2020-01-10 19:29 ` [RFC PATCH V2 02/12] fs/xfs: Isolate the physical DAX flag from effective ira.weiny 2020-01-10 19:29 ` [RFC PATCH V2 03/12] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny 2020-01-10 19:29 ` [RFC PATCH V2 04/12] fs/xfs: Clean up DAX support check ira.weiny 2020-01-10 19:29 ` [RFC PATCH V2 05/12] fs: remove unneeded IS_DAX() check ira.weiny 2020-01-16 9:38 ` Jan Kara 2020-01-16 18:47 ` Ira Weiny 2020-01-10 19:29 ` [RFC PATCH V2 06/12] fs/xfs: Check if the inode supports DAX under lock ira.weiny 2020-01-10 19:29 ` [RFC PATCH V2 07/12] fs: Add locking for a dynamic inode 'mode' ira.weiny 2020-01-13 22:12 ` Darrick J. Wong 2020-01-14 0:20 ` Ira Weiny 2020-01-14 1:03 ` Darrick J. Wong 2020-01-15 19:08 ` Ira Weiny 2020-01-16 5:40 ` Darrick J. Wong 2020-01-16 18:54 ` Ira Weiny 2020-01-10 19:29 ` [RFC PATCH V2 08/12] fs/xfs: Add lock/unlock mode to xfs ira.weiny 2020-01-13 22:19 ` Darrick J. Wong 2020-01-14 0:35 ` Ira Weiny 2020-01-15 0:57 ` Ira Weiny 2020-01-15 23:52 ` Ira Weiny 2020-01-16 9:24 ` Jan Kara 2020-01-16 19:12 ` Ira Weiny 2020-01-10 19:29 ` [RFC PATCH V2 09/12] fs: Prevent mode change if file is mmap'ed ira.weiny 2020-01-13 22:22 ` Darrick J. Wong 2020-01-14 0:46 ` Ira Weiny 2020-01-14 1:30 ` Darrick J. Wong 2020-01-14 17:53 ` Ira Weiny 2020-01-15 11:34 ` Jan Kara 2020-01-15 18:24 ` Ira Weiny 2020-01-15 10:21 ` David Laight 2020-01-15 17:53 ` Ira Weiny 2020-01-10 19:29 ` [RFC PATCH V2 10/12] fs/xfs: Fix truncate up ira.weiny 2020-01-13 22:27 ` Darrick J. Wong 2020-01-14 0:40 ` Ira Weiny 2020-01-14 1:14 ` Darrick J. Wong 2020-01-14 19:00 ` Ira Weiny 2020-01-14 19:39 ` Ira Weiny 2020-01-10 19:29 ` [RFC PATCH V2 11/12] fs/xfs: Clean up locking in dax invalidate ira.weiny 2020-01-10 19:29 ` [RFC PATCH V2 12/12] fs/xfs: Allow toggle of effective DAX flag ira.weiny
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200110192942.25021-1-ira.weiny@intel.com \ --to=ira.weiny@intel.com \ --cc=dan.j.williams@intel.com \ --cc=darrick.wong@oracle.com \ --cc=david@fromorbit.com \ --cc=hch@lst.de \ --cc=jack@suse.cz \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-xfs@vger.kernel.org \ --cc=tytso@mit.edu \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-XFS Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \ linux-xfs@vger.kernel.org public-inbox-index linux-xfs Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs AGPL code for this site: git clone https://public-inbox.org/public-inbox.git