Linux-XFS Archive on lore.kernel.org
 help / color / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@lst.de>,
	"Theodore Y. Ts'o" <tytso@mit.edu>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute
Date: Wed, 15 Jan 2020 11:45:13 -0800
Message-ID: <20200115194512.GF23311@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20200115173834.GD8247@magnolia>

On Wed, Jan 15, 2020 at 09:38:34AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 15, 2020 at 12:37:15PM +0100, Jan Kara wrote:
> > On Fri 10-01-20 11:29:31, ira.weiny@intel.com wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > In order for users to determine if a file is currently operating in DAX
> > > mode (effective DAX).  Define a statx attribute value and set that
> > > attribute if the effective DAX flag is set.
> > > 
> > > To go along with this we propose the following addition to the statx man
> > > page:
> > > 
> > > STATX_ATTR_DAX
> > > 
> > > 	DAX (cpu direct access) is a file mode that attempts to minimize
> 
> "..is a file I/O mode"?

or  "... is a file state ..."?
 
> > > 	software cache effects for both I/O and memory mappings of this
> > > 	file.  It requires a capable device, a compatible filesystem
> > > 	block size, and filesystem opt-in.
> 
> "...a capable storage device..."

Done

> 
> What does "compatible fs block size" mean?  How does the user figure out
> if their fs blocksize is compatible?  Do we tell users to refer their
> filesystem's documentation here?

Perhaps it is wrong for this to be in the man page at all?  Would it be better
to assume the file system and block device are already configured properly by
the admin?

For which the blocksize restrictions are already well documented.  ie:

https://www.kernel.org/doc/Documentation/filesystems/dax.txt

?

How about changing the text to:

	It requires a block device and file system which have been configured
	to support DAX.

?

> 
> > > It generally assumes all
> > > 	accesses are via cpu load / store instructions which can
> > > 	minimize overhead for small accesses, but adversely affect cpu
> > > 	utilization for large transfers.
> 
> Will this always be true for persistent memory?

I'm not clear.  Did you mean; "this" == adverse utilization for large transfers?

> 
> I wasn't even aware that large transfers adversely affected CPU
> utilization. ;)

Sure vs using a DMA engine for example.

> 
> > >  File I/O is done directly
> > > 	to/from user-space buffers. While the DAX property tends to
> > > 	result in data being transferred synchronously it does not give
> 
> "...transferred synchronously, it does not..."

done.

> 
> > > 	the guarantees of synchronous I/O that data and necessary
> 
> "...it does not guarantee that I/O or file metadata have been flushed to
> the storage device."

The lack of guarantee here is mainly regarding metadata.

How about:

        While the DAX property tends to result in data being transferred
        synchronously, it does not give the same guarantees of 
	synchronous I/O where data and the necessary metadata are 
	transferred together.

> 
> > > 	metadata are transferred. Memory mapped I/O may be performed
> > > 	with direct mappings that bypass system memory buffering.
> 
> "...with direct memory mappings that bypass kernel page cache."

Done.

> 
> > > Again
> > > 	while memory-mapped I/O tends to result in data being
> 
> I would move the sentence about "Memory mapped I/O..." to directly after
> the sentence about file I/O being done directly to and from userspace so
> that you don't need to repeat this statement.

Done.

> 
> > > 	transferred synchronously it does not guarantee synchronous
> > > 	metadata updates. A dax file may optionally support being mapped
> > > 	with the MAP_SYNC flag which does allow cpu store operations to
> > > 	be considered synchronous modulo cpu cache effects.
> 
> How does one detect or work around or deal with "cpu cache effects"?  I
> assume some sort of CPU cache flush instruction is what is meant here,
> but I think we could mention the basics of what has to be done here:
> 
> "A DAX file may support being mapped with the MAP_SYNC flag, which
> enables a program to use CPU cache flush operations to persist CPU store
> operations without an explicit fsync(2).  See mmap(2) for more
> information."?

That sounds better.  I like the reference to mmap as well.

Ok I changed a couple of things as well.  How does this sound?


STATX_ATTR_DAX 

        DAX (cpu direct access) is a file mode that attempts to minimize
        software cache effects for both I/O and memory mappings of this
        file.  It requires a block device and file system which have
        been configured to support DAX.

        DAX generally assumes all accesses are via cpu load / store
        instructions which can minimize overhead for small accesses, but
        may adversely affect cpu utilization for large transfers.

        File I/O is done directly to/from user-space buffers and memory
        mapped I/O may be performed with direct memory mappings that
        bypass kernel page cache.

        While the DAX property tends to result in data being transferred
        synchronously, it does not give the same guarantees of
        synchronous I/O where data and the necessary metadata are
        transferred together.

        A DAX file may support being mapped with the MAP_SYNC flag,
        which enables a program to use CPU cache flush operations to
        persist CPU store operations without an explicit fsync(2).  See
        mmap(2) for more information.


Ira

> 
> Oof, a paragraph break would be nice. :)
> 
> --D
> 
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > This looks good to me. You can add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > 
> > 								Honza
> > 
> > > ---
> > >  fs/stat.c                 | 3 +++
> > >  include/uapi/linux/stat.h | 1 +
> > >  2 files changed, 4 insertions(+)
> > > 
> > > diff --git a/fs/stat.c b/fs/stat.c
> > > index 030008796479..894699c74dde 100644
> > > --- a/fs/stat.c
> > > +++ b/fs/stat.c
> > > @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
> > >  	if (IS_AUTOMOUNT(inode))
> > >  		stat->attributes |= STATX_ATTR_AUTOMOUNT;
> > >  
> > > +	if (IS_DAX(inode))
> > > +		stat->attributes |= STATX_ATTR_DAX;
> > > +
> > >  	if (inode->i_op->getattr)
> > >  		return inode->i_op->getattr(path, stat, request_mask,
> > >  					    query_flags);
> > > diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> > > index ad80a5c885d5..e5f9d5517f6b 100644
> > > --- a/include/uapi/linux/stat.h
> > > +++ b/include/uapi/linux/stat.h
> > > @@ -169,6 +169,7 @@ struct statx {
> > >  #define STATX_ATTR_ENCRYPTED		0x00000800 /* [I] File requires key to decrypt in fs */
> > >  #define STATX_ATTR_AUTOMOUNT		0x00001000 /* Dir: Automount trigger */
> > >  #define STATX_ATTR_VERITY		0x00100000 /* [I] Verity protected file */
> > > +#define STATX_ATTR_DAX			0x00002000 /* [I] File is DAX */
> > >  
> > >  
> > >  #endif /* _UAPI_LINUX_STAT_H */
> > > -- 
> > > 2.21.0
> > > 
> > -- 
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR

  reply index

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 19:29 [RFC PATCH V2 00/12] Enable per-file/directory DAX operations V2 ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute ira.weiny
2020-01-15 11:37   ` Jan Kara
2020-01-15 17:38     ` Darrick J. Wong
2020-01-15 19:45       ` Ira Weiny [this message]
2020-01-15 20:10         ` Dan Williams
2020-01-15 22:38           ` Ira Weiny
2020-01-16  5:39             ` Darrick J. Wong
2020-01-16  6:05               ` Dan Williams
2020-01-16  6:18                 ` Darrick J. Wong
2020-01-16  6:25                   ` Dan Williams
2020-01-18  9:11                 ` Dave Chinner
2020-01-16 17:55               ` Ira Weiny
2020-01-16 18:04                 ` Darrick J. Wong
2020-01-16 18:52                   ` Ira Weiny
2020-01-16 22:19                     ` Darrick J. Wong
2020-01-17 11:58                     ` Jan Kara
2020-01-10 19:29 ` [RFC PATCH V2 02/12] fs/xfs: Isolate the physical DAX flag from effective ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 03/12] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 04/12] fs/xfs: Clean up DAX support check ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 05/12] fs: remove unneeded IS_DAX() check ira.weiny
2020-01-16  9:38   ` Jan Kara
2020-01-16 18:47     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 06/12] fs/xfs: Check if the inode supports DAX under lock ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 07/12] fs: Add locking for a dynamic inode 'mode' ira.weiny
2020-01-13 22:12   ` Darrick J. Wong
2020-01-14  0:20     ` Ira Weiny
2020-01-14  1:03       ` Darrick J. Wong
2020-01-15 19:08         ` Ira Weiny
2020-01-16  5:40           ` Darrick J. Wong
2020-01-16 18:54             ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 08/12] fs/xfs: Add lock/unlock mode to xfs ira.weiny
2020-01-13 22:19   ` Darrick J. Wong
2020-01-14  0:35     ` Ira Weiny
2020-01-15  0:57       ` Ira Weiny
2020-01-15 23:52     ` Ira Weiny
2020-01-16  9:24   ` Jan Kara
2020-01-16 19:12     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 09/12] fs: Prevent mode change if file is mmap'ed ira.weiny
2020-01-13 22:22   ` Darrick J. Wong
2020-01-14  0:46     ` Ira Weiny
2020-01-14  1:30       ` Darrick J. Wong
2020-01-14 17:53         ` Ira Weiny
2020-01-15 11:34           ` Jan Kara
2020-01-15 18:24             ` Ira Weiny
2020-01-15 10:21   ` David Laight
2020-01-15 17:53     ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 10/12] fs/xfs: Fix truncate up ira.weiny
2020-01-13 22:27   ` Darrick J. Wong
2020-01-14  0:40     ` Ira Weiny
2020-01-14  1:14       ` Darrick J. Wong
2020-01-14 19:00         ` Ira Weiny
2020-01-14 19:39           ` Ira Weiny
2020-01-10 19:29 ` [RFC PATCH V2 11/12] fs/xfs: Clean up locking in dax invalidate ira.weiny
2020-01-10 19:29 ` [RFC PATCH V2 12/12] fs/xfs: Allow toggle of effective DAX flag ira.weiny

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200115194512.GF23311@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-XFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-xfs/0 linux-xfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-xfs linux-xfs/ https://lore.kernel.org/linux-xfs \
		linux-xfs@vger.kernel.org
	public-inbox-index linux-xfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-xfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git