Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@ml01.01.org>,
	XFS Developers <xfs@oss.sgi.com>
Subject: Re: xfs: untangle the direct I/O and DAX path, fix DAX locking
Date: Fri, 24 Jun 2016 09:31:09 +0200
Message-ID: <20160624073109.GB22205@lst.de> (raw)
In-Reply-To: <20160624071318.GE12670@dastard>

On Fri, Jun 24, 2016 at 05:13:18PM +1000, Dave Chinner wrote:
> This is a POSIX compliant fsync() implementation:
> 
> int fsync(int fd)
> {
> 	return 0;
> }

Depends on what you mean with "Posix".  Modern Posix which includex
XPG has the _POSIX_SYNCHRONIZED_IO option, which Linux implements.  For
that Posix says about fsync:

    [SIO] [Option Start] If _POSIX_SYNCHRONIZED_IO is defined, the fsync()
    function shall force all currently queued I/O operations associated with
    the file indicated by file descriptor fildes to the synchronized I/O
    completion state. All I/O operations shall be completed as defined for
    synchronized I/O file integrity completion. [Option End]


Whereas synchronized I/O file integrity completion is defined as:

     3.378 Synchronized I/O Data Integrity Completion

     For read, when the operation has been completed or diagnosed if
     unsuccessful. The read is complete only when an image of the data has been
     successfully transferred to the requesting process. If there were any
     pending write requests affecting the data to be read at the time that the
     synchronized read operation was requested, these write requests are
     successfully transferred prior to reading the data.

     For write, when the operation has been completed or diagnosed if
     unsuccessful. The write is complete only when the data specified in the
     write request is successfully transferred and all file system information
     required to retrieve the data is successfully transferred.

     File attributes that are not necessary for data retrieval (access time,
     modification time, status change time) need not be successfully
     transferred prior to returning to the calling process.

     3.379 Synchronized I/O File Integrity Completion

     Identical to a synchronized I/O data integrity completion with the
     addition that all file attributes relative to the I/O operation (including
     access time, modification time, status change time) are successfully
     transferred prior to returning to the calling process.


So in this case Posix very much requires data to be on a stable
medium.

> The POSIX exclusive write requirement is a different case. No linux
> filesystem except XFS has ever met that requirement (in 20 something
> years), yet I don't see applications falling over with corrupt data
> from non-exclusive writes all the time, nor do I see application
> developers shouting at us to provide it. i.e. reality tells us this
> isn't a POSIX behaviour that applications rely on because everyone
> implements it differently.

Every file system exludes writes from other writes.

  reply index

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22 15:27 Christoph Hellwig
2016-06-22 15:27 ` [PATCH 1/8] xfs: don't pass ioflags around in the ioctl path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 2/8] xfs: kill ioflags Christoph Hellwig
2016-06-22 15:27 ` [PATCH 3/8] xfs: remove s_maxbytes enforcement in xfs_file_read_iter Christoph Hellwig
2016-06-22 15:27 ` [PATCH 4/8] xfs: split xfs_file_read_iter into buffered and direct I/O helpers Christoph Hellwig
2016-06-22 15:27 ` [PATCH 5/8] xfs: stop using generic_file_read_iter for direct I/O Christoph Hellwig
2016-06-22 15:27 ` [PATCH 6/8] xfs: direct calls in the direct I/O path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 7/8] xfs: split direct I/O and DAX path Christoph Hellwig
2016-09-29  2:53   ` Darrick J. Wong
2016-09-29  8:38     ` aio completions vs file_accessed race, was: " Christoph Hellwig
2016-09-29 20:18       ` Christoph Hellwig
2016-09-29 20:18         ` Christoph Hellwig
2016-09-29 20:33           ` Darrick J. Wong
2016-06-22 15:27 ` [PATCH 8/8] xfs: fix locking for DAX writes Christoph Hellwig
2016-06-23 14:22   ` Boaz Harrosh
2016-06-23 23:24 ` xfs: untangle the direct I/O and DAX path, fix DAX locking Dave Chinner
2016-06-24  1:14   ` Dan Williams
2016-06-24  7:13     ` Dave Chinner
2016-06-24  7:31       ` Christoph Hellwig [this message]
2016-06-24  7:26   ` Christoph Hellwig
2016-06-24 23:00     ` Dave Chinner
2016-06-28 13:10       ` Christoph Hellwig
2016-06-28 13:27         ` Boaz Harrosh
2016-06-28 13:39           ` Christoph Hellwig
2016-06-28 13:56             ` Boaz Harrosh
2016-06-28 15:39               ` Christoph Hellwig
2016-06-29 12:23                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160624073109.GB22205@lst.de \
    --to=hch@lst.de \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git