Linux-Fsdevel Archive on
 help / color / Atom feed
From: Boaz Harrosh <>
To: Christoph Hellwig <>, Boaz Harrosh <>
Cc: Dave Chinner <>,,,
Subject: Re: xfs: untangle the direct I/O and DAX path, fix DAX locking
Date: Wed, 29 Jun 2016 15:23:48 +0300
Message-ID: <> (raw)
In-Reply-To: <>

On 06/28/2016 06:39 PM, Christoph Hellwig wrote:
> On Tue, Jun 28, 2016 at 04:56:30PM +0300, Boaz Harrosh wrote:
>> Actually with O_APPEND each write request should write a different region
>> of the file, there are no overlapping writes. And no issue of which version of the
>> write came last and got its data written.
> You have one fd for multiple threads or processes (it doesn't matter if
> you're using O_APPEND or not), and all of them write to it.

Yes so? but they do not write to the same (overlapping) region of the
file, each thread usually writes to his own record.

> i_size is only updated once the write finishes, so having multiple
> concurrent writes will mean multiple records go into the same regions.
> Now to be fair in current XFS writes beyond i_size will always take
> the lock exclusively, so for this case we will not get concurrent
> writes and thus data corruption anyway.  

Exactly that I understand. And it must be so.

> But if you have a cycling
> log that gets overwritten (say a database journal) we're back to
> square one.

No! In this "cycling log" case the application (the DB) has an Head and a Tail
pointers each thread grabs the next available record and writes to it. The IO
is not overlapping, each thread writes to his own record, and even if they
write at the same time they do not over-write each other. As long as they
properly sync on the Head pointer the write itself can happen in parallel. This is not a
good example and will work perfectly well with the old (current) DAX code.
(Even if such records where 4k aligned)

>> I still don't see how an application can use the fact that two writers
>> will not give them mixed records. And surly it does not work on a shared
>> FS. So I was really wondering if you know of any such app
> If it doesn't work for two threads using the same fd on a shared fs
> the fs is broken.

What works? the above cycling log, sure it will work, also on current
dax code.

I wish you could write a test to demonstrate this bug. Sorry for my
slowness but I don't see it. I do not see how the fact that there is
only a single memcpy in progress can help an application.
Yes sure isize update must be synced, which it is in current code.

Thanks Christoph, I've taken too much of your time, I guess the QA
will need to bang every possible application and see what breaks when
multiple parallel writers are allowed on a single file. So far for
whatever I use in a VM it all works just the same.


      reply index

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22 15:27 Christoph Hellwig
2016-06-22 15:27 ` [PATCH 1/8] xfs: don't pass ioflags around in the ioctl path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 2/8] xfs: kill ioflags Christoph Hellwig
2016-06-22 15:27 ` [PATCH 3/8] xfs: remove s_maxbytes enforcement in xfs_file_read_iter Christoph Hellwig
2016-06-22 15:27 ` [PATCH 4/8] xfs: split xfs_file_read_iter into buffered and direct I/O helpers Christoph Hellwig
2016-06-22 15:27 ` [PATCH 5/8] xfs: stop using generic_file_read_iter for direct I/O Christoph Hellwig
2016-06-22 15:27 ` [PATCH 6/8] xfs: direct calls in the direct I/O path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 7/8] xfs: split direct I/O and DAX path Christoph Hellwig
2016-09-29  2:53   ` Darrick J. Wong
2016-09-29  8:38     ` aio completions vs file_accessed race, was: " Christoph Hellwig
2016-09-29 20:18       ` Christoph Hellwig
2016-09-29 20:18         ` Christoph Hellwig
2016-09-29 20:33           ` Darrick J. Wong
2016-06-22 15:27 ` [PATCH 8/8] xfs: fix locking for DAX writes Christoph Hellwig
2016-06-23 14:22   ` Boaz Harrosh
2016-06-23 23:24 ` xfs: untangle the direct I/O and DAX path, fix DAX locking Dave Chinner
2016-06-24  1:14   ` Dan Williams
2016-06-24  7:13     ` Dave Chinner
2016-06-24  7:31       ` Christoph Hellwig
2016-06-24  7:26   ` Christoph Hellwig
2016-06-24 23:00     ` Dave Chinner
2016-06-28 13:10       ` Christoph Hellwig
2016-06-28 13:27         ` Boaz Harrosh
2016-06-28 13:39           ` Christoph Hellwig
2016-06-28 13:56             ` Boaz Harrosh
2016-06-28 15:39               ` Christoph Hellwig
2016-06-29 12:23                 ` Boaz Harrosh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on

Archives are clonable:
	git clone --mirror linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ \
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone