linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] io_uring: add support for barrier fsync
Date: Tue, 9 Apr 2019 18:42:55 +0000	[thread overview]
Message-ID: <5BF7FDDE-212E-4F9A-9B50-26BDA99E952A@fb.com> (raw)
In-Reply-To: <5f8d9644-9e8f-c9d2-611e-4b144c62539c@kernel.dk>

On 9 Apr 2019, at 14:23, Jens Axboe wrote:

> On 4/9/19 12:17 PM, Christoph Hellwig wrote:
>> On Tue, Apr 09, 2019 at 10:27:43AM -0600, Jens Axboe wrote:
>>> It's a quite common use case to issue a bunch of writes, then an 
>>> fsync
>>> or fdatasync when they complete. Since io_uring doesn't guarantee 
>>> any
>>> type of ordering, the application must track issued writes and wait
>>> with the fsync issue until they have completed.
>>>
>>> Add an IORING_FSYNC_BARRIER flag that helps with this so the 
>>> application
>>> doesn't have to do this manually. If this flag is set for the fsync
>>> request, we won't issue it until pending IO has already completed.
>>
>> I think we need a much more detailed explanation of the semantics,
>> preferably in man page format.
>>
>> Barrier at least in Linux traditionally means all previously 
>> submitted
>> requests have finished and no new ones are started until the
>> barrier request finishes, which is very heavy handed.  Is that what
>> this is supposed to do?  If not what are the exact guarantees vs
>> ordering and or barrier semantics?
>
> The patch description isn't that great, and maybe the naming isn't 
> that
> intuitive either. The way it's implemented, the fsync will NOT be 
> issued
> until previously issued IOs have completed. That means both reads and
> writes, since there's no way to wait for just one.  In terms of
> semantics, any previously submitted writes will have completed before
> this fsync is issued. The barrier fsync has no ordering wrt future
> writes, no ordering is implied there. Hence:
>
> W1, W2, W3, FSYNC_W_BARRIER, W4, W5
>
> W1..3 will have been completed by the hardware side before we start
> FSYNC_W_BARRIER. We don't wait with issuing W4..5 until after the 
> fsync
> completes, no ordering is provided there.

Looking at the patch, why is fsync special?  Seems like you could add 
this ordering bit to any write?

While you're here, do you want to add a way to FUA/cache flush?  
Basically the rest of what user land would need to make their own 
write-back-cache-safe implementation.

-chris

  reply	other threads:[~2019-04-09 18:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-09 16:27 [PATCH] io_uring: add support for barrier fsync Jens Axboe
2019-04-09 18:17 ` Christoph Hellwig
2019-04-09 18:23   ` Jens Axboe
2019-04-09 18:42     ` Chris Mason [this message]
2019-04-09 18:46       ` Jens Axboe
2019-04-09 18:56         ` Chris Mason
2019-04-11 11:05         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5BF7FDDE-212E-4F9A-9B50-26BDA99E952A@fb.com \
    --to=clm@fb.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).