All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: harshad shirwadkar <harshadshirwadkar@gmail.com>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Jayashree <jaya@cs.utexas.edu>,
	vijay@cs.utexas.edu
Subject: Re: [PATCH v10 5/9] ext4: main fast-commit commit path
Date: Tue, 27 Oct 2020 14:45:07 -0400	[thread overview]
Message-ID: <20201027184507.GD5691@mit.edu> (raw)
In-Reply-To: <20201027142910.GB16090@quack2.suse.cz>

On Tue, Oct 27, 2020 at 03:29:10PM +0100, Jan Kara wrote:
> 
> OK, I see. Maybe add a paragraph about this to fastcommit doc? I agree that
> we can leave these optimizations for later, I was just wondering whether
> there isn't some fundamental reason why global flush would be required and
> I'm happy to hear that there isn't.
> 
> The advantage of soft_consistency as you call it would be IMO most seen if
> there's relatively heavy non-fsync IO load in parallel with frequent fsyncs
> of a tiny file. And such load is not infrequent in practice. I agree that
> benchmarks like dbench are unlikely to benefit from soft_consistency since
> all IO the benchmark does is in fact forced by fsync.
> 
> I also think that with soft_consistency we could benefit (e.g. on SSD
> storage) from having several fast-commit areas in the journal so multiple
> fastcommits can run in parallel. But that's also for some later
> experimentation...

Right, so this is the reason why I wasn't super-excited by the
proposal to document crash recovery semantics in Linux file systems
proposed by Jayashree Mohan and Prof. Vijay Chidambaram last year[1].  I
knew that we were planning the Fast Commit work (Jayashree and Vijay,
this is a simplified version of the proposal made by Park and Shin in
their iJournaling paper[2]) and having something document that an
fsync(2) to one file guarantees that changes made to some other file
that were made "earlier" would disallow this particular optimization.

[1] http://lore.kernel.org/r/1552418820-18102-1-git-send-email-jaya@cs.utexas.edu
[2] https://www.usenix.org/conference/atc17/technical-sessions/presentation/park

That being said, I was afraid that there *were* applications that
might be (wrongly) making this assumption, even though it wasn't
guaranteed by POSIX.  So when it didn't make much difference for
benchmarks, and given that our original goal was to speed up NFS file
serving, where every single NFS RPC has to be synchronous before an
acknowledgement is sent back to the client, we decided to take the
conservative path --- at least for now.

I do agree with you that I can certainly think of workloads where not
requiring entanglement of unrelated file writes via fsync(2) could be
a huge performance win.

One of the things that I did discuss with Harshad was using some
hueristics, where if there are two "unrelated" applications (e.g.,
different session id, or process group leader, or different uid,
etc. --- details to be determined layer), we would not entangele
writes to unrelated files via fsync(2), while forcing files written by
the same application to share fate with one another even if only file
is fsync'ed.  Hopefully, this would head off the possibility of
another O_PONIES[3] controversy while still giving most of the
benefits of not making fsync(2) a global file system barrier.  It
would be hell to document in a standards specification, so the
official rule would still be "fsync(2) only applies to the single
file, and anything else is an accident of the implementation", per
POSIX.

[3] https://lwn.net/Articles/322823/

I still think the right answer is a new system call which takes an
array of file descriptors, so the application can explicitly declare
which set of files can be reliably fsync'ed in the same transaction
commit.  The downside is that this would require applications to
change what they are doing, and it would take the better part of a
decade before we could assume well-written applications are explicitly
declaring their crash recovery needs.

					- Ted

  parent reply	other threads:[~2020-10-27 18:46 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 20:37 [PATCH v10 0/9] Add fast commits in Ext4 file system Harshad Shirwadkar
2020-10-15 20:37 ` [PATCH v10 1/9] doc: update ext4 and journalling docs to include fast commit feature Harshad Shirwadkar
2020-10-21 16:04   ` Jan Kara
2020-10-21 17:25     ` harshad shirwadkar
2020-10-22 13:06       ` Jan Kara
2020-10-15 20:37 ` [PATCH v10 2/9] ext4: add fast_commit feature and handling for extended mount options Harshad Shirwadkar
2020-10-21 16:18   ` Jan Kara
2020-10-21 17:31     ` harshad shirwadkar
2020-10-22 13:09       ` Jan Kara
2020-10-26 16:40         ` harshad shirwadkar
2020-10-15 20:37 ` [PATCH v10 3/9] ext4 / jbd2: add fast commit initialization Harshad Shirwadkar
2020-10-21 20:00   ` Jan Kara
2020-10-29 23:28     ` harshad shirwadkar
2020-10-30 15:40       ` Jan Kara
2020-10-15 20:37 ` [PATCH v10 4/9] jbd2: add fast commit machinery Harshad Shirwadkar
2020-10-22 10:16   ` Jan Kara
2020-10-23 17:17     ` harshad shirwadkar
2020-10-26  9:03       ` Jan Kara
2020-10-26 16:34         ` harshad shirwadkar
2020-10-15 20:37 ` [PATCH v10 5/9] ext4: main fast-commit commit path Harshad Shirwadkar
2020-10-23 10:30   ` Jan Kara
2020-10-26 20:55     ` harshad shirwadkar
2020-10-27 14:29       ` Jan Kara
2020-10-27 17:38         ` harshad shirwadkar
2020-10-30 15:28           ` Jan Kara
2020-10-30 16:45             ` harshad shirwadkar
2020-11-03 10:04               ` Jan Kara
2020-11-03 18:31                 ` harshad shirwadkar
2020-10-27 18:45         ` Theodore Y. Ts'o [this message]
2020-10-15 20:37 ` [PATCH v10 6/9] jbd2: fast commit recovery path Harshad Shirwadkar
2020-10-15 20:37 ` [PATCH v10 7/9] ext4: " Harshad Shirwadkar
2020-10-15 20:38 ` [PATCH v10 8/9] ext4: add a mount opt to forcefully turn fast commits on Harshad Shirwadkar
2020-10-15 20:38 ` [PATCH v10 9/9] ext4: add fast commit stats in procfs Harshad Shirwadkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201027184507.GD5691@mit.edu \
    --to=tytso@mit.edu \
    --cc=harshadshirwadkar@gmail.com \
    --cc=jack@suse.cz \
    --cc=jaya@cs.utexas.edu \
    --cc=linux-ext4@vger.kernel.org \
    --cc=vijay@cs.utexas.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.