linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Dave Chinner <david@fromorbit.com>
Cc: Jeremy Bongio <bongiojp@gmail.com>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Allison Henderson <allison.henderson@oracle.com>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/1] iomap regression for aio dio 4k writes
Date: Thu, 22 Jun 2023 22:32:33 -0400	[thread overview]
Message-ID: <20230623023233.GC34229@mit.edu> (raw)
In-Reply-To: <ZJOO4SobNFaQ+C5g@dread.disaster.area>

On Thu, Jun 22, 2023 at 09:59:29AM +1000, Dave Chinner wrote:
> Ah, you are testing pure overwrites, which means for ext4 the only
> thing it needs to care about is cached mappings. What happens when
> you add O_DSYNC here?

I think you mean O_SYNC, right?  In a pure overwrite case, where all
of the extents are initialized and where the Oracle or DB2 server is
doing writes to preallocated, pre-initialized space in the tablespace
file followed by fdatasync(), there *are* no post-I/O data integrity
operations which are required.

If the file is opened O_SYNC or if the blocks were not preallocated
using fallocate(2) and not initialized ahead of time, then sure, we
can't use this optimization.

However, the cases where databases workloads *are* doing overwrites
and using fdatasync(2) most certainly do exist, and the benefit of
this optimization can be a 20% throughput.  Which is nothing to sneeze
at.

What we might to do is to let the file system tell the iomap layer via
a flag whether or not there are no post-I/O metadata operations
required, and then *if* that flag is set, and *if* the inode has no
pages in the page cache (so there are no invalidate operations
necessary), it should be safe to skip using queue_work().  That way,
the file system has to affirmatively state that it is safe to skip the
workqueue, so it shouldn't do any harm to other file systems using the
iomap DIO layer.

What am I missing?

Cheers,

						- Ted

  parent reply	other threads:[~2023-06-23  2:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21 17:29 [PATCH 0/1] iomap regression for aio dio 4k writes Jeremy Bongio
2023-06-21 17:29 ` [PATCH 1/1] For DIO writes with no mapped pages for inode, skip deferring completion Jeremy Bongio
2023-06-21 18:55   ` Matthew Wilcox
2023-06-22  0:04   ` Dave Chinner
2023-06-21 23:59 ` [PATCH 0/1] iomap regression for aio dio 4k writes Dave Chinner
2023-06-22  1:55   ` Dave Chinner
2023-06-22  2:55     ` Matthew Wilcox
2023-06-22  4:08       ` Christoph Hellwig
2023-06-22  4:47       ` Dave Chinner
2023-06-23  2:32   ` Theodore Ts'o [this message]
2023-06-23  3:02     ` Dave Chinner
2023-06-22 23:22 ` Allison Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230623023233.GC34229@mit.edu \
    --to=tytso@mit.edu \
    --cc=allison.henderson@oracle.com \
    --cc=bongiojp@gmail.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).