linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chinmay V S <cvs268@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>,
	linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	LKML <linux-kernel@vger.kernel.org>,
	matthew@wil.cx
Subject: Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?
Date: Wed, 20 Nov 2013 19:04:15 +0530	[thread overview]
Message-ID: <CAK-9PRAManphkxT3ub0DfW8hx=xbq+ZeqUB0E0CEnFTfF7AQuw@mail.gmail.com> (raw)
In-Reply-To: <20131120125446.GA6284@infradead.org>

Hi Stefan,

Christoph is bang on right. To further elaborate upon this, here is
what is happening in the above case :
By using DIRECT, SYNC/DSYNC flags on a block device (i.e. bypassing
the file-systems layer), essentially you are enforcing a CMD_FLUSH on
each I/O command sent to the disk. This is by design of the
block-device driver in the Linux kernel. This severely degrades the
performance.

A detailed walk-through of the various I/O scenarios is available at
thecodeartist.blogspot.com/2012/08/hdd-filesystems-osync.html

Note that SYNC/DSYNC on a filesystem(eg. ext2/3/4) does NOT issue a
CMD_FLUSH. The "SYNC" via filesystem, simply guarantees that the data
is sent to the disk and not really flushed to the disk. It will
continue to reside in the internal cache on the disk, waiting to be
written to the disk platter in a optimum manner (bunch of writes
re-ordered to be sequential on-disk and clubbed together in one go).
This can affect performance to a large extent on modern HDDs with NCQ
support (CMD_FLUSH simply cancels all performance benefits of NCQ).

In case of SSDs, the huge IOPS number for the disk (40,000 in case of
Crucial M4) is again typically observed with write-cache enabled.
For Crucial M4 SSDs,
http://www.crucial.com/pdf/tech_specs-letter_crucial_m4_ssd_v3-11-11_online.pdf
Footnote1 - "Typical I/O performance numbers as measured using Iometer
with a queue depth of 32 and write cache enabled. Iometer measurements
are performed on a 8GB span. 4k transfers used for Read/Write latency
values."

To simply disable this behaviour and make the SYNC/DSYNC behaviour and
performance on raw block-device I/O resemble the standard filesystem
I/O you may want to apply the following patch to your kernel -
https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba

The above patch simply disables the CMD_FLUSH command support even on
disks that claim to support it.

regards
ChinmayVS

On Wed, Nov 20, 2013 at 6:24 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Nov 20, 2013 at 01:12:43PM +0100, Stefan Priebe - Profihost AG wrote:
>> Can anyone explain to me why O_DSYNC for my app on linux is so slow?
>
> Because FreeBSD ignores O_DSYNC on block devices, it never sends a FLUSH
> to the device.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-11-20 13:34 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-20 12:12 Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe - Profihost AG
2013-11-20 12:54 ` Christoph Hellwig
2013-11-20 13:34   ` Chinmay V S [this message]
2013-11-20 13:38     ` Christoph Hellwig
2013-11-20 14:12     ` Stefan Priebe - Profihost AG
2013-11-20 15:22       ` Chinmay V S
2013-11-20 15:37         ` Theodore Ts'o
2013-11-20 15:55           ` J. Bruce Fields
2013-11-20 17:11             ` Chinmay V S
2013-11-20 17:58               ` J. Bruce Fields
2013-11-20 18:43                 ` Chinmay V S
2013-11-21 10:11                   ` Christoph Hellwig
2013-11-22 20:01                     ` Stefan Priebe
2013-11-22 20:37                       ` Ric Wheeler
2013-11-22 21:05                         ` Stefan Priebe
2013-11-23 18:27                         ` Stefan Priebe
2013-11-23 19:35                           ` Ric Wheeler
2013-11-23 19:48                             ` Stefan Priebe
2013-11-25  7:37                             ` Stefan Priebe
2020-01-08  6:58                             ` slow sync performance on LSI / Broadcom MegaRaid performance with battery cache Stefan Priebe - Profihost AG
2013-11-22 19:57             ` Why is O_DSYNC on linux so slow / what's wrong with my SSD? Stefan Priebe
2013-11-24  0:10               ` One Thousand Gnomes
2013-11-20 16:02           ` Howard Chu
2013-11-23 20:36             ` Pavel Machek
2013-11-23 23:01               ` Ric Wheeler
2013-11-24  0:22                 ` Pavel Machek
2013-11-24  1:03                   ` One Thousand Gnomes
2013-11-24  2:43                   ` Ric Wheeler
2013-11-22 19:55         ` Stefan Priebe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK-9PRAManphkxT3ub0DfW8hx=xbq+ZeqUB0E0CEnFTfF7AQuw@mail.gmail.com' \
    --to=cvs268@gmail.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=s.priebe@profihost.ag \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).