archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <>
To: "Theodore Ts'o" <>
Subject: Potential regression with iomap DIO for 4k writes
Date: Wed, 02 Jun 2021 19:35:48 -0400	[thread overview]
Message-ID: <> (raw)


While I've been exploring the performance of different DIO
implementations, I've come across what seems a noticeable regression
(22% slowdown) in 4k writes in Ext4 when comparing the original
direct-io with the current iomap dio implementation, as existing on
linus/master.  Perhaps you already know about this, but I'm having a
hard time understanding the root cause, in order to attempt to improve
the situation.

* Benchmark

For starter, I'm comparing three kernels, built with same config and
compiler (gcc-8.4.0 (locally built)).  My DUT is a Samsung SSD 970 EVO
Plus 250GB dedicated to this test (no concurrent IO).

  - Kernel 1: Commit immediately before iomap for ext4 is merged
    ("f112a2fd1f59").  On the data below, this kernel is identified as
    5.4.0-original-dio. Available in a public branch at:

    < -b dio/original-dio>

  - Kernel 2: tag 5.5 (first release with dio-iomap).  In the data
    below, identified as 5.5.0-old-iomap.  For completeness, it is
    available at:

    < -b dio/old-dio>

  - Kernel 3: Kernel tag 5.13-rc3. In the data below, identified as
    5.13-rc3-iomap.  For completeness, it is available at:

    < -b dio/iomap>

I ran the fio job below with the combinations: BS=4k,16k and RW=read,write

  fio --ioengine libaio --size=2G --direct=1 --iodepth=64 --time_based=1 \
      --thread=1 --overwrite=1 --runtime=100 --output-format=terse

For every kernel test, the file system was recreated, and the 2GB file
was pre-allocated.

In an attempt to further isolate the problem, I tested both xfs and ext4
in the same condition.

The script I used is available at:


* Results

 I obtained the following performance results, relative to the baseline

|                                                  IOPS                                                 |
|    kernel              |          read-4k |          read-16k |          write-4k |         write-16k |
| 5.13.0-rc3-iomap-ext4  | 1.01192950082305 |  1.00026413252562 | 0.806377013901006 |  1.00020735846057 |
| 5.5.0-old-iomap-ext4   | 1.01154156662508 | 0.998753983520427 | 0.777051125458035 | 0.999937792461829 |
| 5.13.0-rc3-iomap-xfs   | 1.00234888443008 |  1.00027645151444 |  1.00996172750095 |  1.00156349447934 |
| 5.5.0-old-iomap-xfs    | 1.00010412786902 |  1.00202731110586 |  1.01502846821264 |  1.00149431330769 |

Total IO is the amount of data copied (relative to baseline).

| 						TOTAL_IO
| kernel                 |          read-4k |          read-16k |          write-4k |         write-16k |
| 5.13.0-rc3-iomap-ext4  | 1.01193023173156 |  1.00026332569559 | 0.806377530301477 |  1.00014686835205 |
| 5.5.0-old-iomap-ext4   | 1.01154196621591 | 0.998758131673757 | 0.777050753425118 | 0.999902824986834 |
| 5.13.0-rc3-iomap-xfs   | 1.00234893734134 |  1.00027535318322 |  1.00996437458991 |  1.00156305646789 |
| 5.5.0-old-iomap-xfs    | 1.00010328564078 |  1.00202831801018 |  1.01503060595258 |  1.00149069402364 |

With a visualization of the above data here:


The only out of the ordinary result seems to be in write-4k for Ext4,
which suggests around 20% less IOPS (and total IO) for iomap in
comparison to the original DIO.  This is not a one-off run, as it seems
to be consistently reproducible with more test runs in my environment.
The performance reduction also doesn't reproduce on XFS.

I tried to limit the influence of other parts of the kernel that could
affect the behavior by comparing the kernel immediately before the
introduction of dio-iomap for ext4 with the first version with that
feature.  By also observing that xfs doesn't change, I believe it to be
ext4 specific.

I'm also publishing raw data and all related material to the link below,
in case anyone wants to tinker with my data:

Perhaps I'm missing something obvious.  But I can't pinpoint a specific
problem with my analysis.  Is this expected, given the way ext4 iomap
work?  Do you have any idea of the root cause or how it can be improved?

I will keep looking to this issue, but I'd like to share this partial
result, in case there is a problem with my analysis, or if you have any


Gabriel Krisman Bertazi

             reply	other threads:[~2021-06-02 23:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02 23:35 Gabriel Krisman Bertazi [this message]
2021-06-16  0:17 ` Gabriel Krisman Bertazi
2021-06-18 17:21   ` Theodore Ts'o
2021-06-23 16:04     ` Gabriel Krisman Bertazi
2021-08-10 15:28       ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \
    --subject='Re: Potential regression with iomap DIO for 4k writes' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox