All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	Brian Foster <bfoster@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	Disha Goel <disgoel@linux.ibm.com>
Subject: Re: [PATCHv6 0/5] iomap: Add support for per-block dirty state to improve write performance
Date: Tue, 6 Jun 2023 14:17:13 +0200	[thread overview]
Message-ID: <CAHpGcM+xTevu0LaJvaPrF-M3bEDvicUqqY5MPD6z2VuMO9vRWw@mail.gmail.com> (raw)
In-Reply-To: <cover.1685900733.git.ritesh.list@gmail.com>

Ritesh,

Am Mo., 5. Juni 2023 um 03:33 Uhr schrieb Ritesh Harjani (IBM)
<ritesh.list@gmail.com>:
> Hello All,
>
> Please find PATCHv6 which adds per-block dirty tracking to iomap.
> As discussed earlier this is required to improve write performance and reduce
> write amplification for cases where either blocksize is less than pagesize (such
> as Power platform with 64k pagesize) or when we have a large folio (such as xfs
> which currently supports large folio).
>
> RFCv5 -> PATCHv6:
> =================
> 1. Addresses review comments from Brian, Christoph and Matthew.
>    @Christoph:
>      - I have renamed the higher level functions such as iop_alloc/iop_free() to
>        iomap_iop_alloc/free() in v6.
>      - As for the low level bitmap accessor functions I couldn't find any better
>        naming then iop_test_/set/clear_**. I could have gone for
>        iomap_iop__test/set/clear/_** or iomap__iop_test/set/clear_**, but
>        I wasn't convinced with either of above as it also increases function
>        name.
>        Besides iop_test/set_clear_ accessors functions for uptodate and dirty
>        status tracking make sense as we are sure we have a valid iop in such
>        cases. Please do let me know if this looks ok to you.
> 2. I tried testing gfs2 (initially with no patches) with xfstests. But I always ended up
>    in some or the other deadlock (I couldn't spend any time debugging that).
>    I also ran it with -x log, but still it was always failing for me.
>    @Andreas:
>    - could you please suggest how can I test gfs2 with these patches. I see gfs2
>      can have a smaller blocksize and it uses iomap buffered io path. It will be
>      good if we can get these patches tested on it too.

here's a minimal list of tests we're running automatically on a daily basis:

https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/blob/main/filesystems/xfs/xfstests/RUNTESTS.gfs2

Please note that function inode_to_wb() in <linux/backing-dev.h>
includes some asserts that are active when CONFIG_LOCKDEP is enabled.
Those asserts will cause every single fstest to fail. So either
disable CONFIG_LOCKDEP or remove the asserts in inode_to_wb() for now.

Thanks,
Andreas

> 3. I can now say I have run some good amount of fstests on these patches on
>    these platforms and I haven't found any new failure in my testing so far.
>    arm64 (64k pagesize): with 4k -g quick
>    Power: with 4k -g auto
>    x86: 1k, 4k with -g auto and adv_auto
>
> From my testing so far these patches looks stable to me and if this looks good
> to reviewers as well, do you think this can be queued to linux-next for wider
> testing?
>
> Performance numbers copied from last patch commit message
> ==================================================
> Performance testing of below fio workload reveals ~16x performance
> improvement using nvme with XFS (4k blocksize) on Power (64K pagesize)
> FIO reported write bw scores improved from around ~28 MBps to ~452 MBps.
>
> 1. <test_randwrite.fio>
> [global]
>         ioengine=psync
>         rw=randwrite
>         overwrite=1
>         pre_read=1
>         direct=0
>         bs=4k
>         size=1G
>         dir=./
>         numjobs=8
>         fdatasync=1
>         runtime=60
>         iodepth=64
>         group_reporting=1
>
> [fio-run]
>
> 2. Also our internal performance team reported that this patch improves
>    their database workload performance by around ~83% (with XFS on Power)
>
> Ritesh Harjani (IBM) (5):
>   iomap: Rename iomap_page_create/release() to iomap_iop_alloc/free()
>   iomap: Move folio_detach_private() in iomap_iop_free() to the end
>   iomap: Refactor some iop related accessor functions
>   iomap: Allocate iop in ->write_begin() early
>   iomap: Add per-block dirty state tracking to improve performance
>
>  fs/gfs2/aops.c         |   2 +-
>  fs/iomap/buffered-io.c | 309 ++++++++++++++++++++++++++++++-----------
>  fs/xfs/xfs_aops.c      |   2 +-
>  fs/zonefs/file.c       |   2 +-
>  include/linux/iomap.h  |   1 +
>  5 files changed, 235 insertions(+), 81 deletions(-)
>
> --
> 2.40.1
>

      parent reply	other threads:[~2023-06-06 12:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05  1:31 [PATCHv6 0/5] iomap: Add support for per-block dirty state to improve write performance Ritesh Harjani (IBM)
2023-06-05  1:31 ` [PATCHv6 1/5] iomap: Rename iomap_page_create/release() to iomap_iop_alloc/free() Ritesh Harjani (IBM)
2023-06-05  1:31 ` [PATCHv6 2/5] iomap: Move folio_detach_private() in iomap_iop_free() to the end Ritesh Harjani (IBM)
2023-06-05  1:31 ` [PATCHv6 3/5] iomap: Refactor some iop related accessor functions Ritesh Harjani (IBM)
2023-06-05  3:33   ` Matthew Wilcox
2023-06-05  5:16     ` Ritesh Harjani
2023-06-05  1:31 ` [PATCHv6 4/5] iomap: Allocate iop in ->write_begin() early Ritesh Harjani (IBM)
2023-06-05  1:31 ` [PATCHv6 5/5] iomap: Add per-block dirty state tracking to improve performance Ritesh Harjani (IBM)
2023-06-05  4:03   ` Matthew Wilcox
2023-06-05  5:20     ` Ritesh Harjani
2023-06-06 12:17 ` Andreas Grünbacher [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHpGcM+xTevu0LaJvaPrF-M3bEDvicUqqY5MPD6z2VuMO9vRWw@mail.gmail.com \
    --to=andreas.gruenbacher@gmail.com \
    --cc=agruenba@redhat.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=disgoel@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.