All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: "Darrick J. Wong" <djwong@kernel.org>
Subject: [GIT PULL] xfs: xlog_write rework and CIL scalability
Date: Fri, 10 Dec 2021 11:09:56 +1100	[thread overview]
Message-ID: <20211210000956.GO449541@dread.disaster.area> (raw)

Hi Darrick,

Can you please pull the following changes from the tag listed below
for the XFS dev tree?

Cheers,

Dave.

The following changes since commit 0fcfb00b28c0b7884635dacf38e46d60bf3d4eb1:

  Linux 5.16-rc4 (2021-12-05 14:08:22 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git tags/xfs-cil-scale-3-tag

for you to fetch changes up to 3b5181b310e0f2064f2aafb6143cdb0e920f5858:

  xfs: expanding delayed logging design with background material (2021-12-09 10:22:36 +1100)

----------------------------------------------------------------
xfs: CIL and log scalability improvements

xlog_write() is code that causes severe eye bleeding. It's extremely
difficult to understand the way it is structured, and extremely easy
to break because of all the weird parameters it passes between
functions that do very non-obvious things. state is set in
xlog_write_finish_copy() that is carried across both outer and inner
loop iterations that is used by xlog_write_setup_copy(), which also
sets state that xlog_write_finish_copy() needs. The way iclog space
was obtained affects the accounting logic that ends up being passed
to xlog_state_finish_copy(). The code that handles commit iclogs is
spread over multiple functions and is obfuscated by the set/finish
copy code.

It's just a mess.

It's also extremely inefficient.

That's why I've rewritten the code. I think the code I've written is
much easier to understand and there's less of it.  The compiled code
is smaller and faster. It has much fewer subtleties and outside
dependencies, and is easier to reason about and modify.

Built on top of this is the CIL scalability improvements. My 32p
machine hits lock contention limits in xlog_cil_commit() at about
700,000 transaction commits a section. It hits this at 16 thread
workloads, and 32 thread workloads go no faster and just burn CPU on
the CIL spinlocks.

This patchset gets rid of spinlocks and global serialisation points
in the xlog_cil_commit() path. It does this by moving to a
combination of per-cpu counters, unordered per-cpu lists and
post-ordered per-cpu lists, and is built upon the xlog_write()
simplifications introduced earlier in the rewrite of that function.

This results in transaction commit rates exceeding 2 million
commits/s under unlink certain workloads, but in general the
improvements are smaller than this as the scalability limitations
simply move from xlog_cil_commit() to global VFS lock contexts.

----------------------------------------------------------------
Christoph Hellwig (2):
      xfs: change the type of ic_datap
      xfs: remove xlog_verify_dest_ptr

Dave Chinner (28):
      xfs: factor out the CIL transaction header building
      xfs: only CIL pushes require a start record
      xfs: embed the xlog_op_header in the unmount record
      xfs: embed the xlog_op_header in the commit record
      xfs: log tickets don't need log client id
      xfs: move log iovec alignment to preparation function
      xfs: reserve space and initialise xlog_op_header in item formatting
      xfs: log ticket region debug is largely useless
      xfs: pass lv chain length into xlog_write()
      xfs: introduce xlog_write_full()
      xfs: introduce xlog_write_partial()
      xfs: xlog_write() no longer needs contwr state
      xfs: xlog_write() doesn't need optype anymore
      xfs: CIL context doesn't need to count iovecs
      xfs: use the CIL space used counter for emptiness checks
      xfs: lift init CIL reservation out of xc_cil_lock
      xfs: rework per-iclog header CIL reservation
      xfs: introduce per-cpu CIL tracking structure
      xfs: implement percpu cil space used calculation
      xfs: track CIL ticket reservation in percpu structure
      xfs: convert CIL busy extents to per-cpu
      xfs: Add order IDs to log items in CIL
      xfs: convert CIL to unordered per cpu lists
      xfs: convert log vector chain to use list heads
      xfs: move CIL ordering to the logvec chain
      xfs: avoid cil push lock if possible
      xfs: xlog_sync() manually adjusts grant head space
      xfs: expanding delayed logging design with background material

 Documentation/filesystems/xfs-delayed-logging-design.rst | 361 +++++++++++++++++++++++++++++++++----
 fs/xfs/libxfs/xfs_log_format.h                           |   1 -
 fs/xfs/xfs_log.c                                         | 809 ++++++++++++++++++++++++++++++++++++----------------------------------------------
 fs/xfs/xfs_log.h                                         |  58 ++----
 fs/xfs/xfs_log_cil.c                                     | 550 +++++++++++++++++++++++++++++++++++++++-----------------
 fs/xfs/xfs_log_priv.h                                    | 103 +++++------
 fs/xfs/xfs_super.c                                       |   1 +
 fs/xfs/xfs_trans.c                                       |  10 +-
 fs/xfs/xfs_trans.h                                       |   1 +
 fs/xfs/xfs_trans_priv.h                                  |   3 +-
 10 files changed, 1134 insertions(+), 763 deletions(-)

-- 
Dave Chinner
david@fromorbit.com

             reply	other threads:[~2021-12-10  0:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10  0:09 Dave Chinner [this message]
2022-01-06 21:40 ` [GIT PULL] xfs: xlog_write rework and CIL scalability Darrick J. Wong
2022-01-11  5:04   ` Dave Chinner
2022-01-11 17:58     ` Darrick J. Wong
2022-01-12 23:56       ` Darrick J. Wong
2022-01-13  3:46         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211210000956.GO449541@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.