linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL v2] xfs: CIL and log scalability improvements
@ 2021-06-08  4:43 Dave Chinner
  2021-06-09 20:46 ` Darrick J. Wong
  0 siblings, 1 reply; 2+ messages in thread
From: Dave Chinner @ 2021-06-08  4:43 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

Hi Darrick,

I've updated the branch and tag for the CIL and log scalability
improvements to fix the CPU hotplug bug that was in the previous
version. The code changes are limited to those, otherwise everything
else in the series is unchanged.

Please pull from the tag decsribed below.

Cheers,

Dave.

The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc:

  Linux 5.13-rc2 (2021-05-16 15:27:44 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git tags/xfs-cil-scale-2-tag

for you to fetch changes up to 7017b129e69c1b451fa926f2cac507c4128608dc:

  xfs: expanding delayed logging design with background material (2021-06-08 14:27:46 +1000)

----------------------------------------------------------------
xfs: CIL and log scalability improvements

Performance improvements are largely documented in the change logs of the
individual patches. Headline numbers are an increase in transaction rate from
700k commits/s to 1.7M commits/s, and a reduction in fua/flush operations by
2-3 orders of magnitude on metadata heavy workloads that don't use fsync.

Summary of series:

Patches         Modifications
-------         -------------
1-7:            log write FUA/FLUSH optimisations
8:              bug fix
9-11:           Async CIL pushes
12-25:          xlog_write() rework
26-39:          CIL commit scalability

The log write FUA/FLUSH optimisations reduce the number of cache flushes
required to flush the CIL to the journal. It extends the old pre-delayed logging
ordering semantics required by writing individual transactions to the iclogs out
to cover then CIL checkpoint transactions rather than individual writes to the
iclogs. In doing so, we reduce the cache flush requirements to once per CIL
checkpoint rather than once per iclog write.

The async CIL pushes fix a pipeline limitation that only allowed a single CIL
push to be processed at a time. This was causing CIL checkpoint writing to
become CPU bound as only a single CIL checkpoint could be pushed at a time. The
checkpoint pipleine was designed to allow multiple pushes to be in flight at
once and use careful ordering of the commit records to ensure correct recovery
order, but the workqueue implementation didn't allow concurrent works to be run.
The concurrent works now extend out to 4 CIL checkpoints running at a time,
hence removing the CPU usage limiations without introducing new lock contention
issues.

The xlog_write() rework is long overdue. The code is complex, difficult to
understand, full of tricky, subtle corner cases and just generally really hard
to modify. This patchset reworks the xlog_write() API to reduce the processing
overhead of writing out long log vector chains, and factors the xlog_write()
code into a simple, compact fast path along with a clearer slow path to handle
the complex cases.

The CIL commit scalability patchset removes spinlocks from the transaction
commit fast path. These spinlocks are the performance limiting bottleneck in the
transaction commit path, so we apply a variety of different techniques to do
either atomic. lockless or per-cpu updates of the CIL tracking structures during
commits. This greatly increases the throughput of the the transaction commit
engine, moving the contention point to the log space tracking algorithms after
doubling throughput on 32-way workloads.

----------------------------------------------------------------
Dave Chinner (40):
      xfs: log stripe roundoff is a property of the log
      xfs: separate CIL commit record IO
      xfs: remove xfs_blkdev_issue_flush
      xfs: async blkdev cache flush
      xfs: CIL checkpoint flushes caches unconditionally
      xfs: remove need_start_rec parameter from xlog_write()
      xfs: journal IO cache flush reductions
      xfs: Fix CIL throttle hang when CIL space used going backwards
      xfs: xfs_log_force_lsn isn't passed a LSN
      xfs: AIL needs asynchronous CIL forcing
      xfs: CIL work is serialised, not pipelined
      xfs: factor out the CIL transaction header building
      xfs: only CIL pushes require a start record
      xfs: embed the xlog_op_header in the unmount record
      xfs: embed the xlog_op_header in the commit record
      xfs: log tickets don't need log client id
      xfs: move log iovec alignment to preparation function
      xfs: reserve space and initialise xlog_op_header in item formatting
      xfs: log ticket region debug is largely useless
      xfs: pass lv chain length into xlog_write()
      xfs: introduce xlog_write_single()
      xfs:_introduce xlog_write_partial()
      xfs: xlog_write() no longer needs contwr state
      xfs: xlog_write() doesn't need optype anymore
      xfs: CIL context doesn't need to count iovecs
      xfs: use the CIL space used counter for emptiness checks
      xfs: lift init CIL reservation out of xc_cil_lock
      xfs: rework per-iclog header CIL reservation
      xfs: introduce CPU hotplug infrastructure
      xfs: introduce per-cpu CIL tracking structure
      xfs: implement percpu cil space used calculation
      xfs: track CIL ticket reservation in percpu structure
      xfs: convert CIL busy extents to per-cpu
      xfs: Add order IDs to log items in CIL
      xfs: convert CIL to unordered per cpu lists
      xfs: convert log vector chain to use list heads
      xfs: move CIL ordering to the logvec chain
      xfs: avoid cil push lock if possible
      xfs: xlog_sync() manually adjusts grant head space
      xfs: expanding delayed logging design with background material

 Documentation/filesystems/xfs-delayed-logging-design.rst |  361 ++++++++++++++++++++++++++----
 fs/xfs/libxfs/xfs_log_format.h                           |    4 -
 fs/xfs/libxfs/xfs_types.h                                |    1 +
 fs/xfs/xfs_bio_io.c                                      |   35 +++
 fs/xfs/xfs_buf.c                                         |    2 +-
 fs/xfs/xfs_buf_item.c                                    |   39 ++--
 fs/xfs/xfs_dquot_item.c                                  |    2 +-
 fs/xfs/xfs_file.c                                        |   20 +-
 fs/xfs/xfs_inode.c                                       |   10 +-
 fs/xfs/xfs_inode_item.c                                  |   18 +-
 fs/xfs/xfs_inode_item.h                                  |    2 +-
 fs/xfs/xfs_linux.h                                       |    2 +
 fs/xfs/xfs_log.c                                         | 1015 +++++++++++++++++++++++++++++++++++++++---------------------------------------------
 fs/xfs/xfs_log.h                                         |   66 ++----
 fs/xfs/xfs_log_cil.c                                     |  804 ++++++++++++++++++++++++++++++++++++++++++++++++------------------
 fs/xfs/xfs_log_priv.h                                    |  123 ++++++-----
 fs/xfs/xfs_super.c                                       |   52 ++++-
 fs/xfs/xfs_super.h                                       |    1 -
 fs/xfs/xfs_sysfs.c                                       |    1 +
 fs/xfs/xfs_trace.c                                       |    1 +
 fs/xfs/xfs_trans.c                                       |   18 +-
 fs/xfs/xfs_trans.h                                       |    5 +-
 fs/xfs/xfs_trans_ail.c                                   |   11 +-
 fs/xfs/xfs_trans_priv.h                                  |    3 +-
 include/linux/cpuhotplug.h                               |    1 +
 25 files changed, 1632 insertions(+), 965 deletions(-)

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [GIT PULL v2] xfs: CIL and log scalability improvements
  2021-06-08  4:43 [GIT PULL v2] xfs: CIL and log scalability improvements Dave Chinner
@ 2021-06-09 20:46 ` Darrick J. Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Darrick J. Wong @ 2021-06-09 20:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Tue, Jun 08, 2021 at 02:43:40PM +1000, Dave Chinner wrote:
> Hi Darrick,
> 
> I've updated the branch and tag for the CIL and log scalability
> improvements to fix the CPU hotplug bug that was in the previous
> version. The code changes are limited to those, otherwise everything
> else in the series is unchanged.
> 
> Please pull from the tag decsribed below.

Pulled, thanks!

--D

> 
> Cheers,
> 
> Dave.
> 
> The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc:
> 
>   Linux 5.13-rc2 (2021-05-16 15:27:44 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git tags/xfs-cil-scale-2-tag
> 
> for you to fetch changes up to 7017b129e69c1b451fa926f2cac507c4128608dc:
> 
>   xfs: expanding delayed logging design with background material (2021-06-08 14:27:46 +1000)
> 
> ----------------------------------------------------------------
> xfs: CIL and log scalability improvements
> 
> Performance improvements are largely documented in the change logs of the
> individual patches. Headline numbers are an increase in transaction rate from
> 700k commits/s to 1.7M commits/s, and a reduction in fua/flush operations by
> 2-3 orders of magnitude on metadata heavy workloads that don't use fsync.
> 
> Summary of series:
> 
> Patches         Modifications
> -------         -------------
> 1-7:            log write FUA/FLUSH optimisations
> 8:              bug fix
> 9-11:           Async CIL pushes
> 12-25:          xlog_write() rework
> 26-39:          CIL commit scalability
> 
> The log write FUA/FLUSH optimisations reduce the number of cache flushes
> required to flush the CIL to the journal. It extends the old pre-delayed logging
> ordering semantics required by writing individual transactions to the iclogs out
> to cover then CIL checkpoint transactions rather than individual writes to the
> iclogs. In doing so, we reduce the cache flush requirements to once per CIL
> checkpoint rather than once per iclog write.
> 
> The async CIL pushes fix a pipeline limitation that only allowed a single CIL
> push to be processed at a time. This was causing CIL checkpoint writing to
> become CPU bound as only a single CIL checkpoint could be pushed at a time. The
> checkpoint pipleine was designed to allow multiple pushes to be in flight at
> once and use careful ordering of the commit records to ensure correct recovery
> order, but the workqueue implementation didn't allow concurrent works to be run.
> The concurrent works now extend out to 4 CIL checkpoints running at a time,
> hence removing the CPU usage limiations without introducing new lock contention
> issues.
> 
> The xlog_write() rework is long overdue. The code is complex, difficult to
> understand, full of tricky, subtle corner cases and just generally really hard
> to modify. This patchset reworks the xlog_write() API to reduce the processing
> overhead of writing out long log vector chains, and factors the xlog_write()
> code into a simple, compact fast path along with a clearer slow path to handle
> the complex cases.
> 
> The CIL commit scalability patchset removes spinlocks from the transaction
> commit fast path. These spinlocks are the performance limiting bottleneck in the
> transaction commit path, so we apply a variety of different techniques to do
> either atomic. lockless or per-cpu updates of the CIL tracking structures during
> commits. This greatly increases the throughput of the the transaction commit
> engine, moving the contention point to the log space tracking algorithms after
> doubling throughput on 32-way workloads.
> 
> ----------------------------------------------------------------
> Dave Chinner (40):
>       xfs: log stripe roundoff is a property of the log
>       xfs: separate CIL commit record IO
>       xfs: remove xfs_blkdev_issue_flush
>       xfs: async blkdev cache flush
>       xfs: CIL checkpoint flushes caches unconditionally
>       xfs: remove need_start_rec parameter from xlog_write()
>       xfs: journal IO cache flush reductions
>       xfs: Fix CIL throttle hang when CIL space used going backwards
>       xfs: xfs_log_force_lsn isn't passed a LSN
>       xfs: AIL needs asynchronous CIL forcing
>       xfs: CIL work is serialised, not pipelined
>       xfs: factor out the CIL transaction header building
>       xfs: only CIL pushes require a start record
>       xfs: embed the xlog_op_header in the unmount record
>       xfs: embed the xlog_op_header in the commit record
>       xfs: log tickets don't need log client id
>       xfs: move log iovec alignment to preparation function
>       xfs: reserve space and initialise xlog_op_header in item formatting
>       xfs: log ticket region debug is largely useless
>       xfs: pass lv chain length into xlog_write()
>       xfs: introduce xlog_write_single()
>       xfs:_introduce xlog_write_partial()
>       xfs: xlog_write() no longer needs contwr state
>       xfs: xlog_write() doesn't need optype anymore
>       xfs: CIL context doesn't need to count iovecs
>       xfs: use the CIL space used counter for emptiness checks
>       xfs: lift init CIL reservation out of xc_cil_lock
>       xfs: rework per-iclog header CIL reservation
>       xfs: introduce CPU hotplug infrastructure
>       xfs: introduce per-cpu CIL tracking structure
>       xfs: implement percpu cil space used calculation
>       xfs: track CIL ticket reservation in percpu structure
>       xfs: convert CIL busy extents to per-cpu
>       xfs: Add order IDs to log items in CIL
>       xfs: convert CIL to unordered per cpu lists
>       xfs: convert log vector chain to use list heads
>       xfs: move CIL ordering to the logvec chain
>       xfs: avoid cil push lock if possible
>       xfs: xlog_sync() manually adjusts grant head space
>       xfs: expanding delayed logging design with background material
> 
>  Documentation/filesystems/xfs-delayed-logging-design.rst |  361 ++++++++++++++++++++++++++----
>  fs/xfs/libxfs/xfs_log_format.h                           |    4 -
>  fs/xfs/libxfs/xfs_types.h                                |    1 +
>  fs/xfs/xfs_bio_io.c                                      |   35 +++
>  fs/xfs/xfs_buf.c                                         |    2 +-
>  fs/xfs/xfs_buf_item.c                                    |   39 ++--
>  fs/xfs/xfs_dquot_item.c                                  |    2 +-
>  fs/xfs/xfs_file.c                                        |   20 +-
>  fs/xfs/xfs_inode.c                                       |   10 +-
>  fs/xfs/xfs_inode_item.c                                  |   18 +-
>  fs/xfs/xfs_inode_item.h                                  |    2 +-
>  fs/xfs/xfs_linux.h                                       |    2 +
>  fs/xfs/xfs_log.c                                         | 1015 +++++++++++++++++++++++++++++++++++++++---------------------------------------------
>  fs/xfs/xfs_log.h                                         |   66 ++----
>  fs/xfs/xfs_log_cil.c                                     |  804 ++++++++++++++++++++++++++++++++++++++++++++++++------------------
>  fs/xfs/xfs_log_priv.h                                    |  123 ++++++-----
>  fs/xfs/xfs_super.c                                       |   52 ++++-
>  fs/xfs/xfs_super.h                                       |    1 -
>  fs/xfs/xfs_sysfs.c                                       |    1 +
>  fs/xfs/xfs_trace.c                                       |    1 +
>  fs/xfs/xfs_trans.c                                       |   18 +-
>  fs/xfs/xfs_trans.h                                       |    5 +-
>  fs/xfs/xfs_trans_ail.c                                   |   11 +-
>  fs/xfs/xfs_trans_priv.h                                  |    3 +-
>  include/linux/cpuhotplug.h                               |    1 +
>  25 files changed, 1632 insertions(+), 965 deletions(-)
> 
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-09 20:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  4:43 [GIT PULL v2] xfs: CIL and log scalability improvements Dave Chinner
2021-06-09 20:46 ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).