linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] xfs: hard limit background CIL push size
@ 2019-09-09  1:51 Dave Chinner
  2019-09-09  1:51 ` [PATCH 1/2] xfs: Lower CIL flush limit for large logs Dave Chinner
  2019-09-09  1:51 ` [PATCH 2/2] xfs: hard limit the background CIL push Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: Dave Chinner @ 2019-09-09  1:51 UTC (permalink / raw)
  To: linux-xfs

Hi folks,

As we've been discussing from the recent generic/530 hangs, the CIL
push work can get held off for an arbitrary amount of time and
result in the CIL checkpoint growing large enough that it can
consume the entire log. This is not allowed - the log recovery
algorithm for finding the head and tail of the log requires two
checkpoints in the log so it can find the last full checkpoint that
it needs to recover. That puts a hard limit on the size of the CIL
checkpoints of just under 50% of the log.

The CIL currently doesn't have any enforcement on that - log space
hangs due to CIL overruns are not something we've see in test or
production systems until the recent unlinked list modifications we
made. Hence this has laregely been a theoretical problem rather than
a practical problem up to this point.

While we've made changes that avoid the CIL hold-off vectors that
lead to the generic/530 hangs, we have enough data on the CIL
characteristics and performance to be able to put in place hard
limits without compromising performance.

The first patch limits the CIL size on large logs - we don't need to
aggregate hundreds of megabytes of metadata in the CIL to realise
the relogging benefits that the CIL provides. Measures show that the
point at which performance starts to be affected is somewhere
between 16MB and 32MB of aggregated changes in the CIL. Hence
the background push threshold is limited to be 32MB on large logs,
but remains at 12.5% of the log on small logs.

The second patch adds a hard limit on the CIL background push
threshold. This is set to double the size of the push threshold, so
at most a single CIL context will consume 25% of the log before
attempts to do background pushes will block waiting for the
background push of that context to start. This means that all
processes that commit to a push context that is over the hard limit
will sleep until the background CIL push work starts on that
context. At that point, they will be woken and their next
transaction commits will occur into the new CIL commit that the
background push will switch over to.

This provides a hard limit on the size of CIL checkpoint of ~1/4 of
the entire log, well inside the size limit that log recovery imposes
on us. For maximally sized logs, this hard limit ends up being about
3% of the entire log, so it serves to keep the logged objects moving
from the CIL to the AIL at a reasonable rate and spreads them over a
wider range of LSNs giving more graduated (less bursty) tail pushing
behaviour.

Comments welcome.

-Dave.


^ permalink raw reply	[flat|nested] 13+ messages in thread
* [PATCH v2 0/2] xfs: limit CIL push sizes
@ 2019-09-30  6:03 Dave Chinner
  2019-09-30  6:03 ` [PATCH 1/2] xfs: Lower CIL flush limit for large logs Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2019-09-30  6:03 UTC (permalink / raw)
  To: linux-xfs

HI Folks,

Version 2 of the CIL push size limiting patches. The main changes in
this version are updates to comments describing behaviour, making it
clear this isn't a hard limit but a method of providing schedule
points that will allow the CIL push to proceed rather than being
held off indefinitely by ongoing work.

The original patchset was here:

https://lore.kernel.org/linux-xfs/20190909015159.19662-1-david@fromorbit.com/

Cheers,

Dave.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-09-30 16:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09  1:51 [RFC PATCH 0/2] xfs: hard limit background CIL push size Dave Chinner
2019-09-09  1:51 ` [PATCH 1/2] xfs: Lower CIL flush limit for large logs Dave Chinner
2019-09-16 16:33   ` Darrick J. Wong
2019-09-24 22:29     ` Dave Chinner
2019-09-25 12:08       ` Brian Foster
2019-09-27 22:47         ` Dave Chinner
2019-09-30 12:24           ` Brian Foster
2019-09-09  1:51 ` [PATCH 2/2] xfs: hard limit the background CIL push Dave Chinner
2019-09-16 16:42   ` Darrick J. Wong
2019-09-24 22:36     ` Dave Chinner
2019-09-24 22:41       ` Darrick J. Wong
2019-09-30  6:03 [PATCH v2 0/2] xfs: limit CIL push sizes Dave Chinner
2019-09-30  6:03 ` [PATCH 1/2] xfs: Lower CIL flush limit for large logs Dave Chinner
2019-09-30 16:55   ` Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).