All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/5] [RFC] xfs: use percpu counters for CIL context counters
Date: Thu, 14 May 2020 11:50:55 +1000	[thread overview]
Message-ID: <20200514015055.GI2040@dread.disaster.area> (raw)
In-Reply-To: <20200513215241.GG2040@dread.disaster.area>

On Thu, May 14, 2020 at 07:52:41AM +1000, Dave Chinner wrote:
> On Wed, May 13, 2020 at 08:09:59AM -0400, Brian Foster wrote:
> > On Wed, May 13, 2020 at 09:36:27AM +1000, Dave Chinner wrote:
> > > On Tue, May 12, 2020 at 10:05:44AM -0400, Brian Foster wrote:
> > > > Particularly as it relates to percpu functionality. Does
> > > > the window scale with cpu count, for example? It might not matter either
> > > 
> > > Not really. We need a thundering herd to cause issues, and this
> > > occurs after formatting an item so we won't get a huge thundering
> > > herd even when lots of threads block on the xc_ctx_lock waiting for
> > > a push to complete.
> > > 
> > 
> > It would be nice to have some debug code somewhere that somehow or
> > another asserts or warns if the CIL reservation exceeds some
> > insane/unexpected heuristic based on the current size of the context. I
> > don't know what that code or heuristic looks like (i.e. multiple factors
> > of the ctx size?) so I'm obviously handwaving. Just something to think
> > about if we can come up with a way to accomplish that opportunistically.
> 
> I don't think there is a reliable mechanism that can be used here.
> At one end of the scale we have the valid case of a synchronous
> inode modification on a log with a 256k stripe unit. So it's valid
> to have a CIL reservation of ~550kB for a single item that consumes
> ~700 bytes of log space.
> 
> OTOH, we might be freeing extents on a massively fragmented file and
> filesystem, so we're pushing 200kB+ transactions into the CIL for
> every rolling transaction. On a filesystem with a 512 byte log
> sector size and no LSU, the CIL reservations are dwarfed by the
> actual metadata being logged...
> 
> I'd suggest that looking at the ungrant trace for the CIL ticket
> once it has committed will tell us exactly how much the reservation
> was over-estimated, as the unused portion of the reservation will be
> returned to the reserve grant head at this point in time.

Typical for this workload is a CIl ticket that looks like this at
ungrant time:

t_curr_res 13408 t_unit_res 231100
t_curr_res 9240 t_unit_res 140724
t_curr_res 46284 t_unit_res 263964
t_curr_res 29780 t_unit_res 190020
t_curr_res 38044 t_unit_res 342016
t_curr_res 21636 t_unit_res 321476
t_curr_res 21576 t_unit_res 263964
t_curr_res 42200 t_unit_res 411852
t_curr_res 21636 t_unit_res 292720
t_curr_res 62740 t_unit_res 514552
t_curr_res 17456 t_unit_res 284504
t_curr_res 29852 t_unit_res 411852
t_curr_res 13384 t_unit_res 206452
t_curr_res 70956 t_unit_res 518660
t_curr_res 70908 t_unit_res 333800
t_curr_res 50404 t_unit_res 518660
t_curr_res 17480 t_unit_res 321476
t_curr_res 33948 t_unit_res 436500
t_curr_res 17492 t_unit_res 317368
t_curr_res 50392 t_unit_res 489904
t_curr_res 13360 t_unit_res 325584
t_curr_res 66812 t_unit_res 506336
t_curr_res 33924 t_unit_res 366664
t_curr_res 70932 t_unit_res 551524
t_curr_res 29852 t_unit_res 374880
t_curr_res 25720 t_unit_res 494012
t_curr_res 42152 t_unit_res 506336
t_curr_res 21684 t_unit_res 543308
t_curr_res 29840 t_unit_res 440608
t_curr_res 46320 t_unit_res 551524
t_curr_res 21624 t_unit_res 387204
t_curr_res 29840 t_unit_res 522768

So we are looking at a reservation of up to 500KB, and typically
using all but a few 10s of KB of it.

I'll use this as the ballpark for the lockless code.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-05-14  1:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-12  9:28 [PATCH 0/5 v2] xfs: fix a couple of performance issues Dave Chinner
2020-05-12  9:28 ` [PATCH 1/5] xfs: separate read-only variables in struct xfs_mount Dave Chinner
2020-05-12 12:30   ` Brian Foster
2020-05-12 16:09     ` Darrick J. Wong
2020-05-12 21:43       ` Dave Chinner
2020-05-12 21:53     ` Dave Chinner
2020-05-12  9:28 ` [PATCH 2/5] xfs: convert m_active_trans counter to per-cpu Dave Chinner
2020-05-12 12:31   ` Brian Foster
2020-05-12  9:28 ` [PATCH 3/5] [RFC] xfs: use percpu counters for CIL context counters Dave Chinner
2020-05-12 14:05   ` Brian Foster
2020-05-12 23:36     ` Dave Chinner
2020-05-13 12:09       ` Brian Foster
2020-05-13 21:52         ` Dave Chinner
2020-05-14  1:50           ` Dave Chinner [this message]
2020-05-14  2:49             ` Dave Chinner
2020-05-14 13:43           ` Brian Foster
2020-05-12  9:28 ` [PATCH 4/5] [RFC] xfs: per-cpu CIL lists Dave Chinner
2020-05-13 17:02   ` Brian Foster
2020-05-13 23:33     ` Dave Chinner
2020-05-14 13:44       ` Brian Foster
2020-05-14 22:46         ` Dave Chinner
2020-05-15 17:26           ` Brian Foster
2020-05-18  0:30             ` Dave Chinner
2020-05-12  9:28 ` [PATCH 5/5] [RFC] xfs: make CIl busy extent lists per-cpu Dave Chinner
2020-05-12 10:25 ` [PATCH 0/5 v2] xfs: fix a couple of performance issues Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200514015055.GI2040@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.