All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: chandan.babu@oracle.com, chandanrlinux@gmail.com,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 11/14] xfs: dynamically allocate cursors based on maxlevels
Date: Tue, 21 Sep 2021 09:06:35 +1000	[thread overview]
Message-ID: <20210920230635.GM1756565@dread.disaster.area> (raw)
In-Reply-To: <163192861018.416199.11733078081556457241.stgit@magnolia>

On Fri, Sep 17, 2021 at 06:30:10PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Replace the statically-sized btree cursor zone with dynamically sized
> allocations so that we can reduce the memory overhead for per-AG bt
> cursors while handling very tall btrees for rt metadata.

Hmmmmm. We do a *lot* of btree cursor allocation and freeing under
load. Keeping that in a single slab rather than using heap memory is
a good idea for stuff like this for many reasons...

I mean, if we are creating a million inodes a second, a rouch
back-of-the-envelope calculation says we are doing 3-4 million btree
cursor instantiations a second. That's a lot of short term churn on
the heap that we don't really need to subject it to. And even a few
extra instructions in a path called millions of times a second adds
up to a lot of extra runtime overhead.

So....

> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/libxfs/xfs_btree.c |   40 ++++++++++++++++++++++++++++++++--------
>  fs/xfs/libxfs/xfs_btree.h |    2 --
>  fs/xfs/xfs_super.c        |   11 +----------
>  3 files changed, 33 insertions(+), 20 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
> index 2486ba22c01d..f9516828a847 100644
> --- a/fs/xfs/libxfs/xfs_btree.c
> +++ b/fs/xfs/libxfs/xfs_btree.c
> @@ -23,11 +23,6 @@
>  #include "xfs_btree_staging.h"
>  #include "xfs_ag.h"
>  
> -/*
> - * Cursor allocation zone.
> - */
> -kmem_zone_t	*xfs_btree_cur_zone;
> -
>  /*
>   * Btree magic numbers.
>   */
> @@ -379,7 +374,7 @@ xfs_btree_del_cursor(
>  		kmem_free(cur->bc_ops);
>  	if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag)
>  		xfs_perag_put(cur->bc_ag.pag);
> -	kmem_cache_free(xfs_btree_cur_zone, cur);
> +	kmem_free(cur);
>  }
>  
>  /*
> @@ -4927,6 +4922,32 @@ xfs_btree_has_more_records(
>  		return block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK);
>  }
>  
> +/* Compute the maximum allowed height for a given btree type. */
> +static unsigned int
> +xfs_btree_maxlevels(
> +	struct xfs_mount	*mp,
> +	xfs_btnum_t		btnum)
> +{
> +	switch (btnum) {
> +	case XFS_BTNUM_BNO:
> +	case XFS_BTNUM_CNT:
> +		return mp->m_ag_maxlevels;
> +	case XFS_BTNUM_BMAP:
> +		return max(mp->m_bm_maxlevels[XFS_DATA_FORK],
> +			   mp->m_bm_maxlevels[XFS_ATTR_FORK]);
> +	case XFS_BTNUM_INO:
> +	case XFS_BTNUM_FINO:
> +		return M_IGEO(mp)->inobt_maxlevels;
> +	case XFS_BTNUM_RMAP:
> +		return mp->m_rmap_maxlevels;
> +	case XFS_BTNUM_REFC:
> +		return mp->m_refc_maxlevels;
> +	default:
> +		ASSERT(0);
> +		return XFS_BTREE_MAXLEVELS;
> +	}
> +}
> +
>  /* Allocate a new btree cursor of the appropriate size. */
>  struct xfs_btree_cur *
>  xfs_btree_alloc_cursor(
> @@ -4935,13 +4956,16 @@ xfs_btree_alloc_cursor(
>  	xfs_btnum_t		btnum)
>  {
>  	struct xfs_btree_cur	*cur;
> +	unsigned int		maxlevels = xfs_btree_maxlevels(mp, btnum);
>  
> -	cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL);
> +	ASSERT(maxlevels <= XFS_BTREE_MAXLEVELS);
> +
> +	cur = kmem_zalloc(xfs_btree_cur_sizeof(maxlevels), KM_NOFS);

Instead of multiple dynamic runtime calculations to determine the
size to allocate from the heap, which then has to select a slab
based on size, why don't we just pre-calculate the max size of
the cursor at XFS module init and use that for the btree cursor slab
size?

The memory overhead of the cursor isn't an issue because we've been
maximally sizing it since forever, and the whole point of a slab
cache is to minimise allocation overhead of frequently allocated
objects. It seems to me that we really want to retain these
properties of the cursor allocator, not give them up just as we're
in the process of making other modifications that will hit the path
more frequently than it's ever been hit before...

I like all the dynamic sized guards that this series places in the
cursor, but I don't think we want to change the way we allocate the
cursors just to support that.

FWIW, an example of avoidable runtime calculation overhead of
constants is xlog_calc_unit_res(). These values are actually
constant for a given transaction reservation, but at 1.6 million
transactions a second it shows up at #20 on the flat profile of
functions using the most CPU:

0.71%  [kernel]  [k] xlog_calc_unit_res

0.71% of 32 CPUs for 1.6 million calculations a second of the same
constants is a non-trivial amount of CPU time to spend doing
unnecessary repeated calculations.

Even though the btree cursor constant calculations are simpler than
the log res calculations, they are more frequent. Hence on general
principles of efficiency, I don't think we want to be replacing high
frequency, low overhead slab/zone based allocations with heap
allocations that require repeated constant calculations and
size->slab redirection....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2021-09-20 23:08 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-18  1:29 [PATCHSET RFC chandan 00/14] xfs: support dynamic btree cursor height Darrick J. Wong
2021-09-18  1:29 ` [PATCH 01/14] xfs: remove xfs_btree_cur_t typedef Darrick J. Wong
2021-09-20  9:53   ` Chandan Babu R
2021-09-21  8:36   ` Christoph Hellwig
2021-09-18  1:29 ` [PATCH 02/14] xfs: don't allocate scrub contexts on the stack Darrick J. Wong
2021-09-20  9:53   ` Chandan Babu R
2021-09-20 17:39     ` Darrick J. Wong
2021-09-21  8:39   ` Christoph Hellwig
2021-09-18  1:29 ` [PATCH 03/14] xfs: dynamically allocate btree scrub context structure Darrick J. Wong
2021-09-20  9:53   ` Chandan Babu R
2021-09-21  8:43   ` Christoph Hellwig
2021-09-22 16:17     ` Darrick J. Wong
2021-09-18  1:29 ` [PATCH 04/14] xfs: stricter btree height checking when looking for errors Darrick J. Wong
2021-09-20  9:54   ` Chandan Babu R
2021-09-18  1:29 ` [PATCH 05/14] xfs: stricter btree height checking when scanning for btree roots Darrick J. Wong
2021-09-20  9:54   ` Chandan Babu R
2021-09-18  1:29 ` [PATCH 06/14] xfs: check that bc_nlevels never overflows Darrick J. Wong
2021-09-20  9:54   ` Chandan Babu R
2021-09-21  8:44   ` Christoph Hellwig
2021-09-18  1:29 ` [PATCH 07/14] xfs: support dynamic btree cursor heights Darrick J. Wong
2021-09-20  9:55   ` Chandan Babu R
2021-09-21  8:49   ` Christoph Hellwig
2021-09-18  1:29 ` [PATCH 08/14] xfs: refactor btree cursor allocation function Darrick J. Wong
2021-09-20  9:55   ` Chandan Babu R
2021-09-21  8:53   ` Christoph Hellwig
2021-09-18  1:29 ` [PATCH 09/14] xfs: fix maxlevels comparisons in the btree staging code Darrick J. Wong
2021-09-20  9:55   ` Chandan Babu R
2021-09-21  8:56   ` Christoph Hellwig
2021-09-22 15:59     ` Darrick J. Wong
2021-09-18  1:30 ` [PATCH 10/14] xfs: encode the max btree height in the cursor Darrick J. Wong
2021-09-20  9:55   ` Chandan Babu R
2021-09-21  8:57   ` Christoph Hellwig
2021-09-18  1:30 ` [PATCH 11/14] xfs: dynamically allocate cursors based on maxlevels Darrick J. Wong
2021-09-20  9:56   ` Chandan Babu R
2021-09-20 23:06   ` Dave Chinner [this message]
2021-09-20 23:36     ` Dave Chinner
2021-09-21  9:03     ` Christoph Hellwig
2021-09-22 18:55       ` Darrick J. Wong
2021-09-22 17:38     ` Darrick J. Wong
2021-09-22 23:10       ` Dave Chinner
2021-09-23  1:58         ` Darrick J. Wong
2021-09-23  5:56           ` Chandan Babu R
2021-09-18  1:30 ` [PATCH 12/14] xfs: compute actual maximum btree height for critical reservation calculation Darrick J. Wong
2021-09-20  9:56   ` Chandan Babu R
2021-09-18  1:30 ` [PATCH 13/14] xfs: compute the maximum height of the rmap btree when reflink enabled Darrick J. Wong
2021-09-20  9:56   ` Chandan Babu R
2021-09-18  1:30 ` [PATCH 14/14] xfs: kill XFS_BTREE_MAXLEVELS Darrick J. Wong
2021-09-20  9:57   ` Chandan Babu R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210920230635.GM1756565@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=chandan.babu@oracle.com \
    --cc=chandanrlinux@gmail.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.