linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 15/24] xfs: eagerly free shadow buffers to reduce CIL footprint
Date: Tue, 6 Aug 2019 08:57:27 -0400	[thread overview]
Message-ID: <20190806125727.GD2979@bfoster> (raw)
In-Reply-To: <20190805233326.GA7777@dread.disaster.area>

On Tue, Aug 06, 2019 at 09:33:26AM +1000, Dave Chinner wrote:
> On Mon, Aug 05, 2019 at 02:03:01PM -0400, Brian Foster wrote:
> > On Thu, Aug 01, 2019 at 12:17:43PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > The CIL can pin a lot of memory and effectively defines the lower
> > > free memory boundary of operation for XFS. The way we hang onto
> > > log item shadow buffers "just in case" effectively doubles the
> > > memory footprint of the CIL for dubious reasons.
> > > 
> > > That is, we hang onto the old shadow buffer in case the next time
> > > we log the item it will fit into the shadow buffer and we won't have
> > > to allocate a new one. However, we only ever tend to grow dirty
> > > objects in the CIL through relogging, so once we've allocated a
> > > larger buffer the old buffer we set as a shadow buffer will never
> > > get reused as the amount we log never decreases until the item is
> > > clean. And then for buffer items we free the log item and the shadow
> > > buffers, anyway. Inode items will hold onto their shadow buffer
> > > until they are reclaimed - this could double the inode's memory
> > > footprint for it's lifetime...
> > > 
> > > Hence we should just free the old log item buffer when we replace it
> > > with a new shadow buffer rather than storing it for later use. It's
> > > not useful, get rid of it as early as possible.
> > > 
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log_cil.c | 7 +++----
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> > > index fa5602d0fd7f..1863a9bdf4a9 100644
> > > --- a/fs/xfs/xfs_log_cil.c
> > > +++ b/fs/xfs/xfs_log_cil.c
> > > @@ -238,9 +238,7 @@ xfs_cil_prepare_item(
> > >  	/*
> > >  	 * If there is no old LV, this is the first time we've seen the item in
> > >  	 * this CIL context and so we need to pin it. If we are replacing the
> > > -	 * old_lv, then remove the space it accounts for and make it the shadow
> > > -	 * buffer for later freeing. In both cases we are now switching to the
> > > -	 * shadow buffer, so update the the pointer to it appropriately.
> > > +	 * old_lv, then remove the space it accounts for and free it.
> > >  	 */
> > 
> > The comment above xlog_cil_alloc_shadow_bufs() needs a similar update
> > around how we handle the old buffer when the shadow buffer is used.
> 
> *nod*
> 
> > 
> > >  	if (!old_lv) {
> > >  		if (lv->lv_item->li_ops->iop_pin)
> > > @@ -251,7 +249,8 @@ xfs_cil_prepare_item(
> > >  
> > >  		*diff_len -= old_lv->lv_bytes;
> > >  		*diff_iovecs -= old_lv->lv_niovecs;
> > > -		lv->lv_item->li_lv_shadow = old_lv;
> > > +		kmem_free(old_lv);
> > > +		lv->lv_item->li_lv_shadow = NULL;
> > >  	}
> > 
> > So IIUC this is the case where we allocated a shadow buffer, the item
> > was already pinned (so old_lv is still around) but we ended up using the
> > shadow buffer for this relog. Instead of keeping the old buffer around
> > as a new shadow, we toss it. That makes sense, but if the objective is
> > to not leave dangling shadow buffers around as such, what about the case
> > where we allocated a shadow buffer but didn't end up using it because
> > old_lv was reusable? It looks like we still keep the shadow buffer
> > around in that scenario with a similar lifetime as the swapout scenario
> > this patch removes. Hm?
> 
> Of the top of my head, we shouldn't allocate a new shadow buffer in
> that case (see xlog_cil_alloc_shadow_bufs()). i.e. we check up front
> if the formatted size of the item will fit in the existing buffer,
> and if it does we do not allocate a new shadow buffer as we just
> reuse the existing one. SO we should only have to free a shadow
> buffer when we switch them, not when we overwrite.
> 

We have such a check in xlog_cil_insert_format_items(), so we'd reuse
->li_lv if it will suffice even if we have a shadow buffer available.

> I'll recheck this, but I'm pretty sure overwrite won't leave a
> shadow buffer around.
> 

But before that we have the following logic:

static void
xlog_cil_alloc_shadow_bufs(
	...

	if (!lip->li_lv_shadow ||
	    buf_size > lip->li_lv_shadow->lv_size) {
		...
		lv = kmem_alloc_large(buf_size, KM_SLEEP | KM_NOFS);
		...
		lip->li_lv_shadow = lv;
	} else {
		<reuse shadow>
	}
	...
}

... which always allocates a shadow buffer if one doesn't exist. We
don't look at the currently used (lip->li_lv) buffer at all here. IIUC,
that has to do with the TOCTOU race described in the big comment above
the function.. hm?

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2019-08-06 12:57 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-01  2:17 [RFC] [PATCH 00/24] mm, xfs: non-blocking inode reclaim Dave Chinner
2019-08-01  2:17 ` [PATCH 01/24] mm: directed shrinker work deferral Dave Chinner
2019-08-02 15:27   ` Brian Foster
2019-08-04  1:49     ` Dave Chinner
2019-08-05 17:42       ` Brian Foster
2019-08-05 23:43         ` Dave Chinner
2019-08-06 12:27           ` Brian Foster
2019-08-06 22:22             ` Dave Chinner
2019-08-07 11:13               ` Brian Foster
2019-08-01  2:17 ` [PATCH 02/24] shrinkers: use will_defer for GFP_NOFS sensitive shrinkers Dave Chinner
2019-08-02 15:27   ` Brian Foster
2019-08-04  1:50     ` Dave Chinner
2019-08-01  2:17 ` [PATCH 03/24] mm: factor shrinker work calculations Dave Chinner
2019-08-02 15:08   ` Nikolay Borisov
2019-08-04  2:05     ` Dave Chinner
2019-08-02 15:31   ` Brian Foster
2019-08-01  2:17 ` [PATCH 04/24] shrinker: defer work only to kswapd Dave Chinner
2019-08-02 15:34   ` Brian Foster
2019-08-04 16:48   ` Nikolay Borisov
2019-08-04 21:37     ` Dave Chinner
2019-08-07 16:12   ` kbuild test robot
2019-08-07 18:00   ` kbuild test robot
2019-08-01  2:17 ` [PATCH 05/24] shrinker: clean up variable types and tracepoints Dave Chinner
2019-08-01  2:17 ` [PATCH 06/24] mm: reclaim_state records pages reclaimed, not slabs Dave Chinner
2019-08-01  2:17 ` [PATCH 07/24] mm: back off direct reclaim on excessive shrinker deferral Dave Chinner
2019-08-01  2:17 ` [PATCH 08/24] mm: kswapd backoff for shrinkers Dave Chinner
2019-08-01  2:17 ` [PATCH 09/24] xfs: don't allow log IO to be throttled Dave Chinner
2019-08-01 13:39   ` Chris Mason
2019-08-01 23:58     ` Dave Chinner
2019-08-02  8:12       ` Christoph Hellwig
2019-08-02 14:11       ` Chris Mason
2019-08-02 18:34         ` Matthew Wilcox
2019-08-02 23:28         ` Dave Chinner
2019-08-05 18:32           ` Chris Mason
2019-08-05 23:09             ` Dave Chinner
2019-08-01  2:17 ` [PATCH 10/24] xfs: fix missed wakeup on l_flush_wait Dave Chinner
2019-08-01  2:17 ` [PATCH 11/24] xfs:: account for memory freed from metadata buffers Dave Chinner
2019-08-01  8:16   ` Christoph Hellwig
2019-08-01  9:21     ` Dave Chinner
2019-08-06  5:51       ` Christoph Hellwig
2019-08-01  2:17 ` [PATCH 12/24] xfs: correctly acount for reclaimable slabs Dave Chinner
2019-08-06  5:52   ` Christoph Hellwig
2019-08-06 21:05     ` Dave Chinner
2019-08-01  2:17 ` [PATCH 13/24] xfs: synchronous AIL pushing Dave Chinner
2019-08-05 17:51   ` Brian Foster
2019-08-05 23:21     ` Dave Chinner
2019-08-06 12:29       ` Brian Foster
2019-08-01  2:17 ` [PATCH 14/24] xfs: tail updates only need to occur when LSN changes Dave Chinner
2019-08-05 17:53   ` Brian Foster
2019-08-05 23:28     ` Dave Chinner
2019-08-06  5:33       ` Dave Chinner
2019-08-06 12:53         ` Brian Foster
2019-08-06 21:11           ` Dave Chinner
2019-08-01  2:17 ` [PATCH 15/24] xfs: eagerly free shadow buffers to reduce CIL footprint Dave Chinner
2019-08-05 18:03   ` Brian Foster
2019-08-05 23:33     ` Dave Chinner
2019-08-06 12:57       ` Brian Foster [this message]
2019-08-06 21:21         ` Dave Chinner
2019-08-01  2:17 ` [PATCH 16/24] xfs: Lower CIL flush limit for large logs Dave Chinner
2019-08-04 17:12   ` Nikolay Borisov
2019-08-01  2:17 ` [PATCH 17/24] xfs: don't block kswapd in inode reclaim Dave Chinner
2019-08-06 18:21   ` Brian Foster
2019-08-06 21:27     ` Dave Chinner
2019-08-07 11:14       ` Brian Foster
2019-08-01  2:17 ` [PATCH 18/24] xfs: reduce kswapd blocking on inode locking Dave Chinner
2019-08-06 18:22   ` Brian Foster
2019-08-06 21:33     ` Dave Chinner
2019-08-07 11:30       ` Brian Foster
2019-08-07 23:16         ` Dave Chinner
2019-08-01  2:17 ` [PATCH 19/24] xfs: kill background reclaim work Dave Chinner
2019-08-01  2:17 ` [PATCH 20/24] xfs: use AIL pushing for inode reclaim IO Dave Chinner
2019-08-07 18:09   ` Brian Foster
2019-08-07 23:10     ` Dave Chinner
2019-08-08 16:20       ` Brian Foster
2019-08-01  2:17 ` [PATCH 21/24] xfs: remove mode from xfs_reclaim_inodes() Dave Chinner
2019-08-01  2:17 ` [PATCH 22/24] xfs: track reclaimable inodes using a LRU list Dave Chinner
2019-08-08 16:36   ` Brian Foster
2019-08-09  0:10     ` Dave Chinner
2019-08-01  2:17 ` [PATCH 23/24] xfs: reclaim inodes from the LRU Dave Chinner
2019-08-08 16:39   ` Brian Foster
2019-08-09  1:20     ` Dave Chinner
2019-08-09 12:36       ` Brian Foster
2019-08-11  2:17         ` Dave Chinner
2019-08-11 12:46           ` Brian Foster
2019-08-01  2:17 ` [PATCH 24/24] xfs: remove unusued old inode reclaim code Dave Chinner
2019-08-06  5:57 ` [RFC] [PATCH 00/24] mm, xfs: non-blocking inode reclaim Christoph Hellwig
2019-08-06 21:37   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190806125727.GD2979@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).