All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Liu Bo <bo.li.liu@oracle.com>
Cc: linux-btrfs@vger.kernel.org, Josef Bacik <jbacik@fb.com>,
	kernel-team@fb.com
Subject: Re: [PATCH 6/7] Btrfs: rework delayed ref total_bytes_pinned accounting
Date: Fri, 9 Jun 2017 16:38:42 -0700	[thread overview]
Message-ID: <20170609233842.GA15078@vader.Home> (raw)
In-Reply-To: <20170607201810.GB16793@lim.localdomain>

On Wed, Jun 07, 2017 at 01:18:10PM -0700, Liu Bo wrote:
> On Tue, Jun 06, 2017 at 04:45:31PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> > 
> > The total_bytes_pinned counter is completely broken when accounting
> > delayed refs:
> > 
> > - If two drops for the same extent are merged, we will decrement
> >   total_bytes_pinned twice but only increment it once.
> > - If an add is merged into a drop or vice versa, we will decrement the
> >   total_bytes_pinned counter but never increment it.
> > - If multiple references to an extent are dropped, we will account it
> >   multiple times, potentially vastly over-estimating the number of bytes
> >   that will be freed by a commit and doing unnecessary work when we're
> >   close to ENOSPC.
> > 
> > The last issue is relatively minor, but the first two make the
> > total_bytes_pinned counter leak or underflow very often. These
> > accounting issues were introduced in b150a4f10d87 ("Btrfs: use a percpu
> > to keep track of possibly pinned bytes"), but they were papered over by
> > zeroing out the counter on every commit until d288db5dc011 ("Btrfs: fix
> > race of using total_bytes_pinned").
> > 
> > We need to make sure that an extent is accounted as pinned exactly once
> > if and only if we will drop references to it when when the transaction
> > is committed. Ideally we would only add to total_bytes_pinned when the
> > *last* reference is dropped, but this information isn't readily
> > available for data extents. Again, this over-estimation can lead to
> > extra commits when we're close to ENOSPC, but it's not as bad as before.
> > 
> > The fix implemented here is to increment total_bytes_pinned when the
> > total refmod count for an extent goes negative and decrement it if the
> > refmod count goes back to non-negative or after we've run all of the
> > delayed refs for that extent.
> >
> 
> The patch could be cleaner if we inc/dec %pinned inside delayed_ref.c.
> 
> The idea looks good to me.
> 
> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>

Yeah, I think that'll work. My first reaction was that it'd be a
layering violation, but I think it makes sense, this counter really is
necessary because of delayed refs.

  reply	other threads:[~2017-06-09 23:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-06 23:45 [PATCH 0/7] Btrfs: fix total_bytes_pinned counter Omar Sandoval
2017-06-06 23:45 ` [PATCH 1/7] Btrfs: make add_pinned_bytes() take an s64 num_bytes instead of u64 Omar Sandoval
2017-06-12 13:39   ` David Sterba
2017-06-12 17:34   ` Liu Bo
2017-06-06 23:45 ` [PATCH 2/7] Btrfs: make BUG_ON() in add_pinned_bytes() an ASSERT() Omar Sandoval
2017-06-12 13:26   ` David Sterba
2017-06-21 17:31   ` David Sterba
2017-06-06 23:45 ` [PATCH 3/7] Btrfs: update total_bytes_pinned when pinning down extents Omar Sandoval
2017-06-12 17:37   ` Liu Bo
2017-06-06 23:45 ` [PATCH 4/7] Btrfs: always account pinned bytes when dropping a tree block ref Omar Sandoval
2017-06-07 20:20   ` Liu Bo
2017-06-06 23:45 ` [PATCH 5/7] Btrfs: return old and new total ref mods when adding delayed refs Omar Sandoval
2017-06-07 20:06   ` Liu Bo
2017-06-06 23:45 ` [PATCH 6/7] Btrfs: rework delayed ref total_bytes_pinned accounting Omar Sandoval
2017-06-07 20:18   ` Liu Bo
2017-06-09 23:38     ` Omar Sandoval [this message]
2017-06-06 23:45 ` [PATCH 7/7] Btrfs: warn if total_bytes_pinned is non-zero on unmount Omar Sandoval
2017-06-07 20:22   ` Liu Bo
2017-06-09 23:45     ` Omar Sandoval
2017-06-13 18:35   ` Jeff Mahoney
2017-06-21 17:40   ` David Sterba
2017-06-07 15:48 ` [PATCH 0/7] Btrfs: fix total_bytes_pinned counter Holger Hoffstätte
2017-06-07 17:37   ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170609233842.GA15078@vader.Home \
    --to=osandov@osandov.com \
    --cc=bo.li.liu@oracle.com \
    --cc=jbacik@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.