All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org, eguan@redhat.com, darrick.wong@oracle.com
Subject: Re: [PATCH 1/4] xfs: fix bogus minleft manipulations
Date: Tue, 20 Dec 2016 09:17:47 -0500	[thread overview]
Message-ID: <20161220141747.GA25290@bfoster.bfoster> (raw)
In-Reply-To: <20161219113826.GA26535@lst.de>

On Mon, Dec 19, 2016 at 12:38:26PM +0100, Christoph Hellwig wrote:
> On Thu, Dec 15, 2016 at 09:34:33AM -0500, Brian Foster wrote:
> > FWIW, I was playing with this a bit more and managed to manufacture a
> > filesystem layout that this series doesn't handle too well. Emphasis on
> > "manufactured" because this might not be a likely real world scenario,
> > but either way the current code handles it fine.
> 
> It does, although mostly by accident.  I suspect with an even better
> manufcatured image you could also drive the current code to it's knees,
> e.g. only have one single block free in the first few AGs, and then
> a small number just higher than that in a higher AG.
> 

Perhaps, I certainly wouldn't expect the code in current form to be
perfect. It's hard enough to understand as it is. Just trying to avoid
regressions and properly scope the required fix...

> > I've attached a metadump of the offending image. mdestore it, mount and
> > attempt something like 'dd if=/dev/zero of=/mnt/file' on the root. The
> > buffered write looks like it's in a livelock, waiting indefinitely for a
> > writeback cycle that will never complete...
> 
> Yeah, that's the loop that keeps going even if it can't allocate any
> blocks, which seems generally bogus.  But even without that we'd get
> ENOSPC despite not having a reservations. Which is a little easier to
> debug, but just as wrong.
> 

Indeed.

> The only good way out I can see is to not hand out any more reservations
> after we only nave nr_ags * xfs_bmap_worst_indlen(1) available.  I'll
> see if I can come up with a patch for that.

Hmm, so the idea is to basically find a way we can infer accurate
information about the per-AG state at the time blocks are reserved from
the global pool (i.e., buffered write time) and cut off writes at the
point we can no longer guarantee at least one AG can satisfy the
smallest write..?

If so, that seems reasonable to me in principle. I'd have to think about
it a bit more. The first question that comes to mind is that we'd have
to make sure all allocations honor the minleft heuristic, yes? (Or
perhaps not allow any allocations after this point?) Otherwise, what
prevents the assumption of (available > nr_ags *
xfs_bmap_worst_indlen(1)) from becoming false after the reservation has
been granted but before the physical allocation is attempted at
writeback time? E.g., write/reserve the last available delalloc block,
then chew up the remaining minleft in each AG via sparse inode allocs or
something (for example), then writeback occurs and can't find an AG to
honor minleft (??).

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-12-20 14:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-13 15:59 minleft fixes Christoph Hellwig
2016-12-13 15:59 ` [PATCH 1/4] xfs: fix bogus minleft manipulations Christoph Hellwig
2016-12-14 17:35   ` Brian Foster
2016-12-14 19:36     ` Christoph Hellwig
2016-12-14 21:51       ` Brian Foster
     [not found]         ` <20161215143430.GB29477@bfoster.bfoster>
2016-12-16  8:21           ` Christoph Hellwig
2016-12-19 11:38           ` Christoph Hellwig
2016-12-20 14:17             ` Brian Foster [this message]
2016-12-20 21:45               ` Dave Chinner
2016-12-15 22:09   ` Dave Chinner
2016-12-16  8:20     ` Christoph Hellwig
2017-01-04  6:32       ` Darrick J. Wong
2017-01-04  8:50         ` Christoph Hellwig
2016-12-13 15:59 ` [PATCH 2/4] xfs: calculate minleft correctly for bmap allocations Christoph Hellwig
2016-12-14 18:24   ` Brian Foster
2016-12-13 15:59 ` [PATCH 3/4] xfs: adjust allocation length in xfs_alloc_space_available Christoph Hellwig
2016-12-14 18:24   ` Brian Foster
2016-12-14 19:37     ` Christoph Hellwig
2016-12-15 20:41   ` Libor Klepáč
2016-12-16  8:20     ` Christoph Hellwig
2016-12-16  0:28   ` Dave Chinner
2016-12-16  8:25     ` Christoph Hellwig
2016-12-18 23:55       ` Dave Chinner
2016-12-13 15:59 ` [PATCH 4/4] xfs: don't rely on ->total " Christoph Hellwig
2016-12-14 18:30   ` Brian Foster
2016-12-14 19:38     ` Christoph Hellwig
2016-12-14 21:51       ` Brian Foster
2016-12-15  8:55         ` Christoph Hellwig
2016-12-15 12:00           ` Brian Foster
2016-12-14 10:51 ` minleft fixes Eryu Guan
2016-12-15 10:24   ` Eryu Guan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161220141747.GA25290@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=eguan@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.