All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, Brian Foster <bfoster@redhat.com>,
	xfs@oss.sgi.com
Subject: Re: "Bad page state" errors when calling BULKSTAT under memory pressure?
Date: Thu, 25 Aug 2016 10:35:48 +1000	[thread overview]
Message-ID: <20160825003548.GD19025@dastard> (raw)
In-Reply-To: <20160824234237.GA22760@birch.djwong.org>

On Wed, Aug 24, 2016 at 04:42:37PM -0700, Darrick J. Wong wrote:
> Hi everyone,
> 
> [cc'ing Brian because he was the last one to touch xfs_buf.c]
> 
> I've been stress-testing xfs_scrub against a 900GB filesystem with 2M inodes
> using a VM with 512M of RAM.  I've noticed that I get BUG messages about
> pages with negative refcount, but only if the system is under memory pressure.
> No errors are seen if the VM memory is increased to, say, 20GB.
> 
> : BUG: Bad page state in process xfs_scrub  pfn:00426
> : page:ffffea0000010980 count:-1 mapcount:0 mapping:          (null) index:0x0
> : flags: 0x0()
> : page dumped because: nonzero _count

Unless we are double-freeing a buffer, that's not an XFS problem.
Have you tried with memory posioning and allocation debug turned on?

> : Modules linked in: xfs libcrc32c sch_fq_codel af_packet
> : CPU: 1 PID: 2058 Comm: xfs_scrub Not tainted 4.8.0-rc3-mcsum #18

the mm architecture was significantly modified in 4.8.0-rc1 - it
went from per-zone to per-node infrastructure, so it's entirely
possible this is a memory reclaim regression. can you reproduce it
on an older kernel (e.g. 4.7.0)?

> Obviously, a page refcount of -1 is not a good sign.  I had a hunch that
> the page in question was (hopefully) a page backing an xfs_buf, so I
> applied the following debug patch:
> 
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 607cc29..144b976 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -317,7 +317,7 @@ xfs_buf_free(
>  
>                 for (i = 0; i < bp->b_page_count; i++) {
>                         struct page     *page = bp->b_pages[i];
> -
> +if (page_ref_count(page) != 1) {xfs_err(NULL, "%s: OHNO! daddr=%llu page=%p ref=%d", __func__, bp->b_bn, page, page_ref_count(page)); dump_stack();}
>                         __free_page(page);
>                 }
>         } else if (bp->b_flags & _XBF_KMEM)
> 
> I then saw this:
> 
> : SGI XFS with ACLs, security attributes, realtime, debug enabled
> : XFS (sda): Mounting V4 Filesystem
> : XFS (sda): Ending clean mount
> : XFS: xfs_buf_free: OHNO! daddr=113849120 page=ffffea0000010980 ref=0

Which implies something else has dropped the page reference count on
us while we hold a reference to it. What you might like to check
what the page reference counts are on /allocation/ to see if we're
being handed a page from the freelist with a bad ref count....

If the ref counts are good at allocation, but bad on free, then I
very much doubt it's an XFS problem. We don't actually touch the
page reference count anywhere, so let's make sure that it's not a
double free or something like that in XFS first.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-08-25  0:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-24 23:42 "Bad page state" errors when calling BULKSTAT under memory pressure? Darrick J. Wong
2016-08-25  0:35 ` Dave Chinner [this message]
2016-08-25  0:48   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160825003548.GD19025@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.