All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>, Tony Luck <tony.luck@intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>
Subject: Re: [PATCH 2/4] mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling
Date: Wed, 13 May 2015 00:11:41 +0000	[thread overview]
Message-ID: <20150513001141.GA14599@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20150512150017.4172e4b7bd549e16d8772753@linux-foundation.org>

On Tue, May 12, 2015 at 03:00:17PM -0700, Andrew Morton wrote:
> On Tue, 12 May 2015 09:46:47 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > memory_failrue() can run in 2 different mode (specified by MF_COUNT_INCREASED)
> > in page refcount perspective. When MF_COUNT_INCREASED is set, memory_failrue()
> > assumes that the caller takes a refcount of the target page. And if cleared,
> > memory_failure() takes it in it's own.
> > 
> > In current code, however, refcounting is done differently in each caller. For
> > example, madvise_hwpoison() uses get_user_pages_fast() and hwpoison_inject()
> > uses get_page_unless_zero(). So this inconsistent refcounting causes refcount
> > failure especially for thp tail pages. Typical user visible effects are like
> > memory leak or VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().
> > 
> > To fix this refcounting issue, this patch introduces get_hwpoison_page() to
> > handle thp tail pages in the same manner for each caller of hwpoison code.
> > 
> > There's a non-trivial change around unpoisoning, which now returns immediately
> > for thp with "MCE: Memory failure is now running on %#lx\n" message. This is
> > not right when split_huge_page() fails. So this patch also allows
> > unpoison_memory() to handle thp.
> >
> > ...
> >
> >  /*
> > + * Get refcount for memory error handling:
> > + * - @page: raw page
> > + */
> > +inline int get_hwpoison_page(struct page *page)
> > +{
> > +	struct page *head = compound_head(page);
> > +
> > +	if (PageHuge(head))
> > +		return get_page_unless_zero(head);
> > +	else if (PageTransHuge(head))
> > +		if (get_page_unless_zero(head)) {
> > +			if (PageTail(page))
> > +				get_page(page);
> > +			return 1;
> > +		} else {
> > +			return 0;
> > +		}
> > +	else
> > +		return get_page_unless_zero(page);
> > +}
> 
> This function is a bit weird.
> 
> - The comment looks like kerneldoc but isn't kerneldoc

OK, will fix it.

> - Why the inline?  It isn't fastpath?

No, so I'll drop 'inline'.

> - The layout is rather painful.  It could be
> 
> 	if (PageHuge(head))
> 		return get_page_unless_zero(head);
> 
> 	if (PageTransHuge(head)) {
> 		if (get_page_unless_zero(head)) {
> 			if (PageTail(page))
> 				get_page(page);
> 			return 1;
> 		} else {
> 			return 0;
> 		}
> 	}
> 
> 	return get_page_unless_zero(page);

OK, will do like this.

> - Some illuminating comments would be nice.  In particular that code
>   path where it grabs a ref on the tail page as well as on the head
>   page.  What's going on there?

We can't call get_page_unless_zero() directly for thp tail pages because
that breaks thp's refcounting rule (refcount of tail pages is stored in
->_mapcount.) This code intends to comply with the rule in hwpoison code too.
So I'll comment the point.

Hmm, I found just now that I forget to put_page(head) in if (PageTail) block,
which leaks head page after thp split.
So it'll be fixed in the next version.

Thanks,
Naoya Horiguchi

WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>, Tony Luck <tony.luck@intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>
Subject: Re: [PATCH 2/4] mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling
Date: Wed, 13 May 2015 00:11:41 +0000	[thread overview]
Message-ID: <20150513001141.GA14599@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20150512150017.4172e4b7bd549e16d8772753@linux-foundation.org>

On Tue, May 12, 2015 at 03:00:17PM -0700, Andrew Morton wrote:
> On Tue, 12 May 2015 09:46:47 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > memory_failrue() can run in 2 different mode (specified by MF_COUNT_INCREASED)
> > in page refcount perspective. When MF_COUNT_INCREASED is set, memory_failrue()
> > assumes that the caller takes a refcount of the target page. And if cleared,
> > memory_failure() takes it in it's own.
> > 
> > In current code, however, refcounting is done differently in each caller. For
> > example, madvise_hwpoison() uses get_user_pages_fast() and hwpoison_inject()
> > uses get_page_unless_zero(). So this inconsistent refcounting causes refcount
> > failure especially for thp tail pages. Typical user visible effects are like
> > memory leak or VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().
> > 
> > To fix this refcounting issue, this patch introduces get_hwpoison_page() to
> > handle thp tail pages in the same manner for each caller of hwpoison code.
> > 
> > There's a non-trivial change around unpoisoning, which now returns immediately
> > for thp with "MCE: Memory failure is now running on %#lx\n" message. This is
> > not right when split_huge_page() fails. So this patch also allows
> > unpoison_memory() to handle thp.
> >
> > ...
> >
> >  /*
> > + * Get refcount for memory error handling:
> > + * - @page: raw page
> > + */
> > +inline int get_hwpoison_page(struct page *page)
> > +{
> > +	struct page *head = compound_head(page);
> > +
> > +	if (PageHuge(head))
> > +		return get_page_unless_zero(head);
> > +	else if (PageTransHuge(head))
> > +		if (get_page_unless_zero(head)) {
> > +			if (PageTail(page))
> > +				get_page(page);
> > +			return 1;
> > +		} else {
> > +			return 0;
> > +		}
> > +	else
> > +		return get_page_unless_zero(page);
> > +}
> 
> This function is a bit weird.
> 
> - The comment looks like kerneldoc but isn't kerneldoc

OK, will fix it.

> - Why the inline?  It isn't fastpath?

No, so I'll drop 'inline'.

> - The layout is rather painful.  It could be
> 
> 	if (PageHuge(head))
> 		return get_page_unless_zero(head);
> 
> 	if (PageTransHuge(head)) {
> 		if (get_page_unless_zero(head)) {
> 			if (PageTail(page))
> 				get_page(page);
> 			return 1;
> 		} else {
> 			return 0;
> 		}
> 	}
> 
> 	return get_page_unless_zero(page);

OK, will do like this.

> - Some illuminating comments would be nice.  In particular that code
>   path where it grabs a ref on the tail page as well as on the head
>   page.  What's going on there?

We can't call get_page_unless_zero() directly for thp tail pages because
that breaks thp's refcounting rule (refcount of tail pages is stored in
->_mapcount.) This code intends to comply with the rule in hwpoison code too.
So I'll comment the point.

Hmm, I found just now that I forget to put_page(head) in if (PageTail) block,
which leaks head page after thp split.
So it'll be fixed in the next version.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-05-13  0:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-12  9:46 [PATCH 0/4] hwpoison fixes for v4.2 Naoya Horiguchi
2015-05-12  9:46 ` Naoya Horiguchi
2015-05-12  9:46 ` [PATCH 2/4] mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling Naoya Horiguchi
2015-05-12  9:46   ` Naoya Horiguchi
2015-05-12 22:00   ` Andrew Morton
2015-05-12 22:00     ` Andrew Morton
2015-05-13  0:11     ` Naoya Horiguchi [this message]
2015-05-13  0:11       ` Naoya Horiguchi
2015-05-12  9:46 ` [PATCH 1/4] mm/memory-failure: split thp earlier in memory error handling Naoya Horiguchi
2015-05-12  9:46   ` Naoya Horiguchi
2015-05-12  9:46 ` [PATCH 3/4] mm: soft-offline: don't free target page in successful page migration Naoya Horiguchi
2015-05-12  9:46   ` Naoya Horiguchi
2015-05-12  9:46 ` [PATCH 4/4] mm/memory-failure: me_huge_page() does nothing for thp Naoya Horiguchi
2015-05-12  9:46   ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150513001141.GA14599@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.