All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Yisheng Xie <xieyisheng1@huawei.com>,
	Jan Stancek <jstancek@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"ltp@lists.linux.it" <ltp@lists.linux.it>
Subject: Re: Is MADV_HWPOISON supposed to work only on faulted-in pages?
Date: Mon, 27 Feb 2017 06:33:09 +0000	[thread overview]
Message-ID: <20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <22763879-C335-41E6-8102-2022EED75DAE@cs.rutgers.edu>

On Sun, Feb 26, 2017 at 10:27:02PM -0600, Zi Yan wrote:
> On 26 Feb 2017, at 19:20, Naoya Horiguchi wrote:
> 
> > On Sat, Feb 25, 2017 at 10:28:15AM +0800, Yisheng Xie wrote:
> >> hi Naoya,
> >>
> >> On 2017/2/23 11:23, Naoya Horiguchi wrote:
> >>> On Mon, Feb 20, 2017 at 05:00:17AM +0000, Horiguchi Naoya(堀口 直也) wrote:
> >>>> On Tue, Feb 14, 2017 at 04:41:29PM +0100, Jan Stancek wrote:
> >>>>> Hi,
> >>>>>
> >>>>> code below (and LTP madvise07 [1]) doesn't produce SIGBUS,
> >>>>> unless I touch/prefault page before call to madvise().
> >>>>>
> >>>>> Is this expected behavior?
> >>>>
> >>>> Thank you for reporting.
> >>>>
> >>>> madvise(MADV_HWPOISON) triggers page fault when called on the address
> >>>> over which no page is faulted-in, so I think that SIGBUS should be
> >>>> called in such case.
> >>>>
> >>>> But it seems that memory error handler considers such a page as "reserved
> >>>> kernel page" and recovery action fails (see below.)
> >>>>
> >>>>   [  383.371372] Injecting memory failure for page 0x1f10 at 0x7efcdc569000
> >>>>   [  383.375678] Memory failure: 0x1f10: reserved kernel page still referenced by 1 users
> >>>>   [  383.377570] Memory failure: 0x1f10: recovery action for reserved kernel page: Failed
> >>>>
> >>>> I'm not sure how/when this behavior was introduced, so I try to understand.
> >>>
> >>> I found that this is a zero page, which is not recoverable for memory
> >>> error now.
> >>>
> >>>> IMO, the test code below looks valid to me, so no need to change.
> >>>
> >>> I think that what the testcase effectively does is to test whether memory
> >>> handling on zero pages works or not.
> >>> And the testcase's failure seems acceptable, because it's simply not-implemented yet.
> >>> Maybe recovering from error on zero page is possible (because there's no data
> >>> loss for memory error,) but I'm not sure that code might be simple enough and/or
> >>> it's worth doing ...
> >> I question about it,  if a memory error happened on zero page, it will
> >> cause all of data read from zero page is error, I mean no-zero, right?
> >
> > Hi Yisheng,
> >
> > Yes, the impact is serious (could affect many processes,) but it's possibility
> > is very low because there's only one page in a system that is used for zero page.
> > There are many other pages which are not recoverable for memory error like
> > slab pages, so I'm not sure how I prioritize it (maybe it's not a
> > top-priority thing, nor low-hanging fruit.)
> >
> >> And can we just use re-initial it with zero data maybe by memset ?
> >
> > Maybe it's not enoguh. Under a real hwpoison, we should isolate the error
> > page to prevent the access on the broken data.
> > But zero page is statically defined as an array of global variable, so
> > it's not trival to replace it with a new zero page at runtime.
> >
> > Anyway, it's in my todo list, so hopefully revisited in the future.
> >
> 
> Hi Naoya,
> 
> The test case tries to HWPOISON a range of virtual addresses that do not
> map to any physical pages.
> 

Hi Yan,

> I expected either madvise should fail because HWPOISON does not work on
> non-existing physical pages or madvise_hwpoison() should populate
> some physical pages for that virtual address range and poison them.

The latter is the current behavior. It just comes from get_user_pages_fast()
which not only finds the page and takes refcount, but also touch the page.

madvise(MADV_HWPOISON) is a test feature, and calling it for address backed
by no page doesn't simulate anything real. IOW, the behavior is undefined.
So I don't have a strong opinion about how it should behave.

> 
> As I tested it on kernel v4.10, the test application exited at
> madvise, because madvise returns -1 and error message is
> "Device or resource busy". I think this is a proper behavior.

yes, maybe we see the same thing, you can see in dmesg "recovery action
for reserved kernel page: Failed" message.

> 
> There might be some confusion in madvise's man page on MADV_HWPOISON.
> If you add some text saying madvise fails if any page is not mapped in
> the given address range, that can eliminate the confusion*

Writing it down to man page makes readers think this behavior is a part of
specification, that might not be good now because the failure in error
handling of zero page is not the eventually fixed behavior.
I mean that if zero page handles hwpoison properly in the future, madvise
will succeed without any confusion.
So I feel that we don't have to update man page for this issue.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: ltp@lists.linux.it
Subject: [LTP] Is MADV_HWPOISON supposed to work only on faulted-in pages?
Date: Mon, 27 Feb 2017 06:33:09 +0000	[thread overview]
Message-ID: <20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <22763879-C335-41E6-8102-2022EED75DAE@cs.rutgers.edu>

On Sun, Feb 26, 2017 at 10:27:02PM -0600, Zi Yan wrote:
> On 26 Feb 2017, at 19:20, Naoya Horiguchi wrote:
> 
> > On Sat, Feb 25, 2017 at 10:28:15AM +0800, Yisheng Xie wrote:
> >> hi Naoya,
> >>
> >> On 2017/2/23 11:23, Naoya Horiguchi wrote:
> >>> On Mon, Feb 20, 2017 at 05:00:17AM +0000, Horiguchi Naoya(堀口 直也) wrote:
> >>>> On Tue, Feb 14, 2017 at 04:41:29PM +0100, Jan Stancek wrote:
> >>>>> Hi,
> >>>>>
> >>>>> code below (and LTP madvise07 [1]) doesn't produce SIGBUS,
> >>>>> unless I touch/prefault page before call to madvise().
> >>>>>
> >>>>> Is this expected behavior?
> >>>>
> >>>> Thank you for reporting.
> >>>>
> >>>> madvise(MADV_HWPOISON) triggers page fault when called on the address
> >>>> over which no page is faulted-in, so I think that SIGBUS should be
> >>>> called in such case.
> >>>>
> >>>> But it seems that memory error handler considers such a page as "reserved
> >>>> kernel page" and recovery action fails (see below.)
> >>>>
> >>>>   [  383.371372] Injecting memory failure for page 0x1f10 at 0x7efcdc569000
> >>>>   [  383.375678] Memory failure: 0x1f10: reserved kernel page still referenced by 1 users
> >>>>   [  383.377570] Memory failure: 0x1f10: recovery action for reserved kernel page: Failed
> >>>>
> >>>> I'm not sure how/when this behavior was introduced, so I try to understand.
> >>>
> >>> I found that this is a zero page, which is not recoverable for memory
> >>> error now.
> >>>
> >>>> IMO, the test code below looks valid to me, so no need to change.
> >>>
> >>> I think that what the testcase effectively does is to test whether memory
> >>> handling on zero pages works or not.
> >>> And the testcase's failure seems acceptable, because it's simply not-implemented yet.
> >>> Maybe recovering from error on zero page is possible (because there's no data
> >>> loss for memory error,) but I'm not sure that code might be simple enough and/or
> >>> it's worth doing ...
> >> I question about it,  if a memory error happened on zero page, it will
> >> cause all of data read from zero page is error, I mean no-zero, right?
> >
> > Hi Yisheng,
> >
> > Yes, the impact is serious (could affect many processes,) but it's possibility
> > is very low because there's only one page in a system that is used for zero page.
> > There are many other pages which are not recoverable for memory error like
> > slab pages, so I'm not sure how I prioritize it (maybe it's not a
> > top-priority thing, nor low-hanging fruit.)
> >
> >> And can we just use re-initial it with zero data maybe by memset ?
> >
> > Maybe it's not enoguh. Under a real hwpoison, we should isolate the error
> > page to prevent the access on the broken data.
> > But zero page is statically defined as an array of global variable, so
> > it's not trival to replace it with a new zero page at runtime.
> >
> > Anyway, it's in my todo list, so hopefully revisited in the future.
> >
> 
> Hi Naoya,
> 
> The test case tries to HWPOISON a range of virtual addresses that do not
> map to any physical pages.
> 

Hi Yan,

> I expected either madvise should fail because HWPOISON does not work on
> non-existing physical pages or madvise_hwpoison() should populate
> some physical pages for that virtual address range and poison them.

The latter is the current behavior. It just comes from get_user_pages_fast()
which not only finds the page and takes refcount, but also touch the page.

madvise(MADV_HWPOISON) is a test feature, and calling it for address backed
by no page doesn't simulate anything real. IOW, the behavior is undefined.
So I don't have a strong opinion about how it should behave.

> 
> As I tested it on kernel v4.10, the test application exited at
> madvise, because madvise returns -1 and error message is
> "Device or resource busy". I think this is a proper behavior.

yes, maybe we see the same thing, you can see in dmesg "recovery action
for reserved kernel page: Failed" message.

> 
> There might be some confusion in madvise's man page on MADV_HWPOISON.
> If you add some text saying madvise fails if any page is not mapped in
> the given address range, that can eliminate the confusion*

Writing it down to man page makes readers think this behavior is a part of
specification, that might not be good now because the failure in error
handling of zero page is not the eventually fixed behavior.
I mean that if zero page handles hwpoison properly in the future, madvise
will succeed without any confusion.
So I feel that we don't have to update man page for this issue.

Thanks,
Naoya Horiguchi

  reply	other threads:[~2017-02-27  6:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-14 15:41 Is MADV_HWPOISON supposed to work only on faulted-in pages? Jan Stancek
2017-02-14 15:41 ` [LTP] " Jan Stancek
2017-02-20  5:00 ` Naoya Horiguchi
2017-02-20  5:00   ` [LTP] " Naoya Horiguchi
2017-02-23  3:23   ` Naoya Horiguchi
2017-02-23  3:23     ` [LTP] " Naoya Horiguchi
2017-02-25  2:28     ` Yisheng Xie
2017-02-25  2:28       ` [LTP] " Yisheng Xie
2017-02-27  1:20       ` Naoya Horiguchi
2017-02-27  1:20         ` [LTP] " Naoya Horiguchi
2017-02-27  4:27         ` Zi Yan
2017-02-27  4:27           ` [LTP] " Zi Yan
2017-02-27  6:33           ` Naoya Horiguchi [this message]
2017-02-27  6:33             ` Naoya Horiguchi
2017-02-27 16:10             ` Zi Yan
2017-02-27 16:10               ` [LTP] " Zi Yan
2017-03-14 13:20             ` Cyril Hrubis
2017-03-14 13:20               ` Cyril Hrubis
2017-03-27 12:08             ` Richard Palethorpe
2017-03-27 12:08               ` Richard Palethorpe
2017-03-27 23:54     ` Andi Kleen
2017-03-27 23:54       ` [LTP] " Andi Kleen
2017-03-28  8:25       ` Cyril Hrubis
2017-03-28  8:25         ` Cyril Hrubis
2017-03-28 20:26         ` Andi Kleen
2017-03-28 20:26           ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=jstancek@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=ltp@lists.linux.it \
    --cc=xieyisheng1@huawei.com \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.