From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> To: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Yisheng Xie <xieyisheng1@huawei.com>, Jan Stancek <jstancek@redhat.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "ltp@lists.linux.it" <ltp@lists.linux.it> Subject: Re: Is MADV_HWPOISON supposed to work only on faulted-in pages? Date: Mon, 27 Feb 2017 06:33:09 +0000 [thread overview] Message-ID: <20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp> (raw) In-Reply-To: <22763879-C335-41E6-8102-2022EED75DAE@cs.rutgers.edu> On Sun, Feb 26, 2017 at 10:27:02PM -0600, Zi Yan wrote: > On 26 Feb 2017, at 19:20, Naoya Horiguchi wrote: > > > On Sat, Feb 25, 2017 at 10:28:15AM +0800, Yisheng Xie wrote: > >> hi Naoya, > >> > >> On 2017/2/23 11:23, Naoya Horiguchi wrote: > >>> On Mon, Feb 20, 2017 at 05:00:17AM +0000, Horiguchi Naoya(堀口 直也) wrote: > >>>> On Tue, Feb 14, 2017 at 04:41:29PM +0100, Jan Stancek wrote: > >>>>> Hi, > >>>>> > >>>>> code below (and LTP madvise07 [1]) doesn't produce SIGBUS, > >>>>> unless I touch/prefault page before call to madvise(). > >>>>> > >>>>> Is this expected behavior? > >>>> > >>>> Thank you for reporting. > >>>> > >>>> madvise(MADV_HWPOISON) triggers page fault when called on the address > >>>> over which no page is faulted-in, so I think that SIGBUS should be > >>>> called in such case. > >>>> > >>>> But it seems that memory error handler considers such a page as "reserved > >>>> kernel page" and recovery action fails (see below.) > >>>> > >>>> [ 383.371372] Injecting memory failure for page 0x1f10 at 0x7efcdc569000 > >>>> [ 383.375678] Memory failure: 0x1f10: reserved kernel page still referenced by 1 users > >>>> [ 383.377570] Memory failure: 0x1f10: recovery action for reserved kernel page: Failed > >>>> > >>>> I'm not sure how/when this behavior was introduced, so I try to understand. > >>> > >>> I found that this is a zero page, which is not recoverable for memory > >>> error now. > >>> > >>>> IMO, the test code below looks valid to me, so no need to change. > >>> > >>> I think that what the testcase effectively does is to test whether memory > >>> handling on zero pages works or not. > >>> And the testcase's failure seems acceptable, because it's simply not-implemented yet. > >>> Maybe recovering from error on zero page is possible (because there's no data > >>> loss for memory error,) but I'm not sure that code might be simple enough and/or > >>> it's worth doing ... > >> I question about it, if a memory error happened on zero page, it will > >> cause all of data read from zero page is error, I mean no-zero, right? > > > > Hi Yisheng, > > > > Yes, the impact is serious (could affect many processes,) but it's possibility > > is very low because there's only one page in a system that is used for zero page. > > There are many other pages which are not recoverable for memory error like > > slab pages, so I'm not sure how I prioritize it (maybe it's not a > > top-priority thing, nor low-hanging fruit.) > > > >> And can we just use re-initial it with zero data maybe by memset ? > > > > Maybe it's not enoguh. Under a real hwpoison, we should isolate the error > > page to prevent the access on the broken data. > > But zero page is statically defined as an array of global variable, so > > it's not trival to replace it with a new zero page at runtime. > > > > Anyway, it's in my todo list, so hopefully revisited in the future. > > > > Hi Naoya, > > The test case tries to HWPOISON a range of virtual addresses that do not > map to any physical pages. > Hi Yan, > I expected either madvise should fail because HWPOISON does not work on > non-existing physical pages or madvise_hwpoison() should populate > some physical pages for that virtual address range and poison them. The latter is the current behavior. It just comes from get_user_pages_fast() which not only finds the page and takes refcount, but also touch the page. madvise(MADV_HWPOISON) is a test feature, and calling it for address backed by no page doesn't simulate anything real. IOW, the behavior is undefined. So I don't have a strong opinion about how it should behave. > > As I tested it on kernel v4.10, the test application exited at > madvise, because madvise returns -1 and error message is > "Device or resource busy". I think this is a proper behavior. yes, maybe we see the same thing, you can see in dmesg "recovery action for reserved kernel page: Failed" message. > > There might be some confusion in madvise's man page on MADV_HWPOISON. > If you add some text saying madvise fails if any page is not mapped in > the given address range, that can eliminate the confusion* Writing it down to man page makes readers think this behavior is a part of specification, that might not be good now because the failure in error handling of zero page is not the eventually fixed behavior. I mean that if zero page handles hwpoison properly in the future, madvise will succeed without any confusion. So I feel that we don't have to update man page for this issue. Thanks, Naoya Horiguchi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> To: ltp@lists.linux.it Subject: [LTP] Is MADV_HWPOISON supposed to work only on faulted-in pages? Date: Mon, 27 Feb 2017 06:33:09 +0000 [thread overview] Message-ID: <20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp> (raw) In-Reply-To: <22763879-C335-41E6-8102-2022EED75DAE@cs.rutgers.edu> On Sun, Feb 26, 2017 at 10:27:02PM -0600, Zi Yan wrote: > On 26 Feb 2017, at 19:20, Naoya Horiguchi wrote: > > > On Sat, Feb 25, 2017 at 10:28:15AM +0800, Yisheng Xie wrote: > >> hi Naoya, > >> > >> On 2017/2/23 11:23, Naoya Horiguchi wrote: > >>> On Mon, Feb 20, 2017 at 05:00:17AM +0000, Horiguchi Naoya(堀口 直也) wrote: > >>>> On Tue, Feb 14, 2017 at 04:41:29PM +0100, Jan Stancek wrote: > >>>>> Hi, > >>>>> > >>>>> code below (and LTP madvise07 [1]) doesn't produce SIGBUS, > >>>>> unless I touch/prefault page before call to madvise(). > >>>>> > >>>>> Is this expected behavior? > >>>> > >>>> Thank you for reporting. > >>>> > >>>> madvise(MADV_HWPOISON) triggers page fault when called on the address > >>>> over which no page is faulted-in, so I think that SIGBUS should be > >>>> called in such case. > >>>> > >>>> But it seems that memory error handler considers such a page as "reserved > >>>> kernel page" and recovery action fails (see below.) > >>>> > >>>> [ 383.371372] Injecting memory failure for page 0x1f10 at 0x7efcdc569000 > >>>> [ 383.375678] Memory failure: 0x1f10: reserved kernel page still referenced by 1 users > >>>> [ 383.377570] Memory failure: 0x1f10: recovery action for reserved kernel page: Failed > >>>> > >>>> I'm not sure how/when this behavior was introduced, so I try to understand. > >>> > >>> I found that this is a zero page, which is not recoverable for memory > >>> error now. > >>> > >>>> IMO, the test code below looks valid to me, so no need to change. > >>> > >>> I think that what the testcase effectively does is to test whether memory > >>> handling on zero pages works or not. > >>> And the testcase's failure seems acceptable, because it's simply not-implemented yet. > >>> Maybe recovering from error on zero page is possible (because there's no data > >>> loss for memory error,) but I'm not sure that code might be simple enough and/or > >>> it's worth doing ... > >> I question about it, if a memory error happened on zero page, it will > >> cause all of data read from zero page is error, I mean no-zero, right? > > > > Hi Yisheng, > > > > Yes, the impact is serious (could affect many processes,) but it's possibility > > is very low because there's only one page in a system that is used for zero page. > > There are many other pages which are not recoverable for memory error like > > slab pages, so I'm not sure how I prioritize it (maybe it's not a > > top-priority thing, nor low-hanging fruit.) > > > >> And can we just use re-initial it with zero data maybe by memset ? > > > > Maybe it's not enoguh. Under a real hwpoison, we should isolate the error > > page to prevent the access on the broken data. > > But zero page is statically defined as an array of global variable, so > > it's not trival to replace it with a new zero page at runtime. > > > > Anyway, it's in my todo list, so hopefully revisited in the future. > > > > Hi Naoya, > > The test case tries to HWPOISON a range of virtual addresses that do not > map to any physical pages. > Hi Yan, > I expected either madvise should fail because HWPOISON does not work on > non-existing physical pages or madvise_hwpoison() should populate > some physical pages for that virtual address range and poison them. The latter is the current behavior. It just comes from get_user_pages_fast() which not only finds the page and takes refcount, but also touch the page. madvise(MADV_HWPOISON) is a test feature, and calling it for address backed by no page doesn't simulate anything real. IOW, the behavior is undefined. So I don't have a strong opinion about how it should behave. > > As I tested it on kernel v4.10, the test application exited at > madvise, because madvise returns -1 and error message is > "Device or resource busy". I think this is a proper behavior. yes, maybe we see the same thing, you can see in dmesg "recovery action for reserved kernel page: Failed" message. > > There might be some confusion in madvise's man page on MADV_HWPOISON. > If you add some text saying madvise fails if any page is not mapped in > the given address range, that can eliminate the confusion* Writing it down to man page makes readers think this behavior is a part of specification, that might not be good now because the failure in error handling of zero page is not the eventually fixed behavior. I mean that if zero page handles hwpoison properly in the future, madvise will succeed without any confusion. So I feel that we don't have to update man page for this issue. Thanks, Naoya Horiguchi
next prev parent reply other threads:[~2017-02-27 6:44 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-02-14 15:41 Is MADV_HWPOISON supposed to work only on faulted-in pages? Jan Stancek 2017-02-14 15:41 ` [LTP] " Jan Stancek 2017-02-20 5:00 ` Naoya Horiguchi 2017-02-20 5:00 ` [LTP] " Naoya Horiguchi 2017-02-23 3:23 ` Naoya Horiguchi 2017-02-23 3:23 ` [LTP] " Naoya Horiguchi 2017-02-25 2:28 ` Yisheng Xie 2017-02-25 2:28 ` [LTP] " Yisheng Xie 2017-02-27 1:20 ` Naoya Horiguchi 2017-02-27 1:20 ` [LTP] " Naoya Horiguchi 2017-02-27 4:27 ` Zi Yan 2017-02-27 4:27 ` [LTP] " Zi Yan 2017-02-27 6:33 ` Naoya Horiguchi [this message] 2017-02-27 6:33 ` Naoya Horiguchi 2017-02-27 16:10 ` Zi Yan 2017-02-27 16:10 ` [LTP] " Zi Yan 2017-03-14 13:20 ` Cyril Hrubis 2017-03-14 13:20 ` Cyril Hrubis 2017-03-27 12:08 ` Richard Palethorpe 2017-03-27 12:08 ` Richard Palethorpe 2017-03-27 23:54 ` Andi Kleen 2017-03-27 23:54 ` [LTP] " Andi Kleen 2017-03-28 8:25 ` Cyril Hrubis 2017-03-28 8:25 ` Cyril Hrubis 2017-03-28 20:26 ` Andi Kleen 2017-03-28 20:26 ` Andi Kleen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170227063308.GA14387@hori1.linux.bs1.fc.nec.co.jp \ --to=n-horiguchi@ah.jp.nec.com \ --cc=jstancek@redhat.com \ --cc=linux-mm@kvack.org \ --cc=ltp@lists.linux.it \ --cc=xieyisheng1@huawei.com \ --cc=zi.yan@cs.rutgers.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.