* [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page @ 2022-08-02 2:00 Miaohe Lin 2022-08-02 4:07 ` Muchun Song 0 siblings, 1 reply; 6+ messages in thread From: Miaohe Lin @ 2022-08-02 2:00 UTC (permalink / raw) To: Linux-MM, linux-kernel Cc: Andrew Morton, Mike Kravetz, Naoya Horiguchi, Muchun Song Hi all: When I investigate the mm/memory-failure.c code again, I found there's a possible race window between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: CPU 1 CPU 2 alloc_fresh_huge_page -- page refcnt > 0 memory_failure prep_new_huge_page get_huge_page_for_hwpoison !PageHeadHuge -- so 2(not a hugepage) is returned hugetlb_vmemmap_optimize -- subpages is read-only set_compound_page_dtor -- PageHuge is true now, but too late!!! TestSetPageHWPoison(p) -- We might write to read-only subpages here!!! Another similar scene: CPU 1 CPU 2 free_huge_page -- page refcnt == 0 and not PageHuge memory_failure get_huge_page_for_hwpoison !PageHeadHuge -- so 2(not a hugepage) is returned TestSetPageHWPoison(p) -- We might write to read-only subpages here!!! hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! I think the above scenes are possible. But I can't found a stable solution to fix it. Any suggestions? Or is it not worth to fix it as it's too rare? Or am I miss something? Any response would be appreciated! Thanks! ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page 2022-08-02 2:00 [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page Miaohe Lin @ 2022-08-02 4:07 ` Muchun Song 2022-08-02 6:27 ` Miaohe Lin 0 siblings, 1 reply; 6+ messages in thread From: Muchun Song @ 2022-08-02 4:07 UTC (permalink / raw) To: Miaohe Lin Cc: Linux-MM, linux-kernel, Andrew Morton, Mike Kravetz, Naoya Horiguchi On Tue, Aug 02, 2022 at 10:00:50AM +0800, Miaohe Lin wrote: > Hi all: > When I investigate the mm/memory-failure.c code again, I found there's a possible race window > between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: > > CPU 1 CPU 2 > alloc_fresh_huge_page -- page refcnt > 0 memory_failure > prep_new_huge_page get_huge_page_for_hwpoison > !PageHeadHuge -- so 2(not a hugepage) is returned > hugetlb_vmemmap_optimize -- subpages is read-only > set_compound_page_dtor -- PageHuge is true now, but too late!!! > TestSetPageHWPoison(p) > -- We might write to read-only subpages here!!! > > Another similar scene: > > CPU 1 CPU 2 > free_huge_page -- page refcnt == 0 and not PageHuge memory_failure > get_huge_page_for_hwpoison > !PageHeadHuge -- so 2(not a hugepage) is returned > TestSetPageHWPoison(p) > -- We might write to read-only subpages here!!! > hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! > I agree this race is possible, I have proposed this race in thread [1]. But I didn't think more how to solve it. [1] https://lore.kernel.org/linux-mm/20220623235153.2623702-1-naoya.horiguchi@linux.dev/T/#ma094a7cea7df8fd9a77a91551bf39077d89e23bd > I think the above scenes are possible. But I can't found a stable solution to fix it. Any suggestions? > Or is it not worth to fix it as it's too rare? Or am I miss something? > Luckily, the system will panic at once, which encountering this race. However, we don't see any bug report. If we have an easy way to fix it, I think it is worth. Just a quick reply, no suggestion/solutions from me. Thanks. > Any response would be appreciated! > > Thanks! > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page 2022-08-02 4:07 ` Muchun Song @ 2022-08-02 6:27 ` Miaohe Lin 2022-08-04 7:40 ` Naoya Horiguchi 0 siblings, 1 reply; 6+ messages in thread From: Miaohe Lin @ 2022-08-02 6:27 UTC (permalink / raw) To: Muchun Song Cc: Linux-MM, linux-kernel, Andrew Morton, Mike Kravetz, Naoya Horiguchi On 2022/8/2 12:07, Muchun Song wrote: > On Tue, Aug 02, 2022 at 10:00:50AM +0800, Miaohe Lin wrote: >> Hi all: >> When I investigate the mm/memory-failure.c code again, I found there's a possible race window >> between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: >> >> CPU 1 CPU 2 >> alloc_fresh_huge_page -- page refcnt > 0 memory_failure >> prep_new_huge_page get_huge_page_for_hwpoison >> !PageHeadHuge -- so 2(not a hugepage) is returned >> hugetlb_vmemmap_optimize -- subpages is read-only >> set_compound_page_dtor -- PageHuge is true now, but too late!!! >> TestSetPageHWPoison(p) >> -- We might write to read-only subpages here!!! >> >> Another similar scene: >> >> CPU 1 CPU 2 >> free_huge_page -- page refcnt == 0 and not PageHuge memory_failure >> get_huge_page_for_hwpoison >> !PageHeadHuge -- so 2(not a hugepage) is returned >> TestSetPageHWPoison(p) >> -- We might write to read-only subpages here!!! >> hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! >> > > I agree this race is possible, I have proposed this race in thread [1]. Oh, I remember I see the race proposed in [1] but I did not look into that carefully at that time. Sorry. > But I didn't think more how to solve it. I hope this thread can find a good solution. :) > > [1] https://lore.kernel.org/linux-mm/20220623235153.2623702-1-naoya.horiguchi@linux.dev/T/#ma094a7cea7df8fd9a77a91551bf39077d89e23bd > >> I think the above scenes are possible. But I can't found a stable solution to fix it. Any suggestions? >> Or is it not worth to fix it as it's too rare? Or am I miss something? >> > > Luckily, the system will panic at once, which encountering this race. However, > we don't see any bug report. If we have an easy way to fix it, I think it is worth. Agree. But I can't find a easy way to fix it yet. > Just a quick reply, no suggestion/solutions from me. Many thanks for your quick reply. > > Thanks. > >> Any response would be appreciated! >> >> Thanks! >> > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page 2022-08-02 6:27 ` Miaohe Lin @ 2022-08-04 7:40 ` Naoya Horiguchi 2022-08-04 7:44 ` Naoya Horiguchi 0 siblings, 1 reply; 6+ messages in thread From: Naoya Horiguchi @ 2022-08-04 7:40 UTC (permalink / raw) To: Miaohe Lin Cc: Muchun Song, Linux-MM, linux-kernel, Andrew Morton, Mike Kravetz On Tue, Aug 02, 2022 at 02:27:36PM +0800, Miaohe Lin wrote: > On 2022/8/2 12:07, Muchun Song wrote: > > On Tue, Aug 02, 2022 at 10:00:50AM +0800, Miaohe Lin wrote: > >> Hi all: > >> When I investigate the mm/memory-failure.c code again, I found there's a possible race window > >> between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: > >> > >> CPU 1 CPU 2 > >> alloc_fresh_huge_page -- page refcnt > 0 memory_failure > >> prep_new_huge_page get_huge_page_for_hwpoison > >> !PageHeadHuge -- so 2(not a hugepage) is returned > >> hugetlb_vmemmap_optimize -- subpages is read-only > >> set_compound_page_dtor -- PageHuge is true now, but too late!!! > >> TestSetPageHWPoison(p) > >> -- We might write to read-only subpages here!!! > >> > >> Another similar scene: > >> > >> CPU 1 CPU 2 > >> free_huge_page -- page refcnt == 0 and not PageHuge memory_failure > >> get_huge_page_for_hwpoison > >> !PageHeadHuge -- so 2(not a hugepage) is returned > >> TestSetPageHWPoison(p) > >> -- We might write to read-only subpages here!!! > >> hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! > >> > > > > I agree this race is possible, I have proposed this race in thread [1]. Thank you for reminding this, and I agree that we need some solution. > > Oh, I remember I see the race proposed in [1] but I did not look into that carefully at that time. Sorry. > > > But I didn't think more how to solve it. > I hope this thread can find a good solution. :) Both of the races show that __get_huge_page_for_hwpoison() fails to capture the case of generic compound page during turning into hugetlb, What makes things complicated is that no one knows to which state such a compound page finally turns into. So I think that if the page to be handled is an unknown compound page, we need to wait until it becomes some known page state to avoid misjudging. If we need a quick small fix, we may replace the check "!PageHeadHuge()" in __get_huge_page_for_hwpoison() with "!PageCompound()", and add another retry path in get_huge_page_for_hwpoison() for non-hugetlb compound pages. Thanks, Naoya Horiguchi > > > > > [1] https://lore.kernel.org/linux-mm/20220623235153.2623702-1-naoya.horiguchi@linux.dev/T/#ma094a7cea7df8fd9a77a91551bf39077d89e23bd > > > >> I think the above scenes are possible. But I can't found a stable solution to fix it. Any suggestions? > >> Or is it not worth to fix it as it's too rare? Or am I miss something? > >> > > > > Luckily, the system will panic at once, which encountering this race. However, > > we don't see any bug report. If we have an easy way to fix it, I think it is worth. > > Agree. But I can't find a easy way to fix it yet. > > > Just a quick reply, no suggestion/solutions from me. > > Many thanks for your quick reply. > > > > > Thanks. > > > >> Any response would be appreciated! > >> > >> Thanks! > >> > > . > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page 2022-08-04 7:40 ` Naoya Horiguchi @ 2022-08-04 7:44 ` Naoya Horiguchi 2022-08-05 2:33 ` Miaohe Lin 0 siblings, 1 reply; 6+ messages in thread From: Naoya Horiguchi @ 2022-08-04 7:44 UTC (permalink / raw) To: Miaohe Lin Cc: Muchun Song, Linux-MM, linux-kernel, Andrew Morton, Mike Kravetz On Thu, Aug 04, 2022 at 04:40:25PM +0900, Naoya Horiguchi wrote: > On Tue, Aug 02, 2022 at 02:27:36PM +0800, Miaohe Lin wrote: > > On 2022/8/2 12:07, Muchun Song wrote: > > > On Tue, Aug 02, 2022 at 10:00:50AM +0800, Miaohe Lin wrote: > > >> Hi all: > > >> When I investigate the mm/memory-failure.c code again, I found there's a possible race window > > >> between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: > > >> > > >> CPU 1 CPU 2 > > >> alloc_fresh_huge_page -- page refcnt > 0 memory_failure > > >> prep_new_huge_page get_huge_page_for_hwpoison > > >> !PageHeadHuge -- so 2(not a hugepage) is returned > > >> hugetlb_vmemmap_optimize -- subpages is read-only > > >> set_compound_page_dtor -- PageHuge is true now, but too late!!! > > >> TestSetPageHWPoison(p) > > >> -- We might write to read-only subpages here!!! > > >> > > >> Another similar scene: > > >> > > >> CPU 1 CPU 2 > > >> free_huge_page -- page refcnt == 0 and not PageHuge memory_failure > > >> get_huge_page_for_hwpoison > > >> !PageHeadHuge -- so 2(not a hugepage) is returned > > >> TestSetPageHWPoison(p) > > >> -- We might write to read-only subpages here!!! > > >> hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! > > >> > > > > > > I agree this race is possible, I have proposed this race in thread [1]. > > Thank you for reminding this, and I agree that we need some solution. > > > > > Oh, I remember I see the race proposed in [1] but I did not look into that carefully at that time. Sorry. > > > > > But I didn't think more how to solve it. > > I hope this thread can find a good solution. :) > > Both of the races show that __get_huge_page_for_hwpoison() fails to > capture the case of generic compound page during turning into hugetlb, I noticed my wrong description just after sending this out. The second race is about generic compound page turning into "buddy", sorry. - Naoya Horiguchi > What makes things complicated is that no one knows to which state > such a compound page finally turns into. So I think that if the page > to be handled is an unknown compound page, we need to wait until it > becomes some known page state to avoid misjudging. > > If we need a quick small fix, we may replace the check "!PageHeadHuge()" > in __get_huge_page_for_hwpoison() with "!PageCompound()", and add another > retry path in get_huge_page_for_hwpoison() for non-hugetlb compound pages. > > Thanks, > Naoya Horiguchi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page 2022-08-04 7:44 ` Naoya Horiguchi @ 2022-08-05 2:33 ` Miaohe Lin 0 siblings, 0 replies; 6+ messages in thread From: Miaohe Lin @ 2022-08-05 2:33 UTC (permalink / raw) To: Naoya Horiguchi Cc: Muchun Song, Linux-MM, linux-kernel, Andrew Morton, Mike Kravetz On 2022/8/4 15:44, Naoya Horiguchi wrote: > On Thu, Aug 04, 2022 at 04:40:25PM +0900, Naoya Horiguchi wrote: >> On Tue, Aug 02, 2022 at 02:27:36PM +0800, Miaohe Lin wrote: >>> On 2022/8/2 12:07, Muchun Song wrote: >>>> On Tue, Aug 02, 2022 at 10:00:50AM +0800, Miaohe Lin wrote: >>>>> Hi all: >>>>> When I investigate the mm/memory-failure.c code again, I found there's a possible race window >>>>> between memory_failure and alloc_fresh_huge_page/free_huge_page. Thank about the below scene: >>>>> >>>>> CPU 1 CPU 2 >>>>> alloc_fresh_huge_page -- page refcnt > 0 memory_failure >>>>> prep_new_huge_page get_huge_page_for_hwpoison >>>>> !PageHeadHuge -- so 2(not a hugepage) is returned >>>>> hugetlb_vmemmap_optimize -- subpages is read-only >>>>> set_compound_page_dtor -- PageHuge is true now, but too late!!! >>>>> TestSetPageHWPoison(p) >>>>> -- We might write to read-only subpages here!!! >>>>> >>>>> Another similar scene: >>>>> >>>>> CPU 1 CPU 2 >>>>> free_huge_page -- page refcnt == 0 and not PageHuge memory_failure >>>>> get_huge_page_for_hwpoison >>>>> !PageHeadHuge -- so 2(not a hugepage) is returned >>>>> TestSetPageHWPoison(p) >>>>> -- We might write to read-only subpages here!!! >>>>> hugetlb_vmemmap_restore -- subpages can be written to now, but too late!!! >>>>> >>>> >>>> I agree this race is possible, I have proposed this race in thread [1]. >> >> Thank you for reminding this, and I agree that we need some solution. >> >>> >>> Oh, I remember I see the race proposed in [1] but I did not look into that carefully at that time. Sorry. >>> >>>> But I didn't think more how to solve it. >>> I hope this thread can find a good solution. :) >> >> Both of the races show that __get_huge_page_for_hwpoison() fails to >> capture the case of generic compound page during turning into hugetlb, > > I noticed my wrong description just after sending this out. The second race > is about generic compound page turning into "buddy", sorry. Never mind. Thanks for your thought. > > - Naoya Horiguchi > >> What makes things complicated is that no one knows to which state >> such a compound page finally turns into. So I think that if the page >> to be handled is an unknown compound page, we need to wait until it >> becomes some known page state to avoid misjudging. >> >> If we need a quick small fix, we may replace the check "!PageHeadHuge()" >> in __get_huge_page_for_hwpoison() with "!PageCompound()", and add another >> retry path in get_huge_page_for_hwpoison() for non-hugetlb compound pages. Sounds like a good idea. If we can figure out a stable way to tell it from other compound page, it will be worth fixing it. Let's have a try. :) >> >> Thanks, >> Naoya Horiguchi Many thanks for your suggestion, Naoya. > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-08-05 2:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-08-02 2:00 [bug report] mm, hwpoison: memory_failure races with alloc_fresh_huge_page/free_huge_page Miaohe Lin 2022-08-02 4:07 ` Muchun Song 2022-08-02 6:27 ` Miaohe Lin 2022-08-04 7:40 ` Naoya Horiguchi 2022-08-04 7:44 ` Naoya Horiguchi 2022-08-05 2:33 ` Miaohe Lin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).