* [PATCH 0/3] Cleanup and fixups for memory hotplug @ 2021-08-21 9:42 Miaohe Lin 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Miaohe Lin @ 2021-08-21 9:42 UTC (permalink / raw) To: akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel, linmiaohe Hi all, This series contains cleanup to use helper function to simplify the code. Also we fix some potential bugs. More details can be found in the respective changelogs. Thanks! Miaohe Lin (3): mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code mm/memory_hotplug: fix potential permanent lru cache disable mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable mm/memory_hotplug.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) -- 2.23.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code 2021-08-21 9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin @ 2021-08-21 9:42 ` Miaohe Lin 2021-08-23 8:20 ` HORIGUCHI NAOYA(堀口 直也) ` (2 more replies) 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin 2021-08-21 9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin 2 siblings, 3 replies; 14+ messages in thread From: Miaohe Lin @ 2021-08-21 9:42 UTC (permalink / raw) To: akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel, linmiaohe Use helper zone_is_zone_device() to simplify the code and remove some explicit CONFIG_ZONE_DEVICE codes. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory_hotplug.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index b287ff3d7229..d986d3791986 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -477,15 +477,13 @@ void __ref remove_pfn_range_from_zone(struct zone *zone, sizeof(struct page) * cur_nr_pages); } -#ifdef CONFIG_ZONE_DEVICE /* * Zone shrinking code cannot properly deal with ZONE_DEVICE. So * we will not try to shrink the zones - which is okay as * set_zone_contiguous() cannot deal with ZONE_DEVICE either way. */ - if (zone_idx(zone) == ZONE_DEVICE) + if (zone_is_zone_device(zone)) return; -#endif clear_zone_contiguous(zone); -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin @ 2021-08-23 8:20 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:11 ` Oscar Salvador 2021-08-23 12:14 ` David Hildenbrand 2 siblings, 0 replies; 14+ messages in thread From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 8:20 UTC (permalink / raw) To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On Sat, Aug 21, 2021 at 05:42:44PM +0800, Miaohe Lin wrote: > Use helper zone_is_zone_device() to simplify the code and remove some > explicit CONFIG_ZONE_DEVICE codes. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin 2021-08-23 8:20 ` HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 9:11 ` Oscar Salvador 2021-08-23 12:14 ` David Hildenbrand 2 siblings, 0 replies; 14+ messages in thread From: Oscar Salvador @ 2021-08-23 9:11 UTC (permalink / raw) To: Miaohe Lin Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 2021-08-21 11:42, Miaohe Lin wrote: > Use helper zone_is_zone_device() to simplify the code and remove some > explicit CONFIG_ZONE_DEVICE codes. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> -- Oscar Salvador SUSE L3 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin 2021-08-23 8:20 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:11 ` Oscar Salvador @ 2021-08-23 12:14 ` David Hildenbrand 2 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2021-08-23 12:14 UTC (permalink / raw) To: Miaohe Lin, akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 21.08.21 11:42, Miaohe Lin wrote: > Use helper zone_is_zone_device() to simplify the code and remove some > explicit CONFIG_ZONE_DEVICE codes. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory_hotplug.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index b287ff3d7229..d986d3791986 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -477,15 +477,13 @@ void __ref remove_pfn_range_from_zone(struct zone *zone, > sizeof(struct page) * cur_nr_pages); > } > > -#ifdef CONFIG_ZONE_DEVICE > /* > * Zone shrinking code cannot properly deal with ZONE_DEVICE. So > * we will not try to shrink the zones - which is okay as > * set_zone_contiguous() cannot deal with ZONE_DEVICE either way. > */ > - if (zone_idx(zone) == ZONE_DEVICE) > + if (zone_is_zone_device(zone)) > return; > -#endif > > clear_zone_contiguous(zone); > > Reviewed-by: David Hildenbrand <david@redhat.com> -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable 2021-08-21 9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin @ 2021-08-21 9:42 ` Miaohe Lin 2021-08-23 8:21 ` HORIGUCHI NAOYA(堀口 直也) ` (2 more replies) 2021-08-21 9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin 2 siblings, 3 replies; 14+ messages in thread From: Miaohe Lin @ 2021-08-21 9:42 UTC (permalink / raw) To: akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel, linmiaohe If offline_pages failed after lru_cache_disable(), it forgot to do lru_cache_enable() in error path. So we would have lru cache disabled permanently in this case. Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory_hotplug.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d986d3791986..9fd0be32a281 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); memory_notify(MEM_CANCEL_OFFLINE, &arg); failed_removal_pcplists_disabled: + lru_cache_enable(); zone_pcp_enable(zone); failed_removal: pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n", -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin @ 2021-08-23 8:21 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:15 ` Oscar Salvador 2021-08-23 12:15 ` David Hildenbrand 2 siblings, 0 replies; 14+ messages in thread From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 8:21 UTC (permalink / raw) To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On Sat, Aug 21, 2021 at 05:42:45PM +0800, Miaohe Lin wrote: > If offline_pages failed after lru_cache_disable(), it forgot to do > lru_cache_enable() in error path. So we would have lru cache disabled > permanently in this case. > > Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin 2021-08-23 8:21 ` HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 9:15 ` Oscar Salvador 2021-08-23 11:13 ` Miaohe Lin 2021-08-23 12:15 ` David Hildenbrand 2 siblings, 1 reply; 14+ messages in thread From: Oscar Salvador @ 2021-08-23 9:15 UTC (permalink / raw) To: Miaohe Lin Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 2021-08-21 11:42, Miaohe Lin wrote: > If offline_pages failed after lru_cache_disable(), it forgot to do > lru_cache_enable() in error path. So we would have lru cache disabled > permanently in this case. > > Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration > temporarily") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Should this go to stable? In case we fail to enable it again, we will bypass the pvec cache anytime we add a new page to the LRU which might lead to severe performance regression? > --- > mm/memory_hotplug.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index d986d3791986..9fd0be32a281 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, > unsigned long nr_pages, > undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); > memory_notify(MEM_CANCEL_OFFLINE, &arg); > failed_removal_pcplists_disabled: > + lru_cache_enable(); > zone_pcp_enable(zone); > failed_removal: > pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to > %s\n", -- Oscar Salvador SUSE L3 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable 2021-08-23 9:15 ` Oscar Salvador @ 2021-08-23 11:13 ` Miaohe Lin 0 siblings, 0 replies; 14+ messages in thread From: Miaohe Lin @ 2021-08-23 11:13 UTC (permalink / raw) To: Oscar Salvador Cc: akpm, naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 2021/8/23 17:15, Oscar Salvador wrote: > On 2021-08-21 11:42, Miaohe Lin wrote: >> If offline_pages failed after lru_cache_disable(), it forgot to do >> lru_cache_enable() in error path. So we would have lru cache disabled >> permanently in this case. >> >> Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily") >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > > Reviewed-by: Oscar Salvador <osalvador@suse.de> > Many thanks for your review and reply. :) > Should this go to stable? > In case we fail to enable it again, we will bypass the pvec cache anytime we add a new page to the LRU which might lead to severe performance regression? > Agree with you. I think this should go to stable too. >> --- >> mm/memory_hotplug.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index d986d3791986..9fd0be32a281 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, >> unsigned long nr_pages, >> undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); >> memory_notify(MEM_CANCEL_OFFLINE, &arg); >> failed_removal_pcplists_disabled: >> + lru_cache_enable(); >> zone_pcp_enable(zone); >> failed_removal: >> pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n", > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin 2021-08-23 8:21 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:15 ` Oscar Salvador @ 2021-08-23 12:15 ` David Hildenbrand 2 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2021-08-23 12:15 UTC (permalink / raw) To: Miaohe Lin, akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 21.08.21 11:42, Miaohe Lin wrote: > If offline_pages failed after lru_cache_disable(), it forgot to do > lru_cache_enable() in error path. So we would have lru cache disabled > permanently in this case. > > Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory_hotplug.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index d986d3791986..9fd0be32a281 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -2033,6 +2033,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, > undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); > memory_notify(MEM_CANCEL_OFFLINE, &arg); > failed_removal_pcplists_disabled: > + lru_cache_enable(); > zone_pcp_enable(zone); > failed_removal: > pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n", > Reviewed-by: David Hildenbrand <david@redhat.com> As mentioned, this should be backported to stable. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable 2021-08-21 9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin @ 2021-08-21 9:42 ` Miaohe Lin 2021-08-23 8:26 ` HORIGUCHI NAOYA(堀口 直也) 2 siblings, 1 reply; 14+ messages in thread From: Miaohe Lin @ 2021-08-21 9:42 UTC (permalink / raw) To: akpm Cc: naoya.horiguchi, mhocko, minchan, cgoldswo, linux-mm, linux-kernel, linmiaohe HWPoisoned dirty swapcache pages are kept for killing owner processes. We should not offline these pages or do_swap_page() would access the offline pages and lead to bad ending. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory_hotplug.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 9fd0be32a281..0488eed3327c 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end, */ if (PageOffline(page) && page_count(page)) return -EBUSY; + /* + * HWPoisoned dirty swapcache pages are definitely unmovable + * because they are kept for killing owner processes. + */ + if (PageHWPoison(page) && PageSwapCache(page)) + return -EBUSY; if (!PageHuge(page)) continue; -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable 2021-08-21 9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin @ 2021-08-23 8:26 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:14 ` Miaohe Lin 0 siblings, 1 reply; 14+ messages in thread From: HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 8:26 UTC (permalink / raw) To: Miaohe Lin; +Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote: > HWPoisoned dirty swapcache pages are kept for killing owner processes. > We should not offline these pages or do_swap_page() would access the > offline pages and lead to bad ending. > Thank you for the report. I'm not yet sure of the whole picture of this issue. do_swap_page() is expected to return with fault VM_FAULT_HWPOISON when called via the access to the error page, so I wonder why this doesn't work for your situation. And what is the "bad ending" in the description? I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache might be too hard, so I'd like to find another solution if we have. # You may separate this patch from former two to make them merged to # mainline soon. Thanks, Naoya Horiguchi > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory_hotplug.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 9fd0be32a281..0488eed3327c 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end, > */ > if (PageOffline(page) && page_count(page)) > return -EBUSY; > + /* > + * HWPoisoned dirty swapcache pages are definitely unmovable > + * because they are kept for killing owner processes. > + */ > + if (PageHWPoison(page) && PageSwapCache(page)) > + return -EBUSY; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable 2021-08-23 8:26 ` HORIGUCHI NAOYA(堀口 直也) @ 2021-08-23 9:14 ` Miaohe Lin 2021-11-04 22:07 ` Andrew Morton 0 siblings, 1 reply; 14+ messages in thread From: Miaohe Lin @ 2021-08-23 9:14 UTC (permalink / raw) To: HORIGUCHI NAOYA(堀口 直也) Cc: akpm, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On 2021/8/23 16:26, HORIGUCHI NAOYA(堀口 直也) wrote: > On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote: >> HWPoisoned dirty swapcache pages are kept for killing owner processes. >> We should not offline these pages or do_swap_page() would access the >> offline pages and lead to bad ending. >> > > Thank you for the report. I'm not yet sure of the whole picture of this > issue. do_swap_page() is expected to return with fault VM_FAULT_HWPOISON > when called via the access to the error page, so I wonder why this doesn't > work for your situation. And what is the "bad ending" in the description? > IMO we might hotremove the page while SwapCache still have ref to it. Thus the page struct would be accessed after offlined. The page struct should be invalid in this case and this would make do_swap_page fragile. Or am I miss something? > I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache > might be too hard, so I'd like to find another solution if we have. If there is a better way, we can just drop this one. Many thanks for your review and reply! :) > # You may separate this patch from former two to make them merged to > # mainline soon. > > Thanks, > Naoya Horiguchi > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/memory_hotplug.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index 9fd0be32a281..0488eed3327c 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end, >> */ >> if (PageOffline(page) && page_count(page)) >> return -EBUSY; >> + /* >> + * HWPoisoned dirty swapcache pages are definitely unmovable >> + * because they are kept for killing owner processes. >> + */ >> + if (PageHWPoison(page) && PageSwapCache(page)) >> + return -EBUSY; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable 2021-08-23 9:14 ` Miaohe Lin @ 2021-11-04 22:07 ` Andrew Morton 0 siblings, 0 replies; 14+ messages in thread From: Andrew Morton @ 2021-11-04 22:07 UTC (permalink / raw) To: Miaohe Lin Cc: HORIGUCHI NAOYA, mhocko, minchan, cgoldswo, linux-mm, linux-kernel On Mon, 23 Aug 2021 17:14:29 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote: > On 2021/8/23 16:26, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Sat, Aug 21, 2021 at 05:42:46PM +0800, Miaohe Lin wrote: > >> HWPoisoned dirty swapcache pages are kept for killing owner processes. > >> We should not offline these pages or do_swap_page() would access the > >> offline pages and lead to bad ending. > >> > > > > Thank you for the report. I'm not yet sure of the whole picture of this > > issue. do_swap_page() is expected to return with fault VM_FAULT_HWPOISON > > when called via the access to the error page, so I wonder why this doesn't > > work for your situation. And what is the "bad ending" in the description? > > > > IMO we might hotremove the page while SwapCache still have ref to it. Thus the page > struct would be accessed after offlined. The page struct should be invalid in this case > and this would make do_swap_page fragile. Or am I miss something? > > > I feel that aborting memory hotremove due to a hwpoisoned dirty swapcache > > might be too hard, so I'd like to find another solution if we have. > > If there is a better way, we can just drop this one. > > Many thanks for your review and reply! :) > > > # You may separate this patch from former two to make them merged to > > # mainline soon. > > ... > > >> --- a/mm/memory_hotplug.c > >> +++ b/mm/memory_hotplug.c > >> @@ -1664,6 +1664,12 @@ static int scan_movable_pages(unsigned long start, unsigned long end, > >> */ > >> if (PageOffline(page) && page_count(page)) > >> return -EBUSY; > >> + /* > >> + * HWPoisoned dirty swapcache pages are definitely unmovable > >> + * because they are kept for killing owner processes. > >> + */ > >> + if (PageHWPoison(page) && PageSwapCache(page)) > >> + return -EBUSY; > I'll drop this. Please resend something if you still believe that changes are desirable. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-11-04 22:07 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-21 9:42 [PATCH 0/3] Cleanup and fixups for memory hotplug Miaohe Lin 2021-08-21 9:42 ` [PATCH 1/3] mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code Miaohe Lin 2021-08-23 8:20 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:11 ` Oscar Salvador 2021-08-23 12:14 ` David Hildenbrand 2021-08-21 9:42 ` [PATCH 2/3] mm/memory_hotplug: fix potential permanent lru cache disable Miaohe Lin 2021-08-23 8:21 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:15 ` Oscar Salvador 2021-08-23 11:13 ` Miaohe Lin 2021-08-23 12:15 ` David Hildenbrand 2021-08-21 9:42 ` [PATCH 3/3] mm/memory_hotplug: make HWPoisoned dirty swapcache pages unmovable Miaohe Lin 2021-08-23 8:26 ` HORIGUCHI NAOYA(堀口 直也) 2021-08-23 9:14 ` Miaohe Lin 2021-11-04 22:07 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).