Naoya Horiguchi wrote: > On Fri, Apr 21, 2017 at 10:55:49AM -0500, Zi Yan wrote: >> >> Anshuman Khandual wrote: >>> On 04/21/2017 02:17 AM, Zi Yan wrote: >>>> From: Naoya Horiguchi >>>> >>>> This patch enables thp migration for soft offline. >>>> >>>> Signed-off-by: Naoya Horiguchi >>>> >>>> ChangeLog: v1 -> v5: >>>> - fix page isolation counting error >>>> >>>> Signed-off-by: Zi Yan >>>> --- >>>> mm/memory-failure.c | 35 ++++++++++++++--------------------- >>>> 1 file changed, 14 insertions(+), 21 deletions(-) >>>> >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>> index 9b77476ef31f..23ff02eb3ed4 100644 >>>> --- a/mm/memory-failure.c >>>> +++ b/mm/memory-failure.c >>>> @@ -1481,7 +1481,17 @@ static struct page *new_page(struct page *p, unsigned long private, int **x) >>>> if (PageHuge(p)) >>>> return alloc_huge_page_node(page_hstate(compound_head(p)), >>>> nid); >>>> - else >>>> + else if (thp_migration_supported() && PageTransHuge(p)) { >>>> + struct page *thp; >>>> + >>>> + thp = alloc_pages_node(nid, >>>> + (GFP_TRANSHUGE | __GFP_THISNODE) & ~__GFP_RECLAIM, >>> Why not __GFP_RECLAIM ? Its soft offline path we wait a bit before >>> declaring that THP page cannot be allocated and hence should invoke >>> reclaim methods as well. >> I am not sure how much effort the kernel wants to put here to soft >> offline a THP. Naoya knows more here. > > What I thought at first was that soft offline is not an urgent user > and no need to reclaim (i.e. give a little impact on other thread.) > But that's not a strong opinion, so if you like __GFP_RECLAIM here, > I'm fine about that. OK, I will add __GFP_RECLAIM. > >> >>>> + HPAGE_PMD_ORDER); >>>> + if (!thp) >>>> + return NULL; >>>> + prep_transhuge_page(thp); >>>> + return thp; >>>> + } else >>>> return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0); >>>> } >>>> >>>> @@ -1665,8 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags) >>>> * cannot have PAGE_MAPPING_MOVABLE. >>>> */ >>>> if (!__PageMovable(page)) >>>> - inc_node_page_state(page, NR_ISOLATED_ANON + >>>> - page_is_file_cache(page)); >>>> + mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + >>>> + page_is_file_cache(page), hpage_nr_pages(page)); >>>> list_add(&page->lru, &pagelist); >>>> ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL, >>>> MIGRATE_SYNC, MR_MEMORY_FAILURE); >>>> @@ -1689,28 +1699,11 @@ static int __soft_offline_page(struct page *page, int flags) >>>> static int soft_offline_in_use_page(struct page *page, int flags) >>>> { >>>> int ret; >>>> - struct page *hpage = compound_head(page); >>>> - >>>> - if (!PageHuge(page) && PageTransHuge(hpage)) { >>>> - lock_page(hpage); >>>> - if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) { >>>> - unlock_page(hpage); >>>> - if (!PageAnon(hpage)) >>>> - pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page)); >>>> - else >>>> - pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page)); >>>> - put_hwpoison_page(hpage); >>>> - return -EBUSY; >>>> - } >>>> - unlock_page(hpage); >>>> - get_hwpoison_page(page); >>>> - put_hwpoison_page(hpage); >>>> - } >>>> >>>> if (PageHuge(page)) >>>> ret = soft_offline_huge_page(page, flags); >>>> else >>>> - ret = __soft_offline_page(page, flags); >>>> + ret = __soft_offline_page(compound_head(page), flags); >>> Hmm, what if the THP allocation fails in the new_page() path and >>> we fallback for general page allocation. In that case we will >>> always be still calling with the head page ? Because we dont >>> split the huge page any more. >> This could be a problem if the user wants to offline a TailPage but due >> to THP allocation failure, the HeadPage is offlined. > > Right, "retry with split" part is unfinished, so we need some improvement. > >> It may be better to only soft offline THPs if page == >> compound_head(page). If page != compound_head(page), we still split THPs >> like before. >> >> Because in migrate_pages(), we cannot guarantee any TailPages in that >> THP are migrated (1. THP allocation failure causes THP splitting, then >> only HeadPage is going to be migrated; 2. even if we change existing >> migrate_pages() implementation to add all TailPages to migration list >> instead of LRU list, we still cannot guarantee the TailPage we want to >> migrate is migrated.). >> >> Naoya, what do you think? > > Maybe soft offline is a special caller of page migration because it > basically wants to migrate only one page, but thp migration still has > a benefit because we can avoid thp split. > So I like that we try thp migration at first, and if it fails we fall > back to split and migrate (only) a raw error page. This should be done > in caller side for soft offline, because it knows where the error page is. Make sense. So when migrate_pages() sees the migrate reason is MR_MEMORY_FAILURE, it will not split THP when newpage allocation fails. Then, the soft offline caller will split failed THP and retry migrating the error subpage. I can do that. > > As for generic case (for other migration callers which mainly want to > migrate multiple pages for their purpose,) thp split and retry can be > done in common migration code. After thp split, all subpages are linked > to migration list, then we retry without returning to the caller. > So I think that split_huge_page() can be moved to (for example) for-loop > in migrate_pages(). > > I tried to write a patch for it last year, but considering vm event > accounting, the patch might be large (~100 lines). Yes. I saw your code on your github. I can pick it up and send it for review after this patchset is merged, if you are OK with it. -- Best Regards, Yan Zi