linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
@ 2017-04-26  3:57 Anshuman Khandual
  2017-05-12 21:35 ` Andrew Morton
  2017-05-16 10:05 ` [PATCH V3] " Anshuman Khandual
  0 siblings, 2 replies; 6+ messages in thread
From: Anshuman Khandual @ 2017-04-26  3:57 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: n-horiguchi, akpm, aneesh.kumar

Though migrating gigantic HugeTLB pages does not sound much like real
world use case, they can be affected by memory errors. Hence migration
at the PGD level HugeTLB pages should be supported just to enable soft
and hard offline use cases.

While allocating the new gigantic HugeTLB page, it should not matter
whether new page comes from the same node or not. There would be very
few gigantic pages on the system afterall, we should not be bothered
about node locality when trying to save a big page from crashing.

This introduces a new HugeTLB allocator called alloc_huge_page_nonid()
which will scan over all online nodes on the system and allocate a
single HugeTLB page.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
Tested on a POWER8 machine with 16GB pages along with Aneesh's
recent HugeTLB enablement patch series on powerpc which can
be found here.

https://lkml.org/lkml/2017/4/17/225

Here, we directly call alloc_huge_page_nonid() which ignores the
node locality. But we can also first call normal alloc_huge_page()
with the node number and if that fails to allocate only then call
alloc_huge_page_nonid() as a fallback option.

Aneesh mentioned about the waste of memory if we just have to
soft offline a single page. The problem persists both on PGD
as well as PMD level HugeTLB pages. Tried solving the problem
with https://patchwork.kernel.org/patch/9690119/ but right now
madvise splits the entire range of HugeTLB pages (if the page
is HugeTLB one) and calls soft_offline_page() on the head page
of each HugeTLB page as soft_offline_page() acts on the entire
HugeTLB range not just the individual pages. Changing the iterator
in madvise() scan over individual pages solves the problem but
then it creates multiple HugeTLB migrations (HUGE_PAGE_SIZE /
PAGE_SIZE times to be precise) if we really have to soft offline
a single HugeTLB page which is not optimal.

Hence for now, lets just enable PGD level HugeTLB soft offline
at par with the PMD level HugeTLB before we can go back and
address the memory wastage problem comprehensively for both
PGD and PMD level HugeTLB page as mentioned above.

Changes in V2:
 * Added hstate_is_gigantic() definition when !CONFIG_HUGETLB_PAGE
   which takes care of the build failure reported earlier.

 mm/hugetlb.c            | 17 +++++++++++++++++
 mm/memory-failure.c     |  8 ++++++--
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 04b73a9c8b4b..964d964f22c8 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -347,6 +347,7 @@ struct huge_bootmem_page {
 
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
+struct page *alloc_huge_page_nonid(struct hstate *h);
 struct page *alloc_huge_page_node(struct hstate *h, int nid);
 struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
@@ -473,7 +474,11 @@ extern int dissolve_free_huge_pages(unsigned long start_pfn,
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
-	return huge_page_shift(h) == PMD_SHIFT;
+	if ((huge_page_shift(h) == PMD_SHIFT) ||
+		(huge_page_shift(h) == PGDIR_SHIFT))
+		return true;
+	else
+		return false;
 #else
 	return false;
 #endif
@@ -511,6 +516,7 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page(v, a, r) NULL
+#define alloc_huge_page_nonid(h) NULL
 #define alloc_huge_page_node(h, nid) NULL
 #define alloc_huge_page_noerr(v, a, r) NULL
 #define alloc_bootmem_huge_page(h) NULL
@@ -525,6 +531,7 @@ struct hstate {};
 #define vma_mmu_pagesize(v) PAGE_SIZE
 #define huge_page_order(h) 0
 #define huge_page_shift(h) PAGE_SHIFT
+#define hstate_is_gigantic(h) 0
 static inline unsigned int pages_per_huge_page(struct hstate *h)
 {
 	return 1;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 97a44db06850..bd96fff2bc09 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1669,6 +1669,23 @@ struct page *__alloc_buddy_huge_page_with_mpol(struct hstate *h,
 	return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
 }
 
+struct page *alloc_huge_page_nonid(struct hstate *h)
+{
+	struct page *page = NULL;
+	int nid = 0;
+
+	spin_lock(&hugetlb_lock);
+	if (h->free_huge_pages - h->resv_huge_pages > 0) {
+		for_each_online_node(nid) {
+			page = dequeue_huge_page_node(h, nid);
+			if (page)
+				break;
+		}
+	}
+	spin_unlock(&hugetlb_lock);
+	return page;
+}
+
 /*
  * This allocation function is useful in the context where vma is irrelevant.
  * E.g. soft-offlining uses this function because it only cares physical
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index fe64d7729a8e..d4f5710cf3f7 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1481,11 +1481,15 @@ EXPORT_SYMBOL(unpoison_memory);
 static struct page *new_page(struct page *p, unsigned long private, int **x)
 {
 	int nid = page_to_nid(p);
-	if (PageHuge(p))
+	if (PageHuge(p)) {
+		if (hstate_is_gigantic(page_hstate(compound_head(p))))
+			return alloc_huge_page_nonid(page_hstate(compound_head(p)));
+
 		return alloc_huge_page_node(page_hstate(compound_head(p)),
 						   nid);
-	else
+	} else {
 		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
+	}
 }
 
 /*
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
  2017-04-26  3:57 [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level Anshuman Khandual
@ 2017-05-12 21:35 ` Andrew Morton
  2017-05-14  4:11   ` Anshuman Khandual
  2017-05-16 10:05 ` [PATCH V3] " Anshuman Khandual
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2017-05-12 21:35 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linux-kernel, linux-mm, n-horiguchi, aneesh.kumar

On Wed, 26 Apr 2017 09:27:31 +0530 Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:

> Though migrating gigantic HugeTLB pages does not sound much like real
> world use case, they can be affected by memory errors. Hence migration
> at the PGD level HugeTLB pages should be supported just to enable soft
> and hard offline use cases.
> 
> While allocating the new gigantic HugeTLB page, it should not matter
> whether new page comes from the same node or not. There would be very
> few gigantic pages on the system afterall, we should not be bothered
> about node locality when trying to save a big page from crashing.
> 
> This introduces a new HugeTLB allocator called alloc_huge_page_nonid()
> which will scan over all online nodes on the system and allocate a
> single HugeTLB page.
> 
> ...
>
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1669,6 +1669,23 @@ struct page *__alloc_buddy_huge_page_with_mpol(struct hstate *h,
>  	return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
>  }
>  
> +struct page *alloc_huge_page_nonid(struct hstate *h)
> +{
> +	struct page *page = NULL;
> +	int nid = 0;
> +
> +	spin_lock(&hugetlb_lock);
> +	if (h->free_huge_pages - h->resv_huge_pages > 0) {
> +		for_each_online_node(nid) {
> +			page = dequeue_huge_page_node(h, nid);
> +			if (page)
> +				break;
> +		}
> +	}
> +	spin_unlock(&hugetlb_lock);
> +	return page;
> +}
> +
>  /*
>   * This allocation function is useful in the context where vma is irrelevant.
>   * E.g. soft-offlining uses this function because it only cares physical
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index fe64d7729a8e..d4f5710cf3f7 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1481,11 +1481,15 @@ EXPORT_SYMBOL(unpoison_memory);
>  static struct page *new_page(struct page *p, unsigned long private, int **x)
>  {
>  	int nid = page_to_nid(p);
> -	if (PageHuge(p))
> +	if (PageHuge(p)) {
> +		if (hstate_is_gigantic(page_hstate(compound_head(p))))
> +			return alloc_huge_page_nonid(page_hstate(compound_head(p)));
> +
>  		return alloc_huge_page_node(page_hstate(compound_head(p)),
>  						   nid);
> -	else
> +	} else {
>  		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
> +	}
>  }

Rather than adding alloc_huge_page_nonid(), would it be neater to teach
alloc_huge_page_node() (actually dequeue_huge_page_node()) to understand
nid==NUMA_NO_NODE?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
  2017-05-12 21:35 ` Andrew Morton
@ 2017-05-14  4:11   ` Anshuman Khandual
  0 siblings, 0 replies; 6+ messages in thread
From: Anshuman Khandual @ 2017-05-14  4:11 UTC (permalink / raw)
  To: Andrew Morton, Anshuman Khandual
  Cc: linux-kernel, linux-mm, n-horiguchi, aneesh.kumar

On 05/13/2017 03:05 AM, Andrew Morton wrote:
> On Wed, 26 Apr 2017 09:27:31 +0530 Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:
> 
>> Though migrating gigantic HugeTLB pages does not sound much like real
>> world use case, they can be affected by memory errors. Hence migration
>> at the PGD level HugeTLB pages should be supported just to enable soft
>> and hard offline use cases.
>>
>> While allocating the new gigantic HugeTLB page, it should not matter
>> whether new page comes from the same node or not. There would be very
>> few gigantic pages on the system afterall, we should not be bothered
>> about node locality when trying to save a big page from crashing.
>>
>> This introduces a new HugeTLB allocator called alloc_huge_page_nonid()
>> which will scan over all online nodes on the system and allocate a
>> single HugeTLB page.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1669,6 +1669,23 @@ struct page *__alloc_buddy_huge_page_with_mpol(struct hstate *h,
>>  	return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
>>  }
>>  
>> +struct page *alloc_huge_page_nonid(struct hstate *h)
>> +{
>> +	struct page *page = NULL;
>> +	int nid = 0;
>> +
>> +	spin_lock(&hugetlb_lock);
>> +	if (h->free_huge_pages - h->resv_huge_pages > 0) {
>> +		for_each_online_node(nid) {
>> +			page = dequeue_huge_page_node(h, nid);
>> +			if (page)
>> +				break;
>> +		}
>> +	}
>> +	spin_unlock(&hugetlb_lock);
>> +	return page;
>> +}
>> +
>>  /*
>>   * This allocation function is useful in the context where vma is irrelevant.
>>   * E.g. soft-offlining uses this function because it only cares physical
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index fe64d7729a8e..d4f5710cf3f7 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1481,11 +1481,15 @@ EXPORT_SYMBOL(unpoison_memory);
>>  static struct page *new_page(struct page *p, unsigned long private, int **x)
>>  {
>>  	int nid = page_to_nid(p);
>> -	if (PageHuge(p))
>> +	if (PageHuge(p)) {
>> +		if (hstate_is_gigantic(page_hstate(compound_head(p))))
>> +			return alloc_huge_page_nonid(page_hstate(compound_head(p)));
>> +
>>  		return alloc_huge_page_node(page_hstate(compound_head(p)),
>>  						   nid);
>> -	else
>> +	} else {
>>  		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
>> +	}
>>  }
> 
> Rather than adding alloc_huge_page_nonid(), would it be neater to teach
> alloc_huge_page_node() (actually dequeue_huge_page_node()) to understand
> nid==NUMA_NO_NODE?

Sure, will change dequeue_huge_page_node() to accommodate NUMA_NO_NODE and
let soft offline call with NUMA_NO_NODE in case of gigantic huge pages.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH V3] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
  2017-04-26  3:57 [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level Anshuman Khandual
  2017-05-12 21:35 ` Andrew Morton
@ 2017-05-16 10:05 ` Anshuman Khandual
  2017-07-28  0:49   ` Mike Kravetz
  1 sibling, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2017-05-16 10:05 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: akpm

Though migrating gigantic HugeTLB pages does not sound much like real
world use case, they can be affected by memory errors. Hence migration
at the PGD level HugeTLB pages should be supported just to enable soft
and hard offline use cases.

While allocating the new gigantic HugeTLB page, it should not matter
whether new page comes from the same node or not. There would be very
few gigantic pages on the system afterall, we should not be bothered
about node locality when trying to save a big page from crashing.

This change renames dequeu_huge_page_node() function as dequeue_huge
_page_node_exact() preserving it's original functionality. Now the new
dequeue_huge_page_node() function scans through all available online
nodes to allocate a huge page for the NUMA_NO_NODE case and just falls
back calling dequeu_huge_page_node_exact() for all other cases.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
Changes in V3:
* Dropped alloc_huge_page_nonid() as per Andrew
* Changed dequeue_huge_page_node() to accommodate NUMA_NO_NODE as per Andrew
* Added dequeue_huge_page_node_exact() which implements functionality for the
  previous dequeue_huge_page_node() function

Changes in V2:
 * Added hstate_is_gigantic() definition when !CONFIG_HUGETLB_PAGE
   which takes care of the build failure reported earlier.

 include/linux/hugetlb.h |  7 ++++++-
 mm/hugetlb.c            | 18 +++++++++++++++++-
 mm/memory-failure.c     | 13 +++++++++----
 3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index b857fc8cc2ec..614a0a40f1ef 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -466,7 +466,11 @@ extern int dissolve_free_huge_pages(unsigned long start_pfn,
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
-	return huge_page_shift(h) == PMD_SHIFT;
+	if ((huge_page_shift(h) == PMD_SHIFT) ||
+		(huge_page_shift(h) == PGDIR_SHIFT))
+		return true;
+	else
+		return false;
 #else
 	return false;
 #endif
@@ -518,6 +522,7 @@ struct hstate {};
 #define vma_mmu_pagesize(v) PAGE_SIZE
 #define huge_page_order(h) 0
 #define huge_page_shift(h) PAGE_SHIFT
+#define hstate_is_gigantic(h) 0
 static inline unsigned int pages_per_huge_page(struct hstate *h)
 {
 	return 1;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e5828875f7bb..7cd0f09b8dd0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -867,7 +867,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
 	h->free_huge_pages_node[nid]++;
 }
 
-static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
+static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
 {
 	struct page *page;
 
@@ -887,6 +887,22 @@ static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
 	return page;
 }
 
+static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
+{
+	struct page *page;
+	int node;
+
+	if (nid != NUMA_NO_NODE)
+		return dequeue_huge_page_node_exact(h, nid);
+
+	for_each_online_node(node) {
+		page = dequeue_huge_page_node_exact(h, node);
+		if (page)
+			return page;
+	}
+	return NULL;
+}
+
 /* Movability of hugepages depends on migration support. */
 static inline gfp_t htlb_alloc_mask(struct hstate *h)
 {
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 2527dfeddb00..f71efae2e494 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1489,11 +1489,16 @@ EXPORT_SYMBOL(unpoison_memory);
 static struct page *new_page(struct page *p, unsigned long private, int **x)
 {
 	int nid = page_to_nid(p);
-	if (PageHuge(p))
-		return alloc_huge_page_node(page_hstate(compound_head(p)),
-						   nid);
-	else
+	if (PageHuge(p)) {
+		struct hstate *hstate = page_hstate(compound_head(p));
+
+		if (hstate_is_gigantic(hstate))
+			return alloc_huge_page_node(hstate, NUMA_NO_NODE);
+
+		return alloc_huge_page_node(hstate, nid);
+	} else {
 		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
+	}
 }
 
 /*
-- 
2.12.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
  2017-05-16 10:05 ` [PATCH V3] " Anshuman Khandual
@ 2017-07-28  0:49   ` Mike Kravetz
  2017-07-28  5:53     ` Anshuman Khandual
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Kravetz @ 2017-07-28  0:49 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-mm; +Cc: akpm

On 05/16/2017 03:05 AM, Anshuman Khandual wrote:
> Though migrating gigantic HugeTLB pages does not sound much like real
> world use case, they can be affected by memory errors. Hence migration
> at the PGD level HugeTLB pages should be supported just to enable soft
> and hard offline use cases.

Hi Anshuman,

Sorry for the late question, but I just stumbled on this code when
looking at something else.

It appears the primary motivation for these changes is to handle
memory errors in gigantic pages.  In this case, you migrate to
another gigantic page.  However, doesn't this assume that there is
a pre-allocated gigantic page sitting unused that will be the target
of the migration?  alloc_huge_page_node will not allocate a gigantic
page.  Or, am I missing something?

-- 
Mike Kravetz

> 
> While allocating the new gigantic HugeTLB page, it should not matter
> whether new page comes from the same node or not. There would be very
> few gigantic pages on the system afterall, we should not be bothered
> about node locality when trying to save a big page from crashing.
> 
> This change renames dequeu_huge_page_node() function as dequeue_huge
> _page_node_exact() preserving it's original functionality. Now the new
> dequeue_huge_page_node() function scans through all available online
> nodes to allocate a huge page for the NUMA_NO_NODE case and just falls
> back calling dequeu_huge_page_node_exact() for all other cases.
> 
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> Changes in V3:
> * Dropped alloc_huge_page_nonid() as per Andrew
> * Changed dequeue_huge_page_node() to accommodate NUMA_NO_NODE as per Andrew
> * Added dequeue_huge_page_node_exact() which implements functionality for the
>   previous dequeue_huge_page_node() function
> 
> Changes in V2:
>  * Added hstate_is_gigantic() definition when !CONFIG_HUGETLB_PAGE
>    which takes care of the build failure reported earlier.
> 
>  include/linux/hugetlb.h |  7 ++++++-
>  mm/hugetlb.c            | 18 +++++++++++++++++-
>  mm/memory-failure.c     | 13 +++++++++----
>  3 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index b857fc8cc2ec..614a0a40f1ef 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -466,7 +466,11 @@ extern int dissolve_free_huge_pages(unsigned long start_pfn,
>  static inline bool hugepage_migration_supported(struct hstate *h)
>  {
>  #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
> -	return huge_page_shift(h) == PMD_SHIFT;
> +	if ((huge_page_shift(h) == PMD_SHIFT) ||
> +		(huge_page_shift(h) == PGDIR_SHIFT))
> +		return true;
> +	else
> +		return false;
>  #else
>  	return false;
>  #endif
> @@ -518,6 +522,7 @@ struct hstate {};
>  #define vma_mmu_pagesize(v) PAGE_SIZE
>  #define huge_page_order(h) 0
>  #define huge_page_shift(h) PAGE_SHIFT
> +#define hstate_is_gigantic(h) 0
>  static inline unsigned int pages_per_huge_page(struct hstate *h)
>  {
>  	return 1;
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e5828875f7bb..7cd0f09b8dd0 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -867,7 +867,7 @@ static void enqueue_huge_page(struct hstate *h, struct page *page)
>  	h->free_huge_pages_node[nid]++;
>  }
>  
> -static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
> +static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
>  {
>  	struct page *page;
>  
> @@ -887,6 +887,22 @@ static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
>  	return page;
>  }
>  
> +static struct page *dequeue_huge_page_node(struct hstate *h, int nid)
> +{
> +	struct page *page;
> +	int node;
> +
> +	if (nid != NUMA_NO_NODE)
> +		return dequeue_huge_page_node_exact(h, nid);
> +
> +	for_each_online_node(node) {
> +		page = dequeue_huge_page_node_exact(h, node);
> +		if (page)
> +			return page;
> +	}
> +	return NULL;
> +}
> +
>  /* Movability of hugepages depends on migration support. */
>  static inline gfp_t htlb_alloc_mask(struct hstate *h)
>  {
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 2527dfeddb00..f71efae2e494 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1489,11 +1489,16 @@ EXPORT_SYMBOL(unpoison_memory);
>  static struct page *new_page(struct page *p, unsigned long private, int **x)
>  {
>  	int nid = page_to_nid(p);
> -	if (PageHuge(p))
> -		return alloc_huge_page_node(page_hstate(compound_head(p)),
> -						   nid);
> -	else
> +	if (PageHuge(p)) {
> +		struct hstate *hstate = page_hstate(compound_head(p));
> +
> +		if (hstate_is_gigantic(hstate))
> +			return alloc_huge_page_node(hstate, NUMA_NO_NODE);
> +
> +		return alloc_huge_page_node(hstate, nid);
> +	} else {
>  		return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
> +	}
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level
  2017-07-28  0:49   ` Mike Kravetz
@ 2017-07-28  5:53     ` Anshuman Khandual
  0 siblings, 0 replies; 6+ messages in thread
From: Anshuman Khandual @ 2017-07-28  5:53 UTC (permalink / raw)
  To: Mike Kravetz, Anshuman Khandual, linux-kernel, linux-mm; +Cc: akpm

On 07/28/2017 06:19 AM, Mike Kravetz wrote:
> On 05/16/2017 03:05 AM, Anshuman Khandual wrote:
>> Though migrating gigantic HugeTLB pages does not sound much like real
>> world use case, they can be affected by memory errors. Hence migration
>> at the PGD level HugeTLB pages should be supported just to enable soft
>> and hard offline use cases.
> 
> Hi Anshuman,
> 
> Sorry for the late question, but I just stumbled on this code when
> looking at something else.
> 
> It appears the primary motivation for these changes is to handle
> memory errors in gigantic pages.  In this case, you migrate to

Right.

> another gigantic page.  However, doesn't this assume that there is

Right.

> a pre-allocated gigantic page sitting unused that will be the target
> of the migration?  alloc_huge_page_node will not allocate a gigantic
> page.  Or, am I missing something?

Yes, its in the context of 16GB pages on POWER8 system where all the
gigantic pages are pre allocated from the platform and passed on to
the kernel through the device tree. We dont allocate these gigantic
pages on runtime.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-07-28  5:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26  3:57 [PATCH V2] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level Anshuman Khandual
2017-05-12 21:35 ` Andrew Morton
2017-05-14  4:11   ` Anshuman Khandual
2017-05-16 10:05 ` [PATCH V3] " Anshuman Khandual
2017-07-28  0:49   ` Mike Kravetz
2017-07-28  5:53     ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).