All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages
@ 2016-09-26 17:28 ` Gerald Schaefer
  0 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

This addresses several issues with hugepages and memory offline. While
the first patch fixes a panic, and is therefore rather important, the
last patch is just a performance optimization.

The second patch fixes a theoretical issue with reserved hugepages,
while still leaving some ugly usability issue, see description.

Changes in v4:
- Add check for free vs. reserved hugepages
- Revalidate checks in dissolve_free_huge_page() after taking the lock
- Split up into 3 patches

Changes in v3:
- Add Fixes: c8721bbb
- Add Cc: stable
- Elaborate on losing the gigantic page vs. failing memory offline
- Move page_count() check out of dissolve_free_huge_page()

Changes in v2:
- Update comment in dissolve_free_huge_pages()
- Change locking in dissolve_free_huge_page()

Gerald Schaefer (3):
  mm/hugetlb: fix memory offline with hugepage size > memory block size
  mm/hugetlb: check for reserved hugepages during memory offline
  mm/hugetlb: improve locking in dissolve_free_huge_pages()

 include/linux/hugetlb.h |  6 +++---
 mm/hugetlb.c            | 47 +++++++++++++++++++++++++++++++++++------------
 mm/memory_hotplug.c     |  4 +++-
 3 files changed, 41 insertions(+), 16 deletions(-)

-- 
2.8.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages
@ 2016-09-26 17:28 ` Gerald Schaefer
  0 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

This addresses several issues with hugepages and memory offline. While
the first patch fixes a panic, and is therefore rather important, the
last patch is just a performance optimization.

The second patch fixes a theoretical issue with reserved hugepages,
while still leaving some ugly usability issue, see description.

Changes in v4:
- Add check for free vs. reserved hugepages
- Revalidate checks in dissolve_free_huge_page() after taking the lock
- Split up into 3 patches

Changes in v3:
- Add Fixes: c8721bbb
- Add Cc: stable
- Elaborate on losing the gigantic page vs. failing memory offline
- Move page_count() check out of dissolve_free_huge_page()

Changes in v2:
- Update comment in dissolve_free_huge_pages()
- Change locking in dissolve_free_huge_page()

Gerald Schaefer (3):
  mm/hugetlb: fix memory offline with hugepage size > memory block size
  mm/hugetlb: check for reserved hugepages during memory offline
  mm/hugetlb: improve locking in dissolve_free_huge_pages()

 include/linux/hugetlb.h |  6 +++---
 mm/hugetlb.c            | 47 +++++++++++++++++++++++++++++++++++------------
 mm/memory_hotplug.c     |  4 +++-
 3 files changed, 41 insertions(+), 16 deletions(-)

-- 
2.8.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 1/3] mm/hugetlb: fix memory offline with hugepage size > memory block size
  2016-09-26 17:28 ` Gerald Schaefer
@ 2016-09-26 17:28   ` Gerald Schaefer
  -1 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
list corruption and addressing exception when trying to set a memory
block offline that is part (but not the first part) of a "gigantic"
hugetlb page with a size > memory block size.

When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
will trigger directly. In the other case we will run into an addressing
exception later, because dissolve_free_huge_page() will not work on the
head page of the compound hugetlb page which will result in a NULL
hstate from page_hstate().

To fix this, first remove the VM_BUG_ON() because it is wrong, and then
use the compound head page in dissolve_free_huge_page(). This means that
an unused pre-allocated gigantic page that has any part of itself inside
the memory block that is going offline will be dissolved completely.
Losing an unused gigantic hugepage is preferable to failing the memory
offline, for example in the situation where a (possibly faulty) memory
DIMM needs to go offline.

Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 mm/hugetlb.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 87e11d8..603bdd0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1443,13 +1443,14 @@ static void dissolve_free_huge_page(struct page *page)
 {
 	spin_lock(&hugetlb_lock);
 	if (PageHuge(page) && !page_count(page)) {
-		struct hstate *h = page_hstate(page);
-		int nid = page_to_nid(page);
-		list_del(&page->lru);
+		struct page *head = compound_head(page);
+		struct hstate *h = page_hstate(head);
+		int nid = page_to_nid(head);
+		list_del(&head->lru);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		h->max_huge_pages--;
-		update_and_free_page(h, page);
+		update_and_free_page(h, head);
 	}
 	spin_unlock(&hugetlb_lock);
 }
@@ -1457,7 +1458,8 @@ static void dissolve_free_huge_page(struct page *page)
 /*
  * Dissolve free hugepages in a given pfn range. Used by memory hotplug to
  * make specified memory blocks removable from the system.
- * Note that start_pfn should aligned with (minimum) hugepage size.
+ * Note that this will dissolve a free gigantic hugepage completely, if any
+ * part of it lies within the given range.
  */
 void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
@@ -1466,7 +1468,6 @@ void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 	if (!hugepages_supported())
 		return;
 
-	VM_BUG_ON(!IS_ALIGNED(start_pfn, 1 << minimum_order));
 	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
 		dissolve_free_huge_page(pfn_to_page(pfn));
 }
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 1/3] mm/hugetlb: fix memory offline with hugepage size > memory block size
@ 2016-09-26 17:28   ` Gerald Schaefer
  0 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
list corruption and addressing exception when trying to set a memory
block offline that is part (but not the first part) of a "gigantic"
hugetlb page with a size > memory block size.

When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
will trigger directly. In the other case we will run into an addressing
exception later, because dissolve_free_huge_page() will not work on the
head page of the compound hugetlb page which will result in a NULL
hstate from page_hstate().

To fix this, first remove the VM_BUG_ON() because it is wrong, and then
use the compound head page in dissolve_free_huge_page(). This means that
an unused pre-allocated gigantic page that has any part of itself inside
the memory block that is going offline will be dissolved completely.
Losing an unused gigantic hugepage is preferable to failing the memory
offline, for example in the situation where a (possibly faulty) memory
DIMM needs to go offline.

Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 mm/hugetlb.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 87e11d8..603bdd0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1443,13 +1443,14 @@ static void dissolve_free_huge_page(struct page *page)
 {
 	spin_lock(&hugetlb_lock);
 	if (PageHuge(page) && !page_count(page)) {
-		struct hstate *h = page_hstate(page);
-		int nid = page_to_nid(page);
-		list_del(&page->lru);
+		struct page *head = compound_head(page);
+		struct hstate *h = page_hstate(head);
+		int nid = page_to_nid(head);
+		list_del(&head->lru);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		h->max_huge_pages--;
-		update_and_free_page(h, page);
+		update_and_free_page(h, head);
 	}
 	spin_unlock(&hugetlb_lock);
 }
@@ -1457,7 +1458,8 @@ static void dissolve_free_huge_page(struct page *page)
 /*
  * Dissolve free hugepages in a given pfn range. Used by memory hotplug to
  * make specified memory blocks removable from the system.
- * Note that start_pfn should aligned with (minimum) hugepage size.
+ * Note that this will dissolve a free gigantic hugepage completely, if any
+ * part of it lies within the given range.
  */
 void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
@@ -1466,7 +1468,6 @@ void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 	if (!hugepages_supported())
 		return;
 
-	VM_BUG_ON(!IS_ALIGNED(start_pfn, 1 << minimum_order));
 	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
 		dissolve_free_huge_page(pfn_to_page(pfn));
 }
-- 
2.8.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
  2016-09-26 17:28 ` Gerald Schaefer
@ 2016-09-26 17:28   ` Gerald Schaefer
  -1 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

In dissolve_free_huge_pages(), free hugepages will be dissolved without
making sure that there are enough of them left to satisfy hugepage
reservations.

Fix this by adding a return value to dissolve_free_huge_pages() and
checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
lead to the situation where dissolve_free_huge_page() returns an error
and all free hugepages that were dissolved before that error are lost,
while the memory block still cannot be set offline.

Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 include/linux/hugetlb.h |  6 +++---
 mm/hugetlb.c            | 26 +++++++++++++++++++++-----
 mm/memory_hotplug.c     |  4 +++-
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c26d463..fe99e6f 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
 	return __basepage_index(page);
 }
 
-extern void dissolve_free_huge_pages(unsigned long start_pfn,
-				     unsigned long end_pfn);
+extern int dissolve_free_huge_pages(unsigned long start_pfn,
+				    unsigned long end_pfn);
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
@@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
 {
 	return page->index;
 }
-#define dissolve_free_huge_pages(s, e)	do {} while (0)
+#define dissolve_free_huge_pages(s, e)	0
 #define hugepage_migration_supported(h)	false
 
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 603bdd0..91ae1f5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
 
 /*
  * Dissolve a given free hugepage into free buddy pages. This function does
- * nothing for in-use (including surplus) hugepages.
+ * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
+ * number of free hugepages would be reduced below the number of reserved
+ * hugepages.
  */
-static void dissolve_free_huge_page(struct page *page)
+static int dissolve_free_huge_page(struct page *page)
 {
+	int rc = 0;
+
 	spin_lock(&hugetlb_lock);
 	if (PageHuge(page) && !page_count(page)) {
 		struct page *head = compound_head(page);
 		struct hstate *h = page_hstate(head);
 		int nid = page_to_nid(head);
+		if (h->free_huge_pages - h->resv_huge_pages == 0) {
+			rc = -EBUSY;
+			goto out;
+		}
 		list_del(&head->lru);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		h->max_huge_pages--;
 		update_and_free_page(h, head);
 	}
+out:
 	spin_unlock(&hugetlb_lock);
+	return rc;
 }
 
 /*
@@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
  * make specified memory blocks removable from the system.
  * Note that this will dissolve a free gigantic hugepage completely, if any
  * part of it lies within the given range.
+ * Also note that if dissolve_free_huge_page() returns with an error, all
+ * free hugepages that were dissolved before that error are lost.
  */
-void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
+int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
+	int rc = 0;
 
 	if (!hugepages_supported())
-		return;
+		return rc;
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
-		dissolve_free_huge_page(pfn_to_page(pfn));
+		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
+			break;
+
+	return rc;
 }
 
 /*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b58906b..13998d9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * dissolve free hugepages in the memory block before doing offlining
 	 * actually in order to make hugetlbfs's object counting consistent.
 	 */
-	dissolve_free_huge_pages(start_pfn, end_pfn);
+	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
+	if (ret)
+		goto failed_removal;
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
 	if (offlined_pages < 0) {
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
@ 2016-09-26 17:28   ` Gerald Schaefer
  0 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

In dissolve_free_huge_pages(), free hugepages will be dissolved without
making sure that there are enough of them left to satisfy hugepage
reservations.

Fix this by adding a return value to dissolve_free_huge_pages() and
checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
lead to the situation where dissolve_free_huge_page() returns an error
and all free hugepages that were dissolved before that error are lost,
while the memory block still cannot be set offline.

Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 include/linux/hugetlb.h |  6 +++---
 mm/hugetlb.c            | 26 +++++++++++++++++++++-----
 mm/memory_hotplug.c     |  4 +++-
 3 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c26d463..fe99e6f 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
 	return __basepage_index(page);
 }
 
-extern void dissolve_free_huge_pages(unsigned long start_pfn,
-				     unsigned long end_pfn);
+extern int dissolve_free_huge_pages(unsigned long start_pfn,
+				    unsigned long end_pfn);
 static inline bool hugepage_migration_supported(struct hstate *h)
 {
 #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
@@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
 {
 	return page->index;
 }
-#define dissolve_free_huge_pages(s, e)	do {} while (0)
+#define dissolve_free_huge_pages(s, e)	0
 #define hugepage_migration_supported(h)	false
 
 static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 603bdd0..91ae1f5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
 
 /*
  * Dissolve a given free hugepage into free buddy pages. This function does
- * nothing for in-use (including surplus) hugepages.
+ * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
+ * number of free hugepages would be reduced below the number of reserved
+ * hugepages.
  */
-static void dissolve_free_huge_page(struct page *page)
+static int dissolve_free_huge_page(struct page *page)
 {
+	int rc = 0;
+
 	spin_lock(&hugetlb_lock);
 	if (PageHuge(page) && !page_count(page)) {
 		struct page *head = compound_head(page);
 		struct hstate *h = page_hstate(head);
 		int nid = page_to_nid(head);
+		if (h->free_huge_pages - h->resv_huge_pages == 0) {
+			rc = -EBUSY;
+			goto out;
+		}
 		list_del(&head->lru);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
 		h->max_huge_pages--;
 		update_and_free_page(h, head);
 	}
+out:
 	spin_unlock(&hugetlb_lock);
+	return rc;
 }
 
 /*
@@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
  * make specified memory blocks removable from the system.
  * Note that this will dissolve a free gigantic hugepage completely, if any
  * part of it lies within the given range.
+ * Also note that if dissolve_free_huge_page() returns with an error, all
+ * free hugepages that were dissolved before that error are lost.
  */
-void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
+int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
+	int rc = 0;
 
 	if (!hugepages_supported())
-		return;
+		return rc;
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
-		dissolve_free_huge_page(pfn_to_page(pfn));
+		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
+			break;
+
+	return rc;
 }
 
 /*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b58906b..13998d9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	 * dissolve free hugepages in the memory block before doing offlining
 	 * actually in order to make hugetlbfs's object counting consistent.
 	 */
-	dissolve_free_huge_pages(start_pfn, end_pfn);
+	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
+	if (ret)
+		goto failed_removal;
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
 	if (offlined_pages < 0) {
-- 
2.8.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/3] mm/hugetlb: improve locking in dissolve_free_huge_pages()
  2016-09-26 17:28 ` Gerald Schaefer
@ 2016-09-26 17:28   ` Gerald Schaefer
  -1 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

For every pfn aligned to minimum_order, dissolve_free_huge_pages() will
call dissolve_free_huge_page() which takes the hugetlb spinlock, even if
the page is not huge at all or a hugepage that is in-use.

Improve this by doing the PageHuge() and page_count() checks already in
dissolve_free_huge_pages() before calling dissolve_free_huge_page(). In
dissolve_free_huge_page(), when holding the spinlock, those checks need
to be revalidated.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 mm/hugetlb.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 91ae1f5..770d83e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1476,14 +1476,20 @@ static int dissolve_free_huge_page(struct page *page)
 int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
+	struct page *page;
 	int rc = 0;
 
 	if (!hugepages_supported())
 		return rc;
 
-	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
-		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
-			break;
+	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
+		page = pfn_to_page(pfn);
+		if (PageHuge(page) && !page_count(page)) {
+			rc = dissolve_free_huge_page(page);
+			if (rc)
+				break;
+		}
+	}
 
 	return rc;
 }
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/3] mm/hugetlb: improve locking in dissolve_free_huge_pages()
@ 2016-09-26 17:28   ` Gerald Schaefer
  0 siblings, 0 replies; 20+ messages in thread
From: Gerald Schaefer @ 2016-09-26 17:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

For every pfn aligned to minimum_order, dissolve_free_huge_pages() will
call dissolve_free_huge_page() which takes the hugetlb spinlock, even if
the page is not huge at all or a hugepage that is in-use.

Improve this by doing the PageHuge() and page_count() checks already in
dissolve_free_huge_pages() before calling dissolve_free_huge_page(). In
dissolve_free_huge_page(), when holding the spinlock, those checks need
to be revalidated.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
 mm/hugetlb.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 91ae1f5..770d83e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1476,14 +1476,20 @@ static int dissolve_free_huge_page(struct page *page)
 int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
+	struct page *page;
 	int rc = 0;
 
 	if (!hugepages_supported())
 		return rc;
 
-	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
-		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
-			break;
+	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
+		page = pfn_to_page(pfn);
+		if (PageHuge(page) && !page_count(page)) {
+			rc = dissolve_free_huge_page(page);
+			if (rc)
+				break;
+		}
+	}
 
 	return rc;
 }
-- 
2.8.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages
  2016-09-26 17:28 ` Gerald Schaefer
@ 2016-09-29  6:14   ` Naoya Horiguchi
  -1 siblings, 0 replies; 20+ messages in thread
From: Naoya Horiguchi @ 2016-09-29  6:14 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon, Sep 26, 2016 at 07:28:08PM +0200, Gerald Schaefer wrote:
> This addresses several issues with hugepages and memory offline. While
> the first patch fixes a panic, and is therefore rather important, the
> last patch is just a performance optimization.
> 
> The second patch fixes a theoretical issue with reserved hugepages,
> while still leaving some ugly usability issue, see description.
> 
> Changes in v4:
> - Add check for free vs. reserved hugepages
> - Revalidate checks in dissolve_free_huge_page() after taking the lock
> - Split up into 3 patches
> 
> Changes in v3:
> - Add Fixes: c8721bbb
> - Add Cc: stable
> - Elaborate on losing the gigantic page vs. failing memory offline
> - Move page_count() check out of dissolve_free_huge_page()
> 
> Changes in v2:
> - Update comment in dissolve_free_huge_pages()
> - Change locking in dissolve_free_huge_page()
> 
> Gerald Schaefer (3):
>   mm/hugetlb: fix memory offline with hugepage size > memory block size
>   mm/hugetlb: check for reserved hugepages during memory offline
>   mm/hugetlb: improve locking in dissolve_free_huge_pages()
> 
>  include/linux/hugetlb.h |  6 +++---
>  mm/hugetlb.c            | 47 +++++++++++++++++++++++++++++++++++------------
>  mm/memory_hotplug.c     |  4 +++-
>  3 files changed, 41 insertions(+), 16 deletions(-)

I'm happy with these patches fixing/improving hugetlb offline code,
thank you very much, Gerald and reviewers/testers!

For patchset ...

Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages
@ 2016-09-29  6:14   ` Naoya Horiguchi
  0 siblings, 0 replies; 20+ messages in thread
From: Naoya Horiguchi @ 2016-09-29  6:14 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon, Sep 26, 2016 at 07:28:08PM +0200, Gerald Schaefer wrote:
> This addresses several issues with hugepages and memory offline. While
> the first patch fixes a panic, and is therefore rather important, the
> last patch is just a performance optimization.
> 
> The second patch fixes a theoretical issue with reserved hugepages,
> while still leaving some ugly usability issue, see description.
> 
> Changes in v4:
> - Add check for free vs. reserved hugepages
> - Revalidate checks in dissolve_free_huge_page() after taking the lock
> - Split up into 3 patches
> 
> Changes in v3:
> - Add Fixes: c8721bbb
> - Add Cc: stable
> - Elaborate on losing the gigantic page vs. failing memory offline
> - Move page_count() check out of dissolve_free_huge_page()
> 
> Changes in v2:
> - Update comment in dissolve_free_huge_pages()
> - Change locking in dissolve_free_huge_page()
> 
> Gerald Schaefer (3):
>   mm/hugetlb: fix memory offline with hugepage size > memory block size
>   mm/hugetlb: check for reserved hugepages during memory offline
>   mm/hugetlb: improve locking in dissolve_free_huge_pages()
> 
>  include/linux/hugetlb.h |  6 +++---
>  mm/hugetlb.c            | 47 +++++++++++++++++++++++++++++++++++------------
>  mm/memory_hotplug.c     |  4 +++-
>  3 files changed, 41 insertions(+), 16 deletions(-)

I'm happy with these patches fixing/improving hugetlb offline code,
thank you very much, Gerald and reviewers/testers!

For patchset ...

Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] mm/hugetlb: fix memory offline with hugepage size > memory block size
  2016-09-26 17:28   ` Gerald Schaefer
@ 2016-09-29 12:11     ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:11 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:09, Gerald Schaefer wrote:
> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
> list corruption and addressing exception when trying to set a memory
> block offline that is part (but not the first part) of a "gigantic"
> hugetlb page with a size > memory block size.
> 
> When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
> will trigger directly. In the other case we will run into an addressing
> exception later, because dissolve_free_huge_page() will not work on the
> head page of the compound hugetlb page which will result in a NULL
> hstate from page_hstate().
> 
> To fix this, first remove the VM_BUG_ON() because it is wrong, and then
> use the compound head page in dissolve_free_huge_page(). This means that
> an unused pre-allocated gigantic page that has any part of itself inside
> the memory block that is going offline will be dissolved completely.
> Losing an unused gigantic hugepage is preferable to failing the memory
> offline, for example in the situation where a (possibly faulty) memory
> DIMM needs to go offline.
> 
> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/hugetlb.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 87e11d8..603bdd0 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1443,13 +1443,14 @@ static void dissolve_free_huge_page(struct page *page)
>  {
>  	spin_lock(&hugetlb_lock);
>  	if (PageHuge(page) && !page_count(page)) {
> -		struct hstate *h = page_hstate(page);
> -		int nid = page_to_nid(page);
> -		list_del(&page->lru);
> +		struct page *head = compound_head(page);
> +		struct hstate *h = page_hstate(head);
> +		int nid = page_to_nid(head);
> +		list_del(&head->lru);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		h->max_huge_pages--;
> -		update_and_free_page(h, page);
> +		update_and_free_page(h, head);
>  	}
>  	spin_unlock(&hugetlb_lock);
>  }
> @@ -1457,7 +1458,8 @@ static void dissolve_free_huge_page(struct page *page)
>  /*
>   * Dissolve free hugepages in a given pfn range. Used by memory hotplug to
>   * make specified memory blocks removable from the system.
> - * Note that start_pfn should aligned with (minimum) hugepage size.
> + * Note that this will dissolve a free gigantic hugepage completely, if any
> + * part of it lies within the given range.
>   */
>  void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
> @@ -1466,7 +1468,6 @@ void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  	if (!hugepages_supported())
>  		return;
>  
> -	VM_BUG_ON(!IS_ALIGNED(start_pfn, 1 << minimum_order));
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
>  		dissolve_free_huge_page(pfn_to_page(pfn));
>  }
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] mm/hugetlb: fix memory offline with hugepage size > memory block size
@ 2016-09-29 12:11     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:11 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:09, Gerald Schaefer wrote:
> dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
> list corruption and addressing exception when trying to set a memory
> block offline that is part (but not the first part) of a "gigantic"
> hugetlb page with a size > memory block size.
> 
> When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
> will trigger directly. In the other case we will run into an addressing
> exception later, because dissolve_free_huge_page() will not work on the
> head page of the compound hugetlb page which will result in a NULL
> hstate from page_hstate().
> 
> To fix this, first remove the VM_BUG_ON() because it is wrong, and then
> use the compound head page in dissolve_free_huge_page(). This means that
> an unused pre-allocated gigantic page that has any part of itself inside
> the memory block that is going offline will be dissolved completely.
> Losing an unused gigantic hugepage is preferable to failing the memory
> offline, for example in the situation where a (possibly faulty) memory
> DIMM needs to go offline.
> 
> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/hugetlb.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 87e11d8..603bdd0 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1443,13 +1443,14 @@ static void dissolve_free_huge_page(struct page *page)
>  {
>  	spin_lock(&hugetlb_lock);
>  	if (PageHuge(page) && !page_count(page)) {
> -		struct hstate *h = page_hstate(page);
> -		int nid = page_to_nid(page);
> -		list_del(&page->lru);
> +		struct page *head = compound_head(page);
> +		struct hstate *h = page_hstate(head);
> +		int nid = page_to_nid(head);
> +		list_del(&head->lru);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		h->max_huge_pages--;
> -		update_and_free_page(h, page);
> +		update_and_free_page(h, head);
>  	}
>  	spin_unlock(&hugetlb_lock);
>  }
> @@ -1457,7 +1458,8 @@ static void dissolve_free_huge_page(struct page *page)
>  /*
>   * Dissolve free hugepages in a given pfn range. Used by memory hotplug to
>   * make specified memory blocks removable from the system.
> - * Note that start_pfn should aligned with (minimum) hugepage size.
> + * Note that this will dissolve a free gigantic hugepage completely, if any
> + * part of it lies within the given range.
>   */
>  void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
> @@ -1466,7 +1468,6 @@ void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  	if (!hugepages_supported())
>  		return;
>  
> -	VM_BUG_ON(!IS_ALIGNED(start_pfn, 1 << minimum_order));
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
>  		dissolve_free_huge_page(pfn_to_page(pfn));
>  }
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
  2016-09-26 17:28   ` Gerald Schaefer
@ 2016-09-29 12:30     ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:30 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
> In dissolve_free_huge_pages(), free hugepages will be dissolved without
> making sure that there are enough of them left to satisfy hugepage
> reservations.

otherwise a poor process with a reservation might get unexpected SIGBUS,
right?

> Fix this by adding a return value to dissolve_free_huge_pages() and
> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
> lead to the situation where dissolve_free_huge_page() returns an error
> and all free hugepages that were dissolved before that error are lost,
> while the memory block still cannot be set offline.

Hmm, OK offline failure is certainly a better option than an application
failure.
 
> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/hugetlb.h |  6 +++---
>  mm/hugetlb.c            | 26 +++++++++++++++++++++-----
>  mm/memory_hotplug.c     |  4 +++-
>  3 files changed, 27 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index c26d463..fe99e6f 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
>  	return __basepage_index(page);
>  }
>  
> -extern void dissolve_free_huge_pages(unsigned long start_pfn,
> -				     unsigned long end_pfn);
> +extern int dissolve_free_huge_pages(unsigned long start_pfn,
> +				    unsigned long end_pfn);
>  static inline bool hugepage_migration_supported(struct hstate *h)
>  {
>  #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
> @@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
>  {
>  	return page->index;
>  }
> -#define dissolve_free_huge_pages(s, e)	do {} while (0)
> +#define dissolve_free_huge_pages(s, e)	0
>  #define hugepage_migration_supported(h)	false
>  
>  static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 603bdd0..91ae1f5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
>  
>  /*
>   * Dissolve a given free hugepage into free buddy pages. This function does
> - * nothing for in-use (including surplus) hugepages.
> + * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
> + * number of free hugepages would be reduced below the number of reserved
> + * hugepages.
>   */
> -static void dissolve_free_huge_page(struct page *page)
> +static int dissolve_free_huge_page(struct page *page)
>  {
> +	int rc = 0;
> +
>  	spin_lock(&hugetlb_lock);
>  	if (PageHuge(page) && !page_count(page)) {
>  		struct page *head = compound_head(page);
>  		struct hstate *h = page_hstate(head);
>  		int nid = page_to_nid(head);
> +		if (h->free_huge_pages - h->resv_huge_pages == 0) {
> +			rc = -EBUSY;
> +			goto out;
> +		}
>  		list_del(&head->lru);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		h->max_huge_pages--;
>  		update_and_free_page(h, head);
>  	}
> +out:
>  	spin_unlock(&hugetlb_lock);
> +	return rc;
>  }
>  
>  /*
> @@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
>   * make specified memory blocks removable from the system.
>   * Note that this will dissolve a free gigantic hugepage completely, if any
>   * part of it lies within the given range.
> + * Also note that if dissolve_free_huge_page() returns with an error, all
> + * free hugepages that were dissolved before that error are lost.
>   */
> -void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
> +int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	unsigned long pfn;
> +	int rc = 0;
>  
>  	if (!hugepages_supported())
> -		return;
> +		return rc;
>  
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
> -		dissolve_free_huge_page(pfn_to_page(pfn));
> +		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
> +			break;
> +
> +	return rc;
>  }
>  
>  /*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b58906b..13998d9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	 * dissolve free hugepages in the memory block before doing offlining
>  	 * actually in order to make hugetlbfs's object counting consistent.
>  	 */
> -	dissolve_free_huge_pages(start_pfn, end_pfn);
> +	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> +	if (ret)
> +		goto failed_removal;
>  	/* check again */
>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>  	if (offlined_pages < 0) {
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
@ 2016-09-29 12:30     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:30 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
> In dissolve_free_huge_pages(), free hugepages will be dissolved without
> making sure that there are enough of them left to satisfy hugepage
> reservations.

otherwise a poor process with a reservation might get unexpected SIGBUS,
right?

> Fix this by adding a return value to dissolve_free_huge_pages() and
> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
> lead to the situation where dissolve_free_huge_page() returns an error
> and all free hugepages that were dissolved before that error are lost,
> while the memory block still cannot be set offline.

Hmm, OK offline failure is certainly a better option than an application
failure.
 
> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/hugetlb.h |  6 +++---
>  mm/hugetlb.c            | 26 +++++++++++++++++++++-----
>  mm/memory_hotplug.c     |  4 +++-
>  3 files changed, 27 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index c26d463..fe99e6f 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
>  	return __basepage_index(page);
>  }
>  
> -extern void dissolve_free_huge_pages(unsigned long start_pfn,
> -				     unsigned long end_pfn);
> +extern int dissolve_free_huge_pages(unsigned long start_pfn,
> +				    unsigned long end_pfn);
>  static inline bool hugepage_migration_supported(struct hstate *h)
>  {
>  #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
> @@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
>  {
>  	return page->index;
>  }
> -#define dissolve_free_huge_pages(s, e)	do {} while (0)
> +#define dissolve_free_huge_pages(s, e)	0
>  #define hugepage_migration_supported(h)	false
>  
>  static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 603bdd0..91ae1f5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
>  
>  /*
>   * Dissolve a given free hugepage into free buddy pages. This function does
> - * nothing for in-use (including surplus) hugepages.
> + * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
> + * number of free hugepages would be reduced below the number of reserved
> + * hugepages.
>   */
> -static void dissolve_free_huge_page(struct page *page)
> +static int dissolve_free_huge_page(struct page *page)
>  {
> +	int rc = 0;
> +
>  	spin_lock(&hugetlb_lock);
>  	if (PageHuge(page) && !page_count(page)) {
>  		struct page *head = compound_head(page);
>  		struct hstate *h = page_hstate(head);
>  		int nid = page_to_nid(head);
> +		if (h->free_huge_pages - h->resv_huge_pages == 0) {
> +			rc = -EBUSY;
> +			goto out;
> +		}
>  		list_del(&head->lru);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
>  		h->max_huge_pages--;
>  		update_and_free_page(h, head);
>  	}
> +out:
>  	spin_unlock(&hugetlb_lock);
> +	return rc;
>  }
>  
>  /*
> @@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
>   * make specified memory blocks removable from the system.
>   * Note that this will dissolve a free gigantic hugepage completely, if any
>   * part of it lies within the given range.
> + * Also note that if dissolve_free_huge_page() returns with an error, all
> + * free hugepages that were dissolved before that error are lost.
>   */
> -void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
> +int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	unsigned long pfn;
> +	int rc = 0;
>  
>  	if (!hugepages_supported())
> -		return;
> +		return rc;
>  
>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
> -		dissolve_free_huge_page(pfn_to_page(pfn));
> +		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
> +			break;
> +
> +	return rc;
>  }
>  
>  /*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b58906b..13998d9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	 * dissolve free hugepages in the memory block before doing offlining
>  	 * actually in order to make hugetlbfs's object counting consistent.
>  	 */
> -	dissolve_free_huge_pages(start_pfn, end_pfn);
> +	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
> +	if (ret)
> +		goto failed_removal;
>  	/* check again */
>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>  	if (offlined_pages < 0) {
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 3/3] mm/hugetlb: improve locking in dissolve_free_huge_pages()
  2016-09-26 17:28   ` Gerald Schaefer
@ 2016-09-29 12:32     ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:32 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:11, Gerald Schaefer wrote:
> For every pfn aligned to minimum_order, dissolve_free_huge_pages() will
> call dissolve_free_huge_page() which takes the hugetlb spinlock, even if
> the page is not huge at all or a hugepage that is in-use.
> 
> Improve this by doing the PageHuge() and page_count() checks already in
> dissolve_free_huge_pages() before calling dissolve_free_huge_page(). In
> dissolve_free_huge_page(), when holding the spinlock, those checks need
> to be revalidated.
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/hugetlb.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 91ae1f5..770d83e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1476,14 +1476,20 @@ static int dissolve_free_huge_page(struct page *page)
>  int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	unsigned long pfn;
> +	struct page *page;
>  	int rc = 0;
>  
>  	if (!hugepages_supported())
>  		return rc;
>  
> -	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
> -		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
> -			break;
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
> +		page = pfn_to_page(pfn);
> +		if (PageHuge(page) && !page_count(page)) {
> +			rc = dissolve_free_huge_page(page);
> +			if (rc)
> +				break;
> +		}
> +	}
>  
>  	return rc;
>  }
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 3/3] mm/hugetlb: improve locking in dissolve_free_huge_pages()
@ 2016-09-29 12:32     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-29 12:32 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Mike Kravetz,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Mon 26-09-16 19:28:11, Gerald Schaefer wrote:
> For every pfn aligned to minimum_order, dissolve_free_huge_pages() will
> call dissolve_free_huge_page() which takes the hugetlb spinlock, even if
> the page is not huge at all or a hugepage that is in-use.
> 
> Improve this by doing the PageHuge() and page_count() checks already in
> dissolve_free_huge_pages() before calling dissolve_free_huge_page(). In
> dissolve_free_huge_page(), when holding the spinlock, those checks need
> to be revalidated.
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/hugetlb.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 91ae1f5..770d83e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1476,14 +1476,20 @@ static int dissolve_free_huge_page(struct page *page)
>  int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	unsigned long pfn;
> +	struct page *page;
>  	int rc = 0;
>  
>  	if (!hugepages_supported())
>  		return rc;
>  
> -	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
> -		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
> -			break;
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
> +		page = pfn_to_page(pfn);
> +		if (PageHuge(page) && !page_count(page)) {
> +			rc = dissolve_free_huge_page(page);
> +			if (rc)
> +				break;
> +		}
> +	}
>  
>  	return rc;
>  }
> -- 
> 2.8.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
  2016-09-29 12:30     ` Michal Hocko
@ 2016-09-29 17:09       ` Mike Kravetz
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Kravetz @ 2016-09-29 17:09 UTC (permalink / raw)
  To: Michal Hocko, Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Aneesh Kumar K . V,
	Martin Schwidefsky, Heiko Carstens, Rui Teng, Dave Hansen

On 09/29/2016 05:30 AM, Michal Hocko wrote:
> On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
>> In dissolve_free_huge_pages(), free hugepages will be dissolved without
>> making sure that there are enough of them left to satisfy hugepage
>> reservations.
> 
> otherwise a poor process with a reservation might get unexpected SIGBUS,
> right?

Yes, that is correct.

> 
>> Fix this by adding a return value to dissolve_free_huge_pages() and
>> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
>> lead to the situation where dissolve_free_huge_page() returns an error
>> and all free hugepages that were dissolved before that error are lost,
>> while the memory block still cannot be set offline.
> 
> Hmm, OK offline failure is certainly a better option than an application
> failure.

I agree.

However, if the reason for the offline is that a dimm within the huge page
is starting to fail, then one could make an argument that forced offline of
the huge page would be more desirable.  We really don't know the reason for
the offline.  So, I think the approach of this patch is best.

--
Mike Kravetz

>  
>> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
>> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
>> ---
>>  include/linux/hugetlb.h |  6 +++---
>>  mm/hugetlb.c            | 26 +++++++++++++++++++++-----
>>  mm/memory_hotplug.c     |  4 +++-
>>  3 files changed, 27 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index c26d463..fe99e6f 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
>>  	return __basepage_index(page);
>>  }
>>  
>> -extern void dissolve_free_huge_pages(unsigned long start_pfn,
>> -				     unsigned long end_pfn);
>> +extern int dissolve_free_huge_pages(unsigned long start_pfn,
>> +				    unsigned long end_pfn);
>>  static inline bool hugepage_migration_supported(struct hstate *h)
>>  {
>>  #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
>> @@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
>>  {
>>  	return page->index;
>>  }
>> -#define dissolve_free_huge_pages(s, e)	do {} while (0)
>> +#define dissolve_free_huge_pages(s, e)	0
>>  #define hugepage_migration_supported(h)	false
>>  
>>  static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 603bdd0..91ae1f5 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
>>  
>>  /*
>>   * Dissolve a given free hugepage into free buddy pages. This function does
>> - * nothing for in-use (including surplus) hugepages.
>> + * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
>> + * number of free hugepages would be reduced below the number of reserved
>> + * hugepages.
>>   */
>> -static void dissolve_free_huge_page(struct page *page)
>> +static int dissolve_free_huge_page(struct page *page)
>>  {
>> +	int rc = 0;
>> +
>>  	spin_lock(&hugetlb_lock);
>>  	if (PageHuge(page) && !page_count(page)) {
>>  		struct page *head = compound_head(page);
>>  		struct hstate *h = page_hstate(head);
>>  		int nid = page_to_nid(head);
>> +		if (h->free_huge_pages - h->resv_huge_pages == 0) {
>> +			rc = -EBUSY;
>> +			goto out;
>> +		}
>>  		list_del(&head->lru);
>>  		h->free_huge_pages--;
>>  		h->free_huge_pages_node[nid]--;
>>  		h->max_huge_pages--;
>>  		update_and_free_page(h, head);
>>  	}
>> +out:
>>  	spin_unlock(&hugetlb_lock);
>> +	return rc;
>>  }
>>  
>>  /*
>> @@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
>>   * make specified memory blocks removable from the system.
>>   * Note that this will dissolve a free gigantic hugepage completely, if any
>>   * part of it lies within the given range.
>> + * Also note that if dissolve_free_huge_page() returns with an error, all
>> + * free hugepages that were dissolved before that error are lost.
>>   */
>> -void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>> +int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>>  {
>>  	unsigned long pfn;
>> +	int rc = 0;
>>  
>>  	if (!hugepages_supported())
>> -		return;
>> +		return rc;
>>  
>>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
>> -		dissolve_free_huge_page(pfn_to_page(pfn));
>> +		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
>> +			break;
>> +
>> +	return rc;
>>  }
>>  
>>  /*
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index b58906b..13998d9 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>>  	 * dissolve free hugepages in the memory block before doing offlining
>>  	 * actually in order to make hugetlbfs's object counting consistent.
>>  	 */
>> -	dissolve_free_huge_pages(start_pfn, end_pfn);
>> +	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
>> +	if (ret)
>> +		goto failed_removal;
>>  	/* check again */
>>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>>  	if (offlined_pages < 0) {
>> -- 
>> 2.8.4
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
@ 2016-09-29 17:09       ` Mike Kravetz
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Kravetz @ 2016-09-29 17:09 UTC (permalink / raw)
  To: Michal Hocko, Gerald Schaefer
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	Kirill A . Shutemov, Vlastimil Babka, Aneesh Kumar K . V,
	Martin Schwidefsky, Heiko Carstens, Rui Teng, Dave Hansen

On 09/29/2016 05:30 AM, Michal Hocko wrote:
> On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
>> In dissolve_free_huge_pages(), free hugepages will be dissolved without
>> making sure that there are enough of them left to satisfy hugepage
>> reservations.
> 
> otherwise a poor process with a reservation might get unexpected SIGBUS,
> right?

Yes, that is correct.

> 
>> Fix this by adding a return value to dissolve_free_huge_pages() and
>> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
>> lead to the situation where dissolve_free_huge_page() returns an error
>> and all free hugepages that were dissolved before that error are lost,
>> while the memory block still cannot be set offline.
> 
> Hmm, OK offline failure is certainly a better option than an application
> failure.

I agree.

However, if the reason for the offline is that a dimm within the huge page
is starting to fail, then one could make an argument that forced offline of
the huge page would be more desirable.  We really don't know the reason for
the offline.  So, I think the approach of this patch is best.

--
Mike Kravetz

>  
>> Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
>> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
>> ---
>>  include/linux/hugetlb.h |  6 +++---
>>  mm/hugetlb.c            | 26 +++++++++++++++++++++-----
>>  mm/memory_hotplug.c     |  4 +++-
>>  3 files changed, 27 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index c26d463..fe99e6f 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -450,8 +450,8 @@ static inline pgoff_t basepage_index(struct page *page)
>>  	return __basepage_index(page);
>>  }
>>  
>> -extern void dissolve_free_huge_pages(unsigned long start_pfn,
>> -				     unsigned long end_pfn);
>> +extern int dissolve_free_huge_pages(unsigned long start_pfn,
>> +				    unsigned long end_pfn);
>>  static inline bool hugepage_migration_supported(struct hstate *h)
>>  {
>>  #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
>> @@ -518,7 +518,7 @@ static inline pgoff_t basepage_index(struct page *page)
>>  {
>>  	return page->index;
>>  }
>> -#define dissolve_free_huge_pages(s, e)	do {} while (0)
>> +#define dissolve_free_huge_pages(s, e)	0
>>  #define hugepage_migration_supported(h)	false
>>  
>>  static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 603bdd0..91ae1f5 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1437,22 +1437,32 @@ static int free_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed,
>>  
>>  /*
>>   * Dissolve a given free hugepage into free buddy pages. This function does
>> - * nothing for in-use (including surplus) hugepages.
>> + * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
>> + * number of free hugepages would be reduced below the number of reserved
>> + * hugepages.
>>   */
>> -static void dissolve_free_huge_page(struct page *page)
>> +static int dissolve_free_huge_page(struct page *page)
>>  {
>> +	int rc = 0;
>> +
>>  	spin_lock(&hugetlb_lock);
>>  	if (PageHuge(page) && !page_count(page)) {
>>  		struct page *head = compound_head(page);
>>  		struct hstate *h = page_hstate(head);
>>  		int nid = page_to_nid(head);
>> +		if (h->free_huge_pages - h->resv_huge_pages == 0) {
>> +			rc = -EBUSY;
>> +			goto out;
>> +		}
>>  		list_del(&head->lru);
>>  		h->free_huge_pages--;
>>  		h->free_huge_pages_node[nid]--;
>>  		h->max_huge_pages--;
>>  		update_and_free_page(h, head);
>>  	}
>> +out:
>>  	spin_unlock(&hugetlb_lock);
>> +	return rc;
>>  }
>>  
>>  /*
>> @@ -1460,16 +1470,22 @@ static void dissolve_free_huge_page(struct page *page)
>>   * make specified memory blocks removable from the system.
>>   * Note that this will dissolve a free gigantic hugepage completely, if any
>>   * part of it lies within the given range.
>> + * Also note that if dissolve_free_huge_page() returns with an error, all
>> + * free hugepages that were dissolved before that error are lost.
>>   */
>> -void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>> +int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
>>  {
>>  	unsigned long pfn;
>> +	int rc = 0;
>>  
>>  	if (!hugepages_supported())
>> -		return;
>> +		return rc;
>>  
>>  	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
>> -		dissolve_free_huge_page(pfn_to_page(pfn));
>> +		if (rc = dissolve_free_huge_page(pfn_to_page(pfn)))
>> +			break;
>> +
>> +	return rc;
>>  }
>>  
>>  /*
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index b58906b..13998d9 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1945,7 +1945,9 @@ static int __ref __offline_pages(unsigned long start_pfn,
>>  	 * dissolve free hugepages in the memory block before doing offlining
>>  	 * actually in order to make hugetlbfs's object counting consistent.
>>  	 */
>> -	dissolve_free_huge_pages(start_pfn, end_pfn);
>> +	ret = dissolve_free_huge_pages(start_pfn, end_pfn);
>> +	if (ret)
>> +		goto failed_removal;
>>  	/* check again */
>>  	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
>>  	if (offlined_pages < 0) {
>> -- 
>> 2.8.4
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
  2016-09-29 17:09       ` Mike Kravetz
@ 2016-09-30  6:38         ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-30  6:38 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Gerald Schaefer, Andrew Morton, Naoya Horiguchi, linux-mm,
	linux-kernel, Kirill A . Shutemov, Vlastimil Babka,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Thu 29-09-16 10:09:37, Mike Kravetz wrote:
> On 09/29/2016 05:30 AM, Michal Hocko wrote:
> > On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
[...]
> >> Fix this by adding a return value to dissolve_free_huge_pages() and
> >> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
> >> lead to the situation where dissolve_free_huge_page() returns an error
> >> and all free hugepages that were dissolved before that error are lost,
> >> while the memory block still cannot be set offline.
> > 
> > Hmm, OK offline failure is certainly a better option than an application
> > failure.
> 
> I agree.
> 
> However, if the reason for the offline is that a dimm within the huge page
> is starting to fail, then one could make an argument that forced offline of
> the huge page would be more desirable.  We really don't know the reason for
> the offline.  So, I think the approach of this patch is best.

I though that memory which was already reported to be faulty would be
marked HWPoison and removed from the allocator. But it's been quite some
time since I've looked in that area...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline
@ 2016-09-30  6:38         ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2016-09-30  6:38 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Gerald Schaefer, Andrew Morton, Naoya Horiguchi, linux-mm,
	linux-kernel, Kirill A . Shutemov, Vlastimil Babka,
	Aneesh Kumar K . V, Martin Schwidefsky, Heiko Carstens, Rui Teng,
	Dave Hansen

On Thu 29-09-16 10:09:37, Mike Kravetz wrote:
> On 09/29/2016 05:30 AM, Michal Hocko wrote:
> > On Mon 26-09-16 19:28:10, Gerald Schaefer wrote:
[...]
> >> Fix this by adding a return value to dissolve_free_huge_pages() and
> >> checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may
> >> lead to the situation where dissolve_free_huge_page() returns an error
> >> and all free hugepages that were dissolved before that error are lost,
> >> while the memory block still cannot be set offline.
> > 
> > Hmm, OK offline failure is certainly a better option than an application
> > failure.
> 
> I agree.
> 
> However, if the reason for the offline is that a dimm within the huge page
> is starting to fail, then one could make an argument that forced offline of
> the huge page would be more desirable.  We really don't know the reason for
> the offline.  So, I think the approach of this patch is best.

I though that memory which was already reported to be faulty would be
marked HWPoison and removed from the allocator. But it's been quite some
time since I've looked in that area...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-09-30  6:45 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-26 17:28 [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages Gerald Schaefer
2016-09-26 17:28 ` Gerald Schaefer
2016-09-26 17:28 ` [PATCH v4 1/3] mm/hugetlb: fix memory offline with hugepage size > memory block size Gerald Schaefer
2016-09-26 17:28   ` Gerald Schaefer
2016-09-29 12:11   ` Michal Hocko
2016-09-29 12:11     ` Michal Hocko
2016-09-26 17:28 ` [PATCH v4 2/3] mm/hugetlb: check for reserved hugepages during memory offline Gerald Schaefer
2016-09-26 17:28   ` Gerald Schaefer
2016-09-29 12:30   ` Michal Hocko
2016-09-29 12:30     ` Michal Hocko
2016-09-29 17:09     ` Mike Kravetz
2016-09-29 17:09       ` Mike Kravetz
2016-09-30  6:38       ` Michal Hocko
2016-09-30  6:38         ` Michal Hocko
2016-09-26 17:28 ` [PATCH v4 3/3] mm/hugetlb: improve locking in dissolve_free_huge_pages() Gerald Schaefer
2016-09-26 17:28   ` Gerald Schaefer
2016-09-29 12:32   ` Michal Hocko
2016-09-29 12:32     ` Michal Hocko
2016-09-29  6:14 ` [PATCH v4 0/3] mm/hugetlb: memory offline issues with hugepages Naoya Horiguchi
2016-09-29  6:14   ` Naoya Horiguchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.