linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] HWPOISON, hugetlbfs: small bug fixes
@ 2012-12-05 21:47 Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-05 21:47 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tony Luck, Wu Fengguang, linux-kernel, linux-mm, Naoya Horiguchi

Hi,

I found some small bugs about memory error handling on hugepages by my
testing on the recent kernel, so I wrote patches for them.
Can I have your reviews or comments on them?

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-05 21:47 [PATCH 0/3] HWPOISON, hugetlbfs: small bug fixes Naoya Horiguchi
@ 2012-12-05 21:47 ` Naoya Horiguchi
  2012-12-05 22:13   ` Luck, Tony
  2012-12-06 22:36   ` Andrew Morton
  2012-12-05 21:47 ` [PATCH 2/3] HWPOISON, hugetlbfs: fix "bad pmd" warning in unmapping " Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning Naoya Horiguchi
  2 siblings, 2 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-05 21:47 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tony Luck, Wu Fengguang, linux-kernel, linux-mm, Naoya Horiguchi

This patch fixes the warning from __list_del_entry() which is triggered
when a process tries to do free_huge_page() for a hwpoisoned hugepage.

Originally, page->lru of hugetlbfs head page was dangling when the
hugepage was in use. This behavior has changed by commit 0edaecfab218d7
("hugetlb: add a list for tracking in-use HugeTLB pages"), where hugepages
in use are linked to hugepage_activelist. HWpoisoned hugepages should not
be charged to any process, so we introduce another list to link hwpoisoned
hugepages.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 include/linux/hugetlb.h | 3 +++
 mm/hugetlb.c            | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git v3.7-rc8.orig/include/linux/hugetlb.h v3.7-rc8/include/linux/hugetlb.h
index 2251648..13858ba 100644
--- v3.7-rc8.orig/include/linux/hugetlb.h
+++ v3.7-rc8/include/linux/hugetlb.h
@@ -230,6 +230,9 @@ struct hstate {
 	unsigned long nr_overcommit_huge_pages;
 	struct list_head hugepage_activelist;
 	struct list_head hugepage_freelists[MAX_NUMNODES];
+#ifdef CONFIG_MEMORY_FAILURE
+	struct list_head hugepage_hwpoisonedlist;
+#endif
 	unsigned int nr_huge_pages_node[MAX_NUMNODES];
 	unsigned int free_huge_pages_node[MAX_NUMNODES];
 	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
diff --git v3.7-rc8.orig/mm/hugetlb.c v3.7-rc8/mm/hugetlb.c
index 59a0059..e61a749 100644
--- v3.7-rc8.orig/mm/hugetlb.c
+++ v3.7-rc8/mm/hugetlb.c
@@ -1939,6 +1939,7 @@ void __init hugetlb_add_hstate(unsigned order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
+	INIT_LIST_HEAD(&h->hugepage_hwpoisonedlist);
 	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
 	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
@@ -3170,7 +3171,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
 
 	spin_lock(&hugetlb_lock);
 	if (is_hugepage_on_freelist(hpage)) {
-		list_del(&hpage->lru);
+		list_move(&hpage->lru, &h->hugepage_hwpoisonedlist);
 		set_page_refcounted(hpage);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] HWPOISON, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
  2012-12-05 21:47 [PATCH 0/3] HWPOISON, hugetlbfs: small bug fixes Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
@ 2012-12-05 21:47 ` Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning Naoya Horiguchi
  2 siblings, 0 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-05 21:47 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tony Luck, Wu Fengguang, linux-kernel, linux-mm, Naoya Horiguchi

When a process which used a hwpoisoned hugepage tries to exit() or
munmap(), the kernel can print out "bad pmd" message because page
table walker in free_pgtables() encounters 'hwpoisoned entry' on pmd.

This is because currently we fail to clear the hwpoisoned entry in
__unmap_hugepage_range(), so this patch simply does it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/hugetlb.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git v3.7-rc8.orig/mm/hugetlb.c v3.7-rc8/mm/hugetlb.c
index e61a749..fe7c2a7 100644
--- v3.7-rc8.orig/mm/hugetlb.c
+++ v3.7-rc8/mm/hugetlb.c
@@ -2387,8 +2387,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		/*
 		 * HWPoisoned hugepage is already unmapped and dropped reference
 		 */
-		if (unlikely(is_hugetlb_entry_hwpoisoned(pte)))
+		if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) {
+			pte_clear(mm, address, ptep);
 			continue;
+		}
 
 		page = pte_page(pte);
 		/*
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-05 21:47 [PATCH 0/3] HWPOISON, hugetlbfs: small bug fixes Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
  2012-12-05 21:47 ` [PATCH 2/3] HWPOISON, hugetlbfs: fix "bad pmd" warning in unmapping " Naoya Horiguchi
@ 2012-12-05 21:47 ` Naoya Horiguchi
  2012-12-05 22:04   ` Luck, Tony
  2012-12-10 11:17   ` Simon Jeons
  2 siblings, 2 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-05 21:47 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen
  Cc: Tony Luck, Wu Fengguang, linux-kernel, linux-mm, Naoya Horiguchi

Memory error handling on hugepages can break a RSS counter, which emits
a message like "Bad rss-counter state mm:ffff88040abecac0 idx:1 val:-1".
This is because PageAnon returns true for hugepage (this behavior is
necessary for reverse mapping to work on hugetlbfs).

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/rmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git v3.7-rc8.orig/mm/rmap.c v3.7-rc8/mm/rmap.c
index 2ee1ef0..df54ef0 100644
--- v3.7-rc8.orig/mm/rmap.c
+++ v3.7-rc8/mm/rmap.c
@@ -1235,7 +1235,9 @@ int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	update_hiwater_rss(mm);
 
 	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
-		if (PageAnon(page))
+		if (PageHuge(page))
+			;
+		else if (PageAnon(page))
 			dec_mm_counter(mm, MM_ANONPAGES);
 		else
 			dec_mm_counter(mm, MM_FILEPAGES);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* RE: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-05 21:47 ` [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning Naoya Horiguchi
@ 2012-12-05 22:04   ` Luck, Tony
  2012-12-05 22:14     ` Naoya Horiguchi
  2012-12-10 11:17   ` Simon Jeons
  1 sibling, 1 reply; 22+ messages in thread
From: Luck, Tony @ 2012-12-05 22:04 UTC (permalink / raw)
  To: Naoya Horiguchi, Andrew Morton, Kleen, Andi
  Cc: Wu, Fengguang, linux-kernel, linux-mm

	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
-		if (PageAnon(page))
+		if (PageHuge(page))
+			;
+		else if (PageAnon(page))
 			dec_mm_counter(mm, MM_ANONPAGES);
 		else
 			dec_mm_counter(mm, MM_FILEPAGES);

This style minimizes the "diff" ... but wouldn't it be nicer to say:

		if (!PageHuge(page)) {
			old code in here
		}

-Tony

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
@ 2012-12-05 22:13   ` Luck, Tony
  2012-12-07  2:18     ` Naoya Horiguchi
  2012-12-06 22:36   ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Luck, Tony @ 2012-12-05 22:13 UTC (permalink / raw)
  To: Naoya Horiguchi, Andrew Morton, Kleen, Andi
  Cc: Wu, Fengguang, linux-kernel, linux-mm

> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.

Ultimately it would be nice to avoid poisoning huge pages. Generally we know the
location of the poison to a cache line granularity (but sometimes only to a 4K
granularity) ... and it is rather inefficient to take an entire 2M page out of service.
With 1G pages things would be even worse!!

It also makes life harder for applications that would like to catch the SIGBUS
and try to take their own recovery actions. Losing more data than they really
need to will make it less likely that they can do something to work around the
loss.

Has anyone looked at how hard it might be to have the code in memory-failure.c
break up a huge page and only poison the 4K that needs to be taken out of service?

-Tony

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-05 22:04   ` Luck, Tony
@ 2012-12-05 22:14     ` Naoya Horiguchi
  2012-12-06 22:40       ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-05 22:14 UTC (permalink / raw)
  To: Tony Luck
  Cc: Naoya Horiguchi, Andrew Morton, Andi Kleen, Wu Fengguang,
	linux-kernel, linux-mm

Hi Tony,

On Wed, Dec 05, 2012 at 10:04:50PM +0000, Luck, Tony wrote:
> 	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> -		if (PageAnon(page))
> +		if (PageHuge(page))
> +			;
> +		else if (PageAnon(page))
>  			dec_mm_counter(mm, MM_ANONPAGES);
>  		else
>  			dec_mm_counter(mm, MM_FILEPAGES);
> 
> This style minimizes the "diff" ... but wouldn't it be nicer to say:
> 
> 		if (!PageHuge(page)) {
> 			old code in here
> 		}
> 

I think this need more lines in diff because old code should be
indented without any logical change.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
  2012-12-05 22:13   ` Luck, Tony
@ 2012-12-06 22:36   ` Andrew Morton
  2012-12-07  2:03     ` Naoya Horiguchi
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2012-12-06 22:36 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, Tony Luck, Wu Fengguang, linux-kernel, linux-mm

On Wed,  5 Dec 2012 16:47:36 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> 
> Originally, page->lru of hugetlbfs head page was dangling when the
> hugepage was in use. This behavior has changed by commit 0edaecfab218d7
> ("hugetlb: add a list for tracking in-use HugeTLB pages"), where hugepages
> in use are linked to hugepage_activelist. HWpoisoned hugepages should not
> be charged to any process, so we introduce another list to link hwpoisoned
> hugepages.
> 
> ...
>
> --- v3.7-rc8.orig/include/linux/hugetlb.h
> +++ v3.7-rc8/include/linux/hugetlb.h
> @@ -230,6 +230,9 @@ struct hstate {
>  	unsigned long nr_overcommit_huge_pages;
>  	struct list_head hugepage_activelist;
>  	struct list_head hugepage_freelists[MAX_NUMNODES];
> +#ifdef CONFIG_MEMORY_FAILURE
> +	struct list_head hugepage_hwpoisonedlist;
> +#endif
>  	unsigned int nr_huge_pages_node[MAX_NUMNODES];
>  	unsigned int free_huge_pages_node[MAX_NUMNODES];
>  	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
> diff --git v3.7-rc8.orig/mm/hugetlb.c v3.7-rc8/mm/hugetlb.c
> index 59a0059..e61a749 100644
> --- v3.7-rc8.orig/mm/hugetlb.c
> +++ v3.7-rc8/mm/hugetlb.c
> @@ -1939,6 +1939,7 @@ void __init hugetlb_add_hstate(unsigned order)
>  	for (i = 0; i < MAX_NUMNODES; ++i)
>  		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
>  	INIT_LIST_HEAD(&h->hugepage_activelist);
> +	INIT_LIST_HEAD(&h->hugepage_hwpoisonedlist);
>  	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
>  	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
>  	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
> @@ -3170,7 +3171,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
>  
>  	spin_lock(&hugetlb_lock);
>  	if (is_hugepage_on_freelist(hpage)) {
> -		list_del(&hpage->lru);
> +		list_move(&hpage->lru, &h->hugepage_hwpoisonedlist);
>  		set_page_refcounted(hpage);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;

Do we actually need to new list?  We could use list_del_init() to leave
the page's list_head pointing at itself.  In this state, it is its own
list_head and further list_del()s are a no-op.

I don't know whether this would trigger list-debug warnings.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-05 22:14     ` Naoya Horiguchi
@ 2012-12-06 22:40       ` Andrew Morton
  2012-12-07  1:22         ` Naoya Horiguchi
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2012-12-06 22:40 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, linux-kernel, linux-mm

On Wed,  5 Dec 2012 17:14:33 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Hi Tony,
> 
> On Wed, Dec 05, 2012 at 10:04:50PM +0000, Luck, Tony wrote:
> > 	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> > -		if (PageAnon(page))
> > +		if (PageHuge(page))
> > +			;
> > +		else if (PageAnon(page))
> >  			dec_mm_counter(mm, MM_ANONPAGES);
> >  		else
> >  			dec_mm_counter(mm, MM_FILEPAGES);
> > 
> > This style minimizes the "diff" ... but wouldn't it be nicer to say:
> > 
> > 		if (!PageHuge(page)) {
> > 			old code in here
> > 		}
> > 
> 
> I think this need more lines in diff because old code should be
> indented without any logical change.

I do agree with Tony on this.  While it is nice to keep the diff
looking simple, it is more important that the resulting code be clean
and idiomatic.

--- a/mm/rmap.c~hwpoison-hugetlbfs-fix-rss-counter-warning-fix
+++ a/mm/rmap.c
@@ -1249,14 +1249,14 @@ int try_to_unmap_one(struct page *page, 
 	update_hiwater_rss(mm);
 
 	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
-		if (PageHuge(page))
-			;
-		else if (PageAnon(page))
-			dec_mm_counter(mm, MM_ANONPAGES);
-		else
-			dec_mm_counter(mm, MM_FILEPAGES);
-		set_pte_at(mm, address, pte,
-				swp_entry_to_pte(make_hwpoison_entry(page)));
+		if (!PageHuge(page)) {
+			if (PageAnon(page))
+				dec_mm_counter(mm, MM_ANONPAGES);
+			else
+				dec_mm_counter(mm, MM_FILEPAGES);
+			set_pte_at(mm, address, pte,
+				   swp_entry_to_pte(make_hwpoison_entry(page)));
+		}
 	} else if (PageAnon(page)) {
 		swp_entry_t entry = { .val = page_private(page) };
 
_


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-06 22:40       ` Andrew Morton
@ 2012-12-07  1:22         ` Naoya Horiguchi
  2012-12-07  2:10           ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07  1:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Tony Luck, Andi Kleen, Wu Fengguang,
	linux-kernel, linux-mm

On Thu, Dec 06, 2012 at 02:40:08PM -0800, Andrew Morton wrote:
...
> > On Wed, Dec 05, 2012 at 10:04:50PM +0000, Luck, Tony wrote:
> > > 	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> > > -		if (PageAnon(page))
> > > +		if (PageHuge(page))
> > > +			;
> > > +		else if (PageAnon(page))
> > >  			dec_mm_counter(mm, MM_ANONPAGES);
> > >  		else
> > >  			dec_mm_counter(mm, MM_FILEPAGES);
> > > 
> > > This style minimizes the "diff" ... but wouldn't it be nicer to say:
> > > 
> > > 		if (!PageHuge(page)) {
> > > 			old code in here
> > > 		}
> > > 
> > 
> > I think this need more lines in diff because old code should be
> > indented without any logical change.
> 
> I do agree with Tony on this.  While it is nice to keep the diff
> looking simple, it is more important that the resulting code be clean
> and idiomatic.

OK, I agree.

> --- a/mm/rmap.c~hwpoison-hugetlbfs-fix-rss-counter-warning-fix
> +++ a/mm/rmap.c
> @@ -1249,14 +1249,14 @@ int try_to_unmap_one(struct page *page, 
>  	update_hiwater_rss(mm);
>  
>  	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> -		if (PageHuge(page))
> -			;
> -		else if (PageAnon(page))
> -			dec_mm_counter(mm, MM_ANONPAGES);
> -		else
> -			dec_mm_counter(mm, MM_FILEPAGES);
> -		set_pte_at(mm, address, pte,
> -				swp_entry_to_pte(make_hwpoison_entry(page)));
> +		if (!PageHuge(page)) {
> +			if (PageAnon(page))
> +				dec_mm_counter(mm, MM_ANONPAGES);
> +			else
> +				dec_mm_counter(mm, MM_FILEPAGES);
> +			set_pte_at(mm, address, pte,
> +				   swp_entry_to_pte(make_hwpoison_entry(page)));
> +		}

This set_pte_at() should come outside the if-block, or error containment
does not work.

Thanks,
Naoya

>  	} else if (PageAnon(page)) {
>  		swp_entry_t entry = { .val = page_private(page) };
>  
> _
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-06 22:36   ` Andrew Morton
@ 2012-12-07  2:03     ` Naoya Horiguchi
  2012-12-07  2:20       ` Andrew Morton
  2012-12-07  5:36       ` Aneesh Kumar K.V
  0 siblings, 2 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07  2:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Andi Kleen, Tony Luck, Wu Fengguang,
	linux-kernel, linux-mm

On Thu, Dec 06, 2012 at 02:36:52PM -0800, Andrew Morton wrote:
> On Wed,  5 Dec 2012 16:47:36 -0500
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > This patch fixes the warning from __list_del_entry() which is triggered
> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> > 
> > Originally, page->lru of hugetlbfs head page was dangling when the
> > hugepage was in use. This behavior has changed by commit 0edaecfab218d7
> > ("hugetlb: add a list for tracking in-use HugeTLB pages"), where hugepages
> > in use are linked to hugepage_activelist. HWpoisoned hugepages should not
> > be charged to any process, so we introduce another list to link hwpoisoned
> > hugepages.
> > 
> > ...
> >
> > --- v3.7-rc8.orig/include/linux/hugetlb.h
> > +++ v3.7-rc8/include/linux/hugetlb.h
> > @@ -230,6 +230,9 @@ struct hstate {
> >  	unsigned long nr_overcommit_huge_pages;
> >  	struct list_head hugepage_activelist;
> >  	struct list_head hugepage_freelists[MAX_NUMNODES];
> > +#ifdef CONFIG_MEMORY_FAILURE
> > +	struct list_head hugepage_hwpoisonedlist;
> > +#endif
> >  	unsigned int nr_huge_pages_node[MAX_NUMNODES];
> >  	unsigned int free_huge_pages_node[MAX_NUMNODES];
> >  	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
> > diff --git v3.7-rc8.orig/mm/hugetlb.c v3.7-rc8/mm/hugetlb.c
> > index 59a0059..e61a749 100644
> > --- v3.7-rc8.orig/mm/hugetlb.c
> > +++ v3.7-rc8/mm/hugetlb.c
> > @@ -1939,6 +1939,7 @@ void __init hugetlb_add_hstate(unsigned order)
> >  	for (i = 0; i < MAX_NUMNODES; ++i)
> >  		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
> >  	INIT_LIST_HEAD(&h->hugepage_activelist);
> > +	INIT_LIST_HEAD(&h->hugepage_hwpoisonedlist);
> >  	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
> >  	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
> >  	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
> > @@ -3170,7 +3171,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
> >  
> >  	spin_lock(&hugetlb_lock);
> >  	if (is_hugepage_on_freelist(hpage)) {
> > -		list_del(&hpage->lru);
> > +		list_move(&hpage->lru, &h->hugepage_hwpoisonedlist);
> >  		set_page_refcounted(hpage);
> >  		h->free_huge_pages--;
> >  		h->free_huge_pages_node[nid]--;
> 
> Do we actually need to new list?  We could use list_del_init() to leave
> the page's list_head pointing at itself.  In this state, it is its own
> list_head and further list_del()s are a no-op.

OK, it's better, thanks.

> I don't know whether this would trigger list-debug warnings.

I tested your idea (with attached patch) and confirmed that
we never get the warnings.

Thanks,
Naoya
---
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date: Thu, 6 Dec 2012 20:54:30 -0500
Subject: [PATCH v2] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
 hugepage

This patch fixes the warning from __list_del_entry() which is triggered
when a process tries to do free_huge_page() for a hwpoisoned hugepage.

ChangeLog v2:
 - simply use list_del_init instead of introducing new hugepage list

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 59a0059..9308752 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3170,7 +3170,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
 
 	spin_lock(&hugetlb_lock);
 	if (is_hugepage_on_freelist(hpage)) {
-		list_del(&hpage->lru);
+		list_del_init(&hpage->lru);
 		set_page_refcounted(hpage);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-07  1:22         ` Naoya Horiguchi
@ 2012-12-07  2:10           ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2012-12-07  2:10 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Tony Luck, Andi Kleen, Wu Fengguang, linux-kernel, linux-mm

On Thu,  6 Dec 2012 20:22:42 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> > --- a/mm/rmap.c~hwpoison-hugetlbfs-fix-rss-counter-warning-fix
> > +++ a/mm/rmap.c
> > @@ -1249,14 +1249,14 @@ int try_to_unmap_one(struct page *page, 
> >  	update_hiwater_rss(mm);
> >  
> >  	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> > -		if (PageHuge(page))
> > -			;
> > -		else if (PageAnon(page))
> > -			dec_mm_counter(mm, MM_ANONPAGES);
> > -		else
> > -			dec_mm_counter(mm, MM_FILEPAGES);
> > -		set_pte_at(mm, address, pte,
> > -				swp_entry_to_pte(make_hwpoison_entry(page)));
> > +		if (!PageHuge(page)) {
> > +			if (PageAnon(page))
> > +				dec_mm_counter(mm, MM_ANONPAGES);
> > +			else
> > +				dec_mm_counter(mm, MM_FILEPAGES);
> > +			set_pte_at(mm, address, pte,
> > +				   swp_entry_to_pte(make_hwpoison_entry(page)));
> > +		}
> 
> This set_pte_at() should come outside the if-block, or error containment
> does not work.

Doh.  C is really hard!

--- a/mm/rmap.c~hwpoison-hugetlbfs-fix-rss-counter-warning-fix-fix
+++ a/mm/rmap.c
@@ -1254,9 +1254,9 @@ int try_to_unmap_one(struct page *page, 
 				dec_mm_counter(mm, MM_ANONPAGES);
 			else
 				dec_mm_counter(mm, MM_FILEPAGES);
-			set_pte_at(mm, address, pte,
-				   swp_entry_to_pte(make_hwpoison_entry(page)));
 		}
+		set_pte_at(mm, address, pte,
+			   swp_entry_to_pte(make_hwpoison_entry(page)));
 	} else if (PageAnon(page)) {
 		swp_entry_t entry = { .val = page_private(page) };
 
_


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-05 22:13   ` Luck, Tony
@ 2012-12-07  2:18     ` Naoya Horiguchi
  0 siblings, 0 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07  2:18 UTC (permalink / raw)
  To: Tony Luck
  Cc: Naoya Horiguchi, Andrew Morton, Andi Kleen, Wu Fengguang,
	linux-kernel, linux-mm

On Wed, Dec 05, 2012 at 10:13:42PM +0000, Luck, Tony wrote:
> > This patch fixes the warning from __list_del_entry() which is triggered
> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> 
> Ultimately it would be nice to avoid poisoning huge pages. Generally we know the
> location of the poison to a cache line granularity (but sometimes only to a 4K
> granularity) ... and it is rather inefficient to take an entire 2M page out of service.
> With 1G pages things would be even worse!!

Thanks for the comment.
And yes, it's remaining work to be done.

> It also makes life harder for applications that would like to catch the SIGBUS
> and try to take their own recovery actions. Losing more data than they really
> need to will make it less likely that they can do something to work around the
> loss.
> 
> Has anyone looked at how hard it might be to have the code in memory-failure.c
> break up a huge page and only poison the 4K that needs to be taken out of service?

This work is one of my interest and became a bit easier than used to be,
because now transparent hugepage works commonly and some of code can be
copied from or shared with it.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  2:03     ` Naoya Horiguchi
@ 2012-12-07  2:20       ` Andrew Morton
  2012-12-07  5:48         ` Naoya Horiguchi
  2012-12-07  5:36       ` Aneesh Kumar K.V
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2012-12-07  2:20 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, Tony Luck, Wu Fengguang, linux-kernel, linux-mm

On Thu,  6 Dec 2012 21:03:44 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> 
> ChangeLog v2:
>  - simply use list_del_init instead of introducing new hugepage list
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>  mm/hugetlb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 59a0059..9308752 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3170,7 +3170,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
>  
>  	spin_lock(&hugetlb_lock);
>  	if (is_hugepage_on_freelist(hpage)) {
> -		list_del(&hpage->lru);
> +		list_del_init(&hpage->lru);

Can we please have a code comment in here explaining why
list_del_init() is used?

>  		set_page_refcounted(hpage);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  2:03     ` Naoya Horiguchi
  2012-12-07  2:20       ` Andrew Morton
@ 2012-12-07  5:36       ` Aneesh Kumar K.V
  2012-12-07  6:14         ` Naoya Horiguchi
  1 sibling, 1 reply; 22+ messages in thread
From: Aneesh Kumar K.V @ 2012-12-07  5:36 UTC (permalink / raw)
  To: Naoya Horiguchi, Andrew Morton
  Cc: Naoya Horiguchi, Andi Kleen, Tony Luck, Wu Fengguang,
	linux-kernel, linux-mm

Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> writes:

> On Thu, Dec 06, 2012 at 02:36:52PM -0800, Andrew Morton wrote:
>> On Wed,  5 Dec 2012 16:47:36 -0500
>> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
>> 
>> > This patch fixes the warning from __list_del_entry() which is triggered
>> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
>> > 
>> > Originally, page->lru of hugetlbfs head page was dangling when the
>> > hugepage was in use. This behavior has changed by commit 0edaecfab218d7
>> > ("hugetlb: add a list for tracking in-use HugeTLB pages"), where hugepages
>> > in use are linked to hugepage_activelist. HWpoisoned hugepages should not
>> > be charged to any process, so we introduce another list to link hwpoisoned
>> > hugepages.
>> > 
>> > ...
>> >
>> > --- v3.7-rc8.orig/include/linux/hugetlb.h
>> > +++ v3.7-rc8/include/linux/hugetlb.h
>> > @@ -230,6 +230,9 @@ struct hstate {
>> >  	unsigned long nr_overcommit_huge_pages;
>> >  	struct list_head hugepage_activelist;
>> >  	struct list_head hugepage_freelists[MAX_NUMNODES];
>> > +#ifdef CONFIG_MEMORY_FAILURE
>> > +	struct list_head hugepage_hwpoisonedlist;
>> > +#endif
>> >  	unsigned int nr_huge_pages_node[MAX_NUMNODES];
>> >  	unsigned int free_huge_pages_node[MAX_NUMNODES];
>> >  	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
>> > diff --git v3.7-rc8.orig/mm/hugetlb.c v3.7-rc8/mm/hugetlb.c
>> > index 59a0059..e61a749 100644
>> > --- v3.7-rc8.orig/mm/hugetlb.c
>> > +++ v3.7-rc8/mm/hugetlb.c
>> > @@ -1939,6 +1939,7 @@ void __init hugetlb_add_hstate(unsigned order)
>> >  	for (i = 0; i < MAX_NUMNODES; ++i)
>> >  		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
>> >  	INIT_LIST_HEAD(&h->hugepage_activelist);
>> > +	INIT_LIST_HEAD(&h->hugepage_hwpoisonedlist);
>> >  	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
>> >  	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
>> >  	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
>> > @@ -3170,7 +3171,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
>> >  
>> >  	spin_lock(&hugetlb_lock);
>> >  	if (is_hugepage_on_freelist(hpage)) {
>> > -		list_del(&hpage->lru);
>> > +		list_move(&hpage->lru, &h->hugepage_hwpoisonedlist);
>> >  		set_page_refcounted(hpage);
>> >  		h->free_huge_pages--;
>> >  		h->free_huge_pages_node[nid]--;
>> 
>> Do we actually need to new list?  We could use list_del_init() to leave
>> the page's list_head pointing at itself.  In this state, it is its own
>> list_head and further list_del()s are a no-op.
>
> OK, it's better, thanks.
>
>> I don't know whether this would trigger list-debug warnings.
>
> I tested your idea (with attached patch) and confirmed that
> we never get the warnings.
>
> Thanks,
> Naoya
> ---
> From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Date: Thu, 6 Dec 2012 20:54:30 -0500
> Subject: [PATCH v2] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
>  hugepage
>
> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.


Can you get a dump stack for that. I am confused because the page was
already in freelist, and we deleted it from the list and set the
refcount to 1. So how are we reaching free_huge_page() again ?


>
> ChangeLog v2:
>  - simply use list_del_init instead of introducing new hugepage list
>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>  mm/hugetlb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 59a0059..9308752 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3170,7 +3170,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
>
>  	spin_lock(&hugetlb_lock);
>  	if (is_hugepage_on_freelist(hpage)) {
> -		list_del(&hpage->lru);
> +		list_del_init(&hpage->lru);
>  		set_page_refcounted(hpage);
>  		h->free_huge_pages--;
>  		h->free_huge_pages_node[nid]--;
> -- 

-aneesh


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  2:20       ` Andrew Morton
@ 2012-12-07  5:48         ` Naoya Horiguchi
  0 siblings, 0 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07  5:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Andi Kleen, Tony Luck, Wu Fengguang,
	linux-kernel, linux-mm

On Thu, Dec 06, 2012 at 06:20:24PM -0800, Andrew Morton wrote:
> On Thu,  6 Dec 2012 21:03:44 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > This patch fixes the warning from __list_del_entry() which is triggered
> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> > 
> > ChangeLog v2:
> >  - simply use list_del_init instead of introducing new hugepage list
> > 
> > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> > ---
> >  mm/hugetlb.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 59a0059..9308752 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -3170,7 +3170,7 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
> >  
> >  	spin_lock(&hugetlb_lock);
> >  	if (is_hugepage_on_freelist(hpage)) {
> > -		list_del(&hpage->lru);
> > +		list_del_init(&hpage->lru);
> 
> Can we please have a code comment in here explaining why
> list_del_init() is used?

OK, I added it.
---
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date: Thu, 6 Dec 2012 20:54:30 -0500
Subject: [PATCH v3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
 hugepage

This patch fixes the warning from __list_del_entry() which is triggered
when a process tries to do free_huge_page() for a hwpoisoned hugepage.

ChangeLog v3:
 - add comment

ChangeLog v2:
 - simply use list_del_init instead of introducing new hugepage list

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/hugetlb.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 59a0059..df03345 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3170,7 +3170,12 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
 
 	spin_lock(&hugetlb_lock);
 	if (is_hugepage_on_freelist(hpage)) {
-		list_del(&hpage->lru);
+		/* 
+		 * Hwpoisoned hugepage isn't linked to activelist or freelist,
+		 * but dangling hpage->lru can trigger list-debug warnings,
+		 * so let it point to itself with list_del_init().
+		 */
+		list_del_init(&hpage->lru);
 		set_page_refcounted(hpage);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  5:36       ` Aneesh Kumar K.V
@ 2012-12-07  6:14         ` Naoya Horiguchi
  2012-12-07  7:54           ` Aneesh Kumar K.V
  0 siblings, 1 reply; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07  6:14 UTC (permalink / raw)
  To: aneesh.kumar
  Cc: Naoya Horiguchi, Andrew Morton, Andi Kleen, Tony Luck,
	Wu Fengguang, linux-kernel, linux-mm

On Fri, Dec 07, 2012 at 11:06:41AM +0530, Aneesh Kumar K.V wrote:
...
> > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> > Date: Thu, 6 Dec 2012 20:54:30 -0500
> > Subject: [PATCH v2] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
> >  hugepage
> >
> > This patch fixes the warning from __list_del_entry() which is triggered
> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> 
> 
> Can you get a dump stack for that. I am confused because the page was
> already in freelist, and we deleted it from the list and set the
> refcount to 1. So how are we reaching free_huge_page() again ?

free_huge_page() can be called for hwpoisoned hugepage from unpoison_memory().
This function gets refcount once and clears PageHWPoison, and then puts
refcount twice to return the hugepage back to free pool.
The second put_page() finally reaches free_huge_page().

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  6:14         ` Naoya Horiguchi
@ 2012-12-07  7:54           ` Aneesh Kumar K.V
  2012-12-07 15:49             ` Naoya Horiguchi
  0 siblings, 1 reply; 22+ messages in thread
From: Aneesh Kumar K.V @ 2012-12-07  7:54 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Naoya Horiguchi, Andrew Morton, Andi Kleen, Tony Luck,
	Wu Fengguang, linux-kernel, linux-mm

Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> writes:

> On Fri, Dec 07, 2012 at 11:06:41AM +0530, Aneesh Kumar K.V wrote:
> ...
>> > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> > Date: Thu, 6 Dec 2012 20:54:30 -0500
>> > Subject: [PATCH v2] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
>> >  hugepage
>> >
>> > This patch fixes the warning from __list_del_entry() which is triggered
>> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
>> 
>> 
>> Can you get a dump stack for that. I am confused because the page was
>> already in freelist, and we deleted it from the list and set the
>> refcount to 1. So how are we reaching free_huge_page() again ?
>
> free_huge_page() can be called for hwpoisoned hugepage from unpoison_memory().
> This function gets refcount once and clears PageHWPoison, and then puts
> refcount twice to return the hugepage back to free pool.
> The second put_page() finally reaches free_huge_page().
>

Can we add this also to the commit message ?. With that you can add

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Thanks
-aneesh


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07  7:54           ` Aneesh Kumar K.V
@ 2012-12-07 15:49             ` Naoya Horiguchi
  2012-12-07 22:34               ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-07 15:49 UTC (permalink / raw)
  To: aneesh.kumar
  Cc: Naoya Horiguchi, Andrew Morton, Andi Kleen, Tony Luck,
	Wu Fengguang, linux-kernel, linux-mm

On Fri, Dec 07, 2012 at 01:24:45PM +0530, Aneesh Kumar K.V wrote:
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> writes:
> 
> > On Fri, Dec 07, 2012 at 11:06:41AM +0530, Aneesh Kumar K.V wrote:
> > ...
> >> > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> >> > Date: Thu, 6 Dec 2012 20:54:30 -0500
> >> > Subject: [PATCH v2] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
> >> >  hugepage
> >> >
> >> > This patch fixes the warning from __list_del_entry() which is triggered
> >> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
> >> 
> >> 
> >> Can you get a dump stack for that. I am confused because the page was
> >> already in freelist, and we deleted it from the list and set the
> >> refcount to 1. So how are we reaching free_huge_page() again ?
> >
> > free_huge_page() can be called for hwpoisoned hugepage from unpoison_memory().
> > This function gets refcount once and clears PageHWPoison, and then puts
> > refcount twice to return the hugepage back to free pool.
> > The second put_page() finally reaches free_huge_page().
> >
> 
> Can we add this also to the commit message ?. With that you can add
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

OK, I added it.
Thanks for the review!

Naoya
---
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Date: Thu, 6 Dec 2012 20:54:30 -0500
Subject: [PATCH v4] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned
 hugepage

This patch fixes the warning from __list_del_entry() which is triggered
when a process tries to do free_huge_page() for a hwpoisoned hugepage.

ChangeLog v4:
 - Add comment about when the warning is triggered

ChangeLog v3:
 - Add comment

ChangeLog v2:
 - simply use list_del_init instead of introducing new hugepage list

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 mm/hugetlb.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 59a0059..2511bcb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3170,7 +3170,13 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
 
 	spin_lock(&hugetlb_lock);
 	if (is_hugepage_on_freelist(hpage)) {
-		list_del(&hpage->lru);
+		/* 
+		 * Hwpoisoned hugepage isn't linked to activelist or freelist,
+		 * but dangling hpage->lru can trigger list-debug warnings
+		 * (this happens when we call unpoison_memory() on it),
+		 * so let it point to itself with list_del_init().
+		 */
+		list_del_init(&hpage->lru);
 		set_page_refcounted(hpage);
 		h->free_huge_pages--;
 		h->free_huge_pages_node[nid]--;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07 15:49             ` Naoya Horiguchi
@ 2012-12-07 22:34               ` Andrew Morton
  2012-12-08 21:04                 ` Naoya Horiguchi
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2012-12-07 22:34 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: aneesh.kumar, Andi Kleen, Tony Luck, Wu Fengguang, linux-kernel,
	linux-mm

On Fri,  7 Dec 2012 10:49:57 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.

This changelog is very short.  In fact it is too short, resulting in
others having to ask questions about the patch.  When this happens,
please treat it as a sign that the changelog needs additional
information - so that other readers will not feel a need to ask the
same questions!

I added this paragraph:

: free_huge_page() can be called for hwpoisoned hugepage from
: unpoison_memory().  This function gets refcount once and clears
: PageHWPoison, and then puts refcount twice to return the hugepage back to
: free pool.  The second put_page() finally reaches free_huge_page().



Also, is the description accurate?  Is the __list_del_entry() warning
the only problem?

Or is it the case that this bug will cause memory corruption?  If so
then the patch is pretty important and is probably needed in -stable as
well?  I haven't checked how far back in time the bug exists.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage
  2012-12-07 22:34               ` Andrew Morton
@ 2012-12-08 21:04                 ` Naoya Horiguchi
  0 siblings, 0 replies; 22+ messages in thread
From: Naoya Horiguchi @ 2012-12-08 21:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, aneesh.kumar, Andi Kleen, Tony Luck,
	Wu Fengguang, linux-kernel, linux-mm

On Fri, Dec 07, 2012 at 02:34:14PM -0800, Andrew Morton wrote:
> On Fri,  7 Dec 2012 10:49:57 -0500
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
>
> > This patch fixes the warning from __list_del_entry() which is triggered
> > when a process tries to do free_huge_page() for a hwpoisoned hugepage.
>
> This changelog is very short.  In fact it is too short, resulting in
> others having to ask questions about the patch.  When this happens,
> please treat it as a sign that the changelog needs additional
> information - so that other readers will not feel a need to ask the
> same questions!

OK, I'll be careful after this.

> I added this paragraph:
>
> : free_huge_page() can be called for hwpoisoned hugepage from
> : unpoison_memory().  This function gets refcount once and clears
> : PageHWPoison, and then puts refcount twice to return the hugepage back to
> : free pool.  The second put_page() finally reaches free_huge_page().
>
>
>
> Also, is the description accurate?  Is the __list_del_entry() warning
> the only problem?

Right, this description is correct and this warning is the only problem.

> Or is it the case that this bug will cause memory corruption?  If so
> then the patch is pretty important and is probably needed in -stable as
> well?  I haven't checked how far back in time the bug exists.

There's no memory corruption even if we leave this bug unfixed, because
in unpoisoning (only way to change the status of hwpoisoned hugepage),
there are two possible operations on page->lru as shown below:

  static void free_huge_page(struct page *page)
  {
          ...
          if (h->surplus_huge_pages_node[nid] && huge_page_order(h) < MAX_ORDER) {
                  /* remove the page from active list */
                  list_del(&page->lru);
                  update_and_free_page(h, page);
                  h->surplus_huge_pages--;
                  h->surplus_huge_pages_node[nid]--;
          } else {
                  arch_clear_hugepage_flags(page);
                  enqueue_huge_page(h, page); /* list_move inside this function*/
          }
          ...
  }

, but both path do simply list_del() or list_move(), so there's no
difference (except for warning) after this block whether page->lru is
dangling or pointing to itself.

This bug was introduced recently on commit 0edaecfab218d747d30de
("hugetlb: add a list for tracking in-use HugeTLB pages").

Naoya

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning
  2012-12-05 21:47 ` [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning Naoya Horiguchi
  2012-12-05 22:04   ` Luck, Tony
@ 2012-12-10 11:17   ` Simon Jeons
  1 sibling, 0 replies; 22+ messages in thread
From: Simon Jeons @ 2012-12-10 11:17 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Tony Luck, Wu Fengguang, linux-kernel,
	linux-mm

On Wed, 2012-12-05 at 16:47 -0500, Naoya Horiguchi wrote:
> Memory error handling on hugepages can break a RSS counter, which emits
> a message like "Bad rss-counter state mm:ffff88040abecac0 idx:1 val:-1".
> This is because PageAnon returns true for hugepage (this behavior is

Could you explain why PageAnon returns true for hugepage?

> necessary for reverse mapping to work on hugetlbfs).
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>  mm/rmap.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git v3.7-rc8.orig/mm/rmap.c v3.7-rc8/mm/rmap.c
> index 2ee1ef0..df54ef0 100644
> --- v3.7-rc8.orig/mm/rmap.c
> +++ v3.7-rc8/mm/rmap.c
> @@ -1235,7 +1235,9 @@ int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  	update_hiwater_rss(mm);
>  
>  	if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> -		if (PageAnon(page))
> +		if (PageHuge(page))
> +			;
> +		else if (PageAnon(page))
>  			dec_mm_counter(mm, MM_ANONPAGES);
>  		else
>  			dec_mm_counter(mm, MM_FILEPAGES);



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-12-10 11:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-05 21:47 [PATCH 0/3] HWPOISON, hugetlbfs: small bug fixes Naoya Horiguchi
2012-12-05 21:47 ` [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage Naoya Horiguchi
2012-12-05 22:13   ` Luck, Tony
2012-12-07  2:18     ` Naoya Horiguchi
2012-12-06 22:36   ` Andrew Morton
2012-12-07  2:03     ` Naoya Horiguchi
2012-12-07  2:20       ` Andrew Morton
2012-12-07  5:48         ` Naoya Horiguchi
2012-12-07  5:36       ` Aneesh Kumar K.V
2012-12-07  6:14         ` Naoya Horiguchi
2012-12-07  7:54           ` Aneesh Kumar K.V
2012-12-07 15:49             ` Naoya Horiguchi
2012-12-07 22:34               ` Andrew Morton
2012-12-08 21:04                 ` Naoya Horiguchi
2012-12-05 21:47 ` [PATCH 2/3] HWPOISON, hugetlbfs: fix "bad pmd" warning in unmapping " Naoya Horiguchi
2012-12-05 21:47 ` [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning Naoya Horiguchi
2012-12-05 22:04   ` Luck, Tony
2012-12-05 22:14     ` Naoya Horiguchi
2012-12-06 22:40       ` Andrew Morton
2012-12-07  1:22         ` Naoya Horiguchi
2012-12-07  2:10           ` Andrew Morton
2012-12-10 11:17   ` Simon Jeons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).