All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/memory-failure: fix race with compound page split/merge
@ 2016-04-18 11:43 ` Konstantin Khlebnikov
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-04-18 11:43 UTC (permalink / raw)
  To: linux-mm, Naoya Horiguchi; +Cc: linux-kernel, Kirill A. Shutemov

Get_hwpoison_page() must recheck relation between head and tail pages.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/memory-failure.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 78f5f2641b91..ca5acee53b7a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
 		}
 	}
 
-	return get_page_unless_zero(head);
+	if (get_page_unless_zero(head)) {
+		if (head == compound_head(page))
+			return 1;
+
+		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
+		put_page(head);
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL_GPL(get_hwpoison_page);
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] mm/memory-failure: fix race with compound page split/merge
@ 2016-04-18 11:43 ` Konstantin Khlebnikov
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-04-18 11:43 UTC (permalink / raw)
  To: linux-mm, Naoya Horiguchi; +Cc: linux-kernel, Kirill A. Shutemov

Get_hwpoison_page() must recheck relation between head and tail pages.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/memory-failure.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 78f5f2641b91..ca5acee53b7a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
 		}
 	}
 
-	return get_page_unless_zero(head);
+	if (get_page_unless_zero(head)) {
+		if (head == compound_head(page))
+			return 1;
+
+		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
+		put_page(head);
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL_GPL(get_hwpoison_page);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
  2016-04-18 11:43 ` Konstantin Khlebnikov
@ 2016-04-18 23:15   ` Naoya Horiguchi
  -1 siblings, 0 replies; 8+ messages in thread
From: Naoya Horiguchi @ 2016-04-18 23:15 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, linux-kernel, Kirill A. Shutemov, Andrew Morton

# CCed Andrew,

On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
> Get_hwpoison_page() must recheck relation between head and tail pages.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Looks good to me. Without this recheck, the race causes kernel to pin
an irrelevant page, and finally makes kernel crash for refcount mismcach...

Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

> ---
>  mm/memory-failure.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 78f5f2641b91..ca5acee53b7a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
>  		}
>  	}
>  
> -	return get_page_unless_zero(head);
> +	if (get_page_unless_zero(head)) {
> +		if (head == compound_head(page))
> +			return 1;
> +
> +		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));

Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
but the resolution is trivial, I think.

Thanks,
Naoya Horiguchi

> +		put_page(head);
> +	}
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(get_hwpoison_page);
>  
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
@ 2016-04-18 23:15   ` Naoya Horiguchi
  0 siblings, 0 replies; 8+ messages in thread
From: Naoya Horiguchi @ 2016-04-18 23:15 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, linux-kernel, Kirill A. Shutemov, Andrew Morton

# CCed Andrew,

On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
> Get_hwpoison_page() must recheck relation between head and tail pages.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

Looks good to me. Without this recheck, the race causes kernel to pin
an irrelevant page, and finally makes kernel crash for refcount mismcach...

Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

> ---
>  mm/memory-failure.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 78f5f2641b91..ca5acee53b7a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
>  		}
>  	}
>  
> -	return get_page_unless_zero(head);
> +	if (get_page_unless_zero(head)) {
> +		if (head == compound_head(page))
> +			return 1;
> +
> +		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));

Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
but the resolution is trivial, I think.

Thanks,
Naoya Horiguchi

> +		put_page(head);
> +	}
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL_GPL(get_hwpoison_page);
>  
> 
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
  2016-04-18 23:15   ` Naoya Horiguchi
@ 2016-04-19  5:54     ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-04-19  5:54 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Konstantin Khlebnikov, linux-mm, linux-kernel,
	Kirill A. Shutemov, Andrew Morton

On Tue, Apr 19, 2016 at 2:15 AM, Naoya Horiguchi
<n-horiguchi@ah.jp.nec.com> wrote:
> # CCed Andrew,
>
> On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
>> Get_hwpoison_page() must recheck relation between head and tail pages.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>
> Looks good to me. Without this recheck, the race causes kernel to pin
> an irrelevant page, and finally makes kernel crash for refcount mismcach...

Yep. I seen that a lot. Unfortunately that was in 3.18 branch and
it'll took several months to verify this fix.
This code and page reference counting overall have changed
significantly since then, so probably here is more bugs.
For example, I'm not sure about races with atomic set for page
reference counting,
I've found and removed couple in mellanox driver but there're more in
mm and net.

>
> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
>> ---
>>  mm/memory-failure.c |   10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 78f5f2641b91..ca5acee53b7a 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
>>               }
>>       }
>>
>> -     return get_page_unless_zero(head);
>> +     if (get_page_unless_zero(head)) {
>> +             if (head == compound_head(page))
>> +                     return 1;
>> +
>> +             pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
>
> Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
> but the resolution is trivial, I think.
>
> Thanks,
> Naoya Horiguchi
>
>> +             put_page(head);
>> +     }
>> +
>> +     return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(get_hwpoison_page);
>>
>>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a hrefmailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
@ 2016-04-19  5:54     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 8+ messages in thread
From: Konstantin Khlebnikov @ 2016-04-19  5:54 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Konstantin Khlebnikov, linux-mm, linux-kernel,
	Kirill A. Shutemov, Andrew Morton

On Tue, Apr 19, 2016 at 2:15 AM, Naoya Horiguchi
<n-horiguchi@ah.jp.nec.com> wrote:
> # CCed Andrew,
>
> On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
>> Get_hwpoison_page() must recheck relation between head and tail pages.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>
> Looks good to me. Without this recheck, the race causes kernel to pin
> an irrelevant page, and finally makes kernel crash for refcount mismcach...

Yep. I seen that a lot. Unfortunately that was in 3.18 branch and
it'll took several months to verify this fix.
This code and page reference counting overall have changed
significantly since then, so probably here is more bugs.
For example, I'm not sure about races with atomic set for page
reference counting,
I've found and removed couple in mellanox driver but there're more in
mm and net.

>
> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
>> ---
>>  mm/memory-failure.c |   10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 78f5f2641b91..ca5acee53b7a 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -888,7 +888,15 @@ int get_hwpoison_page(struct page *page)
>>               }
>>       }
>>
>> -     return get_page_unless_zero(head);
>> +     if (get_page_unless_zero(head)) {
>> +             if (head == compound_head(page))
>> +                     return 1;
>> +
>> +             pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
>
> Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
> but the resolution is trivial, I think.
>
> Thanks,
> Naoya Horiguchi
>
>> +             put_page(head);
>> +     }
>> +
>> +     return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(get_hwpoison_page);
>>
>>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a hrefmailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
  2016-04-18 23:15   ` Naoya Horiguchi
@ 2016-04-21 23:44     ` Andrew Morton
  -1 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2016-04-21 23:44 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Konstantin Khlebnikov, linux-mm, linux-kernel, Kirill A. Shutemov

On Mon, 18 Apr 2016 23:15:52 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> # CCed Andrew,

Thanks.

> On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
> > Get_hwpoison_page() must recheck relation between head and tail pages.
> > 
> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> 
> Looks good to me. Without this recheck, the race causes kernel to pin
> an irrelevant page, and finally makes kernel crash for refcount mismcach...

Thanks.  I'll add the above (important!) info to the changelog and
cc:stable.

> > -	return get_page_unless_zero(head);
> > +	if (get_page_unless_zero(head)) {
> > +		if (head == compound_head(page))
> > +			return 1;
> > +
> > +		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
> 
> Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
> but the resolution is trivial, I think.

Yup, that patch is in my (large) backlog.  Away at conferences for
seven days, receiving 100 actionable emails per day.  Give me a few
days ;)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/memory-failure: fix race with compound page split/merge
@ 2016-04-21 23:44     ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2016-04-21 23:44 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Konstantin Khlebnikov, linux-mm, linux-kernel, Kirill A. Shutemov

On Mon, 18 Apr 2016 23:15:52 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> # CCed Andrew,

Thanks.

> On Mon, Apr 18, 2016 at 02:43:45PM +0300, Konstantin Khlebnikov wrote:
> > Get_hwpoison_page() must recheck relation between head and tail pages.
> > 
> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> 
> Looks good to me. Without this recheck, the race causes kernel to pin
> an irrelevant page, and finally makes kernel crash for refcount mismcach...

Thanks.  I'll add the above (important!) info to the changelog and
cc:stable.

> > -	return get_page_unless_zero(head);
> > +	if (get_page_unless_zero(head)) {
> > +		if (head == compound_head(page))
> > +			return 1;
> > +
> > +		pr_info("MCE: %#lx cannot catch tail\n", page_to_pfn(page));
> 
> Recently Chen Yucong replaced the label "MCE:" with "Memory failure:",
> but the resolution is trivial, I think.

Yup, that patch is in my (large) backlog.  Away at conferences for
seven days, receiving 100 actionable emails per day.  Give me a few
days ;)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-04-21 23:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-18 11:43 [PATCH] mm/memory-failure: fix race with compound page split/merge Konstantin Khlebnikov
2016-04-18 11:43 ` Konstantin Khlebnikov
2016-04-18 23:15 ` Naoya Horiguchi
2016-04-18 23:15   ` Naoya Horiguchi
2016-04-19  5:54   ` Konstantin Khlebnikov
2016-04-19  5:54     ` Konstantin Khlebnikov
2016-04-21 23:44   ` Andrew Morton
2016-04-21 23:44     ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.