linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] HWPOISON: soft offlining for non-lru movable page
@ 2017-01-18  4:00 Yisheng Xie
  2017-01-18  9:45 ` Naoya Horiguchi
  2017-01-18  9:51 ` Michal Hocko
  0 siblings, 2 replies; 6+ messages in thread
From: Yisheng Xie @ 2017-01-18  4:00 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: n-horiguchi, mhocko, akpm, minchan, vbabka, guohanjun, qiuxishi

This patch is to extends soft offlining framework to support
non-lru page, which already support migration after
commit bda807d44454 ("mm: migrate: support non-lru movable page
migration")

When memory corrected errors occur on a non-lru movable page,
we can choose to stop using it by migrating data onto another
page and disable the original (maybe half-broken) one.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/memory-failure.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f283c7e..10043a4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1527,7 +1527,8 @@ static int get_any_page(struct page *page, unsigned long pfn, int flags)
 {
 	int ret = __get_any_page(page, pfn, flags);
 
-	if (ret == 1 && !PageHuge(page) && !PageLRU(page)) {
+	if (ret == 1 && !PageHuge(page) &&
+	    !PageLRU(page) && !__PageMovable(page)) {
 		/*
 		 * Try to free it.
 		 */
@@ -1549,6 +1550,54 @@ static int get_any_page(struct page *page, unsigned long pfn, int flags)
 	return ret;
 }
 
+static int soft_offline_movable_page(struct page *page, int flags)
+{
+	int ret;
+	unsigned long pfn = page_to_pfn(page);
+	LIST_HEAD(pagelist);
+
+	/*
+	 * This double-check of PageHWPoison is to avoid the race with
+	 * memory_failure(). See also comment in __soft_offline_page().
+	 */
+	lock_page(page);
+	if (PageHWPoison(page)) {
+		unlock_page(page);
+		put_hwpoison_page(page);
+		pr_info("soft offline: %#lx movable page already poisoned\n",
+			pfn);
+		return -EBUSY;
+	}
+	unlock_page(page);
+
+	ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
+	/*
+	 * get_any_page() and isolate_movable_page() takes a refcount each,
+	 * so need to drop one here.
+	 */
+	put_hwpoison_page(page);
+	if (!ret) {
+		pr_info("soft offline: %#lx movable page failed to isolate\n",
+			pfn);
+		return -EBUSY;
+	}
+
+	list_add(&page->lru, &pagelist);
+	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
+			    MIGRATE_SYNC, MR_MEMORY_FAILURE);
+	if (ret) {
+		if (!list_empty(&pagelist))
+			putback_movable_pages(&pagelist);
+
+		pr_info("soft offline: %#lx: migration failed %d, type %lx\n",
+			pfn, ret, page->flags);
+		if (ret > 0)
+			ret = -EIO;
+	}
+
+	return ret;
+}
+
 static int soft_offline_huge_page(struct page *page, int flags)
 {
 	int ret;
@@ -1705,8 +1754,10 @@ static int soft_offline_in_use_page(struct page *page, int flags)
 
 	if (PageHuge(page))
 		ret = soft_offline_huge_page(page, flags);
-	else
+	else if (PageLRU(page))
 		ret = __soft_offline_page(page, flags);
+	else
+		ret = soft_offline_movable_page(page, flags);
 
 	return ret;
 }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC] HWPOISON: soft offlining for non-lru movable page
  2017-01-18  4:00 [RFC] HWPOISON: soft offlining for non-lru movable page Yisheng Xie
@ 2017-01-18  9:45 ` Naoya Horiguchi
  2017-01-20  9:52   ` Yisheng Xie
  2017-01-18  9:51 ` Michal Hocko
  1 sibling, 1 reply; 6+ messages in thread
From: Naoya Horiguchi @ 2017-01-18  9:45 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: linux-mm, linux-kernel, mhocko, akpm, minchan, vbabka, guohanjun,
	qiuxishi

On Wed, Jan 18, 2017 at 12:00:54PM +0800, Yisheng Xie wrote:
> This patch is to extends soft offlining framework to support
> non-lru page, which already support migration after
> commit bda807d44454 ("mm: migrate: support non-lru movable page
> migration")
> 
> When memory corrected errors occur on a non-lru movable page,
> we can choose to stop using it by migrating data onto another
> page and disable the original (maybe half-broken) one.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

It looks OK in my quick glance. I'll do some testing more tomorrow.

Thanks,
Naoya Horiguchi

> ---
>  mm/memory-failure.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index f283c7e..10043a4 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1527,7 +1527,8 @@ static int get_any_page(struct page *page, unsigned long pfn, int flags)
>  {
>  	int ret = __get_any_page(page, pfn, flags);
>  
> -	if (ret == 1 && !PageHuge(page) && !PageLRU(page)) {
> +	if (ret == 1 && !PageHuge(page) &&
> +	    !PageLRU(page) && !__PageMovable(page)) {
>  		/*
>  		 * Try to free it.
>  		 */
> @@ -1549,6 +1550,54 @@ static int get_any_page(struct page *page, unsigned long pfn, int flags)
>  	return ret;
>  }
>  
> +static int soft_offline_movable_page(struct page *page, int flags)
> +{
> +	int ret;
> +	unsigned long pfn = page_to_pfn(page);
> +	LIST_HEAD(pagelist);
> +
> +	/*
> +	 * This double-check of PageHWPoison is to avoid the race with
> +	 * memory_failure(). See also comment in __soft_offline_page().
> +	 */
> +	lock_page(page);
> +	if (PageHWPoison(page)) {
> +		unlock_page(page);
> +		put_hwpoison_page(page);
> +		pr_info("soft offline: %#lx movable page already poisoned\n",
> +			pfn);
> +		return -EBUSY;
> +	}
> +	unlock_page(page);
> +
> +	ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
> +	/*
> +	 * get_any_page() and isolate_movable_page() takes a refcount each,
> +	 * so need to drop one here.
> +	 */
> +	put_hwpoison_page(page);
> +	if (!ret) {
> +		pr_info("soft offline: %#lx movable page failed to isolate\n",
> +			pfn);
> +		return -EBUSY;
> +	}
> +
> +	list_add(&page->lru, &pagelist);
> +	ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
> +			    MIGRATE_SYNC, MR_MEMORY_FAILURE);
> +	if (ret) {
> +		if (!list_empty(&pagelist))
> +			putback_movable_pages(&pagelist);
> +
> +		pr_info("soft offline: %#lx: migration failed %d, type %lx\n",
> +			pfn, ret, page->flags);
> +		if (ret > 0)
> +			ret = -EIO;
> +	}
> +
> +	return ret;
> +}
> +
>  static int soft_offline_huge_page(struct page *page, int flags)
>  {
>  	int ret;
> @@ -1705,8 +1754,10 @@ static int soft_offline_in_use_page(struct page *page, int flags)
>  
>  	if (PageHuge(page))
>  		ret = soft_offline_huge_page(page, flags);
> -	else
> +	else if (PageLRU(page))
>  		ret = __soft_offline_page(page, flags);
> +	else
> +		ret = soft_offline_movable_page(page, flags);
>  
>  	return ret;
>  }
> -- 
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] HWPOISON: soft offlining for non-lru movable page
  2017-01-18  4:00 [RFC] HWPOISON: soft offlining for non-lru movable page Yisheng Xie
  2017-01-18  9:45 ` Naoya Horiguchi
@ 2017-01-18  9:51 ` Michal Hocko
  2017-01-19  1:21   ` Yisheng Xie
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2017-01-18  9:51 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: linux-mm, linux-kernel, n-horiguchi, akpm, minchan, vbabka,
	guohanjun, qiuxishi

On Wed 18-01-17 12:00:54, Yisheng Xie wrote:
> This patch is to extends soft offlining framework to support
> non-lru page, which already support migration after
> commit bda807d44454 ("mm: migrate: support non-lru movable page
> migration")
> 
> When memory corrected errors occur on a non-lru movable page,
> we can choose to stop using it by migrating data onto another
> page and disable the original (maybe half-broken) one.

soft_offline_movable_page duplicates quite a lot from
__soft_offline_page. Would it be better to handle both cases in
__soft_offline_page?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] HWPOISON: soft offlining for non-lru movable page
  2017-01-18  9:51 ` Michal Hocko
@ 2017-01-19  1:21   ` Yisheng Xie
  0 siblings, 0 replies; 6+ messages in thread
From: Yisheng Xie @ 2017-01-19  1:21 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, linux-kernel, n-horiguchi, akpm, minchan, vbabka,
	guohanjun, qiuxishi



On 2017/1/18 17:51, Michal Hocko wrote:
> On Wed 18-01-17 12:00:54, Yisheng Xie wrote:
>> This patch is to extends soft offlining framework to support
>> non-lru page, which already support migration after
>> commit bda807d44454 ("mm: migrate: support non-lru movable page
>> migration")
>>
>> When memory corrected errors occur on a non-lru movable page,
>> we can choose to stop using it by migrating data onto another
>> page and disable the original (maybe half-broken) one.
> 
> soft_offline_movable_page duplicates quite a lot from
> __soft_offline_page. Would it be better to handle both cases in
> __soft_offline_page?
> 
Hi Michal,
Thanks for reviewing.
Yes, the most code of soft_offline_movable_page is duplicates with
__soft_offline_page, I use a single function to make code looks clear,
just as what soft_offline_hugetlb_page do.

I will try to make a v2 as your suggestion.

Thanks
Yisheng Xie.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] HWPOISON: soft offlining for non-lru movable page
  2017-01-18  9:45 ` Naoya Horiguchi
@ 2017-01-20  9:52   ` Yisheng Xie
  2017-01-23  4:26     ` Naoya Horiguchi
  0 siblings, 1 reply; 6+ messages in thread
From: Yisheng Xie @ 2017-01-20  9:52 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, linux-kernel, mhocko, akpm, minchan, vbabka, guohanjun,
	qiuxishi

Hi Naoya,

On 2017/1/18 17:45, Naoya Horiguchi wrote:
> On Wed, Jan 18, 2017 at 12:00:54PM +0800, Yisheng Xie wrote:
>> This patch is to extends soft offlining framework to support
>> non-lru page, which already support migration after
>> commit bda807d44454 ("mm: migrate: support non-lru movable page
>> migration")
>>
>> When memory corrected errors occur on a non-lru movable page,
>> we can choose to stop using it by migrating data onto another
>> page and disable the original (maybe half-broken) one.
>>
>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> 
> It looks OK in my quick glance. I'll do some testing more tomorrow.
> 
Thanks for reviewing.
I have do some basic test like offline movable page and unpoison it.
Do you have some test suit or test suggestion? So I can do some more
test of it for double check? Very thanks for that.

Thanks
Yisheng Xie.

> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] HWPOISON: soft offlining for non-lru movable page
  2017-01-20  9:52   ` Yisheng Xie
@ 2017-01-23  4:26     ` Naoya Horiguchi
  0 siblings, 0 replies; 6+ messages in thread
From: Naoya Horiguchi @ 2017-01-23  4:26 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: linux-mm, linux-kernel, mhocko, akpm, minchan, vbabka, guohanjun,
	qiuxishi

On Fri, Jan 20, 2017 at 05:52:13PM +0800, Yisheng Xie wrote:
> Hi Naoya,
> 
> On 2017/1/18 17:45, Naoya Horiguchi wrote:
> > On Wed, Jan 18, 2017 at 12:00:54PM +0800, Yisheng Xie wrote:
> >> This patch is to extends soft offlining framework to support
> >> non-lru page, which already support migration after
> >> commit bda807d44454 ("mm: migrate: support non-lru movable page
> >> migration")
> >>
> >> When memory corrected errors occur on a non-lru movable page,
> >> we can choose to stop using it by migrating data onto another
> >> page and disable the original (maybe half-broken) one.
> >>
> >> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> > 
> > It looks OK in my quick glance. I'll do some testing more tomorrow.
> > 
> Thanks for reviewing.
> I have do some basic test like offline movable page and unpoison it.
> Do you have some test suit or test suggestion? So I can do some more
> test of it for double check? Very thanks for that.

I've tried soft offline on zram pages with your v2 patch, and it works fine.
I have no specific suggestion about other testcases.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-01-23  4:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-18  4:00 [RFC] HWPOISON: soft offlining for non-lru movable page Yisheng Xie
2017-01-18  9:45 ` Naoya Horiguchi
2017-01-20  9:52   ` Yisheng Xie
2017-01-23  4:26     ` Naoya Horiguchi
2017-01-18  9:51 ` Michal Hocko
2017-01-19  1:21   ` Yisheng Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).