linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mlock: fix mlock count can not decrease in race condition
@ 2017-05-25  2:13 Yisheng Xie
  2017-05-25  6:32 ` Vlastimil Babka
  0 siblings, 1 reply; 3+ messages in thread
From: Yisheng Xie @ 2017-05-25  2:13 UTC (permalink / raw)
  To: akpm
  Cc: vbabka, joern, mgorman, walken, hughd, riel, hannes, mhocko,
	qiuxishi, zhongjiang, guohanjun, wangkefeng.wang, stable,
	linux-kernel, linux-mm

Kefeng reported that when run the follow test the mlock count in meminfo
cannot be decreased:
 [1] testcase
 linux:~ # cat test_mlockal
 grep Mlocked /proc/meminfo
  for j in `seq 0 10`
  do
 	for i in `seq 4 15`
 	do
 		./p_mlockall >> log &
 	done
 	sleep 0.2
 done
 # wait some time to let mlock counter decrease and 5s may not enough
 sleep 5
 grep Mlocked /proc/meminfo

 linux:~ # cat p_mlockall.c
 #include <sys/mman.h>
 #include <stdlib.h>
 #include <stdio.h>

 #define SPACE_LEN	4096

 int main(int argc, char ** argv)
 {
 	int ret;
 	void *adr = malloc(SPACE_LEN);
 	if (!adr)
 		return -1;

 	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
 	printf("mlcokall ret = %d\n", ret);

 	ret = munlockall();
 	printf("munlcokall ret = %d\n", ret);

 	free(adr);
 	return 0;
 }

When __munlock_pagevec, we ClearPageMlock but isolation_failed in race
condition, and we do not count these page into delta_munlocked, which cause
mlock counter incorrect for we had Clear the PageMlock and cannot count down
the number in the feture.

Fix it by count the number of page whoes PageMlock flag is cleared.

Fixes: 1ebb7cc6a583 (" mm: munlock: batch NR_MLOCK zone state updates")
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joern Engel <joern@logfs.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
CC: zhongjiang <zhongjiang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: <stable@vger.kernel.org>
---
v2:
 - use delta_munlocked for it doesn't do the increment in fastpath - Vlastimil

Hi Andrew:
Could you please help to fold this?

Thanks
Yisheng Xie

 mm/mlock.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index c483c5c..b562b55 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 {
 	int i;
 	int nr = pagevec_count(pvec);
-	int delta_munlocked;
+	int delta_munlocked = -nr;
 	struct pagevec pvec_putback;
 	int pgrescued = 0;
 
@@ -304,6 +304,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 				continue;
 			else
 				__munlock_isolation_failed(page);
+		} else {
+			delta_munlocked++;
 		}
 
 		/*
@@ -315,7 +317,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 		pagevec_add(&pvec_putback, pvec->pages[i]);
 		pvec->pages[i] = NULL;
 	}
-	delta_munlocked = -nr + pagevec_count(&pvec_putback);
 	__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
 	spin_unlock_irq(zone_lru_lock(zone));
 
-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mlock: fix mlock count can not decrease in race condition
  2017-05-25  2:13 [PATCH v2] mlock: fix mlock count can not decrease in race condition Yisheng Xie
@ 2017-05-25  6:32 ` Vlastimil Babka
  2017-05-25  6:48   ` Yisheng Xie
  0 siblings, 1 reply; 3+ messages in thread
From: Vlastimil Babka @ 2017-05-25  6:32 UTC (permalink / raw)
  To: Yisheng Xie, akpm
  Cc: joern, mgorman, walken, hughd, riel, hannes, mhocko, qiuxishi,
	zhongjiang, guohanjun, wangkefeng.wang, stable, linux-kernel,
	linux-mm

On 05/25/2017 04:13 AM, Yisheng Xie wrote:
> Kefeng reported that when run the follow test the mlock count

> in meminfo
> cannot be decreased:

"increases permanently."?

>  [1] testcase
>  linux:~ # cat test_mlockal
>  grep Mlocked /proc/meminfo
>   for j in `seq 0 10`
>   do
>  	for i in `seq 4 15`
>  	do
>  		./p_mlockall >> log &
>  	done
>  	sleep 0.2
>  done
>  # wait some time to let mlock counter decrease and 5s may not enough
>  sleep 5
>  grep Mlocked /proc/meminfo
> 
>  linux:~ # cat p_mlockall.c
>  #include <sys/mman.h>
>  #include <stdlib.h>
>  #include <stdio.h>
> 
>  #define SPACE_LEN	4096
> 
>  int main(int argc, char ** argv)
>  {
>  	int ret;
>  	void *adr = malloc(SPACE_LEN);
>  	if (!adr)
>  		return -1;
> 
>  	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
>  	printf("mlcokall ret = %d\n", ret);
> 
>  	ret = munlockall();
>  	printf("munlcokall ret = %d\n", ret);
> 
>  	free(adr);
>  	return 0;
>  }
> 
> When __munlock_pagevec, we ClearPageMlock but isolation_failed in race
> condition, and we do not count these page into delta_munlocked, which cause
> mlock counter incorrect for we had Clear the PageMlock and cannot count down
> the number in the feture.

Can I suggest the following instead:

In __munlock_pagevec() we should decrement NR_MLOCK for each page where
we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch
NR_MLOCK zone state updates") has introduced a bug where we don't
decrement NR_MLOCK for pages where we clear the flag, but fail to
isolate them from the lru list (e.g. when the pages are on some other
cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
accounting gets permanently disrupted by this.

> Fix it by count the number of page whoes PageMlock flag is cleared.
> 
> Fixes: 1ebb7cc6a583 (" mm: munlock: batch NR_MLOCK zone state updates")
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks!

> Cc: Joern Engel <joern@logfs.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michel Lespinasse <walken@google.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Xishi Qiu <qiuxishi@huawei.com>
> CC: zhongjiang <zhongjiang@huawei.com>
> Cc: Hanjun Guo <guohanjun@huawei.com>
> Cc: <stable@vger.kernel.org>
> ---
> v2:
>  - use delta_munlocked for it doesn't do the increment in fastpath - Vlastimil
> 
> Hi Andrew:
> Could you please help to fold this?
> 
> Thanks
> Yisheng Xie
> 
>  mm/mlock.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index c483c5c..b562b55 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>  {
>  	int i;
>  	int nr = pagevec_count(pvec);
> -	int delta_munlocked;
> +	int delta_munlocked = -nr;
>  	struct pagevec pvec_putback;
>  	int pgrescued = 0;
>  
> @@ -304,6 +304,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>  				continue;
>  			else
>  				__munlock_isolation_failed(page);
> +		} else {
> +			delta_munlocked++;
>  		}
>  
>  		/*
> @@ -315,7 +317,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>  		pagevec_add(&pvec_putback, pvec->pages[i]);
>  		pvec->pages[i] = NULL;
>  	}
> -	delta_munlocked = -nr + pagevec_count(&pvec_putback);
>  	__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
>  	spin_unlock_irq(zone_lru_lock(zone));
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mlock: fix mlock count can not decrease in race condition
  2017-05-25  6:32 ` Vlastimil Babka
@ 2017-05-25  6:48   ` Yisheng Xie
  0 siblings, 0 replies; 3+ messages in thread
From: Yisheng Xie @ 2017-05-25  6:48 UTC (permalink / raw)
  To: Vlastimil Babka, akpm
  Cc: joern, mgorman, walken, hughd, riel, hannes, mhocko, qiuxishi,
	zhongjiang, guohanjun, wangkefeng.wang, stable, linux-kernel,
	linux-mm

Hi Vlastimil,
Thanks for comment!

On 2017/5/25 14:32, Vlastimil Babka wrote:
> On 05/25/2017 04:13 AM, Yisheng Xie wrote:
>> Kefeng reported that when run the follow test the mlock count
> 
>> in meminfo
>> cannot be decreased:
> 
> "increases permanently."?
Yes if I am not mis-understanding what your means.

> 
>>  [1] testcase
>>  linux:~ # cat test_mlockal
>>  grep Mlocked /proc/meminfo
>>   for j in `seq 0 10`
>>   do
>>  	for i in `seq 4 15`
>>  	do
>>  		./p_mlockall >> log &
>>  	done
>>  	sleep 0.2
>>  done
>>  # wait some time to let mlock counter decrease and 5s may not enough
>>  sleep 5
>>  grep Mlocked /proc/meminfo
>>
>>  linux:~ # cat p_mlockall.c
>>  #include <sys/mman.h>
>>  #include <stdlib.h>
>>  #include <stdio.h>
>>
>>  #define SPACE_LEN	4096
>>
>>  int main(int argc, char ** argv)
>>  {
>>  	int ret;
>>  	void *adr = malloc(SPACE_LEN);
>>  	if (!adr)
>>  		return -1;
>>
>>  	ret = mlockall(MCL_CURRENT | MCL_FUTURE);
>>  	printf("mlcokall ret = %d\n", ret);
>>
>>  	ret = munlockall();
>>  	printf("munlcokall ret = %d\n", ret);
>>
>>  	free(adr);
>>  	return 0;
>>  }
>>
>> When __munlock_pagevec, we ClearPageMlock but isolation_failed in race
>> condition, and we do not count these page into delta_munlocked, which cause
>> mlock counter incorrect for we had Clear the PageMlock and cannot count down
>> the number in the feture.
> 
> Can I suggest the following instead:
> 
> In __munlock_pagevec() we should decrement NR_MLOCK for each page where
> we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch
> NR_MLOCK zone state updates") has introduced a bug where we don't
> decrement NR_MLOCK for pages where we clear the flag, but fail to
> isolate them from the lru list (e.g. when the pages are on some other
> cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK
> accounting gets permanently disrupted by this.
That's much better and clear. Should I send another version ?

Thanks
Yisheng Xie

> 
>> Fix it by count the number of page whoes PageMlock flag is cleared.
>>
>> Fixes: 1ebb7cc6a583 (" mm: munlock: batch NR_MLOCK zone state updates")
>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
>> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Thanks!
> 
>> Cc: Joern Engel <joern@logfs.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Michel Lespinasse <walken@google.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Michal Hocko <mhocko@suse.cz>
>> Cc: Xishi Qiu <qiuxishi@huawei.com>
>> CC: zhongjiang <zhongjiang@huawei.com>
>> Cc: Hanjun Guo <guohanjun@huawei.com>
>> Cc: <stable@vger.kernel.org>
>> ---
>> v2:
>>  - use delta_munlocked for it doesn't do the increment in fastpath - Vlastimil
>>
>> Hi Andrew:
>> Could you please help to fold this?
>>
>> Thanks
>> Yisheng Xie
>>
>>  mm/mlock.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/mlock.c b/mm/mlock.c
>> index c483c5c..b562b55 100644
>> --- a/mm/mlock.c
>> +++ b/mm/mlock.c
>> @@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>>  {
>>  	int i;
>>  	int nr = pagevec_count(pvec);
>> -	int delta_munlocked;
>> +	int delta_munlocked = -nr;
>>  	struct pagevec pvec_putback;
>>  	int pgrescued = 0;
>>  
>> @@ -304,6 +304,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>>  				continue;
>>  			else
>>  				__munlock_isolation_failed(page);
>> +		} else {
>> +			delta_munlocked++;
>>  		}
>>  
>>  		/*
>> @@ -315,7 +317,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>>  		pagevec_add(&pvec_putback, pvec->pages[i]);
>>  		pvec->pages[i] = NULL;
>>  	}
>> -	delta_munlocked = -nr + pagevec_count(&pvec_putback);
>>  	__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
>>  	spin_unlock_irq(zone_lru_lock(zone));
>>  
>>
> 
> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-05-25  6:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-25  2:13 [PATCH v2] mlock: fix mlock count can not decrease in race condition Yisheng Xie
2017-05-25  6:32 ` Vlastimil Babka
2017-05-25  6:48   ` Yisheng Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).