All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-03 14:34 Wei Yang
  2020-01-03 19:29   ` David Rientjes
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Wei Yang @ 2020-01-03 14:34 UTC (permalink / raw)
  To: hannes, mhocko, vdavydov.dev, akpm
  Cc: kirill.shutemov, cgroups, linux-mm, linux-kernel, yang.shi, Wei Yang

As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.

Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>

---
I notice the difference during code reading and just confused about the
difference. No specific test is done since limited knowledge about cgroup.

Maybe I miss something important?
---
 mm/memcontrol.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bc01423277c5..62b7ec34ef1a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
 	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	spin_lock(&from->deferred_split_queue.split_queue_lock);
 	if (compound && !list_empty(page_deferred_list(page))) {
-		spin_lock(&from->deferred_split_queue.split_queue_lock);
 		list_del_init(page_deferred_list(page));
 		from->deferred_split_queue.split_queue_len--;
-		spin_unlock(&from->deferred_split_queue.split_queue_lock);
 	}
+	spin_unlock(&from->deferred_split_queue.split_queue_lock);
 #endif
 	/*
 	 * It is safe to change page->mem_cgroup here because the page
@@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
 	page->mem_cgroup = to;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	spin_lock(&to->deferred_split_queue.split_queue_lock);
 	if (compound && list_empty(page_deferred_list(page))) {
-		spin_lock(&to->deferred_split_queue.split_queue_lock);
 		list_add_tail(page_deferred_list(page),
 			      &to->deferred_split_queue.split_queue);
 		to->deferred_split_queue.split_queue_len++;
-		spin_unlock(&to->deferred_split_queue.split_queue_lock);
 	}
+	spin_unlock(&to->deferred_split_queue.split_queue_lock);
 #endif
 
 	spin_unlock_irqrestore(&from->move_lock, flags);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
@ 2020-01-03 19:29   ` David Rientjes
  2020-01-06 10:23 ` Michal Hocko
  2020-01-06 16:18   ` Alexander Duyck
  2 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-03 19:29 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Fri, 3 Jan 2020, Wei Yang wrote:

> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
> 
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> 
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
> 
> Maybe I miss something important?

The check for !list_empty(page_deferred_list(page)) must certainly be 
serialized with doing list_del_init(page_deferred_list(page)).

> ---
>  mm/memcontrol.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>  	}
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&from->deferred_split_queue.split_queue_lock);
>  	if (compound && !list_empty(page_deferred_list(page))) {
> -		spin_lock(&from->deferred_split_queue.split_queue_lock);
>  		list_del_init(page_deferred_list(page));
>  		from->deferred_split_queue.split_queue_len--;
> -		spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  #endif
>  	/*
>  	 * It is safe to change page->mem_cgroup here because the page
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
>  	page->mem_cgroup = to;
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&to->deferred_split_queue.split_queue_lock);
>  	if (compound && list_empty(page_deferred_list(page))) {
> -		spin_lock(&to->deferred_split_queue.split_queue_lock);
>  		list_add_tail(page_deferred_list(page),
>  			      &to->deferred_split_queue.split_queue);
>  		to->deferred_split_queue.split_queue_len++;
> -		spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  #endif
>  
>  	spin_unlock_irqrestore(&from->move_lock, flags);
> -- 
> 2.17.1
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-03 19:29   ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-03 19:29 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Fri, 3 Jan 2020, Wei Yang wrote:

> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
> 
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> 
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
> 
> Maybe I miss something important?

The check for !list_empty(page_deferred_list(page)) must certainly be 
serialized with doing list_del_init(page_deferred_list(page)).

> ---
>  mm/memcontrol.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>  	}
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&from->deferred_split_queue.split_queue_lock);
>  	if (compound && !list_empty(page_deferred_list(page))) {
> -		spin_lock(&from->deferred_split_queue.split_queue_lock);
>  		list_del_init(page_deferred_list(page));
>  		from->deferred_split_queue.split_queue_len--;
> -		spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  #endif
>  	/*
>  	 * It is safe to change page->mem_cgroup here because the page
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
>  	page->mem_cgroup = to;
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&to->deferred_split_queue.split_queue_lock);
>  	if (compound && list_empty(page_deferred_list(page))) {
> -		spin_lock(&to->deferred_split_queue.split_queue_lock);
>  		list_add_tail(page_deferred_list(page),
>  			      &to->deferred_split_queue.split_queue);
>  		to->deferred_split_queue.split_queue_len++;
> -		spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  #endif
>  
>  	spin_unlock_irqrestore(&from->move_lock, flags);
> -- 
> 2.17.1
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-03 19:29   ` David Rientjes
  (?)
@ 2020-01-03 23:39   ` Wei Yang
  2020-01-04  0:44       ` David Rientjes
  -1 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-03 23:39 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wei Yang, hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov,
	cgroups, linux-mm, linux-kernel, yang.shi

On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
>On Fri, 3 Jan 2020, Wei Yang wrote:
>
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>> 
>> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>> 
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> 
>> ---
>> I notice the difference during code reading and just confused about the
>> difference. No specific test is done since limited knowledge about cgroup.
>> 
>> Maybe I miss something important?
>
>The check for !list_empty(page_deferred_list(page)) must certainly be 
>serialized with doing list_del_init(page_deferred_list(page)).
>

Hi David

Would you mind giving more information? You mean list_empty and list_del_init
is atomic?

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-03 23:39   ` Wei Yang
@ 2020-01-04  0:44       ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-04  0:44 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Sat, 4 Jan 2020, Wei Yang wrote:

> On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
> >On Fri, 3 Jan 2020, Wei Yang wrote:
> >
> >> As all the other places, we grab the lock before manipulate the defer list.
> >> Current implementation may face a race condition.
> >> 
> >> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> 
> >> ---
> >> I notice the difference during code reading and just confused about the
> >> difference. No specific test is done since limited knowledge about cgroup.
> >> 
> >> Maybe I miss something important?
> >
> >The check for !list_empty(page_deferred_list(page)) must certainly be 
> >serialized with doing list_del_init(page_deferred_list(page)).
> >
> 
> Hi David
> 
> Would you mind giving more information? You mean list_empty and list_del_init
> is atomic?
> 

I mean your patch is obviously correct :)  It should likely also have a 
stable@vger.kernel.org # 5.4+

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-04  0:44       ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-04  0:44 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Sat, 4 Jan 2020, Wei Yang wrote:

> On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
> >On Fri, 3 Jan 2020, Wei Yang wrote:
> >
> >> As all the other places, we grab the lock before manipulate the defer list.
> >> Current implementation may face a race condition.
> >> 
> >> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >> 
> >> ---
> >> I notice the difference during code reading and just confused about the
> >> difference. No specific test is done since limited knowledge about cgroup.
> >> 
> >> Maybe I miss something important?
> >
> >The check for !list_empty(page_deferred_list(page)) must certainly be 
> >serialized with doing list_del_init(page_deferred_list(page)).
> >
> 
> Hi David
> 
> Would you mind giving more information? You mean list_empty and list_del_init
> is atomic?
> 

I mean your patch is obviously correct :)  It should likely also have a 
stable@vger.kernel.org # 5.4+

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-04  0:44       ` David Rientjes
  (?)
@ 2020-01-06  1:20       ` Wei Yang
  -1 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-01-06  1:20 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wei Yang, hannes, mhocko, vdavydov.dev, akpm, kirill.shutemov,
	cgroups, linux-mm, linux-kernel, yang.shi

On Fri, Jan 03, 2020 at 04:44:59PM -0800, David Rientjes wrote:
>On Sat, 4 Jan 2020, Wei Yang wrote:
>
>> On Fri, Jan 03, 2020 at 11:29:06AM -0800, David Rientjes wrote:
>> >On Fri, 3 Jan 2020, Wei Yang wrote:
>> >
>> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> Current implementation may face a race condition.
>> >> 
>> >> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>> >> 
>> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>> >> 
>> >> ---
>> >> I notice the difference during code reading and just confused about the
>> >> difference. No specific test is done since limited knowledge about cgroup.
>> >> 
>> >> Maybe I miss something important?
>> >
>> >The check for !list_empty(page_deferred_list(page)) must certainly be 
>> >serialized with doing list_del_init(page_deferred_list(page)).
>> >
>> 
>> Hi David
>> 
>> Would you mind giving more information? You mean list_empty and list_del_init
>> is atomic?
>> 
>
>I mean your patch is obviously correct :)  It should likely also have a 
>stable@vger.kernel.org # 5.4+

Ah, my poor English ;-)

>
>Acked-by: David Rientjes <rientjes@google.com>

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
  2020-01-03 19:29   ` David Rientjes
@ 2020-01-06 10:23 ` Michal Hocko
  2020-01-07  1:22   ` Wei Yang
  2020-01-06 16:18   ` Alexander Duyck
  2 siblings, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-01-06 10:23 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
	linux-kernel, yang.shi

On Fri 03-01-20 22:34:07, Wei Yang wrote:
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.

Please always make sure to describe the effect of the change. Why a racy
list_empty check matters?

> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> 
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
> 
> Maybe I miss something important?
> ---
>  mm/memcontrol.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>  	}
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&from->deferred_split_queue.split_queue_lock);
>  	if (compound && !list_empty(page_deferred_list(page))) {
> -		spin_lock(&from->deferred_split_queue.split_queue_lock);
>  		list_del_init(page_deferred_list(page));
>  		from->deferred_split_queue.split_queue_len--;
> -		spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  #endif
>  	/*
>  	 * It is safe to change page->mem_cgroup here because the page
> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
>  	page->mem_cgroup = to;
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&to->deferred_split_queue.split_queue_lock);
>  	if (compound && list_empty(page_deferred_list(page))) {
> -		spin_lock(&to->deferred_split_queue.split_queue_lock);
>  		list_add_tail(page_deferred_list(page),
>  			      &to->deferred_split_queue.split_queue);
>  		to->deferred_split_queue.split_queue_len++;
> -		spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  	}
> +	spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  #endif
>  
>  	spin_unlock_irqrestore(&from->move_lock, flags);
> -- 
> 2.17.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
@ 2020-01-06 16:18   ` Alexander Duyck
  2020-01-06 10:23 ` Michal Hocko
  2020-01-06 16:18   ` Alexander Duyck
  2 siblings, 0 replies; 24+ messages in thread
From: Alexander Duyck @ 2020-01-06 16:18 UTC (permalink / raw)
  To: Wei Yang
  Cc: Johannes Weiner, Michal Hocko, vdavydov.dev, Andrew Morton,
	Kirill A. Shutemov, cgroups, linux-mm, LKML, Yang Shi

On Fri, Jan 3, 2020 at 6:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
>
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
>
> Maybe I miss something important?
> ---
>  mm/memcontrol.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>         }
>
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +       spin_lock(&from->deferred_split_queue.split_queue_lock);
>         if (compound && !list_empty(page_deferred_list(page))) {
> -               spin_lock(&from->deferred_split_queue.split_queue_lock);
>                 list_del_init(page_deferred_list(page));
>                 from->deferred_split_queue.split_queue_len--;
> -               spin_unlock(&from->deferred_split_queue.split_queue_lock);
>         }
> +       spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  #endif
>         /*
>          * It is safe to change page->mem_cgroup here because the page

So I suspect the lock placement has to do with the compound boolean
value passed to the function.

One thing you might want to do is pull the "if (compound)" check out
and place it outside of the spinlock check. It would then simplify
this signficantly so it is something like
if (compound) {
  spin_lock();
  list = page_deferred_list(page);
  if (!list_empty(list)) {
    list_del_init(list);
    from->..split_queue_len--;
  }
  spin_unlock();
}

Same for the block below. I would pull the check for compound outside
of the spinlock call since it is a value that shouldn't change and
would eliminate an unnecessary lock in the non-compound case.

> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
>         page->mem_cgroup = to;
>
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +       spin_lock(&to->deferred_split_queue.split_queue_lock);
>         if (compound && list_empty(page_deferred_list(page))) {
> -               spin_lock(&to->deferred_split_queue.split_queue_lock);
>                 list_add_tail(page_deferred_list(page),
>                               &to->deferred_split_queue.split_queue);
>                 to->deferred_split_queue.split_queue_len++;
> -               spin_unlock(&to->deferred_split_queue.split_queue_lock);
>         }
> +       spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  #endif
>
>         spin_unlock_irqrestore(&from->move_lock, flags);
> --

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-06 16:18   ` Alexander Duyck
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Duyck @ 2020-01-06 16:18 UTC (permalink / raw)
  To: Wei Yang
  Cc: Johannes Weiner, Michal Hocko, vdavydov.dev, Andrew Morton,
	Kirill A. Shutemov, cgroups, linux-mm, LKML, Yang Shi

On Fri, Jan 3, 2020 at 6:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>
> As all the other places, we grab the lock before manipulate the defer list.
> Current implementation may face a race condition.
>
> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
> ---
> I notice the difference during code reading and just confused about the
> difference. No specific test is done since limited knowledge about cgroup.
>
> Maybe I miss something important?
> ---
>  mm/memcontrol.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc01423277c5..62b7ec34ef1a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>         }
>
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +       spin_lock(&from->deferred_split_queue.split_queue_lock);
>         if (compound && !list_empty(page_deferred_list(page))) {
> -               spin_lock(&from->deferred_split_queue.split_queue_lock);
>                 list_del_init(page_deferred_list(page));
>                 from->deferred_split_queue.split_queue_len--;
> -               spin_unlock(&from->deferred_split_queue.split_queue_lock);
>         }
> +       spin_unlock(&from->deferred_split_queue.split_queue_lock);
>  #endif
>         /*
>          * It is safe to change page->mem_cgroup here because the page

So I suspect the lock placement has to do with the compound boolean
value passed to the function.

One thing you might want to do is pull the "if (compound)" check out
and place it outside of the spinlock check. It would then simplify
this signficantly so it is something like
if (compound) {
  spin_lock();
  list = page_deferred_list(page);
  if (!list_empty(list)) {
    list_del_init(list);
    from->..split_queue_len--;
  }
  spin_unlock();
}

Same for the block below. I would pull the check for compound outside
of the spinlock call since it is a value that shouldn't change and
would eliminate an unnecessary lock in the non-compound case.

> @@ -5385,13 +5385,13 @@ static int mem_cgroup_move_account(struct page *page,
>         page->mem_cgroup = to;
>
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +       spin_lock(&to->deferred_split_queue.split_queue_lock);
>         if (compound && list_empty(page_deferred_list(page))) {
> -               spin_lock(&to->deferred_split_queue.split_queue_lock);
>                 list_add_tail(page_deferred_list(page),
>                               &to->deferred_split_queue.split_queue);
>                 to->deferred_split_queue.split_queue_len++;
> -               spin_unlock(&to->deferred_split_queue.split_queue_lock);
>         }
> +       spin_unlock(&to->deferred_split_queue.split_queue_lock);
>  #endif
>
>         spin_unlock_irqrestore(&from->move_lock, flags);
> --

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-06 10:23 ` Michal Hocko
@ 2020-01-07  1:22   ` Wei Yang
  2020-01-07  8:38     ` Michal Hocko
  0 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-07  1:22 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>
>Please always make sure to describe the effect of the change. Why a racy
>list_empty check matters?
>

Hmm... access the list without proper lock leads to many bad behaviors.

For example, if we grab the lock after checking list_empty, the page may
already be removed from list in split_huge_page_list. And then list_del_init
would trigger bug.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-06 16:18   ` Alexander Duyck
  (?)
@ 2020-01-07  1:26   ` Wei Yang
  2020-01-07  2:07       ` David Rientjes
  -1 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-07  1:26 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Wei Yang, Johannes Weiner, Michal Hocko, vdavydov.dev,
	Andrew Morton, Kirill A. Shutemov, cgroups, linux-mm, LKML,
	Yang Shi

On Mon, Jan 06, 2020 at 08:18:34AM -0800, Alexander Duyck wrote:
>On Fri, Jan 3, 2020 at 6:34 AM Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>> As all the other places, we grab the lock before manipulate the defer list.
>> Current implementation may face a race condition.
>>
>> Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")
>>
>> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>>
>> ---
>> I notice the difference during code reading and just confused about the
>> difference. No specific test is done since limited knowledge about cgroup.
>>
>> Maybe I miss something important?
>> ---
>>  mm/memcontrol.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index bc01423277c5..62b7ec34ef1a 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -5368,12 +5368,12 @@ static int mem_cgroup_move_account(struct page *page,
>>         }
>>
>>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> +       spin_lock(&from->deferred_split_queue.split_queue_lock);
>>         if (compound && !list_empty(page_deferred_list(page))) {
>> -               spin_lock(&from->deferred_split_queue.split_queue_lock);
>>                 list_del_init(page_deferred_list(page));
>>                 from->deferred_split_queue.split_queue_len--;
>> -               spin_unlock(&from->deferred_split_queue.split_queue_lock);
>>         }
>> +       spin_unlock(&from->deferred_split_queue.split_queue_lock);
>>  #endif
>>         /*
>>          * It is safe to change page->mem_cgroup here because the page
>
>So I suspect the lock placement has to do with the compound boolean
>value passed to the function.
>

Hey, Alexander

Thanks for your comment.

>One thing you might want to do is pull the "if (compound)" check out
>and place it outside of the spinlock check. It would then simplify
>this signficantly so it is something like
>if (compound) {
>  spin_lock();
>  list = page_deferred_list(page);
>  if (!list_empty(list)) {
>    list_del_init(list);
>    from->..split_queue_len--;
>  }
>  spin_unlock();
>}
>
>Same for the block below. I would pull the check for compound outside
>of the spinlock call since it is a value that shouldn't change and
>would eliminate an unnecessary lock in the non-compound case.

This is reasonable, if no objection from others, I would change this in v2.


-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-07  1:26   ` Wei Yang
@ 2020-01-07  2:07       ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-07  2:07 UTC (permalink / raw)
  To: Wei Yang
  Cc: Alexander Duyck, Johannes Weiner, Michal Hocko, vdavydov.dev,
	Andrew Morton, Kirill A. Shutemov, cgroups, linux-mm, LKML,
	Yang Shi

On Tue, 7 Jan 2020, Wei Yang wrote:

> >One thing you might want to do is pull the "if (compound)" check out
> >and place it outside of the spinlock check. It would then simplify
> >this signficantly so it is something like
> >if (compound) {
> >  spin_lock();
> >  list = page_deferred_list(page);
> >  if (!list_empty(list)) {
> >    list_del_init(list);
> >    from->..split_queue_len--;
> >  }
> >  spin_unlock();
> >}
> >
> >Same for the block below. I would pull the check for compound outside
> >of the spinlock call since it is a value that shouldn't change and
> >would eliminate an unnecessary lock in the non-compound case.
> 
> This is reasonable, if no objection from others, I would change this in v2.

Looks fine to me; I don't see it as a necessary improvement but there's 
also no reason to object to it.  It's definitely a patch that is needed, 
however, for the simple reason that with the existing code we can 
manipulate the deferred split queue incorrectly so either way works for 
me.  Feel free to keep my acked-by.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
@ 2020-01-07  2:07       ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2020-01-07  2:07 UTC (permalink / raw)
  To: Wei Yang
  Cc: Alexander Duyck, Johannes Weiner, Michal Hocko, vdavydov.dev,
	Andrew Morton, Kirill A. Shutemov, cgroups, linux-mm, LKML,
	Yang Shi

On Tue, 7 Jan 2020, Wei Yang wrote:

> >One thing you might want to do is pull the "if (compound)" check out
> >and place it outside of the spinlock check. It would then simplify
> >this signficantly so it is something like
> >if (compound) {
> >  spin_lock();
> >  list = page_deferred_list(page);
> >  if (!list_empty(list)) {
> >    list_del_init(list);
> >    from->..split_queue_len--;
> >  }
> >  spin_unlock();
> >}
> >
> >Same for the block below. I would pull the check for compound outside
> >of the spinlock call since it is a value that shouldn't change and
> >would eliminate an unnecessary lock in the non-compound case.
> 
> This is reasonable, if no objection from others, I would change this in v2.

Looks fine to me; I don't see it as a necessary improvement but there's 
also no reason to object to it.  It's definitely a patch that is needed, 
however, for the simple reason that with the existing code we can 
manipulate the deferred split queue incorrectly so either way works for 
me.  Feel free to keep my acked-by.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-07  2:07       ` David Rientjes
  (?)
@ 2020-01-07  2:33       ` Wei Yang
  -1 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-01-07  2:33 UTC (permalink / raw)
  To: David Rientjes
  Cc: Wei Yang, Alexander Duyck, Johannes Weiner, Michal Hocko,
	vdavydov.dev, Andrew Morton, Kirill A. Shutemov, cgroups,
	linux-mm, LKML, Yang Shi

On Mon, Jan 06, 2020 at 06:07:29PM -0800, David Rientjes wrote:
>On Tue, 7 Jan 2020, Wei Yang wrote:
>
>> >One thing you might want to do is pull the "if (compound)" check out
>> >and place it outside of the spinlock check. It would then simplify
>> >this signficantly so it is something like
>> >if (compound) {
>> >  spin_lock();
>> >  list = page_deferred_list(page);
>> >  if (!list_empty(list)) {
>> >    list_del_init(list);
>> >    from->..split_queue_len--;
>> >  }
>> >  spin_unlock();
>> >}
>> >
>> >Same for the block below. I would pull the check for compound outside
>> >of the spinlock call since it is a value that shouldn't change and
>> >would eliminate an unnecessary lock in the non-compound case.
>> 
>> This is reasonable, if no objection from others, I would change this in v2.
>
>Looks fine to me; I don't see it as a necessary improvement but there's 
>also no reason to object to it.  It's definitely a patch that is needed, 
>however, for the simple reason that with the existing code we can 
>manipulate the deferred split queue incorrectly so either way works for 
>me.  Feel free to keep my acked-by.

Ah, thanks David. You are so supportive.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-07  1:22   ` Wei Yang
@ 2020-01-07  8:38     ` Michal Hocko
  2020-01-08  0:35       ` Wei Yang
  0 siblings, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-01-07  8:38 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
	linux-kernel, yang.shi

On Tue 07-01-20 09:22:41, Wei Yang wrote:
> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> As all the other places, we grab the lock before manipulate the defer list.
> >> Current implementation may face a race condition.
> >
> >Please always make sure to describe the effect of the change. Why a racy
> >list_empty check matters?
> >
> 
> Hmm... access the list without proper lock leads to many bad behaviors.

My point is that the changelog should describe that bad behavior.

> For example, if we grab the lock after checking list_empty, the page may
> already be removed from list in split_huge_page_list. And then list_del_init
> would trigger bug.

And how does list_empty check under the lock guarantee that the page is
on the deferred list?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-07  8:38     ` Michal Hocko
@ 2020-01-08  0:35       ` Wei Yang
  2020-01-08  9:40         ` Michal Hocko
  0 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-08  0:35 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> Current implementation may face a race condition.
>> >
>> >Please always make sure to describe the effect of the change. Why a racy
>> >list_empty check matters?
>> >
>> 
>> Hmm... access the list without proper lock leads to many bad behaviors.
>
>My point is that the changelog should describe that bad behavior.
>
>> For example, if we grab the lock after checking list_empty, the page may
>> already be removed from list in split_huge_page_list. And then list_del_init
>> would trigger bug.
>
>And how does list_empty check under the lock guarantee that the page is
>on the deferred list?

Just one confusion, is this kind of description basic concept of concurrent
programming? How detail level we need to describe the effect?

To me, grab the lock before accessing the critical section is obvious.
list_empty and list_del should be the critical section. And the
lock should protect the whole critical section instead of part of it.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-08  0:35       ` Wei Yang
@ 2020-01-08  9:40         ` Michal Hocko
  2020-01-09  2:03           ` Wei Yang
                             ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Michal Hocko @ 2020-01-08  9:40 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
	linux-kernel, yang.shi

On Wed 08-01-20 08:35:43, Wei Yang wrote:
> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> >> As all the other places, we grab the lock before manipulate the defer list.
> >> >> Current implementation may face a race condition.
> >> >
> >> >Please always make sure to describe the effect of the change. Why a racy
> >> >list_empty check matters?
> >> >
> >> 
> >> Hmm... access the list without proper lock leads to many bad behaviors.
> >
> >My point is that the changelog should describe that bad behavior.
> >
> >> For example, if we grab the lock after checking list_empty, the page may
> >> already be removed from list in split_huge_page_list. And then list_del_init
> >> would trigger bug.
> >
> >And how does list_empty check under the lock guarantee that the page is
> >on the deferred list?
> 
> Just one confusion, is this kind of description basic concept of concurrent
> programming? How detail level we need to describe the effect?

When I write changelogs for patches like this I usually describe, what
is the potential race - e.g.
	CPU1			CPU2
	path1			path2
	  check			  lock
	  			    operation2
				  unlock
	    lock
	    # check might not hold anymore
	    operation1
	    unlock

and what is the effect of the race - e.g. a crash, data corruption,
pointless attempt for operation1 which fails with user visible effect
etc.
This helps reviewers and everybody reading the code in the future to
understand the locking scheme.

> To me, grab the lock before accessing the critical section is obvious.

It might be obvious but in many cases it is useful to minimize the
locking and do a potentially race check before the lock is taken if the
resulting operation can handle that.

> list_empty and list_del should be the critical section. And the
> lock should protect the whole critical section instead of part of it.

I am not disputing that. What I am trying to say is that the changelog
should described the problem in the first place.

Moreover, look at the code you are trying to fix. Sure extending the
locking seem straightforward but does it result in a correct code
though? See my question in the previous email. How do we know that the
page is actually enqued in a non-empty list?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-08  9:40         ` Michal Hocko
@ 2020-01-09  2:03           ` Wei Yang
  2020-01-09  8:34             ` Michal Hocko
  2020-01-09  2:03           ` Wei Yang
  2020-01-09  3:18           ` Wei Yang
  2 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-09  2:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> Current implementation may face a race condition.
>> >> >
>> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >list_empty check matters?
>> >> >
>> >> 
>> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >
>> >My point is that the changelog should describe that bad behavior.
>> >
>> >> For example, if we grab the lock after checking list_empty, the page may
>> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> would trigger bug.
>> >
>> >And how does list_empty check under the lock guarantee that the page is
>> >on the deferred list?
>> 
>> Just one confusion, is this kind of description basic concept of concurrent
>> programming? How detail level we need to describe the effect?
>
>When I write changelogs for patches like this I usually describe, what
>is the potential race - e.g.
>	CPU1			CPU2
>	path1			path2
>	  check			  lock
>	  			    operation2
>				  unlock
>	    lock
>	    # check might not hold anymore
>	    operation1
>	    unlock
>

Nice, I would prepare a changelog like this.

>and what is the effect of the race - e.g. a crash, data corruption,
>pointless attempt for operation1 which fails with user visible effect
>etc.
>This helps reviewers and everybody reading the code in the future to
>understand the locking scheme.
>
>> To me, grab the lock before accessing the critical section is obvious.
>
>It might be obvious but in many cases it is useful to minimize the
>locking and do a potentially race check before the lock is taken if the
>resulting operation can handle that.
>
>> list_empty and list_del should be the critical section. And the
>> lock should protect the whole critical section instead of part of it.
>
>I am not disputing that. What I am trying to say is that the changelog
>should described the problem in the first place.
>
>Moreover, look at the code you are trying to fix. Sure extending the
>locking seem straightforward but does it result in a correct code
>though? See my question in the previous email. How do we know that the
>page is actually enqued in a non-empty list?

I may not get your point for the last sentence.

The list_empty() doesn't check the queue is empty but check the list, here is
the page, is not enqueued into any list. Is this your concern?

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-08  9:40         ` Michal Hocko
  2020-01-09  2:03           ` Wei Yang
@ 2020-01-09  2:03           ` Wei Yang
  2020-01-09  3:18           ` Wei Yang
  2 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-01-09  2:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> Current implementation may face a race condition.
>> >> >
>> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >list_empty check matters?
>> >> >
>> >> 
>> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >
>> >My point is that the changelog should describe that bad behavior.
>> >
>> >> For example, if we grab the lock after checking list_empty, the page may
>> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> would trigger bug.
>> >
>> >And how does list_empty check under the lock guarantee that the page is
>> >on the deferred list?
>> 
>> Just one confusion, is this kind of description basic concept of concurrent
>> programming? How detail level we need to describe the effect?
>
>When I write changelogs for patches like this I usually describe, what
>is the potential race - e.g.
>	CPU1			CPU2
>	path1			path2
>	  check			  lock
>	  			    operation2
>				  unlock
>	    lock
>	    # check might not hold anymore
>	    operation1
>	    unlock
>

Nice, I would prepare a changelog like this.

>and what is the effect of the race - e.g. a crash, data corruption,
>pointless attempt for operation1 which fails with user visible effect
>etc.
>This helps reviewers and everybody reading the code in the future to
>understand the locking scheme.
>
>> To me, grab the lock before accessing the critical section is obvious.
>
>It might be obvious but in many cases it is useful to minimize the
>locking and do a potentially race check before the lock is taken if the
>resulting operation can handle that.
>
>> list_empty and list_del should be the critical section. And the
>> lock should protect the whole critical section instead of part of it.
>
>I am not disputing that. What I am trying to say is that the changelog
>should described the problem in the first place.
>
>Moreover, look at the code you are trying to fix. Sure extending the
>locking seem straightforward but does it result in a correct code
>though? See my question in the previous email. How do we know that the
>page is actually enqued in a non-empty list?

I may not get your point for the last sentence.

The list_empty() doesn't check the queue is empty but check the list, here is
the page, is not enqueued into any list. Is this your concern?

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-08  9:40         ` Michal Hocko
  2020-01-09  2:03           ` Wei Yang
  2020-01-09  2:03           ` Wei Yang
@ 2020-01-09  3:18           ` Wei Yang
  2020-01-09  8:36             ` Michal Hocko
  2 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2020-01-09  3:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> Current implementation may face a race condition.
>> >> >
>> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >list_empty check matters?
>> >> >
>> >> 
>> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >
>> >My point is that the changelog should describe that bad behavior.
>> >
>> >> For example, if we grab the lock after checking list_empty, the page may
>> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> would trigger bug.
>> >
>> >And how does list_empty check under the lock guarantee that the page is
>> >on the deferred list?
>> 
>> Just one confusion, is this kind of description basic concept of concurrent
>> programming? How detail level we need to describe the effect?
>
>When I write changelogs for patches like this I usually describe, what
>is the potential race - e.g.
>	CPU1			CPU2
>	path1			path2
>	  check			  lock
>	  			    operation2
>				  unlock
>	    lock
>	    # check might not hold anymore
>	    operation1
>	    unlock
>
>and what is the effect of the race - e.g. a crash, data corruption,
>pointless attempt for operation1 which fails with user visible effect
>etc.

Hi, Michal, here is my attempt for an example. Hope this one looks good to
you.


    For example, the potential race would be:
    
        CPU1                      CPU2
        mem_cgroup_move_account   split_huge_page_to_list
          !list_empty
                                    lock
                                    !list_empty
                                    list_del
                                    unlock
          lock
          # !list_empty might not hold anymore
          list_del_init
          unlock
    
    When this sequence happens, the list_del_init() in
    mem_cgroup_move_account() would crash since the page is already been
    removed by list_del in split_huge_page_to_list().


-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-09  2:03           ` Wei Yang
@ 2020-01-09  8:34             ` Michal Hocko
  0 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2020-01-09  8:34 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
	linux-kernel, yang.shi

On Thu 09-01-20 10:03:19, Wei Yang wrote:
> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
[...]
> >Moreover, look at the code you are trying to fix. Sure extending the
> >locking seem straightforward but does it result in a correct code
> >though? See my question in the previous email. How do we know that the
> >page is actually enqued in a non-empty list?
> 
> I may not get your point for the last sentence.
> 
> The list_empty() doesn't check the queue is empty but check the list, here is
> the page, is not enqueued into any list. Is this your concern?

My bad. For some reason I have misread the code and thought this was
get_deferred_split_queue rather than page_deferred_list. Sorry about the
confusion.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-09  3:18           ` Wei Yang
@ 2020-01-09  8:36             ` Michal Hocko
  2020-01-09  8:52               ` Wei Yang
  0 siblings, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2020-01-09  8:36 UTC (permalink / raw)
  To: Wei Yang
  Cc: hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups, linux-mm,
	linux-kernel, yang.shi

On Thu 09-01-20 11:18:21, Wei Yang wrote:
> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
> >On Wed 08-01-20 08:35:43, Wei Yang wrote:
> >> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
> >> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
> >> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
> >> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
> >> >> >> As all the other places, we grab the lock before manipulate the defer list.
> >> >> >> Current implementation may face a race condition.
> >> >> >
> >> >> >Please always make sure to describe the effect of the change. Why a racy
> >> >> >list_empty check matters?
> >> >> >
> >> >> 
> >> >> Hmm... access the list without proper lock leads to many bad behaviors.
> >> >
> >> >My point is that the changelog should describe that bad behavior.
> >> >
> >> >> For example, if we grab the lock after checking list_empty, the page may
> >> >> already be removed from list in split_huge_page_list. And then list_del_init
> >> >> would trigger bug.
> >> >
> >> >And how does list_empty check under the lock guarantee that the page is
> >> >on the deferred list?
> >> 
> >> Just one confusion, is this kind of description basic concept of concurrent
> >> programming? How detail level we need to describe the effect?
> >
> >When I write changelogs for patches like this I usually describe, what
> >is the potential race - e.g.
> >	CPU1			CPU2
> >	path1			path2
> >	  check			  lock
> >	  			    operation2
> >				  unlock
> >	    lock
> >	    # check might not hold anymore
> >	    operation1
> >	    unlock
> >
> >and what is the effect of the race - e.g. a crash, data corruption,
> >pointless attempt for operation1 which fails with user visible effect
> >etc.
> 
> Hi, Michal, here is my attempt for an example. Hope this one looks good to
> you.
> 
> 
>     For example, the potential race would be:
>     
>         CPU1                      CPU2
>         mem_cgroup_move_account   split_huge_page_to_list
>           !list_empty
>                                     lock
>                                     !list_empty
>                                     list_del
>                                     unlock
>           lock
>           # !list_empty might not hold anymore
>           list_del_init
>           unlock
>     
>     When this sequence happens, the list_del_init() in
>     mem_cgroup_move_account() would crash since the page is already been
>     removed by list_del in split_huge_page_to_list().

Yes this looks much more informative. I would just add that this will
crash if CONFIG_DEBUG_LIST.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH] mm: thp: grab the lock before manipulation defer list
  2020-01-09  8:36             ` Michal Hocko
@ 2020-01-09  8:52               ` Wei Yang
  0 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2020-01-09  8:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, hannes, vdavydov.dev, akpm, kirill.shutemov, cgroups,
	linux-mm, linux-kernel, yang.shi

On Thu, Jan 09, 2020 at 09:36:41AM +0100, Michal Hocko wrote:
>On Thu 09-01-20 11:18:21, Wei Yang wrote:
>> On Wed, Jan 08, 2020 at 10:40:41AM +0100, Michal Hocko wrote:
>> >On Wed 08-01-20 08:35:43, Wei Yang wrote:
>> >> On Tue, Jan 07, 2020 at 09:38:08AM +0100, Michal Hocko wrote:
>> >> >On Tue 07-01-20 09:22:41, Wei Yang wrote:
>> >> >> On Mon, Jan 06, 2020 at 11:23:45AM +0100, Michal Hocko wrote:
>> >> >> >On Fri 03-01-20 22:34:07, Wei Yang wrote:
>> >> >> >> As all the other places, we grab the lock before manipulate the defer list.
>> >> >> >> Current implementation may face a race condition.
>> >> >> >
>> >> >> >Please always make sure to describe the effect of the change. Why a racy
>> >> >> >list_empty check matters?
>> >> >> >
>> >> >> 
>> >> >> Hmm... access the list without proper lock leads to many bad behaviors.
>> >> >
>> >> >My point is that the changelog should describe that bad behavior.
>> >> >
>> >> >> For example, if we grab the lock after checking list_empty, the page may
>> >> >> already be removed from list in split_huge_page_list. And then list_del_init
>> >> >> would trigger bug.
>> >> >
>> >> >And how does list_empty check under the lock guarantee that the page is
>> >> >on the deferred list?
>> >> 
>> >> Just one confusion, is this kind of description basic concept of concurrent
>> >> programming? How detail level we need to describe the effect?
>> >
>> >When I write changelogs for patches like this I usually describe, what
>> >is the potential race - e.g.
>> >	CPU1			CPU2
>> >	path1			path2
>> >	  check			  lock
>> >	  			    operation2
>> >				  unlock
>> >	    lock
>> >	    # check might not hold anymore
>> >	    operation1
>> >	    unlock
>> >
>> >and what is the effect of the race - e.g. a crash, data corruption,
>> >pointless attempt for operation1 which fails with user visible effect
>> >etc.
>> 
>> Hi, Michal, here is my attempt for an example. Hope this one looks good to
>> you.
>> 
>> 
>>     For example, the potential race would be:
>>     
>>         CPU1                      CPU2
>>         mem_cgroup_move_account   split_huge_page_to_list
>>           !list_empty
>>                                     lock
>>                                     !list_empty
>>                                     list_del
>>                                     unlock
>>           lock
>>           # !list_empty might not hold anymore
>>           list_del_init
>>           unlock
>>     
>>     When this sequence happens, the list_del_init() in
>>     mem_cgroup_move_account() would crash since the page is already been
>>     removed by list_del in split_huge_page_to_list().
>
>Yes this looks much more informative. I would just add that this will
>crash if CONFIG_DEBUG_LIST.
>
>Thanks!

Glad you like it~

Will prepare v2 with your suggestion :-)

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-01-09  8:52 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-03 14:34 [RFC PATCH] mm: thp: grab the lock before manipulation defer list Wei Yang
2020-01-03 19:29 ` David Rientjes
2020-01-03 19:29   ` David Rientjes
2020-01-03 23:39   ` Wei Yang
2020-01-04  0:44     ` David Rientjes
2020-01-04  0:44       ` David Rientjes
2020-01-06  1:20       ` Wei Yang
2020-01-06 10:23 ` Michal Hocko
2020-01-07  1:22   ` Wei Yang
2020-01-07  8:38     ` Michal Hocko
2020-01-08  0:35       ` Wei Yang
2020-01-08  9:40         ` Michal Hocko
2020-01-09  2:03           ` Wei Yang
2020-01-09  8:34             ` Michal Hocko
2020-01-09  2:03           ` Wei Yang
2020-01-09  3:18           ` Wei Yang
2020-01-09  8:36             ` Michal Hocko
2020-01-09  8:52               ` Wei Yang
2020-01-06 16:18 ` Alexander Duyck
2020-01-06 16:18   ` Alexander Duyck
2020-01-07  1:26   ` Wei Yang
2020-01-07  2:07     ` David Rientjes
2020-01-07  2:07       ` David Rientjes
2020-01-07  2:33       ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.