linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
@ 2019-10-03  9:09 Daniel Wagner
  2019-10-03 11:55 ` Uladzislau Rezki
  2019-10-04 15:37 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel Wagner @ 2019-10-03  9:09 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-rt-users, Andrew Morton, Daniel Wagner,
	Uladzislau Rezki

Replace preempt_enable() and preempt_disable() with the vmap_area_lock
spin_lock instead. Calling spin_lock() with preempt disabled is
illegal for -rt. Furthermore, enabling preemption inside the
spin_lock() doesn't really make sense.

Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for
split purpose")
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 mm/vmalloc.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 08c134aa7ff3..0d1175673583 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1091,11 +1091,11 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * Even if it fails we do not really care about that. Just proceed
 	 * as it is. "overflow" path will refill the cache we allocate from.
 	 */
-	preempt_disable();
+	spin_lock(&vmap_area_lock);
 	if (!__this_cpu_read(ne_fit_preload_node)) {
-		preempt_enable();
+		spin_unlock(&vmap_area_lock);
 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
-		preempt_disable();
+		spin_lock(&vmap_area_lock);
 
 		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
 			if (pva)
@@ -1103,9 +1103,6 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 		}
 	}
 
-	spin_lock(&vmap_area_lock);
-	preempt_enable();
-
 	/*
 	 * If an allocation fails, the "vend" address is
 	 * returned. Therefore trigger the overflow path.
-- 
2.16.4



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-03  9:09 [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node Daniel Wagner
@ 2019-10-03 11:55 ` Uladzislau Rezki
  2019-10-04 15:37 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-03 11:55 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-mm, linux-kernel, linux-rt-users, Andrew Morton, Uladzislau Rezki

On Thu, Oct 03, 2019 at 11:09:06AM +0200, Daniel Wagner wrote:
> Replace preempt_enable() and preempt_disable() with the vmap_area_lock
> spin_lock instead. Calling spin_lock() with preempt disabled is
> illegal for -rt. Furthermore, enabling preemption inside the
> spin_lock() doesn't really make sense.
> 
> Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for
> split purpose")
> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  mm/vmalloc.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 08c134aa7ff3..0d1175673583 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1091,11 +1091,11 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	 * Even if it fails we do not really care about that. Just proceed
>  	 * as it is. "overflow" path will refill the cache we allocate from.
>  	 */
> -	preempt_disable();
> +	spin_lock(&vmap_area_lock);
>  	if (!__this_cpu_read(ne_fit_preload_node)) {
> -		preempt_enable();
> +		spin_unlock(&vmap_area_lock);
>  		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> -		preempt_disable();
> +		spin_lock(&vmap_area_lock);
>  
>  		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
>  			if (pva)
> @@ -1103,9 +1103,6 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  		}
>  	}
>  
> -	spin_lock(&vmap_area_lock);
> -	preempt_enable();
> -
>  	/*
>  	 * If an allocation fails, the "vend" address is
>  	 * returned. Therefore trigger the overflow path.
> -- 
> 2.16.4
> 
Some background. The idea was to avoid of touching several times
vmap_area_lock, therefore there are preempt_disable()/preempt_enable()
instead, in order to stay on the same CPU.

When it comes to PREEMPT_RT it is a problem, so

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-03  9:09 [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node Daniel Wagner
  2019-10-03 11:55 ` Uladzislau Rezki
@ 2019-10-04 15:37 ` Sebastian Andrzej Siewior
  2019-10-04 16:20   ` Uladzislau Rezki
  2019-10-07  8:27   ` Daniel Wagner
  1 sibling, 2 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-04 15:37 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: linux-mm, linux-kernel, linux-rt-users, Andrew Morton, Uladzislau Rezki

If you post something that is related to PREEMPT_RT please keep tglx and
me in Cc.

On 2019-10-03 11:09:06 [+0200], Daniel Wagner wrote:
> Replace preempt_enable() and preempt_disable() with the vmap_area_lock
> spin_lock instead. Calling spin_lock() with preempt disabled is
> illegal for -rt. Furthermore, enabling preemption inside the
> spin_lock() doesn't really make sense.

This spin_lock will cause all CPUs to block on it while the
preempt_disable() does not have this limitation.
I added a migrate_disable() in my -next tree. Looking at it again, I
have reasonable doubt that this preempt_disable() is needed.

> Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for
> split purpose")
> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  mm/vmalloc.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 08c134aa7ff3..0d1175673583 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1091,11 +1091,11 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	 * Even if it fails we do not really care about that. Just proceed
>  	 * as it is. "overflow" path will refill the cache we allocate from.
>  	 */
> -	preempt_disable();
> +	spin_lock(&vmap_area_lock);
>  	if (!__this_cpu_read(ne_fit_preload_node)) {
> -		preempt_enable();
> +		spin_unlock(&vmap_area_lock);
>  		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> -		preempt_disable();
> +		spin_lock(&vmap_area_lock);
>  
>  		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
>  			if (pva)
> @@ -1103,9 +1103,6 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  		}
>  	}
>  
> -	spin_lock(&vmap_area_lock);
> -	preempt_enable();
> -
>  	/*
>  	 * If an allocation fails, the "vend" address is
>  	 * returned. Therefore trigger the overflow path.
> -- 

Sebastian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 15:37 ` Sebastian Andrzej Siewior
@ 2019-10-04 16:20   ` Uladzislau Rezki
  2019-10-04 16:30     ` Sebastian Andrzej Siewior
  2019-10-07  8:27   ` Daniel Wagner
  1 sibling, 1 reply; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-04 16:20 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Daniel Wagner, linux-mm, linux-kernel, linux-rt-users,
	Andrew Morton, Uladzislau Rezki

On Fri, Oct 04, 2019 at 05:37:28PM +0200, Sebastian Andrzej Siewior wrote:
> If you post something that is related to PREEMPT_RT please keep tglx and
> me in Cc.
> 
> On 2019-10-03 11:09:06 [+0200], Daniel Wagner wrote:
> > Replace preempt_enable() and preempt_disable() with the vmap_area_lock
> > spin_lock instead. Calling spin_lock() with preempt disabled is
> > illegal for -rt. Furthermore, enabling preemption inside the
> 
> Looking at it again, I have reasonable doubt that this
> preempt_disable() is needed.
> 
The intention was to preload a current CPU with one extra object in
non-atomic context, thus to use GFP_KERNEL permissive parameters. I.e.
that allows us to avoid any allocation(if we stay on the same CPU)
when we are in atomic context do splitting.

If we have migrate_disable/enable, then, i think preempt_enable/disable
should be replaced by it and not the way how it has been proposed
in the patch.

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 16:20   ` Uladzislau Rezki
@ 2019-10-04 16:30     ` Sebastian Andrzej Siewior
  2019-10-04 17:04       ` Uladzislau Rezki
  2019-10-07  8:30       ` Daniel Wagner
  0 siblings, 2 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-04 16:30 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Daniel Wagner, linux-mm, linux-kernel, linux-rt-users, Andrew Morton

On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> On Fri, Oct 04, 2019 at 05:37:28PM +0200, Sebastian Andrzej Siewior wrote:
> > If you post something that is related to PREEMPT_RT please keep tglx and
> > me in Cc.
> > 
> > On 2019-10-03 11:09:06 [+0200], Daniel Wagner wrote:
> > > Replace preempt_enable() and preempt_disable() with the vmap_area_lock
> > > spin_lock instead. Calling spin_lock() with preempt disabled is
> > > illegal for -rt. Furthermore, enabling preemption inside the
> > 
> > Looking at it again, I have reasonable doubt that this
> > preempt_disable() is needed.
> > 
> The intention was to preload a current CPU with one extra object in
> non-atomic context, thus to use GFP_KERNEL permissive parameters. I.e.
> that allows us to avoid any allocation(if we stay on the same CPU)
> when we are in atomic context do splitting.

You could have been migrated to another CPU before the first
preempt_disable(). You could have been migrated to another CPU while
memory has been allocated.
I don't really see the point of that preempt_disable() besides keeping
debug code quiet.

> If we have migrate_disable/enable, then, i think preempt_enable/disable
> should be replaced by it and not the way how it has been proposed
> in the patch.

I don't think this patch is appropriate for upstream.

Sebastian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 16:30     ` Sebastian Andrzej Siewior
@ 2019-10-04 17:04       ` Uladzislau Rezki
  2019-10-04 17:45         ` Sebastian Andrzej Siewior
  2019-10-07  8:30       ` Daniel Wagner
  1 sibling, 1 reply; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-04 17:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Uladzislau Rezki, Daniel Wagner, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

>
> You could have been migrated to another CPU while
> memory has been allocated.
> 
That is true that we can migrate since we allow preemption
when allocate. But it does not really matter on which CPU an
allocation occurs and whether we migrate or not.

If we land on another CPU or still stay on the same, we will
check anyway one more time if it(another/same CPU) is preloaded
or not:

<snip>
    preempt_disable();
    if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)
<snip>

if another, we can free the object allocated on previous step if
it already has it. If another CPU does not have it, save it in 
ne_fit_preload_node for another current CPU to reuse later. Further
we can not migrate because of:

<snip>
    spin_lock(&vmap_area_lock);
    preempt_enable();
<snip>

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 17:04       ` Uladzislau Rezki
@ 2019-10-04 17:45         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-04 17:45 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Daniel Wagner, linux-mm, linux-kernel, linux-rt-users, Andrew Morton

On 2019-10-04 19:04:11 [+0200], Uladzislau Rezki wrote:
> if another, we can free the object allocated on previous step if
> it already has it. If another CPU does not have it, save it in 
> ne_fit_preload_node for another current CPU to reuse later. Further
> we can not migrate because of:
> 
> <snip>
>     spin_lock(&vmap_area_lock);
>     preempt_enable();
> <snip>

ah right. So you keep the lock later on, I somehow missed that part.
That way it actually makes sense.

Sebastian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 15:37 ` Sebastian Andrzej Siewior
  2019-10-04 16:20   ` Uladzislau Rezki
@ 2019-10-07  8:27   ` Daniel Wagner
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2019-10-07  8:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-mm, linux-kernel, linux-rt-users, Andrew Morton, Uladzislau Rezki

On Fri, Oct 04, 2019 at 05:37:28PM +0200, Sebastian Andrzej Siewior wrote:
> If you post something that is related to PREEMPT_RT please keep tglx and
> me in Cc.

Sure, just forgot to add it this time. My email setup needs a bit more
love. Sorry.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-04 16:30     ` Sebastian Andrzej Siewior
  2019-10-04 17:04       ` Uladzislau Rezki
@ 2019-10-07  8:30       ` Daniel Wagner
  2019-10-07 10:56         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2019-10-07  8:30 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Uladzislau Rezki, linux-mm, linux-kernel, linux-rt-users, Andrew Morton

Hi,

On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > should be replaced by it and not the way how it has been proposed
> > in the patch.
> 
> I don't think this patch is appropriate for upstream.

Yes, I agree. The discussion made this clear, this is only for -rt
trees. Initially I though this should be in mainline too.

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07  8:30       ` Daniel Wagner
@ 2019-10-07 10:56         ` Sebastian Andrzej Siewior
  2019-10-07 16:23           ` Uladzislau Rezki
  0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-07 10:56 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Uladzislau Rezki, linux-mm, linux-kernel, linux-rt-users, Andrew Morton

On 2019-10-07 10:30:37 [+0200], Daniel Wagner wrote:
> Hi,
Hi Daniel,

> On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > > should be replaced by it and not the way how it has been proposed
> > > in the patch.
> > 
> > I don't think this patch is appropriate for upstream.
> 
> Yes, I agree. The discussion made this clear, this is only for -rt
> trees. Initially I though this should be in mainline too.

Sorry, this was _before_ Uladzislau pointed out that you *just* moved
the lock that was there from the beginning. I missed that while looking
over the patch. Based on that I don't think that this patch is not
appropriate for upstream.

> Thanks,
> Daniel

Sebastian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 10:56         ` Sebastian Andrzej Siewior
@ 2019-10-07 16:23           ` Uladzislau Rezki
  2019-10-07 16:34             ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-07 16:23 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Daniel Wagner, Uladzislau Rezki, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

Hello, Daniel, Sebastian.

> > On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > > > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > > > should be replaced by it and not the way how it has been proposed
> > > > in the patch.
> > > 
> > > I don't think this patch is appropriate for upstream.
> > 
> > Yes, I agree. The discussion made this clear, this is only for -rt
> > trees. Initially I though this should be in mainline too.
> 
> Sorry, this was _before_ Uladzislau pointed out that you *just* moved
> the lock that was there from the beginning. I missed that while looking
> over the patch. Based on that I don't think that this patch is not
> appropriate for upstream.
> 
Yes that is a bit messy :) Then i do not see what that patch fixes in
mainline? Instead it will just add an extra blocking, i did not want that
therefore used preempt_enable/disable. But, when i saw this patch i got it
as a preparation of PREEMPT_RT merging work.

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 16:23           ` Uladzislau Rezki
@ 2019-10-07 16:34             ` Daniel Wagner
  2019-10-07 16:56               ` Uladzislau Rezki
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2019-10-07 16:34 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

On Mon, Oct 07, 2019 at 06:23:30PM +0200, Uladzislau Rezki wrote:
> Hello, Daniel, Sebastian.
> 
> > > On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> > > > On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > > > > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > > > > should be replaced by it and not the way how it has been proposed
> > > > > in the patch.
> > > > 
> > > > I don't think this patch is appropriate for upstream.
> > > 
> > > Yes, I agree. The discussion made this clear, this is only for -rt
> > > trees. Initially I though this should be in mainline too.
> > 
> > Sorry, this was _before_ Uladzislau pointed out that you *just* moved
> > the lock that was there from the beginning. I missed that while looking
> > over the patch. Based on that I don't think that this patch is not
> > appropriate for upstream.
> > 
> Yes that is a bit messy :) Then i do not see what that patch fixes in
> mainline? Instead it will just add an extra blocking, i did not want that
> therefore used preempt_enable/disable. But, when i saw this patch i got it
> as a preparation of PREEMPT_RT merging work.

Maybe I should add some background info here as well. Currently, I am
creating an -rt tree on v5.3 for which I need this patch (or a
migrate_disable() version of it). So this is slightly independent of
the work Sebiastian is doing. Though the mainline effort of PREEMPT_RT
will hit this problem as well.

I understood Sebiastian wrong above. I thought he suggest to use the
migrate_disable() approach even for mainline. 

I supppose, one thing which would help in this discussion, is what do
you gain by using preempt_disable() instead of moving the lock up?
Do you have performance numbers which could justify the code?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 16:34             ` Daniel Wagner
@ 2019-10-07 16:56               ` Uladzislau Rezki
  2019-10-07 17:22                 ` Daniel Wagner
  2019-10-07 17:36                 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-07 16:56 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, linux-mm,
	linux-kernel, linux-rt-users, Andrew Morton

On Mon, Oct 07, 2019 at 06:34:43PM +0200, Daniel Wagner wrote:
> On Mon, Oct 07, 2019 at 06:23:30PM +0200, Uladzislau Rezki wrote:
> > Hello, Daniel, Sebastian.
> > 
> > > > On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> > > > > On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > > > > > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > > > > > should be replaced by it and not the way how it has been proposed
> > > > > > in the patch.
> > > > > 
> > > > > I don't think this patch is appropriate for upstream.
> > > > 
> > > > Yes, I agree. The discussion made this clear, this is only for -rt
> > > > trees. Initially I though this should be in mainline too.
> > > 
> > > Sorry, this was _before_ Uladzislau pointed out that you *just* moved
> > > the lock that was there from the beginning. I missed that while looking
> > > over the patch. Based on that I don't think that this patch is not
> > > appropriate for upstream.
> > > 
> > Yes that is a bit messy :) Then i do not see what that patch fixes in
> > mainline? Instead it will just add an extra blocking, i did not want that
> > therefore used preempt_enable/disable. But, when i saw this patch i got it
> > as a preparation of PREEMPT_RT merging work.
> 
> Maybe I should add some background info here as well. Currently, I am
> creating an -rt tree on v5.3 for which I need this patch (or a
> migrate_disable() version of it). So this is slightly independent of
> the work Sebiastian is doing. Though the mainline effort of PREEMPT_RT
> will hit this problem as well.
> 
> I understood Sebiastian wrong above. I thought he suggest to use the
> migrate_disable() approach even for mainline. 
> 
> I supppose, one thing which would help in this discussion, is what do
> you gain by using preempt_disable() instead of moving the lock up?
> Do you have performance numbers which could justify the code?
>
Actually there is a high lock contention on vmap_area_lock, because it
is still global. You can have a look at last slide:

https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf

so this change will make it a bit higher. From the other hand i agree
that for rt it should be fixed, probably it could be done like:

ifdef PREEMPT_RT
    migrate_disable()
#else
    preempt_disable()
...

but i am not sure it is good either.

--
Vlad Rezki



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 16:56               ` Uladzislau Rezki
@ 2019-10-07 17:22                 ` Daniel Wagner
  2019-10-07 17:36                 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 19+ messages in thread
From: Daniel Wagner @ 2019-10-07 17:22 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

On Mon, Oct 07, 2019 at 06:56:11PM +0200, Uladzislau Rezki wrote:
> On Mon, Oct 07, 2019 at 06:34:43PM +0200, Daniel Wagner wrote:
> > I supppose, one thing which would help in this discussion, is what do
> > you gain by using preempt_disable() instead of moving the lock up?
> > Do you have performance numbers which could justify the code?
> >
> Actually there is a high lock contention on vmap_area_lock, because it
> is still global. You can have a look at last slide:
> 
> https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf
> 
> so this change will make it a bit higher.

Thanks! I suspected something like this :(

On the todo-list page you stating that vmap_area_lock could be
splitted and therefore reduce the contention. If you could avoid those
preempt_disable() tricks and just use plain spin_locks() to protect it
would be really helpful.

> From the other hand i agree
> that for rt it should be fixed, probably it could be done like:
> 
> ifdef PREEMPT_RT
>     migrate_disable()
> #else
>     preempt_disable()
> ...
> 
> but i am not sure it is good either.

I don't think this way to go. I guess Sebastian and Thomas have a
better idea how to address this for PREEMPT_RT.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 16:56               ` Uladzislau Rezki
  2019-10-07 17:22                 ` Daniel Wagner
@ 2019-10-07 17:36                 ` Sebastian Andrzej Siewior
  2019-10-07 21:44                   ` Uladzislau Rezki
  1 sibling, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-07 17:36 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Daniel Wagner, linux-mm, linux-kernel, linux-rt-users, Andrew Morton

On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote:
> Actually there is a high lock contention on vmap_area_lock, because it
> is still global. You can have a look at last slide:
> 
> https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf
> 
> so this change will make it a bit higher. From the other hand i agree
> that for rt it should be fixed, probably it could be done like:
> 
> ifdef PREEMPT_RT
>     migrate_disable()
> #else
>     preempt_disable()
> ...
> 
> but i am not sure it is good either.

What is to be expected on average? Is the lock acquired and then
released again because the slot is empty and memory needs to be
allocated or can it be assumed that this hardly happens? 

Sebastian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 17:36                 ` Sebastian Andrzej Siewior
@ 2019-10-07 21:44                   ` Uladzislau Rezki
  2019-10-08 16:04                     ` Uladzislau Rezki
  0 siblings, 1 reply; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-07 21:44 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Uladzislau Rezki, Daniel Wagner, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

On Mon, Oct 07, 2019 at 07:36:44PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote:
> > Actually there is a high lock contention on vmap_area_lock, because it
> > is still global. You can have a look at last slide:
> > 
> > https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf
> > 
> > so this change will make it a bit higher. From the other hand i agree
> > that for rt it should be fixed, probably it could be done like:
> > 
> > ifdef PREEMPT_RT
> >     migrate_disable()
> > #else
> >     preempt_disable()
> > ...
> > 
> > but i am not sure it is good either.
> 
> What is to be expected on average? Is the lock acquired and then
> released again because the slot is empty and memory needs to be
> allocated or can it be assumed that this hardly happens? 
> 
The lock is not released(we are not allowed), instead we just try
to allocate with GFP_NOWAIT flag. It can happen if preallocation
has been failed with GFP_KERNEL flag earlier:

<snip>
...
 } else if (type == NE_FIT_TYPE) {
  /*
   * Split no edge of fit VA.
   *
   *     |       |
   *   L V  NVA  V R
   * |---|-------|---|
   */
  lva = __this_cpu_xchg(ne_fit_preload_node, NULL);
  if (unlikely(!lva)) {
      ...
      lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
      ...
  }
...
<snip>

How often we need an extra object for split purpose, the answer
is it depends on. For example fork() path falls to that pattern.

I think we can assume that migration can hardly ever happen and
that should be considered as rare case. Thus we can do a prealoading
without worrying much if a it occurs:

<snip>
urezki@pc636:~/data/ssd/coding/linux-stable$ git diff
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..bc782edcd1fd 100644 
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1089,20 +1089,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
         * Even if it fails we do not really care about that. Just proceed
         * as it is. "overflow" path will refill the cache we allocate from.
         */
-       preempt_disable();
-       if (!__this_cpu_read(ne_fit_preload_node)) {
-               preempt_enable();
+       if (!this_cpu_read(ne_fit_preload_node)) {
                pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
-               preempt_disable();

-               if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
+               if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
                        if (pva)
                                kmem_cache_free(vmap_area_cachep, pva);
                }
        }
 
        spin_lock(&vmap_area_lock);
-       preempt_enable();

        /*
         * If an allocation fails, the "vend" address is
urezki@pc636:~/data/ssd/coding/linux-stable$
<snip>

so, we do not guarantee, instead we minimize number of allocations
with GFP_NOWAIT flag. For example on my 4xCPUs i am not able to
even trigger the case when CPU is not preloaded.

I can test it tomorrow on my 12xCPUs to see its behavior there.

--
Vlad Rezki


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-07 21:44                   ` Uladzislau Rezki
@ 2019-10-08 16:04                     ` Uladzislau Rezki
  2019-10-09  6:05                       ` Daniel Wagner
  0 siblings, 1 reply; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-08 16:04 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, Daniel Wagner, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

On Mon, Oct 07, 2019 at 11:44:20PM +0200, Uladzislau Rezki wrote:
> On Mon, Oct 07, 2019 at 07:36:44PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote:
> > > Actually there is a high lock contention on vmap_area_lock, because it
> > > is still global. You can have a look at last slide:
> > > 
> > > https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf
> > > 
> > > so this change will make it a bit higher. From the other hand i agree
> > > that for rt it should be fixed, probably it could be done like:
> > > 
> > > ifdef PREEMPT_RT
> > >     migrate_disable()
> > > #else
> > >     preempt_disable()
> > > ...
> > > 
> > > but i am not sure it is good either.
> > 
> > What is to be expected on average? Is the lock acquired and then
> > released again because the slot is empty and memory needs to be
> > allocated or can it be assumed that this hardly happens? 
> > 
> The lock is not released(we are not allowed), instead we just try
> to allocate with GFP_NOWAIT flag. It can happen if preallocation
> has been failed with GFP_KERNEL flag earlier:
> 
> <snip>
> ...
>  } else if (type == NE_FIT_TYPE) {
>   /*
>    * Split no edge of fit VA.
>    *
>    *     |       |
>    *   L V  NVA  V R
>    * |---|-------|---|
>    */
>   lva = __this_cpu_xchg(ne_fit_preload_node, NULL);
>   if (unlikely(!lva)) {
>       ...
>       lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
>       ...
>   }
> ...
> <snip>
> 
> How often we need an extra object for split purpose, the answer
> is it depends on. For example fork() path falls to that pattern.
> 
> I think we can assume that migration can hardly ever happen and
> that should be considered as rare case. Thus we can do a prealoading
> without worrying much if a it occurs:
> 
> <snip>
> urezki@pc636:~/data/ssd/coding/linux-stable$ git diff
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e92ff5f7dd8b..bc782edcd1fd 100644 
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1089,20 +1089,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>          * Even if it fails we do not really care about that. Just proceed
>          * as it is. "overflow" path will refill the cache we allocate from.
>          */
> -       preempt_disable();
> -       if (!__this_cpu_read(ne_fit_preload_node)) {
> -               preempt_enable();
> +       if (!this_cpu_read(ne_fit_preload_node)) {
>                 pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> -               preempt_disable();
> 
> -               if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
> +               if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
>                         if (pva)
>                                 kmem_cache_free(vmap_area_cachep, pva);
>                 }
>         }
>  
>         spin_lock(&vmap_area_lock);
> -       preempt_enable();
> 
>         /*
>          * If an allocation fails, the "vend" address is
> urezki@pc636:~/data/ssd/coding/linux-stable$
> <snip>
> 
> so, we do not guarantee, instead we minimize number of allocations
> with GFP_NOWAIT flag. For example on my 4xCPUs i am not able to
> even trigger the case when CPU is not preloaded.
> 
> I can test it tomorrow on my 12xCPUs to see its behavior there.
> 
Tested it on different systems. For example on my 8xCPUs system that
runs PREEMPT kernel i see only few GFP_NOWAIT allocations, i.e. it
happens when we land to another CPU that was not preloaded.

I run the special test case that follows the preload pattern and path.
So 20 "unbind" threads run it and each does 1000000 allocations. As a
result only 3.5 times among 1000000, during splitting, CPU was not
preloaded thus, GFP_NOWAIT was used to obtain an extra object.

It is obvious that slightly modified approach still minimizes allocations
in atomic context, so it can happen but the number is negligible and can
be ignored, i think.

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-08 16:04                     ` Uladzislau Rezki
@ 2019-10-09  6:05                       ` Daniel Wagner
  2019-10-09  9:47                         ` Uladzislau Rezki
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Wagner @ 2019-10-09  6:05 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Sebastian Andrzej Siewior, linux-mm, linux-kernel,
	linux-rt-users, Andrew Morton

On Tue, Oct 08, 2019 at 06:04:59PM +0200, Uladzislau Rezki wrote:
> > so, we do not guarantee, instead we minimize number of allocations
> > with GFP_NOWAIT flag. For example on my 4xCPUs i am not able to
> > even trigger the case when CPU is not preloaded.
> > 
> > I can test it tomorrow on my 12xCPUs to see its behavior there.
> > 
> Tested it on different systems. For example on my 8xCPUs system that
> runs PREEMPT kernel i see only few GFP_NOWAIT allocations, i.e. it
> happens when we land to another CPU that was not preloaded.
> 
> I run the special test case that follows the preload pattern and path.
> So 20 "unbind" threads run it and each does 1000000 allocations. As a
> result only 3.5 times among 1000000, during splitting, CPU was not
> preloaded thus, GFP_NOWAIT was used to obtain an extra object.
> 
> It is obvious that slightly modified approach still minimizes allocations
> in atomic context, so it can happen but the number is negligible and can
> be ignored, i think.

Thanks for doing the tests. In this case I would suggest to get rid of
the preempt_disable() micro optimization, since there is almost no
gain in doing so. Do you send a patch? :)

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node
  2019-10-09  6:05                       ` Daniel Wagner
@ 2019-10-09  9:47                         ` Uladzislau Rezki
  0 siblings, 0 replies; 19+ messages in thread
From: Uladzislau Rezki @ 2019-10-09  9:47 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Uladzislau Rezki, Sebastian Andrzej Siewior, linux-mm,
	linux-kernel, linux-rt-users, Andrew Morton

Hello, Daniel.

On Wed, Oct 09, 2019 at 08:05:39AM +0200, Daniel Wagner wrote:
> On Tue, Oct 08, 2019 at 06:04:59PM +0200, Uladzislau Rezki wrote:
> > > so, we do not guarantee, instead we minimize number of allocations
> > > with GFP_NOWAIT flag. For example on my 4xCPUs i am not able to
> > > even trigger the case when CPU is not preloaded.
> > > 
> > > I can test it tomorrow on my 12xCPUs to see its behavior there.
> > > 
> > Tested it on different systems. For example on my 8xCPUs system that
> > runs PREEMPT kernel i see only few GFP_NOWAIT allocations, i.e. it
> > happens when we land to another CPU that was not preloaded.
> > 
> > I run the special test case that follows the preload pattern and path.
> > So 20 "unbind" threads run it and each does 1000000 allocations. As a
> > result only 3.5 times among 1000000, during splitting, CPU was not
> > preloaded thus, GFP_NOWAIT was used to obtain an extra object.
> > 
> > It is obvious that slightly modified approach still minimizes allocations
> > in atomic context, so it can happen but the number is negligible and can
> > be ignored, i think.
> 
> Thanks for doing the tests. In this case I would suggest to get rid of
> the preempt_disable() micro optimization, since there is almost no
> gain in doing so. Do you send a patch? :)
> 
I can do it, for sure, in case you do not mind, since it was your
initiative. Otherwise you can upload v2.

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-10-09  9:47 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03  9:09 [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node Daniel Wagner
2019-10-03 11:55 ` Uladzislau Rezki
2019-10-04 15:37 ` Sebastian Andrzej Siewior
2019-10-04 16:20   ` Uladzislau Rezki
2019-10-04 16:30     ` Sebastian Andrzej Siewior
2019-10-04 17:04       ` Uladzislau Rezki
2019-10-04 17:45         ` Sebastian Andrzej Siewior
2019-10-07  8:30       ` Daniel Wagner
2019-10-07 10:56         ` Sebastian Andrzej Siewior
2019-10-07 16:23           ` Uladzislau Rezki
2019-10-07 16:34             ` Daniel Wagner
2019-10-07 16:56               ` Uladzislau Rezki
2019-10-07 17:22                 ` Daniel Wagner
2019-10-07 17:36                 ` Sebastian Andrzej Siewior
2019-10-07 21:44                   ` Uladzislau Rezki
2019-10-08 16:04                     ` Uladzislau Rezki
2019-10-09  6:05                       ` Daniel Wagner
2019-10-09  9:47                         ` Uladzislau Rezki
2019-10-07  8:27   ` Daniel Wagner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).