[PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
@ 2019-10-09 16:49 Uladzislau Rezki (Sony)
  2019-10-09 18:21 ` Steven Rostedt
  2019-10-09 22:19 ` Andrew Morton
  0 siblings, 2 replies; 9+ messages in thread
From: Uladzislau Rezki (Sony) @ 2019-10-09 16:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Wagner, Sebastian Andrzej Siewior, Thomas Gleixner,
	linux-mm, LKML, Peter Zijlstra, Uladzislau Rezki, Hillf Danton,
	Michal Hocko, Matthew Wilcox, Oleksiy Avramchenko,
	Steven Rostedt

Get rid of preempt_disable() and preempt_enable() when the
preload is done for splitting purpose. The reason is that
calling spin_lock() with disabled preemtion is forbidden in
CONFIG_PREEMPT_RT kernel.

Therefore, we do not guarantee that a CPU is preloaded, instead
we minimize the case when it is not with this change.

For example i run the special test case that follows the preload
pattern and path. 20 "unbind" threads run it and each does
1000000 allocations. Only 3.5 times among 1000000 a CPU was
not preloaded thus. So it can happen but the number is rather
negligible.

Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..2ed6fef86950 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1078,9 +1078,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 
 retry:
 	/*
-	 * Preload this CPU with one extra vmap_area object to ensure
-	 * that we have it available when fit type of free area is
-	 * NE_FIT_TYPE.
+	 * Preload this CPU with one extra vmap_area object. It is used
+	 * when fit type of free area is NE_FIT_TYPE. Please note, it
+	 * does not guarantee that an allocation occurs on a CPU that
+	 * is preloaded, instead we minimize the case when it is not.
+	 * It can happen because of migration, because there is a race
+	 * until the below spinlock is taken.
 	 *
 	 * The preload is done in non-atomic context, thus it allows us
 	 * to use more permissive allocation masks to be more stable under
@@ -1089,20 +1092,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	 * Even if it fails we do not really care about that. Just proceed
 	 * as it is. "overflow" path will refill the cache we allocate from.
 	 */
-	preempt_disable();
-	if (!__this_cpu_read(ne_fit_preload_node)) {
-		preempt_enable();
+	if (!this_cpu_read(ne_fit_preload_node)) {
 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
-		preempt_disable();
 
-		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
+		if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
 			if (pva)
 				kmem_cache_free(vmap_area_cachep, pva);
 		}
 	}
 
 	spin_lock(&vmap_area_lock);
-	preempt_enable();
 
 	/*
 	 * If an allocation fails, the "vend" address is
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-09 16:49 [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading Uladzislau Rezki (Sony)
@ 2019-10-09 18:21 ` Steven Rostedt
  2019-10-09 22:19 ` Andrew Morton
  1 sibling, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2019-10-09 18:21 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: Andrew Morton, Daniel Wagner, Sebastian Andrzej Siewior,
	Thomas Gleixner, linux-mm, LKML, Peter Zijlstra, Hillf Danton,
	Michal Hocko, Matthew Wilcox, Oleksiy Avramchenko

On Wed,  9 Oct 2019 18:49:34 +0200
"Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote:

> Get rid of preempt_disable() and preempt_enable() when the
> preload is done for splitting purpose. The reason is that
> calling spin_lock() with disabled preemtion is forbidden in
> CONFIG_PREEMPT_RT kernel.
> 
> Therefore, we do not guarantee that a CPU is preloaded, instead
> we minimize the case when it is not with this change.
> 
> For example i run the special test case that follows the preload
> pattern and path. 20 "unbind" threads run it and each does
> 1000000 allocations. Only 3.5 times among 1000000 a CPU was
> not preloaded thus. So it can happen but the number is rather
> negligible.

Thanks for the analysis.

> 
> Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose")
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e92ff5f7dd8b..2ed6fef86950 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1078,9 +1078,12 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  
>  retry:
>  	/*
> -	 * Preload this CPU with one extra vmap_area object to ensure
> -	 * that we have it available when fit type of free area is
> -	 * NE_FIT_TYPE.
> +	 * Preload this CPU with one extra vmap_area object. It is used
> +	 * when fit type of free area is NE_FIT_TYPE. Please note, it
> +	 * does not guarantee that an allocation occurs on a CPU that
> +	 * is preloaded, instead we minimize the case when it is not.
> +	 * It can happen because of migration, because there is a race
> +	 * until the below spinlock is taken.
>  	 *
>  	 * The preload is done in non-atomic context, thus it allows us
>  	 * to use more permissive allocation masks to be more stable under
> @@ -1089,20 +1092,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	 * Even if it fails we do not really care about that. Just proceed
>  	 * as it is. "overflow" path will refill the cache we allocate from.
>  	 */
> -	preempt_disable();
> -	if (!__this_cpu_read(ne_fit_preload_node)) {
> -		preempt_enable();

As the original code enables preemption here regardless, there's no
guarantee that the original patch would allocate the pva to the CPU in
question.

I agree with this patch, the preempt_disable() here only narrows an
already narrow window, with no real help in what it was doing.

> +	if (!this_cpu_read(ne_fit_preload_node)) {
>  		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);

If the memory allocation failed here, we still may not have a pva for
the current CPU's ne_fit_preload_node, rare as that may be.

> -		preempt_disable();
>  
> -		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
> +		if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {


Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


>  			if (pva)
>  				kmem_cache_free(vmap_area_cachep, pva);
>  		}
>  	}
>  
>  	spin_lock(&vmap_area_lock);
> -	preempt_enable();
>  
>  	/*
>  	 * If an allocation fails, the "vend" address is



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-09 16:49 [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading Uladzislau Rezki (Sony)
  2019-10-09 18:21 ` Steven Rostedt
@ 2019-10-09 22:19 ` Andrew Morton
  2019-10-10  2:17   ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2019-10-09 22:19 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony)
  Cc: Daniel Wagner, Sebastian Andrzej Siewior, Thomas Gleixner,
	linux-mm, LKML, Peter Zijlstra, Hillf Danton, Michal Hocko,
	Matthew Wilcox, Oleksiy Avramchenko, Steven Rostedt

On Wed,  9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote:

> Get rid of preempt_disable() and preempt_enable() when the
> preload is done for splitting purpose. The reason is that
> calling spin_lock() with disabled preemtion is forbidden in
> CONFIG_PREEMPT_RT kernel.
> 
> Therefore, we do not guarantee that a CPU is preloaded, instead
> we minimize the case when it is not with this change.
> 
> For example i run the special test case that follows the preload
> pattern and path. 20 "unbind" threads run it and each does
> 1000000 allocations. Only 3.5 times among 1000000 a CPU was
> not preloaded thus. So it can happen but the number is rather
> negligible.
>
> ...
>

A few questions about the resulting alloc_vmap_area():

: static struct vmap_area *alloc_vmap_area(unsigned long size,
: 				unsigned long align,
: 				unsigned long vstart, unsigned long vend,
: 				int node, gfp_t gfp_mask)
: {
: 	struct vmap_area *va, *pva;
: 	unsigned long addr;
: 	int purged = 0;
: 
: 	BUG_ON(!size);
: 	BUG_ON(offset_in_page(size));
: 	BUG_ON(!is_power_of_2(align));
: 
: 	if (unlikely(!vmap_initialized))
: 		return ERR_PTR(-EBUSY);
: 
: 	might_sleep();
: 
: 	va = kmem_cache_alloc_node(vmap_area_cachep,
: 			gfp_mask & GFP_RECLAIM_MASK, node);

Why does this use GFP_RECLAIM_MASK?  Please add a comment explaining
this.

: 	if (unlikely(!va))
: 		return ERR_PTR(-ENOMEM);
: 
: 	/*
: 	 * Only scan the relevant parts containing pointers to other objects
: 	 * to avoid false negatives.
: 	 */
: 	kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
: 
: retry:
: 	/*
: 	 * Preload this CPU with one extra vmap_area object. It is used
: 	 * when fit type of free area is NE_FIT_TYPE. Please note, it
: 	 * does not guarantee that an allocation occurs on a CPU that
: 	 * is preloaded, instead we minimize the case when it is not.
: 	 * It can happen because of migration, because there is a race
: 	 * until the below spinlock is taken.
: 	 *
: 	 * The preload is done in non-atomic context, thus it allows us
: 	 * to use more permissive allocation masks to be more stable under
: 	 * low memory condition and high memory pressure.
: 	 *
: 	 * Even if it fails we do not really care about that. Just proceed
: 	 * as it is. "overflow" path will refill the cache we allocate from.
: 	 */
: 	if (!this_cpu_read(ne_fit_preload_node)) {

Readability nit: local `pva' should be defined here, rather than having
function-wide scope.

: 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);

Why doesn't this honour gfp_mask?  If it's not a bug, please add
comment explaining this.

The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's
gfp_mask also.  If not a bug, please document the unexpected behaviour.

: 
: 		if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
: 			if (pva)
: 				kmem_cache_free(vmap_area_cachep, pva);
: 		}
: 	}
: 
: 	spin_lock(&vmap_area_lock);
: 
: 	/*
: 	 * If an allocation fails, the "vend" address is
: 	 * returned. Therefore trigger the overflow path.
: 	 */

As for the intent of this patch, why not preallocate the vmap_area
outside the spinlock and use it within the spinlock?  Does spin_lock()
disable preemption on RT?  I forget, but it doesn't matter much anyway
- doing this will make the code better in the regular kernel I think? 
Something like this:

	struct vmap_area *pva = NULL;

	...

	if (!this_cpu_read(ne_fit_preload_node))
		pva = kmem_cache_alloc_node(vmap_area_cachep, ...);

	spin_lock(&vmap_area_lock);

	if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
		kmem_cache_free(vmap_area_cachep, pva);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-09 22:19 ` Andrew Morton
@ 2019-10-10  2:17   ` Steven Rostedt
  2019-10-10 15:17     ` Uladzislau Rezki
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2019-10-10  2:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Uladzislau Rezki (Sony),
	Daniel Wagner, Sebastian Andrzej Siewior, Thomas Gleixner,
	linux-mm, LKML, Peter Zijlstra, Hillf Danton, Michal Hocko,
	Matthew Wilcox, Oleksiy Avramchenko

On Wed, 9 Oct 2019 15:19:01 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed,  9 Oct 2019 18:49:34 +0200 "Uladzislau Rezki (Sony)" <urezki@gmail.com> wrote:
> 
> > Get rid of preempt_disable() and preempt_enable() when the
> > preload is done for splitting purpose. The reason is that
> > calling spin_lock() with disabled preemtion is forbidden in
> > CONFIG_PREEMPT_RT kernel.
> > 
> > Therefore, we do not guarantee that a CPU is preloaded, instead
> > we minimize the case when it is not with this change.
> > 
> > For example i run the special test case that follows the preload
> > pattern and path. 20 "unbind" threads run it and each does
> > 1000000 allocations. Only 3.5 times among 1000000 a CPU was
> > not preloaded thus. So it can happen but the number is rather
> > negligible.
> >
> > ...
> >  
> 
> A few questions about the resulting alloc_vmap_area():
> 
> : static struct vmap_area *alloc_vmap_area(unsigned long size,
> : 				unsigned long align,
> : 				unsigned long vstart, unsigned long vend,
> : 				int node, gfp_t gfp_mask)
> : {
> : 	struct vmap_area *va, *pva;
> : 	unsigned long addr;
> : 	int purged = 0;
> : 
> : 	BUG_ON(!size);
> : 	BUG_ON(offset_in_page(size));
> : 	BUG_ON(!is_power_of_2(align));
> : 
> : 	if (unlikely(!vmap_initialized))
> : 		return ERR_PTR(-EBUSY);
> : 
> : 	might_sleep();
> : 
> : 	va = kmem_cache_alloc_node(vmap_area_cachep,
> : 			gfp_mask & GFP_RECLAIM_MASK, node);
> 
> Why does this use GFP_RECLAIM_MASK?  Please add a comment explaining
> this.
> 
> : 	if (unlikely(!va))
> : 		return ERR_PTR(-ENOMEM);
> : 
> : 	/*
> : 	 * Only scan the relevant parts containing pointers to other objects
> : 	 * to avoid false negatives.
> : 	 */
> : 	kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
> : 
> : retry:
> : 	/*
> : 	 * Preload this CPU with one extra vmap_area object. It is used
> : 	 * when fit type of free area is NE_FIT_TYPE. Please note, it
> : 	 * does not guarantee that an allocation occurs on a CPU that
> : 	 * is preloaded, instead we minimize the case when it is not.
> : 	 * It can happen because of migration, because there is a race
> : 	 * until the below spinlock is taken.
> : 	 *
> : 	 * The preload is done in non-atomic context, thus it allows us
> : 	 * to use more permissive allocation masks to be more stable under
> : 	 * low memory condition and high memory pressure.
> : 	 *
> : 	 * Even if it fails we do not really care about that. Just proceed
> : 	 * as it is. "overflow" path will refill the cache we allocate from.
> : 	 */
> : 	if (!this_cpu_read(ne_fit_preload_node)) {
> 
> Readability nit: local `pva' should be defined here, rather than having
> function-wide scope.
> 
> : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> 
> Why doesn't this honour gfp_mask?  If it's not a bug, please add
> comment explaining this.
> 
> The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's
> gfp_mask also.  If not a bug, please document the unexpected behaviour.
> 

These questions appear to be for the code that this patch touches, not
for the patch itself.

> : 
> : 		if (this_cpu_cmpxchg(ne_fit_preload_node, NULL,
> pva)) { : 			if (pva)
> : 				kmem_cache_free(vmap_area_cachep,
> pva); : 		}
> : 	}
> : 
> : 	spin_lock(&vmap_area_lock);
> : 
> : 	/*
> : 	 * If an allocation fails, the "vend" address is
> : 	 * returned. Therefore trigger the overflow path.
> : 	 */
> 
> As for the intent of this patch, why not preallocate the vmap_area
> outside the spinlock and use it within the spinlock?  Does spin_lock()
> disable preemption on RT?  I forget, but it doesn't matter much anyway

spin_lock() does not disable preemption on RT. But it does disable
migration (thus the task should remain on the current CPU).

> - doing this will make the code better in the regular kernel I think? 
> Something like this:
> 
> 	struct vmap_area *pva = NULL;
> 
> 	...
> 
> 	if (!this_cpu_read(ne_fit_preload_node))
> 		pva = kmem_cache_alloc_node(vmap_area_cachep, ...);
> 
> 	spin_lock(&vmap_area_lock);
> 
> 	if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
> 		kmem_cache_free(vmap_area_cachep, pva);
> 


This looks fine to me.

-- Steve


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-10  2:17   ` Steven Rostedt
@ 2019-10-10 15:17     ` Uladzislau Rezki
  2019-10-11 23:55       ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Uladzislau Rezki @ 2019-10-10 15:17 UTC (permalink / raw)
  To: Steven Rostedt, Andrew Morton
  Cc: Andrew Morton, Uladzislau Rezki (Sony),
	Daniel Wagner, Sebastian Andrzej Siewior, Thomas Gleixner,
	linux-mm, LKML, Peter Zijlstra, Hillf Danton, Michal Hocko,
	Matthew Wilcox, Oleksiy Avramchenko

> > 
> > A few questions about the resulting alloc_vmap_area():
> > 
> > : static struct vmap_area *alloc_vmap_area(unsigned long size,
> > : 				unsigned long align,
> > : 				unsigned long vstart, unsigned long vend,
> > : 				int node, gfp_t gfp_mask)
> > : {
> > : 	struct vmap_area *va, *pva;
> > : 	unsigned long addr;
> > : 	int purged = 0;
> > : 
> > : 	BUG_ON(!size);
> > : 	BUG_ON(offset_in_page(size));
> > : 	BUG_ON(!is_power_of_2(align));
> > : 
> > : 	if (unlikely(!vmap_initialized))
> > : 		return ERR_PTR(-EBUSY);
> > : 
> > : 	might_sleep();
> > : 
> > : 	va = kmem_cache_alloc_node(vmap_area_cachep,
> > : 			gfp_mask & GFP_RECLAIM_MASK, node);
> > 
> > Why does this use GFP_RECLAIM_MASK?  Please add a comment explaining
> > this.
> > 
I need to think about it. Initially it was like that.

> > : 	if (unlikely(!va))
> > : 		return ERR_PTR(-ENOMEM);
> > : 
> > : 	/*
> > : 	 * Only scan the relevant parts containing pointers to other objects
> > : 	 * to avoid false negatives.
> > : 	 */
> > : 	kmemleak_scan_area(&va->rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
> > : 
> > : retry:
> > : 	/*
> > : 	 * Preload this CPU with one extra vmap_area object. It is used
> > : 	 * when fit type of free area is NE_FIT_TYPE. Please note, it
> > : 	 * does not guarantee that an allocation occurs on a CPU that
> > : 	 * is preloaded, instead we minimize the case when it is not.
> > : 	 * It can happen because of migration, because there is a race
> > : 	 * until the below spinlock is taken.
> > : 	 *
> > : 	 * The preload is done in non-atomic context, thus it allows us
> > : 	 * to use more permissive allocation masks to be more stable under
> > : 	 * low memory condition and high memory pressure.
> > : 	 *
> > : 	 * Even if it fails we do not really care about that. Just proceed
> > : 	 * as it is. "overflow" path will refill the cache we allocate from.
> > : 	 */
> > : 	if (!this_cpu_read(ne_fit_preload_node)) {
> > 
> > Readability nit: local `pva' should be defined here, rather than having
> > function-wide scope.
> > 
> > : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > 
> > Why doesn't this honour gfp_mask?  If it's not a bug, please add
> > comment explaining this.
> > 
But there is a comment, if understand you correctly:

<snip>
* Even if it fails we do not really care about that. Just proceed
* as it is. "overflow" path will refill the cache we allocate from.
<snip>

> > The kmem_cache_alloc() in adjust_va_to_fit_type() omits the caller's
> > gfp_mask also.  If not a bug, please document the unexpected behaviour.
> > 
I will add a comment there.

> 
> These questions appear to be for the code that this patch touches, not
> for the patch itself.
> 
> > : 
> > : 		if (this_cpu_cmpxchg(ne_fit_preload_node, NULL,
> > pva)) { : 			if (pva)
> > : 				kmem_cache_free(vmap_area_cachep,
> > pva); : 		}
> > : 	}
> > : 
> > : 	spin_lock(&vmap_area_lock);
> > : 
> > : 	/*
> > : 	 * If an allocation fails, the "vend" address is
> > : 	 * returned. Therefore trigger the overflow path.
> > : 	 */
> > 
> > As for the intent of this patch, why not preallocate the vmap_area
> > outside the spinlock and use it within the spinlock?  Does spin_lock()
> > disable preemption on RT?  I forget, but it doesn't matter much anyway
> 
> spin_lock() does not disable preemption on RT. But it does disable
> migration (thus the task should remain on the current CPU).
> 
> > - doing this will make the code better in the regular kernel I think? 
> > Something like this:
> > 
> > 	struct vmap_area *pva = NULL;
> > 
> > 	...
> > 
> > 	if (!this_cpu_read(ne_fit_preload_node))
> > 		pva = kmem_cache_alloc_node(vmap_area_cachep, ...);
> > 
> > 	spin_lock(&vmap_area_lock);
> > 
> > 	if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
> > 		kmem_cache_free(vmap_area_cachep, pva);
> > 
> 
> 
> This looks fine to me.
> 
Yes, i agree that is better. I was thinking about doing so, but decided
to keep as it is, because of low number of "corner cases" anyway.

I will upload the v2.

Thanks for the comments!

--
Vlad Rezki


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-10 15:17     ` Uladzislau Rezki
@ 2019-10-11 23:55       ` Andrew Morton
  2019-10-14 14:27         ` Uladzislau Rezki
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2019-10-11 23:55 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Steven Rostedt, Daniel Wagner, Sebastian Andrzej Siewior,
	Thomas Gleixner, linux-mm, LKML, Peter Zijlstra, Hillf Danton,
	Michal Hocko, Matthew Wilcox, Oleksiy Avramchenko

On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:

> > > : 	 * The preload is done in non-atomic context, thus it allows us
> > > : 	 * to use more permissive allocation masks to be more stable under
> > > : 	 * low memory condition and high memory pressure.
> > > : 	 *
> > > : 	 * Even if it fails we do not really care about that. Just proceed
> > > : 	 * as it is. "overflow" path will refill the cache we allocate from.
> > > : 	 */
> > > : 	if (!this_cpu_read(ne_fit_preload_node)) {
> > > 
> > > Readability nit: local `pva' should be defined here, rather than having
> > > function-wide scope.
> > > 
> > > : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > > 
> > > Why doesn't this honour gfp_mask?  If it's not a bug, please add
> > > comment explaining this.
> > > 
> But there is a comment, if understand you correctly:
> 
> <snip>
> * Even if it fails we do not really care about that. Just proceed
> * as it is. "overflow" path will refill the cache we allocate from.
> <snip>

My point is that the alloc_vmap_area() caller passed us a gfp_t but
this code ignores it, as does adjust_va_to_fit_type().  These *look*
like potential bugs.  If not, they should be commented so they don't
look like bugs any more ;)




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-11 23:55       ` Andrew Morton
@ 2019-10-14 14:27         ` Uladzislau Rezki
  2019-10-14 16:30           ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Uladzislau Rezki @ 2019-10-14 14:27 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Uladzislau Rezki, Steven Rostedt, Daniel Wagner,
	Sebastian Andrzej Siewior, Thomas Gleixner, linux-mm, LKML,
	Peter Zijlstra, Hillf Danton, Michal Hocko, Matthew Wilcox,
	Oleksiy Avramchenko

On Fri, Oct 11, 2019 at 04:55:15PM -0700, Andrew Morton wrote:
> On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> 
> > > > : 	 * The preload is done in non-atomic context, thus it allows us
> > > > : 	 * to use more permissive allocation masks to be more stable under
> > > > : 	 * low memory condition and high memory pressure.
> > > > : 	 *
> > > > : 	 * Even if it fails we do not really care about that. Just proceed
> > > > : 	 * as it is. "overflow" path will refill the cache we allocate from.
> > > > : 	 */
> > > > : 	if (!this_cpu_read(ne_fit_preload_node)) {
> > > > 
> > > > Readability nit: local `pva' should be defined here, rather than having
> > > > function-wide scope.
> > > > 
> > > > : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > > > 
> > > > Why doesn't this honour gfp_mask?  If it's not a bug, please add
> > > > comment explaining this.
> > > > 
> > But there is a comment, if understand you correctly:
> > 
> > <snip>
> > * Even if it fails we do not really care about that. Just proceed
> > * as it is. "overflow" path will refill the cache we allocate from.
> > <snip>
> 
> My point is that the alloc_vmap_area() caller passed us a gfp_t but
> this code ignores it, as does adjust_va_to_fit_type().  These *look*
> like potential bugs.  If not, they should be commented so they don't
> look like bugs any more ;)
> 
I got it, there was misunderstanding from my side :) I agree.

In the first case i should have used and respect the passed "gfp_mask",
like below:

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index f48cd0711478..880b6e8cdeae 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
                 * Just proceed as it is. If needed "overflow" path
                 * will refill the cache we allocate from.
                 */
-               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
+               pva = kmem_cache_alloc_node(vmap_area_cachep,
+                               gfp_mask & GFP_RECLAIM_MASK, node);
 
        spin_lock(&vmap_area_lock);

It should be sent as a separate patch, i think.

As for adjust_va_to_fit_type(), i can add a comment, since we can not
sleep there and the case is one per 1000000 or even lower with your proposal.

Does it sound good?

Thank you!

--
Vlad Rezki


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-14 14:27         ` Uladzislau Rezki
@ 2019-10-14 16:30           ` Michal Hocko
  2019-10-15  9:54             ` Uladzislau Rezki
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2019-10-14 16:30 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Andrew Morton, Steven Rostedt, Daniel Wagner,
	Sebastian Andrzej Siewior, Thomas Gleixner, linux-mm, LKML,
	Peter Zijlstra, Hillf Danton, Matthew Wilcox,
	Oleksiy Avramchenko

On Mon 14-10-19 16:27:19, Uladzislau Rezki wrote:
> On Fri, Oct 11, 2019 at 04:55:15PM -0700, Andrew Morton wrote:
> > On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki <urezki@gmail.com> wrote:
> > 
> > > > > : 	 * The preload is done in non-atomic context, thus it allows us
> > > > > : 	 * to use more permissive allocation masks to be more stable under
> > > > > : 	 * low memory condition and high memory pressure.
> > > > > : 	 *
> > > > > : 	 * Even if it fails we do not really care about that. Just proceed
> > > > > : 	 * as it is. "overflow" path will refill the cache we allocate from.
> > > > > : 	 */
> > > > > : 	if (!this_cpu_read(ne_fit_preload_node)) {
> > > > > 
> > > > > Readability nit: local `pva' should be defined here, rather than having
> > > > > function-wide scope.
> > > > > 
> > > > > : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > > > > 
> > > > > Why doesn't this honour gfp_mask?  If it's not a bug, please add
> > > > > comment explaining this.
> > > > > 
> > > But there is a comment, if understand you correctly:
> > > 
> > > <snip>
> > > * Even if it fails we do not really care about that. Just proceed
> > > * as it is. "overflow" path will refill the cache we allocate from.
> > > <snip>
> > 
> > My point is that the alloc_vmap_area() caller passed us a gfp_t but
> > this code ignores it, as does adjust_va_to_fit_type().  These *look*
> > like potential bugs.  If not, they should be commented so they don't
> > look like bugs any more ;)
> > 
> I got it, there was misunderstanding from my side :) I agree.
> 
> In the first case i should have used and respect the passed "gfp_mask",
> like below:
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index f48cd0711478..880b6e8cdeae 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>                  * Just proceed as it is. If needed "overflow" path
>                  * will refill the cache we allocate from.
>                  */
> -               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> +               pva = kmem_cache_alloc_node(vmap_area_cachep,
> +                               gfp_mask & GFP_RECLAIM_MASK, node);
>  
>         spin_lock(&vmap_area_lock);
> 
> It should be sent as a separate patch, i think.

Yes. I do not think this would make any real difference because that
battle is lost long ago. vmalloc is simply not gfp mask friendly. There
are places like page table allocation which are hardcoded GFP_KERNEL so
GFP_NOWAIT semantic is not going to work, really. The above makes sense
from a pure aesthetic POV, though, I would say.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading
  2019-10-14 16:30           ` Michal Hocko
@ 2019-10-15  9:54             ` Uladzislau Rezki
  0 siblings, 0 replies; 9+ messages in thread
From: Uladzislau Rezki @ 2019-10-15  9:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Uladzislau Rezki, Andrew Morton, Steven Rostedt, Daniel Wagner,
	Sebastian Andrzej Siewior, Thomas Gleixner, linux-mm, LKML,
	Peter Zijlstra, Hillf Danton, Matthew Wilcox,
	Oleksiy Avramchenko

> > > > > > : 	 * The preload is done in non-atomic context, thus it allows us
> > > > > > : 	 * to use more permissive allocation masks to be more stable under
> > > > > > : 	 * low memory condition and high memory pressure.
> > > > > > : 	 *
> > > > > > : 	 * Even if it fails we do not really care about that. Just proceed
> > > > > > : 	 * as it is. "overflow" path will refill the cache we allocate from.
> > > > > > : 	 */
> > > > > > : 	if (!this_cpu_read(ne_fit_preload_node)) {
> > > > > > 
> > > > > > Readability nit: local `pva' should be defined here, rather than having
> > > > > > function-wide scope.
> > > > > > 
> > > > > > : 		pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > > > > > 
> > > > > > Why doesn't this honour gfp_mask?  If it's not a bug, please add
> > > > > > comment explaining this.
> > > > > > 
> > > > But there is a comment, if understand you correctly:
> > > > 
> > > > <snip>
> > > > * Even if it fails we do not really care about that. Just proceed
> > > > * as it is. "overflow" path will refill the cache we allocate from.
> > > > <snip>
> > > 
> > > My point is that the alloc_vmap_area() caller passed us a gfp_t but
> > > this code ignores it, as does adjust_va_to_fit_type().  These *look*
> > > like potential bugs.  If not, they should be commented so they don't
> > > look like bugs any more ;)
> > > 
> > I got it, there was misunderstanding from my side :) I agree.
> > 
> > In the first case i should have used and respect the passed "gfp_mask",
> > like below:
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index f48cd0711478..880b6e8cdeae 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1113,7 +1113,8 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> >                  * Just proceed as it is. If needed "overflow" path
> >                  * will refill the cache we allocate from.
> >                  */
> > -               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> > +               pva = kmem_cache_alloc_node(vmap_area_cachep,
> > +                               gfp_mask & GFP_RECLAIM_MASK, node);
> >  
> >         spin_lock(&vmap_area_lock);
> > 
> > It should be sent as a separate patch, i think.
> 
> Yes. I do not think this would make any real difference because that
> battle is lost long ago. vmalloc is simply not gfp mask friendly. There
> are places like page table allocation which are hardcoded GFP_KERNEL so
> GFP_NOWAIT semantic is not going to work, really. The above makes sense
> from a pure aesthetic POV, though, I would say.
I agree. Then i will create a patch.

Thank you!

--
Vlad Rezki



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-15  9:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-09 16:49 [PATCH 1/1] mm/vmalloc: remove preempt_disable/enable when do preloading Uladzislau Rezki (Sony)
2019-10-09 18:21 ` Steven Rostedt
2019-10-09 22:19 ` Andrew Morton
2019-10-10  2:17   ` Steven Rostedt
2019-10-10 15:17     ` Uladzislau Rezki
2019-10-11 23:55       ` Andrew Morton
2019-10-14 14:27         ` Uladzislau Rezki
2019-10-14 16:30           ` Michal Hocko
2019-10-15  9:54             ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).