linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
@ 2021-12-22 19:48 Manfred Spraul
  2021-12-23  3:40 ` Davidlohr Bueso
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Manfred Spraul @ 2021-12-22 19:48 UTC (permalink / raw)
  To: LKML, Andrew Morton
  Cc: Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar,
	chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1, Manfred Spraul,
	stable

One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
Since vfree() can sleep this is a bug.

Previously, the code path used kfree(), and kfree() is safe to be called
while holding a spinlock.

Minghao proposed to fix this by updating find_alloc_undo().

Alternate proposal to fix this: Instead of changing find_alloc_undo(),
change kvfree() so that the same rules as for kfree() apply:
Having different rules for kfree() and kvfree() just asks for bugs.

Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Reported-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/all/20211222081026.484058-1-chi.minghao@zte.com.cn/
Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation")
Cc: stable@vger.kernel.org
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 mm/util.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 741ba32a43ac..7f9181998835 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
  * It is slightly more efficient to use kfree() or vfree() if you are certain
  * that you know which one to use.
  *
- * Context: Either preemptible task context or not-NMI interrupt.
+ * Context: Any context except NMI interrupt.
  */
 void kvfree(const void *addr)
 {
 	if (is_vmalloc_addr(addr))
-		vfree(addr);
+		vfree_atomic(addr);
 	else
 		kfree(addr);
 }
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-22 19:48 [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks Manfred Spraul
@ 2021-12-23  3:40 ` Davidlohr Bueso
  2021-12-23  7:21 ` Vasily Averin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2021-12-23  3:40 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: LKML, Andrew Morton, Vasily Averin, cgel.zte, shakeelb, rdunlap,
	unixbhaskar, chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1,
	stable, mhocko, willy, vbabka

Cc'ing more mm folks.

On 2021-12-22 11:48, Manfred Spraul wrote:
> One codepath in find_alloc_undo() calls kvfree() while holding a 
> spinlock.
> Since vfree() can sleep this is a bug.

afaict the only other offender is devx_async_cmd_event_destroy_uobj(), 
in drivers/infiniband/hw/mlx5/devx.c. I was expecting to find more, 
actually.

> Previously, the code path used kfree(), and kfree() is safe to be 
> called
> while holding a spinlock.
> 
> Minghao proposed to fix this by updating find_alloc_undo().
> 
> Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> change kvfree() so that the same rules as for kfree() apply:
> Having different rules for kfree() and kvfree() just asks for bugs.

I agree that it is best to have the same atomic semantics across all 
family of calls.

> 
> Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.

I would not expect the added latency to be a big deal unless under 
serious memory pressure, for which case things are already fragile to 
begin with. Furthermore users of kvfree() are already warned that this 
is the slower choice. Feel free to add my:

Acked-by: Davidlohr Bueso <dbueso@suse.de>

> 
> Reported-by: Zeal Robot <zealci@zte.com.cn>
> Reported-by: Minghao Chi <chi.minghao@zte.com.cn>
> Link:
> https://lore.kernel.org/all/20211222081026.484058-1-chi.minghao@zte.com.cn/
> Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo 
> allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> ---
>  mm/util.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 741ba32a43ac..7f9181998835 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>   * It is slightly more efficient to use kfree() or vfree() if you are 
> certain
>   * that you know which one to use.
>   *
> - * Context: Either preemptible task context or not-NMI interrupt.
> + * Context: Any context except NMI interrupt.
>   */
>  void kvfree(const void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
> -		vfree(addr);
> +		vfree_atomic(addr);
>  	else
>  		kfree(addr);
>  }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-22 19:48 [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks Manfred Spraul
  2021-12-23  3:40 ` Davidlohr Bueso
@ 2021-12-23  7:21 ` Vasily Averin
  2021-12-23 11:52   ` Manfred Spraul
  2021-12-25 18:54 ` Uladzislau Rezki
  2022-01-27  2:53 ` Andrew Morton
  3 siblings, 1 reply; 15+ messages in thread
From: Vasily Averin @ 2021-12-23  7:21 UTC (permalink / raw)
  To: Manfred Spraul, LKML, Andrew Morton
  Cc: cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao,
	arnd, Zeal Robot, linux-mm, 1vier1, stable

On 22.12.2021 22:48, Manfred Spraul wrote:
> One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
> Since vfree() can sleep this is a bug.
> 
> Previously, the code path used kfree(), and kfree() is safe to be called
> while holding a spinlock.
> 
> Minghao proposed to fix this by updating find_alloc_undo().
> 
> Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> change kvfree() so that the same rules as for kfree() apply:
> Having different rules for kfree() and kvfree() just asks for bugs.
> 
> Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.
> 
> Reported-by: Zeal Robot <zealci@zte.com.cn>
> Reported-by: Minghao Chi <chi.minghao@zte.com.cn>
> Link: https://lore.kernel.org/all/20211222081026.484058-1-chi.minghao@zte.com.cn/
> Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
> ---
>  mm/util.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 741ba32a43ac..7f9181998835 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>   * It is slightly more efficient to use kfree() or vfree() if you are certain
>   * that you know which one to use.
>   *
> - * Context: Either preemptible task context or not-NMI interrupt.
> + * Context: Any context except NMI interrupt.
>   */
>  void kvfree(const void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
> -		vfree(addr);
> +		vfree_atomic(addr);
>  	else
>  		kfree(addr);
>  }

I would prefer to release memory ASAP if it's possible.
What do you think about this change?
--- a/mm/util.c
+++ b/mm/util.c
@@ -614,9 +614,12 @@ EXPORT_SYMBOL(kvmalloc_node);
  */
 void kvfree(const void *addr)
 {
-       if (is_vmalloc_addr(addr))
-               vfree(addr);
-       else
+       if (is_vmalloc_addr(addr)) {
+               if (in_atomic())
+                       vfree_atomic();
+               else
+                       vfree(addr);
+       } else
                kfree(addr);
 }
 EXPORT_SYMBOL(kvfree);



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-23  7:21 ` Vasily Averin
@ 2021-12-23 11:52   ` Manfred Spraul
  2021-12-23 12:34     ` Vasily Averin
  0 siblings, 1 reply; 15+ messages in thread
From: Manfred Spraul @ 2021-12-23 11:52 UTC (permalink / raw)
  To: Vasily Averin, LKML, Andrew Morton
  Cc: cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao,
	arnd, Zeal Robot, linux-mm, 1vier1, stable

Hello Vasily,

On 12/23/21 08:21, Vasily Averin wrote:
>
> I would prefer to release memory ASAP if it's possible.
> What do you think about this change?
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -614,9 +614,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>    */
>   void kvfree(const void *addr)
>   {
> -       if (is_vmalloc_addr(addr))
> -               vfree(addr);
> -       else
> +       if (is_vmalloc_addr(addr)) {
> +               if (in_atomic())
> +                       vfree_atomic();
> +               else
> +                       vfree(addr);
> +       } else
>                  kfree(addr);
>   }
>   EXPORT_SYMBOL(kvfree);
>
Unfortunately this cannot work:

> /*
> * Are we running in atomic context?  WARNING: this macro cannot
> * always detect atomic context; in particular, it cannot know about
> * held spinlocks in non-preemptible kernels.  Thus it should not be
> * used in the general case to determine whether sleeping is possible.
> * Do not use in_atomic() in driver code.
> */
> #define in_atomic()     (preempt_count() != 0)
>

--

     Manfred


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-23 11:52   ` Manfred Spraul
@ 2021-12-23 12:34     ` Vasily Averin
  0 siblings, 0 replies; 15+ messages in thread
From: Vasily Averin @ 2021-12-23 12:34 UTC (permalink / raw)
  To: Manfred Spraul, LKML, Andrew Morton
  Cc: cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao,
	arnd, Zeal Robot, linux-mm, 1vier1, stable

On 23.12.2021 14:52, Manfred Spraul wrote:
> Hello Vasily,
> 
> On 12/23/21 08:21, Vasily Averin wrote:
>>
>> I would prefer to release memory ASAP if it's possible.
>> What do you think about this change?
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -614,9 +614,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>>    */
>>   void kvfree(const void *addr)
>>   {
>> -       if (is_vmalloc_addr(addr))
>> -               vfree(addr);
>> -       else
>> +       if (is_vmalloc_addr(addr)) {
>> +               if (in_atomic())
>> +                       vfree_atomic();
>> +               else
>> +                       vfree(addr);
>> +       } else
>>                  kfree(addr);
>>   }
>>   EXPORT_SYMBOL(kvfree);
>>
> Unfortunately this cannot work:

yes, you're right and I do not see any better solution yet.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-22 19:48 [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks Manfred Spraul
  2021-12-23  3:40 ` Davidlohr Bueso
  2021-12-23  7:21 ` Vasily Averin
@ 2021-12-25 18:54 ` Uladzislau Rezki
  2021-12-25 22:58   ` Matthew Wilcox
  2022-01-27  2:53 ` Andrew Morton
  3 siblings, 1 reply; 15+ messages in thread
From: Uladzislau Rezki @ 2021-12-25 18:54 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: LKML, Andrew Morton, Vasily Averin, cgel.zte, shakeelb, rdunlap,
	dbueso, unixbhaskar, chi.minghao, arnd, Zeal Robot, linux-mm,
	1vier1, stable

> One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
> Since vfree() can sleep this is a bug.
> 
> Previously, the code path used kfree(), and kfree() is safe to be called
> while holding a spinlock.
> 
> Minghao proposed to fix this by updating find_alloc_undo().
> 
> Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> change kvfree() so that the same rules as for kfree() apply:
> Having different rules for kfree() and kvfree() just asks for bugs.
> 
> Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.
> 
I guess the issues is with "vmap_purge_lock" mutex? I think it is better
to make the vfree() call as non-blocking one, i.e. the current design is
is suffering from one drawback. It is related to purging the outstanding
lazy areas from caller context. The drain process can be time consuming
and if it is done from high-prio or RT contexts it can hog a CPU. Another
issue is what you have reported that is about calling the schedule() and
holding spinlock. The proposal is to perform a drain in a separate work:

<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d2a00ad4e1dd..7c5d9b148fa4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1717,18 +1717,6 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 	return true;
 }
 
-/*
- * Kick off a purge of the outstanding lazy areas. Don't bother if somebody
- * is already purging.
- */
-static void try_purge_vmap_area_lazy(void)
-{
-	if (mutex_trylock(&vmap_purge_lock)) {
-		__purge_vmap_area_lazy(ULONG_MAX, 0);
-		mutex_unlock(&vmap_purge_lock);
-	}
-}
-
 /*
  * Kick off a purge of the outstanding lazy areas.
  */
@@ -1740,6 +1728,16 @@ static void purge_vmap_area_lazy(void)
 	mutex_unlock(&vmap_purge_lock);
 }
 
+static void drain_vmap_area(struct work_struct *work)
+{
+	if (mutex_trylock(&vmap_purge_lock)) {
+		__purge_vmap_area_lazy(ULONG_MAX, 0);
+		mutex_unlock(&vmap_purge_lock);
+	}
+}
+
+static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
+
 /*
  * Free a vmap area, caller ensuring that the area has been unmapped
  * and flush_cache_vunmap had been called for the correct range
@@ -1766,7 +1764,7 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 
 	/* After this point, we may free va at any time */
 	if (unlikely(nr_lazy > lazy_max_pages()))
-		try_purge_vmap_area_lazy();
+		schedule_work(&drain_vmap_area_work);
 }
 
 /*
<snip>


--
Vlad Rezki

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-25 18:54 ` Uladzislau Rezki
@ 2021-12-25 22:58   ` Matthew Wilcox
  2021-12-26 17:57     ` Uladzislau Rezki
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2021-12-25 22:58 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Manfred Spraul, LKML, Andrew Morton, Vasily Averin, cgel.zte,
	shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao, arnd,
	Zeal Robot, linux-mm, 1vier1, stable

On Sat, Dec 25, 2021 at 07:54:12PM +0100, Uladzislau Rezki wrote:
> +static void drain_vmap_area(struct work_struct *work)
> +{
> +	if (mutex_trylock(&vmap_purge_lock)) {
> +		__purge_vmap_area_lazy(ULONG_MAX, 0);
> +		mutex_unlock(&vmap_purge_lock);
> +	}
> +}
> +
> +static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);

Presuambly if the worker fails to get the mutex, it should reschedule
itself?  And should it even trylock or just always lock?

This kind of ties into something I've been wondering about -- we have
a number of places in the kernel which cache 'freed' vmalloc allocations
in order to speed up future allocations of the same size.  Kind of like
slab.  Would we be better off trying to cache frequent allocations
inside vmalloc instead of always purging them?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-25 22:58   ` Matthew Wilcox
@ 2021-12-26 17:57     ` Uladzislau Rezki
  2021-12-28 19:45       ` Uladzislau Rezki
  0 siblings, 1 reply; 15+ messages in thread
From: Uladzislau Rezki @ 2021-12-26 17:57 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Uladzislau Rezki, Manfred Spraul, LKML, Andrew Morton,
	Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar,
	chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1, stable

On Sat, Dec 25, 2021 at 10:58:29PM +0000, Matthew Wilcox wrote:
> On Sat, Dec 25, 2021 at 07:54:12PM +0100, Uladzislau Rezki wrote:
> > +static void drain_vmap_area(struct work_struct *work)
> > +{
> > +	if (mutex_trylock(&vmap_purge_lock)) {
> > +		__purge_vmap_area_lazy(ULONG_MAX, 0);
> > +		mutex_unlock(&vmap_purge_lock);
> > +	}
> > +}
> > +
> > +static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
> 
> Presuambly if the worker fails to get the mutex, it should reschedule
> itself?  And should it even trylock or just always lock?
> 
mutex_trylock() has no sense here. It should just always get the lock.
Otherwise we can miss the point to purge. Agree with your opinion.

>
> This kind of ties into something I've been wondering about -- we have
> a number of places in the kernel which cache 'freed' vmalloc allocations
> in order to speed up future allocations of the same size.  Kind of like
> slab.  Would we be better off trying to cache frequent allocations
> inside vmalloc instead of always purging them?
>
Hm... Some sort of caching would be good. Though it will require some 
time to think over all details and design itself. We can cache VAs
instead of purging them until some point or threshold. So basically
we can keep it in our data structures, associate it with some cache,
based on size and reuse it later in the alloc_vmap_area(). 

All that is related to "vmap_area" caching. Another option is to cache
the "vm_struct". It includes "vmap_area" + pages to drive the mapping.
It is a higher level of caching and i am not sure if an implementation 
would be so straightforward.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-26 17:57     ` Uladzislau Rezki
@ 2021-12-28 19:45       ` Uladzislau Rezki
  2021-12-28 20:04         ` Manfred Spraul
  0 siblings, 1 reply; 15+ messages in thread
From: Uladzislau Rezki @ 2021-12-28 19:45 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Matthew Wilcox, Manfred Spraul, LKML, Andrew Morton,
	Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar,
	chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1, stable

On Sun, Dec 26, 2021 at 06:57:16PM +0100, Uladzislau Rezki wrote:
> On Sat, Dec 25, 2021 at 10:58:29PM +0000, Matthew Wilcox wrote:
> > On Sat, Dec 25, 2021 at 07:54:12PM +0100, Uladzislau Rezki wrote:
> > > +static void drain_vmap_area(struct work_struct *work)
> > > +{
> > > +	if (mutex_trylock(&vmap_purge_lock)) {
> > > +		__purge_vmap_area_lazy(ULONG_MAX, 0);
> > > +		mutex_unlock(&vmap_purge_lock);
> > > +	}
> > > +}
> > > +
> > > +static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
> > 
> > Presuambly if the worker fails to get the mutex, it should reschedule
> > itself?  And should it even trylock or just always lock?
> > 
> mutex_trylock() has no sense here. It should just always get the lock.
> Otherwise we can miss the point to purge. Agree with your opinion.
> 
Below the patch that address Matthew's points:

<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d2a00ad4e1dd..b82db44fea60 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1717,17 +1717,10 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
 	return true;
 }
 
-/*
- * Kick off a purge of the outstanding lazy areas. Don't bother if somebody
- * is already purging.
- */
-static void try_purge_vmap_area_lazy(void)
-{
-	if (mutex_trylock(&vmap_purge_lock)) {
-		__purge_vmap_area_lazy(ULONG_MAX, 0);
-		mutex_unlock(&vmap_purge_lock);
-	}
-}
+static void purge_vmap_area_lazy(void);
+static void drain_vmap_area(struct work_struct *work);
+static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
+static atomic_t drain_vmap_area_work_in_progress;
 
 /*
  * Kick off a purge of the outstanding lazy areas.
@@ -1740,6 +1733,22 @@ static void purge_vmap_area_lazy(void)
 	mutex_unlock(&vmap_purge_lock);
 }
 
+static void drain_vmap_area(struct work_struct *work)
+{
+	mutex_lock(&vmap_purge_lock);
+	__purge_vmap_area_lazy(ULONG_MAX, 0);
+	mutex_unlock(&vmap_purge_lock);
+
+	/*
+	 * Check if rearming is still required. If not, we are
+	 * done and can let a next caller to initiate a new drain.
+	 */
+	if (atomic_long_read(&vmap_lazy_nr) > lazy_max_pages())
+		schedule_work(&drain_vmap_area_work);
+	else
+		atomic_set(&drain_vmap_area_work_in_progress, 0);
+}
+
 /*
  * Free a vmap area, caller ensuring that the area has been unmapped
  * and flush_cache_vunmap had been called for the correct range
@@ -1766,7 +1775,8 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 
 	/* After this point, we may free va at any time */
 	if (unlikely(nr_lazy > lazy_max_pages()))
-		try_purge_vmap_area_lazy();
+		if (!atomic_xchg(&drain_vmap_area_work_in_progress, 1))
+			schedule_work(&drain_vmap_area_work);
 }
 
 /*
<snip>

Manfred, could you please have a look and if you have a time test it?
I mean if it solves your issue. You can take over this patch and resend
it, otherwise i can send it myself later if we all agree with it.

--
Vlad Rezki

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-28 19:45       ` Uladzislau Rezki
@ 2021-12-28 20:04         ` Manfred Spraul
  2021-12-28 20:26           ` Uladzislau Rezki
  0 siblings, 1 reply; 15+ messages in thread
From: Manfred Spraul @ 2021-12-28 20:04 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Matthew Wilcox, LKML, Andrew Morton, Vasily Averin, cgel.zte,
	shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao, arnd,
	Zeal Robot, linux-mm, 1vier1, stable

Hello Vlad,

On 12/28/21 20:45, Uladzislau Rezki wrote:
> [...]
> Manfred, could you please have a look and if you have a time test it?
> I mean if it solves your issue. You can take over this patch and resend
> it, otherwise i can send it myself later if we all agree with it.

I think we mix tasks: We have a bug in ipc/sem.c, thus we need a 
solution suitable for stable.

Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo 
allocation")
Cc: stable@vger.kernel.org

I think for stable, there are only two options:

- change ipc/sem.c, call kvfree() after dropping the spinlock

- change kvfree() to use vfree_atomic().

 From my point of view, both approaches are fine.

I.e. I'm waiting for feedback from an mm maintainer.

As soon as it is agreed, I will retest the chosen solution.


Now you propose to redesign vfree(), so that vfree() is safe to be 
called while holding spinlocks:

> <snip>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d2a00ad4e1dd..b82db44fea60 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1717,17 +1717,10 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
>   	return true;
>   }
>   
> -/*
> - * Kick off a purge of the outstanding lazy areas. Don't bother if somebody
> - * is already purging.
> - */
> -static void try_purge_vmap_area_lazy(void)
> -{
> -	if (mutex_trylock(&vmap_purge_lock)) {
> -		__purge_vmap_area_lazy(ULONG_MAX, 0);
> -		mutex_unlock(&vmap_purge_lock);
> -	}
> -}
> +static void purge_vmap_area_lazy(void);
> +static void drain_vmap_area(struct work_struct *work);
> +static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
> +static atomic_t drain_vmap_area_work_in_progress;
>   
>   /*
>    * Kick off a purge of the outstanding lazy areas.
> @@ -1740,6 +1733,22 @@ static void purge_vmap_area_lazy(void)
>   	mutex_unlock(&vmap_purge_lock);
>   }
>   
> +static void drain_vmap_area(struct work_struct *work)
> +{
> +	mutex_lock(&vmap_purge_lock);
> +	__purge_vmap_area_lazy(ULONG_MAX, 0);
> +	mutex_unlock(&vmap_purge_lock);
> +
> +	/*
> +	 * Check if rearming is still required. If not, we are
> +	 * done and can let a next caller to initiate a new drain.
> +	 */
> +	if (atomic_long_read(&vmap_lazy_nr) > lazy_max_pages())
> +		schedule_work(&drain_vmap_area_work);
> +	else
> +		atomic_set(&drain_vmap_area_work_in_progress, 0);
> +}
> +
>   /*
>    * Free a vmap area, caller ensuring that the area has been unmapped
>    * and flush_cache_vunmap had been called for the correct range
> @@ -1766,7 +1775,8 @@ static void free_vmap_area_noflush(struct vmap_area *va)
>   
>   	/* After this point, we may free va at any time */
>   	if (unlikely(nr_lazy > lazy_max_pages()))
> -		try_purge_vmap_area_lazy();
> +		if (!atomic_xchg(&drain_vmap_area_work_in_progress, 1))
> +			schedule_work(&drain_vmap_area_work);
>   }
>   
>   /*
> <snip>
I do now know the mm code well enough to understand the side effects of 
the change. And doubt that it is suitable for stable, i.e. we need the 
simple patch first.

--

     Manfred


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-28 20:04         ` Manfred Spraul
@ 2021-12-28 20:26           ` Uladzislau Rezki
  0 siblings, 0 replies; 15+ messages in thread
From: Uladzislau Rezki @ 2021-12-28 20:26 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Uladzislau Rezki, Matthew Wilcox, LKML, Andrew Morton,
	Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso, unixbhaskar,
	chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1, stable

> Hello Vlad,
> 
> On 12/28/21 20:45, Uladzislau Rezki wrote:
> > [...]
> > Manfred, could you please have a look and if you have a time test it?
> > I mean if it solves your issue. You can take over this patch and resend
> > it, otherwise i can send it myself later if we all agree with it.
> 
> I think we mix tasks: We have a bug in ipc/sem.c, thus we need a solution
> suitable for stable.
> 
> Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo
> allocation")
> Cc: stable@vger.kernel.org
> 
> I think for stable, there are only two options:
> 
> - change ipc/sem.c, call kvfree() after dropping the spinlock
> 
> - change kvfree() to use vfree_atomic().
> 
> From my point of view, both approaches are fine.
> 
> I.e. I'm waiting for feedback from an mm maintainer.
> 
> As soon as it is agreed, I will retest the chosen solution.
> 
Here for me it anyway looks like a change and it is hard to judge
if the second solution is stable or not, because it is a new change
and the kvfree() interface is changed internally.

> 
> Now you propose to redesign vfree(), so that vfree() is safe to be called
> while holding spinlocks:
> 
> > <snip>
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index d2a00ad4e1dd..b82db44fea60 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1717,17 +1717,10 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> >   	return true;
> >   }
> > -/*
> > - * Kick off a purge of the outstanding lazy areas. Don't bother if somebody
> > - * is already purging.
> > - */
> > -static void try_purge_vmap_area_lazy(void)
> > -{
> > -	if (mutex_trylock(&vmap_purge_lock)) {
> > -		__purge_vmap_area_lazy(ULONG_MAX, 0);
> > -		mutex_unlock(&vmap_purge_lock);
> > -	}
> > -}
> > +static void purge_vmap_area_lazy(void);
> > +static void drain_vmap_area(struct work_struct *work);
> > +static DECLARE_WORK(drain_vmap_area_work, drain_vmap_area);
> > +static atomic_t drain_vmap_area_work_in_progress;
> >   /*
> >    * Kick off a purge of the outstanding lazy areas.
> > @@ -1740,6 +1733,22 @@ static void purge_vmap_area_lazy(void)
> >   	mutex_unlock(&vmap_purge_lock);
> >   }
> > +static void drain_vmap_area(struct work_struct *work)
> > +{
> > +	mutex_lock(&vmap_purge_lock);
> > +	__purge_vmap_area_lazy(ULONG_MAX, 0);
> > +	mutex_unlock(&vmap_purge_lock);
> > +
> > +	/*
> > +	 * Check if rearming is still required. If not, we are
> > +	 * done and can let a next caller to initiate a new drain.
> > +	 */
> > +	if (atomic_long_read(&vmap_lazy_nr) > lazy_max_pages())
> > +		schedule_work(&drain_vmap_area_work);
> > +	else
> > +		atomic_set(&drain_vmap_area_work_in_progress, 0);
> > +}
> > +
> >   /*
> >    * Free a vmap area, caller ensuring that the area has been unmapped
> >    * and flush_cache_vunmap had been called for the correct range
> > @@ -1766,7 +1775,8 @@ static void free_vmap_area_noflush(struct vmap_area *va)
> >   	/* After this point, we may free va at any time */
> >   	if (unlikely(nr_lazy > lazy_max_pages()))
> > -		try_purge_vmap_area_lazy();
> > +		if (!atomic_xchg(&drain_vmap_area_work_in_progress, 1))
> > +			schedule_work(&drain_vmap_area_work);
> >   }
> >   /*
> > <snip>
> I do now know the mm code well enough to understand the side effects of the
> change. And doubt that it is suitable for stable, i.e. we need the simple
> patch first.
> 
Well, it is as simple as it could be :)

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2021-12-22 19:48 [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks Manfred Spraul
                   ` (2 preceding siblings ...)
  2021-12-25 18:54 ` Uladzislau Rezki
@ 2022-01-27  2:53 ` Andrew Morton
  2022-01-27  5:59   ` Manfred Spraul
  3 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2022-01-27  2:53 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: LKML, Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso,
	unixbhaskar, chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1,
	stable, Michal Hocko

On Wed, 22 Dec 2021 20:48:28 +0100 Manfred Spraul <manfred@colorfullife.com> wrote:

> One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
> Since vfree() can sleep this is a bug.
> 
> Previously, the code path used kfree(), and kfree() is safe to be called
> while holding a spinlock.
> 
> Minghao proposed to fix this by updating find_alloc_undo().
> 
> Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> change kvfree() so that the same rules as for kfree() apply:
> Having different rules for kfree() and kvfree() just asks for bugs.
> 
> Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.

I know we've been around this loop a bunch of times and deferring was
considered.   But I forget the conclusion.  IIRC, mhocko was involved?

> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>   * It is slightly more efficient to use kfree() or vfree() if you are certain
>   * that you know which one to use.
>   *
> - * Context: Either preemptible task context or not-NMI interrupt.
> + * Context: Any context except NMI interrupt.
>   */
>  void kvfree(const void *addr)
>  {
>  	if (is_vmalloc_addr(addr))
> -		vfree(addr);
> +		vfree_atomic(addr);
>  	else
>  		kfree(addr);
>  }



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2022-01-27  2:53 ` Andrew Morton
@ 2022-01-27  5:59   ` Manfred Spraul
  2022-01-27  8:25     ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Manfred Spraul @ 2022-01-27  5:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Vasily Averin, cgel.zte, shakeelb, rdunlap, dbueso,
	unixbhaskar, chi.minghao, arnd, Zeal Robot, linux-mm, 1vier1,
	stable, Michal Hocko

Hi Andrew,

On 1/27/22 03:53, Andrew Morton wrote:
> On Wed, 22 Dec 2021 20:48:28 +0100 Manfred Spraul <manfred@colorfullife.com> wrote:
>
>> One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
>> Since vfree() can sleep this is a bug.
>>
>> Previously, the code path used kfree(), and kfree() is safe to be called
>> while holding a spinlock.
>>
>> Minghao proposed to fix this by updating find_alloc_undo().
>>
>> Alternate proposal to fix this: Instead of changing find_alloc_undo(),
>> change kvfree() so that the same rules as for kfree() apply:
>> Having different rules for kfree() and kvfree() just asks for bugs.
>>
>> Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.
> I know we've been around this loop a bunch of times and deferring was
> considered.   But I forget the conclusion.  IIRC, mhocko was involved?

I do not remember a mail from mhocko.

Shakeel proposed to use the approach from Chi.

Decision: https://marc.info/?l=linux-kernel&m=164132032717757&w=2

With Reviewed-by:

https://marc.info/?l=linux-kernel&m=164132744522325&w=2
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
>>    * It is slightly more efficient to use kfree() or vfree() if you are certain
>>    * that you know which one to use.
>>    *
>> - * Context: Either preemptible task context or not-NMI interrupt.
>> + * Context: Any context except NMI interrupt.
>>    */
>>   void kvfree(const void *addr)
>>   {
>>   	if (is_vmalloc_addr(addr))
>> -		vfree(addr);
>> +		vfree_atomic(addr);
>>   	else
>>   		kfree(addr);
>>   }



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2022-01-27  5:59   ` Manfred Spraul
@ 2022-01-27  8:25     ` Michal Hocko
  2022-01-27 15:54       ` Uladzislau Rezki
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2022-01-27  8:25 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Andrew Morton, LKML, Vasily Averin, cgel.zte, shakeelb, rdunlap,
	dbueso, unixbhaskar, chi.minghao, arnd, Zeal Robot, linux-mm,
	1vier1, stable

On Thu 27-01-22 06:59:50, Manfred Spraul wrote:
> Hi Andrew,
> 
> On 1/27/22 03:53, Andrew Morton wrote:
> > On Wed, 22 Dec 2021 20:48:28 +0100 Manfred Spraul <manfred@colorfullife.com> wrote:
> > 
> > > One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
> > > Since vfree() can sleep this is a bug.
> > > 
> > > Previously, the code path used kfree(), and kfree() is safe to be called
> > > while holding a spinlock.
> > > 
> > > Minghao proposed to fix this by updating find_alloc_undo().
> > > 
> > > Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> > > change kvfree() so that the same rules as for kfree() apply:
> > > Having different rules for kfree() and kvfree() just asks for bugs.
> > > 
> > > Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.
> > I know we've been around this loop a bunch of times and deferring was
> > considered.   But I forget the conclusion.  IIRC, mhocko was involved?
> 
> I do not remember a mail from mhocko.

I do not remember either.

> 
> Shakeel proposed to use the approach from Chi.
> 
> Decision: https://marc.info/?l=linux-kernel&m=164132032717757&w=2

And I would agree with Shakeel and go with the original change to the
ipc code. That is trivial and without any other side effects like this
one. I bet nobody has evaluated what the undconditional deferred freeing
has. At least changelog doesn't really dive into that more than a very
vague statement that this will happen.

> With Reviewed-by:
> 
> https://marc.info/?l=linux-kernel&m=164132744522325&w=2
> > > --- a/mm/util.c
> > > +++ b/mm/util.c
> > > @@ -610,12 +610,12 @@ EXPORT_SYMBOL(kvmalloc_node);
> > >    * It is slightly more efficient to use kfree() or vfree() if you are certain
> > >    * that you know which one to use.
> > >    *
> > > - * Context: Either preemptible task context or not-NMI interrupt.
> > > + * Context: Any context except NMI interrupt.
> > >    */
> > >   void kvfree(const void *addr)
> > >   {
> > >   	if (is_vmalloc_addr(addr))
> > > -		vfree(addr);
> > > +		vfree_atomic(addr);
> > >   	else
> > >   		kfree(addr);
> > >   }
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks
  2022-01-27  8:25     ` Michal Hocko
@ 2022-01-27 15:54       ` Uladzislau Rezki
  0 siblings, 0 replies; 15+ messages in thread
From: Uladzislau Rezki @ 2022-01-27 15:54 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Manfred Spraul, Andrew Morton, LKML, Vasily Averin, cgel.zte,
	shakeelb, rdunlap, dbueso, unixbhaskar, chi.minghao, arnd,
	Zeal Robot, linux-mm, 1vier1, stable

On Thu, Jan 27, 2022 at 09:25:48AM +0100, Michal Hocko wrote:
> On Thu 27-01-22 06:59:50, Manfred Spraul wrote:
> > Hi Andrew,
> > 
> > On 1/27/22 03:53, Andrew Morton wrote:
> > > On Wed, 22 Dec 2021 20:48:28 +0100 Manfred Spraul <manfred@colorfullife.com> wrote:
> > > 
> > > > One codepath in find_alloc_undo() calls kvfree() while holding a spinlock.
> > > > Since vfree() can sleep this is a bug.
> > > > 
> > > > Previously, the code path used kfree(), and kfree() is safe to be called
> > > > while holding a spinlock.
> > > > 
> > > > Minghao proposed to fix this by updating find_alloc_undo().
> > > > 
> > > > Alternate proposal to fix this: Instead of changing find_alloc_undo(),
> > > > change kvfree() so that the same rules as for kfree() apply:
> > > > Having different rules for kfree() and kvfree() just asks for bugs.
> > > > 
> > > > Disadvantage: Releasing vmalloc'ed memory will be delayed a bit.
> > > I know we've been around this loop a bunch of times and deferring was
> > > considered.   But I forget the conclusion.  IIRC, mhocko was involved?
> > 
> > I do not remember a mail from mhocko.
> 
> I do not remember either.
> 
> > 
> > Shakeel proposed to use the approach from Chi.
> > 
> > Decision: https://marc.info/?l=linux-kernel&m=164132032717757&w=2
> 
> And I would agree with Shakeel and go with the original change to the
> ipc code. That is trivial and without any other side effects like this
> one. I bet nobody has evaluated what the undconditional deferred freeing
> has. At least changelog doesn't really dive into that more than a very
> vague statement that this will happen.
>
Absolutely agree here. Especially that changing the kvfree() will not
look stable.

After applying the https://www.spinics.net/lists/linux-mm/msg282264.html
we will be able to use vfree() from atomic anyway.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-01-27 15:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 19:48 [PATCH] mm/util.c: Make kvfree() safe for calling while holding spinlocks Manfred Spraul
2021-12-23  3:40 ` Davidlohr Bueso
2021-12-23  7:21 ` Vasily Averin
2021-12-23 11:52   ` Manfred Spraul
2021-12-23 12:34     ` Vasily Averin
2021-12-25 18:54 ` Uladzislau Rezki
2021-12-25 22:58   ` Matthew Wilcox
2021-12-26 17:57     ` Uladzislau Rezki
2021-12-28 19:45       ` Uladzislau Rezki
2021-12-28 20:04         ` Manfred Spraul
2021-12-28 20:26           ` Uladzislau Rezki
2022-01-27  2:53 ` Andrew Morton
2022-01-27  5:59   ` Manfred Spraul
2022-01-27  8:25     ` Michal Hocko
2022-01-27 15:54       ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).