Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] kmemleak: Do not corrupt the object_list during clean-up
@ 2019-10-04 13:46 Catalin Marinas
  2019-10-05  3:11 ` Alexey Kardashevskiy
  0 siblings, 1 reply; 4+ messages in thread
From: Catalin Marinas @ 2019-10-04 13:46 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Alexey Kardashevskiy, Marc Dionne, Andrew Morton

In case of an error (e.g. memory pool too small), kmemleak disables
itself and cleans up the already allocated metadata objects. However, if
this happens early before the RCU callback mechanism is available,
put_object() skips call_rcu() and frees the object directly. This is not
safe with the RCU list traversal in __kmemleak_do_cleanup().

Change the list traversal in __kmemleak_do_cleanup() to
list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
the kmemleak is already disabled at this point. In addition, avoid an
unnecessary metadata object rb-tree look-up since it already has the
struct kmemleak_object pointer.

Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 mm/kmemleak.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 03a8d84badad..244607663363 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -526,6 +526,16 @@ static struct kmemleak_object *find_and_get_object(unsigned long ptr, int alias)
 	return object;
 }
 
+/*
+ * Remove an object from the object_tree_root and object_list. Must be called
+ * with the kmemleak_lock held _if_ kmemleak is still enabled.
+ */
+static void __remove_object(struct kmemleak_object *object)
+{
+	rb_erase(&object->rb_node, &object_tree_root);
+	list_del_rcu(&object->object_list);
+}
+
 /*
  * Look up an object in the object search tree and remove it from both
  * object_tree_root and object_list. The returned object's use_count should be
@@ -538,10 +548,8 @@ static struct kmemleak_object *find_and_remove_object(unsigned long ptr, int ali
 
 	write_lock_irqsave(&kmemleak_lock, flags);
 	object = lookup_object(ptr, alias);
-	if (object) {
-		rb_erase(&object->rb_node, &object_tree_root);
-		list_del_rcu(&object->object_list);
-	}
+	if (object)
+		__remove_object(object);
 	write_unlock_irqrestore(&kmemleak_lock, flags);
 
 	return object;
@@ -1834,12 +1842,16 @@ static const struct file_operations kmemleak_fops = {
 
 static void __kmemleak_do_cleanup(void)
 {
-	struct kmemleak_object *object;
+	struct kmemleak_object *object, *tmp;
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(object, &object_list, object_list)
-		delete_object_full(object->pointer);
-	rcu_read_unlock();
+	/*
+	 * Kmemleak has already been disabled, no need for RCU list traversal
+	 * or kmemleak_lock held.
+	 */
+	list_for_each_entry_safe(object, tmp, &object_list, object_list) {
+		__remove_object(object);
+		__delete_object(object);
+	}
 }
 
 /*


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kmemleak: Do not corrupt the object_list during clean-up
  2019-10-04 13:46 [PATCH] kmemleak: Do not corrupt the object_list during clean-up Catalin Marinas
@ 2019-10-05  3:11 ` Alexey Kardashevskiy
  2019-10-09 16:37   ` Song Liu
  0 siblings, 1 reply; 4+ messages in thread
From: Alexey Kardashevskiy @ 2019-10-05  3:11 UTC (permalink / raw)
  To: Catalin Marinas, linux-mm; +Cc: linux-kernel, Marc Dionne, Andrew Morton



On 04/10/2019 23:46, Catalin Marinas wrote:
> In case of an error (e.g. memory pool too small), kmemleak disables
> itself and cleans up the already allocated metadata objects. However, if
> this happens early before the RCU callback mechanism is available,
> put_object() skips call_rcu() and frees the object directly. This is not
> safe with the RCU list traversal in __kmemleak_do_cleanup().
> 
> Change the list traversal in __kmemleak_do_cleanup() to
> list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
> the kmemleak is already disabled at this point. In addition, avoid an
> unnecessary metadata object rb-tree look-up since it already has the
> struct kmemleak_object pointer.
> 
> Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>

It not just fixed lockups but brought network speed back to normal but I guess it is because kmemleak disables itself
when CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=400.

dmesg:
[    0.000000] kmemleak: Memory pool empty, consider increasing CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
[    0.000000] kmemleak: Cannot allocate a kmemleak_object structure
[    0.000000] kmemleak: Kernel memory leak detector disabled



> ---
>  mm/kmemleak.c | 30 +++++++++++++++++++++---------
>  1 file changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index 03a8d84badad..244607663363 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -526,6 +526,16 @@ static struct kmemleak_object *find_and_get_object(unsigned long ptr, int alias)
>  	return object;
>  }
>  
> +/*
> + * Remove an object from the object_tree_root and object_list. Must be called
> + * with the kmemleak_lock held _if_ kmemleak is still enabled.
> + */
> +static void __remove_object(struct kmemleak_object *object)
> +{
> +	rb_erase(&object->rb_node, &object_tree_root);
> +	list_del_rcu(&object->object_list);
> +}
> +
>  /*
>   * Look up an object in the object search tree and remove it from both
>   * object_tree_root and object_list. The returned object's use_count should be
> @@ -538,10 +548,8 @@ static struct kmemleak_object *find_and_remove_object(unsigned long ptr, int ali
>  
>  	write_lock_irqsave(&kmemleak_lock, flags);
>  	object = lookup_object(ptr, alias);
> -	if (object) {
> -		rb_erase(&object->rb_node, &object_tree_root);
> -		list_del_rcu(&object->object_list);
> -	}
> +	if (object)
> +		__remove_object(object);
>  	write_unlock_irqrestore(&kmemleak_lock, flags);
>  
>  	return object;
> @@ -1834,12 +1842,16 @@ static const struct file_operations kmemleak_fops = {
>  
>  static void __kmemleak_do_cleanup(void)
>  {
> -	struct kmemleak_object *object;
> +	struct kmemleak_object *object, *tmp;
>  
> -	rcu_read_lock();
> -	list_for_each_entry_rcu(object, &object_list, object_list)
> -		delete_object_full(object->pointer);
> -	rcu_read_unlock();
> +	/*
> +	 * Kmemleak has already been disabled, no need for RCU list traversal
> +	 * or kmemleak_lock held.
> +	 */
> +	list_for_each_entry_safe(object, tmp, &object_list, object_list) {
> +		__remove_object(object);
> +		__delete_object(object);
> +	}
>  }
>  
>  /*
> 

-- 
Alexey


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kmemleak: Do not corrupt the object_list during clean-up
  2019-10-05  3:11 ` Alexey Kardashevskiy
@ 2019-10-09 16:37   ` Song Liu
  2019-10-09 16:37     ` Song Liu
  0 siblings, 1 reply; 4+ messages in thread
From: Song Liu @ 2019-10-09 16:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Catalin Marinas, Linux-MM, open list, Marc Dionne, Andrew Morton

On Fri, Oct 4, 2019 at 8:11 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 04/10/2019 23:46, Catalin Marinas wrote:
> > In case of an error (e.g. memory pool too small), kmemleak disables
> > itself and cleans up the already allocated metadata objects. However, if
> > this happens early before the RCU callback mechanism is available,
> > put_object() skips call_rcu() and frees the object directly. This is not
> > safe with the RCU list traversal in __kmemleak_do_cleanup().
> >
> > Change the list traversal in __kmemleak_do_cleanup() to
> > list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
> > the kmemleak is already disabled at this point. In addition, avoid an
> > unnecessary metadata object rb-tree look-up since it already has the
> > struct kmemleak_object pointer.
> >
> > Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
> > Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>
>
> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Tested-by: Song Liu <songliubraving@fb.com>

This fixes my vm, which could not boot with 5.4-rc3.

Thanks,
Song


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kmemleak: Do not corrupt the object_list during clean-up
  2019-10-09 16:37   ` Song Liu
@ 2019-10-09 16:37     ` Song Liu
  0 siblings, 0 replies; 4+ messages in thread
From: Song Liu @ 2019-10-09 16:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Catalin Marinas, Linux-MM, open list, Marc Dionne, Andrew Morton

On Fri, Oct 4, 2019 at 8:11 PM Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>
>
>
> On 04/10/2019 23:46, Catalin Marinas wrote:
> > In case of an error (e.g. memory pool too small), kmemleak disables
> > itself and cleans up the already allocated metadata objects. However, if
> > this happens early before the RCU callback mechanism is available,
> > put_object() skips call_rcu() and frees the object directly. This is not
> > safe with the RCU list traversal in __kmemleak_do_cleanup().
> >
> > Change the list traversal in __kmemleak_do_cleanup() to
> > list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
> > the kmemleak is already disabled at this point. In addition, avoid an
> > unnecessary metadata object rb-tree look-up since it already has the
> > struct kmemleak_object pointer.
> >
> > Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
> > Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>
>
> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Tested-by: Song Liu <songliubraving@fb.com>

This fixes my vm, which could not boot with 5.4-rc3.

Thanks,
Song



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 13:46 [PATCH] kmemleak: Do not corrupt the object_list during clean-up Catalin Marinas
2019-10-05  3:11 ` Alexey Kardashevskiy
2019-10-09 16:37   ` Song Liu
2019-10-09 16:37     ` Song Liu

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git