All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kasan: fix slab double free when cpu-hotplug
@ 2020-12-04 10:22 qiang.zhang
  2020-12-04 12:01 ` Kuan-Ying Lee
  0 siblings, 1 reply; 9+ messages in thread
From: qiang.zhang @ 2020-12-04 10:22 UTC (permalink / raw)
  To: aryabinin, dvyukov
  Cc: akpm, andreyknvl, qcai, kuan-ying.lee, linux-mm, linux-kernel

From: Zqiang <qiang.zhang@windriver.com>

When a CPU offline, the per-cpu quarantine's offline be set true,
after this, if the quarantine_put be called in this CPU, the objects
will be free and return false, free objects doesn't to be done, due
to return false, the slab memory manager will free this objects.

Fixes: 41ab1aae781f ("kasan: fix object remaining in offline per-cpu quarantine")
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
---
 mm/kasan/quarantine.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index d98b516f372f..55783125a767 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
 
 	q = this_cpu_ptr(&cpu_quarantine);
 	if (q->offline) {
-		qlink_free(&meta->quarantine_link, cache);
 		local_irq_restore(flags);
 		return false;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-04 10:22 [PATCH] kasan: fix slab double free when cpu-hotplug qiang.zhang
@ 2020-12-04 12:01 ` Kuan-Ying Lee
  2020-12-05  1:25   ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Kuan-Ying Lee @ 2020-12-04 12:01 UTC (permalink / raw)
  To: qiang.zhang, sfr
  Cc: aryabinin, dvyukov, akpm, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Fri, 2020-12-04 at 18:22 +0800, qiang.zhang@windriver.com wrote:
> From: Zqiang <qiang.zhang@windriver.com>
> 
> When a CPU offline, the per-cpu quarantine's offline be set true,
> after this, if the quarantine_put be called in this CPU, the objects
> will be free and return false, free objects doesn't to be done, due
> to return false, the slab memory manager will free this objects.
> 
> Fixes: 41ab1aae781f ("kasan: fix object remaining in offline per-cpu quarantine")
> Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> ---
>  mm/kasan/quarantine.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> index d98b516f372f..55783125a767 100644
> --- a/mm/kasan/quarantine.c
> +++ b/mm/kasan/quarantine.c
> @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
>  
>  	q = this_cpu_ptr(&cpu_quarantine);
>  	if (q->offline) {
> -		qlink_free(&meta->quarantine_link, cache);
>  		local_irq_restore(flags);
>  		return false;
>  	}

Hi Qiang,

Thanks for fixing this.
Due to that issue, my commit has been removed by Stephen from
linux-next.


Hi Stephen, Andrew,

Should I directly upload the v4 or Stephen can pick the commit which 
has been removed back to the linux-next.

What do you think?

Thanks a lot.

Kuan-Ying

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-04 12:01 ` Kuan-Ying Lee
@ 2020-12-05  1:25   ` Andrew Morton
  2020-12-05 16:17     ` Kuan-Ying Lee
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2020-12-05  1:25 UTC (permalink / raw)
  To: Kuan-Ying Lee
  Cc: qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Fri, 4 Dec 2020 20:01:35 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:

> > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > index d98b516f372f..55783125a767 100644
> > --- a/mm/kasan/quarantine.c
> > +++ b/mm/kasan/quarantine.c
> > @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
> >  
> >  	q = this_cpu_ptr(&cpu_quarantine);
> >  	if (q->offline) {
> > -		qlink_free(&meta->quarantine_link, cache);
> >  		local_irq_restore(flags);
> >  		return false;
> >  	}
> 
> Hi Qiang,
> 
> Thanks for fixing this.
> Due to that issue, my commit has been removed by Stephen from
> linux-next.
> 
> 
> Hi Stephen, Andrew,
> 
> Should I directly upload the v4 or Stephen can pick the commit which 
> has been removed back to the linux-next.

I took care of it.  Restored the original patch and added this one as a
-fix.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-05  1:25   ` Andrew Morton
@ 2020-12-05 16:17     ` Kuan-Ying Lee
  2020-12-06  1:09       ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Kuan-Ying Lee @ 2020-12-05 16:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Fri, 2020-12-04 at 17:25 -0800, Andrew Morton wrote:
> On Fri, 4 Dec 2020 20:01:35 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> 
> > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > index d98b516f372f..55783125a767 100644
> > > --- a/mm/kasan/quarantine.c
> > > +++ b/mm/kasan/quarantine.c
> > > @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
> > >  
> > >  	q = this_cpu_ptr(&cpu_quarantine);
> > >  	if (q->offline) {
> > > -		qlink_free(&meta->quarantine_link, cache);
> > >  		local_irq_restore(flags);
> > >  		return false;

Hi Andrew,

Return false will cause slab allocator to free the object.
Thus, we do not need to qlink_free here to free object twice.

The return value is introduced from Andrey's patch.
"kasan: sanitize objects when metadata doesn't fit"


> > >  	}
> > 
> > Hi Qiang,
> > 
> > Thanks for fixing this.
> > Due to that issue, my commit has been removed by Stephen from
> > linux-next.
> > 
> > 
> > Hi Stephen, Andrew,
> > 
> > Should I directly upload the v4 or Stephen can pick the commit which 
> > has been removed back to the linux-next.
> 
> I took care of it.  Restored the original patch and added this one as a
> -fix.

Thanks for taking care of it.

I think there are some problem in the patch you just restored.
I saw the restored patch is not based on Andrey's patch and Stephen's
fix conflict patch.

But the issue Qiang fixed need to be based on the Andrey's patch and
Stephen's fix conflict patch.
"kasan: sanitize objects when metadata doesn't fit"
"kasan-rename-get_alloc-free_info-fix"

If the restored patch is not based on that, it may cause some problems
and conflicts.

I think I can prepare a patch v4 based on Andrey's patch, fix the
conflict and include the Qiang's modification.

Thanks,
Kuan-Ying



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-05 16:17     ` Kuan-Ying Lee
@ 2020-12-06  1:09       ` Andrew Morton
  2020-12-07  2:06         ` Kuan-Ying Lee
  2020-12-11 13:43         ` Chris Down
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2020-12-06  1:09 UTC (permalink / raw)
  To: Kuan-Ying Lee
  Cc: qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Sun, 6 Dec 2020 00:17:15 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:

> On Fri, 2020-12-04 at 17:25 -0800, Andrew Morton wrote:
> > On Fri, 4 Dec 2020 20:01:35 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> > 
> > > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > > index d98b516f372f..55783125a767 100644
> > > > --- a/mm/kasan/quarantine.c
> > > > +++ b/mm/kasan/quarantine.c
> > > > @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
> > > >  
> > > >  	q = this_cpu_ptr(&cpu_quarantine);
> > > >  	if (q->offline) {
> > > > -		qlink_free(&meta->quarantine_link, cache);
> > > >  		local_irq_restore(flags);
> > > >  		return false;
> 
> Hi Andrew,
> 
> Return false will cause slab allocator to free the object.
> Thus, we do not need to qlink_free here to free object twice.
> 
> The return value is introduced from Andrey's patch.
> "kasan: sanitize objects when metadata doesn't fit"
> 
> 
> > > >  	}
> > > 
> > > Hi Qiang,
> > > 
> > > Thanks for fixing this.
> > > Due to that issue, my commit has been removed by Stephen from
> > > linux-next.
> > > 
> > > 
> > > Hi Stephen, Andrew,
> > > 
> > > Should I directly upload the v4 or Stephen can pick the commit which 
> > > has been removed back to the linux-next.
> > 
> > I took care of it.  Restored the original patch and added this one as a
> > -fix.
> 
> Thanks for taking care of it.
> 
> I think there are some problem in the patch you just restored.
> I saw the restored patch is not based on Andrey's patch and Stephen's
> fix conflict patch.
> 
> But the issue Qiang fixed need to be based on the Andrey's patch and
> Stephen's fix conflict patch.
> "kasan: sanitize objects when metadata doesn't fit"
> "kasan-rename-get_alloc-free_info-fix"
> 
> If the restored patch is not based on that, it may cause some problems
> and conflicts.
> 
> I think I can prepare a patch v4 based on Andrey's patch, fix the
> conflict and include the Qiang's modification.

I'm not sure what you mean here.  When appying this fix, yes, I had to
replace "meta" with "info", of course.

So the combined patch, which I'd like to send to Linus next week is as
below.  Is there something wrong with it?


From: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Subject: kasan: fix object remaining in offline per-cpu quarantine

We hit this issue in our internal test.  When enabling generic kasan, a
kfree()'d object is put into per-cpu quarantine first.  If the cpu goes
offline, object still remains in the per-cpu quarantine.  If we call
kmem_cache_destroy() now, slub will report "Objects remaining" error.

[   74.982625] =============================================================================
[   74.983380] BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown()
[   74.984145] -----------------------------------------------------------------------------
[   74.984145]
[   74.984883] Disabling lock debugging due to kernel taint
[   74.985561] INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200
[   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: G    B             5.10.0-rc1-00007-g4525c8781ec0-dirty #10
[   74.987262] Hardware name: linux,dummy-virt (DT)
[   74.987606] Call trace:
[   74.987924]  dump_backtrace+0x0/0x2b0
[   74.988296]  show_stack+0x18/0x68
[   74.988698]  dump_stack+0xfc/0x168
[   74.989030]  slab_err+0xac/0xd4
[   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
[   74.989779]  kmem_cache_destroy+0x68/0x130
[   74.990176]  test_version_show+0x84/0xf0
[   74.990679]  module_attr_show+0x40/0x60
[   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
[   74.991656]  kernfs_seq_show+0xa0/0xb8
[   74.992059]  seq_read+0x1f0/0x7e8
[   74.992415]  kernfs_fop_read+0x70/0x338
[   74.993051]  vfs_read+0xe4/0x250
[   74.993498]  ksys_read+0xc8/0x180
[   74.993825]  __arm64_sys_read+0x44/0x58
[   74.994203]  el0_svc_common.constprop.0+0xac/0x228
[   74.994708]  do_el0_svc+0x38/0xa0
[   74.995088]  el0_sync_handler+0x170/0x178
[   74.995497]  el0_sync+0x174/0x180
[   74.996050] INFO: Object 0x(____ptrval____) @offset=15848
[   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172
[   75.000802]  stack_trace_save+0x9c/0xd0
[   75.002420]  set_track+0x64/0xf0
[   75.002770]  alloc_debug_processing+0x104/0x1a0
[   75.003171]  ___slab_alloc+0x628/0x648
[   75.004213]  __slab_alloc.isra.0+0x2c/0x58
[   75.004757]  kmem_cache_alloc+0x560/0x588
[   75.005376]  test_version_show+0x98/0xf0
[   75.005756]  module_attr_show+0x40/0x60
[   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
[   75.007433]  kernfs_seq_show+0xa0/0xb8
[   75.007800]  seq_read+0x1f0/0x7e8
[   75.008128]  kernfs_fop_read+0x70/0x338
[   75.008507]  vfs_read+0xe4/0x250
[   75.008990]  ksys_read+0xc8/0x180
[   75.009462]  __arm64_sys_read+0x44/0x58
[   75.010085]  el0_svc_common.constprop.0+0xac/0x228
[   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has objects

Register a cpu hotplug function to remove all objects in the offline
per-cpu quarantine when cpu is going offline.  Set a per-cpu variable to
indicate this cpu is offline.

[qiang.zhang@windriver.com: fix slab double free when cpu-hotplug]
  Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com
Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Guangye Yang <guangye.yang@mediatek.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Nicholas Tang <nicholas.tang@mediatek.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Qian Cai <qcai@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kasan/quarantine.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

--- a/mm/kasan/quarantine.c~kasan-fix-object-remain-in-offline-per-cpu-quarantine
+++ a/mm/kasan/quarantine.c
@@ -29,6 +29,7 @@
 #include <linux/srcu.h>
 #include <linux/string.h>
 #include <linux/types.h>
+#include <linux/cpuhotplug.h>
 
 #include "../slab.h"
 #include "kasan.h"
@@ -43,6 +44,7 @@ struct qlist_head {
 	struct qlist_node *head;
 	struct qlist_node *tail;
 	size_t bytes;
+	bool offline;
 };
 
 #define QLIST_INIT { NULL, NULL, 0 }
@@ -188,6 +190,10 @@ void quarantine_put(struct kasan_free_me
 	local_irq_save(flags);
 
 	q = this_cpu_ptr(&cpu_quarantine);
+	if (q->offline) {
+		local_irq_restore(flags);
+		return;
+	}
 	qlist_put(q, &info->quarantine_link, cache->size);
 	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
 		qlist_move_all(q, &temp);
@@ -328,3 +334,36 @@ void quarantine_remove_cache(struct kmem
 
 	synchronize_srcu(&remove_cache_srcu);
 }
+
+static int kasan_cpu_online(unsigned int cpu)
+{
+	this_cpu_ptr(&cpu_quarantine)->offline = false;
+	return 0;
+}
+
+static int kasan_cpu_offline(unsigned int cpu)
+{
+	struct qlist_head *q;
+
+	q = this_cpu_ptr(&cpu_quarantine);
+	/* Ensure the ordering between the writing to q->offline and
+	 * qlist_free_all. Otherwise, cpu_quarantine may be corrupted
+	 * by interrupt.
+	 */
+	WRITE_ONCE(q->offline, true);
+	barrier();
+	qlist_free_all(q, NULL);
+	return 0;
+}
+
+static int __init kasan_cpu_quarantine_init(void)
+{
+	int ret = 0;
+
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mm/kasan:online",
+				kasan_cpu_online, kasan_cpu_offline);
+	if (ret < 0)
+		pr_err("kasan cpu quarantine register failed [%d]\n", ret);
+	return ret;
+}
+late_initcall(kasan_cpu_quarantine_init);
_


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-06  1:09       ` Andrew Morton
@ 2020-12-07  2:06         ` Kuan-Ying Lee
  2020-12-07  7:00           ` Kuan-Ying Lee
  2020-12-11 13:43         ` Chris Down
  1 sibling, 1 reply; 9+ messages in thread
From: Kuan-Ying Lee @ 2020-12-07  2:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Sat, 2020-12-05 at 17:09 -0800, Andrew Morton wrote:
> On Sun, 6 Dec 2020 00:17:15 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> 
> > On Fri, 2020-12-04 at 17:25 -0800, Andrew Morton wrote:
> > > On Fri, 4 Dec 2020 20:01:35 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> > > 
> > > > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > > > index d98b516f372f..55783125a767 100644
> > > > > --- a/mm/kasan/quarantine.c
> > > > > +++ b/mm/kasan/quarantine.c
> > > > > @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
> > > > >  
> > > > >  	q = this_cpu_ptr(&cpu_quarantine);
> > > > >  	if (q->offline) {
> > > > > -		qlink_free(&meta->quarantine_link, cache);
> > > > >  		local_irq_restore(flags);
> > > > >  		return false;
> > 
> > Hi Andrew,
> > 
> > Return false will cause slab allocator to free the object.
> > Thus, we do not need to qlink_free here to free object twice.
> > 
> > The return value is introduced from Andrey's patch.
> > "kasan: sanitize objects when metadata doesn't fit"
> > 
> > 
> > > > >  	}
> > > > 
> > > > Hi Qiang,
> > > > 
> > > > Thanks for fixing this.
> > > > Due to that issue, my commit has been removed by Stephen from
> > > > linux-next.
> > > > 
> > > > 
> > > > Hi Stephen, Andrew,
> > > > 
> > > > Should I directly upload the v4 or Stephen can pick the commit which 
> > > > has been removed back to the linux-next.
> > > 
> > > I took care of it.  Restored the original patch and added this one as a
> > > -fix.
> > 
> > Thanks for taking care of it.
> > 
> > I think there are some problem in the patch you just restored.
> > I saw the restored patch is not based on Andrey's patch and Stephen's
> > fix conflict patch.
> > 
> > But the issue Qiang fixed need to be based on the Andrey's patch and
> > Stephen's fix conflict patch.
> > "kasan: sanitize objects when metadata doesn't fit"
> > "kasan-rename-get_alloc-free_info-fix"
> > 
> > If the restored patch is not based on that, it may cause some problems
> > and conflicts.
> > 
> > I think I can prepare a patch v4 based on Andrey's patch, fix the
> > conflict and include the Qiang's modification.
> 
> I'm not sure what you mean here.  When appying this fix, yes, I had to
> replace "meta" with "info", of course.
> 
> So the combined patch, which I'd like to send to Linus next week is as
> below.  Is there something wrong with it?
> 

Is this combined patch based on Andrey's patch?

If yes, Andrey's patch not only change the "info" to "meta" but also
introduce the return value.
I think we need to add return value or it will build error.

> 
> From: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
> Subject: kasan: fix object remaining in offline per-cpu quarantine
> 
> We hit this issue in our internal test.  When enabling generic kasan, a
> kfree()'d object is put into per-cpu quarantine first.  If the cpu goes
> offline, object still remains in the per-cpu quarantine.  If we call
> kmem_cache_destroy() now, slub will report "Objects remaining" error.
> 
> [   74.982625] =============================================================================
> [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown()
> [   74.984145] -----------------------------------------------------------------------------
> [   74.984145]
> [   74.984883] Disabling lock debugging due to kernel taint
> [   74.985561] INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200
> [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: G    B             5.10.0-rc1-00007-g4525c8781ec0-dirty #10
> [   74.987262] Hardware name: linux,dummy-virt (DT)
> [   74.987606] Call trace:
> [   74.987924]  dump_backtrace+0x0/0x2b0
> [   74.988296]  show_stack+0x18/0x68
> [   74.988698]  dump_stack+0xfc/0x168
> [   74.989030]  slab_err+0xac/0xd4
> [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> [   74.989779]  kmem_cache_destroy+0x68/0x130
> [   74.990176]  test_version_show+0x84/0xf0
> [   74.990679]  module_attr_show+0x40/0x60
> [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> [   74.991656]  kernfs_seq_show+0xa0/0xb8
> [   74.992059]  seq_read+0x1f0/0x7e8
> [   74.992415]  kernfs_fop_read+0x70/0x338
> [   74.993051]  vfs_read+0xe4/0x250
> [   74.993498]  ksys_read+0xc8/0x180
> [   74.993825]  __arm64_sys_read+0x44/0x58
> [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> [   74.994708]  do_el0_svc+0x38/0xa0
> [   74.995088]  el0_sync_handler+0x170/0x178
> [   74.995497]  el0_sync+0x174/0x180
> [   74.996050] INFO: Object 0x(____ptrval____) @offset=15848
> [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172
> [   75.000802]  stack_trace_save+0x9c/0xd0
> [   75.002420]  set_track+0x64/0xf0
> [   75.002770]  alloc_debug_processing+0x104/0x1a0
> [   75.003171]  ___slab_alloc+0x628/0x648
> [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> [   75.004757]  kmem_cache_alloc+0x560/0x588
> [   75.005376]  test_version_show+0x98/0xf0
> [   75.005756]  module_attr_show+0x40/0x60
> [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> [   75.007433]  kernfs_seq_show+0xa0/0xb8
> [   75.007800]  seq_read+0x1f0/0x7e8
> [   75.008128]  kernfs_fop_read+0x70/0x338
> [   75.008507]  vfs_read+0xe4/0x250
> [   75.008990]  ksys_read+0xc8/0x180
> [   75.009462]  __arm64_sys_read+0x44/0x58
> [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has objects
> 
> Register a cpu hotplug function to remove all objects in the offline
> per-cpu quarantine when cpu is going offline.  Set a per-cpu variable to
> indicate this cpu is offline.
> 
> [qiang.zhang@windriver.com: fix slab double free when cpu-hotplug]
>   Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com
> Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com
> Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
> Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> Suggested-by: Dmitry Vyukov <dvyukov@google.com>
> Reported-by: Guangye Yang <guangye.yang@mediatek.com>
> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Matthias Brugger <matthias.bgg@gmail.com>
> Cc: Nicholas Tang <nicholas.tang@mediatek.com>
> Cc: Miles Chen <miles.chen@mediatek.com>
> Cc: Qian Cai <qcai@redhat.com>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/kasan/quarantine.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> --- a/mm/kasan/quarantine.c~kasan-fix-object-remain-in-offline-per-cpu-quarantine
> +++ a/mm/kasan/quarantine.c
> @@ -29,6 +29,7 @@
>  #include <linux/srcu.h>
>  #include <linux/string.h>
>  #include <linux/types.h>
> +#include <linux/cpuhotplug.h>
>  
>  #include "../slab.h"
>  #include "kasan.h"
> @@ -43,6 +44,7 @@ struct qlist_head {
>  	struct qlist_node *head;
>  	struct qlist_node *tail;
>  	size_t bytes;
> +	bool offline;
>  };
>  
>  #define QLIST_INIT { NULL, NULL, 0 }
> @@ -188,6 +190,10 @@ void quarantine_put(struct kasan_free_me

Andrey's patch changes the return value from "void" to "bool".
We need to replace void with bool.

>  	local_irq_save(flags);
>  
>  	q = this_cpu_ptr(&cpu_quarantine);
> +	if (q->offline) {
> +		local_irq_restore(flags);
> +		return;

I think we need to return false here. Otherwise, it will lack return
value and cause build error.

> +	}
>  	qlist_put(q, &info->quarantine_link, cache->size);

This "info" may cause conflict because Andrey's patch has already
changed it to "meta".

Thanks.
Kuan-Ying

>  	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
>  		qlist_move_all(q, &temp);
> @@ -328,3 +334,36 @@ void quarantine_remove_cache(struct kmem
>  
>  	synchronize_srcu(&remove_cache_srcu);
>  }
> +
> +static int kasan_cpu_online(unsigned int cpu)
> +{
> +	this_cpu_ptr(&cpu_quarantine)->offline = false;
> +	return 0;
> +}
> +
> +static int kasan_cpu_offline(unsigned int cpu)
> +{
> +	struct qlist_head *q;
> +
> +	q = this_cpu_ptr(&cpu_quarantine);
> +	/* Ensure the ordering between the writing to q->offline and
> +	 * qlist_free_all. Otherwise, cpu_quarantine may be corrupted
> +	 * by interrupt.
> +	 */
> +	WRITE_ONCE(q->offline, true);
> +	barrier();
> +	qlist_free_all(q, NULL);
> +	return 0;
> +}
> +
> +static int __init kasan_cpu_quarantine_init(void)
> +{
> +	int ret = 0;
> +
> +	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mm/kasan:online",
> +				kasan_cpu_online, kasan_cpu_offline);
> +	if (ret < 0)
> +		pr_err("kasan cpu quarantine register failed [%d]\n", ret);
> +	return ret;
> +}
> +late_initcall(kasan_cpu_quarantine_init);
> _
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-07  2:06         ` Kuan-Ying Lee
@ 2020-12-07  7:00           ` Kuan-Ying Lee
  0 siblings, 0 replies; 9+ messages in thread
From: Kuan-Ying Lee @ 2020-12-07  7:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl, qcai, linux-mm,
	linux-kernel, walter-zh.wu

On Mon, 2020-12-07 at 10:06 +0800, Kuan-Ying Lee wrote:
> On Sat, 2020-12-05 at 17:09 -0800, Andrew Morton wrote:
> > On Sun, 6 Dec 2020 00:17:15 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> > 
> > > On Fri, 2020-12-04 at 17:25 -0800, Andrew Morton wrote:
> > > > On Fri, 4 Dec 2020 20:01:35 +0800 Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> wrote:
> > > > 
> > > > > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
> > > > > > index d98b516f372f..55783125a767 100644
> > > > > > --- a/mm/kasan/quarantine.c
> > > > > > +++ b/mm/kasan/quarantine.c
> > > > > > @@ -194,7 +194,6 @@ bool quarantine_put(struct kmem_cache *cache, void *object)
> > > > > >  
> > > > > >  	q = this_cpu_ptr(&cpu_quarantine);
> > > > > >  	if (q->offline) {
> > > > > > -		qlink_free(&meta->quarantine_link, cache);
> > > > > >  		local_irq_restore(flags);
> > > > > >  		return false;
> > > 
> > > Hi Andrew,
> > > 
> > > Return false will cause slab allocator to free the object.
> > > Thus, we do not need to qlink_free here to free object twice.
> > > 
> > > The return value is introduced from Andrey's patch.
> > > "kasan: sanitize objects when metadata doesn't fit"
> > > 
> > > 
> > > > > >  	}
> > > > > 
> > > > > Hi Qiang,
> > > > > 
> > > > > Thanks for fixing this.
> > > > > Due to that issue, my commit has been removed by Stephen from
> > > > > linux-next.
> > > > > 
> > > > > 
> > > > > Hi Stephen, Andrew,
> > > > > 
> > > > > Should I directly upload the v4 or Stephen can pick the commit which 
> > > > > has been removed back to the linux-next.
> > > > 
> > > > I took care of it.  Restored the original patch and added this one as a
> > > > -fix.
> > > 
> > > Thanks for taking care of it.
> > > 
> > > I think there are some problem in the patch you just restored.
> > > I saw the restored patch is not based on Andrey's patch and Stephen's
> > > fix conflict patch.
> > > 
> > > But the issue Qiang fixed need to be based on the Andrey's patch and
> > > Stephen's fix conflict patch.
> > > "kasan: sanitize objects when metadata doesn't fit"
> > > "kasan-rename-get_alloc-free_info-fix"
> > > 
> > > If the restored patch is not based on that, it may cause some problems
> > > and conflicts.
> > > 
> > > I think I can prepare a patch v4 based on Andrey's patch, fix the
> > > conflict and include the Qiang's modification.
> > 
> > I'm not sure what you mean here.  When appying this fix, yes, I had to
> > replace "meta" with "info", of course.
> > 
> > So the combined patch, which I'd like to send to Linus next week is as
> > below.  Is there something wrong with it?
> > 
> 
> Is this combined patch based on Andrey's patch?
> 
> If yes, Andrey's patch not only change the "info" to "meta" but also
> introduce the return value.
> I think we need to add return value or it will build error.
> 

Hi Andrew,

Sorry to bother.

Thanks for fixing my patch.
But it still has some problem.

I know you are busy. I can fix that by myself.
I will upload the patch v4 to fix the conflicts and build errors and 
include the Qiang's patch.

Thanks.
Kuan-Ying

> > 
> > From: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
> > Subject: kasan: fix object remaining in offline per-cpu quarantine
> > 
> > We hit this issue in our internal test.  When enabling generic kasan, a
> > kfree()'d object is put into per-cpu quarantine first.  If the cpu goes
> > offline, object still remains in the per-cpu quarantine.  If we call
> > kmem_cache_destroy() now, slub will report "Objects remaining" error.
> > 
> > [   74.982625] =============================================================================
> > [   74.983380] BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown()
> > [   74.984145] -----------------------------------------------------------------------------
> > [   74.984145]
> > [   74.984883] Disabling lock debugging due to kernel taint
> > [   74.985561] INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200
> > [   74.986638] CPU: 3 PID: 176 Comm: cat Tainted: G    B             5.10.0-rc1-00007-g4525c8781ec0-dirty #10
> > [   74.987262] Hardware name: linux,dummy-virt (DT)
> > [   74.987606] Call trace:
> > [   74.987924]  dump_backtrace+0x0/0x2b0
> > [   74.988296]  show_stack+0x18/0x68
> > [   74.988698]  dump_stack+0xfc/0x168
> > [   74.989030]  slab_err+0xac/0xd4
> > [   74.989346]  __kmem_cache_shutdown+0x1e4/0x3c8
> > [   74.989779]  kmem_cache_destroy+0x68/0x130
> > [   74.990176]  test_version_show+0x84/0xf0
> > [   74.990679]  module_attr_show+0x40/0x60
> > [   74.991218]  sysfs_kf_seq_show+0x128/0x1c0
> > [   74.991656]  kernfs_seq_show+0xa0/0xb8
> > [   74.992059]  seq_read+0x1f0/0x7e8
> > [   74.992415]  kernfs_fop_read+0x70/0x338
> > [   74.993051]  vfs_read+0xe4/0x250
> > [   74.993498]  ksys_read+0xc8/0x180
> > [   74.993825]  __arm64_sys_read+0x44/0x58
> > [   74.994203]  el0_svc_common.constprop.0+0xac/0x228
> > [   74.994708]  do_el0_svc+0x38/0xa0
> > [   74.995088]  el0_sync_handler+0x170/0x178
> > [   74.995497]  el0_sync+0x174/0x180
> > [   74.996050] INFO: Object 0x(____ptrval____) @offset=15848
> > [   74.996752] INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172
> > [   75.000802]  stack_trace_save+0x9c/0xd0
> > [   75.002420]  set_track+0x64/0xf0
> > [   75.002770]  alloc_debug_processing+0x104/0x1a0
> > [   75.003171]  ___slab_alloc+0x628/0x648
> > [   75.004213]  __slab_alloc.isra.0+0x2c/0x58
> > [   75.004757]  kmem_cache_alloc+0x560/0x588
> > [   75.005376]  test_version_show+0x98/0xf0
> > [   75.005756]  module_attr_show+0x40/0x60
> > [   75.007035]  sysfs_kf_seq_show+0x128/0x1c0
> > [   75.007433]  kernfs_seq_show+0xa0/0xb8
> > [   75.007800]  seq_read+0x1f0/0x7e8
> > [   75.008128]  kernfs_fop_read+0x70/0x338
> > [   75.008507]  vfs_read+0xe4/0x250
> > [   75.008990]  ksys_read+0xc8/0x180
> > [   75.009462]  __arm64_sys_read+0x44/0x58
> > [   75.010085]  el0_svc_common.constprop.0+0xac/0x228
> > [   75.011006] kmem_cache_destroy test_module_slab: Slab cache still has objects
> > 
> > Register a cpu hotplug function to remove all objects in the offline
> > per-cpu quarantine when cpu is going offline.  Set a per-cpu variable to
> > indicate this cpu is offline.
> > 
> > [qiang.zhang@windriver.com: fix slab double free when cpu-hotplug]
> >   Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com
> > Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com
> > Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
> > Signed-off-by: Zqiang <qiang.zhang@windriver.com>
> > Suggested-by: Dmitry Vyukov <dvyukov@google.com>
> > Reported-by: Guangye Yang <guangye.yang@mediatek.com>
> > Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> > Cc: Alexander Potapenko <glider@google.com>
> > Cc: Matthias Brugger <matthias.bgg@gmail.com>
> > Cc: Nicholas Tang <nicholas.tang@mediatek.com>
> > Cc: Miles Chen <miles.chen@mediatek.com>
> > Cc: Qian Cai <qcai@redhat.com>
> > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> > 
> >  mm/kasan/quarantine.c |   39 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 39 insertions(+)
> > 
> > --- a/mm/kasan/quarantine.c~kasan-fix-object-remain-in-offline-per-cpu-quarantine
> > +++ a/mm/kasan/quarantine.c
> > @@ -29,6 +29,7 @@
> >  #include <linux/srcu.h>
> >  #include <linux/string.h>
> >  #include <linux/types.h>
> > +#include <linux/cpuhotplug.h>
> >  
> >  #include "../slab.h"
> >  #include "kasan.h"
> > @@ -43,6 +44,7 @@ struct qlist_head {
> >  	struct qlist_node *head;
> >  	struct qlist_node *tail;
> >  	size_t bytes;
> > +	bool offline;
> >  };
> >  
> >  #define QLIST_INIT { NULL, NULL, 0 }
> > @@ -188,6 +190,10 @@ void quarantine_put(struct kasan_free_me
> 
> Andrey's patch changes the return value from "void" to "bool".
> We need to replace void with bool.
> 
> >  	local_irq_save(flags);
> >  
> >  	q = this_cpu_ptr(&cpu_quarantine);
> > +	if (q->offline) {
> > +		local_irq_restore(flags);
> > +		return;
> 
> I think we need to return false here. Otherwise, it will lack return
> value and cause build error.
> 
> > +	}
> >  	qlist_put(q, &info->quarantine_link, cache->size);
> 
> This "info" may cause conflict because Andrey's patch has already
> changed it to "meta".
> 
> Thanks.
> Kuan-Ying
> 
> >  	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
> >  		qlist_move_all(q, &temp);
> > @@ -328,3 +334,36 @@ void quarantine_remove_cache(struct kmem
> >  
> >  	synchronize_srcu(&remove_cache_srcu);
> >  }
> > +
> > +static int kasan_cpu_online(unsigned int cpu)
> > +{
> > +	this_cpu_ptr(&cpu_quarantine)->offline = false;
> > +	return 0;
> > +}
> > +
> > +static int kasan_cpu_offline(unsigned int cpu)
> > +{
> > +	struct qlist_head *q;
> > +
> > +	q = this_cpu_ptr(&cpu_quarantine);
> > +	/* Ensure the ordering between the writing to q->offline and
> > +	 * qlist_free_all. Otherwise, cpu_quarantine may be corrupted
> > +	 * by interrupt.
> > +	 */
> > +	WRITE_ONCE(q->offline, true);
> > +	barrier();
> > +	qlist_free_all(q, NULL);
> > +	return 0;
> > +}
> > +
> > +static int __init kasan_cpu_quarantine_init(void)
> > +{
> > +	int ret = 0;
> > +
> > +	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mm/kasan:online",
> > +				kasan_cpu_online, kasan_cpu_offline);
> > +	if (ret < 0)
> > +		pr_err("kasan cpu quarantine register failed [%d]\n", ret);
> > +	return ret;
> > +}
> > +late_initcall(kasan_cpu_quarantine_init);
> > _
> > 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-06  1:09       ` Andrew Morton
  2020-12-07  2:06         ` Kuan-Ying Lee
@ 2020-12-11 13:43         ` Chris Down
  2020-12-11 19:39           ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Down @ 2020-12-11 13:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kuan-Ying Lee, qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl,
	qcai, linux-mm, linux-kernel, walter-zh.wu

Hi folks,

Andrew Morton writes:
>@@ -188,6 +190,10 @@ void quarantine_put(struct kasan_free_me
> 	local_irq_save(flags);
>
> 	q = this_cpu_ptr(&cpu_quarantine);
>+	if (q->offline) {
>+		local_irq_restore(flags);
>+		return;
>+	}
> 	qlist_put(q, &info->quarantine_link, cache->size);
> 	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
> 		qlist_move_all(q, &temp);

I'm afraid as well as the issues already identified, this also fails, because 
`quarantine_put` now returns a bool after "kasan: sanitize objects when 
metadata doesn't fit":

     mm/kasan/quarantine.c: In function ‘quarantine_put’:
     mm/kasan/quarantine.c:198:3: error: ‘return’ with no value, in function returning non-void [-Werror=return-type]
       198 |   return;
           |   ^~~~~~
     mm/kasan/quarantine.c:171:6: note: declared here
       171 | bool quarantine_put(struct kmem_cache *cache, void *object)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kasan: fix slab double free when cpu-hotplug
  2020-12-11 13:43         ` Chris Down
@ 2020-12-11 19:39           ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2020-12-11 19:39 UTC (permalink / raw)
  To: Chris Down
  Cc: Kuan-Ying Lee, qiang.zhang, sfr, aryabinin, dvyukov, andreyknvl,
	qcai, linux-mm, linux-kernel, walter-zh.wu

On Fri, 11 Dec 2020 13:43:39 +0000 Chris Down <chris@chrisdown.name> wrote:

> Hi folks,
> 
> Andrew Morton writes:
> >@@ -188,6 +190,10 @@ void quarantine_put(struct kasan_free_me
> > 	local_irq_save(flags);
> >
> > 	q = this_cpu_ptr(&cpu_quarantine);
> >+	if (q->offline) {
> >+		local_irq_restore(flags);
> >+		return;
> >+	}
> > 	qlist_put(q, &info->quarantine_link, cache->size);
> > 	if (unlikely(q->bytes > QUARANTINE_PERCPU_SIZE)) {
> > 		qlist_move_all(q, &temp);
> 
> I'm afraid as well as the issues already identified, this also fails, because 
> `quarantine_put` now returns a bool after "kasan: sanitize objects when 
> metadata doesn't fit":
> 
>      mm/kasan/quarantine.c: In function ‘quarantine_put’:
>      mm/kasan/quarantine.c:198:3: error: ‘return’ with no value, in function returning non-void [-Werror=return-type]
>        198 |   return;
>            |   ^~~~~~
>      mm/kasan/quarantine.c:171:6: note: declared here
>        171 | bool quarantine_put(struct kmem_cache *cache, void *object)

Yup, thanks.  I think I have this all fixed now, using the old
apply-one-patch-compile-everything-repeat approach.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-12-11 21:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 10:22 [PATCH] kasan: fix slab double free when cpu-hotplug qiang.zhang
2020-12-04 12:01 ` Kuan-Ying Lee
2020-12-05  1:25   ` Andrew Morton
2020-12-05 16:17     ` Kuan-Ying Lee
2020-12-06  1:09       ` Andrew Morton
2020-12-07  2:06         ` Kuan-Ying Lee
2020-12-07  7:00           ` Kuan-Ying Lee
2020-12-11 13:43         ` Chris Down
2020-12-11 19:39           ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.