All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
@ 2018-09-19 23:37 Cong Wang
  2018-09-28 14:59 ` Ido Schimmel
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2018-09-19 23:37 UTC (permalink / raw)
  To: netdev; +Cc: jiri, jhs, Vlad Buslov, Cong Wang

From: Vlad Buslov <vladbu@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>

Action API was changed to work with actions and action_idr in concurrency
safe manner, however tcf_del_walker() still uses actions without taking a
reference or idrinfo->lock first, and deletes them directly, disregarding
possible concurrent delete.

Change tcf_del_walker() to take idrinfo->lock while iterating over actions
and use new tcf_idr_release_unsafe() to release them while holding the
lock.

And the blocking function fl_hw_destroy_tmplt() could be called when we
put a filter chain, so defer it to a work queue.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
[xiyou.wangcong@gmail.com: heavily modify the code and changelog]
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/act_api.c    | 20 +++++++++++++++++++-
 net/sched/cls_flower.c | 13 +++++++++++--
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 6f118d62c731..fac8c769454f 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -246,6 +246,20 @@ static int tcf_dump_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
 	goto done;
 }
 
+static int tcf_idr_release_unsafe(struct tc_action *p)
+{
+	if (atomic_read(&p->tcfa_bindcnt) > 0)
+		return -EPERM;
+
+	if (refcount_dec_and_test(&p->tcfa_refcnt)) {
+		idr_remove(&p->idrinfo->action_idr, p->tcfa_index);
+		tcf_action_cleanup(p);
+		return ACT_P_DELETED;
+	}
+
+	return 0;
+}
+
 static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
 			  const struct tc_action_ops *ops)
 {
@@ -262,15 +276,19 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
 	if (nla_put_string(skb, TCA_KIND, ops->kind))
 		goto nla_put_failure;
 
+	spin_lock(&idrinfo->lock);
 	idr_for_each_entry_ul(idr, p, id) {
-		ret = __tcf_idr_release(p, false, true);
+		ret = tcf_idr_release_unsafe(p);
 		if (ret == ACT_P_DELETED) {
 			module_put(ops->owner);
 			n_i++;
 		} else if (ret < 0) {
+			spin_unlock(&idrinfo->lock);
 			goto nla_put_failure;
 		}
 	}
+	spin_unlock(&idrinfo->lock);
+
 	if (nla_put_u32(skb, TCA_FCNT, n_i))
 		goto nla_put_failure;
 	nla_nest_end(skb, nest);
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 4b8dd37dd4f8..0ed6630a5049 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -79,6 +79,7 @@ struct fl_flow_tmplt {
 	struct fl_flow_key mask;
 	struct flow_dissector dissector;
 	struct tcf_chain *chain;
+	struct rcu_work rwork;
 };
 
 struct cls_fl_head {
@@ -1437,14 +1438,22 @@ static void *fl_tmplt_create(struct net *net, struct tcf_chain *chain,
 	return ERR_PTR(err);
 }
 
-static void fl_tmplt_destroy(void *tmplt_priv)
+static void fl_tmplt_destroy_work(struct work_struct *work)
 {
-	struct fl_flow_tmplt *tmplt = tmplt_priv;
+	struct fl_flow_tmplt *tmplt = container_of(to_rcu_work(work),
+						 struct fl_flow_tmplt, rwork);
 
 	fl_hw_destroy_tmplt(tmplt->chain, tmplt);
 	kfree(tmplt);
 }
 
+static void fl_tmplt_destroy(void *tmplt_priv)
+{
+	struct fl_flow_tmplt *tmplt = tmplt_priv;
+
+	tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work);
+}
+
 static int fl_dump_key_val(struct sk_buff *skb,
 			   void *val, int val_type,
 			   void *mask, int mask_type, int len)
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-19 23:37 [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock Cong Wang
@ 2018-09-28 14:59 ` Ido Schimmel
  2018-09-28 15:08   ` Ido Schimmel
  2018-09-28 17:56   ` Cong Wang
  0 siblings, 2 replies; 8+ messages in thread
From: Ido Schimmel @ 2018-09-28 14:59 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, jiri, jhs, Vlad Buslov

On Wed, Sep 19, 2018 at 04:37:29PM -0700, Cong Wang wrote:
> From: Vlad Buslov <vladbu@mellanox.com>
> 
> From: Vlad Buslov <vladbu@mellanox.com>
> 
> Action API was changed to work with actions and action_idr in concurrency
> safe manner, however tcf_del_walker() still uses actions without taking a
> reference or idrinfo->lock first, and deletes them directly, disregarding
> possible concurrent delete.
> 
> Change tcf_del_walker() to take idrinfo->lock while iterating over actions
> and use new tcf_idr_release_unsafe() to release them while holding the
> lock.
> 
> And the blocking function fl_hw_destroy_tmplt() could be called when we
> put a filter chain, so defer it to a work queue.

I'm getting a use-after-free when running tc_chains.sh selftest and I
believe it's caused by this patch.

To reproduce:
# cd tools/testing/selftests/net/forwarding
# export TESTS="template_filter_fits"; ./tc_chains.sh veth0 veth1

__tcf_chain_put()
	tc_chain_tmplt_del()
		fl_tmplt_destroy()
			tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work)
	tcf_chain_destroy()
		kfree(chain)

Some time later fl_tmplt_destroy_work() starts executing and
dereferencing 'chain'.

Splat:
[   48.269074] ==================================================================
[   48.270186] BUG: KASAN: use-after-free in fl_tmplt_destroy_work+0x289/0x2d0
[   48.271199] Read of size 8 at addr ffff880067f0d498 by task kworker/u2:1/18
[   48.272270]                                        
[   48.272520] CPU: 0 PID: 18 Comm: kworker/u2:1 Not tainted 4.19.0-rc5-custom+ #917      
[   48.273683] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
[   48.275495] Workqueue: tc_filter_workqueue fl_tmplt_destroy_work                    
[   48.276387] Call Trace:                                                             
[   48.276766]  dump_stack+0x10f/0x1d8                        
[   48.277302]  ? dump_stack_print_info.cold.0+0x20/0x20
[   48.278080]  ? printk+0xac/0xd4                   
[   48.278585]  ? cpumask_weight.constprop.23+0x39/0x39                          
[   48.279360]  print_address_description.cold.3+0x9/0x258
[   48.280186]  kasan_report.cold.4+0x6a/0x9d
[   48.280856]  __asan_report_load8_noabort+0x19/0x20
[   48.281642]  fl_tmplt_destroy_work+0x289/0x2d0                                
[   48.282331]  ? fl_tmplt_destroy+0x30/0x30                                     
[   48.282966]  process_one_work+0xb5d/0x1a10                                    
[   48.283642]  ? kasan_check_write+0x14/0x20                  
[   48.284301]  ? pwq_dec_nr_in_flight+0x4e0/0x4e0                        
[   48.285003]  ? lock_repin_lock+0x360/0x360                                                                               
[   48.285652]  ? __schedule+0x86d/0x2310
[   48.286256]  ? lockdep_hardirqs_on+0x39f/0x580
[   48.286995]  ? __sched_text_start+0x8/0x8
[   48.287654]  ? lock_downgrade+0x740/0x740
[   48.288292]  ? sched_clock_local+0xe0/0x150
[   48.288969]  ? save_trace+0x340/0x340
[   48.289526]  ? save_trace+0x340/0x340
[   48.290156]  ? check_preemption_disabled+0x3b/0x210
[   48.290937]  ? find_held_lock+0x40/0x1d0
[   48.291598]  ? debug_smp_processor_id+0x1c/0x20
[   48.292290]  ? worker_thread+0x3c3/0x12b0
[   48.292952]  ? lock_contended+0x1260/0x1260
[   48.293606]  ? _raw_spin_unlock_irq+0x2c/0x50
[   48.294311]  ? do_raw_spin_trylock+0x121/0x1c0
[   48.294982]  ? do_raw_spin_lock+0x1e0/0x1e0
[   48.295658]  ? trace_hardirqs_on+0x290/0x290
[   48.296346]  worker_thread+0x18e/0x12b0
[   48.296972]  ? lock_repin_lock+0x360/0x360
[   48.297624]  ? process_one_work+0x1a10/0x1a10
[   48.298332]  ? save_trace+0x340/0x340
[   48.298937]  ? kasan_check_write+0x14/0x20
[   48.299639]  ? sched_clock_local+0xe0/0x150
[   48.300310]  ? check_preemption_disabled+0x3b/0x210
[   48.301064]  ? check_preemption_disabled+0x3b/0x210
[   48.301848]  ? find_held_lock+0x40/0x1d0
[   48.302472]  ? debug_smp_processor_id+0x1c/0x20
[   48.303196]  ? _raw_spin_unlock_irqrestore+0x57/0x70
[   48.304037]  ? _raw_spin_unlock_irqrestore+0x57/0x70
[   48.304862]  ? lockdep_hardirqs_on+0x39f/0x580
[   48.305595]  ? trace_hardirqs_on+0xa1/0x290
[   48.306237]  ? do_raw_spin_trylock+0x121/0x1c0
[   48.306924]  ? __kthread_parkme+0xdc/0x1a0
[   48.307607]  ? preempt_schedule_common+0x1f/0xd0
[   48.308359]  ? __kthread_parkme+0x5d/0x1a0
[   48.309014]  ? __kthread_parkme+0xfa/0x1a0
[   48.309690]  ? _raw_spin_unlock_irqrestore+0x60/0x70
[   48.310461]  kthread+0x34d/0x410
[   48.310970]  ? process_one_work+0x1a10/0x1a10
[   48.311672]  ? kthread_delayed_work_timer_fn+0x450/0x450
[   48.312548]  ret_from_fork+0x24/0x30
[   48.313114]
[   48.313390] Allocated by task 1239:
[   48.313983]  save_stack+0x43/0xd0
[   48.314533]  kasan_kmalloc+0xc4/0xe0
[   48.315119]  kmem_cache_alloc_trace+0x124/0x280
[   48.315873]  tcf_chain_create+0xa4/0x370
[   48.316506]  tc_ctl_chain+0xe3d/0x1430
[   48.317114]  rtnetlink_rcv_msg+0x3a3/0xa80
[   48.317799]  netlink_rcv_skb+0x152/0x3c0
[   48.318440]  rtnetlink_rcv+0x21/0x30
[   48.319042]  netlink_unicast+0x52f/0x740
[   48.319675]  netlink_sendmsg+0x9c7/0xf50
[   48.320276]  sock_sendmsg+0xcf/0x120
[   48.320853]  ___sys_sendmsg+0x778/0x8f0
[   48.321445]  __sys_sendmsg+0x112/0x270
[   48.322079]  __x64_sys_sendmsg+0x7d/0xc0
[   48.322687]  do_syscall_64+0x15d/0x640
[   48.323274]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   48.324004]
[   48.324264] Freed by task 1248:
[   48.324796]  save_stack+0x43/0xd0
[   48.314533]  kasan_kmalloc+0xc4/0xe0
[   48.315119]  kmem_cache_alloc_trace+0x124/0x280
[   48.315873]  tcf_chain_create+0xa4/0x370
[   48.316506]  tc_ctl_chain+0xe3d/0x1430
[   48.317114]  rtnetlink_rcv_msg+0x3a3/0xa80
[   48.317799]  netlink_rcv_skb+0x152/0x3c0
[   48.318440]  rtnetlink_rcv+0x21/0x30
[   48.319042]  netlink_unicast+0x52f/0x740
[   48.319675]  netlink_sendmsg+0x9c7/0xf50
[   48.320276]  sock_sendmsg+0xcf/0x120
[   48.320853]  ___sys_sendmsg+0x778/0x8f0
[   48.321445]  __sys_sendmsg+0x112/0x270
[   48.322079]  __x64_sys_sendmsg+0x7d/0xc0
[   48.322687]  do_syscall_64+0x15d/0x640
[   48.323274]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   48.324004]
[   48.324264] Freed by task 1248:
[   48.324796]  save_stack+0x43/0xd0
[   48.325355]  __kasan_slab_free+0x131/0x180
[   48.326001]  kasan_slab_free+0xe/0x10
[   48.326579]  kfree+0xd7/0x2b0
[   48.327054]  __tcf_chain_put+0x339/0x710
[   48.327654]  tc_ctl_chain+0xbde/0x1430
[   48.328276]  rtnetlink_rcv_msg+0x3a3/0xa80
[   48.328899]  netlink_rcv_skb+0x152/0x3c0
[   48.329556]  rtnetlink_rcv+0x21/0x30
[   48.330170]  netlink_unicast+0x52f/0x740
[   48.330839]  netlink_sendmsg+0x9c7/0xf50
[   48.331491]  sock_sendmsg+0xcf/0x120
[   48.332079]  ___sys_sendmsg+0x778/0x8f0
[   48.332706]  __sys_sendmsg+0x112/0x270
[   48.333334]  __x64_sys_sendmsg+0x7d/0xc0
[   48.333957]  do_syscall_64+0x15d/0x640
[   48.334581]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   48.335362]
[   48.335645] The buggy address belongs to the object at ffff880067f0d480
[   48.335645]  which belongs to the cache kmalloc-64 of size 64
[   48.337559] The buggy address is located 24 bytes inside of
[   48.337559]  64-byte region [ffff880067f0d480, ffff880067f0d4c0)
[   48.339313] The buggy address belongs to the page:
[   48.340083] page:ffffea00019fc340 count:1 mapcount:0 mapping:ffff88006cc01780 index:0x0
[   48.341330] flags: 0x100000000000100(slab)
[   48.342000] raw: 0100000000000100 ffffea00019fce80 0000001100000011 ffff88006cc01780
[   48.343163] raw: 0000000000000000 00000000002a002a 00000001ffffffff 0000000000000000
[   48.344424] page dumped because: kasan: bad access detected
[   48.345299]
[   48.345580] Memory state around the buggy address:
[   48.346356]  ffff880067f0d380: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
[   48.347477]  ffff880067f0d400: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
[   48.348645] >ffff880067f0d480: fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb fb
[   48.349799]                             ^
[   48.350457]  ffff880067f0d500: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
[   48.351627]  ffff880067f0d580: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
[   48.352802] ==================================================================

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-28 14:59 ` Ido Schimmel
@ 2018-09-28 15:08   ` Ido Schimmel
  2018-09-28 17:56   ` Cong Wang
  1 sibling, 0 replies; 8+ messages in thread
From: Ido Schimmel @ 2018-09-28 15:08 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, jiri, jhs, Vlad Buslov

On Fri, Sep 28, 2018 at 05:59:00PM +0300, Ido Schimmel wrote:
> I'm getting a use-after-free when running tc_chains.sh selftest and I
> believe it's caused by this patch.

BTW, I can't reproduce the issue after reverting the patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-28 14:59 ` Ido Schimmel
  2018-09-28 15:08   ` Ido Schimmel
@ 2018-09-28 17:56   ` Cong Wang
  2018-09-28 18:11     ` Ido Schimmel
  1 sibling, 1 reply; 8+ messages in thread
From: Cong Wang @ 2018-09-28 17:56 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Fri, Sep 28, 2018 at 7:59 AM Ido Schimmel <idosch@idosch.org> wrote:
>
> On Wed, Sep 19, 2018 at 04:37:29PM -0700, Cong Wang wrote:
> > From: Vlad Buslov <vladbu@mellanox.com>
> >
> > From: Vlad Buslov <vladbu@mellanox.com>
> >
> > Action API was changed to work with actions and action_idr in concurrency
> > safe manner, however tcf_del_walker() still uses actions without taking a
> > reference or idrinfo->lock first, and deletes them directly, disregarding
> > possible concurrent delete.
> >
> > Change tcf_del_walker() to take idrinfo->lock while iterating over actions
> > and use new tcf_idr_release_unsafe() to release them while holding the
> > lock.
> >
> > And the blocking function fl_hw_destroy_tmplt() could be called when we
> > put a filter chain, so defer it to a work queue.
>
> I'm getting a use-after-free when running tc_chains.sh selftest and I
> believe it's caused by this patch.
>
> To reproduce:
> # cd tools/testing/selftests/net/forwarding
> # export TESTS="template_filter_fits"; ./tc_chains.sh veth0 veth1
>
> __tcf_chain_put()
>         tc_chain_tmplt_del()
>                 fl_tmplt_destroy()
>                         tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work)
>         tcf_chain_destroy()
>                 kfree(chain)
>
> Some time later fl_tmplt_destroy_work() starts executing and
> dereferencing 'chain'.

Oops, forgot to hold the chain... I will test this:

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 92dd5071a708..cbb68d5515d6 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1444,6 +1444,7 @@ static void fl_tmplt_destroy_work(struct
work_struct *work)
                                                 struct fl_flow_tmplt, rwork);

        fl_hw_destroy_tmplt(tmplt->chain, tmplt);
+       tcf_chain_put(tmplt->chain);
        kfree(tmplt);
 }

@@ -1451,6 +1452,7 @@ static void fl_tmplt_destroy(void *tmplt_priv)
 {
        struct fl_flow_tmplt *tmplt = tmplt_priv;

+       tcf_chain_hold(tmplt->chain);
        tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work);
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-28 17:56   ` Cong Wang
@ 2018-09-28 18:11     ` Ido Schimmel
  2018-09-28 18:29       ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Ido Schimmel @ 2018-09-28 18:11 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Fri, Sep 28, 2018 at 10:56:47AM -0700, Cong Wang wrote:
> On Fri, Sep 28, 2018 at 7:59 AM Ido Schimmel <idosch@idosch.org> wrote:
> >
> > On Wed, Sep 19, 2018 at 04:37:29PM -0700, Cong Wang wrote:
> > > From: Vlad Buslov <vladbu@mellanox.com>
> > >
> > > From: Vlad Buslov <vladbu@mellanox.com>
> > >
> > > Action API was changed to work with actions and action_idr in concurrency
> > > safe manner, however tcf_del_walker() still uses actions without taking a
> > > reference or idrinfo->lock first, and deletes them directly, disregarding
> > > possible concurrent delete.
> > >
> > > Change tcf_del_walker() to take idrinfo->lock while iterating over actions
> > > and use new tcf_idr_release_unsafe() to release them while holding the
> > > lock.
> > >
> > > And the blocking function fl_hw_destroy_tmplt() could be called when we
> > > put a filter chain, so defer it to a work queue.
> >
> > I'm getting a use-after-free when running tc_chains.sh selftest and I
> > believe it's caused by this patch.
> >
> > To reproduce:
> > # cd tools/testing/selftests/net/forwarding
> > # export TESTS="template_filter_fits"; ./tc_chains.sh veth0 veth1
> >
> > __tcf_chain_put()
> >         tc_chain_tmplt_del()
> >                 fl_tmplt_destroy()
> >                         tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work)
> >         tcf_chain_destroy()
> >                 kfree(chain)
> >
> > Some time later fl_tmplt_destroy_work() starts executing and
> > dereferencing 'chain'.
> 
> Oops, forgot to hold the chain... I will test this:
> 
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 92dd5071a708..cbb68d5515d6 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -1444,6 +1444,7 @@ static void fl_tmplt_destroy_work(struct
> work_struct *work)
>                                                  struct fl_flow_tmplt, rwork);
> 
>         fl_hw_destroy_tmplt(tmplt->chain, tmplt);
> +       tcf_chain_put(tmplt->chain);
>         kfree(tmplt);
>  }
> 
> @@ -1451,6 +1452,7 @@ static void fl_tmplt_destroy(void *tmplt_priv)
>  {
>         struct fl_flow_tmplt *tmplt = tmplt_priv;
> 
> +       tcf_chain_hold(tmplt->chain);
>         tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work);
>  }

I don't think this will work given the reference count already dropped
to 0, which is why the template deletion function was invoked. I didn't
test the patch, but I don't see what would prevent the chain from being
freed.

Thanks for looking into this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-28 18:11     ` Ido Schimmel
@ 2018-09-28 18:29       ` Cong Wang
  2018-09-29  4:47         ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2018-09-28 18:29 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Fri, Sep 28, 2018 at 11:11 AM Ido Schimmel <idosch@idosch.org> wrote:
> I don't think this will work given the reference count already dropped
> to 0, which is why the template deletion function was invoked. I didn't
> test the patch, but I don't see what would prevent the chain from being
> freed.

Good catch! So I should just "move" tcf_chain_destroy() into cls_flower.

A patch is being compiled now.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-28 18:29       ` Cong Wang
@ 2018-09-29  4:47         ` Cong Wang
  2018-10-02 19:23           ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2018-09-29  4:47 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Fri, Sep 28, 2018 at 11:29 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Fri, Sep 28, 2018 at 11:11 AM Ido Schimmel <idosch@idosch.org> wrote:
> > I don't think this will work given the reference count already dropped
> > to 0, which is why the template deletion function was invoked. I didn't
> > test the patch, but I don't see what would prevent the chain from being
> > freed.
>
> Good catch! So I should just "move" tcf_chain_destroy() into cls_flower.
>
> A patch is being compiled now.

Just FYI, the crash is fixed, but another RCU warning pops up after
my fix. So I am still debugging it, it is related to the kfree_rcu() in
tcf_chain_destroy().

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock
  2018-09-29  4:47         ` Cong Wang
@ 2018-10-02 19:23           ` Cong Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Cong Wang @ 2018-10-02 19:23 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Linux Kernel Network Developers, Jiri Pirko, Jamal Hadi Salim,
	Vlad Buslov

On Fri, Sep 28, 2018 at 9:47 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> Just FYI, the crash is fixed, but another RCU warning pops up after
> my fix. So I am still debugging it, it is related to the kfree_rcu() in
> tcf_chain_destroy().

Update: I tried very hard to make it working, the end result is ugly,
as I have to play with refcnt==1. I don't like it either. So, I am thinking
about to just convert the idrinfo->lock from a spinlock to a mutex,
which could save us a lot of troubles here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-03  2:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-19 23:37 [Patch net-next v3] net_sched: change tcf_del_walker() to take idrinfo->lock Cong Wang
2018-09-28 14:59 ` Ido Schimmel
2018-09-28 15:08   ` Ido Schimmel
2018-09-28 17:56   ` Cong Wang
2018-09-28 18:11     ` Ido Schimmel
2018-09-28 18:29       ` Cong Wang
2018-09-29  4:47         ` Cong Wang
2018-10-02 19:23           ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.