linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] act_ife: sleeping functions called in atomic context
@ 2016-06-16 20:50 Alexey Khoroshilov
  2016-06-16 21:43 ` Cong Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Khoroshilov @ 2016-06-16 20:50 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Alexey Khoroshilov, David S. Miller, netdev, linux-kernel, ldv-project

tcf_ife_init() contains a big chunk of code executed with
ife->tcf_lock spinlock held. But that code contains several calls
to sleeping functions:
  populate_metalist() and use_all_metadata()
    -> add_metainfo()
      -> find_ife_oplist(metaid)
        -> read_lock()
        -> try_module_get(o->owner)
      -> kzalloc(sizeof(*mi), GFP_KERNEL);
      -> ops->alloc(mi, metaval);
      -> module_put(ops->owner);
  _tcf_ife_cleanup()
    -> module_put()

The same problem is actual for tcf_ife_cleanup() as well.

Found by Linux Driver Verification project (linuxtesting.org).

--
Alexey Khoroshilov
Linux Verification Center, ISPRAS
web: http://linuxtesting.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-16 20:50 [BUG] act_ife: sleeping functions called in atomic context Alexey Khoroshilov
@ 2016-06-16 21:43 ` Cong Wang
  2016-06-17  0:38   ` Jamal Hadi Salim
  0 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2016-06-16 21:43 UTC (permalink / raw)
  To: Alexey Khoroshilov
  Cc: Jamal Hadi Salim, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

On Thu, Jun 16, 2016 at 1:50 PM, Alexey Khoroshilov
<khoroshilov@ispras.ru> wrote:
> tcf_ife_init() contains a big chunk of code executed with
> ife->tcf_lock spinlock held. But that code contains several calls
> to sleeping functions:
>   populate_metalist() and use_all_metadata()
>     -> add_metainfo()
>       -> find_ife_oplist(metaid)
>         -> read_lock()
>         -> try_module_get(o->owner)
>       -> kzalloc(sizeof(*mi), GFP_KERNEL);

Hmm, we don't need to hold that spinlock when we create a new ife action,
since we haven't inserted it yet. We do need it when we modify an existing
one. So I am thinking if we can refactor that code to avoid spinlock
whenever possible.

>       -> ops->alloc(mi, metaval);
>       -> module_put(ops->owner);
>   _tcf_ife_cleanup()
>     -> module_put()
>
> The same problem is actual for tcf_ife_cleanup() as well.
>

Huh? Both module_put() and kfree() should not sleep, right?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-16 21:43 ` Cong Wang
@ 2016-06-17  0:38   ` Jamal Hadi Salim
  2016-06-17  2:14     ` Cong Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Jamal Hadi Salim @ 2016-06-17  0:38 UTC (permalink / raw)
  To: Cong Wang, Alexey Khoroshilov
  Cc: David S. Miller, Linux Kernel Network Developers, LKML, ldv-project

[-- Attachment #1: Type: text/plain, Size: 1120 bytes --]

On 16-06-16 05:43 PM, Cong Wang wrote:
> On Thu, Jun 16, 2016 at 1:50 PM, Alexey Khoroshilov
> <khoroshilov@ispras.ru> wrote:
>> tcf_ife_init() contains a big chunk of code executed with
>> ife->tcf_lock spinlock held. But that code contains several calls
>> to sleeping functions:
>>    populate_metalist() and use_all_metadata()
>>      -> add_metainfo()
>>        -> find_ife_oplist(metaid)
>>          -> read_lock()
>>          -> try_module_get(o->owner)
>>        -> kzalloc(sizeof(*mi), GFP_KERNEL);
>
> Hmm, we don't need to hold that spinlock when we create a new ife action,
> since we haven't inserted it yet. We do need it when we modify an existing
> one. So I am thinking if we can refactor that code to avoid spinlock
> whenever possible.
>

Does attached (compile tested) patch help?

>>        -> ops->alloc(mi, metaval);
>>        -> module_put(ops->owner);
>>    _tcf_ife_cleanup()
>>      -> module_put()
>>
>> The same problem is actual for tcf_ife_cleanup() as well.
>>
>
> Huh? Both module_put() and kfree() should not sleep, right?
>

I dont think there is any sleeping there ;->

cheers,
jamal

[-- Attachment #2: patchlet1 --]
[-- Type: text/plain, Size: 1032 bytes --]

diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 6bbc518..e341bef 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -302,7 +302,9 @@ static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 		}
 	}
 
+	spin_lock_bh(&ife->tcf_lock);
 	list_add_tail(&mi->metalist, &ife->metalist);
+	spin_unlock_bh(&ife->tcf_lock);
 
 	return ret;
 }
@@ -474,7 +476,6 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 			saddr = nla_data(tb[TCA_IFE_SMAC]);
 	}
 
-	spin_lock_bh(&ife->tcf_lock);
 	ife->tcf_action = parm->action;
 
 	if (parm->flags & IFE_ENCODE) {
@@ -504,7 +505,6 @@ metadata_parse_err:
 			if (ret == ACT_P_CREATED)
 				_tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 
@@ -523,13 +523,10 @@ metadata_parse_err:
 			if (ret == ACT_P_CREATED)
 				_tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 	}
 
-	spin_unlock_bh(&ife->tcf_lock);
-
 	if (ret == ACT_P_CREATED)
 		tcf_hash_insert(tn, a);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17  0:38   ` Jamal Hadi Salim
@ 2016-06-17  2:14     ` Cong Wang
  2016-06-17  5:38       ` Cong Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2016-06-17  2:14 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Alexey Khoroshilov, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

On Thu, Jun 16, 2016 at 5:38 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 16-06-16 05:43 PM, Cong Wang wrote:
>>
>> On Thu, Jun 16, 2016 at 1:50 PM, Alexey Khoroshilov
>> <khoroshilov@ispras.ru> wrote:
>>>
>>> tcf_ife_init() contains a big chunk of code executed with
>>> ife->tcf_lock spinlock held. But that code contains several calls
>>> to sleeping functions:
>>>    populate_metalist() and use_all_metadata()
>>>      -> add_metainfo()
>>>        -> find_ife_oplist(metaid)
>>>          -> read_lock()
>>>          -> try_module_get(o->owner)
>>>        -> kzalloc(sizeof(*mi), GFP_KERNEL);
>>
>>
>> Hmm, we don't need to hold that spinlock when we create a new ife action,
>> since we haven't inserted it yet. We do need it when we modify an existing
>> one. So I am thinking if we can refactor that code to avoid spinlock
>> whenever possible.
>>
>
> Does attached (compile tested) patch help?

You at least miss the unlock in load_metaops_and_vet()?

I think we can just remove that tcf_lock, I am testing a patch now.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17  2:14     ` Cong Wang
@ 2016-06-17  5:38       ` Cong Wang
  2016-06-17 11:05         ` Alexey Khoroshilov
  2016-06-17 11:07         ` Jamal Hadi Salim
  0 siblings, 2 replies; 10+ messages in thread
From: Cong Wang @ 2016-06-17  5:38 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Alexey Khoroshilov, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

[-- Attachment #1: Type: text/plain, Size: 221 bytes --]

On Thu, Jun 16, 2016 at 7:14 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> I think we can just remove that tcf_lock, I am testing a patch now.

Please try the attached patch, I will do more tests tomorrow.

Thanks!

[-- Attachment #2: ife.diff --]
[-- Type: text/plain, Size: 8875 bytes --]

diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 658046d..859fb02 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -240,8 +240,7 @@ static int ife_validate_metatype(struct tcf_meta_ops *ops, void *val, int len)
 }
 
 /* called when adding new meta information
- * under ife->tcf_lock
-*/
+ */
 static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 				void *val, int len)
 {
@@ -251,11 +250,9 @@ static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 	if (!ops) {
 		ret = -ENOENT;
 #ifdef CONFIG_MODULES
-		spin_unlock_bh(&ife->tcf_lock);
 		rtnl_unlock();
 		request_module("ifemeta%u", metaid);
 		rtnl_lock();
-		spin_lock_bh(&ife->tcf_lock);
 		ops = find_ife_oplist(metaid);
 #endif
 	}
@@ -272,8 +269,8 @@ static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 }
 
 /* called when adding new meta information
- * under ife->tcf_lock
-*/
+ * under RTNL lock
+ */
 static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 			int len)
 {
@@ -284,7 +281,7 @@ static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 	if (!ops)
 		return -ENOENT;
 
-	mi = kzalloc(sizeof(*mi), GFP_KERNEL);
+	mi = kzalloc(sizeof(*mi), GFP_ATOMIC);
 	if (!mi) {
 		/*put back what find_ife_oplist took */
 		module_put(ops->owner);
@@ -302,8 +299,7 @@ static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 		}
 	}
 
-	list_add_tail(&mi->metalist, &ife->metalist);
-
+	list_add_tail_rcu(&mi->metalist, &ife->metalist);
 	return ret;
 }
 
@@ -313,11 +309,13 @@ static int use_all_metadata(struct tcf_ife_info *ife)
 	int rc = 0;
 	int installed = 0;
 
+	read_lock(&ife_mod_lock);
 	list_for_each_entry(o, &ifeoplist, list) {
 		rc = add_metainfo(ife, o->metaid, NULL, 0);
 		if (rc == 0)
 			installed += 1;
 	}
+	read_unlock(&ife_mod_lock);
 
 	if (installed)
 		return 0;
@@ -340,10 +338,12 @@ static int dump_metalist(struct sk_buff *skb, struct tcf_ife_info *ife)
 	if (!nest)
 		goto out_nlmsg_trim;
 
-	list_for_each_entry(e, &ife->metalist, metalist) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(e, &ife->metalist, metalist) {
 		if (!e->ops->get(skb, e))
 			total_encoded += 1;
 	}
+	rcu_read_unlock();
 
 	if (!total_encoded)
 		goto out_nlmsg_trim;
@@ -357,15 +357,14 @@ out_nlmsg_trim:
 	return -1;
 }
 
-/* under ife->tcf_lock */
-static void _tcf_ife_cleanup(struct tc_action *a, int bind)
+static void tcf_ife_cleanup(struct tc_action *a, int bind)
 {
 	struct tcf_ife_info *ife = a->priv;
 	struct tcf_meta_info *e, *n;
 
 	list_for_each_entry_safe(e, n, &ife->metalist, metalist) {
 		module_put(e->ops->owner);
-		list_del(&e->metalist);
+		list_del_rcu(&e->metalist);
 		if (e->metaval) {
 			if (e->ops->release)
 				e->ops->release(e);
@@ -376,16 +375,6 @@ static void _tcf_ife_cleanup(struct tc_action *a, int bind)
 	}
 }
 
-static void tcf_ife_cleanup(struct tc_action *a, int bind)
-{
-	struct tcf_ife_info *ife = a->priv;
-
-	spin_lock_bh(&ife->tcf_lock);
-	_tcf_ife_cleanup(a, bind);
-	spin_unlock_bh(&ife->tcf_lock);
-}
-
-/* under ife->tcf_lock */
 static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb)
 {
 	int len = 0;
@@ -474,7 +463,6 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 			saddr = nla_data(tb[TCA_IFE_SMAC]);
 	}
 
-	spin_lock_bh(&ife->tcf_lock);
 	ife->tcf_action = parm->action;
 
 	if (parm->flags & IFE_ENCODE) {
@@ -502,9 +490,8 @@ metadata_parse_err:
 			if (exists)
 				tcf_hash_release(a, bind);
 			if (ret == ACT_P_CREATED)
-				_tcf_ife_cleanup(a, bind);
+				tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 
@@ -521,15 +508,12 @@ metadata_parse_err:
 		err = use_all_metadata(ife);
 		if (err) {
 			if (ret == ACT_P_CREATED)
-				_tcf_ife_cleanup(a, bind);
+				tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 	}
 
-	spin_unlock_bh(&ife->tcf_lock);
-
 	if (ret == ACT_P_CREATED)
 		tcf_hash_insert(tn, a);
 
@@ -589,16 +573,19 @@ int find_decode_metaid(struct sk_buff *skb, struct tcf_ife_info *ife,
 {
 	struct tcf_meta_info *e;
 
+	rcu_read_lock();
 	/* XXX: use hash to speed up */
-	list_for_each_entry(e, &ife->metalist, metalist) {
+	list_for_each_entry_rcu(e, &ife->metalist, metalist) {
 		if (metaid == e->metaid) {
 			if (e->ops) {
+				rcu_read_unlock();
 				/* We check for decode presence already */
 				return e->ops->decode(skb, mdata, mlen);
 			}
 		}
 	}
 
+	rcu_read_unlock();
 	return 0;
 }
 
@@ -621,16 +608,12 @@ static int tcf_ife_decode(struct sk_buff *skb, const struct tc_action *a,
 	u16 ifehdrln = ifehdr->metalen;
 	struct meta_tlvhdr *tlv = (struct meta_tlvhdr *)(ifehdr->tlv_data);
 
-	spin_lock(&ife->tcf_lock);
-	bstats_update(&ife->tcf_bstats, skb);
-	ife->tcf_tm.lastuse = jiffies;
-	spin_unlock(&ife->tcf_lock);
+	tcf_lastuse_update(&ife->tcf_tm);
+	bstats_cpu_update(this_cpu_ptr(ife->common.cpu_bstats), skb);
 
 	ifehdrln = ntohs(ifehdrln);
 	if (unlikely(!pskb_may_pull(skb, ifehdrln))) {
-		spin_lock(&ife->tcf_lock);
-		ife->tcf_qstats.drops++;
-		spin_unlock(&ife->tcf_lock);
+		qstats_drop_inc(this_cpu_ptr(ife->common.cpu_qstats));
 		return TC_ACT_SHOT;
 	}
 
@@ -654,7 +637,7 @@ static int tcf_ife_decode(struct sk_buff *skb, const struct tc_action *a,
 			 */
 			pr_info_ratelimited("Unknown metaid %d alnlen %d\n",
 					    mtype, mlen);
-			ife->tcf_qstats.overlimits++;
+			qstats_overlimit_inc(this_cpu_ptr(ife->common.cpu_qstats));
 		}
 
 		tlvdata += mlen;
@@ -671,16 +654,17 @@ static int tcf_ife_decode(struct sk_buff *skb, const struct tc_action *a,
 **/
 static int ife_get_sz(struct sk_buff *skb, struct tcf_ife_info *ife)
 {
-	struct tcf_meta_info *e, *n;
+	struct tcf_meta_info *e;
 	int tot_run_sz = 0, run_sz = 0;
 
-	list_for_each_entry_safe(e, n, &ife->metalist, metalist) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(e, &ife->metalist, metalist) {
 		if (e->ops->check_presence) {
 			run_sz = e->ops->check_presence(skb, e);
 			tot_run_sz += run_sz;
 		}
 	}
-
+	rcu_read_unlock();
 	return tot_run_sz;
 }
 
@@ -709,34 +693,28 @@ static int tcf_ife_encode(struct sk_buff *skb, const struct tc_action *a,
 			exceed_mtu = true;
 	}
 
-	spin_lock(&ife->tcf_lock);
-	bstats_update(&ife->tcf_bstats, skb);
-	ife->tcf_tm.lastuse = jiffies;
+	tcf_lastuse_update(&ife->tcf_tm);
+	bstats_cpu_update(this_cpu_ptr(ife->common.cpu_bstats), skb);
 
+	rcu_read_lock();
 	if (!metalen) {		/* no metadata to send */
 		/* abuse overlimits to count when we allow packet
 		 * with no metadata
 		 */
-		ife->tcf_qstats.overlimits++;
-		spin_unlock(&ife->tcf_lock);
+		qstats_overlimit_inc(this_cpu_ptr(ife->common.cpu_qstats));
+		rcu_read_unlock();
 		return action;
 	}
 	/* could be stupid policy setup or mtu config
 	 * so lets be conservative.. */
-	if ((action == TC_ACT_SHOT) || exceed_mtu) {
-		ife->tcf_qstats.drops++;
-		spin_unlock(&ife->tcf_lock);
-		return TC_ACT_SHOT;
-	}
+	if ((action == TC_ACT_SHOT) || exceed_mtu)
+		goto drop;
 
 	iethh = eth_hdr(skb);
 
 	err = skb_cow_head(skb, hdrm);
-	if (unlikely(err)) {
-		ife->tcf_qstats.drops++;
-		spin_unlock(&ife->tcf_lock);
-		return TC_ACT_SHOT;
-	}
+	if (unlikely(err))
+		goto drop;
 
 	if (!(at & AT_EGRESS))
 		skb_push(skb, skb->dev->hard_header_len);
@@ -756,17 +734,13 @@ static int tcf_ife_encode(struct sk_buff *skb, const struct tc_action *a,
 	 * not repeat some of the computations that are done by
 	 * ops->presence_check...
 	 */
-	list_for_each_entry(e, &ife->metalist, metalist) {
+	list_for_each_entry_rcu(e, &ife->metalist, metalist) {
 		if (e->ops->encode) {
 			err = e->ops->encode(skb, (void *)(skb->data + skboff),
 					     e);
 		}
-		if (err < 0) {
-			/* too corrupt to keep around if overwritten */
-			ife->tcf_qstats.drops++;
-			spin_unlock(&ife->tcf_lock);
-			return TC_ACT_SHOT;
-		}
+		if (err < 0)
+			goto drop;
 		skboff += err;
 	}
 
@@ -783,9 +757,14 @@ static int tcf_ife_encode(struct sk_buff *skb, const struct tc_action *a,
 	if (!(at & AT_EGRESS))
 		skb_pull(skb, skb->dev->hard_header_len);
 
-	spin_unlock(&ife->tcf_lock);
+	rcu_read_unlock();
 
 	return action;
+
+drop:
+	qstats_drop_inc(this_cpu_ptr(ife->common.cpu_qstats));
+	rcu_read_unlock();
+	return TC_ACT_SHOT;
 }
 
 static int tcf_ife_act(struct sk_buff *skb, const struct tc_action *a,
@@ -800,11 +779,10 @@ static int tcf_ife_act(struct sk_buff *skb, const struct tc_action *a,
 		return tcf_ife_decode(skb, a, res);
 
 	pr_info_ratelimited("unknown failure(policy neither de/encode\n");
-	spin_lock(&ife->tcf_lock);
-	bstats_update(&ife->tcf_bstats, skb);
-	ife->tcf_tm.lastuse = jiffies;
-	ife->tcf_qstats.drops++;
-	spin_unlock(&ife->tcf_lock);
+
+	tcf_lastuse_update(&ife->tcf_tm);
+	bstats_cpu_update(this_cpu_ptr(ife->common.cpu_bstats), skb);
+	qstats_drop_inc(this_cpu_ptr(ife->common.cpu_qstats));
 
 	return TC_ACT_SHOT;
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17  5:38       ` Cong Wang
@ 2016-06-17 11:05         ` Alexey Khoroshilov
  2016-06-17 17:16           ` Cong Wang
  2016-06-17 11:07         ` Jamal Hadi Salim
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Khoroshilov @ 2016-06-17 11:05 UTC (permalink / raw)
  To: Cong Wang, Jamal Hadi Salim
  Cc: David S. Miller, Linux Kernel Network Developers, LKML, ldv-project

On 17.06.2016 08:38, Cong Wang wrote:
> On Thu, Jun 16, 2016 at 7:14 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> I think we can just remove that tcf_lock, I am testing a patch now.
> 
> Please try the attached patch, I will do more tests tomorrow.
> 
> Thanks!
> 

Looks good with two notes:
1. add_metainfo() still contains
ret = ops->alloc(mi, metaval);
that allocates memory with GFP_KERNEL.
So, I would add gfpflag argument to alloc() operation.

2. It makes sense to mention ife_mod_lock in the comment before
add_metainfo(), because ife_mod_lock is the reason to use GFP_ATOMIC there.

--
Alexey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17  5:38       ` Cong Wang
  2016-06-17 11:05         ` Alexey Khoroshilov
@ 2016-06-17 11:07         ` Jamal Hadi Salim
  2016-06-17 17:31           ` Cong Wang
  1 sibling, 1 reply; 10+ messages in thread
From: Jamal Hadi Salim @ 2016-06-17 11:07 UTC (permalink / raw)
  To: Cong Wang
  Cc: Alexey Khoroshilov, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

On 16-06-17 01:38 AM, Cong Wang wrote:
> On Thu, Jun 16, 2016 at 7:14 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> I think we can just remove that tcf_lock, I am testing a patch now.
>
> Please try the attached patch, I will do more tests tomorrow.
>
> Thanks!
>

Cong, What tree are you using? I dont see the time aggregation patches
that I sent (and Dave took in) in your changes.

Comments:
Is GFP_ATOMIC really necessary? Thats user->kernel interface. GFP_KERNEL 
should be sufficient.
Also, it would be nice to kill the lock - but this feels like two
patches in one. 1) to fix the alloc not to be under the lock 2) to
kill said lock. Maybe split it as such for easier review.
I am using this action extensively so will be happy to test.
I think my patch is a good beginning to #1 - if you fix the forgotten
unlock and ensure we lock around updating ife fields when it exists
already (you said it in your earlier email and I thought about
that afterwards).

cheers,
jamal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17 11:05         ` Alexey Khoroshilov
@ 2016-06-17 17:16           ` Cong Wang
  0 siblings, 0 replies; 10+ messages in thread
From: Cong Wang @ 2016-06-17 17:16 UTC (permalink / raw)
  To: Alexey Khoroshilov
  Cc: Jamal Hadi Salim, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

On Fri, Jun 17, 2016 at 4:05 AM, Alexey Khoroshilov
<khoroshilov@ispras.ru> wrote:
> On 17.06.2016 08:38, Cong Wang wrote:
>> On Thu, Jun 16, 2016 at 7:14 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>
>>> I think we can just remove that tcf_lock, I am testing a patch now.
>>
>> Please try the attached patch, I will do more tests tomorrow.
>>
>> Thanks!
>>
>
> Looks good with two notes:
> 1. add_metainfo() still contains
> ret = ops->alloc(mi, metaval);
> that allocates memory with GFP_KERNEL.
> So, I would add gfpflag argument to alloc() operation.

I thought about this too, but we just allocate 32+ bytes here,
not sure if it is really worth to pass a gfp flag.

>
> 2. It makes sense to mention ife_mod_lock in the comment before
> add_metainfo(), because ife_mod_lock is the reason to use GFP_ATOMIC there.

Don't worry, it is in a separated patch, I will explain this
in the changelog. (I sent a combined patch just for review/tests.)

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17 11:07         ` Jamal Hadi Salim
@ 2016-06-17 17:31           ` Cong Wang
  2016-06-18 14:38             ` Jamal Hadi Salim
  0 siblings, 1 reply; 10+ messages in thread
From: Cong Wang @ 2016-06-17 17:31 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Alexey Khoroshilov, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

On Fri, Jun 17, 2016 at 4:07 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> On 16-06-17 01:38 AM, Cong Wang wrote:
>>
>> On Thu, Jun 16, 2016 at 7:14 PM, Cong Wang <xiyou.wangcong@gmail.com>
>> wrote:
>>>
>>>
>>> I think we can just remove that tcf_lock, I am testing a patch now.
>>
>>
>> Please try the attached patch, I will do more tests tomorrow.
>>
>> Thanks!
>>
>
> Cong, What tree are you using? I dont see the time aggregation patches
> that I sent (and Dave took in) in your changes.

My patch is against -net. (I see you already figured out your patch is
missing in -net-next.)

Or are you suggesting to rebase it for -net-next? I think it fixes some real
bug so -net is better, although it is slightly large as a bug fix.

>
> Comments:
> Is GFP_ATOMIC really necessary? Thats user->kernel interface. GFP_KERNEL
> should be sufficient.

I added a read_lock(ife_mod_lock), this is why we need
GFP_ATOMIC.

Again, don't worry, this change should be in a separated patch,
you will not miss it again when I send them formally. ;)


> Also, it would be nice to kill the lock - but this feels like two
> patches in one. 1) to fix the alloc not to be under the lock 2) to
> kill said lock. Maybe split it as such for easier review.
> I am using this action extensively so will be happy to test.
> I think my patch is a good beginning to #1 - if you fix the forgotten
> unlock and ensure we lock around updating ife fields when it exists
> already (you said it in your earlier email and I thought about
> that afterwards).

Yes, it makes sense too. Your patch is smaller, if you plan to
backport it to stable, we can use your patch for -net  and -stable
and I am happy to rebase mine for -net-next.

I am fine with either way.

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] act_ife: sleeping functions called in atomic context
  2016-06-17 17:31           ` Cong Wang
@ 2016-06-18 14:38             ` Jamal Hadi Salim
  0 siblings, 0 replies; 10+ messages in thread
From: Jamal Hadi Salim @ 2016-06-18 14:38 UTC (permalink / raw)
  To: Cong Wang
  Cc: Alexey Khoroshilov, David S. Miller,
	Linux Kernel Network Developers, LKML, ldv-project

[-- Attachment #1: Type: text/plain, Size: 590 bytes --]

On 16-06-17 01:31 PM, Cong Wang wrote:

> My patch is against -net. (I see you already figured out your patch is
> missing in -net-next.)
>

Ok, should have re-read this email before working on the patch;->

> Or are you suggesting to rebase it for -net-next? I think it fixes some real
> bug so -net is better, although it is slightly large as a bug fix.
>

I am conflicted. There are a lot of changes in net-next at the moment;
adding this to -net seems like will definetely cause merge issues for
Dave.

Ok, Cong - patch attached and tested. Let me know what you think.

cheers,
jamal



[-- Attachment #2: patchlet1 --]
[-- Type: text/plain, Size: 2915 bytes --]

diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index b7fa969..b52deb4 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -239,8 +239,7 @@ static int ife_validate_metatype(struct tcf_meta_ops *ops, void *val, int len)
 	return ret;
 }
 
-/* called when adding new meta information
- * under ife->tcf_lock
+/* called to find new metadata ops. Possibly load it as a module.
 */
 static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 				void *val, int len)
@@ -251,11 +250,9 @@ static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 	if (!ops) {
 		ret = -ENOENT;
 #ifdef CONFIG_MODULES
-		spin_unlock_bh(&ife->tcf_lock);
 		rtnl_unlock();
 		request_module("ifemeta%u", metaid);
 		rtnl_lock();
-		spin_lock_bh(&ife->tcf_lock);
 		ops = find_ife_oplist(metaid);
 #endif
 	}
@@ -272,7 +269,6 @@ static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 }
 
 /* called when adding new meta information
- * under ife->tcf_lock
 */
 static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 			int len)
@@ -302,7 +298,9 @@ static int add_metainfo(struct tcf_ife_info *ife, u32 metaid, void *metaval,
 		}
 	}
 
+	spin_lock_bh(&ife->tcf_lock);
 	list_add_tail(&mi->metalist, &ife->metalist);
+	spin_unlock_bh(&ife->tcf_lock);
 
 	return ret;
 }
@@ -357,7 +355,6 @@ out_nlmsg_trim:
 	return -1;
 }
 
-/* under ife->tcf_lock */
 static void _tcf_ife_cleanup(struct tc_action *a, int bind)
 {
 	struct tcf_ife_info *ife = a->priv;
@@ -385,7 +382,6 @@ static void tcf_ife_cleanup(struct tc_action *a, int bind)
 	spin_unlock_bh(&ife->tcf_lock);
 }
 
-/* under ife->tcf_lock */
 static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb)
 {
 	int len = 0;
@@ -465,6 +461,8 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 	}
 
 	ife = to_ife(a);
+	if (exists)
+		spin_lock_bh(&ife->tcf_lock);
 	ife->flags = parm->flags;
 
 	if (parm->flags & IFE_ENCODE) {
@@ -475,10 +473,9 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 			saddr = nla_data(tb[TCA_IFE_SMAC]);
 	}
 
-	spin_lock_bh(&ife->tcf_lock);
 	ife->tcf_action = parm->action;
-
 	if (parm->flags & IFE_ENCODE) {
+
 		if (daddr)
 			ether_addr_copy(ife->eth_dst, daddr);
 		else
@@ -492,6 +489,9 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 		ife->eth_type = ife_type;
 	}
 
+	if (exists)
+		spin_unlock_bh(&ife->tcf_lock);
+
 	if (ret == ACT_P_CREATED)
 		INIT_LIST_HEAD(&ife->metalist);
 
@@ -505,7 +505,6 @@ metadata_parse_err:
 			if (ret == ACT_P_CREATED)
 				_tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 
@@ -524,13 +523,10 @@ metadata_parse_err:
 			if (ret == ACT_P_CREATED)
 				_tcf_ife_cleanup(a, bind);
 
-			spin_unlock_bh(&ife->tcf_lock);
 			return err;
 		}
 	}
 
-	spin_unlock_bh(&ife->tcf_lock);
-
 	if (ret == ACT_P_CREATED)
 		tcf_hash_insert(tn, a);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-06-18 14:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-16 20:50 [BUG] act_ife: sleeping functions called in atomic context Alexey Khoroshilov
2016-06-16 21:43 ` Cong Wang
2016-06-17  0:38   ` Jamal Hadi Salim
2016-06-17  2:14     ` Cong Wang
2016-06-17  5:38       ` Cong Wang
2016-06-17 11:05         ` Alexey Khoroshilov
2016-06-17 17:16           ` Cong Wang
2016-06-17 11:07         ` Jamal Hadi Salim
2016-06-17 17:31           ` Cong Wang
2016-06-18 14:38             ` Jamal Hadi Salim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).