netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: sched: don't release block->lock when dumping chains
@ 2019-02-25 15:45 Vlad Buslov
  2019-02-25 18:25 ` David Miller
  2019-02-26  0:15 ` Cong Wang
  0 siblings, 2 replies; 8+ messages in thread
From: Vlad Buslov @ 2019-02-25 15:45 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, Vlad Buslov

Function tc_dump_chain() obtains and releases block->lock on each iteration
of its inner loop that dumps all chains on block. Outputting chain template
info is fast operation so locking/unlocking mutex multiple times is an
overhead when lock is highly contested. Modify tc_dump_chain() to only
obtain block->lock once and dump all chains without releasing it.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/cls_api.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3543be31d400..5c6a9baa389f 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2996,12 +2996,12 @@ static int tc_ctl_chain(struct sk_buff *skb, struct nlmsghdr *n,
 /* called with RTNL */
 static int tc_dump_chain(struct sk_buff *skb, struct netlink_callback *cb)
 {
-	struct tcf_chain *chain, *chain_prev;
 	struct net *net = sock_net(skb->sk);
 	struct nlattr *tca[TCA_MAX + 1];
 	struct Qdisc *q = NULL;
 	struct tcf_block *block;
 	struct tcmsg *tcm = nlmsg_data(cb->nlh);
+	struct tcf_chain *chain;
 	long index_start;
 	long index;
 	u32 parent;
@@ -3064,11 +3064,8 @@ static int tc_dump_chain(struct sk_buff *skb, struct netlink_callback *cb)
 	index_start = cb->args[0];
 	index = 0;
 
-	for (chain = __tcf_get_next_chain(block, NULL);
-	     chain;
-	     chain_prev = chain,
-		     chain = __tcf_get_next_chain(block, chain),
-		     tcf_chain_put(chain_prev)) {
+	mutex_lock(&block->lock);
+	list_for_each_entry(chain, &block->chain_list, list) {
 		if ((tca[TCA_CHAIN] &&
 		     nla_get_u32(tca[TCA_CHAIN]) != chain->index))
 			continue;
@@ -3076,17 +3073,18 @@ static int tc_dump_chain(struct sk_buff *skb, struct netlink_callback *cb)
 			index++;
 			continue;
 		}
+		if (tcf_chain_held_by_acts_only(chain))
+			continue;
 		err = tc_chain_fill_node(chain->tmplt_ops, chain->tmplt_priv,
 					 chain->index, net, skb, block,
 					 NETLINK_CB(cb->skb).portid,
 					 cb->nlh->nlmsg_seq, NLM_F_MULTI,
 					 RTM_NEWCHAIN);
-		if (err <= 0) {
-			tcf_chain_put(chain);
+		if (err <= 0)
 			break;
-		}
 		index++;
 	}
+	mutex_unlock(&block->lock);
 
 	if (tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK)
 		tcf_block_refcnt_put(block, true);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-25 15:45 [PATCH net-next] net: sched: don't release block->lock when dumping chains Vlad Buslov
@ 2019-02-25 18:25 ` David Miller
  2019-02-26  0:15 ` Cong Wang
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2019-02-25 18:25 UTC (permalink / raw)
  To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri

From: Vlad Buslov <vladbu@mellanox.com>
Date: Mon, 25 Feb 2019 17:45:44 +0200

> Function tc_dump_chain() obtains and releases block->lock on each iteration
> of its inner loop that dumps all chains on block. Outputting chain template
> info is fast operation so locking/unlocking mutex multiple times is an
> overhead when lock is highly contested. Modify tc_dump_chain() to only
> obtain block->lock once and dump all chains without releasing it.
> 
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied, thanks Vlad.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-25 15:45 [PATCH net-next] net: sched: don't release block->lock when dumping chains Vlad Buslov
  2019-02-25 18:25 ` David Miller
@ 2019-02-26  0:15 ` Cong Wang
  2019-02-26 16:09   ` Vlad Buslov
  1 sibling, 1 reply; 8+ messages in thread
From: Cong Wang @ 2019-02-26  0:15 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller

On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> Function tc_dump_chain() obtains and releases block->lock on each iteration
> of its inner loop that dumps all chains on block. Outputting chain template
> info is fast operation so locking/unlocking mutex multiple times is an
> overhead when lock is highly contested. Modify tc_dump_chain() to only
> obtain block->lock once and dump all chains without releasing it.
>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>

Thanks for the followup!

Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
And for tc_dump_tfilter()?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-26  0:15 ` Cong Wang
@ 2019-02-26 16:09   ` Vlad Buslov
  2019-02-27 23:03     ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Vlad Buslov @ 2019-02-26 16:09 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller


On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>> Function tc_dump_chain() obtains and releases block->lock on each iteration
>> of its inner loop that dumps all chains on block. Outputting chain template
>> info is fast operation so locking/unlocking mutex multiple times is an
>> overhead when lock is highly contested. Modify tc_dump_chain() to only
>> obtain block->lock once and dump all chains without releasing it.
>>
>> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
>
> Thanks for the followup!
>
> Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
> And for tc_dump_tfilter()?

Not really. These two dump all tp filters and not just a template, which
is O(n) on number of filters and can be slow because it calls hw offload
API for each of them. Our typical use-case involves periodic filter dump
(to update stats) while multiple concurrent user-space threads are
updating filters, so it is important for them to be able to execute in
parallel.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-26 16:09   ` Vlad Buslov
@ 2019-02-27 23:03     ` Cong Wang
  2019-02-28 14:53       ` Vlad Buslov
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2019-02-27 23:03 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller

On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
>
> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
> >>
> >> Function tc_dump_chain() obtains and releases block->lock on each iteration
> >> of its inner loop that dumps all chains on block. Outputting chain template
> >> info is fast operation so locking/unlocking mutex multiple times is an
> >> overhead when lock is highly contested. Modify tc_dump_chain() to only
> >> obtain block->lock once and dump all chains without releasing it.
> >>
> >> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> >> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
> >
> > Thanks for the followup!
> >
> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
> > And for tc_dump_tfilter()?
>
> Not really. These two dump all tp filters and not just a template, which
> is O(n) on number of filters and can be slow because it calls hw offload
> API for each of them. Our typical use-case involves periodic filter dump
> (to update stats) while multiple concurrent user-space threads are
> updating filters, so it is important for them to be able to execute in
> parallel.

Hmm, but if these are read-only, you probably don't even need a
mutex, you can just use RCU read lock to protect list iteration
and you still can grab the refcnt in the same way.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-27 23:03     ` Cong Wang
@ 2019-02-28 14:53       ` Vlad Buslov
  2019-03-02  0:08         ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Vlad Buslov @ 2019-02-28 14:53 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller


On Wed 27 Feb 2019 at 23:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>>
>> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >>
>> >> Function tc_dump_chain() obtains and releases block->lock on each iteration
>> >> of its inner loop that dumps all chains on block. Outputting chain template
>> >> info is fast operation so locking/unlocking mutex multiple times is an
>> >> overhead when lock is highly contested. Modify tc_dump_chain() to only
>> >> obtain block->lock once and dump all chains without releasing it.
>> >>
>> >> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> >> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
>> >
>> > Thanks for the followup!
>> >
>> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
>> > And for tc_dump_tfilter()?
>>
>> Not really. These two dump all tp filters and not just a template, which
>> is O(n) on number of filters and can be slow because it calls hw offload
>> API for each of them. Our typical use-case involves periodic filter dump
>> (to update stats) while multiple concurrent user-space threads are
>> updating filters, so it is important for them to be able to execute in
>> parallel.
>
> Hmm, but if these are read-only, you probably don't even need a
> mutex, you can just use RCU read lock to protect list iteration
> and you still can grab the refcnt in the same way.

That is how it worked in my initial implementation. However, it doesn't
work with hw offloads because driver callbacks can sleep.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-02-28 14:53       ` Vlad Buslov
@ 2019-03-02  0:08         ` Cong Wang
  2019-03-04 14:14           ` Vlad Buslov
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2019-03-02  0:08 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller

On Thu, Feb 28, 2019 at 6:53 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
>
> On Wed 27 Feb 2019 at 23:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vladbu@mellanox.com> wrote:
> >>
> >>
> >> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
> >> >>
> >> >> Function tc_dump_chain() obtains and releases block->lock on each iteration
> >> >> of its inner loop that dumps all chains on block. Outputting chain template
> >> >> info is fast operation so locking/unlocking mutex multiple times is an
> >> >> overhead when lock is highly contested. Modify tc_dump_chain() to only
> >> >> obtain block->lock once and dump all chains without releasing it.
> >> >>
> >> >> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> >> >> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
> >> >
> >> > Thanks for the followup!
> >> >
> >> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
> >> > And for tc_dump_tfilter()?
> >>
> >> Not really. These two dump all tp filters and not just a template, which
> >> is O(n) on number of filters and can be slow because it calls hw offload
> >> API for each of them. Our typical use-case involves periodic filter dump
> >> (to update stats) while multiple concurrent user-space threads are
> >> updating filters, so it is important for them to be able to execute in
> >> parallel.
> >
> > Hmm, but if these are read-only, you probably don't even need a
> > mutex, you can just use RCU read lock to protect list iteration
> > and you still can grab the refcnt in the same way.
>
> That is how it worked in my initial implementation. However, it doesn't
> work with hw offloads because driver callbacks can sleep.

Hmm? You drop RCU read lock after grabbing the refcnt,
right? If so what's the problem with sleeping?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next] net: sched: don't release block->lock when dumping chains
  2019-03-02  0:08         ` Cong Wang
@ 2019-03-04 14:14           ` Vlad Buslov
  0 siblings, 0 replies; 8+ messages in thread
From: Vlad Buslov @ 2019-03-04 14:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller


On Sat 02 Mar 2019 at 00:08, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Feb 28, 2019 at 6:53 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>>
>> On Wed 27 Feb 2019 at 23:03, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >>
>> >>
>> >> On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> >> > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >> >>
>> >> >> Function tc_dump_chain() obtains and releases block->lock on each iteration
>> >> >> of its inner loop that dumps all chains on block. Outputting chain template
>> >> >> info is fast operation so locking/unlocking mutex multiple times is an
>> >> >> overhead when lock is highly contested. Modify tc_dump_chain() to only
>> >> >> obtain block->lock once and dump all chains without releasing it.
>> >> >>
>> >> >> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> >> >> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
>> >> >
>> >> > Thanks for the followup!
>> >> >
>> >> > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()?
>> >> > And for tc_dump_tfilter()?
>> >>
>> >> Not really. These two dump all tp filters and not just a template, which
>> >> is O(n) on number of filters and can be slow because it calls hw offload
>> >> API for each of them. Our typical use-case involves periodic filter dump
>> >> (to update stats) while multiple concurrent user-space threads are
>> >> updating filters, so it is important for them to be able to execute in
>> >> parallel.
>> >
>> > Hmm, but if these are read-only, you probably don't even need a
>> > mutex, you can just use RCU read lock to protect list iteration
>> > and you still can grab the refcnt in the same way.
>>
>> That is how it worked in my initial implementation. However, it doesn't
>> work with hw offloads because driver callbacks can sleep.
>
> Hmm? You drop RCU read lock after grabbing the refcnt,
> right? If so what's the problem with sleeping?

Okay, I misunderstood your suggestion. In tc_dump_tfilter() we can't use
RCU in __tcf_get_next_chain() because chain reference counters are not
atomic and require protection of block->lock. __tcf_get_next_proto()
requires chain->filter_chain_lock because it checks 'deleting' flag
besides taking reference to tp.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-03-04 14:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-25 15:45 [PATCH net-next] net: sched: don't release block->lock when dumping chains Vlad Buslov
2019-02-25 18:25 ` David Miller
2019-02-26  0:15 ` Cong Wang
2019-02-26 16:09   ` Vlad Buslov
2019-02-27 23:03     ` Cong Wang
2019-02-28 14:53       ` Vlad Buslov
2019-03-02  0:08         ` Cong Wang
2019-03-04 14:14           ` Vlad Buslov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).