netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* locating the 'tc actions' hook
@ 2013-07-31 21:19 John Fastabend
  2013-08-01 11:40 ` Jamal Hadi Salim
  0 siblings, 1 reply; 8+ messages in thread
From: John Fastabend @ 2013-07-31 21:19 UTC (permalink / raw)
  To: Jamal Hadi Salim, Stephen Hemminger; +Cc: Eric Dumazet, Tom Herbert, netdev

Hi Jamal,

I'm trying to sort out why I would use 'tc actions' and exactly
what it is doing.

Its clear I think adding one or more actions to filters will be
used in the classifier via tcf_exts_exec() through the classify
hook called from a qdisc.

This is your standard

	# tc filter add ... {u32|fw|tcindex|route|...} ... action ...

When these actions get configured the specific actions tc_action_ops
will be used to init the action create the hash 'tcf_hash_create' and
parse the options. All this I can follow in ./net/sched/

But the actions netlink hook does this,

tc_ctl_action()
	tcf_action_add()
		tcf_action_init() <- inserts action in the table

So at this point we have the entry in the table but I must be missing
where the tc_action_ops act() is going to be called because its not via
tcf_exts_exec().

Am I missing something obvious here? Is there a way to link them to
filters? Sorry if it turns out to be a stupid question.

My motivation here is to use the filters/actions outside the qdisc lock
for mq, mqprio, and the ingress qdisc.

.John

-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-07-31 21:19 locating the 'tc actions' hook John Fastabend
@ 2013-08-01 11:40 ` Jamal Hadi Salim
  2013-08-01 23:18   ` John Fastabend
  0 siblings, 1 reply; 8+ messages in thread
From: Jamal Hadi Salim @ 2013-08-01 11:40 UTC (permalink / raw)
  To: John Fastabend; +Cc: Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

On 13-07-31 05:19 PM, John Fastabend wrote:
> Hi Jamal,
>
> I'm trying to sort out why I would use 'tc actions' and exactly
> what it is doing.
>
> Its clear I think adding one or more actions to filters will be
> used in the classifier via tcf_exts_exec() through the classify
> hook called from a qdisc.
>
> This is your standard
>
>      # tc filter add ... {u32|fw|tcindex|route|...} ... action ...
>
> When these actions get configured the specific actions tc_action_ops
> will be used to init the action create the hash 'tcf_hash_create' and
> parse the options. All this I can follow in ./net/sched/
>
> But the actions netlink hook does this,
>
> tc_ctl_action()
>      tcf_action_add()
>          tcf_action_init() <- inserts action in the table
>
> So at this point we have the entry in the table but I must be missing
> where the tc_action_ops act() is going to be called because its not via
> tcf_exts_exec().
>


There are two ways to add actions.
a) you specify a filter followed by a graph of actions to be executed
on a match (yes, in the kernel 10 years before Openflow and still
expressively more powerful;->)

b) You can add actions first, then later specify the filter followed by
a graph of actions. Actions in such a case are referenced by their
table indices.

#a tends to be more popular way of provisioning.

> Am I missing something obvious here? Is there a way to link them to
> filters? Sorry if it turns out to be a stupid question.
>

I think the second use case is what you are bumping into. I know from
answering questions this is a very popular use case in some eastern
European countries (where one policer with a specific rate is shared
by many flows); i think they have a setup where you share your DSL
connection with your neighbors. Its quiet a clever setup.


> My motivation here is to use the filters/actions outside the qdisc lock
> for mq, mqprio, and the ingress qdisc.
>

Are you trying to offload these actions into hardware?
Is the classifier in hardware?
Please let me know if you need further help. Example, I could send you
a bunch of examples for either

cheers,
jamal


> .John
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-01 11:40 ` Jamal Hadi Salim
@ 2013-08-01 23:18   ` John Fastabend
  2013-08-02 18:46     ` John Fastabend
  2013-08-03 11:47     ` Jamal Hadi Salim
  0 siblings, 2 replies; 8+ messages in thread
From: John Fastabend @ 2013-08-01 23:18 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

[...]

>> Am I missing something obvious here? Is there a way to link them to
>> filters? Sorry if it turns out to be a stupid question.
>>
>
> I think the second use case is what you are bumping into. I know from
> answering questions this is a very popular use case in some eastern
> European countries (where one policer with a specific rate is shared
> by many flows); i think they have a setup where you share your DSL
> connection with your neighbors. Its quiet a clever setup.
>

Great thanks I was missing part (b) above. Now I see how the index
works.

>
>> My motivation here is to use the filters/actions outside the qdisc lock
>> for mq, mqprio, and the ingress qdisc.
>>
>
> Are you trying to offload these actions into hardware?
> Is the classifier in hardware?
> Please let me know if you need further help. Example, I could send you
> a bunch of examples for either
>

I have two things in mind for this.

The first being directly related to the previous per queue rate limiter
patch. With rate limiters per queue on a multiqueue device using mq or
mqprio I need some mechanism to steer packets to queues. One way to do
this is to use mqprio and create a 'tc' with a single queue in it.
And then use iptables or netprio_cgroup to steer packets. Another way
to do this would be to use 'skbedit queue_mapping' to set the queue from
'tc' but unfortunately with the existing flows the queue has already
been selected by the time the classifiers are called. Calling into the
classifier chain before picking the qdisc would fix this. For flow based
QOS with multiqueue devices this type of functionality would be useful.

The second thought that I've been piecing together would be to populate
the rxhash (or maybe some other field) using the hardware flow
classifier in some meaningful way for the ingress qdisc. Some of the
existing Intel NICs can do this and I believe other vendors have similar
capabilities. Although currently with the qdisc lock running around the
ingress qdisc the multiqueue devices take a perf hit just by
instantiating the ingress qdisc which really is only using the lock to
guard some stats and keep the classifier/action chains sane.

If you have some good examples it would be great to see them and drop
them in my testbed. Go ahead and send them to me offlist if you can.

.John

> cheers,
> jamal
>
>
>> .John
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-01 23:18   ` John Fastabend
@ 2013-08-02 18:46     ` John Fastabend
  2013-08-03 11:49       ` Jamal Hadi Salim
  2013-08-03 11:47     ` Jamal Hadi Salim
  1 sibling, 1 reply; 8+ messages in thread
From: John Fastabend @ 2013-08-02 18:46 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

On 08/01/2013 04:18 PM, John Fastabend wrote:
> [...]
>
>>> Am I missing something obvious here? Is there a way to link them to
>>> filters? Sorry if it turns out to be a stupid question.
>>>
>>
>> I think the second use case is what you are bumping into. I know from
>> answering questions this is a very popular use case in some eastern
>> European countries (where one policer with a specific rate is shared
>> by many flows); i think they have a setup where you share your DSL
>> connection with your neighbors. Its quiet a clever setup.
>>
>
> Great thanks I was missing part (b) above. Now I see how the index
> works.
>

Perhaps another incorrect observation but what protects the tc_actions?

Create a series of actions via 'tc actions' which populates the hash
table protected by hinfo->lock and also rtnetlink is holding the rtnl
lock.

Add a filter with index hook to get this action graph attached to a
filters tcf_exts pointer.

Now for what I think is the race, the classifier will call tcf_exts_exec
which will call tcf_action_exec() and start walking the actions and
executing them with the qdisc_lock held.

At the same time tcf_action_destroy() may be called via 'tc actions 
delete' which will only hold the rtnl lock via rtnetlink.

Again I think I might be missing a piece somewhere but I'm not seeing
how the locking adds up here. I'll look at it a bit more but thought
it might be worth asking.


>>
>>> My motivation here is to use the filters/actions outside the qdisc lock
>>> for mq, mqprio, and the ingress qdisc.
>>>
>>
>> Are you trying to offload these actions into hardware?
>> Is the classifier in hardware?
>> Please let me know if you need further help. Example, I could send you
>> a bunch of examples for either
>>
>
> I have two things in mind for this.
>
> The first being directly related to the previous per queue rate limiter
> patch. With rate limiters per queue on a multiqueue device using mq or
> mqprio I need some mechanism to steer packets to queues. One way to do
> this is to use mqprio and create a 'tc' with a single queue in it.
> And then use iptables or netprio_cgroup to steer packets. Another way
> to do this would be to use 'skbedit queue_mapping' to set the queue from
> 'tc' but unfortunately with the existing flows the queue has already
> been selected by the time the classifiers are called. Calling into the
> classifier chain before picking the qdisc would fix this. For flow based
> QOS with multiqueue devices this type of functionality would be useful.
>
> The second thought that I've been piecing together would be to populate
> the rxhash (or maybe some other field) using the hardware flow
> classifier in some meaningful way for the ingress qdisc. Some of the
> existing Intel NICs can do this and I believe other vendors have similar
> capabilities. Although currently with the qdisc lock running around the
> ingress qdisc the multiqueue devices take a perf hit just by
> instantiating the ingress qdisc which really is only using the lock to
> guard some stats and keep the classifier/action chains sane.
>
> If you have some good examples it would be great to see them and drop
> them in my testbed. Go ahead and send them to me offlist if you can.
>
> .John
>
>> cheers,
>> jamal
>>
>>
>>> .John
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-01 23:18   ` John Fastabend
  2013-08-02 18:46     ` John Fastabend
@ 2013-08-03 11:47     ` Jamal Hadi Salim
  2013-08-05 16:11       ` John Fastabend
  1 sibling, 1 reply; 8+ messages in thread
From: Jamal Hadi Salim @ 2013-08-03 11:47 UTC (permalink / raw)
  To: John Fastabend
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

On 13-08-01 07:18 PM, John Fastabend wrote:

[..]

> The first being directly related to the previous per queue rate limiter
> patch. With rate limiters per queue on a multiqueue device using mq or
> mqprio I need some mechanism to steer packets to queues. One way to do
> this is to use mqprio and create a 'tc' with a single queue in it.
> And then use iptables or netprio_cgroup to steer packets. Another way
> to do this would be to use 'skbedit queue_mapping' to set the queue from
> 'tc' but unfortunately with the existing flows the queue has already
> been selected by the time the classifiers are called.

I am assuming the skb (mark) will be tagged with a proper meaningful
tag maybe by the driver. Such a tag can be used later up-stack.

>Calling into the
> classifier chain before picking the qdisc would fix this. For flow based
> QOS with multiqueue devices this type of functionality would be useful.
>

 From your description this seems to be ingress side; so we should be
able to use that tag if you set it.


> The second thought that I've been piecing together would be to populate
> the rxhash (or maybe some other field) using the hardware flow
> classifier in some meaningful way for the ingress qdisc.


The rxhash would be useful for further classification or avoiding
further classification.

> Some of the
> existing Intel NICs can do this and I believe other vendors have similar
> capabilities. Although currently with the qdisc lock running around the
> ingress qdisc the multiqueue devices take a perf hit just by
> instantiating the ingress qdisc which really is only using the lock to
> guard some stats and keep the classifier/action chains sane.
>

Instantiating any qdisc is not a problem. You do it once and are done.
There are penalties with using qdiscs in terms of locking.
The locks penalties really have to do with the design of the netdev
not qdiscs i.e queues are centered around netdevs and netdevs
are shared across  processors.


> If you have some good examples it would be great to see them and drop
> them in my testbed. Go ahead and send them to me offlist if you can.
>

will do.

cheers,
jamal


> .John
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-02 18:46     ` John Fastabend
@ 2013-08-03 11:49       ` Jamal Hadi Salim
  0 siblings, 0 replies; 8+ messages in thread
From: Jamal Hadi Salim @ 2013-08-03 11:49 UTC (permalink / raw)
  To: John Fastabend
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev



On 13-08-02 02:46 PM, John Fastabend wrote:

> Perhaps another incorrect observation but what protects the tc_actions?
>
> Create a series of actions via 'tc actions' which populates the hash
> table protected by hinfo->lock and also rtnetlink is holding the rtnl
> lock.
>
> Add a filter with index hook to get this action graph attached to a
> filters tcf_exts pointer.
>
> Now for what I think is the race, the classifier will call tcf_exts_exec
> which will call tcf_action_exec() and start walking the actions and
> executing them with the qdisc_lock held.
>
> At the same time tcf_action_destroy() may be called via 'tc actions
> delete' which will only hold the rtnl lock via rtnetlink.
>

One runs in user context and the other in softirq context; so it will
work fine.

cheers,
jamal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-03 11:47     ` Jamal Hadi Salim
@ 2013-08-05 16:11       ` John Fastabend
  2013-08-12  0:55         ` Jamal Hadi Salim
  0 siblings, 1 reply; 8+ messages in thread
From: John Fastabend @ 2013-08-05 16:11 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

On 08/03/2013 04:47 AM, Jamal Hadi Salim wrote:
> On 13-08-01 07:18 PM, John Fastabend wrote:
>
> [..]
>
>> The first being directly related to the previous per queue rate limiter
>> patch. With rate limiters per queue on a multiqueue device using mq or
>> mqprio I need some mechanism to steer packets to queues. One way to do
>> this is to use mqprio and create a 'tc' with a single queue in it.
>> And then use iptables or netprio_cgroup to steer packets. Another way
>> to do this would be to use 'skbedit queue_mapping' to set the queue from
>> 'tc' but unfortunately with the existing flows the queue has already
>> been selected by the time the classifiers are called.
>
> I am assuming the skb (mark) will be tagged with a proper meaningful
> tag maybe by the driver. Such a tag can be used later up-stack.
>

Actually in the first case here I was considering egress traffic.

Separating the classifier chain from the qdisc allows using the
classifiers to pick a qdisc via queue_mapping because qdisc's are
attached to queues when using sch_mq.

>> Calling into the
>> classifier chain before picking the qdisc would fix this. For flow based
>> QOS with multiqueue devices this type of functionality would be useful.
>>
>
>  From your description this seems to be ingress side; so we should be
> able to use that tag if you set it.
>

Sorry I mucked two examples one egress and one ingress together without
clearly stating it. The second case here was ingress the first egress.

>
>> The second thought that I've been piecing together would be to populate
>> the rxhash (or maybe some other field) using the hardware flow
>> classifier in some meaningful way for the ingress qdisc.
>
>
> The rxhash would be useful for further classification or avoiding
> further classification.
>
>> Some of the
>> existing Intel NICs can do this and I believe other vendors have similar
>> capabilities. Although currently with the qdisc lock running around the
>> ingress qdisc the multiqueue devices take a perf hit just by
>> instantiating the ingress qdisc which really is only using the lock to
>> guard some stats and keep the classifier/action chains sane.
>>
>
> Instantiating any qdisc is not a problem. You do it once and are done.
> There are penalties with using qdiscs in terms of locking.
> The locks penalties really have to do with the design of the netdev
> not qdiscs i.e queues are centered around netdevs and netdevs
> are shared across  processors.
>

But currently we have a situation on egress where multiqueue devices
use mq or mqprio which aligns qdisc's with queues. This fixes the
performance penalties but you lose the ability to have state across
multiple queues and the ability to bind flows to queues.

For example a SW rate limiter that applies to a set of queues or a set
of tuned low latency queues. To fix this we can attach a qdisc (htb,
whatever) to the netdev but this has performance penalties. So we are
left with a trade off between functionality and performance.

On the ingress side we may have many queues but as soon as we add a
qdisc/classifier/action we have a single lock.

>
>> If you have some good examples it would be great to see them and drop
>> them in my testbed. Go ahead and send them to me offlist if you can.
>>
>
> will do.
>

Thanks. I'll see if I can draw up what I'm thinking here a bit more
clearly and send it your way.

> cheers,
> jamal
>
>
>> .John
>>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: locating the 'tc actions' hook
  2013-08-05 16:11       ` John Fastabend
@ 2013-08-12  0:55         ` Jamal Hadi Salim
  0 siblings, 0 replies; 8+ messages in thread
From: Jamal Hadi Salim @ 2013-08-12  0:55 UTC (permalink / raw)
  To: John Fastabend
  Cc: John Fastabend, Stephen Hemminger, Eric Dumazet, Tom Herbert, netdev

On 13-08-05 12:11 PM, John Fastabend wrote:

> Actually in the first case here I was considering egress traffic.
>


[..]
> Separating the classifier chain from the qdisc allows using the
> classifiers to pick a qdisc via queue_mapping because qdisc's are
> attached to queues when using sch_mq.
>

[..]
> Sorry I mucked two examples one egress and one ingress together without
> clearly stating it. The second case here was ingress the first egress.
>

Ok. Still fine; if a tag of some form exists, the driver should be
able to inherit it and pass it down to the hardware.
There's akready precedence on such offloading, this should just
follow in the same spirit.

> But currently we have a situation on egress where multiqueue devices
> use mq or mqprio which aligns qdisc's with queues. This fixes the
> performance penalties but you lose the ability to have state across
> multiple queues and the ability to bind flows to queues.
>

If the goal is to share resources, there's no way to avoid locks.
You have to update resource usage state which is a write operation.
Dont even bother with rcu.

> For example a SW rate limiter that applies to a set of queues or a set
> of tuned low latency queues. To fix this we can attach a qdisc (htb,
> whatever) to the netdev but this has performance penalties. So we are
> left with a trade off between functionality and performance.
>
> On the ingress side we may have many queues but as soon as we add a
> qdisc/classifier/action we have a single lock.
>

[..]
> Thanks. I'll see if I can draw up what I'm thinking here a bit more
> clearly and send it your way.

I think that would help me. Maybe we should take it offline and sync up.

cheers,
jamal

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-08-12  0:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-31 21:19 locating the 'tc actions' hook John Fastabend
2013-08-01 11:40 ` Jamal Hadi Salim
2013-08-01 23:18   ` John Fastabend
2013-08-02 18:46     ` John Fastabend
2013-08-03 11:49       ` Jamal Hadi Salim
2013-08-03 11:47     ` Jamal Hadi Salim
2013-08-05 16:11       ` John Fastabend
2013-08-12  0:55         ` Jamal Hadi Salim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).