netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* TC stats / hw offload question
@ 2019-02-05 19:41 Edward Cree
  2019-02-06  2:20 ` Jamal Hadi Salim
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-02-05 19:41 UTC (permalink / raw)
  To: netdev; +Cc: Jiri Pirko, Jamal Hadi Salim, Cong Wang

Regarding TC_CLSFLOWER_STATS, when a filter rule modifies the length of
 the packet, e.g. by adding a VLAN or encap header, should the bytes
 counter count the length of the packet _before_ edits (i.e. as seen by
 the match), or the length after edits?  If the latter, what is the
 correct behaviour when (say) a packet is mirrored as-is but also
 redirected with encapsulation?
The fact that the stats live in the struct tc_action suggests a per-
 action connection that would imply post-edit length, but then in
 tcf_exts_dump_stats() we only look at the first action, which seems to
 imply we really want the pre-edit length.
I can't find any kind of doc or spec defining what behaviour is required.

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-05 19:41 TC stats / hw offload question Edward Cree
@ 2019-02-06  2:20 ` Jamal Hadi Salim
  2019-02-08 10:26   ` Edward Cree
  2019-04-24 14:05   ` Edward Cree
  0 siblings, 2 replies; 21+ messages in thread
From: Jamal Hadi Salim @ 2019-02-06  2:20 UTC (permalink / raw)
  To: Edward Cree, netdev; +Cc: Jiri Pirko, Cong Wang

On 2019-02-05 2:41 p.m., Edward Cree wrote:
> Regarding TC_CLSFLOWER_STATS, when a filter rule modifies the length of
>   the packet, e.g. by adding a VLAN or encap header, should the bytes
>   counter count the length of the packet _before_ edits (i.e. as seen by
>   the match), or the length after edits?  

Classifier stats - when they exist - should be on
"classified" packets. i.e before edits.
Edits are typically done by actions. And even then we dont
have counters which indicate the post edit resizing.

> If the latter, what is the
>   correct behaviour when (say) a packet is mirrored as-is but also
>   redirected with encapsulation?

It is that ambiguity that make it hard to maintain a single
counter for post-edit size. packet and byte counts are on
"entry".

> The fact that the stats live in the struct tc_action suggests a per-
>   action connection that would imply post-edit length,

The classifiers dont mod the packets. The actions do. And they
maintain stats on the size on "entry" i.e pre-edit.

> but then in
>   tcf_exts_dump_stats() we only look at the first action, which seems to
>   imply we really want the pre-edit length.
> I can't find any kind of doc or spec defining what behaviour is required.
> 

Each action keeps its own counters. If you did something like:

tc match using flower blah \
action vlan push tag ... \
action redirect to egress of eth0

And you submited a packet of size x bytes,
then the "match" would record x bytes.
the "vlan action" would record x bytes.
the "redirect" would record size x+vlaninfo bytes
the egress of eth0 would  recorr x+vlaninfo bytes

cheers,
jamal

> -Ed
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-06  2:20 ` Jamal Hadi Salim
@ 2019-02-08 10:26   ` Edward Cree
  2019-02-09 17:39     ` Jamal Hadi Salim
  2019-04-24 14:05   ` Edward Cree
  1 sibling, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-02-08 10:26 UTC (permalink / raw)
  To: Jamal Hadi Salim, netdev; +Cc: Jiri Pirko, Cong Wang

On 06/02/19 02:20, Jamal Hadi Salim wrote:
> The classifiers dont mod the packets. The actions do. And they
> maintain stats on the size on "entry" i.e pre-edit. 
Thank you for clearing that up.

> Each action keeps its own counters. If you did something like:
>
> tc match using flower blah \
> action vlan push tag ... \
> action redirect to egress of eth0
>
> And you submited a packet of size x bytes,
> then the "match" would record x bytes.
Sorry, where would it record that?  I can't find any stats counters on
 the "match" either in the software path or the offload API.

> the "vlan action" would record x bytes.
> the "redirect" would record size x+vlaninfo bytes
> the egress of eth0 would  recorr x+vlaninfo bytes
Am I right in thinking that offloaded counters don't do that?  As far
 as I can tell, the drivers with flower offload all use
 tcf_exts_stats_update() which takes a single 'bytes' count and adds
 it to all the actions.  (Presumably this is pre-edit length.)

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-08 10:26   ` Edward Cree
@ 2019-02-09 17:39     ` Jamal Hadi Salim
  2019-02-11 11:44       ` Edward Cree
  0 siblings, 1 reply; 21+ messages in thread
From: Jamal Hadi Salim @ 2019-02-09 17:39 UTC (permalink / raw)
  To: Edward Cree, netdev
  Cc: Jiri Pirko, Cong Wang, Or Gerlitz, Andy Gospodarek,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski

On 2019-02-08 5:26 a.m., Edward Cree wrote:
> On 06/02/19 02:20, Jamal Hadi Salim wrote:

>> tc match using flower blah \
>> action vlan push tag ... \
>> action redirect to egress of eth0
>>
>> And you submited a packet of size x bytes,
>> then the "match" would record x bytes.
 >
> Sorry, where would it record that?

Sorry - that was hypothetical on my part. We dont
record the bytes in the classifier. We do (on some
classifiers) record hit counts, so we can see how many
times a lookup on a specific filter happened and how
many times that lookup resulted in a match.
See u32 (or a few others Cong has been updating
recently).

>  I can't find any stats counters on
>   the "match" either in the software path or the offload API.

Hasnt been necessary thus far.
Is your end goal to match and count?
Most hardware has stats/counters as a separate table.
Assuming yours does as well, then if all you want was
to just match and accept, then you would add a filter
with an accept/ok. i.e something along the lines of:

tc match using flower blah \
action ok index 12345

The "action index" field can be used to map to a counter
table index in hardware. Note if you dont explicitly
specify the index, the kernel (and in this case h/w)
issues one for you.

>> the "vlan action" would record x bytes.
>> the "redirect" would record size x+vlaninfo bytes
>> the egress of eth0 would  recorr x+vlaninfo bytes
> Am I right in thinking that offloaded counters don't do that?  As far
>   as I can tell, the drivers with flower offload all use
>   tcf_exts_stats_update() which takes a single 'bytes' count and adds
>   it to all the actions.  (Presumably this is pre-edit length.)

On the pre/post edit, perhaps some of the hardware
owners could chime in +Ccing a few.

To the tcf_exts_stats_update():
Note, most people interested in h/w offload will choose to skip
sw (explicitly defined when you add  a rule).
The update synchronizes the hardware stats/counters
to the kernel - so when you dump the policies in such a setup
you are seeing what is reflected in hardware.

cheers,
jamal

PS: I have to say just keeping one counter is at times
insufficient. Example, the policer records how many bytes
were seen, not how many bytes were allowed through.
For tunnels, it is easy to compute post-edit size because
you can easily compute later the added/removed bytes.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-09 17:39     ` Jamal Hadi Salim
@ 2019-02-11 11:44       ` Edward Cree
  2019-02-14 12:39         ` Jamal Hadi Salim
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-02-11 11:44 UTC (permalink / raw)
  To: Jamal Hadi Salim, netdev
  Cc: Jiri Pirko, Cong Wang, Or Gerlitz, Andy Gospodarek,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski

On 09/02/19 17:39, Jamal Hadi Salim wrote:
> On 2019-02-08 5:26 a.m., Edward Cree wrote:
>>   I can't find any stats counters on
>>   the "match" either in the software path or the offload API.
>
> Hasnt been necessary thus far.
> Is your end goal to match and count?
My end goal is to implement TC offload in some hw we're designing
 here at Solarflare.  So I'm trying to determine what hardware is
 expected/required to do.
It might be possible to design our new hw so that we can attach a
 counter to every action, if that's what TC wants.  But since the
 other vendors don't seem to do that, I wondered if there was a
 reason, or if perhaps the counter resources (and PCI bw to read
 them) could be saved if all those separate counters aren't really
 needed.  Right now the design we are considering would only count
 packets as-matched, i.e. before any edits.  That's fine for encap
 — you can calculate the bytes correction in SW — but not for decap
 since in principle the length of the RXed outer headers could
 vary (e.g. you might have IP options there).

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-11 11:44       ` Edward Cree
@ 2019-02-14 12:39         ` Jamal Hadi Salim
  2019-02-14 15:17           ` Andy Gospodarek
  2019-02-18 18:56           ` Edward Cree
  0 siblings, 2 replies; 21+ messages in thread
From: Jamal Hadi Salim @ 2019-02-14 12:39 UTC (permalink / raw)
  To: Edward Cree, netdev
  Cc: Jiri Pirko, Cong Wang, Or Gerlitz, Andy Gospodarek,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski


On 2019-02-11 6:44 a.m., Edward Cree wrote:

>> Hasnt been necessary thus far.
>> Is your end goal to match and count?
> My end goal is to implement TC offload in some hw we're designing
>   here at Solarflare.  So I'm trying to determine what hardware is
>   expected/required to do.
> It might be possible to design our new hw so that we can attach a
>   counter to every action, if that's what TC wants. 

It makes sense to have a counter on every action - even if it is
for debugging purposes. The two most basic actions are "drop" or
"accept". In TC speak the default action is "classid x:y" which
typically is to select a queue or give the flow some identity
(one should be able to use the same action on h/w ingress as well
to select a rx DMA ring for example, but that seems uncommon).

Note, your counters should also be shareable; example, count all
the drops in one counter across multiple flows as in the following
case where counter index 1 is used.

tc flower match foo action drop index 1
tc flower match bar action drop index 1


>   But since the
>   other vendors don't seem to do that, I wondered if there was a
>   reason, or if perhaps the counter resources (and PCI bw to read
>   them) could be saved if all those separate counters aren't really
>   needed.  

Probably nobody has paid attention or asked as you did.
Will let the h/w folks speak for themselves. My understanding
based on experience is counters are cheap. Most modern NICs
and ASICs have a gazillion of them at their disposal.

> Right now the design we are considering would only count
>   packets as-matched, i.e. before any edits.  That's fine for encap
>   — you can calculate the bytes correction in SW — but not for decap
>   since in principle the length of the RXed outer headers could
>   vary (e.g. you might have IP options there).
>

ok, so not much in terms of other types of actions.
But for abstraction sake maybe use "flowid x:y" and have counters
associated with that. Or even make this optional and only attach
a counter if someone says "action ok" and allow them to specify
the counter index (assuming you architecture has an indexed table
of counters).

cheers,
jamal


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-14 12:39         ` Jamal Hadi Salim
@ 2019-02-14 15:17           ` Andy Gospodarek
  2019-02-18 18:56           ` Edward Cree
  1 sibling, 0 replies; 21+ messages in thread
From: Andy Gospodarek @ 2019-02-14 15:17 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Edward Cree, netdev, Jiri Pirko, Cong Wang, Or Gerlitz,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski

On Thu, Feb 14, 2019 at 07:39:28AM -0500, Jamal Hadi Salim wrote:
> 
> On 2019-02-11 6:44 a.m., Edward Cree wrote:
> 
> >   But since the
> >   other vendors don't seem to do that, I wondered if there was a
> >   reason, or if perhaps the counter resources (and PCI bw to read
> >   them) could be saved if all those separate counters aren't really
> >   needed.
> 
> Probably nobody has paid attention or asked as you did.

That's not totally true, but I'm glad to hear that others are
considering it.

> Will let the h/w folks speak for themselves. My understanding
> based on experience is counters are cheap. Most modern NICs
> and ASICs have a gazillion of them at their disposal.

Counters can be cheap to implement (though that does not always mean
that everyone chooses to add many of them to hardware), but the real
cost to them, as Edward points out, is the cost of accessing them an
keeping them up to date.  If we are concerned about keeping track of
flow counters on thousands (or more) flows the cost on the PCI bus can
be quite high.

We have been looking at a few options to deal with tracking this massive
number of counts, but are open to hearing what others feel they would
like to see happen to this.  Though stats sometimes seem boring to some
developers, they are critical and we have been considering whether or
not we need to think about different driver or global infra to handle
it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-14 12:39         ` Jamal Hadi Salim
  2019-02-14 15:17           ` Andy Gospodarek
@ 2019-02-18 18:56           ` Edward Cree
  2019-02-18 19:37             ` Edward Cree
  1 sibling, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-02-18 18:56 UTC (permalink / raw)
  To: Jamal Hadi Salim, netdev
  Cc: Jiri Pirko, Cong Wang, Or Gerlitz, Andy Gospodarek,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski

On 14/02/19 12:39, Jamal Hadi Salim wrote:
> On 2019-02-11 6:44 a.m., Edward Cree wrote:
>> My end goal is to implement TC offload in some hw we're designing
>>   here at Solarflare.  So I'm trying to determine what hardware is
>>   expected/required to do.
>> It might be possible to design our new hw so that we can attach a
>>   counter to every action, if that's what TC wants. 
>
> It makes sense to have a counter on every action - even if it is
> for debugging purposes. The two most basic actions are "drop" or
> "accept". In TC speak the default action is "classid x:y" which
> typically is to select a queue or give the flow some identity
Perhaps I was insufficiently clear; we're only looking at the switching
 side of things (e.g. OVS offload) right now; we don't yet have a plan
 for 'delivery' filters (I imagine we'll probably initially port over
 our ethtool ntuple filter handling from ef10, though we may end up
 going down the TC route there).
So I think at the moment 'classid' isn't relevant (?)

> Note, your counters should also be shareable; example, count all
> the drops in one counter across multiple flows as in the following
> case where counter index 1 is used.
>
> tc flower match foo action drop index 1
> tc flower match bar action drop index 1
[...]
> allow them to specify
> the counter index (assuming you architecture has an indexed table
> of counters).
Our architecture allocates objects (including counters) and returns
 opaque handles to them, so we'd need a software table to connect
 counter index to FW counter ID.
Also, sharing counters in hw causes extra work for the driver code
 that keeps track of which encap actions are getting hit (so it can
 keep the neighbour entries alive).  Maybe summing the shared
 counters in sw is easier than that, I'm not sure (or maybe encap
 action counters should just be kept separate from the counters we
 report to TC).

TBH I'm coming to the conclusion that what we should do for the first
 version of our driver is just to create a counter per rule and report
 it against the first action (only), and for now ignore the index (or
 maybe require it to be set to some distinguished value, like 0, to
 mean 'allocate', so that as a future extension we can support
 shareable counters).

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-18 18:56           ` Edward Cree
@ 2019-02-18 19:37             ` Edward Cree
  0 siblings, 0 replies; 21+ messages in thread
From: Edward Cree @ 2019-02-18 19:37 UTC (permalink / raw)
  To: Jamal Hadi Salim, netdev
  Cc: Jiri Pirko, Cong Wang, Or Gerlitz, Andy Gospodarek,
	PJ Waskiewicz, Anjali Singhai Jain, Jakub Kicinski

On 18/02/19 18:56, Edward Cree wrote:
> (or
>  maybe require it to be set to some distinguished value, like 0, to
>  mean 'allocate', so that as a future extension we can support
>  shareable counters).
Turns out 0 isn't an option, because TC already internally swallows
 that and allocates its own index (same as if 'index' wasn't specified
 on cmdline at all), unless we can change TC to pass that 0 on to the
 driver.

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-02-06  2:20 ` Jamal Hadi Salim
  2019-02-08 10:26   ` Edward Cree
@ 2019-04-24 14:05   ` Edward Cree
  2019-04-24 14:11     ` Pablo Neira Ayuso
  1 sibling, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-04-24 14:05 UTC (permalink / raw)
  To: Jamal Hadi Salim, netdev, Pablo Neira Ayuso; +Cc: Jiri Pirko, Cong Wang

On 06/02/2019 02:20, Jamal Hadi Salim wrote:
> The classifiers dont mod the packets. The actions do. And they
> maintain stats on the size on "entry" i.e pre-edit.
> Each action keeps its own counters. If you did something like:
>
> tc match using flower blah \
> action vlan push tag ... \
> action redirect to egress of eth0
>
> And you submited a packet of size x bytes,
> then the "match" would record x bytes.
> the "vlan action" would record x bytes.
> the "redirect" would record size x+vlaninfo bytes
> the egress of eth0 would  recorr x+vlaninfo bytes
So, I had this working for a while, by calling tcf_action_stats_update()
 directly on the correct element of tc->exts->actions[], instead of using
 tcf_exts_stats_update().  And this was fine, until I tried to port my
 code to the 5.1 kernel, with Pablo's flow action infrastructure.  On that
 it's not possible, because there is only flow_stats_update(), which takes
 a single bytes value for _all_ the actions in the rule.
Pablo, was your patch series[1] intended to change the semantics of TC
 action bytes-counters?  What are the proper semantics now?

-Ed

[1] http://patchwork.ozlabs.org/cover/1035414/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-24 14:05   ` Edward Cree
@ 2019-04-24 14:11     ` Pablo Neira Ayuso
  2019-04-24 15:03       ` Edward Cree
  0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-24 14:11 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On Wed, Apr 24, 2019 at 03:05:05PM +0100, Edward Cree wrote:
> On 06/02/2019 02:20, Jamal Hadi Salim wrote:
> > The classifiers dont mod the packets. The actions do. And they
> > maintain stats on the size on "entry" i.e pre-edit.
> > Each action keeps its own counters. If you did something like:
> >
> > tc match using flower blah \
> > action vlan push tag ... \
> > action redirect to egress of eth0
> >
> > And you submited a packet of size x bytes,
> > then the "match" would record x bytes.
> > the "vlan action" would record x bytes.
> > the "redirect" would record size x+vlaninfo bytes
> > the egress of eth0 would  recorr x+vlaninfo bytes
> So, I had this working for a while, by calling tcf_action_stats_update()
>  directly on the correct element of tc->exts->actions[], instead of using
>  tcf_exts_stats_update().  And this was fine, until I tried to port my
>  code to the 5.1 kernel, with Pablo's flow action infrastructure.  On that
>  it's not possible, because there is only flow_stats_update(), which takes
>  a single bytes value for _all_ the actions in the rule.

Then, we need to extend the API to allow to update counters per
action, no driver in the tree has been supporting this so far.  May I
have a look at your driver code to infer what we might need?

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-24 14:11     ` Pablo Neira Ayuso
@ 2019-04-24 15:03       ` Edward Cree
  2019-04-25 13:23         ` Edward Cree
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-04-24 15:03 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On 24/04/2019 15:11, Pablo Neira Ayuso wrote:
> On Wed, Apr 24, 2019 at 03:05:05PM +0100, Edward Cree wrote:
>> So, I had this working for a while, by calling tcf_action_stats_update()
>>  directly on the correct element of tc->exts->actions[], instead of using
>>  tcf_exts_stats_update().  And this was fine, until I tried to port my
>>  code to the 5.1 kernel, with Pablo's flow action infrastructure.  On that
>>  it's not possible, because there is only flow_stats_update(), which takes
>>  a single bytes value for _all_ the actions in the rule.
> Then, we need to extend the API to allow to update counters per
> action, no driver in the tree has been supporting this so far.  May I
> have a look at your driver code to infer what we might need?
>
> Thanks.
Somewhat abridged and out-of-context, because this is for stuff that's a
 long way away from being ready to upstream (mumble unreleased hardware),
 here's the counter-handling part of it.

struct efx_tc_action_set {
    u16 vlan_push:2;
    u16 vlan_pop:2;
    /* other action flag bits */
    __be16 vlan_tci[2]; /* TCIs for vlan_push */
    __be16 vlan_proto[2]; /* Ethertypes for vlan_push */
    struct efx_tc_counter_index *count;
    int count_action_idx; /* Which tc_action uses the counter */
    int count_size_delta; /* bytes per packet change */
    /* other action metadata and gubbins */
};

static void calculate_count_delta(struct efx_tc_action_set *act)
{
    act->count_size_delta = 0;
    if (act->vlan_pop & 2)
        act->count_size_delta -= 4;
    if (act->vlan_pop & 1)
        act->count_size_delta -= 4;
    if (act->vlan_push & 1)
        act->count_size_delta += 4;
    if (act->vlan_push & 2)
        act->count_size_delta += 4;
}

static int efx_tc_flower_replace(struct efx_nic *efx,
                                 struct net_device *net_dev,
                                 struct tc_cls_flower_offload *tc)
{
    struct efx_tc_action_set *act;

    /* parse the match */

    tcf_exts_for_each_action(i, a, tc->exts) {
        if (a->ops && a->ops->stats_update) {
            /* act is the hw action we're building */
            act->count = allocate_a_counter();
            act->count_action_idx = i;
            calculate_count_delta(act);
        }
        /* handle the actual action */
        /* in the case of 'mirror' we'll stick 'act' on a list and
         * allocate a new one that starts out the same, to represent
         * the piped packet.
         */
    }

    /* insert the rule in hw*/
    return ...;
}

static int efx_tc_flower_stats(struct efx_nic *efx, struct net_device *net_dev,
                               struct tc_cls_flower_offload *tc)
{
    struct efx_tc_action_set *act;
    struct efx_tc_counter *cnt;
    u64 packets, bytes;

    rule = /* lookup the rule */;
    list_for_each_entry(act, &/*list of rule actions*/, list)
        if (act->count) {
            struct tc_action *a;
            u64 d_packets, d_bytes;

            cnt = /* get the counter */;
            /* Report only new pkts/bytes since last time TC asked */
            packets = cnt->packets;
            bytes = cnt->bytes;
            d_packets = packets - cnt->old_packets;
            /* Apply corrections for any VLAN pops/pushes before
             * our count action (but after the HW counter)
             */
            d_bytes = bytes - cnt->old_bytes +
                      act->count_size_delta * d_packets;
            a = tc->exts->actions[act->count_action_idx];
            tcf_action_stats_update(a, d_bytes, d_packets,
                                    cnt->touched, true);
            cnt->old_packets = packets;
            cnt->old_bytes = bytes;
        }
    return 0;
}

With this code, if for instance the rule is
tc match using flower blah \
action mirred egress mirror eth1 \
action vlan push blah \
action mirred egress redirect eth2

and a packed of length 100 bytes is matched, then tc stats show will report:
action mirred egress mirror eth1:
    1 packets, 100 bytes
action vlan push blah
    0 packets, 0 bytes /* Doesn't have an ops->stats_update */
action mirred egress redirect eth2
    1 packets, 104 bytes
which AIUI was the correct semantics, although no drivers upstream did it.

TC does have all the information it needs to make those corrections itself
 (it knows that the mirred eth2 comes after a vlan push), but currently it
 doesn't do so.
So one possible solution would be for flow_stats_update() to take bytes 'as
 matched', and make the corrections as it goes along.  Unfortunately, this
 doesn't quite work for decap (tunnel_key unset), because the length of the
 tunnel header is not statically known.  (For this reason, our hw counts
 bytes _after_ decap but _before_ other actions like VLAN push/pop.)  Since
 from TC's perspective tunnel_key is metadata rather than packet data, this
 works if we are willing to define the counters as not including tunnel
 headers — that's what I've done currently.  Then VLAN push/pop are the
 only TC actions that alter packet length.  It is however a bit unintuitive
 that in the case of
    tc filter add dev vxlan0 ... flower blah \
    action mirred egress mirror eth1
    action tunnel_key unset
    action mirred egress redirect eth2
 both mirreds will show the same (decapped) bytes count, yet eth1 will (at
 least as I've interpreted it for offload) receive the packet with the
 encap header still present.

So to implement this we could do something like (include/net/pkt_cls.h,
 written in mail client & not at all tested...)

static inline void
tcf_exts_stats_update(const struct tcf_exts *exts,
                      u64 bytes, u64 packets, u64 lastuse)
{
#ifdef CONFIG_NET_CLS_ACT
    int i, delta = 0;

    preempt_disable();

    for (i = 0; i < exts->nr_actions; i++) {
        struct tc_action *a = exts->actions[i];

        tcf_action_stats_update(a, bytes + packets * delta, packets,
                                lastuse, true);
        if (is_tcf_vlan(a))
            switch (tcf_vlan_action(a)) {
            case TCA_VLAN_ACT_POP:
                delta -= 4;
                break;
            case TCA_VLAN_ACT_PUSH:
                delta += 4;
                break;
            case TCA_VLAN_ACT_MODIFY:
            default:
                break;
            }
    }

    preempt_enable();
#endif
}

Does that seem reasonable?

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-24 15:03       ` Edward Cree
@ 2019-04-25 13:23         ` Edward Cree
  2019-04-25 22:33           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-04-25 13:23 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On 24/04/2019 16:03, Edward Cree wrote:
> static int efx_tc_flower_replace(struct efx_nic *efx,
>                                  struct net_device *net_dev,
>                                  struct tc_cls_flower_offload *tc)
> {
>     struct efx_tc_action_set *act;
>
>     /* parse the match */
>
>     tcf_exts_for_each_action(i, a, tc->exts) {
>         if (a->ops && a->ops->stats_update) {
>             /* act is the hw action we're building */
>             act->count = allocate_a_counter();
Also, this was actually taking a->tcfa_index, allowing multiple rules to
 share a counter.  The action index doesn't seem to be available in the
 new flow_offload API.

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-25 13:23         ` Edward Cree
@ 2019-04-25 22:33           ` Pablo Neira Ayuso
  2019-04-26 12:13             ` Edward Cree
  0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-25 22:33 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On Thu, Apr 25, 2019 at 02:23:08PM +0100, Edward Cree wrote:
> On 24/04/2019 16:03, Edward Cree wrote:
> > static int efx_tc_flower_replace(struct efx_nic *efx,
> >                                  struct net_device *net_dev,
> >                                  struct tc_cls_flower_offload *tc)
> > {
> >     struct efx_tc_action_set *act;
> >
> >     /* parse the match */
> >
> >     tcf_exts_for_each_action(i, a, tc->exts) {
> >         if (a->ops && a->ops->stats_update) {
> >             /* act is the hw action we're building */
> >             act->count = allocate_a_counter();
>
> Also, this was actually taking a->tcfa_index, allowing multiple rules to
>  share a counter.  The action index doesn't seem to be available in the
>  new flow_offload API.

Could you show a bit more code to see how you use a->tcfa_index from
your efx_tc_flower_replace()?

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-25 22:33           ` Pablo Neira Ayuso
@ 2019-04-26 12:13             ` Edward Cree
  2019-04-26 12:42               ` Jamal Hadi Salim
  2019-04-26 18:49               ` Pablo Neira Ayuso
  0 siblings, 2 replies; 21+ messages in thread
From: Edward Cree @ 2019-04-26 12:13 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On 25/04/2019 23:33, Pablo Neira Ayuso wrote:
> On Thu, Apr 25, 2019 at 02:23:08PM +0100, Edward Cree wrote:
>> On 24/04/2019 16:03, Edward Cree wrote:
>>> static int efx_tc_flower_replace(struct efx_nic *efx,
>>>                                  struct net_device *net_dev,
>>>                                  struct tc_cls_flower_offload *tc)
>>> {
>>>     struct efx_tc_action_set *act;
>>>
>>>     /* parse the match */
>>>
>>>     tcf_exts_for_each_action(i, a, tc->exts) {
>>>         if (a->ops && a->ops->stats_update) {
>>>             /* act is the hw action we're building */
>>>             act->count = allocate_a_counter();
>> Also, this was actually taking a->tcfa_index, allowing multiple rules to
>>  share a counter.  The action index doesn't seem to be available in the
>>  new flow_offload API.
> Could you show a bit more code to see how you use a->tcfa_index from
> your efx_tc_flower_replace()?
>
> Thanks.
Sure; this block is (still slightly abridged)

if (a->ops && a->ops->stats_update) {
    struct efx_tc_counter_index *ctr;

    ctr = efx_tc_flower_get_counter_by_index(efx, a->tcfa_index);
    if (IS_ERR(ctr)) {
        rc = PTR_ERR(ctr);
        goto release;
    }
    act->count = ctr;
    act->count_action_idx = i;
    efx_tc_calculate_count_delta(act);
}

and we have

struct efx_tc_counter_index {
    u32 tcfa_index;
    struct rhash_head linkage;
    refcount_t ref;
    u32 fw_id;
};

const static struct rhashtable_params efx_tc_counter_ht_params = {
    .key_len        = offsetof(struct efx_tc_counter_index, linkage),
    .key_offset     = 0,
    .head_offset    = offsetof(struct efx_tc_counter_index, linkage),
};

static struct efx_tc_counter_index *efx_tc_flower_get_counter_by_index(
                struct efx_nic *efx, u32 idx)
{
    struct efx_tc_counter_index *ctr, *old;
    long rc;

    ctr = kzalloc(sizeof(*ctr), GFP_USER);
    if (!ctr)
        return ERR_PTR(-ENOMEM);
    ctr->tcfa_index = idx;
    old = rhashtable_lookup_get_insert_fast(&efx->tc->counter_ht,
                                            &ctr->linkage,
                                            efx_tc_counter_ht_params);
    if (old) {
        /* don't need our new entry */
        kfree(ctr);
        if (!refcount_inc_not_zero(&old->ref))
            return ERR_PTR(-EAGAIN);
        /* existing entry found */
        ctr = old;
    } else {
        rc = efx_tc_flower_allocate_counter(efx);
        if (rc < 0) {
            rhashtable_remove_fast(&efx->tc->counter_ht,
                                   &ctr->linkage,
                                   efx_tc_counter_ht_params);
            kfree(ctr);
            return ERR_PTR(rc);
        }
        ctr->fw_id = rc;
        refcount_inc(&ctr->ref);
    }
    return ctr;
}

Thus if (and only if) two TC actions have the same tcfa_index, they will
 share a single counter in the HW.
I gathered from a previous conversation with Jamal[1] that that was the
 correct behaviour:
> Note, your counters should also be shareable; example, count all
> the drops in one counter across multiple flows as in the following
> case where counter index 1 is used.
>
> tc flower match foo action drop index 1
> tc flower match bar action drop index 1

-Ed

[1]: https://www.spinics.net/lists/netdev/msg551448.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-26 12:13             ` Edward Cree
@ 2019-04-26 12:42               ` Jamal Hadi Salim
  2019-04-26 18:49               ` Pablo Neira Ayuso
  1 sibling, 0 replies; 21+ messages in thread
From: Jamal Hadi Salim @ 2019-04-26 12:42 UTC (permalink / raw)
  To: Edward Cree, Pablo Neira Ayuso; +Cc: netdev, Jiri Pirko, Cong Wang

On 2019-04-26 8:13 a.m., Edward Cree wrote:

> Sure; this block is (still slightly abridged)
> 
> if (a->ops && a->ops->stats_update) {
>      struct efx_tc_counter_index *ctr;
> 
>      ctr = efx_tc_flower_get_counter_by_index(efx, a->tcfa_index);
>      if (IS_ERR(ctr)) {
>          rc = PTR_ERR(ctr);
>          goto release;
>      }
>      act->count = ctr;
>      act->count_action_idx = i;
>      efx_tc_calculate_count_delta(act);
> }
> 
> and we have
> 
> struct efx_tc_counter_index {
>      u32 tcfa_index;
>      struct rhash_head linkage;
>      refcount_t ref;
>      u32 fw_id;
> };
> 
> const static struct rhashtable_params efx_tc_counter_ht_params = {
>      .key_len        = offsetof(struct efx_tc_counter_index, linkage),
>      .key_offset     = 0,
>      .head_offset    = offsetof(struct efx_tc_counter_index, linkage),
> };
> 
> static struct efx_tc_counter_index *efx_tc_flower_get_counter_by_index(
>                  struct efx_nic *efx, u32 idx)
> {
>      struct efx_tc_counter_index *ctr, *old;
>      long rc;
> 
>      ctr = kzalloc(sizeof(*ctr), GFP_USER);
>      if (!ctr)
>          return ERR_PTR(-ENOMEM);
>      ctr->tcfa_index = idx;
>      old = rhashtable_lookup_get_insert_fast(&efx->tc->counter_ht,
>                                              &ctr->linkage,
>                                              efx_tc_counter_ht_params);
>      if (old) {
>          /* don't need our new entry */
>          kfree(ctr);
>          if (!refcount_inc_not_zero(&old->ref))
>              return ERR_PTR(-EAGAIN);
>          /* existing entry found */
>          ctr = old;
>      } else {
>          rc = efx_tc_flower_allocate_counter(efx);
>          if (rc < 0) {
>              rhashtable_remove_fast(&efx->tc->counter_ht,
>                                     &ctr->linkage,
>                                     efx_tc_counter_ht_params);
>              kfree(ctr);
>              return ERR_PTR(rc);
>          }
>          ctr->fw_id = rc;
>          refcount_inc(&ctr->ref);
>      }
>      return ctr;
> }
> 
> Thus if (and only if) two TC actions have the same tcfa_index, they will
>   share a single counter in the HW.
> I gathered from a previous conversation with Jamal[1] that that was the
>   correct behaviour:

Yes, this is expected behavior. Meters/policers as well used indices to
indicate sharing.

cheers,
jamal

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-26 12:13             ` Edward Cree
  2019-04-26 12:42               ` Jamal Hadi Salim
@ 2019-04-26 18:49               ` Pablo Neira Ayuso
  2019-04-29 14:11                 ` Edward Cree
  1 sibling, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-26 18:49 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On Fri, Apr 26, 2019 at 01:13:41PM +0100, Edward Cree wrote:
> On 25/04/2019 23:33, Pablo Neira Ayuso wrote:
> > On Thu, Apr 25, 2019 at 02:23:08PM +0100, Edward Cree wrote:
> >> On 24/04/2019 16:03, Edward Cree wrote:
> >>> static int efx_tc_flower_replace(struct efx_nic *efx,
> >>>                                  struct net_device *net_dev,
> >>>                                  struct tc_cls_flower_offload *tc)
> >>> {
> >>>     struct efx_tc_action_set *act;
> >>>
> >>>     /* parse the match */
> >>>
> >>>     tcf_exts_for_each_action(i, a, tc->exts) {
> >>>         if (a->ops && a->ops->stats_update) {
> >>>             /* act is the hw action we're building */
> >>>             act->count = allocate_a_counter();
> >> Also, this was actually taking a->tcfa_index, allowing multiple rules to
> >>  share a counter.  The action index doesn't seem to be available in the
> >>  new flow_offload API.
> > Could you show a bit more code to see how you use a->tcfa_index from
> > your efx_tc_flower_replace()?
> >
> > Thanks.
> Sure; this block is (still slightly abridged)
> 
> if (a->ops && a->ops->stats_update) {
>     struct efx_tc_counter_index *ctr;
> 
>     ctr = efx_tc_flower_get_counter_by_index(efx, a->tcfa_index);
>     if (IS_ERR(ctr)) {
>         rc = PTR_ERR(ctr);
>         goto release;
>     }
>     act->count = ctr;
>     act->count_action_idx = i;
>     efx_tc_calculate_count_delta(act);
> }
> 
> and we have
> 
> struct efx_tc_counter_index {
>     u32 tcfa_index;
>     struct rhash_head linkage;
>     refcount_t ref;
>     u32 fw_id;
> };
> 
> const static struct rhashtable_params efx_tc_counter_ht_params = {
>     .key_len        = offsetof(struct efx_tc_counter_index, linkage),
>     .key_offset     = 0,
>     .head_offset    = offsetof(struct efx_tc_counter_index, linkage),
> };
> 
> static struct efx_tc_counter_index *efx_tc_flower_get_counter_by_index(
>                 struct efx_nic *efx, u32 idx)
> {
>     struct efx_tc_counter_index *ctr, *old;
>     long rc;
> 
>     ctr = kzalloc(sizeof(*ctr), GFP_USER);
>     if (!ctr)
>         return ERR_PTR(-ENOMEM);
>     ctr->tcfa_index = idx;
>     old = rhashtable_lookup_get_insert_fast(&efx->tc->counter_ht,
>                                             &ctr->linkage,
>                                             efx_tc_counter_ht_params);
>     if (old) {
>         /* don't need our new entry */
>         kfree(ctr);
>         if (!refcount_inc_not_zero(&old->ref))
>             return ERR_PTR(-EAGAIN);
>         /* existing entry found */
>         ctr = old;
>     } else {
>         rc = efx_tc_flower_allocate_counter(efx);
>         if (rc < 0) {
>             rhashtable_remove_fast(&efx->tc->counter_ht,
>                                    &ctr->linkage,
>                                    efx_tc_counter_ht_params);
>             kfree(ctr);
>             return ERR_PTR(rc);
>         }
>         ctr->fw_id = rc;
>         refcount_inc(&ctr->ref);
>     }
>     return ctr;
> }
> 
> Thus if (and only if) two TC actions have the same tcfa_index, they will
>  share a single counter in the HW.
> I gathered from a previous conversation with Jamal[1] that that was the
>  correct behaviour:
> > Note, your counters should also be shareable; example, count all
> > the drops in one counter across multiple flows as in the following
> > case where counter index 1 is used.
> >
> > tc flower match foo action drop index 1
> > tc flower match bar action drop index 1

The flow_action_entry structure needs a new 'counter_index' field to
store this. The tc_setup_flow_action() function needs to be updated
for this for the FLOW_ACTION_{ACCEPT,DROP,REDIRECT,MIRRED} cases to
set this entry->counter_index field to tcfa_index, so the driver has
access to this.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-26 18:49               ` Pablo Neira Ayuso
@ 2019-04-29 14:11                 ` Edward Cree
  2019-04-29 15:21                   ` Pablo Neira Ayuso
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-04-29 14:11 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On 26/04/2019 19:49, Pablo Neira Ayuso wrote:
> On Fri, Apr 26, 2019 at 01:13:41PM +0100, Edward Cree wrote:
>> Thus if (and only if) two TC actions have the same tcfa_index, they will
>>  share a single counter in the HW.
>> I gathered from a previous conversation with Jamal[1] that that was the
>>  correct behaviour:
>>> Note, your counters should also be shareable; example, count all
>>> the drops in one counter across multiple flows as in the following
>>> case where counter index 1 is used.
>>>
>>> tc flower match foo action drop index 1
>>> tc flower match bar action drop index 1
> The flow_action_entry structure needs a new 'counter_index' field to
> store this. The tc_setup_flow_action() function needs to be updated
> for this for the FLOW_ACTION_{ACCEPT,DROP,REDIRECT,MIRRED} cases to
> set this entry->counter_index field to tcfa_index, so the driver has
> access to this.
Hmm, I'm still not sure this solves everything.
Before, we could write
tc flower match foo \
    action mirred egress mirror eth1 index 1 \
    action mirred egress redirect eth2 index 2
and have two distinct HW counters (one of which might e.g. be shared
 with another rule).  But when reading those counters, under
 fl_hw_update_stats(), the driver only gets to return one set of flow
 stats for both actions.
Previously, the driver's TC_CLSFLOWER_STATS handler was updating the
 action stats directly, so was able to do something different for each
 action, but that's not possible in 5.1.  At stats gathering time, the
 driver doesn't even have access to anything that's per-action and
 thus could have a flow_stats member shoved in it.
AFAICT, the only reason this isn't a regression is that existing
 drivers didn't implement the old semantics correctly.
This is a bit of a mess; the best idea I've got is for the
 TC_CLSFLOWER_STATS call to include a tcfa_index.  Then the driver
 returns counter stats for that index, and tcf_exts_stats_update()
 only updates those actions whose index matches.  But then
 fl_hw_update_stats() would have to iterate over all the indices in
 f->exts.  What do you think?

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-29 14:11                 ` Edward Cree
@ 2019-04-29 15:21                   ` Pablo Neira Ayuso
  2019-04-29 16:25                     ` Edward Cree
  0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-29 15:21 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On Mon, Apr 29, 2019 at 03:11:06PM +0100, Edward Cree wrote:
> On 26/04/2019 19:49, Pablo Neira Ayuso wrote:
> > On Fri, Apr 26, 2019 at 01:13:41PM +0100, Edward Cree wrote:
> >> Thus if (and only if) two TC actions have the same tcfa_index, they will
> >>  share a single counter in the HW.
> >> I gathered from a previous conversation with Jamal[1] that that was the
> >>  correct behaviour:
> >>> Note, your counters should also be shareable; example, count all
> >>> the drops in one counter across multiple flows as in the following
> >>> case where counter index 1 is used.
> >>>
> >>> tc flower match foo action drop index 1
> >>> tc flower match bar action drop index 1
> > The flow_action_entry structure needs a new 'counter_index' field to
> > store this. The tc_setup_flow_action() function needs to be updated
> > for this for the FLOW_ACTION_{ACCEPT,DROP,REDIRECT,MIRRED} cases to
> > set this entry->counter_index field to tcfa_index, so the driver has
> > access to this.
> Hmm, I'm still not sure this solves everything.
> Before, we could write
> tc flower match foo \
>     action mirred egress mirror eth1 index 1 \
>     action mirred egress redirect eth2 index 2
> and have two distinct HW counters (one of which might e.g. be shared
>  with another rule).  But when reading those counters, under
>  fl_hw_update_stats(), the driver only gets to return one set of flow
>  stats for both actions.
> Previously, the driver's TC_CLSFLOWER_STATS handler was updating the
>  action stats directly, so was able to do something different for each
>  action, but that's not possible in 5.1.  At stats gathering time, the
>  driver doesn't even have access to anything that's per-action and
>  thus could have a flow_stats member shoved in it.
> AFAICT, the only reason this isn't a regression is that existing
>  drivers didn't implement the old semantics correctly.
>
> This is a bit of a mess; the best idea I've got is for the
>  TC_CLSFLOWER_STATS call to include a tcfa_index.  Then the driver
>  returns counter stats for that index, and tcf_exts_stats_update()
>  only updates those actions whose index matches.  But then
>  fl_hw_update_stats() would have to iterate over all the indices in
>  f->exts.  What do you think?

You could extend struct flow_stats to pass an array of stats to the
driver, including one stats per action and the counter index. Then,
tcf_exts_stats_update() uses this array of stats to update per-action
stats.

struct flow_action_stats {
        u32     counter_index;
        u64     pkts;
        u64     bytes;
        u64     lastused;
};

struct flow_stats {
        struct flow_action_stats        *stats[];
        u32                             num_actions;
};

As you mentioned, no driver supports for tcfa_index so far, probably
it would be a good idea to return -EOPNOTSUPP in such case by now.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-29 15:21                   ` Pablo Neira Ayuso
@ 2019-04-29 16:25                     ` Edward Cree
  2019-04-29 19:14                       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 21+ messages in thread
From: Edward Cree @ 2019-04-29 16:25 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On 29/04/2019 16:21, Pablo Neira Ayuso wrote:
> On Mon, Apr 29, 2019 at 03:11:06PM +0100, Edward Cree wrote:
>> This is a bit of a mess; the best idea I've got is for the
>>  TC_CLSFLOWER_STATS call to include a tcfa_index.  Then the driver
>>  returns counter stats for that index, and tcf_exts_stats_update()
>>  only updates those actions whose index matches.  But then
>>  fl_hw_update_stats() would have to iterate over all the indices in
>>  f->exts.  What do you think?
> You could extend struct flow_stats to pass an array of stats to the
> driver, including one stats per action and the counter index. Then,
> tcf_exts_stats_update() uses this array of stats to update per-action
> stats.
Yes, but that means allocating the flow_stats.stats array each time;
 I'd rather avoid memory allocation unless it's necessary.  As long as
 we can move the preempt_disable() inside the loop that's currently in
 tcf_exts_stats_update() (i.e. only disable pre-emption across each
 individual call to tcf_action_stats_update()) I think we can.
I think I prefer my approach (ask for one tcfa_index at a time); but
 unmodified drivers that don't look at the passed index would return
 zeroes for actions after the first, so we'll need some way to handle
 those drivers separately (e.g. one tc_setup_cb_call with "answer
 this one if you don't do indices" and a bunch more with specified
 index values).  I think that requires much less change to the
 existing drivers than putting an array back in the API, and keeps as
 much of the work as possible in the core where it won't have to be
 replicated in every driver.
I'll put an RFC patch together soonish if no objections.

-Ed

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: TC stats / hw offload question
  2019-04-29 16:25                     ` Edward Cree
@ 2019-04-29 19:14                       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-29 19:14 UTC (permalink / raw)
  To: Edward Cree; +Cc: Jamal Hadi Salim, netdev, Jiri Pirko, Cong Wang

On Mon, Apr 29, 2019 at 05:25:10PM +0100, Edward Cree wrote:
> On 29/04/2019 16:21, Pablo Neira Ayuso wrote:
> > On Mon, Apr 29, 2019 at 03:11:06PM +0100, Edward Cree wrote:
> >> This is a bit of a mess; the best idea I've got is for the
> >>  TC_CLSFLOWER_STATS call to include a tcfa_index.  Then the driver
> >>  returns counter stats for that index, and tcf_exts_stats_update()
> >>  only updates those actions whose index matches.  But then
> >>  fl_hw_update_stats() would have to iterate over all the indices in
> >>  f->exts.  What do you think?
> > You could extend struct flow_stats to pass an array of stats to the
> > driver, including one stats per action and the counter index. Then,
> > tcf_exts_stats_update() uses this array of stats to update per-action
> > stats.
> Yes, but that means allocating the flow_stats.stats array each time;

We use the stack to attach a reasonable size array, eg. 16 actions,
otherwise fall back to kmalloc(). I haven't seen any driver in the
tree supporting more than that so far.

>  I'd rather avoid memory allocation unless it's necessary.  As long as
>  we can move the preempt_disable() inside the loop that's currently in
>  tcf_exts_stats_update() (i.e. only disable pre-emption across each
>  individual call to tcf_action_stats_update()) I think we can.
> I think I prefer my approach (ask for one tcfa_index at a time); but
>  unmodified drivers that don't look at the passed index would return
>  zeroes for actions after the first, so we'll need some way to handle
>  those drivers separately (e.g. one tc_setup_cb_call with "answer
>  this one if you don't do indices" and a bunch more with specified
>  index values).  I think that requires much less change to the
>  existing drivers than putting an array back in the API, and keeps as
>  much of the work as possible in the core where it won't have to be
>  replicated in every driver.

That's all right. This chunk update will not be particularly large, so
we can change it anytime in the future.

> I'll put an RFC patch together soonish if no objections.

Sure, thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-04-29 19:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-05 19:41 TC stats / hw offload question Edward Cree
2019-02-06  2:20 ` Jamal Hadi Salim
2019-02-08 10:26   ` Edward Cree
2019-02-09 17:39     ` Jamal Hadi Salim
2019-02-11 11:44       ` Edward Cree
2019-02-14 12:39         ` Jamal Hadi Salim
2019-02-14 15:17           ` Andy Gospodarek
2019-02-18 18:56           ` Edward Cree
2019-02-18 19:37             ` Edward Cree
2019-04-24 14:05   ` Edward Cree
2019-04-24 14:11     ` Pablo Neira Ayuso
2019-04-24 15:03       ` Edward Cree
2019-04-25 13:23         ` Edward Cree
2019-04-25 22:33           ` Pablo Neira Ayuso
2019-04-26 12:13             ` Edward Cree
2019-04-26 12:42               ` Jamal Hadi Salim
2019-04-26 18:49               ` Pablo Neira Ayuso
2019-04-29 14:11                 ` Edward Cree
2019-04-29 15:21                   ` Pablo Neira Ayuso
2019-04-29 16:25                     ` Edward Cree
2019-04-29 19:14                       ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).