From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org, davem@davemloft.net, saeedm@mellanox.com,
leon@kernel.org, michael.chan@broadcom.com, vishal@chelsio.com,
jeffrey.t.kirsher@intel.com, idosch@mellanox.com,
aelior@marvell.com, peppe.cavallaro@st.com,
alexandre.torgue@st.com, jhs@mojatatu.com,
xiyou.wangcong@gmail.com, pablo@netfilter.org,
ecree@solarflare.com, mlxsw@mellanox.com
Subject: Re: [patch net-next v2 12/12] sched: act: allow user to specify type of HW stats for a filter
Date: Sun, 1 Mar 2020 09:57:56 +0100 [thread overview]
Message-ID: <20200301085756.GS26061@nanopsycho> (raw)
In-Reply-To: <20200229121452.5dd4963b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Sat, Feb 29, 2020 at 09:14:52PM CET, kuba@kernel.org wrote:
>On Sat, 29 Feb 2020 08:52:09 +0100 Jiri Pirko wrote:
>> Fri, Feb 28, 2020 at 08:59:23PM CET, kuba@kernel.org wrote:
>> >On Fri, 28 Feb 2020 18:25:05 +0100 Jiri Pirko wrote:
>> >> From: Jiri Pirko <jiri@mellanox.com>
>> >> +/* tca HW stats type */
>> >> +enum tca_act_hw_stats_type {
>> >> + TCA_ACT_HW_STATS_TYPE_ANY, /* User does not care, it's default
>> >> + * when user does not pass the attr.
>> >> + * Instructs the driver that user does not
>> >> + * care if the HW stats are "immediate"
>> >> + * or "delayed".
>> >> + */
>> >> + TCA_ACT_HW_STATS_TYPE_IMMEDIATE, /* Means that in dump, user gets
>> >> + * the current HW stats state from
>> >> + * the device queried at the dump time.
>> >> + */
>> >> + TCA_ACT_HW_STATS_TYPE_DELAYED, /* Means that in dump, user gets
>> >> + * HW stats that might be out of date
>> >> + * for some time, maybe couple of
>> >> + * seconds. This is the case when driver
>> >> + * polls stats updates periodically
>> >> + * or when it gets async stats update
>> >> + * from the device.
>> >> + */
>> >> + TCA_ACT_HW_STATS_TYPE_DISABLED, /* User is not interested in getting
>> >> + * any HW statistics.
>> >> + */
>> >> +};
>> >
>> >On the ABI I wonder if we can redefine it a little bit..
>> >
>> >Can we make the stat types into a bitfield?
>> >
>> >On request:
>> > - no attr -> any stats allowed but some stats must be provided *
>> > - 0 -> no stats requested / disabled
>> > - 0x1 -> must be stat type0
>> > - 0x6 -> stat type1 or stat type2 are both fine
>>
>> I was thinking about this of course. On the write side, this is ok
>> however, this is very tricky on read side. See below.
>>
>> >* no attr kinda doesn't work 'cause u32 offload has no stats and this
>> > is action-level now, not flower-level :S What about u32 and matchall?
>>
>> The fact that cls does not implement stats offloading is a lack of
>> feature of the particular cls.
>
>Yeah, I wonder how that squares with strict netlink parsing.
>
>> >We can add a separate attribute with "active" stat types:
>> > - no attr -> old kernel
>> > - 0 -> no stats are provided / stats disabled
>> > - 0x1 -> only stat type0 is used by drivers
>> > - 0x6 -> at least one driver is using type1 and one type2
>>
>> There are 2 problems:
>> 1) There is a mismatch between write and read. User might pass different
>> value than it eventually gets from kernel. I guess this might be fine.
>
>Separate attribute would work.
>
>> 2) Much bigger problem is, that since the same action may be offloaded
>> by multiple drivers, the read would have to provide an array of
>> bitfields, each array item would represent one offloaded driver. That is
>> why I decided for simple value instead of bitfield which is the same on
>> write and read.
>
>Why an array? The counter itself is added up from all the drivers.
>If the value is a bitfield all drivers can just OR-in their type.
Yeah, for uapi. Internally the array would be still needed. Also the
driver would need to somehow "write-back" the value to the offload
caller and someone (caller/tc) would have to use the array to track
these bitfields for individual callbacks (probably idr of some sort).
I don't know, is this excercise worth it?
Seems to me like we are overengineering this one a bit.
Also there would be no "any" it would be type0|type1|type2 the user
would have to pass. If new type appears, the userspace would have to be
updated to do "any" again :/ This is inconvenient.
>
>> >That assumes that we may one day add another stat type which would
>> >not be just based on the reporting time.
>> >
>> >If we only foresee time-based reporting would it make sense to turn
>> >the attribute into max acceptable delay in ms?
>> >
>> >0 -> only immediate / blocking stats
>> >(0, MAX) -> given reporting delay in ms is acceptable
>> >MAX -> don't care about stats at all
>>
>> Interesting, is this "delayed" granularity something that has a usecase?
>> It might turn into a guessing game between user and driver during action
>> insertion :/
>
>Yeah, I don't like the guessing part too, worst case refresh time may
>be system dependent.
>
>With just "DELAYED" I'm worried users will think the delay may be too
>long for OvS. Or simply poll the statistics more often than the HW
>reports them, which would be pointless.
>
>For the latter case I guess the best case refresh time is needed,
>while the former needs worst case. Hopefully the two are not too far
>apart.
>
>Maybe some day drivers may also tweak the refresh rate based on user
>requests to save PCIe bandwidth and CPU..
>
>Anyway.. maybe its not worth it today.
>
>> >> +tcf_flow_action_hw_stats_type(enum tca_act_hw_stats_type hw_stats_type)
>> >> +{
>> >> + switch (hw_stats_type) {
>> >> + default:
>> >> + WARN_ON(1);
>> >> + /* fall-through */
>> >
>> >without the policy change this seems user-triggerable
>>
>> Nope. tcf_action_hw_stats_type_get() takes care of setting
>> TCA_ACT_HW_STATS_TYPE_ANY when no attr is passed.
>
>I meant attribute is present but carries a large value.
Ah, will sanitize this.
>
>> >> + case TCA_ACT_HW_STATS_TYPE_ANY:
>> >> + return FLOW_ACTION_HW_STATS_TYPE_ANY;
>> >> + case TCA_ACT_HW_STATS_TYPE_IMMEDIATE:
>> >> + return FLOW_ACTION_HW_STATS_TYPE_IMMEDIATE;
>> >> + case TCA_ACT_HW_STATS_TYPE_DELAYED:
>> >> + return FLOW_ACTION_HW_STATS_TYPE_DELAYED;
>> >> + case TCA_ACT_HW_STATS_TYPE_DISABLED:
>> >> + return FLOW_ACTION_HW_STATS_TYPE_DISABLED;
>
next prev parent reply other threads:[~2020-03-01 8:59 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-28 17:24 [patch net-next v2 00/12] net: allow user specify TC action HW stats type Jiri Pirko
2020-02-28 17:24 ` [patch net-next v2 01/12] flow_offload: Introduce offload of " Jiri Pirko
2020-02-29 19:29 ` Pablo Neira Ayuso
2020-03-01 8:44 ` Jiri Pirko
2020-03-02 13:20 ` Pablo Neira Ayuso
2020-03-02 13:58 ` Jiri Pirko
2020-03-02 16:29 ` Edward Cree
2020-03-02 19:24 ` Pablo Neira Ayuso
2020-03-02 20:18 ` Jakub Kicinski
2020-03-02 21:46 ` Pablo Neira Ayuso
2020-03-02 22:49 ` Jakub Kicinski
2020-03-03 17:25 ` Pablo Neira Ayuso
2020-03-03 19:34 ` Jakub Kicinski
2020-03-03 18:55 ` Edward Cree
2020-03-03 19:26 ` Jakub Kicinski
2020-03-03 20:27 ` Pablo Neira Ayuso
2020-03-03 21:06 ` Edward Cree
2020-03-03 21:25 ` Pablo Neira Ayuso
2020-03-03 19:13 ` Edward Cree
2020-03-04 14:18 ` Jiri Pirko
2020-02-28 17:24 ` [patch net-next v2 02/12] ocelot_flower: use flow_offload_has_one_action() helper Jiri Pirko
2020-02-29 0:50 ` Vladimir Oltean
2020-02-28 17:24 ` [patch net-next v2 03/12] flow_offload: check for basic action hw stats type Jiri Pirko
2020-02-28 19:40 ` Jakub Kicinski
2020-02-29 7:40 ` Jiri Pirko
2020-02-29 19:18 ` Jakub Kicinski
2020-03-01 9:00 ` Jiri Pirko
2020-03-02 19:33 ` Jakub Kicinski
2020-03-03 11:46 ` Jiri Pirko
2020-02-28 17:24 ` [patch net-next v2 04/12] mlx5: en_tc: Do not allow mixing HW stats types for actions Jiri Pirko
2020-02-28 17:24 ` [patch net-next v2 05/12] mlxsw: spectrum_flower: " Jiri Pirko
2020-02-28 17:24 ` [patch net-next v2 06/12] mlx5: restrict supported HW stats type to "any" Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 07/12] mlxsw: " Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 08/12] flow_offload: introduce "immediate" HW stats type and allow it in mlxsw Jiri Pirko
2020-02-29 19:32 ` Pablo Neira Ayuso
2020-03-01 8:47 ` Jiri Pirko
2020-03-02 13:23 ` Pablo Neira Ayuso
2020-03-02 13:59 ` Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 09/12] flow_offload: introduce "delayed" HW stats type and allow it in mlx5 Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 10/12] mlxsw: spectrum_acl: Ask device for rule stats only if counter was created Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 11/12] flow_offload: introduce "disabled" HW stats type and allow it in mlxsw Jiri Pirko
2020-02-28 17:25 ` [patch net-next v2 12/12] sched: act: allow user to specify type of HW stats for a filter Jiri Pirko
2020-02-28 19:59 ` Jakub Kicinski
2020-02-29 7:52 ` Jiri Pirko
2020-02-29 20:14 ` Jakub Kicinski
2020-03-01 8:57 ` Jiri Pirko [this message]
2020-03-02 19:39 ` Jakub Kicinski
2020-03-03 13:20 ` Jiri Pirko
2020-03-03 19:48 ` Jakub Kicinski
2020-03-04 8:15 ` Jiri Pirko
2020-03-03 19:44 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200301085756.GS26061@nanopsycho \
--to=jiri@resnulli.us \
--cc=aelior@marvell.com \
--cc=alexandre.torgue@st.com \
--cc=davem@davemloft.net \
--cc=ecree@solarflare.com \
--cc=idosch@mellanox.com \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jhs@mojatatu.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=michael.chan@broadcom.com \
--cc=mlxsw@mellanox.com \
--cc=netdev@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=peppe.cavallaro@st.com \
--cc=saeedm@mellanox.com \
--cc=vishal@chelsio.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).