* Re: [PATCH net-next 00/18] devlink: rate objects API
[not found] <1618918434-25520-1-git-send-email-dlinkin@nvidia.com>
@ 2021-04-20 20:35 ` Jakub Kicinski
2021-04-21 12:08 ` Dmytro Linkin
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2021-04-20 20:35 UTC (permalink / raw)
To: dlinkin; +Cc: netdev, davem, jiri
On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
> From: Dmytro Linkin <dlinkin@nvidia.com>
>
> Currently kernel provides a way to change tx rate of single VF in
> switchdev mode via tc-police action. When lots of VFs are configured
> management of theirs rates becomes non-trivial task and some grouping
> mechanism is required. Implementing such grouping in tc-police will bring
> flow related limitations and unwanted complications, like:
> - flows requires net device to be placed on
Meaning they are only usable in "switchdev mode"?
> - effect of limiting depends on the position of tc-police action in the
> pipeline
Could you expand? tc-police is usually expected to be first.
> - etc.
Please expand.
> According to that devlink is the most appropriate place.
>
> This series introduces devlink API for managing tx rate of single devlink
> port or of a group by invoking callbacks (see below) of corresponding
> driver. Also devlink port or a group can be added to the parent group,
> where driver responsible to handle rates of a group elements. To achieve
> all of that new rate object is added. It can be one of the two types:
> - leaf - represents a single devlink port; created/destroyed by the
> driver and bound to the devlink port. As example, some driver may
> create leaf rate object for every devlink port associated with VF.
> Since leaf have 1to1 mapping to it's devlink port, in user space it is
> referred as pci/<bus_addr>/<port_index>;
> - node - represents a group of rate objects; created/deleted by request
> from the userspace; initially empty (no rate objects added). In
> userspace it is referred as pci/<bus_addr>/<node_name>, where node name
> can be any, except decimal number, to avoid collisions with leafs.
>
> devlink_ops extended with following callbacks:
> - rate_{leaf|node}_tx_{share|max}_set
> - rate_node_{new|del}
> - rate_{leaf|node}_parent_set
Tx is incorrect. You're setting an admission rate limiter on the port.
> KAPI provides:
> - creation/destruction of the leaf rate object associated with devlink
> port
> - storing/retrieving driver specific data in rate object
>
> UAPI provides:
> - dumping all or single rate objects
> - setting tx_{share|max} of rate object of any type
> - creating/deleting node rate object
> - setting/unsetting parent of any rate object
> Add devlink rate object support for netdevsim driver.
> To support devlink rate objects implement VF ports and eswitch mode
> selector for netdevsim driver.
>
> Issues/open questions:
> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
> children of particular parent node? For example:
> $ devlink port func rate flush netdevsim/netdevsim10/group
Is this an RFC? There is no real user in this set.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 00/18] devlink: rate objects API
2021-04-20 20:35 ` [PATCH net-next 00/18] devlink: rate objects API Jakub Kicinski
@ 2021-04-21 12:08 ` Dmytro Linkin
2021-04-21 18:59 ` Jakub Kicinski
0 siblings, 1 reply; 4+ messages in thread
From: Dmytro Linkin @ 2021-04-21 12:08 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem, jiri
On 4/20/21 11:35 PM, Jakub Kicinski wrote:
> On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
>> From: Dmytro Linkin <dlinkin@nvidia.com>
>>
>> Currently kernel provides a way to change tx rate of single VF in
>> switchdev mode via tc-police action. When lots of VFs are configured
>> management of theirs rates becomes non-trivial task and some grouping
>> mechanism is required. Implementing such grouping in tc-police will bring
>> flow related limitations and unwanted complications, like:
>> - flows requires net device to be placed on
>
> Meaning they are only usable in "switchdev mode"?
Meaning, "groups" wouldn't have corresponding net devices and needs
somehow to deal with that. I'll rephrase this line.
>
>> - effect of limiting depends on the position of tc-police action in the
>> pipeline
>
> Could you expand? tc-police is usually expected to be first.
Ok
>
>> - etc.
>
> Please expand.
Ok
>
>> According to that devlink is the most appropriate place.
>>
>> This series introduces devlink API for managing tx rate of single devlink
>> port or of a group by invoking callbacks (see below) of corresponding
>> driver. Also devlink port or a group can be added to the parent group,
>> where driver responsible to handle rates of a group elements. To achieve
>> all of that new rate object is added. It can be one of the two types:
>> - leaf - represents a single devlink port; created/destroyed by the
>> driver and bound to the devlink port. As example, some driver may
>> create leaf rate object for every devlink port associated with VF.
>> Since leaf have 1to1 mapping to it's devlink port, in user space it is
>> referred as pci/<bus_addr>/<port_index>;
>> - node - represents a group of rate objects; created/deleted by request
>> from the userspace; initially empty (no rate objects added). In
>> userspace it is referred as pci/<bus_addr>/<node_name>, where node name
>> can be any, except decimal number, to avoid collisions with leafs.
>>
>> devlink_ops extended with following callbacks:
>> - rate_{leaf|node}_tx_{share|max}_set
>> - rate_node_{new|del}
>> - rate_{leaf|node}_parent_set
>
> Tx is incorrect. You're setting an admission rate limiter on the port.
>
>> KAPI provides:
>> - creation/destruction of the leaf rate object associated with devlink
>> port
>> - storing/retrieving driver specific data in rate object
>>
>> UAPI provides:
>> - dumping all or single rate objects
>> - setting tx_{share|max} of rate object of any type
>> - creating/deleting node rate object
>> - setting/unsetting parent of any rate object
>
>> Add devlink rate object support for netdevsim driver.
>> To support devlink rate objects implement VF ports and eswitch mode
>> selector for netdevsim driver.
>>
>> Issues/open questions:
>> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
>> children of particular parent node? For example:
>> $ devlink port func rate flush netdevsim/netdevsim10/group
>
> Is this an RFC? There is no real user in this set.
Yes. I'll resend patches anyway, because of issue with smtp server
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 00/18] devlink: rate objects API
2021-04-21 12:08 ` Dmytro Linkin
@ 2021-04-21 18:59 ` Jakub Kicinski
2021-05-05 8:05 ` Dmytro Linkin
0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2021-04-21 18:59 UTC (permalink / raw)
To: Dmytro Linkin; +Cc: netdev, davem, jiri
On Wed, 21 Apr 2021 15:08:07 +0300 Dmytro Linkin wrote:
> On 4/20/21 11:35 PM, Jakub Kicinski wrote:
> > On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
> >> From: Dmytro Linkin <dlinkin@nvidia.com>
> >>
> >> Currently kernel provides a way to change tx rate of single VF in
> >> switchdev mode via tc-police action. When lots of VFs are configured
> >> management of theirs rates becomes non-trivial task and some grouping
> >> mechanism is required. Implementing such grouping in tc-police will bring
> >> flow related limitations and unwanted complications, like:
> >> - flows requires net device to be placed on
> >
> > Meaning they are only usable in "switchdev mode"?
>
> Meaning, "groups" wouldn't have corresponding net devices and needs
> somehow to deal with that. I'll rephrase this line.
But you can share a police action across netdevs. A deeper analysis of
the capabilities of the current subsystem would be appreciated before
we commit to this (less expressive) implementation.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 00/18] devlink: rate objects API
2021-04-21 18:59 ` Jakub Kicinski
@ 2021-05-05 8:05 ` Dmytro Linkin
0 siblings, 0 replies; 4+ messages in thread
From: Dmytro Linkin @ 2021-05-05 8:05 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, davem, jiri
On 4/21/21 9:59 PM, Jakub Kicinski wrote:
> On Wed, 21 Apr 2021 15:08:07 +0300 Dmytro Linkin wrote:
>> On 4/20/21 11:35 PM, Jakub Kicinski wrote:
>>> On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
>>>> From: Dmytro Linkin <dlinkin@nvidia.com>
>>>>
>>>> Currently kernel provides a way to change tx rate of single VF in
>>>> switchdev mode via tc-police action. When lots of VFs are configured
>>>> management of theirs rates becomes non-trivial task and some grouping
>>>> mechanism is required. Implementing such grouping in tc-police will bring
>>>> flow related limitations and unwanted complications, like:
>>>> - flows requires net device to be placed on
>>>
>>> Meaning they are only usable in "switchdev mode"?
>>
>> Meaning, "groups" wouldn't have corresponding net devices and needs
>> somehow to deal with that. I'll rephrase this line.
>
> But you can share a police action across netdevs. A deeper analysis of
> the capabilities of the current subsystem would be appreciated before
> we commit to this (less expressive) implementation.
>
Hi, Sorry for a delay in answering.
We have a customer request for a traffic shaper for a group of VFs.
tc-police action is a policer, so shared action isn't suitable. Since
request was more about group shaper, was reviewed a case when
representor have a policer and the driver will use a shaper if qdisc
action is continue and qdisc contains group of vf – but such approach
ugly, complicated and misleading.
Also TC is ingress only, while configuring "other" side of the wire
looks more like a “real” picture where shaping is outside of steering world.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-05-05 8:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1618918434-25520-1-git-send-email-dlinkin@nvidia.com>
2021-04-20 20:35 ` [PATCH net-next 00/18] devlink: rate objects API Jakub Kicinski
2021-04-21 12:08 ` Dmytro Linkin
2021-04-21 18:59 ` Jakub Kicinski
2021-05-05 8:05 ` Dmytro Linkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).