netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net-next 00/18] devlink: rate objects API
       [not found] <1618918434-25520-1-git-send-email-dlinkin@nvidia.com>
@ 2021-04-20 20:35 ` Jakub Kicinski
  2021-04-21 12:08   ` Dmytro Linkin
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2021-04-20 20:35 UTC (permalink / raw)
  To: dlinkin; +Cc: netdev, davem, jiri

On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
> From: Dmytro Linkin <dlinkin@nvidia.com>
> 
> Currently kernel provides a way to change tx rate of single VF in
> switchdev mode via tc-police action. When lots of VFs are configured
> management of theirs rates becomes non-trivial task and some grouping
> mechanism is required. Implementing such grouping in tc-police will bring
> flow related limitations and unwanted complications, like:
> - flows requires net device to be placed on

Meaning they are only usable in "switchdev mode"?

> - effect of limiting depends on the position of tc-police action in the
>   pipeline

Could you expand? tc-police is usually expected to be first.

> - etc.

Please expand.

> According to that devlink is the most appropriate place.
> 
> This series introduces devlink API for managing tx rate of single devlink
> port or of a group by invoking callbacks (see below) of corresponding
> driver. Also devlink port or a group can be added to the parent group,
> where driver responsible to handle rates of a group elements. To achieve
> all of that new rate object is added. It can be one of the two types:
> - leaf - represents a single devlink port; created/destroyed by the
>   driver and bound to the devlink port. As example, some driver may
>   create leaf rate object for every devlink port associated with VF.
>   Since leaf have 1to1 mapping to it's devlink port, in user space it is
>   referred as pci/<bus_addr>/<port_index>;
> - node - represents a group of rate objects; created/deleted by request
>   from the userspace; initially empty (no rate objects added). In
>   userspace it is referred as pci/<bus_addr>/<node_name>, where node name
>   can be any, except decimal number, to avoid collisions with leafs.
> 
> devlink_ops extended with following callbacks:
> - rate_{leaf|node}_tx_{share|max}_set
> - rate_node_{new|del}
> - rate_{leaf|node}_parent_set

Tx is incorrect. You're setting an admission rate limiter on the port.

> KAPI provides:
> - creation/destruction of the leaf rate object associated with devlink
>   port
> - storing/retrieving driver specific data in rate object
> 
> UAPI provides:
> - dumping all or single rate objects
> - setting tx_{share|max} of rate object of any type
> - creating/deleting node rate object
> - setting/unsetting parent of any rate object

> Add devlink rate object support for netdevsim driver.
> To support devlink rate objects implement VF ports and eswitch mode
> selector for netdevsim driver.
> 
> Issues/open questions:
> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
>   children of particular parent node? For example:
>   $ devlink port func rate flush netdevsim/netdevsim10/group

Is this an RFC? There is no real user in this set.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 00/18] devlink: rate objects API
  2021-04-20 20:35 ` [PATCH net-next 00/18] devlink: rate objects API Jakub Kicinski
@ 2021-04-21 12:08   ` Dmytro Linkin
  2021-04-21 18:59     ` Jakub Kicinski
  0 siblings, 1 reply; 4+ messages in thread
From: Dmytro Linkin @ 2021-04-21 12:08 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, jiri

On 4/20/21 11:35 PM, Jakub Kicinski wrote:
> On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:
>> From: Dmytro Linkin <dlinkin@nvidia.com>
>>
>> Currently kernel provides a way to change tx rate of single VF in
>> switchdev mode via tc-police action. When lots of VFs are configured
>> management of theirs rates becomes non-trivial task and some grouping
>> mechanism is required. Implementing such grouping in tc-police will bring
>> flow related limitations and unwanted complications, like:
>> - flows requires net device to be placed on
> 
> Meaning they are only usable in "switchdev mode"?

Meaning, "groups" wouldn't have corresponding net devices and needs
somehow to deal with that. I'll rephrase this line.

> 
>> - effect of limiting depends on the position of tc-police action in the
>>   pipeline
> 
> Could you expand? tc-police is usually expected to be first.

Ok

>
>> - etc.
> 
> Please expand.

Ok

> 
>> According to that devlink is the most appropriate place.
>>
>> This series introduces devlink API for managing tx rate of single devlink
>> port or of a group by invoking callbacks (see below) of corresponding
>> driver. Also devlink port or a group can be added to the parent group,
>> where driver responsible to handle rates of a group elements. To achieve
>> all of that new rate object is added. It can be one of the two types:
>> - leaf - represents a single devlink port; created/destroyed by the
>>   driver and bound to the devlink port. As example, some driver may
>>   create leaf rate object for every devlink port associated with VF.
>>   Since leaf have 1to1 mapping to it's devlink port, in user space it is
>>   referred as pci/<bus_addr>/<port_index>;
>> - node - represents a group of rate objects; created/deleted by request
>>   from the userspace; initially empty (no rate objects added). In
>>   userspace it is referred as pci/<bus_addr>/<node_name>, where node name
>>   can be any, except decimal number, to avoid collisions with leafs.
>>
>> devlink_ops extended with following callbacks:
>> - rate_{leaf|node}_tx_{share|max}_set
>> - rate_node_{new|del}
>> - rate_{leaf|node}_parent_set
> 
> Tx is incorrect. You're setting an admission rate limiter on the port.
> 
>> KAPI provides:
>> - creation/destruction of the leaf rate object associated with devlink
>>   port
>> - storing/retrieving driver specific data in rate object
>>
>> UAPI provides:
>> - dumping all or single rate objects
>> - setting tx_{share|max} of rate object of any type
>> - creating/deleting node rate object
>> - setting/unsetting parent of any rate object
> 
>> Add devlink rate object support for netdevsim driver.
>> To support devlink rate objects implement VF ports and eswitch mode
>> selector for netdevsim driver.
>>
>> Issues/open questions:
>> - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
>>   children of particular parent node? For example:
>>   $ devlink port func rate flush netdevsim/netdevsim10/group
> 
> Is this an RFC? There is no real user in this set.

Yes. I'll resend patches anyway, because of issue with smtp server

> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 00/18] devlink: rate objects API
  2021-04-21 12:08   ` Dmytro Linkin
@ 2021-04-21 18:59     ` Jakub Kicinski
  2021-05-05  8:05       ` Dmytro Linkin
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2021-04-21 18:59 UTC (permalink / raw)
  To: Dmytro Linkin; +Cc: netdev, davem, jiri

On Wed, 21 Apr 2021 15:08:07 +0300 Dmytro Linkin wrote:
> On 4/20/21 11:35 PM, Jakub Kicinski wrote:
> > On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:  
> >> From: Dmytro Linkin <dlinkin@nvidia.com>
> >>
> >> Currently kernel provides a way to change tx rate of single VF in
> >> switchdev mode via tc-police action. When lots of VFs are configured
> >> management of theirs rates becomes non-trivial task and some grouping
> >> mechanism is required. Implementing such grouping in tc-police will bring
> >> flow related limitations and unwanted complications, like:
> >> - flows requires net device to be placed on  
> > 
> > Meaning they are only usable in "switchdev mode"?  
> 
> Meaning, "groups" wouldn't have corresponding net devices and needs
> somehow to deal with that. I'll rephrase this line.

But you can share a police action across netdevs. A deeper analysis of
the capabilities of the current subsystem would be appreciated before
we commit to this (less expressive) implementation.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 00/18] devlink: rate objects API
  2021-04-21 18:59     ` Jakub Kicinski
@ 2021-05-05  8:05       ` Dmytro Linkin
  0 siblings, 0 replies; 4+ messages in thread
From: Dmytro Linkin @ 2021-05-05  8:05 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem, jiri

On 4/21/21 9:59 PM, Jakub Kicinski wrote:
> On Wed, 21 Apr 2021 15:08:07 +0300 Dmytro Linkin wrote:
>> On 4/20/21 11:35 PM, Jakub Kicinski wrote:
>>> On Tue, 20 Apr 2021 14:33:36 +0300 dlinkin@nvidia.com wrote:  
>>>> From: Dmytro Linkin <dlinkin@nvidia.com>
>>>>
>>>> Currently kernel provides a way to change tx rate of single VF in
>>>> switchdev mode via tc-police action. When lots of VFs are configured
>>>> management of theirs rates becomes non-trivial task and some grouping
>>>> mechanism is required. Implementing such grouping in tc-police will bring
>>>> flow related limitations and unwanted complications, like:
>>>> - flows requires net device to be placed on  
>>>
>>> Meaning they are only usable in "switchdev mode"?  
>>
>> Meaning, "groups" wouldn't have corresponding net devices and needs
>> somehow to deal with that. I'll rephrase this line.
> 
> But you can share a police action across netdevs. A deeper analysis of
> the capabilities of the current subsystem would be appreciated before
> we commit to this (less expressive) implementation.
> 

Hi, Sorry for a delay in answering.

We have a customer request for a traffic shaper for a group of VFs.
tc-police action is a policer, so shared action isn't suitable. Since
request was more about group shaper, was reviewed a case when
representor have a policer and the driver will use a shaper if qdisc
action is continue and qdisc contains group of vf – but such approach
ugly, complicated and misleading.
Also TC is ingress only, while configuring "other" side of the wire
looks more like a “real” picture where shaping is outside of steering world.




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-05  8:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1618918434-25520-1-git-send-email-dlinkin@nvidia.com>
2021-04-20 20:35 ` [PATCH net-next 00/18] devlink: rate objects API Jakub Kicinski
2021-04-21 12:08   ` Dmytro Linkin
2021-04-21 18:59     ` Jakub Kicinski
2021-05-05  8:05       ` Dmytro Linkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).