From: Petr Machata <petrm@nvidia.com>
To: <Daniel.Machon@microchip.com>
Cc: <petrm@nvidia.com>, <netdev@vger.kernel.org>, <kuba@kernel.org>,
<vinicius.gomes@intel.com>, <vladimir.oltean@nxp.com>,
<thomas.petazzoni@bootlin.com>, <Allan.Nielsen@microchip.com>,
<maxime.chevallier@bootlin.com>, <roopa@nvidia.com>
Subject: Re: Basic PCP/DEI-based queue classification
Date: Wed, 24 Aug 2022 21:36:54 +0200 [thread overview]
Message-ID: <87k06xjplj.fsf@nvidia.com> (raw)
In-Reply-To: <YwZoGJXgx/t/Qxam@DEN-LT-70577>
<Daniel.Machon@microchip.com> writes:
>> > As I hinted earlier, we could also add an entirely new PCP interface
>> > (like with maxrate), this will give us a bit more flexibility and will
>> > not crash with anything. This approach will not give is trust for DSCP,
>> > but maybe we can disregard this and go with a PCP solution initially?
>>
>> I would like to have a line of sight to how things will be done. Not
>> everything needs to be implemented at once, but we have to understand
>> how to get there when we need to. At least for issues that we can
>> already foresee now, such as the DSCP / PCP / default ordering.
>>
>> Adding the PCP rules as a new APP selector, and then expressing the
>> ordering as a "selector policy" or whatever, IMHO takes care of this
>> nicely.
>>
>> But OK, let's talk about the "flexibility" bit that you mention: what
>> does this approach make difficult or impossible?
>
> It was merely a concern of not changing too much on something that is
> already standard. Maybe I dont quite see how the APP interface can be
> extended to accomodate for: pcp/dei, ingress/egress and trust. Lets
> try to break it down:
>
> - pcp/dei:
> this *could* be expressed in app->protocol and map 1:1 to the
> pcp table entrise, so that 8*dei+pcp:priority. If I want to map
> pcp 3, with dei 1 to priority 2, it would be encoded 11:2.
Yep. In particular something like {sel=255, pid=11, prio=2}.
iproute2 "dcb" would obviously grow brains to let you configure this
stuff semantically, so e.g.:
# dcb app replace dev X pcp-prio 3:3 3de:2 2:2 2de:1
> - ingress/egress:
> I guess we need a selector for each? I notice that the mellanox
> driver uses the dcb_ieee_getapp_prio_dscp_mask_map and
> dcb_ieee_getapp_dscp_prio_mask_map for priority map and priority
> rewrite map, but these seems to be the same for both ingress and
> egress to me?
Ha, I was only thinking about prioritization, not about rewrite at all.
Yeah, mlxsw uses APP rules for rewrite as well. The logic is that if the
network behind port X uses DSCP value D to express priority P, then
packets with priority P leaving that port should have DSCP value of D.
Of course it doesn't work too well, because there are 8 priorities, but
64 DSCP values. So mlxsw arbitrarily chooses the highest DSCP value.
The situation is similar with PCP, where there are 16 PCP+DEI
combinations, but only 8 priorities.
So having a way to configure rewrite would be good. But then we are very
firmly in the extension territory. This would basically need a separate
APP-like object.
> So far only subtle changes. Now how do you see trust going in. Can you
> elaborate a little on the policy selector you mentioned?
Sure. In my mind the policy is a array that describes the order in which
APP rules are applied. "default" is implicitly last.
So "trust DSCP" has a policy of just [DSCP]. "Trust PCP" of [PCP].
"Trust DSCP, then PCP" of [DSCP, PCP]. "Trust port" (i.e. just default)
is simply []. Etc.
Individual drivers validate whether their device can implement the
policy.
I expect most devices to really just support the DSCP and PCP parts, but
this is flexible in allowing more general configuration in devices that
allow it.
ABI-wise it is tempting to reuse APP to assign priority to selectors in
the same way that it currently assigns priority to field values:
# dcb app replace dev X sel-prio dscp:2 pcp:1
But that feels like a hack. It will probably be better to have a
dedicated object for this:
# dcb app-policy set dev X sel-order dscp pcp
This can be sliced in different ways that we can bikeshed to death
later. Does the above basically address your request?
next prev parent reply other threads:[~2022-08-24 21:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-19 9:09 Basic PCP/DEI-based queue classification Daniel.Machon
2022-08-19 10:50 ` Petr Machata
2022-08-21 20:58 ` Daniel.Machon
2022-08-22 10:34 ` Petr Machata
2022-08-24 7:39 ` Daniel.Machon
2022-08-24 9:45 ` Petr Machata
2022-08-24 17:55 ` Daniel.Machon
2022-08-24 19:36 ` Petr Machata [this message]
2022-08-25 0:54 ` Jakub Kicinski
2022-08-26 18:11 ` Petr Machata
2022-08-29 7:53 ` Allan W. Nielsen
2022-09-02 13:32 ` Vladimir Oltean
2022-09-07 10:41 ` Daniel.Machon
2022-09-07 17:26 ` Vladimir Oltean
2022-09-07 19:57 ` Daniel.Machon
2022-09-08 8:03 ` Allan Nielsen - M31684
2022-09-08 11:18 ` Petr Machata
2022-09-08 12:01 ` Daniel.Machon
2022-09-09 12:11 ` Vladimir Oltean
2022-09-08 8:27 ` Petr Machata
2022-08-25 11:31 ` Daniel.Machon
2022-08-25 13:30 ` Petr Machata
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k06xjplj.fsf@nvidia.com \
--to=petrm@nvidia.com \
--cc=Allan.Nielsen@microchip.com \
--cc=Daniel.Machon@microchip.com \
--cc=kuba@kernel.org \
--cc=maxime.chevallier@bootlin.com \
--cc=netdev@vger.kernel.org \
--cc=roopa@nvidia.com \
--cc=thomas.petazzoni@bootlin.com \
--cc=vinicius.gomes@intel.com \
--cc=vladimir.oltean@nxp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).