All of lore.kernel.org
 help / color / mirror / Atom feed
* mlx5e question about PF fwd packets to PF
@ 2019-12-31  8:34 Tonghao Zhang
  2019-12-31 20:40 ` Or Gerlitz
  0 siblings, 1 reply; 6+ messages in thread
From: Tonghao Zhang @ 2019-12-31  8:34 UTC (permalink / raw)
  To: Saeed Mahameed, Roi Dayan; +Cc: Linux Kernel Network Developers

In one case, we want forward the packets from one PF to otter PF in
eswitchdev mode.

The tc flow can be installed, but it may not work.
tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
flower action mirred egress redirect dev $PF1

# tc -d -s filter show dev $PF0 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  in_hw
action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
  index 1 ref 1 bind 1 installed 19 sec used 0 sec
  Action statistics:
Sent 3206840 bytes 32723 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

but I can't get packets with:
mlnx_perf -i $PF1

the kernel version is 5.4.0+, the nic is "MT27800 Family [ConnectX-5]".

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlx5e question about PF fwd packets to PF
  2019-12-31  8:34 mlx5e question about PF fwd packets to PF Tonghao Zhang
@ 2019-12-31 20:40 ` Or Gerlitz
  2020-01-02  3:03   ` Tonghao Zhang
  0 siblings, 1 reply; 6+ messages in thread
From: Or Gerlitz @ 2019-12-31 20:40 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Saeed Mahameed, Roi Dayan, Linux Kernel Network Developers

On Tue, Dec 31, 2019 at 10:39 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:

> In one case, we want forward the packets from one PF to otter PF in eswitchdev mode.

Did you want to say from one uplink to the other uplink? -- this is
not supported.

What we do support is the following (I think you do it by now):

PF0.uplink --> esw --> PF0.VFx --> hairpin --> PF1.VFy --> esw --> PF1.uplink

Hairpin is an offload for SW gateway, SW gateway is an **application** that runs
over two NIC ports -- we allow them to be virtual NIC ports  -- PF0.VFx/PF1.VFy

since e-switch != (SW) gateway --> eswitch offload != (SW) gateway offload

note that steering rules ## (wise)

PF0.uplink --> T1 --> PF0.VFx --> T2 --> PF1.VFy --> T3 --> PF1.uplink

since you use instantiate eswitch on the system, T(ype)1 and T(ype)3 rules
are ones that differentiate packets that belong to this GW. But, T(ype)2 rules,
can be just "fwd everything" -- TC wise, you can even mask out the ethertype,
just a tc/flower rules that fwd everything from ingress PF0.VFx vNIC
to egress  PF1.VFy vNIC.

Further, you can also you this match-all (but with flower..) rule for
the PF1.VFy --> PF1.uplink
part of the chain since you know that everything that originates in
this VF should go to the uplink.

Hence the claim here is that if PF0.uplink --> hairpin --> PF1.uplink
would have been
supported and the system had N steering rules, with what is currently
supported you
need N+2 rules -- N rules + one T2 rule and one T3 rule

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlx5e question about PF fwd packets to PF
  2019-12-31 20:40 ` Or Gerlitz
@ 2020-01-02  3:03   ` Tonghao Zhang
  2020-01-02  7:53     ` Or Gerlitz
       [not found]     ` <CAJ3xEMgXvxkmxNcfK-hFDWEu1qW7o7+FBhyGf3YGgr5dPK=Ddg@mail.gmail.com>
  0 siblings, 2 replies; 6+ messages in thread
From: Tonghao Zhang @ 2020-01-02  3:03 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Saeed Mahameed, Roi Dayan, Linux Kernel Network Developers

On Wed, Jan 1, 2020 at 4:40 AM Or Gerlitz <gerlitz.or@gmail.com> wrote:
>
> On Tue, Dec 31, 2019 at 10:39 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> > In one case, we want forward the packets from one PF to otter PF in eswitchdev mode.
>
> Did you want to say from one uplink to the other uplink? -- this is
> not supported.
yes, I try to install one rule and hope that one uplink can forward
the packets to other uplink of PF.
But the rule can be installed successfully, and the counter of rule is
changed as show below:

# tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
flower action mirred egress redirect dev $PF1

# tc -d -s filter show dev $PF0 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  in_hw
action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
  index 1 ref 1 bind 1 installed 19 sec used 0 sec
  Action statistics:
Sent 3206840 bytes 32723 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

The PF1 uplink don't sent the packets out(as you say, we don't support it now).
If we don't support it, should we return -NOSUPPORT when we install
the hairpin rule between
uplink of PF, because it makes me confuse.
> What we do support is the following (I think you do it by now):
>
> PF0.uplink --> esw --> PF0.VFx --> hairpin --> PF1.VFy --> esw --> PF1.uplink
Yes, I have tested it, and it work fine for us.
> Hairpin is an offload for SW gateway, SW gateway is an **application** that runs
> over two NIC ports -- we allow them to be virtual NIC ports  -- PF0.VFx/PF1.VFy
>
> since e-switch != (SW) gateway --> eswitch offload != (SW) gateway offload
>
> note that steering rules ## (wise)
>
> PF0.uplink --> T1 --> PF0.VFx --> T2 --> PF1.VFy --> T3 --> PF1.uplink
>
> since you use instantiate eswitch on the system, T(ype)1 and T(ype)3 rules
> are ones that differentiate packets that belong to this GW. But, T(ype)2 rules,
> can be just "fwd everything" -- TC wise, you can even mask out the ethertype,
> just a tc/flower rules that fwd everything from ingress PF0.VFx vNIC
> to egress  PF1.VFy vNIC.
>
> Further, you can also you this match-all (but with flower..) rule for
> the PF1.VFy --> PF1.uplink
> part of the chain since you know that everything that originates in
> this VF should go to the uplink.
>
> Hence the claim here is that if PF0.uplink --> hairpin --> PF1.uplink
> would have been
> supported
Did we have plan to support that function.
> and the system had N steering rules, with what is currently
> supported you
> need N+2 rules -- N rules + one T2 rule and one T3 rul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlx5e question about PF fwd packets to PF
  2020-01-02  3:03   ` Tonghao Zhang
@ 2020-01-02  7:53     ` Or Gerlitz
       [not found]     ` <CAJ3xEMgXvxkmxNcfK-hFDWEu1qW7o7+FBhyGf3YGgr5dPK=Ddg@mail.gmail.com>
  1 sibling, 0 replies; 6+ messages in thread
From: Or Gerlitz @ 2020-01-02  7:53 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Saeed Mahameed, Roi Dayan, Linux Kernel Network Developers

On Thu, Jan 2, 2020 at 5:04 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Wed, Jan 1, 2020 at 4:40 AM Or Gerlitz <gerlitz.or@gmail.com> wrote:
> > On Tue, Dec 31, 2019 at 10:39 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > In one case, we want forward the packets from one PF to otter PF in eswitchdev mode.


>
> > Did you want to say from one uplink to the other uplink? -- this is not supported.
> yes, I try to install one rule and hope that one uplink can forward
> the packets to other uplink of PF.
> But the rule can be installed successfully, and the counter of rule is
> changed as show below:
>
> # tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
> flower action mirred egress redirect dev $PF1
>

you didn't ask for skip_sw, if you install a rule with "none" and adding to hw
fails, still the rule is fine in the SW data-path

>
> # tc -d -s filter show dev $PF0 ingress
> filter protocol all pref 1 flower chain 0
> filter protocol all pref 1 flower chain 0 handle 0x1
>   in_hw


this (in_hw) seems to be a bug, we don't support it AFAIK

> action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
>   index 1 ref 1 bind 1 installed 19 sec used 0 sec
>   Action statistics:
> Sent 3206840 bytes 32723 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0


I think newish (for about a year now or maybe more)  kernels and iproute have
per data-path (SW/HW) rule traffic counters - this would help you
realize what is
going on down there

>
> The PF1 uplink don't sent the packets out(as you say, we don't support it now).
> If we don't support it, should we return -NOSUPPORT when we install
> the hairpin rule between
> uplink of PF, because it makes me confuse.


indeed, but only if you use skip_sw

still the in_hw indication suggests there a driver bug


>
> > What we do support is the following (I think you do it by now):
> > PF0.uplink --> esw --> PF0.VFx --> hairpin --> PF1.VFy --> esw --> PF1.uplink


>
> Yes, I have tested it, and it work fine for us.


cool, so production can keep using these rules..


>
> > Hence the claim here is that if PF0.uplink --> hairpin --> PF1.uplink
> > would have been supported


>
> Did we have plan to support that function.


I don't think so, what is the need? something wrong with N+2 rules as
I suggested?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlx5e question about PF fwd packets to PF
       [not found]     ` <CAJ3xEMgXvxkmxNcfK-hFDWEu1qW7o7+FBhyGf3YGgr5dPK=Ddg@mail.gmail.com>
@ 2020-01-02  9:31       ` Tonghao Zhang
  2020-01-02  9:38         ` Or Gerlitz
  0 siblings, 1 reply; 6+ messages in thread
From: Tonghao Zhang @ 2020-01-02  9:31 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Saeed Mahameed, Roi Dayan, Linux Kernel Network Developers

On Thu, Jan 2, 2020 at 3:50 PM Or Gerlitz <gerlitz.or@gmail.com> wrote:
>
> On Thu, Jan 2, 2020 at 5:04 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>>
>> On Wed, Jan 1, 2020 at 4:40 AM Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> > On Tue, Dec 31, 2019 at 10:39 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
>
>>
>> >> In one case, we want forward the packets from one PF to otter PF in eswitchdev mode.
>
>
>>
>> > Did you want to say from one uplink to the other uplink? -- this is not supported.
>
>
>>
>> yes, I try to install one rule and hope that one uplink can forward
>> the packets to other uplink of PF.
>
>
>
> this is not supported
>
>
>>
>> But the rule can be installed successfully, and the counter of rule is
>> changed as show below:
>
>
>>
>> # tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
>> flower action mirred egress redirect dev $PF1
>
>
> you didn't ask for skip_sw, if you install a rule with "none" and adding to hw
> fails, still the rule is fine in the SW data-path
>
>
>>
>> # tc -d -s filter show dev $PF0 ingress
>> filter protocol all pref 1 flower chain 0
>> filter protocol all pref 1 flower chain 0 handle 0x1
>>   in_hw
>
>
> this (in_hw) seems to be a bug, we don't support it AFAIK
>
>> action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
>>   index 1 ref 1 bind 1 installed 19 sec used 0 sec
>>   Action statistics:
>> Sent 3206840 bytes 32723 pkt (dropped 0, overlimits 0 requeues 0)
>> backlog 0b 0p requeues 0
>
>
> I think newish (for about a year now or maybe more)  kernels and iproute have
> per data-path (SW/HW) rule traffic counters - this would help you realize what is
> going on down there
Hi, Or
Thanks for answering my question.
I add "skip_sw" option in tc command, and update the tc version to
upstream, it run successfully:
# tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
flower skip_sw action mirred egress redirect dev $PF1
# tc -d -s filter show dev $PF0 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
  index 1 ref 1 bind 1 installed 42 sec used 0 sec
  Action statistics:
Sent 408954 bytes 4173 pkt (dropped 0, overlimits 0 requeues 0)
Sent software 0 bytes 0 pkt
Sent hardware 408954 bytes 4173 pkt
backlog 0b 0p requeues 0

>>
>> The PF1 uplink don't sent the packets out(as you say, we don't support it now).
>> If we don't support it, should we return -NOSUPPORT when we install
>> the hairpin rule between
>> uplink of PF, because it makes me confuse.
>
>
> indeed, but only if you use skip_sw
>
> still the in_hw indication suggests there a driver bug
>
>
>>
>> > What we do support is the following (I think you do it by now):
>> > PF0.uplink --> esw --> PF0.VFx --> hairpin --> PF1.VFy --> esw --> PF1.uplink
>
>
>>
>> Yes, I have tested it, and it work fine for us.
>
>
> cool, so production can keep using these rules..
>
>
>>
>> > Hence the claim here is that if PF0.uplink --> hairpin --> PF1.uplink
>> > would have been supported
>
>
>>
>> Did we have plan to support that function.
>
>
> I don't think so, what is the need? something wrong with N+2 rules as I suggested?
N+2 works fine. I do some research about ovs offload with mellanox nic.
I add the uplink of PF0 and PF1 to ovs. and it can offload the
rule(PF0 to PF1, I reproduce with tc commands) to hardware but the nic
can't send the packet out.
>
>>
>> > and the system had N steering rules, with what is currently supported you
>> > need N+2 rules -- N rules + one T2 rule and one T3 rul

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mlx5e question about PF fwd packets to PF
  2020-01-02  9:31       ` Tonghao Zhang
@ 2020-01-02  9:38         ` Or Gerlitz
  0 siblings, 0 replies; 6+ messages in thread
From: Or Gerlitz @ 2020-01-02  9:38 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: Saeed Mahameed, Roi Dayan, Linux Kernel Network Developers

On Thu, Jan 2, 2020 at 11:32 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> On Thu, Jan 2, 2020 at 3:50 PM Or Gerlitz <gerlitz.or@gmail.com> wrote:
> > On Thu, Jan 2, 2020 at 5:04 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >> On Wed, Jan 1, 2020 at 4:40 AM Or Gerlitz <gerlitz.or@gmail.com> wrote:
> >> > On Tue, Dec 31, 2019 at 10:39 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:

> I add "skip_sw" option in tc command, and update the tc version to
> upstream, it run successfully:
> # tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1
> flower skip_sw action mirred egress redirect dev $PF1
> # tc -d -s filter show dev $PF0 ingress
> filter protocol all pref 1 flower chain 0
> filter protocol all pref 1 flower chain 0 handle 0x1
>   skip_sw
>   in_hw in_hw_count 1

As I said, in_hw seems like a bug

> action order 1: mirred (Egress Redirect to device enp130s0f1) stolen
>   index 1 ref 1 bind 1 installed 42 sec used 0 sec
>   Action statistics:
> Sent 408954 bytes 4173 pkt (dropped 0, overlimits 0 requeues 0)
> Sent software 0 bytes 0 pkt
> Sent hardware 408954 bytes 4173 pkt
> backlog 0b 0p requeues 0

> > I don't think so, what is the need? something wrong with N+2 rules as I suggested?

> N+2 works fine.

good!

> I do some research about ovs offload with mellanox nic.

cool

> I add the uplink of PF0 and PF1 to ovs. and it can offload the
> rule(PF0 to PF1, I reproduce with tc commands) to hardware but the nic
> can't send the packet out.

we don't offload that and should return error on the tc command

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-02  9:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-31  8:34 mlx5e question about PF fwd packets to PF Tonghao Zhang
2019-12-31 20:40 ` Or Gerlitz
2020-01-02  3:03   ` Tonghao Zhang
2020-01-02  7:53     ` Or Gerlitz
     [not found]     ` <CAJ3xEMgXvxkmxNcfK-hFDWEu1qW7o7+FBhyGf3YGgr5dPK=Ddg@mail.gmail.com>
2020-01-02  9:31       ` Tonghao Zhang
2020-01-02  9:38         ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.