All of lore.kernel.org
 help / color / mirror / Atom feed
* switchdev offload & ecmp
@ 2017-05-15 14:25 Nicolas Dichtel
  2017-05-15 16:40 ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Dichtel @ 2017-05-15 14:25 UTC (permalink / raw)
  To: Jiri Pirko, Ido Schimmel; +Cc: Nikolay Aleksandrov, Roopa Prabhu, netdev

Hi Jiri and Ido,

I'm trying to understand how ecmp offloading works. It seems that rocker doesn't
support it:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/rocker/rocker_ofdpa.c#n2409.
But I saw that the support was added in spectrum:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?h=684a95c064fc.

Is there a consistency between the ecmp algorithm of the kernel and the one from
spectrum?
I suspect that there can be scenarii where some packets of a flow are forwarded
by the driver and some other are forwarded by the kernel.
For example, an ecmp route with two nexthops: a connected route and a gw? In
that case, the periodic nexthops update
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987)
won't help. How do you ensure that all packets of the flow are always forwarded
through the same nexthop?


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: switchdev offload & ecmp
  2017-05-15 14:25 switchdev offload & ecmp Nicolas Dichtel
@ 2017-05-15 16:40 ` Ido Schimmel
  2017-05-16 12:57   ` Nicolas Dichtel
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2017-05-15 16:40 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev

Hi,

On Mon, May 15, 2017 at 04:25:43PM +0200, Nicolas Dichtel wrote:
> Hi Jiri and Ido,
> 
> I'm trying to understand how ecmp offloading works. It seems that rocker doesn't
> support it:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/rocker/rocker_ofdpa.c#n2409.
> But I saw that the support was added in spectrum:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?h=684a95c064fc.
> 
> Is there a consistency between the ecmp algorithm of the kernel and the one from
> spectrum?

We currently use the hardware's defaults for ECMP hashing, which include
both L3 and L4 fields. I'm aware of Nik's patch, but we've yet to
reflect that. Note that the L4 fields aren't considered for fragmented
packets.

> I suspect that there can be scenarii where some packets of a flow are forwarded
> by the driver and some other are forwarded by the kernel.

Can you elaborate? The kernel only sees specific packets, which were
trapped to the CPU. See:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996

> For example, an ecmp route with two nexthops: a connected route and a gw? 

Not sure I'm following you. A packet will either hit a remote route or a
directly connected one. We distinguish between the two based on the
scope of the first nexthop in the group. See:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n2043

> In that case, the periodic nexthops update
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987)
> won't help. How do you ensure that all packets of the flow are always forwarded
> through the same nexthop?

I don't think we can ensure that for a flow in which some packets are
forwarded by the kernel and some by the device, but I failed to
understand your example of such a flow.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: switchdev offload & ecmp
  2017-05-15 16:40 ` Ido Schimmel
@ 2017-05-16 12:57   ` Nicolas Dichtel
  2017-05-16 14:11     ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Nicolas Dichtel @ 2017-05-16 12:57 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev

Le 15/05/2017 à 18:40, Ido Schimmel a écrit :
[snip]
>> Is there a consistency between the ecmp algorithm of the kernel and the one from
>> spectrum?
> 
> We currently use the hardware's defaults for ECMP hashing, which include
> both L3 and L4 fields. I'm aware of Nik's patch, but we've yet to
> reflect that. Note that the L4 fields aren't considered for fragmented
> packets.
Ok.

> 
>> I suspect that there can be scenarii where some packets of a flow are forwarded
>> by the driver and some other are forwarded by the kernel.
> 
> Can you elaborate? The kernel only sees specific packets, which were
> trapped to the CPU. See:
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996
Ok, this part was not clear for me, thank you for the pointer.

So, when an arp resolution is needed, the packets are not trapped to the CPU,
the device manages the queue itself?

> 
>> For example, an ecmp route with two nexthops: a connected route and a gw? 
> 
> Not sure I'm following you. A packet will either hit a remote route or a
> directly connected one. We distinguish between the two based on the
> scope of the first nexthop in the group. See:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n2043
> 
>> In that case, the periodic nexthops update
>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987)
>> won't help. How do you ensure that all packets of the flow are always forwarded
>> through the same nexthop?
> 
> I don't think we can ensure that for a flow in which some packets are
> forwarded by the kernel and some by the device, but I failed to
> understand your example of such a flow.
I was trying to understand if nexthop choice is always the same in the kernel
and in the device. And I was also trying to understand if it's possible to have
some packets of a flow routed by the kernel and some others by the device.


Thank you for the answers,
Nicolas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: switchdev offload & ecmp
  2017-05-16 12:57   ` Nicolas Dichtel
@ 2017-05-16 14:11     ` Ido Schimmel
  2017-05-16 20:22       ` Nicolas Dichtel
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2017-05-16 14:11 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev

On Tue, May 16, 2017 at 02:57:47PM +0200, Nicolas Dichtel wrote:
> >> I suspect that there can be scenarii where some packets of a flow are forwarded
> >> by the driver and some other are forwarded by the kernel.
> > 
> > Can you elaborate? The kernel only sees specific packets, which were
> > trapped to the CPU. See:
> > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996
> Ok, this part was not clear for me, thank you for the pointer.
> 
> So, when an arp resolution is needed, the packets are not trapped to the CPU,
> the device manages the queue itself?

There are two cases here. If you need an ARP resolution following a hit
of a directly connected route and this neighbour isn't in the device's
table, then packet is trapped (HOST_MISS_IPV4 in above list) to the CPU
and triggers ARP resolution in the kernel. Eventually a NETEVENT will be
sent and the neighbour will be programmed to the device.

If you need an ARP resolution of a nexthop, then this is a bit
different. If you have an ECMP group with several nexthops, then once
one of them is resolved, packets will be forwarded using it. To make
sure other nexthops will also be resolved we try to periodically refresh
them. Otherwise packets will always be forwarded using a single nexthop,
as the kernel won't have motivation to resolve the others.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987

In case no nexthops can be resolved, then packets will be trapped to the
CPU (RTR_INGRESS0 in above list) and forwarded by the kernel.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n1896

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: switchdev offload & ecmp
  2017-05-16 14:11     ` Ido Schimmel
@ 2017-05-16 20:22       ` Nicolas Dichtel
  0 siblings, 0 replies; 5+ messages in thread
From: Nicolas Dichtel @ 2017-05-16 20:22 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev

Le 16/05/2017 à 16:11, Ido Schimmel a écrit :
> On Tue, May 16, 2017 at 02:57:47PM +0200, Nicolas Dichtel wrote:
>>>> I suspect that there can be scenarii where some packets of a flow are forwarded
>>>> by the driver and some other are forwarded by the kernel.
>>>
>>> Can you elaborate? The kernel only sees specific packets, which were
>>> trapped to the CPU. See:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996
>> Ok, this part was not clear for me, thank you for the pointer.
>>
>> So, when an arp resolution is needed, the packets are not trapped to the CPU,
>> the device manages the queue itself?
> 
> There are two cases here. If you need an ARP resolution following a hit
> of a directly connected route and this neighbour isn't in the device's
> table, then packet is trapped (HOST_MISS_IPV4 in above list) to the CPU
> and triggers ARP resolution in the kernel. Eventually a NETEVENT will be
> sent and the neighbour will be programmed to the device.
> 
> If you need an ARP resolution of a nexthop, then this is a bit
> different. If you have an ECMP group with several nexthops, then once
> one of them is resolved, packets will be forwarded using it. To make
> sure other nexthops will also be resolved we try to periodically refresh
> them. Otherwise packets will always be forwarded using a single nexthop,
> as the kernel won't have motivation to resolve the others.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987
> 
> In case no nexthops can be resolved, then packets will be trapped to the
> CPU (RTR_INGRESS0 in above list) and forwarded by the kernel.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n1896
> 
Ok, thank you for the details.

Regards,
Nicolas

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-16 20:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 14:25 switchdev offload & ecmp Nicolas Dichtel
2017-05-15 16:40 ` Ido Schimmel
2017-05-16 12:57   ` Nicolas Dichtel
2017-05-16 14:11     ` Ido Schimmel
2017-05-16 20:22       ` Nicolas Dichtel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.