* switchdev offload & ecmp @ 2017-05-15 14:25 Nicolas Dichtel 2017-05-15 16:40 ` Ido Schimmel 0 siblings, 1 reply; 5+ messages in thread From: Nicolas Dichtel @ 2017-05-15 14:25 UTC (permalink / raw) To: Jiri Pirko, Ido Schimmel; +Cc: Nikolay Aleksandrov, Roopa Prabhu, netdev Hi Jiri and Ido, I'm trying to understand how ecmp offloading works. It seems that rocker doesn't support it: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/rocker/rocker_ofdpa.c#n2409. But I saw that the support was added in spectrum: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?h=684a95c064fc. Is there a consistency between the ecmp algorithm of the kernel and the one from spectrum? I suspect that there can be scenarii where some packets of a flow are forwarded by the driver and some other are forwarded by the kernel. For example, an ecmp route with two nexthops: a connected route and a gw? In that case, the periodic nexthops update (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987) won't help. How do you ensure that all packets of the flow are always forwarded through the same nexthop? Regards, Nicolas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: switchdev offload & ecmp 2017-05-15 14:25 switchdev offload & ecmp Nicolas Dichtel @ 2017-05-15 16:40 ` Ido Schimmel 2017-05-16 12:57 ` Nicolas Dichtel 0 siblings, 1 reply; 5+ messages in thread From: Ido Schimmel @ 2017-05-15 16:40 UTC (permalink / raw) To: Nicolas Dichtel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev Hi, On Mon, May 15, 2017 at 04:25:43PM +0200, Nicolas Dichtel wrote: > Hi Jiri and Ido, > > I'm trying to understand how ecmp offloading works. It seems that rocker doesn't > support it: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/rocker/rocker_ofdpa.c#n2409. > But I saw that the support was added in spectrum: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?h=684a95c064fc. > > Is there a consistency between the ecmp algorithm of the kernel and the one from > spectrum? We currently use the hardware's defaults for ECMP hashing, which include both L3 and L4 fields. I'm aware of Nik's patch, but we've yet to reflect that. Note that the L4 fields aren't considered for fragmented packets. > I suspect that there can be scenarii where some packets of a flow are forwarded > by the driver and some other are forwarded by the kernel. Can you elaborate? The kernel only sees specific packets, which were trapped to the CPU. See: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996 > For example, an ecmp route with two nexthops: a connected route and a gw? Not sure I'm following you. A packet will either hit a remote route or a directly connected one. We distinguish between the two based on the scope of the first nexthop in the group. See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n2043 > In that case, the periodic nexthops update > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987) > won't help. How do you ensure that all packets of the flow are always forwarded > through the same nexthop? I don't think we can ensure that for a flow in which some packets are forwarded by the kernel and some by the device, but I failed to understand your example of such a flow. Thanks ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: switchdev offload & ecmp 2017-05-15 16:40 ` Ido Schimmel @ 2017-05-16 12:57 ` Nicolas Dichtel 2017-05-16 14:11 ` Ido Schimmel 0 siblings, 1 reply; 5+ messages in thread From: Nicolas Dichtel @ 2017-05-16 12:57 UTC (permalink / raw) To: Ido Schimmel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev Le 15/05/2017 à 18:40, Ido Schimmel a écrit : [snip] >> Is there a consistency between the ecmp algorithm of the kernel and the one from >> spectrum? > > We currently use the hardware's defaults for ECMP hashing, which include > both L3 and L4 fields. I'm aware of Nik's patch, but we've yet to > reflect that. Note that the L4 fields aren't considered for fragmented > packets. Ok. > >> I suspect that there can be scenarii where some packets of a flow are forwarded >> by the driver and some other are forwarded by the kernel. > > Can you elaborate? The kernel only sees specific packets, which were > trapped to the CPU. See: > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996 Ok, this part was not clear for me, thank you for the pointer. So, when an arp resolution is needed, the packets are not trapped to the CPU, the device manages the queue itself? > >> For example, an ecmp route with two nexthops: a connected route and a gw? > > Not sure I'm following you. A packet will either hit a remote route or a > directly connected one. We distinguish between the two based on the > scope of the first nexthop in the group. See: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n2043 > >> In that case, the periodic nexthops update >> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987) >> won't help. How do you ensure that all packets of the flow are always forwarded >> through the same nexthop? > > I don't think we can ensure that for a flow in which some packets are > forwarded by the kernel and some by the device, but I failed to > understand your example of such a flow. I was trying to understand if nexthop choice is always the same in the kernel and in the device. And I was also trying to understand if it's possible to have some packets of a flow routed by the kernel and some others by the device. Thank you for the answers, Nicolas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: switchdev offload & ecmp 2017-05-16 12:57 ` Nicolas Dichtel @ 2017-05-16 14:11 ` Ido Schimmel 2017-05-16 20:22 ` Nicolas Dichtel 0 siblings, 1 reply; 5+ messages in thread From: Ido Schimmel @ 2017-05-16 14:11 UTC (permalink / raw) To: Nicolas Dichtel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev On Tue, May 16, 2017 at 02:57:47PM +0200, Nicolas Dichtel wrote: > >> I suspect that there can be scenarii where some packets of a flow are forwarded > >> by the driver and some other are forwarded by the kernel. > > > > Can you elaborate? The kernel only sees specific packets, which were > > trapped to the CPU. See: > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996 > Ok, this part was not clear for me, thank you for the pointer. > > So, when an arp resolution is needed, the packets are not trapped to the CPU, > the device manages the queue itself? There are two cases here. If you need an ARP resolution following a hit of a directly connected route and this neighbour isn't in the device's table, then packet is trapped (HOST_MISS_IPV4 in above list) to the CPU and triggers ARP resolution in the kernel. Eventually a NETEVENT will be sent and the neighbour will be programmed to the device. If you need an ARP resolution of a nexthop, then this is a bit different. If you have an ECMP group with several nexthops, then once one of them is resolved, packets will be forwarded using it. To make sure other nexthops will also be resolved we try to periodically refresh them. Otherwise packets will always be forwarded using a single nexthop, as the kernel won't have motivation to resolve the others. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987 In case no nexthops can be resolved, then packets will be trapped to the CPU (RTR_INGRESS0 in above list) and forwarded by the kernel. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n1896 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: switchdev offload & ecmp 2017-05-16 14:11 ` Ido Schimmel @ 2017-05-16 20:22 ` Nicolas Dichtel 0 siblings, 0 replies; 5+ messages in thread From: Nicolas Dichtel @ 2017-05-16 20:22 UTC (permalink / raw) To: Ido Schimmel; +Cc: Jiri Pirko, Nikolay Aleksandrov, Roopa Prabhu, netdev Le 16/05/2017 à 16:11, Ido Schimmel a écrit : > On Tue, May 16, 2017 at 02:57:47PM +0200, Nicolas Dichtel wrote: >>>> I suspect that there can be scenarii where some packets of a flow are forwarded >>>> by the driver and some other are forwarded by the kernel. >>> >>> Can you elaborate? The kernel only sees specific packets, which were >>> trapped to the CPU. See: >>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum.c#n2996 >> Ok, this part was not clear for me, thank you for the pointer. >> >> So, when an arp resolution is needed, the packets are not trapped to the CPU, >> the device manages the queue itself? > > There are two cases here. If you need an ARP resolution following a hit > of a directly connected route and this neighbour isn't in the device's > table, then packet is trapped (HOST_MISS_IPV4 in above list) to the CPU > and triggers ARP resolution in the kernel. Eventually a NETEVENT will be > sent and the neighbour will be programmed to the device. > > If you need an ARP resolution of a nexthop, then this is a bit > different. If you have an ECMP group with several nexthops, then once > one of them is resolved, packets will be forwarded using it. To make > sure other nexthops will also be resolved we try to periodically refresh > them. Otherwise packets will always be forwarded using a single nexthop, > as the kernel won't have motivation to resolve the others. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n987 > > In case no nexthops can be resolved, then packets will be trapped to the > CPU (RTR_INGRESS0 in above list) and forwarded by the kernel. > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c#n1896 > Ok, thank you for the details. Regards, Nicolas ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-05-16 20:22 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-15 14:25 switchdev offload & ecmp Nicolas Dichtel 2017-05-15 16:40 ` Ido Schimmel 2017-05-16 12:57 ` Nicolas Dichtel 2017-05-16 14:11 ` Ido Schimmel 2017-05-16 20:22 ` Nicolas Dichtel
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.