From: Rainer Sickinger <rainersickinger.official@gmail.com>
To: Alexander Lobakin <alobakin@dlink.ru>
Cc: "David S. Miller" <davem@davemloft.net>,
Muciri Gatimu <muciri@openmesh.com>,
Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>,
John Crispin <john@phrozen.org>, Andrew Lunn <andrew@lunn.ch>,
Vivien Didelot <vivien.didelot@gmail.com>,
Florian Fainelli <f.fainelli@gmail.com>,
Stanislav Fomichev <sdf@google.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Song Liu <songliubraving@fb.com>,
Alexei Starovoitov <ast@kernel.org>,
Matteo Croce <mcroce@redhat.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Eric Dumazet <edumazet@google.com>,
Paul Blakey <paulb@mellanox.com>,
Yoshiki Komachi <komachi.yoshiki@gmail.com>,
netdev@vger.kernel.org, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net] net: dsa: fix flow dissection on Tx path
Date: Fri, 6 Dec 2019 20:32:29 +0100 [thread overview]
Message-ID: <CAD9cdQ68Ye8wuO5N_PZG04htFeqFPCMuNgYWJVRdnHw47fnqqA@mail.gmail.com> (raw)
In-Reply-To: <20191205100235.14195-1-alobakin@dlink.ru>
That is a really great improvement!
Am Do., 5. Dez. 2019 um 11:04 Uhr schrieb Alexander Lobakin <alobakin@dlink.ru>:
>
> Commit 43e665287f93 ("net-next: dsa: fix flow dissection") added an
> ability to override protocol and network offset during flow dissection
> for DSA-enabled devices (i.e. controllers shipped as switch CPU ports)
> in order to fix skb hashing for RPS on Rx path.
>
> However, skb_hash() and added part of code can be invoked not only on
> Rx, but also on Tx path if we have a multi-queued device and:
> - kernel is running on UP system or
> - XPS is not configured.
>
> The call stack in this two cases will be like: dev_queue_xmit() ->
> __dev_queue_xmit() -> netdev_core_pick_tx() -> netdev_pick_tx() ->
> skb_tx_hash() -> skb_get_hash().
>
> The problem is that skbs queued for Tx have both network offset and
> correct protocol already set up even after inserting a CPU tag by DSA
> tagger, so calling tag_ops->flow_dissect() on this path actually only
> breaks flow dissection and hashing.
>
> This can be observed by adding debug prints just before and right after
> tag_ops->flow_dissect() call to the related block of code:
>
> Before the patch:
>
> Rx path (RPS):
>
> [ 19.240001] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 19.244271] tag_ops->flow_dissect()
> [ 19.247811] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */
>
> [ 19.215435] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 19.219746] tag_ops->flow_dissect()
> [ 19.223241] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */
>
> [ 18.654057] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 18.658332] tag_ops->flow_dissect()
> [ 18.661826] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */
>
> Tx path (UP system):
>
> [ 18.759560] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */
> [ 18.763933] tag_ops->flow_dissect()
> [ 18.767485] Tx: proto: 0x920b, nhoff: 34 /* junk */
>
> [ 22.800020] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */
> [ 22.804392] tag_ops->flow_dissect()
> [ 22.807921] Tx: proto: 0x920b, nhoff: 34 /* junk */
>
> [ 16.898342] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */
> [ 16.902705] tag_ops->flow_dissect()
> [ 16.906227] Tx: proto: 0x920b, nhoff: 34 /* junk */
>
> After:
>
> Rx path (RPS):
>
> [ 16.520993] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 16.525260] tag_ops->flow_dissect()
> [ 16.528808] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */
>
> [ 15.484807] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 15.490417] tag_ops->flow_dissect()
> [ 15.495223] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */
>
> [ 17.134621] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
> [ 17.138895] tag_ops->flow_dissect()
> [ 17.142388] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */
>
> Tx path (UP system):
>
> [ 15.499558] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */
>
> [ 20.664689] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */
>
> [ 18.565782] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */
>
> In order to fix that we can add the check 'proto == htons(ETH_P_XDSA)'
> to prevent code from calling tag_ops->flow_dissect() on Tx.
> I also decided to initialize 'offset' variable so tagger callbacks can
> now safely leave it untouched without provoking a chaos.
>
> Fixes: 43e665287f93 ("net-next: dsa: fix flow dissection")
> Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
> ---
> net/core/flow_dissector.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 69395b804709..d524a693e00f 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -969,9 +969,10 @@ bool __skb_flow_dissect(const struct net *net,
> nhoff = skb_network_offset(skb);
> hlen = skb_headlen(skb);
> #if IS_ENABLED(CONFIG_NET_DSA)
> - if (unlikely(skb->dev && netdev_uses_dsa(skb->dev))) {
> + if (unlikely(skb->dev && netdev_uses_dsa(skb->dev) &&
> + proto == htons(ETH_P_XDSA))) {
> const struct dsa_device_ops *ops;
> - int offset;
> + int offset = 0;
>
> ops = skb->dev->dsa_ptr->tag_ops;
> if (ops->flow_dissect &&
> --
> 2.24.0
>
next prev parent reply other threads:[~2019-12-06 19:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 10:02 [PATCH net] net: dsa: fix flow dissection on Tx path Alexander Lobakin
2019-12-05 12:58 ` Andrew Lunn
2019-12-05 13:34 ` Alexander Lobakin
2019-12-05 14:01 ` Andrew Lunn
2019-12-05 14:58 ` Alexander Lobakin
2019-12-06 3:32 ` Florian Fainelli
2019-12-06 7:37 ` Alexander Lobakin
2019-12-06 18:05 ` Florian Fainelli
2019-12-06 3:28 ` Florian Fainelli
2019-12-06 15:06 ` Alexander Lobakin
2019-12-06 19:32 ` Rainer Sickinger [this message]
2019-12-07 4:19 ` David Miller
2019-12-07 8:10 ` Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAD9cdQ68Ye8wuO5N_PZG04htFeqFPCMuNgYWJVRdnHw47fnqqA@mail.gmail.com \
--to=rainersickinger.official@gmail.com \
--cc=alobakin@dlink.ru \
--cc=andrew@lunn.ch \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=jakub@cloudflare.com \
--cc=john@phrozen.org \
--cc=komachi.yoshiki@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mcroce@redhat.com \
--cc=muciri@openmesh.com \
--cc=netdev@vger.kernel.org \
--cc=paulb@mellanox.com \
--cc=sdf@google.com \
--cc=shashidhar.lakkavalli@openmesh.com \
--cc=songliubraving@fb.com \
--cc=vivien.didelot@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).