All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Alexander Lobakin <alobakin@dlink.ru>, Andrew Lunn <andrew@lunn.ch>
Cc: "David S. Miller" <davem@davemloft.net>,
	Muciri Gatimu <muciri@openmesh.com>,
	Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>,
	John Crispin <john@phrozen.org>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Song Liu <songliubraving@fb.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Matteo Croce <mcroce@redhat.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	Eric Dumazet <edumazet@google.com>,
	Paul Blakey <paulb@mellanox.com>,
	Yoshiki Komachi <komachi.yoshiki@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net] net: dsa: fix flow dissection on Tx path
Date: Thu, 5 Dec 2019 19:32:38 -0800	[thread overview]
Message-ID: <3cc7a0c3-4eeb-52d5-1777-f646329a9303@gmail.com> (raw)
In-Reply-To: <72a21c5f03abdc3d2d1c1bb85fd4489d@dlink.ru>



On 12/5/2019 6:58 AM, Alexander Lobakin wrote:
> Andrew Lunn wrote 05.12.2019 17:01:
>>> Hi,
>>>
>>> > What i'm missing here is an explanation why the flow dissector is
>>> > called here if the protocol is already set? It suggests there is a
>>> > case when the protocol is not correctly set, and we do need to look
>>> > into the frame?
>>>
>>> If we have a device with multiple Tx queues, but XPS is not configured
>>> or system is running on uniprocessor system, then networking core code
>>> selects Tx queue depending on the flow to utilize as much Tx queues as
>>> possible but without breaking frames order.
>>> This selection happens in net/core/dev.c:skb_tx_hash() as:
>>>
>>> reciprocal_scale(skb_get_hash(skb), qcount)
>>>
>>> where 'qcount' is the total number of Tx queues on the network device.
>>>
>>> If skb has not been hashed prior to this line, then skb_get_hash() will
>>> call flow dissector to generate a new hash. That's why flow dissection
>>> can occur on Tx path.
>>
>>
>> Hi Alexander
>>
>> So it looks like you are now skipping this hash. Which in your
>> testing, give better results, because the protocol is already set
>> correctly. But are there cases when the protocol is not set correctly?
>> We really do need to look into the frame?
> 
> Actually no, I'm not skipping the entire hashing, I'm only skipping
> tag_ops->flow_dissect() (helper that only alters network offset and
> replaces fake ETH_P_XDSA with the actual protocol) call on Tx path,
> because this only breaks flow dissection logics. All skbs are still
> processed and hashed by the generic code that goes after that call.
> 
>> How about when an outer header has just been removed? The frame was
>> received on a GRE tunnel, the GRE header has just been removed, and
>> now the frame is on its way out? Is the protocol still GRE, and we
>> should look into the frame to determine if it is IPv4, ARP etc?
>>
>> Your patch looks to improve things for the cases you have tested, but
>> i'm wondering if there are other use cases where we really do need to
>> look into the frame? In which case, your fix is doing the wrong thing.
>> Should we be extending the tagger to handle the TX case as well as the
>> RX case?
> 
> We really have two options: don't call tag_ops->flow_dissect() on Tx
> (this patch), or extend tagger callbacks to handle Tx path too. I was
> using both of this for several months each and couldn't detect cases
> where the first one was worse than the second.
> I mean, there _might_ be such cases in theory, and if they will appear
> we should extend our taggers. But for now I don't see the necessity to
> do this as generic flow dissection logics works as expected after this
> patch and is completely broken without it.
> And remember that we have the reverse logic on Tx and all skbs are
> firstly queued on slave netdevice and only then on master/CPU port.
> 
> It would be nice to see what other people think about it anyways.

Your patch seems appropriate to me and quite frankly I am not sure why
flow dissection on RX is done at the DSA master device level, where we
have not parsed the DSA tag yet, instead of being done at the DSA slave
network device level. It seems to me that if the DSA master has N RX
queues, we should be creating the DSA slave devices with the same amount
of RX queues and perform RPS there against a standard Ethernet frame
(sans DSA tag).

For TX the story is a little different because we can have multiqueue
DSA slave network devices in order to steer traffic towards particular
switch queues and we could do XPS there that way.

What do you think?
-- 
Florian

  reply	other threads:[~2019-12-06  3:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-05 10:02 [PATCH net] net: dsa: fix flow dissection on Tx path Alexander Lobakin
2019-12-05 12:58 ` Andrew Lunn
2019-12-05 13:34   ` Alexander Lobakin
2019-12-05 14:01     ` Andrew Lunn
2019-12-05 14:58       ` Alexander Lobakin
2019-12-06  3:32         ` Florian Fainelli [this message]
2019-12-06  7:37           ` Alexander Lobakin
2019-12-06 18:05             ` Florian Fainelli
2019-12-06  3:28 ` Florian Fainelli
2019-12-06 15:06   ` Alexander Lobakin
2019-12-06 19:32 ` Rainer Sickinger
2019-12-07  4:19 ` David Miller
2019-12-07  8:10   ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3cc7a0c3-4eeb-52d5-1777-f646329a9303@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=alobakin@dlink.ru \
    --cc=andrew@lunn.ch \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=john@phrozen.org \
    --cc=komachi.yoshiki@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcroce@redhat.com \
    --cc=muciri@openmesh.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulb@mellanox.com \
    --cc=sdf@google.com \
    --cc=shashidhar.lakkavalli@openmesh.com \
    --cc=songliubraving@fb.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.