netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Cc: Network Development <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Tom Herbert <tom@herbertland.com>,
	carolyn.wyborny@intel.com, "Keller,
	Jacob E" <jacob.e.keller@intel.com>,
	amritha.nambiar@intel.com
Subject: Re: [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device
Date: Thu, 22 Oct 2020 13:38:06 -0400	[thread overview]
Message-ID: <CA+FuTSdjL4bFYHXyH8dv2x-ZEQZSuA7R8ecttzdZMRwyPEF-=A@mail.gmail.com> (raw)
In-Reply-To: <20201021194743.781583-1-harshitha.ramamurthy@intel.com>

On Wed, Oct 21, 2020 at 3:51 PM Harshitha Ramamurthy
<harshitha.ramamurthy@intel.com> wrote:
>
> In XPS, the transmit queue selected for a packet is saved in the associated
> sock for the packet and is then used to avoid recalculating the queue
> on subsequent sends. The problem is that the corresponding device is not
> also recorded so that when the queue mapping is referenced it may
> correspond to a different device than the sending one, resulting in an
> incorrect queue being used for transmit. Particularly with xps_rxqs, this
> can lead to non-deterministic behaviour as illustrated below.
>
> Consider a case where xps_rxqs is configured and there is a difference
> in number of Tx and Rx queues. Suppose we have 2 devices A and B. Device A
> has 0-7 queues and device B has 0-15 queues. Packets are transmitted from
> Device A but packets are received on B. For packets received on queue 0-7
> of Device B, xps_rxqs will be applied for reply packets to transmit on
> Device A's queues 0-7. However, when packets are received on queues
> 8-15 of Device B, normal XPS is used to reply packets when transmitting
> from Device A. This leads to non-deterministic behaviour. The case where
> there are fewer receive queues is even more insidious. Consider Device
> A, the trasmitting device has queues 0-15 and Device B, the receiver
> has queues 0-7. With xps_rxqs enabled, the packets will be received only
> on queues 0-7 of Device B, but sent only on 0-7 queues of Device A
> thereby causing a load imbalance.

So the issue is limited to xps_rxqs with multiple nics.

When do we need sk_tx_dev_and_queue_mapping (patch 3/3)? It is used in
netdev_pick_tx, but associations are reset on route change and
recomputed if queue_index would exceed the current device queue count.

> This patch set fixes the issue by recording both the device (via
> ifindex) and the queue in the sock mapping. The pair is set and
> retrieved atomically.

I guess this is the reason for the somewhat convoluted cast to u64
logic in patch 1/3. Is the assumption that 64-bit loads and stores are
atomic on all platforms? That is not correct.

Is atomicity even needed? For the purpose of load balancing it isn't.
Just adding a sk->rx_ifindex would be a lot simpler.

sk->sk_napi_id already uniquely identifies the device. Unfortunately,
dev_get_by_napi_id is not cheap (traverses a hashtable bucket). Though
purely for the purpose of load balancing this validation could be
sample based.

The rx ifindex is also already recorded for inet sockets in
rx_dst_ifindex, and the sk_rx_queue_get functions are limited to
those, so could conceivably use that. But it is derived from skb_iif,
which is overwritten with every reentry of __netif_receive_skb_core.

  parent reply	other threads:[~2020-10-22 17:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-21 19:47 [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 1/3] sock: Definition and general functions for dev_and_queue structure Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 2/3] sock: Use dev_and_queue structure for RX queue mapping in sock Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 3/3] sock: Use dev_and_queue structure for TX " Harshitha Ramamurthy
2020-10-22 17:38 ` Willem de Bruijn [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-07-24 20:14 [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device Tom Herbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+FuTSdjL4bFYHXyH8dv2x-ZEQZSuA7R8ecttzdZMRwyPEF-=A@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=amritha.nambiar@intel.com \
    --cc=carolyn.wyborny@intel.com \
    --cc=davem@davemloft.net \
    --cc=harshitha.ramamurthy@intel.com \
    --cc=jacob.e.keller@intel.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).