netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device
@ 2020-10-21 19:47 Harshitha Ramamurthy
  2020-10-21 19:47 ` [RFC PATCH net-next 1/3] sock: Definition and general functions for dev_and_queue structure Harshitha Ramamurthy
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Harshitha Ramamurthy @ 2020-10-21 19:47 UTC (permalink / raw)
  To: netdev, davem, kuba
  Cc: tom, carolyn.wyborny, jacob.e.keller, amritha.nambiar,
	Harshitha Ramamurthy

In XPS, the transmit queue selected for a packet is saved in the associated
sock for the packet and is then used to avoid recalculating the queue
on subsequent sends. The problem is that the corresponding device is not
also recorded so that when the queue mapping is referenced it may
correspond to a different device than the sending one, resulting in an
incorrect queue being used for transmit. Particularly with xps_rxqs, this
can lead to non-deterministic behaviour as illustrated below.

Consider a case where xps_rxqs is configured and there is a difference
in number of Tx and Rx queues. Suppose we have 2 devices A and B. Device A
has 0-7 queues and device B has 0-15 queues. Packets are transmitted from
Device A but packets are received on B. For packets received on queue 0-7
of Device B, xps_rxqs will be applied for reply packets to transmit on
Device A's queues 0-7. However, when packets are received on queues
8-15 of Device B, normal XPS is used to reply packets when transmitting
from Device A. This leads to non-deterministic behaviour. The case where
there are fewer receive queues is even more insidious. Consider Device
A, the trasmitting device has queues 0-15 and Device B, the receiver
has queues 0-7. With xps_rxqs enabled, the packets will be received only
on queues 0-7 of Device B, but sent only on 0-7 queues of Device A
thereby causing a load imbalance.

This patch set fixes the issue by recording both the device (via
ifindex) and the queue in the sock mapping. The pair is set and
retrieved atomically. While retrieving the queue using the get
functions, we check if the ifindex held is the same as the ifindex
stored before returning the queue held. For instance during transmit,
we return a valid queue number only after checking if the ifindex stored
matches the device currently held.

This patch set contains:
	- Definition of dev_and_queue structure to hold the ifindex
	  and queue number
	- Generic functions to get, set, and clear dev_and_queue
	  structure
	- Change sk_tx_queue_{get,set,clear} to
	  sk_tx_dev_and_queue_{get,set,clear}
	- Modify callers of above to use new interface
	- Change sk_rx_queue_{get,set,clear} to 
          sk_rx_dev_and_queue_{get,set,clear}
        - Modify callers of above to use new interface

This patch set was tested as follows:
	- XPS with both xps_cpus and xps_rxqs works as expected
	- the Q index is calculated only once when picking a tx queue
	  per connection. For ex: in netdev_pick_tx

Tom Herbert (3):
  sock: Definition and general functions for dev_and_queue structure
  sock: Use dev_and_queue structure for RX queue mapping in sock
  sock: Use dev_and_queue structure for TX queue mapping in sock

 .../mellanox/mlx5/core/en_accel/ktls_rx.c     |   6 +-
 drivers/net/hyperv/netvsc_drv.c               |   4 +-
 include/net/busy_poll.h                       |   2 +-
 include/net/request_sock.h                    |   2 +-
 include/net/sock.h                            | 107 ++++++++++++------
 net/core/dev.c                                |   6 +-
 net/core/filter.c                             |   7 +-
 net/core/sock.c                               |  10 +-
 net/ipv4/tcp_input.c                          |   2 +-
 9 files changed, 93 insertions(+), 53 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread
* [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device
@ 2020-07-24 20:14 Tom Herbert
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Herbert @ 2020-07-24 20:14 UTC (permalink / raw)
  To: netdev, amritha.nambiar; +Cc: Tom Herbert

The transmit queue selected for a packet is saved in the associated sock
for the packet and is subsequently used to avoid recalculating the queue
on subsequent sends. The problem is that the corresponding device is not
also recorded so that when the queue mapping is referenced it may
correspond to a different device than the sending one, resulting in an
incorrect queue being used for transmit. A similar problem exists in
recording the receive queue in the sock without the corresponding
receive device.

This patch set fixes the issue by recording both the device (via
ifindex) and the queue in the sock mapping. The pair is set and
retrieved atomically. The caller getting the mapping pair checks
that both the recorded queue and in the device are valid in the
context (for instance, in transmit the returned ifindex is checked
against that of the transmitting device to ensure they refer to
same device before apply the recorded queue).

This patch set contains:
	- Definition of dev_and_queue structure to hold the ifindex
	  and queue number
	- Generic functions to get, set, and clear dev_and_queue
	  structure
	- Change sk_tx_queue_{get,set,clear} to
	  sk_tx_dev_and_queue_{get,set,clear}
	- Modify callers of above to use new interface
	- Change sk_rx_queue_{get,set,clear} to 
          sk_rx_dev_and_queue_{get,set,clear}
        - Modify callers of above to use new interface

Tom Herbert (3):
  sock: Definition and general functions for dev_and_queue structure
  sock: Use dev_and_queue structure for TX queue mapping in sock
  sock: Use dev_and_queue structure for RX queue mapping in sock

 .../mellanox/mlx5/core/en_accel/ktls_rx.c     |  10 +-
 drivers/net/hyperv/netvsc_drv.c               |   9 +-
 include/net/busy_poll.h                       |   2 +-
 include/net/request_sock.h                    |   2 +-
 include/net/sock.h                            | 126 +++++++++++++-----
 net/core/dev.c                                |  15 ++-
 net/core/filter.c                             |   7 +-
 net/core/sock.c                               |  10 +-
 net/ipv4/tcp_input.c                          |   2 +-
 9 files changed, 124 insertions(+), 59 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-22 17:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-21 19:47 [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 1/3] sock: Definition and general functions for dev_and_queue structure Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 2/3] sock: Use dev_and_queue structure for RX queue mapping in sock Harshitha Ramamurthy
2020-10-21 19:47 ` [RFC PATCH net-next 3/3] sock: Use dev_and_queue structure for TX " Harshitha Ramamurthy
2020-10-22 17:38 ` [RFC PATCH net-next 0/3] sock: Fix sock queue mapping to include device Willem de Bruijn
  -- strict thread matches above, loose matches on Subject: below --
2020-07-24 20:14 Tom Herbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).