All of lore.kernel.org
 help / color / mirror / Atom feed
* ADQ - comparison to aRFS, clarifications on NAPI ID, binding with busy-polling
@ 2020-06-17 13:15 Maxim Mikityanskiy
  2020-06-24 20:21 ` Samudrala, Sridhar
  0 siblings, 1 reply; 6+ messages in thread
From: Maxim Mikityanskiy @ 2020-06-17 13:15 UTC (permalink / raw)
  To: Amritha Nambiar, Kiran Patil, Sridhar Samudrala
  Cc: Alexander Duyck, Eric Dumazet, Tom Herbert, netdev

Hi,

I discovered Intel ADQ feature [1] that allows to boost performance by 
picking dedicated queues for application traffic. We did some research, 
and I got some level of understanding how it works, but I have some 
questions, and I hope you could answer them.

1. SO_INCOMING_NAPI_ID usage. In my understanding, every connection has 
a key (sk_napi_id) that is unique to the NAPI where this connection is 
handled, and the application uses that key to choose a handler thread 
from the thread pool. If we have a one-to-one relationship between 
application threads and NAPI IDs of connections, each application thread 
will handle only traffic from a single NAPI. Is my understanding correct?

1.1. I wonder how the application thread gets scheduled on the same core 
that NAPI runs at. It currently only works with busy_poll, so when the 
application initiates busy polling (calls epoll), does the Linux 
scheduler move the thread to the right CPU? Do we have to have a strict 
one-to-one relationship between threads and NAPIs, or can one thread 
handle multiple NAPIs? When the data arrives, does the scheduler run the 
application thread on the same CPU that NAPI ran on?

1.2. I see that SO_INCOMING_NAPI_ID is tightly coupled with busy_poll. 
It is enabled only if CONFIG_NET_RX_BUSY_POLL is set. Is there a real 
reason why it can't be used without busy_poll? In other words, if we 
modify the kernel to drop this requirement, will the kernel still 
schedule the application thread on the same CPU as NAPI when busy_poll 
is not used?

2. Can you compare ADQ to aRFS+XPS? aRFS provides a way to steer traffic 
to the application's CPU in an automatic fashion, and xps_rxqs can be 
used to transmit from the corresponding queues. This setup doesn't need 
manual configuration of TCs and is not limited to 4 applications. The 
difference of ADQ is that (in my understanding) it moves the application 
to the RX CPU, while aRFS steers the traffic to the RX queue handled my 
the application's CPU. Is there any advantage of ADQ over aRFS, that I 
failed to find?

3. At [1], you mention that ADQ can be used to create separate RSS sets. 
  Could you elaborate about the API used? Does the tc mqprio 
configuration also affect RSS? Can it be turned on/off?

4. How is tc flower used in context of ADQ? Does the user need to 
reflect the configuration in both mqprio qdisc (for TX) and tc flower 
(for RX)? It looks like tc flower maps incoming traffic to TCs, but what 
is the mechanism of mapping TCs to RX queues?

I really hope you will be able to shed more light on this feature to 
increase my awareness on how to use it and to compare it with aRFS.

Thanks,
Max

[1]: 
https://netdevconf.info/0x14/session.html?talk-ADQ-for-system-level-network-io-performance-improvements

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-01  0:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-17 13:15 ADQ - comparison to aRFS, clarifications on NAPI ID, binding with busy-polling Maxim Mikityanskiy
2020-06-24 20:21 ` Samudrala, Sridhar
2020-06-26 12:48   ` Maxim Mikityanskiy
2020-06-30 20:08     ` Jonathan Lemon
2020-07-01  0:23     ` Samudrala, Sridhar
2020-06-27 16:26   ` Tom Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.