netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net_sched: remove need_resched() from qdisc_run()
@ 2019-10-01 21:02 Eric Dumazet
  2019-10-02 21:28 ` David Miller
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Dumazet @ 2019-10-01 21:02 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet

The introduction of this schedule point was done in commit
2ba2506ca7ca ("[NET]: Add preemption point in qdisc_run")
at a time the loop was not bounded.

Then later in commit d5b8aa1d246f ("net_sched: fix dequeuer fairness")
we added a limit on the number of packets.

Now is the time to remove the schedule point, since the default
limit of 64 packets matches the number of packets a typical NAPI
poll can process in a row.

This solves a latency problem for most TCP receivers under moderate load :

1) host receives a packet.
   NET_RX_SOFTIRQ is raised by NIC hard IRQ handler

2) __do_softirq() does its first loop, handling NET_RX_SOFTIRQ
   and calling the driver napi->loop() function

3) TCP stores the skb in socket receive queue:

4) TCP calls sk->sk_data_ready() and wakeups a user thread
   waiting for EPOLLIN (as a result, need_resched() might now be true)

5) TCP cooks an ACK and sends it.

6) qdisc_run() processes one packet from qdisc, and sees need_resched(),
   this raises NET_TX_SOFTIRQ (even if there are no more packets in
   the qdisc)

Then we go back to the __do_softirq() in 2), and we see that new
softirqs were raised. Since need_resched() is true, we end up waking
ksoftirqd in this path :

    if (pending) {
            if (time_before(jiffies, end) && !need_resched() &&
                --max_restart)
                    goto restart;

            wakeup_softirqd();
    }

So we have many wakeups of ksoftirqd kernel threads,
and more calls to qdisc_run() with associated lock overhead.

Note that another way to solve the issue would be to change TCP
to first send the ACK packet, then signal the EPOLLIN,
but this changes P99 latencies, as sending the ACK packet
can add a long delay.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/sched/sch_generic.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 17bd8f539bc7f1d596e97c713467f953802c9b82..4c75dbabd343e4585da2f3b11105e436c872c4a8 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -382,13 +382,8 @@ void __qdisc_run(struct Qdisc *q)
 	int packets;
 
 	while (qdisc_restart(q, &packets)) {
-		/*
-		 * Ordered by possible occurrence: Postpone processing if
-		 * 1. we've exceeded packet quota
-		 * 2. another process needs the CPU;
-		 */
 		quota -= packets;
-		if (quota <= 0 || need_resched()) {
+		if (quota <= 0) {
 			__netif_schedule(q);
 			break;
 		}
-- 
2.23.0.581.g78d2f28ef7-goog


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH net-next] net_sched: remove need_resched() from qdisc_run()
  2019-10-01 21:02 [PATCH net-next] net_sched: remove need_resched() from qdisc_run() Eric Dumazet
@ 2019-10-02 21:28 ` David Miller
  0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2019-10-02 21:28 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Tue,  1 Oct 2019 14:02:36 -0700

> Now is the time to remove the schedule point, since the default
> limit of 64 packets matches the number of packets a typical NAPI
> poll can process in a row.
 ...

Agreed.

Applied.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-02 21:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-01 21:02 [PATCH net-next] net_sched: remove need_resched() from qdisc_run() Eric Dumazet
2019-10-02 21:28 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).