All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	netdev <netdev@vger.kernel.org>,
	Alexander Duyck <aduyck@mirantis.com>,
	John Fastabend <john.r.fastabend@intel.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Subject: [PATCH net] net_sched: avoid too many hrtimer_start() calls
Date: Mon, 23 May 2016 14:24:56 -0700	[thread overview]
Message-ID: <1464038696.5939.29.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <20160523115000.40e25fab@redhat.com>

From: Eric Dumazet <edumazet@google.com>

I found a serious performance bug in packet schedulers using hrtimers.

sch_htb and sch_fq are definitely impacted by this problem.

We constantly rearm high resolution timers if some packets are throttled
in one (or more) class, and other packets are flying through qdisc on
another (non throttled) class.

hrtimer_start() does not have the mod_timer() trick of doing nothing if
expires value does not change :

	if (timer_pending(timer) &&
            timer->expires == expires)
                return 1;

This issue is particularly visible when multiple cpus can queue/dequeue
packets on the same qdisc, as hrtimer code has to lock a remote base.

I used following fix :

1) Change htb to use qdisc_watchdog_schedule_ns() instead of open-coding
it.

2) Cache watchdog prior expiration. hrtimer might provide this, but I
prefer to not rely on some hrtimer internal.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/pkt_sched.h |    1 +
 net/sched/sch_api.c     |    4 ++++
 net/sched/sch_htb.c     |   13 +++----------
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 401038d2f9b8..fea53f4d92ca 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -61,6 +61,7 @@ psched_tdiff_bounded(psched_time_t tv1, psched_time_t tv2, psched_time_t bound)
 }
 
 struct qdisc_watchdog {
+	u64		last_expires;
 	struct hrtimer	timer;
 	struct Qdisc	*qdisc;
 };
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 64f71a2155f3..ddf047df5361 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -607,6 +607,10 @@ void qdisc_watchdog_schedule_ns(struct qdisc_watchdog *wd, u64 expires, bool thr
 	if (throttle)
 		qdisc_throttled(wd->qdisc);
 
+	if (wd->last_expires == expires)
+		return;
+
+	wd->last_expires = expires;
 	hrtimer_start(&wd->timer,
 		      ns_to_ktime(expires),
 		      HRTIMER_MODE_ABS_PINNED);
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index f6bf5818ed4d..d4b4218af6b1 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -928,17 +928,10 @@ ok:
 		}
 	}
 	qdisc_qstats_overlimit(sch);
-	if (likely(next_event > q->now)) {
-		if (!test_bit(__QDISC_STATE_DEACTIVATED,
-			      &qdisc_root_sleeping(q->watchdog.qdisc)->state)) {
-			ktime_t time = ns_to_ktime(next_event);
-			qdisc_throttled(q->watchdog.qdisc);
-			hrtimer_start(&q->watchdog.timer, time,
-				      HRTIMER_MODE_ABS_PINNED);
-		}
-	} else {
+	if (likely(next_event > q->now))
+		qdisc_watchdog_schedule_ns(&q->watchdog, next_event, true);
+	else
 		schedule_work(&q->work);
-	}
 fin:
 	return skb;
 }

  reply	other threads:[~2016-05-23 21:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-19 17:08 [RFC] net: remove busylock Eric Dumazet
2016-05-19 18:03 ` Alexander Duyck
2016-05-19 18:41   ` Rick Jones
2016-05-19 18:56   ` Eric Dumazet
2016-05-19 19:35     ` Eric Dumazet
2016-05-19 20:39       ` Alexander Duyck
2016-05-20  4:49         ` John Fastabend
2016-05-20  4:56           ` Eric Dumazet
2016-05-20  7:29   ` Jesper Dangaard Brouer
2016-05-20 13:11     ` Eric Dumazet
2016-05-20 13:47       ` Eric Dumazet
2016-05-20 14:16         ` Eric Dumazet
2016-05-20 17:49           ` Jesper Dangaard Brouer
2016-05-20 21:32             ` Eric Dumazet
2016-05-23  9:50               ` Jesper Dangaard Brouer
2016-05-23 21:24                 ` Eric Dumazet [this message]
2016-05-24 21:49                   ` [PATCH net] net_sched: avoid too many hrtimer_start() calls David Miller
2016-05-24 13:50             ` [RFC] net: remove busylock David Laight
2016-05-24 14:37               ` Eric Dumazet
2016-05-20 16:01       ` John Fastabend
2016-05-19 18:12 ` David Miller
2016-05-19 18:44   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1464038696.5939.29.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=aduyck@mirantis.com \
    --cc=alexander.duyck@gmail.com \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.