All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jörn Engel" <joern@purestorage.com>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>
Subject: sched: wakeup setting TIF_NEED_RESCHED too frequently
Date: Tue, 27 Apr 2021 14:16:50 -0700	[thread overview]
Message-ID: <YIh/QubidJcE5IIv@cork> (raw)

I came across something that looks wrong and is certainly inefficient.
Here is a trace of what happened.  Kernel was 5.4, so not too ancient.

 + nanoseconds before present
 |       + 0 for voluntary (going to sleep), 1 for involuntary (TASK_RUNNING)
 |       |    + PID of prev task
 |       |    |      + PID of next task
 |       |    |      |    + RT priority
 |       |    |      |    |  + comm
 v       v    v      v    v  v
8729751  0   6893   6894  0 one
8723802  0   6894 106051  0 one
8718671  1 106051   6899  0 other
8711411  0   6899   6900  0 one
8705643  0   6900   6902  0 one
8699444  0   6902   6903  0 one
8693987  0   6903   6906  0 one
8687832  0   6906   6904  0 one
8681188  0   6904   6901  0 one
8674083  0   6901 106051  0 one
8639249  1 106051   6893  0 other
8631961  0   6893   6894  0 one
8626684  0   6894 106051  0 one
8623375  1 106051   6899  0 other
8617434  0   6899   6900  0 one
8612341  0   6900   6901  0 one
8607215  0   6901   6902  0 one
8600824  0   6902   6903  0 one
8595214  0   6903   6906  0 one
8590161  0   6906   6904  0 one
8584825  0   6904 106051  0 one
8465395  1 106051   6893  0 other
8358696  0   6893   6903  0 one

The "one" process seems to frequently wake a bunch of threads that
do very little work before going back to sleep.  Clearly inefficient,
but we are not interested in userspace problems here.  The "other"
process has a thread that would like to continue running, but only gets
to run for very short periods, 3.3µs in one case.  That also seems
inefficient and looks like either a scheduler or debatable policy.

My rule of thumb for task switch overhead is 1-3µs.  Probably 1µs for
the raw switch, plus another 2µs for cold cache effects.  If we want to
amortize that overhead a bit, time slices should be in the region of
1ms, give or take.  I would also be happy if the "one" process would
have to wait 1ms before the woken threads actually get to run.  So in my
book the observed behaviour is wrong, but opinions may differ.

Anyway, trying to find a cause, I noticed the following call chain:
	set_nr_if_polling()
	ttwu_queue_remote()
	ttwu_queue()
	try_to_wake_up()
	default_wake_function()
	curr->func()
	__wake_up_common()
	__wake_up_common_lock()
	__wake_up()
	wake_up()

Call chain above is manually created from source code.  Closest sample I
caught with instrumentation is missing the leaf calls after
try_to_wake_up():
	_raw_spin_unlock_irqrestore+0x1f/0x40
	try_to_wake_up+0x425/0x5e0
	wake_up_q+0x3f/0x80
	futex_wake+0x159/0x180
	do_futex+0xcd/0xba0

Afaics, the result is us setting TIF_NEED_RESCHED on any wakeup, unless
wake_list is already populated.  Is that actually intentional?  And is
that useful for performance of latency?  I think it isn't, but I am
probably missing something here.

Jörn

--
With a PC, I always felt limited by the software available. On Unix,
I am limited only by my knowledge.
-- Peter J. Schoenster

             reply	other threads:[~2021-04-27 21:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 21:16 Jörn Engel [this message]
2021-04-28  8:30 ` sched: wakeup setting TIF_NEED_RESCHED too frequently Peter Zijlstra
2021-04-28 18:51   ` Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YIh/QubidJcE5IIv@cork \
    --to=joern@purestorage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.