All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yong Zhang <yong.zhang0@gmail.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@amd64.org>, Borislav Petkov <bp@alien8.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"markus@trippelsdorf.de" <markus@trippelsdorf.de>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@elte.hu" <mingo@elte.hu>,
	"linux-tip-commits@vger.kernel.org" 
	<linux-tip-commits@vger.kernel.org>
Subject: Re: [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote wakeups
Date: Fri, 3 Jun 2011 14:49:38 +0800	[thread overview]
Message-ID: <BANLkTikciSq0x6-RXTx7ugrgthxXtUsoJA@mail.gmail.com> (raw)
In-Reply-To: <1307029711.2497.717.camel@laptop>

On Thu, Jun 2, 2011 at 11:48 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2011-06-02 at 22:23 +0800, Yong Zhang wrote:
>> On Thu, Jun 02, 2011 at 03:04:26PM +0200, Peter Zijlstra wrote:
>> > On Thu, 2011-06-02 at 15:52 +0800, Yong Zhang wrote:
>> > > In sched_clock_local(), clock is calculated around ->tick_gtod even if
>> > > that ->tick_gtod is stale for long time because we stays in idle state.
>> > > You know ->tick_gtod is only updated in sched_clock_tick();
>> >
>> > (well, no, there's idle callbacks as you said below)
>> >
>> > > IOW, when a cpu goes out of idle, sched_clock_tick() is called from
>> > > tick_nohz_stop_idle() which is later than interrupt.
>> >
>> > Gah, that would be awefull and mean wakeups from interrupts were already
>> > borken. /me goes look at code.
>> >
>> > irq_enter() -> tick_check_idle() -> tick_check_nohz() ->
>> > tick_nohz_stop_idle() -> sched_clock_idle_wakeup_event()
>> >
>> > should update the thing before we run any isrs, right?
>>
>> Hmmm, you are right.
>>
>> But smp_reschedule_interrupt() doesn't call irq_enter()/irq_exit(),
>> is that correct?
>
> Crap.. you're right.
> And I bet other archs don't do that either.

Most of them ;)
I only notice sparc32 do that. Maybe there have more,
but I didn't check it very carefully.

> With
> NO_HZ you really need irq_enter() for pretty much all interrupts so I
> was assuming the resched IPI had it, but its been special and never
> really needed it. If it would wake an idle cpu the idle loop exit would
> deal with it, if it interrupted userspace the thing was running and
> NO_HZ wasn't relevant.
>
> Damn.
>
> And yes, the only reason I didn't see this on my dev box was because we
> do indeed set that sched_clock_stable thing on wsm. And I never noticed
> on my desktop because firefox/X/etc. consuming heaps of CPU isn't weird
> at all.
>
> Adding it to all resched int handlers is of course a possibility but
> would slow down the thing, although with the new code, most users are
> now indeed wakeups (excepting weird and wonderful users like KVM).
>
> We could of course add it in sched.c since the logic recurses just
> fine.. its not pretty though.. :/

Yeah, IMHO it's suitable here and my test looks good.

Reviewed-and-Tested-by: Yong Zhang <yong.zhang0@gmail.com>

BTW, sched_ipi() and sched_ttwu_pending() could share a piece of
code now. And we place irq_enter()/irq_exit() in sched_ipi() because
it's the only function we could call, thus account_system_vtime() could
get the almost exact time value. IOW we should pay some attention on
the future change of smp_reschedule_interrupt().

Thanks,
Yong

>
> Thoughts?
>
> ---
>  kernel/sched.c |   18 +++++++++++++++++-
>  1 files changed, 17 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 2fe98ed..365ed6b 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2554,7 +2554,23 @@ static void sched_ttwu_pending(void)
>
>  void scheduler_ipi(void)
>  {
> -       sched_ttwu_pending();
> +       struct rq *rq = this_rq();
> +       struct task_struct *list = xchg(&rq->wake_list, NULL);
> +
> +       if (!list)
> +               return;
> +
> +       irq_enter();
> +       raw_spin_lock(&rq->lock);
> +
> +       while (list) {
> +               struct task_struct *p = list;
> +               list = list->wake_entry;
> +               ttwu_do_activate(rq, p, 0);
> +       }
> +
> +       raw_spin_unlock(&rq->lock);
> +       irq_exit();
>  }
>
>  static void ttwu_queue_remote(struct task_struct *p, int cpu)
>
>
>



-- 
Only stand for myself

  reply	other threads:[~2011-06-03  6:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-30 17:39 Very high CPU values in top on idle system (3.0-rc1) Markus Trippelsdorf
2011-05-30 18:05 ` Peter Zijlstra
2011-05-30 18:23   ` Markus Trippelsdorf
2011-05-30 20:45     ` Markus Trippelsdorf
2011-05-30 22:12       ` Peter Zijlstra
2011-05-31  9:55         ` Peter Zijlstra
2011-05-31 10:04           ` Markus Trippelsdorf
2011-05-31 12:31           ` [tip:sched/urgent] sched: Fix cross-cpu clock sync on remote wakeups tip-bot for Peter Zijlstra
2011-05-31 12:56             ` Borislav Petkov
2011-05-31 13:11               ` Peter Zijlstra
2011-06-01  7:05                 ` Borislav Petkov
2011-06-01 10:36                   ` Peter Zijlstra
2011-06-01 15:50                     ` Borislav Petkov
2011-06-02  7:52                       ` Yong Zhang
2011-06-02 13:04                         ` Peter Zijlstra
2011-06-02 14:23                           ` Yong Zhang
2011-06-02 15:48                             ` Peter Zijlstra
2011-06-03  6:49                               ` Yong Zhang [this message]
2011-06-07 13:12                               ` Borislav Petkov
2011-06-07 13:16                                 ` Peter Zijlstra
2011-06-03  9:57                             ` Milton Miller
2011-06-03 10:36                               ` Peter Zijlstra
2011-06-03 10:55                                 ` Peter Zijlstra
2011-06-03 10:58                                   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTikciSq0x6-RXTx7ugrgthxXtUsoJA@mail.gmail.com \
    --to=yong.zhang0@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=bp@alien8.de \
    --cc=bp@amd64.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=markus@trippelsdorf.de \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.