linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Problem with commit 31e77c93e432 "sched/fair: Update blocked load when newly idle"
Date: Wed, 28 Mar 2018 16:01:44 +0200	[thread overview]
Message-ID: <CAKfTPtDigSEOmtxk7daTTBb3o6kVR8AgUVhu4X6YWutK9yWNwg@mail.gmail.com> (raw)
In-Reply-To: <6b151c56-9ead-7bbe-d1b7-b0e6d69c0d7f@arm.com>

Hi,

On 28 March 2018 at 12:37, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> Hi,
>
> On 03/24/2018 01:47 PM, Heiner Kallweit wrote:
>>
>> Am 24.03.2018 um 07:46 schrieb Vincent Guittot:
>>>
>>> Hi Heiner,
>>>
>>> Le Friday 23 Mar 2018 à 22:28:09 (+0100), Heiner Kallweit a écrit :
>>>>
>>>> Recently I started to get the following problems with linux-next:
>>>>
>>>> - When working via Putty/SSH on the system the console frequently
>>>> freezes
>>>>    for few seconds. Sometimes only opening a second console makes the
>>>>    first one react again.
>>>>
>>>> - I get "INFO: rcu_sched detected stalls on CPUs/tasks:" warnings as
>>>>    described in [1].
>>>>
>
> I can't catch this issue on my Juno r0 (arm64 big.Little).
>
> root@juno:~# uname -r
> 4.16.0-rc4-00198-g31e77c93e432
>
> I'm using openssh-client and openssh-server though.

I think that I have finally been able to reproduce it on my hikey
(octo cortex-A53) after unplugging 6 cores and waiting for almost 2
hours
This seems to happen only on dual core system as I haven't faced that
before on the hikey which I have used for my tests

[  191.365730] CPU2: shutdown
[  191.368482] psci: CPU2 killed.
[  195.601017] CPU3: shutdown
[  195.603767] psci: CPU3 killed.
[  199.037500] CPU4: shutdown
[  199.040251] psci: CPU4 killed.
[  201.813237] CPU5: shutdown
[  201.815996] psci: CPU5 killed.
[  204.624902] CPU6: shutdown
[  204.627646] psci: CPU6 killed.
[  207.652478] CPU7: shutdown
[  207.655204] psci: CPU7 killed.
[ 6017.160463] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 6017.166151] 1-...!: (4 GPs behind) idle=e20/0/0 softirq=10820/10864 fqs=0
[ 6017.173113] (detected by 0, t=20705 jiffies, g=1389, c=1388, q=27)
[ 6017.179386] Task dump for CPU 1:
[ 6017.182612] swapper/1       R  running task        0     0      1 0x00000000
[ 6017.189666] Call trace:
[ 6017.192120]  __switch_to+0x8c/0xd0
[ 6017.195524]  cpuidle_enter_state+0x64/0x360
[ 6017.199706]  cpuidle_enter+0x18/0x20
[ 6017.203282]  call_cpuidle+0x18/0x30
[ 6017.206771]  do_idle+0x1a4/0x1e0
[ 6017.209999]  cpu_startup_entry+0x20/0x28
[ 6017.213923]  secondary_start_kernel+0x188/0x1c8
[ 6017.218457] rcu_preempt kthread starved for 20705 jiffies! g1389
c1388 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[ 6017.228985] rcu_preempt     I    0     8      2 0x00000000
[ 6017.234474] Call trace:
[ 6017.236918]  __switch_to+0x8c/0xd0
[ 6017.240322]  __schedule+0x1b8/0x730
[ 6017.243810]  schedule+0x38/0xa0
[ 6017.246952]  schedule_timeout+0x194/0x428
[ 6017.250964]  rcu_gp_kthread+0x4d4/0x780
[ 6017.254802]  kthread+0xfc/0x128
[ 6017.257942]  ret_from_fork+0x10/0x18
[ 6066.541736] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 6066.547428] 1-...!: (5 GPs behind) idle=e28/0/0 softirq=10820/10864 fqs=0
[ 6066.554392] (detected by 0, t=12345 jiffies, g=1390, c=1389, q=48)
[ 6066.560666] Task dump for CPU 1:
[ 6066.563893] swapper/1       R  running task        0     0      1 0x00000000
[ 6066.570948] Call trace:
[ 6066.573404]  __switch_to+0x8c/0xd0
[ 6066.576809]  cpuidle_enter_state+0x64/0x360
[ 6066.580992]  cpuidle_enter+0x18/0x20
[ 6066.584568]  call_cpuidle+0x18/0x30
[ 6066.588056]  do_idle+0x1a4/0x1e0
[ 6066.591284]  cpu_startup_entry+0x20/0x28
[ 6066.595208]  secondary_start_kernel+0x188/0x1c8
[ 6066.599742] rcu_preempt kthread starved for 12345 jiffies! g1390
c1389 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[ 6066.610270] rcu_preempt     I    0     8      2 0x00000000
[ 6066.615758] Call trace:
[ 6066.618203]  __switch_to+0x8c/0xd0
[ 6066.621607]  __schedule+0x1b8/0x730
[ 6066.625095]  schedule+0x38/0xa0
[ 6066.628236]  schedule_timeout+0x194/0x428
[ 6066.632249]  rcu_gp_kthread+0x4d4/0x780
[ 6066.636087]  kthread+0xfc/0x128

>
>>>> Bisecting the issue resulted in:
>>>>
>>>> 31e77c93e432dec79c7d90b888bbfc3652592741 is the first bad commit
>>>> commit 31e77c93e432dec79c7d90b888bbfc3652592741
>>>> Author: Vincent Guittot <vincent.guittot@linaro.org>
>>>> Date:   Wed Feb 14 16:26:46 2018 +0100
>>>>
>>>>      sched/fair: Update blocked load when newly idle
>>>>
>>>>      When NEWLY_IDLE load balance is not triggered, we might need to
>>>> update the
>>>>      blocked load anyway. We can kick an ilb so an idle CPU will take
>>>> care of
>>>>      updating blocked load or we can try to update them locally before
>>>> entering
>>>>      idle. In the latter case, we reuse part of the nohz_idle_balance.
>>>>
>>>> After reversing this commit at least the issue with the freezing console
>>>> is gone. The second one appeared only sporadically, I still have to see
>>>> whether it pops up again.
>
>
> [...]
>

  reply	other threads:[~2018-03-28 14:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-23 21:28 Problem with commit 31e77c93e432 "sched/fair: Update blocked load when newly idle" Heiner Kallweit
2018-03-24  6:46 ` Vincent Guittot
2018-03-24 12:47   ` Heiner Kallweit
2018-03-28 10:37     ` Dietmar Eggemann
2018-03-28 14:01       ` Vincent Guittot [this message]
2018-03-29  7:41         ` Vincent Guittot
2018-03-29 17:40           ` Heiner Kallweit
2018-03-30  6:50             ` Vincent Guittot
2018-03-30  8:37               ` Heiner Kallweit
2018-04-06 16:03                 ` Vincent Guittot
2018-04-06 19:53                   ` Heiner Kallweit
2018-04-09 17:33                   ` Heiner Kallweit
2018-04-11 17:00                     ` Vincent Guittot
2018-04-11 20:35                       ` Heiner Kallweit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtDigSEOmtxk7daTTBb3o6kVR8AgUVhu4X6YWutK9yWNwg@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=hkallweit1@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).