From: Dexuan Cui <decui@microsoft.com>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: "mingo@redhat.com" <mingo@redhat.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
"dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"bsegall@google.com" <bsegall@google.com>,
"mgorman@suse.de" <mgorman@suse.de>,
"bristot@redhat.com" <bristot@redhat.com>,
"x86@kernel.org" <x86@kernel.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
Michael Kelley <mikelley@microsoft.com>
Subject: RE: v5.10: sched_cpu_dying() hits BUG_ON during hibernation: kernel BUG at kernel/sched/core.c:7596!
Date: Tue, 22 Dec 2020 21:44:37 +0000 [thread overview]
Message-ID: <MW4PR21MB1857209BF0AB8C074FA4A5B6BFDF9@MW4PR21MB1857.namprd21.prod.outlook.com> (raw)
In-Reply-To: <jhjlfdqrmc6.mognet@arm.com>
> From: Valentin Schneider <valentin.schneider@arm.com>
> Sent: Tuesday, December 22, 2020 5:40 AM
> To: Dexuan Cui <decui@microsoft.com>
> Cc: mingo@redhat.com; peterz@infradead.org; juri.lelli@redhat.com;
> vincent.guittot@linaro.org; dietmar.eggemann@arm.com;
> rostedt@goodmis.org; bsegall@google.com; mgorman@suse.de;
> bristot@redhat.com; x86@kernel.org; linux-pm@vger.kernel.org;
> linux-kernel@vger.kernel.org; linux-hyperv@vger.kernel.org; Michael Kelley
> <mikelley@microsoft.com>
> Subject: Re: v5.10: sched_cpu_dying() hits BUG_ON during hibernation: kernel
> BUG at kernel/sched/core.c:7596!
>
>
> Hi,
>
> On 22/12/20 09:13, Dexuan Cui wrote:
> > Hi,
> > I'm running a Linux VM with the recent mainline (48342fc07272, 12/20/2020)
> on Hyper-V.
> > When I test hibernation, the VM can easily hit the below BUG_ON during the
> resume
> > procedure (I estimate this can repro about 1/5 of the time). BTW, my VM has
> 40 vCPUs.
> >
> > I can't repro the BUG_ON with v5.9.0, so I suspect something in v5.10.0 may
> be broken?
> >
> > In v5.10.0, when the BUG_ON happens, rq->nr_running==2, and
> rq->nr_pinned==0:
> >
> > 7587 int sched_cpu_dying(unsigned int cpu)
> > 7588 {
> > 7589 struct rq *rq = cpu_rq(cpu);
> > 7590 struct rq_flags rf;
> > 7591
> > 7592 /* Handle pending wakeups and then migrate everything off
> */
> > 7593 sched_tick_stop(cpu);
> > 7594
> > 7595 rq_lock_irqsave(rq, &rf);
> > 7596 BUG_ON(rq->nr_running != 1 || rq_has_pinned_tasks(rq));
> > 7597 rq_unlock_irqrestore(rq, &rf);
> > 7598
> > 7599 calc_load_migrate(rq);
> > 7600 update_max_interval();
> > 7601 nohz_balance_exit_idle(rq);
> > 7602 hrtick_clear(rq);
> > 7603 return 0;
> > 7604 }
> >
> > The last commit that touches the BUG_ON line is the commit
> > 3015ef4b98f5 ("sched/core: Make migrate disable and CPU hotplug
> cooperative")
> > but the commit looks good to me.
> >
> > Any idea?
> >
>
> I'd wager this extra task is a kworker; could you give this series a try?
>
>
> https ://lore.kernel.org/lkml/20201218170919.2950-1-jiangshanlai@gmail.com/
Thanks, Valentin! It looks like the patchset can fix the BUG_ON, though I see
a warning, which I reported here: https://lkml.org/lkml/2020/12/22/648
Thanks,
-- Dexuan
prev parent reply other threads:[~2020-12-22 21:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-22 9:13 v5.10: sched_cpu_dying() hits BUG_ON during hibernation: kernel BUG at kernel/sched/core.c:7596! Dexuan Cui
2020-12-22 13:39 ` Valentin Schneider
2020-12-22 21:44 ` Dexuan Cui [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW4PR21MB1857209BF0AB8C074FA4A5B6BFDF9@MW4PR21MB1857.namprd21.prod.outlook.com \
--to=decui@microsoft.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mikelley@microsoft.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.