From: "Chen, Yu C" <yu.c.chen@intel.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
James Morse <james.morse@arm.com>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>, Pavel Machek <pavel@ucw.cz>,
Borislav Petkov <bp@suse.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Len Brown <lenb@kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH][RFC v3] x86, hotplug: Use hlt instead of mwait if invoked from disable_nonboot_cpus
Date: Thu, 7 Jul 2016 02:50:09 +0000 [thread overview]
Message-ID: <36DF59CE26D8EE47B0655C516E9CE6402877EBB4@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <209704957.TWElLMTLfP@vostro.rjw.lan>
> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@rjwysocki.net]
> Sent: Thursday, July 07, 2016 8:33 AM
> To: Chen, Yu C; James Morse
> Cc: linux-pm@vger.kernel.org; Thomas Gleixner; H. Peter Anvin; Pavel Machek;
> Borislav Petkov; Peter Zijlstra; Ingo Molnar; Len Brown; x86@kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH][RFC v3] x86, hotplug: Use hlt instead of mwait if invoked
> from disable_nonboot_cpus
>
> On Tuesday, June 28, 2016 05:16:43 PM Chen Yu wrote:
> > Stress test from Varun Koyyalagunta reports that, the nonboot CPU
> > would hang occasionally, when resuming from hibernation. Further
> > investigation shows that, the precise stage when nonboot CPU hangs, is
> > the time when the nonboot CPU been woken up incorrectly, and tries to
> > monitor the mwait_ptr for the second time, then an exception is
> > triggered due to illegal vaddr access, say, something like, 'Unable to
> > handler kernel address of 0xffff8800ba800010...'
> >
> > Further investigation shows that, this exception is caused by
> > accessing a page without PRESENT flag, because the pte entry for this
> > vaddr is zero. Here's the scenario how this problem
> > happens: Page table for direct mapping is allocated dynamically by
> > kernel_physical_mapping_init, it is possible that in the resume
> > process, when the boot CPU is trying to write back pages to their
> > original address, and just right to writes to the monitor mwait_ptr
> > then wakes up one of the nonboot CPUs, since the page table currently
> > used by the nonboot CPU might not the same as it is before the
> > hibernation, an exception might occur due to inconsistent page table.
> >
> > First try is to get rid of this problem by changing the monitor
> > address from task.flag to zero page, because no one would write data
> > to zero page. But there is still problem because of a ping-pong wake
> > up scenario in mwait_play_dead:
> >
> > One possible implementation of a clflush is a read-invalidate snoop,
> > which is what a store might look like, so cflush might break the mwait.
> >
> > 1. CPU1 wait at zero page
> > 2. CPU2 cflush zero page, wake CPU1 up, then CPU2 waits at zero page
> > 3. CPU1 is woken up, and invoke cflush zero page, thus wake up CPU2 again.
> > then the nonboot CPUs never sleep for long.
> >
> > So it's better to monitor different address for each nonboot CPUs,
> > however since there is only one zero page, at most:
> > PAGE_SIZE/L1_CACHE_LINE CPUs are satisfied, which is usually 64 on a
> > x86_64, apparently it's not enough for servers, maybe more zero pages
> > are required.
> >
> > So choose a new solution as Brian suggested, to put the nonboot CPUs
> > into hlt before resume, without touching any memory during s/r.
> > Theoretically there might still be some problems if some of the CPUs
> > have already been put offline, but since the case is very rare and
> > users can work around it, we do not deal with this special case in
> > kernel for now.
> >
> > BTW, as James mentioned, he might want to encapsulate
> > disable_nonboot_cpus into arch-specific, so this patch might need small
> change after that.
> >
> > Comments and suggestions would be appreciated.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
> > Reported-and-tested-by: Varun Koyyalagunta <cpudebug@centtech.com>
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
>
> Below is my sort of version of this (untested) and I did it this way, because the
> issue is specific to resume from hibernation (the workaround need not be
> applied anywhere else) and the hibernate_resume_nonboot_cpu_disable()
> thing may be useful to arm64 too if I'm not mistaken (James?).
James might want a flag to distinguish whether it is from suspend or resume,
in his arch-specific disabled_nonboot_cpus?
and this patch works on my xeon.
Tested-by: Chen Yu <yu.c.chen@intel.com>
>
> Actually, if arm64 uses it too, the __weak implementation can be dropped,
> because it will be possible to make it depend on ARCH_HIBERNATION_HEADER
> (x86 and arm64 are the only users of that).
>
> Thanks,
> Rafael
>
next prev parent reply other threads:[~2016-07-07 2:50 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-28 9:16 [PATCH][RFC v3] x86, hotplug: Use hlt instead of mwait if invoked from disable_nonboot_cpus Chen Yu
2016-07-07 0:33 ` Rafael J. Wysocki
2016-07-07 2:50 ` Chen, Yu C [this message]
2016-07-07 16:03 ` James Morse
2016-07-07 8:38 ` James Morse
2016-07-07 12:25 ` Rafael J. Wysocki
2016-07-10 1:49 ` [PATCH] x86 / hibernate: Use hlt_play_dead() when resuming from hibernation Rafael J. Wysocki
2016-07-13 9:56 ` Pavel Machek
2016-07-13 10:29 ` Chen Yu
2016-07-13 12:01 ` Rafael J. Wysocki
2016-07-13 12:41 ` Rafael J. Wysocki
2016-07-28 19:33 ` Pavel Machek
2016-07-14 1:55 ` [PATCH v2] " Rafael J. Wysocki
2016-07-14 8:57 ` Ingo Molnar
2016-07-28 19:34 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=36DF59CE26D8EE47B0655C516E9CE6402877EBB4@shsmsx102.ccr.corp.intel.com \
--to=yu.c.chen@intel.com \
--cc=bp@suse.de \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pavel@ucw.cz \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).