From: "河合英宏 / KAWAI,HIDEHIRO" <hidehiro.kawai.ez@hitachi.com> To: "'Steven Rostedt'" <rostedt@goodmis.org> Cc: "Jonathan Corbet" <corbet@lwn.net>, "Peter Zijlstra" <peterz@infradead.org>, "Ingo Molnar" <mingo@kernel.org>, "Eric W. Biederman" <ebiederm@xmission.com>, "H. Peter Anvin" <hpa@zytor.com>, "Andrew Morton" <akpm@linux-foundation.org>, "Thomas Gleixner" <tglx@linutronix.de>, "Vivek Goyal" <vgoyal@redhat.com>, "Baoquan He" <bhe@redhat.com>, "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>, "x86@kernel.org" <x86@kernel.org>, "kexec@lists.infradead.org" <kexec@lists.infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "Michal Hocko" <mhocko@kernel.org>, "Borislav Petkov" <bp@alien8.de>, "平松雅巳 / HIRAMATU,MASAMI" <masami.hiramatsu.pt@hitachi.com> Subject: RE: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Date: Wed, 25 Nov 2015 06:28:14 +0000 [thread overview] Message-ID: <04EAB7311EE43145B2D3536183D1A84454A2A48B@GSjpTKYDCembx31.service.hitachi.net> (raw) In-Reply-To: <20151124203555.GC6100@home.goodmis.org> [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6229 bytes --] > On Fri, Nov 20, 2015 at 06:36:48PM +0900, Hidehiro Kawai wrote: > > Currently, panic() and crash_kexec() can be called at the same time. > > For example (x86 case): > > > > CPU 0: > > oops_end() > > crash_kexec() > > mutex_trylock() // acquired > > nmi_shootdown_cpus() // stop other cpus > > > > CPU 1: > > panic() > > crash_kexec() > > mutex_trylock() // failed to acquire > > smp_send_stop() // stop other cpus > > infinite loop > > > > If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump > > fails. > > So the smp_send_stop() stops CPU 0 from calling nmi_shootdown_cpus(), right? Yes, but the important thing is that CPU 1 stops CPU 0 which is only CPU processing crash_ kexec routines. > > > > In another case: > > > > CPU 0: > > oops_end() > > crash_kexec() > > mutex_trylock() // acquired > > <NMI> > > io_check_error() > > panic() > > crash_kexec() > > mutex_trylock() // failed to acquire > > infinite loop > > > > Clearly, this is an undesirable result. > > I'm trying to see how this patch fixes this case. > > > > > To fix this problem, this patch changes crash_kexec() to exclude > > others by using atomic_t panic_cpu. > > > > V5: > > - Add missing dummy __crash_kexec() for !CONFIG_KEXEC_CORE case > > - Replace atomic_xchg() with atomic_set() in crash_kexec() because > > it is used as a release operation and there is no need of memory > > barrier effect. This change also removes an unused value warning > > > > V4: > > - Use new __crash_kexec(), no exclusion check version of crash_kexec(), > > instead of checking if panic_cpu is the current cpu or not > > > > V2: > > - Use atomic_cmpxchg() instead of spin_trylock() on panic_lock > > to exclude concurrent accesses > > - Don't introduce no-lock version of crash_kexec() > > > > Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> > > Cc: Eric Biederman <ebiederm@xmission.com> > > Cc: Vivek Goyal <vgoyal@redhat.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Michal Hocko <mhocko@kernel.org> > > --- > > include/linux/kexec.h | 2 ++ > > kernel/kexec_core.c | 26 +++++++++++++++++++++++++- > > kernel/panic.c | 4 ++-- > > 3 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/kexec.h b/include/linux/kexec.h > > index d140b1e..7b68d27 100644 > > --- a/include/linux/kexec.h > > +++ b/include/linux/kexec.h > > @@ -237,6 +237,7 @@ extern int kexec_purgatory_get_set_symbol(struct kimage *image, > > unsigned int size, bool get_value); > > extern void *kexec_purgatory_get_symbol_addr(struct kimage *image, > > const char *name); > > +extern void __crash_kexec(struct pt_regs *); > > extern void crash_kexec(struct pt_regs *); > > int kexec_should_crash(struct task_struct *); > > void crash_save_cpu(struct pt_regs *regs, int cpu); > > @@ -332,6 +333,7 @@ int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, > > #else /* !CONFIG_KEXEC_CORE */ > > struct pt_regs; > > struct task_struct; > > +static inline void __crash_kexec(struct pt_regs *regs) { } > > static inline void crash_kexec(struct pt_regs *regs) { } > > static inline int kexec_should_crash(struct task_struct *p) { return 0; } > > #define kexec_in_progress false > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c > > index 11b64a6..9d097f5 100644 > > --- a/kernel/kexec_core.c > > +++ b/kernel/kexec_core.c > > @@ -853,7 +853,8 @@ struct kimage *kexec_image; > > struct kimage *kexec_crash_image; > > int kexec_load_disabled; > > > > -void crash_kexec(struct pt_regs *regs) > > +/* No panic_cpu check version of crash_kexec */ > > +void __crash_kexec(struct pt_regs *regs) > > { > > /* Take the kexec_mutex here to prevent sys_kexec_load > > * running on one cpu from replacing the crash kernel > > @@ -876,6 +877,29 @@ void crash_kexec(struct pt_regs *regs) > > } > > } > > > > +void crash_kexec(struct pt_regs *regs) > > +{ > > + int old_cpu, this_cpu; > > + > > + /* > > + * Only one CPU is allowed to execute the crash_kexec() code as with > > + * panic(). Otherwise parallel calls of panic() and crash_kexec() > > + * may stop each other. To exclude them, we use panic_cpu here too. > > + */ > > + this_cpu = raw_smp_processor_id(); > > + old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu); > > + if (old_cpu == -1) { > > + /* This is the 1st CPU which comes here, so go ahead. */ > > + __crash_kexec(regs); > > + > > + /* > > + * Reset panic_cpu to allow another panic()/crash_kexec() > > + * call. > > + */ > > + atomic_set(&panic_cpu, -1); > > + } > > +} > > + > > size_t crash_get_memory_size(void) > > { > > size_t size = 0; > > diff --git a/kernel/panic.c b/kernel/panic.c > > index 4fce2be..5d0b807 100644 > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -138,7 +138,7 @@ void panic(const char *fmt, ...) > > * the "crash_kexec_post_notifiers" option to the kernel. > > */ > > if (!crash_kexec_post_notifiers) > > - crash_kexec(NULL); > > + __crash_kexec(NULL); > > Why call the __crash_kexec() version and not just crash_kexec() here. > This needs to be documented. In this patch, an exclusive execution control with panic_cpu is added to crash_kexec(). When crash_kexec() is called from panic(), we don't need to check panic_cpu because we have already held the exclusive control. So, __crash_kexec() is used here to bypass it. Of course, we can call crash_kexec() here, and crash_kexec() checks if panic_cpu is equal to the current CPU number, and if so, continues to process crash_kexec() routines. This was done in older version of this patch series, but Peter received a wrong impression about checking if panic_cpu is equal to the current CPU number; it implies that it permits recursive call of crash_kexec() (actually recursive call of crash_kexec() can't happen). Anyway, I'll add some comments. Regards, -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
WARNING: multiple messages have this Message-ID (diff)
From: "河合英宏 / KAWAI,HIDEHIRO" <hidehiro.kawai.ez@hitachi.com> To: 'Steven Rostedt' <rostedt@goodmis.org> Cc: "x86@kernel.org" <x86@kernel.org>, "Baoquan He" <bhe@redhat.com>, "Jonathan Corbet" <corbet@lwn.net>, "Peter Zijlstra" <peterz@infradead.org>, "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>, "kexec@lists.infradead.org" <kexec@lists.infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "Michal Hocko" <mhocko@kernel.org>, "Thomas Gleixner" <tglx@linutronix.de>, "Eric W. Biederman" <ebiederm@xmission.com>, "H. Peter Anvin" <hpa@zytor.com>, "平松雅巳 / HIRAMATU,MASAMI" <masami.hiramatsu.pt@hitachi.com>, "Borislav Petkov" <bp@alien8.de>, "Andrew Morton" <akpm@linux-foundation.org>, "Ingo Molnar" <mingo@kernel.org>, "Vivek Goyal" <vgoyal@redhat.com> Subject: RE: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Date: Wed, 25 Nov 2015 06:28:14 +0000 [thread overview] Message-ID: <04EAB7311EE43145B2D3536183D1A84454A2A48B@GSjpTKYDCembx31.service.hitachi.net> (raw) In-Reply-To: <20151124203555.GC6100@home.goodmis.org> > On Fri, Nov 20, 2015 at 06:36:48PM +0900, Hidehiro Kawai wrote: > > Currently, panic() and crash_kexec() can be called at the same time. > > For example (x86 case): > > > > CPU 0: > > oops_end() > > crash_kexec() > > mutex_trylock() // acquired > > nmi_shootdown_cpus() // stop other cpus > > > > CPU 1: > > panic() > > crash_kexec() > > mutex_trylock() // failed to acquire > > smp_send_stop() // stop other cpus > > infinite loop > > > > If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump > > fails. > > So the smp_send_stop() stops CPU 0 from calling nmi_shootdown_cpus(), right? Yes, but the important thing is that CPU 1 stops CPU 0 which is only CPU processing crash_ kexec routines. > > > > In another case: > > > > CPU 0: > > oops_end() > > crash_kexec() > > mutex_trylock() // acquired > > <NMI> > > io_check_error() > > panic() > > crash_kexec() > > mutex_trylock() // failed to acquire > > infinite loop > > > > Clearly, this is an undesirable result. > > I'm trying to see how this patch fixes this case. > > > > > To fix this problem, this patch changes crash_kexec() to exclude > > others by using atomic_t panic_cpu. > > > > V5: > > - Add missing dummy __crash_kexec() for !CONFIG_KEXEC_CORE case > > - Replace atomic_xchg() with atomic_set() in crash_kexec() because > > it is used as a release operation and there is no need of memory > > barrier effect. This change also removes an unused value warning > > > > V4: > > - Use new __crash_kexec(), no exclusion check version of crash_kexec(), > > instead of checking if panic_cpu is the current cpu or not > > > > V2: > > - Use atomic_cmpxchg() instead of spin_trylock() on panic_lock > > to exclude concurrent accesses > > - Don't introduce no-lock version of crash_kexec() > > > > Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> > > Cc: Eric Biederman <ebiederm@xmission.com> > > Cc: Vivek Goyal <vgoyal@redhat.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Michal Hocko <mhocko@kernel.org> > > --- > > include/linux/kexec.h | 2 ++ > > kernel/kexec_core.c | 26 +++++++++++++++++++++++++- > > kernel/panic.c | 4 ++-- > > 3 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/kexec.h b/include/linux/kexec.h > > index d140b1e..7b68d27 100644 > > --- a/include/linux/kexec.h > > +++ b/include/linux/kexec.h > > @@ -237,6 +237,7 @@ extern int kexec_purgatory_get_set_symbol(struct kimage *image, > > unsigned int size, bool get_value); > > extern void *kexec_purgatory_get_symbol_addr(struct kimage *image, > > const char *name); > > +extern void __crash_kexec(struct pt_regs *); > > extern void crash_kexec(struct pt_regs *); > > int kexec_should_crash(struct task_struct *); > > void crash_save_cpu(struct pt_regs *regs, int cpu); > > @@ -332,6 +333,7 @@ int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, > > #else /* !CONFIG_KEXEC_CORE */ > > struct pt_regs; > > struct task_struct; > > +static inline void __crash_kexec(struct pt_regs *regs) { } > > static inline void crash_kexec(struct pt_regs *regs) { } > > static inline int kexec_should_crash(struct task_struct *p) { return 0; } > > #define kexec_in_progress false > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c > > index 11b64a6..9d097f5 100644 > > --- a/kernel/kexec_core.c > > +++ b/kernel/kexec_core.c > > @@ -853,7 +853,8 @@ struct kimage *kexec_image; > > struct kimage *kexec_crash_image; > > int kexec_load_disabled; > > > > -void crash_kexec(struct pt_regs *regs) > > +/* No panic_cpu check version of crash_kexec */ > > +void __crash_kexec(struct pt_regs *regs) > > { > > /* Take the kexec_mutex here to prevent sys_kexec_load > > * running on one cpu from replacing the crash kernel > > @@ -876,6 +877,29 @@ void crash_kexec(struct pt_regs *regs) > > } > > } > > > > +void crash_kexec(struct pt_regs *regs) > > +{ > > + int old_cpu, this_cpu; > > + > > + /* > > + * Only one CPU is allowed to execute the crash_kexec() code as with > > + * panic(). Otherwise parallel calls of panic() and crash_kexec() > > + * may stop each other. To exclude them, we use panic_cpu here too. > > + */ > > + this_cpu = raw_smp_processor_id(); > > + old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu); > > + if (old_cpu == -1) { > > + /* This is the 1st CPU which comes here, so go ahead. */ > > + __crash_kexec(regs); > > + > > + /* > > + * Reset panic_cpu to allow another panic()/crash_kexec() > > + * call. > > + */ > > + atomic_set(&panic_cpu, -1); > > + } > > +} > > + > > size_t crash_get_memory_size(void) > > { > > size_t size = 0; > > diff --git a/kernel/panic.c b/kernel/panic.c > > index 4fce2be..5d0b807 100644 > > --- a/kernel/panic.c > > +++ b/kernel/panic.c > > @@ -138,7 +138,7 @@ void panic(const char *fmt, ...) > > * the "crash_kexec_post_notifiers" option to the kernel. > > */ > > if (!crash_kexec_post_notifiers) > > - crash_kexec(NULL); > > + __crash_kexec(NULL); > > Why call the __crash_kexec() version and not just crash_kexec() here. > This needs to be documented. In this patch, an exclusive execution control with panic_cpu is added to crash_kexec(). When crash_kexec() is called from panic(), we don't need to check panic_cpu because we have already held the exclusive control. So, __crash_kexec() is used here to bypass it. Of course, we can call crash_kexec() here, and crash_kexec() checks if panic_cpu is equal to the current CPU number, and if so, continues to process crash_kexec() routines. This was done in older version of this patch series, but Peter received a wrong impression about checking if panic_cpu is equal to the current CPU number; it implies that it permits recursive call of crash_kexec() (actually recursive call of crash_kexec() can't happen). Anyway, I'll add some comments. Regards, -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2015-11-25 6:28 UTC|newest] Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-11-20 9:36 [V5 PATCH 0/4] Fix race issues among panic, NMI and crash_kexec Hidehiro Kawai 2015-11-20 9:36 ` Hidehiro Kawai 2015-11-20 9:36 ` [V5 PATCH 1/4] panic/x86: Fix re-entrance problem due to panic on NMI Hidehiro Kawai 2015-11-20 9:36 ` Hidehiro Kawai 2015-11-23 18:49 ` Borislav Petkov 2015-11-23 18:49 ` Borislav Petkov 2015-11-24 4:06 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-24 4:06 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-24 12:45 ` Michal Hocko 2015-11-24 12:45 ` Michal Hocko 2015-11-24 15:05 ` Steven Rostedt 2015-11-24 15:05 ` Steven Rostedt 2015-11-24 15:12 ` Steven Rostedt 2015-11-24 15:12 ` Steven Rostedt 2015-11-24 20:27 ` Michal Hocko 2015-11-24 20:27 ` Michal Hocko 2015-11-24 20:45 ` Steven Rostedt 2015-11-24 20:45 ` Steven Rostedt 2015-11-20 9:36 ` [V5 PATCH 2/4] panic/x86: Allow cpus to save registers even if they are looping in NMI context Hidehiro Kawai 2015-11-20 9:36 ` Hidehiro Kawai 2015-11-24 10:48 ` Borislav Petkov 2015-11-24 10:48 ` Borislav Petkov 2015-11-24 19:37 ` Steven Rostedt 2015-11-24 19:37 ` Steven Rostedt 2015-11-24 20:16 ` Borislav Petkov 2015-11-24 20:16 ` Borislav Petkov 2015-11-25 5:57 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 5:57 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 5:51 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 5:51 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 8:56 ` Borislav Petkov 2015-11-25 8:56 ` Borislav Petkov 2015-11-25 9:46 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 9:46 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 9:57 ` Borislav Petkov 2015-11-25 9:57 ` Borislav Petkov 2015-11-25 15:11 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 15:11 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-24 12:58 ` Michal Hocko 2015-11-24 12:58 ` Michal Hocko 2015-12-03 2:23 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-03 2:23 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-20 9:36 ` [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly Hidehiro Kawai 2015-11-20 9:36 ` Hidehiro Kawai 2015-11-24 13:05 ` Michal Hocko 2015-11-24 13:05 ` Michal Hocko 2015-11-24 20:35 ` Steven Rostedt 2015-11-24 20:35 ` Steven Rostedt 2015-11-25 6:28 ` 河合英宏 / KAWAI,HIDEHIRO [this message] 2015-11-25 6:28 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 9:54 ` Borislav Petkov 2015-11-25 9:54 ` Borislav Petkov 2015-12-02 11:57 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-02 11:57 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-02 15:40 ` Borislav Petkov 2015-12-02 15:40 ` Borislav Petkov 2015-12-03 2:01 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-03 2:01 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-03 9:35 ` Borislav Petkov 2015-12-03 9:35 ` Borislav Petkov 2015-12-03 11:29 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-03 11:29 ` 河合英宏 / KAWAI,HIDEHIRO 2015-12-03 12:22 ` Borislav Petkov 2015-12-03 12:22 ` Borislav Petkov 2015-11-20 9:36 ` [V5 PATCH 4/4] x86/apic: Introduce apic_extnmi boot option Hidehiro Kawai 2015-11-20 9:36 ` Hidehiro Kawai 2015-11-25 11:49 ` Borislav Petkov 2015-11-25 11:49 ` Borislav Petkov 2015-11-25 15:29 ` 河合英宏 / KAWAI,HIDEHIRO 2015-11-25 15:29 ` 河合英宏 / KAWAI,HIDEHIRO
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=04EAB7311EE43145B2D3536183D1A84454A2A48B@GSjpTKYDCembx31.service.hitachi.net \ --to=hidehiro.kawai.ez@hitachi.com \ --cc=akpm@linux-foundation.org \ --cc=bhe@redhat.com \ --cc=bp@alien8.de \ --cc=corbet@lwn.net \ --cc=ebiederm@xmission.com \ --cc=hpa@zytor.com \ --cc=kexec@lists.infradead.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=masami.hiramatsu.pt@hitachi.com \ --cc=mhocko@kernel.org \ --cc=mingo@kernel.org \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ --cc=tglx@linutronix.de \ --cc=vgoyal@redhat.com \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.