From: Jiri Olsa <jolsa@redhat.com> To: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com>, linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Peter Zijlstra <peterz@infradead.org>, Len Brown <len.brown@intel.com>, Borislav Petkov <bp@suse.de>, Andi Kleen <ak@linux.intel.com>, Juergen Gross <jgross@suse.com>, dyoung@redhat.com, Eric Biederman <ebiederm@xmission.com>, kexec@lists.infradead.org Subject: Re: [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs Date: Tue, 4 Oct 2016 14:27:17 +0200 [thread overview] Message-ID: <20161004122717.GA4998@krava> (raw) In-Reply-To: <alpine.DEB.2.20.1610041241160.7712@nanos> On Tue, Oct 04, 2016 at 12:58:04PM +0200, Thomas Gleixner wrote: > On Mon, 3 Oct 2016, Prarit Bhargava wrote: > > BUG: unable to handle kernel paging request at 0000000000841f1f > > IP: [<ffffffff81014ec4>] uncore_change_context+0xd4/0x180 > ... > > [<ffffffff81015a60>] ? uncore_cpu_starting+0x130/0x130 > > [<ffffffff81015acc>] uncore_event_cpu_online+0x6c/0x80 > > [<ffffffff8108e819>] cpuhp_invoke_callback+0x49/0x100 > > [<ffffffff8108ead1>] cpuhp_thread_fun+0x41/0x100 > > [<ffffffff810b054f>] smpboot_thread_fn+0x10f/0x160 > > [<ffffffff810b0440>] ? sort_range+0x30/0x30 > > [<ffffffff810accd8>] kthread+0xd8/0xf0 > > [<ffffffff816ff4bf>] ret_from_fork+0x1f/0x40 > > [<ffffffff810acc00>] ? kthread_park+0x60/0x60 > > > arch/x86/events/intel/uncore.c: > > 1137 static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_ cpu, > > 1138 int new_cpu) > > 1139 { > > 1140 struct intel_uncore_pmu *pmu = type->pmus; > > 1141 struct intel_uncore_box *box; > > 1142 int i, pkg; > > 1143 > > 1144 pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu); > > 1145 for (i = 0; i < type->num_boxes; i++, pmu++) { > > 1146 box = pmu->boxes[pkg]; > > > > pmu->boxes[pkg] is garbage because pkg was returned as 0xffff. > > And that's what needs to be fixed in the first place. right, I'll check on that.. but I think we need this fix as well > > > This patch adds the missing generic_processor_info() to > > prefill_possible_map() to ensure the initialization of the boot cpu is > > correct. > > > This results in smp_init_package_map() having correct data and > > properly setting the package map for the hotplugged boot cpu, which in > > turn resolves the kdump kernel panic on physically hotplugged cpus. > > While it is the right thing to initialize the package map in that case, it > still papers over a robustness issue in the uncore code, which needs to be > fixed first. > > > [2] prefill_possible_map() is called before smp_store_boot_cpu_info(). > > The comment beside the call to smp_store_boot_cpu_info() states that the > > completed call results in "Final full version of the data". > > I'm not sure what that [2] here means and I cannot figure out the meaning > of this sentence either. > > This changelog is incomprehensible in general and more a "oh look how I > decoded this problem" report than something which clearly describes the > problem at hand, the root cause and the fix. The latter wants a > understandable explanation why prefill_possible_map() is the right place to > do this. I was wondering if acpi_boot_init was a better place for that, but then Prarit suggested in our discussion that the prefill_possible_map() call seems to be a hotplug cleanup.. so it seemed to fit however it's difficult to say with complex code like this, so any ideas are welcome ;-) thanks, jirka
WARNING: multiple messages have this Message-ID (diff)
From: Jiri Olsa <jolsa@redhat.com> To: Thomas Gleixner <tglx@linutronix.de> Cc: Prarit Bhargava <prarit@redhat.com>, Len Brown <len.brown@intel.com>, Andi Kleen <ak@linux.intel.com>, Juergen Gross <jgross@suse.com>, Peter Zijlstra <peterz@infradead.org>, x86@kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, Eric Biederman <ebiederm@xmission.com>, "H. Peter Anvin" <hpa@zytor.com>, Borislav Petkov <bp@suse.de>, dyoung@redhat.com Subject: Re: [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs Date: Tue, 4 Oct 2016 14:27:17 +0200 [thread overview] Message-ID: <20161004122717.GA4998@krava> (raw) In-Reply-To: <alpine.DEB.2.20.1610041241160.7712@nanos> On Tue, Oct 04, 2016 at 12:58:04PM +0200, Thomas Gleixner wrote: > On Mon, 3 Oct 2016, Prarit Bhargava wrote: > > BUG: unable to handle kernel paging request at 0000000000841f1f > > IP: [<ffffffff81014ec4>] uncore_change_context+0xd4/0x180 > ... > > [<ffffffff81015a60>] ? uncore_cpu_starting+0x130/0x130 > > [<ffffffff81015acc>] uncore_event_cpu_online+0x6c/0x80 > > [<ffffffff8108e819>] cpuhp_invoke_callback+0x49/0x100 > > [<ffffffff8108ead1>] cpuhp_thread_fun+0x41/0x100 > > [<ffffffff810b054f>] smpboot_thread_fn+0x10f/0x160 > > [<ffffffff810b0440>] ? sort_range+0x30/0x30 > > [<ffffffff810accd8>] kthread+0xd8/0xf0 > > [<ffffffff816ff4bf>] ret_from_fork+0x1f/0x40 > > [<ffffffff810acc00>] ? kthread_park+0x60/0x60 > > > arch/x86/events/intel/uncore.c: > > 1137 static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_ cpu, > > 1138 int new_cpu) > > 1139 { > > 1140 struct intel_uncore_pmu *pmu = type->pmus; > > 1141 struct intel_uncore_box *box; > > 1142 int i, pkg; > > 1143 > > 1144 pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu); > > 1145 for (i = 0; i < type->num_boxes; i++, pmu++) { > > 1146 box = pmu->boxes[pkg]; > > > > pmu->boxes[pkg] is garbage because pkg was returned as 0xffff. > > And that's what needs to be fixed in the first place. right, I'll check on that.. but I think we need this fix as well > > > This patch adds the missing generic_processor_info() to > > prefill_possible_map() to ensure the initialization of the boot cpu is > > correct. > > > This results in smp_init_package_map() having correct data and > > properly setting the package map for the hotplugged boot cpu, which in > > turn resolves the kdump kernel panic on physically hotplugged cpus. > > While it is the right thing to initialize the package map in that case, it > still papers over a robustness issue in the uncore code, which needs to be > fixed first. > > > [2] prefill_possible_map() is called before smp_store_boot_cpu_info(). > > The comment beside the call to smp_store_boot_cpu_info() states that the > > completed call results in "Final full version of the data". > > I'm not sure what that [2] here means and I cannot figure out the meaning > of this sentence either. > > This changelog is incomprehensible in general and more a "oh look how I > decoded this problem" report than something which clearly describes the > problem at hand, the root cause and the fix. The latter wants a > understandable explanation why prefill_possible_map() is the right place to > do this. I was wondering if acpi_boot_init was a better place for that, but then Prarit suggested in our discussion that the prefill_possible_map() call seems to be a hotplug cleanup.. so it seemed to fit however it's difficult to say with complex code like this, so any ideas are welcome ;-) thanks, jirka _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
next prev parent reply other threads:[~2016-10-04 12:27 UTC|newest] Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-10-03 17:07 [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs Prarit Bhargava 2016-10-03 17:07 ` Prarit Bhargava 2016-10-03 22:22 ` Jiri Olsa 2016-10-03 22:22 ` Jiri Olsa 2016-10-04 10:58 ` Thomas Gleixner 2016-10-04 10:58 ` Thomas Gleixner 2016-10-04 12:09 ` Prarit Bhargava 2016-10-04 12:09 ` Prarit Bhargava 2016-10-04 14:38 ` Thomas Gleixner 2016-10-04 14:38 ` Thomas Gleixner 2016-10-04 16:01 ` Prarit Bhargava 2016-10-04 16:01 ` Prarit Bhargava 2016-10-05 16:14 ` Jiri Olsa 2016-10-05 16:14 ` Jiri Olsa 2016-10-06 15:25 ` Prarit Bhargava 2016-10-06 15:25 ` Prarit Bhargava 2016-10-07 6:49 ` Jiri Olsa 2016-10-07 6:49 ` Jiri Olsa 2016-10-07 8:02 ` Thomas Gleixner 2016-10-07 8:02 ` Thomas Gleixner 2016-10-04 12:27 ` Jiri Olsa [this message] 2016-10-04 12:27 ` Jiri Olsa 2016-10-04 14:19 ` Thomas Gleixner 2016-10-04 14:19 ` Thomas Gleixner 2016-10-07 13:28 ` [tip:x86/urgent] arch/x86: Handle non enumerated CPU after physical hotplug tip-bot for Prarit Bhargava
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161004122717.GA4998@krava \ --to=jolsa@redhat.com \ --cc=ak@linux.intel.com \ --cc=bp@suse.de \ --cc=dyoung@redhat.com \ --cc=ebiederm@xmission.com \ --cc=hpa@zytor.com \ --cc=jgross@suse.com \ --cc=kexec@lists.infradead.org \ --cc=len.brown@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=prarit@redhat.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.