All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Prarit Bhargava <prarit@redhat.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Len Brown <len.brown@intel.com>, Borislav Petkov <bp@suse.de>,
	Andi Kleen <ak@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>,
	Juergen Gross <jgross@suse.com>,
	dyoung@redhat.com, Eric Biederman <ebiederm@xmission.com>,
	kexec@lists.infradead.org
Subject: Re: [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs
Date: Tue, 4 Oct 2016 12:58:04 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.20.1610041241160.7712@nanos> (raw)
In-Reply-To: <1475514432-27682-1-git-send-email-prarit@redhat.com>

On Mon, 3 Oct 2016, Prarit Bhargava wrote:
> BUG: unable to handle kernel paging request at 0000000000841f1f
> IP: [<ffffffff81014ec4>] uncore_change_context+0xd4/0x180
...
>  [<ffffffff81015a60>] ? uncore_cpu_starting+0x130/0x130
>  [<ffffffff81015acc>] uncore_event_cpu_online+0x6c/0x80
>  [<ffffffff8108e819>] cpuhp_invoke_callback+0x49/0x100
>  [<ffffffff8108ead1>] cpuhp_thread_fun+0x41/0x100
>  [<ffffffff810b054f>] smpboot_thread_fn+0x10f/0x160
>  [<ffffffff810b0440>] ? sort_range+0x30/0x30
>  [<ffffffff810accd8>] kthread+0xd8/0xf0
>  [<ffffffff816ff4bf>] ret_from_fork+0x1f/0x40
>  [<ffffffff810acc00>] ? kthread_park+0x60/0x60

> arch/x86/events/intel/uncore.c:
> 1137 static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_     cpu,
> 1138                                    int new_cpu)
> 1139 {
> 1140         struct intel_uncore_pmu *pmu = type->pmus;
> 1141         struct intel_uncore_box *box;
> 1142         int i, pkg;
> 1143
> 1144         pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
> 1145         for (i = 0; i < type->num_boxes; i++, pmu++) {
> 1146                 box = pmu->boxes[pkg];
> 
> pmu->boxes[pkg] is garbage because pkg was returned as 0xffff.

And that's what needs to be fixed in the first place.

> This patch adds the missing generic_processor_info() to
> prefill_possible_map() to ensure the initialization of the boot cpu is
> correct. 

> This results in smp_init_package_map() having correct data and
> properly setting the package map for the hotplugged boot cpu, which in
> turn resolves the kdump kernel panic on physically hotplugged cpus.

While it is the right thing to initialize the package map in that case, it
still papers over a robustness issue in the uncore code, which needs to be
fixed first.

> [2] prefill_possible_map() is called before smp_store_boot_cpu_info().
> The comment beside the call to smp_store_boot_cpu_info() states that the
> completed call results in "Final full version of the data".

I'm not sure what that [2] here means and I cannot figure out the meaning
of this sentence either.

This changelog is incomprehensible in general and more a "oh look how I
decoded this problem" report than something which clearly describes the
problem at hand, the root cause and the fix. The latter wants a
understandable explanation why prefill_possible_map() is the right place to
do this.

> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 4296beb8fdd3..d1272febc13b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1406,9 +1406,18 @@ __init void prefill_possible_map(void)
>  {
>  	int i, possible;
>  
> -	/* no processor from mptable or madt */
> -	if (!num_processors)
> -		num_processors = 1;
> +	/* No boot processor was found in mptable or ACPI MADT */
> +	if (!num_processors) {
> +		/* Make sure boot cpu is enumerated */
> +		if (apic->cpu_present_to_apicid(0) == BAD_APICID &&
> +		    apic->apic_id_valid(boot_cpu_physical_apicid))
> +			generic_processor_info(boot_cpu_physical_apicid,
> +					apic_version[boot_cpu_physical_apicid]);
> +		if (!num_processors) {
> +			pr_warn("CPU 0 not enumerated in mptable or ACPI MADT\n");
> +			num_processors = 1;

And in this case we end up with the same problem, right?

Thanks,

	tglx

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: Prarit Bhargava <prarit@redhat.com>
Cc: Juergen Gross <jgross@suse.com>, Len Brown <len.brown@intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	dyoung@redhat.com, x86@kernel.org, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Borislav Petkov <bp@suse.de>,
	Jiri Olsa <jolsa@redhat.com>
Subject: Re: [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs
Date: Tue, 4 Oct 2016 12:58:04 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.20.1610041241160.7712@nanos> (raw)
In-Reply-To: <1475514432-27682-1-git-send-email-prarit@redhat.com>

On Mon, 3 Oct 2016, Prarit Bhargava wrote:
> BUG: unable to handle kernel paging request at 0000000000841f1f
> IP: [<ffffffff81014ec4>] uncore_change_context+0xd4/0x180
...
>  [<ffffffff81015a60>] ? uncore_cpu_starting+0x130/0x130
>  [<ffffffff81015acc>] uncore_event_cpu_online+0x6c/0x80
>  [<ffffffff8108e819>] cpuhp_invoke_callback+0x49/0x100
>  [<ffffffff8108ead1>] cpuhp_thread_fun+0x41/0x100
>  [<ffffffff810b054f>] smpboot_thread_fn+0x10f/0x160
>  [<ffffffff810b0440>] ? sort_range+0x30/0x30
>  [<ffffffff810accd8>] kthread+0xd8/0xf0
>  [<ffffffff816ff4bf>] ret_from_fork+0x1f/0x40
>  [<ffffffff810acc00>] ? kthread_park+0x60/0x60

> arch/x86/events/intel/uncore.c:
> 1137 static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_     cpu,
> 1138                                    int new_cpu)
> 1139 {
> 1140         struct intel_uncore_pmu *pmu = type->pmus;
> 1141         struct intel_uncore_box *box;
> 1142         int i, pkg;
> 1143
> 1144         pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
> 1145         for (i = 0; i < type->num_boxes; i++, pmu++) {
> 1146                 box = pmu->boxes[pkg];
> 
> pmu->boxes[pkg] is garbage because pkg was returned as 0xffff.

And that's what needs to be fixed in the first place.

> This patch adds the missing generic_processor_info() to
> prefill_possible_map() to ensure the initialization of the boot cpu is
> correct. 

> This results in smp_init_package_map() having correct data and
> properly setting the package map for the hotplugged boot cpu, which in
> turn resolves the kdump kernel panic on physically hotplugged cpus.

While it is the right thing to initialize the package map in that case, it
still papers over a robustness issue in the uncore code, which needs to be
fixed first.

> [2] prefill_possible_map() is called before smp_store_boot_cpu_info().
> The comment beside the call to smp_store_boot_cpu_info() states that the
> completed call results in "Final full version of the data".

I'm not sure what that [2] here means and I cannot figure out the meaning
of this sentence either.

This changelog is incomprehensible in general and more a "oh look how I
decoded this problem" report than something which clearly describes the
problem at hand, the root cause and the fix. The latter wants a
understandable explanation why prefill_possible_map() is the right place to
do this.

> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 4296beb8fdd3..d1272febc13b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1406,9 +1406,18 @@ __init void prefill_possible_map(void)
>  {
>  	int i, possible;
>  
> -	/* no processor from mptable or madt */
> -	if (!num_processors)
> -		num_processors = 1;
> +	/* No boot processor was found in mptable or ACPI MADT */
> +	if (!num_processors) {
> +		/* Make sure boot cpu is enumerated */
> +		if (apic->cpu_present_to_apicid(0) == BAD_APICID &&
> +		    apic->apic_id_valid(boot_cpu_physical_apicid))
> +			generic_processor_info(boot_cpu_physical_apicid,
> +					apic_version[boot_cpu_physical_apicid]);
> +		if (!num_processors) {
> +			pr_warn("CPU 0 not enumerated in mptable or ACPI MADT\n");
> +			num_processors = 1;

And in this case we end up with the same problem, right?

Thanks,

	tglx

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  parent reply	other threads:[~2016-10-04 11:00 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-03 17:07 [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs Prarit Bhargava
2016-10-03 17:07 ` Prarit Bhargava
2016-10-03 22:22 ` Jiri Olsa
2016-10-03 22:22   ` Jiri Olsa
2016-10-04 10:58 ` Thomas Gleixner [this message]
2016-10-04 10:58   ` Thomas Gleixner
2016-10-04 12:09   ` Prarit Bhargava
2016-10-04 12:09     ` Prarit Bhargava
2016-10-04 14:38     ` Thomas Gleixner
2016-10-04 14:38       ` Thomas Gleixner
2016-10-04 16:01       ` Prarit Bhargava
2016-10-04 16:01         ` Prarit Bhargava
2016-10-05 16:14       ` Jiri Olsa
2016-10-05 16:14         ` Jiri Olsa
2016-10-06 15:25         ` Prarit Bhargava
2016-10-06 15:25           ` Prarit Bhargava
2016-10-07  6:49           ` Jiri Olsa
2016-10-07  6:49             ` Jiri Olsa
2016-10-07  8:02             ` Thomas Gleixner
2016-10-07  8:02               ` Thomas Gleixner
2016-10-04 12:27   ` Jiri Olsa
2016-10-04 12:27     ` Jiri Olsa
2016-10-04 14:19     ` Thomas Gleixner
2016-10-04 14:19       ` Thomas Gleixner
2016-10-07 13:28 ` [tip:x86/urgent] arch/x86: Handle non enumerated CPU after physical hotplug tip-bot for Prarit Bhargava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1610041241160.7712@nanos \
    --to=tglx@linutronix.de \
    --cc=ak@linux.intel.com \
    --cc=bp@suse.de \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jolsa@redhat.com \
    --cc=kexec@lists.infradead.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=prarit@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.