Linux-HyperV Archive on lore.kernel.org
 help / color / Atom feed
* [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic)
@ 2020-06-26 18:21 Dexuan Cui
  2020-07-13 10:31 ` Wei Liu
  2020-07-17 13:03 ` Thomas Gleixner
  0 siblings, 2 replies; 3+ messages in thread
From: Dexuan Cui @ 2020-06-26 18:21 UTC (permalink / raw)
  To: tglx, mingo, rdunlap, bp, hpa, x86, peterz, allison,
	alexios.zavras, gregkh, decui, namit, mikelley, longli
  Cc: linux-kernel, linux-hyperv

parse_apic() allows the user to try a different APIC driver than the
default one that's automatically chosen. It works for X86-32, but
doesn't work for X86-64 because it was removed in 2009 for X86-64 by
commit 7b38725318f4 ("x86: remove subarchitecture support code"),
whose changelog doesn't explicitly describe the removal for X86-64.

The patch adds back the functionality for X86-64. The intent is mainly
to work around an APIC emulation bug in Hyper-V in the case of kdump:
currently Hyper-V does not honor the disabled state of the local APICs,
so all the IOAPIC-based interrupts may not be delivered to the correct
virtual CPU, if the logical-mode APIC driver is used (the kdump
kernel usually uses the logical-mode APIC driver, since typically
only 1 CPU is active). Luckily the kdump issue can be worked around by
forcing the kdump kernel to use physical mode, before the fix to Hyper-V
becomes widely available.

The current algorithm of choosing an APIC driver is:

1. The global pointer "struct apic *apic" has a default value, i.e
"apic_default" on X86-32, and "apic_flat" on X86-64.

2. If the early_param "apic=" is specified, parse_apic() is called and
the pointer "apic" is changed if a matching APIC driver is found.

3. default_acpi_madt_oem_check() calls the acpi_madt_oem_check() method
of all APIC drivers, which may override the "apic" pointer.

4. default_setup_apic_routing() may override the "apic" pointer, e.g.
by calling the probe() method of all APIC drivers. Note: refer to the
order of the APIC drivers specified in arch/x86/kernel/apic/Makefile.

The patch is safe because if the apic= early param is not specified,
the current algorithm of choosing an APIC driver is unchanged; when the
param is specified (e.g. on X86-64, "apic=physical flat"), the kernel
still tries to find a "more suitable" APIC driver in the above step 3 and
4: e.g. if the BIOS/firmware requires that apic_x2apic_phys should be used,
the above step 4 will override the APIC driver to apic_x2apic_phys, even
if an early_param "apic=physical flat" is specified.

On Hyper-V, when a Linux VM has <= 8 virtual CPUs, if we use
"apic=physical flat", sending IPIs to multiple vCPUs is still fast because
Linux VM uses the para-virtualized IPI hypercalls: see hv_apic_init().

The patch adds the __init tag for flat_acpi_madt_oem_check() and
physflat_acpi_madt_oem_check() to avoid a warning seen with "make W=1":
flat_acpi_madt_oem_check() accesses cmdline_apic, which has a __initdata
tag.

Fixes: 7b38725318f4 ("x86: remove subarchitecture support code")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---

Changes in v2 (5/28/2020):
  Updated Documentation/admin-guide/kernel-parameters.txt. [Randy Dunlap]
  Changed apic_set_verbosity().
  Enhanced the changelog.

Changes in v3 (5/31/2020):
  Added the __init tag for flat_acpi_madt_oem_check() and
physflat_acpi_madt_oem_check() to avoid a warning seen with "make W=1".
  (Thanks to kbuild test robot <lkp@intel.com>).

  Updated the changelog for the __init tag.

Today is 6/26/2020 and this is just a RESEND of v3, which was posted
on 5/31 (https://lkml.org/lkml/2020/5/31/198).

 .../admin-guide/kernel-parameters.txt         | 11 +++++--
 arch/x86/kernel/apic/apic.c                   | 11 +++----
 arch/x86/kernel/apic/apic_flat_64.c           | 31 +++++++++++++++++--
 3 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7bc83f3d9bdf..c4503fff9348 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -341,10 +341,15 @@
 			Format: { quiet (default) | verbose | debug }
 			Change the amount of debugging information output
 			when initialising the APIC and IO-APIC components.
-			For X86-32, this can also be used to specify an APIC
-			driver name.
+			This can also be used to specify an APIC driver name.
 			Format: apic=driver_name
-			Examples: apic=bigsmp
+			Examples:
+			  On X86-32:  apic=bigsmp
+			  On X86-64: "apic=physical flat"
+			  Note: the available driver names depend on the
+			  architecture and the kernel config; the setting may
+			  be overridden by the acpi_madt_oem_check() and probe()
+			  methods of other APIC drivers.
 
 	apic_extnmi=	[APIC,X86] External NMI delivery setting
 			Format: { bsp (default) | all | none }
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index e53dda210cd7..6f7d75b6358b 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2855,13 +2855,10 @@ static int __init apic_set_verbosity(char *arg)
 		apic_verbosity = APIC_DEBUG;
 	else if (strcmp("verbose", arg) == 0)
 		apic_verbosity = APIC_VERBOSE;
-#ifdef CONFIG_X86_64
-	else {
-		pr_warn("APIC Verbosity level %s not recognised"
-			" use apic=verbose or apic=debug\n", arg);
-		return -EINVAL;
-	}
-#endif
+
+	/* Ignore unrecognized verbosity level setting. */
+
+	pr_info("APIC Verbosity level is %d\n", apic_verbosity);
 
 	return 0;
 }
diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
index 7862b152a052..da8f3640453f 100644
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -23,9 +23,34 @@ static struct apic apic_flat;
 struct apic *apic __ro_after_init = &apic_flat;
 EXPORT_SYMBOL_GPL(apic);
 
-static int flat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+static int cmdline_apic __initdata;
+static int __init parse_apic(char *arg)
 {
-	return 1;
+	struct apic **drv;
+
+	if (!arg)
+		return -EINVAL;
+
+	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
+		if (!strcmp((*drv)->name, arg)) {
+			apic = *drv;
+			cmdline_apic = 1;
+			return 0;
+		}
+	}
+
+	/* Parsed again by __setup for debug/verbose */
+	return 0;
+}
+early_param("apic", parse_apic);
+
+
+static int __init flat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+{
+	if (!cmdline_apic)
+		return 1;
+
+	return apic == &apic_flat;
 }
 
 /*
@@ -157,7 +182,7 @@ static struct apic apic_flat __ro_after_init = {
  * We cannot use logical delivery in this case because the mask
  * overflows, so use physical mode.
  */
-static int physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
+static int __init physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
 #ifdef CONFIG_ACPI
 	/*
-- 
2.19.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic)
  2020-06-26 18:21 [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic) Dexuan Cui
@ 2020-07-13 10:31 ` Wei Liu
  2020-07-17 13:03 ` Thomas Gleixner
  1 sibling, 0 replies; 3+ messages in thread
From: Wei Liu @ 2020-07-13 10:31 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: tglx, mingo, rdunlap, bp, hpa, x86, peterz, allison,
	alexios.zavras, gregkh, namit, mikelley, longli, linux-kernel,
	linux-hyperv, Wei Liu

On Fri, Jun 26, 2020 at 11:21:06AM -0700, Dexuan Cui wrote:
> parse_apic() allows the user to try a different APIC driver than the
> default one that's automatically chosen. It works for X86-32, but
> doesn't work for X86-64 because it was removed in 2009 for X86-64 by
> commit 7b38725318f4 ("x86: remove subarchitecture support code"),
> whose changelog doesn't explicitly describe the removal for X86-64.
> 
> The patch adds back the functionality for X86-64. The intent is mainly
> to work around an APIC emulation bug in Hyper-V in the case of kdump:
> currently Hyper-V does not honor the disabled state of the local APICs,
> so all the IOAPIC-based interrupts may not be delivered to the correct
> virtual CPU, if the logical-mode APIC driver is used (the kdump
> kernel usually uses the logical-mode APIC driver, since typically
> only 1 CPU is active). Luckily the kdump issue can be worked around by
> forcing the kdump kernel to use physical mode, before the fix to Hyper-V
> becomes widely available.
> 
> The current algorithm of choosing an APIC driver is:
> 
> 1. The global pointer "struct apic *apic" has a default value, i.e
> "apic_default" on X86-32, and "apic_flat" on X86-64.
> 
> 2. If the early_param "apic=" is specified, parse_apic() is called and
> the pointer "apic" is changed if a matching APIC driver is found.
> 
> 3. default_acpi_madt_oem_check() calls the acpi_madt_oem_check() method
> of all APIC drivers, which may override the "apic" pointer.
> 
> 4. default_setup_apic_routing() may override the "apic" pointer, e.g.
> by calling the probe() method of all APIC drivers. Note: refer to the
> order of the APIC drivers specified in arch/x86/kernel/apic/Makefile.
> 
> The patch is safe because if the apic= early param is not specified,
> the current algorithm of choosing an APIC driver is unchanged; when the
> param is specified (e.g. on X86-64, "apic=physical flat"), the kernel
> still tries to find a "more suitable" APIC driver in the above step 3 and
> 4: e.g. if the BIOS/firmware requires that apic_x2apic_phys should be used,
> the above step 4 will override the APIC driver to apic_x2apic_phys, even
> if an early_param "apic=physical flat" is specified.
> 
> On Hyper-V, when a Linux VM has <= 8 virtual CPUs, if we use
> "apic=physical flat", sending IPIs to multiple vCPUs is still fast because
> Linux VM uses the para-virtualized IPI hypercalls: see hv_apic_init().
> 
> The patch adds the __init tag for flat_acpi_madt_oem_check() and
> physflat_acpi_madt_oem_check() to avoid a warning seen with "make W=1":
> flat_acpi_madt_oem_check() accesses cmdline_apic, which has a __initdata
> tag.
> 
> Fixes: 7b38725318f4 ("x86: remove subarchitecture support code")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

[...]

> +static int cmdline_apic __initdata;
> +static int __init parse_apic(char *arg)
>  {
> -	return 1;
> +	struct apic **drv;
> +
> +	if (!arg)
> +		return -EINVAL;
> +
> +	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
> +		if (!strcmp((*drv)->name, arg)) {
> +			apic = *drv;
> +			cmdline_apic = 1;
> +			return 0;
> +		}
> +	}
> +
> +	/* Parsed again by __setup for debug/verbose */
> +	return 0;
> +}
> +early_param("apic", parse_apic);

This is a verbatim copy of the code in probe_32.c. Can you avoid the
duplication by moving the snippet from probe_32.c to apic.c?

You can declare cmdline_apic in local.h if you do that.

Wei.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic)
  2020-06-26 18:21 [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic) Dexuan Cui
  2020-07-13 10:31 ` Wei Liu
@ 2020-07-17 13:03 ` Thomas Gleixner
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2020-07-17 13:03 UTC (permalink / raw)
  To: Dexuan Cui, mingo, rdunlap, bp, hpa, x86, peterz, allison,
	alexios.zavras, gregkh, decui, namit, mikelley, longli
  Cc: linux-kernel, linux-hyperv

Dexuan Cui <decui@microsoft.com> writes:

What has the early param to do with apic/flat64?

> parse_apic() allows the user to try a different APIC driver than the
> default one that's automatically chosen. It works for X86-32, but
> doesn't work for X86-64 because it was removed in 2009 for X86-64 by
> commit 7b38725318f4 ("x86: remove subarchitecture support code"),
> whose changelog doesn't explicitly describe the removal for X86-64.

Well, the patches leading to this had the clear intent to simplify the
whole APIC code and the removal of that command line override was part
of that cleanup. Up to now nothing needed it at all.

> The patch adds back the functionality for X86-64. The intent is mainly

git grep 'This patch' Documentation/process 

> to work around an APIC emulation bug in Hyper-V in the case of kdump:
> currently Hyper-V does not honor the disabled state of the local APICs,
> so all the IOAPIC-based interrupts may not be delivered to the correct
> virtual CPU, if the logical-mode APIC driver is used (the kdump
> kernel usually uses the logical-mode APIC driver, since typically
> only 1 CPU is active). Luckily the kdump issue can be worked around by
> forcing the kdump kernel to use physical mode, before the fix to Hyper-V
> becomes widely available.

This is just another proof that virtualization creates more problems
than it solves.

> The current algorithm of choosing an APIC driver is:
>
> 1. The global pointer "struct apic *apic" has a default value, i.e
> "apic_default" on X86-32, and "apic_flat" on X86-64.
>
> 2. If the early_param "apic=" is specified, parse_apic() is called and
> the pointer "apic" is changed if a matching APIC driver is found.
>
> 3. default_acpi_madt_oem_check() calls the acpi_madt_oem_check() method
> of all APIC drivers, which may override the "apic" pointer.
>
> 4. default_setup_apic_routing() may override the "apic" pointer, e.g.
> by calling the probe() method of all APIC drivers. Note: refer to the
> order of the APIC drivers specified in arch/x86/kernel/apic/Makefile.

What's the 'current algorithm'? #2 is not in use on 64bit currently.

And looking at 32bit then the above decription is wrong. Once a command
line parameter is specified and matched then #3 and #4 do not override
anything. See the probe code in 32 bit.

> On Hyper-V, when a Linux VM has <= 8 virtual CPUs, if we use
> "apic=physical flat", sending IPIs to multiple vCPUs is still fast because
> Linux VM uses the para-virtualized IPI hypercalls: see hv_apic_init().

And this is relevant because?

> The patch adds the __init tag for flat_acpi_madt_oem_check() and
> physflat_acpi_madt_oem_check() to avoid a warning seen with "make W=1":
> flat_acpi_madt_oem_check() accesses cmdline_apic, which has a __initdata
> tag.

And both function have pointers stored in the corresponding apic
structs, i.e. after init these pointers become dangling. No...

Aside of that the only function which accesses cmdline_apic is
flat_acpi_madt_oem_check() according to the patch below.

> Fixes: 7b38725318f4 ("x86: remove subarchitecture support code")

This fixes tag is blantantly wrong. You want to add:

Fixes: Broken hypervisor

When this mess was removed hyperv still wore diapers.

>  			Format: { quiet (default) | verbose | debug }
>  			Change the amount of debugging information output
>  			when initialising the APIC and IO-APIC components.
> -			For X86-32, this can also be used to specify an APIC
> -			driver name.
> +			This can also be used to specify an APIC driver name.
>  			Format: apic=driver_name
> -			Examples: apic=bigsmp
> +			Examples:
> +			  On X86-32:  apic=bigsmp
> +			  On X86-64: "apic=physical flat"
> +			  Note: the available driver names depend on the
> +			  architecture and the kernel config; the setting may
> +			  be overridden by the acpi_madt_oem_check() and probe()
> +			  methods of other APIC drivers.

Which is wrong because that's not true for 32bit.

> @@ -2855,13 +2855,10 @@ static int __init apic_set_verbosity(char *arg)
>  		apic_verbosity = APIC_DEBUG;
>  	else if (strcmp("verbose", arg) == 0)
>  		apic_verbosity = APIC_VERBOSE;
> -#ifdef CONFIG_X86_64
> -	else {
> -		pr_warn("APIC Verbosity level %s not recognised"
> -			" use apic=verbose or apic=debug\n", arg);
> -		return -EINVAL;
> -	}
> -#endif
> +
> +	/* Ignore unrecognized verbosity level setting. */
> +
> +	pr_info("APIC Verbosity level is %d\n", apic_verbosity);

We need that printk on every boot because?

>  
>  	return 0;
>  }
> diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
> index 7862b152a052..da8f3640453f 100644
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -23,9 +23,34 @@ static struct apic apic_flat;
>  struct apic *apic __ro_after_init = &apic_flat;
>  EXPORT_SYMBOL_GPL(apic);
>  
> -static int flat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> +static int cmdline_apic __initdata;
> +static int __init parse_apic(char *arg)
>  {
> -	return 1;
> +	struct apic **drv;
> +
> +	if (!arg)
> +		return -EINVAL;
> +
> +	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) {
> +		if (!strcmp((*drv)->name, arg)) {
> +			apic = *drv;
> +			cmdline_apic = 1;
> +			return 0;
> +		}
> +	}
> +
> +	/* Parsed again by __setup for debug/verbose */
> +	return 0;
> +}
> +early_param("apic", parse_apic);

We surely need yet another copy of the code in probe_32.c

> +static int __init flat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> +{
> +	if (!cmdline_apic)
> +		return 1;
> +
> +	return apic == &apic_flat;

Comments are overrated...

>  }
>  
>  /*
> @@ -157,7 +182,7 @@ static struct apic apic_flat __ro_after_init = {
>   * We cannot use logical delivery in this case because the mask
>   * overflows, so use physical mode.
>   */
> -static int physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
> +static int __init physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
>  {

See above ...

But why do you need all this if it is only relevant for a crash dump
kernel? Can't this be simply cured by using 'nolapic' on the kernel
command line?

Even if you must have the local apic for whatever reason in the kdump
kernel then this can be completely done w/o command line hackery simply
because its only required for the mshypervers + kdump case.

Thanks,

        tglx
---
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -176,6 +176,13 @@ static int physflat_acpi_madt_oem_check(
 		return 1;
 	}
 #endif
+	/*
+	 * Workaround for a bug in the MS hypervisor which fails to reset
+	 * the APIC properly when a kernel crash jumps into the kdump
+	 * kernel.
+	 */
+	if (hypervisor_is_type(X86_HYPER_MS_HYPERV) && is_kdump_kernel())
+		return 1;
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 18:21 [RESEND][PATCH v3] x86/apic/flat64: Add back the early_param("apic", parse_apic) Dexuan Cui
2020-07-13 10:31 ` Wei Liu
2020-07-17 13:03 ` Thomas Gleixner

Linux-HyperV Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-hyperv/0 linux-hyperv/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-hyperv linux-hyperv/ https://lore.kernel.org/linux-hyperv \
		linux-hyperv@vger.kernel.org
	public-inbox-index linux-hyperv

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-hyperv


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git