All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
@ 2020-11-03  1:11 Dexuan Cui
  2020-11-03  2:04 ` Dexuan Cui
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Dexuan Cui @ 2020-11-03  1:11 UTC (permalink / raw)
  To: tglx, dwmw2, x86, decui, mikelley, linux-hyperv
  Cc: linux-kernel, Tianyu.Lan, vkuznets, kys, haiyangz, sthemmin,
	wei.liu, mingo, bp, hpa

When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
the CPUs can't be the destination of IOAPIC interrupts, because the
IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
only routed to CPUs that don't have >255 APIC IDs. However, there is
an issue with kdump, because the kdump kernel can run on any CPU, and
hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
with a >255 APIC ID.

The kdump issue can be fixed by the Extended Dest ID, which is introduced
recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
support of the underlying hypervisor. The latest Hyper-V has added the
support recently: with this commit, on such a Hyper-V host, Linux VM
does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
ID from David is used, meaning that Linux VM is able to support up to
32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.

On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
changes with this commit: Linux VM is still able to bring up the CPUs with
>255 APIC IDs with the help of hyperv-iommu.c, but IOAPIC interrupts still
can not go to such CPUs, and the kdump kernel still can not work properly
on such CPUs.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h |  7 +++++++
 arch/x86/kernel/cpu/mshyperv.c     | 30 ++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8bba9e..6bf42aed387e 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -23,6 +23,13 @@
 #define HYPERV_CPUID_IMPLEMENT_LIMITS		0x40000005
 #define HYPERV_CPUID_NESTED_FEATURES		0x4000000A
 
+#define HYPERV_CPUID_VIRT_STACK_INTERFACE	0x40000081
+#define HYPERV_VS_INTERFACE_EAX_SIGNATURE	0x31235356  /* "VS#1" */
+
+#define HYPERV_CPUID_VIRT_STACK_PROPERTIES	0x40000082
+/* Support for the extended IOAPIC RTE format */
+#define HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE	BIT(2)
+
 #define HYPERV_HYPERVISOR_PRESENT_BIT		0x80000000
 #define HYPERV_CPUID_MIN			0x40000005
 #define HYPERV_CPUID_MAX			0x4000ffff
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4550cb..cc4037d841df 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -366,9 +366,39 @@ static void __init ms_hyperv_init_platform(void)
 #endif
 }
 
+static bool __init ms_hyperv_x2apic_available(void)
+{
+	return x2apic_supported();
+}
+
+/*
+ * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
+ * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
+ * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
+ *
+ * Note: For a VM on Hyper-V, no emulated legacy device supports PCI MSI/MSI-X,
+ * and PCI MSI/MSI-X only come from the assigned physical PCIe device, and the
+ * PCI MSI/MSI-X interrupts are handled by the pci-hyperv driver. Here despite
+ * the word "msi" in the name "msi_ext_dest_id", actually the callback only
+ * affects how IOAPIC interrupts are routed.
+ */
+static bool __init ms_hyperv_msi_ext_dest_id(void)
+{
+	u32 eax;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_INTERFACE);
+	if (eax != HYPERV_VS_INTERFACE_EAX_SIGNATURE)
+		return false;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_PROPERTIES);
+	return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
 	.type			= X86_HYPER_MS_HYPERV,
+	.init.x2apic_available	= ms_hyperv_x2apic_available,
+	.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 	.init.init_platform	= ms_hyperv_init_platform,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
@ 2020-11-03  2:04 ` Dexuan Cui
  2020-11-03  7:38   ` kernel test robot
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Dexuan Cui @ 2020-11-03  2:04 UTC (permalink / raw)
  To: tglx, dwmw2, x86, Michael Kelley, linux-hyperv
  Cc: linux-kernel, Tianyu Lan, vkuznets, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, wei.liu, mingo, bp, hpa

> From: Dexuan Cui <decui@microsoft.com>
> Sent: Monday, November 2, 2020 5:12 PM

Sorry I forgot to mention that this patch is based on tip.git's x86/apic branch.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
@ 2020-11-03  7:38   ` kernel test robot
  2020-11-03  7:38   ` kernel test robot
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2020-11-03  7:38 UTC (permalink / raw)
  To: Dexuan Cui, tglx, dwmw2, x86, mikelley, linux-hyperv
  Cc: kbuild-all, linux-kernel, Tianyu.Lan, vkuznets, kys

[-- Attachment #1: Type: text/plain, Size: 2874 bytes --]

Hi Dexuan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on tip/master v5.10-rc2 next-20201102]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Dexuan-Cui/x86-hyperv-Enable-15-bit-APIC-ID-if-the-hypervisor-supports-it/20201103-091414
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: i386-randconfig-r012-20201103 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/c4037b8c4cd61f970749c6685a3df5a1376193d2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Dexuan-Cui/x86-hyperv-Enable-15-bit-APIC-ID-if-the-hypervisor-supports-it/20201103-091414
        git checkout c4037b8c4cd61f970749c6685a3df5a1376193d2
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

>> arch/x86/kernel/cpu/mshyperv.c:397:8: error: 'struct x86_hyper_init' has no member named 'msi_ext_dest_id'
     397 |  .init.msi_ext_dest_id = ms_hyperv_msi_ext_dest_id,
         |        ^~~~~~~~~~~~~~~
>> arch/x86/kernel/cpu/mshyperv.c:397:26: error: initialization of 'void (*)(void)' from incompatible pointer type 'bool (*)(void)' {aka '_Bool (*)(void)'} [-Werror=incompatible-pointer-types]
     397 |  .init.msi_ext_dest_id = ms_hyperv_msi_ext_dest_id,
         |                          ^~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/kernel/cpu/mshyperv.c:397:26: note: (near initialization for 'x86_hyper_ms_hyperv.init.init_platform')
>> arch/x86/kernel/cpu/mshyperv.c:398:24: warning: initialized field overwritten [-Woverride-init]
     398 |  .init.init_platform = ms_hyperv_init_platform,
         |                        ^~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/kernel/cpu/mshyperv.c:398:24: note: (near initialization for 'x86_hyper_ms_hyperv.init.init_platform')
   cc1: some warnings being treated as errors

vim +397 arch/x86/kernel/cpu/mshyperv.c

   391	
   392	const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
   393		.name			= "Microsoft Hyper-V",
   394		.detect			= ms_hyperv_platform,
   395		.type			= X86_HYPER_MS_HYPERV,
   396		.init.x2apic_available	= ms_hyperv_x2apic_available,
 > 397		.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 > 398		.init.init_platform	= ms_hyperv_init_platform,

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26544 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
@ 2020-11-03  7:38   ` kernel test robot
  0 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2020-11-03  7:38 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2930 bytes --]

Hi Dexuan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on tip/master v5.10-rc2 next-20201102]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Dexuan-Cui/x86-hyperv-Enable-15-bit-APIC-ID-if-the-hypervisor-supports-it/20201103-091414
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: i386-randconfig-r012-20201103 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/c4037b8c4cd61f970749c6685a3df5a1376193d2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Dexuan-Cui/x86-hyperv-Enable-15-bit-APIC-ID-if-the-hypervisor-supports-it/20201103-091414
        git checkout c4037b8c4cd61f970749c6685a3df5a1376193d2
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

>> arch/x86/kernel/cpu/mshyperv.c:397:8: error: 'struct x86_hyper_init' has no member named 'msi_ext_dest_id'
     397 |  .init.msi_ext_dest_id = ms_hyperv_msi_ext_dest_id,
         |        ^~~~~~~~~~~~~~~
>> arch/x86/kernel/cpu/mshyperv.c:397:26: error: initialization of 'void (*)(void)' from incompatible pointer type 'bool (*)(void)' {aka '_Bool (*)(void)'} [-Werror=incompatible-pointer-types]
     397 |  .init.msi_ext_dest_id = ms_hyperv_msi_ext_dest_id,
         |                          ^~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/kernel/cpu/mshyperv.c:397:26: note: (near initialization for 'x86_hyper_ms_hyperv.init.init_platform')
>> arch/x86/kernel/cpu/mshyperv.c:398:24: warning: initialized field overwritten [-Woverride-init]
     398 |  .init.init_platform = ms_hyperv_init_platform,
         |                        ^~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/kernel/cpu/mshyperv.c:398:24: note: (near initialization for 'x86_hyper_ms_hyperv.init.init_platform')
   cc1: some warnings being treated as errors

vim +397 arch/x86/kernel/cpu/mshyperv.c

   391	
   392	const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
   393		.name			= "Microsoft Hyper-V",
   394		.detect			= ms_hyperv_platform,
   395		.type			= X86_HYPER_MS_HYPERV,
   396		.init.x2apic_available	= ms_hyperv_x2apic_available,
 > 397		.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 > 398		.init.init_platform	= ms_hyperv_init_platform,

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 26544 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
  2020-11-03  2:04 ` Dexuan Cui
  2020-11-03  7:38   ` kernel test robot
@ 2020-11-03  8:02 ` David Woodhouse
  2020-11-03  9:46   ` Dexuan Cui
  2020-11-03  8:22 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: David Woodhouse @ 2020-11-03  8:02 UTC (permalink / raw)
  To: Dexuan Cui, tglx, x86, mikelley, linux-hyperv
  Cc: linux-kernel, Tianyu.Lan, vkuznets, kys, haiyangz, sthemmin,
	wei.liu, mingo, bp, hpa

[-- Attachment #1: Type: text/plain, Size: 3079 bytes --]

On Mon, 2020-11-02 at 17:11 -0800, Dexuan Cui wrote:
> When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
> the CPUs can't be the destination of IOAPIC interrupts, because the
> IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
> drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
> only routed to CPUs that don't have >255 APIC IDs. However, there is
> an issue with kdump, because the kdump kernel can run on any CPU, and
> hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
> with a >255 APIC ID.
> 
> The kdump issue can be fixed by the Extended Dest ID, which is introduced
> recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
> struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
> support of the underlying hypervisor. The latest Hyper-V has added the
> support recently: with this commit, on such a Hyper-V host, Linux VM
> does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
> returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
> ID from David is used, meaning that Linux VM is able to support up to
> 32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.
> 
> On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
> changes with this commit: Linux VM is still able to bring up the CPUs with
> > 255 APIC IDs with the help of hyperv-iommu.c, but IOAPIC interrupts still
> 
> can not go to such CPUs, and the kdump kernel still can not work properly
> on such CPUs.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Acked-by: David Woodhouse <dwmw@amazon.co.uk>

> +/*
> + * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
> + * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
> + * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
> + *
> + * Note: For a VM on Hyper-V, no emulated legacy device supports PCI MSI/MSI-X,
> + * and PCI MSI/MSI-X only come from the assigned physical PCIe device, and the
> + * PCI MSI/MSI-X interrupts are handled by the pci-hyperv driver. Here despite
> + * the word "msi" in the name "msi_ext_dest_id", actually the callback only
> + * affects how IOAPIC interrupts are routed.
> + */

I named it like that on purpose to make the point that the I/OAPIC is
just a device for turning line interrupts into MSIs. Some VMMs, just
like real hardware, really do implement their I/OAPIC emulation that
way. It makes a lot of sense to do so if you support interrupt
remapping.

FWIW I might have phrased your last paragraph in that comment as

  Note: for a VM on Hyper-V, the I/OAPIC is the only device which
  (logically) generates MSIs directly to the system APIC irq domain.
  There is no HPET, and PCI MSI/MSI-X interrupts are remapped by the
  pci-hyperv host bridge.

But don't bother to change it; I think I've made my point quite well
enough with https://git.kernel.org/tip/tip/c/5d5a97133 :)

-- 
dwmw2



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: x86/apic] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
                   ` (2 preceding siblings ...)
  2020-11-03  8:02 ` David Woodhouse
@ 2020-11-03  8:22 ` tip-bot2 for Dexuan Cui
  2020-11-03  9:37 ` [PATCH] " Dexuan Cui
  2020-11-04 10:13 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui
  5 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Dexuan Cui @ 2020-11-03  8:22 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dexuan Cui, Thomas Gleixner, David Woodhouse, x86, LKML

The following commit has been merged into the x86/apic branch of tip:

Commit-ID:     af2abc92c5ddf5fc5a2036bc106c4d9a80a4d5f7
Gitweb:        https://git.kernel.org/tip/af2abc92c5ddf5fc5a2036bc106c4d9a80a4d5f7
Author:        Dexuan Cui <decui@microsoft.com>
AuthorDate:    Mon, 02 Nov 2020 17:11:36 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 03 Nov 2020 09:16:46 +01:00

x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
the CPUs can't be the destination of IOAPIC interrupts, because the
IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
only routed to CPUs that don't have >255 APIC IDs. However, there is
an issue with kdump, because the kdump kernel can run on any CPU, and
hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
with a >255 APIC ID.

The kdump issue can be fixed by the Extended Dest ID, which is introduced
recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
support of the underlying hypervisor. The latest Hyper-V has added the
support recently: with this commit, on such a Hyper-V host, Linux VM
does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
ID from David is used, meaning that Linux VM is able to support up to
32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.

On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
changes with this commit: Linux VM is still able to bring up the CPUs with
can not go to such CPUs, and the kdump kernel still can not work properly
on such CPUs.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>                                                                                                                                                                                                                                  
Link: https://lore.kernel.org/r/20201103011136.59108-1-decui@microsoft.com

---
 arch/x86/include/asm/hyperv-tlfs.h |  7 +++++++-
 arch/x86/kernel/cpu/mshyperv.c     | 30 +++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8..6bf42ae 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -23,6 +23,13 @@
 #define HYPERV_CPUID_IMPLEMENT_LIMITS		0x40000005
 #define HYPERV_CPUID_NESTED_FEATURES		0x4000000A
 
+#define HYPERV_CPUID_VIRT_STACK_INTERFACE	0x40000081
+#define HYPERV_VS_INTERFACE_EAX_SIGNATURE	0x31235356  /* "VS#1" */
+
+#define HYPERV_CPUID_VIRT_STACK_PROPERTIES	0x40000082
+/* Support for the extended IOAPIC RTE format */
+#define HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE	BIT(2)
+
 #define HYPERV_HYPERVISOR_PRESENT_BIT		0x80000000
 #define HYPERV_CPUID_MIN			0x40000005
 #define HYPERV_CPUID_MAX			0x4000ffff
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4..cc4037d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -366,9 +366,39 @@ static void __init ms_hyperv_init_platform(void)
 #endif
 }
 
+static bool __init ms_hyperv_x2apic_available(void)
+{
+	return x2apic_supported();
+}
+
+/*
+ * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
+ * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
+ * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
+ *
+ * Note: For a VM on Hyper-V, no emulated legacy device supports PCI MSI/MSI-X,
+ * and PCI MSI/MSI-X only come from the assigned physical PCIe device, and the
+ * PCI MSI/MSI-X interrupts are handled by the pci-hyperv driver. Here despite
+ * the word "msi" in the name "msi_ext_dest_id", actually the callback only
+ * affects how IOAPIC interrupts are routed.
+ */
+static bool __init ms_hyperv_msi_ext_dest_id(void)
+{
+	u32 eax;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_INTERFACE);
+	if (eax != HYPERV_VS_INTERFACE_EAX_SIGNATURE)
+		return false;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_PROPERTIES);
+	return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
 	.type			= X86_HYPER_MS_HYPERV,
+	.init.x2apic_available	= ms_hyperv_x2apic_available,
+	.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 	.init.init_platform	= ms_hyperv_init_platform,
 };

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
                   ` (3 preceding siblings ...)
  2020-11-03  8:22 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui
@ 2020-11-03  9:37 ` Dexuan Cui
  2020-11-04 10:13 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui
  5 siblings, 0 replies; 9+ messages in thread
From: Dexuan Cui @ 2020-11-03  9:37 UTC (permalink / raw)
  To: tglx, dwmw2, x86, Michael Kelley, linux-hyperv
  Cc: linux-kernel, Tianyu Lan, vkuznets, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, wei.liu, mingo, bp, hpa

> From: Dexuan Cui <decui@microsoft.com>
> Sent: Monday, November 2, 2020 5:12 PM
> 
> ...

Hi tglx,
Now this patch is in the x86/apic branch, which is great! Thanks for the 
quick action! But the third line of the below paragraph of the commit log
is missing... Sorry I just realized I should have not prefixed that line with the
">255 APIC IDs" -- it looks a line is ignored if it starts with 2 chars of ">>". :-(

> On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
> changes with this commit: Linux VM is still able to bring up the CPUs with
> >255 APIC IDs with the help of hyperv-iommu.c, but IOAPIC interrupts still
> can not go to such CPUs, and the kdump kernel still can not work properly
> on such CPUs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  8:02 ` David Woodhouse
@ 2020-11-03  9:46   ` Dexuan Cui
  0 siblings, 0 replies; 9+ messages in thread
From: Dexuan Cui @ 2020-11-03  9:46 UTC (permalink / raw)
  To: David Woodhouse, tglx, x86, Michael Kelley, linux-hyperv
  Cc: linux-kernel, Tianyu Lan, vkuznets, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, wei.liu, mingo, bp, hpa

> From: David Woodhouse <dwmw2@infradead.org>
> Sent: Tuesday, November 3, 2020 12:03 AM
> > +/*
> > + * If ms_hyperv_msi_ext_dest_id() returns true,
> > hyperv_prepare_irq_remapping()
> > + * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
> > + * generic support of the 15-bit APIC ID is used: see
> > __irq_msi_compose_msg().
> > + *
> > + * Note: For a VM on Hyper-V, no emulated legacy device supports PCI
> MSI/MSI-X,
> > + * and PCI MSI/MSI-X only come from the assigned physical PCIe device, and
> the
> > + * PCI MSI/MSI-X interrupts are handled by the pci-hyperv driver. Here
> despite
> > + * the word "msi" in the name "msi_ext_dest_id", actually the callback only
> > + * affects how IOAPIC interrupts are routed.
> > + */
> 
> I named it like that on purpose to make the point that the I/OAPIC is
> just a device for turning line interrupts into MSIs. Some VMMs, just
> like real hardware, really do implement their I/OAPIC emulation that
> way. It makes a lot of sense to do so if you support interrupt
> remapping.

I totally agree.
 
> FWIW I might have phrased your last paragraph in that comment as
> 
>   Note: for a VM on Hyper-V, the I/OAPIC is the only device which
>   (logically) generates MSIs directly to the system APIC irq domain.
>   There is no HPET, and PCI MSI/MSI-X interrupts are remapped by the
>   pci-hyperv host bridge.

I agree. This version is much better.
 
> But don't bother to change it; I think I've made my point quite well
> enough with https://git.kernel.org/tip/tip/c/5d5a97133 :)
> 
> --
> dwmw2

Hi David,
This patch has been in the x86/apic branch (with a line missing in the commit
log). If possible, I hope tglx can help make this change you suggested, and add
the missing line in the commit log. :-)

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: x86/apic] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
  2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
                   ` (4 preceding siblings ...)
  2020-11-03  9:37 ` [PATCH] " Dexuan Cui
@ 2020-11-04 10:13 ` tip-bot2 for Dexuan Cui
  5 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Dexuan Cui @ 2020-11-04 10:13 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dexuan Cui, Thomas Gleixner, David Woodhouse, x86, LKML

The following commit has been merged into the x86/apic branch of tip:

Commit-ID:     d981059e13ffa9ed03a73472e932d070323bd057
Gitweb:        https://git.kernel.org/tip/d981059e13ffa9ed03a73472e932d070323bd057
Author:        Dexuan Cui <decui@microsoft.com>
AuthorDate:    Mon, 02 Nov 2020 17:11:36 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 04 Nov 2020 11:10:52 +01:00

x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
the CPUs can't be the destination of IOAPIC interrupts, because the
IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
only routed to CPUs that don't have >255 APIC IDs. However, there is
an issue with kdump, because the kdump kernel can run on any CPU, and
hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
with a >255 APIC ID.

The kdump issue can be fixed by the Extended Dest ID, which is introduced
recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
support of the underlying hypervisor. The latest Hyper-V has added the
support recently: with this commit, on such a Hyper-V host, Linux VM
does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
ID from David is used, meaning that Linux VM is able to support up to
32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.

On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
changes with this commit: Linux VM is still able to bring up the CPUs with
> 255 APIC IDs with the help of hyperv-iommu.c, but IOAPIC interrupts still
can not go to such CPUs, and the kdump kernel still can not work properly
on such CPUs.

[ tglx: Updated comment as suggested by David ]

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20201103011136.59108-1-decui@microsoft.com
---
 arch/x86/include/asm/hyperv-tlfs.h |  7 +++++++-
 arch/x86/kernel/cpu/mshyperv.c     | 29 +++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8..6bf42ae 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -23,6 +23,13 @@
 #define HYPERV_CPUID_IMPLEMENT_LIMITS		0x40000005
 #define HYPERV_CPUID_NESTED_FEATURES		0x4000000A
 
+#define HYPERV_CPUID_VIRT_STACK_INTERFACE	0x40000081
+#define HYPERV_VS_INTERFACE_EAX_SIGNATURE	0x31235356  /* "VS#1" */
+
+#define HYPERV_CPUID_VIRT_STACK_PROPERTIES	0x40000082
+/* Support for the extended IOAPIC RTE format */
+#define HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE	BIT(2)
+
 #define HYPERV_HYPERVISOR_PRESENT_BIT		0x80000000
 #define HYPERV_CPUID_MIN			0x40000005
 #define HYPERV_CPUID_MAX			0x4000ffff
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4..f628e3d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -366,9 +366,38 @@ static void __init ms_hyperv_init_platform(void)
 #endif
 }
 
+static bool __init ms_hyperv_x2apic_available(void)
+{
+	return x2apic_supported();
+}
+
+/*
+ * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
+ * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
+ * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
+ *
+ * Note: for a VM on Hyper-V, the I/O-APIC is the only device which
+ * (logically) generates MSIs directly to the system APIC irq domain.
+ * There is no HPET, and PCI MSI/MSI-X interrupts are remapped by the
+ * pci-hyperv host bridge.
+ */
+static bool __init ms_hyperv_msi_ext_dest_id(void)
+{
+	u32 eax;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_INTERFACE);
+	if (eax != HYPERV_VS_INTERFACE_EAX_SIGNATURE)
+		return false;
+
+	eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_PROPERTIES);
+	return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
 	.type			= X86_HYPER_MS_HYPERV,
+	.init.x2apic_available	= ms_hyperv_x2apic_available,
+	.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 	.init.init_platform	= ms_hyperv_init_platform,
 };

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-11-04 10:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03  1:11 [PATCH] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it Dexuan Cui
2020-11-03  2:04 ` Dexuan Cui
2020-11-03  7:38 ` kernel test robot
2020-11-03  7:38   ` kernel test robot
2020-11-03  8:02 ` David Woodhouse
2020-11-03  9:46   ` Dexuan Cui
2020-11-03  8:22 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui
2020-11-03  9:37 ` [PATCH] " Dexuan Cui
2020-11-04 10:13 ` [tip: x86/apic] " tip-bot2 for Dexuan Cui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.