* [PATCH 0/2] VMD fixes for 4.20
@ 2018-10-16 0:48 Jon Derrick
2018-10-16 0:48 ` [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
2018-10-16 0:48 ` [PATCH 2/2] PCI/VMD: Set up firmware-first if capable Jon Derrick
0 siblings, 2 replies; 11+ messages in thread
From: Jon Derrick @ 2018-10-16 0:48 UTC (permalink / raw)
To: linux-pci; +Cc: Bjorn Helgaas, Lorenzo Pieralisi, Keith Busch, Jon Derrick
Hello Lorenzo, Keith, Bjorn
These are VMD fixes for 4.20
First one looks to fix an orphaned struct resource issue when removing the VMD
module
Second patch enables firmware-first error handling on a VMD domain.
Because the endpoint is somewhat divorced from the VMD domain, the driver
doesn't use a standard ACPI method of signaling firmware-first on the domain.
The idea behind firmware-first in VMD is that the endpoint will advertise its
support using the interface bit in the endpoint. The driver enables SMI system
error messages in the root port control register. The firmware is allowed to
handle the interrupt as it sees fit, but returns control to the VMD driver by
synthesizing an MSI message. The kernel native AER driver may be used for
further processing, so is left enabled. If the firmware doesn't need the native
driver, of course it can handle and clear all errors and the subsequent MSI
will be eventually dropped as it goes through the VMD ISR.
At this time, I'm unaware of how the errors will be represented in the error
status headers. In the past they could be seen as a combination of the VMD
endpoint device's bus number and the erroring device's bus number, but could be
different depending on the type of error and if the erroring endpoint was lost
(completion timeout). Currently the native driver handles these errors by
scanning every device on the bus for error status and there's no plan to change
that mechanism for future VMD devices.
Follow-up to:
https://patchwork.ozlabs.org/patch/966128/
https://patchwork.ozlabs.org/patch/964254/
Jon Derrick (2):
PCI/VMD: Detach resources after stopping root bus
PCI/VMD: Set up firmware-first if capable
drivers/pci/controller/vmd.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus
2018-10-16 0:48 [PATCH 0/2] VMD fixes for 4.20 Jon Derrick
@ 2018-10-16 0:48 ` Jon Derrick
2018-10-16 14:48 ` Keith Busch
2018-10-18 16:43 ` Lorenzo Pieralisi
2018-10-16 0:48 ` [PATCH 2/2] PCI/VMD: Set up firmware-first if capable Jon Derrick
1 sibling, 2 replies; 11+ messages in thread
From: Jon Derrick @ 2018-10-16 0:48 UTC (permalink / raw)
To: linux-pci; +Cc: Bjorn Helgaas, Lorenzo Pieralisi, Keith Busch, Jon Derrick
The VMD removal path calls pci_stop_root_bus, which tears down the pcie
tree, including detaching all of the attached drivers. During driver
detachment, devices may use pci_release_region to release resources.
This path relies on the resource being accessible in resource tree.
By detaching the child domain from the parent resource domain prior to
stopping the bus, we are preventing the list traversal from finding the
resource to be freed. If we instead detach the resource after stopping
the bus, we will have properly freed the resource and detaching is
simply accounting at that point.
Without this order, the resource is never freed and is orphaned on VMD
removal, leading to warning:
[ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
Fixes 2c2c5c5cd213aea38c850bb6edc9b7f77f29802f:
"x86/PCI: VMD: Attach VMD resources to parent domain's resource tree"
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
drivers/pci/controller/vmd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index fd2dbd7..46ed80f 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -813,12 +813,12 @@ static void vmd_remove(struct pci_dev *dev)
{
struct vmd_dev *vmd = pci_get_drvdata(dev);
- vmd_detach_resources(vmd);
sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
pci_stop_root_bus(vmd->bus);
pci_remove_root_bus(vmd->bus);
vmd_cleanup_srcu(vmd);
vmd_teardown_dma_ops(vmd);
+ vmd_detach_resources(vmd);
irq_domain_remove(vmd->irq_domain);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] PCI/VMD: Set up firmware-first if capable
2018-10-16 0:48 [PATCH 0/2] VMD fixes for 4.20 Jon Derrick
2018-10-16 0:48 ` [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
@ 2018-10-16 0:48 ` Jon Derrick
2018-10-16 2:25 ` kbuild test robot
2018-10-17 14:12 ` Lorenzo Pieralisi
1 sibling, 2 replies; 11+ messages in thread
From: Jon Derrick @ 2018-10-16 0:48 UTC (permalink / raw)
To: linux-pci; +Cc: Bjorn Helgaas, Lorenzo Pieralisi, Keith Busch, Jon Derrick
The VMD endpoint device exposes a domain of root ports and any
downstream devices attached to that hierarchy. VMD domains consisting of
the root ports and downstream devices are not represented in the ACPI
tables and _OSC is unsupported. Because of this non-standard way of
signaling firmware-first enabling on the root ports, the VMD device
instead advertises support for firmware-first on the root ports by
setting its interface bit to 0x1.
When firmware-first is enabled on a VMD domain, the driver sets up the
root port control registers to generate SMI system interrupts to
firmware on errors. System firmware will handle the error as it sees
fit, then passes back control to VMD with a synthesized MSI message.
Because of this kernel pass-back, the driver does not disable the native
AER port service driver attached to the VMD root ports, allowing for
further kernel error handling if desired.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
drivers/pci/controller/vmd.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 46ed80f..9625dca 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
LIST_HEAD(resources);
resource_size_t offset[2] = {0};
resource_size_t membar2_offset = 0x2000, busn_start = 0;
+ u8 interface;
/*
* Shadow registers may exist in certain VMD device ids which allow
@@ -718,6 +719,35 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
pci_rescan_bus(vmd->bus);
+ /*
+ * Certain VMD devices may request firmware-first error handling
+ * support on the domain. These domains are virtual and not described
+ * by ACPI and must be configured manually. VMD domains which utilize
+ * firmware-first may still require further kernel error handling, but
+ * the domain is intended to first interrupt upon error to system
+ * firmware before being passed back to the kernel. The system error
+ * handling bits in the root port control register must be enabled
+ * following the AER service driver configuration in order to generate
+ * these system interrupts.
+ *
+ * Because the root ports are not described by ACPI and _OSC is
+ * unsupported in VMD domains, the intent to use firmware-first error
+ * handling in the root ports is instead described by the VMD device's
+ * interface bit.
+ */
+ pci_read_config_byte(vmd->dev, PCI_CLASS_PROG, &interface);
+ if (interface == 0x1) {
+ struct pci_dev *rpdev;
+
+ list_for_each_entry(rpdev, &vmd->bus->devices, bus_list) {
+ if (rpdev->aer_cap)
+ pcie_capability_set_word(rpdev, PCI_EXP_RTCTL,
+ PCI_EXP_RTCTL_SECEE |
+ PCI_EXP_RTCTL_SENFEE |
+ PCI_EXP_RTCTL_SEFEE);
+ }
+ }
+
WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj,
"domain"), "Can't create symlink to domain\n");
return 0;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] PCI/VMD: Set up firmware-first if capable
2018-10-16 0:48 ` [PATCH 2/2] PCI/VMD: Set up firmware-first if capable Jon Derrick
@ 2018-10-16 2:25 ` kbuild test robot
2018-10-17 14:12 ` Lorenzo Pieralisi
1 sibling, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2018-10-16 2:25 UTC (permalink / raw)
To: Jon Derrick
Cc: kbuild-all, linux-pci, Bjorn Helgaas, Lorenzo Pieralisi,
Keith Busch, Jon Derrick
[-- Attachment #1: Type: text/plain, Size: 7948 bytes --]
Hi Jon,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on pci/next]
[also build test ERROR on v4.19-rc8 next-20181012]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Jon-Derrick/VMD-fixes-for-4-20/20181016-085809
base: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: x86_64-randconfig-x019-201841 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
drivers/pci//controller/vmd.c: In function 'vmd_enable_domain':
>> drivers/pci//controller/vmd.c:743:15: error: 'struct pci_dev' has no member named 'aer_cap'; did you mean 'ats_cap'?
if (rpdev->aer_cap)
^~~~~~~
ats_cap
vim +743 drivers/pci//controller/vmd.c
581
582 static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
583 {
584 struct pci_sysdata *sd = &vmd->sysdata;
585 struct fwnode_handle *fn;
586 struct resource *res;
587 u32 upper_bits;
588 unsigned long flags;
589 LIST_HEAD(resources);
590 resource_size_t offset[2] = {0};
591 resource_size_t membar2_offset = 0x2000, busn_start = 0;
592 u8 interface;
593
594 /*
595 * Shadow registers may exist in certain VMD device ids which allow
596 * guests to correctly assign host physical addresses to the root ports
597 * and child devices. These registers will either return the host value
598 * or 0, depending on an enable bit in the VMD device.
599 */
600 if (features & VMD_FEAT_HAS_MEMBAR_SHADOW) {
601 u32 vmlock;
602 int ret;
603
604 membar2_offset = 0x2018;
605 ret = pci_read_config_dword(vmd->dev, PCI_REG_VMLOCK, &vmlock);
606 if (ret || vmlock == ~0)
607 return -ENODEV;
608
609 if (MB2_SHADOW_EN(vmlock)) {
610 void __iomem *membar2;
611
612 membar2 = pci_iomap(vmd->dev, VMD_MEMBAR2, 0);
613 if (!membar2)
614 return -ENOMEM;
615 offset[0] = vmd->dev->resource[VMD_MEMBAR1].start -
616 readq(membar2 + 0x2008);
617 offset[1] = vmd->dev->resource[VMD_MEMBAR2].start -
618 readq(membar2 + 0x2010);
619 pci_iounmap(vmd->dev, membar2);
620 }
621 }
622
623 /*
624 * Certain VMD devices may have a root port configuration option which
625 * limits the bus range to between 0-127 or 128-255
626 */
627 if (features & VMD_FEAT_HAS_BUS_RESTRICTIONS) {
628 u32 vmcap, vmconfig;
629
630 pci_read_config_dword(vmd->dev, PCI_REG_VMCAP, &vmcap);
631 pci_read_config_dword(vmd->dev, PCI_REG_VMCONFIG, &vmconfig);
632 if (BUS_RESTRICT_CAP(vmcap) &&
633 (BUS_RESTRICT_CFG(vmconfig) == 0x1))
634 busn_start = 128;
635 }
636
637 res = &vmd->dev->resource[VMD_CFGBAR];
638 vmd->resources[0] = (struct resource) {
639 .name = "VMD CFGBAR",
640 .start = busn_start,
641 .end = busn_start + (resource_size(res) >> 20) - 1,
642 .flags = IORESOURCE_BUS | IORESOURCE_PCI_FIXED,
643 };
644
645 /*
646 * If the window is below 4GB, clear IORESOURCE_MEM_64 so we can
647 * put 32-bit resources in the window.
648 *
649 * There's no hardware reason why a 64-bit window *couldn't*
650 * contain a 32-bit resource, but pbus_size_mem() computes the
651 * bridge window size assuming a 64-bit window will contain no
652 * 32-bit resources. __pci_assign_resource() enforces that
653 * artificial restriction to make sure everything will fit.
654 *
655 * The only way we could use a 64-bit non-prefechable MEMBAR is
656 * if its address is <4GB so that we can convert it to a 32-bit
657 * resource. To be visible to the host OS, all VMD endpoints must
658 * be initially configured by platform BIOS, which includes setting
659 * up these resources. We can assume the device is configured
660 * according to the platform needs.
661 */
662 res = &vmd->dev->resource[VMD_MEMBAR1];
663 upper_bits = upper_32_bits(res->end);
664 flags = res->flags & ~IORESOURCE_SIZEALIGN;
665 if (!upper_bits)
666 flags &= ~IORESOURCE_MEM_64;
667 vmd->resources[1] = (struct resource) {
668 .name = "VMD MEMBAR1",
669 .start = res->start,
670 .end = res->end,
671 .flags = flags,
672 .parent = res,
673 };
674
675 res = &vmd->dev->resource[VMD_MEMBAR2];
676 upper_bits = upper_32_bits(res->end);
677 flags = res->flags & ~IORESOURCE_SIZEALIGN;
678 if (!upper_bits)
679 flags &= ~IORESOURCE_MEM_64;
680 vmd->resources[2] = (struct resource) {
681 .name = "VMD MEMBAR2",
682 .start = res->start + membar2_offset,
683 .end = res->end,
684 .flags = flags,
685 .parent = res,
686 };
687
688 sd->vmd_domain = true;
689 sd->domain = vmd_find_free_domain();
690 if (sd->domain < 0)
691 return sd->domain;
692
693 sd->node = pcibus_to_node(vmd->dev->bus);
694
695 fn = irq_domain_alloc_named_id_fwnode("VMD-MSI", vmd->sysdata.domain);
696 if (!fn)
697 return -ENODEV;
698
699 vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info,
700 x86_vector_domain);
701 irq_domain_free_fwnode(fn);
702 if (!vmd->irq_domain)
703 return -ENODEV;
704
705 pci_add_resource(&resources, &vmd->resources[0]);
706 pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
707 pci_add_resource_offset(&resources, &vmd->resources[2], offset[1]);
708
709 vmd->bus = pci_create_root_bus(&vmd->dev->dev, busn_start, &vmd_ops,
710 sd, &resources);
711 if (!vmd->bus) {
712 pci_free_resource_list(&resources);
713 irq_domain_remove(vmd->irq_domain);
714 return -ENODEV;
715 }
716
717 vmd_attach_resources(vmd);
718 vmd_setup_dma_ops(vmd);
719 dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
720 pci_rescan_bus(vmd->bus);
721
722 /*
723 * Certain VMD devices may request firmware-first error handling
724 * support on the domain. These domains are virtual and not described
725 * by ACPI and must be configured manually. VMD domains which utilize
726 * firmware-first may still require further kernel error handling, but
727 * the domain is intended to first interrupt upon error to system
728 * firmware before being passed back to the kernel. The system error
729 * handling bits in the root port control register must be enabled
730 * following the AER service driver configuration in order to generate
731 * these system interrupts.
732 *
733 * Because the root ports are not described by ACPI and _OSC is
734 * unsupported in VMD domains, the intent to use firmware-first error
735 * handling in the root ports is instead described by the VMD device's
736 * interface bit.
737 */
738 pci_read_config_byte(vmd->dev, PCI_CLASS_PROG, &interface);
739 if (interface == 0x1) {
740 struct pci_dev *rpdev;
741
742 list_for_each_entry(rpdev, &vmd->bus->devices, bus_list) {
> 743 if (rpdev->aer_cap)
744 pcie_capability_set_word(rpdev, PCI_EXP_RTCTL,
745 PCI_EXP_RTCTL_SECEE |
746 PCI_EXP_RTCTL_SENFEE |
747 PCI_EXP_RTCTL_SEFEE);
748 }
749 }
750
751 WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj,
752 "domain"), "Can't create symlink to domain\n");
753 return 0;
754 }
755
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34656 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus
2018-10-16 0:48 ` [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
@ 2018-10-16 14:48 ` Keith Busch
2018-10-18 16:43 ` Lorenzo Pieralisi
1 sibling, 0 replies; 11+ messages in thread
From: Keith Busch @ 2018-10-16 14:48 UTC (permalink / raw)
To: Jon Derrick; +Cc: linux-pci, Bjorn Helgaas, Lorenzo Pieralisi
On Mon, Oct 15, 2018 at 05:48:07PM -0700, Jon Derrick wrote:
> The VMD removal path calls pci_stop_root_bus, which tears down the pcie
> tree, including detaching all of the attached drivers. During driver
> detachment, devices may use pci_release_region to release resources.
> This path relies on the resource being accessible in resource tree.
>
> By detaching the child domain from the parent resource domain prior to
> stopping the bus, we are preventing the list traversal from finding the
> resource to be freed. If we instead detach the resource after stopping
> the bus, we will have properly freed the resource and detaching is
> simply accounting at that point.
>
> Without this order, the resource is never freed and is orphaned on VMD
> removal, leading to warning:
> [ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
>
> Fixes 2c2c5c5cd213aea38c850bb6edc9b7f77f29802f:
> "x86/PCI: VMD: Attach VMD resources to parent domain's resource tree"
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Looks good.
Reviewed-by: Keith Busch <keith.busch@intel.com>
> ---
> drivers/pci/controller/vmd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index fd2dbd7..46ed80f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -813,12 +813,12 @@ static void vmd_remove(struct pci_dev *dev)
> {
> struct vmd_dev *vmd = pci_get_drvdata(dev);
>
> - vmd_detach_resources(vmd);
> sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
> pci_stop_root_bus(vmd->bus);
> pci_remove_root_bus(vmd->bus);
> vmd_cleanup_srcu(vmd);
> vmd_teardown_dma_ops(vmd);
> + vmd_detach_resources(vmd);
> irq_domain_remove(vmd->irq_domain);
> }
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] PCI/VMD: Set up firmware-first if capable
2018-10-16 0:48 ` [PATCH 2/2] PCI/VMD: Set up firmware-first if capable Jon Derrick
2018-10-16 2:25 ` kbuild test robot
@ 2018-10-17 14:12 ` Lorenzo Pieralisi
2018-10-17 18:16 ` Derrick, Jonathan
1 sibling, 1 reply; 11+ messages in thread
From: Lorenzo Pieralisi @ 2018-10-17 14:12 UTC (permalink / raw)
To: Jon Derrick, bhelgaas; +Cc: linux-pci, Bjorn Helgaas, Keith Busch
On Mon, Oct 15, 2018 at 06:48:08PM -0600, Jon Derrick wrote:
> The VMD endpoint device exposes a domain of root ports and any
> downstream devices attached to that hierarchy. VMD domains consisting of
> the root ports and downstream devices are not represented in the ACPI
> tables and _OSC is unsupported. Because of this non-standard way of
> signaling firmware-first enabling on the root ports, the VMD device
> instead advertises support for firmware-first on the root ports by
> setting its interface bit to 0x1.
>
> When firmware-first is enabled on a VMD domain, the driver sets up the
> root port control registers to generate SMI system interrupts to
> firmware on errors. System firmware will handle the error as it sees
> fit, then passes back control to VMD with a synthesized MSI message.
>
> Because of this kernel pass-back, the driver does not disable the native
> AER port service driver attached to the VMD root ports, allowing for
> further kernel error handling if desired.
This patch looks more like a policy to detect whether the generic
Root Port system error reporting should be enabled or not, or at
least may be generalized as such.
It is contained in the VMD driver - I would like to have Keith and Bjorn
opinions before merging it, see below.
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> ---
> drivers/pci/controller/vmd.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 46ed80f..9625dca 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> LIST_HEAD(resources);
> resource_size_t offset[2] = {0};
> resource_size_t membar2_offset = 0x2000, busn_start = 0;
> + u8 interface;
>
> /*
> * Shadow registers may exist in certain VMD device ids which allow
> @@ -718,6 +719,35 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
> pci_rescan_bus(vmd->bus);
>
> + /*
> + * Certain VMD devices may request firmware-first error handling
> + * support on the domain. These domains are virtual and not described
> + * by ACPI and must be configured manually. VMD domains which utilize
> + * firmware-first may still require further kernel error handling, but
> + * the domain is intended to first interrupt upon error to system
> + * firmware before being passed back to the kernel. The system error
> + * handling bits in the root port control register must be enabled
> + * following the AER service driver configuration in order to generate
> + * these system interrupts.
> + *
> + * Because the root ports are not described by ACPI and _OSC is
> + * unsupported in VMD domains, the intent to use firmware-first error
> + * handling in the root ports is instead described by the VMD device's
> + * interface bit.
> + */
> + pci_read_config_byte(vmd->dev, PCI_CLASS_PROG, &interface);
> + if (interface == 0x1) {
> + struct pci_dev *rpdev;
> +
> + list_for_each_entry(rpdev, &vmd->bus->devices, bus_list) {
> + if (rpdev->aer_cap)
This should be CONFIG_PCIEAER guarded but I would like to understand its
logic.
IIUC this assumes all devices in the root bus are root ports, which is
an VMD assumption I reckon.
> + pcie_capability_set_word(rpdev, PCI_EXP_RTCTL,
> + PCI_EXP_RTCTL_SECEE |
> + PCI_EXP_RTCTL_SENFEE |
> + PCI_EXP_RTCTL_SEFEE);
I wonder whether this code should be part of the generic AER layer (ie
the PCIE port driver) rather than VMD specific, after all that's part of
the generic specifications, I do not know if we can envisage an API that
allow PCI controller drivers to enable/disable system errors.
Thoughts ?
Lorenzo
> + }
> + }
> +
> WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj,
> "domain"), "Can't create symlink to domain\n");
> return 0;
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] PCI/VMD: Set up firmware-first if capable
2018-10-17 14:12 ` Lorenzo Pieralisi
@ 2018-10-17 18:16 ` Derrick, Jonathan
2018-10-17 23:01 ` Bjorn Helgaas
0 siblings, 1 reply; 11+ messages in thread
From: Derrick, Jonathan @ 2018-10-17 18:16 UTC (permalink / raw)
To: lorenzo.pieralisi, bhelgaas; +Cc: linux-pci, Busch, Keith, helgaas
Hi Lorenzo,
On Wed, 2018-10-17 at 15:12 +0100, Lorenzo Pieralisi wrote:
> On Mon, Oct 15, 2018 at 06:48:08PM -0600, Jon Derrick wrote:
> > The VMD endpoint device exposes a domain of root ports and any
> > downstream devices attached to that hierarchy. VMD domains
> > consisting of
> > the root ports and downstream devices are not represented in the
> > ACPI
> > tables and _OSC is unsupported. Because of this non-standard way of
> > signaling firmware-first enabling on the root ports, the VMD device
> > instead advertises support for firmware-first on the root ports by
> > setting its interface bit to 0x1.
> >
> > When firmware-first is enabled on a VMD domain, the driver sets up
> > the
> > root port control registers to generate SMI system interrupts to
> > firmware on errors. System firmware will handle the error as it
> > sees
> > fit, then passes back control to VMD with a synthesized MSI
> > message.
> >
> > Because of this kernel pass-back, the driver does not disable the
> > native
> > AER port service driver attached to the VMD root ports, allowing
> > for
> > further kernel error handling if desired.
>
> This patch looks more like a policy to detect whether the generic
> Root Port system error reporting should be enabled or not, or at
> least may be generalized as such.
> It is contained in the VMD driver - I would like to have Keith and
> Bjorn
> opinions before merging it, see below.
It is inherently a policy decision by the BIOS.
>
> > Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> > ---
> > drivers/pci/controller/vmd.c | 30 ++++++++++++++++++++++++++++++
> > 1 file changed, 30 insertions(+)
> >
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c
> > index 46ed80f..9625dca 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> > LIST_HEAD(resources);
> > resource_size_t offset[2] = {0};
> > resource_size_t membar2_offset = 0x2000, busn_start = 0;
> > + u8 interface;
> >
> > /*
> > * Shadow registers may exist in certain VMD device ids
> > which allow
> > @@ -718,6 +719,35 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> > dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
> > pci_rescan_bus(vmd->bus);
> >
> > + /*
> > + * Certain VMD devices may request firmware-first error
> > handling
> > + * support on the domain. These domains are virtual and
> > not described
> > + * by ACPI and must be configured manually. VMD domains
> > which utilize
> > + * firmware-first may still require further kernel error
> > handling, but
> > + * the domain is intended to first interrupt upon error to
> > system
> > + * firmware before being passed back to the kernel. The
> > system error
> > + * handling bits in the root port control register must be
> > enabled
> > + * following the AER service driver configuration in order
> > to generate
> > + * these system interrupts.
> > + *
> > + * Because the root ports are not described by ACPI and
> > _OSC is
> > + * unsupported in VMD domains, the intent to use firmware-
> > first error
> > + * handling in the root ports is instead described by the
> > VMD device's
> > + * interface bit.
> > + */
> > + pci_read_config_byte(vmd->dev, PCI_CLASS_PROG,
> > &interface);
> > + if (interface == 0x1) {
> > + struct pci_dev *rpdev;
> > +
> > + list_for_each_entry(rpdev, &vmd->bus->devices,
> > bus_list) {
> > + if (rpdev->aer_cap)
>
> This should be CONFIG_PCIEAER guarded but I would like to understand
> its
> logic.
I think we should consider allowing it for CONFIG_PCIEAER=n. The
firmware should be able to try and manage the error even if it occurs
when native aer isn't built-in. It would be worse in that respect if
the error went left unchecked.
> IIUC this assumes all devices in the root bus are root ports, which
> is
> an VMD assumption I reckon.
That's true for the VMD hardware so far. We don't expect any devices
besides root ports on the root bus.
I would rather drop the dev->aer_cap check, but I can add a
pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT check
>
>
> > + pcie_capability_set_word(rpdev,
> > PCI_EXP_RTCTL,
> > + PCI_EXP_R
> > TCTL_SECEE |
> > + PCI_EXP_R
> > TCTL_SENFEE |
> > + PCI_EXP_R
> > TCTL_SEFEE);
>
> I wonder whether this code should be part of the generic AER layer
> (ie
> the PCIE port driver) rather than VMD specific, after all that's part
> of
> the generic specifications, I do not know if we can envisage an API
> that
> allow PCI controller drivers to enable/disable system errors.
>
> Thoughts ?
>
> Lorenzo
I did consider this option first.
There's a call in aer.c to disable these bits, so the API need has
precendence.
We would need this API reachable by vmd.c. It seems easiest to inline
in pci.h.
>
> > + }
> > + }
> > +
> > WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus-
> > >dev.kobj,
> > "domain"), "Can't create symlink to
> > domain\n");
> > return 0;
> > --
> > 1.8.3.1
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] PCI/VMD: Set up firmware-first if capable
2018-10-17 18:16 ` Derrick, Jonathan
@ 2018-10-17 23:01 ` Bjorn Helgaas
0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2018-10-17 23:01 UTC (permalink / raw)
To: Derrick, Jonathan; +Cc: lorenzo.pieralisi, bhelgaas, linux-pci, Busch, Keith
[Jon, whatever you're using to respond is inserting line breaks in the
patch itself, which makes it really hard to read. I often rewrap the
text so it fits in a reasonable line length, but that's a disaster for
the code parts.]
On Wed, Oct 17, 2018 at 06:16:34PM +0000, Derrick, Jonathan wrote:
> On Wed, 2018-10-17 at 15:12 +0100, Lorenzo Pieralisi wrote:
> > On Mon, Oct 15, 2018 at 06:48:08PM -0600, Jon Derrick wrote:
> > > The VMD endpoint device exposes a domain of root ports and any
> > > downstream devices attached to that hierarchy. VMD domains
> > > consisting of the root ports and downstream devices are not
> > > represented in the ACPI tables and _OSC is unsupported. Because
> > > of this non-standard way of signaling firmware-first enabling on
> > > the root ports, the VMD device instead advertises support for
> > > firmware-first on the root ports by setting its interface bit to
> > > 0x1.
> > >
> > > When firmware-first is enabled on a VMD domain, the driver sets
> > > up the root port control registers to generate SMI system
> > > interrupts to firmware on errors. System firmware will handle
> > > the error as it sees fit, then passes back control to VMD with a
> > > synthesized MSI message.
Does this synthesized MSI look like an AER interrupt? How does AER
work in this case? If you're using the usual "firmware-first" model,
AER needs to get the error data from firmware via the ACPI APEI
framework, but I don't see anything here that tells the AER code to
use that instead of reading the AER registers directly.
Or maybe you're implementing something that's sort of *like* the ACPI
"firmware-first" model but without using any of that framework? E.g.,
the VMD Root Port system errors cause SMIs, the firmware handles the
SMI, the OS regains control (knowing nothing of the SMI) and handles
what it thinks is a "normal" AER interrupt and reads and clears the
AER registers directly.
If you're doing something like the above, the changelog should
definitely outline that, because using "firmware-first" in this
special VMD-only way that is nothing like the ACPI usage is going to
cause confusion.
> > > Because of this kernel pass-back, the driver does not disable
> > > the native AER port service driver attached to the VMD root
> > > ports, allowing for further kernel error handling if desired.
> >
> > This patch looks more like a policy to detect whether the generic
> > Root Port system error reporting should be enabled or not, or at
> > least may be generalized as such. It is contained in the VMD
> > driver - I would like to have Keith and Bjorn opinions before
> > merging it, see below.
>
> It is inherently a policy decision by the BIOS.
>
> > > Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> > > ---
> > > drivers/pci/controller/vmd.c | 30 ++++++++++++++++++++++++++++++
> > > 1 file changed, 30 insertions(+)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c
> > > b/drivers/pci/controller/vmd.c
> > > index 46ed80f..9625dca 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev
> > > *vmd, unsigned long features)
> > > LIST_HEAD(resources);
> > > resource_size_t offset[2] = {0};
> > > resource_size_t membar2_offset = 0x2000, busn_start = 0;
> > > + u8 interface;
> > >
> > > /*
> > > * Shadow registers may exist in certain VMD device ids
> > > which allow
> > > @@ -718,6 +719,35 @@ static int vmd_enable_domain(struct vmd_dev
> > > *vmd, unsigned long features)
> > > dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain);
> > > pci_rescan_bus(vmd->bus);
> > >
> > > + /*
> > > + * Certain VMD devices may request firmware-first error
> > > handling
> > > + * support on the domain. These domains are virtual and
> > > not described
> > > + * by ACPI and must be configured manually. VMD domains
> > > which utilize
> > > + * firmware-first may still require further kernel error
> > > handling, but
> > > + * the domain is intended to first interrupt upon error to
> > > system
> > > + * firmware before being passed back to the kernel. The
> > > system error
> > > + * handling bits in the root port control register must be
> > > enabled
> > > + * following the AER service driver configuration in order
> > > to generate
> > > + * these system interrupts.
> > > + *
> > > + * Because the root ports are not described by ACPI and
> > > _OSC is
> > > + * unsupported in VMD domains, the intent to use firmware-
> > > first error
> > > + * handling in the root ports is instead described by the
> > > VMD device's
> > > + * interface bit.
> > > + */
> > > + pci_read_config_byte(vmd->dev, PCI_CLASS_PROG,
> > > &interface);
> > > + if (interface == 0x1) {
The previous version [1] implemented this only for
PCI_DEVICE_ID_INTEL_VMD_28C0, but I guess you decided this is safe for
all VMD devices?
And there's still no spec that says "interface == 0x1" means
"firmware-first"? I guess that's OK; it's VMD-specific code. But
this doesn't seem to be the usual ACPI firmware-first situation, so I
suspect we need a different name and/or some comment/changelog
clarification.
[1] https://lore.kernel.org/linux-pci/1535675403-2903-2-git-send-email-jonathan.derrick@intel.com
> > > + struct pci_dev *rpdev;
> > > +
> > > + list_for_each_entry(rpdev, &vmd->bus->devices,
> > > bus_list) {
> > > + if (rpdev->aer_cap)
> >
> > This should be CONFIG_PCIEAER guarded but I would like to
> > understand its logic.
>
> I think we should consider allowing it for CONFIG_PCIEAER=n. The
> firmware should be able to try and manage the error even if it
> occurs when native aer isn't built-in. It would be worse in that
> respect if the error went left unchecked.
>
> > IIUC this assumes all devices in the root bus are root ports,
> > which is an VMD assumption I reckon.
>
> That's true for the VMD hardware so far. We don't expect any devices
> besides root ports on the root bus.
>
> I would rather drop the dev->aer_cap check, but I can add a
> pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT check
That sounds sensible to me. I think you want firmware to handle these
system errors regardless of what the kernel may do later, so this
shouldn't have anything to do with AER.
I think this patch relies on the AER driver binding to the Root Ports
(which clears the System Error Enable bits) before you set those bits
again here. And I think that's a pretty safe assumption because the
AER driver cannot be a module, so portdrv/AER should always bind to
the Root Ports before pci_rescan_bus() returns:
vmd_probe
vmd_enable_domain
pci_create_root_bus
pci_rescan_bus
pci_bus_add_devices
...
aer_probe
aer_enable_rootport
pcie_capability_clear_word(..., PCI_EXP_RTCTL, ...)
pcie_capability_set_word(..., PCI_EXP_RTCTL, ...)
It's a little ugly to loop through and touch these Root Ports after
they have been added because (a) that breaks hotplug scenarios (but I
assume it's impossible to hot-add a Root Port under an existing VMD)
and (b) the core or host drivers like VMD really shouldn't touch
devices that may have been claimed by other drivers (but we "know"
portdrv is the only driver involved here).
The ideal way would be to set these bits somewhere in the
pci_device_add() path as the Root Port is being enumerated, but
there's no good hook for VMD-specific code and we'd lose the ordering
with respect to aer_enable_rootport(), so that's not practical.
> > > + pcie_capability_set_word(rpdev,
> > > PCI_EXP_RTCTL,
> > > + PCI_EXP_R
> > > TCTL_SECEE |
> > > + PCI_EXP_R
> > > TCTL_SENFEE |
> > > + PCI_EXP_R
> > > TCTL_SEFEE);
> >
> > I wonder whether this code should be part of the generic AER layer
> > (ie the PCIE port driver) rather than VMD specific, after all
> > that's part of the generic specifications, I do not know if we can
> > envisage an API that allow PCI controller drivers to
> > enable/disable system errors.
Maybe a bit in struct pci_host_bridge the VMD driver could set to say
"please don't disable my System Error interrupts"? I assume the BIOS
sets them already and the only problem is that AER turns them off
again?
I'm not sure what _OSC has to do with this. It doesn't sound like you
would want to use it to prevent the OS from using AER.
> I did consider this option first. There's a call in aer.c to
> disable these bits, so the API need has precendence.
>
> We would need this API reachable by vmd.c. It seems easiest to
> inline in pci.h.
> > > + }
> > > + }
> > > +
> > > WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus-
> > > >dev.kobj,
> > > "domain"), "Can't create symlink to
> > > domain\n");
> > > return 0;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus
2018-10-16 0:48 ` [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
2018-10-16 14:48 ` Keith Busch
@ 2018-10-18 16:43 ` Lorenzo Pieralisi
1 sibling, 0 replies; 11+ messages in thread
From: Lorenzo Pieralisi @ 2018-10-18 16:43 UTC (permalink / raw)
To: Jon Derrick; +Cc: linux-pci, Bjorn Helgaas, Keith Busch
On Mon, Oct 15, 2018 at 06:48:07PM -0600, Jon Derrick wrote:
> The VMD removal path calls pci_stop_root_bus, which tears down the pcie
> tree, including detaching all of the attached drivers. During driver
> detachment, devices may use pci_release_region to release resources.
> This path relies on the resource being accessible in resource tree.
>
> By detaching the child domain from the parent resource domain prior to
> stopping the bus, we are preventing the list traversal from finding the
> resource to be freed. If we instead detach the resource after stopping
> the bus, we will have properly freed the resource and detaching is
> simply accounting at that point.
>
> Without this order, the resource is never freed and is orphaned on VMD
> removal, leading to warning:
> [ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
>
> Fixes 2c2c5c5cd213aea38c850bb6edc9b7f77f29802f:
> "x86/PCI: VMD: Attach VMD resources to parent domain's resource tree"
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> ---
> drivers/pci/controller/vmd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
I have applied this patch to pci/vmd for v4.20, thanks.
Lorenzo
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index fd2dbd7..46ed80f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -813,12 +813,12 @@ static void vmd_remove(struct pci_dev *dev)
> {
> struct vmd_dev *vmd = pci_get_drvdata(dev);
>
> - vmd_detach_resources(vmd);
> sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
> pci_stop_root_bus(vmd->bus);
> pci_remove_root_bus(vmd->bus);
> vmd_cleanup_srcu(vmd);
> vmd_teardown_dma_ops(vmd);
> + vmd_detach_resources(vmd);
> irq_domain_remove(vmd->irq_domain);
> }
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus
2018-09-05 1:09 [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
@ 2018-10-02 9:44 ` Lorenzo Pieralisi
0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Pieralisi @ 2018-10-02 9:44 UTC (permalink / raw)
To: Jon Derrick; +Cc: Bjorn Helgaas, Keith Busch, linux-pci
On Tue, Sep 04, 2018 at 07:09:50PM -0600, Jon Derrick wrote:
> Fixes the ugly warning:
> [ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
It would be good to explain why that warning triggers and why this
patch fixes it.
Lorenzo
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> ---
> drivers/pci/controller/vmd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index fd2dbd7..46ed80f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -813,12 +813,12 @@ static void vmd_remove(struct pci_dev *dev)
> {
> struct vmd_dev *vmd = pci_get_drvdata(dev);
>
> - vmd_detach_resources(vmd);
> sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
> pci_stop_root_bus(vmd->bus);
> pci_remove_root_bus(vmd->bus);
> vmd_cleanup_srcu(vmd);
> vmd_teardown_dma_ops(vmd);
> + vmd_detach_resources(vmd);
> irq_domain_remove(vmd->irq_domain);
> }
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus
@ 2018-09-05 1:09 Jon Derrick
2018-10-02 9:44 ` Lorenzo Pieralisi
0 siblings, 1 reply; 11+ messages in thread
From: Jon Derrick @ 2018-09-05 1:09 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Keith Busch, Lorenzo Pieralisi, linux-pci, Jon Derrick
Fixes the ugly warning:
[ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
drivers/pci/controller/vmd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index fd2dbd7..46ed80f 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -813,12 +813,12 @@ static void vmd_remove(struct pci_dev *dev)
{
struct vmd_dev *vmd = pci_get_drvdata(dev);
- vmd_detach_resources(vmd);
sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
pci_stop_root_bus(vmd->bus);
pci_remove_root_bus(vmd->bus);
vmd_cleanup_srcu(vmd);
vmd_teardown_dma_ops(vmd);
+ vmd_detach_resources(vmd);
irq_domain_remove(vmd->irq_domain);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-10-18 16:43 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 0:48 [PATCH 0/2] VMD fixes for 4.20 Jon Derrick
2018-10-16 0:48 ` [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
2018-10-16 14:48 ` Keith Busch
2018-10-18 16:43 ` Lorenzo Pieralisi
2018-10-16 0:48 ` [PATCH 2/2] PCI/VMD: Set up firmware-first if capable Jon Derrick
2018-10-16 2:25 ` kbuild test robot
2018-10-17 14:12 ` Lorenzo Pieralisi
2018-10-17 18:16 ` Derrick, Jonathan
2018-10-17 23:01 ` Bjorn Helgaas
-- strict thread matches above, loose matches on Subject: below --
2018-09-05 1:09 [PATCH 1/2] PCI/VMD: Detach resources after stopping root bus Jon Derrick
2018-10-02 9:44 ` Lorenzo Pieralisi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).