All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
@ 2021-08-30 15:56 Jon Derrick
  2021-08-30 17:46 ` Raj, Ashok
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Jon Derrick @ 2021-08-30 15:56 UTC (permalink / raw)
  To: linux-pci, Bjorn Helgaas
  Cc: Ashok Raj, Lukas Wunner, jonathan.derrick, James Puthukattukaran,
	Jon Derrick

From: James Puthukattukaran <james.puthukattukaran@oracle.com>

When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
ports both support hotplugging on each respective x4 device, a slot
management system for the SSD requires both x4 slots to have power
removed via sysfs (echo 0 > slot/N/power), from the OS before it can
safely turn-off physical power for the whole x8 device. The implications
are that slot status will display powered off and link inactive statuses
for the x4 devices where the devices are actually powered until both
ports have powered off.

The issue with the SSD manifests when power is removed from the
first-half and then the second-half. During the first-half removal, slot
status shows the slot as powered-down and link-inactive, while internal
power and link remain active while waiting for the second-half to have
power removed. When power is then removed from the second-half, the
first-half starts shutdown sequence and will trigger a DLLSC event. This
is misinterpreted as an enabling event and causes the first-half to be
re-enabled.

The spurious enable can be resolved by ignoring link status change
events when no link is active when in the off state. This patch adds a
quirk for specific P5608 SSDs which have been tested for compatibility.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
---
v2->v3: Clarified commit message and comment block
	Added second supported subdevice ID
	Added hotplug ifdef blocks

 drivers/pci/hotplug/pciehp_ctrl.c |  7 +++++++
 drivers/pci/quirks.c              | 34 +++++++++++++++++++++++++++++++
 include/linux/pci.h               |  1 +
 3 files changed, 42 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 529c34808440..db41f78bfac8 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
 void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 {
 	int present, link_active;
+	struct pci_dev *pdev = ctrl->pcie->port;
 
 	/*
 	 * If the slot is on and presence or link has changed, turn it off.
@@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 		cancel_delayed_work(&ctrl->button_work);
 		fallthrough;
 	case OFF_STATE:
+		if (pdev->shared_pcc_and_link_slot &&
+		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
+			mutex_unlock(&ctrl->state_lock);
+			break;
+		}
+
 		ctrl->state = POWERON_STATE;
 		mutex_unlock(&ctrl->state_lock);
 		if (present)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 10122e3318cf..c6b48ddc5c3d 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5750,3 +5750,37 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
 }
 DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
 			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
+
+#ifdef CONFIG_HOTPLUG_PCI_PCIE
+/*
+ * This is an Intel NVMe SSD which sits in a x8 pciehp slot but is bifurcated
+ * as a x4x4 and manifests as two slots with respect to PCIe hot plug register
+ * states. However, the hotplug controller treats these slots as a single x8
+ * slot for link and power. Either one of the two slots can be powered down
+ * separately and the slot status will show negative power and link states, but
+ * internal power and link will be active until the last of the two slots is
+ * powered down. When the last of the two x4 slots is turned off, power and
+ * link will be turned off for the x8 slot by the HP controller. This
+ * configuration causes some interesting behavior in bringup sequence
+ *
+ * When the second slot is powered off to remove the card, this will cause the
+ * link to go down for both x4 slots. So, the x4 that is already powered down
+ * earlier will see a DLLSC event and attempt to bring itself up (card present,
+ * link change event, link state is down). Special handling is required in
+ * pciehp_handle_presence_or_link_change to prevent this unintended bring up
+ */
+static void quirk_shared_pcc_and_link_slot(struct pci_dev *pdev)
+{
+	struct pci_dev *parent = pci_upstream_bridge(pdev);
+
+	if (parent && pdev->subsystem_vendor == 0x108e) {
+		switch (pdev->subsystem_device) {
+		/* P5608 */
+		case 0x487d:
+		case 0x488d:
+			parent->shared_pcc_and_link_slot = 1;
+		}
+	}
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0b60, quirk_shared_pcc_and_link_slot);
+#endif /* CONFIG_HOTPLUG_PCI_PCIE */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7ed95f11c6bd..bcef73209487 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -463,6 +463,7 @@ struct pci_dev {
 
 #ifdef CONFIG_HOTPLUG_PCI_PCIE
 	unsigned int	broken_cmd_compl:1;	/* No compl for some cmds */
+	unsigned int	shared_pcc_and_link_slot:1;
 #endif
 #ifdef CONFIG_PCIE_PTM
 	unsigned int	ptm_root:1;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-08-30 15:56 [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD Jon Derrick
@ 2021-08-30 17:46 ` Raj, Ashok
  2021-08-31  1:59 ` jonathan.derrick
  2021-09-12  8:45 ` Lukas Wunner
  2 siblings, 0 replies; 11+ messages in thread
From: Raj, Ashok @ 2021-08-30 17:46 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Bjorn Helgaas, Lukas Wunner, James Puthukattukaran,
	Jon Derrick, Ashok Raj

Hi Jonathan

Looks mostly good. Thanks for the update. Reads better, but I'm still a bit
confused .. sorry


On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
> 
> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
> ports both support hotplugging on each respective x4 device, a slot
> management system for the SSD requires both x4 slots to have power
> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
> safely turn-off physical power for the whole x8 device. The implications
> are that slot status will display powered off and link inactive statuses
> for the x4 devices where the devices are actually powered until both
> ports have powered off.

This part is a bit muddy based on the description in the following writeup.

slot1->power off->link1 DOWN (Should fire a DLLSC event?) or does the link
stay active until slot2 is powered off?

slot2->power off, looks like only now link1 is getting DLLSC which you are
ignoring skipping to make sure you don't turn this into an ON event.

slot2 will also receive DLLSC, and you would ignore it also for the same
reasons.

Maybe this was discussed earlier and I didn't catch it. 
When in OFF-state, and DLLSC is received but link is down while in OFF state,
shoudn't that be the default handling? It almost looks like its a spurious
DLLSC event that can be ignored for all cases without a specific quirk?

Or is there a case we care about for normal devices?

> 
> The issue with the SSD manifests when power is removed from the
> first-half and then the second-half. During the first-half removal, slot
> status shows the slot as powered-down and link-inactive, while internal
> power and link remain active while waiting for the second-half to have
> power removed. When power is then removed from the second-half, the
> first-half starts shutdown sequence and will trigger a DLLSC event. This
> is misinterpreted as an enabling event and causes the first-half to be
> re-enabled.

Question is:

when slot1 is powered off, do you get a DLLSC for that link that reports
link_down? or does external link still report UP until both ports are
powered off?

> 
> The spurious enable can be resolved by ignoring link status change
> events when no link is active when in the off state. This patch adds a
> quirk for specific P5608 SSDs which have been tested for compatibility.
> 
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
> ---
> v2->v3: Clarified commit message and comment block
> 	Added second supported subdevice ID
> 	Added hotplug ifdef blocks
> 
>  drivers/pci/hotplug/pciehp_ctrl.c |  7 +++++++
>  drivers/pci/quirks.c              | 34 +++++++++++++++++++++++++++++++
>  include/linux/pci.h               |  1 +
>  3 files changed, 42 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 529c34808440..db41f78bfac8 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
>  void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  {
>  	int present, link_active;
> +	struct pci_dev *pdev = ctrl->pcie->port;
>  
>  	/*
>  	 * If the slot is on and presence or link has changed, turn it off.
> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  		cancel_delayed_work(&ctrl->button_work);
>  		fallthrough;
>  	case OFF_STATE:
> +		if (pdev->shared_pcc_and_link_slot &&
> +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> +			mutex_unlock(&ctrl->state_lock);
> +			break;
> +		}
> +
>  		ctrl->state = POWERON_STATE;
>  		mutex_unlock(&ctrl->state_lock);
>  		if (present)
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 10122e3318cf..c6b48ddc5c3d 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5750,3 +5750,37 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>  }
>  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> +
> +#ifdef CONFIG_HOTPLUG_PCI_PCIE
> +/*
> + * This is an Intel NVMe SSD which sits in a x8 pciehp slot but is bifurcated
> + * as a x4x4 and manifests as two slots with respect to PCIe hot plug register
> + * states. However, the hotplug controller treats these slots as a single x8
> + * slot for link and power. Either one of the two slots can be powered down
> + * separately and the slot status will show negative power and link states, but
> + * internal power and link will be active until the last of the two slots is
> + * powered down. When the last of the two x4 slots is turned off, power and
> + * link will be turned off for the x8 slot by the HP controller. This
> + * configuration causes some interesting behavior in bringup sequence
> + *
> + * When the second slot is powered off to remove the card, this will cause the
> + * link to go down for both x4 slots. So, the x4 that is already powered down
> + * earlier will see a DLLSC event and attempt to bring itself up (card present,
> + * link change event, link state is down). Special handling is required in
> + * pciehp_handle_presence_or_link_change to prevent this unintended bring up
> + */
> +static void quirk_shared_pcc_and_link_slot(struct pci_dev *pdev)
> +{
> +	struct pci_dev *parent = pci_upstream_bridge(pdev);
> +
> +	if (parent && pdev->subsystem_vendor == 0x108e) {
> +		switch (pdev->subsystem_device) {
> +		/* P5608 */
> +		case 0x487d:
> +		case 0x488d:
> +			parent->shared_pcc_and_link_slot = 1;
> +		}
> +	}
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0b60, quirk_shared_pcc_and_link_slot);
> +#endif /* CONFIG_HOTPLUG_PCI_PCIE */
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 7ed95f11c6bd..bcef73209487 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -463,6 +463,7 @@ struct pci_dev {
>  
>  #ifdef CONFIG_HOTPLUG_PCI_PCIE
>  	unsigned int	broken_cmd_compl:1;	/* No compl for some cmds */
> +	unsigned int	shared_pcc_and_link_slot:1;
>  #endif
>  #ifdef CONFIG_PCIE_PTM
>  	unsigned int	ptm_root:1;
> -- 
> 2.27.0
> 

-- 
Cheers,
Ashok


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-08-30 15:56 [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD Jon Derrick
  2021-08-30 17:46 ` Raj, Ashok
@ 2021-08-31  1:59 ` jonathan.derrick
  2021-09-12  8:45 ` Lukas Wunner
  2 siblings, 0 replies; 11+ messages in thread
From: jonathan.derrick @ 2021-08-31  1:59 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: linux-pci, Bjorn Helgaas, Lukas Wunner, James Puthukattukaran,
	Jon Derrick

Hi Ashok

August 30, 2021 12:46 PM, "Raj, Ashok" <ashok.raj@intel.com> wrote:

> Hi Jonathan
> 
> Looks mostly good. Thanks for the update. Reads better, but I'm still a bit
> confused .. sorry
> 
> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> 
>> From: James Puthukattukaran <james.puthukattukaran@oracle.com>
>> 
>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
>> ports both support hotplugging on each respective x4 device, a slot
>> management system for the SSD requires both x4 slots to have power
>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
>> safely turn-off physical power for the whole x8 device. The implications
>> are that slot status will display powered off and link inactive statuses
>> for the x4 devices where the devices are actually powered until both
>> ports have powered off.
> 
> This part is a bit muddy based on the description in the following writeup.
> 
> slot1->power off->link1 DOWN (Should fire a DLLSC event?) or does the link
> stay active until slot2 is powered off?

Internal link is active but link is
presented as down in the slot status register


> 
> slot2->power off, looks like only now link1 is getting DLLSC which you are
> ignoring skipping to make sure you don't turn this into an ON event.

I think the dllsc on slot 1 occurs when the
whole complex powers down after both slots
are powered-off



> 
> slot2 will also receive DLLSC, and you would ignore it also for the same
> reasons.
> 
> Maybe this was discussed earlier and I didn't catch it.
> When in OFF-state, and DLLSC is received but link is down while in OFF state,
> shoudn't that be the default handling? It almost looks like its a spurious
> DLLSC event that can be ignored for all cases without a specific quirk?
> 
> Or is there a case we care about for normal devices?

Lukas presented a possible race [1] with a
spurious surprise hotplug where you could be
in off state and link is still inactive,
expecting the state machine to power on the slot.

This is avoided in this SSD because the
individual slots in the complex won’t be
surprise hotplugged, avoiding spurious
events.

[1] https://lore.kernel.org/linux-pci/20210525192512.GA3450@wunner.de/


> 
>> The issue with the SSD manifests when power is removed from the
>> first-half and then the second-half. During the first-half removal, slot
>> status shows the slot as powered-down and link-inactive, while internal
>> power and link remain active while waiting for the second-half to have
>> power removed. When power is then removed from the second-half, the
>> first-half starts shutdown sequence and will trigger a DLLSC event. This
>> is misinterpreted as an enabling event and causes the first-half to be
>> re-enabled.
> 
> Question is:
> 
> when slot1 is powered off, do you get a DLLSC for that link that reports
> link_down? or does external link still report UP until both ports are
> powered off?
As far as the kernel knows by checking the
slot status register, the link is inactive.
Internally to the complex, link is still
active until both slots are powered off.

It’s the powering off of the whole complex
that causes slot 1 to throw dllsc, where
internally slot 1 is going from active to
inactive, and externally the kernel sees it
staying inactive.


> 
>> The spurious enable can be resolved by ignoring link status change
>> events when no link is active when in the off state. This patch adds a
>> quirk for specific P5608 SSDs which have been tested for compatibility.
>> 
>> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
>> Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
>> ---
>> v2->v3: Clarified commit message and comment block
>> Added second supported subdevice ID
>> Added hotplug ifdef blocks
>> 
>> drivers/pci/hotplug/pciehp_ctrl.c | 7 +++++++
>> drivers/pci/quirks.c | 34 +++++++++++++++++++++++++++++++
>> include/linux/pci.h | 1 +
>> 3 files changed, 42 insertions(+)
>> 
>> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
>> index 529c34808440..db41f78bfac8 100644
>> --- a/drivers/pci/hotplug/pciehp_ctrl.c
>> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
>> @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
>> void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>> {
>> int present, link_active;
>> + struct pci_dev *pdev = ctrl->pcie->port;
>> 
>> /*
>> * If the slot is on and presence or link has changed, turn it off.
>> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32
>> events)
>> cancel_delayed_work(&ctrl->button_work);
>> fallthrough;
>> case OFF_STATE:
>> + if (pdev->shared_pcc_and_link_slot &&
>> + (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
>> + mutex_unlock(&ctrl->state_lock);
>> + break;
>> + }
>> +
>> ctrl->state = POWERON_STATE;
>> mutex_unlock(&ctrl->state_lock);
>> if (present)
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 10122e3318cf..c6b48ddc5c3d 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -5750,3 +5750,37 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>> }
>> DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>> PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
>> +
>> +#ifdef CONFIG_HOTPLUG_PCI_PCIE
>> +/*
>> + * This is an Intel NVMe SSD which sits in a x8 pciehp slot but is bifurcated
>> + * as a x4x4 and manifests as two slots with respect to PCIe hot plug register
>> + * states. However, the hotplug controller treats these slots as a single x8
>> + * slot for link and power. Either one of the two slots can be powered down
>> + * separately and the slot status will show negative power and link states, but
>> + * internal power and link will be active until the last of the two slots is
>> + * powered down. When the last of the two x4 slots is turned off, power and
>> + * link will be turned off for the x8 slot by the HP controller. This
>> + * configuration causes some interesting behavior in bringup sequence
>> + *
>> + * When the second slot is powered off to remove the card, this will cause the
>> + * link to go down for both x4 slots. So, the x4 that is already powered down
>> + * earlier will see a DLLSC event and attempt to bring itself up (card present,
>> + * link change event, link state is down). Special handling is required in
>> + * pciehp_handle_presence_or_link_change to prevent this unintended bring up
>> + */
>> +static void quirk_shared_pcc_and_link_slot(struct pci_dev *pdev)
>> +{
>> + struct pci_dev *parent = pci_upstream_bridge(pdev);
>> +
>> + if (parent && pdev->subsystem_vendor == 0x108e) {
>> + switch (pdev->subsystem_device) {
>> + /* P5608 */
>> + case 0x487d:
>> + case 0x488d:
>> + parent->shared_pcc_and_link_slot = 1;
>> + }
>> + }
>> +}
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0b60, quirk_shared_pcc_and_link_slot);
>> +#endif /* CONFIG_HOTPLUG_PCI_PCIE */
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 7ed95f11c6bd..bcef73209487 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -463,6 +463,7 @@ struct pci_dev {
>> 
>> #ifdef CONFIG_HOTPLUG_PCI_PCIE
>> unsigned int broken_cmd_compl:1; /* No compl for some cmds */
>> + unsigned int shared_pcc_and_link_slot:1;
>> #endif
>> #ifdef CONFIG_PCIE_PTM
>> unsigned int ptm_root:1;
>> --
>> 2.27.0
> 
> --
> Cheers,
> Ashok

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-08-30 15:56 [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD Jon Derrick
  2021-08-30 17:46 ` Raj, Ashok
  2021-08-31  1:59 ` jonathan.derrick
@ 2021-09-12  8:45 ` Lukas Wunner
  2021-09-13 21:07   ` Jon Derrick
  2 siblings, 1 reply; 11+ messages in thread
From: Lukas Wunner @ 2021-09-12  8:45 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Bjorn Helgaas, Ashok Raj, James Puthukattukaran, Jon Derrick

On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
> ports both support hotplugging on each respective x4 device, a slot
> management system for the SSD requires both x4 slots to have power
> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
> safely turn-off physical power for the whole x8 device. The implications
> are that slot status will display powered off and link inactive statuses
> for the x4 devices where the devices are actually powered until both
> ports have powered off.

Just to get a better understanding, does the P5608 have an internal
PCIe switch with hotplug capability on the Downstream Ports or
does it plug into two separate PCIe slots?  I recall previous patches
mentioned a CEM interposer?  (An lspci listing might be helpful.)


> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
>  void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  {
>  	int present, link_active;
> +	struct pci_dev *pdev = ctrl->pcie->port;

Nit: Reverse christmas tree.


> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  		cancel_delayed_work(&ctrl->button_work);
>  		fallthrough;
>  	case OFF_STATE:
> +		if (pdev->shared_pcc_and_link_slot &&
> +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> +			mutex_unlock(&ctrl->state_lock);
> +			break;
> +		}
> +

I think you also need to add...

			pdev->shared_pcc_and_link_slot = false;

... here to reset the shared_pcc_and_link_slot attribute in case the
next card plugged into the slot doesn't have the quirk.

(This can't be done in pciehp_unconfigure_device() because the attribute
is queried *after* the slot has been brought down.)


> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5750,3 +5750,37 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>  }
>  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> +
> +#ifdef CONFIG_HOTPLUG_PCI_PCIE

It's possible to put the quirk at the bottom of pciehp_ctrl.c and
thus avoid the need for the #ifdef here.  (We've got another
pciehp-specific quirk at the bottom of pciehp_hpc.c.)

Otherwise LGTM.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-09-12  8:45 ` Lukas Wunner
@ 2021-09-13 21:07   ` Jon Derrick
  2021-09-14 14:46     ` Lukas Wunner
  0 siblings, 1 reply; 11+ messages in thread
From: Jon Derrick @ 2021-09-13 21:07 UTC (permalink / raw)
  To: Lukas Wunner, Jon Derrick
  Cc: linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran



On 9/12/21 3:45 AM, Lukas Wunner wrote:
> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
>> ports both support hotplugging on each respective x4 device, a slot
>> management system for the SSD requires both x4 slots to have power
>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
>> safely turn-off physical power for the whole x8 device. The implications
>> are that slot status will display powered off and link inactive statuses
>> for the x4 devices where the devices are actually powered until both
>> ports have powered off.
> 
> Just to get a better understanding, does the P5608 have an internal
> PCIe switch with hotplug capability on the Downstream Ports or
> does it plug into two separate PCIe slots?  I recall previous patches
> mentioned a CEM interposer?  (An lspci listing might be helpful.)

It looks like 2 NVMe endpoints plugged into two different root ports, ex,
80:00.0 Root port to [81-86]
80:01.0 Root port to [87-8b]
81:00.0 NVMe
87:00.0 NVMe

The x8 is bifurcated to x4x4. Physically they share the same slot
power/clock/reset but are logically separate per root port.


> 
> 
>> --- a/drivers/pci/hotplug/pciehp_ctrl.c
>> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
>> @@ -225,6 +225,7 @@ void pciehp_handle_disable_request(struct controller *ctrl)
>>  void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>>  {
>>  	int present, link_active;
>> +	struct pci_dev *pdev = ctrl->pcie->port;
> 
> Nit: Reverse christmas tree.
Sure


> 
> 
>> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>>  		cancel_delayed_work(&ctrl->button_work);
>>  		fallthrough;
>>  	case OFF_STATE:
>> +		if (pdev->shared_pcc_and_link_slot &&
>> +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
>> +			mutex_unlock(&ctrl->state_lock);
>> +			break;
>> +		}
>> +
> 
> I think you also need to add...
> 
> 			pdev->shared_pcc_and_link_slot = false;
> 
> ... here to reset the shared_pcc_and_link_slot attribute in case the
> next card plugged into the slot doesn't have the quirk.
> 
> (This can't be done in pciehp_unconfigure_device() because the attribute
> is queried *after* the slot has been brought down.)
Agreed. I'll find a good spot for it.


> 
> 
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -5750,3 +5750,37 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>>  }
>>  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>>  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
>> +
>> +#ifdef CONFIG_HOTPLUG_PCI_PCIE
> 
> It's possible to put the quirk at the bottom of pciehp_ctrl.c and
> thus avoid the need for the #ifdef here.  (We've got another
> pciehp-specific quirk at the bottom of pciehp_hpc.c.)
Sure that would look fine


> 
> Otherwise LGTM.
> 
> Thanks,
> 
> Lukas
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-09-13 21:07   ` Jon Derrick
@ 2021-09-14 14:46     ` Lukas Wunner
  2021-09-20 17:18       ` Jon Derrick
  0 siblings, 1 reply; 11+ messages in thread
From: Lukas Wunner @ 2021-09-14 14:46 UTC (permalink / raw)
  To: Jon Derrick
  Cc: Jon Derrick, linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran

On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
> On 9/12/21 3:45 AM, Lukas Wunner wrote:
> > On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> > > When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
> > > ports both support hotplugging on each respective x4 device, a slot
> > > management system for the SSD requires both x4 slots to have power
> > > removed via sysfs (echo 0 > slot/N/power), from the OS before it can
> > > safely turn-off physical power for the whole x8 device. The implications
> > > are that slot status will display powered off and link inactive statuses
> > > for the x4 devices where the devices are actually powered until both
> > > ports have powered off.
> > 
> > Just to get a better understanding, does the P5608 have an internal
> > PCIe switch with hotplug capability on the Downstream Ports or
> > does it plug into two separate PCIe slots?  I recall previous patches
> > mentioned a CEM interposer?  (An lspci listing might be helpful.)
> 
> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
> 80:00.0 Root port to [81-86]
> 80:01.0 Root port to [87-8b]
> 81:00.0 NVMe
> 87:00.0 NVMe
> 
> The x8 is bifurcated to x4x4. Physically they share the same slot
> power/clock/reset but are logically separate per root port.

So are these two P5608 drives attached to a single Root Port with an
interposer in-between?

I assume the Root Port needs to know that it's bifurcated and has to
appear as two slots on the bus.  Is this configured with a BIOS setting?

If these assumptions are true, the quirk isn't really specific to
the P5608 but should rather apply to the bifurcation-capable Root Port
and the quirk should set the flag if the Root Port is indeed configured
for bifurcation.


> > > @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> > >  		cancel_delayed_work(&ctrl->button_work);
> > >  		fallthrough;
> > >  	case OFF_STATE:
> > > +		if (pdev->shared_pcc_and_link_slot &&
> > > +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> > > +			mutex_unlock(&ctrl->state_lock);
> > > +			break;
> > > +		}
> > > +
> > 
> > I think you also need to add...
> > 
> > 			pdev->shared_pcc_and_link_slot = false;
> > 
> > ... here to reset the shared_pcc_and_link_slot attribute in case the
> > next card plugged into the slot doesn't have the quirk.
> > 
> > (This can't be done in pciehp_unconfigure_device() because the attribute
> > is queried *after* the slot has been brought down.)
> 
> Agreed. I'll find a good spot for it.

Adding it in the if-clause above should work.  The if-clause is only
entered when the sibling card has had its power removed, and this
only happens once.  When power is reinstated via sysfs, the device
in the slot is reenumerated and pdev->shared_pcc_and_link_slot is
set to true again if there's a quirked device in the slot.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-09-14 14:46     ` Lukas Wunner
@ 2021-09-20 17:18       ` Jon Derrick
  2022-07-08 16:35         ` Jonathan Derrick
  0 siblings, 1 reply; 11+ messages in thread
From: Jon Derrick @ 2021-09-20 17:18 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jon Derrick, linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran



On 9/14/21 9:46 AM, Lukas Wunner wrote:
> On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
>> On 9/12/21 3:45 AM, Lukas Wunner wrote:
>>> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
>>>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
>>>> ports both support hotplugging on each respective x4 device, a slot
>>>> management system for the SSD requires both x4 slots to have power
>>>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
>>>> safely turn-off physical power for the whole x8 device. The implications
>>>> are that slot status will display powered off and link inactive statuses
>>>> for the x4 devices where the devices are actually powered until both
>>>> ports have powered off.
>>>
>>> Just to get a better understanding, does the P5608 have an internal
>>> PCIe switch with hotplug capability on the Downstream Ports or
>>> does it plug into two separate PCIe slots?  I recall previous patches
>>> mentioned a CEM interposer?  (An lspci listing might be helpful.)
>>
>> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
>> 80:00.0 Root port to [81-86]
>> 80:01.0 Root port to [87-8b]
>> 81:00.0 NVMe
>> 87:00.0 NVMe
>>
>> The x8 is bifurcated to x4x4. Physically they share the same slot
>> power/clock/reset but are logically separate per root port.
> 
> So are these two P5608 drives attached to a single Root Port with an
> interposer in-between?
> 
> I assume the Root Port needs to know that it's bifurcated and has to
> appear as two slots on the bus.  Is this configured with a BIOS setting?
> 
> If these assumptions are true, the quirk isn't really specific to
> the P5608 but should rather apply to the bifurcation-capable Root Port
> and the quirk should set the flag if the Root Port is indeed configured
> for bifurcation.
It's a function of the slot + card combination, but we can't distinguish this
slot's special power handling behavior from the vanilla behavior. It's modified
to handle power on the logically bifurcated, singular physical device.


> 
> 
>>>> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>>>>  		cancel_delayed_work(&ctrl->button_work);
>>>>  		fallthrough;
>>>>  	case OFF_STATE:
>>>> +		if (pdev->shared_pcc_and_link_slot &&
>>>> +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
>>>> +			mutex_unlock(&ctrl->state_lock);
>>>> +			break;
>>>> +		}
>>>> +
>>>
>>> I think you also need to add...
>>>
>>> 			pdev->shared_pcc_and_link_slot = false;
>>>
>>> ... here to reset the shared_pcc_and_link_slot attribute in case the
>>> next card plugged into the slot doesn't have the quirk.
>>>
>>> (This can't be done in pciehp_unconfigure_device() because the attribute
>>> is queried *after* the slot has been brought down.)
>>
>> Agreed. I'll find a good spot for it.
> 
> Adding it in the if-clause above should work.  The if-clause is only
> entered when the sibling card has had its power removed, and this
> only happens once.  When power is reinstated via sysfs, the device
> in the slot is reenumerated and pdev->shared_pcc_and_link_slot is
> set to true again if there's a quirked device in the slot.
> 
> Thanks,
> 
> Lukas
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2021-09-20 17:18       ` Jon Derrick
@ 2022-07-08 16:35         ` Jonathan Derrick
  2022-09-24  7:32           ` Lukas Wunner
  0 siblings, 1 reply; 11+ messages in thread
From: Jonathan Derrick @ 2022-07-08 16:35 UTC (permalink / raw)
  To: Jon Derrick, Lukas Wunner
  Cc: linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran



On 9/20/2021 11:18 AM, Jon Derrick wrote:
> 
> 
> On 9/14/21 9:46 AM, Lukas Wunner wrote:
>> On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
>>> On 9/12/21 3:45 AM, Lukas Wunner wrote:
>>>> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
>>>>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
>>>>> ports both support hotplugging on each respective x4 device, a slot
>>>>> management system for the SSD requires both x4 slots to have power
>>>>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
>>>>> safely turn-off physical power for the whole x8 device. The implications
>>>>> are that slot status will display powered off and link inactive statuses
>>>>> for the x4 devices where the devices are actually powered until both
>>>>> ports have powered off.
>>>>
>>>> Just to get a better understanding, does the P5608 have an internal
>>>> PCIe switch with hotplug capability on the Downstream Ports or
>>>> does it plug into two separate PCIe slots?  I recall previous patches
>>>> mentioned a CEM interposer?  (An lspci listing might be helpful.)
>>>
>>> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
>>> 80:00.0 Root port to [81-86]
>>> 80:01.0 Root port to [87-8b]
>>> 81:00.0 NVMe
>>> 87:00.0 NVMe
>>>
>>> The x8 is bifurcated to x4x4. Physically they share the same slot
>>> power/clock/reset but are logically separate per root port.
>>
>> So are these two P5608 drives attached to a single Root Port with an
>> interposer in-between?
>>
>> I assume the Root Port needs to know that it's bifurcated and has to
>> appear as two slots on the bus.  Is this configured with a BIOS setting?
>>
>> If these assumptions are true, the quirk isn't really specific to
>> the P5608 but should rather apply to the bifurcation-capable Root Port
>> and the quirk should set the flag if the Root Port is indeed configured
>> for bifurcation.
> It's a function of the slot + card combination, but we can't distinguish this
> slot's special power handling behavior from the vanilla behavior. It's modified
> to handle power on the logically bifurcated, singular physical device.
> 
> 
>>
>>
>>>>> @@ -265,6 +266,12 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>>>>>  		cancel_delayed_work(&ctrl->button_work);
>>>>>  		fallthrough;
>>>>>  	case OFF_STATE:
>>>>> +		if (pdev->shared_pcc_and_link_slot &&
>>>>> +		    (events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
>>>>> +			mutex_unlock(&ctrl->state_lock);
>>>>> +			break;
>>>>> +		}
>>>>> +
>>>>
>>>> I think you also need to add...
>>>>
>>>> 			pdev->shared_pcc_and_link_slot = false;
>>>>
>>>> ... here to reset the shared_pcc_and_link_slot attribute in case the
>>>> next card plugged into the slot doesn't have the quirk.
>>>>
>>>> (This can't be done in pciehp_unconfigure_device() because the attribute
>>>> is queried *after* the slot has been brought down.)
>>>
>>> Agreed. I'll find a good spot for it.
>>
>> Adding it in the if-clause above should work.  The if-clause is only
>> entered when the sibling card has had its power removed, and this
>> only happens once.  When power is reinstated via sysfs, the device
>> in the slot is reenumerated and pdev->shared_pcc_and_link_slot is
>> set to true again if there's a quirked device in the slot.
>>
>> Thanks,
>>
>> Lukas
>>
> 

Hi Bjorn, Lukas,

I need to resubmit this.

Besides the 'pdev->shared_pcc_and_link_slot = false', addition mentioned
above, is there anything else that should be changed or any reason this
wouldn't be accepted?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2022-07-08 16:35         ` Jonathan Derrick
@ 2022-09-24  7:32           ` Lukas Wunner
  2022-09-26 21:05             ` Jonathan Derrick
  2022-09-26 21:21             ` Ashok Raj
  0 siblings, 2 replies; 11+ messages in thread
From: Lukas Wunner @ 2022-09-24  7:32 UTC (permalink / raw)
  To: Jonathan Derrick
  Cc: Jon Derrick, linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran

On Fri, Jul 08, 2022 at 10:35:15AM -0600, Jonathan Derrick wrote:
> On 9/20/2021 11:18 AM, Jon Derrick wrote:
> > On 9/14/21 9:46 AM, Lukas Wunner wrote:
> >> On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
> >>> On 9/12/21 3:45 AM, Lukas Wunner wrote:
> >>>> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> >>>>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
> >>>>> ports both support hotplugging on each respective x4 device, a slot
> >>>>> management system for the SSD requires both x4 slots to have power
> >>>>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
> >>>>> safely turn-off physical power for the whole x8 device. The implications
> >>>>> are that slot status will display powered off and link inactive statuses
> >>>>> for the x4 devices where the devices are actually powered until both
> >>>>> ports have powered off.
> >>>>
> >>>> Just to get a better understanding, does the P5608 have an internal
> >>>> PCIe switch with hotplug capability on the Downstream Ports or
> >>>> does it plug into two separate PCIe slots?  I recall previous patches
> >>>> mentioned a CEM interposer?  (An lspci listing might be helpful.)
> >>>
> >>> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
> >>> 80:00.0 Root port to [81-86]
> >>> 80:01.0 Root port to [87-8b]
> >>> 81:00.0 NVMe
> >>> 87:00.0 NVMe
> >>>
> >>> The x8 is bifurcated to x4x4. Physically they share the same slot
> >>> power/clock/reset but are logically separate per root port.
> >>
> >> So are these two P5608 drives attached to a single Root Port with an
> >> interposer in-between?
> >>
> >> I assume the Root Port needs to know that it's bifurcated and has to
> >> appear as two slots on the bus.  Is this configured with a BIOS setting?
> >>
> >> If these assumptions are true, the quirk isn't really specific to
> >> the P5608 but should rather apply to the bifurcation-capable Root Port
> >> and the quirk should set the flag if the Root Port is indeed configured
> >> for bifurcation.
> > It's a function of the slot + card combination, but we can't distinguish this
> > slot's special power handling behavior from the vanilla behavior. It's modified
> > to handle power on the logically bifurcated, singular physical device.
> 
> Hi Bjorn, Lukas,
> 
> I need to resubmit this.
> 
> Besides the 'pdev->shared_pcc_and_link_slot = false', addition mentioned
> above, is there anything else that should be changed or any reason this
> wouldn't be accepted?

Another report has cropped up of spurious DLLSC events:
https://bugzilla.kernel.org/show_bug.cgi?id=216511

That other case differs from yours in that a spurious DLLSC event
is seen on plugging *in* a card, whereas in your case the event
seems to occur on *removing* a card.  In both cases, the spurious
event is seen on the hotplug port's sibling.

I'm starting to think that we should probably disable DLLSC events
entirely if they're known to be unreliable.  The hotplug port
solely relies on PDC events then.  Otherwise we'd have to clutter
the event handling with all sorts of special cases.  The code would
become fairly difficult to follow.

I've attached an experimental patch to the bug report which disables
DLLSC events on a hotplug port if a P5608 SSD is plugged into it:
https://bugzilla.kernel.org/attachment.cgi?id=301845

Would this approach work for you?

One other question:  What if the SSD is not bifurcated (i.e. x8
instead of x4x4), don't we need avoid applying the quirk in that case?
Your patch doesn't seem to do that.  Can we recognize somehow whether
the card is bifurcated or not?  Is it sufficient to just look at the
Maximum Link Width in the Link Capabilities Register?  Does the SSD
report x4 there if it's bifurcated?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2022-09-24  7:32           ` Lukas Wunner
@ 2022-09-26 21:05             ` Jonathan Derrick
  2022-09-26 21:21             ` Ashok Raj
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Derrick @ 2022-09-26 21:05 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jon Derrick, linux-pci, Bjorn Helgaas, Raj, Ashok, James Puthukattukaran



On 9/24/2022 1:32 AM, Lukas Wunner wrote:
> On Fri, Jul 08, 2022 at 10:35:15AM -0600, Jonathan Derrick wrote:
>> On 9/20/2021 11:18 AM, Jon Derrick wrote:
>>> On 9/14/21 9:46 AM, Lukas Wunner wrote:
>>>> On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
>>>>> On 9/12/21 3:45 AM, Lukas Wunner wrote:
>>>>>> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
>>>>>>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
>>>>>>> ports both support hotplugging on each respective x4 device, a slot
>>>>>>> management system for the SSD requires both x4 slots to have power
>>>>>>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
>>>>>>> safely turn-off physical power for the whole x8 device. The implications
>>>>>>> are that slot status will display powered off and link inactive statuses
>>>>>>> for the x4 devices where the devices are actually powered until both
>>>>>>> ports have powered off.
>>>>>>
>>>>>> Just to get a better understanding, does the P5608 have an internal
>>>>>> PCIe switch with hotplug capability on the Downstream Ports or
>>>>>> does it plug into two separate PCIe slots?  I recall previous patches
>>>>>> mentioned a CEM interposer?  (An lspci listing might be helpful.)
>>>>>
>>>>> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
>>>>> 80:00.0 Root port to [81-86]
>>>>> 80:01.0 Root port to [87-8b]
>>>>> 81:00.0 NVMe
>>>>> 87:00.0 NVMe
>>>>>
>>>>> The x8 is bifurcated to x4x4. Physically they share the same slot
>>>>> power/clock/reset but are logically separate per root port.
>>>>
>>>> So are these two P5608 drives attached to a single Root Port with an
>>>> interposer in-between?
>>>>
>>>> I assume the Root Port needs to know that it's bifurcated and has to
>>>> appear as two slots on the bus.  Is this configured with a BIOS setting?
>>>>
>>>> If these assumptions are true, the quirk isn't really specific to
>>>> the P5608 but should rather apply to the bifurcation-capable Root Port
>>>> and the quirk should set the flag if the Root Port is indeed configured
>>>> for bifurcation.
>>> It's a function of the slot + card combination, but we can't distinguish this
>>> slot's special power handling behavior from the vanilla behavior. It's modified
>>> to handle power on the logically bifurcated, singular physical device.
>>
>> Hi Bjorn, Lukas,
>>
>> I need to resubmit this.
>>
>> Besides the 'pdev->shared_pcc_and_link_slot = false', addition mentioned
>> above, is there anything else that should be changed or any reason this
>> wouldn't be accepted?
> 
> Another report has cropped up of spurious DLLSC events:
> https://bugzilla.kernel.org/show_bug.cgi?id=216511
> 
> That other case differs from yours in that a spurious DLLSC event
> is seen on plugging *in* a card, whereas in your case the event
> seems to occur on *removing* a card.  In both cases, the spurious
> event is seen on the hotplug port's sibling.
> 
> I'm starting to think that we should probably disable DLLSC events
> entirely if they're known to be unreliable.  The hotplug port
> solely relies on PDC events then.  Otherwise we'd have to clutter
> the event handling with all sorts of special cases.  The code would
> become fairly difficult to follow.
I'm not sure we can do that either. Think of a non-logical interposer.
PDC will be static but DLLSC may be the only hotplug status received.


> 
> I've attached an experimental patch to the bug report which disables
> DLLSC events on a hotplug port if a P5608 SSD is plugged into it:
> https://bugzilla.kernel.org/attachment.cgi?id=301845
> 
> Would this approach work for you?
It looks correct to me


> 
> One other question:  What if the SSD is not bifurcated (i.e. x8
> instead of x4x4), don't we need avoid applying the quirk in that case?
> Your patch doesn't seem to do that.  Can we recognize somehow whether
> the card is bifurcated or not?  Is it sufficient to just look at the
> Maximum Link Width in the Link Capabilities Register?  Does the SSD
> report x4 there if it's bifurcate
Good question. Unique to the subsystem device id?

> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD
  2022-09-24  7:32           ` Lukas Wunner
  2022-09-26 21:05             ` Jonathan Derrick
@ 2022-09-26 21:21             ` Ashok Raj
  1 sibling, 0 replies; 11+ messages in thread
From: Ashok Raj @ 2022-09-26 21:21 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jonathan Derrick, Jon Derrick, linux-pci, Bjorn Helgaas,
	James Puthukattukaran, Ashok Raj

On Sat, Sep 24, 2022 at 09:32:08AM +0200, Lukas Wunner wrote:
> On Fri, Jul 08, 2022 at 10:35:15AM -0600, Jonathan Derrick wrote:
> > On 9/20/2021 11:18 AM, Jon Derrick wrote:
> > > On 9/14/21 9:46 AM, Lukas Wunner wrote:
> > >> On Mon, Sep 13, 2021 at 04:07:22PM -0500, Jon Derrick wrote:
> > >>> On 9/12/21 3:45 AM, Lukas Wunner wrote:
> > >>>> On Mon, Aug 30, 2021 at 09:56:28AM -0600, Jon Derrick wrote:
> > >>>>> When an Intel P5608 SSD is bifurcated into x4x4 mode, and the upstream
> > >>>>> ports both support hotplugging on each respective x4 device, a slot
> > >>>>> management system for the SSD requires both x4 slots to have power
> > >>>>> removed via sysfs (echo 0 > slot/N/power), from the OS before it can
> > >>>>> safely turn-off physical power for the whole x8 device. The implications
> > >>>>> are that slot status will display powered off and link inactive statuses
> > >>>>> for the x4 devices where the devices are actually powered until both
> > >>>>> ports have powered off.
> > >>>>
> > >>>> Just to get a better understanding, does the P5608 have an internal
> > >>>> PCIe switch with hotplug capability on the Downstream Ports or
> > >>>> does it plug into two separate PCIe slots?  I recall previous patches
> > >>>> mentioned a CEM interposer?  (An lspci listing might be helpful.)
> > >>>
> > >>> It looks like 2 NVMe endpoints plugged into two different root ports, ex,
> > >>> 80:00.0 Root port to [81-86]
> > >>> 80:01.0 Root port to [87-8b]
> > >>> 81:00.0 NVMe
> > >>> 87:00.0 NVMe
> > >>>
> > >>> The x8 is bifurcated to x4x4. Physically they share the same slot
> > >>> power/clock/reset but are logically separate per root port.
> > >>
> > >> So are these two P5608 drives attached to a single Root Port with an
> > >> interposer in-between?
> > >>
> > >> I assume the Root Port needs to know that it's bifurcated and has to
> > >> appear as two slots on the bus.  Is this configured with a BIOS setting?
> > >>
> > >> If these assumptions are true, the quirk isn't really specific to
> > >> the P5608 but should rather apply to the bifurcation-capable Root Port
> > >> and the quirk should set the flag if the Root Port is indeed configured
> > >> for bifurcation.
> > > It's a function of the slot + card combination, but we can't distinguish this
> > > slot's special power handling behavior from the vanilla behavior. It's modified
> > > to handle power on the logically bifurcated, singular physical device.
> > 
> > Hi Bjorn, Lukas,
> > 
> > I need to resubmit this.
> > 
> > Besides the 'pdev->shared_pcc_and_link_slot = false', addition mentioned
> > above, is there anything else that should be changed or any reason this
> > wouldn't be accepted?
> 
> Another report has cropped up of spurious DLLSC events:
> https://bugzilla.kernel.org/show_bug.cgi?id=216511
> 
> That other case differs from yours in that a spurious DLLSC event
> is seen on plugging *in* a card, whereas in your case the event
> seems to occur on *removing* a card.  In both cases, the spurious
> event is seen on the hotplug port's sibling.
> 
> I'm starting to think that we should probably disable DLLSC events
> entirely if they're known to be unreliable.  The hotplug port
> solely relies on PDC events then.  Otherwise we'd have to clutter

But when we have ATTN, we don't subscribe to PDC events correct?
Can we always rely on PDC changes?

Unless you have a PowerControl, PDC can also be spurious correct? and so
will be DLLSC.. But I agreem, the hotplug code is getting hairy complex
and can help some simplification. A long time ago I remember I worked
out a state diagram in google draw for how these states transition to
guide the code, but I don't remember where they went.

> the event handling with all sorts of special cases.  The code would
> become fairly difficult to follow.
> 
> I've attached an experimental patch to the bug report which disables
> DLLSC events on a hotplug port if a P5608 SSD is plugged into it:
> https://bugzilla.kernel.org/attachment.cgi?id=301845
> 
> Would this approach work for you?
> 
> One other question:  What if the SSD is not bifurcated (i.e. x8
> instead of x4x4), don't we need avoid applying the quirk in that case?
> Your patch doesn't seem to do that.  Can we recognize somehow whether
> the card is bifurcated or not?  Is it sufficient to just look at the
> Maximum Link Width in the Link Capabilities Register?  Does the SSD
> report x4 there if it's bifurcated?
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-09-26 21:21 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 15:56 [PATCH v3] PCI: pciehp: Add quirk to handle spurious DLLSC on a x4x4 SSD Jon Derrick
2021-08-30 17:46 ` Raj, Ashok
2021-08-31  1:59 ` jonathan.derrick
2021-09-12  8:45 ` Lukas Wunner
2021-09-13 21:07   ` Jon Derrick
2021-09-14 14:46     ` Lukas Wunner
2021-09-20 17:18       ` Jon Derrick
2022-07-08 16:35         ` Jonathan Derrick
2022-09-24  7:32           ` Lukas Wunner
2022-09-26 21:05             ` Jonathan Derrick
2022-09-26 21:21             ` Ashok Raj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.