All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI: pciehp: Ignore spurious link inactive change when off
@ 2021-04-09 20:59 Jon Derrick
  2021-04-09 21:38 ` Raj, Ashok
  2021-05-24 22:42 ` Bjorn Helgaas
  0 siblings, 2 replies; 5+ messages in thread
From: Jon Derrick @ 2021-04-09 20:59 UTC (permalink / raw)
  To: linux-pci
  Cc: Bjorn Helgaas, Lukas Wunner, Ashok Raj, Dan Williams,
	Sathyanarayanan Kuppuswamy, Jon Derrick

When a specific x8 CEM card is bifurcated into x4x4 mode, and the
upstream ports both support hotplugging on each respective x4 device, a
slot management system for the CEM card requires both x4 devices to be
sysfs removed from the OS before it can safely turn-off physical power.
The implications are that Slot Control will display Powered Off status
for the device where the device is actually powered until both ports
have powered off.

When power is removed from the first half, the link remains active to
provide clocking while waiting for the second half to have power
removed. When power is then removed from the second half, the first half
starts shutdown sequence and will trigger a link status change event.
This is misinterpreted as an enabling event due to positive presence
detect and causes the first half to be re-enabled.

The spurious enable can be resolved by ignoring link status change
events when no link is active when in the off state.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/hotplug/pciehp_ctrl.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 529c34808440..a2c5eef03e7d 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 		cancel_delayed_work(&ctrl->button_work);
 		fallthrough;
 	case OFF_STATE:
+		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
+			mutex_unlock(&ctrl->state_lock);
+			break;
+		}
+
 		ctrl->state = POWERON_STATE;
 		mutex_unlock(&ctrl->state_lock);
 		if (present)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: pciehp: Ignore spurious link inactive change when off
  2021-04-09 20:59 [PATCH] PCI: pciehp: Ignore spurious link inactive change when off Jon Derrick
@ 2021-04-09 21:38 ` Raj, Ashok
  2021-05-24 22:42 ` Bjorn Helgaas
  1 sibling, 0 replies; 5+ messages in thread
From: Raj, Ashok @ 2021-04-09 21:38 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Bjorn Helgaas, Lukas Wunner, Dan Williams,
	Sathyanarayanan Kuppuswamy, Ashok Raj

On Fri, Apr 09, 2021 at 02:59:35PM -0600, Derrick, Jonathan wrote:
> When a specific x8 CEM card is bifurcated into x4x4 mode, and the
> upstream ports both support hotplugging on each respective x4 device, a
> slot management system for the CEM card requires both x4 devices to be
> sysfs removed from the OS before it can safely turn-off physical power.
> The implications are that Slot Control will display Powered Off status
> for the device where the device is actually powered until both ports
> have powered off.
> 
> When power is removed from the first half, the link remains active to
> provide clocking while waiting for the second half to have power
> removed. When power is then removed from the second half, the first half
> starts shutdown sequence and will trigger a link status change event.
> This is misinterpreted as an enabling event due to positive presence
> detect and causes the first half to be re-enabled.
> 
> The spurious enable can be resolved by ignoring link status change
> events when no link is active when in the off state.
> 
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>

Although this seems like it should never happen with normal cards, it seems
harmless otherwise. 


Reviewed by: Ashok Raj <ashok.raj@intel.com>
> ---
>  drivers/pci/hotplug/pciehp_ctrl.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 529c34808440..a2c5eef03e7d 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  		cancel_delayed_work(&ctrl->button_work);
>  		fallthrough;
>  	case OFF_STATE:
> +		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> +			mutex_unlock(&ctrl->state_lock);
> +			break;
> +		}
> +
>  		ctrl->state = POWERON_STATE;
>  		mutex_unlock(&ctrl->state_lock);
>  		if (present)
> -- 
> 2.26.2
> 

-- 
Cheers,
Ashok

[Forgiveness is the attribute of the STRONG - Gandhi]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: pciehp: Ignore spurious link inactive change when off
  2021-04-09 20:59 [PATCH] PCI: pciehp: Ignore spurious link inactive change when off Jon Derrick
  2021-04-09 21:38 ` Raj, Ashok
@ 2021-05-24 22:42 ` Bjorn Helgaas
  2021-05-25 19:25   ` Lukas Wunner
  1 sibling, 1 reply; 5+ messages in thread
From: Bjorn Helgaas @ 2021-05-24 22:42 UTC (permalink / raw)
  To: Jon Derrick
  Cc: linux-pci, Lukas Wunner, Ashok Raj, Dan Williams,
	Sathyanarayanan Kuppuswamy

On Fri, Apr 09, 2021 at 02:59:35PM -0600, Jon Derrick wrote:
> When a specific x8 CEM card is bifurcated into x4x4 mode, and the
> upstream ports both support hotplugging on each respective x4 device, a
> slot management system for the CEM card requires both x4 devices to be
> sysfs removed from the OS before it can safely turn-off physical power.
> The implications are that Slot Control will display Powered Off status
> for the device where the device is actually powered until both ports
> have powered off.
> 
> When power is removed from the first half, the link remains active to
> provide clocking while waiting for the second half to have power
> removed. When power is then removed from the second half, the first half
> starts shutdown sequence and will trigger a link status change event.
> This is misinterpreted as an enabling event due to positive presence
> detect and causes the first half to be re-enabled.
> 
> The spurious enable can be resolved by ignoring link status change
> events when no link is active when in the off state.
> 
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>

Applied to pci/hotplug for v5.14, thanks!

> ---
>  drivers/pci/hotplug/pciehp_ctrl.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index 529c34808440..a2c5eef03e7d 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  		cancel_delayed_work(&ctrl->button_work);
>  		fallthrough;
>  	case OFF_STATE:
> +		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> +			mutex_unlock(&ctrl->state_lock);
> +			break;
> +		}
> +
>  		ctrl->state = POWERON_STATE;
>  		mutex_unlock(&ctrl->state_lock);
>  		if (present)
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: pciehp: Ignore spurious link inactive change when off
  2021-05-24 22:42 ` Bjorn Helgaas
@ 2021-05-25 19:25   ` Lukas Wunner
  2021-05-25 20:17     ` Bjorn Helgaas
  0 siblings, 1 reply; 5+ messages in thread
From: Lukas Wunner @ 2021-05-25 19:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jon Derrick, linux-pci, Ashok Raj, Dan Williams,
	Sathyanarayanan Kuppuswamy

On Mon, May 24, 2021 at 05:42:18PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 09, 2021 at 02:59:35PM -0600, Jon Derrick wrote:
> > When a specific x8 CEM card is bifurcated into x4x4 mode, and the
> > upstream ports both support hotplugging on each respective x4 device, a
> > slot management system for the CEM card requires both x4 devices to be
> > sysfs removed from the OS before it can safely turn-off physical power.
> > The implications are that Slot Control will display Powered Off status
> > for the device where the device is actually powered until both ports
> > have powered off.
> > 
> > When power is removed from the first half, the link remains active to
> > provide clocking while waiting for the second half to have power
> > removed. When power is then removed from the second half, the first half
> > starts shutdown sequence and will trigger a link status change event.
> > This is misinterpreted as an enabling event due to positive presence
> > detect and causes the first half to be re-enabled.
> > 
> > The spurious enable can be resolved by ignoring link status change
> > events when no link is active when in the off state.

Sorry for not responding earlier, I missed this patch.


> > --- a/drivers/pci/hotplug/pciehp_ctrl.c
> > +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> > @@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> >  		cancel_delayed_work(&ctrl->button_work);
> >  		fallthrough;
> >  	case OFF_STATE:
> > +		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> > +			mutex_unlock(&ctrl->state_lock);
> > +			break;
> > +		}
> > +

I think this change will inadvertently ignore events that shouldn't be
ignored:  E.g., a DLLSC event may have been triggered by replacement of
the card in the slot and while Presence Detect State is 1, the link may
not yet be active.  The change above will cause not only the DLLSC but
also the PDC event to be ignored.

There are also reports of link flaps on card insertion and the above
change may result in the slot not being brought up even though it should.

The commit message sounds like powering down the CEM card takes longer
than expected.  We wait 1 second in set_slot_off() after disabling
slot power and that's apparently not sufficient.  The 1 second delay
is mandated by section 6.7.1.8 of the PCIe Base Spec.  If this card
needs a longer delay, a quirk should be added rather than changing
the algorithm for everyone.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: pciehp: Ignore spurious link inactive change when off
  2021-05-25 19:25   ` Lukas Wunner
@ 2021-05-25 20:17     ` Bjorn Helgaas
  0 siblings, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2021-05-25 20:17 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Jon Derrick, linux-pci, Ashok Raj, Dan Williams,
	Sathyanarayanan Kuppuswamy

On Tue, May 25, 2021 at 09:25:12PM +0200, Lukas Wunner wrote:
> On Mon, May 24, 2021 at 05:42:18PM -0500, Bjorn Helgaas wrote:
> > On Fri, Apr 09, 2021 at 02:59:35PM -0600, Jon Derrick wrote:
> > > When a specific x8 CEM card is bifurcated into x4x4 mode, and the
> > > upstream ports both support hotplugging on each respective x4 device, a
> > > slot management system for the CEM card requires both x4 devices to be
> > > sysfs removed from the OS before it can safely turn-off physical power.
> > > The implications are that Slot Control will display Powered Off status
> > > for the device where the device is actually powered until both ports
> > > have powered off.
> > > 
> > > When power is removed from the first half, the link remains active to
> > > provide clocking while waiting for the second half to have power
> > > removed. When power is then removed from the second half, the first half
> > > starts shutdown sequence and will trigger a link status change event.
> > > This is misinterpreted as an enabling event due to positive presence
> > > detect and causes the first half to be re-enabled.
> > > 
> > > The spurious enable can be resolved by ignoring link status change
> > > events when no link is active when in the off state.
> 
> Sorry for not responding earlier, I missed this patch.
> 
> 
> > > --- a/drivers/pci/hotplug/pciehp_ctrl.c
> > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> > > @@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> > >  		cancel_delayed_work(&ctrl->button_work);
> > >  		fallthrough;
> > >  	case OFF_STATE:
> > > +		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> > > +			mutex_unlock(&ctrl->state_lock);
> > > +			break;
> > > +		}
> > > +
> 
> I think this change will inadvertently ignore events that shouldn't be
> ignored:  E.g., a DLLSC event may have been triggered by replacement of
> the card in the slot and while Presence Detect State is 1, the link may
> not yet be active.  The change above will cause not only the DLLSC but
> also the PDC event to be ignored.
> 
> There are also reports of link flaps on card insertion and the above
> change may result in the slot not being brought up even though it should.
> 
> The commit message sounds like powering down the CEM card takes longer
> than expected.  We wait 1 second in set_slot_off() after disabling
> slot power and that's apparently not sufficient.  The 1 second delay
> is mandated by section 6.7.1.8 of the PCIe Base Spec.  If this card
> needs a longer delay, a quirk should be added rather than changing
> the algorithm for everyone.

I dropped this patch for now, thanks for taking a look, Lukas.

Bjorn

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-25 20:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 20:59 [PATCH] PCI: pciehp: Ignore spurious link inactive change when off Jon Derrick
2021-04-09 21:38 ` Raj, Ashok
2021-05-24 22:42 ` Bjorn Helgaas
2021-05-25 19:25   ` Lukas Wunner
2021-05-25 20:17     ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.