All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] pci: add option to ignore slot capabilities
@ 2016-08-08 20:19 Keith Busch
  2016-08-08 20:19 ` [PATCH 2/2] pci: Add ignore indicator quirk for devices Keith Busch
  2016-08-13  0:03 ` [PATCH 1/2] pci: add option to ignore slot capabilities Sean O. Stalley
  0 siblings, 2 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-08 20:19 UTC (permalink / raw)
  To: linux-pci, Bjorn Helgaas; +Cc: Keith Busch

This adds flags to struct pci_dev that can be set to request ignoring
attention and power indicators. This is in prepration for devices that
advertise these capabilities, but do not support it.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/hotplug/pciehp_hpc.c | 6 ++++++
 include/linux/pci.h              | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 08e84d6..3e6646c 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -804,6 +804,12 @@ struct controller *pcie_init(struct pcie_device *dev)
 	}
 	ctrl->pcie = dev;
 	pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, &slot_cap);
+
+	if (pdev->ignore_aip)
+		slot_cap &= ~PCI_EXP_SLTCAP_AIP;
+	if (pdev->ignore_pip)
+		slot_cap &= ~PCI_EXP_SLTCAP_PIP;
+
 	ctrl->slot_cap = slot_cap;
 	mutex_init(&ctrl->ctrl_lock);
 	init_waitqueue_head(&ctrl->queue);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9890906..d8bc530 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -308,6 +308,8 @@ struct pci_dev {
 						   powered on/off by the
 						   corresponding bridge */
 	unsigned int	ignore_hotplug:1;	/* Ignore hotplug events */
+	unsigned int	ignore_aip:1;		/* Ignore attention indicator */
+	unsigned int	ignore_pip:1;		/* Ignore power indicator */
 	unsigned int	d3_delay;	/* D3->D0 transition time in ms */
 	unsigned int	d3cold_delay;	/* D3cold->D0 transition time in ms */
 
-- 
2.7.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-08 20:19 [PATCH 1/2] pci: add option to ignore slot capabilities Keith Busch
@ 2016-08-08 20:19 ` Keith Busch
  2016-08-15 17:40   ` Bjorn Helgaas
  2016-08-13  0:03 ` [PATCH 1/2] pci: add option to ignore slot capabilities Sean O. Stalley
  1 sibling, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-08 20:19 UTC (permalink / raw)
  To: linux-pci, Bjorn Helgaas; +Cc: Keith Busch

This adds a quirk for certain devices that require ignoring attention
and power indicators.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/quirks.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b69321c..b5bd7a0 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4427,3 +4427,18 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 	}
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
+
+/*
+ * The PCIe slot capabilities for Intel compatible Hot-swap backplane advertise
+ * attention and power indicators, but will do the wrong thing if used in a
+ * standard way. Ignore these.
+ */
+static void quirk_hsbp(struct pci_dev *pdev)
+{
+	pdev->ignore_aip = 1;
+	pdev->ignore_pip = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2030, quirk_hsbp);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2031, quirk_hsbp);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2032, quirk_hsbp);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_hsbp);
-- 
2.7.2


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] pci: add option to ignore slot capabilities
  2016-08-08 20:19 [PATCH 1/2] pci: add option to ignore slot capabilities Keith Busch
  2016-08-08 20:19 ` [PATCH 2/2] pci: Add ignore indicator quirk for devices Keith Busch
@ 2016-08-13  0:03 ` Sean O. Stalley
  2016-08-13  0:58   ` Keith Busch
  1 sibling, 1 reply; 23+ messages in thread
From: Sean O. Stalley @ 2016-08-13  0:03 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

Hi Keith,

On Mon, Aug 08, 2016 at 02:19:01PM -0600, Keith Busch wrote:
> This adds flags to struct pci_dev that can be set to request ignoring
> attention and power indicators. This is in prepration for devices that
> advertise these capabilities, but do not support it.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c | 6 ++++++
>  include/linux/pci.h              | 2 ++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 08e84d6..3e6646c 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -804,6 +804,12 @@ struct controller *pcie_init(struct pcie_device *dev)
>  	}
>  	ctrl->pcie = dev;
>  	pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, &slot_cap);
> +
> +	if (pdev->ignore_aip)
> +		slot_cap &= ~PCI_EXP_SLTCAP_AIP;
> +	if (pdev->ignore_pip)
> +		slot_cap &= ~PCI_EXP_SLTCAP_PIP;
> +

I think we should clear these bits after the ctrl_info() below.
If we clear it immediately, the ctrl_info() will report these bits as cleared,
even though they may be set in hardware.

I don't think we want to put a value into the log that differs from what hardware reports.

>  	ctrl->slot_cap = slot_cap;
>  	mutex_init(&ctrl->ctrl_lock);
>  	init_waitqueue_head(&ctrl->queue);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 9890906..d8bc530 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -308,6 +308,8 @@ struct pci_dev {
>  						   powered on/off by the
>  						   corresponding bridge */
>  	unsigned int	ignore_hotplug:1;	/* Ignore hotplug events */
> +	unsigned int	ignore_aip:1;		/* Ignore attention indicator */
> +	unsigned int	ignore_pip:1;		/* Ignore power indicator */
>  	unsigned int	d3_delay;	/* D3->D0 transition time in ms */
>  	unsigned int	d3cold_delay;	/* D3cold->D0 transition time in ms */
>  
> -- 
> 2.7.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] pci: add option to ignore slot capabilities
  2016-08-13  0:03 ` [PATCH 1/2] pci: add option to ignore slot capabilities Sean O. Stalley
@ 2016-08-13  0:58   ` Keith Busch
  0 siblings, 0 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-13  0:58 UTC (permalink / raw)
  To: Sean O. Stalley; +Cc: linux-pci, Bjorn Helgaas

On Fri, Aug 12, 2016 at 05:03:58PM -0700, Sean O. Stalley wrote:
> I think we should clear these bits after the ctrl_info() below.
> If we clear it immediately, the ctrl_info() will report these bits as cleared,
> even though they may be set in hardware.
> 
> I don't think we want to put a value into the log that differs from what hardware reports.

Maybe, I didn't think of it that way. I thought if a user saw the
capabilities in the kernel message, she could reasonably expect what
the pciehp driver is going to do, even though the driver masked off
that behavior.

'lspci' would still show the hardware capabilities if that's okay for
a user. Had I my way, the hardware wouldn't advertise this capability
at all.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-08 20:19 ` [PATCH 2/2] pci: Add ignore indicator quirk for devices Keith Busch
@ 2016-08-15 17:40   ` Bjorn Helgaas
  2016-08-15 19:23     ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-15 17:40 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 08, 2016 at 02:19:02PM -0600, Keith Busch wrote:
> This adds a quirk for certain devices that require ignoring attention
> and power indicators.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  drivers/pci/quirks.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b69321c..b5bd7a0 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4427,3 +4427,18 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
>  	}
>  }
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
> +
> +/*
> + * The PCIe slot capabilities for Intel compatible Hot-swap backplane advertise
> + * attention and power indicators, but will do the wrong thing if used in a
> + * standard way. Ignore these.
> + */

Hmm.  So I guess you're saying these devices are defective?  Is there
an erratum we can reference?

What exactly does "do the wrong thing" mean?  These are indicators, so
the only thing we really do is turn them on and off.  I think we do
that with pcie_write_cmd_nowait(), and all the synchronization there
is a little messy.  Maybe we got that wrong somehow?

It's hard to believe something as simple as controlling an LED is
broken.  If it *is* broken, I would think the breakage would be
platform-dependent, not just device-dependent, i.e., I would suspect
something wrong with motherboard wiring or firmware.

> +static void quirk_hsbp(struct pci_dev *pdev)
> +{
> +	pdev->ignore_aip = 1;
> +	pdev->ignore_pip = 1;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2030, quirk_hsbp);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2031, quirk_hsbp);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2032, quirk_hsbp);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_hsbp);
> -- 
> 2.7.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-15 17:40   ` Bjorn Helgaas
@ 2016-08-15 19:23     ` Keith Busch
  2016-08-15 19:50       ` Bjorn Helgaas
  2016-08-17 21:37       ` Bjorn Helgaas
  0 siblings, 2 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-15 19:23 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 15, 2016 at 12:40:02PM -0500, Bjorn Helgaas wrote:
> On Mon, Aug 08, 2016 at 02:19:02PM -0600, Keith Busch wrote:
> > +/*
> > + * The PCIe slot capabilities for Intel compatible Hot-swap backplane advertise
> > + * attention and power indicators, but will do the wrong thing if used in a
> > + * standard way. Ignore these.
> > + */
> 
> Hmm.  So I guess you're saying these devices are defective?  Is there
> an erratum we can reference?
> 
> What exactly does "do the wrong thing" mean?  These are indicators, so
> the only thing we really do is turn them on and off.  I think we do
> that with pcie_write_cmd_nowait(), and all the synchronization there
> is a little messy.  Maybe we got that wrong somehow?
> 
> It's hard to believe something as simple as controlling an LED is
> broken.  If it *is* broken, I would think the breakage would be
> platform-dependent, not just device-dependent, i.e., I would suspect
> something wrong with motherboard wiring or firmware.

This is actually a "feature". The devices listed in the patch re-purpose
the spec defined capability and control bits for Attention and Power
indicators. The control values match IBPI (International Blinking Pattern
Interpretation) rather than the spec definition.

Since these operate in a non-standard way, we'd just as soon not let
the kernel know about them (an incorrect LED pattern will definitely
occur). The LEDs are to be set from user space by 'ledmon' instead.

Had I my way, the hardware wouldn't advertise the capability in the
first place. I rarely get my way, so I instead get to publicly defend
the quirk. :)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-15 19:23     ` Keith Busch
@ 2016-08-15 19:50       ` Bjorn Helgaas
  2016-08-15 22:35         ` Keith Busch
  2016-08-17 21:37       ` Bjorn Helgaas
  1 sibling, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-15 19:50 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 15, 2016 at 03:23:16PM -0400, Keith Busch wrote:
> On Mon, Aug 15, 2016 at 12:40:02PM -0500, Bjorn Helgaas wrote:
> > On Mon, Aug 08, 2016 at 02:19:02PM -0600, Keith Busch wrote:
> > > +/*
> > > + * The PCIe slot capabilities for Intel compatible Hot-swap backplane advertise
> > > + * attention and power indicators, but will do the wrong thing if used in a
> > > + * standard way. Ignore these.
> > > + */
> > 
> > Hmm.  So I guess you're saying these devices are defective?  Is there
> > an erratum we can reference?
> > 
> > What exactly does "do the wrong thing" mean?  These are indicators, so
> > the only thing we really do is turn them on and off.  I think we do
> > that with pcie_write_cmd_nowait(), and all the synchronization there
> > is a little messy.  Maybe we got that wrong somehow?
> > 
> > It's hard to believe something as simple as controlling an LED is
> > broken.  If it *is* broken, I would think the breakage would be
> > platform-dependent, not just device-dependent, i.e., I would suspect
> > something wrong with motherboard wiring or firmware.
> 
> This is actually a "feature". The devices listed in the patch re-purpose
> the spec defined capability and control bits for Attention and Power
> indicators. The control values match IBPI (International Blinking Pattern
> Interpretation) rather than the spec definition.
> 
> Since these operate in a non-standard way, we'd just as soon not let
> the kernel know about them (an incorrect LED pattern will definitely
> occur). The LEDs are to be set from user space by 'ledmon' instead.

What?  This is August 15, not April 1.  That makes no sense
whatsoever.

The device is advertising that it supports the hotplug model in the
PCIe spec.  If you don't like that model, you should define a new
model with a different capability ID.

Or you could propose a PCIe spec update with a new capability bit so
software can tell that this device supports different functionality.

> Had I my way, the hardware wouldn't advertise the capability in the
> first place. I rarely get my way, so I instead get to publicly defend
> the quirk. :)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-15 19:50       ` Bjorn Helgaas
@ 2016-08-15 22:35         ` Keith Busch
  2016-08-16  3:03           ` Bjorn Helgaas
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-15 22:35 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 15, 2016 at 02:50:58PM -0500, Bjorn Helgaas wrote:
> What?  This is August 15, not April 1.  That makes no sense
> whatsoever.

Yes, I agree, and not advocating this was a good idea. I'm just the
messenger in this case.

That said, it looks like this can fall under the intended usage for
the quirk framework if we can consider this a work-around for standard
non-compliance. Or are your thoughts that we've gone too far if it's
not a true erratum?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-15 22:35         ` Keith Busch
@ 2016-08-16  3:03           ` Bjorn Helgaas
  0 siblings, 0 replies; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-16  3:03 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 15, 2016 at 06:35:03PM -0400, Keith Busch wrote:
> On Mon, Aug 15, 2016 at 02:50:58PM -0500, Bjorn Helgaas wrote:
> > What?  This is August 15, not April 1.  That makes no sense
> > whatsoever.
> 
> Yes, I agree, and not advocating this was a good idea. I'm just the
> messenger in this case.
> 
> That said, it looks like this can fall under the intended usage for
> the quirk framework if we can consider this a work-around for standard
> non-compliance. Or are your thoughts that we've gone too far if it's
> not a true erratum?

Well, I don't want to add random quirks for everybody who decides to
make their own non-standard hardware.  The reason we have the PCI
specs is for interoperability, and this is not it.

The whole purpose of capability bits is for the hardware to tell the
OS that it has certain *capabilities* that work in certain ways.  It's
idiotic to set a capability bit that says "I have this indicator" when
it doesn't work the way it's supposed to.  The hardware should just
NOT SET THE BIT.  How hard is that?  I can't conceive of the reasoning
that concluded this was useful.

I don't care if you want to make the LEDs do something else, but for
Pete's sake, don't claim they work according to spec when they don't.

I assume there will be more broken devices like this in the future.
How long will we be adding quirks?  Do you have a plan to resolve this
the correct way eventually?

I still don't know what it means that the devices "do the wrong
thing."  Do the LEDs not work?  Do the LEDs work according to the PCIe
spec, but the patterns don't match the International Blinken Lights
spec?  Does the hotplug controller lock up?

Sigh.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-15 19:23     ` Keith Busch
  2016-08-15 19:50       ` Bjorn Helgaas
@ 2016-08-17 21:37       ` Bjorn Helgaas
  2016-08-17 23:09         ` Keith Busch
  1 sibling, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-17 21:37 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 15, 2016 at 03:23:16PM -0400, Keith Busch wrote:
> On Mon, Aug 15, 2016 at 12:40:02PM -0500, Bjorn Helgaas wrote:
> > On Mon, Aug 08, 2016 at 02:19:02PM -0600, Keith Busch wrote:
> > > +/*
> > > + * The PCIe slot capabilities for Intel compatible Hot-swap backplane advertise
> > > + * attention and power indicators, but will do the wrong thing if used in a
> > > + * standard way. Ignore these.
> > > + */
> > 
> > Hmm.  So I guess you're saying these devices are defective?  Is there
> > an erratum we can reference?
> > 
> > What exactly does "do the wrong thing" mean?  These are indicators, so
> > the only thing we really do is turn them on and off.  I think we do
> > that with pcie_write_cmd_nowait(), and all the synchronization there
> > is a little messy.  Maybe we got that wrong somehow?
> > 
> > It's hard to believe something as simple as controlling an LED is
> > broken.  If it *is* broken, I would think the breakage would be
> > platform-dependent, not just device-dependent, i.e., I would suspect
> > something wrong with motherboard wiring or firmware.
> 
> This is actually a "feature". The devices listed in the patch re-purpose
> the spec defined capability and control bits for Attention and Power
> indicators. The control values match IBPI (International Blinking Pattern
> Interpretation) rather than the spec definition.
> 
> Since these operate in a non-standard way, we'd just as soon not let
> the kernel know about them (an incorrect LED pattern will definitely
> occur). The LEDs are to be set from user space by 'ledmon' instead.
> 
> Had I my way, the hardware wouldn't advertise the capability in the
> first place. I rarely get my way, so I instead get to publicly defend
> the quirk. :)

Usually when I think something is totally stupid, it's because I don't
know the whole story.  So it might make more sense and lead to a
better solution if you could tell us more about your intent here.

According to the Linux PCI database, the devices you want to quirk are:

  2030  Sky Lake-E PCI Express Root Port 1A	
  2031  Sky Lake-E PCI Express Root Port 1B	
  2032  Sky Lake-E PCI Express Root Port 1C	
  2033  Sky Lake-E PCI Express Root Port 1D

So are you saying that on every platform that uses Sky Lake-E, these
indicators are non-standard in this way?

IBPI looks like it's targeted at storage arrays, since it has states
for "drive not present", "fail", "rebuild", "hotspare", etc.  Maybe
there's some sense for Sky Lake-E platforms with directly-attached
storage.

But if somebody built a Sky Lake-E platform with one of these Root
Ports leading to a plain hotplug PCIe slot with regular indicators,
your quirk would break them, wouldn't it?  Or are you imposing
constraints on how those Root Ports can be used?

How does 'ledmon' manage the indicators?  The kernel (pciehp) uses the
Slot Control register, which is not completely trivial because of the
Command Completed synchronization required.  I'm hoping ledmon isn't
going to mess up that synchronization.

How does this work for other OSes?  Are you proposing similar changes
to Windows?

What's your plan for backwards compatibility?  Just accept that old
OSes won't be able to operate the indicators correctly until they're
patched with this quirk?

You must have set that capability bit for some reason.  You don't want
the OS to consume it, so who *do* you expect to consume it, and how
(direct PCI config access, lspci, etc.), and what are they supposed to
do with it?

Still scratching my head,
Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-17 21:37       ` Bjorn Helgaas
@ 2016-08-17 23:09         ` Keith Busch
  2016-08-18 19:56           ` Bjorn Helgaas
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-17 23:09 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Wed, Aug 17, 2016 at 04:37:45PM -0500, Bjorn Helgaas wrote:
> Usually when I think something is totally stupid, it's because I don't
> know the whole story.  So it might make more sense and lead to a
> better solution if you could tell us more about your intent here.

Definitely. I did not provide all the details because I didn't think
knowing the full context would help toward understanding the function
of the patch, but I see now that skimping on the details did not help
our cause.

This came about from wanting a simple SGPIO-like LED management solution
for PCIe SSDs. The Intel group who made this, not considering the
more broad impact on standarization, chose to reuse the hot plug
serial SMBus in the Intel CPUs (aka VPP) that already carried the Slot
Control register bits out of the CPU.

We would love to have been able to disable the capability present
bits, but the hardware logic that would disable those bits would also
have disabled LED control entirely, and we can't change the hardware
now. We have to rely on software to work around this limitation for this
generation of hardware.

The next generation of these devices will pursue standards compliant
methods by engaging with PCI-SIG and the NVMe-MI standards bodies.
This current generation of devices is the only one set this way that
requires this work-around.

> According to the Linux PCI database, the devices you want to quirk are:
> 
>   2030  Sky Lake-E PCI Express Root Port 1A	
>   2031  Sky Lake-E PCI Express Root Port 1B	
>   2032  Sky Lake-E PCI Express Root Port 1C	
>   2033  Sky Lake-E PCI Express Root Port 1D
>
> 
> So are you saying that on every platform that uses Sky Lake-E, these
> indicators are non-standard in this way?

Yes (more details below), but I may need to refine this to specific
subsystem IDs (clarifying that internally now).

Also, just for future note, the device list that I provided with the quirk
may need to be augmented for other vendor devices that implement it this
way as well.
 
> IBPI looks like it's targeted at storage arrays, since it has states
> for "drive not present", "fail", "rebuild", "hotspare", etc.  Maybe
> there's some sense for Sky Lake-E platforms with directly-attached
> storage.
> 
> But if somebody built a Sky Lake-E platform with one of these Root
> Ports leading to a plain hotplug PCIe slot with regular indicators,
> your quirk would break them, wouldn't it?  Or are you imposing
> constraints on how those Root Ports can be used?

Yes, you totally nailed it on this being for storage.

To go even further, we constrain these to be in Sky Lake's VMD domains. We
do not support using VMD for anything but PCIe storage.

> How does 'ledmon' manage the indicators?  The kernel (pciehp) uses the
> Slot Control register, which is not completely trivial because of the
> Command Completed synchronization required.  I'm hoping ledmon isn't
> going to mess up that synchronization.

We've augmented 'ledmon' with libpci and toggles these indicators very
similar to how 'setpci' can change PCI config registers. It is done only
after the storage device is up and registered, which is well outside
the time when pciehp is actively using the slot control.
 
> How does this work for other OSes?  Are you proposing similar changes
> to Windows?

Aha, the mystery to that part might be clarified knowing this is tied
to VMD.

For other OSes including Windows, the VMD driver does not expose the
domain to the rest of the operating system. These VMD drivers implement
their own pci bus and port service drivers, nvme storage drivers,
and interrupt chaining, so they don't have to worry about anyone using
these devices incorrectly. Those drivers also provide VMD specific
management interfaces (special ioctls, for example) to manage its domain.

For Linux, we did not think it appropriate or upstreamable to
re-write all the drivers and services into a monolithic VMD driver,
then implement new user tooling for things that 'lspci' and 'setpci'
already provide. Instead, we leverage all the existing good code by
implementing VMD as a PCI host-bridge driver. The down side is we have
to get our quirks fixed in generic code.


> What's your plan for backwards compatibility?  Just accept that old
> OSes won't be able to operate the indicators correctly until they're
> patched with this quirk?

You may have surmised this already based on the new details, but I'll
just add that since it's tied to VMD, and that being a very new driver,
backward compatibility is not a concern.

> You must have set that capability bit for some reason.  You don't want
> the OS to consume it, so who *do* you expect to consume it, and how
> (direct PCI config access, lspci, etc.), and what are they supposed to
> do with it?

Just restating what I mentioned above, this is an artifact on how the h/w gates
were wired. Suppressing the capability bit automatically suppresses the
LED control, and we need LED control.

This is being addressed in the next generation to do provide the desired
LED control in a standard compliant manner.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-17 23:09         ` Keith Busch
@ 2016-08-18 19:56           ` Bjorn Helgaas
  2016-08-18 22:46             ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-18 19:56 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Wed, Aug 17, 2016 at 07:09:51PM -0400, Keith Busch wrote:
> On Wed, Aug 17, 2016 at 04:37:45PM -0500, Bjorn Helgaas wrote:
> > Usually when I think something is totally stupid, it's because I don't
> > know the whole story.  So it might make more sense and lead to a
> > better solution if you could tell us more about your intent here.
> 
> Definitely. I did not provide all the details because I didn't think
> knowing the full context would help toward understanding the function
> of the patch, but I see now that skimping on the details did not help
> our cause.
> 
> This came about from wanting a simple SGPIO-like LED management solution
> for PCIe SSDs. The Intel group who made this, not considering the
> more broad impact on standarization, chose to reuse the hot plug
> serial SMBus in the Intel CPUs (aka VPP) that already carried the Slot
> Control register bits out of the CPU.
> 
> We would love to have been able to disable the capability present
> bits, but the hardware logic that would disable those bits would also
> have disabled LED control entirely, and we can't change the hardware
> now. We have to rely on software to work around this limitation for this
> generation of hardware.
> 
> The next generation of these devices will pursue standards compliant
> methods by engaging with PCI-SIG and the NVMe-MI standards bodies.
> This current generation of devices is the only one set this way that
> requires this work-around.

That's good news.

> > How does 'ledmon' manage the indicators?  The kernel (pciehp) uses the
> > Slot Control register, which is not completely trivial because of the
> > Command Completed synchronization required.  I'm hoping ledmon isn't
> > going to mess up that synchronization.
> 
> We've augmented 'ledmon' with libpci and toggles these indicators very
> similar to how 'setpci' can change PCI config registers. It is done only
> after the storage device is up and registered, which is well outside
> the time when pciehp is actively using the slot control.

I assume this means ledmon writes Slot Control to manage the LEDs.
That sounds pretty scary to me because it's impossible to enforce the
assumption that ledmon only uses Slot Control when pciehp isn't using
it.  Hotplug includes surprise events, and I think pciehp is entitled
to assume it is the sole owner of Slot Control.

I wonder if there's some way to implement the LED control in pciehp,
e.g., by enhancing pciehp_set_attention_status().  I assume the
desired indicator state is coming from some in-kernel storage driver,
and ledmon learns about that somehow, then fiddles with Slot Control
behind the kernel's back?  That looping from kernel to user and back
to kernel sounds a little dodgy and fragile.

Can you share any details about how you plan to implement this on the
next generation of devices?  Maybe we can enhance pciehp indicator
control in a way that supports both Sky Lake and future devices.

Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-18 19:56           ` Bjorn Helgaas
@ 2016-08-18 22:46             ` Keith Busch
  2016-08-22 16:55               ` Bjorn Helgaas
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-18 22:46 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Thu, Aug 18, 2016 at 02:56:56PM -0500, Bjorn Helgaas wrote:
> On Wed, Aug 17, 2016 at 07:09:51PM -0400, Keith Busch wrote:
> > We've augmented 'ledmon' with libpci and toggles these indicators very
> > similar to how 'setpci' can change PCI config registers. It is done only
> > after the storage device is up and registered, which is well outside
> > the time when pciehp is actively using the slot control.
> 
> I assume this means ledmon writes Slot Control to manage the LEDs.

Correct, the registers that are defined for AttnCtrl and PwrCtrl are
what's being redefined for the new pattern.

We are definitely not dead set on requiring we let ledmon have direct
access to SlotCtl through libpci. That's just the one way we thought of
that didn't require new kernel dependencies.

> That sounds pretty scary to me because it's impossible to enforce the
> assumption that ledmon only uses Slot Control when pciehp isn't using
> it.  Hotplug includes surprise events, and I think pciehp is entitled
> to assume it is the sole owner of Slot Control.

I understand the concern here. You don't want independent programs
vying to mask/unmask bits in SlotCtl and end up with a state neither
intended. The only scenario that I can come up with where that could
happen is if ledmon changes the LED on a slot at the same time the drive
is removed, but there is no MRL or attention buttons on this hardware,
and the rest that we do care about looks totally harmless on these slots.

This condition also isn't really new. While probably not recommended,
I could blink the standard Attn and Pwr LED's like this (user, of course,
assumes all the risks):

  # setpci -s <B:D.f> CAP_EXP+18.w=280:3c0

It's basically the same as what we'd have ledmon do, but with a less
esoteric syntax:

  # ledctl locate=/dev/nvme0n1

[ledmon invokes ledctl from a daemon, but ledctl can run on its own]

> I wonder if there's some way to implement the LED control in pciehp,
> e.g., by enhancing pciehp_set_attention_status().  I assume the
> desired indicator state is coming from some in-kernel storage driver,
> and ledmon learns about that somehow, then fiddles with Slot Control
> behind the kernel's back?  That looping from kernel to user and back
> to kernel sounds a little dodgy and fragile.

That is definitely an interesting possibility if you are open to
this. We'd "just" need the kernel to comprehend these vendor specific
bit patterns for this particular generation of hardware.

If you're okay with that, then I'm more than happy to propose a patch
for consideration. We can then have ledmon subscribe to the sysfs entry
point instead of going through libpci, no problem.
 
> Can you share any details about how you plan to implement this on the
> next generation of devices?  Maybe we can enhance pciehp indicator
> control in a way that supports both Sky Lake and future devices.

One possibility is to define through PCI-SIG
SlotCap2 and SlotCtl2 with this kind of LED control.

Another possibilities I hear about through the NVMe-MI working group
is creating similar type of devices as SES (SCSI Enclosure Services)
for PCIe SSD enclosures.



Finally, at the risk of digressing this thread, but if I may switch
topic back to suppressing the indicator presence fields...

Since these are specific to VMD domains, there are folks at Intel who
want to see the VMD driver mask off these capabilities from the driver.
There is never a case where a slot in this domain should advertise these,
so masking off the bits at the config read level would achieve the desired
result. It'd be isolated to the VMD driver, and the quirk wouldn't need
to be maintained as addtional vendor devices implementing the same quirk
are made.

Below is the diff for *that* proposal, and the developer of this patch
wanted to know if you may consider this over the quirk that was previously
written. This looks odd to me, but I can't argue against that it achieves
what the device maker desires.

Thanks!
Keith

---
diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
index b73da50..aa2791f 100644
--- a/arch/x86/pci/vmd.c
+++ b/arch/x86/pci/vmd.c
@@ -452,6 +452,7 @@ static int vmd_pci_read(struct pci_bus *bus,
unsigned int devfn, int reg,
 {
 	struct vmd_dev *vmd = vmd_from_bus(bus);
 	char __iomem *addr = vmd_cfg_addr(vmd, bus, devfn, reg, len);
+	struct pci_dev *dev;
 	unsigned long flags;
 	int ret = 0;
 
@@ -474,6 +475,17 @@ static int vmd_pci_read(struct pci_bus *bus,
unsigned int devfn, int reg,
 		break;
 	}
 	spin_unlock_irqrestore(&vmd->cfg_lock, flags);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		if (dev->devfn == devfn) {
+			if (dev->pcie_cap &&
+					reg == dev->pcie_cap + PCI_EXP_SLTCAP)
+				*value &= ~(PCI_EXP_SLTCAP_AIP |
+					    PCI_EXP_SLTCAP_PIP);
+			break;
+		}
+	}
+
 	return ret;
 }
 
--

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-18 22:46             ` Keith Busch
@ 2016-08-22 16:55               ` Bjorn Helgaas
  2016-08-22 21:15                 ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-22 16:55 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Thu, Aug 18, 2016 at 06:46:10PM -0400, Keith Busch wrote:
> On Thu, Aug 18, 2016 at 02:56:56PM -0500, Bjorn Helgaas wrote:
> > On Wed, Aug 17, 2016 at 07:09:51PM -0400, Keith Busch wrote:
> > > We've augmented 'ledmon' with libpci and toggles these indicators very
> > > similar to how 'setpci' can change PCI config registers. It is done only
> > > after the storage device is up and registered, which is well outside
> > > the time when pciehp is actively using the slot control.
> > 
> > I assume this means ledmon writes Slot Control to manage the LEDs.
> 
> Correct, the registers that are defined for AttnCtrl and PwrCtrl are
> what's being redefined for the new pattern.
> 
> We are definitely not dead set on requiring we let ledmon have direct
> access to SlotCtl through libpci. That's just the one way we thought of
> that didn't require new kernel dependencies.
> 
> > That sounds pretty scary to me because it's impossible to enforce the
> > assumption that ledmon only uses Slot Control when pciehp isn't using
> > it.  Hotplug includes surprise events, and I think pciehp is entitled
> > to assume it is the sole owner of Slot Control.
> 
> I understand the concern here. You don't want independent programs
> vying to mask/unmask bits in SlotCtl and end up with a state neither
> intended. The only scenario that I can come up with where that could
> happen is if ledmon changes the LED on a slot at the same time the drive
> is removed, but there is no MRL or attention buttons on this hardware,
> and the rest that we do care about looks totally harmless on these slots.
> 
> This condition also isn't really new. While probably not recommended,
> I could blink the standard Attn and Pwr LED's like this (user, of course,
> assumes all the risks):
> 
>   # setpci -s <B:D.f> CAP_EXP+18.w=280:3c0

Definitely not recommended.  If you use setpci, all bets are off and
we can't really support the system afterwards.  In fact, we probably
should taint the kernel when users write to config space.

If we tainted the kernel and put something in dmesg on the first user
config space write, I might be OK with the original quirk to make
pciehp ignore the attention and power indicators.

The only reason I suggest a dmesg note is because while we can't
*imagine* a scenario where pciehp and ledmon stomp on each other, that
doesn't mean such a scenario doesn't exist, and a problem would be
hard to reproduce and hard to debug.

> > I wonder if there's some way to implement the LED control in pciehp,
> > e.g., by enhancing pciehp_set_attention_status().  I assume the
> > desired indicator state is coming from some in-kernel storage driver,
> > and ledmon learns about that somehow, then fiddles with Slot Control
> > behind the kernel's back?  That looping from kernel to user and back
> > to kernel sounds a little dodgy and fragile.
> 
> That is definitely an interesting possibility if you are open to
> this. We'd "just" need the kernel to comprehend these vendor specific
> bit patterns for this particular generation of hardware.
> 
> If you're okay with that, then I'm more than happy to propose a patch
> for consideration. We can then have ledmon subscribe to the sysfs entry
> point instead of going through libpci, no problem.

I was imagining these LEDs as some sort of extension to the PCIe
hotplug model, but I think that was a mistake: logically, they have
nothing to do with hotplug, and the only reason they're currently
associated with hotplug is because you chose to re-use a bus (VPP)
that happens to be connected to the Slot Control registers.

>From an architectural point of view, these LEDs seem device-specific
or storage-specific, with no connection to PCIe at all, so I don't
know why we would put them in the PCIe spec or teach pciehp about
them.

> > Can you share any details about how you plan to implement this on the
> > next generation of devices?  Maybe we can enhance pciehp indicator
> > control in a way that supports both Sky Lake and future devices.
> 
> One possibility is to define through PCI-SIG
> SlotCap2 and SlotCtl2 with this kind of LED control.
> 
> Another possibilities I hear about through the NVMe-MI working group
> is creating similar type of devices as SES (SCSI Enclosure Services)
> for PCIe SSD enclosures.

> Since these are specific to VMD domains, there are folks at Intel who
> want to see the VMD driver mask off these capabilities from the driver.
> There is never a case where a slot in this domain should advertise these,
> so masking off the bits at the config read level would achieve the desired
> result. It'd be isolated to the VMD driver, and the quirk wouldn't need
> to be maintained as addtional vendor devices implementing the same quirk
> are made.

My biggest concern is the ownership problem, and this strategy doesn't
address that.

> ---
> diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
> index b73da50..aa2791f 100644
> --- a/arch/x86/pci/vmd.c
> +++ b/arch/x86/pci/vmd.c
> @@ -452,6 +452,7 @@ static int vmd_pci_read(struct pci_bus *bus,
> unsigned int devfn, int reg,
>  {
>  	struct vmd_dev *vmd = vmd_from_bus(bus);
>  	char __iomem *addr = vmd_cfg_addr(vmd, bus, devfn, reg, len);
> +	struct pci_dev *dev;
>  	unsigned long flags;
>  	int ret = 0;
>  
> @@ -474,6 +475,17 @@ static int vmd_pci_read(struct pci_bus *bus,
> unsigned int devfn, int reg,
>  		break;
>  	}
>  	spin_unlock_irqrestore(&vmd->cfg_lock, flags);
> +
> +	list_for_each_entry(dev, &bus->devices, bus_list) {
> +		if (dev->devfn == devfn) {
> +			if (dev->pcie_cap &&
> +					reg == dev->pcie_cap + PCI_EXP_SLTCAP)
> +				*value &= ~(PCI_EXP_SLTCAP_AIP |
> +					    PCI_EXP_SLTCAP_PIP);
> +			break;
> +		}
> +	}
> +
>  	return ret;
>  }
>  
> --

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-22 16:55               ` Bjorn Helgaas
@ 2016-08-22 21:15                 ` Keith Busch
  2016-08-23 13:39                   ` Bjorn Helgaas
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-22 21:15 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 22, 2016 at 11:55:24AM -0500, Bjorn Helgaas wrote:
> I was imagining these LEDs as some sort of extension to the PCIe
> hotplug model, but I think that was a mistake: logically, they have
> nothing to do with hotplug, and the only reason they're currently
> associated with hotplug is because you chose to re-use a bus (VPP)
> that happens to be connected to the Slot Control registers.
> 
> From an architectural point of view, these LEDs seem device-specific
> or storage-specific, with no connection to PCIe at all, so I don't
> know why we would put them in the PCIe spec or teach pciehp about
> them.

It's not entirely for hotplug scenarios, but it does help with user pain
points locating devices they intend to hot remove.

I hear many vendors are for the concept of proposing new status
and location indicator definitions to PCIe. I don't think anyone is
suggesting the implementation requiring this patch be made standard;
this generation of hardware is just a non-standard implementation that
needs a quirk to help fill the gap.

Would it be more palatable if I modify the quirk such that when set,
pciehp provides a sysfs entry allowing arbitrary user defined Slot Control
commands? That removes the dangerous direct access from user space.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-22 21:15                 ` Keith Busch
@ 2016-08-23 13:39                   ` Bjorn Helgaas
  2016-08-23 17:10                     ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-23 13:39 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Mon, Aug 22, 2016 at 05:15:36PM -0400, Keith Busch wrote:
> On Mon, Aug 22, 2016 at 11:55:24AM -0500, Bjorn Helgaas wrote:
> > I was imagining these LEDs as some sort of extension to the PCIe
> > hotplug model, but I think that was a mistake: logically, they have
> > nothing to do with hotplug, and the only reason they're currently
> > associated with hotplug is because you chose to re-use a bus (VPP)
> > that happens to be connected to the Slot Control registers.
> > 
> > From an architectural point of view, these LEDs seem device-specific
> > or storage-specific, with no connection to PCIe at all, so I don't
> > know why we would put them in the PCIe spec or teach pciehp about
> > them.
> 
> It's not entirely for hotplug scenarios, but it does help with user pain
> points locating devices they intend to hot remove.
> 
> I hear many vendors are for the concept of proposing new status
> and location indicator definitions to PCIe. I don't think anyone is
> suggesting the implementation requiring this patch be made standard;
> this generation of hardware is just a non-standard implementation that
> needs a quirk to help fill the gap.

I'm still not buying the hotplug connection.  The existing Attention
Indicator is already defined for the purpose of locating a device
(PCIe r3.0, sec 6.7.1.1.1):

  A blinking Attention Indicator indicates that system software is
  identifying this slot for a human operator to find. This behavior is
  controlled by a user (for example, from a software user interface or
  management tool).

The other IBPI states (fail, rebuild, predicted failure, hotspare,
degraded, failed) all seem like storage-related things without an
obvious connection to pciehp.  I would expect a kernel driver like md
to manage those states.

PCIe hotplug is designed with a 1:1:1 relationship between the
Downstream Port, the slot, and a replaceable device.  It's not obvious
to me how the IBPI signaling maps into that.  The first diagram on the
IBPI wikipedia page shows a single iPass cable (e.g., a single PCIe
slot connection) connected to an enclosure of four drives.  Each drive
has three LEDs, so we couldn't control all twelve of those LEDs using
the single Slot Control of the Downstream Port.

> Would it be more palatable if I modify the quirk such that when set,
> pciehp provides a sysfs entry allowing arbitrary user defined Slot Control
> commands? That removes the dangerous direct access from user space.

Maybe, I dunno.  So far, it still seems like an unrelated wart on pciehp.

Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 13:39                   ` Bjorn Helgaas
@ 2016-08-23 17:10                     ` Keith Busch
  2016-08-23 17:14                       ` Sinan Kaya
  2016-08-23 20:02                       ` Bjorn Helgaas
  0 siblings, 2 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-23 17:10 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 08:39:13AM -0500, Bjorn Helgaas wrote:
> On Mon, Aug 22, 2016 at 05:15:36PM -0400, Keith Busch wrote:
> 
> The other IBPI states (fail, rebuild, predicted failure, hotspare,
> degraded, failed) all seem like storage-related things without an
> obvious connection to pciehp.  I would expect a kernel driver like md
> to manage those states.

Yes, that's fairly accurate on how this works with other storage
transports. While md doesn't directly affect LED states, 'ledmon'
subscribes to md events and executes an appropriate LED control method
specific to that transport.

It just happens that this quirky hardware chose to re-purpose the Slot
Control commands for the PCIe transport specific method to get feature
parity with other storage fabrics.
 
> PCIe hotplug is designed with a 1:1:1 relationship between the
> Downstream Port, the slot, and a replaceable device.  It's not obvious
> to me how the IBPI signaling maps into that.  The first diagram on the
> IBPI wikipedia page shows a single iPass cable (e.g., a single PCIe
> slot connection) connected to an enclosure of four drives.  Each drive
> has three LEDs, so we couldn't control all twelve of those LEDs using
> the single Slot Control of the Downstream Port.

For a PCIe storage transport, each drive has its own slot, and the
relationship with the downstream port and its replaceable device is still
preserved; connections to each drive provide individual Slot Control.
 
> > Would it be more palatable if I modify the quirk such that when set,
> > pciehp provides a sysfs entry allowing arbitrary user defined Slot Control
> > commands? That removes the dangerous direct access from user space.
> 
> Maybe, I dunno.  So far, it still seems like an unrelated wart on pciehp.

It's superficially related to pciehp only because that's the only module
that touches Slot Control, and this particular hardware interprets
pciehp's control commands differently than the specification.

Since the hardware can't be changed, is there any guidance you can
recommend we follow to appropriately fence off pciehp from attention and
power indicator control? I initially attempted the least invasive method,
but I'm happy to explore other possibilities.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 17:10                     ` Keith Busch
@ 2016-08-23 17:14                       ` Sinan Kaya
  2016-08-23 19:23                         ` Keith Busch
  2016-08-23 20:02                       ` Bjorn Helgaas
  1 sibling, 1 reply; 23+ messages in thread
From: Sinan Kaya @ 2016-08-23 17:14 UTC (permalink / raw)
  To: Keith Busch, Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On 8/23/2016 1:10 PM, Keith Busch wrote:
> It's superficially related to pciehp only because that's the only module
> that touches Slot Control, and this particular hardware interprets
> pciehp's control commands differently than the specification.
> 
> Since the hardware can't be changed, is there any guidance you can
> recommend we follow to appropriately fence off pciehp from attention and
> power indicator control? I initially attempted the least invasive method,
> but I'm happy to explore other possibilities.

Most other non-standard HW require an LED driver for hotplug in the kernel. 
IBM has one. It sound like you need another one.

static struct hotplug_slot_ops cpci_hotplug_slot_ops = {
	.enable_slot = enable_slot,
	.disable_slot = disable_slot,
	.set_attention_status = set_attention_status,
	.get_power_status = get_power_status,
	.get_attention_status = get_attention_status,
	.get_adapter_status = get_adapter_status,
	.get_latch_status = get_latch_status,
}; 

cpci_hotplug_core.c and ibmphp_core.c.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 17:14                       ` Sinan Kaya
@ 2016-08-23 19:23                         ` Keith Busch
  2016-08-23 19:52                           ` Bjorn Helgaas
  0 siblings, 1 reply; 23+ messages in thread
From: Keith Busch @ 2016-08-23 19:23 UTC (permalink / raw)
  To: Sinan Kaya; +Cc: Bjorn Helgaas, linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 01:14:03PM -0400, Sinan Kaya wrote:
> On 8/23/2016 1:10 PM, Keith Busch wrote:
> > It's superficially related to pciehp only because that's the only module
> > that touches Slot Control, and this particular hardware interprets
> > pciehp's control commands differently than the specification.
> > 
> > Since the hardware can't be changed, is there any guidance you can
> > recommend we follow to appropriately fence off pciehp from attention and
> > power indicator control? I initially attempted the least invasive method,
> > but I'm happy to explore other possibilities.
> 
> Most other non-standard HW require an LED driver for hotplug in the kernel. 
> IBM has one. It sound like you need another one.

We'd still need to fence off pciehp from LED control since it manipulates
these in response to hot plug events handled by that driver.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 19:23                         ` Keith Busch
@ 2016-08-23 19:52                           ` Bjorn Helgaas
  2016-08-23 20:44                             ` Keith Busch
  0 siblings, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-23 19:52 UTC (permalink / raw)
  To: Keith Busch; +Cc: Sinan Kaya, linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 03:23:31PM -0400, Keith Busch wrote:
> On Tue, Aug 23, 2016 at 01:14:03PM -0400, Sinan Kaya wrote:
> > On 8/23/2016 1:10 PM, Keith Busch wrote:
> > > It's superficially related to pciehp only because that's the only module
> > > that touches Slot Control, and this particular hardware interprets
> > > pciehp's control commands differently than the specification.
> > > 
> > > Since the hardware can't be changed, is there any guidance you can
> > > recommend we follow to appropriately fence off pciehp from attention and
> > > power indicator control? I initially attempted the least invasive method,
> > > but I'm happy to explore other possibilities.
> > 
> > Most other non-standard HW require an LED driver for hotplug in the kernel. 
> > IBM has one. It sound like you need another one.
> 
> We'd still need to fence off pciehp from LED control since it manipulates
> these in response to hot plug events handled by that driver.

Everybody probably knows this, but the problem is not with LED control
per se.  The problem is multiple writers to the Slot Control register.
Each write to Slot Control is a "command", and software is responsible
for waiting for one command to complete before writing another (see
PCIe r3.0, sec 6.7.3.2).

It's no problem to make pciehp keep its mitts off the LEDs; the
problem is if ledmon writes Slot Control for the indicators, and
pciehp writes Slot Control for some other reason at about the same
time.

Sinan has a very interesting idea...  Maybe we can work with that.
Hmm, it looks like if we have an attention indicator, we already put
an "attention" file in sysfs (see attention_write_file()).  That
should eventually call into pciehp_set_attention_status(), where we
currently only support off/on/blink states.

What if you made a quirk that replaced pciehp_set_attention_status()
with a special version that supported the extra states you need and
made ledmon use that sysfs file instead of writing Slot Control
directly?  Would everything just work?

Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 17:10                     ` Keith Busch
  2016-08-23 17:14                       ` Sinan Kaya
@ 2016-08-23 20:02                       ` Bjorn Helgaas
  2016-08-24 17:33                         ` Keith Busch
  1 sibling, 1 reply; 23+ messages in thread
From: Bjorn Helgaas @ 2016-08-23 20:02 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 01:10:18PM -0400, Keith Busch wrote:
> On Tue, Aug 23, 2016 at 08:39:13AM -0500, Bjorn Helgaas wrote:
> > On Mon, Aug 22, 2016 at 05:15:36PM -0400, Keith Busch wrote:
> > 
> > The other IBPI states (fail, rebuild, predicted failure, hotspare,
> > degraded, failed) all seem like storage-related things without an
> > obvious connection to pciehp.  I would expect a kernel driver like md
> > to manage those states.
> 
> Yes, that's fairly accurate on how this works with other storage
> transports. While md doesn't directly affect LED states, 'ledmon'
> subscribes to md events and executes an appropriate LED control method
> specific to that transport.

This is a tangent, but is there some reason for putting ledmon in the
middle of this scheme?  It seems like the important pieces (md and LED
control) are intimately related and both in the kernel, so why the
trip through userland?  ledmon sounds like a glorified special-case
/dev/mem, with similar security and stability issues.  Plus it must
have to associate a storage device with a PCIe port, grub through its
config space to find the Slot Control register, deal with possible
hotplug, etc.  It all sounds icky.

Bjorn

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 19:52                           ` Bjorn Helgaas
@ 2016-08-23 20:44                             ` Keith Busch
  0 siblings, 0 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-23 20:44 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Sinan Kaya, linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 02:52:21PM -0500, Bjorn Helgaas wrote:
> Everybody probably knows this, but the problem is not with LED control
> per se.  The problem is multiple writers to the Slot Control register.
> Each write to Slot Control is a "command", and software is responsible
> for waiting for one command to complete before writing another (see
> PCIe r3.0, sec 6.7.3.2).
> 
> It's no problem to make pciehp keep its mitts off the LEDs; the
> problem is if ledmon writes Slot Control for the indicators, and
> pciehp writes Slot Control for some other reason at about the same
> time.
> 
> Sinan has a very interesting idea...  Maybe we can work with that.
> Hmm, it looks like if we have an attention indicator, we already put
> an "attention" file in sysfs (see attention_write_file()).  That
> should eventually call into pciehp_set_attention_status(), where we
> currently only support off/on/blink states.
> 
> What if you made a quirk that replaced pciehp_set_attention_status()
> with a special version that supported the extra states you need and
> made ledmon use that sysfs file instead of writing Slot Control
> directly?  Would everything just work?

I'm on board with this idea. I'll get some testing started and send a
new version if/when successful.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices
  2016-08-23 20:02                       ` Bjorn Helgaas
@ 2016-08-24 17:33                         ` Keith Busch
  0 siblings, 0 replies; 23+ messages in thread
From: Keith Busch @ 2016-08-24 17:33 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Bjorn Helgaas

On Tue, Aug 23, 2016 at 03:02:32PM -0500, Bjorn Helgaas wrote:
> On Tue, Aug 23, 2016 at 01:10:18PM -0400, Keith Busch wrote:
> > Yes, that's fairly accurate on how this works with other storage
> > transports. While md doesn't directly affect LED states, 'ledmon'
> > subscribes to md events and executes an appropriate LED control method
> > specific to that transport.
> 
> This is a tangent, but is there some reason for putting ledmon in the
> middle of this scheme?  It seems like the important pieces (md and LED
> control) are intimately related and both in the kernel, so why the
> trip through userland?  ledmon sounds like a glorified special-case
> /dev/mem, with similar security and stability issues.  Plus it must
> have to associate a storage device with a PCIe port, grub through its
> config space to find the Slot Control register, deal with possible
> hotplug, etc.  It all sounds icky.

DM knows to report status on groups and members, but as far as I know,
there's no in-kernel API for setting storage status LEDs. If we want to
put it in the kernel, I think it'd be a longer term project, but could
be interesting to try if this is a desirable feature.

There's more than one way to set status (SGPIO, SES, IPMI), and
'ledmon' already knows how to figure out which method to use for many
storage devices. We will just need to teach it whatever method we expose
for PCIe attached drives.

The direct SlotCtl config write method had no new kernel dependencies,
so seemed desirable, but you raised a good point that it potentially
breaks the slot control command sequence.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-08-24 17:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-08 20:19 [PATCH 1/2] pci: add option to ignore slot capabilities Keith Busch
2016-08-08 20:19 ` [PATCH 2/2] pci: Add ignore indicator quirk for devices Keith Busch
2016-08-15 17:40   ` Bjorn Helgaas
2016-08-15 19:23     ` Keith Busch
2016-08-15 19:50       ` Bjorn Helgaas
2016-08-15 22:35         ` Keith Busch
2016-08-16  3:03           ` Bjorn Helgaas
2016-08-17 21:37       ` Bjorn Helgaas
2016-08-17 23:09         ` Keith Busch
2016-08-18 19:56           ` Bjorn Helgaas
2016-08-18 22:46             ` Keith Busch
2016-08-22 16:55               ` Bjorn Helgaas
2016-08-22 21:15                 ` Keith Busch
2016-08-23 13:39                   ` Bjorn Helgaas
2016-08-23 17:10                     ` Keith Busch
2016-08-23 17:14                       ` Sinan Kaya
2016-08-23 19:23                         ` Keith Busch
2016-08-23 19:52                           ` Bjorn Helgaas
2016-08-23 20:44                             ` Keith Busch
2016-08-23 20:02                       ` Bjorn Helgaas
2016-08-24 17:33                         ` Keith Busch
2016-08-13  0:03 ` [PATCH 1/2] pci: add option to ignore slot capabilities Sean O. Stalley
2016-08-13  0:58   ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.