All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/4] PCI VPD access fixes
@ 2016-01-13 11:25 Hannes Reinecke
  2016-01-13 11:25 ` [PATCHv2 1/4] pci: Update VPD definitions Hannes Reinecke
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-13 11:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger, Hannes Reinecke

Hi all,

the current PCI VPD page access assumes that the entire possible VPD
data is readable. However, the spec only guarantees a VPD data up to
the 'end' marker, with everything beyond that being undefined.
This causes a system lockup on certain devices.

With this patch we always set the VPD sysfs attribute size to '0', and
calculate the available VPD size on the first access.
If no valid data can be read an I/O error is returned.

I've also included the patch from Babu to blacklists devices which
are known to lockup when accessing the VPD data.

Babu Moger (1):
  pci: Blacklist vpd access for buggy devices

Hannes Reinecke (3):
  pci: Update VPD definitions
  pci: allow access to VPD attributes with size '0'
  pci: Determine actual VPD size on first access

 drivers/pci/access.c    | 79 ++++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/pci/pci-sysfs.c | 22 ++++++++------
 drivers/pci/quirks.c    | 41 +++++++++++++++++++++++++
 include/linux/pci.h     | 27 +++++++++++++++--
 4 files changed, 157 insertions(+), 12 deletions(-)

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCHv2 1/4] pci: Update VPD definitions
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
@ 2016-01-13 11:25 ` Hannes Reinecke
  2016-01-13 11:25 ` [PATCHv2 2/4] pci: allow access to VPD attributes with size '0' Hannes Reinecke
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-13 11:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger,
	Hannes Reinecke, Bjorn Helgaas, Hannes Reinecke

The 'end' tag is actually 0x0f, it's the representation as a
small resource data type tag that's 0x78 (ie shifted by 3).
This patch also adds helper functions to extract the resource
data type tags for both large and small resource data types.

Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 include/linux/pci.h | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index d86378c..5d61c36 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1836,12 +1836,13 @@ bool pci_acs_path_enabled(struct pci_dev *start,
 #define PCI_VPD_LRDT_RW_DATA		PCI_VPD_LRDT_ID(PCI_VPD_LTIN_RW_DATA)
 
 /* Small Resource Data Type Tag Item Names */
-#define PCI_VPD_STIN_END		0x78	/* End */
+#define PCI_VPD_STIN_END		0x0f	/* End */
 
-#define PCI_VPD_SRDT_END		PCI_VPD_STIN_END
+#define PCI_VPD_SRDT_END		(PCI_VPD_STIN_END << 3)
 
 #define PCI_VPD_SRDT_TIN_MASK		0x78
 #define PCI_VPD_SRDT_LEN_MASK		0x07
+#define PCI_VPD_LRDT_TIN_MASK		0x7f
 
 #define PCI_VPD_LRDT_TAG_SIZE		3
 #define PCI_VPD_SRDT_TAG_SIZE		1
@@ -1865,6 +1866,17 @@ static inline u16 pci_vpd_lrdt_size(const u8 *lrdt)
 }
 
 /**
+ * pci_vpd_lrdt_tag - Extracts the Large Resource Data Type Tag Item
+ * @lrdt: Pointer to the beginning of the Large Resource Data Type tag
+ *
+ * Returns the extracted Large Resource Data Type Tag item.
+ */
+static inline u16 pci_vpd_lrdt_tag(const u8 *lrdt)
+{
+    return (u16)(lrdt[0] & PCI_VPD_LRDT_TIN_MASK);
+}
+
+/**
  * pci_vpd_srdt_size - Extracts the Small Resource Data Type length
  * @lrdt: Pointer to the beginning of the Small Resource Data Type tag
  *
@@ -1876,6 +1888,17 @@ static inline u8 pci_vpd_srdt_size(const u8 *srdt)
 }
 
 /**
+ * pci_vpd_srdt_tag - Extracts the Small Resource Data Type Tag Item
+ * @lrdt: Pointer to the beginning of the Small Resource Data Type tag
+ *
+ * Returns the extracted Small Resource Data Type Tag Item.
+ */
+static inline u8 pci_vpd_srdt_tag(const u8 *srdt)
+{
+	return ((*srdt) & PCI_VPD_SRDT_TIN_MASK) >> 3;
+}
+
+/**
  * pci_vpd_info_field_size - Extracts the information field length
  * @lrdt: Pointer to the beginning of an information field header
  *
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCHv2 2/4] pci: allow access to VPD attributes with size '0'
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
  2016-01-13 11:25 ` [PATCHv2 1/4] pci: Update VPD definitions Hannes Reinecke
@ 2016-01-13 11:25 ` Hannes Reinecke
  2016-02-09 20:53   ` Bjorn Helgaas
  2016-01-13 11:25 ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Hannes Reinecke
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-13 11:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger, Hannes Reinecke

It is not always possible to determine the actual size of the VPD
data, so allow access to them if the size is set to '0'.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/pci/pci-sysfs.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index eead54c..de327c3 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -772,10 +772,12 @@ static ssize_t read_vpd_attr(struct file *filp, struct kobject *kobj,
 	struct pci_dev *dev =
 		to_pci_dev(container_of(kobj, struct device, kobj));
 
-	if (off > bin_attr->size)
-		count = 0;
-	else if (count > bin_attr->size - off)
-		count = bin_attr->size - off;
+	if (bin_attr->size > 0) {
+		if (off > bin_attr->size)
+			count = 0;
+		else if (count > bin_attr->size - off)
+			count = bin_attr->size - off;
+	}
 
 	return pci_read_vpd(dev, off, count, buf);
 }
@@ -787,10 +789,12 @@ static ssize_t write_vpd_attr(struct file *filp, struct kobject *kobj,
 	struct pci_dev *dev =
 		to_pci_dev(container_of(kobj, struct device, kobj));
 
-	if (off > bin_attr->size)
-		count = 0;
-	else if (count > bin_attr->size - off)
-		count = bin_attr->size - off;
+	if (bin_attr->size > 0) {
+		if (off > bin_attr->size)
+			count = 0;
+		else if (count > bin_attr->size - off)
+			count = bin_attr->size - off;
+	}
 
 	return pci_write_vpd(dev, off, count, buf);
 }
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
  2016-01-13 11:25 ` [PATCHv2 1/4] pci: Update VPD definitions Hannes Reinecke
  2016-01-13 11:25 ` [PATCHv2 2/4] pci: allow access to VPD attributes with size '0' Hannes Reinecke
@ 2016-01-13 11:25 ` Hannes Reinecke
  2016-02-09 21:04   ` Bjorn Helgaas
  2016-01-13 11:25 ` [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices Hannes Reinecke
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-13 11:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger, Hannes Reinecke

PCI-2.2 VPD entries have a maximum size of 32k, but might actually
be smaller than that. To figure out the actual size one has to read
the VPD area until the 'end marker' is reached.
Trying to read VPD data beyond that marker results in 'interesting'
effects, from simple read errors to crashing the card. And to make
matters worse not every PCI card implements this properly, leaving
us with no 'end' marker or even completely invalid data.
This path tries to determine the size of the VPD data.
If no valid data can be read an I/O error will be returned when
reading the sysfs attribute.

Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/pci/access.c    | 76 ++++++++++++++++++++++++++++++++++++++++++++++++-
 drivers/pci/pci-sysfs.c |  2 +-
 2 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 59ac36f..914e023 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -284,9 +284,65 @@ struct pci_vpd_pci22 {
 	struct mutex lock;
 	u16	flag;
 	bool	busy;
+	bool	valid;
 	u8	cap;
 };
 
+/**
+ * pci_vpd_size - determine actual size of Vital Product Data
+ * @dev:	pci device struct
+ * @old_size:	current assumed size, also maximum allowed size
+ *
+ */
+static size_t
+pci_vpd_pci22_size(struct pci_dev *dev)
+{
+	size_t off = 0;
+	unsigned char header[1+2];	/* 1 byte tag, 2 bytes length */
+
+	while (off < PCI_VPD_PCI22_SIZE &&
+	       pci_read_vpd(dev, off, 1, header) == 1) {
+		unsigned char tag;
+
+		if (header[0] & PCI_VPD_LRDT) {
+			/* Large Resource Data Type Tag */
+			tag = pci_vpd_lrdt_tag(header);
+			/* Only read length from known tag items */
+			if ((tag == PCI_VPD_LTIN_ID_STRING) ||
+			    (tag == PCI_VPD_LTIN_RO_DATA) ||
+			    (tag == PCI_VPD_LTIN_RW_DATA)) {
+				if (pci_read_vpd(dev, off+1, 2,
+						 &header[1]) != 2) {
+					dev_dbg(&dev->dev,
+						"invalid large vpd tag %02x "
+						"size at offset %zu",
+						tag, off + 1);
+					break;
+				}
+				off += PCI_VPD_LRDT_TAG_SIZE +
+					pci_vpd_lrdt_size(header);
+			}
+		} else {
+			/* Short Resource Data Type Tag */
+			off += PCI_VPD_SRDT_TAG_SIZE +
+				pci_vpd_srdt_size(header);
+			tag = pci_vpd_srdt_tag(header);
+		}
+		if (tag == PCI_VPD_STIN_END)	/* End tag descriptor */
+			return off;
+		if ((tag != PCI_VPD_LTIN_ID_STRING) &&
+		    (tag != PCI_VPD_LTIN_RO_DATA) &&
+		    (tag != PCI_VPD_LTIN_RW_DATA)) {
+			dev_dbg(&dev->dev,
+				"invalid %s vpd tag %02x at offset %zu",
+				(header[0] & PCI_VPD_LRDT) ? "large" : "short",
+				tag, off);
+			break;
+		}
+	}
+	return 0;
+}
+
 /*
  * Wait for last operation to complete.
  * This code has to spin since there is no other notification from the PCI
@@ -337,9 +393,23 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
 	loff_t end = pos + count;
 	u8 *buf = arg;
 
-	if (pos < 0 || pos > vpd->base.len || end > vpd->base.len)
+	if (pos < 0)
 		return -EINVAL;
 
+	if (!vpd->valid) {
+		vpd->valid = true;
+		vpd->base.len = pci_vpd_pci22_size(dev);
+	}
+	if (vpd->base.len == 0)
+		return -EIO;
+
+	if (end > vpd->base.len) {
+		if (pos > vpd->base.len)
+			return 0;
+		end = vpd->base.len;
+		count = end - pos;
+	}
+
 	if (mutex_lock_killable(&vpd->lock))
 		return -EINTR;
 
@@ -389,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
 	loff_t end = pos + count;
 	int ret = 0;
 
+	if (!vpd->valid)
+		return -EAGAIN;
+
 	if (pos < 0 || (pos & 3) || (count & 3) || end > vpd->base.len)
 		return -EINVAL;
 
@@ -496,6 +569,7 @@ int pci_vpd_pci22_init(struct pci_dev *dev)
 	mutex_init(&vpd->lock);
 	vpd->cap = cap;
 	vpd->busy = false;
+	vpd->valid = false;
 	dev->vpd = &vpd->base;
 	return 0;
 }
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index de327c3..31a1f35 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1333,7 +1333,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
 			return -ENOMEM;
 
 		sysfs_bin_attr_init(attr);
-		attr->size = dev->vpd->len;
+		attr->size = 0;
 		attr->attr.name = "vpd";
 		attr->attr.mode = S_IRUSR | S_IWUSR;
 		attr->read = read_vpd_attr;
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
                   ` (2 preceding siblings ...)
  2016-01-13 11:25 ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Hannes Reinecke
@ 2016-01-13 11:25 ` Hannes Reinecke
  2016-01-19 20:57   ` [PATCH v3 " Babu Moger
  2016-02-09 21:07   ` [PATCHv2 " Bjorn Helgaas
  2016-01-15  1:07 ` [PATCHv2 0/4] PCI VPD access fixes Seymour, Shane M
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-13 11:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger, Hannes Reinecke

From: Babu Moger <babu.moger@oracle.com>

Reading or Writing of PCI VPD data causes system panic.
We saw this problem by running "lspci -vvv" in the beginning.
However this can be easily reproduced by running
 cat /sys/bus/devices/XX../vpd

As even a simple read on any VPD data triggers a system
lockup on certain cards this patch implements a PCI quirk
to disabling VPD acces altogether by setting the vpd length
to '0'.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/pci/access.c |  5 ++++-
 drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 914e023..82f41a8 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -396,7 +396,7 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
 	if (pos < 0)
 		return -EINVAL;
 
-	if (!vpd->valid) {
+	if (!vpd->valid && vpd->base.len > 0) {
 		vpd->valid = true;
 		vpd->base.len = pci_vpd_pci22_size(dev);
 	}
@@ -459,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
 	loff_t end = pos + count;
 	int ret = 0;
 
+	if (vpd->base.len == 0)
+		return -EIO;
+
 	if (!vpd->valid)
 		return -EAGAIN;
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 7e32730..af0f8a1 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2123,6 +2123,47 @@ static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching);
 
 /*
+ * A read/write to sysfs entry ('/sys/bus/pci/devices/<id>/vpd')
+ * will dump 32k of data. The default length is set as 32768.
+ * Reading a full 32k will cause an access beyond the VPD end tag.
+ * The system behaviour at that point is mostly unpredictable.
+ * Apparently, some vendors have not implemented this VPD headers properly.
+ * Adding a generic function disable vpd data for these buggy adapters
+ * Add the DECLARE_PCI_FIXUP_FINAL line below with the specific with
+ * vendor and device of interest to use this quirk.
+ */
+static void quirk_blacklist_vpd(struct pci_dev *dev)
+{
+	if (dev->vpd) {
+		dev->vpd->len = 0;
+		dev_warn(&dev->dev, "PCI vpd access has been disabled due to firmware bug\n");
+	}
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0073,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0071,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005b,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x002f,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f,
+		quirk_blacklist_vpd);
+
+/*
  * For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
  * VPD end tag will hang the device.  This problem was initially
  * observed when a vpd entry was created in sysfs
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* RE: [PATCHv2 0/4] PCI VPD access fixes
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
                   ` (3 preceding siblings ...)
  2016-01-13 11:25 ` [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices Hannes Reinecke
@ 2016-01-15  1:07 ` Seymour, Shane M
  2016-01-15 14:10   ` Babu Moger
  2016-01-19 20:53 ` Babu Moger
  2016-01-21 18:34 ` [PATCH v4 4/4] pci: Blacklist vpd access for buggy devices Babu Moger
  6 siblings, 1 reply; 37+ messages in thread
From: Seymour, Shane M @ 2016-01-15  1:07 UTC (permalink / raw)
  To: Hannes Reinecke, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

For the series. Tested with AE311 PCI-Express 4Gb Fibre Channel HBA.
Truncation of the vpd data returned works (154 bytes) and lspci -vvv
prints out the VPD tags the same way with and without the changes.
This card did not require any quirks it was capable of returning 32k
of data (although it repeated every 4k).
---
Tested-by: Shane Seymour <shane.seymour@hpe.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 0/4] PCI VPD access fixes
  2016-01-15  1:07 ` [PATCHv2 0/4] PCI VPD access fixes Seymour, Shane M
@ 2016-01-15 14:10   ` Babu Moger
  2016-01-15 14:18     ` Hannes Reinecke
  0 siblings, 1 reply; 37+ messages in thread
From: Babu Moger @ 2016-01-15 14:10 UTC (permalink / raw)
  To: Seymour, Shane M, Hannes Reinecke, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel

Shane, Thank You very much..

On 1/14/2016 7:07 PM, Seymour, Shane M wrote:
> For the series. Tested with AE311 PCI-Express 4Gb Fibre Channel HBA.
> Truncation of the vpd data returned works (154 bytes) and lspci -vvv
> prints out the VPD tags the same way with and without the changes.
> This card did not require any quirks it was capable of returning 32k
> of data (although it repeated every 4k).
> ---
> Tested-by: Shane Seymour <shane.seymour@hpe.com>
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 0/4] PCI VPD access fixes
  2016-01-15 14:10   ` Babu Moger
@ 2016-01-15 14:18     ` Hannes Reinecke
  0 siblings, 0 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-01-15 14:18 UTC (permalink / raw)
  To: Babu Moger, Seymour, Shane M, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel

On 01/15/2016 03:10 PM, Babu Moger wrote:
> Shane, Thank You very much..
>
> On 1/14/2016 7:07 PM, Seymour, Shane M wrote:
>> For the series. Tested with AE311 PCI-Express 4Gb Fibre Channel HBA.
>> Truncation of the vpd data returned works (154 bytes) and lspci -vvv
>> prints out the VPD tags the same way with and without the changes.
>> This card did not require any quirks it was capable of returning 32k
>> of data (although it repeated every 4k).
>> ---
>> Tested-by: Shane Seymour <shane.seymour@hpe.com>
>>
So, Alexander, are you happy with this series?
It looks as if the technical issues are resolved now ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 0/4] PCI VPD access fixes
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
                   ` (4 preceding siblings ...)
  2016-01-15  1:07 ` [PATCHv2 0/4] PCI VPD access fixes Seymour, Shane M
@ 2016-01-19 20:53 ` Babu Moger
  2016-01-21 18:34 ` [PATCH v4 4/4] pci: Blacklist vpd access for buggy devices Babu Moger
  6 siblings, 0 replies; 37+ messages in thread
From: Babu Moger @ 2016-01-19 20:53 UTC (permalink / raw)
  To: Hannes Reinecke, Bjorn Helgaas
  Cc: Alexander Duyck, linux-pci, linux-kernel, Jordan Hargrave

Hi,

On 1/13/2016 5:25 AM, Hannes Reinecke wrote:
> Hi all,
> 
> the current PCI VPD page access assumes that the entire possible VPD
> data is readable. However, the spec only guarantees a VPD data up to
> the 'end' marker, with everything beyond that being undefined.
> This causes a system lockup on certain devices.
> 
> With this patch we always set the VPD sysfs attribute size to '0', and
> calculate the available VPD size on the first access.
> If no valid data can be read an I/O error is returned.
> 
> I've also included the patch from Babu to blacklists devices which
> are known to lockup when accessing the VPD data.
> 
> Babu Moger (1):
>   pci: Blacklist vpd access for buggy devices
> 
> Hannes Reinecke (3):
>   pci: Update VPD definitions
>   pci: allow access to VPD attributes with size '0'
>   pci: Determine actual VPD size on first access
> 
>  drivers/pci/access.c    | 79 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/pci/pci-sysfs.c | 22 ++++++++------
>  drivers/pci/quirks.c    | 41 +++++++++++++++++++++++++
>  include/linux/pci.h     | 27 +++++++++++++++--
>  4 files changed, 157 insertions(+), 12 deletions(-)
> 

Resending the patch 4/4. Added Atheros controller(0x1969:0x1026) in blacklist.
Jordan confirmed the Vendor and Device id(0x1969:0x1026). Here is the device.
09:00.0 Ethernet controller: Atheros Communications AR8121/AR8113/AR8114 Gigabit or Fast Ethernet (rev b0)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 4/4] pci: Blacklist vpd access for buggy devices
  2016-01-13 11:25 ` [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices Hannes Reinecke
@ 2016-01-19 20:57   ` Babu Moger
  2016-02-09 21:07   ` [PATCHv2 " Bjorn Helgaas
  1 sibling, 0 replies; 37+ messages in thread
From: Babu Moger @ 2016-01-19 20:57 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, hare, alexander.duyck, linux-kernel, Jordan_Hargrave

Reading or Writing of PCI VPD data causes system panic.
We saw this problem by running "lspci -vvv" in the beginning.
However this can be easily reproduced by running
 cat /sys/bus/devices/XX../vpd

As even a simple read on any VPD data triggers a system
lockup on certain cards this patch implements a PCI quirk
to disabling VPD acces altogether by setting the vpd length
to '0'.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com>
---
 drivers/pci/access.c |    5 ++++-
 drivers/pci/quirks.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 914e023..82f41a8 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -396,7 +396,7 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
 	if (pos < 0)
 		return -EINVAL;
 
-	if (!vpd->valid) {
+	if (!vpd->valid && vpd->base.len > 0) {
 		vpd->valid = true;
 		vpd->base.len = pci_vpd_pci22_size(dev);
 	}
@@ -459,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
 	loff_t end = pos + count;
 	int ret = 0;
 
+	if (vpd->base.len == 0)
+		return -EIO;
+
 	if (!vpd->valid)
 		return -EAGAIN;
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b03373f..ed09258 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2123,6 +2123,49 @@ static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching);
 
 /*
+ * A read/write to sysfs entry ('/sys/bus/pci/devices/<id>/vpd')
+ * will dump 32k of data. The default length is set as 32768.
+ * Reading a full 32k will cause an access beyond the VPD end tag.
+ * The system behaviour at that point is mostly unpredictable.
+ * Apparently, some vendors have not implemented this VPD headers properly.
+ * Adding a generic function disable vpd data for these buggy adapters
+ * Add the DECLARE_PCI_FIXUP_FINAL line below with the specific with
+ * vendor and device of interest to use this quirk.
+ */
+static void quirk_blacklist_vpd(struct pci_dev *dev)
+{
+	if (dev->vpd) {
+		dev->vpd->len = 0;
+		dev_warn(&dev->dev, "PCI vpd access has been disabled due to firmware bug\n");
+	}
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0073,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0071,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005b,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x002f,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x1026,
+		quirk_blacklist_vpd);
+
+/*
  * For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
  * VPD end tag will hang the device.  This problem was initially
  * observed when a vpd entry was created in sysfs
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 4/4] pci: Blacklist vpd access for buggy devices
  2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
                   ` (5 preceding siblings ...)
  2016-01-19 20:53 ` Babu Moger
@ 2016-01-21 18:34 ` Babu Moger
  6 siblings, 0 replies; 37+ messages in thread
From: Babu Moger @ 2016-01-21 18:34 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, hare, alexander.duyck, linux-kernel, Jordan_Hargrave

Reading or Writing of PCI VPD data causes system panic.
We saw this problem by running "lspci -vvv" in the beginning.
However this can be easily reproduced by running
 cat /sys/bus/devices/XX../vpd

As even a simple read on any VPD data triggers a system
lockup on certain cards this patch implements a PCI quirk
to disabling VPD acces altogether by setting the vpd length
to '0'.

Added all the PCI_VENDOR_ID_ATTANSIC varients.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com>
---
 drivers/pci/access.c |    5 ++++-
 drivers/pci/quirks.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 914e023..82f41a8 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -396,7 +396,7 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
 	if (pos < 0)
 		return -EINVAL;
 
-	if (!vpd->valid) {
+	if (!vpd->valid && vpd->base.len > 0) {
 		vpd->valid = true;
 		vpd->base.len = pci_vpd_pci22_size(dev);
 	}
@@ -459,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
 	loff_t end = pos + count;
 	int ret = 0;
 
+	if (vpd->base.len == 0)
+		return -EIO;
+
 	if (!vpd->valid)
 		return -EAGAIN;
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b03373f..f0007e9 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2123,6 +2123,49 @@ static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching);
 
 /*
+ * A read/write to sysfs entry ('/sys/bus/pci/devices/<id>/vpd')
+ * will dump 32k of data. The default length is set as 32768.
+ * Reading a full 32k will cause an access beyond the VPD end tag.
+ * The system behaviour at that point is mostly unpredictable.
+ * Apparently, some vendors have not implemented this VPD headers properly.
+ * Adding a generic function disable vpd data for these buggy adapters
+ * Add the DECLARE_PCI_FIXUP_FINAL line below with the specific with
+ * vendor and device of interest to use this quirk.
+ */
+static void quirk_blacklist_vpd(struct pci_dev *dev)
+{
+	if (dev->vpd) {
+		dev->vpd->len = 0;
+		dev_warn(&dev->dev, "PCI vpd access has been disabled due to firmware bug\n");
+	}
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0073,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0071,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005b,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x002f,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f,
+		quirk_blacklist_vpd);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, PCI_ANY_ID,
+		quirk_blacklist_vpd);
+
+/*
  * For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
  * VPD end tag will hang the device.  This problem was initially
  * observed when a vpd entry was created in sysfs
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 2/4] pci: allow access to VPD attributes with size '0'
  2016-01-13 11:25 ` [PATCHv2 2/4] pci: allow access to VPD attributes with size '0' Hannes Reinecke
@ 2016-02-09 20:53   ` Bjorn Helgaas
  2016-02-10  7:17     ` Hannes Reinecke
  0 siblings, 1 reply; 37+ messages in thread
From: Bjorn Helgaas @ 2016-02-09 20:53 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

Hi Hannes,

On Wed, Jan 13, 2016 at 12:25:33PM +0100, Hannes Reinecke wrote:
> It is not always possible to determine the actual size of the VPD
> data, so allow access to them if the size is set to '0'.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/pci/pci-sysfs.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index eead54c..de327c3 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -772,10 +772,12 @@ static ssize_t read_vpd_attr(struct file *filp, struct kobject *kobj,
>  	struct pci_dev *dev =
>  		to_pci_dev(container_of(kobj, struct device, kobj));
>  
> -	if (off > bin_attr->size)
> -		count = 0;
> -	else if (count > bin_attr->size - off)
> -		count = bin_attr->size - off;
> +	if (bin_attr->size > 0) {
> +		if (off > bin_attr->size)
> +			count = 0;
> +		else if (count > bin_attr->size - off)
> +			count = bin_attr->size - off;
> +	}

I'm trying to figure out why we do any of this checking here.  It
seems like we could do all of it in pci_read_vpd().

Where is bin_attr->size set?  I assume this is the "attr" allocated in
pci_create_capabilities_sysfs().  I see that before your series, we
set "attr->size = dev->vpd->len" there, but after patch 3/4 of your
series, we set it to 0 there, and I don't see a place that sets it to
anything else.

>  	return pci_read_vpd(dev, off, count, buf);
>  }
> @@ -787,10 +789,12 @@ static ssize_t write_vpd_attr(struct file *filp, struct kobject *kobj,
>  	struct pci_dev *dev =
>  		to_pci_dev(container_of(kobj, struct device, kobj));
>  
> -	if (off > bin_attr->size)
> -		count = 0;
> -	else if (count > bin_attr->size - off)
> -		count = bin_attr->size - off;
> +	if (bin_attr->size > 0) {
> +		if (off > bin_attr->size)
> +			count = 0;
> +		else if (count > bin_attr->size - off)
> +			count = bin_attr->size - off;
> +	}
>  
>  	return pci_write_vpd(dev, off, count, buf);
>  }
> -- 
> 1.8.5.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-01-13 11:25 ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Hannes Reinecke
@ 2016-02-09 21:04   ` Bjorn Helgaas
  2016-02-10  7:24     ` Hannes Reinecke
  2016-08-09 12:54     ` Alexey Kardashevskiy
  0 siblings, 2 replies; 37+ messages in thread
From: Bjorn Helgaas @ 2016-02-09 21:04 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

On Wed, Jan 13, 2016 at 12:25:34PM +0100, Hannes Reinecke wrote:
> PCI-2.2 VPD entries have a maximum size of 32k, but might actually
> be smaller than that. To figure out the actual size one has to read
> the VPD area until the 'end marker' is reached.
> Trying to read VPD data beyond that marker results in 'interesting'
> effects, from simple read errors to crashing the card. And to make
> matters worse not every PCI card implements this properly, leaving
> us with no 'end' marker or even completely invalid data.
> This path tries to determine the size of the VPD data.
> If no valid data can be read an I/O error will be returned when
> reading the sysfs attribute.
> 
> Cc: Alexander Duyck <alexander.duyck@gmail.com>
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/pci/access.c    | 76 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  drivers/pci/pci-sysfs.c |  2 +-
>  2 files changed, 76 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 59ac36f..914e023 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -284,9 +284,65 @@ struct pci_vpd_pci22 {
>  	struct mutex lock;
>  	u16	flag;
>  	bool	busy;
> +	bool	valid;

You're just following the precedent here, but I think using "bool" in
structures like this is pointless: it takes more space than a :1 field
and doesn't really add any safety.

>  	u8	cap;
>  };
>  
> +/**
> + * pci_vpd_size - determine actual size of Vital Product Data
> + * @dev:	pci device struct
> + * @old_size:	current assumed size, also maximum allowed size
> + *

Superfluous empty line.

> + */
> +static size_t
> +pci_vpd_pci22_size(struct pci_dev *dev)

Please follow indentation style of the file (all on one line).

> +{
> +	size_t off = 0;
> +	unsigned char header[1+2];	/* 1 byte tag, 2 bytes length */
> +
> +	while (off < PCI_VPD_PCI22_SIZE &&
> +	       pci_read_vpd(dev, off, 1, header) == 1) {
> +		unsigned char tag;
> +
> +		if (header[0] & PCI_VPD_LRDT) {
> +			/* Large Resource Data Type Tag */
> +			tag = pci_vpd_lrdt_tag(header);
> +			/* Only read length from known tag items */
> +			if ((tag == PCI_VPD_LTIN_ID_STRING) ||
> +			    (tag == PCI_VPD_LTIN_RO_DATA) ||
> +			    (tag == PCI_VPD_LTIN_RW_DATA)) {
> +				if (pci_read_vpd(dev, off+1, 2,
> +						 &header[1]) != 2) {
> +					dev_dbg(&dev->dev,
> +						"invalid large vpd tag %02x "
> +						"size at offset %zu",
> +						tag, off + 1);

The effect of this is that we set vpd->base.len to 0, which will cause
all VPD accesses to fail, right?  If so, I'd make this at least a
dev_info(), maybe even a dev_warn().  The dev_dbg() may not appear in
dmesg at all, depending on config options.

Capitalize "VPD" and concatenate all the strings, even if it exceeds
80 columns.

> +					break;

I think this leads to "return 0" below; I'd just return directly here
so the reader doesn't have to figure out what we're breaking from.

> +				}
> +				off += PCI_VPD_LRDT_TAG_SIZE +
> +					pci_vpd_lrdt_size(header);
> +			}
> +		} else {
> +			/* Short Resource Data Type Tag */
> +			off += PCI_VPD_SRDT_TAG_SIZE +
> +				pci_vpd_srdt_size(header);
> +			tag = pci_vpd_srdt_tag(header);
> +		}
> +		if (tag == PCI_VPD_STIN_END)	/* End tag descriptor */
> +			return off;
> +		if ((tag != PCI_VPD_LTIN_ID_STRING) &&
> +		    (tag != PCI_VPD_LTIN_RO_DATA) &&
> +		    (tag != PCI_VPD_LTIN_RW_DATA)) {
> +			dev_dbg(&dev->dev,
> +				"invalid %s vpd tag %02x at offset %zu",
> +				(header[0] & PCI_VPD_LRDT) ? "large" : "short",
> +				tag, off);
> +			break;

Same dev_dbg() and break comment here.

> +		}
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Wait for last operation to complete.
>   * This code has to spin since there is no other notification from the PCI
> @@ -337,9 +393,23 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
>  	loff_t end = pos + count;
>  	u8 *buf = arg;
>  
> -	if (pos < 0 || pos > vpd->base.len || end > vpd->base.len)
> +	if (pos < 0)
>  		return -EINVAL;
>  
> +	if (!vpd->valid) {
> +		vpd->valid = true;
> +		vpd->base.len = pci_vpd_pci22_size(dev);
> +	}
> +	if (vpd->base.len == 0)
> +		return -EIO;
> +
> +	if (end > vpd->base.len) {
> +		if (pos > vpd->base.len)
> +			return 0;
> +		end = vpd->base.len;
> +		count = end - pos;
> +	}
> +
>  	if (mutex_lock_killable(&vpd->lock))
>  		return -EINTR;
>  
> @@ -389,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
>  	loff_t end = pos + count;
>  	int ret = 0;
>  
> +	if (!vpd->valid)
> +		return -EAGAIN;

Does this mean we now have to do a VPD read before a VPD write?  That
sounds like a new requirement that is non-obvious.

> +
>  	if (pos < 0 || (pos & 3) || (count & 3) || end > vpd->base.len)
>  		return -EINVAL;
>  
> @@ -496,6 +569,7 @@ int pci_vpd_pci22_init(struct pci_dev *dev)
>  	mutex_init(&vpd->lock);
>  	vpd->cap = cap;
>  	vpd->busy = false;
> +	vpd->valid = false;
>  	dev->vpd = &vpd->base;
>  	return 0;
>  }
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index de327c3..31a1f35 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1333,7 +1333,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>  			return -ENOMEM;
>  
>  		sysfs_bin_attr_init(attr);
> -		attr->size = dev->vpd->len;
> +		attr->size = 0;

I'm still puzzled about how we're using attr->size.  I don't really
want that size to change at run-time, i.e., I don't want it to be zero
immediately after boot, then change to something else after the first
VPD read.  I don't think it's important that this reflect the size
computed based on the VPD contents.  I think it would be fine if it
were set to 32K all the time and possibly set to zero by quirks on
devices where VPD is completely broken.

>  		attr->attr.name = "vpd";
>  		attr->attr.mode = S_IRUSR | S_IWUSR;
>  		attr->read = read_vpd_attr;
> -- 
> 1.8.5.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices
  2016-01-13 11:25 ` [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices Hannes Reinecke
  2016-01-19 20:57   ` [PATCH v3 " Babu Moger
@ 2016-02-09 21:07   ` Bjorn Helgaas
  2016-02-09 21:24     ` Babu Moger
  1 sibling, 1 reply; 37+ messages in thread
From: Bjorn Helgaas @ 2016-02-09 21:07 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

There seem to be several revs of this patch, and it's hard for me to
keep track of what's current.  If you want to update any patch in the
series, please repost the entire series with a new version number.

On Wed, Jan 13, 2016 at 12:25:35PM +0100, Hannes Reinecke wrote:
> From: Babu Moger <babu.moger@oracle.com>
> 
> Reading or Writing of PCI VPD data causes system panic.
> We saw this problem by running "lspci -vvv" in the beginning.
> However this can be easily reproduced by running
>  cat /sys/bus/devices/XX../vpd
> 
> As even a simple read on any VPD data triggers a system
> lockup on certain cards this patch implements a PCI quirk
> to disabling VPD acces altogether by setting the vpd length

s/acces/access/
s/vpd/VPD/

> to '0'.
> 
> Signed-off-by: Babu Moger <babu.moger@oracle.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/pci/access.c |  5 ++++-
>  drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 914e023..82f41a8 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -396,7 +396,7 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
>  	if (pos < 0)
>  		return -EINVAL;
>  
> -	if (!vpd->valid) {
> +	if (!vpd->valid && vpd->base.len > 0) {
>  		vpd->valid = true;
>  		vpd->base.len = pci_vpd_pci22_size(dev);
>  	}
> @@ -459,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
>  	loff_t end = pos + count;
>  	int ret = 0;
>  
> +	if (vpd->base.len == 0)
> +		return -EIO;
> +
>  	if (!vpd->valid)
>  		return -EAGAIN;
>  
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 7e32730..af0f8a1 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2123,6 +2123,47 @@ static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev)
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching);
>  
>  /*
> + * A read/write to sysfs entry ('/sys/bus/pci/devices/<id>/vpd')
> + * will dump 32k of data. The default length is set as 32768.
> + * Reading a full 32k will cause an access beyond the VPD end tag.
> + * The system behaviour at that point is mostly unpredictable.
> + * Apparently, some vendors have not implemented this VPD headers properly.
> + * Adding a generic function disable vpd data for these buggy adapters
> + * Add the DECLARE_PCI_FIXUP_FINAL line below with the specific with
> + * vendor and device of interest to use this quirk.
> + */
> +static void quirk_blacklist_vpd(struct pci_dev *dev)
> +{
> +	if (dev->vpd) {
> +		dev->vpd->len = 0;
> +		dev_warn(&dev->dev, "PCI vpd access has been disabled due to firmware bug\n");

"PCI" is superfluous and "VPD" should be capitalized.

> +	}
> +}
> +
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0073,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0071,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005b,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x002f,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d,
> +		quirk_blacklist_vpd);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f,
> +		quirk_blacklist_vpd);
> +
> +/*
>   * For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
>   * VPD end tag will hang the device.  This problem was initially
>   * observed when a vpd entry was created in sysfs
> -- 
> 1.8.5.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices
  2016-02-09 21:07   ` [PATCHv2 " Bjorn Helgaas
@ 2016-02-09 21:24     ` Babu Moger
  0 siblings, 0 replies; 37+ messages in thread
From: Babu Moger @ 2016-02-09 21:24 UTC (permalink / raw)
  To: Bjorn Helgaas, Hannes Reinecke; +Cc: Alexander Duyck, linux-pci, linux-kernel

On 2/9/2016 3:07 PM, Bjorn Helgaas wrote:
> There seem to be several revs of this patch, and it's hard for me to
> keep track of what's current.  If you want to update any patch in the
> series, please repost the entire series with a new version number.

 Here is the latest of patch 4/4.
 https://patchwork.kernel.org/patch/8084221/

 I will wait for Hannes's response before re-posting it.
 Hannes, If you want me to re-post all the series let me know.
 
> 
> On Wed, Jan 13, 2016 at 12:25:35PM +0100, Hannes Reinecke wrote:
>> From: Babu Moger <babu.moger@oracle.com>
>>
>> Reading or Writing of PCI VPD data causes system panic.
>> We saw this problem by running "lspci -vvv" in the beginning.
>> However this can be easily reproduced by running
>>  cat /sys/bus/devices/XX../vpd
>>
>> As even a simple read on any VPD data triggers a system
>> lockup on certain cards this patch implements a PCI quirk
>> to disabling VPD acces altogether by setting the vpd length
> 
> s/acces/access/
> s/vpd/VPD/
> 
>> to '0'.
>>
>> Signed-off-by: Babu Moger <babu.moger@oracle.com>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>  drivers/pci/access.c |  5 ++++-
>>  drivers/pci/quirks.c | 41 +++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
>> index 914e023..82f41a8 100644
>> --- a/drivers/pci/access.c
>> +++ b/drivers/pci/access.c
>> @@ -396,7 +396,7 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
>>  	if (pos < 0)
>>  		return -EINVAL;
>>  
>> -	if (!vpd->valid) {
>> +	if (!vpd->valid && vpd->base.len > 0) {
>>  		vpd->valid = true;
>>  		vpd->base.len = pci_vpd_pci22_size(dev);
>>  	}
>> @@ -459,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
>>  	loff_t end = pos + count;
>>  	int ret = 0;
>>  
>> +	if (vpd->base.len == 0)
>> +		return -EIO;
>> +
>>  	if (!vpd->valid)
>>  		return -EAGAIN;
>>  
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 7e32730..af0f8a1 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -2123,6 +2123,47 @@ static void quirk_via_cx700_pci_parking_caching(struct pci_dev *dev)
>>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, 0x324e, quirk_via_cx700_pci_parking_caching);
>>  
>>  /*
>> + * A read/write to sysfs entry ('/sys/bus/pci/devices/<id>/vpd')
>> + * will dump 32k of data. The default length is set as 32768.
>> + * Reading a full 32k will cause an access beyond the VPD end tag.
>> + * The system behaviour at that point is mostly unpredictable.
>> + * Apparently, some vendors have not implemented this VPD headers properly.
>> + * Adding a generic function disable vpd data for these buggy adapters
>> + * Add the DECLARE_PCI_FIXUP_FINAL line below with the specific with
>> + * vendor and device of interest to use this quirk.
>> + */
>> +static void quirk_blacklist_vpd(struct pci_dev *dev)
>> +{
>> +	if (dev->vpd) {
>> +		dev->vpd->len = 0;
>> +		dev_warn(&dev->dev, "PCI vpd access has been disabled due to firmware bug\n");
> 
> "PCI" is superfluous and "VPD" should be capitalized.
> 
>> +	}
>> +}
>> +
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0060,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x007c,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0413,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0078,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0079,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0073,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x0071,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005b,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x002f,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d,
>> +		quirk_blacklist_vpd);
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f,
>> +		quirk_blacklist_vpd);
>> +
>> +/*
>>   * For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
>>   * VPD end tag will hang the device.  This problem was initially
>>   * observed when a vpd entry was created in sysfs
>> -- 
>> 1.8.5.6
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 2/4] pci: allow access to VPD attributes with size '0'
  2016-02-09 20:53   ` Bjorn Helgaas
@ 2016-02-10  7:17     ` Hannes Reinecke
  0 siblings, 0 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-02-10  7:17 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

On 02/09/2016 09:53 PM, Bjorn Helgaas wrote:
> Hi Hannes,
> 
> On Wed, Jan 13, 2016 at 12:25:33PM +0100, Hannes Reinecke wrote:
>> It is not always possible to determine the actual size of the VPD
>> data, so allow access to them if the size is set to '0'.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>  drivers/pci/pci-sysfs.c | 20 ++++++++++++--------
>>  1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index eead54c..de327c3 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -772,10 +772,12 @@ static ssize_t read_vpd_attr(struct file *filp, struct kobject *kobj,
>>  	struct pci_dev *dev =
>>  		to_pci_dev(container_of(kobj, struct device, kobj));
>>  
>> -	if (off > bin_attr->size)
>> -		count = 0;
>> -	else if (count > bin_attr->size - off)
>> -		count = bin_attr->size - off;
>> +	if (bin_attr->size > 0) {
>> +		if (off > bin_attr->size)
>> +			count = 0;
>> +		else if (count > bin_attr->size - off)
>> +			count = bin_attr->size - off;
>> +	}
> 
> I'm trying to figure out why we do any of this checking here.  It
> seems like we could do all of it in pci_read_vpd().
> 
> Where is bin_attr->size set?  I assume this is the "attr" allocated in
> pci_create_capabilities_sysfs().  I see that before your series, we
> set "attr->size = dev->vpd->len" there, but after patch 3/4 of your
> series, we set it to 0 there, and I don't see a place that sets it to
> anything else.
> 
That is entirely correct.
One of the joys of sysfs binary attributes.

In general, sysfs binary attributes are allowed to have a size of
'0', but still can provide data when read.
The size of '0' is just an indicator for "I don't know how much data
I can provide, figure out yourself".
(Cf fs/sysfs/file.c:sysfs_kf_bin_read() for details here)

In our case we're not using that function, but rather our own
pci_read_vpd(), which always exposes a size.

So this patch just takes the same reasoning from
sysfs_kf_bin_read(), and sets the size to '0', as initially we
simply do _not_ know how much data will be provided.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-02-09 21:04   ` Bjorn Helgaas
@ 2016-02-10  7:24     ` Hannes Reinecke
  2016-08-09 12:54     ` Alexey Kardashevskiy
  1 sibling, 0 replies; 37+ messages in thread
From: Hannes Reinecke @ 2016-02-10  7:24 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger

On 02/09/2016 10:04 PM, Bjorn Helgaas wrote:
> On Wed, Jan 13, 2016 at 12:25:34PM +0100, Hannes Reinecke wrote:
>> PCI-2.2 VPD entries have a maximum size of 32k, but might actually
>> be smaller than that. To figure out the actual size one has to read
>> the VPD area until the 'end marker' is reached.
>> Trying to read VPD data beyond that marker results in 'interesting'
>> effects, from simple read errors to crashing the card. And to make
>> matters worse not every PCI card implements this properly, leaving
>> us with no 'end' marker or even completely invalid data.
>> This path tries to determine the size of the VPD data.
>> If no valid data can be read an I/O error will be returned when
>> reading the sysfs attribute.
>>
>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
>> Cc: Bjorn Helgaas <helgaas@kernel.org>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>  drivers/pci/access.c    | 76 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>  drivers/pci/pci-sysfs.c |  2 +-
>>  2 files changed, 76 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
>> index 59ac36f..914e023 100644
>> --- a/drivers/pci/access.c
>> +++ b/drivers/pci/access.c
>> @@ -284,9 +284,65 @@ struct pci_vpd_pci22 {
>>  	struct mutex lock;
>>  	u16	flag;
>>  	bool	busy;
>> +	bool	valid;
> 
> You're just following the precedent here, but I think using "bool" in
> structures like this is pointless: it takes more space than a :1 field
> and doesn't really add any safety.
> 
Okay, will be moving to a bitfield here.

>>  	u8	cap;
>>  };
>>  
>> +/**
>> + * pci_vpd_size - determine actual size of Vital Product Data
>> + * @dev:	pci device struct
>> + * @old_size:	current assumed size, also maximum allowed size
>> + *
> 
> Superfluous empty line.
> 
Ok.

>> + */
>> +static size_t
>> +pci_vpd_pci22_size(struct pci_dev *dev)
> 
> Please follow indentation style of the file (all on one line).
> 
Ok.

>> +{
>> +	size_t off = 0;
>> +	unsigned char header[1+2];	/* 1 byte tag, 2 bytes length */
>> +
>> +	while (off < PCI_VPD_PCI22_SIZE &&
>> +	       pci_read_vpd(dev, off, 1, header) == 1) {
>> +		unsigned char tag;
>> +
>> +		if (header[0] & PCI_VPD_LRDT) {
>> +			/* Large Resource Data Type Tag */
>> +			tag = pci_vpd_lrdt_tag(header);
>> +			/* Only read length from known tag items */
>> +			if ((tag == PCI_VPD_LTIN_ID_STRING) ||
>> +			    (tag == PCI_VPD_LTIN_RO_DATA) ||
>> +			    (tag == PCI_VPD_LTIN_RW_DATA)) {
>> +				if (pci_read_vpd(dev, off+1, 2,
>> +						 &header[1]) != 2) {
>> +					dev_dbg(&dev->dev,
>> +						"invalid large vpd tag %02x "
>> +						"size at offset %zu",
>> +						tag, off + 1);
> 
> The effect of this is that we set vpd->base.len to 0, which will cause
> all VPD accesses to fail, right?  If so, I'd make this at least a
> dev_info(), maybe even a dev_warn().  The dev_dbg() may not appear in
> dmesg at all, depending on config options.
> 
Wrong. Setting the length to '0' will just make the sysfs attribute
appear with a length of '0', as we cannot determine the length when
we create the attribute.

> Capitalize "VPD" and concatenate all the strings, even if it exceeds
> 80 columns.
> 
Okay.

>> +					break;
> 
> I think this leads to "return 0" below; I'd just return directly here
> so the reader doesn't have to figure out what we're breaking from.
> 
Okay, can do.

>> +				}
>> +				off += PCI_VPD_LRDT_TAG_SIZE +
>> +					pci_vpd_lrdt_size(header);
>> +			}
>> +		} else {
>> +			/* Short Resource Data Type Tag */
>> +			off += PCI_VPD_SRDT_TAG_SIZE +
>> +				pci_vpd_srdt_size(header);
>> +			tag = pci_vpd_srdt_tag(header);
>> +		}
>> +		if (tag == PCI_VPD_STIN_END)	/* End tag descriptor */
>> +			return off;
>> +		if ((tag != PCI_VPD_LTIN_ID_STRING) &&
>> +		    (tag != PCI_VPD_LTIN_RO_DATA) &&
>> +		    (tag != PCI_VPD_LTIN_RW_DATA)) {
>> +			dev_dbg(&dev->dev,
>> +				"invalid %s vpd tag %02x at offset %zu",
>> +				(header[0] & PCI_VPD_LRDT) ? "large" : "short",
>> +				tag, off);
>> +			break;
> 
> Same dev_dbg() and break comment here.
> 
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>>  /*
>>   * Wait for last operation to complete.
>>   * This code has to spin since there is no other notification from the PCI
>> @@ -337,9 +393,23 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
>>  	loff_t end = pos + count;
>>  	u8 *buf = arg;
>>  
>> -	if (pos < 0 || pos > vpd->base.len || end > vpd->base.len)
>> +	if (pos < 0)
>>  		return -EINVAL;
>>  
>> +	if (!vpd->valid) {
>> +		vpd->valid = true;
>> +		vpd->base.len = pci_vpd_pci22_size(dev);
>> +	}
>> +	if (vpd->base.len == 0)
>> +		return -EIO;
>> +
>> +	if (end > vpd->base.len) {
>> +		if (pos > vpd->base.len)
>> +			return 0;
>> +		end = vpd->base.len;
>> +		count = end - pos;
>> +	}
>> +
>>  	if (mutex_lock_killable(&vpd->lock))
>>  		return -EINTR;
>>  
>> @@ -389,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
>>  	loff_t end = pos + count;
>>  	int ret = 0;
>>  
>> +	if (!vpd->valid)
>> +		return -EAGAIN;
> 
> Does this mean we now have to do a VPD read before a VPD write?  That
> sounds like a new requirement that is non-obvious.
> 
Hmm. I thought no-one would be daft enough to write a VPD attribute
without first checking whether it actually _needs_ to be written to.
But yeah, you're right. I'll be adding a call to
pci_vpd_pci22_size() here.

>> +
>>  	if (pos < 0 || (pos & 3) || (count & 3) || end > vpd->base.len)
>>  		return -EINVAL;
>>  
>> @@ -496,6 +569,7 @@ int pci_vpd_pci22_init(struct pci_dev *dev)
>>  	mutex_init(&vpd->lock);
>>  	vpd->cap = cap;
>>  	vpd->busy = false;
>> +	vpd->valid = false;
>>  	dev->vpd = &vpd->base;
>>  	return 0;
>>  }
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index de327c3..31a1f35 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -1333,7 +1333,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>>  			return -ENOMEM;
>>  
>>  		sysfs_bin_attr_init(attr);
>> -		attr->size = dev->vpd->len;
>> +		attr->size = 0;
> 
> I'm still puzzled about how we're using attr->size.  I don't really
> want that size to change at run-time, i.e., I don't want it to be zero
> immediately after boot, then change to something else after the first
> VPD read.  I don't think it's important that this reflect the size
> computed based on the VPD contents.  I think it would be fine if it
> were set to 32K all the time and possibly set to zero by quirks on
> devices where VPD is completely broken.
> 
The general idea is to always set the attribute size to '0' (ie the
size reflected to userland via the sysfs attribute).
The _actual_ size of the VPD data is generated by a call to
pci_vpd_pci22_size(), and stored in the vpd structure itself.

With that we won't need to change the sysfs attribute size (which we
cannot do anyway), and don't need any quirks like modifying the
return data for invalid VPD data.

But yeah, the sysfs zero-length binfile handling is non-obvious. It
took me several days and a confused mail exchange with Alexander
Duyck to figure it out.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-02-09 21:04   ` Bjorn Helgaas
  2016-02-10  7:24     ` Hannes Reinecke
@ 2016-08-09 12:54     ` Alexey Kardashevskiy
  2016-08-09 18:12       ` Alexander Duyck
  2016-08-09 23:59       ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Benjamin Herrenschmidt
  1 sibling, 2 replies; 37+ messages in thread
From: Alexey Kardashevskiy @ 2016-08-09 12:54 UTC (permalink / raw)
  To: Bjorn Helgaas, Hannes Reinecke
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger,
	Paul Mackerras, Alex Williamson, Benjamin Herrenschmidt

On 10/02/16 08:04, Bjorn Helgaas wrote:
> On Wed, Jan 13, 2016 at 12:25:34PM +0100, Hannes Reinecke wrote:
>> PCI-2.2 VPD entries have a maximum size of 32k, but might actually
>> be smaller than that. To figure out the actual size one has to read
>> the VPD area until the 'end marker' is reached.
>> Trying to read VPD data beyond that marker results in 'interesting'
>> effects, from simple read errors to crashing the card. And to make
>> matters worse not every PCI card implements this properly, leaving
>> us with no 'end' marker or even completely invalid data.
>> This path tries to determine the size of the VPD data.
>> If no valid data can be read an I/O error will be returned when
>> reading the sysfs attribute.


I have a problem with this particular feature as today VFIO uses this
pci_vpd_xxxx API to virtualize access to VPD and the existing code assumes
there is just one VPD block with 0x2 start and 0xf end. However I have at
least one device where this is not true - "10 Gigabit Ethernet-SR PCI
Express Adapter" - it has 2 blocks (made a script to read/parse it as
/sys/bus/pci/devices/0001\:03\:00.0/vpd shows it wrong):


#0000 Large item 42 bytes; name 0x2 Identifier String
#002d Large item 74 bytes; name 0x10
#007a Small item 1 bytes; name 0xf End Tag
---
#0c00 Large item 16 bytes; name 0x2 Identifier String
#0c13 Large item 234 bytes; name 0x10
#0d00 Large item 252 bytes; name 0x11
#0dff Small item 0 bytes; name 0xf End Tag

The cxgb3 driver is reading the second bit starting from 0xc00 but since
the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
guest driver fails to probe.

I also cannot find a clause in the PCI 3.0 spec saying that there must be
just a single block, is it there?

What would the correct fix be? Scanning all 32k of VPD is not an option I
suppose as this is what this patch is trying to avoid. Thanks.



This is the device:

[aik@p81-p9 ~]$ sudo lspci -vvnns 0001:03:00.0
0001:03:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310
10GbE Single Port Adapter [1425:0030]
	Subsystem: IBM Device [1014:038c]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 494
	Region 0: Memory at 3fe080880000 (64-bit, non-prefetchable) [size=4K]
	Region 2: Memory at 3fe080000000 (64-bit, non-prefetchable) [size=8M]
	Region 4: Memory at 3fe080881000 (64-bit, non-prefetchable) [size=4K]
	[virtual] Expansion ROM at 3fe080800000 [disabled] [size=512K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [48] MSI: Enable- Count=1/32 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [58] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s
unlimited, L1 unlimited
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt-
ABWMgmt-
		DevCap2: Completion Timeout: Range ABC, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [94] Vital Product Data
		Product Name: 10 Gigabit Ethernet-SR PCI Express Adapter
		Read-only fields:
			[EC] Engineering changes: D76809
			[FN] Unknown: 34 36 4b 37 38 39 37
			[PN] Part number: 46K7897
			[MN] Manufacture ID: 31 30 33 37
			[FC] Unknown: 35 37 36 39
			[SN] Serial number: YL102035603V
			[NA] Unknown: 30 30 31 34 35 45 39 39 32 45 44 31
		End
	Capabilities: [9c] MSI-X: Enable+ Count=32 Masked-
		Vector table: BAR=4 offset=00000000
		PBA: BAR=4 offset=00000800
	Capabilities: [100 v1] Device Serial Number 00-00-00-01-00-00-00-01
	Capabilities: [300 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Kernel driver in use: cxgb3
	Kernel modules: cxgb3


>>
>> Cc: Alexander Duyck <alexander.duyck@gmail.com>
>> Cc: Bjorn Helgaas <helgaas@kernel.org>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>  drivers/pci/access.c    | 76 ++++++++++++++++++++++++++++++++++++++++++++++++-
>>  drivers/pci/pci-sysfs.c |  2 +-
>>  2 files changed, 76 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
>> index 59ac36f..914e023 100644
>> --- a/drivers/pci/access.c
>> +++ b/drivers/pci/access.c
>> @@ -284,9 +284,65 @@ struct pci_vpd_pci22 {
>>  	struct mutex lock;
>>  	u16	flag;
>>  	bool	busy;
>> +	bool	valid;
> 
> You're just following the precedent here, but I think using "bool" in
> structures like this is pointless: it takes more space than a :1 field
> and doesn't really add any safety.
> 
>>  	u8	cap;
>>  };
>>  
>> +/**
>> + * pci_vpd_size - determine actual size of Vital Product Data
>> + * @dev:	pci device struct
>> + * @old_size:	current assumed size, also maximum allowed size
>> + *
> 
> Superfluous empty line.
> 
>> + */
>> +static size_t
>> +pci_vpd_pci22_size(struct pci_dev *dev)
> 
> Please follow indentation style of the file (all on one line).
> 
>> +{
>> +	size_t off = 0;
>> +	unsigned char header[1+2];	/* 1 byte tag, 2 bytes length */
>> +
>> +	while (off < PCI_VPD_PCI22_SIZE &&
>> +	       pci_read_vpd(dev, off, 1, header) == 1) {
>> +		unsigned char tag;
>> +
>> +		if (header[0] & PCI_VPD_LRDT) {
>> +			/* Large Resource Data Type Tag */
>> +			tag = pci_vpd_lrdt_tag(header);
>> +			/* Only read length from known tag items */
>> +			if ((tag == PCI_VPD_LTIN_ID_STRING) ||
>> +			    (tag == PCI_VPD_LTIN_RO_DATA) ||
>> +			    (tag == PCI_VPD_LTIN_RW_DATA)) {
>> +				if (pci_read_vpd(dev, off+1, 2,
>> +						 &header[1]) != 2) {
>> +					dev_dbg(&dev->dev,
>> +						"invalid large vpd tag %02x "
>> +						"size at offset %zu",
>> +						tag, off + 1);
> 
> The effect of this is that we set vpd->base.len to 0, which will cause
> all VPD accesses to fail, right?  If so, I'd make this at least a
> dev_info(), maybe even a dev_warn().  The dev_dbg() may not appear in
> dmesg at all, depending on config options.
> 
> Capitalize "VPD" and concatenate all the strings, even if it exceeds
> 80 columns.
> 
>> +					break;
> 
> I think this leads to "return 0" below; I'd just return directly here
> so the reader doesn't have to figure out what we're breaking from.
> 
>> +				}
>> +				off += PCI_VPD_LRDT_TAG_SIZE +
>> +					pci_vpd_lrdt_size(header);
>> +			}
>> +		} else {
>> +			/* Short Resource Data Type Tag */
>> +			off += PCI_VPD_SRDT_TAG_SIZE +
>> +				pci_vpd_srdt_size(header);
>> +			tag = pci_vpd_srdt_tag(header);
>> +		}
>> +		if (tag == PCI_VPD_STIN_END)	/* End tag descriptor */
>> +			return off;
>> +		if ((tag != PCI_VPD_LTIN_ID_STRING) &&
>> +		    (tag != PCI_VPD_LTIN_RO_DATA) &&
>> +		    (tag != PCI_VPD_LTIN_RW_DATA)) {
>> +			dev_dbg(&dev->dev,
>> +				"invalid %s vpd tag %02x at offset %zu",
>> +				(header[0] & PCI_VPD_LRDT) ? "large" : "short",
>> +				tag, off);
>> +			break;
> 
> Same dev_dbg() and break comment here.
> 
>> +		}
>> +	}
>> +	return 0;
>> +}
>> +
>>  /*
>>   * Wait for last operation to complete.
>>   * This code has to spin since there is no other notification from the PCI
>> @@ -337,9 +393,23 @@ static ssize_t pci_vpd_pci22_read(struct pci_dev *dev, loff_t pos, size_t count,
>>  	loff_t end = pos + count;
>>  	u8 *buf = arg;
>>  
>> -	if (pos < 0 || pos > vpd->base.len || end > vpd->base.len)
>> +	if (pos < 0)
>>  		return -EINVAL;
>>  
>> +	if (!vpd->valid) {
>> +		vpd->valid = true;
>> +		vpd->base.len = pci_vpd_pci22_size(dev);
>> +	}
>> +	if (vpd->base.len == 0)
>> +		return -EIO;
>> +
>> +	if (end > vpd->base.len) {
>> +		if (pos > vpd->base.len)
>> +			return 0;
>> +		end = vpd->base.len;
>> +		count = end - pos;
>> +	}
>> +
>>  	if (mutex_lock_killable(&vpd->lock))
>>  		return -EINTR;
>>  
>> @@ -389,6 +459,9 @@ static ssize_t pci_vpd_pci22_write(struct pci_dev *dev, loff_t pos, size_t count
>>  	loff_t end = pos + count;
>>  	int ret = 0;
>>  
>> +	if (!vpd->valid)
>> +		return -EAGAIN;
> 
> Does this mean we now have to do a VPD read before a VPD write?  That
> sounds like a new requirement that is non-obvious.
> 
>> +
>>  	if (pos < 0 || (pos & 3) || (count & 3) || end > vpd->base.len)
>>  		return -EINVAL;
>>  
>> @@ -496,6 +569,7 @@ int pci_vpd_pci22_init(struct pci_dev *dev)
>>  	mutex_init(&vpd->lock);
>>  	vpd->cap = cap;
>>  	vpd->busy = false;
>> +	vpd->valid = false;
>>  	dev->vpd = &vpd->base;
>>  	return 0;
>>  }
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index de327c3..31a1f35 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -1333,7 +1333,7 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
>>  			return -ENOMEM;
>>  
>>  		sysfs_bin_attr_init(attr);
>> -		attr->size = dev->vpd->len;
>> +		attr->size = 0;
> 
> I'm still puzzled about how we're using attr->size.  I don't really
> want that size to change at run-time, i.e., I don't want it to be zero
> immediately after boot, then change to something else after the first
> VPD read.  I don't think it's important that this reflect the size
> computed based on the VPD contents.  I think it would be fine if it
> were set to 32K all the time and possibly set to zero by quirks on
> devices where VPD is completely broken.
> 
>>  		attr->attr.name = "vpd";
>>  		attr->attr.mode = S_IRUSR | S_IWUSR;
>>  		attr->read = read_vpd_attr;
>> -- 
>> 1.8.5.6
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Alexey

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-09 12:54     ` Alexey Kardashevskiy
@ 2016-08-09 18:12       ` Alexander Duyck
  2016-08-10  0:03         ` Benjamin Herrenschmidt
  2016-08-10  6:23         ` Hannes Reinecke
  2016-08-09 23:59       ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Benjamin Herrenschmidt
  1 sibling, 2 replies; 37+ messages in thread
From: Alexander Duyck @ 2016-08-09 18:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, Alex Williamson,
	Benjamin Herrenschmidt, santosh, Netdev

On Tue, Aug 9, 2016 at 5:54 AM, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> On 10/02/16 08:04, Bjorn Helgaas wrote:
>> On Wed, Jan 13, 2016 at 12:25:34PM +0100, Hannes Reinecke wrote:
>>> PCI-2.2 VPD entries have a maximum size of 32k, but might actually
>>> be smaller than that. To figure out the actual size one has to read
>>> the VPD area until the 'end marker' is reached.
>>> Trying to read VPD data beyond that marker results in 'interesting'
>>> effects, from simple read errors to crashing the card. And to make
>>> matters worse not every PCI card implements this properly, leaving
>>> us with no 'end' marker or even completely invalid data.
>>> This path tries to determine the size of the VPD data.
>>> If no valid data can be read an I/O error will be returned when
>>> reading the sysfs attribute.
>
>
> I have a problem with this particular feature as today VFIO uses this
> pci_vpd_xxxx API to virtualize access to VPD and the existing code assumes
> there is just one VPD block with 0x2 start and 0xf end. However I have at
> least one device where this is not true - "10 Gigabit Ethernet-SR PCI
> Express Adapter" - it has 2 blocks (made a script to read/parse it as
> /sys/bus/pci/devices/0001\:03\:00.0/vpd shows it wrong):

The PCI spec is what essentially assumes that there is only one block.
If I am not mistaken in the case of this device the second block here
actually contains device configuration data, not actual VPD data.  The
issue here is that the second block is being accessed as VPD when it
isn't.

> #0000 Large item 42 bytes; name 0x2 Identifier String
> #002d Large item 74 bytes; name 0x10
> #007a Small item 1 bytes; name 0xf End Tag
> ---
> #0c00 Large item 16 bytes; name 0x2 Identifier String
> #0c13 Large item 234 bytes; name 0x10
> #0d00 Large item 252 bytes; name 0x11
> #0dff Small item 0 bytes; name 0xf End Tag

The second block here is driver proprietary setup bits.

> The cxgb3 driver is reading the second bit starting from 0xc00 but since
> the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> guest driver fails to probe.
>
> I also cannot find a clause in the PCI 3.0 spec saying that there must be
> just a single block, is it there?

The problem is we need to be able to parse it.  The spec defines a
series of tags that can be used starting at offset 0.  That is how we
are supposed to get around through the VPD data.  The problem is we
can't have more than one end tag and what appears to be happening here
is that we are defining a second block of data which uses the same
formatting as VPD but is not VPD.

> What would the correct fix be? Scanning all 32k of VPD is not an option I
> suppose as this is what this patch is trying to avoid. Thanks.

I adding the current cxgb3 maintainer and netdev list to the Cc.  This
is something that can probably be addressed via a PCI quirk as what
needs to happen is that we need to extend the VPD in the case of this
part in order to include this second block.  As long as we can read
the VPD data all the way out to 0xdff odds are we could probably just
have the size arbitrarily increased to 0xe00 via the quirk and then
you would be able to access all of the VPD for the device.  We already
have code making other modifications to drivers/pci/quirks.c for
several Broadcom devices and probably just need something similar to
allow extended access in the case of these devices.

>
>
>
> This is the device:
>
> [aik@p81-p9 ~]$ sudo lspci -vvnns 0001:03:00.0
> 0001:03:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310
> 10GbE Single Port Adapter [1425:0030]
>         Subsystem: IBM Device [1014:038c]
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
>         Latency: 0
>         Interrupt: pin A routed to IRQ 494
>         Region 0: Memory at 3fe080880000 (64-bit, non-prefetchable) [size=4K]
>         Region 2: Memory at 3fe080000000 (64-bit, non-prefetchable) [size=8M]
>         Region 4: Memory at 3fe080881000 (64-bit, non-prefetchable) [size=4K]
>         [virtual] Expansion ROM at 3fe080800000 [disabled] [size=512K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [48] MSI: Enable- Count=1/32 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [58] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
>                         ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
>                 LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s
> unlimited, L1 unlimited
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt-
> ABWMgmt-
>                 DevCap2: Completion Timeout: Range ABC, TimeoutDis-, LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
> ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
> EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>         Capabilities: [94] Vital Product Data
>                 Product Name: 10 Gigabit Ethernet-SR PCI Express Adapter
>                 Read-only fields:
>                         [EC] Engineering changes: D76809
>                         [FN] Unknown: 34 36 4b 37 38 39 37
>                         [PN] Part number: 46K7897
>                         [MN] Manufacture ID: 31 30 33 37
>                         [FC] Unknown: 35 37 36 39
>                         [SN] Serial number: YL102035603V
>                         [NA] Unknown: 30 30 31 34 35 45 39 39 32 45 44 31
>                 End
>         Capabilities: [9c] MSI-X: Enable+ Count=32 Masked-
>                 Vector table: BAR=4 offset=00000000
>                 PBA: BAR=4 offset=00000800
>         Capabilities: [100 v1] Device Serial Number 00-00-00-01-00-00-00-01
>         Capabilities: [300 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
> ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Kernel driver in use: cxgb3
>         Kernel modules: cxgb3
>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-09 12:54     ` Alexey Kardashevskiy
  2016-08-09 18:12       ` Alexander Duyck
@ 2016-08-09 23:59       ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-09 23:59 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Bjorn Helgaas, Hannes Reinecke
  Cc: Alexander Duyck, linux-pci, linux-kernel, Babu Moger,
	Paul Mackerras, Alex Williamson

> On Tue, 2016-08-09 at 22:54 +1000, Alexey Kardashevskiy wrote:
> The cxgb3 driver is reading the second bit starting from 0xc00 but since
> the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> guest driver fails to probe.
> 
> I also cannot find a clause in the PCI 3.0 spec saying that there must be
> just a single block, is it there?
> 
> What would the correct fix be? Scanning all 32k of VPD is not an option I
> suppose as this is what this patch is trying to avoid. Thanks.

Additionally, Hannes, Alex, I argue that for platforms with proper HW isolation
(such as ppc with EEH), we shouldn't have VFIO try to virtualize that stuff.

It's the same problem with the bloody MSIs. Just let the guest config space
accesses go straight through. Its drivers knows better what the HW needs and
if it crashes the card, too bad for that guest.

That being said, we don't have fine grained per-device PERST control on
all systems so there may not be recovery from that but on the other hand,
our constant attempts at "filtering" what the guest does to the HW is
imho, doomed.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-09 18:12       ` Alexander Duyck
@ 2016-08-10  0:03         ` Benjamin Herrenschmidt
  2016-08-10 15:47           ` Alexander Duyck
  2016-08-10  6:23         ` Hannes Reinecke
  1 sibling, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-10  0:03 UTC (permalink / raw)
  To: Alexander Duyck, Alexey Kardashevskiy
  Cc: Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, Alex Williamson, santosh, Netdev

On Tue, 2016-08-09 at 11:12 -0700, Alexander Duyck wrote:
> 
> The PCI spec is what essentially assumes that there is only one block.
> If I am not mistaken in the case of this device the second block here
> actually contains device configuration data, not actual VPD data.  The
> issue here is that the second block is being accessed as VPD when it
> isn't.

Devices do funny things with config space, film at 11. VFIO trying to
be the middle man and intercept/interpret things is broken, cannot work,
will never work, will just results in lots and lots of useless code, but
I've been singing that song for too long and nobody seems to care...

> > > #0000 Large item 42 bytes; name 0x2 Identifier String
> > #002d Large item 74 bytes; name 0x10
> > #007a Small item 1 bytes; name 0xf End Tag
> > ---
> > #0c00 Large item 16 bytes; name 0x2 Identifier String
> > #0c13 Large item 234 bytes; name 0x10
> > #0d00 Large item 252 bytes; name 0x11
> > #0dff Small item 0 bytes; name 0xf End Tag
> 
> The second block here is driver proprietary setup bits.

Right. They happen to be in VPD on this device. They an be elsewhere on
other devices. In between capabilities on some, in vendor caps on others...

> > > The cxgb3 driver is reading the second bit starting from 0xc00 but since
> > the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
> > guest driver fails to probe.
> >
> > I also cannot find a clause in the PCI 3.0 spec saying that there must be
> > just a single block, is it there?
> 
> > The problem is we need to be able to parse it.

We can parse the standard part for generic stuff like inventory tools
or lsvpd, but we shouldn't get in the way of the driver poking at its
device.

>   The spec defines a
> series of tags that can be used starting at offset 0.  That is how we
> are supposed to get around through the VPD data.  The problem is we
> can't have more than one end tag and what appears to be happening here
> is that we are defining a second block of data which uses the same
> formatting as VPD but is not VPD.
> 
> > What would the correct fix be? Scanning all 32k of VPD is not an option I
> > suppose as this is what this patch is trying to avoid. Thanks.
> 
> I adding the current cxgb3 maintainer and netdev list to the Cc.  This
> is something that can probably be addressed via a PCI quirk as what
> needs to happen is that we need to extend the VPD in the case of this
> part in order to include this second block.  As long as we can read
> the VPD data all the way out to 0xdff odds are we could probably just
> have the size arbitrarily increased to 0xe00 via the quirk and then
> you would be able to access all of the VPD for the device.  We already
> have code making other modifications to drivers/pci/quirks.c for
> several Broadcom devices and probably just need something similar to
> allow extended access in the case of these devices.


> > >
> >
> >
> > This is the device:
> >
> > > [aik@p81-p9 ~]$ sudo lspci -vvnns 0001:03:00.0
> > 0001:03:00.0 Ethernet controller [0200]: Chelsio Communications Inc T310
> > 10GbE Single Port Adapter [1425:0030]
> >         Subsystem: IBM Device [1014:038c]
> >         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> > <MAbort- >SERR- <PERR- INTx-
> >         Latency: 0
> >         Interrupt: pin A routed to IRQ 494
> >         Region 0: Memory at 3fe080880000 (64-bit, non-prefetchable) [size=4K]
> >         Region 2: Memory at 3fe080000000 (64-bit, non-prefetchable) [size=8M]
> >         Region 4: Memory at 3fe080881000 (64-bit, non-prefetchable) [size=4K]
> >         [virtual] Expansion ROM at 3fe080800000 [disabled] [size=512K]
> >         Capabilities: [40] Power Management version 3
> >                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
> >                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> >         Capabilities: [48] MSI: Enable- Count=1/32 Maskable- 64bit+
> >                 Address: 0000000000000000  Data: 0000
> >         Capabilities: [58] Express (v2) Endpoint, MSI 00
> >                 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
> >                         ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> >                 DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> >                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> >                         MaxPayload 256 bytes, MaxReadReq 512 bytes
> >                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
> >                 LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s
> > unlimited, L1 unlimited
> >                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> >                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> >                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> >                 LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt-
> > ABWMgmt-
> >                 DevCap2: Completion Timeout: Range ABC, TimeoutDis-, LTR-, OBFF Not Supported
> >                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> >                 LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> >                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
> > ComplianceSOS-
> >                          Compliance De-emphasis: -6dB
> >                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
> > EqualizationPhase1-
> >                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >         Capabilities: [94] Vital Product Data
> >                 Product Name: 10 Gigabit Ethernet-SR PCI Express Adapter
> >                 Read-only fields:
> >                         [EC] Engineering changes: D76809
> >                         [FN] Unknown: 34 36 4b 37 38 39 37
> >                         [PN] Part number: 46K7897
> >                         [MN] Manufacture ID: 31 30 33 37
> >                         [FC] Unknown: 35 37 36 39
> >                         [SN] Serial number: YL102035603V
> >                         [NA] Unknown: 30 30 31 34 35 45 39 39 32 45 44 31
> >                 End
> >         Capabilities: [9c] MSI-X: Enable+ Count=32 Masked-
> >                 Vector table: BAR=4 offset=00000000
> >                 PBA: BAR=4 offset=00000800
> >         Capabilities: [100 v1] Device Serial Number 00-00-00-01-00-00-00-01
> >         Capabilities: [300 v1] Advanced Error Reporting
> >                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> > ECRC- UnsupReq- ACSViol-
> >                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
> > ECRC- UnsupReq- ACSViol-
> >                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
> > ECRC- UnsupReq- ACSViol-
> >                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> >                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> >                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> >         Kernel driver in use: cxgb3
> >         Kernel modules: cxgb3
> >
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-09 18:12       ` Alexander Duyck
  2016-08-10  0:03         ` Benjamin Herrenschmidt
@ 2016-08-10  6:23         ` Hannes Reinecke
  2016-08-11 10:03           ` [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3) Alexey Kardashevskiy
  1 sibling, 1 reply; 37+ messages in thread
From: Hannes Reinecke @ 2016-08-10  6:23 UTC (permalink / raw)
  To: Alexander Duyck, Alexey Kardashevskiy
  Cc: Bjorn Helgaas, linux-pci, linux-kernel, Babu Moger,
	Paul Mackerras, Alex Williamson, Benjamin Herrenschmidt, santosh,
	Netdev

On 08/09/2016 08:12 PM, Alexander Duyck wrote:
> On Tue, Aug 9, 2016 at 5:54 AM, Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>> On 10/02/16 08:04, Bjorn Helgaas wrote:
>>> On Wed, Jan 13, 2016 at 12:25:34PM +0100, Hannes Reinecke wrote:
>>>> PCI-2.2 VPD entries have a maximum size of 32k, but might actually
>>>> be smaller than that. To figure out the actual size one has to read
>>>> the VPD area until the 'end marker' is reached.
>>>> Trying to read VPD data beyond that marker results in 'interesting'
>>>> effects, from simple read errors to crashing the card. And to make
>>>> matters worse not every PCI card implements this properly, leaving
>>>> us with no 'end' marker or even completely invalid data.
>>>> This path tries to determine the size of the VPD data.
>>>> If no valid data can be read an I/O error will be returned when
>>>> reading the sysfs attribute.
>>
>>
>> I have a problem with this particular feature as today VFIO uses this
>> pci_vpd_xxxx API to virtualize access to VPD and the existing code assumes
>> there is just one VPD block with 0x2 start and 0xf end. However I have at
>> least one device where this is not true - "10 Gigabit Ethernet-SR PCI
>> Express Adapter" - it has 2 blocks (made a script to read/parse it as
>> /sys/bus/pci/devices/0001\:03\:00.0/vpd shows it wrong):
> 
> The PCI spec is what essentially assumes that there is only one block.
> If I am not mistaken in the case of this device the second block here
> actually contains device configuration data, not actual VPD data.  The
> issue here is that the second block is being accessed as VPD when it
> isn't.
> 
>> #0000 Large item 42 bytes; name 0x2 Identifier String
>> #002d Large item 74 bytes; name 0x10
>> #007a Small item 1 bytes; name 0xf End Tag
>> ---
>> #0c00 Large item 16 bytes; name 0x2 Identifier String
>> #0c13 Large item 234 bytes; name 0x10
>> #0d00 Large item 252 bytes; name 0x11
>> #0dff Small item 0 bytes; name 0xf End Tag
> 
> The second block here is driver proprietary setup bits.
> 
>> The cxgb3 driver is reading the second bit starting from 0xc00 but since
>> the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
>> guest driver fails to probe.
>>
>> I also cannot find a clause in the PCI 3.0 spec saying that there must be
>> just a single block, is it there?
> 
> The problem is we need to be able to parse it.  The spec defines a
> series of tags that can be used starting at offset 0.  That is how we
> are supposed to get around through the VPD data.  The problem is we
> can't have more than one end tag and what appears to be happening here
> is that we are defining a second block of data which uses the same
> formatting as VPD but is not VPD.
> 
>> What would the correct fix be? Scanning all 32k of VPD is not an option I
>> suppose as this is what this patch is trying to avoid. Thanks.
> 
> I adding the current cxgb3 maintainer and netdev list to the Cc.  This
> is something that can probably be addressed via a PCI quirk as what
> needs to happen is that we need to extend the VPD in the case of this
> part in order to include this second block.  As long as we can read
> the VPD data all the way out to 0xdff odds are we could probably just
> have the size arbitrarily increased to 0xe00 via the quirk and then
> you would be able to access all of the VPD for the device.  We already
> have code making other modifications to drivers/pci/quirks.c for
> several Broadcom devices and probably just need something similar to
> allow extended access in the case of these devices.
> 
Yes, that's what I think, too.
The Broadcom quirk should work here, too.
(Didn't we do that already?)

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-10  0:03         ` Benjamin Herrenschmidt
@ 2016-08-10 15:47           ` Alexander Duyck
  2016-08-10 23:54             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Duyck @ 2016-08-10 15:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Bjorn Helgaas, Hannes Reinecke, linux-pci,
	linux-kernel, Babu Moger, Paul Mackerras, Alex Williamson,
	santosh, Netdev

On Tue, Aug 9, 2016 at 5:03 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2016-08-09 at 11:12 -0700, Alexander Duyck wrote:
>>
>> The PCI spec is what essentially assumes that there is only one block.
>> If I am not mistaken in the case of this device the second block here
>> actually contains device configuration data, not actual VPD data.  The
>> issue here is that the second block is being accessed as VPD when it
>> isn't.
>
> Devices do funny things with config space, film at 11. VFIO trying to
> be the middle man and intercept/interpret things is broken, cannot work,
> will never work, will just results in lots and lots of useless code, but
> I've been singing that song for too long and nobody seems to care...
>
>> > > #0000 Large item 42 bytes; name 0x2 Identifier String
>> > #002d Large item 74 bytes; name 0x10
>> > #007a Small item 1 bytes; name 0xf End Tag
>> > ---
>> > #0c00 Large item 16 bytes; name 0x2 Identifier String
>> > #0c13 Large item 234 bytes; name 0x10
>> > #0d00 Large item 252 bytes; name 0x11
>> > #0dff Small item 0 bytes; name 0xf End Tag
>>
>> The second block here is driver proprietary setup bits.
>
> Right. They happen to be in VPD on this device. They an be elsewhere on
> other devices. In between capabilities on some, in vendor caps on others...
>
>> > > The cxgb3 driver is reading the second bit starting from 0xc00 but since
>> > the size is wrongly detected as 0x7c, VFIO blocks access beyond it and the
>> > guest driver fails to probe.
>> >
>> > I also cannot find a clause in the PCI 3.0 spec saying that there must be
>> > just a single block, is it there?
>>
>> > The problem is we need to be able to parse it.
>
> We can parse the standard part for generic stuff like inventory tools
> or lsvpd, but we shouldn't get in the way of the driver poking at its
> device.

If we add the quirk to the kernel that reports the VPD for this device
is the actual size of both blocks then we wouldn't be blocking the VPD
access like we currently are.

The problem is if we don't do this it becomes possible for a guest to
essentially cripple a device on the host by just accessing VPD regions
that aren't actually viable on many devices.  We are much better off
in terms of security and stability if we restrict access to what
should be accessible.  In this case what has happened is that the
vendor threw in an extra out-of-spec block and just expected it to
work.  In order to work around it we just need to add a small function
to drivers/pci/quirks.c that would update the VPD size reported so
that it matches what the hardware is actually providing instead of
what we can determine based on the VPD layout.

Really working around something like this is not much different than
what we would have to do if the vendor had stuffed the data in some
reserved section of their PCI configuration space.  We end up needing
to add special quirks any time a vendor goes out-of-spec for some
one-off configuration interface that only they are ever going to use.

- Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-10 15:47           ` Alexander Duyck
@ 2016-08-10 23:54             ` Benjamin Herrenschmidt
  2016-08-11 18:52               ` Alexander Duyck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-10 23:54 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexey Kardashevskiy, Bjorn Helgaas, Hannes Reinecke, linux-pci,
	linux-kernel, Babu Moger, Paul Mackerras, Alex Williamson,
	santosh, Netdev

On Wed, 2016-08-10 at 08:47 -0700, Alexander Duyck wrote:
> 
> The problem is if we don't do this it becomes possible for a guest to
> essentially cripple a device on the host by just accessing VPD
> regions that aren't actually viable on many devices. 

And ? We already can cripple the device in so many different ways
simpy because we have pretty much full BAR access to it...

>  We are much better off
> in terms of security and stability if we restrict access to what
> should be accessible. 

Bollox. I've heard that argument over and over again, it never stood
and still doesn't.

We have full BAR access for god sake. We can already destroy the device
in many cases (think: reflashing microcode, internal debug bus access
with a route to the config space, voltage/freq control ....).

We aren't protecting anything more here, we are just adding layers of
bloat, complication and bugs.

>  In this case what has happened is that the
> vendor threw in an extra out-of-spec block and just expected it to
> work.

Like vendors do all the time in all sort of places

I still completely fail to see the point in acting as a filtering
middle man.

> In order to work around it we just need to add a small function
> to drivers/pci/quirks.c that would update the VPD size reported so
> that it matches what the hardware is actually providing instead of
> what we can determine based on the VPD layout.
> 
> Really working around something like this is not much different than
> what we would have to do if the vendor had stuffed the data in some
> reserved section of their PCI configuration space.

It is, in both cases we shouldn't have VFIO or the host involved. We
should just let the guest config space accesses go through.

>   We end up needing
> to add special quirks any time a vendor goes out-of-spec for some
> one-off configuration interface that only they are ever going to use.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3)
  2016-08-10  6:23         ` Hannes Reinecke
@ 2016-08-11 10:03           ` Alexey Kardashevskiy
  2016-09-06 15:48             ` Bjorn Helgaas
  0 siblings, 1 reply; 37+ messages in thread
From: Alexey Kardashevskiy @ 2016-08-11 10:03 UTC (permalink / raw)
  To: netdev
  Cc: Alexey Kardashevskiy, Bjorn Helgaas, Santosh Raspatur,
	Alex Williamson, Paul Mackerras, Alexander Duyck,
	Hannes Reinecke, linux-kernel, linux-pci

There is at least one Chelsio 10Gb card which uses VPD area to store
some custom blocks (example below). However pci_vpd_size() returns
the length of the first block only assuming that there can be only
one VPD "End Tag" and VFIO blocks access beyond that offset
(since 4e1a63555) which leads to the situation when the guest "cxgb3"
driver fails to probe the device. The host system does not have this
problem as the drives accesses the config space directly without
pci_read_vpd()/...

This adds a quirk to override the VPD size to a bigger value.

This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]

This is its VPD:
#0000 Large item 42 bytes; name 0x2 Identifier String
	b'10 Gigabit Ethernet-SR PCI Express Adapter'
#002d Large item 74 bytes; name 0x10
	#00 [EC] len=7: b'D76809 '
	#0a [FN] len=7: b'46K7897'
	#14 [PN] len=7: b'46K7897'
	#1e [MN] len=4: b'1037'
	#25 [FC] len=4: b'5769'
	#2c [SN] len=12: b'YL102035603V'
	#3b [NA] len=12: b'00145E992ED1'
#007a Small item 1 bytes; name 0xf End Tag
---
#0c00 Large item 16 bytes; name 0x2 Identifier String
	b'S310E-SR-X      '
#0c13 Large item 234 bytes; name 0x10
	#00 [PN] len=16: b'TBD             '
	#13 [EC] len=16: b'110107730D2     '
	#26 [SN] len=16: b'97YL102035603V  '
	#39 [NA] len=12: b'00145E992ED1'
	#48 [V0] len=6: b'175000'
	#51 [V1] len=6: b'266666'
	#5a [V2] len=6: b'266666'
	#63 [V3] len=6: b'2000  '
	#6c [V4] len=2: b'1 '
	#71 [V5] len=6: b'c2    '
	#7a [V6] len=6: b'0     '
	#83 [V7] len=2: b'1 '
	#88 [V8] len=2: b'0 '
	#8d [V9] len=2: b'0 '
	#92 [VA] len=2: b'0 '
	#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
#0d00 Large item 252 bytes; name 0x11
	#00 [VC] len=16: b'122310_1222 dp  '
	#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
	#26 [VE] len=16: b'122310_1353 fp  '
	#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
	#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
#0dff Small item 0 bytes; name 0xf End Tag

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 drivers/pci/quirks.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ee72ebe1..94d3fb5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3241,6 +3241,18 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
 			quirk_thunderbolt_hotplug_msi);
 
+static void quirk_chelsio_extend_vpd(struct pci_dev *dev)
+{
+	if (!dev->vpd || !dev->vpd->ops || !dev->vpd->ops->set_size)
+		return;
+
+	dev->vpd->ops->set_size(dev, max_t(unsigned int, dev->vpd->len, 0xe00));
+}
+
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO,
+			PCI_ANY_ID,
+			quirk_chelsio_extend_vpd);
+
 #ifdef CONFIG_ACPI
 /*
  * Apple: Shutdown Cactus Ridge Thunderbolt controller.
-- 
2.5.0.rc3

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-10 23:54             ` Benjamin Herrenschmidt
@ 2016-08-11 18:52               ` Alexander Duyck
  2016-08-11 20:17                 ` Alex Williamson
  2016-08-16  1:40                 ` Alexey Kardashevskiy
  0 siblings, 2 replies; 37+ messages in thread
From: Alexander Duyck @ 2016-08-11 18:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alexey Kardashevskiy, Bjorn Helgaas, Hannes Reinecke, linux-pci,
	linux-kernel, Babu Moger, Paul Mackerras, Alex Williamson,
	santosh, Netdev

On Wed, Aug 10, 2016 at 4:54 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2016-08-10 at 08:47 -0700, Alexander Duyck wrote:
>>
>> The problem is if we don't do this it becomes possible for a guest to
>> essentially cripple a device on the host by just accessing VPD
>> regions that aren't actually viable on many devices.
>
> And ? We already can cripple the device in so many different ways
> simpy because we have pretty much full BAR access to it...
>
>>  We are much better off
>> in terms of security and stability if we restrict access to what
>> should be accessible.
>
> Bollox. I've heard that argument over and over again, it never stood
> and still doesn't.
>
> We have full BAR access for god sake. We can already destroy the device
> in many cases (think: reflashing microcode, internal debug bus access
> with a route to the config space, voltage/freq control ....).
>
> We aren't protecting anything more here, we are just adding layers of
> bloat, complication and bugs.

To some extent I can agree with you.  I don't know if we should be
restricting the VFIO based interface the same way we restrict systemd
from accessing this region.  In the case of VFIO maybe we need to look
at a different approach for accessing this.  Perhaps we need a
privileged version of the VPD accessors that could be used by things
like VFIO and the cxgb3 driver since they are assumed to be a bit
smarter than those interfaces that were just trying to slurp up
something like 4K of VPD data.

>>  In this case what has happened is that the
>> vendor threw in an extra out-of-spec block and just expected it to
>> work.
>
> Like vendors do all the time in all sort of places
>
> I still completely fail to see the point in acting as a filtering
> middle man.

The problem is we are having to do some filtering because things like
systemd were using dumb accessors that were trying to suck down 4K of
VPD data instead of trying to parse through and read it a field at a
time.

>> In order to work around it we just need to add a small function
>> to drivers/pci/quirks.c that would update the VPD size reported so
>> that it matches what the hardware is actually providing instead of
>> what we can determine based on the VPD layout.
>>
>> Really working around something like this is not much different than
>> what we would have to do if the vendor had stuffed the data in some
>> reserved section of their PCI configuration space.
>
> It is, in both cases we shouldn't have VFIO or the host involved. We
> should just let the guest config space accesses go through.
>
>>   We end up needing
>> to add special quirks any time a vendor goes out-of-spec for some
>> one-off configuration interface that only they are ever going to use.
>
> Cheers,
> Ben.

If you have a suggestion on how to resolve this patches are always
welcome.  Otherwise I think the simpler approach to fixing this
without re-introducing the existing bugs is to just add the quirk.  I
will try to get to it sometime this weekend if nobody else does.  It
should be pretty straight foward, but I just don't have the time to
pull up a kernel and generate a patch right now.

- Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-11 18:52               ` Alexander Duyck
@ 2016-08-11 20:17                 ` Alex Williamson
  2016-08-12  5:11                   ` Benjamin Herrenschmidt
  2016-08-16  1:40                 ` Alexey Kardashevskiy
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Williamson @ 2016-08-11 20:17 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Benjamin Herrenschmidt, Alexey Kardashevskiy, Bjorn Helgaas,
	Hannes Reinecke, linux-pci, linux-kernel, Babu Moger,
	Paul Mackerras, santosh, Netdev

On Thu, 11 Aug 2016 11:52:02 -0700
Alexander Duyck <alexander.duyck@gmail.com> wrote:

> On Wed, Aug 10, 2016 at 4:54 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Wed, 2016-08-10 at 08:47 -0700, Alexander Duyck wrote:  
> >>
> >> The problem is if we don't do this it becomes possible for a guest to
> >> essentially cripple a device on the host by just accessing VPD
> >> regions that aren't actually viable on many devices.  
> >
> > And ? We already can cripple the device in so many different ways
> > simpy because we have pretty much full BAR access to it...
> >  
> >>  We are much better off
> >> in terms of security and stability if we restrict access to what
> >> should be accessible.  
> >
> > Bollox. I've heard that argument over and over again, it never stood
> > and still doesn't.
> >
> > We have full BAR access for god sake. We can already destroy the device
> > in many cases (think: reflashing microcode, internal debug bus access
> > with a route to the config space, voltage/freq control ....).
> >
> > We aren't protecting anything more here, we are just adding layers of
> > bloat, complication and bugs.  
> 
> To some extent I can agree with you.  I don't know if we should be
> restricting the VFIO based interface the same way we restrict systemd
> from accessing this region.  In the case of VFIO maybe we need to look
> at a different approach for accessing this.  Perhaps we need a
> privileged version of the VPD accessors that could be used by things
> like VFIO and the cxgb3 driver since they are assumed to be a bit
> smarter than those interfaces that were just trying to slurp up
> something like 4K of VPD data.
> 
> >>  In this case what has happened is that the
> >> vendor threw in an extra out-of-spec block and just expected it to
> >> work.  
> >
> > Like vendors do all the time in all sort of places
> >
> > I still completely fail to see the point in acting as a filtering
> > middle man.  
> 
> The problem is we are having to do some filtering because things like
> systemd were using dumb accessors that were trying to suck down 4K of
> VPD data instead of trying to parse through and read it a field at a
> time.

vfio isn't playing nanny here for the fun of it, part of the reason we
have vpd access functions is because folks have discovered that vpd
registers between PCI functions on multi-function devices may be
shared.  So pounding on vpd registers for function 1 may adversely
affect someone else reading from a different function.  This is a case
where I feel vfio needs to step in because if that's a user competing
with the host or two users stepping on each other, well that's exactly
what vfio tries to prevent.  A driver in userspace or a VM driver can't
very well determine these sorts of interactions when it only has
visibility to a subset of the functions and users and hardware folks
would throw a fit if I extended iommu groups to encompass all the
related devices rather than take the relatively simple step of
virtualizing these accesses and occasionally quirking devices that are
extra broken, as seems to be required here.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-11 20:17                 ` Alex Williamson
@ 2016-08-12  5:11                   ` Benjamin Herrenschmidt
  2016-08-15 17:59                     ` Rustad, Mark D
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-12  5:11 UTC (permalink / raw)
  To: Alex Williamson, Alexander Duyck
  Cc: Alexey Kardashevskiy, Bjorn Helgaas, Hannes Reinecke, linux-pci,
	linux-kernel, Babu Moger, Paul Mackerras, santosh, Netdev

On Thu, 2016-08-11 at 14:17 -0600, Alex Williamson wrote:
> 
> vfio isn't playing nanny here for the fun of it, part of the reason we
> have vpd access functions is because folks have discovered that vpd
> registers between PCI functions on multi-function devices may be
> shared.  So pounding on vpd registers for function 1 may adversely
> > affect someone else reading from a different function.

A lot more than that may be shared... the bus for example. lots of
devices have backdoors into their own BARs, one partition can make its
BARs overlap the neighbour function, ... dang !

In the end, PCI assignment at a granularity smaller than a bus cannot
be done in a fully air tight way and shouldn't be relied upon in
environemnt where the isolation between partitions matters.

The only exception here is SR-IOV of course since it's designed
specifically so that the VFs can't be messed with.

>   This is a case
> where I feel vfio needs to step in because if that's a user competing
> with the host or two users stepping on each other, well that's exactly
> what vfio tries to prevent.  A driver in userspace or a VM driver can't
> very well determine these sorts of interactions when it only has
> visibility to a subset of the functions and users and hardware folks
> would throw a fit if I extended iommu groups to encompass all the
> related devices rather than take the relatively simple step of
> virtualizing these accesses and occasionally quirking devices that are
> extra broken, as seems to be required here.  Thanks,

We may want some kind of "strict" vs. "relaxed" model here to
differenciate the desktop user wanting to give a function to his/her
windows partition and doesn't care about strict isolation vs. the cloud
data center.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-12  5:11                   ` Benjamin Herrenschmidt
@ 2016-08-15 17:59                     ` Rustad, Mark D
  2016-08-15 22:23                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: Rustad, Mark D @ 2016-08-15 17:59 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alex Williamson, Alexander Duyck, Alexey Kardashevskiy,
	Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, santosh, Netdev

[-- Attachment #1: Type: text/plain, Size: 498 bytes --]

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> We may want some kind of "strict" vs. "relaxed" model here to
> differenciate the desktop user wanting to give a function to his/her
> windows partition and doesn't care about strict isolation vs. the cloud
> data center.

I don't think desktop users appreciate hangs any more than anyone else, and  
that is one of the symptoms that can arise here without the vfio  
coordination.

--
Mark Rustad, Networking Division, Intel Corporation

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 841 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-15 17:59                     ` Rustad, Mark D
@ 2016-08-15 22:23                       ` Benjamin Herrenschmidt
  2016-08-15 22:33                         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-15 22:23 UTC (permalink / raw)
  To: Rustad, Mark D
  Cc: Alex Williamson, Alexander Duyck, Alexey Kardashevskiy,
	Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, santosh, Netdev

On Mon, 2016-08-15 at 17:59 +0000, Rustad, Mark D wrote:
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
> > 
> > We may want some kind of "strict" vs. "relaxed" model here to
> > differenciate the desktop user wanting to give a function to his/her
> > windows partition and doesn't care about strict isolation vs. the cloud
> > data center.
> 
> I don't think desktop users appreciate hangs any more than anyone else, and  
> that is one of the symptoms that can arise here without the vfio  
> coordination.

And can happen with it as well.... 

Put it this way, we have a hole the size of a football field and we are
spending all that time "plugging" it by putting a 3 foot gate in the
middle of it... ugh.

VFIO shouldn't try to get between the device driver and the HW, it doesn't
work. It cannot work.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-15 22:23                       ` Benjamin Herrenschmidt
@ 2016-08-15 22:33                         ` Benjamin Herrenschmidt
  2016-08-15 23:16                           ` Rustad, Mark D
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-15 22:33 UTC (permalink / raw)
  To: Rustad, Mark D
  Cc: Alex Williamson, Alexander Duyck, Alexey Kardashevskiy,
	Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, santosh, Netdev

On Tue, 2016-08-16 at 08:23 +1000, Benjamin Herrenschmidt wrote:
> > I don't think desktop users appreciate hangs any more than anyone else, and  
> > that is one of the symptoms that can arise here without the vfio  
> > coordination.
> 
> And can happen with it as well.... 

Oh and your response was completely besides the point I was trying
to make, just some passive aggressive noise, thank you.

The point was that if you want the sort of safety that we are trying
to aim for, without additional HW support, you basically have to do
the isolation on a granularity no smaller than a bridge/bus (with the
notable exception of SR-IOV of course).

Otherwise guest *will* be able to harm each other (or the host) in
all sorts of ways if anything because devices will have MMIO backdoors
into their own BAR space, or can be made to DMA to a neighbour, etc...

This is the only safe thing to do (and we are enforcing that on POWER
with our IOMMU groups).

Now that being said, if you want to keep the ability to assign 2 functions
of a device to different guests for your average non-critical desktop user,
that's where you may want to consider two models.

Generally speaking, filtering things for "safety" won't work.

Filtering things to work around bugs in existing guests to avoid crashes
is a different kettle of fish and could be justified but keep in mind that
in most cases a malicious guest will be able to exploit those HW flaws.

Assuming that a device coming back from a guest is functional and not
completely broken and can be re-used without a full PERST or power cycle
is a wrong assumption. It may or may not work, no amount of "filtering"
will fix the fundamental issue. If your HW won't give you access to PERST
well ... blame Intel for not specifying a standard way to generate it in
the first place :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-15 22:33                         ` Benjamin Herrenschmidt
@ 2016-08-15 23:16                           ` Rustad, Mark D
  2016-08-16  0:13                             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 37+ messages in thread
From: Rustad, Mark D @ 2016-08-15 23:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alex Williamson, Alexander Duyck, Alexey Kardashevskiy,
	Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, santosh, Netdev

[-- Attachment #1: Type: text/plain, Size: 2688 bytes --]

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> Filtering things to work around bugs in existing guests to avoid crashes
> is a different kettle of fish and could be justified but keep in mind that
> in most cases a malicious guest will be able to exploit those HW flaws.

Bugs in existing guests is an interesting case, but I have been focused on  
getting acceptable behavior from a properly functioning guest, in the face  
of hardware issues that can only be resolved in a single place.

I agree that a malicious guest can cause all kinds of havoc with  
directly-assigned devices. Consider a 4-port PHY chip on a shared MDIO bus,  
for instance. There is really nothing to be done about the potential for  
mischief with that kind of thing.

The VPD problem that I had been concerned about arises from a bad design in  
the PCI spec together with implementations that share the registers across  
functions. The hardware isn't going to change and I really doubt that the  
spec will either, so we address it the only place we can.

I am certain that we agree that not everything can or should be addressed  
in vfio. I did not mean to suggest it should try to address everything, but  
I think it should make it possible for correctly behaving guests to work. I  
think that is not unreasonable.

Perhaps the VPD range check should really just have been implemented for  
the sysfs interface, and left the vfio case unchecked. I don't know because  
I was not involved in that issue. Perhaps someone more intimately involved  
can comment on that notion.

> Assuming that a device coming back from a guest is functional and not
> completely broken and can be re-used without a full PERST or power cycle
> is a wrong assumption. It may or may not work, no amount of "filtering"
> will fix the fundamental issue. If your HW won't give you access to PERST
> well ... blame Intel for not specifying a standard way to generate it in
> the first place :-)

Yeah, I worry about the state that a malicious guest could leave a device  
in, but I consider direct assignment always risky anyway. I would just like  
it to at least work in the non-malicious guest cases.

I guess my previous response was really just too terse, I was just focused  
on unavoidable hangs and data corruption, which even were happening without  
any guest involvement. For me, guests were just an additional exposure of  
the same underlying issue.

With hindsight, it is easy to see that a standard reset would now be a  
pretty useful thing. I am sure that even if it existed, we would now have  
lots and lots of quirks around it as well! :-)

--
Mark Rustad, Networking Division, Intel Corporation

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 841 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-15 23:16                           ` Rustad, Mark D
@ 2016-08-16  0:13                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-16  0:13 UTC (permalink / raw)
  To: Rustad, Mark D
  Cc: Alex Williamson, Alexander Duyck, Alexey Kardashevskiy,
	Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, santosh, Netdev

On Mon, 2016-08-15 at 23:16 +0000, Rustad, Mark D wrote:
> 
> Bugs in existing guests is an interesting case, but I have been focused on  
> getting acceptable behavior from a properly functioning guest, in the face  
> of hardware issues that can only be resolved in a single place.
> 
> I agree that a malicious guest can cause all kinds of havoc with  
> directly-assigned devices. Consider a 4-port PHY chip on a shared MDIO bus,  
> for instance. There is really nothing to be done about the potential for  
> mischief with that kind of thing.
> 
> The VPD problem that I had been concerned about arises from a bad design in  
> the PCI spec together with implementations that share the registers across  
> functions. The hardware isn't going to change and I really doubt that the  
> spec will either, so we address it the only place we can.

Right but as I mentioned, there are plenty of holes when it comes to
having multi function devices assigned to different guests, this is just
one of them.

Now, that being said, "working around" the bug to make the "non fully secure"
case work (in my example case the "desktop user") is probably an ok workaround
as long as we fully agree that this is all it is, it by no means provide
actual isolation or security.
 
> >   >   rtain that we agree that not everything can or should be addressed  
> in vfio. I did not mean to suggest it should try to address everything, but  
> I think it should make it possible for correctly behaving guests to work. I  
> think that is not unreasonable.

Again as long as there is no expectation of security here, such as a data
center giving PCI access to some devices.

> > Perhaps the VPD range check should really just have been implemented for  
> the sysfs interface, and left the vfio case unchecked. I don't know because  
> I was not involved in that issue. Perhaps someone more intimately involved  
> can comment on that notion.

That would have been my preferred approach... I think VFIO tries to do too much
which complicates things, causes other bugs, without briging actual safety. I
don't think it should stand in between a guest driver and its device unless
absolutely necessary to provide the functionality due to broken HW or design,
but with the full understanding that doing so remains unsafe from an isolation
standpoint.

> > > Assuming that a device coming back from a guest is functional and not
> > completely broken and can be re-used without a full PERST or power cycle
> > is a wrong assumption. It may or may not work, no amount of "filtering"
> > will fix the fundamental issue. If your HW won't give you access to PERST
> > well ... blame Intel for not specifying a standard way to generate it in
> > the first place :-)
> 
> Yeah, I worry about the state that a malicious guest could leave a device  
> in, but I consider direct assignment always risky anyway. I would just like  
> it to at least work in the non-malicious guest cases.

Right. Only SR-IOV which is somewhat designed for assignment is reasonably safe
in the general case.

On server POWER boxes, we have isolation at the bus level with usual per-slot
PERST control so we are in a much better situation but we also for all the
above reasons, only allow a slot granularity for pass-through.

> > I guess my previous response was really just too terse, I was just focused  
> on unavoidable hangs and data corruption, which even were happening without  
> any guest involvement. For me, guests were just an additional exposure of  
> the same underlying issue.
> 
> With hindsight, it is easy to see that a standard reset would now be a  
> pretty useful thing. I am sure that even if it existed, we would now have  
> lots and lots of quirks around it as well! :-)

Hehe yes, well, HW for you ...

Cheers,
Ben.

> > --
> Mark Rustad, Networking Division, Intel Corporation

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCHv2 3/4] pci: Determine actual VPD size on first access
  2016-08-11 18:52               ` Alexander Duyck
  2016-08-11 20:17                 ` Alex Williamson
@ 2016-08-16  1:40                 ` Alexey Kardashevskiy
  1 sibling, 0 replies; 37+ messages in thread
From: Alexey Kardashevskiy @ 2016-08-16  1:40 UTC (permalink / raw)
  To: Alexander Duyck, Benjamin Herrenschmidt
  Cc: Bjorn Helgaas, Hannes Reinecke, linux-pci, linux-kernel,
	Babu Moger, Paul Mackerras, Alex Williamson, santosh, Netdev

On 12/08/16 04:52, Alexander Duyck wrote:
> On Wed, Aug 10, 2016 at 4:54 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>> On Wed, 2016-08-10 at 08:47 -0700, Alexander Duyck wrote:
>>>
>>> The problem is if we don't do this it becomes possible for a guest to
>>> essentially cripple a device on the host by just accessing VPD
>>> regions that aren't actually viable on many devices.
>>
>> And ? We already can cripple the device in so many different ways
>> simpy because we have pretty much full BAR access to it...
>>
>>>  We are much better off
>>> in terms of security and stability if we restrict access to what
>>> should be accessible.
>>
>> Bollox. I've heard that argument over and over again, it never stood
>> and still doesn't.
>>
>> We have full BAR access for god sake. We can already destroy the device
>> in many cases (think: reflashing microcode, internal debug bus access
>> with a route to the config space, voltage/freq control ....).
>>
>> We aren't protecting anything more here, we are just adding layers of
>> bloat, complication and bugs.
> 
> To some extent I can agree with you.  I don't know if we should be
> restricting the VFIO based interface the same way we restrict systemd
> from accessing this region.  In the case of VFIO maybe we need to look
> at a different approach for accessing this.  Perhaps we need a
> privileged version of the VPD accessors that could be used by things
> like VFIO and the cxgb3 driver since they are assumed to be a bit
> smarter than those interfaces that were just trying to slurp up
> something like 4K of VPD data.
> 
>>>  In this case what has happened is that the
>>> vendor threw in an extra out-of-spec block and just expected it to
>>> work.
>>
>> Like vendors do all the time in all sort of places
>>
>> I still completely fail to see the point in acting as a filtering
>> middle man.
> 
> The problem is we are having to do some filtering because things like
> systemd were using dumb accessors that were trying to suck down 4K of
> VPD data instead of trying to parse through and read it a field at a
> time.
> 
>>> In order to work around it we just need to add a small function
>>> to drivers/pci/quirks.c that would update the VPD size reported so
>>> that it matches what the hardware is actually providing instead of
>>> what we can determine based on the VPD layout.
>>>
>>> Really working around something like this is not much different than
>>> what we would have to do if the vendor had stuffed the data in some
>>> reserved section of their PCI configuration space.
>>
>> It is, in both cases we shouldn't have VFIO or the host involved. We
>> should just let the guest config space accesses go through.
>>
>>>   We end up needing
>>> to add special quirks any time a vendor goes out-of-spec for some
>>> one-off configuration interface that only they are ever going to use.
>>
>> Cheers,
>> Ben.
> 
> If you have a suggestion on how to resolve this patches are always
> welcome.  Otherwise I think the simpler approach to fixing this
> without re-introducing the existing bugs is to just add the quirk.  I
> will try to get to it sometime this weekend if nobody else does.  It
> should be pretty straight foward, but I just don't have the time to
> pull up a kernel and generate a patch right now.


How exactly is mine - https://lkml.org/lkml/2016/8/11/200 - bad (except
missing rb/ab from Chelsio folks)? Thanks.


-- 
Alexey

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3)
  2016-08-11 10:03           ` [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3) Alexey Kardashevskiy
@ 2016-09-06 15:48             ` Bjorn Helgaas
  2016-09-06 18:30               ` Alexander Duyck
  0 siblings, 1 reply; 37+ messages in thread
From: Bjorn Helgaas @ 2016-09-06 15:48 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: netdev, Bjorn Helgaas, Santosh Raspatur, Alex Williamson,
	Paul Mackerras, Alexander Duyck, Hannes Reinecke, linux-kernel,
	linux-pci

Hi Alexey,

On Thu, Aug 11, 2016 at 08:03:29PM +1000, Alexey Kardashevskiy wrote:
> There is at least one Chelsio 10Gb card which uses VPD area to store
> some custom blocks (example below). However pci_vpd_size() returns
> the length of the first block only assuming that there can be only
> one VPD "End Tag" and VFIO blocks access beyond that offset
> (since 4e1a63555) which leads to the situation when the guest "cxgb3"
> driver fails to probe the device. The host system does not have this
> problem as the drives accesses the config space directly without
> pci_read_vpd()/...
> 
> This adds a quirk to override the VPD size to a bigger value.
> 
> This is the controller:
> Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
> 
> This is its VPD:
> #0000 Large item 42 bytes; name 0x2 Identifier String
> 	b'10 Gigabit Ethernet-SR PCI Express Adapter'
> #002d Large item 74 bytes; name 0x10
> 	#00 [EC] len=7: b'D76809 '
> 	#0a [FN] len=7: b'46K7897'
> 	#14 [PN] len=7: b'46K7897'
> 	#1e [MN] len=4: b'1037'
> 	#25 [FC] len=4: b'5769'
> 	#2c [SN] len=12: b'YL102035603V'
> 	#3b [NA] len=12: b'00145E992ED1'
> #007a Small item 1 bytes; name 0xf End Tag
> ---
> #0c00 Large item 16 bytes; name 0x2 Identifier String
> 	b'S310E-SR-X      '
> #0c13 Large item 234 bytes; name 0x10
> 	#00 [PN] len=16: b'TBD             '
> 	#13 [EC] len=16: b'110107730D2     '
> 	#26 [SN] len=16: b'97YL102035603V  '
> 	#39 [NA] len=12: b'00145E992ED1'
> 	#48 [V0] len=6: b'175000'
> 	#51 [V1] len=6: b'266666'
> 	#5a [V2] len=6: b'266666'
> 	#63 [V3] len=6: b'2000  '
> 	#6c [V4] len=2: b'1 '
> 	#71 [V5] len=6: b'c2    '
> 	#7a [V6] len=6: b'0     '
> 	#83 [V7] len=2: b'1 '
> 	#88 [V8] len=2: b'0 '
> 	#8d [V9] len=2: b'0 '
> 	#92 [VA] len=2: b'0 '
> 	#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
> #0d00 Large item 252 bytes; name 0x11
> 	#00 [VC] len=16: b'122310_1222 dp  '
> 	#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
> 	#26 [VE] len=16: b'122310_1353 fp  '
> 	#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
> 	#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
> #0dff Small item 0 bytes; name 0xf End Tag
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  drivers/pci/quirks.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ee72ebe1..94d3fb5 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3241,6 +3241,18 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
>  			quirk_thunderbolt_hotplug_msi);
>  
> +static void quirk_chelsio_extend_vpd(struct pci_dev *dev)
> +{
> +	if (!dev->vpd || !dev->vpd->ops || !dev->vpd->ops->set_size)
> +		return;
> +
> +	dev->vpd->ops->set_size(dev, max_t(unsigned int, dev->vpd->len, 0xe00));
> +}
> +
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO,
> +			PCI_ANY_ID,
> +			quirk_chelsio_extend_vpd);

Do you really want this for *all* Chelsio devices?  If you only need
it for certain devices, the quirk could probably go in the driver.

Can you use pci_set_vpd_size() instead?  There's already a use of that
in cxgb4.

> +
>  #ifdef CONFIG_ACPI
>  /*
>   * Apple: Shutdown Cactus Ridge Thunderbolt controller.
> -- 
> 2.5.0.rc3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3)
  2016-09-06 15:48             ` Bjorn Helgaas
@ 2016-09-06 18:30               ` Alexander Duyck
  2016-09-21 10:53                 ` Alexey Kardashevskiy
  0 siblings, 1 reply; 37+ messages in thread
From: Alexander Duyck @ 2016-09-06 18:30 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexey Kardashevskiy, Netdev, Bjorn Helgaas, Santosh Raspatur,
	Alex Williamson, Paul Mackerras, Hannes Reinecke, linux-kernel,
	linux-pci

On Tue, Sep 6, 2016 at 8:48 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> Hi Alexey,
>
> On Thu, Aug 11, 2016 at 08:03:29PM +1000, Alexey Kardashevskiy wrote:
>> There is at least one Chelsio 10Gb card which uses VPD area to store
>> some custom blocks (example below). However pci_vpd_size() returns
>> the length of the first block only assuming that there can be only
>> one VPD "End Tag" and VFIO blocks access beyond that offset
>> (since 4e1a63555) which leads to the situation when the guest "cxgb3"
>> driver fails to probe the device. The host system does not have this
>> problem as the drives accesses the config space directly without
>> pci_read_vpd()/...
>>
>> This adds a quirk to override the VPD size to a bigger value.
>>
>> This is the controller:
>> Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
>>
>> This is its VPD:
>> #0000 Large item 42 bytes; name 0x2 Identifier String
>>       b'10 Gigabit Ethernet-SR PCI Express Adapter'
>> #002d Large item 74 bytes; name 0x10
>>       #00 [EC] len=7: b'D76809 '
>>       #0a [FN] len=7: b'46K7897'
>>       #14 [PN] len=7: b'46K7897'
>>       #1e [MN] len=4: b'1037'
>>       #25 [FC] len=4: b'5769'
>>       #2c [SN] len=12: b'YL102035603V'
>>       #3b [NA] len=12: b'00145E992ED1'
>> #007a Small item 1 bytes; name 0xf End Tag
>> ---
>> #0c00 Large item 16 bytes; name 0x2 Identifier String
>>       b'S310E-SR-X      '
>> #0c13 Large item 234 bytes; name 0x10
>>       #00 [PN] len=16: b'TBD             '
>>       #13 [EC] len=16: b'110107730D2     '
>>       #26 [SN] len=16: b'97YL102035603V  '
>>       #39 [NA] len=12: b'00145E992ED1'
>>       #48 [V0] len=6: b'175000'
>>       #51 [V1] len=6: b'266666'
>>       #5a [V2] len=6: b'266666'
>>       #63 [V3] len=6: b'2000  '
>>       #6c [V4] len=2: b'1 '
>>       #71 [V5] len=6: b'c2    '
>>       #7a [V6] len=6: b'0     '
>>       #83 [V7] len=2: b'1 '
>>       #88 [V8] len=2: b'0 '
>>       #8d [V9] len=2: b'0 '
>>       #92 [VA] len=2: b'0 '
>>       #97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>> #0d00 Large item 252 bytes; name 0x11
>>       #00 [VC] len=16: b'122310_1222 dp  '
>>       #13 [VD] len=16: b'610-0001-00 H1\x00\x00'
>>       #26 [VE] len=16: b'122310_1353 fp  '
>>       #39 [VF] len=16: b'610-0001-00 H1\x00\x00'
>>       #4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>> #0dff Small item 0 bytes; name 0xf End Tag
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  drivers/pci/quirks.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index ee72ebe1..94d3fb5 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -3241,6 +3241,18 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
>>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
>>                       quirk_thunderbolt_hotplug_msi);
>>
>> +static void quirk_chelsio_extend_vpd(struct pci_dev *dev)
>> +{
>> +     if (!dev->vpd || !dev->vpd->ops || !dev->vpd->ops->set_size)
>> +             return;
>> +
>> +     dev->vpd->ops->set_size(dev, max_t(unsigned int, dev->vpd->len, 0xe00));
>> +}
>> +

So one thing you might want to look at doing is actually validating
there is something there before increasing the size.  If you look at
the get_vpd_params function from the cxgb3 driver you will see what
they do is verify the first tag located at 0xC00 is 0x82 before they
do any further reads.  You might do something similar just to verify
there is something there before you open it up to access by anyone.

One option would be to modify pci_vpd_size so that you can use it
outside of access.c and can pass it an offset.  Then you could update
your quirk so that you call pci_vpd_size and pass it the offset of
0xC00.  It should then be able to walk from that starting point and
reach the end of the list.  If you do then pci_vpd_size will return
the total size, else it returns 0.  So if size comes back as a
non-zero value then you could pass that into pci_set_vpd_size,
otherwise we can assume the starting offset is 0 and let the existing
code run its course.

>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO,
>> +                     PCI_ANY_ID,
>> +                     quirk_chelsio_extend_vpd);
>
> Do you really want this for *all* Chelsio devices?  If you only need
> it for certain devices, the quirk could probably go in the driver.
>
> Can you use pci_set_vpd_size() instead?  There's already a use of that
> in cxgb4.
>

I would assume this quirk needs to support the same device IDs as
supported by the cxgb3 driver.  If so you might just clone the ID list
from cxgb3_pci_tbl for this quirk.

Also from the looks of it the cxgb3 driver probably needs to be
updated to use the VPD accessor functions instead of just open coding
it themselves.  Otherwise we run the risk of the driver having issues
if it attempts to access the EEPROM at the same time as other
applications attempting to access the VPD for the device.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3)
  2016-09-06 18:30               ` Alexander Duyck
@ 2016-09-21 10:53                 ` Alexey Kardashevskiy
  0 siblings, 0 replies; 37+ messages in thread
From: Alexey Kardashevskiy @ 2016-09-21 10:53 UTC (permalink / raw)
  To: Alexander Duyck, Bjorn Helgaas
  Cc: Netdev, Bjorn Helgaas, Santosh Raspatur, Alex Williamson,
	Paul Mackerras, Hannes Reinecke, linux-kernel, linux-pci

On 07/09/16 04:30, Alexander Duyck wrote:
> On Tue, Sep 6, 2016 at 8:48 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> Hi Alexey,
>>
>> On Thu, Aug 11, 2016 at 08:03:29PM +1000, Alexey Kardashevskiy wrote:
>>> There is at least one Chelsio 10Gb card which uses VPD area to store
>>> some custom blocks (example below). However pci_vpd_size() returns
>>> the length of the first block only assuming that there can be only
>>> one VPD "End Tag" and VFIO blocks access beyond that offset
>>> (since 4e1a63555) which leads to the situation when the guest "cxgb3"
>>> driver fails to probe the device. The host system does not have this
>>> problem as the drives accesses the config space directly without
>>> pci_read_vpd()/...
>>>
>>> This adds a quirk to override the VPD size to a bigger value.
>>>
>>> This is the controller:
>>> Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
>>>
>>> This is its VPD:
>>> #0000 Large item 42 bytes; name 0x2 Identifier String
>>>       b'10 Gigabit Ethernet-SR PCI Express Adapter'
>>> #002d Large item 74 bytes; name 0x10
>>>       #00 [EC] len=7: b'D76809 '
>>>       #0a [FN] len=7: b'46K7897'
>>>       #14 [PN] len=7: b'46K7897'
>>>       #1e [MN] len=4: b'1037'
>>>       #25 [FC] len=4: b'5769'
>>>       #2c [SN] len=12: b'YL102035603V'
>>>       #3b [NA] len=12: b'00145E992ED1'
>>> #007a Small item 1 bytes; name 0xf End Tag
>>> ---
>>> #0c00 Large item 16 bytes; name 0x2 Identifier String
>>>       b'S310E-SR-X      '
>>> #0c13 Large item 234 bytes; name 0x10
>>>       #00 [PN] len=16: b'TBD             '
>>>       #13 [EC] len=16: b'110107730D2     '
>>>       #26 [SN] len=16: b'97YL102035603V  '
>>>       #39 [NA] len=12: b'00145E992ED1'
>>>       #48 [V0] len=6: b'175000'
>>>       #51 [V1] len=6: b'266666'
>>>       #5a [V2] len=6: b'266666'
>>>       #63 [V3] len=6: b'2000  '
>>>       #6c [V4] len=2: b'1 '
>>>       #71 [V5] len=6: b'c2    '
>>>       #7a [V6] len=6: b'0     '
>>>       #83 [V7] len=2: b'1 '
>>>       #88 [V8] len=2: b'0 '
>>>       #8d [V9] len=2: b'0 '
>>>       #92 [VA] len=2: b'0 '
>>>       #97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>>> #0d00 Large item 252 bytes; name 0x11
>>>       #00 [VC] len=16: b'122310_1222 dp  '
>>>       #13 [VD] len=16: b'610-0001-00 H1\x00\x00'
>>>       #26 [VE] len=16: b'122310_1353 fp  '
>>>       #39 [VF] len=16: b'610-0001-00 H1\x00\x00'
>>>       #4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
>>> #0dff Small item 0 bytes; name 0xf End Tag
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>>  drivers/pci/quirks.c | 12 ++++++++++++
>>>  1 file changed, 12 insertions(+)
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index ee72ebe1..94d3fb5 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -3241,6 +3241,18 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C
>>>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE,
>>>                       quirk_thunderbolt_hotplug_msi);
>>>
>>> +static void quirk_chelsio_extend_vpd(struct pci_dev *dev)
>>> +{
>>> +     if (!dev->vpd || !dev->vpd->ops || !dev->vpd->ops->set_size)
>>> +             return;
>>> +
>>> +     dev->vpd->ops->set_size(dev, max_t(unsigned int, dev->vpd->len, 0xe00));
>>> +}
>>> +
> 
> So one thing you might want to look at doing is actually validating
> there is something there before increasing the size.  If you look at
> the get_vpd_params function from the cxgb3 driver you will see what
> they do is verify the first tag located at 0xC00 is 0x82 before they
> do any further reads.  You might do something similar just to verify
> there is something there before you open it up to access by anyone.
> 
> One option would be to modify pci_vpd_size so that you can use it
> outside of access.c and can pass it an offset.  Then you could update
> your quirk so that you call pci_vpd_size and pass it the offset of
> 0xC00.  It should then be able to walk from that starting point and
> reach the end of the list.  If you do then pci_vpd_size will return
> the total size, else it returns 0.  So if size comes back as a
> non-zero value then you could pass that into pci_set_vpd_size,
> otherwise we can assume the starting offset is 0 and let the existing
> code run its course.


I could do this but I can predict endless argument whether a generic
pci_vpd_size() should scan for something which looks like VPD but it is
not, etc. I'd read at 0xc00, if it is 0x82, then pci_set_vpd_size(0xe00)
and that's it.


> 
>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO,
>>> +                     PCI_ANY_ID,
>>> +                     quirk_chelsio_extend_vpd);
>>
>> Do you really want this for *all* Chelsio devices?  

Well, no. When I did this, I just wanted opinions :) I'll have a list here
in next revision.


> If you only need it for certain devices, the quirk could probably go in the driver.

It cannot, the cxgb3 driver is not running on the host, vfio-pci controls it
.

>
>>
>> Can you use pci_set_vpd_size() instead?  There's already a use of that
>> in cxgb4.
>>
> 
> I would assume this quirk needs to support the same device IDs as
> supported by the cxgb3 driver.  If so you might just clone the ID list
> from cxgb3_pci_tbl for this quirk.
> 
> Also from the looks of it the cxgb3 driver probably needs to be
> updated to use the VPD accessor functions instead of just open coding
> it themselves.  Otherwise we run the risk of the driver having issues
> if it attempts to access the EEPROM at the same time as other
> applications attempting to access the VPD for the device.


In this particular case cxgb3 driver is not running, vfio-pci controls the
device on the host side and does all the filtering so I have to add a quirk
to  drivers/pci/quirks.c with all 13 PCI IDs of cxgb3 hardware OOOOOR
implement quirks in vfio-pci (it has one already but I seriously doubt we
should extend it).

Fixing cxgb3 in the guest won't make any difference.

So where should I put the quirks? Thanks.


-- 
Alexey

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-09-21 10:54 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-13 11:25 [PATCHv2 0/4] PCI VPD access fixes Hannes Reinecke
2016-01-13 11:25 ` [PATCHv2 1/4] pci: Update VPD definitions Hannes Reinecke
2016-01-13 11:25 ` [PATCHv2 2/4] pci: allow access to VPD attributes with size '0' Hannes Reinecke
2016-02-09 20:53   ` Bjorn Helgaas
2016-02-10  7:17     ` Hannes Reinecke
2016-01-13 11:25 ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Hannes Reinecke
2016-02-09 21:04   ` Bjorn Helgaas
2016-02-10  7:24     ` Hannes Reinecke
2016-08-09 12:54     ` Alexey Kardashevskiy
2016-08-09 18:12       ` Alexander Duyck
2016-08-10  0:03         ` Benjamin Herrenschmidt
2016-08-10 15:47           ` Alexander Duyck
2016-08-10 23:54             ` Benjamin Herrenschmidt
2016-08-11 18:52               ` Alexander Duyck
2016-08-11 20:17                 ` Alex Williamson
2016-08-12  5:11                   ` Benjamin Herrenschmidt
2016-08-15 17:59                     ` Rustad, Mark D
2016-08-15 22:23                       ` Benjamin Herrenschmidt
2016-08-15 22:33                         ` Benjamin Herrenschmidt
2016-08-15 23:16                           ` Rustad, Mark D
2016-08-16  0:13                             ` Benjamin Herrenschmidt
2016-08-16  1:40                 ` Alexey Kardashevskiy
2016-08-10  6:23         ` Hannes Reinecke
2016-08-11 10:03           ` [RFC PATCH kernel] PCI: Enable access to custom VPD for Chelsio devices (cxgb3) Alexey Kardashevskiy
2016-09-06 15:48             ` Bjorn Helgaas
2016-09-06 18:30               ` Alexander Duyck
2016-09-21 10:53                 ` Alexey Kardashevskiy
2016-08-09 23:59       ` [PATCHv2 3/4] pci: Determine actual VPD size on first access Benjamin Herrenschmidt
2016-01-13 11:25 ` [PATCHv2 4/4] pci: Blacklist vpd access for buggy devices Hannes Reinecke
2016-01-19 20:57   ` [PATCH v3 " Babu Moger
2016-02-09 21:07   ` [PATCHv2 " Bjorn Helgaas
2016-02-09 21:24     ` Babu Moger
2016-01-15  1:07 ` [PATCHv2 0/4] PCI VPD access fixes Seymour, Shane M
2016-01-15 14:10   ` Babu Moger
2016-01-15 14:18     ` Hannes Reinecke
2016-01-19 20:53 ` Babu Moger
2016-01-21 18:34 ` [PATCH v4 4/4] pci: Blacklist vpd access for buggy devices Babu Moger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.