linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE
@ 2016-06-02  5:46 Yongji Xie
  2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Yongji Xie @ 2016-06-02  5:46 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linuxppc-dev, linux-doc
  Cc: bhelgaas, alex.williamson, aik, benh, paulus, mpe, corbet,
	warrier, zhong, nikunj, gwshan

This series aims to add an option for PCI resource allocator to  
force BARs not to share PAGE_SIZE. This would make sense to VFIO 
driver. Because current VFIO implementation disallows to mmap 
sub-page(size < PAGE_SIZE) MMIO BARs which may share the same page 
with other BARs for security reasons. Thus, we have to handle mmio 
access to these BARs in QEMU emulation rather than in guest which 
will cause some performance loss.

To achieve that, we would like to make use of the existing 
resource_alignment kernel parameter and force a minimum alignment 
of PAGE_SIZE. It's flexible. And we can default to enable it on  
some archs which may easily hit the performance issue because of  
their 64K page.

In this series, patch 1 fixed a bug of resource_alignment; patch 2,3 
tried to add a new option for resource_alignment to use 
IORESOURCE_STARTALIGN to specify the alignment of PCI BARs; patch 4 
modified resource_alignment to support syntax which can be used to  
enforce the alignment of all MMIO BARs to be at least PAGE_SIZE.

Changelog v2:
- Ignore enforced alignment to VF BARs on pci_reassigndev_resource_alignment()

Yongji Xie (4):
  PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set
  PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources
  PCI: Add a new option for resource_alignment to reassign alignment
  PCI: Add support for enforcing all MMIO BARs to be page aligned

 Documentation/kernel-parameters.txt |    7 ++-
 arch/powerpc/include/asm/pci.h      |    2 +
 drivers/pci/pci.c                   |  109 ++++++++++++++++++++++++++++-------
 drivers/pci/setup-bus.c             |    9 ++-
 4 files changed, 102 insertions(+), 25 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set
  2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
@ 2016-06-02  5:46 ` Yongji Xie
  2016-06-21  1:43   ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\ Bjorn Helgaas
  2016-06-02  5:46 ` [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources Yongji Xie
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Yongji Xie @ 2016-06-02  5:46 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linuxppc-dev, linux-doc
  Cc: bhelgaas, alex.williamson, aik, benh, paulus, mpe, corbet,
	warrier, zhong, nikunj, gwshan

The resource_alignment will releases memory resources allocated
by firmware so that kernel can reassign new resources later on.
But this will cause the problem that no resources can be
allocated by kernel if PCI_PROBE_ONLY was set, e.g. on pSeries
platform because PCI_PROBE_ONLY force kernel to use firmware
setup and not to reassign any resources.

To solve this problem, this patch ignores resource_alignment
if PCI_PROBE_ONLY was set.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 drivers/pci/pci.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c8b4dbd..a259394 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4761,6 +4761,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 	spin_lock(&resource_alignment_lock);
 	p = resource_alignment_param;
 	while (*p) {
+		if (pci_has_flag(PCI_PROBE_ONLY)) {
+			printk(KERN_ERR "PCI: Ignore resource_alignment parameter: %s with PCI_PROBE_ONLY set\n",
+					p);
+			*p = 0;
+			break;
+		}
 		count = 0;
 		if (sscanf(p, "%d%n", &align_order, &count) == 1 &&
 							p[count] == '@') {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources
  2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
  2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
@ 2016-06-02  5:46 ` Yongji Xie
  2016-06-21  1:50   ` Bjorn Helgaas
  2016-06-02  5:46 ` [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment Yongji Xie
  2016-06-02  5:46 ` [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned Yongji Xie
  3 siblings, 1 reply; 14+ messages in thread
From: Yongji Xie @ 2016-06-02  5:46 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linuxppc-dev, linux-doc
  Cc: bhelgaas, alex.williamson, aik, benh, paulus, mpe, corbet,
	warrier, zhong, nikunj, gwshan

Now we use the IORESOURCE_STARTALIGN to identify bridge resources
in __assign_resources_sorted(). That's quite fragile. We can't
make sure that the PCI devices' resources will not use
IORESOURCE_STARTALIGN any more.

In this patch, we try to use a more robust way to identify
bridge resources.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 drivers/pci/setup-bus.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 55641a3..216ddbc 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -390,6 +390,7 @@ static void __assign_resources_sorted(struct list_head *head,
 	struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
 	unsigned long fail_type;
 	resource_size_t add_align, align;
+	int index;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -410,11 +411,13 @@ static void __assign_resources_sorted(struct list_head *head,
 
 		/*
 		 * There are two kinds of additional resources in the list:
-		 * 1. bridge resource  -- IORESOURCE_STARTALIGN
-		 * 2. SR-IOV resource   -- IORESOURCE_SIZEALIGN
+		 * 1. bridge resource
+		 * 2. SR-IOV resource
 		 * Here just fix the additional alignment for bridge
 		 */
-		if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
+		index = dev_res->res - dev_res->dev->resource;
+		if (index < PCI_BRIDGE_RESOURCES ||
+			index > PCI_BRIDGE_RESOURCE_END)
 			continue;
 
 		add_align = get_res_add_align(realloc_head, dev_res->res);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment
  2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
  2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
  2016-06-02  5:46 ` [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources Yongji Xie
@ 2016-06-02  5:46 ` Yongji Xie
  2016-06-21  1:57   ` Bjorn Helgaas
  2016-06-02  5:46 ` [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned Yongji Xie
  3 siblings, 1 reply; 14+ messages in thread
From: Yongji Xie @ 2016-06-02  5:46 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linuxppc-dev, linux-doc
  Cc: bhelgaas, alex.williamson, aik, benh, paulus, mpe, corbet,
	warrier, zhong, nikunj, gwshan

When using resource_alignment kernel parameter, the current
implement reassigns the alignment by changing resources' size
which can potentially break some drivers. For example, the driver
uses the size to locate some register whose length is related
to the size.

This patch adds a new option "noresize" for the parameter to
solve this problem.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt |    5 ++++-
 drivers/pci/pci.c                   |   35 +++++++++++++++++++++++++----------
 2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c9..c4802f5 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2997,13 +2997,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 				window. The default value is 64 megabytes.
 		resource_alignment=
 				Format:
-				[<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
+				[<order of align>@][<domain>:]<bus>:<slot>.<func>
+				[:noresize][; ...]
 				Specifies alignment and device to reassign
 				aligned memory resources.
 				If <order of align> is not specified,
 				PAGE_SIZE is used as alignment.
 				PCI-PCI bridge can be specified, if resource
 				windows need to be expanded.
+				noresize: Don't change the resources' sizes when
+				reassigning alignment.
 		ecrc=		Enable/disable PCIe ECRC (transaction layer
 				end-to-end CRC checking).
 				bios: Use BIOS/firmware settings. This is the
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a259394..3ee13e5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4748,11 +4748,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
 /**
  * pci_specified_resource_alignment - get resource alignment specified by user.
  * @dev: the PCI device to get
+ * @resize: whether or not to change resources' size when reassigning alignment
  *
  * RETURNS: Resource alignment if it is specified.
  *          Zero if it is not specified.
  */
-static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
+static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
+		bool *resize)
 {
 	int seg, bus, slot, func, align_order, count;
 	resource_size_t align = 0;
@@ -4786,6 +4788,11 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 			}
 		}
 		p += count;
+		if (!strncmp(p, ":noresize", 9)) {
+			*resize = false;
+			p += 9;
+		} else
+			*resize = true;
 		if (seg == pci_domain_nr(dev->bus) &&
 			bus == dev->bus->number &&
 			slot == PCI_SLOT(dev->devfn) &&
@@ -4818,11 +4825,12 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 {
 	int i;
 	struct resource *r;
+	bool resize = true;
 	resource_size_t align, size;
 	u16 command;
 
 	/* check if specified PCI is target device to reassign */
-	align = pci_specified_resource_alignment(dev);
+	align = pci_specified_resource_alignment(dev, &resize);
 	if (!align)
 		return;
 
@@ -4844,15 +4852,22 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 		if (!(r->flags & IORESOURCE_MEM))
 			continue;
 		size = resource_size(r);
-		if (size < align) {
-			size = align;
-			dev_info(&dev->dev,
-				"Rounding up size of resource #%d to %#llx.\n",
-				i, (unsigned long long)size);
+		if (resize) {
+			if (size < align) {
+				size = align;
+				dev_info(&dev->dev,
+					"Rounding up size of resource #%d to %#llx.\n",
+					i, (unsigned long long)size);
+			}
+			r->flags |= IORESOURCE_UNSET;
+			r->end = size - 1;
+			r->start = 0;
+		} else {
+			r->flags &= ~IORESOURCE_SIZEALIGN;
+			r->flags |= IORESOURCE_STARTALIGN | IORESOURCE_UNSET;
+			r->start = max(align, size);
+			r->end = r->start + size - 1;
 		}
-		r->flags |= IORESOURCE_UNSET;
-		r->end = size - 1;
-		r->start = 0;
 	}
 	/* Need to disable bridge's resource window,
 	 * to enable the kernel to reassign new resource
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned
  2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
                   ` (2 preceding siblings ...)
  2016-06-02  5:46 ` [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment Yongji Xie
@ 2016-06-02  5:46 ` Yongji Xie
  2016-06-21  2:26   ` Bjorn Helgaas
  3 siblings, 1 reply; 14+ messages in thread
From: Yongji Xie @ 2016-06-02  5:46 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linuxppc-dev, linux-doc
  Cc: bhelgaas, alex.williamson, aik, benh, paulus, mpe, corbet,
	warrier, zhong, nikunj, gwshan

When vfio passthrough a PCI device of which MMIO BARs are
smaller than PAGE_SIZE, guest will not handle the mmio
accesses to the BARs which leads to mmio emulations in host.

This is because vfio will not allow to passthrough one BAR's
mmio page which may be shared with other BARs. Otherwise,
there will be a backdoor that guest can use to access BARs
of other guest.

To solve this issue, this patch modifies resource_alignment
to support syntax where multiple devices get the same
alignment. So we can use something like
"pci=resource_alignment=*:*:*.*:noresize" to enforce the
alignment of all MMIO BARs to be at least PAGE_SIZE so that
one BAR's mmio page would not be shared with other BARs.

And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
automatically on PPC64 platform which can easily hit this issue
because its PAGE_SIZE is 64KB.

Note that this would not be applied to VFs whose BARs are always
page aligned and should be never reassigned according to SRIOV
spec.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt |    2 ++
 arch/powerpc/include/asm/pci.h      |    2 ++
 drivers/pci/pci.c                   |   68 +++++++++++++++++++++++++++++------
 3 files changed, 61 insertions(+), 11 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c4802f5..cb09503 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3003,6 +3003,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 				aligned memory resources.
 				If <order of align> is not specified,
 				PAGE_SIZE is used as alignment.
+				<domain>, <bus>, <slot> and <func> can be set to
+				"*" which means match all values.
 				PCI-PCI bridge can be specified, if resource
 				windows need to be expanded.
 				noresize: Don't change the resources' sizes when
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index a6f3ac0..742fd34 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -28,6 +28,8 @@
 #define PCIBIOS_MIN_IO		0x1000
 #define PCIBIOS_MIN_MEM		0x10000000
 
+#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
+
 struct pci_dev;
 
 /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3ee13e5..664f295 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4759,7 +4759,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
 	int seg, bus, slot, func, align_order, count;
 	resource_size_t align = 0;
 	char *p;
+	bool invalid = false;
 
+#ifdef PCIBIOS_MIN_ALIGNMENT
+	align = PCIBIOS_MIN_ALIGNMENT;
+	*resize = false;
+#endif
 	spin_lock(&resource_alignment_lock);
 	p = resource_alignment_param;
 	while (*p) {
@@ -4776,16 +4781,49 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
 		} else {
 			align_order = -1;
 		}
-		if (sscanf(p, "%x:%x:%x.%x%n",
-			&seg, &bus, &slot, &func, &count) != 4) {
+		if (p[0] == '*' && p[1] == ':') {
+			seg = -1;
+			count = 1;
+		} else if (sscanf(p, "%x%n", &seg, &count) != 1 ||
+				p[count] != ':') {
+			invalid = true;
+			break;
+		}
+		p += count + 1;
+		if (*p == '*') {
+			bus = -1;
+			count = 1;
+		} else if (sscanf(p, "%x%n", &bus, &count) != 1) {
+			invalid = true;
+			break;
+		}
+		p += count;
+		if (*p == '.') {
+			slot = bus;
+			bus = seg;
 			seg = 0;
-			if (sscanf(p, "%x:%x.%x%n",
-					&bus, &slot, &func, &count) != 3) {
-				/* Invalid format */
-				printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n",
-					p);
+			p++;
+		} else if (*p == ':') {
+			p++;
+			if (p[0] == '*' && p[1] == '.') {
+				slot = -1;
+				count = 1;
+			} else if (sscanf(p, "%x%n", &slot, &count) != 1 ||
+					p[count] != '.') {
+				invalid = true;
 				break;
 			}
+			p += count + 1;
+		} else {
+			invalid = true;
+			break;
+		}
+		if (*p == '*') {
+			func = -1;
+			count = 1;
+		} else if (sscanf(p, "%x%n", &func, &count) != 1) {
+			invalid = true;
+			break;
 		}
 		p += count;
 		if (!strncmp(p, ":noresize", 9)) {
@@ -4793,10 +4831,10 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
 			p += 9;
 		} else
 			*resize = true;
-		if (seg == pci_domain_nr(dev->bus) &&
-			bus == dev->bus->number &&
-			slot == PCI_SLOT(dev->devfn) &&
-			func == PCI_FUNC(dev->devfn)) {
+		if ((seg == pci_domain_nr(dev->bus) || seg == -1) &&
+			(bus == dev->bus->number || bus == -1) &&
+			(slot == PCI_SLOT(dev->devfn) || slot == -1) &&
+			(func == PCI_FUNC(dev->devfn) || func == -1)) {
 			if (align_order == -1)
 				align = PAGE_SIZE;
 			else
@@ -4806,10 +4844,14 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
 		}
 		if (*p != ';' && *p != ',') {
 			/* End of param or invalid format */
+			invalid = true;
 			break;
 		}
 		p++;
 	}
+	if (invalid)
+		printk(KERN_ERR "PCI: Can't parse resource_alignment parameter:%s\n",
+				p);
 	spin_unlock(&resource_alignment_lock);
 	return align;
 }
@@ -4829,6 +4871,10 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	resource_size_t align, size;
 	u16 command;
 
+	/* We should never try to reassign VF's alignment */
+	if (dev->is_virtfn)
+		return;
+
 	/* check if specified PCI is target device to reassign */
 	align = pci_specified_resource_alignment(dev, &resize);
 	if (!align)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\
  2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
@ 2016-06-21  1:43   ` Bjorn Helgaas
  2016-06-21  2:16     ` Yongji Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-06-21  1:43 UTC (permalink / raw)
  To: Yongji Xie
  Cc: linux-kernel, linux-pci, linuxppc-dev, linux-doc, bhelgaas,
	alex.williamson, aik, benh, paulus, mpe, corbet, warrier, zhong,
	nikunj, gwshan

On Thu, Jun 02, 2016 at 01:46:48PM +0800, Yongji Xie wrote:
> The resource_alignment will releases memory resources allocated
> by firmware so that kernel can reassign new resources later on.
> But this will cause the problem that no resources can be
> allocated by kernel if PCI_PROBE_ONLY was set, e.g. on pSeries
> platform because PCI_PROBE_ONLY force kernel to use firmware
> setup and not to reassign any resources.
> 
> To solve this problem, this patch ignores resource_alignment
> if PCI_PROBE_ONLY was set.
> 
> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  drivers/pci/pci.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index c8b4dbd..a259394 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4761,6 +4761,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>  	spin_lock(&resource_alignment_lock);
>  	p = resource_alignment_param;
>  	while (*p) {
> +		if (pci_has_flag(PCI_PROBE_ONLY)) {
> +			printk(KERN_ERR "PCI: Ignore resource_alignment parameter: %s with PCI_PROBE_ONLY set\n",
> +					p);
> +			*p = 0;
> +			break;

Wouldn't it be simpler to make pci_set_resource_alignment_param() fail
if PCI_PROBE_ONLY is set?

> +		}
>  		count = 0;
>  		if (sscanf(p, "%d%n", &align_order, &count) == 1 &&
>  							p[count] == '@') {
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources
  2016-06-02  5:46 ` [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources Yongji Xie
@ 2016-06-21  1:50   ` Bjorn Helgaas
  2016-06-21  2:59     ` Yongji Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-06-21  1:50 UTC (permalink / raw)
  To: Yongji Xie
  Cc: linux-kernel, linux-pci, linuxppc-dev, linux-doc, bhelgaas,
	alex.williamson, aik, benh, paulus, mpe, corbet, warrier, zhong,
	nikunj, gwshan

On Thu, Jun 02, 2016 at 01:46:49PM +0800, Yongji Xie wrote:
> Now we use the IORESOURCE_STARTALIGN to identify bridge resources
> in __assign_resources_sorted(). That's quite fragile. We can't
> make sure that the PCI devices' resources will not use
> IORESOURCE_STARTALIGN any more.

Can you explain this a little more?  I don't quite understand the
problem.  Maybe you can give an example of the problem?

> In this patch, we try to use a more robust way to identify
> bridge resources.
> 
> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  drivers/pci/setup-bus.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 55641a3..216ddbc 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -390,6 +390,7 @@ static void __assign_resources_sorted(struct list_head *head,
>  	struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
>  	unsigned long fail_type;
>  	resource_size_t add_align, align;
> +	int index;
>  
>  	/* Check if optional add_size is there */
>  	if (!realloc_head || list_empty(realloc_head))
> @@ -410,11 +411,13 @@ static void __assign_resources_sorted(struct list_head *head,
>  
>  		/*
>  		 * There are two kinds of additional resources in the list:
> -		 * 1. bridge resource  -- IORESOURCE_STARTALIGN
> -		 * 2. SR-IOV resource   -- IORESOURCE_SIZEALIGN
> +		 * 1. bridge resource
> +		 * 2. SR-IOV resource
>  		 * Here just fix the additional alignment for bridge
>  		 */
> -		if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
> +		index = dev_res->res - dev_res->dev->resource;
> +		if (index < PCI_BRIDGE_RESOURCES ||
> +			index > PCI_BRIDGE_RESOURCE_END)

I think the code looks OK; at least, it seems to match the comment.
I'd just like to understand the problem a little better.

>  			continue;
>  
>  		add_align = get_res_add_align(realloc_head, dev_res->res);
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment
  2016-06-02  5:46 ` [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment Yongji Xie
@ 2016-06-21  1:57   ` Bjorn Helgaas
  2016-06-21  3:26     ` Yongji Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-06-21  1:57 UTC (permalink / raw)
  To: Yongji Xie
  Cc: linux-kernel, linux-pci, linuxppc-dev, linux-doc, bhelgaas,
	alex.williamson, aik, benh, paulus, mpe, corbet, warrier, zhong,
	nikunj, gwshan

On Thu, Jun 02, 2016 at 01:46:50PM +0800, Yongji Xie wrote:
> When using resource_alignment kernel parameter, the current
> implement reassigns the alignment by changing resources' size
> which can potentially break some drivers. For example, the driver
> uses the size to locate some register whose length is related
> to the size.
> 
> This patch adds a new option "noresize" for the parameter to
> solve this problem.

Why do we ever want to change the resource's size?  I understand that
you want to change the resource's *alignment*, and that part makes
sense.  But why change the *size*?  Changing the resource size doesn't
change the hardware BAR size; it just means the struct resource will
describe a region larger than what the BAR actually claims.  That
unnecessarily wastes space after the BAR.

This was a problem with the code even before your patch; I'm
suggesting that if you have a way to change the alignment without
changing the resource size, maybe we should do that all the time.
Then you wouldn't need to add the "noresize" option.

> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt |    5 ++++-
>  drivers/pci/pci.c                   |   35 +++++++++++++++++++++++++----------
>  2 files changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 82b42c9..c4802f5 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2997,13 +2997,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  				window. The default value is 64 megabytes.
>  		resource_alignment=
>  				Format:
> -				[<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
> +				[<order of align>@][<domain>:]<bus>:<slot>.<func>
> +				[:noresize][; ...]
>  				Specifies alignment and device to reassign
>  				aligned memory resources.
>  				If <order of align> is not specified,
>  				PAGE_SIZE is used as alignment.
>  				PCI-PCI bridge can be specified, if resource
>  				windows need to be expanded.
> +				noresize: Don't change the resources' sizes when
> +				reassigning alignment.
>  		ecrc=		Enable/disable PCIe ECRC (transaction layer
>  				end-to-end CRC checking).
>  				bios: Use BIOS/firmware settings. This is the
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a259394..3ee13e5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4748,11 +4748,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
>  /**
>   * pci_specified_resource_alignment - get resource alignment specified by user.
>   * @dev: the PCI device to get
> + * @resize: whether or not to change resources' size when reassigning alignment
>   *
>   * RETURNS: Resource alignment if it is specified.
>   *          Zero if it is not specified.
>   */
> -static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
> +static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
> +		bool *resize)
>  {
>  	int seg, bus, slot, func, align_order, count;
>  	resource_size_t align = 0;
> @@ -4786,6 +4788,11 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>  			}
>  		}
>  		p += count;
> +		if (!strncmp(p, ":noresize", 9)) {
> +			*resize = false;
> +			p += 9;
> +		} else
> +			*resize = true;
>  		if (seg == pci_domain_nr(dev->bus) &&
>  			bus == dev->bus->number &&
>  			slot == PCI_SLOT(dev->devfn) &&
> @@ -4818,11 +4825,12 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  {
>  	int i;
>  	struct resource *r;
> +	bool resize = true;
>  	resource_size_t align, size;
>  	u16 command;
>  
>  	/* check if specified PCI is target device to reassign */
> -	align = pci_specified_resource_alignment(dev);
> +	align = pci_specified_resource_alignment(dev, &resize);
>  	if (!align)
>  		return;
>  
> @@ -4844,15 +4852,22 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  		if (!(r->flags & IORESOURCE_MEM))
>  			continue;
>  		size = resource_size(r);
> -		if (size < align) {
> -			size = align;
> -			dev_info(&dev->dev,
> -				"Rounding up size of resource #%d to %#llx.\n",
> -				i, (unsigned long long)size);
> +		if (resize) {
> +			if (size < align) {
> +				size = align;
> +				dev_info(&dev->dev,
> +					"Rounding up size of resource #%d to %#llx.\n",
> +					i, (unsigned long long)size);
> +			}
> +			r->flags |= IORESOURCE_UNSET;
> +			r->end = size - 1;
> +			r->start = 0;
> +		} else {
> +			r->flags &= ~IORESOURCE_SIZEALIGN;
> +			r->flags |= IORESOURCE_STARTALIGN | IORESOURCE_UNSET;
> +			r->start = max(align, size);
> +			r->end = r->start + size - 1;
>  		}
> -		r->flags |= IORESOURCE_UNSET;
> -		r->end = size - 1;
> -		r->start = 0;
>  	}
>  	/* Need to disable bridge's resource window,
>  	 * to enable the kernel to reassign new resource
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\
  2016-06-21  1:43   ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\ Bjorn Helgaas
@ 2016-06-21  2:16     ` Yongji Xie
  2016-06-21  8:38       ` Yongji Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Yongji Xie @ 2016-06-21  2:16 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nikunj, zhong, linux-doc, aik, linux-pci, corbet, linux-kernel,
	gwshan, warrier, alex.williamson, paulus, bhelgaas, linuxppc-dev

On 2016/6/21 9:43, Bjorn Helgaas wrote:

> On Thu, Jun 02, 2016 at 01:46:48PM +0800, Yongji Xie wrote:
>> The resource_alignment will releases memory resources allocated
>> by firmware so that kernel can reassign new resources later on.
>> But this will cause the problem that no resources can be
>> allocated by kernel if PCI_PROBE_ONLY was set, e.g. on pSeries
>> platform because PCI_PROBE_ONLY force kernel to use firmware
>> setup and not to reassign any resources.
>>
>> To solve this problem, this patch ignores resource_alignment
>> if PCI_PROBE_ONLY was set.
>>
>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>>   drivers/pci/pci.c |    6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index c8b4dbd..a259394 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4761,6 +4761,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>>   	spin_lock(&resource_alignment_lock);
>>   	p = resource_alignment_param;
>>   	while (*p) {
>> +		if (pci_has_flag(PCI_PROBE_ONLY)) {
>> +			printk(KERN_ERR "PCI: Ignore resource_alignment parameter: %s with PCI_PROBE_ONLY set\n",
>> +					p);
>> +			*p = 0;
>> +			break;
> Wouldn't it be simpler to make pci_set_resource_alignment_param() fail
> if PCI_PROBE_ONLY is set?

I add the check here because I want to print some logs so that users
could know the reason why resource_alignment doesn't work when
they add this parameter.

Thanks,
Yongji

>> +		}
>>   		count = 0;
>>   		if (sscanf(p, "%d%n", &align_order, &count) == 1 &&
>>   							p[count] == '@') {
>> -- 
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned
  2016-06-02  5:46 ` [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned Yongji Xie
@ 2016-06-21  2:26   ` Bjorn Helgaas
  2016-06-21  6:46     ` Yongji Xie
  0 siblings, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2016-06-21  2:26 UTC (permalink / raw)
  To: Yongji Xie
  Cc: linux-kernel, linux-pci, linuxppc-dev, linux-doc, bhelgaas,
	alex.williamson, aik, benh, paulus, mpe, corbet, warrier, zhong,
	nikunj, gwshan

On Thu, Jun 02, 2016 at 01:46:51PM +0800, Yongji Xie wrote:
> When vfio passthrough a PCI device of which MMIO BARs are
> smaller than PAGE_SIZE, guest will not handle the mmio
> accesses to the BARs which leads to mmio emulations in host.
> 
> This is because vfio will not allow to passthrough one BAR's
> mmio page which may be shared with other BARs. Otherwise,
> there will be a backdoor that guest can use to access BARs
> of other guest.
> 
> To solve this issue, this patch modifies resource_alignment
> to support syntax where multiple devices get the same
> alignment. So we can use something like
> "pci=resource_alignment=*:*:*.*:noresize" to enforce the
> alignment of all MMIO BARs to be at least PAGE_SIZE so that
> one BAR's mmio page would not be shared with other BARs.
> 
> And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
> automatically on PPC64 platform which can easily hit this issue
> because its PAGE_SIZE is 64KB.
> 
> Note that this would not be applied to VFs whose BARs are always
> page aligned and should be never reassigned according to SRIOV
> spec.

I see that SR-IOV spec r1.1, sec 3.3.13 requires that all VF BAR
resources be aligned on System Page Size, and must be sized to consume
an integral number of pages.

Where does it say VF BARs can't be reassigned?  I thought they *could*
be reassigned, as long as VFs are disabled when you do it.

> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt |    2 ++
>  arch/powerpc/include/asm/pci.h      |    2 ++
>  drivers/pci/pci.c                   |   68 +++++++++++++++++++++++++++++------
>  3 files changed, 61 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c4802f5..cb09503 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3003,6 +3003,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  				aligned memory resources.
>  				If <order of align> is not specified,
>  				PAGE_SIZE is used as alignment.
> +				<domain>, <bus>, <slot> and <func> can be set to
> +				"*" which means match all values.
>  				PCI-PCI bridge can be specified, if resource
>  				windows need to be expanded.
>  				noresize: Don't change the resources' sizes when
> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
> index a6f3ac0..742fd34 100644
> --- a/arch/powerpc/include/asm/pci.h
> +++ b/arch/powerpc/include/asm/pci.h
> @@ -28,6 +28,8 @@
>  #define PCIBIOS_MIN_IO		0x1000
>  #define PCIBIOS_MIN_MEM		0x10000000
>  
> +#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
> +
>  struct pci_dev;
>  
>  /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3ee13e5..664f295 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4759,7 +4759,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  	int seg, bus, slot, func, align_order, count;
>  	resource_size_t align = 0;
>  	char *p;
> +	bool invalid = false;
>  
> +#ifdef PCIBIOS_MIN_ALIGNMENT
> +	align = PCIBIOS_MIN_ALIGNMENT;
> +	*resize = false;
> +#endif

This PCIBIOS_MIN_ALIGNMENT part should be a separate patch by itself.

If you have PCIBIOS_MIN_ALIGNMENT enabled automatically for powerpc,
do you still need the command-line argument?

>  	spin_lock(&resource_alignment_lock);
>  	p = resource_alignment_param;
>  	while (*p) {
> @@ -4776,16 +4781,49 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  		} else {
>  			align_order = -1;
>  		}
> -		if (sscanf(p, "%x:%x:%x.%x%n",
> -			&seg, &bus, &slot, &func, &count) != 4) {
> +		if (p[0] == '*' && p[1] == ':') {
> +			seg = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &seg, &count) != 1 ||
> +				p[count] != ':') {
> +			invalid = true;
> +			break;
> +		}
> +		p += count + 1;
> +		if (*p == '*') {
> +			bus = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &bus, &count) != 1) {
> +			invalid = true;
> +			break;
> +		}
> +		p += count;
> +		if (*p == '.') {
> +			slot = bus;
> +			bus = seg;
>  			seg = 0;
> -			if (sscanf(p, "%x:%x.%x%n",
> -					&bus, &slot, &func, &count) != 3) {
> -				/* Invalid format */
> -				printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n",
> -					p);
> +			p++;
> +		} else if (*p == ':') {
> +			p++;
> +			if (p[0] == '*' && p[1] == '.') {
> +				slot = -1;
> +				count = 1;
> +			} else if (sscanf(p, "%x%n", &slot, &count) != 1 ||
> +					p[count] != '.') {
> +				invalid = true;
>  				break;
>  			}
> +			p += count + 1;
> +		} else {
> +			invalid = true;
> +			break;
> +		}
> +		if (*p == '*') {
> +			func = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &func, &count) != 1) {
> +			invalid = true;
> +			break;
>  		}
>  		p += count;
>  		if (!strncmp(p, ":noresize", 9)) {
> @@ -4793,10 +4831,10 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  			p += 9;
>  		} else
>  			*resize = true;
> -		if (seg == pci_domain_nr(dev->bus) &&
> -			bus == dev->bus->number &&
> -			slot == PCI_SLOT(dev->devfn) &&
> -			func == PCI_FUNC(dev->devfn)) {
> +		if ((seg == pci_domain_nr(dev->bus) || seg == -1) &&
> +			(bus == dev->bus->number || bus == -1) &&
> +			(slot == PCI_SLOT(dev->devfn) || slot == -1) &&
> +			(func == PCI_FUNC(dev->devfn) || func == -1)) {
>  			if (align_order == -1)
>  				align = PAGE_SIZE;
>  			else
> @@ -4806,10 +4844,14 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  		}
>  		if (*p != ';' && *p != ',') {
>  			/* End of param or invalid format */
> +			invalid = true;
>  			break;
>  		}
>  		p++;
>  	}
> +	if (invalid)
> +		printk(KERN_ERR "PCI: Can't parse resource_alignment parameter:%s\n",
> +				p);
>  	spin_unlock(&resource_alignment_lock);
>  	return align;
>  }
> @@ -4829,6 +4871,10 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  	resource_size_t align, size;
>  	u16 command;
>  
> +	/* We should never try to reassign VF's alignment */
> +	if (dev->is_virtfn)
> +		return;

This part looks like a bugfix that should be in a separate patch.

I assume this is because VFs have no read/write BARs themselves.  A PF
has the usual read/write BAR0-BAR5 at offsets 0x10-0x24, as well as
read/write VF BAR0-BAR5 in the SR-IOV capability.  The VF BARs in the
SR-IOV capability determine the resources assigned for VFs.

For the VFs themselves, BAR0-BAR5 at offsets 0x10-0x24 are read-only
zeroes (SR-IOV spec r1.1., sec 3.4.1.11), and there is no SR-IOV
capability.

Right?

>  	/* check if specified PCI is target device to reassign */
>  	align = pci_specified_resource_alignment(dev, &resize);
>  	if (!align)
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources
  2016-06-21  1:50   ` Bjorn Helgaas
@ 2016-06-21  2:59     ` Yongji Xie
  0 siblings, 0 replies; 14+ messages in thread
From: Yongji Xie @ 2016-06-21  2:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nikunj, zhong, linux-doc, aik, linux-pci, corbet, linux-kernel,
	gwshan, warrier, alex.williamson, paulus, bhelgaas, linuxppc-dev

On 2016/6/21 9:50, Bjorn Helgaas wrote:

> On Thu, Jun 02, 2016 at 01:46:49PM +0800, Yongji Xie wrote:
>> Now we use the IORESOURCE_STARTALIGN to identify bridge resources
>> in __assign_resources_sorted(). That's quite fragile. We can't
>> make sure that the PCI devices' resources will not use
>> IORESOURCE_STARTALIGN any more.
> Can you explain this a little more?  I don't quite understand the
> problem.  Maybe you can give an example of the problem?

Now there are two kinds of additional resources in the list: bridge
resource and SR-IOV resource. Here we just want to fix the
additional alignment for bridge. But if SR-IOV resource get the
flag IORESOURCE_STARTALIGN which is not only for bridge, the
current check seems to be wrong. And kernel parameter
"resource_alignment" could set IORESOURCE_STARTALIGN for
SR-IOV resource with my next patch applied.

Thanks,
Yongji

>> In this patch, we try to use a more robust way to identify
>> bridge resources.
>>
>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>>   drivers/pci/setup-bus.c |    9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>> index 55641a3..216ddbc 100644
>> --- a/drivers/pci/setup-bus.c
>> +++ b/drivers/pci/setup-bus.c
>> @@ -390,6 +390,7 @@ static void __assign_resources_sorted(struct list_head *head,
>>   	struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
>>   	unsigned long fail_type;
>>   	resource_size_t add_align, align;
>> +	int index;
>>   
>>   	/* Check if optional add_size is there */
>>   	if (!realloc_head || list_empty(realloc_head))
>> @@ -410,11 +411,13 @@ static void __assign_resources_sorted(struct list_head *head,
>>   
>>   		/*
>>   		 * There are two kinds of additional resources in the list:
>> -		 * 1. bridge resource  -- IORESOURCE_STARTALIGN
>> -		 * 2. SR-IOV resource   -- IORESOURCE_SIZEALIGN
>> +		 * 1. bridge resource
>> +		 * 2. SR-IOV resource
>>   		 * Here just fix the additional alignment for bridge
>>   		 */
>> -		if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
>> +		index = dev_res->res - dev_res->dev->resource;
>> +		if (index < PCI_BRIDGE_RESOURCES ||
>> +			index > PCI_BRIDGE_RESOURCE_END)
> I think the code looks OK; at least, it seems to match the comment.
> I'd just like to understand the problem a little better.
>
>>   			continue;
>>   
>>   		add_align = get_res_add_align(realloc_head, dev_res->res);
>> -- 
>> 1.7.9.5
>>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment
  2016-06-21  1:57   ` Bjorn Helgaas
@ 2016-06-21  3:26     ` Yongji Xie
  0 siblings, 0 replies; 14+ messages in thread
From: Yongji Xie @ 2016-06-21  3:26 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, linux-pci, linuxppc-dev, linux-doc, bhelgaas,
	alex.williamson, aik, benh, paulus, mpe, corbet, warrier, zhong,
	nikunj, gwshan

On 2016/6/21 9:57, Bjorn Helgaas wrote:

> On Thu, Jun 02, 2016 at 01:46:50PM +0800, Yongji Xie wrote:
>> When using resource_alignment kernel parameter, the current
>> implement reassigns the alignment by changing resources' size
>> which can potentially break some drivers. For example, the driver
>> uses the size to locate some register whose length is related
>> to the size.
>>
>> This patch adds a new option "noresize" for the parameter to
>> solve this problem.
> Why do we ever want to change the resource's size?  I understand that
> you want to change the resource's *alignment*, and that part makes
> sense.  But why change the *size*?  Changing the resource size doesn't
> change the hardware BAR size; it just means the struct resource will
> describe a region larger than what the BAR actually claims.  That
> unnecessarily wastes space after the BAR.
>
> This was a problem with the code even before your patch; I'm
> suggesting that if you have a way to change the alignment without
> changing the resource size, maybe we should do that all the time.
> Then you wouldn't need to add the "noresize" option.

Yes, changing resource's size seems not a good idea. But
would it break some existing systems which are using this
kernel parameter if we remove the "noresize" option and change
the alignment all the time?

Thanks,
Yongji

>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>>   Documentation/kernel-parameters.txt |    5 ++++-
>>   drivers/pci/pci.c                   |   35 +++++++++++++++++++++++++----------
>>   2 files changed, 29 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index 82b42c9..c4802f5 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -2997,13 +2997,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>   				window. The default value is 64 megabytes.
>>   		resource_alignment=
>>   				Format:
>> -				[<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
>> +				[<order of align>@][<domain>:]<bus>:<slot>.<func>
>> +				[:noresize][; ...]
>>   				Specifies alignment and device to reassign
>>   				aligned memory resources.
>>   				If <order of align> is not specified,
>>   				PAGE_SIZE is used as alignment.
>>   				PCI-PCI bridge can be specified, if resource
>>   				windows need to be expanded.
>> +				noresize: Don't change the resources' sizes when
>> +				reassigning alignment.
>>   		ecrc=		Enable/disable PCIe ECRC (transaction layer
>>   				end-to-end CRC checking).
>>   				bios: Use BIOS/firmware settings. This is the
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index a259394..3ee13e5 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4748,11 +4748,13 @@ static DEFINE_SPINLOCK(resource_alignment_lock);
>>   /**
>>    * pci_specified_resource_alignment - get resource alignment specified by user.
>>    * @dev: the PCI device to get
>> + * @resize: whether or not to change resources' size when reassigning alignment
>>    *
>>    * RETURNS: Resource alignment if it is specified.
>>    *          Zero if it is not specified.
>>    */
>> -static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>> +static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>> +		bool *resize)
>>   {
>>   	int seg, bus, slot, func, align_order, count;
>>   	resource_size_t align = 0;
>> @@ -4786,6 +4788,11 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>>   			}
>>   		}
>>   		p += count;
>> +		if (!strncmp(p, ":noresize", 9)) {
>> +			*resize = false;
>> +			p += 9;
>> +		} else
>> +			*resize = true;
>>   		if (seg == pci_domain_nr(dev->bus) &&
>>   			bus == dev->bus->number &&
>>   			slot == PCI_SLOT(dev->devfn) &&
>> @@ -4818,11 +4825,12 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>>   {
>>   	int i;
>>   	struct resource *r;
>> +	bool resize = true;
>>   	resource_size_t align, size;
>>   	u16 command;
>>   
>>   	/* check if specified PCI is target device to reassign */
>> -	align = pci_specified_resource_alignment(dev);
>> +	align = pci_specified_resource_alignment(dev, &resize);
>>   	if (!align)
>>   		return;
>>   
>> @@ -4844,15 +4852,22 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>>   		if (!(r->flags & IORESOURCE_MEM))
>>   			continue;
>>   		size = resource_size(r);
>> -		if (size < align) {
>> -			size = align;
>> -			dev_info(&dev->dev,
>> -				"Rounding up size of resource #%d to %#llx.\n",
>> -				i, (unsigned long long)size);
>> +		if (resize) {
>> +			if (size < align) {
>> +				size = align;
>> +				dev_info(&dev->dev,
>> +					"Rounding up size of resource #%d to %#llx.\n",
>> +					i, (unsigned long long)size);
>> +			}
>> +			r->flags |= IORESOURCE_UNSET;
>> +			r->end = size - 1;
>> +			r->start = 0;
>> +		} else {
>> +			r->flags &= ~IORESOURCE_SIZEALIGN;
>> +			r->flags |= IORESOURCE_STARTALIGN | IORESOURCE_UNSET;
>> +			r->start = max(align, size);
>> +			r->end = r->start + size - 1;
>>   		}
>> -		r->flags |= IORESOURCE_UNSET;
>> -		r->end = size - 1;
>> -		r->start = 0;
>>   	}
>>   	/* Need to disable bridge's resource window,
>>   	 * to enable the kernel to reassign new resource
>> -- 
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned
  2016-06-21  2:26   ` Bjorn Helgaas
@ 2016-06-21  6:46     ` Yongji Xie
  0 siblings, 0 replies; 14+ messages in thread
From: Yongji Xie @ 2016-06-21  6:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nikunj, zhong, linux-doc, aik, linux-pci, corbet, linux-kernel,
	gwshan, warrier, alex.williamson, paulus, bhelgaas, linuxppc-dev

On 2016/6/21 10:26, Bjorn Helgaas wrote:

> On Thu, Jun 02, 2016 at 01:46:51PM +0800, Yongji Xie wrote:
>> When vfio passthrough a PCI device of which MMIO BARs are
>> smaller than PAGE_SIZE, guest will not handle the mmio
>> accesses to the BARs which leads to mmio emulations in host.
>>
>> This is because vfio will not allow to passthrough one BAR's
>> mmio page which may be shared with other BARs. Otherwise,
>> there will be a backdoor that guest can use to access BARs
>> of other guest.
>>
>> To solve this issue, this patch modifies resource_alignment
>> to support syntax where multiple devices get the same
>> alignment. So we can use something like
>> "pci=resource_alignment=*:*:*.*:noresize" to enforce the
>> alignment of all MMIO BARs to be at least PAGE_SIZE so that
>> one BAR's mmio page would not be shared with other BARs.
>>
>> And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
>> automatically on PPC64 platform which can easily hit this issue
>> because its PAGE_SIZE is 64KB.
>>
>> Note that this would not be applied to VFs whose BARs are always
>> page aligned and should be never reassigned according to SRIOV
>> spec.
> I see that SR-IOV spec r1.1, sec 3.3.13 requires that all VF BAR
> resources be aligned on System Page Size, and must be sized to consume
> an integral number of pages.
>
> Where does it say VF BARs can't be reassigned?  I thought they *could*
> be reassigned, as long as VFs are disabled when you do it.

Oh, sorry. I made a mistake here. We can reassign VF BARs by writing the
alignment to System Page Size(20h) when VFs are disabled.

As you said below,  VF BARs are read-only zeroes, the normal way(writing
BARs) of resources allocation wouldn't be applied to VFs. The resources
allocation of VFs have been determined when we enable SR-IOV capability.
So we should not touch VF BARs here. It's useless and will release the
allocated resources of VFs which leads to a bug.

>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>>   Documentation/kernel-parameters.txt |    2 ++
>>   arch/powerpc/include/asm/pci.h      |    2 ++
>>   drivers/pci/pci.c                   |   68 +++++++++++++++++++++++++++++------
>>   3 files changed, 61 insertions(+), 11 deletions(-)
>>
>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index c4802f5..cb09503 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -3003,6 +3003,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>   				aligned memory resources.
>>   				If <order of align> is not specified,
>>   				PAGE_SIZE is used as alignment.
>> +				<domain>, <bus>, <slot> and <func> can be set to
>> +				"*" which means match all values.
>>   				PCI-PCI bridge can be specified, if resource
>>   				windows need to be expanded.
>>   				noresize: Don't change the resources' sizes when
>> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
>> index a6f3ac0..742fd34 100644
>> --- a/arch/powerpc/include/asm/pci.h
>> +++ b/arch/powerpc/include/asm/pci.h
>> @@ -28,6 +28,8 @@
>>   #define PCIBIOS_MIN_IO		0x1000
>>   #define PCIBIOS_MIN_MEM		0x10000000
>>   
>> +#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
>> +
>>   struct pci_dev;
>>   
>>   /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 3ee13e5..664f295 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4759,7 +4759,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>   	int seg, bus, slot, func, align_order, count;
>>   	resource_size_t align = 0;
>>   	char *p;
>> +	bool invalid = false;
>>   
>> +#ifdef PCIBIOS_MIN_ALIGNMENT
>> +	align = PCIBIOS_MIN_ALIGNMENT;
>> +	*resize = false;
>> +#endif
> This PCIBIOS_MIN_ALIGNMENT part should be a separate patch by itself.

OK, I will.

> If you have PCIBIOS_MIN_ALIGNMENT enabled automatically for powerpc,
> do you still need the command-line argument?

Other archs may benefit from this. And using command-line seems to be
more flexible that we can enable/disable this feature dynamically.

>>   	spin_lock(&resource_alignment_lock);
>>   	p = resource_alignment_param;
>>   	while (*p) {
>> @@ -4776,16 +4781,49 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>   		} else {
>>   			align_order = -1;
>>   		}
>> -		if (sscanf(p, "%x:%x:%x.%x%n",
>> -			&seg, &bus, &slot, &func, &count) != 4) {
>> +		if (p[0] == '*' && p[1] == ':') {
>> +			seg = -1;
>> +			count = 1;
>> +		} else if (sscanf(p, "%x%n", &seg, &count) != 1 ||
>> +				p[count] != ':') {
>> +			invalid = true;
>> +			break;
>> +		}
>> +		p += count + 1;
>> +		if (*p == '*') {
>> +			bus = -1;
>> +			count = 1;
>> +		} else if (sscanf(p, "%x%n", &bus, &count) != 1) {
>> +			invalid = true;
>> +			break;
>> +		}
>> +		p += count;
>> +		if (*p == '.') {
>> +			slot = bus;
>> +			bus = seg;
>>   			seg = 0;
>> -			if (sscanf(p, "%x:%x.%x%n",
>> -					&bus, &slot, &func, &count) != 3) {
>> -				/* Invalid format */
>> -				printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n",
>> -					p);
>> +			p++;
>> +		} else if (*p == ':') {
>> +			p++;
>> +			if (p[0] == '*' && p[1] == '.') {
>> +				slot = -1;
>> +				count = 1;
>> +			} else if (sscanf(p, "%x%n", &slot, &count) != 1 ||
>> +					p[count] != '.') {
>> +				invalid = true;
>>   				break;
>>   			}
>> +			p += count + 1;
>> +		} else {
>> +			invalid = true;
>> +			break;
>> +		}
>> +		if (*p == '*') {
>> +			func = -1;
>> +			count = 1;
>> +		} else if (sscanf(p, "%x%n", &func, &count) != 1) {
>> +			invalid = true;
>> +			break;
>>   		}
>>   		p += count;
>>   		if (!strncmp(p, ":noresize", 9)) {
>> @@ -4793,10 +4831,10 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>   			p += 9;
>>   		} else
>>   			*resize = true;
>> -		if (seg == pci_domain_nr(dev->bus) &&
>> -			bus == dev->bus->number &&
>> -			slot == PCI_SLOT(dev->devfn) &&
>> -			func == PCI_FUNC(dev->devfn)) {
>> +		if ((seg == pci_domain_nr(dev->bus) || seg == -1) &&
>> +			(bus == dev->bus->number || bus == -1) &&
>> +			(slot == PCI_SLOT(dev->devfn) || slot == -1) &&
>> +			(func == PCI_FUNC(dev->devfn) || func == -1)) {
>>   			if (align_order == -1)
>>   				align = PAGE_SIZE;
>>   			else
>> @@ -4806,10 +4844,14 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>>   		}
>>   		if (*p != ';' && *p != ',') {
>>   			/* End of param or invalid format */
>> +			invalid = true;
>>   			break;
>>   		}
>>   		p++;
>>   	}
>> +	if (invalid)
>> +		printk(KERN_ERR "PCI: Can't parse resource_alignment parameter:%s\n",
>> +				p);
>>   	spin_unlock(&resource_alignment_lock);
>>   	return align;
>>   }
>> @@ -4829,6 +4871,10 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>>   	resource_size_t align, size;
>>   	u16 command;
>>   
>> +	/* We should never try to reassign VF's alignment */
>> +	if (dev->is_virtfn)
>> +		return;
> This part looks like a bugfix that should be in a separate patch.

Yes, it's a bugfix. VFs would not work if we enable the reassignment to them.
  

> I assume this is because VFs have no read/write BARs themselves.  A PF
> has the usual read/write BAR0-BAR5 at offsets 0x10-0x24, as well as
> read/write VF BAR0-BAR5 in the SR-IOV capability.  The VF BARs in the
> SR-IOV capability determine the resources assigned for VFs.
>
> For the VFs themselves, BAR0-BAR5 at offsets 0x10-0x24 are read-only
> zeroes (SR-IOV spec r1.1., sec 3.4.1.11), and there is no SR-IOV
> capability.
>
> Right?

You are right. The resources should not be reassigned after we
enable VFs. It's useless because of the read-only BARs and will release
the resources allocated in sriov_enable().

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\
  2016-06-21  2:16     ` Yongji Xie
@ 2016-06-21  8:38       ` Yongji Xie
  0 siblings, 0 replies; 14+ messages in thread
From: Yongji Xie @ 2016-06-21  8:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nikunj, zhong, corbet, aik, linux-pci, linux-doc, linux-kernel,
	gwshan, bhelgaas, alex.williamson, paulus, warrier, linuxppc-dev

On 2016/6/21 10:16, Yongji Xie wrote:

> On 2016/6/21 9:43, Bjorn Helgaas wrote:
>
>> On Thu, Jun 02, 2016 at 01:46:48PM +0800, Yongji Xie wrote:
>>> The resource_alignment will releases memory resources allocated
>>> by firmware so that kernel can reassign new resources later on.
>>> But this will cause the problem that no resources can be
>>> allocated by kernel if PCI_PROBE_ONLY was set, e.g. on pSeries
>>> platform because PCI_PROBE_ONLY force kernel to use firmware
>>> setup and not to reassign any resources.
>>>
>>> To solve this problem, this patch ignores resource_alignment
>>> if PCI_PROBE_ONLY was set.
>>>
>>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>>> ---
>>>   drivers/pci/pci.c |    6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index c8b4dbd..a259394 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -4761,6 +4761,12 @@ static resource_size_t 
>>> pci_specified_resource_alignment(struct pci_dev *dev)
>>>       spin_lock(&resource_alignment_lock);
>>>       p = resource_alignment_param;
>>>       while (*p) {
>>> +        if (pci_has_flag(PCI_PROBE_ONLY)) {
>>> +            printk(KERN_ERR "PCI: Ignore resource_alignment 
>>> parameter: %s with PCI_PROBE_ONLY set\n",
>>> +                    p);
>>> +            *p = 0;
>>> +            break;
>> Wouldn't it be simpler to make pci_set_resource_alignment_param() fail
>> if PCI_PROBE_ONLY is set?
>
> I add the check here because I want to print some logs so that users
> could know the reason why resource_alignment doesn't work when
> they add this parameter.
>
> Thanks,
> Yongji
>

Sorry, please ignore the previous reply. I didn't add this check in
pci_set_resource_alignment_param() because PCI_PROBE_ONLY
may be set after we parse "resource_alignment".

And it seems that printk_once() may be better here so that
we don't need to set *p = 0.

Thanks,
Yongji

>>> +        }
>>>           count = 0;
>>>           if (sscanf(p, "%d%n", &align_order, &count) == 1 &&
>>>                               p[count] == '@') {
>>> -- 
>>> 1.7.9.5
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-06-21  8:38 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
2016-06-21  1:43   ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\ Bjorn Helgaas
2016-06-21  2:16     ` Yongji Xie
2016-06-21  8:38       ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources Yongji Xie
2016-06-21  1:50   ` Bjorn Helgaas
2016-06-21  2:59     ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment Yongji Xie
2016-06-21  1:57   ` Bjorn Helgaas
2016-06-21  3:26     ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned Yongji Xie
2016-06-21  2:26   ` Bjorn Helgaas
2016-06-21  6:46     ` Yongji Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).