linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE
@ 2017-02-15  6:45 Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment Yongji Xie
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Yongji Xie @ 2017-02-15  6:45 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

This series introduces a way for PCI resource allocator to force
MMIO BARs not to share PAGE_SIZE. This would make sense to VFIO
driver. Because current VFIO implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs which may share the same page
with other BARs for security reasons. Thus, we have to handle mmio
access to these BARs in QEMU emulation rather than in guest which
will cause some performance loss.

In our solution, we try to make use of the existing code path of
resource_alignment kernel parameter and add a macro to set default
alignment for it. Thus we can define this macro by default on some
archs which may easily hit the performance issue because of their
64K page.

In this series, patch 1 fixes a bug related to bridge window
size/alignment caculating; patch 2,3 add support for setting default
alignment of all MMIO BAR.

Changelog v9:
- Add a patch to fix for caculating bridge window's size and alignment
- Remove an unrelated patch
- Rework the patch that fix bug that reassign resources's alignment by
  changing its size

Changelog v8:
- Rebased against v4.10-rc4
- Rework the patch 2
- Change the commit log of patch 1

Changelog v7:
- Rebased against v4.9-rc2
- Drop two merged patches
- Rework the patch which fix a bug that resources's size is changed when
  using resource_alignment
- Add a patch that fix a bug for IOV BARs when using resource_alignment

Changelog v6: 
- Remove the option "noresize@" of resource_alignment

Changelog v5: 
- Rebased against v4.8-rc6
- Drop the patch that forbidding disable memory decoding in
  pci_reassigndev_resource_alignment()

Changelog v4: 
- Rebased against v4.8-rc1
- Drop one irrelevant patch
- Drop the patch that adding wildcard to resource_alignment to enforce
  the alignment of all MMIO BARs to be at least PAGE_SIZE
- Change the format of option "noresize" of resource_alignment
- Code style improvements

Changelog v3: 
- Ignore enforced alignment to fixed BARs
- Fix issue that disabling memory decoding when reassigning the alignment
- Only enable default alignment on PowerNV platform

Changelog v2:
- Ignore enforced alignment to VF BARs on pci_reassigndev_resource_alignment()

Yongji Xie (3):
  PCI: A fix for caculating bridge window's size and alignment
  PCI: Add a macro to set default alignment for all PCI devices
  PCI: Don't extend device's size when using default alignment for all devices

 arch/powerpc/include/asm/pci.h |    4 ++++
 drivers/pci/pci.c              |   37 +++++++++++++++++++++++++++----------
 drivers/pci/setup-bus.c        |    4 ++--
 3 files changed, 33 insertions(+), 12 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment
  2017-02-15  6:45 [PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Yongji Xie
@ 2017-02-15  6:45 ` Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices Yongji Xie
  2 siblings, 0 replies; 12+ messages in thread
From: Yongji Xie @ 2017-02-15  6:45 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

In case that one device's alignment is greater than its size,
we may get an incorrect size and alignment for its bus's memory
window in pbus_size_mem(). This patch fixes this case.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 drivers/pci/setup-bus.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f30ca75..b3cabc2 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1075,10 +1075,10 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 				r->flags = 0;
 				continue;
 			}
-			size += r_size;
+			size += max(r_size, align);
 			/* Exclude ranges with size > align from
 			   calculation of the alignment. */
-			if (r_size == align)
+			if (r_size <= align)
 				aligns[order] += align;
 			if (order > max_order)
 				max_order = order;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-02-15  6:45 [PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment Yongji Xie
@ 2017-02-15  6:45 ` Yongji Xie
  2017-03-23 20:53   ` Bjorn Helgaas
  2017-02-15  6:45 ` [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices Yongji Xie
  2 siblings, 1 reply; 12+ messages in thread
From: Yongji Xie @ 2017-02-15  6:45 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

When vfio passthroughs a PCI device of which MMIO BARs are
smaller than PAGE_SIZE, guest will not handle the mmio
accesses to the BARs which leads to mmio emulations in host.

This is because vfio will not allow to passthrough one BAR's
mmio page which may be shared with other BARs. Otherwise,
there will be a backdoor that guest can use to access BARs
of other guest.

This patch adds a macro to set default alignment for all
PCI devices. Then we could solve this issue on some platforms
which would easily hit this issue because of their 64K page
such as PowerNV platform by defining this macro as PAGE_SIZE.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pci.h |    4 ++++
 drivers/pci/pci.c              |    3 +++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index e9bd6cf..5e31bc2 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -28,6 +28,10 @@
 #define PCIBIOS_MIN_IO		0x1000
 #define PCIBIOS_MIN_MEM		0x10000000
 
+#ifdef CONFIG_PPC_POWERNV
+#define PCIBIOS_DEFAULT_ALIGNMENT	PAGE_SIZE
+#endif
+
 struct pci_dev;
 
 /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a881c0d..2622e9b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4965,6 +4965,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 	resource_size_t align = 0;
 	char *p;
 
+#ifdef PCIBIOS_DEFAULT_ALIGNMENT
+	align = PCIBIOS_DEFAULT_ALIGNMENT;
+#endif
 	spin_lock(&resource_alignment_lock);
 	p = resource_alignment_param;
 	if (!*p)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices
  2017-02-15  6:45 [PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment Yongji Xie
  2017-02-15  6:45 ` [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices Yongji Xie
@ 2017-02-15  6:45 ` Yongji Xie
  2017-03-23 21:55   ` Bjorn Helgaas
  2 siblings, 1 reply; 12+ messages in thread
From: Yongji Xie @ 2017-02-15  6:45 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

Currently we reassign the alignment by extending resources' size in
pci_reassigndev_resource_alignment(). This could potentially break
some drivers when the driver uses the size to locate register
whose length is related to the size. Some examples as below:

- misc\Hpilo.c:
off = pci_resource_len(pdev, bar) - 0x2000;

- net\ethernet\chelsio\cxgb4\cxgb4_uld.h:
(pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size))

- infiniband\hw\nes\Nes_hw.c:
num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT;

This risk could be easily prevented before because we only had one way
(kernel parameter resource_alignment) to touch those codes. And even
some users may be happy to see the extended size.

But now we define a default alignment for all devices which would also
touch those codes. It would be hard to prevent the risk in this case. So
this patch tries to use START_ALIGNMENT to identify the resource's
alignment without extending the size when the alignment reassigning is
caused by the default alignment.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 drivers/pci/pci.c |   34 ++++++++++++++++++++++++----------
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2622e9b..0017e5e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4958,7 +4958,8 @@ void pci_ignore_hotplug(struct pci_dev *dev)
  * RETURNS: Resource alignment if it is specified.
  *          Zero if it is not specified.
  */
-static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
+static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
+							bool *resize)
 {
 	int seg, bus, slot, func, align_order, count;
 	unsigned short vendor, device, subsystem_vendor, subsystem_device;
@@ -5002,6 +5003,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 				(!device || (device == dev->device)) &&
 				(!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) &&
 				(!subsystem_device || (subsystem_device == dev->subsystem_device))) {
+				*resize = true;
 				if (align_order == -1)
 					align = PAGE_SIZE;
 				else
@@ -5027,6 +5029,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 				bus == dev->bus->number &&
 				slot == PCI_SLOT(dev->devfn) &&
 				func == PCI_FUNC(dev->devfn)) {
+				*resize = true;
 				if (align_order == -1)
 					align = PAGE_SIZE;
 				else
@@ -5059,6 +5062,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 	struct resource *r;
 	resource_size_t align, size;
 	u16 command;
+	bool resize = false;
 
 	/*
 	 * VF BARs are read-only zero according to SR-IOV spec r1.1, sec
@@ -5070,7 +5074,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 		return;
 
 	/* check if specified PCI is target device to reassign */
-	align = pci_specified_resource_alignment(dev);
+	align = pci_specified_resource_alignment(dev, &resize);
 	if (!align)
 		return;
 
@@ -5098,15 +5102,25 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
 		}
 
 		size = resource_size(r);
-		if (size < align) {
-			size = align;
-			dev_info(&dev->dev,
-				"Rounding up size of resource #%d to %#llx.\n",
-				i, (unsigned long long)size);
+		if (resize) {
+			if (size < align) {
+				size = align;
+				dev_info(&dev->dev,
+					"Rounding up size of resource #%d to %#llx.\n",
+					i, (unsigned long long)size);
+			}
+			r->flags |= IORESOURCE_UNSET;
+			r->end = size - 1;
+			r->start = 0;
+		} else {
+			if (size < align) {
+				r->flags &= ~IORESOURCE_SIZEALIGN;
+				r->flags |= IORESOURCE_STARTALIGN |
+							IORESOURCE_UNSET;
+				r->start = align;
+				r->end = r->start + size - 1;
+			}
 		}
-		r->flags |= IORESOURCE_UNSET;
-		r->end = size - 1;
-		r->start = 0;
 	}
 	/* Need to disable bridge's resource window,
 	 * to enable the kernel to reassign new resource
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-02-15  6:45 ` [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices Yongji Xie
@ 2017-03-23 20:53   ` Bjorn Helgaas
  2017-03-23 21:57     ` Bjorn Helgaas
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2017-03-23 20:53 UTC (permalink / raw)
  To: Yongji Xie
  Cc: bhelgaas, linux-pci, linuxppc-dev, alex.williamson, gwshan, aik,
	benh, mpe, paulus, zhong

Hi Yongji,

On Wed, Feb 15, 2017 at 02:45:05PM +0800, Yongji Xie wrote:
> When vfio passthroughs a PCI device of which MMIO BARs are
> smaller than PAGE_SIZE, guest will not handle the mmio
> accesses to the BARs which leads to mmio emulations in host.
> 
> This is because vfio will not allow to passthrough one BAR's
> mmio page which may be shared with other BARs. Otherwise,
> there will be a backdoor that guest can use to access BARs
> of other guest.

Please include a pointer to the VFIO code that enforces this.  It's
not obvious to me how it would do this.  This doesn't change the
*size* of the resource, only the alignment.  So if VFIO sees a BAR
like [mem 0x80000000-0x800000ff], it knows the BAR is aligned enough
that it *could* be the only thing on a page, but I don't know how it
could know that there will never be another BAR at 0x80000100.  Even
if there's no other BAR on that page *now*, it would have to know that
no hot-added device will ever be placed on that page.

> This patch adds a macro to set default alignment for all
> PCI devices. Then we could solve this issue on some platforms
> which would easily hit this issue because of their 64K page
> such as PowerNV platform by defining this macro as PAGE_SIZE.
> 
> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/pci.h |    4 ++++
>  drivers/pci/pci.c              |    3 +++
>  2 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
> index e9bd6cf..5e31bc2 100644
> --- a/arch/powerpc/include/asm/pci.h
> +++ b/arch/powerpc/include/asm/pci.h
> @@ -28,6 +28,10 @@
>  #define PCIBIOS_MIN_IO		0x1000
>  #define PCIBIOS_MIN_MEM		0x10000000
>  
> +#ifdef CONFIG_PPC_POWERNV
> +#define PCIBIOS_DEFAULT_ALIGNMENT	PAGE_SIZE
> +#endif
> +
>  struct pci_dev;
>  
>  /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a881c0d..2622e9b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4965,6 +4965,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>  	resource_size_t align = 0;
>  	char *p;
>  
> +#ifdef PCIBIOS_DEFAULT_ALIGNMENT
> +	align = PCIBIOS_DEFAULT_ALIGNMENT;
> +#endif

I'd prefer to move this #ifdef out of the code path, as in the patch
below.

I don't know how powerpc kernels are built: can you build a kernel
that will run on PowerNV as well as on different powerpc systems?  If
so, is it acceptable to force that kernel to user 64K alignment even
when it's running on non-PowerNV systems?  Or do you need a function
call here instead of a #define?

>  	spin_lock(&resource_alignment_lock);
>  	p = resource_alignment_param;
>  	if (!*p)


commit 60106ae302a264f9be15fa736dae2ea51140b988
Author: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Date:   Wed Feb 15 14:45:05 2017 +0800

    PCI: Add PCIBIOS_DEFAULT_ALIGNMENT for arch-specific alignment control
    
    When VFIO passes through a PCI device to a guest, it does not allow the
    guest direct access to BARs that are smaller than PAGE_SIZE.  This is
    because a page might contain several small BARs for unrelated devices and
    a guest should not be able to access all of them.
    
    VFIO emulates guest accesses to these small BARs, which is functional but
    slow.  On systems with large page sizes, e.g., PowerNV with 64K pages, BARs
    are more likely to share a page and performance is more of a problem.
    
    Add a macro to set default alignment for all PCI devices.  An arch can set
    this to PAGE_SIZE to prevent memory BARs from sharing a page.
    
    [bhelgaas: add default PCIBIOS_DEFAULT_ALIGNMENT definition, changelog]
    Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 93eded8d3843..dc3c81ab4cc4 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -28,6 +28,10 @@
 #define PCIBIOS_MIN_IO		0x1000
 #define PCIBIOS_MIN_MEM		0x10000000
 
+#ifdef CONFIG_PPC_POWERNV
+#define PCIBIOS_DEFAULT_ALIGNMENT	PAGE_SIZE
+#endif
+
 struct pci_dev;
 
 /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7904d02ffdb9..e9a079063fd0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4962,7 +4962,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
 {
 	int seg, bus, slot, func, align_order, count;
 	unsigned short vendor, device, subsystem_vendor, subsystem_device;
-	resource_size_t align = 0;
+	resource_size_t align = PCIBIOS_DEFAULT_ALIGNMENT;
 	char *p;
 
 	spin_lock(&resource_alignment_lock);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index eb3da1a04e6c..618beb5809d1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1630,6 +1630,10 @@ static inline int pci_get_new_domain_nr(void) { return -ENOSYS; }
 #define pci_root_bus_fwnode(bus)	NULL
 #endif
 
+#ifndef PCIBIOS_DEFAULT_ALIGNMENT
+#define PCIBIOS_DEFAULT_ALIGNMENT	0
+#endif
+
 /* these helpers provide future and backwards compatibility
  * for accessing popular PCI BAR info */
 #define pci_resource_start(dev, bar)	((dev)->resource[(bar)].start)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices
  2017-02-15  6:45 ` [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices Yongji Xie
@ 2017-03-23 21:55   ` Bjorn Helgaas
  2017-03-25 12:46     ` Yongji Xie
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2017-03-23 21:55 UTC (permalink / raw)
  To: Yongji Xie
  Cc: bhelgaas, linux-pci, linuxppc-dev, alex.williamson, gwshan, aik,
	benh, mpe, paulus, zhong

On Wed, Feb 15, 2017 at 02:45:06PM +0800, Yongji Xie wrote:
> Currently we reassign the alignment by extending resources' size in
> pci_reassigndev_resource_alignment(). This could potentially break
> some drivers when the driver uses the size to locate register
> whose length is related to the size. Some examples as below:

I understand the motivation to stop changing the resource size.  That
makes good sense.

> - misc\Hpilo.c:

This isn't Windows; please use regular Linux forward slashes here.

> off = pci_resource_len(pdev, bar) - 0x2000;
> 
> - net\ethernet\chelsio\cxgb4\cxgb4_uld.h:
> (pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size))
> 
> - infiniband\hw\nes\Nes_hw.c:
> num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT;
> 
> This risk could be easily prevented before because we only had one way
> (kernel parameter resource_alignment) to touch those codes. And even
> some users may be happy to see the extended size.
> 
> But now we define a default alignment for all devices which would also
> touch those codes. It would be hard to prevent the risk in this case. So
> this patch tries to use START_ALIGNMENT to identify the resource's
> alignment without extending the size when the alignment reassigning is
> caused by the default alignment.

But I don't understand the new START_ALIGNMENT case.

> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  drivers/pci/pci.c |   34 ++++++++++++++++++++++++----------
>  1 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 2622e9b..0017e5e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4958,7 +4958,8 @@ void pci_ignore_hotplug(struct pci_dev *dev)
>   * RETURNS: Resource alignment if it is specified.
>   *          Zero if it is not specified.

Since there's function doc, please update it to reflect the new
"resize" return parameter.

What does "resize" mean?  I can't figure it out from your code.

>   */
> -static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
> +static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
> +							bool *resize)
>  {
>  	int seg, bus, slot, func, align_order, count;
>  	unsigned short vendor, device, subsystem_vendor, subsystem_device;
> @@ -5002,6 +5003,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>  				(!device || (device == dev->device)) &&
>  				(!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) &&
>  				(!subsystem_device || (subsystem_device == dev->subsystem_device))) {
> +				*resize = true;
>  				if (align_order == -1)
>  					align = PAGE_SIZE;
>  				else
> @@ -5027,6 +5029,7 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>  				bus == dev->bus->number &&
>  				slot == PCI_SLOT(dev->devfn) &&
>  				func == PCI_FUNC(dev->devfn)) {
> +				*resize = true;
>  				if (align_order == -1)
>  					align = PAGE_SIZE;
>  				else
> @@ -5059,6 +5062,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  	struct resource *r;
>  	resource_size_t align, size;
>  	u16 command;
> +	bool resize = false;
>  
>  	/*
>  	 * VF BARs are read-only zero according to SR-IOV spec r1.1, sec
> @@ -5070,7 +5074,7 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  		return;
>  
>  	/* check if specified PCI is target device to reassign */
> -	align = pci_specified_resource_alignment(dev);
> +	align = pci_specified_resource_alignment(dev, &resize);
>  	if (!align)
>  		return;
>  
> @@ -5098,15 +5102,25 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  		}
>  
>  		size = resource_size(r);
> -		if (size < align) {
> -			size = align;
> -			dev_info(&dev->dev,
> -				"Rounding up size of resource #%d to %#llx.\n",
> -				i, (unsigned long long)size);
> +		if (resize) {
> +			if (size < align) {
> +				size = align;
> +				dev_info(&dev->dev,
> +					"Rounding up size of resource #%d to %#llx.\n",
> +					i, (unsigned long long)size);
> +			}
> +			r->flags |= IORESOURCE_UNSET;
> +			r->end = size - 1;
> +			r->start = 0;
> +		} else {
> +			if (size < align) {
> +				r->flags &= ~IORESOURCE_SIZEALIGN;
> +				r->flags |= IORESOURCE_STARTALIGN |
> +							IORESOURCE_UNSET;
> +				r->start = align;
> +				r->end = r->start + size - 1;

If you have to have two blocks that fiddle with r->start, etc., they
should at least be as similar as possible.  You should have this in
both cases:

  r->end = r->start + size - 1;

> +			}
>  		}
> -		r->flags |= IORESOURCE_UNSET;
> -		r->end = size - 1;
> -		r->start = 0;
>  	}
>  	/* Need to disable bridge's resource window,
>  	 * to enable the kernel to reassign new resource
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-03-23 20:53   ` Bjorn Helgaas
@ 2017-03-23 21:57     ` Bjorn Helgaas
  2017-03-25 12:36     ` Yongji Xie
  2017-03-27 10:17     ` Michael Ellerman
  2 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2017-03-23 21:57 UTC (permalink / raw)
  To: Yongji Xie
  Cc: bhelgaas, linux-pci, linuxppc-dev, alex.williamson, gwshan, aik,
	benh, mpe, paulus, zhong

On Thu, Mar 23, 2017 at 03:53:42PM -0500, Bjorn Helgaas wrote:
> Hi Yongji,
> 
> On Wed, Feb 15, 2017 at 02:45:05PM +0800, Yongji Xie wrote:
> > When vfio passthroughs a PCI device of which MMIO BARs are
> > smaller than PAGE_SIZE, guest will not handle the mmio
> > accesses to the BARs which leads to mmio emulations in host.
> > 
> > This is because vfio will not allow to passthrough one BAR's
> > mmio page which may be shared with other BARs. Otherwise,
> > there will be a backdoor that guest can use to access BARs
> > of other guest.
> 
> Please include a pointer to the VFIO code that enforces this.  It's
> not obvious to me how it would do this.  This doesn't change the
> *size* of the resource, only the alignment.  So if VFIO sees a BAR
> like [mem 0x80000000-0x800000ff], it knows the BAR is aligned enough
> that it *could* be the only thing on a page, but I don't know how it
> could know that there will never be another BAR at 0x80000100.  Even
> if there's no other BAR on that page *now*, it would have to know that
> no hot-added device will ever be placed on that page.

Never mind, I found it.  I updated the changelog like this; please
correct anything I got wrong:

    When VFIO passes through a PCI device to a guest, it does not
    allow the guest to mmap BARs that are smaller than PAGE_SIZE
    unless it can reserve the rest of the page (see
    vfio_pci_probe_mmaps()).  This is because a page might contain
    several small BARs for unrelated devices and a guest should not be
    able to access all of them.

    VFIO emulates guest accesses to non-mappable BARs, which is
    functional but slow.  On systems with large page sizes, e.g.,
    PowerNV with 64K pages, BARs are more likely to share a page and
    performance is more likely to be a problem.

    Add a macro to set default alignment for all PCI devices.  An arch
    can set this to PAGE_SIZE to force the PCI core to place memory
    BARs on their own pages.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-03-23 20:53   ` Bjorn Helgaas
  2017-03-23 21:57     ` Bjorn Helgaas
@ 2017-03-25 12:36     ` Yongji Xie
  2017-03-27 10:17     ` Michael Ellerman
  2 siblings, 0 replies; 12+ messages in thread
From: Yongji Xie @ 2017-03-25 12:36 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

On Thu, Mar 23, 2017 at 03:53:42PM -0500, Bjorn Helgaas wrote:
> Hi Yongji,
>
> On Wed, Feb 15, 2017 at 02:45:05PM +0800, Yongji Xie wrote:
>> When vfio passthroughs a PCI device of which MMIO BARs are
>> smaller than PAGE_SIZE, guest will not handle the mmio
>> accesses to the BARs which leads to mmio emulations in host.
>>
>> This is because vfio will not allow to passthrough one BAR's
>> mmio page which may be shared with other BARs. Otherwise,
>> there will be a backdoor that guest can use to access BARs
>> of other guest.
>
> Please include a pointer to the VFIO code that enforces this.  It's
> not obvious to me how it would do this.  This doesn't change the
> *size* of the resource, only the alignment.  So if VFIO sees a BAR
> like [mem 0x80000000-0x800000ff], it knows the BAR is aligned enough
> that it *could* be the only thing on a page, but I don't know how it
> could know that there will never be another BAR at 0x80000100.  Even
> if there's no other BAR on that page *now*, it would have to know that
> no hot-added device will ever be placed on that page.
>
>> This patch adds a macro to set default alignment for all
>> PCI devices. Then we could solve this issue on some platforms
>> which would easily hit this issue because of their 64K page
>> such as PowerNV platform by defining this macro as PAGE_SIZE.
>>
>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/pci.h |    4 ++++
>> drivers/pci/pci.c              |    3 +++
>> 2 files changed, 7 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
>> index e9bd6cf..5e31bc2 100644
>> --- a/arch/powerpc/include/asm/pci.h
>> +++ b/arch/powerpc/include/asm/pci.h
>> @@ -28,6 +28,10 @@
>> #define PCIBIOS_MIN_IO               0x1000
>> #define PCIBIOS_MIN_MEM              0x10000000
>>
>> +#ifdef CONFIG_PPC_POWERNV
>> +#define PCIBIOS_DEFAULT_ALIGNMENT   PAGE_SIZE
>> +#endif
>> +
>> struct pci_dev;
>>
>> /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index a881c0d..2622e9b 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4965,6 +4965,9 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev)
>>      resource_size_t align = 0;
>>      char *p;
>>
>> +#ifdef PCIBIOS_DEFAULT_ALIGNMENT
>> +    align = PCIBIOS_DEFAULT_ALIGNMENT;
>> +#endif
>
> I'd prefer to move this #ifdef out of the code path, as in the patch
> below.
>
> I don't know how powerpc kernels are built: can you build a kernel
> that will run on PowerNV as well as on different powerpc systems?  If
> so, is it acceptable to force that kernel to user 64K alignment even
> when it's running on non-PowerNV systems?  Or do you need a function
> call here instead of a #define?
>

That's a good point, the macro may be also defined on non-PowerNV systems.
Maybe an arch-specific function is more reasonable here.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices
  2017-03-23 21:55   ` Bjorn Helgaas
@ 2017-03-25 12:46     ` Yongji Xie
  0 siblings, 0 replies; 12+ messages in thread
From: Yongji Xie @ 2017-03-25 12:46 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linuxppc-dev, alex.williamson, gwshan, aik, benh, mpe,
	paulus, zhong

On Thu, Mar 23, 2017 at 04:55:58PM -0500, Bjorn Helgaas wrote:
>
> On Wed, Feb 15, 2017 at 02:45:06PM +0800, Yongji Xie wrote:
>> Currently we reassign the alignment by extending resources' size in
>> pci_reassigndev_resource_alignment(). This could potentially break
>> some drivers when the driver uses the size to locate register
>> whose length is related to the size. Some examples as below:
>
> I understand the motivation to stop changing the resource size.  That
> makes good sense.
>
>> - misc\Hpilo.c:
>
> This isn't Windows; please use regular Linux forward slashes here.
>

Will fix it.

>> off = pci_resource_len(pdev, bar) - 0x2000;
>>
>> - net\ethernet\chelsio\cxgb4\cxgb4_uld.h:
>> (pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size))
>>
>> - infiniband\hw\nes\Nes_hw.c:
>> num_pds = pci_resource_len(nesdev->pcidev, BAR_1) >> PAGE_SHIFT;
>>
>> This risk could be easily prevented before because we only had one way
>> (kernel parameter resource_alignment) to touch those codes. And even
>> some users may be happy to see the extended size.
>>
>> But now we define a default alignment for all devices which would also
>> touch those codes. It would be hard to prevent the risk in this case. So
>> this patch tries to use START_ALIGNMENT to identify the resource's
>> alignment without extending the size when the alignment reassigning is
>> caused by the default alignment.
>
> But I don't understand the new START_ALIGNMENT case.
>

The START_ALIGNMENT case will be used when we have default alignment
for PCI devices, which can identify the resource's alignment without
extending the size. If we don't use START_ALIGNMENT for default alignment,
resources' size would be extended by default, that means the system's 
default action could potentially break drivers, it looks unreasonable.

>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>> drivers/pci/pci.c |   34 ++++++++++++++++++++++++----------
>> 1 files changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 2622e9b..0017e5e 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4958,7 +4958,8 @@ void pci_ignore_hotplug(struct pci_dev *dev)
>>  * RETURNS: Resource alignment if it is specified.
>>  *          Zero if it is not specified.
>
> Since there's function doc, please update it to reflect the new
> "resize" return parameter.
>

Will do it.

> What does "resize" mean?  I can't figure it out from your code.

"resize" means we will update the resource'alignment by extending the 
resource's size. This would happen when we use kernel parameter to identify 
resource's alignment. If the resource's alignment is identified by the default 
alignment function(macro) rather than kernel parameter, "resize" should be 
false, which means we update the resource'alignment without extending the 
resource's size.

Thanks,
Yongji

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-03-23 20:53   ` Bjorn Helgaas
  2017-03-23 21:57     ` Bjorn Helgaas
  2017-03-25 12:36     ` Yongji Xie
@ 2017-03-27 10:17     ` Michael Ellerman
  2017-03-27 10:25       ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2017-03-27 10:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Yongji Xie
  Cc: bhelgaas, linux-pci, linuxppc-dev, alex.williamson, gwshan, aik,
	benh, paulus, zhong

Bjorn Helgaas <helgaas@kernel.org> writes:
>
> I don't know how powerpc kernels are built: can you build a kernel
> that will run on PowerNV as well as on different powerpc systems?

Yes you can. But there are some restrictions.

You can only build a 64-bit OR a 32-bit kernel.

And within 64-bit you can only build a *kernel* for the "server
architecture (~= IBM CPUs)" or the "embedded architecture
(Freescale/NXP)".

So in practice you can build a 64-bit kernel that supports:
 - powernv (bare metal on IBM server chips)
 - pseries (virtualised on IBM server chips or qemu)
 - powermac (Apple G5s)
 - pasemi (long dead but still used by some Amiga folks?)
 - Cell/PS3 (also long dead)

> If so, is it acceptable to force that kernel to user 64K alignment even
> when it's running on non-PowerNV systems?

Probably, but I'm not sure TBH. Benh will know, I'll try and get his
attention.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-03-27 10:17     ` Michael Ellerman
@ 2017-03-27 10:25       ` Benjamin Herrenschmidt
  2017-03-30 23:13         ` Bjorn Helgaas
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2017-03-27 10:25 UTC (permalink / raw)
  To: Michael Ellerman, Bjorn Helgaas, Yongji Xie
  Cc: bhelgaas, linux-pci, linuxppc-dev, alex.williamson, gwshan, aik,
	paulus, zhong

On Mon, 2017-03-27 at 21:17 +1100, Michael Ellerman wrote:
> > If so, is it acceptable to force that kernel to user 64K alignment
> > even
> > when it's running on non-PowerNV systems?
> 
> Probably, but I'm not sure TBH. Benh will know, I'll try and get his
> attention.

Can we make the alignment PAGE_SIZE ? I think that should be ok as long
as it doesn't try to re-allocate BARs for devices that have already
been properly allocated by firmware (ie, PowerMac, if you mess around
with the MacIO chip on these, bad things will happen).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices
  2017-03-27 10:25       ` Benjamin Herrenschmidt
@ 2017-03-30 23:13         ` Bjorn Helgaas
  0 siblings, 0 replies; 12+ messages in thread
From: Bjorn Helgaas @ 2017-03-30 23:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Ellerman, Yongji Xie, bhelgaas, linux-pci, linuxppc-dev,
	alex.williamson, gwshan, aik, paulus, zhong

On Mon, Mar 27, 2017 at 09:25:50PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2017-03-27 at 21:17 +1100, Michael Ellerman wrote:
> > > If so, is it acceptable to force that kernel to user 64K alignment
> > > even
> > > when it's running on non-PowerNV systems?
> > 
> > Probably, but I'm not sure TBH. Benh will know, I'll try and get his
> > attention.
> 
> Can we make the alignment PAGE_SIZE ? I think that should be ok as long
> as it doesn't try to re-allocate BARs for devices that have already
> been properly allocated by firmware (ie, PowerMac, if you mess around
> with the MacIO chip on these, bad things will happen).

I think this *will* change BARs even if they've already been allocated
by firmware, because it affects this path that is used for every
device:

  pci_device_add
    pci_reassigndev_resource_alignment
      pci_specified_resource_alignment
        align = PCIBIOS_DEFAULT_ALIGNMENT

I'm looking for an update that fixes the minor comment issues and
possibly addresses this PowerMac question.

Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-30 23:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-15  6:45 [PATCH v9 0/3] PCI: Introduce a way to enforce all MMIO BARs not to share PAGE_SIZE Yongji Xie
2017-02-15  6:45 ` [PATCH v9 1/3] PCI: A fix for caculating bridge window's size and alignment Yongji Xie
2017-02-15  6:45 ` [PATCH v9 2/3] PCI: Add a macro to set default alignment for all PCI devices Yongji Xie
2017-03-23 20:53   ` Bjorn Helgaas
2017-03-23 21:57     ` Bjorn Helgaas
2017-03-25 12:36     ` Yongji Xie
2017-03-27 10:17     ` Michael Ellerman
2017-03-27 10:25       ` Benjamin Herrenschmidt
2017-03-30 23:13         ` Bjorn Helgaas
2017-02-15  6:45 ` [PATCH v9 3/3] PCI: Don't extend device's size when using default alignment for all devices Yongji Xie
2017-03-23 21:55   ` Bjorn Helgaas
2017-03-25 12:46     ` Yongji Xie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).