All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 08 Jun 2016 21:03:22 +0000	[thread overview]
Message-ID: <20160608210322.GA4248@localhost> (raw)
In-Reply-To: <20160604000642.28162-2-yinghai@kernel.org>

On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.

I'm not convinced yet that the extra resource pointer is necessary.

Microblaze does look up the resource in pci_mmap_page_range(), but it
never actually uses it.  It *looks* like it uses it, but that code is
actually dead and I think we should apply the first patch below.

That leaves powerpc as the only arch that would use this extra
resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
decide whether to make a normal uncacheable mapping or a write-
combining one.  There's nothing here that's specific to the powerpc
architecture, and I don't think we should add this parameter just to
cater to powerpc.

There are two cases where __pci_mmap_set_pgprot() on powerpc does
something based on the resource:

  1) We're using procfs to mmap I/O port space after we requested
     write-combining, e.g., we did this:

       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
       mmap(fd, ...)

     On powerpc, we ignore the write-combining request in this case.

     I think we can handle this case by applying the second patch
     below to ignore write-combining on I/O space for all arches, not
     just powerpc.

  2) We're using sysfs to mmap resourceN (not resourceN_wc), and
     the resource is prefetchable.  On powerpc, we turn *on*
     write-combining, even though the user didn't ask for it.

     I'm not sure this case is actually safe, because it changes the
     ordering properties.  If it *is* safe, we could enable write-
     combining in pci_mmap_resource(), where we already have the
     resource and it could be done for all arches.

     This case is not strictly necessary, except to avoid a
     performance regression, because the user could have mapped
     resourceN_wc to explicitly request write-combining.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d319a9c..5bbe20c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
>  	pci_resource_to_user(pdev, i, res, &start, &end);
>  	vma->vm_pgoff += start >> PAGE_SHIFT;
>  	mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io;
> -	return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
> +	return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine);
>  }
>  
>  static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj,
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..f19ee2a 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>  	if (i >= PCI_ROM_RESOURCE)
>  		return -ENODEV;
>  
> -	ret = pci_mmap_page_range(dev, vma,
> +	ret = pci_mmap_page_range(dev, &dev->resource[i], vma,
>  				  fpriv->mmap_state,
>  				  fpriv->write_combine);
>  	if (ret < 0)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..3c1a0f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -70,6 +70,12 @@ enum pci_mmap_state {
>  	pci_mmap_mem
>  };
>  
> +struct vm_area_struct;
> +/* Map a range of PCI memory or I/O space for a device into user space */
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> +			struct vm_area_struct *vma,
> +			enum pci_mmap_state mmap_state, int write_combine);
> +
>  /*
>   *  For PCI devices, the region numbers are assigned this way:
>   */


commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:00:14 2016 -0500

    microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
    
    The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
    where it computes either an uncacheable pgprot_t or a write-combining one.
    But on microblaze, we always use the regular uncacheable pgprot_t.
    
    Remove the useless code in __pci_mmap_set_pgprot() and inline the
    pgprot_noncached() at the only caller.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 14cba60..1974567 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-				      pgprot_t protection,
-				      enum pci_mmap_state mmap_state,
-				      int write_combine)
-{
-	pgprot_t prot = protection;
-
-	/* Write combine is always 0 on non-memory space mappings. On
-	 * memory space, if the user didn't pass 1, we check for a
-	 * "prefetchable" resource. This is a bit hackish, but we use
-	 * this to workaround the inability of /sysfs to provide a write
-	 * combine bit
-	 */
-	if (mmap_state != pci_mmap_mem)
-		write_combine = 0;
-	else if (write_combine = 0) {
-		if (rp->flags & IORESOURCE_PREFETCH)
-			write_combine = 1;
-	}
-
-	return pgprot_noncached(prot);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		return -EINVAL;
 
 	vma->vm_pgoff = offset >> PAGE_SHIFT;
-	vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-						  vma->vm_page_prot,
-						  mmap_state, write_combine);
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			       vma->vm_end - vma->vm_start, vma->vm_page_prot);



commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:46:54 2016 -0500

    PCI: Ignore write-combining when mapping I/O port space
    
    PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
    support operations like this:
    
      ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      mmap(fd, ...)
    
    Many architectures don't allow mmap of I/O port space at all, but I don't
    think it makes sense to do a write-combining mapping on the ones that do.
    We could change proc_bus_pci_ioctl() so the user could never enable write-
    combining for I/O port space, but that would break the following sequence,
    which is currently legal:
    
      mmap(fd, ...)                           # default is I/O, non-combining
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
      mmap(fd, ...)
    
    Ignore the write-combining flag when mapping I/O port space.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 3f155e7..21f8d613 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 	ret = pci_mmap_page_range(dev, vma,
 				  fpriv->mmap_state,
-				  fpriv->write_combine);
+				  (fpriv->mmap_state = pci_mmap_mem) ?
+					fpriv->write_combine : 0);
 	if (ret < 0)
 		return ret;
 

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	David Miller <davem@davemloft.net>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-cris-kernel@axis.com,
	linux-ia64@vger.kernel.org, linux-mips@linux-mips.org,
	linux-am33-list@redhat.com, linux-parisc@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-sh@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-xtensa@linux-xtensa.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 16:03:22 -0500	[thread overview]
Message-ID: <20160608210322.GA4248@localhost> (raw)
In-Reply-To: <20160604000642.28162-2-yinghai@kernel.org>

On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.

I'm not convinced yet that the extra resource pointer is necessary.

Microblaze does look up the resource in pci_mmap_page_range(), but it
never actually uses it.  It *looks* like it uses it, but that code is
actually dead and I think we should apply the first patch below.

That leaves powerpc as the only arch that would use this extra
resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
decide whether to make a normal uncacheable mapping or a write-
combining one.  There's nothing here that's specific to the powerpc
architecture, and I don't think we should add this parameter just to
cater to powerpc.

There are two cases where __pci_mmap_set_pgprot() on powerpc does
something based on the resource:

  1) We're using procfs to mmap I/O port space after we requested
     write-combining, e.g., we did this:

       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
       mmap(fd, ...)

     On powerpc, we ignore the write-combining request in this case.

     I think we can handle this case by applying the second patch
     below to ignore write-combining on I/O space for all arches, not
     just powerpc.

  2) We're using sysfs to mmap resourceN (not resourceN_wc), and
     the resource is prefetchable.  On powerpc, we turn *on*
     write-combining, even though the user didn't ask for it.

     I'm not sure this case is actually safe, because it changes the
     ordering properties.  If it *is* safe, we could enable write-
     combining in pci_mmap_resource(), where we already have the
     resource and it could be done for all arches.

     This case is not strictly necessary, except to avoid a
     performance regression, because the user could have mapped
     resourceN_wc to explicitly request write-combining.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d319a9c..5bbe20c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
>  	pci_resource_to_user(pdev, i, res, &start, &end);
>  	vma->vm_pgoff += start >> PAGE_SHIFT;
>  	mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io;
> -	return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
> +	return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine);
>  }
>  
>  static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj,
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..f19ee2a 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>  	if (i >= PCI_ROM_RESOURCE)
>  		return -ENODEV;
>  
> -	ret = pci_mmap_page_range(dev, vma,
> +	ret = pci_mmap_page_range(dev, &dev->resource[i], vma,
>  				  fpriv->mmap_state,
>  				  fpriv->write_combine);
>  	if (ret < 0)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..3c1a0f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -70,6 +70,12 @@ enum pci_mmap_state {
>  	pci_mmap_mem
>  };
>  
> +struct vm_area_struct;
> +/* Map a range of PCI memory or I/O space for a device into user space */
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> +			struct vm_area_struct *vma,
> +			enum pci_mmap_state mmap_state, int write_combine);
> +
>  /*
>   *  For PCI devices, the region numbers are assigned this way:
>   */


commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:00:14 2016 -0500

    microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
    
    The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
    where it computes either an uncacheable pgprot_t or a write-combining one.
    But on microblaze, we always use the regular uncacheable pgprot_t.
    
    Remove the useless code in __pci_mmap_set_pgprot() and inline the
    pgprot_noncached() at the only caller.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 14cba60..1974567 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-				      pgprot_t protection,
-				      enum pci_mmap_state mmap_state,
-				      int write_combine)
-{
-	pgprot_t prot = protection;
-
-	/* Write combine is always 0 on non-memory space mappings. On
-	 * memory space, if the user didn't pass 1, we check for a
-	 * "prefetchable" resource. This is a bit hackish, but we use
-	 * this to workaround the inability of /sysfs to provide a write
-	 * combine bit
-	 */
-	if (mmap_state != pci_mmap_mem)
-		write_combine = 0;
-	else if (write_combine == 0) {
-		if (rp->flags & IORESOURCE_PREFETCH)
-			write_combine = 1;
-	}
-
-	return pgprot_noncached(prot);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		return -EINVAL;
 
 	vma->vm_pgoff = offset >> PAGE_SHIFT;
-	vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-						  vma->vm_page_prot,
-						  mmap_state, write_combine);
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			       vma->vm_end - vma->vm_start, vma->vm_page_prot);



commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:46:54 2016 -0500

    PCI: Ignore write-combining when mapping I/O port space
    
    PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
    support operations like this:
    
      ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      mmap(fd, ...)
    
    Many architectures don't allow mmap of I/O port space at all, but I don't
    think it makes sense to do a write-combining mapping on the ones that do.
    We could change proc_bus_pci_ioctl() so the user could never enable write-
    combining for I/O port space, but that would break the following sequence,
    which is currently legal:
    
      mmap(fd, ...)                           # default is I/O, non-combining
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
      mmap(fd, ...)
    
    Ignore the write-combining flag when mapping I/O port space.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 3f155e7..21f8d613 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 	ret = pci_mmap_page_range(dev, vma,
 				  fpriv->mmap_state,
-				  fpriv->write_combine);
+				  (fpriv->mmap_state == pci_mmap_mem) ?
+					fpriv->write_combine : 0);
 	if (ret < 0)
 		return ret;
 

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Yinghai Lu <yinghai@kernel.org>
Cc: linux-mips@linux-mips.org, Wei Yang <weiyang@linux.vnet.ibm.com>,
	linux-am33-list@redhat.com, linux-ia64@vger.kernel.org,
	linux-parisc@vger.kernel.org, linux-cris-kernel@axis.com,
	linux-sh@vger.kernel.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	linux-xtensa@linux-xtensa.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org,
	Khalid Aziz <khalid.aziz@oracle.com>,
	linux-pci@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	David Miller <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 16:03:22 -0500	[thread overview]
Message-ID: <20160608210322.GA4248@localhost> (raw)
In-Reply-To: <20160604000642.28162-2-yinghai@kernel.org>

On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.

I'm not convinced yet that the extra resource pointer is necessary.

Microblaze does look up the resource in pci_mmap_page_range(), but it
never actually uses it.  It *looks* like it uses it, but that code is
actually dead and I think we should apply the first patch below.

That leaves powerpc as the only arch that would use this extra
resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
decide whether to make a normal uncacheable mapping or a write-
combining one.  There's nothing here that's specific to the powerpc
architecture, and I don't think we should add this parameter just to
cater to powerpc.

There are two cases where __pci_mmap_set_pgprot() on powerpc does
something based on the resource:

  1) We're using procfs to mmap I/O port space after we requested
     write-combining, e.g., we did this:

       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
       mmap(fd, ...)

     On powerpc, we ignore the write-combining request in this case.

     I think we can handle this case by applying the second patch
     below to ignore write-combining on I/O space for all arches, not
     just powerpc.

  2) We're using sysfs to mmap resourceN (not resourceN_wc), and
     the resource is prefetchable.  On powerpc, we turn *on*
     write-combining, even though the user didn't ask for it.

     I'm not sure this case is actually safe, because it changes the
     ordering properties.  If it *is* safe, we could enable write-
     combining in pci_mmap_resource(), where we already have the
     resource and it could be done for all arches.

     This case is not strictly necessary, except to avoid a
     performance regression, because the user could have mapped
     resourceN_wc to explicitly request write-combining.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d319a9c..5bbe20c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
>  	pci_resource_to_user(pdev, i, res, &start, &end);
>  	vma->vm_pgoff += start >> PAGE_SHIFT;
>  	mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io;
> -	return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
> +	return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine);
>  }
>  
>  static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj,
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..f19ee2a 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>  	if (i >= PCI_ROM_RESOURCE)
>  		return -ENODEV;
>  
> -	ret = pci_mmap_page_range(dev, vma,
> +	ret = pci_mmap_page_range(dev, &dev->resource[i], vma,
>  				  fpriv->mmap_state,
>  				  fpriv->write_combine);
>  	if (ret < 0)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..3c1a0f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -70,6 +70,12 @@ enum pci_mmap_state {
>  	pci_mmap_mem
>  };
>  
> +struct vm_area_struct;
> +/* Map a range of PCI memory or I/O space for a device into user space */
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> +			struct vm_area_struct *vma,
> +			enum pci_mmap_state mmap_state, int write_combine);
> +
>  /*
>   *  For PCI devices, the region numbers are assigned this way:
>   */


commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:00:14 2016 -0500

    microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
    
    The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
    where it computes either an uncacheable pgprot_t or a write-combining one.
    But on microblaze, we always use the regular uncacheable pgprot_t.
    
    Remove the useless code in __pci_mmap_set_pgprot() and inline the
    pgprot_noncached() at the only caller.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 14cba60..1974567 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-				      pgprot_t protection,
-				      enum pci_mmap_state mmap_state,
-				      int write_combine)
-{
-	pgprot_t prot = protection;
-
-	/* Write combine is always 0 on non-memory space mappings. On
-	 * memory space, if the user didn't pass 1, we check for a
-	 * "prefetchable" resource. This is a bit hackish, but we use
-	 * this to workaround the inability of /sysfs to provide a write
-	 * combine bit
-	 */
-	if (mmap_state != pci_mmap_mem)
-		write_combine = 0;
-	else if (write_combine == 0) {
-		if (rp->flags & IORESOURCE_PREFETCH)
-			write_combine = 1;
-	}
-
-	return pgprot_noncached(prot);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		return -EINVAL;
 
 	vma->vm_pgoff = offset >> PAGE_SHIFT;
-	vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-						  vma->vm_page_prot,
-						  mmap_state, write_combine);
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			       vma->vm_end - vma->vm_start, vma->vm_page_prot);



commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:46:54 2016 -0500

    PCI: Ignore write-combining when mapping I/O port space
    
    PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
    support operations like this:
    
      ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      mmap(fd, ...)
    
    Many architectures don't allow mmap of I/O port space at all, but I don't
    think it makes sense to do a write-combining mapping on the ones that do.
    We could change proc_bus_pci_ioctl() so the user could never enable write-
    combining for I/O port space, but that would break the following sequence,
    which is currently legal:
    
      mmap(fd, ...)                           # default is I/O, non-combining
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
      mmap(fd, ...)
    
    Ignore the write-combining flag when mapping I/O port space.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 3f155e7..21f8d613 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 	ret = pci_mmap_page_range(dev, vma,
 				  fpriv->mmap_state,
-				  fpriv->write_combine);
+				  (fpriv->mmap_state == pci_mmap_mem) ?
+					fpriv->write_combine : 0);
 	if (ret < 0)
 		return ret;
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: helgaas@kernel.org (Bjorn Helgaas)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 16:03:22 -0500	[thread overview]
Message-ID: <20160608210322.GA4248@localhost> (raw)
In-Reply-To: <20160604000642.28162-2-yinghai@kernel.org>

On Fri, Jun 03, 2016 at 05:06:28PM -0700, Yinghai Lu wrote:
> This one is preparing patch for next one:
>   PCI: Let pci_mmap_page_range() take resource addr
> 
> We need to pass extra resource pointer to avoid searching that again
> for powerpc and microblaze prot set operation.

I'm not convinced yet that the extra resource pointer is necessary.

Microblaze does look up the resource in pci_mmap_page_range(), but it
never actually uses it.  It *looks* like it uses it, but that code is
actually dead and I think we should apply the first patch below.

That leaves powerpc as the only arch that would use this extra
resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
decide whether to make a normal uncacheable mapping or a write-
combining one.  There's nothing here that's specific to the powerpc
architecture, and I don't think we should add this parameter just to
cater to powerpc.

There are two cases where __pci_mmap_set_pgprot() on powerpc does
something based on the resource:

  1) We're using procfs to mmap I/O port space after we requested
     write-combining, e.g., we did this:

       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
       mmap(fd, ...)

     On powerpc, we ignore the write-combining request in this case.

     I think we can handle this case by applying the second patch
     below to ignore write-combining on I/O space for all arches, not
     just powerpc.

  2) We're using sysfs to mmap resourceN (not resourceN_wc), and
     the resource is prefetchable.  On powerpc, we turn *on*
     write-combining, even though the user didn't ask for it.

     I'm not sure this case is actually safe, because it changes the
     ordering properties.  If it *is* safe, we could enable write-
     combining in pci_mmap_resource(), where we already have the
     resource and it could be done for all arches.

     This case is not strictly necessary, except to avoid a
     performance regression, because the user could have mapped
     resourceN_wc to explicitly request write-combining.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index d319a9c..5bbe20c 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1027,7 +1027,7 @@ static int pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
>  	pci_resource_to_user(pdev, i, res, &start, &end);
>  	vma->vm_pgoff += start >> PAGE_SHIFT;
>  	mmap_type = res->flags & IORESOURCE_MEM ? pci_mmap_mem : pci_mmap_io;
> -	return pci_mmap_page_range(pdev, vma, mmap_type, write_combine);
> +	return pci_mmap_page_range(pdev, res, vma, mmap_type, write_combine);
>  }
>  
>  static int pci_mmap_resource_uc(struct file *filp, struct kobject *kobj,
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..f19ee2a 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -245,7 +245,7 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>  	if (i >= PCI_ROM_RESOURCE)
>  		return -ENODEV;
>  
> -	ret = pci_mmap_page_range(dev, vma,
> +	ret = pci_mmap_page_range(dev, &dev->resource[i], vma,
>  				  fpriv->mmap_state,
>  				  fpriv->write_combine);
>  	if (ret < 0)
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index b67e4df..3c1a0f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -70,6 +70,12 @@ enum pci_mmap_state {
>  	pci_mmap_mem
>  };
>  
> +struct vm_area_struct;
> +/* Map a range of PCI memory or I/O space for a device into user space */
> +int pci_mmap_page_range(struct pci_dev *dev, struct resource *res,
> +			struct vm_area_struct *vma,
> +			enum pci_mmap_state mmap_state, int write_combine);
> +
>  /*
>   *  For PCI devices, the region numbers are assigned this way:
>   */


commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:00:14 2016 -0500

    microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
    
    The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
    where it computes either an uncacheable pgprot_t or a write-combining one.
    But on microblaze, we always use the regular uncacheable pgprot_t.
    
    Remove the useless code in __pci_mmap_set_pgprot() and inline the
    pgprot_noncached() at the only caller.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 14cba60..1974567 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
 }
 
 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-				      pgprot_t protection,
-				      enum pci_mmap_state mmap_state,
-				      int write_combine)
-{
-	pgprot_t prot = protection;
-
-	/* Write combine is always 0 on non-memory space mappings. On
-	 * memory space, if the user didn't pass 1, we check for a
-	 * "prefetchable" resource. This is a bit hackish, but we use
-	 * this to workaround the inability of /sysfs to provide a write
-	 * combine bit
-	 */
-	if (mmap_state != pci_mmap_mem)
-		write_combine = 0;
-	else if (write_combine == 0) {
-		if (rp->flags & IORESOURCE_PREFETCH)
-			write_combine = 1;
-	}
-
-	return pgprot_noncached(prot);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
 		return -EINVAL;
 
 	vma->vm_pgoff = offset >> PAGE_SHIFT;
-	vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-						  vma->vm_page_prot,
-						  mmap_state, write_combine);
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			       vma->vm_end - vma->vm_start, vma->vm_page_prot);



commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Wed Jun 8 14:46:54 2016 -0500

    PCI: Ignore write-combining when mapping I/O port space
    
    PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
    support operations like this:
    
      ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      mmap(fd, ...)
    
    Many architectures don't allow mmap of I/O port space at all, but I don't
    think it makes sense to do a write-combining mapping on the ones that do.
    We could change proc_bus_pci_ioctl() so the user could never enable write-
    combining for I/O port space, but that would break the following sequence,
    which is currently legal:
    
      mmap(fd, ...)                           # default is I/O, non-combining
      ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
      ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
      mmap(fd, ...)
    
    Ignore the write-combining flag when mapping I/O port space.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 3f155e7..21f8d613 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 
 	ret = pci_mmap_page_range(dev, vma,
 				  fpriv->mmap_state,
-				  fpriv->write_combine);
+				  (fpriv->mmap_state == pci_mmap_mem) ?
+					fpriv->write_combine : 0);
 	if (ret < 0)
 		return ret;
 

  parent reply	other threads:[~2016-06-08 21:03 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-04  0:06 [PATCH v12 00/15] PCI: Fixup for 64bit resource with sparc Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-07  8:04   ` Jesper Nilsson
2016-06-07  8:04     ` Jesper Nilsson
2016-06-07  8:04     ` Jesper Nilsson
2016-06-08 21:03   ` Bjorn Helgaas [this message]
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 22:35     ` Yinghai Lu
2016-06-08 22:35       ` Yinghai Lu
2016-06-08 22:35       ` Yinghai Lu
2016-06-08 22:35       ` Yinghai Lu
2016-06-09  0:00       ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09 22:36         ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 02/15] PCI: Let pci_mmap_page_range() take resource address Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 03/15] sparc/PCI: Use correct offset for bus address to resource Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 04/15] PCI: Add pci_find_bus_resource() Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 05/15] sparc/PCI: Reserve legacy mmio after PCI mmio Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 06/15] sparc/PCI: Add IORESOURCE_MEM_64 for 64-bit resource in OF parsing Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 07/15] sparc/PCI: Keep resource idx order with bridge register number Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 08/15] powerpc/PCI: " Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 09/15] powerpc/PCI: Add IORESOURCE_MEM_64 for 64-bit resource in OF parsing Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 10/15] OF/PCI: Add IORESOURCE_MEM_64 for 64-bit resource Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 11/15] PCI: Check pref compatible bit for mem64 resource of PCIe device Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 12/15] PCI: Only treat non-pref mmio64 as pref if all bridges have MEM_64 Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 13/15] PCI: Add has_mem64 for struct host_bridge Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 14/15] PCI: Only treat non-pref mmio64 as pref if host bridge has mmio64 Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 15/15] PCI: Restore pref MMIO allocation logic for host bridge without mmio64 Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160608210322.GA4248@localhost \
    --to=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.