All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 08 Jun 2016 22:35:16 +0000	[thread overview]
Message-ID: <CAE9FiQXPmG6UYYGHG52_i8vaBJ5yPm6Z4Zfx_BhQxVhyWC5fnw@mail.gmail.com> (raw)
In-Reply-To: <20160608210322.GA4248@localhost>

On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> Microblaze does look up the resource in pci_mmap_page_range(), but it
> never actually uses it.  It *looks* like it uses it, but that code is
> actually dead and I think we should apply the first patch below.

Good one.

>
> That leaves powerpc as the only arch that would use this extra
> resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
> decide whether to make a normal uncacheable mapping or a write-
> combining one.  There's nothing here that's specific to the powerpc
> architecture, and I don't think we should add this parameter just to
> cater to powerpc.
>
> There are two cases where __pci_mmap_set_pgprot() on powerpc does
> something based on the resource:
>
>   1) We're using procfs to mmap I/O port space after we requested
>      write-combining, e.g., we did this:
>
>        ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>        ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>        mmap(fd, ...)
>
>      On powerpc, we ignore the write-combining request in this case.
>
>      I think we can handle this case by applying the second patch
>      below to ignore write-combining on I/O space for all arches, not
>      just powerpc.
>
>   2) We're using sysfs to mmap resourceN (not resourceN_wc), and
>      the resource is prefetchable.  On powerpc, we turn *on*
>      write-combining, even though the user didn't ask for it.
>
>      I'm not sure this case is actually safe, because it changes the
>      ordering properties.  If it *is* safe, we could enable write-
>      combining in pci_mmap_resource(), where we already have the
>      resource and it could be done for all arches.
>
>      This case is not strictly necessary, except to avoid a
>      performance regression, because the user could have mapped
>      resourceN_wc to explicitly request write-combining.
>

Agreed.

>
> commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:00:14 2016 -0500
>
>     microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
>
>     The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
>     where it computes either an uncacheable pgprot_t or a write-combining one.
>     But on microblaze, we always use the regular uncacheable pgprot_t.
>
>     Remove the useless code in __pci_mmap_set_pgprot() and inline the
>     pgprot_noncached() at the only caller.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
> index 14cba60..1974567 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
>  }
>
>  /*
> - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
> - * device mapping.
> - */
> -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
> -                                     pgprot_t protection,
> -                                     enum pci_mmap_state mmap_state,
> -                                     int write_combine)
> -{
> -       pgprot_t prot = protection;
> -
> -       /* Write combine is always 0 on non-memory space mappings. On
> -        * memory space, if the user didn't pass 1, we check for a
> -        * "prefetchable" resource. This is a bit hackish, but we use
> -        * this to workaround the inability of /sysfs to provide a write
> -        * combine bit
> -        */
> -       if (mmap_state != pci_mmap_mem)
> -               write_combine = 0;
> -       else if (write_combine = 0) {
> -               if (rp->flags & IORESOURCE_PREFETCH)
> -                       write_combine = 1;
> -       }
> -
> -       return pgprot_noncached(prot);
> -}
> -
> -/*
>   * This one is used by /dev/mem and fbdev who have no clue about the
>   * PCI device, it tries to find the PCI device first and calls the
>   * above routine
> @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
>                 return -EINVAL;
>
>         vma->vm_pgoff = offset >> PAGE_SHIFT;
> -       vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
> -                                                 vma->vm_page_prot,
> -                                                 mmap_state, write_combine);
> +       vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
>         ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>                                vma->vm_end - vma->vm_start, vma->vm_page_prot);
>

Acked-by: Yinghai Lu <yinghai@kernel.org>

>
>
> commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:46:54 2016 -0500
>
>     PCI: Ignore write-combining when mapping I/O port space
>
>     PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
>     support operations like this:
>
>       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       mmap(fd, ...)
>
>     Many architectures don't allow mmap of I/O port space at all, but I don't
>     think it makes sense to do a write-combining mapping on the ones that do.
>     We could change proc_bus_pci_ioctl() so the user could never enable write-
>     combining for I/O port space, but that would break the following sequence,
>     which is currently legal:
>
>       mmap(fd, ...)                           # default is I/O, non-combining
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
>       mmap(fd, ...)
>
>     Ignore the write-combining flag when mapping I/O port space.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..21f8d613 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>
>         ret = pci_mmap_page_range(dev, vma,
>                                   fpriv->mmap_state,
> -                                 fpriv->write_combine);
> +                                 (fpriv->mmap_state = pci_mmap_mem) ?
> +                                       fpriv->write_combine : 0);
>         if (ret < 0)
>                 return ret;
>

ok to me.

At the same time, can you kill __pci_mmap_set_pgprot() for powerpc.

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..0d0148d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -356,36 +356,6 @@ static struct resource
*__pci_mmap_make_offset(struct pci_dev *dev,
 }

 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-                      pgprot_t protection,
-                      enum pci_mmap_state mmap_state,
-                      int write_combine)
-{
-
-    /* Write combine is always 0 on non-memory space mappings. On
-     * memory space, if the user didn't pass 1, we check for a
-     * "prefetchable" resource. This is a bit hackish, but we use
-     * this to workaround the inability of /sysfs to provide a write
-     * combine bit
-     */
-    if (mmap_state != pci_mmap_mem)
-        write_combine = 0;
-    else if (write_combine = 0) {
-        if (rp->flags & IORESOURCE_PREFETCH)
-            write_combine = 1;
-    }
-
-    /* XXX would be nice to have a way to ask for write-through */
-    if (write_combine)
-        return pgprot_noncached_wc(protection);
-    else
-        return pgprot_noncached(protection);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *dev,
struct vm_area_struct *vma,
         return -EINVAL;

     vma->vm_pgoff = offset >> PAGE_SHIFT;
-    vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-                          vma->vm_page_prot,
-                          mmap_state, write_combine);
+    if (write_combine)
+        vma->vm_page_prot = pgprot_noncached_wc(protection);
+    else
+        vma->vm_page_prot = pgprot_noncached(protection);

     ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
                    vma->vm_end - vma->vm_start, vma->vm_page_prot);

WARNING: multiple messages have this Message-ID (diff)
From: Yinghai Lu <yinghai@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	David Miller <davem@davemloft.net>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	linux-cris-kernel@axis.com,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-mips@linux-mips.org" <linux-mips@linux-mips.org>,
	linux-am33-list@redhat.com, linux-parisc@vger.kernel.org,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-sh@vger.kernel.org,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	linux-xtensa@linux-xtensa.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 15:35:16 -0700	[thread overview]
Message-ID: <CAE9FiQXPmG6UYYGHG52_i8vaBJ5yPm6Z4Zfx_BhQxVhyWC5fnw@mail.gmail.com> (raw)
In-Reply-To: <20160608210322.GA4248@localhost>

On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> Microblaze does look up the resource in pci_mmap_page_range(), but it
> never actually uses it.  It *looks* like it uses it, but that code is
> actually dead and I think we should apply the first patch below.

Good one.

>
> That leaves powerpc as the only arch that would use this extra
> resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
> decide whether to make a normal uncacheable mapping or a write-
> combining one.  There's nothing here that's specific to the powerpc
> architecture, and I don't think we should add this parameter just to
> cater to powerpc.
>
> There are two cases where __pci_mmap_set_pgprot() on powerpc does
> something based on the resource:
>
>   1) We're using procfs to mmap I/O port space after we requested
>      write-combining, e.g., we did this:
>
>        ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>        ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>        mmap(fd, ...)
>
>      On powerpc, we ignore the write-combining request in this case.
>
>      I think we can handle this case by applying the second patch
>      below to ignore write-combining on I/O space for all arches, not
>      just powerpc.
>
>   2) We're using sysfs to mmap resourceN (not resourceN_wc), and
>      the resource is prefetchable.  On powerpc, we turn *on*
>      write-combining, even though the user didn't ask for it.
>
>      I'm not sure this case is actually safe, because it changes the
>      ordering properties.  If it *is* safe, we could enable write-
>      combining in pci_mmap_resource(), where we already have the
>      resource and it could be done for all arches.
>
>      This case is not strictly necessary, except to avoid a
>      performance regression, because the user could have mapped
>      resourceN_wc to explicitly request write-combining.
>

Agreed.

>
> commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:00:14 2016 -0500
>
>     microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
>
>     The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
>     where it computes either an uncacheable pgprot_t or a write-combining one.
>     But on microblaze, we always use the regular uncacheable pgprot_t.
>
>     Remove the useless code in __pci_mmap_set_pgprot() and inline the
>     pgprot_noncached() at the only caller.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
> index 14cba60..1974567 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
>  }
>
>  /*
> - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
> - * device mapping.
> - */
> -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
> -                                     pgprot_t protection,
> -                                     enum pci_mmap_state mmap_state,
> -                                     int write_combine)
> -{
> -       pgprot_t prot = protection;
> -
> -       /* Write combine is always 0 on non-memory space mappings. On
> -        * memory space, if the user didn't pass 1, we check for a
> -        * "prefetchable" resource. This is a bit hackish, but we use
> -        * this to workaround the inability of /sysfs to provide a write
> -        * combine bit
> -        */
> -       if (mmap_state != pci_mmap_mem)
> -               write_combine = 0;
> -       else if (write_combine == 0) {
> -               if (rp->flags & IORESOURCE_PREFETCH)
> -                       write_combine = 1;
> -       }
> -
> -       return pgprot_noncached(prot);
> -}
> -
> -/*
>   * This one is used by /dev/mem and fbdev who have no clue about the
>   * PCI device, it tries to find the PCI device first and calls the
>   * above routine
> @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
>                 return -EINVAL;
>
>         vma->vm_pgoff = offset >> PAGE_SHIFT;
> -       vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
> -                                                 vma->vm_page_prot,
> -                                                 mmap_state, write_combine);
> +       vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
>         ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>                                vma->vm_end - vma->vm_start, vma->vm_page_prot);
>

Acked-by: Yinghai Lu <yinghai@kernel.org>

>
>
> commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:46:54 2016 -0500
>
>     PCI: Ignore write-combining when mapping I/O port space
>
>     PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
>     support operations like this:
>
>       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       mmap(fd, ...)
>
>     Many architectures don't allow mmap of I/O port space at all, but I don't
>     think it makes sense to do a write-combining mapping on the ones that do.
>     We could change proc_bus_pci_ioctl() so the user could never enable write-
>     combining for I/O port space, but that would break the following sequence,
>     which is currently legal:
>
>       mmap(fd, ...)                           # default is I/O, non-combining
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
>       mmap(fd, ...)
>
>     Ignore the write-combining flag when mapping I/O port space.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..21f8d613 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>
>         ret = pci_mmap_page_range(dev, vma,
>                                   fpriv->mmap_state,
> -                                 fpriv->write_combine);
> +                                 (fpriv->mmap_state == pci_mmap_mem) ?
> +                                       fpriv->write_combine : 0);
>         if (ret < 0)
>                 return ret;
>

ok to me.

At the same time, can you kill __pci_mmap_set_pgprot() for powerpc.

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..0d0148d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -356,36 +356,6 @@ static struct resource
*__pci_mmap_make_offset(struct pci_dev *dev,
 }

 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-                      pgprot_t protection,
-                      enum pci_mmap_state mmap_state,
-                      int write_combine)
-{
-
-    /* Write combine is always 0 on non-memory space mappings. On
-     * memory space, if the user didn't pass 1, we check for a
-     * "prefetchable" resource. This is a bit hackish, but we use
-     * this to workaround the inability of /sysfs to provide a write
-     * combine bit
-     */
-    if (mmap_state != pci_mmap_mem)
-        write_combine = 0;
-    else if (write_combine == 0) {
-        if (rp->flags & IORESOURCE_PREFETCH)
-            write_combine = 1;
-    }
-
-    /* XXX would be nice to have a way to ask for write-through */
-    if (write_combine)
-        return pgprot_noncached_wc(protection);
-    else
-        return pgprot_noncached(protection);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *dev,
struct vm_area_struct *vma,
         return -EINVAL;

     vma->vm_pgoff = offset >> PAGE_SHIFT;
-    vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-                          vma->vm_page_prot,
-                          mmap_state, write_combine);
+    if (write_combine)
+        vma->vm_page_prot = pgprot_noncached_wc(protection);
+    else
+        vma->vm_page_prot = pgprot_noncached(protection);

     ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
                    vma->vm_end - vma->vm_start, vma->vm_page_prot);

WARNING: multiple messages have this Message-ID (diff)
From: Yinghai Lu <yinghai@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	David Miller <davem@davemloft.net>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	linux-cris-kernel@axis.com,
	"linux-ia64@vger.kernel.org" <linux-ia64@vger.kernel.org>,
	"linux-mips@linux-mips.org" <linux-mips@linux-mips.org>,
	linux-am33-list@redhat.com, linux-parisc@vger.kernel.org,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-sh@vger.kernel.org,
	"sparclinux@vger.kernel.org" <sparclinux@vger.kernel.org>,
	linux-xtensa@linux-xtensa.org
Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 15:35:16 -0700	[thread overview]
Message-ID: <CAE9FiQXPmG6UYYGHG52_i8vaBJ5yPm6Z4Zfx_BhQxVhyWC5fnw@mail.gmail.com> (raw)
In-Reply-To: <20160608210322.GA4248@localhost>

On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> Microblaze does look up the resource in pci_mmap_page_range(), but it
> never actually uses it.  It *looks* like it uses it, but that code is
> actually dead and I think we should apply the first patch below.

Good one.

>
> That leaves powerpc as the only arch that would use this extra
> resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
> decide whether to make a normal uncacheable mapping or a write-
> combining one.  There's nothing here that's specific to the powerpc
> architecture, and I don't think we should add this parameter just to
> cater to powerpc.
>
> There are two cases where __pci_mmap_set_pgprot() on powerpc does
> something based on the resource:
>
>   1) We're using procfs to mmap I/O port space after we requested
>      write-combining, e.g., we did this:
>
>        ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>        ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>        mmap(fd, ...)
>
>      On powerpc, we ignore the write-combining request in this case.
>
>      I think we can handle this case by applying the second patch
>      below to ignore write-combining on I/O space for all arches, not
>      just powerpc.
>
>   2) We're using sysfs to mmap resourceN (not resourceN_wc), and
>      the resource is prefetchable.  On powerpc, we turn *on*
>      write-combining, even though the user didn't ask for it.
>
>      I'm not sure this case is actually safe, because it changes the
>      ordering properties.  If it *is* safe, we could enable write-
>      combining in pci_mmap_resource(), where we already have the
>      resource and it could be done for all arches.
>
>      This case is not strictly necessary, except to avoid a
>      performance regression, because the user could have mapped
>      resourceN_wc to explicitly request write-combining.
>

Agreed.

>
> commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:00:14 2016 -0500
>
>     microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
>
>     The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
>     where it computes either an uncacheable pgprot_t or a write-combining one.
>     But on microblaze, we always use the regular uncacheable pgprot_t.
>
>     Remove the useless code in __pci_mmap_set_pgprot() and inline the
>     pgprot_noncached() at the only caller.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
> index 14cba60..1974567 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
>  }
>
>  /*
> - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
> - * device mapping.
> - */
> -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
> -                                     pgprot_t protection,
> -                                     enum pci_mmap_state mmap_state,
> -                                     int write_combine)
> -{
> -       pgprot_t prot = protection;
> -
> -       /* Write combine is always 0 on non-memory space mappings. On
> -        * memory space, if the user didn't pass 1, we check for a
> -        * "prefetchable" resource. This is a bit hackish, but we use
> -        * this to workaround the inability of /sysfs to provide a write
> -        * combine bit
> -        */
> -       if (mmap_state != pci_mmap_mem)
> -               write_combine = 0;
> -       else if (write_combine == 0) {
> -               if (rp->flags & IORESOURCE_PREFETCH)
> -                       write_combine = 1;
> -       }
> -
> -       return pgprot_noncached(prot);
> -}
> -
> -/*
>   * This one is used by /dev/mem and fbdev who have no clue about the
>   * PCI device, it tries to find the PCI device first and calls the
>   * above routine
> @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
>                 return -EINVAL;
>
>         vma->vm_pgoff = offset >> PAGE_SHIFT;
> -       vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
> -                                                 vma->vm_page_prot,
> -                                                 mmap_state, write_combine);
> +       vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
>         ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>                                vma->vm_end - vma->vm_start, vma->vm_page_prot);
>

Acked-by: Yinghai Lu <yinghai@kernel.org>

>
>
> commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:46:54 2016 -0500
>
>     PCI: Ignore write-combining when mapping I/O port space
>
>     PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
>     support operations like this:
>
>       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       mmap(fd, ...)
>
>     Many architectures don't allow mmap of I/O port space at all, but I don't
>     think it makes sense to do a write-combining mapping on the ones that do.
>     We could change proc_bus_pci_ioctl() so the user could never enable write-
>     combining for I/O port space, but that would break the following sequence,
>     which is currently legal:
>
>       mmap(fd, ...)                           # default is I/O, non-combining
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
>       mmap(fd, ...)
>
>     Ignore the write-combining flag when mapping I/O port space.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..21f8d613 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>
>         ret = pci_mmap_page_range(dev, vma,
>                                   fpriv->mmap_state,
> -                                 fpriv->write_combine);
> +                                 (fpriv->mmap_state == pci_mmap_mem) ?
> +                                       fpriv->write_combine : 0);
>         if (ret < 0)
>                 return ret;
>

ok to me.

At the same time, can you kill __pci_mmap_set_pgprot() for powerpc.

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..0d0148d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -356,36 +356,6 @@ static struct resource
*__pci_mmap_make_offset(struct pci_dev *dev,
 }

 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-                      pgprot_t protection,
-                      enum pci_mmap_state mmap_state,
-                      int write_combine)
-{
-
-    /* Write combine is always 0 on non-memory space mappings. On
-     * memory space, if the user didn't pass 1, we check for a
-     * "prefetchable" resource. This is a bit hackish, but we use
-     * this to workaround the inability of /sysfs to provide a write
-     * combine bit
-     */
-    if (mmap_state != pci_mmap_mem)
-        write_combine = 0;
-    else if (write_combine == 0) {
-        if (rp->flags & IORESOURCE_PREFETCH)
-            write_combine = 1;
-    }
-
-    /* XXX would be nice to have a way to ask for write-through */
-    if (write_combine)
-        return pgprot_noncached_wc(protection);
-    else
-        return pgprot_noncached(protection);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *dev,
struct vm_area_struct *vma,
         return -EINVAL;

     vma->vm_pgoff = offset >> PAGE_SHIFT;
-    vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-                          vma->vm_page_prot,
-                          mmap_state, write_combine);
+    if (write_combine)
+        vma->vm_page_prot = pgprot_noncached_wc(protection);
+    else
+        vma->vm_page_prot = pgprot_noncached(protection);

     ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
                    vma->vm_end - vma->vm_start, vma->vm_page_prot);

WARNING: multiple messages have this Message-ID (diff)
From: yinghai@kernel.org (Yinghai Lu)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer
Date: Wed, 8 Jun 2016 15:35:16 -0700	[thread overview]
Message-ID: <CAE9FiQXPmG6UYYGHG52_i8vaBJ5yPm6Z4Zfx_BhQxVhyWC5fnw@mail.gmail.com> (raw)
In-Reply-To: <20160608210322.GA4248@localhost>

On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> Microblaze does look up the resource in pci_mmap_page_range(), but it
> never actually uses it.  It *looks* like it uses it, but that code is
> actually dead and I think we should apply the first patch below.

Good one.

>
> That leaves powerpc as the only arch that would use this extra
> resource pointer.  It uses it in __pci_mmap_set_pgprot() to help
> decide whether to make a normal uncacheable mapping or a write-
> combining one.  There's nothing here that's specific to the powerpc
> architecture, and I don't think we should add this parameter just to
> cater to powerpc.
>
> There are two cases where __pci_mmap_set_pgprot() on powerpc does
> something based on the resource:
>
>   1) We're using procfs to mmap I/O port space after we requested
>      write-combining, e.g., we did this:
>
>        ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>        ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>        mmap(fd, ...)
>
>      On powerpc, we ignore the write-combining request in this case.
>
>      I think we can handle this case by applying the second patch
>      below to ignore write-combining on I/O space for all arches, not
>      just powerpc.
>
>   2) We're using sysfs to mmap resourceN (not resourceN_wc), and
>      the resource is prefetchable.  On powerpc, we turn *on*
>      write-combining, even though the user didn't ask for it.
>
>      I'm not sure this case is actually safe, because it changes the
>      ordering properties.  If it *is* safe, we could enable write-
>      combining in pci_mmap_resource(), where we already have the
>      resource and it could be done for all arches.
>
>      This case is not strictly necessary, except to avoid a
>      performance regression, because the user could have mapped
>      resourceN_wc to explicitly request write-combining.
>

Agreed.

>
> commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:00:14 2016 -0500
>
>     microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
>
>     The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc,
>     where it computes either an uncacheable pgprot_t or a write-combining one.
>     But on microblaze, we always use the regular uncacheable pgprot_t.
>
>     Remove the useless code in __pci_mmap_set_pgprot() and inline the
>     pgprot_noncached() at the only caller.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
> index 14cba60..1974567 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev,
>  }
>
>  /*
> - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
> - * device mapping.
> - */
> -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
> -                                     pgprot_t protection,
> -                                     enum pci_mmap_state mmap_state,
> -                                     int write_combine)
> -{
> -       pgprot_t prot = protection;
> -
> -       /* Write combine is always 0 on non-memory space mappings. On
> -        * memory space, if the user didn't pass 1, we check for a
> -        * "prefetchable" resource. This is a bit hackish, but we use
> -        * this to workaround the inability of /sysfs to provide a write
> -        * combine bit
> -        */
> -       if (mmap_state != pci_mmap_mem)
> -               write_combine = 0;
> -       else if (write_combine == 0) {
> -               if (rp->flags & IORESOURCE_PREFETCH)
> -                       write_combine = 1;
> -       }
> -
> -       return pgprot_noncached(prot);
> -}
> -
> -/*
>   * This one is used by /dev/mem and fbdev who have no clue about the
>   * PCI device, it tries to find the PCI device first and calls the
>   * above routine
> @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
>                 return -EINVAL;
>
>         vma->vm_pgoff = offset >> PAGE_SHIFT;
> -       vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
> -                                                 vma->vm_page_prot,
> -                                                 mmap_state, write_combine);
> +       vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
>         ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>                                vma->vm_end - vma->vm_start, vma->vm_page_prot);
>

Acked-by: Yinghai Lu <yinghai@kernel.org>

>
>
> commit 962972ee5e0ba6ceb680cb182bad65f8886586a6
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Wed Jun 8 14:46:54 2016 -0500
>
>     PCI: Ignore write-combining when mapping I/O port space
>
>     PCI exposes files like /proc/bus/pci/00/00.0 in procfs.  These files
>     support operations like this:
>
>       ioctl(fd, PCIIOC_MMAP_IS_IO);           # request I/O port space
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       mmap(fd, ...)
>
>     Many architectures don't allow mmap of I/O port space at all, but I don't
>     think it makes sense to do a write-combining mapping on the ones that do.
>     We could change proc_bus_pci_ioctl() so the user could never enable write-
>     combining for I/O port space, but that would break the following sequence,
>     which is currently legal:
>
>       mmap(fd, ...)                           # default is I/O, non-combining
>       ioctl(fd, PCIIOC_WRITE_COMBINE, 1);     # request write-combining
>       ioctl(fd, PCIIOC_MMAP_IS_MEM);          # request memory space
>       mmap(fd, ...)
>
>     Ignore the write-combining flag when mapping I/O port space.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3f155e7..21f8d613 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
>
>         ret = pci_mmap_page_range(dev, vma,
>                                   fpriv->mmap_state,
> -                                 fpriv->write_combine);
> +                                 (fpriv->mmap_state == pci_mmap_mem) ?
> +                                       fpriv->write_combine : 0);
>         if (ret < 0)
>                 return ret;
>

ok to me.

At the same time, can you kill __pci_mmap_set_pgprot() for powerpc.

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..0d0148d 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -356,36 +356,6 @@ static struct resource
*__pci_mmap_make_offset(struct pci_dev *dev,
 }

 /*
- * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci
- * device mapping.
- */
-static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
-                      pgprot_t protection,
-                      enum pci_mmap_state mmap_state,
-                      int write_combine)
-{
-
-    /* Write combine is always 0 on non-memory space mappings. On
-     * memory space, if the user didn't pass 1, we check for a
-     * "prefetchable" resource. This is a bit hackish, but we use
-     * this to workaround the inability of /sysfs to provide a write
-     * combine bit
-     */
-    if (mmap_state != pci_mmap_mem)
-        write_combine = 0;
-    else if (write_combine == 0) {
-        if (rp->flags & IORESOURCE_PREFETCH)
-            write_combine = 1;
-    }
-
-    /* XXX would be nice to have a way to ask for write-through */
-    if (write_combine)
-        return pgprot_noncached_wc(protection);
-    else
-        return pgprot_noncached(protection);
-}
-
-/*
  * This one is used by /dev/mem and fbdev who have no clue about the
  * PCI device, it tries to find the PCI device first and calls the
  * above routine
@@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *dev,
struct vm_area_struct *vma,
         return -EINVAL;

     vma->vm_pgoff = offset >> PAGE_SHIFT;
-    vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp,
-                          vma->vm_page_prot,
-                          mmap_state, write_combine);
+    if (write_combine)
+        vma->vm_page_prot = pgprot_noncached_wc(protection);
+    else
+        vma->vm_page_prot = pgprot_noncached(protection);

     ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
                    vma->vm_end - vma->vm_start, vma->vm_page_prot);

  reply	other threads:[~2016-06-08 22:35 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-04  0:06 [PATCH v12 00/15] PCI: Fixup for 64bit resource with sparc Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-07  8:04   ` Jesper Nilsson
2016-06-07  8:04     ` Jesper Nilsson
2016-06-07  8:04     ` Jesper Nilsson
2016-06-08 21:03   ` Bjorn Helgaas
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 21:03     ` Bjorn Helgaas
2016-06-08 22:35     ` Yinghai Lu [this message]
2016-06-08 22:35       ` Yinghai Lu
2016-06-08 22:35       ` Yinghai Lu
2016-06-08 22:35       ` Yinghai Lu
2016-06-09  0:00       ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09  0:00         ` Yinghai Lu
2016-06-09 22:36         ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-09 22:36           ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 02/15] PCI: Let pci_mmap_page_range() take resource address Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 03/15] sparc/PCI: Use correct offset for bus address to resource Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 04/15] PCI: Add pci_find_bus_resource() Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 05/15] sparc/PCI: Reserve legacy mmio after PCI mmio Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 06/15] sparc/PCI: Add IORESOURCE_MEM_64 for 64-bit resource in OF parsing Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 07/15] sparc/PCI: Keep resource idx order with bridge register number Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 08/15] powerpc/PCI: " Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 09/15] powerpc/PCI: Add IORESOURCE_MEM_64 for 64-bit resource in OF parsing Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 10/15] OF/PCI: Add IORESOURCE_MEM_64 for 64-bit resource Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 11/15] PCI: Check pref compatible bit for mem64 resource of PCIe device Yinghai Lu
2016-06-04  0:06   ` Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 12/15] PCI: Only treat non-pref mmio64 as pref if all bridges have MEM_64 Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 13/15] PCI: Add has_mem64 for struct host_bridge Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 14/15] PCI: Only treat non-pref mmio64 as pref if host bridge has mmio64 Yinghai Lu
2016-06-04  0:06 ` [PATCH v12 15/15] PCI: Restore pref MMIO allocation logic for host bridge without mmio64 Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAE9FiQXPmG6UYYGHG52_i8vaBJ5yPm6Z4Zfx_BhQxVhyWC5fnw@mail.gmail.com \
    --to=yinghai@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.