linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] pci sysfs file iomem revoke support
@ 2021-02-04 16:58 Daniel Vetter
  2021-02-04 16:58 ` [PATCH 1/2] PCI: also set up legacy files only after sysfs init Daniel Vetter
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Daniel Vetter @ 2021-02-04 16:58 UTC (permalink / raw)
  To: LKML
  Cc: DRI Development, Daniel Vetter, Stephen Rothwell,
	Jason Gunthorpe, Kees Cook, Dan Williams, Andrew Morton,
	John Hubbard, Jérôme Glisse, Jan Kara,
	Greg Kroah-Hartman, linux-mm, linux-arm-kernel,
	linux-samsung-soc, linux-media, Bjorn Helgaas, linux-pci

Hi all,

This is a revised version of patch 12 from my series to lock down some
follow_pfn vs VM_SPECIAL races:

https://lore.kernel.org/dri-devel/CAKwvOdnSrsnTgPEuQJyaOTSkTP2dR9208Y66HQG_h1e2LKfqtw@mail.gmail.com/

Stephen reported an issue on HAVE_PCI_LEGACY platforms which this patch
set tries to address. Previous patches are all still in linux-next.

Stephen, would be awesome if you can give this a spin.

Björn/Greg, review on the first patch is needed, I think that's the
cleanest approach from all the options I discussed with Greg in this
thread:

https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/

Cheers, Daniel

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org

Daniel Vetter (2):
  PCI: also set up legacy files only after sysfs init
  PCI: Revoke mappings like devmem

 drivers/pci/pci-sysfs.c | 11 +++++++++++
 drivers/pci/proc.c      |  1 +
 2 files changed, 12 insertions(+)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-04 16:58 [PATCH 0/2] pci sysfs file iomem revoke support Daniel Vetter
@ 2021-02-04 16:58 ` Daniel Vetter
  2021-02-04 21:50   ` Bjorn Helgaas
  2021-02-04 16:58 ` [PATCH 2/2] PCI: Revoke mappings like devmem Daniel Vetter
  2021-02-07 19:43 ` [PATCH 0/2] pci sysfs file iomem revoke support Stephen Rothwell
  2 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2021-02-04 16:58 UTC (permalink / raw)
  To: LKML
  Cc: DRI Development, Daniel Vetter, Daniel Vetter, Stephen Rothwell,
	Jason Gunthorpe, Kees Cook, Dan Williams, Andrew Morton,
	John Hubbard, Jérôme Glisse, Jan Kara,
	Greg Kroah-Hartman, linux-mm, linux-arm-kernel,
	linux-samsung-soc, linux-media, Bjorn Helgaas, linux-pci

We are already doing this for all the regular sysfs files on PCI
devices, but not yet on the legacy io files on the PCI buses. Thus far
now problem, but in the next patch I want to wire up iomem revoke
support. That needs the vfs up an running already to make so that
iomem_get_mapping() works.

Wire it up exactly like the existing code. Note that
pci_remove_legacy_files() doesn't need a check since the one for
pci_bus->legacy_io is sufficient.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-sysfs.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index fb072f4b3176..0c45b4f7b214 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
 {
 	int error;
 
+	if (!sysfs_initialized)
+		return;
+
 	b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
 			       GFP_ATOMIC);
 	if (!b->legacy_io)
@@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
 static int __init pci_sysfs_init(void)
 {
 	struct pci_dev *pdev = NULL;
+	struct pci_bus *pbus = NULL;
 	int retval;
 
 	sysfs_initialized = 1;
@@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
 		}
 	}
 
+	while ((pbus = pci_find_next_bus(pbus)))
+		pci_create_legacy_files(pbus);
+
 	return 0;
 }
 late_initcall(pci_sysfs_init);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] PCI: Revoke mappings like devmem
  2021-02-04 16:58 [PATCH 0/2] pci sysfs file iomem revoke support Daniel Vetter
  2021-02-04 16:58 ` [PATCH 1/2] PCI: also set up legacy files only after sysfs init Daniel Vetter
@ 2021-02-04 16:58 ` Daniel Vetter
  2021-02-10 22:20   ` Bjorn Helgaas
  2021-03-13 21:57   ` Bjorn Helgaas
  2021-02-07 19:43 ` [PATCH 0/2] pci sysfs file iomem revoke support Stephen Rothwell
  2 siblings, 2 replies; 14+ messages in thread
From: Daniel Vetter @ 2021-02-04 16:58 UTC (permalink / raw)
  To: LKML
  Cc: DRI Development, Daniel Vetter, Bjorn Helgaas, Dan Williams,
	Daniel Vetter, Stephen Rothwell, Jason Gunthorpe, Kees Cook,
	Andrew Morton, John Hubbard, Jérôme Glisse, Jan Kara,
	Greg Kroah-Hartman, linux-mm, linux-arm-kernel,
	linux-samsung-soc, linux-media, linux-pci

Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
the region") /dev/kmem zaps ptes when the kernel requests exclusive
acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
the default for all driver uses.

Except there's two more ways to access PCI BARs: sysfs and proc mmap
support. Let's plug that hole.

For revoke_devmem() to work we need to link our vma into the same
address_space, with consistent vma->vm_pgoff. ->pgoff is already
adjusted, because that's how (io_)remap_pfn_range works, but for the
mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
to adjust this at at ->open time:

- for sysfs this is easy, now that binary attributes support this. We
  just set bin_attr->mapping when mmap is supported
- for procfs it's a bit more tricky, since procfs pci access has only
  one file per device, and access to a specific resources first needs
  to be set up with some ioctl calls. But mmap is only supported for
  the same resources as sysfs exposes with mmap support, and otherwise
  rejected, so we can set the mapping unconditionally at open time
  without harm.

A special consideration is for arch_can_pci_mmap_io() - we need to
make sure that the ->f_mapping doesn't alias between ioport and iomem
space. There's only 2 ways in-tree to support mmap of ioports: generic
pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
architecture hand-rolling. Both approach support ioport mmap through a
special pfn range and not through magic pte attributes. Aliasing is
therefore not a problem.

The only difference in access checks left is that sysfs PCI mmap does
not check for CAP_RAWIO. I'm not really sure whether that should be
added or not.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-sysfs.c | 4 ++++
 drivers/pci/proc.c      | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 0c45b4f7b214..f8afd54ca3e1 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
 	b->legacy_io->read = pci_read_legacy_io;
 	b->legacy_io->write = pci_write_legacy_io;
 	b->legacy_io->mmap = pci_mmap_legacy_io;
+	b->legacy_io->mapping = iomem_get_mapping();
 	pci_adjust_legacy_attr(b, pci_mmap_io);
 	error = device_create_bin_file(&b->dev, b->legacy_io);
 	if (error)
@@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
 	b->legacy_mem->size = 1024*1024;
 	b->legacy_mem->attr.mode = 0600;
 	b->legacy_mem->mmap = pci_mmap_legacy_mem;
+	b->legacy_io->mapping = iomem_get_mapping();
 	pci_adjust_legacy_attr(b, pci_mmap_mem);
 	error = device_create_bin_file(&b->dev, b->legacy_mem);
 	if (error)
@@ -1169,6 +1171,8 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
 			res_attr->mmap = pci_mmap_resource_uc;
 		}
 	}
+	if (res_attr->mmap)
+		res_attr->mapping = iomem_get_mapping();
 	res_attr->attr.name = res_attr_name;
 	res_attr->attr.mode = 0600;
 	res_attr->size = pci_resource_len(pdev, num);
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 3a2f90beb4cb..9bab07302bbf 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
 	fpriv->write_combine = 0;
 
 	file->private_data = fpriv;
+	file->f_mapping = iomem_get_mapping();
 
 	return 0;
 }
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-04 16:58 ` [PATCH 1/2] PCI: also set up legacy files only after sysfs init Daniel Vetter
@ 2021-02-04 21:50   ` Bjorn Helgaas
  2021-02-04 22:24     ` Pali Rohár
  2021-02-05  9:23     ` Daniel Vetter
  0 siblings, 2 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2021-02-04 21:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Stephen Rothwell, linux-samsung-soc, Jan Kara, Kees Cook,
	Greg Kroah-Hartman, linux-pci, DRI Development, linux-mm,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	linux-arm-kernel, linux-media, Oliver O'Halloran,
	Pali Rohár, Krzysztof Wilczyński

[+cc Oliver, Pali, Krzysztof]

s/also/Also/ in subject

On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> We are already doing this for all the regular sysfs files on PCI
> devices, but not yet on the legacy io files on the PCI buses. Thus far
> now problem, but in the next patch I want to wire up iomem revoke
> support. That needs the vfs up an running already to make so that
> iomem_get_mapping() works.

s/now problem/no problem/
s/an running/and running/
s/so that/sure that/ ?

iomem_get_mapping() doesn't exist; I don't know what that should be.

> Wire it up exactly like the existing code. Note that
> pci_remove_legacy_files() doesn't need a check since the one for
> pci_bus->legacy_io is sufficient.

I'm not sure exactly what you mean by "the existing code."  I could
probably figure it out, but it would save time to mention the existing
function here.

This looks like another instance where we should really apply Oliver's
idea of converting these to attribute_groups [1].

The cover letter mentions options discussed with Greg in [2], but I
don't think the "sysfs_initialized" hack vs attribute_groups was part
of that discussion.

It's not absolutely a show-stopper, but it *is* a shame to extend the
sysfs_initialized hack if attribute_groups could do this more cleanly
and help solve more than one issue.

Bjorn

[1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
[2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/

> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-mm@kvack.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-media@vger.kernel.org
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index fb072f4b3176..0c45b4f7b214 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
>  {
>  	int error;
>  
> +	if (!sysfs_initialized)
> +		return;
> +
>  	b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
>  			       GFP_ATOMIC);
>  	if (!b->legacy_io)
> @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
>  static int __init pci_sysfs_init(void)
>  {
>  	struct pci_dev *pdev = NULL;
> +	struct pci_bus *pbus = NULL;
>  	int retval;
>  
>  	sysfs_initialized = 1;
> @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
>  		}
>  	}
>  
> +	while ((pbus = pci_find_next_bus(pbus)))
> +		pci_create_legacy_files(pbus);
> +
>  	return 0;
>  }
>  late_initcall(pci_sysfs_init);
> -- 
> 2.30.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-04 21:50   ` Bjorn Helgaas
@ 2021-02-04 22:24     ` Pali Rohár
  2021-02-05  9:59       ` Daniel Vetter
  2021-02-05  9:23     ` Daniel Vetter
  1 sibling, 1 reply; 14+ messages in thread
From: Pali Rohár @ 2021-02-04 22:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Daniel Vetter, LKML, Stephen Rothwell, linux-samsung-soc,
	Jan Kara, Kees Cook, Greg Kroah-Hartman, linux-pci, linux-mm,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	linux-arm-kernel, linux-media, Oliver O'Halloran,
	Krzysztof Wilczyński

On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote:
> [+cc Oliver, Pali, Krzysztof]

Just to note that extending or using sysfs_initialized introduces
another race condition into kernel code which results in PCI fatal
errors. Details are in email discussion which Bjorn already sent.

> s/also/Also/ in subject
> 
> On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > We are already doing this for all the regular sysfs files on PCI
> > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > now problem, but in the next patch I want to wire up iomem revoke
> > support. That needs the vfs up an running already to make so that
> > iomem_get_mapping() works.
> 
> s/now problem/no problem/
> s/an running/and running/
> s/so that/sure that/ ?
> 
> iomem_get_mapping() doesn't exist; I don't know what that should be.
> 
> > Wire it up exactly like the existing code. Note that
> > pci_remove_legacy_files() doesn't need a check since the one for
> > pci_bus->legacy_io is sufficient.
> 
> I'm not sure exactly what you mean by "the existing code."  I could
> probably figure it out, but it would save time to mention the existing
> function here.
> 
> This looks like another instance where we should really apply Oliver's
> idea of converting these to attribute_groups [1].
> 
> The cover letter mentions options discussed with Greg in [2], but I
> don't think the "sysfs_initialized" hack vs attribute_groups was part
> of that discussion.
> 
> It's not absolutely a show-stopper, but it *is* a shame to extend the
> sysfs_initialized hack if attribute_groups could do this more cleanly
> and help solve more than one issue.
> 
> Bjorn
> 
> [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
> 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: linux-mm@kvack.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-samsung-soc@vger.kernel.org
> > Cc: linux-media@vger.kernel.org
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: linux-pci@vger.kernel.org
> > ---
> >  drivers/pci/pci-sysfs.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > index fb072f4b3176..0c45b4f7b214 100644
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> >  {
> >  	int error;
> >  
> > +	if (!sysfs_initialized)
> > +		return;
> > +
> >  	b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> >  			       GFP_ATOMIC);
> >  	if (!b->legacy_io)
> > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> >  static int __init pci_sysfs_init(void)
> >  {
> >  	struct pci_dev *pdev = NULL;
> > +	struct pci_bus *pbus = NULL;
> >  	int retval;
> >  
> >  	sysfs_initialized = 1;
> > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> >  		}
> >  	}
> >  
> > +	while ((pbus = pci_find_next_bus(pbus)))
> > +		pci_create_legacy_files(pbus);
> > +
> >  	return 0;
> >  }
> >  late_initcall(pci_sysfs_init);
> > -- 
> > 2.30.0
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-04 21:50   ` Bjorn Helgaas
  2021-02-04 22:24     ` Pali Rohár
@ 2021-02-05  9:23     ` Daniel Vetter
  1 sibling, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2021-02-05  9:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: LKML, Stephen Rothwell, linux-samsung-soc, Jan Kara, Kees Cook,
	Greg Kroah-Hartman, Linux PCI, DRI Development, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Oliver O'Halloran, Pali Rohár,
	Krzysztof Wilczyński

On Thu, Feb 4, 2021 at 10:50 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Oliver, Pali, Krzysztof]
>
> s/also/Also/ in subject
>
> On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > We are already doing this for all the regular sysfs files on PCI
> > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > now problem, but in the next patch I want to wire up iomem revoke
> > support. That needs the vfs up an running already to make so that
> > iomem_get_mapping() works.
>
> s/now problem/no problem/
> s/an running/and running/
> s/so that/sure that/ ?
>
> iomem_get_mapping() doesn't exist; I don't know what that should be.

Series is based on top of linux-next, where iomem_get_mapping exists.
This patch fixes the 2nd patch in this series, which I had to take out
of my branch because it failed.

> > Wire it up exactly like the existing code. Note that
> > pci_remove_legacy_files() doesn't need a check since the one for
> > pci_bus->legacy_io is sufficient.
>
> I'm not sure exactly what you mean by "the existing code."  I could
> probably figure it out, but it would save time to mention the existing
> function here.

Sorry, I meant the existing code in pci_create_sysfs_dev_files().

> This looks like another instance where we should really apply Oliver's
> idea of converting these to attribute_groups [1].
>
> The cover letter mentions options discussed with Greg in [2], but I
> don't think the "sysfs_initialized" hack vs attribute_groups was part
> of that discussion.

Hm not sure the attribute_groups works. The problem is that I cant set
up the attributes before the vfs layer is initialized, because before
that point the iomem_get_mapping function doesn't return anything
useful (well it crashes), because it needs to have an inode available.

So if you want to set up the attributes earlier, we'd need some kind
of callback, which Greg didn't like.

> It's not absolutely a show-stopper, but it *is* a shame to extend the
> sysfs_initialized hack if attribute_groups could do this more cleanly
> and help solve more than one issue.

So I think I have yet another init ordering problem here, but not sure.
-Daniel

>
> Bjorn
>
> [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: linux-mm@kvack.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-samsung-soc@vger.kernel.org
> > Cc: linux-media@vger.kernel.org
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: linux-pci@vger.kernel.org
> > ---
> >  drivers/pci/pci-sysfs.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > index fb072f4b3176..0c45b4f7b214 100644
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> >  {
> >       int error;
> >
> > +     if (!sysfs_initialized)
> > +             return;
> > +
> >       b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> >                              GFP_ATOMIC);
> >       if (!b->legacy_io)
> > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> >  static int __init pci_sysfs_init(void)
> >  {
> >       struct pci_dev *pdev = NULL;
> > +     struct pci_bus *pbus = NULL;
> >       int retval;
> >
> >       sysfs_initialized = 1;
> > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> >               }
> >       }
> >
> > +     while ((pbus = pci_find_next_bus(pbus)))
> > +             pci_create_legacy_files(pbus);
> > +
> >       return 0;
> >  }
> >  late_initcall(pci_sysfs_init);
> > --
> > 2.30.0
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-04 22:24     ` Pali Rohár
@ 2021-02-05  9:59       ` Daniel Vetter
  2021-02-05 10:04         ` Pali Rohár
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2021-02-05  9:59 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Bjorn Helgaas, LKML, Stephen Rothwell, linux-samsung-soc,
	Jan Kara, Kees Cook, Greg Kroah-Hartman, Linux PCI, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Oliver O'Halloran, Krzysztof Wilczyński

On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár <pali@kernel.org> wrote:
>
> On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote:
> > [+cc Oliver, Pali, Krzysztof]
>
> Just to note that extending or using sysfs_initialized introduces
> another race condition into kernel code which results in PCI fatal
> errors. Details are in email discussion which Bjorn already sent.

Yeah I wondered why this doesn't race, but since the history goes back
to pre-git times I figured it would have been addressed somehow
already if it indeed does race.
-Daniel

> > s/also/Also/ in subject
> >
> > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > > We are already doing this for all the regular sysfs files on PCI
> > > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > > now problem, but in the next patch I want to wire up iomem revoke
> > > support. That needs the vfs up an running already to make so that
> > > iomem_get_mapping() works.
> >
> > s/now problem/no problem/
> > s/an running/and running/
> > s/so that/sure that/ ?
> >
> > iomem_get_mapping() doesn't exist; I don't know what that should be.
> >
> > > Wire it up exactly like the existing code. Note that
> > > pci_remove_legacy_files() doesn't need a check since the one for
> > > pci_bus->legacy_io is sufficient.
> >
> > I'm not sure exactly what you mean by "the existing code."  I could
> > probably figure it out, but it would save time to mention the existing
> > function here.
> >
> > This looks like another instance where we should really apply Oliver's
> > idea of converting these to attribute_groups [1].
> >
> > The cover letter mentions options discussed with Greg in [2], but I
> > don't think the "sysfs_initialized" hack vs attribute_groups was part
> > of that discussion.
> >
> > It's not absolutely a show-stopper, but it *is* a shame to extend the
> > sysfs_initialized hack if attribute_groups could do this more cleanly
> > and help solve more than one issue.
> >
> > Bjorn
> >
> > [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> > [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
> >
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > Cc: Jan Kara <jack@suse.cz>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: linux-mm@kvack.org
> > > Cc: linux-arm-kernel@lists.infradead.org
> > > Cc: linux-samsung-soc@vger.kernel.org
> > > Cc: linux-media@vger.kernel.org
> > > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > > Cc: linux-pci@vger.kernel.org
> > > ---
> > >  drivers/pci/pci-sysfs.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > index fb072f4b3176..0c45b4f7b214 100644
> > > --- a/drivers/pci/pci-sysfs.c
> > > +++ b/drivers/pci/pci-sysfs.c
> > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> > >  {
> > >     int error;
> > >
> > > +   if (!sysfs_initialized)
> > > +           return;
> > > +
> > >     b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> > >                            GFP_ATOMIC);
> > >     if (!b->legacy_io)
> > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> > >  static int __init pci_sysfs_init(void)
> > >  {
> > >     struct pci_dev *pdev = NULL;
> > > +   struct pci_bus *pbus = NULL;
> > >     int retval;
> > >
> > >     sysfs_initialized = 1;
> > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> > >             }
> > >     }
> > >
> > > +   while ((pbus = pci_find_next_bus(pbus)))
> > > +           pci_create_legacy_files(pbus);
> > > +
> > >     return 0;
> > >  }
> > >  late_initcall(pci_sysfs_init);
> > > --
> > > 2.30.0
> > >
> > >
> > > _______________________________________________
> > > linux-arm-kernel mailing list
> > > linux-arm-kernel@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-05  9:59       ` Daniel Vetter
@ 2021-02-05 10:04         ` Pali Rohár
  2021-02-05 10:16           ` Daniel Vetter
  0 siblings, 1 reply; 14+ messages in thread
From: Pali Rohár @ 2021-02-05 10:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Bjorn Helgaas, LKML, Stephen Rothwell, linux-samsung-soc,
	Jan Kara, Kees Cook, Greg Kroah-Hartman, Linux PCI, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Oliver O'Halloran, Krzysztof Wilczyński

On Friday 05 February 2021 10:59:50 Daniel Vetter wrote:
> On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote:
> > > [+cc Oliver, Pali, Krzysztof]
> >
> > Just to note that extending or using sysfs_initialized introduces
> > another race condition into kernel code which results in PCI fatal
> > errors. Details are in email discussion which Bjorn already sent.
> 
> Yeah I wondered why this doesn't race.

It races, but with smaller probability. I have not seen this race
condition on x86. But I was able to reproduce it with native PCIe
drivers on ARM64 (Marvell Armada 3720; pci-aardvark). In mentioned
discussion I wrote when this race condition happen. But I understand
that it is hard to simulate it.

> but since the history goes back
> to pre-git times I figured it would have been addressed somehow
> already if it indeed does race.
> -Daniel
> 
> > > s/also/Also/ in subject
> > >
> > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > > > We are already doing this for all the regular sysfs files on PCI
> > > > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > > > now problem, but in the next patch I want to wire up iomem revoke
> > > > support. That needs the vfs up an running already to make so that
> > > > iomem_get_mapping() works.
> > >
> > > s/now problem/no problem/
> > > s/an running/and running/
> > > s/so that/sure that/ ?
> > >
> > > iomem_get_mapping() doesn't exist; I don't know what that should be.
> > >
> > > > Wire it up exactly like the existing code. Note that
> > > > pci_remove_legacy_files() doesn't need a check since the one for
> > > > pci_bus->legacy_io is sufficient.
> > >
> > > I'm not sure exactly what you mean by "the existing code."  I could
> > > probably figure it out, but it would save time to mention the existing
> > > function here.
> > >
> > > This looks like another instance where we should really apply Oliver's
> > > idea of converting these to attribute_groups [1].
> > >
> > > The cover letter mentions options discussed with Greg in [2], but I
> > > don't think the "sysfs_initialized" hack vs attribute_groups was part
> > > of that discussion.
> > >
> > > It's not absolutely a show-stopper, but it *is* a shame to extend the
> > > sysfs_initialized hack if attribute_groups could do this more cleanly
> > > and help solve more than one issue.
> > >
> > > Bjorn
> > >
> > > [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> > > [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
> > >
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > > > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > > > Cc: Kees Cook <keescook@chromium.org>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > > Cc: Jan Kara <jack@suse.cz>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Cc: linux-mm@kvack.org
> > > > Cc: linux-arm-kernel@lists.infradead.org
> > > > Cc: linux-samsung-soc@vger.kernel.org
> > > > Cc: linux-media@vger.kernel.org
> > > > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > > > Cc: linux-pci@vger.kernel.org
> > > > ---
> > > >  drivers/pci/pci-sysfs.c | 7 +++++++
> > > >  1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > > index fb072f4b3176..0c45b4f7b214 100644
> > > > --- a/drivers/pci/pci-sysfs.c
> > > > +++ b/drivers/pci/pci-sysfs.c
> > > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> > > >  {
> > > >     int error;
> > > >
> > > > +   if (!sysfs_initialized)
> > > > +           return;
> > > > +
> > > >     b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> > > >                            GFP_ATOMIC);
> > > >     if (!b->legacy_io)
> > > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> > > >  static int __init pci_sysfs_init(void)
> > > >  {
> > > >     struct pci_dev *pdev = NULL;
> > > > +   struct pci_bus *pbus = NULL;
> > > >     int retval;
> > > >
> > > >     sysfs_initialized = 1;
> > > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> > > >             }
> > > >     }
> > > >
> > > > +   while ((pbus = pci_find_next_bus(pbus)))
> > > > +           pci_create_legacy_files(pbus);
> > > > +
> > > >     return 0;
> > > >  }
> > > >  late_initcall(pci_sysfs_init);
> > > > --
> > > > 2.30.0
> > > >
> > > >
> > > > _______________________________________________
> > > > linux-arm-kernel mailing list
> > > > linux-arm-kernel@lists.infradead.org
> > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-05 10:04         ` Pali Rohár
@ 2021-02-05 10:16           ` Daniel Vetter
  2021-02-05 10:21             ` Pali Rohár
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2021-02-05 10:16 UTC (permalink / raw)
  To: Pali Rohár
  Cc: Bjorn Helgaas, LKML, Stephen Rothwell, linux-samsung-soc,
	Jan Kara, Kees Cook, Greg Kroah-Hartman, Linux PCI, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Oliver O'Halloran, Krzysztof Wilczyński

On Fri, Feb 5, 2021 at 11:04 AM Pali Rohár <pali@kernel.org> wrote:
>
> On Friday 05 February 2021 10:59:50 Daniel Vetter wrote:
> > On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár <pali@kernel.org> wrote:
> > >
> > > On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote:
> > > > [+cc Oliver, Pali, Krzysztof]
> > >
> > > Just to note that extending or using sysfs_initialized introduces
> > > another race condition into kernel code which results in PCI fatal
> > > errors. Details are in email discussion which Bjorn already sent.
> >
> > Yeah I wondered why this doesn't race.
>
> It races, but with smaller probability. I have not seen this race
> condition on x86. But I was able to reproduce it with native PCIe
> drivers on ARM64 (Marvell Armada 3720; pci-aardvark). In mentioned
> discussion I wrote when this race condition happen. But I understand
> that it is hard to simulate it.

btw I looked at your patch, and isn't that just reducing the race window?

I think we have a very similar problem in drm, where the
drm_dev_register() for the overall device (which also registers all
drm_connector) can race with the hotplug of an individual connector in
drm_connector_register() which is hotplugged at runtime.

I went with a per-connector registered boolean + a lock to make sure
that really only one of the two call paths can end up registering the
connector. Part of registering connectors is setting up sysfs files,
so I think it's exactly the same problem as here.

Cheers, Daniel

>
> > but since the history goes back
> > to pre-git times I figured it would have been addressed somehow
> > already if it indeed does race.
> > -Daniel
> >
> > > > s/also/Also/ in subject
> > > >
> > > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > > > > We are already doing this for all the regular sysfs files on PCI
> > > > > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > > > > now problem, but in the next patch I want to wire up iomem revoke
> > > > > support. That needs the vfs up an running already to make so that
> > > > > iomem_get_mapping() works.
> > > >
> > > > s/now problem/no problem/
> > > > s/an running/and running/
> > > > s/so that/sure that/ ?
> > > >
> > > > iomem_get_mapping() doesn't exist; I don't know what that should be.
> > > >
> > > > > Wire it up exactly like the existing code. Note that
> > > > > pci_remove_legacy_files() doesn't need a check since the one for
> > > > > pci_bus->legacy_io is sufficient.
> > > >
> > > > I'm not sure exactly what you mean by "the existing code."  I could
> > > > probably figure it out, but it would save time to mention the existing
> > > > function here.
> > > >
> > > > This looks like another instance where we should really apply Oliver's
> > > > idea of converting these to attribute_groups [1].
> > > >
> > > > The cover letter mentions options discussed with Greg in [2], but I
> > > > don't think the "sysfs_initialized" hack vs attribute_groups was part
> > > > of that discussion.
> > > >
> > > > It's not absolutely a show-stopper, but it *is* a shame to extend the
> > > > sysfs_initialized hack if attribute_groups could do this more cleanly
> > > > and help solve more than one issue.
> > > >
> > > > Bjorn
> > > >
> > > > [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> > > > [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
> > > >
> > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > > > > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > > > > Cc: Kees Cook <keescook@chromium.org>
> > > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > > > Cc: Jan Kara <jack@suse.cz>
> > > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > Cc: linux-mm@kvack.org
> > > > > Cc: linux-arm-kernel@lists.infradead.org
> > > > > Cc: linux-samsung-soc@vger.kernel.org
> > > > > Cc: linux-media@vger.kernel.org
> > > > > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > > > > Cc: linux-pci@vger.kernel.org
> > > > > ---
> > > > >  drivers/pci/pci-sysfs.c | 7 +++++++
> > > > >  1 file changed, 7 insertions(+)
> > > > >
> > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > > > index fb072f4b3176..0c45b4f7b214 100644
> > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> > > > >  {
> > > > >     int error;
> > > > >
> > > > > +   if (!sysfs_initialized)
> > > > > +           return;
> > > > > +
> > > > >     b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> > > > >                            GFP_ATOMIC);
> > > > >     if (!b->legacy_io)
> > > > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> > > > >  static int __init pci_sysfs_init(void)
> > > > >  {
> > > > >     struct pci_dev *pdev = NULL;
> > > > > +   struct pci_bus *pbus = NULL;
> > > > >     int retval;
> > > > >
> > > > >     sysfs_initialized = 1;
> > > > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> > > > >             }
> > > > >     }
> > > > >
> > > > > +   while ((pbus = pci_find_next_bus(pbus)))
> > > > > +           pci_create_legacy_files(pbus);
> > > > > +
> > > > >     return 0;
> > > > >  }
> > > > >  late_initcall(pci_sysfs_init);
> > > > > --
> > > > > 2.30.0
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > linux-arm-kernel mailing list
> > > > > linux-arm-kernel@lists.infradead.org
> > > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] PCI: also set up legacy files only after sysfs init
  2021-02-05 10:16           ` Daniel Vetter
@ 2021-02-05 10:21             ` Pali Rohár
  0 siblings, 0 replies; 14+ messages in thread
From: Pali Rohár @ 2021-02-05 10:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Bjorn Helgaas, LKML, Stephen Rothwell, linux-samsung-soc,
	Jan Kara, Kees Cook, Greg Kroah-Hartman, Linux PCI, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Oliver O'Halloran, Krzysztof Wilczyński

On Friday 05 February 2021 11:16:00 Daniel Vetter wrote:
> On Fri, Feb 5, 2021 at 11:04 AM Pali Rohár <pali@kernel.org> wrote:
> >
> > On Friday 05 February 2021 10:59:50 Daniel Vetter wrote:
> > > On Thu, Feb 4, 2021 at 11:24 PM Pali Rohár <pali@kernel.org> wrote:
> > > >
> > > > On Thursday 04 February 2021 15:50:19 Bjorn Helgaas wrote:
> > > > > [+cc Oliver, Pali, Krzysztof]
> > > >
> > > > Just to note that extending or using sysfs_initialized introduces
> > > > another race condition into kernel code which results in PCI fatal
> > > > errors. Details are in email discussion which Bjorn already sent.
> > >
> > > Yeah I wondered why this doesn't race.
> >
> > It races, but with smaller probability. I have not seen this race
> > condition on x86. But I was able to reproduce it with native PCIe
> > drivers on ARM64 (Marvell Armada 3720; pci-aardvark). In mentioned
> > discussion I wrote when this race condition happen. But I understand
> > that it is hard to simulate it.
> 
> btw I looked at your patch, and isn't that just reducing the race window?

I probably have not wrote reply to that thread and only to Krzysztof on
IRC, but my "hack" really does not solve that race condition. And as you
wrote it only reduced occurrence on tested HW.

Krzysztof wrote that would look at this issue and try to solve it
properly. So I have not doing more investigation on that my "hack"
patch, race conditions are hard to catch and solve...

> I think we have a very similar problem in drm, where the
> drm_dev_register() for the overall device (which also registers all
> drm_connector) can race with the hotplug of an individual connector in
> drm_connector_register() which is hotplugged at runtime.
> 
> I went with a per-connector registered boolean + a lock to make sure
> that really only one of the two call paths can end up registering the
> connector. Part of registering connectors is setting up sysfs files,
> so I think it's exactly the same problem as here.
> 
> Cheers, Daniel
> 
> >
> > > but since the history goes back
> > > to pre-git times I figured it would have been addressed somehow
> > > already if it indeed does race.
> > > -Daniel
> > >
> > > > > s/also/Also/ in subject
> > > > >
> > > > > On Thu, Feb 04, 2021 at 05:58:30PM +0100, Daniel Vetter wrote:
> > > > > > We are already doing this for all the regular sysfs files on PCI
> > > > > > devices, but not yet on the legacy io files on the PCI buses. Thus far
> > > > > > now problem, but in the next patch I want to wire up iomem revoke
> > > > > > support. That needs the vfs up an running already to make so that
> > > > > > iomem_get_mapping() works.
> > > > >
> > > > > s/now problem/no problem/
> > > > > s/an running/and running/
> > > > > s/so that/sure that/ ?
> > > > >
> > > > > iomem_get_mapping() doesn't exist; I don't know what that should be.
> > > > >
> > > > > > Wire it up exactly like the existing code. Note that
> > > > > > pci_remove_legacy_files() doesn't need a check since the one for
> > > > > > pci_bus->legacy_io is sufficient.
> > > > >
> > > > > I'm not sure exactly what you mean by "the existing code."  I could
> > > > > probably figure it out, but it would save time to mention the existing
> > > > > function here.
> > > > >
> > > > > This looks like another instance where we should really apply Oliver's
> > > > > idea of converting these to attribute_groups [1].
> > > > >
> > > > > The cover letter mentions options discussed with Greg in [2], but I
> > > > > don't think the "sysfs_initialized" hack vs attribute_groups was part
> > > > > of that discussion.
> > > > >
> > > > > It's not absolutely a show-stopper, but it *is* a shame to extend the
> > > > > sysfs_initialized hack if attribute_groups could do this more cleanly
> > > > > and help solve more than one issue.
> > > > >
> > > > > Bjorn
> > > > >
> > > > > [1] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
> > > > > [2] https://lore.kernel.org/dri-devel/CAKMK7uGrdDrbtj0OyzqQc0CGrQwc2F3tFJU9vLfm2jjufAZ5YQ@mail.gmail.com/
> > > > >
> > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > > > > > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > > > > > Cc: Kees Cook <keescook@chromium.org>
> > > > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > > > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > > > > Cc: Jan Kara <jack@suse.cz>
> > > > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > Cc: linux-mm@kvack.org
> > > > > > Cc: linux-arm-kernel@lists.infradead.org
> > > > > > Cc: linux-samsung-soc@vger.kernel.org
> > > > > > Cc: linux-media@vger.kernel.org
> > > > > > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > > > > > Cc: linux-pci@vger.kernel.org
> > > > > > ---
> > > > > >  drivers/pci/pci-sysfs.c | 7 +++++++
> > > > > >  1 file changed, 7 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > > > > index fb072f4b3176..0c45b4f7b214 100644
> > > > > > --- a/drivers/pci/pci-sysfs.c
> > > > > > +++ b/drivers/pci/pci-sysfs.c
> > > > > > @@ -927,6 +927,9 @@ void pci_create_legacy_files(struct pci_bus *b)
> > > > > >  {
> > > > > >     int error;
> > > > > >
> > > > > > +   if (!sysfs_initialized)
> > > > > > +           return;
> > > > > > +
> > > > > >     b->legacy_io = kcalloc(2, sizeof(struct bin_attribute),
> > > > > >                            GFP_ATOMIC);
> > > > > >     if (!b->legacy_io)
> > > > > > @@ -1448,6 +1451,7 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> > > > > >  static int __init pci_sysfs_init(void)
> > > > > >  {
> > > > > >     struct pci_dev *pdev = NULL;
> > > > > > +   struct pci_bus *pbus = NULL;
> > > > > >     int retval;
> > > > > >
> > > > > >     sysfs_initialized = 1;
> > > > > > @@ -1459,6 +1463,9 @@ static int __init pci_sysfs_init(void)
> > > > > >             }
> > > > > >     }
> > > > > >
> > > > > > +   while ((pbus = pci_find_next_bus(pbus)))
> > > > > > +           pci_create_legacy_files(pbus);
> > > > > > +
> > > > > >     return 0;
> > > > > >  }
> > > > > >  late_initcall(pci_sysfs_init);
> > > > > > --
> > > > > > 2.30.0
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > linux-arm-kernel mailing list
> > > > > > linux-arm-kernel@lists.infradead.org
> > > > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] pci sysfs file iomem revoke support
  2021-02-04 16:58 [PATCH 0/2] pci sysfs file iomem revoke support Daniel Vetter
  2021-02-04 16:58 ` [PATCH 1/2] PCI: also set up legacy files only after sysfs init Daniel Vetter
  2021-02-04 16:58 ` [PATCH 2/2] PCI: Revoke mappings like devmem Daniel Vetter
@ 2021-02-07 19:43 ` Stephen Rothwell
  2 siblings, 0 replies; 14+ messages in thread
From: Stephen Rothwell @ 2021-02-07 19:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, DRI Development, Jason Gunthorpe, Kees Cook, Dan Williams,
	Andrew Morton, John Hubbard, Jérôme Glisse, Jan Kara,
	Greg Kroah-Hartman, linux-mm, linux-arm-kernel,
	linux-samsung-soc, linux-media, Bjorn Helgaas, linux-pci

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

Hi Daniel,

On Thu,  4 Feb 2021 17:58:29 +0100 Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Hi all,
> 
> This is a revised version of patch 12 from my series to lock down some
> follow_pfn vs VM_SPECIAL races:
> 
> https://lore.kernel.org/dri-devel/CAKwvOdnSrsnTgPEuQJyaOTSkTP2dR9208Y66HQG_h1e2LKfqtw@mail.gmail.com/
> 
> Stephen reported an issue on HAVE_PCI_LEGACY platforms which this patch
> set tries to address. Previous patches are all still in linux-next.
> 
> Stephen, would be awesome if you can give this a spin.

OK, I applied the 2 patches on top of next-20210205 and it no longer
panics for my simple boot test (PowerPC pseries_le_defconfig under
qemu).

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] PCI: Revoke mappings like devmem
  2021-02-04 16:58 ` [PATCH 2/2] PCI: Revoke mappings like devmem Daniel Vetter
@ 2021-02-10 22:20   ` Bjorn Helgaas
  2021-03-13 21:57   ` Bjorn Helgaas
  1 sibling, 0 replies; 14+ messages in thread
From: Bjorn Helgaas @ 2021-02-10 22:20 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Stephen Rothwell, linux-samsung-soc, Jan Kara, Kees Cook,
	Greg Kroah-Hartman, linux-pci, DRI Development, linux-mm,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	linux-arm-kernel, linux-media

I see I already acked this, but if you haven't merged it yet there are
a few typos in the commit log:

On Thu, Feb 04, 2021 at 05:58:31PM +0100, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.

s/ptes/PTEs/

> Except there's two more ways to access PCI BARs: sysfs and proc mmap
> support. Let's plug that hole.

s/there's two/there are two/

> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> to adjust this at at ->open time:
> 
> - for sysfs this is easy, now that binary attributes support this. We
>   just set bin_attr->mapping when mmap is supported
> - for procfs it's a bit more tricky, since procfs pci access has only
>   one file per device, and access to a specific resources first needs
>   to be set up with some ioctl calls. But mmap is only supported for
>   the same resources as sysfs exposes with mmap support, and otherwise
>   rejected, so we can set the mapping unconditionally at open time
>   without harm.

s/pci access/PCI access/
s/a specific resources/a specific resource/

> A special consideration is for arch_can_pci_mmap_io() - we need to
> make sure that the ->f_mapping doesn't alias between ioport and iomem
> space. There's only 2 ways in-tree to support mmap of ioports: generic
> pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> architecture hand-rolling. Both approach support ioport mmap through a
> special pfn range and not through magic pte attributes. Aliasing is
> therefore not a problem.

s/There's only 2/There are only two/
s/pci mmap/PCI mmap/
s/Both approach/Both approaches/
s/pfn/PFN/
s/pte/PTE/

> The only difference in access checks left is that sysfs PCI mmap does
> not check for CAP_RAWIO. I'm not really sure whether that should be
> added or not.
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-mm@kvack.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-media@vger.kernel.org
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 4 ++++
>  drivers/pci/proc.c      | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 0c45b4f7b214..f8afd54ca3e1 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>  	b->legacy_io->read = pci_read_legacy_io;
>  	b->legacy_io->write = pci_write_legacy_io;
>  	b->legacy_io->mmap = pci_mmap_legacy_io;
> +	b->legacy_io->mapping = iomem_get_mapping();
>  	pci_adjust_legacy_attr(b, pci_mmap_io);
>  	error = device_create_bin_file(&b->dev, b->legacy_io);
>  	if (error)
> @@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>  	b->legacy_mem->size = 1024*1024;
>  	b->legacy_mem->attr.mode = 0600;
>  	b->legacy_mem->mmap = pci_mmap_legacy_mem;
> +	b->legacy_io->mapping = iomem_get_mapping();
>  	pci_adjust_legacy_attr(b, pci_mmap_mem);
>  	error = device_create_bin_file(&b->dev, b->legacy_mem);
>  	if (error)
> @@ -1169,6 +1171,8 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
>  			res_attr->mmap = pci_mmap_resource_uc;
>  		}
>  	}
> +	if (res_attr->mmap)
> +		res_attr->mapping = iomem_get_mapping();
>  	res_attr->attr.name = res_attr_name;
>  	res_attr->attr.mode = 0600;
>  	res_attr->size = pci_resource_len(pdev, num);
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3a2f90beb4cb..9bab07302bbf 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
>  	fpriv->write_combine = 0;
>  
>  	file->private_data = fpriv;
> +	file->f_mapping = iomem_get_mapping();
>  
>  	return 0;
>  }
> -- 
> 2.30.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] PCI: Revoke mappings like devmem
  2021-02-04 16:58 ` [PATCH 2/2] PCI: Revoke mappings like devmem Daniel Vetter
  2021-02-10 22:20   ` Bjorn Helgaas
@ 2021-03-13 21:57   ` Bjorn Helgaas
  2021-03-13 22:36     ` Daniel Vetter
  1 sibling, 1 reply; 14+ messages in thread
From: Bjorn Helgaas @ 2021-03-13 21:57 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: LKML, Stephen Rothwell, linux-samsung-soc, Jan Kara, Kees Cook,
	Greg Kroah-Hartman, linux-pci, DRI Development, linux-mm,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	linux-arm-kernel, linux-media, Krzysztof Wilczyński,
	Pali Rohár, Oliver O'Halloran

[+cc Krzysztof, Pali, Oliver]

On Thu, Feb 04, 2021 at 05:58:31PM +0100, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.
> 
> Except there's two more ways to access PCI BARs: sysfs and proc mmap
> support. Let's plug that hole.

IIUC, the idea is that if a driver calls request_mem_region() on a PCI
BAR, we prevent access to the BAR via sysfs.  I guess I'm OK with that
if it's a real security improvement or something.

But the downside of this implementation is that it depends on
iomem_get_mapping(), which doesn't work until after fs_initcalls,
which means the sysfs files cannot be static attributes of devices
added before that.  PCI devices are typically enumerated in
subsys_initcall.

Krzysztof is converting PCI sysfs files (config, rom, reset, vpd, etc)
to static attributes.  This is a major improvement that could get rid
of pci_create_sysfs_dev_files(), the late_initcall pci_sysfs_init(),
and the "sysfs_initialized" hack.  This would fix a race reported by
Pali [1] (thanks to Oliver for the idea [2]).

EXCEPT that this revoke change means the "resource%d", "legacy_io",
and "legacy_mem" files cannot be static attributes because of
iomem_get_mapping().

Any ideas on how to deal with this?  Having to keep the
pci_sysfs_init() initcall just for these few files seems like the tail
wagging the dog.

[1] https://lore.kernel.org/r/20200716110423.xtfyb3n6tn5ixedh@pali
[2] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com

> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> to adjust this at at ->open time:
> 
> - for sysfs this is easy, now that binary attributes support this. We
>   just set bin_attr->mapping when mmap is supported
> - for procfs it's a bit more tricky, since procfs pci access has only
>   one file per device, and access to a specific resources first needs
>   to be set up with some ioctl calls. But mmap is only supported for
>   the same resources as sysfs exposes with mmap support, and otherwise
>   rejected, so we can set the mapping unconditionally at open time
>   without harm.
> 
> A special consideration is for arch_can_pci_mmap_io() - we need to
> make sure that the ->f_mapping doesn't alias between ioport and iomem
> space. There's only 2 ways in-tree to support mmap of ioports: generic
> pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> architecture hand-rolling. Both approach support ioport mmap through a
> special pfn range and not through magic pte attributes. Aliasing is
> therefore not a problem.
> 
> The only difference in access checks left is that sysfs PCI mmap does
> not check for CAP_RAWIO. I'm not really sure whether that should be
> added or not.
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-mm@kvack.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-samsung-soc@vger.kernel.org
> Cc: linux-media@vger.kernel.org
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pci-sysfs.c | 4 ++++
>  drivers/pci/proc.c      | 1 +
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 0c45b4f7b214..f8afd54ca3e1 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>  	b->legacy_io->read = pci_read_legacy_io;
>  	b->legacy_io->write = pci_write_legacy_io;
>  	b->legacy_io->mmap = pci_mmap_legacy_io;
> +	b->legacy_io->mapping = iomem_get_mapping();
>  	pci_adjust_legacy_attr(b, pci_mmap_io);
>  	error = device_create_bin_file(&b->dev, b->legacy_io);
>  	if (error)
> @@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
>  	b->legacy_mem->size = 1024*1024;
>  	b->legacy_mem->attr.mode = 0600;
>  	b->legacy_mem->mmap = pci_mmap_legacy_mem;
> +	b->legacy_io->mapping = iomem_get_mapping();
>  	pci_adjust_legacy_attr(b, pci_mmap_mem);
>  	error = device_create_bin_file(&b->dev, b->legacy_mem);
>  	if (error)
> @@ -1169,6 +1171,8 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
>  			res_attr->mmap = pci_mmap_resource_uc;
>  		}
>  	}
> +	if (res_attr->mmap)
> +		res_attr->mapping = iomem_get_mapping();
>  	res_attr->attr.name = res_attr_name;
>  	res_attr->attr.mode = 0600;
>  	res_attr->size = pci_resource_len(pdev, num);
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index 3a2f90beb4cb..9bab07302bbf 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
>  	fpriv->write_combine = 0;
>  
>  	file->private_data = fpriv;
> +	file->f_mapping = iomem_get_mapping();
>  
>  	return 0;
>  }
> -- 
> 2.30.0
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] PCI: Revoke mappings like devmem
  2021-03-13 21:57   ` Bjorn Helgaas
@ 2021-03-13 22:36     ` Daniel Vetter
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2021-03-13 22:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: LKML, Stephen Rothwell, linux-samsung-soc, Jan Kara, Kees Cook,
	Greg Kroah-Hartman, Linux PCI, DRI Development, Linux MM,
	Jason Gunthorpe, Jérôme Glisse, John Hubbard,
	Bjorn Helgaas, Daniel Vetter, Dan Williams, Andrew Morton,
	Linux ARM, open list:DMA BUFFER SHARING FRAMEWORK,
	Krzysztof Wilczyński, Pali Rohár,
	Oliver O'Halloran

On Sat, Mar 13, 2021 at 10:57 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Krzysztof, Pali, Oliver]
>
> On Thu, Feb 04, 2021 at 05:58:31PM +0100, Daniel Vetter wrote:
> > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > the default for all driver uses.
> >
> > Except there's two more ways to access PCI BARs: sysfs and proc mmap
> > support. Let's plug that hole.
>
> IIUC, the idea is that if a driver calls request_mem_region() on a PCI
> BAR, we prevent access to the BAR via sysfs.  I guess I'm OK with that
> if it's a real security improvement or something.

Yup.

> But the downside of this implementation is that it depends on
> iomem_get_mapping(), which doesn't work until after fs_initcalls,
> which means the sysfs files cannot be static attributes of devices
> added before that.  PCI devices are typically enumerated in
> subsys_initcall.
>
> Krzysztof is converting PCI sysfs files (config, rom, reset, vpd, etc)
> to static attributes.  This is a major improvement that could get rid
> of pci_create_sysfs_dev_files(), the late_initcall pci_sysfs_init(),
> and the "sysfs_initialized" hack.  This would fix a race reported by
> Pali [1] (thanks to Oliver for the idea [2]).
>
> EXCEPT that this revoke change means the "resource%d", "legacy_io",
> and "legacy_mem" files cannot be static attributes because of
> iomem_get_mapping().
>
> Any ideas on how to deal with this?  Having to keep the
> pci_sysfs_init() initcall just for these few files seems like the tail
> wagging the dog.

It's a bit "pick your ugly". Either we have the late init call (not
pretty), or the sysfs side needs a callback to fish out the
address_space for the mmap at open() time, which didn't stir up much
enthusiams with Greg because we need a new callback just for these
mmio files. Either approach works.
-Daniel

> [1] https://lore.kernel.org/r/20200716110423.xtfyb3n6tn5ixedh@pali
> [2] https://lore.kernel.org/r/CAOSf1CHss03DBSDO4PmTtMp0tCEu5kScn704ZEwLKGXQzBfqaA@mail.gmail.com
>
> > For revoke_devmem() to work we need to link our vma into the same
> > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > mapping we need to adjust vma->vm_file->f_mapping. The cleanest way is
> > to adjust this at at ->open time:
> >
> > - for sysfs this is easy, now that binary attributes support this. We
> >   just set bin_attr->mapping when mmap is supported
> > - for procfs it's a bit more tricky, since procfs pci access has only
> >   one file per device, and access to a specific resources first needs
> >   to be set up with some ioctl calls. But mmap is only supported for
> >   the same resources as sysfs exposes with mmap support, and otherwise
> >   rejected, so we can set the mapping unconditionally at open time
> >   without harm.
> >
> > A special consideration is for arch_can_pci_mmap_io() - we need to
> > make sure that the ->f_mapping doesn't alias between ioport and iomem
> > space. There's only 2 ways in-tree to support mmap of ioports: generic
> > pci mmap (ARCH_GENERIC_PCI_MMAP_RESOURCE), and sparc as the single
> > architecture hand-rolling. Both approach support ioport mmap through a
> > special pfn range and not through magic pte attributes. Aliasing is
> > therefore not a problem.
> >
> > The only difference in access checks left is that sysfs PCI mmap does
> > not check for CAP_RAWIO. I'm not really sure whether that should be
> > added or not.
> >
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: linux-mm@kvack.org
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-samsung-soc@vger.kernel.org
> > Cc: linux-media@vger.kernel.org
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: linux-pci@vger.kernel.org
> > ---
> >  drivers/pci/pci-sysfs.c | 4 ++++
> >  drivers/pci/proc.c      | 1 +
> >  2 files changed, 5 insertions(+)
> >
> > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > index 0c45b4f7b214..f8afd54ca3e1 100644
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -942,6 +942,7 @@ void pci_create_legacy_files(struct pci_bus *b)
> >       b->legacy_io->read = pci_read_legacy_io;
> >       b->legacy_io->write = pci_write_legacy_io;
> >       b->legacy_io->mmap = pci_mmap_legacy_io;
> > +     b->legacy_io->mapping = iomem_get_mapping();
> >       pci_adjust_legacy_attr(b, pci_mmap_io);
> >       error = device_create_bin_file(&b->dev, b->legacy_io);
> >       if (error)
> > @@ -954,6 +955,7 @@ void pci_create_legacy_files(struct pci_bus *b)
> >       b->legacy_mem->size = 1024*1024;
> >       b->legacy_mem->attr.mode = 0600;
> >       b->legacy_mem->mmap = pci_mmap_legacy_mem;
> > +     b->legacy_io->mapping = iomem_get_mapping();
> >       pci_adjust_legacy_attr(b, pci_mmap_mem);
> >       error = device_create_bin_file(&b->dev, b->legacy_mem);
> >       if (error)
> > @@ -1169,6 +1171,8 @@ static int pci_create_attr(struct pci_dev *pdev, int num, int write_combine)
> >                       res_attr->mmap = pci_mmap_resource_uc;
> >               }
> >       }
> > +     if (res_attr->mmap)
> > +             res_attr->mapping = iomem_get_mapping();
> >       res_attr->attr.name = res_attr_name;
> >       res_attr->attr.mode = 0600;
> >       res_attr->size = pci_resource_len(pdev, num);
> > diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> > index 3a2f90beb4cb..9bab07302bbf 100644
> > --- a/drivers/pci/proc.c
> > +++ b/drivers/pci/proc.c
> > @@ -298,6 +298,7 @@ static int proc_bus_pci_open(struct inode *inode, struct file *file)
> >       fpriv->write_combine = 0;
> >
> >       file->private_data = fpriv;
> > +     file->f_mapping = iomem_get_mapping();
> >
> >       return 0;
> >  }
> > --
> > 2.30.0
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-03-13 22:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04 16:58 [PATCH 0/2] pci sysfs file iomem revoke support Daniel Vetter
2021-02-04 16:58 ` [PATCH 1/2] PCI: also set up legacy files only after sysfs init Daniel Vetter
2021-02-04 21:50   ` Bjorn Helgaas
2021-02-04 22:24     ` Pali Rohár
2021-02-05  9:59       ` Daniel Vetter
2021-02-05 10:04         ` Pali Rohár
2021-02-05 10:16           ` Daniel Vetter
2021-02-05 10:21             ` Pali Rohár
2021-02-05  9:23     ` Daniel Vetter
2021-02-04 16:58 ` [PATCH 2/2] PCI: Revoke mappings like devmem Daniel Vetter
2021-02-10 22:20   ` Bjorn Helgaas
2021-03-13 21:57   ` Bjorn Helgaas
2021-03-13 22:36     ` Daniel Vetter
2021-02-07 19:43 ` [PATCH 0/2] pci sysfs file iomem revoke support Stephen Rothwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).