All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Lorenzo Pieralisi" <lorenzo.pieralisi@arm.com>,
	"Gregory Clement" <gregory.clement@bootlin.com>,
	"Andrew Lunn" <andrew@lunn.ch>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	"Yinghai Lu" <yinghai@kernel.org>
Subject: Re: PCI: Race condition in pci_create_sysfs_dev_files
Date: Wed, 7 Oct 2020 10:12:27 +0200	[thread overview]
Message-ID: <20201007081227.d6f6otfsnmlgvegi@pali> (raw)
In-Reply-To: <20201006222222.GA3221382@bjorn-Precision-5520>

On Tuesday 06 October 2020 17:22:22 Bjorn Helgaas wrote:
> [+cc Krzysztof, Yinghai]
> 
> On Wed, Sep 09, 2020 at 01:28:50PM +0200, Pali Rohár wrote:
> > Hello! I'm adding more people to loop.
> > 
> > Can somebody look at these race conditions and my patch?
> > 
> > On Friday 14 August 2020 10:08:24 Pali Rohár wrote:
> > > Hello! I would like to remind this issue which I reported month ago.
> > > 
> > > On Thursday 16 July 2020 13:04:23 Pali Rohár wrote:
> > > > Hello Bjorn!
> > > > 
> > > > I see following error message in dmesg which looks like a race condition:
> > > > 
> > > > sysfs: cannot create duplicate filename '/devices/platform/soc/d0070000.pcie/pci0000:00/0000:00:00.0/config'
> > > > 
> > > > I looked at it deeper and found out that in PCI subsystem code is race
> > > > condition between pci_bus_add_device() and pci_sysfs_init() calls. Both
> > > > of these functions calls pci_create_sysfs_dev_files() and calling this
> > > > function more times for same pci device throws above error message.
> > > > 
> > > > There can be two different race conditions:
> > > > 
> > > > 1. pci_bus_add_device() called pcibios_bus_add_device() or
> > > > pci_fixup_device() but have not called pci_create_sysfs_dev_files() yet.
> > > > Meanwhile pci_sysfs_init() is running and pci_create_sysfs_dev_files()
> > > > was called for newly registered device. In this case function
> > > > pci_create_sysfs_dev_files() is called two times, ones from
> > > > pci_bus_add_device() and once from pci_sysfs_init().
> > > > 
> > > > 2. pci_sysfs_init() is called. It first sets sysfs_initialized to 1
> > > > which unblock calling pci_create_sysfs_dev_files(). Then another bus
> > > > registers new PCI device and calls pci_bus_add_device() which calls
> > > > pci_create_sysfs_dev_files() and registers sysfs files. Function
> > > > pci_sysfs_init() continues execution and calls function
> > > > pci_create_sysfs_dev_files() also for this newly registered device. So
> > > > pci_create_sysfs_dev_files() is again called two times.
> > > > 
> > > > 
> > > > I workaround both race conditions I created following hack patch. After
> > > > applying it I'm not getting that 'sysfs: cannot create duplicate filename'
> > > > error message anymore.
> > > > 
> > > > Can you look at it how to fix both race conditions in proper way?
> > > 
> > > Is this workaround diff enough? Or are you going to prepare
> > > something better?
> > > 
> > > Please let me know if I should send this diff as regular patch.
> 
> I'm not really a fan of this because pci_sysfs_init() is a bit of a
> hack to begin with, and this makes it even more complicated.

I understand, but the two race conditions are in this pci_sysfs_init()
function. So either it needs to be fixed or code changed, so this
function can be removed.

I'm hitting these race conditions randomly on pci aardvark controller
driver- I prepared patch which speed up initialization of this driver,
but also increase probability that it hits above race conditions :-(

> It's not obvious from the code why we need pci_sysfs_init(), but
> Yinghai hinted [1] that we need to create sysfs after assigning
> resources.  I experimented by removing pci_sysfs_init() and skipping
> the ROM BAR sizing.  In that case, we create sysfs files in
> pci_bus_add_device() and later assign space for the ROM BAR, so we
> fail to create the "rom" sysfs file.

Thank you for explanation. I think it should be documented somewhere in
the code as it is really not obvious.

> The current solution to that is to delay the sysfs files until
> pci_sysfs_init(), a late_initcall(), which runs after resource
> assignments.  But I think it would be better if we could create the
> sysfs file when we assign the BAR.  Then we could get rid of the
> late_initcall() and that implicit ordering requirement.
> 
> But I haven't tried to code it up, so it's probably more complicated
> than this.  I guess ideally we would assign all the resources before
> pci_bus_add_device().  If we could do that, we could just remove
> pci_sysfs_init() and everything would just work, but I think that's a
> HUGE can of worms.
> 
> [1] https://lore.kernel.org/linux-pci/CAE9FiQWBXHgz-gWCmpWLaBOfQQJwtRZemV6Ut9GVw_KJ-dTGTA@mail.gmail.com/
> 
> > > > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> > > > index 8e40b3e6da77..691be2258c4e 100644
> > > > --- a/drivers/pci/bus.c
> > > > +++ b/drivers/pci/bus.c
> > > > @@ -316,7 +316,7 @@ void pci_bus_add_device(struct pci_dev *dev)
> > > >  	 */
> > > >  	pcibios_bus_add_device(dev);
> > > >  	pci_fixup_device(pci_fixup_final, dev);
> > > > -	pci_create_sysfs_dev_files(dev);
> > > > +	pci_create_sysfs_dev_files(dev, false);
> > > >  	pci_proc_attach_device(dev);
> > > >  	pci_bridge_d3_update(dev);
> > > >  
> > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > > > index 6d78df981d41..b0c4852a51dd 100644
> > > > --- a/drivers/pci/pci-sysfs.c
> > > > +++ b/drivers/pci/pci-sysfs.c
> > > > @@ -1328,13 +1328,13 @@ static int pci_create_capabilities_sysfs(struct pci_dev *dev)
> > > >  	return retval;
> > > >  }
> > > >  
> > > > -int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
> > > > +int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev, bool sysfs_initializing)
> > > >  {
> > > >  	int retval;
> > > >  	int rom_size;
> > > >  	struct bin_attribute *attr;
> > > >  
> > > > -	if (!sysfs_initialized)
> > > > +	if (!sysfs_initializing && !sysfs_initialized)
> > > >  		return -EACCES;
> > > >  
> > > >  	if (pdev->cfg_size > PCI_CFG_SPACE_SIZE)
> > > > @@ -1437,18 +1437,21 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
> > > >  static int __init pci_sysfs_init(void)
> > > >  {
> > > >  	struct pci_dev *pdev = NULL;
> > > > -	int retval;
> > > > +	int retval = 0;
> > > >  
> > > > -	sysfs_initialized = 1;
> > > >  	for_each_pci_dev(pdev) {
> > > > -		retval = pci_create_sysfs_dev_files(pdev);
> > > > +		if (!pci_dev_is_added(pdev))
> > > > +			continue;
> > > > +		retval = pci_create_sysfs_dev_files(pdev, true);
> > > >  		if (retval) {
> > > >  			pci_dev_put(pdev);
> > > > -			return retval;
> > > > +			goto out;
> > > >  		}
> > > >  	}
> > > >  
> > > > -	return 0;
> > > > +out:
> > > > +	sysfs_initialized = 1;
> > > > +	return retval;
> > > >  }
> > > >  late_initcall(pci_sysfs_init);
> > > >  
> > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > > index 6d3f75867106..304294c7171e 100644
> > > > --- a/drivers/pci/pci.h
> > > > +++ b/drivers/pci/pci.h
> > > > @@ -19,7 +19,7 @@ bool pcie_cap_has_rtctl(const struct pci_dev *dev);
> > > >  
> > > >  /* Functions internal to the PCI core code */
> > > >  
> > > > -int pci_create_sysfs_dev_files(struct pci_dev *pdev);
> > > > +int pci_create_sysfs_dev_files(struct pci_dev *pdev, bool sysfs_initializing);
> > > >  void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
> > > >  #if !defined(CONFIG_DMI) && !defined(CONFIG_ACPI)
> > > >  static inline void pci_create_firmware_label_files(struct pci_dev *pdev)
> > > > 

  parent reply	other threads:[~2020-10-07  8:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-16 11:04 PCI: Race condition in pci_create_sysfs_dev_files Pali Rohár
2020-08-14  8:08 ` Pali Rohár
2020-09-09 11:28   ` Pali Rohár
2020-10-05  8:20     ` Pali Rohár
2020-10-05 13:54       ` Bjorn Helgaas
2020-10-06 22:22     ` Bjorn Helgaas
2020-10-07  1:47       ` Oliver O'Halloran
2020-10-07  8:14         ` Pali Rohár
2020-10-07 16:14           ` Bjorn Helgaas
2020-10-08 19:59             ` Bjorn Helgaas
2020-10-09  8:08               ` Pali Rohár
2020-11-04 16:29                 ` Pali Rohár
2020-11-15  6:19                   ` Krzysztof Wilczyński
2020-11-15 12:10                     ` Pali Rohár
2020-10-07  8:12       ` Pali Rohár [this message]
2021-04-07 14:25         ` Petr Štetiar
2021-04-07 14:51           ` Pali Rohár
2021-04-07 15:30             ` Petr Štetiar
2021-06-25 11:29               ` Koen Vandeputte
2021-06-25 11:54                 ` Pali Rohár
2021-10-13  6:18                   ` Dexuan-Linux Cui
2021-11-18  3:09                     ` Krzysztof Wilczyński
2021-11-18  3:42                       ` Dexuan Cui
2020-11-04 16:46       ` Pali Rohár
2021-04-07 14:41 ` Pali Rohár

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201007081227.d6f6otfsnmlgvegi@pali \
    --to=pali@kernel.org \
    --cc=andrew@lunn.ch \
    --cc=bhelgaas@google.com \
    --cc=gregory.clement@bootlin.com \
    --cc=helgaas@kernel.org \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.