linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: "Petr Štetiar" <ynezz@true.cz>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Lorenzo Pieralisi" <lorenzo.pieralisi@arm.com>,
	"Gregory Clement" <gregory.clement@bootlin.com>,
	"Andrew Lunn" <andrew@lunn.ch>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	"Yinghai Lu" <yinghai@kernel.org>,
	"Koen Vandeputte" <koen.vandeputte@citymesh.com>
Subject: Re: PCI: Race condition in pci_create_sysfs_dev_files
Date: Wed, 7 Apr 2021 16:51:47 +0200	[thread overview]
Message-ID: <20210407145147.bvtubdmz4k6nulf7@pali> (raw)
In-Reply-To: <20210407142538.GA12387@meh.true.cz>

On Wednesday 07 April 2021 16:25:46 Petr Štetiar wrote:
> Pali Rohár <pali@kernel.org> [2020-10-07 10:12:27]:
> 
> Hi,
> 
> [adding Koen to Cc:]
> 
> > I'm hitting these race conditions randomly on pci aardvark controller
> > driver- I prepared patch which speed up initialization of this driver,
> > but also increase probability that it hits above race conditions :-(
> 
> it seems, that I'm able to reproduce this race condition on every boot with
> 5.10 on my Freescale i.MX6Q board, see the log excerpt bellow. I don't know if
> this helps, but it's not happening on 5.4 kernel.

Hello! This is same race condition which I described in my original
report. Good to know that other people can reproduce it too!

> [    0.000000] Booting Linux on physical CPU 0x0
> [    0.000000] Linux version 5.10.27 (ynezz@ntbk) (arm-openwrt-linux-muslgnueabi-gcc (OpenWrt GCC 8.4.0 r12719+30-84f4a783c698) 8.4.0, GNU ld (GNU Binutils) 2.34) #0 SMP Wed Apr 7 12:52:23 2021
> [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
> [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> [    0.000000] OF: fdt: Machine model: Toradex Apalis iMX6Q/D Module on Ixora Carrier Board
> 
> ...
> 
> [    2.239498] pci 0000:00:00.0: BAR 0: assigned [mem 0x01000000-0x010fffff]
> [    2.266430] pci 0000:00:00.0: BAR 6: assigned [mem 0x01100000-0x0110ffff pref]
> ����#���+$HH��.274570] pci 0000:00:00.0: Max Payload Size set to  128/ 128 (was  128), Max Read Rq  128
> 
>  (this serial console hiccup during PCI initialization seems quite strange as well, happens always)

I'm seeing similar garbage on UART output when such thing happen. But I
suspect that this is the issue when doing serialization of more parallel
messages print by kernel. Nothing relevant to PCI.

But running 'dmesg' after full bootup can show also these lost messages.

> [    2.283074] sysfs: cannot create duplicate filename '/devices/platform/soc/1ffc000.pcie/pci0000:00/0000:00:00.0/config'
> [    2.293884] CPU: 1 PID: 47 Comm: kworker/u8:3 Not tainted 5.10.27 #0
> [    2.300165] random: fast init done
> [    2.300249] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> [    2.310186] Workqueue: events_unbound async_run_entry_fn

And this is important information! Device is registered by some async
workqueue. When I tried to use my (private) patch which speed by
aardvark initialization by putting init code into separate worker then
probability of hitting this race condition increased to about 90%. So
same situation as yours, it was called from async workqueue.

But now with 5.12-rc kernel I'm not able to reproduce it because
something changed in kernel and kernel schedule this "racy" worker long
after race condition may happen.

So it needs perfect timing and seems that one important thing is to call
pci_bus_add_device() function from separate worker.

> [    2.315507] Backtrace:
> [    2.317976] [<8010cc88>] (dump_backtrace) from [<8010d134>] (show_stack+0x20/0x24)
> [    2.325556]  r7:813283d4 r6:60000013 r5:00000000 r4:80ec845c
> [    2.331236] [<8010d114>] (show_stack) from [<805971fc>] (dump_stack+0xa4/0xb8)
> [    2.338480] [<80597158>] (dump_stack) from [<803a7128>] (sysfs_warn_dup+0x68/0x74)
> [    2.346058]  r7:813283d4 r6:80af0dfc r5:812f48f0 r4:81afa000
> [    2.351726] [<803a70c0>] (sysfs_warn_dup) from [<803a6c90>] (sysfs_add_file_mode_ns+0x100/0x1cc)
> [    2.360518]  r7:813283d4 r6:812f48f0 r5:80b4299c r4:ffffffef
> [    2.366187] [<803a6b90>] (sysfs_add_file_mode_ns) from [<803a6fe8>] (sysfs_create_bin_file+0x94/0x9c)
> [    2.375411]  r6:81eb8078 r5:80b4299c r4:00000000
> [    2.380043] [<803a6f54>] (sysfs_create_bin_file) from [<805da848>] (pci_create_sysfs_dev_files+0x58/0x2cc)
> [    2.389701]  r6:81eb8000 r5:81eb8078 r4:81eb8000
> [    2.394341] [<805da7f0>] (pci_create_sysfs_dev_files) from [<805cba98>] (pci_bus_add_device+0x34/0x90)
> [    2.403659]  r10:80b45d88 r9:81818810 r8:81328200 r7:813283d4 r6:8190c000 r5:81eb8078
> [    2.411490]  r4:81eb8000
> [    2.414034] [<805cba64>] (pci_bus_add_device) from [<805cbb30>] (pci_bus_add_devices+0x3c/0x80)
> [    2.421744] random: crng init done
> [    2.422737]  r5:8190c014 r4:81eb8000
> [    2.429720] [<805cbaf4>] (pci_bus_add_devices) from [<805cf898>] (pci_host_probe+0x50/0xa0)
> [    2.438078]  r7:813283d4 r6:8190c000 r5:8190c00c r4:00000000
> [    2.443753] [<805cf848>] (pci_host_probe) from [<805eeb20>] (dw_pcie_host_init+0x1d0/0x414)
> [    2.452111]  r7:813283d4 r6:81328058 r5:00000200 r4:00000000
> [    2.457780] [<805ee950>] (dw_pcie_host_init) from [<805ef5a8>] (imx6_pcie_probe+0x38c/0x69c)
> [    2.466226]  r10:81226180 r9:ef0205c4 r8:81818800 r7:81328040 r6:81328040 r5:81818810
> [    2.474060]  r4:00000020
> [    2.476615] [<805ef21c>] (imx6_pcie_probe) from [<8065e858>] (platform_drv_probe+0x58/0xa8)
> [    2.484977]  r10:80ec9f78 r9:00000000 r8:80f160a8 r7:00000000 r6:80ec9f78 r5:00000000
> [    2.492809]  r4:81818810
> [    2.495357] [<8065e800>] (platform_drv_probe) from [<8065c0a0>] (really_probe+0x128/0x534)
> [    2.503627]  r7:00000000 r6:80f5b8c4 r5:81818810 r4:80f5b8d4
> [    2.509296] [<8065bf78>] (really_probe) from [<8065c700>] (driver_probe_device+0x88/0x200)
> [    2.517570]  r10:00000000 r9:80f0bb60 r8:00000000 r7:80f160a8 r6:80ec9f78 r5:80ec9f78
> [    2.525401]  r4:81818810
> [    2.527946] [<8065c678>] (driver_probe_device) from [<8065c904>] (__driver_attach_async_helper+0x8c/0xb4)
> [    2.537521]  r9:80f0bb60 r8:00000000 r7:8104d000 r6:80ec9f78 r5:80f25010 r4:81818810
> [    2.545273] [<8065c878>] (__driver_attach_async_helper) from [<8015b930>] (async_run_entry_fn+0x58/0x1bc)
> [    2.554843]  r6:819fb480 r5:80f25010 r4:819fb490
> [    2.559479] [<8015b8d8>] (async_run_entry_fn) from [<8015114c>] (process_one_work+0x238/0x5ac)
> [    2.568099]  r8:00000000 r7:8104d000 r6:81048400 r5:8127a080 r4:819fb490
> [    2.574811] [<80150f14>] (process_one_work) from [<8015152c>] (worker_thread+0x6c/0x5c0)
> [    2.582909]  r10:81048400 r9:80e03d00 r8:81048418 r7:00000088 r6:81048400 r5:8127a094
> [    2.590742]  r4:8127a080
> [    2.593290] [<801514c0>] (worker_thread) from [<80158168>] (kthread+0x174/0x178)
> [    2.600694]  r10:8127a080 r9:812e3024 r8:8116fe74 r7:813c0000 r6:00000000 r5:8127b040
> [    2.608527]  r4:812e3000
> [    2.611073] [<80157ff4>] (kthread) from [<80100148>] (ret_from_fork+0x14/0x2c)
> [    2.618302] Exception stack(0x813c1fb0 to 0x813c1ff8)
> [    2.623362] 1fa0:                                     00000000 00000000 00000000 00000000
> [    2.631548] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [    2.639731] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [    2.646354]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:80157ff4
> [    2.654187]  r4:8127b040
> [    2.657157] pcieport 0000:00:00.0: PME: Signaling with IRQ 316
> [    2.674770] VFS: Mounted root (squashfs filesystem) readonly on device 179:2.
> [    2.686264] Freeing unused kernel memory: 1024K
> 
> Complete serial console log http://sprunge.us/wCe6zs

Could you run 'dmesg' and provide its output? So also missing / garbage
messages would be visible.

  reply	other threads:[~2021-04-07 14:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-16 11:04 PCI: Race condition in pci_create_sysfs_dev_files Pali Rohár
2020-08-14  8:08 ` Pali Rohár
2020-09-09 11:28   ` Pali Rohár
2020-10-05  8:20     ` Pali Rohár
2020-10-05 13:54       ` Bjorn Helgaas
2020-10-06 22:22     ` Bjorn Helgaas
2020-10-07  1:47       ` Oliver O'Halloran
2020-10-07  8:14         ` Pali Rohár
2020-10-07 16:14           ` Bjorn Helgaas
2020-10-08 19:59             ` Bjorn Helgaas
2020-10-09  8:08               ` Pali Rohár
2020-11-04 16:29                 ` Pali Rohár
2020-11-15  6:19                   ` Krzysztof Wilczyński
2020-11-15 12:10                     ` Pali Rohár
2020-10-07  8:12       ` Pali Rohár
2021-04-07 14:25         ` Petr Štetiar
2021-04-07 14:51           ` Pali Rohár [this message]
2021-04-07 15:30             ` Petr Štetiar
2021-06-25 11:29               ` Koen Vandeputte
2021-06-25 11:54                 ` Pali Rohár
2021-10-13  6:18                   ` Dexuan-Linux Cui
2021-11-18  3:09                     ` Krzysztof Wilczyński
2021-11-18  3:42                       ` Dexuan Cui
2020-11-04 16:46       ` Pali Rohár
2021-04-07 14:41 ` Pali Rohár

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210407145147.bvtubdmz4k6nulf7@pali \
    --to=pali@kernel.org \
    --cc=andrew@lunn.ch \
    --cc=bhelgaas@google.com \
    --cc=gregory.clement@bootlin.com \
    --cc=helgaas@kernel.org \
    --cc=koen.vandeputte@citymesh.com \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=yinghai@kernel.org \
    --cc=ynezz@true.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).