From: Bjorn Helgaas <helgaas@kernel.org>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hari Vyas <hari.vyas@broadcom.com>,
bhelgaas@google.com, linux-pci@vger.kernel.org,
ray.jui@broadcom.com, Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v3] PCI: Data corruption happening due to race condition
Date: Fri, 27 Jul 2018 17:25:40 -0500 [thread overview]
Message-ID: <20180727222540.GH173328@bhelgaas-glaptop.roam.corp.google.com> (raw)
In-Reply-To: <65dd986d0b8b2ebe5132b365dabb2dbaaed9177f.camel@kernel.crashing.org>
On Thu, Jul 19, 2018 at 02:18:09PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-07-18 at 18:29 -0500, Bjorn Helgaas wrote:
> > [+cc Paul, Michael, linuxppc-dev]
> >
>
> ..../...
>
> > > Debugging revealed a race condition between pcie core driver
> > > enabling is_added bit(pci_bus_add_device()) and nvme driver
> > > reset work-queue enabling is_busmaster bit (by pci_set_master()).
> > > As both fields are not handled in atomic manner and that clears
> > > is_added bit.
> > >
> > > Fix moves device addition is_added bit to separate private flag
> > > variable and use different atomic functions to set and retrieve
> > > device addition state. As is_added shares different memory
> > > location so race condition is avoided.
> >
> > Really nice bit of debugging!
>
> Indeed. However I'm not fan of the solution. Shouldn't we instead have
> some locking for the content of pci_dev ? I've always been wary of us
> having other similar races in there.
>
> As for the powerpc bits, I'm probably the one who wrote them, however,
> I'm on vacation this week and right now, no bandwidth to context switch
> all that back in :-) So give me a few days and/or ping me next week.
OK, here's a ping :)
Some powerpc cleanup would be ideal, but I'd like to fix the race for
v4.19, so I'm fine with this patch as-is. But I'd definitely want
your ack before inserting the ugly #include path in the powerpc code.
> The powerpc PCI code contains a lot of cruft coming from the depth of
> history, including rather nasty assumptions. We want to progressively
> clean it up, starting with EEH, but it will take time.
>
> Cheers,
> Ben.
>
> > > Signed-off-by: Hari Vyas <hari.vyas@broadcom.com>
> > > ---
> > > arch/powerpc/kernel/pci-common.c | 4 +++-
> > > arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
> > > arch/powerpc/platforms/pseries/setup.c | 3 ++-
> > > drivers/pci/bus.c | 6 +++---
> > > drivers/pci/hotplug/acpiphp_glue.c | 2 +-
> > > drivers/pci/pci.h | 11 +++++++++++
> > > drivers/pci/probe.c | 4 ++--
> > > drivers/pci/remove.c | 5 +++--
> > > include/linux/pci.h | 1 -
> > > 9 files changed, 27 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> > > index fe9733f..471aac3 100644
> > > --- a/arch/powerpc/kernel/pci-common.c
> > > +++ b/arch/powerpc/kernel/pci-common.c
> > > @@ -42,6 +42,8 @@
> > > #include <asm/ppc-pci.h>
> > > #include <asm/eeh.h>
> > >
> > > +#include "../../../drivers/pci/pci.h"
> >
> > I see why you need it, but this include path is really ugly. Outside
> > of bootloaders and tools, there are very few instances of includes
> > like this that reference a different top-level directory, and I'm not
> > very keen about adding more.
next prev parent reply other threads:[~2018-07-27 23:49 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-03 9:05 [PATCH v3] PCI: Data corruption happening due to race condition Hari Vyas
2018-07-03 9:05 ` Hari Vyas
2018-07-03 9:13 ` Lukas Wunner
2018-07-18 23:29 ` Bjorn Helgaas
2018-07-19 4:18 ` Benjamin Herrenschmidt
2018-07-19 14:04 ` Hari Vyas
2018-07-19 18:55 ` Lukas Wunner
2018-07-20 4:27 ` Benjamin Herrenschmidt
2018-07-27 22:25 ` Bjorn Helgaas [this message]
2018-07-28 0:45 ` Benjamin Herrenschmidt
2018-07-31 11:21 ` Michael Ellerman
2018-07-19 17:41 ` Bjorn Helgaas
2018-07-20 9:16 ` Hari Vyas
2018-07-20 12:20 ` Bjorn Helgaas
2018-07-31 16:37 ` Bjorn Helgaas
2018-08-15 3:35 ` PCIe enable device races (Was: [PATCH v3] PCI: Data corruption happening due to race condition) Benjamin Herrenschmidt
2018-08-15 4:16 ` Benjamin Herrenschmidt
2018-08-15 4:44 ` Benjamin Herrenschmidt
2018-08-15 5:21 ` [RFC PATCH] pci: Proof of concept at fixing pci_enable_device/bridge races Benjamin Herrenschmidt
2018-08-15 19:09 ` PCIe enable device races (Was: [PATCH v3] PCI: Data corruption happening due to race condition) Bjorn Helgaas
2018-08-15 21:50 ` [RFC PATCH] pci: Proof of concept at fixing pci_enable_device/bridge races Benjamin Herrenschmidt
2018-08-15 22:40 ` Guenter Roeck
2018-08-15 23:38 ` Benjamin Herrenschmidt
2018-08-20 1:31 ` Guenter Roeck
2018-08-17 3:07 ` Bjorn Helgaas
2018-08-17 3:42 ` Benjamin Herrenschmidt
2018-08-15 18:50 ` PCIe enable device races (Was: [PATCH v3] PCI: Data corruption happening due to race condition) Bjorn Helgaas
2018-08-15 21:52 ` Benjamin Herrenschmidt
2018-08-15 23:23 ` Benjamin Herrenschmidt
2018-08-16 7:58 ` Konstantin Khlebnikov
2018-08-16 8:02 ` Benjamin Herrenschmidt
2018-08-16 9:22 ` Hari Vyas
2018-08-16 10:10 ` Benjamin Herrenschmidt
2018-08-16 10:11 ` Benjamin Herrenschmidt
2018-08-16 10:26 ` Lukas Wunner
2018-08-16 10:47 ` Hari Vyas
2018-08-16 23:20 ` Benjamin Herrenschmidt
2018-08-16 23:17 ` Benjamin Herrenschmidt
2018-08-17 0:43 ` Benjamin Herrenschmidt
2018-08-16 19:43 ` Jens Axboe
2018-08-16 21:37 ` Benjamin Herrenschmidt
2018-08-16 21:56 ` Jens Axboe
2018-08-16 23:09 ` Benjamin Herrenschmidt
2018-08-17 0:14 ` Jens Axboe
2018-08-16 12:28 ` Lukas Wunner
2018-08-16 23:25 ` Benjamin Herrenschmidt
2018-08-17 1:12 ` Benjamin Herrenschmidt
2018-08-17 16:39 ` Lukas Wunner
2018-08-18 3:37 ` Benjamin Herrenschmidt
2018-08-18 9:22 ` Lukas Wunner
2018-08-18 13:11 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180727222540.GH173328@bhelgaas-glaptop.roam.corp.google.com \
--to=helgaas@kernel.org \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=hari.vyas@broadcom.com \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=ray.jui@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).