linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <liuj97@gmail.com>, Jiang Liu <jiang.liu@huawei.com>,
	Don Dutile <ddutile@redhat.com>,
	Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
	Yijing Wang <wangyijing@huawei.com>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [PATCH 4/5] PCI/IOV: simplify code by hotplug safe pci_get_domain_bus_and_slot()
Date: Thu, 20 Sep 2012 20:56:43 -0600	[thread overview]
Message-ID: <CAErSpo5zeBVBowMyQS8QgAzrVbTJOb=Tuees9r1BnUZ3x13V0A@mail.gmail.com> (raw)
In-Reply-To: <CAE9FiQWuVW6U9cE+iY0Gpua_0xhdxp6HxcWyhqqSN+L3Da_1Qg@mail.gmail.com>

On Thu, Sep 20, 2012 at 7:51 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, Sep 20, 2012 at 4:59 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Thu, Sep 20, 2012 at 2:38 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> in that case, VFs  are stopped before PF, so they are not in device
>>> tree anymore.
>>> so pci_get_domain_bus_and_slot will not find those VFs.
>>>
>>> So the reference to PF is not released. Also vit_bus may not be released too.
>>>
>>> So you have to rework
>>>     pci_get_domain_bus_and_slot to make it work on pci devices get stopped only.
>>>
>>> or just drop this from the tree.
>>
>> pci_find_bus() is a broken interface (because there's no reference
>> counting or safety with respect to hot-plug), and if the design
>> depends on it, that means the design is broken, too.  I don't think
>> reworking pci_get_domain_bus_and_slot() is the right answer.
>>
>> It's not clear to me why we need the split between stopping and
>> removing devices.  That split leads to these zombie devices that have
>> been stopped and are no longer findable by bus_find_device() (which is
>> used by pci_get_domain_bus_and_slot()), but still "exist" in some
>> fashion until they're removed.  It's unreasonable for random PCI and
>> driver code to have to worry about that zombie state.
>
> That is not zombie state. that is driver unloaded, and not in /sys, /proc.
> that pci device only can be found under bus->devices.

It doesn't matter whether we call this a "zombie state" or just refer
to it as "devices not in /sys & /proc but still in bus->devices."  The
point is that this state is not very useful, and code outside the PCI
core should not have to know that it exists.

> just like we have pci_device_add and pci_bus_add_device
> or acpi add and acpi start.

The fact that ACPI drivers have both .add() and .start() methods is
another artifact of poor design, in my opinion.  No other subsystem
has that split, as far as I know.  The ACPI split exists because of a
messed-up ACPI hotplug implementation.  That doesn't mean we should
copy it.

>> I'm not happy about either reverting Jiang's patch or splitting
>> stop/remove again.  It complicates the design and the code.  I'll
>> apply them because they're regressions, and we don't have time for
>> redesign before 3.7.  But I encourage you to think about how to do
>> this more cleanly.
>
> That will need to redesign sriov implementation.

That's right.  If we can improve the situation by redesigning, that's
what we should do.  The present situation, where we keep adding
special cases because "that's the way the rest of the system works" is
not sustainable in the long term.

> Also that pci root bus add/start, stop/remove will need special
> sequence to make ioapic
> and dmar to be started early before normal pci device drivers and
> stopped after normal pci device drivers.

This is another thing I'm curious about.  How do you handle this
situation today (before host bridge hot-add)?

The DMAR I'm not so worried about because as far as I know, there's no
such thing as a DMAR that's discovered by PCI enumeration.  We should
discover it via ACPI, and that can happen before we enumerate anything
behind a host bridge, so I don't really see any ordering problem
between the DMAR and the PCI devices that would use it.

However, I know there *are* IOAPICs that are enumerated as PCI
devices, and I don't know whether we can deduce a relationship between
the IOAPIC and the devices that use it.  Don't we have this problem
already?  I assume that even without hot-adding a host bridge, we
might discover a PCI IOAPIC that was present at boot, and we'd have to
make sure to bind a driver to it before we use any of the PCI devices
connected to it.  How does that work?

Bjorn

  reply	other threads:[~2012-09-21  2:57 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-28 15:43 [PATCH 0/5] Simplify code by using hotplug safe pci_get_domain_bus_and_slot() Jiang Liu
2012-08-28 15:43 ` [PATCH 1/5] PCI/IA64: simplify code by " Jiang Liu
2012-08-28 15:43 ` [PATCH 2/5] PCI/vga: " Jiang Liu
2012-08-28 15:43 ` [PATCH 3/5] PCI/cpcihp: " Jiang Liu
2012-08-28 15:43 ` [PATCH 4/5] PCI/IOV: " Jiang Liu
2012-09-20 20:38   ` Yinghai Lu
2012-09-20 23:59     ` Bjorn Helgaas
2012-09-21  0:02       ` Jiang Liu
2012-09-21  1:51       ` Yinghai Lu
2012-09-21  2:56         ` Bjorn Helgaas [this message]
2012-09-21  6:22           ` Yinghai Lu
2012-09-21 14:15             ` Don Dutile
2012-09-21 15:11               ` Yinghai Lu
2012-08-28 15:43 ` [PATCH 5/5] PCI/xen-pcifront: " Jiang Liu
2012-08-28 16:59   ` Konrad Rzeszutek Wilk
2012-08-28 23:56     ` Jiang Liu
2012-09-05 16:29       ` Konrad Rzeszutek Wilk
2012-09-05 20:32   ` Konrad Rzeszutek Wilk
2012-09-17 19:34 ` [PATCH 0/5] Simplify code by using " Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAErSpo5zeBVBowMyQS8QgAzrVbTJOb=Tuees9r1BnUZ3x13V0A@mail.gmail.com' \
    --to=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=jiang.liu@huawei.com \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuj97@gmail.com \
    --cc=wangyijing@huawei.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).