All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Don Dutile <ddutile@redhat.com>
Cc: Jiang Liu <jiang.liu@huawei.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Yinghai Lu <yinghai@kernel.org>,
	Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	Yijing Wang <wangyijing@huawei.com>,
	Keping Chen <chenkeping@huawei.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	Jiang Liu <liuj97@gmail.com>
Subject: Re: [Resend with Ack][PATCH v1] PCI: allow acpiphp to handle PCIe ports without native PCIe hotplug capability
Date: Wed, 4 Jul 2012 12:07:11 -0600	[thread overview]
Message-ID: <CAErSpo7=BzPX87qgKS_xkJg4o8BGDan-o7D9NqRiqKvH0kLjfQ@mail.gmail.com> (raw)
In-Reply-To: <4FF34CEF.3090400@redhat.com>

On Tue, Jul 3, 2012 at 1:50 PM, Don Dutile <ddutile@redhat.com> wrote:
> On 07/03/2012 11:59 AM, Bjorn Helgaas wrote:
>>
>> On Mon, Jul 2, 2012 at 10:16 PM, Bjorn Helgaas<bhelgaas@google.com>
>> wrote:
>>>
>>> On Mon, Jun 4, 2012 at 1:44 AM, Jiang Liu<jiang.liu@huawei.com>  wrote:
>>>>
>>>> Commit 0d52f54e2ef64c189dedc332e680b2eb4a34590a (PCI / ACPI: Make
>>>> acpiphp
>>>> ignore root bridges using PCIe native hotplug) added code that made the
>>>> acpiphp driver completely ignore PCIe root complexes for which the
>>>> kernel
>>>> had been granted control of the native PCIe hotplug feature by the BIOS
>>>> through _OSC. Later commit 619a5182d1f38a3d629ee48e04fa182ef9170052
>>>> "PCI hotplug: Always allow acpiphp to handle non-PCIe bridges" relaxed
>>>> the constraints to allow acpiphp driver handle non-PCIe bridges under
>>>> such a complex. The constraint needs to be relaxed further to allow
>>>> acpiphp driver to hanlde PCIe ports without native PCIe hotplug
>>>> capability.
>>>>
>>>> Some MR-IOV switch chipsets, such PLX8696, support multiple virtual PCIe
>>>> switches and may migrate downstream ports among virtual switches.
>>>> To migrate a downstream port from the source virtual switch to the
>>>> target,
>>>> the port needs to be hot-removed from the source and hot-added into the
>>>> target. pciehp driver can't be used here because there's no slots within
>>>> the virtual PCIe switch. So acpiphp driver is used to support downstream
>>>> port migration. A typical configuration is as below:
>>>> [Root w/o native PCIe HP]
>>>>          [Upstream port of vswitch w/o native PCIe HP]
>>>>                  [Downstream port of vswitch w/ native PCIe HP]
>>>>                          [PCIe enpoint]
>>>>
>>>> Here acpiphp driver will be used to handle root ports and upstream port
>>>> in the virtual switch, and pciehp driver will be used to handle
>>>> downstream
>>>> ports in the virtual switch.
>>>>
>>>> Acked-by: Rafael J. Wysocki<rjw@sisk.pl>
>>>> Signed-off-by: Jiang Liu<liuj97@gmail.com>
>>>>
>>>> ---
>>>>   drivers/pci/hotplug/acpiphp_glue.c |   49
>>>> ++++++++++++++++++++++++++++-------
>>>>   1 files changed, 39 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/hotplug/acpiphp_glue.c
>>>> b/drivers/pci/hotplug/acpiphp_glue.c
>>>> index 806c44f..4889448 100644
>>>> --- a/drivers/pci/hotplug/acpiphp_glue.c
>>>> +++ b/drivers/pci/hotplug/acpiphp_glue.c
>>>> @@ -115,6 +115,43 @@ static const struct acpi_dock_ops acpiphp_dock_ops
>>>> = {
>>>>          .handler = handle_hotplug_event_func,
>>>>   };
>>>>
>>>> +/* Check whether device is managed by native PCIe hotplug driver */
>>>> +static bool device_is_managed_by_native_pciehp(struct pci_dev *pdev)
>>>> +{
>>>> +       int pos;
>>>> +       u16 reg16;
>>>> +       u32 reg32;
>>>> +       acpi_handle tmp;
>>>> +       struct acpi_pci_root *root;
>>>> +
>>>> +       if (!pci_is_pcie(pdev))
>>>> +               return false;
>>>> +
>>>> +       /* Check whether PCIe port supports native PCIe hotplug */
>>>> +       pos = pci_pcie_cap(pdev);
>>>
>>>
>>> Add "if (!pos) return false;" here and you can drop the "if
>>> (!pci_is_pcie())" test above.
>>>
>>>> +       pci_read_config_word(pdev, pos + PCI_EXP_FLAGS,&reg16);
>>>> +       if (!(reg16&  PCI_EXP_FLAGS_SLOT))
>>>
>>>
>>> I think this is unsafe.  Per the PCIe v3.0 spec, sec 7.8.2 on p648,
>>> the "Slot Implemented" bit is undefined except for Downstream Ports,
>>> so we're using an undefined bit to decide whether to read
>>> PCI_EXP_SLTCAP.
>>>
>>> If the device has a v1 PCIe Capability, it is not required to even
>>> implement PCI_EXP_SLTCAP, so we could be reading garbage out of an
>>> unrelated capability.  This is in sec 7.8, p363, of the v1.1 PCIe
>>> spec.  I think v3.0 of the spec is dangerously incomplete because it
>>> doesn't include enough information to handle the v1 PCIe Capability
>>> correctly.
>>>
>>> There's a fair amount of work to fix this.  I started doing it, but
>>> decided I didn't have time to complete it.  Here's what I think we
>>> (and by "we," I'm afraid I mean "you" :)) should do:
>>>
>>>    - Add a "u16 pcie_flags" field in struct pci_dev and save the "PCI
>>> Express Capabilities Register" there in set_pcie_port_type().  All
>>> fields in that register are read-only, so it should be safe to cache
>>> it.
>>>    - Remove pcie_type from struct pci_dev and replace it with a
>>> pcie_type() inline that extracts it from pcie_flags.
>>>    - Rework the pcie_cap_has_*() macros in drivers/pci/pci.c to take a
>>> struct pci_dev * and use pcie_flags instead of type and flags.  This
>>> will remove the need for callers to read the flags themselves.
>>>    - Move the pcie_cap_has_*() macros to include/linux/pci_reg.h so
>>> they can be shared.
>>>    - Audit all uses of the Link registers (PCI_EXP_LNKCAP,
>>> PCI_EXP_LNKCTL, PCI_EXP_LNKSTA), Slot registers (PCI_EXP_SLTCAP,
>>> PCI_EXP_SLTCTL, PCI_EXP_SLTSTA), and Root registers (PCI_EXP_RTCAP,
>>> PCI_EXP_RTCTL, PCI_EXP_RTSTA) to make sure the register exists, either
>>> by using pcie_cap_has_*() or some other knowledge of the device.
>>
>>
>> Thinking about this some more, this still leaves the callers
>> responsible for using pcie_cap_has_*(), which feels pretty
>> error-prone.
>>
>> I wonder if it'd be worth adding interfaces like:
>>
>>    pcie_cap_read_word(const struct pci_dev *, int where, u16 *val);
>>    pcie_cap_read_dword(const struct pci_dev *, int where, u32 *val);
>>    pcie_cap_write_word(const struct pci_dev *, int where, u16 val);
>>    pcie_cap_write_dword(const struct pci_dev *, int where, u32 val);
>>
>
> I like your thinking!
>
>
>> We might be able to encapsulate the v1/v2 differences inside these, e.g.,
>>
>>    int pcie_cap_read_word(const struct pci_dev *dev, int where, u16 *val)
>>    {
>>        int pos;
>>
>>        pos = pci_pcie_cap(dev);
>>        if (!pos)
>>            return -EINVAL;
>>
> may want to change read value to 0 just in case callers are doing rtn value
> check and just value-read mask & go.  I believe for all the
> optional/version'd
> registers below, non-existent regs are required to be rtn-zero if not
> implemented.

Generally I prefer that if a function returns failure, it doesn't
modify the parameters passed by reference, but in this case, I think
you're right that we should set *val to zero to begin with.  It will
simplify the following code somewhat, too.

Note that most non-implemented registers should read as zero, but Slot
Status of Downstream Ports is an exception (spec v3.0, sec 7.8, line
25).

>>        switch (where) {
>>        case PCI_EXP_FLAGS:
>>        case PCI_EXP_DEVCTL:
>>        case PCI_EXP_DEVSTA:
>>            return pci_read_config_word(dev, pos + where, val);
>>        case PCI_EXP_LNKCTL:
>>        case PCI_EXP_LNKSTA:
>>            if (pcie_cap_has_lnkctl(dev))
>>                return pci_read_config_word(dev, pos + where, val);
>>            else {
>>                *val = 0;
>>                return 0;
>>            }
>>        case PCI_EXP_SLTCTL:
>>        case PCI_EXP_SLTSTA:
>>            if (pcie_cap_has_sltctl(dev))
>>                return pci_read_config_word(dev, pos + where, val);
>>            else {
>>                *val = 0;
>>                if (where == PCI_EXP_SLTSTA&&  dev->pcie_type ==
>>
>> PCI_EXP_TYPE_DOWNSTREAM)
>>                    *val = PCI_EXP_SLTSTA_PDS;
>>                return 0;
>>        ...
>>        };
>>        return -EINVAL;
>>    }
>>
>> Any thoughts?
>
>
> only one is that 'cap' is overused in PCI space, just like 'domain' in
> various kernel subsystems.  cap could be 'cap list structure'
> or a specific 'capability'.  I wish we had a better TLA for 'cap' and what
> it refers to. ... but that's my pet peeve...

I agree, 'cap' is overused and confusing.  Even "PCI Express
Capability Structure" as used in the spec seems slightly confusing to
me.  The existing pci_find_capability() interfaces spell out 'cap';
maybe we should, too.  Here are some possibilities, starting with my
current favorite:

  pci_pcie_capability_read_word()
  pci_pcie_cap_read_word()  (extension of existing pci_pcie_cap() idea)
  pci_express_cap_read_word()

  reply	other threads:[~2012-07-04 18:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  7:44 [Resend with Ack][PATCH v1] PCI: allow acpiphp to handle PCIe ports without native PCIe hotplug capability Jiang Liu
2012-06-04  8:23 ` Kenji Kaneshige
2012-07-03  4:16 ` Bjorn Helgaas
2012-07-03 15:59   ` Bjorn Helgaas
2012-07-03 19:50     ` Don Dutile
2012-07-04 18:07       ` Bjorn Helgaas [this message]
2012-07-09 10:05         ` Jiang Liu
2012-07-09 17:05           ` Bjorn Helgaas
2012-07-04  2:52     ` Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 00/14] improve PCIe capabilities registers handling Jiang Liu
2012-07-10 18:44       ` Bjorn Helgaas
2012-07-10 15:54     ` [RFC PATCH 01/14] PCI: add pcie_flags into struct pci_dev to cache PCIe capabilities register Jiang Liu
2012-07-11  9:01       ` Taku Izumi
2012-07-11 14:27         ` Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 02/14] PCI: introduce pci_pcie_type(dev) to replace pci_dev->pcie_type Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 03/14] PCI: remove unused field pcie_type from struct pci_dev Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 04/14] PCI: refine and move pcie_cap_has_*() macros to include/linux/pci.h Jiang Liu
2012-07-10 18:49       ` Bjorn Helgaas
2012-07-10 15:54     ` [RFC PATCH 05/14] PCI: add access functions for PCIe capabilities to hide PCIe spec differences Jiang Liu
2012-07-10 18:35       ` Bjorn Helgaas
2012-07-11  3:07         ` Jiang Liu
2012-07-11  3:40           ` Bjorn Helgaas
2012-07-11  6:40             ` Jiang Liu
2012-07-11 17:52               ` Bjorn Helgaas
2012-07-12  2:56                 ` Jiang Liu
2012-07-12 20:49                   ` Bjorn Helgaas
2012-07-15 16:47                     ` Jiang Liu
2012-07-16 17:29                       ` Bjorn Helgaas
2012-07-16 18:57                         ` Don Dutile
2012-07-17  0:09                         ` Jiang Liu
2012-07-17  0:14                           ` Bjorn Helgaas
2012-07-10 15:54     ` [RFC PATCH 06/14] PCI: use PCIe cap access functions to simplify PCI core implementation Jiang Liu
2012-07-10 18:35       ` Bjorn Helgaas
2012-07-11  2:49         ` Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 07/14] hotplug/PCI: use PCIe cap access functions to simplify implementation Jiang Liu
2012-07-10 18:35       ` Bjorn Helgaas
2012-07-10 15:54     ` [RFC PATCH 08/14] portdrv/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 09/14] pciehp/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 10/14] PME/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 11/14] AER/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 12/14] ASPM/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 13/14] r8169/PCI: " Jiang Liu
2012-07-10 15:54     ` [RFC PATCH 14/14] qib/PCI: " Jiang Liu
2012-08-15 19:12 ` [Resend with Ack][PATCH v1] PCI: allow acpiphp to handle PCIe ports without native PCIe hotplug capability Bjorn Helgaas
2012-08-16 15:15   ` Jiang Liu
2012-08-22 15:16   ` [PATCH v2] PCI: allow acpiphp to handle PCIe ports w/o " Jiang Liu
2012-09-24 22:10     ` Bjorn Helgaas
2012-09-25 15:16       ` Jiang Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAErSpo7=BzPX87qgKS_xkJg4o8BGDan-o7D9NqRiqKvH0kLjfQ@mail.gmail.com' \
    --to=bhelgaas@google.com \
    --cc=chenkeping@huawei.com \
    --cc=ddutile@redhat.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=jiang.liu@huawei.com \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuj97@gmail.com \
    --cc=rjw@sisk.pl \
    --cc=wangyijing@huawei.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.