All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Kumar Gautam <vivek.gautam@arm.com>
To: Alexandru Elisei <alexandru.elisei@arm.com>,
	will@kernel.org, julien.thierry.kdev@gmail.com,
	kvm@vger.kernel.org
Cc: andre.przywara@arm.com, lorenzo.pieralisi@arm.com,
	jean-philippe@linaro.org, eric.auger@redhat.com
Subject: Re: [PATCH] vfio/pci: Add support for PCIe extended capabilities
Date: Thu, 2 Sep 2021 16:18:09 +0530	[thread overview]
Message-ID: <51cc78f5-6841-5e83-3b28-bef7fbf68937@arm.com> (raw)
In-Reply-To: <af1ae6fe-176b-8016-3815-faa9651b0aba@arm.com>

Hi Alex,


On 9/2/21 3:29 PM, Alexandru Elisei wrote:
> Hi Vivek,
>
> On 8/10/21 7:25 AM, Vivek Gautam wrote:
>> Add support to parse extended configuration space for vfio based
>> assigned PCIe devices and add extended capabilities for the device
>> in the guest. This allows the guest to see and work on extended
>> capabilities, for example to toggle PRI extended cap to enable and
>> disable Shared virtual addressing (SVA) support.
>> PCIe extended capability header that is the first DWORD of all
>> extended caps is shown below -
>>
>>     31               20  19   16  15                 0
>>     ____________________|_______|_____________________
>>    |    Next cap off    |  1h   |     Cap ID          |
>>    |____________________|_______|_____________________|
>>
>> Out of the two upper bytes of extended cap header, the
>> lower nibble is actually cap version - 0x1.
>> 'next cap offset' if present at bits [31:20], should be
>> right shifted by 4 bits to calculate the position of next
>> capability.
>> This change supports parsing and adding ATS, PRI and PASID caps.
>>
>> Signed-off-by: Vivek Gautam <vivek.gautam@arm.com>
>> ---
>>   include/kvm/pci.h |   6 +++
>>   vfio/pci.c        | 104 ++++++++++++++++++++++++++++++++++++++++++----
>>   2 files changed, 103 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/kvm/pci.h b/include/kvm/pci.h
>> index 0f2d5bb..a365337 100644
>> --- a/include/kvm/pci.h
>> +++ b/include/kvm/pci.h
>> @@ -165,6 +165,12 @@ struct pci_exp_cap {
>>      u32 root_status;
>>   };
>>
>> +struct pci_ext_cap_hdr {
>> +    u16     type;
>> +    /* bit 19:16 = 0x1: Cap version */
>
> I believe bits 19:16 are the cap version if you look at the header as a 32bit
> value (next:type). If you actually want to set those bits, you need to set bits
> 3:0 of the next field. I believe the comment should reflect that because it's
> slightly confusing (no field is larger than 16 bits, for example). Or you can move
> the comment at the top of the struct and keep it as it is.

Right, bit [19:16] for u16 would be a wrong interpretation. I will move
this comment to top of the structure.

>
>> +    u16     next;
>> +};
>> +
>>   struct pci_device_header;
>>
>>   typedef int (*bar_activate_fn_t)(struct kvm *kvm,
>> diff --git a/vfio/pci.c b/vfio/pci.c
>> index ea33fd6..d045e0d 100644
>> --- a/vfio/pci.c
>> +++ b/vfio/pci.c
>> @@ -665,19 +665,105 @@ static int vfio_pci_add_cap(struct vfio_device *vdev, u8 *virt_hdr,
>>      return 0;
>>   }
>>
>> +static ssize_t vfio_pci_ext_cap_size(struct pci_ext_cap_hdr *cap_hdr)
>> +{
>> +    switch (cap_hdr->type) {
>> +    case PCI_EXT_CAP_ID_ATS:
>> +            return PCI_EXT_CAP_ATS_SIZEOF;
>> +    case PCI_EXT_CAP_ID_PRI:
>> +            return PCI_EXT_CAP_PRI_SIZEOF;
>> +    case PCI_EXT_CAP_ID_PASID:
>> +            return PCI_EXT_CAP_PASID_SIZEOF;
>> +    default:
>> +            pr_err("unknown extended PCI capability 0x%x", cap_hdr->type);
>> +            return 0;
>> +    }
>> +}
>> +
>> +static int vfio_pci_add_ext_cap(struct vfio_device *vdev, u8 *virt_hdr,
>> +                        struct pci_ext_cap_hdr *cap, off_t pos)
>> +{
>> +    struct pci_ext_cap_hdr *last;
>> +
>> +    cap->next = 0;
>> +    last = PCI_CAP(virt_hdr, 0x100);
>> +
>> +    while (last->next)
>> +            last = PCI_CAP(virt_hdr, last->next);
>> +
>> +    last->next = pos;
>> +    /*
>> +     * Out of the two upper bytes of extended cap header, the
>> +     * nibble [19:16] is actually cap version that should be 0x1,
>> +     * so shift back the actual 'next' value by 4 bits.
>> +     */
>> +    last->next = (last->next << 4) | 0x1;
>> +    memcpy(virt_hdr + pos, cap, vfio_pci_ext_cap_size(cap));
>> +
>> +    return 0;
>> +
>> +}
>> +
>> +static int vfio_pci_parse_ext_caps(struct vfio_device *vdev, u8 *virt_hdr)
>> +{
>> +    int ret;
>> +    u16 pos, next;
>> +    struct pci_ext_cap_hdr *ext_cap;
>> +    struct vfio_pci_device *pdev = &vdev->pci;
>> +
>> +    /* Extended cap only for PCIe devices */
>
> Devices are PCI Express if they have the PCI Express Capability (this is also how
> Linux tells them apart). The arch_has_pci_exp() is meant to check that the
> architecture kvmtool has been compiled for can emulate a PCI Express bus (as
> apposed to a legacy PCI bus). For example, when you compile kvmtool for x86, you
> will get a legacy PCI bus.
>
> I'm not saying the check is bad, because it definitely should be done, but if what
> you're trying to do is to check that the device is a PCI Express capable device,
> then you also need to have a look at the PCI Express Capability like Andre suggested.

Yes, better to check the 'presence' of PCI express caps after reading
the extended configuration space rather than relying on a static check.
I will add as Andre suggested.

>
>> +    if (!arch_has_pci_exp())
>> +            return 0;
>> +
>> +    /* Extended caps start from 0x100 offset. */
>> +    pos = 0x100;
>> +
>> +    for (; pos; pos = next) {
>> +            ext_cap = PCI_CAP(&pdev->hdr, pos);
>> +            /*
>> +             * Out of the two upper bytes of extended cap header, the
>> +             * lowest nibble is actually cap version.
>> +             * 'next cap offset' if present at bits [31:20], while
>> +             * bits [19:16] are set to 1 to indicate cap version.
>> +             * So to get position of next cap right shift by 4 bits.
>> +             */
>> +            next = (ext_cap->next >> 4);
>> +
>> +            switch (ext_cap->type) {
>> +            case PCI_EXT_CAP_ID_ATS:
>> +                    ret = vfio_pci_add_ext_cap(vdev, virt_hdr, ext_cap, pos);
>> +                    if (ret)
>> +                            return ret;
>> +                    break;
>> +            case PCI_EXT_CAP_ID_PRI:
>> +                    ret = vfio_pci_add_ext_cap(vdev, virt_hdr, ext_cap, pos);
>> +                    if (ret)
>> +                            return ret;
>> +                    break;
>> +            case PCI_EXT_CAP_ID_PASID:
>> +                    ret = vfio_pci_add_ext_cap(vdev, virt_hdr, ext_cap, pos);
>> +                    if (ret)
>> +                            return ret;
>> +                    break;
>> +            }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static int vfio_pci_parse_caps(struct vfio_device *vdev)
>>   {
>>      int ret;
>>      size_t size;
>>      u16 pos, next;
>>      struct pci_cap_hdr *cap;
>> -    u8 virt_hdr[PCI_DEV_CFG_SIZE_LEGACY];
>> +    u8 virt_hdr[PCI_DEV_CFG_SIZE];
>>      struct vfio_pci_device *pdev = &vdev->pci;
>>
>>      if (!(pdev->hdr.status & PCI_STATUS_CAP_LIST))
>>              return 0;
>>
>> -    memset(virt_hdr, 0, PCI_DEV_CFG_SIZE_LEGACY);
>> +    memset(virt_hdr, 0, PCI_DEV_CFG_SIZE);
>>
>>      pos = pdev->hdr.capabilities & ~3;
>>
>> @@ -715,9 +801,13 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev)
>>              }
>>      }
>>
>> +    ret = vfio_pci_parse_ext_caps(vdev, virt_hdr);
>> +    if (ret)
>> +            return ret;
>> +
>>      /* Wipe remaining capabilities */
>>      pos = PCI_STD_HEADER_SIZEOF;
>> -    size = PCI_DEV_CFG_SIZE_LEGACY - PCI_STD_HEADER_SIZEOF;
>> +    size = PCI_DEV_CFG_SIZE - PCI_STD_HEADER_SIZEOF;
>>      memcpy((void *)&pdev->hdr + pos, virt_hdr + pos, size);
>>
>>      return 0;
>> @@ -725,7 +815,7 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev)
>>
>>   static int vfio_pci_parse_cfg_space(struct vfio_device *vdev)
>>   {
>> -    ssize_t sz = PCI_DEV_CFG_SIZE_LEGACY;
>> +    ssize_t sz = PCI_DEV_CFG_SIZE;
>>      struct vfio_region_info *info;
>>      struct vfio_pci_device *pdev = &vdev->pci;
>>
>> @@ -831,10 +921,10 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev)
>>      /* Install our fake Configuration Space */
>>      info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info;
>>      /*
>> -     * We don't touch the extended configuration space, let's be cautious
>> -     * and not overwrite it all with zeros, or bad things might happen.
>> +     * Update the extended configuration space as well since we
>> +     * are now populating the extended capabilities.
>>       */
>> -    hdr_sz = PCI_DEV_CFG_SIZE_LEGACY;
>> +    hdr_sz = PCI_DEV_CFG_SIZE;
>
> In one of the earlier versions of the PCI Express patches I was doing the same
> thing here - overwriting the entire PCI Express configuration space for a device.
> However, that made one of the devices I was using for testing stop working when
> assigned to a VM.
>
> I'll go through my testing notes and test it again, the cause of the failure might
> have been something else entirely which was fixed since then.

Sure. Let me know your findings.

Best regards
Vivek

>
> Thanks,
>
> Alex
>
>>      if (pwrite(vdev->fd, &pdev->hdr, hdr_sz, info->offset) != hdr_sz) {
>>              vfio_dev_err(vdev, "failed to write %zd bytes to Config Space",
>>                           hdr_sz);
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

  reply	other threads:[~2021-09-02 10:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10  6:25 [PATCH] vfio/pci: Add support for PCIe extended capabilities Vivek Gautam
2021-08-31 14:54 ` Will Deacon
2021-09-01 10:27   ` Andre Przywara
2021-09-01 10:44     ` Will Deacon
2021-09-02 10:28       ` Vivek Kumar Gautam
2021-08-31 17:14 ` Andre Przywara
2021-09-02 10:42   ` Vivek Kumar Gautam
2021-09-02  9:59 ` Alexandru Elisei
2021-09-02 10:48   ` Vivek Kumar Gautam [this message]
2021-09-03 15:15     ` Alexandru Elisei
2021-09-08  9:32       ` Vivek Kumar Gautam
2021-09-08  9:36         ` Vivek Kumar Gautam
2021-09-08 16:20           ` Alexandru Elisei
2021-09-08 16:18         ` Alexandru Elisei
2021-10-04  9:47         ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51cc78f5-6841-5e83-3b28-bef7fbf68937@arm.com \
    --to=vivek.gautam@arm.com \
    --cc=alexandru.elisei@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=eric.auger@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.