All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: Auger Eric <eric.auger@redhat.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>
Cc: Mark Rutland <Mark.Rutland@arm.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>,
	"tnowicki@caviumnetworks.com" <tnowicki@caviumnetworks.com>,
	"mst@redhat.com" <mst@redhat.com>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Will Deacon <Will.Deacon@arm.com>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"robh+dt@kernel.org" <robh+dt@kernel.org>,
	Robin Murphy <Robin.Murphy@arm.com>,
	"joro@8bytes.org" <joro@8bytes.org>
Subject: Re: [PATCH v3 0/7] Add virtio-iommu driver
Date: Wed, 17 Oct 2018 12:54:28 +0100	[thread overview]
Message-ID: <40dddb20-6248-5bb0-e0ed-48bacd1867a1@arm.com> (raw)
In-Reply-To: <c7167c51-d864-08f5-c128-f4b3257840ce@redhat.com>

On 16/10/2018 21:31, Auger Eric wrote:
> Hi Jean,
> 
> On 10/16/18 8:44 PM, Jean-Philippe Brucker wrote:
>> On 16/10/2018 10:25, Auger Eric wrote:
>>> Hi Jean,
>>>
>>> On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
>>>> Implement the virtio-iommu driver, following specification v0.8 [1].
>>>> Changes since v2 [2]:
>>>>
>>>> * Patches 2-4 allow virtio-iommu to use the PCI transport, since QEMU
>>>>    would like to phase out the MMIO transport. This produces a complex
>>>>    topology where the programming interface of the IOMMU could appear
>>>>    lower than the endpoints that it translates. It's not unheard of (e.g.
>>>>    AMD IOMMU), and the guest easily copes with this.
>>>>    
>>>>    The "Firmware description" section of the specification has been
>>>>    updated with all combinations of PCI, MMIO and DT, ACPI.
>>>
>>> I have a question wrt the FW specification. The IOMMU consumes 1 slot in
>>> the PCI domain and one needs to leave a RID hole in the iommu-map.  It
>>> is not obvious to me that this RID always is predictable given the pcie
>>> enumeration mechanism. Generally we have a coarse grain mapping of RID
>>> onto iommu phandles/STREAMIDs. Here, if I understand correctly we need
>>> to precisely identify the RID granted to the iommu. On QEMU this may
>>> depend on the instantiation order of the virtio-pci device right?
>> 
>> Yes, although it should all happen before you boot the guest, since
>> there is no hotplugging an IOMMU. Could you reserve a PCI slot upfront
>> and use it for virtio-iommu later? Or generate the iommu-map at the same
>> time as generating the child node of the PCI RC?
> 
> Even when cold-plugging the PCIe devices through qemu CLI, this depends
> on the order of the pcie devices in the list I guess. I need to further
> experiment.

Please let me know how it goes. I guess the problem will be the same for
building IORT tables? You're also going to need a hole in the ID
mappings of the PCI root complex node.

>>> So
>>> this does not look trivial to build this info. Isn't it possible to do
>>> this exclusion at kernel level instead?
>> 
>> So in theory VIRTIO_F_IOMMU_PLATFORM already does that:
>> 
>> VIRTIO_F_IOMMU_PLATFORM(33)
>>     This feature indicates that the device is behind an IOMMU that
>>     translates bus addresses from the device into physical addresses in
>>     memory. If this feature bit is set to 0, then the device emits
>>     physical addresses which are not translated further, even though an
>>     IOMMU may be present.
> 
> This tells the driver to use the dma api, right? 

That's how Linux implements the bit, install custom DMA ops when the bit
is absent. But it doesn't work for everyone and has caused a lot of
debate (https://patchwork.ozlabs.org/cover/946708/)

> Effectively this
> explicitly says whether the device is supposed to be upfront an IOMMU.

Yes. It's quite strange if you consider hotpluggable hardware, since
those devices shouldn't get to choose whether they are managed by an
IOMMU. For the IOMMU itself, it should be fine

>> For better or for worse, the guest has to implement it. If this feature
>> bit is unset for virtio-iommu, it does DMA on the physical address
>> space, regardless of what the static topology description says.
>> 
>> In practice it doesn't quite work. If your iommu-map describes the IOMMU
>> as translating itself, Linux' OF code will wait for the IOMMU to be
>> probed before probing the IOMMU. Working around this with hacks is
>> possible, but I don't want to introduce more questionable code to OF and
>> device tree bindings if there is any other way.
> Hum ok. I cannot really comment on this.
> 
> I just wanted to raise this concern about RID identfication.

We can always try. Relaxing iommu-map further would be one additional
patch to Documentation/devicetree/bindings/pci/pci-iommu.txt, and one to
drivers/iommu/of-iommu.c. I'd rather make it a separate RFC.

Since we need acks from an OF maintainer and I'd also like Joerg's
approval for adding a new driver to the IOMMU tree, I think it's too
late for this iteration. I wasn't intending for this to go into 4.20,
just have something to discuss at KVM forum next week.

Thanks,
Jean
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
To: Auger Eric <eric.auger@redhat.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"robh+dt@kernel.org" <robh+dt@kernel.org>,
	Mark Rutland <Mark.Rutland@arm.com>,
	"tnowicki@caviumnetworks.com" <tnowicki@caviumnetworks.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	Marc Zyngier <Marc.Zyngier@arm.com>,
	Robin Murphy <Robin.Murphy@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Subject: Re: [PATCH v3 0/7] Add virtio-iommu driver
Date: Wed, 17 Oct 2018 12:54:28 +0100	[thread overview]
Message-ID: <40dddb20-6248-5bb0-e0ed-48bacd1867a1@arm.com> (raw)
In-Reply-To: <c7167c51-d864-08f5-c128-f4b3257840ce@redhat.com>

On 16/10/2018 21:31, Auger Eric wrote:
> Hi Jean,
> 
> On 10/16/18 8:44 PM, Jean-Philippe Brucker wrote:
>> On 16/10/2018 10:25, Auger Eric wrote:
>>> Hi Jean,
>>>
>>> On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
>>>> Implement the virtio-iommu driver, following specification v0.8 [1].
>>>> Changes since v2 [2]:
>>>>
>>>> * Patches 2-4 allow virtio-iommu to use the PCI transport, since QEMU
>>>>    would like to phase out the MMIO transport. This produces a complex
>>>>    topology where the programming interface of the IOMMU could appear
>>>>    lower than the endpoints that it translates. It's not unheard of (e.g.
>>>>    AMD IOMMU), and the guest easily copes with this.
>>>>    
>>>>    The "Firmware description" section of the specification has been
>>>>    updated with all combinations of PCI, MMIO and DT, ACPI.
>>>
>>> I have a question wrt the FW specification. The IOMMU consumes 1 slot in
>>> the PCI domain and one needs to leave a RID hole in the iommu-map.  It
>>> is not obvious to me that this RID always is predictable given the pcie
>>> enumeration mechanism. Generally we have a coarse grain mapping of RID
>>> onto iommu phandles/STREAMIDs. Here, if I understand correctly we need
>>> to precisely identify the RID granted to the iommu. On QEMU this may
>>> depend on the instantiation order of the virtio-pci device right?
>> 
>> Yes, although it should all happen before you boot the guest, since
>> there is no hotplugging an IOMMU. Could you reserve a PCI slot upfront
>> and use it for virtio-iommu later? Or generate the iommu-map at the same
>> time as generating the child node of the PCI RC?
> 
> Even when cold-plugging the PCIe devices through qemu CLI, this depends
> on the order of the pcie devices in the list I guess. I need to further
> experiment.

Please let me know how it goes. I guess the problem will be the same for
building IORT tables? You're also going to need a hole in the ID
mappings of the PCI root complex node.

>>> So
>>> this does not look trivial to build this info. Isn't it possible to do
>>> this exclusion at kernel level instead?
>> 
>> So in theory VIRTIO_F_IOMMU_PLATFORM already does that:
>> 
>> VIRTIO_F_IOMMU_PLATFORM(33)
>>     This feature indicates that the device is behind an IOMMU that
>>     translates bus addresses from the device into physical addresses in
>>     memory. If this feature bit is set to 0, then the device emits
>>     physical addresses which are not translated further, even though an
>>     IOMMU may be present.
> 
> This tells the driver to use the dma api, right? 

That's how Linux implements the bit, install custom DMA ops when the bit
is absent. But it doesn't work for everyone and has caused a lot of
debate (https://patchwork.ozlabs.org/cover/946708/)

> Effectively this
> explicitly says whether the device is supposed to be upfront an IOMMU.

Yes. It's quite strange if you consider hotpluggable hardware, since
those devices shouldn't get to choose whether they are managed by an
IOMMU. For the IOMMU itself, it should be fine

>> For better or for worse, the guest has to implement it. If this feature
>> bit is unset for virtio-iommu, it does DMA on the physical address
>> space, regardless of what the static topology description says.
>> 
>> In practice it doesn't quite work. If your iommu-map describes the IOMMU
>> as translating itself, Linux' OF code will wait for the IOMMU to be
>> probed before probing the IOMMU. Working around this with hacks is
>> possible, but I don't want to introduce more questionable code to OF and
>> device tree bindings if there is any other way.
> Hum ok. I cannot really comment on this.
> 
> I just wanted to raise this concern about RID identfication.

We can always try. Relaxing iommu-map further would be one additional
patch to Documentation/devicetree/bindings/pci/pci-iommu.txt, and one to
drivers/iommu/of-iommu.c. I'd rather make it a separate RFC.

Since we need acks from an OF maintainer and I'd also like Joerg's
approval for adding a new driver to the IOMMU tree, I think it's too
late for this iteration. I wasn't intending for this to go into 4.20,
just have something to discuss at KVM forum next week.

Thanks,
Jean

  reply	other threads:[~2018-10-17 11:54 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-12 14:59 [PATCH v3 0/7] Add virtio-iommu driver Jean-Philippe Brucker
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59 ` [PATCH v3 1/7] dt-bindings: virtio-mmio: Add IOMMU description Jean-Philippe Brucker
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
2018-10-18  0:30   ` Rob Herring
     [not found]   ` <20181012145917.6840-2-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-18  0:30     ` Rob Herring
2018-10-18  0:30       ` Rob Herring
2018-11-15  8:45   ` Auger Eric
2018-11-15  8:45     ` Auger Eric
2018-10-12 14:59 ` [PATCH v3 2/7] dt-bindings: virtio: Add virtio-pci-iommu node Jean-Philippe Brucker
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
2018-10-18  0:35   ` Rob Herring
2018-10-18  0:35     ` Rob Herring
2018-10-18  0:35   ` Rob Herring
2018-11-15  8:45   ` Auger Eric
2018-11-15  8:45   ` Auger Eric
2018-11-15  8:45     ` Auger Eric
2018-10-12 14:59 ` [PATCH v3 3/7] PCI: OF: Allow endpoints to bypass the iommu Jean-Philippe Brucker
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
     [not found]   ` <20181012145917.6840-4-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-12 19:41     ` Bjorn Helgaas
2018-10-12 19:41       ` Bjorn Helgaas
2018-10-15 10:52       ` Michael S. Tsirkin
2018-10-15 11:32       ` Robin Murphy
     [not found]       ` <20181012194158.GX5906-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-10-15 10:52         ` Michael S. Tsirkin
2018-10-15 10:52           ` Michael S. Tsirkin
     [not found]           ` <20181015065024-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-10-15 19:46             ` Jean-philippe Brucker
2018-10-15 19:46               ` Jean-philippe Brucker
2018-10-17 15:14               ` Michael S. Tsirkin
2018-10-17 15:14                 ` Michael S. Tsirkin
2018-10-18 10:47                 ` Robin Murphy
2018-10-18 10:47                 ` Robin Murphy
2018-10-18 10:47                   ` Robin Murphy
     [not found]                 ` <20181017111100-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-10-22 11:27                   ` Jean-Philippe Brucker
2018-10-22 11:27                     ` Jean-Philippe Brucker
2018-10-22 11:27                 ` Jean-Philippe Brucker
2018-10-15 11:32         ` Robin Murphy
2018-10-15 11:32           ` Robin Murphy
2018-10-15 19:45         ` Jean-philippe Brucker
2018-10-15 19:45           ` Jean-philippe Brucker
2018-10-12 14:59 ` [PATCH v3 4/7] PCI: OF: Initialize dev->fwnode appropriately Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
     [not found]   ` <20181012145917.6840-5-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-12 19:44     ` Bjorn Helgaas
2018-10-12 19:44       ` Bjorn Helgaas
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59 ` [PATCH v3 5/7] iommu: Add virtio-iommu driver Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
2018-10-12 16:35   ` Michael S. Tsirkin
2018-10-12 16:35   ` Michael S. Tsirkin
2018-10-12 16:35     ` Michael S. Tsirkin
     [not found]     ` <20181012120953-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-10-12 18:54       ` Jean-Philippe Brucker
2018-10-12 18:54         ` Jean-Philippe Brucker
2018-10-12 18:54     ` Jean-Philippe Brucker
2018-11-08 14:51     ` Auger Eric
2018-11-08 14:51       ` Auger Eric
2018-11-08 16:46       ` Jean-Philippe Brucker
2018-11-08 16:46         ` Jean-Philippe Brucker
2018-11-08 16:46       ` Jean-Philippe Brucker
2018-11-08 14:51     ` Auger Eric
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59 ` [PATCH v3 6/7] iommu/virtio: Add probe request Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
2018-10-12 16:42   ` Michael S. Tsirkin
2018-10-12 16:42     ` Michael S. Tsirkin
2018-11-08 14:48   ` Auger Eric
     [not found]   ` <20181012145917.6840-7-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-11-08 14:48     ` Auger Eric
2018-11-08 14:48       ` Auger Eric
     [not found]       ` <295d30bb-5aef-2727-01c0-ec10c7a8fa8c-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-11-08 16:46         ` Jean-Philippe Brucker
2018-11-08 16:46           ` Jean-Philippe Brucker
2018-11-08 16:46       ` Jean-Philippe Brucker
2018-11-15 13:20     ` Auger Eric
2018-11-15 13:20       ` Auger Eric
2018-11-15 16:22       ` Jean-Philippe Brucker
2018-11-15 16:22         ` Jean-Philippe Brucker
2018-11-15 16:22       ` Jean-Philippe Brucker
2018-11-15 13:20   ` Auger Eric
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59 ` [PATCH v3 7/7] iommu/virtio: Add event queue Jean-Philippe Brucker
2018-10-12 14:59 ` Jean-Philippe Brucker
2018-10-12 14:59   ` Jean-Philippe Brucker
2018-10-12 17:00 ` [PATCH v3 0/7] Add virtio-iommu driver Michael S. Tsirkin
     [not found] ` <20181012145917.6840-1-jean-philippe.brucker-5wv7dgnIgG8@public.gmane.org>
2018-10-12 17:00   ` Michael S. Tsirkin
2018-10-12 17:00     ` Michael S. Tsirkin
2018-10-12 18:55     ` Jean-Philippe Brucker
     [not found]     ` <20181012125443-mutt-send-email-mst-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-10-12 18:55       ` Jean-Philippe Brucker
2018-10-12 18:55         ` Jean-Philippe Brucker
2018-10-16  9:25   ` Auger Eric
2018-10-16  9:25     ` Auger Eric
2018-10-16 18:44     ` Jean-Philippe Brucker
2018-10-16 18:44       ` Jean-Philippe Brucker
2018-10-16 20:31       ` Auger Eric
2018-10-16 20:31         ` Auger Eric
2018-10-17 11:54         ` Jean-Philippe Brucker [this message]
2018-10-17 11:54           ` Jean-Philippe Brucker
2018-10-17 15:23           ` Michael S. Tsirkin
2018-10-17 15:23             ` Michael S. Tsirkin
2018-10-17 15:23           ` Michael S. Tsirkin
2018-10-16 18:44     ` Jean-Philippe Brucker
2018-10-16  9:25 ` Auger Eric
2018-10-12 14:59 Jean-Philippe Brucker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40dddb20-6248-5bb0-e0ed-48bacd1867a1@arm.com \
    --to=jean-philippe.brucker@arm.com \
    --cc=Lorenzo.Pieralisi@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Robin.Murphy@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=devicetree@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-pci@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=robh+dt@kernel.org \
    --cc=tnowicki@caviumnetworks.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.