linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Auger Eric <eric.auger@redhat.com>
To: Robin Murphy <robin.murphy@arm.com>,
	eric.auger.pro@gmail.com, joro@8bytes.org,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	dwmw2@infradead.org, lorenzo.pieralisi@arm.com,
	will.deacon@arm.com, hanjun.guo@linaro.org, sudeep.holla@arm.com
Cc: alex.williamson@redhat.com, shameerali.kolothum.thodi@huawei.com
Subject: Re: [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions
Date: Thu, 16 May 2019 15:23:17 +0200	[thread overview]
Message-ID: <57db1955-9d19-7c0b-eca3-37cc0d7d745b@redhat.com> (raw)
In-Reply-To: <ad8a99fa-b98a-14d3-12be-74df0e6eb8f8@arm.com>

Hi Robin,
On 5/16/19 2:46 PM, Robin Murphy wrote:
> On 16/05/2019 11:08, Eric Auger wrote:
>> Introduce a new type for reserved region. This corresponds
>> to directly mapped regions which are known to be relaxable
>> in some specific conditions, such as device assignment use
>> case. Well known examples are those used by USB controllers
>> providing PS/2 keyboard emulation for pre-boot BIOS and
>> early BOOT or RMRRs associated to IGD working in legacy mode.
>>
>> Since commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs
>> from IOMMU API domains") and commit 18436afdc11a ("iommu/vt-d: Allow
>> RMRR on graphics devices too"), those regions are currently
>> considered "safe" with respect to device assignment use case
>> which requires a non direct mapping at IOMMU physical level
>> (RAM GPA -> HPA mapping).
>>
>> Those RMRRs currently exist and sometimes the device is
>> attempting to access it but this has not been considered
>> an issue until now.
>>
>> However at the moment, iommu_get_group_resv_regions() is
>> not able to make any difference between directly mapped
>> regions: those which must be absolutely enforced and those
>> like above ones which are known as relaxable.
>>
>> This is a blocker for reporting severe conflicts between
>> non relaxable RMRRs (like MSI doorbells) and guest GPA space.
>>
>> With this new reserved region type we will be able to use
>> iommu_get_group_resv_regions() to enumerate the IOVA space
>> that is usable through the IOMMU API without introducing
>> regressions with respect to existing device assignment
>> use cases (USB and IGD).
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> Note: At the moment the sysfs ABI is not changed. However I wonder
>> whether it wouldn't be preferable to report the direct region as
>> "direct_relaxed" there. At the moment, in case the same direct
>> region is used by 2 devices, one USB/GFX and another not belonging
>> to the previous categories, the direct region will be output twice
>> with "direct" type.
> 
> Hmm, that sounds a bit off - if we have overlapping regions within the
> same domain, then we need to do some additional pre-processing to adjust
> them anyway, since any part of a relaxable region which overlaps a
> non-relaxable region cannot actually be relaxed, and so really should
> never be described as such.
In iommu_insert_resv_region(), we are overlapping regions of the same
type. So iommu_get_group_resv_regions() should return both the relaxable
region and non relaxable one. I should test this again using a hacked
kernel though.
> 
>> This would unblock Shameer's series:
>> [PATCH v6 0/7] vfio/type1: Add support for valid iova list management
>> https://patchwork.kernel.org/patch/10425309/
>>
>> which failed to get pulled for 4.18 merge window due to IGD
>> device assignment regression.
>>
>> v2 -> v3:
>> - fix direct type check
>> ---
>>   drivers/iommu/iommu.c | 12 +++++++-----
>>   include/linux/iommu.h |  6 ++++++
>>   2 files changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index ae4ea5c0e6f9..28c3d6351832 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -73,10 +73,11 @@ struct iommu_group_attribute {
>>   };
>>     static const char * const iommu_group_resv_type_string[] = {
>> -    [IOMMU_RESV_DIRECT]    = "direct",
>> -    [IOMMU_RESV_RESERVED]    = "reserved",
>> -    [IOMMU_RESV_MSI]    = "msi",
>> -    [IOMMU_RESV_SW_MSI]    = "msi",
>> +    [IOMMU_RESV_DIRECT]            = "direct",
>> +    [IOMMU_RESV_DIRECT_RELAXABLE]        = "direct",
> 
> Wasn't part of the idea that we might not need to expose these to
> userspace at all if they're only relevant to default domains, and not to
> VFIO domains or anything else userspace can get involved with?
Fur sure, today, those regions are exposed as "direct". Isn't this
information relevant to the userspace as it would rather not use those
direct regions. We eventually chose to implement the check inside the
VFIO driver but I understood we wanted this info to be visible. Then it
is arguable whether we should expose the new relaxable type, at the pain
of making the UABI evolve.

> 
>> +    [IOMMU_RESV_RESERVED]            = "reserved",
>> +    [IOMMU_RESV_MSI]            = "msi",
>> +    [IOMMU_RESV_SW_MSI]            = "msi",
>>   };
>>     #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store)        \
>> @@ -573,7 +574,8 @@ static int
>> iommu_group_create_direct_mappings(struct iommu_group *group,
>>           start = ALIGN(entry->start, pg_size);
>>           end   = ALIGN(entry->start + entry->length, pg_size);
>>   -        if (entry->type != IOMMU_RESV_DIRECT)
>> +        if (entry->type != IOMMU_RESV_DIRECT &&
>> +            entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>>               continue;
>>             for (addr = start; addr < end; addr += pg_size) {
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index ba91666998fb..14a521f85f14 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -135,6 +135,12 @@ enum iommu_attr {
>>   enum iommu_resv_type {
>>       /* Memory regions which must be mapped 1:1 at all times */
>>       IOMMU_RESV_DIRECT,
>> +    /*
>> +     * Memory regions which are advertised to be 1:1 but are
>> +     * commonly considered relaxable in some conditions,
>> +     * for instance in device assignment use case (USB, Graphics)
>> +     */
>> +    IOMMU_RESV_DIRECT_RELAXABLE,
> 
> What do you think of s/RELAXABLE/BOOT/ ? My understanding is that these
> regions are only considered relevant until Linux has taken full control
> of the endpoint, and having a slightly more well-defined scope than
> "some conditions" might be nice.
That's not my current understanding. I think those RMRRs may be used
post-boot (especially the IGD stolen memory covered by RMRR). I
understand this depends on the video mode or FW in use by the IGD. But I
am definitively not an expert here.
> 
> There's an open question of how to handle things like simple-fb and EFI
> framebuffer handover on Arm and other platforms, so I'd really like to
> be able to slot that right into this case as well (how we describe the
> relevant connections in DT/ACPI is another matter, but hey, one step at
> a time...)
> 
> Otherwise though, I too am pleased to see this, thanks! I did start
> having a go myself after talking to Alex at KVM Forum, but I got bogged
> down in the intel-iommu parts and inevitably distracted - patch #7 looks
> beautifully simpler than I'd convinced myself was possible :D
Thank you Robin. Let's see if we can converge ...

Thanks

Eric
> 
> Robin.
> 
>>       /* Arbitrary "never map this or give it to a device" address
>> ranges */
>>       IOMMU_RESV_RESERVED,
>>       /* Hardware MSI region (untranslated) */
>>

  reply	other threads:[~2019-05-16 13:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16 10:08 [PATCH v3 0/7] RMRR related fixes and enhancements Eric Auger
2019-05-16 10:08 ` [PATCH v3 1/7] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() Eric Auger
2019-05-16 10:08 ` [PATCH v3 2/7] iommu/vt-d: Duplicate iommu_resv_region objects per device list Eric Auger
2019-05-16 10:08 ` [PATCH v3 3/7] iommu/vt-d: Introduce is_downstream_to_pci_bridge helper Eric Auger
2019-05-16 10:08 ` [PATCH v3 4/7] iommu/vt-d: Handle RMRR with PCI bridge device scopes Eric Auger
2019-05-16 10:08 ` [PATCH v3 5/7] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions Eric Auger
2019-05-16 10:08 ` [PATCH v3 6/7] iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions Eric Auger
2019-05-16 11:16   ` Jean-Philippe Brucker
2019-05-16 11:45     ` Auger Eric
2019-05-16 12:43       ` Jean-Philippe Brucker
2019-05-16 12:58         ` Auger Eric
2019-05-16 17:06           ` Alex Williamson
2019-05-16 17:23             ` Robin Murphy
2019-05-16 12:46   ` Robin Murphy
2019-05-16 13:23     ` Auger Eric [this message]
2019-05-16 16:53       ` Alex Williamson
2019-05-16 17:53       ` Robin Murphy
2019-05-17  7:11         ` Auger Eric
2019-05-20 12:45         ` Auger Eric
2019-05-16 10:08 ` [PATCH v3 7/7] iommu/vt-d: Differentiate relaxable and non relaxable RMRRs Eric Auger
2019-05-17  4:46   ` Lu Baolu
2019-05-17  7:07     ` Auger Eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57db1955-9d19-7c0b-eca3-37cc0d7d745b@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger.pro@gmail.com \
    --cc=hanjun.guo@linaro.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=robin.murphy@arm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=sudeep.holla@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).