All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Roger Pau Monne <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Wei Liu <wei.liu2@citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>, Tim Deegan <tim@xen.org>,
	Julien Grall <julien.grall@arm.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges
Date: Thu, 09 Aug 2018 01:36:57 -0600	[thread overview]
Message-ID: <5B6BEF1902000078001DC4F8@prv1-mh.provo.novell.com> (raw)
In-Reply-To: <20180808161826.pwqeud7ix3vzvaum@mac.bytemobile.com>

>>> On 08.08.18 at 18:18, <roger.pau@citrix.com> wrote:
> On Wed, Aug 08, 2018 at 07:15:54AM -0600, Jan Beulich wrote:
>> >>> On 08.08.18 at 12:07, <roger.pau@citrix.com> wrote:
>> > Several people have reported hardware issues (malfunctioning USB
>> > controllers) due to iommu page faults on Intel hardware. Those faults
>> > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
>> > be worked around on VTd hardware by manually adding RMRR entries on
>> > the command line, this is however limited to Intel hardware and quite
>> > cumbersome to do.
>> > 
>> > In order to solve those issues add a new dom0-iommu=reserved option
>> > that identity maps all regions marked as reserved in the memory map.
>> 
>> Considering that report we've had (yesterday? earlier today?), don't
>> we need to go further and use the union of RMRRs and reserved
>> regions? Iirc they had a case where an RMRR was not in a reserved
>> region ...
> 
> AFAICT (and I could be reading the code wrong) RMRR regions not on
> reserved regions are still correctly mapped to the guest.

Oh, yes, you're right - the logged message is really just that,
with no other enforced effects.

>> > --- a/docs/misc/xen-command-line.markdown
>> > +++ b/docs/misc/xen-command-line.markdown
>> > @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses 
>> > to that port.
>> >  >> Enable IOMMU debugging code (implies `verbose`).
>> >  
>> >  ### dom0-iommu
>> > -> `= List of [ none | strict | relaxed | inclusive ]`
>> > +> `= List of [ none | strict | relaxed | inclusive | reserved ]`
>> >  
>> >  * `none`: disables DMA remapping for Dom0.
>> >  
>> > @@ -1233,6 +1233,15 @@ meaning:
>> >    option is only applicable to a PV Dom0 and is enabled by default on Intel
>> >    hardware.
>> >  
>> > +* `reserved`: sets up DMA remapping for all the reserved regions in the  memory
>> > +  map for Dom0. Use this to work around firmware issues providing incorrect
>> > +  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
>> > +  for Dom0, all memory regions marked as reserved in the memory map that don't
>> > +  overlap with any MMIO region from emulated devices will be identity mapped.
>> > +  This option maps a subset of the memory that would be mapped when using the
>> > +  `inclusive` option. This option is available to a PVH Dom0 and is enabled by
>> > +  default on Intel hardware.
>> 
>> The sub-options so far were quite clear in their meanings, but
>> "dom0-iommu=reserved" might mean all sorts of things to me, but quite
>> certainly not "map all reserved regions". "map-reserved" perhaps?
> 
> Then we should also have 'map-inclusive' for symmetry IMO.

I don't mind at all such symmetry to be established.

>> > @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
>> >  {
>> >  }
>> >  
>> > +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned long pfn,
>> > +                                         unsigned long max_pfn)
>> > +{
>> > +    unsigned int i;
>> > +
>> > +    /*
>> > +     * Ignore any address below 1MB, that's already identity mapped by the
>> > +     * domain builder for HVM.
>> > +     */
>> > +    if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
>> 
>> Careful again here with the distinction between Dom0 and hwdom:
>> The domain builder you refer to is - aiui - the in-Xen one, i.e. the
>> one _only_ dealing with Dom0.
> 
> So this should instead be:
> 
> if ( (is_control_domain(d) && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||

From an abstract perspective (long term plans) is_control_domain()
can be true for multiple domains, none of which may be Dom0 or
hwdom. So no, I don't think the addition would help in any way.
With the reference to the in-Xen domain builder, I think you really
mean Dom0 here.

>> > +    /*
>> > +     * If dom0-strict mode is enabled or the guest type is PVH/HVM then exclude
>> > +     * conventional RAM and let the common code map dom0's pages.
>> > +     */
>> > +    if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> > +         (iommu_dom0_strict || is_hvm_domain(d)) )
>> > +        return false;
>> > +    if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> > +         !iommu_dom0_reserved && !iommu_dom0_inclusive )
>> > +        return false;
>> > +    if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
>> > +         !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> > +         !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> > +         (!iommu_dom0_inclusive || pfn > max_pfn) )
>> > +        return false;
>> 
>> As page_is_ram_type() is, especially on systems with many E820
>> entries, not really cheap, I think at least a minimum amount of
>> optimization is on order here - after all you do this for every
>> single page of the system. Hence minimally the result of the first
>> CONVENTIONAL and RESERVED queries should be latched and
>> re-used (or "else" be used suitably). Ideally an approach would
>> be used which involved just a single iteration through the E820
>> map, but I realize this may be more than is feasible here.
> 
> This would be quite better if I could just fetch the type, then I
> could add a switch (type) { ... and it would be better IMO.

Except that, as said, there's no "the" type for an entire page.
Only a single byte can have an exact type.

>> Furthermore I'm unconvinced the !page_is_ram_type() uses
>> are really what you want: The function returning false means
>> "don't know", not "no". Perhaps the function needs to be made
>> return a tristate (yes, no, or part of it).
> 
> Right, that's why I have three different !page_is_ram_type checks in
> the last branch of the if, so that I can make sure it's not one of the
> previous types and also account for holes.

I'm afraid I don't understand. Take the example of a single
page being split in an unusable and a reserved part. Both
respective function invocations will return false. Yet you
want to exclude both unusable and reserved types when
!iommu_dom0_inclusive, and hence your goal would be to
exclude that page here.

As to unusable - don't you break original behavior here
anyway? Shouldn't the function return false when
page_is_ram_type(pfn, RAM_TYPE_UNUSABLE), effectively
independent of any command line options?

>> > +    /* Check that it doesn't overlap with the LAPIC */
>> > +    if ( has_vlapic(d) )
>> > +    {
>> > +        const struct vcpu *v;
>> > +
>> > +        for_each_vcpu(d, v)
>> > +            if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) )
>> > +                return false;
>> > +    }
>> 
>> I don't, btw, recall any code adjusting the IOMMU mappings in
>> case the domain relocates its LAPIC. If you do the check above,
>> wouldn't that other side too need taking care of?
> 
> Yes. I can add something later but this is already an issue if the
> guest for example relocates the LAPIC over a RAM region.

Well, I'm fine leaving out parts that are currently broken anyway,
but please leave at least a TODO note just like you do for MCFG.
Or (for both) don't do anything here until the situation gets taken
care of uniformly (but presumably that's the worse option).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-08-09  7:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-08 10:07 [PATCH v4 0/4] 86/iommu: PVH Dom0 workarounds for missing RMRR entries Roger Pau Monne
2018-08-08 10:07 ` [PATCH v4 1/4] iommu: introduce dom0-iommu option Roger Pau Monne
2018-08-08 12:10   ` Jan Beulich
2018-08-08 15:50     ` Roger Pau Monné
2018-08-09  7:00       ` Jan Beulich
2018-08-09 10:01         ` Roger Pau Monné
2018-08-09 10:29           ` Jan Beulich
2018-08-09 10:51             ` Roger Pau Monné
2018-08-09 11:46               ` Jan Beulich
2018-08-09 10:18         ` Paul Durrant
2018-08-08 10:07 ` [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu Roger Pau Monne
2018-08-08 12:32   ` Jan Beulich
2018-08-08 16:09     ` Roger Pau Monné
2018-08-09  7:17       ` Jan Beulich
2018-08-08 10:07 ` [PATCH v4 3/4] dom0/pvh: change the order of the MMCFG initialization Roger Pau Monne
2018-08-08 12:35   ` Jan Beulich
2018-08-10 10:04     ` Roger Pau Monné
2018-08-10 11:41       ` Jan Beulich
2018-08-08 10:07 ` [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges Roger Pau Monne
2018-08-08 13:15   ` Jan Beulich
2018-08-08 16:18     ` Roger Pau Monné
2018-08-09  7:36       ` Jan Beulich [this message]
2018-08-09 10:23         ` Roger Pau Monné
2018-08-09 10:33           ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B6BEF1902000078001DC4F8@prv1-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=julien.grall@arm.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.