xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Juergen Gross <jgross@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>,
	xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU
Date: Fri, 11 Oct 2019 12:36:53 +0200	[thread overview]
Message-ID: <20191011103653.GK1389@Air-de-Roger.citrite.net> (raw)
In-Reply-To: <a35f02d7-d928-e950-e7b5-bcf2e176655f@suse.com>

On Fri, Oct 11, 2019 at 12:01:36PM +0200, Jan Beulich wrote:
> On 11.10.2019 11:28, Roger Pau Monné  wrote:
> > On Fri, Oct 11, 2019 at 09:52:35AM +0200, Jan Beulich wrote:
> >> On 10.10.2019 17:19, Roger Pau Monné  wrote:
> >>> On Thu, Oct 10, 2019 at 03:46:45PM +0200, Jan Beulich wrote:
> >>>> On 10.10.2019 15:12, Roger Pau Monné  wrote:
> >>>>> On Thu, Oct 10, 2019 at 02:55:02PM +0200, Jan Beulich wrote:
> >>>>>> On 10.10.2019 14:13, Roger Pau Monné  wrote:
> >>>>>>> On Thu, Oct 10, 2019 at 01:54:06PM +0200, Jan Beulich wrote:
> >>>>>>>> On 10.10.2019 13:33, Roger Pau Monne wrote:
> >>>>>>>>> When interrupt remapping is enabled as part of enabling x2APIC the
> >>>>>>>>
> >>>>>>>> Perhaps "unmasked" instead of "the"?
> >>>>>>>>
> >>>>>>>>> IO-APIC entries also need to be translated to the new format and added
> >>>>>>>>> to the interrupt remapping table.
> >>>>>>>>>
> >>>>>>>>> This prevents IOMMU interrupt remapping faults when booting on
> >>>>>>>>> hardware that has unmasked IO-APIC pins.
> >>>>>>>>
> >>>>>>>> But in the end it only papers over whatever the spurious interrupts
> >>>>>>>> result form, doesn't it? Which isn't to say this isn't an
> >>>>>>>> improvement. Calling out the ExtInt case here may be worthwhile as
> >>>>>>>> well, as would be pointing out that this case still won't work on
> >>>>>>>> AMD IOMMUs.
> >>>>>>>
> >>>>>>> But the fix for the ExtINT AMD issue should be done in
> >>>>>>> amd_iommu_ioapic_update_ire then, so that it can properly handle
> >>>>>>> ExtINT delivery mode, not to this part of the code. I will look
> >>>>>>> into it, but I think it's kind of tangential to the issue here.
> >>>>>>
> >>>>>> I'm not talking of you working on fixing this right away. I'm merely
> >>>>>> asking that you mention here (a) the ExtInt special case and (b)
> >>>>>> that this special case will (continue to) not work in the AMD case.
> >>>>>>
> >>>>>>>>> --- a/xen/arch/x86/apic.c
> >>>>>>>>> +++ b/xen/arch/x86/apic.c
> >>>>>>>>> @@ -515,7 +515,7 @@ static void resume_x2apic(void)
> >>>>>>>>>      iommu_enable_x2apic();
> >>>>>>>>>      __enable_x2apic();
> >>>>>>>>>  
> >>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, true);
> >>>>>>>>>      unmask_8259A();
> >>>>>>>>>  
> >>>>>>>>>  out:
> >>>>>>>>> @@ -961,7 +961,12 @@ void __init x2apic_bsp_setup(void)
> >>>>>>>>>          printk("Switched to APIC driver %s\n", genapic.name);
> >>>>>>>>>  
> >>>>>>>>>  restore_out:
> >>>>>>>>> -    restore_IO_APIC_setup(ioapic_entries);
> >>>>>>>>> +    /*
> >>>>>>>>> +     * NB: do not use raw mode when restoring entries if the iommu has been
> >>>>>>>>> +     * enabled during the process, because the entries need to be translated
> >>>>>>>>> +     * and added to the remapping table in that case.
> >>>>>>>>> +     */
> >>>>>>>>> +    restore_IO_APIC_setup(ioapic_entries, !x2apic_enabled);
> >>>>>>>>
> >>>>>>>> How is this different in the resume_x2apic() case? The IOMMU gets
> >>>>>>>> enabled in the course of that as well. I.e. I'd expect you want
> >>>>>>>> to pass "false" there, not "true".
> >>>>>>>
> >>>>>>> In the resume_x2apic case interrupt remapping should already be
> >>>>>>> enabled or not, but that function is not going to enable interrupt
> >>>>>>> remapping if it wasn't enabled before, hence the IO-APIC entries
> >>>>>>> should already be using the interrupt remapping table and no
> >>>>>>> translation is needed.
> >>>>>>
> >>>>>> Who / what would have enabled the IOMMU in the resume case?
> >>>>>
> >>>>> I don't think the question is who enables interrupt remapping in the
> >>>>> resume case (which is resume_x2apic when calling iommu_enable_x2apic
> >>>>> AFAICT), the point here is that on resume the entries in the IO-APIC
> >>>>> will already match the state of interrupt remapping, so they shouldn't
> >>>>> be translated. If interrupt remapping was off before suspend it will
> >>>>> still be off after resume, and there won't be any translation needed.
> >>>>> The same is true if interrupt remapping is on before suspend.
> >>>>
> >>>> I disagree: save_IO_APIC_setup() gets called from resume_x2apic(),
> >>>> not prior to suspend.
> >>>
> >>> Oh, so maybe that's a misunderstanding on my side. I don't seem to be
> >>> able to find a statement about the contents of the IO-APIC registers
> >>> (and more specifically the entries) when getting back from
> >>> suspension. Are all entries cleared and masked?
> >>>
> >>> Are the values previous to suspension stored?
> >>
> >> See ioapic_suspend() / ioapic_resume(): Looks like there's some
> >> redundancy here - I don't think it makes sense for the LAPIC
> >> code to fiddle with all the RTEs if subsequently (I assume;
> >> didn't check) they'll all be overwritten anyway. I would seem
> >> more logical to me if they'd just all get masked for IOMMU
> >> enabling, deferring to ioapic_resume() for everything else.
> > 
> > Yes, it seems like resume_x2apic shouldn't need to play with the
> > entries at all TBH, just enabling interrupt remapping should be enough
> > given that the entries are already save and restored by
> > ioapic_{suspend/resume}.
> > 
> > Would you agree to leave that as-is in this patch, ie: always pass
> > true to restore_IO_APIC_setup in resume_x2apic?
> 
> Properly explained in the description, that's an option. Even better
> would of course be to do away with the unnecessary save/restore in
> a prereq patch, at which point the question disappears as to what to
> pass to the function at that point.

OK, I can create a pre-patch, but I don't think I have any hardware
that does suspend/resume since it's all server-grade. Someone would
have to test it and assert it actually works correctly, hence my
reluctance to touch any of the suspend/resume logic.

> >>> But it's simply not possible to reach the call to
> >>> restore_IO_APIC_setup with x2apic_enabled == true and interrupt
> >>> remapping disabled, regardless of the initial value of
> >>> x2apic_enabled.
> >>>
> >>> All the paths that could lead to this scenario are short-circuited
> >>> above with a panic.
> >>
> >> Hmm, true. Nevertheless it would feel better if the conditionals
> >> were using what actually matters, rather than something derived.
> > 
> > Would you be OK to using something like v1:
> > 
> > https://lists.xenproject.org/archives/html/xen-devel/2019-10/msg00805.html
> > 
> > See the usage of iommu_enabled in x2apic_bsp_setup. I think using
> > x2apic_enabled is fine given the logic in the function, and the fact
> > that x2APIC mandates interrupt remapping, but I could live with that
> > extra local variable.
> 
> Conceptually this looks better to me. I'd prefer to avoid shadowing
> the global "iommu_enabled" though, and set the local variable to
> "true" only one we actually enable the IOMMU. Also strictly speaking
> you care about the enabling of interrupt remapping, not that of the
> IOMMU (which implicitly means DMA remapping). I wonder whether the
> code couldn't easily be made cope with IOMMU already being enabled,
> assuming that this would lead to {iommu,intremap}_enabled to already
> be true on entry. I.e. you'd request a "raw" restore unless
> intremap_enabled changed from false to true.

I'm not sure it's that easy. A raw restore wouldn't fully work, since
Xen would have replaced the interrupt remapping table with an empty
one, so Xen would have to add the entries for the unmasked IO-APIC
pins to the IRT and then likely also update the IO-APIC pin entry to
match the newly created IRTE.

I think adding a comment at the top of the function to note that this
won't work properly if there are unmasked IO-APIC pins and interrupt
remapping enabled should be enough for now, as the patch doesn't make
this case worse than it's current state.

> >>>> I realize that io_apic_write() would suitably avoid going
> >>>> the remapping path, but I think it would be more clear if the
> >>>> distinction was already made properly at the call site.
> >>>
> >>> I'm afraid I'm slightly loss, do you mean to replace the
> >>> ioapic_write_entry with an io_apic_write in restore_IO_APIC_setup?
> >>
> >> No, because of ...
> >>
> >>> That would be the same as always passing raw == false AFAICT.
> >>
> >> ... this. I'm asking to pass in the argument for "raw" based
> >> on what you want, without relying on io_apic_write()'s
> >> behavior. That's more for code clarity than actual correctness,
> >> since - as said - io_apic_write() would invoke __io_apic_write()
> >> anyway.
> > 
> > Sorry, but I'm afraid I still don't fully understand what you mean
> > with this. restore_IO_APIC_setup uses ioapic_write_entry which in turn
> > uses __io_apic_write or io_apic_write. I could get rid of the usage of
> > ioapic_write_entry and just call __io_apic_write or io_apic_write (or
> > even directly call iommu_update_ire_from_apic), but I'm not sure
> > what's the benefit of any of this, it's just open-coding logic that's
> > done in the helpers.
> > 
> > Would you rather mean to rename the parameter of restore_IO_APIC_setup
> > from 'raw' to 'translate' or some such in order to clearly denote
> > there's a transformation done before writing the entry?
> 
> Whether it's "raw" or "translate" doesn't really matter to me.
> What I'm after is for you to not pass raw=true when you really
> mean raw=false (or the other way around), relying on the
> intremap_enabled check down the call chain in io_apic_write().

I don't think that's the case with my proposed code (or at least not
the intention), when raw=false is passed that's because the entries
_must_ be translated, not because x2apic_bsp_setup is relying on the
value of intremap_enabled.

Let me prepare a new series with the proposed changes and we can
continue from there.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2019-10-11 10:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10 11:33 [Xen-devel] [PATCH v2 0/2] iommu: fixes for interrupt remapping enabling Roger Pau Monne
2019-10-10 11:33 ` [Xen-devel] [PATCH v2 1/2] x2APIC: translate IO-APIC entries when enabling the IOMMU Roger Pau Monne
2019-10-10 11:54   ` Jan Beulich
2019-10-10 12:13     ` Roger Pau Monné
2019-10-10 12:55       ` Jan Beulich
2019-10-10 13:12         ` Roger Pau Monné
2019-10-10 13:46           ` Jan Beulich
2019-10-10 15:19             ` Roger Pau Monné
2019-10-11  7:52               ` Jan Beulich
2019-10-11  9:28                 ` Roger Pau Monné
2019-10-11 10:01                   ` Jan Beulich
2019-10-11 10:36                     ` Roger Pau Monné [this message]
2019-10-10 13:29         ` Roger Pau Monné
2019-10-10 13:53           ` Jan Beulich
2019-10-10 11:33 ` [Xen-devel] [PATCH v2 2/2] iommu: translate IO-APIC pins when enabling interrupt remapping Roger Pau Monne
2019-10-10 11:59   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191011103653.GK1389@Air-de-Roger.citrite.net \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).