xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Scott Davis <scott.davis@starlab.io>
To: "paul@xen.org" <paul@xen.org>, Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Julien Grall <julien@xen.org>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [RFC PATCH] iommu: make no-quarantine mean no-quarantine
Date: Thu, 29 Apr 2021 21:04:58 +0000	[thread overview]
Message-ID: <2A65573D-2DD6-49C7-B34C-27B15FD620B2@starlab.io> (raw)
In-Reply-To: <dc6556ae-c653-8519-1a81-9524e4472f26@xen.org>

On 4/28/21, 3:20 AM, Paul Durrant wrote:
>> Following the extension to the command line option I'm putting in place
>> in "IOMMU: make DMA containment of quarantined devices optional" (which
>> I still need to get around to address review feedback for and resubmit),
>> I'd be inclined to suggest "iommu=quarantine=always" or
>> "iommu=quarantine=on-assign". Unless of course we'd prefer to have the
>> caller of the assignment operation have full control over the behavior
>> here anyway (in which case a command line option control simply is not
>> necessary).
>
> I'm still not entirely sure why not quarantining on is a problem, other
> than it triggering an as-yet undiagnosed issue in QEMU, but I agree that
> that the expectation of 'no-quarantine' meaning just that (i.e. the old
> dom0->domU and domU->dom0 transitions are re-instated) is reasonable. Do
> we really want yet more command line options?

Regarding the problem in QEMU, I traced the crash trigger down to a
write to the IQ tail register during the mapping operation into dom_io
(backtrace below). Along the way I noticed that, since a non-present
entry was being flushed, flush_context_qi only performs this
invalidation on an IOMMU with caching mode enabled (i.e. a software
IOMMU). Therefore this issue is probably only hittable when nesting.
Disabling caching mode on the QEMU vIOMMU was enough to prevent the
crash and give me a working system.

(gdb) si
0xffff82d04025b68b  72  in qinval.c
   0xffff82d04025b687 <qinval_update_qtail+43>: ... shl    $0x4,%r12
=> 0xffff82d04025b68b <qinval_update_qtail+47>: ... mov    %r12,0x88(%rax)
(gdb) bt
#0  0xffff82d04025b68b in qinval_update_qtail (...) at qinval.c:72
#1  0xffff82d04025baa7 in queue_invalidate_context_sync (...) at qinval.c:101
#2  flush_context_qi (...) at qinval.c:341
#3  0xffff82d040259125 in iommu_flush_context_device (...) at iommu.c:400
#4  domain_context_mapping_one (...) at iommu.c:1436
#5  0xffff82d040259351 in domain_context_mapping (...) at iommu.c:1510
#6  0xffff82d040259d20 in reassign_device_ownership (...) at iommu.c:2412
#7  0xffff82d040259f19 in intel_iommu_assign_device (...) at iommu.c:2476
#8  0xffff82d040267154 in assign_device (...) at pci.c:1545
#9  iommu_do_pci_domctl (...) at pci.c:1732
#10 0xffff82d040264de3 in iommu_do_domctl (...) at iommu.c:539
#11 0xffff82d040322ca5 in arch_do_domctl (...) at domctl.c:1496
#12 0xffff82d040205a19 in do_domctl (...) at domctl.c:956
#13 0xffff82d040319476 in pv_hypercall (...) at hypercall.c:155
#14 0xffff82d040390432 in lstar_enter () at entry.S:271
#15 0x0000000000000000 in ?? ()

As a result of the above, I no longer have a need to patch Xen to work
around the problem. Though I do want to test against newer versions of
QEMU (currently on 4.2.1) to see if it still exists.

So unless there's interest among Xen developers for this patch, I will
probably shelve it for now. Especially since it looks like Jan has some
ongoing work in this area that I had not previously discovered. If there
is interest, I just need a resolution on whether iommu=quarantine should
be left as a boolean or expanded to support always, never,
deassign-only, and (why not) assign-only.

Thanks,
Scott


  parent reply	other threads:[~2021-04-29 21:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26 17:25 [RFC PATCH] iommu: make no-quarantine mean no-quarantine Scott Davis
2021-04-27  6:56 ` Jan Beulich
2021-04-27 22:00   ` Scott Davis
2021-04-28  6:15     ` Jan Beulich
2021-04-28  7:19       ` Paul Durrant
2021-04-28  8:49         ` Jan Beulich
2021-04-28  8:51           ` Paul Durrant
2021-04-29 21:04         ` Scott Davis [this message]
2021-04-30  7:15           ` Jan Beulich
2021-04-30 19:27             ` Scott Davis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2A65573D-2DD6-49C7-B34C-27B15FD620B2@starlab.io \
    --to=scott.davis@starlab.io \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=paul@xen.org \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).