All of lore.kernel.org
 help / color / mirror / Atom feed
From: Scott Davis <scott.davis@starlab.io>
To: "paul@xen.org" <paul@xen.org>, Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	George Dunlap <george.dunlap@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Julien Grall <julien@xen.org>,
	Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [RFC PATCH] iommu: make no-quarantine mean no-quarantine
Date: Thu, 29 Apr 2021 21:04:58 +0000	[thread overview]
Message-ID: <2A65573D-2DD6-49C7-B34C-27B15FD620B2@starlab.io> (raw)
In-Reply-To: <dc6556ae-c653-8519-1a81-9524e4472f26@xen.org>

On 4/28/21, 3:20 AM, Paul Durrant wrote:
>> Following the extension to the command line option I'm putting in place
>> in "IOMMU: make DMA containment of quarantined devices optional" (which
>> I still need to get around to address review feedback for and resubmit),
>> I'd be inclined to suggest "iommu=quarantine=always" or
>> "iommu=quarantine=on-assign". Unless of course we'd prefer to have the
>> caller of the assignment operation have full control over the behavior
>> here anyway (in which case a command line option control simply is not
>> necessary).
>
> I'm still not entirely sure why not quarantining on is a problem, other
> than it triggering an as-yet undiagnosed issue in QEMU, but I agree that
> that the expectation of 'no-quarantine' meaning just that (i.e. the old
> dom0->domU and domU->dom0 transitions are re-instated) is reasonable. Do
> we really want yet more command line options?

Regarding the problem in QEMU, I traced the crash trigger down to a
write to the IQ tail register during the mapping operation into dom_io
(backtrace below). Along the way I noticed that, since a non-present
entry was being flushed, flush_context_qi only performs this
invalidation on an IOMMU with caching mode enabled (i.e. a software
IOMMU). Therefore this issue is probably only hittable when nesting.
Disabling caching mode on the QEMU vIOMMU was enough to prevent the
crash and give me a working system.

(gdb) si
0xffff82d04025b68b  72  in qinval.c
   0xffff82d04025b687 <qinval_update_qtail+43>: ... shl    $0x4,%r12
=> 0xffff82d04025b68b <qinval_update_qtail+47>: ... mov    %r12,0x88(%rax)
(gdb) bt
#0  0xffff82d04025b68b in qinval_update_qtail (...) at qinval.c:72
#1  0xffff82d04025baa7 in queue_invalidate_context_sync (...) at qinval.c:101
#2  flush_context_qi (...) at qinval.c:341
#3  0xffff82d040259125 in iommu_flush_context_device (...) at iommu.c:400
#4  domain_context_mapping_one (...) at iommu.c:1436
#5  0xffff82d040259351 in domain_context_mapping (...) at iommu.c:1510
#6  0xffff82d040259d20 in reassign_device_ownership (...) at iommu.c:2412
#7  0xffff82d040259f19 in intel_iommu_assign_device (...) at iommu.c:2476
#8  0xffff82d040267154 in assign_device (...) at pci.c:1545
#9  iommu_do_pci_domctl (...) at pci.c:1732
#10 0xffff82d040264de3 in iommu_do_domctl (...) at iommu.c:539
#11 0xffff82d040322ca5 in arch_do_domctl (...) at domctl.c:1496
#12 0xffff82d040205a19 in do_domctl (...) at domctl.c:956
#13 0xffff82d040319476 in pv_hypercall (...) at hypercall.c:155
#14 0xffff82d040390432 in lstar_enter () at entry.S:271
#15 0x0000000000000000 in ?? ()

As a result of the above, I no longer have a need to patch Xen to work
around the problem. Though I do want to test against newer versions of
QEMU (currently on 4.2.1) to see if it still exists.

So unless there's interest among Xen developers for this patch, I will
probably shelve it for now. Especially since it looks like Jan has some
ongoing work in this area that I had not previously discovered. If there
is interest, I just need a resolution on whether iommu=quarantine should
be left as a boolean or expanded to support always, never,
deassign-only, and (why not) assign-only.

Thanks,
Scott


  parent reply	other threads:[~2021-04-29 21:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26 17:25 [RFC PATCH] iommu: make no-quarantine mean no-quarantine Scott Davis
2021-04-27  6:56 ` Jan Beulich
2021-04-27 22:00   ` Scott Davis
2021-04-28  6:15     ` Jan Beulich
2021-04-28  7:19       ` Paul Durrant
2021-04-28  8:49         ` Jan Beulich
2021-04-28  8:51           ` Paul Durrant
2021-04-29 21:04         ` Scott Davis [this message]
2021-04-30  7:15           ` Jan Beulich
2021-04-30 19:27             ` Scott Davis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2A65573D-2DD6-49C7-B34C-27B15FD620B2@starlab.io \
    --to=scott.davis@starlab.io \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=paul@xen.org \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.