All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "Paul Durrant" <paul@xen.org>, "Ian Jackson" <iwj@xenproject.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [PATCH v8 1/6] AMD/IOMMU: obtain IVHD type to use earlier
Date: Wed, 20 Oct 2021 00:34:23 +0100	[thread overview]
Message-ID: <932476cb-9667-efaa-65e9-7dc4baa3dc7c@citrix.com> (raw)
In-Reply-To: <d5f76461-70d2-fc59-2213-99a093e3b57f@suse.com>

On 22/09/2021 15:36, Jan Beulich wrote:
> Doing this in amd_iommu_prepare() is too late for it, in particular, to
> be used in amd_iommu_detect_one_acpi(), as a subsequent change will want
> to do. Moving it immediately ahead of amd_iommu_detect_acpi() is
> (luckily) pretty simple, (pretty importantly) without breaking
> amd_iommu_prepare()'s logic to prevent multiple processing.
>
> This involves moving table checksumming, as
> amd_iommu_get_supported_ivhd_type() ->  get_supported_ivhd_type() will
> now be invoked before amd_iommu_detect_acpi()  -> detect_iommu_acpi(). In
> the course of doing so stop open-coding acpi_tb_checksum(), seeing that
> we have other uses of this originally ACPI-private function elsewhere in
> the tree.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I'm afraid this breaks booting on Skylake Server.  Yes, really - I
didn't believe the bisection at first either.

From a bit of debugging, I've found:

(XEN) *** acpi_dmar_init() => -19
(XEN) *** amd_iommu_get_supported_ivhd_type() => -19

So VT-d is disabled in firmware.  Oops, but something we should cope with.

Then we fall into acpi_ivrs_init(), and take the new-in-this-patch early
exit with -ENOENT too.

It turns out ...

> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -179,9 +179,17 @@ static int __must_check amd_iommu_setup_
>  
>  int __init acpi_ivrs_init(void)
>  {
> +    int rc;
> +
>      if ( !iommu_enable && !iommu_intremap )
>          return 0;
>  
> +    rc = amd_iommu_get_supported_ivhd_type();
> +    if ( rc < 0 )
> +        return rc;
> +    BUG_ON(!rc);
> +    ivhd_type = rc;
> +
>      if ( (amd_iommu_detect_acpi() !=0) || (iommu_found() == 0) )
>      {
>          iommu_intremap = iommu_intremap_off;
>

... we're relying on this path (now skipped) to set iommu_intremap away
from iommu_intremap_full in the "no IOMMU anywhere to be found" case.

This explains why I occasionally during failure get spew about:

(XEN) CPU0: No irq handler for vector 7a (IRQ -2147483648, LAPIC)
[   17.117518] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
[   17.121114] xhci_hcd 0000:00:14.0: Max number of devices this xHCI
host supports is 64.
[   17.125198] usb usb1-port2: couldn't allocate usb_device
[  248.317462] INFO: task kworker/u32:0:7 blocked for more than 120 seconds.

and eventually (gone 400s) get dumped in a dracut shell.

Booting with an explicit iommu=no-intremap, which clobbers
iommu_intremap during cmdline parsing, recovers the system.

This variable controls a whole lot of magic with interrupt handling.  It
should default to 0, not 2, and only become nonzero when an IOMMU is
properly established.  It also shouldn't be serving double duty as "what
the user wants" ahead of determining the system capabilities.

And not to open another can of worms, but our entire way of working
explodes if there are devices on the system not covered by an IOMMU.

~Andrew



  parent reply	other threads:[~2021-10-19 23:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22 14:35 [PATCH v8 0/6] AMD/IOMMU: further work split from XSA-378 Jan Beulich
2021-09-22 14:36 ` [PATCH v8 1/6] AMD/IOMMU: obtain IVHD type to use earlier Jan Beulich
2021-09-28  7:12   ` Durrant, Paul
2021-10-19 23:34   ` Andrew Cooper [this message]
2021-10-20  6:58     ` Jan Beulich
2021-10-20  8:17     ` Jan Beulich
2021-09-22 14:37 ` [PATCH v8 2/6] AMD/IOMMU: improve (extended) feature detection Jan Beulich
2021-09-22 14:37 ` [PATCH v8 3/6] AMD/IOMMU: check IVMD ranges against host implementation limits Jan Beulich
2021-09-22 14:37 ` [PATCH v8 4/6] AMD/IOMMU: respect AtsDisabled device flag Jan Beulich
2021-09-28  7:34   ` Durrant, Paul
2021-09-28  7:47     ` Jan Beulich
2021-09-28  7:57       ` Durrant, Paul
2021-09-22 14:38 ` [PATCH v8 5/6] AMD/IOMMU: pull ATS disabling earlier Jan Beulich
2021-09-28  7:36   ` Durrant, Paul
2021-09-22 14:38 ` [PATCH v8 6/6] AMD/IOMMU: expose errors and warnings unconditionally Jan Beulich
2021-09-28  7:42   ` Durrant, Paul
2021-09-28  7:50     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=932476cb-9667-efaa-65e9-7dc4baa3dc7c@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=paul@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.