All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
@ 2013-02-08 14:29 Boris Ostrovsky
  2013-02-08 16:48 ` Jan Beulich
  0 siblings, 1 reply; 11+ messages in thread
From: Boris Ostrovsky @ 2013-02-08 14:29 UTC (permalink / raw)
  To: JBeulich; +Cc: sherry.hurwitz, xiantao.zhang, suravee.suthikulpanit, xen-devel


----- JBeulich@suse.com wrote:

> >>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
> > A regression was reported on a class of broken firmware that c/s
> > 26517:601139e2b0db didn't consider, leading to a boot time crash.
> 
> After some more thought on this and the comments we got
> regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we
> can actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than
> prior to the XSA-36 fixes (where interrupt remapping didn't really
> protect domains from one another), but allows at least DMA
> isolation to still be utilized. Patch 3/2 below/attached.

But now users who don't examine log messages may not realize 
that interupt remapping is disabled and therefore the system can be
affected by XSA-36.

With current code (boot option to use global remapping table) users
are explicitly agreeing to allow for possibility of cross-domain interrupt
attack.

Also, I think it may not be a bad idea to have AMD folks test you earlier
patch on multi-IOMMU system (and simulate bad IVRS) to see how it behaves there.

Suravee --- this is the patch I am referring to: http://lists.xen.org/archives/html/xen-devel/2013-02/msg00408.html

-boris

> 
> Jan
> 
> AMD IOMMU: only disable interrupt remapping when certain IVRS
> consistency checks fail
> 
> After some more thought on the XSA-36 and specifically the comments
> we
> got regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we can
> actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than prior
> to the XSA-36 fixes (where interrupt remapping didn't really protect
> domains from one another), but allows at least DMA isolation to still
> be utilized.
> 
> In the course of this, the calls to the interrupt remapping related
> IOMMU implementation specific operations needed to be re-qualified
> from depending on iommu_enabled to iommu_intremap (as was already the
> case for x86's HPET code).
> 
> That in turn required to make sure iommu_intremap gets properly
> cleared when the respective initialization fails (or isn't being
> done at all).
> 
> Along with making sure interrupt remapping doesn't get inconsistently
> enabled on some IOMMUs and not on others in the VT-d code, this in
> turn
> allowed quite a bit of cleanup on the VT-d side (if desired, that
> cleanup could of course be broken out into a separate patch).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -207,7 +207,7 @@ static void read_msi_msg(struct msi_desc
>          BUG();
>      }
>  
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>          iommu_read_msi_from_ire(entry, msg);
>  }
>  
> @@ -215,7 +215,7 @@ static void write_msi_msg(struct msi_des
>  {
>      entry->msg = *msg;
>  
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>      {
>          ASSERT(msg != &entry->msg);
>          iommu_update_ire_from_msi(entry, msg);
> @@ -489,7 +489,7 @@ int msi_free_irq(struct msi_desc *entry)
>      }
>  
>      /* Free the unused IRTE if intr remap enabled */
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>          iommu_update_ire_from_msi(entry, NULL);
>  
>      list_del(&entry->list);
> --- a/xen/drivers/passthrough/amd/iommu_acpi.c
> +++ b/xen/drivers/passthrough/amd/iommu_acpi.c
> @@ -682,10 +682,10 @@ static u16 __init parse_ivhd_device_spec
>                      printk(XENLOG_ERR "IVHD Error: Conflicting
> IO-APIC %#x entries\n",
>                             special->handle);
>                      if ( amd_iommu_perdev_intremap )
> -                        return 0;
> +                        iommu_intremap = 0;
>                  }
>              }
> -            else
> +            else if ( iommu_intremap )
>              {
>                  /* set device id of ioapic */
>                  ioapic_sbdf[special->handle].bdf = bdf;
> @@ -697,7 +697,7 @@ static u16 __init parse_ivhd_device_spec
>                       !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>                  {
>                      printk(XENLOG_ERR "IVHD Error: Out of
> memory\n");
> -                    return 0;
> +                    iommu_intremap = 0;
>                  }
>              }
>              break;
> @@ -706,7 +706,7 @@ static u16 __init parse_ivhd_device_spec
>          {
>              printk(XENLOG_ERR "IVHD Error: Invalid IO-APIC %#x\n",
>                     special->handle);
> -            return 0;
> +            iommu_intremap = 0;
>          }
>          break;
>      case ACPI_IVHD_HPET:
> @@ -932,7 +932,7 @@ static int __init parse_ivrs_table(struc
>          printk(XENLOG_ERR "IVHD Error: no information for IO-APIC
> %#x\n",
>                 IO_APIC_ID(apic));
>          if ( amd_iommu_perdev_intremap )
> -            error = -ENXIO;
> +            iommu_intremap = 0;
>          else
>          {
>              ioapic_sbdf[IO_APIC_ID(apic)].pin_setup = xzalloc_array(
> @@ -940,7 +940,7 @@ static int __init parse_ivrs_table(struc
>              if ( !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>              {
>                  printk(XENLOG_ERR "IVHD Error: Out of memory\n");
> -                error = -ENOMEM;
> +                iommu_intremap = 0;
>              }
>          }
>      }
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -1157,7 +1157,7 @@ int __init amd_iommu_init(void)
>      BUG_ON( !iommu_found() );
>  
>      if ( amd_iommu_perdev_intremap && amd_sp5100_erratum28() )
> -        goto error_out;
> +        iommu_intremap = 0;
>  
>      ivrs_bdf_entries = amd_iommu_get_ivrs_dev_entries();
>  
> @@ -1173,7 +1173,7 @@ int __init amd_iommu_init(void)
>          goto error_out;
>  
>      /* initialize io-apic interrupt remapping entries */
> -    if ( amd_iommu_setup_ioapic_remapping() != 0 )
> +    if ( iommu_intremap && amd_iommu_setup_ioapic_remapping() != 0 )
>          goto error_out;
>  
>      /* allocate and initialize a global device table shared by all
> iommus */
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -470,6 +470,8 @@ int __init iommu_setup(void)
>          rc = iommu_hardware_setup();
>          iommu_enabled = (rc == 0);
>      }
> +    if ( !iommu_enabled )
> +        iommu_intremap = 0;
>  
>      if ( (force_iommu && !iommu_enabled) ||
>           (force_intremap && !iommu_intremap) )
> @@ -487,6 +489,7 @@ int __init iommu_setup(void)
>          printk(" - Dom0 mode: %s\n",
>                 iommu_passthrough ? "Passthrough" :
>                 iommu_dom0_strict ? "Strict" : "Relaxed");
> +    printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" :
> "dis");
>  
>      return rc;
>  }
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -373,7 +373,7 @@ unsigned int io_apic_read_remap_rte(
>      struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
>      struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>  
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr || !ir_ctrl->iremap_num
> ||
> +    if ( !ir_ctrl->iremap_num ||
>          ( (index = apic_pin_2_ir_idx[apic][ioapic_pin]) < 0 ) )
>          return __io_apic_read(apic, reg);
>  
> @@ -396,15 +396,8 @@ void io_apic_write_remap_rte(
>      struct IO_APIC_route_remap_entry *remap_rte;
>      unsigned int rte_upper = (reg & 1) ? 1 : 0;
>      struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
> -    struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>      int saved_mask;
>  
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -    {
> -        __io_apic_write(apic, reg, value);
> -        return;
> -    }
> -
>      old_rte = __ioapic_read_entry(apic, ioapic_pin, 1);
>  
>      remap_rte = (struct IO_APIC_route_remap_entry *) &old_rte;
> @@ -653,20 +646,11 @@ void msi_msg_read_remap_rte(
>  {
>      struct pci_dev *pdev = msi_desc->dev;
>      struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>  
>      drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                  : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
> -    iommu = drhd->iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    remap_entry_to_msi_msg(iommu, msg);
> +    if ( drhd )
> +        remap_entry_to_msi_msg(drhd->iommu, msg);
>  }
>  
>  void msi_msg_write_remap_rte(
> @@ -674,20 +658,11 @@ void msi_msg_write_remap_rte(
>  {
>      struct pci_dev *pdev = msi_desc->dev;
>      struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>  
>      drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                  : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
> -    iommu = drhd->iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    msi_msg_to_remap_entry(iommu, pdev, msi_desc, msg);
> +    if ( drhd )
> +        msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg);
>  }
>  
>  int __init intel_setup_hpet_msi(struct msi_desc *msi_desc)
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2065,6 +2065,9 @@ static int init_vtd_hw(void)
>                  break;
>              }
>          }
> +        if ( !iommu_intremap )
> +            for_each_drhd_unit ( drhd )
> +                disable_intremap(drhd->iommu);
>      }
>  
>      /*
> --- a/xen/include/asm-x86/io_apic.h
> +++ b/xen/include/asm-x86/io_apic.h
> @@ -129,7 +129,7 @@ struct IO_APIC_route_entry {
>  extern struct mpc_config_ioapic mp_ioapics[MAX_IO_APICS];
>  
>  /* Only need to remap ioapic RTE (reg: 10~3Fh) */
> -#define ioapic_reg_remapped(reg) (iommu_enabled && ((reg) >= 0x10))
> +#define ioapic_reg_remapped(reg) (iommu_intremap && ((reg) >= 0x10))
>  
>  static inline unsigned int __io_apic_read(unsigned int apic, unsigned
> int reg)
>  {
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-08 14:29 [PATCH 0/2] AMD IOMMU: XSA-36 follow ups Boris Ostrovsky
@ 2013-02-08 16:48 ` Jan Beulich
  2013-02-08 17:10   ` Sander Eikelenboom
  2013-02-11 11:46   ` Ian Campbell
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Beulich @ 2013-02-08 16:48 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, xiantao.zhang, suravee.suthikulpanit, sherry.hurwitz

>>> On 08.02.13 at 15:29, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:

> ----- JBeulich@suse.com wrote:
> 
>> >>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
>> > A regression was reported on a class of broken firmware that c/s
>> > 26517:601139e2b0db didn't consider, leading to a boot time crash.
>> 
>> After some more thought on this and the comments we got
>> regarding disabling the IOMMU in this situation altogether making
>> things worse instead of better, I came to the conclusion that we
>> can actually restrict the action in affected cases to just disabling
>> interrupt remapping. That doesn't make the situation worse than
>> prior to the XSA-36 fixes (where interrupt remapping didn't really
>> protect domains from one another), but allows at least DMA
>> isolation to still be utilized. Patch 3/2 below/attached.
> 
> But now users who don't examine log messages may not realize 
> that interupt remapping is disabled and therefore the system can be
> affected by XSA-36.

Yes. We need to balance these against one another - I see pros
and cons in both (and I don't mind dropping this additional patch
if we collectively come to the conclusion that the way it is now -
with the one earlier fix - is the better state). So I'm really
interested in others' opinions.

> With current code (boot option to use global remapping table) users
> are explicitly agreeing to allow for possibility of cross-domain interrupt
> attack.
> 
> Also, I think it may not be a bad idea to have AMD folks test you earlier
> patch on multi-IOMMU system (and simulate bad IVRS) to see how it behaves 
> there.

That would indeed be very desirable.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-08 16:48 ` Jan Beulich
@ 2013-02-08 17:10   ` Sander Eikelenboom
  2013-02-11 11:46   ` Ian Campbell
  1 sibling, 0 replies; 11+ messages in thread
From: Sander Eikelenboom @ 2013-02-08 17:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, sherry.hurwitz, suravee.suthikulpanit,
	xiantao.zhang, xen-devel


Friday, February 8, 2013, 5:48:41 PM, you wrote:

>>>> On 08.02.13 at 15:29, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:

>> ----- JBeulich@suse.com wrote:
>> 
>>> >>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
>>> > A regression was reported on a class of broken firmware that c/s
>>> > 26517:601139e2b0db didn't consider, leading to a boot time crash.
>>> 
>>> After some more thought on this and the comments we got
>>> regarding disabling the IOMMU in this situation altogether making
>>> things worse instead of better, I came to the conclusion that we
>>> can actually restrict the action in affected cases to just disabling
>>> interrupt remapping. That doesn't make the situation worse than
>>> prior to the XSA-36 fixes (where interrupt remapping didn't really
>>> protect domains from one another), but allows at least DMA
>>> isolation to still be utilized. Patch 3/2 below/attached.
>> 
>> But now users who don't examine log messages may not realize 
>> that interupt remapping is disabled and therefore the system can be
>> affected by XSA-36.

> Yes. We need to balance these against one another - I see pros
> and cons in both (and I don't mind dropping this additional patch
> if we collectively come to the conclusion that the way it is now -
> with the one earlier fix - is the better state). So I'm really
> interested in others' opinions.

One argument pro could be that linux seems to do the same (only disable interrupt remapping)

>> With current code (boot option to use global remapping table) users
>> are explicitly agreeing to allow for possibility of cross-domain interrupt
>> attack.
>> 
>> Also, I think it may not be a bad idea to have AMD folks test you earlier
>> patch on multi-IOMMU system (and simulate bad IVRS) to see how it behaves 
>> there.

And getting mainbord / bios manufactures actually support stuff .. instead of giving some nice disinformation.
Fresh from MSI techsupport:

Posted:         2013-02-09 00:11:38 (GMT+8),MSI
Content:        CPU virtualization (SVM) should be possible on the 890FXA-GD70. I/O virtualization (IOMMU) is support on the MSI's AMD 900-series. IOMMU will not be possible on MSI's AMD 800 series.

In short .. buy a new mainboard and pray .. pray hard .. that it works out of the box

> That would indeed be very desirable.

> Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-08 16:48 ` Jan Beulich
  2013-02-08 17:10   ` Sander Eikelenboom
@ 2013-02-11 11:46   ` Ian Campbell
  1 sibling, 0 replies; 11+ messages in thread
From: Ian Campbell @ 2013-02-11 11:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, sherry.hurwitz, suravee.suthikulpanit,
	xiantao.zhang, xen-devel

On Fri, 2013-02-08 at 16:48 +0000, Jan Beulich wrote:
> >>> On 08.02.13 at 15:29, Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote:
> 
> > ----- JBeulich@suse.com wrote:
> > 
> >> >>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
> >> > A regression was reported on a class of broken firmware that c/s
> >> > 26517:601139e2b0db didn't consider, leading to a boot time crash.
> >> 
> >> After some more thought on this and the comments we got
> >> regarding disabling the IOMMU in this situation altogether making
> >> things worse instead of better, I came to the conclusion that we
> >> can actually restrict the action in affected cases to just disabling
> >> interrupt remapping. That doesn't make the situation worse than
> >> prior to the XSA-36 fixes (where interrupt remapping didn't really
> >> protect domains from one another), but allows at least DMA
> >> isolation to still be utilized. Patch 3/2 below/attached.
> > 
> > But now users who don't examine log messages may not realize 
> > that interupt remapping is disabled and therefore the system can be
> > affected by XSA-36.
> 
> Yes. We need to balance these against one another - I see pros
> and cons in both (and I don't mind dropping this additional patch
> if we collectively come to the conclusion that the way it is now -
> with the one earlier fix - is the better state). So I'm really
> interested in others' opinions.

Given the infrastructure currently available for dealing with these
things in a more fine-grained/less heavy-handed manner (i.e. not much)
and in the context of security advisories (where obviousness of the
patch is a very useful property and where we are effectively
highlighting the issue to the world) I think it is reasonable to be
fairly aggressive in our handling of systems which don't provide the
necessary security properties.

Most of these sorts of issues naturally only arise as part of a security
vulnerability but if someone wanted to implement more fine grained
control after the fact that would also seem reasonable.

If we had pre-existing better infrastructure for determining controlling
things at runtime, including informing the user and allowing for
overrides, which could be tweaked with one or two lines in a security
fix then it would seem reasonable to use them up front.

Somebody would need to figure out what that actually means and implement
it as part of the normal development process, since we obviously don't
want to do it next time we are under the cosh of an impending security
advisory.

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-03-01 18:45   ` Malcolm Crossley
@ 2013-03-04  9:48     ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2013-03-04  9:48 UTC (permalink / raw)
  To: Malcolm Crossley; +Cc: xen-devel

>>> On 01.03.13 at 19:45, Malcolm Crossley <malcolm.crossley@citrix.com> wrote:
> On 08/02/13 09:58, Jan Beulich wrote:
>>>>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
>>> A regression was reported on a class of broken firmware that c/s
>>> 26517:601139e2b0db didn't consider, leading to a boot time crash.
>> After some more thought on this and the comments we got
>> regarding disabling the IOMMU in this situation altogether making
>> things worse instead of better, I came to the conclusion that we
>> can actually restrict the action in affected cases to just disabling
>> interrupt remapping. That doesn't make the situation worse than
>> prior to the XSA-36 fixes (where interrupt remapping didn't really
>> protect domains from one another), but allows at least DMA
>> isolation to still be utilized. Patch 3/2 below/attached.
> 
> What is the status of this patch? It has not been included into 
> xen-unstable as yet.

No real consensus was reached on the better of two less than
optimal alternatives.

> I am of the opinion that it is better to have DMA isolation than to 
> remove the feature altogether.
> Particularly because there are other ways a guest with a PCI passthrough 
> device can attack the host
> than via reprogramming the interrupt vector in the PCI device.

Yes, that's one of the two perspectives one can take. The other
is that by doing what the patch does, it re-opens the security
hole that XSA-36 describes, i.e. without other measures users'
systems become vulnerable again.

Are you sure you consider either of the two alternatives
significantly better than the other?

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-08  9:58 ` Jan Beulich
  2013-02-08 13:53   ` Sander Eikelenboom
@ 2013-03-01 18:45   ` Malcolm Crossley
  2013-03-04  9:48     ` Jan Beulich
  1 sibling, 1 reply; 11+ messages in thread
From: Malcolm Crossley @ 2013-03-01 18:45 UTC (permalink / raw)
  To: xen-devel

On 08/02/13 09:58, Jan Beulich wrote:
>>>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
>> A regression was reported on a class of broken firmware that c/s
>> 26517:601139e2b0db didn't consider, leading to a boot time crash.
> After some more thought on this and the comments we got
> regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we
> can actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than
> prior to the XSA-36 fixes (where interrupt remapping didn't really
> protect domains from one another), but allows at least DMA
> isolation to still be utilized. Patch 3/2 below/attached.
>
> Jan

What is the status of this patch? It has not been included into 
xen-unstable as yet.

I am of the opinion that it is better to have DMA isolation than to 
remove the feature altogether.
Particularly because there are other ways a guest with a PCI passthrough 
device can attack the host
than via reprogramming the interrupt vector in the PCI device.

Malcolm
> AMD IOMMU: only disable interrupt remapping when certain IVRS consistency checks fail
>
> After some more thought on the XSA-36 and specifically the comments we
> got regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we can
> actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than prior
> to the XSA-36 fixes (where interrupt remapping didn't really protect
> domains from one another), but allows at least DMA isolation to still
> be utilized.
>
> In the course of this, the calls to the interrupt remapping related
> IOMMU implementation specific operations needed to be re-qualified
> from depending on iommu_enabled to iommu_intremap (as was already the
> case for x86's HPET code).
>
> That in turn required to make sure iommu_intremap gets properly
> cleared when the respective initialization fails (or isn't being
> done at all).
>
> Along with making sure interrupt remapping doesn't get inconsistently
> enabled on some IOMMUs and not on others in the VT-d code, this in turn
> allowed quite a bit of cleanup on the VT-d side (if desired, that
> cleanup could of course be broken out into a separate patch).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -207,7 +207,7 @@ static void read_msi_msg(struct msi_desc
>           BUG();
>       }
>   
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>           iommu_read_msi_from_ire(entry, msg);
>   }
>   
> @@ -215,7 +215,7 @@ static void write_msi_msg(struct msi_des
>   {
>       entry->msg = *msg;
>   
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>       {
>           ASSERT(msg != &entry->msg);
>           iommu_update_ire_from_msi(entry, msg);
> @@ -489,7 +489,7 @@ int msi_free_irq(struct msi_desc *entry)
>       }
>   
>       /* Free the unused IRTE if intr remap enabled */
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>           iommu_update_ire_from_msi(entry, NULL);
>   
>       list_del(&entry->list);
> --- a/xen/drivers/passthrough/amd/iommu_acpi.c
> +++ b/xen/drivers/passthrough/amd/iommu_acpi.c
> @@ -682,10 +682,10 @@ static u16 __init parse_ivhd_device_spec
>                       printk(XENLOG_ERR "IVHD Error: Conflicting IO-APIC %#x entries\n",
>                              special->handle);
>                       if ( amd_iommu_perdev_intremap )
> -                        return 0;
> +                        iommu_intremap = 0;
>                   }
>               }
> -            else
> +            else if ( iommu_intremap )
>               {
>                   /* set device id of ioapic */
>                   ioapic_sbdf[special->handle].bdf = bdf;
> @@ -697,7 +697,7 @@ static u16 __init parse_ivhd_device_spec
>                        !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>                   {
>                       printk(XENLOG_ERR "IVHD Error: Out of memory\n");
> -                    return 0;
> +                    iommu_intremap = 0;
>                   }
>               }
>               break;
> @@ -706,7 +706,7 @@ static u16 __init parse_ivhd_device_spec
>           {
>               printk(XENLOG_ERR "IVHD Error: Invalid IO-APIC %#x\n",
>                      special->handle);
> -            return 0;
> +            iommu_intremap = 0;
>           }
>           break;
>       case ACPI_IVHD_HPET:
> @@ -932,7 +932,7 @@ static int __init parse_ivrs_table(struc
>           printk(XENLOG_ERR "IVHD Error: no information for IO-APIC %#x\n",
>                  IO_APIC_ID(apic));
>           if ( amd_iommu_perdev_intremap )
> -            error = -ENXIO;
> +            iommu_intremap = 0;
>           else
>           {
>               ioapic_sbdf[IO_APIC_ID(apic)].pin_setup = xzalloc_array(
> @@ -940,7 +940,7 @@ static int __init parse_ivrs_table(struc
>               if ( !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>               {
>                   printk(XENLOG_ERR "IVHD Error: Out of memory\n");
> -                error = -ENOMEM;
> +                iommu_intremap = 0;
>               }
>           }
>       }
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -1157,7 +1157,7 @@ int __init amd_iommu_init(void)
>       BUG_ON( !iommu_found() );
>   
>       if ( amd_iommu_perdev_intremap && amd_sp5100_erratum28() )
> -        goto error_out;
> +        iommu_intremap = 0;
>   
>       ivrs_bdf_entries = amd_iommu_get_ivrs_dev_entries();
>   
> @@ -1173,7 +1173,7 @@ int __init amd_iommu_init(void)
>           goto error_out;
>   
>       /* initialize io-apic interrupt remapping entries */
> -    if ( amd_iommu_setup_ioapic_remapping() != 0 )
> +    if ( iommu_intremap && amd_iommu_setup_ioapic_remapping() != 0 )
>           goto error_out;
>   
>       /* allocate and initialize a global device table shared by all iommus */
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -470,6 +470,8 @@ int __init iommu_setup(void)
>           rc = iommu_hardware_setup();
>           iommu_enabled = (rc == 0);
>       }
> +    if ( !iommu_enabled )
> +        iommu_intremap = 0;
>   
>       if ( (force_iommu && !iommu_enabled) ||
>            (force_intremap && !iommu_intremap) )
> @@ -487,6 +489,7 @@ int __init iommu_setup(void)
>           printk(" - Dom0 mode: %s\n",
>                  iommu_passthrough ? "Passthrough" :
>                  iommu_dom0_strict ? "Strict" : "Relaxed");
> +    printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
>   
>       return rc;
>   }
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -373,7 +373,7 @@ unsigned int io_apic_read_remap_rte(
>       struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
>       struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>   
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr || !ir_ctrl->iremap_num ||
> +    if ( !ir_ctrl->iremap_num ||
>           ( (index = apic_pin_2_ir_idx[apic][ioapic_pin]) < 0 ) )
>           return __io_apic_read(apic, reg);
>   
> @@ -396,15 +396,8 @@ void io_apic_write_remap_rte(
>       struct IO_APIC_route_remap_entry *remap_rte;
>       unsigned int rte_upper = (reg & 1) ? 1 : 0;
>       struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
> -    struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>       int saved_mask;
>   
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -    {
> -        __io_apic_write(apic, reg, value);
> -        return;
> -    }
> -
>       old_rte = __ioapic_read_entry(apic, ioapic_pin, 1);
>   
>       remap_rte = (struct IO_APIC_route_remap_entry *) &old_rte;
> @@ -653,20 +646,11 @@ void msi_msg_read_remap_rte(
>   {
>       struct pci_dev *pdev = msi_desc->dev;
>       struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>   
>       drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                   : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
> -    iommu = drhd->iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    remap_entry_to_msi_msg(iommu, msg);
> +    if ( drhd )
> +        remap_entry_to_msi_msg(drhd->iommu, msg);
>   }
>   
>   void msi_msg_write_remap_rte(
> @@ -674,20 +658,11 @@ void msi_msg_write_remap_rte(
>   {
>       struct pci_dev *pdev = msi_desc->dev;
>       struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>   
>       drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                   : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
> -    iommu = drhd->iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    msi_msg_to_remap_entry(iommu, pdev, msi_desc, msg);
> +    if ( drhd )
> +        msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg);
>   }
>   
>   int __init intel_setup_hpet_msi(struct msi_desc *msi_desc)
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2065,6 +2065,9 @@ static int init_vtd_hw(void)
>                   break;
>               }
>           }
> +        if ( !iommu_intremap )
> +            for_each_drhd_unit ( drhd )
> +                disable_intremap(drhd->iommu);
>       }
>   
>       /*
> --- a/xen/include/asm-x86/io_apic.h
> +++ b/xen/include/asm-x86/io_apic.h
> @@ -129,7 +129,7 @@ struct IO_APIC_route_entry {
>   extern struct mpc_config_ioapic mp_ioapics[MAX_IO_APICS];
>   
>   /* Only need to remap ioapic RTE (reg: 10~3Fh) */
> -#define ioapic_reg_remapped(reg) (iommu_enabled && ((reg) >= 0x10))
> +#define ioapic_reg_remapped(reg) (iommu_intremap && ((reg) >= 0x10))
>   
>   static inline unsigned int __io_apic_read(unsigned int apic, unsigned int reg)
>   {
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-11 13:12 ` Ian Campbell
@ 2013-02-12  8:45   ` Jan Beulich
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2013-02-12  8:45 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Boris Ostrovsky, xen-devel, Sherry Hurwitz

>>> On 11.02.13 at 14:12, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2013-02-06 at 13:04 +0000, Jan Beulich wrote:
>> A regression was reported on a class of broken firmware that c/s
>> 26517:601139e2b0db didn't consider, leading to a boot time crash.
> 
> Should we release an update to XSA-36 when these have gone in?

I think so, yes.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-06 13:04 Jan Beulich
  2013-02-08  9:58 ` Jan Beulich
@ 2013-02-11 13:12 ` Ian Campbell
  2013-02-12  8:45   ` Jan Beulich
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Campbell @ 2013-02-11 13:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, Sherry Hurwitz, xen-devel

On Wed, 2013-02-06 at 13:04 +0000, Jan Beulich wrote:
> A regression was reported on a class of broken firmware that c/s
> 26517:601139e2b0db didn't consider, leading to a boot time crash.

Should we release an update to XSA-36 when these have gone in?

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-08  9:58 ` Jan Beulich
@ 2013-02-08 13:53   ` Sander Eikelenboom
  2013-03-01 18:45   ` Malcolm Crossley
  1 sibling, 0 replies; 11+ messages in thread
From: Sander Eikelenboom @ 2013-02-08 13:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, Sherry Hurwitz, xiantao.zhang, xen-devel


Friday, February 8, 2013, 10:58:02 AM, you wrote:

>>>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
>> A regression was reported on a class of broken firmware that c/s
>> 26517:601139e2b0db didn't consider, leading to a boot time crash.

> After some more thought on this and the comments we got
> regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we
> can actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than
> prior to the XSA-36 fixes (where interrupt remapping didn't really
> protect domains from one another), but allows at least DMA
> isolation to still be utilized. Patch 3/2 below/attached.

> Jan

That seems the same linux does.
Unfortunately it seems i'am having to use it, just tested the latest bios for my motherboard and that still has no IVHD for the IOAPIC, but also locks up immediately after that message.


> AMD IOMMU: only disable interrupt remapping when certain IVRS consistency checks fail

> After some more thought on the XSA-36 and specifically the comments we
> got regarding disabling the IOMMU in this situation altogether making
> things worse instead of better, I came to the conclusion that we can
> actually restrict the action in affected cases to just disabling
> interrupt remapping. That doesn't make the situation worse than prior
> to the XSA-36 fixes (where interrupt remapping didn't really protect
> domains from one another), but allows at least DMA isolation to still
> be utilized.

> In the course of this, the calls to the interrupt remapping related
> IOMMU implementation specific operations needed to be re-qualified
> from depending on iommu_enabled to iommu_intremap (as was already the
> case for x86's HPET code).

> That in turn required to make sure iommu_intremap gets properly
> cleared when the respective initialization fails (or isn't being
> done at all).

> Along with making sure interrupt remapping doesn't get inconsistently
> enabled on some IOMMUs and not on others in the VT-d code, this in turn
> allowed quite a bit of cleanup on the VT-d side (if desired, that
> cleanup could of course be broken out into a separate patch).

> Signed-off-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -207,7 +207,7 @@ static void read_msi_msg(struct msi_desc
>          BUG();
>      }
>  
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>          iommu_read_msi_from_ire(entry, msg);
>  }
>  
> @@ -215,7 +215,7 @@ static void write_msi_msg(struct msi_des
>  {
>      entry->msg = *msg;
>  
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>      {
>          ASSERT(msg != &entry->msg);
>          iommu_update_ire_from_msi(entry, msg);
> @@ -489,7 +489,7 @@ int msi_free_irq(struct msi_desc *entry)
>      }
>  
>      /* Free the unused IRTE if intr remap enabled */
> -    if ( iommu_enabled )
> +    if ( iommu_intremap )
>          iommu_update_ire_from_msi(entry, NULL);
>  
>      list_del(&entry->list);
> --- a/xen/drivers/passthrough/amd/iommu_acpi.c
> +++ b/xen/drivers/passthrough/amd/iommu_acpi.c
> @@ -682,10 +682,10 @@ static u16 __init parse_ivhd_device_spec
>                      printk(XENLOG_ERR "IVHD Error: Conflicting IO-APIC %#x entries\n",
>                             special->handle);
>                      if ( amd_iommu_perdev_intremap )
> -                        return 0;
> +                        iommu_intremap = 0;
>                  }
>              }
> -            else
> +            else if ( iommu_intremap )
>              {
>                  /* set device id of ioapic */
>                  ioapic_sbdf[special->handle].bdf = bdf;
> @@ -697,7 +697,7 @@ static u16 __init parse_ivhd_device_spec
>                       !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>                  {
>                      printk(XENLOG_ERR "IVHD Error: Out of memory\n");
> -                    return 0;
> +                    iommu_intremap = 0;
>                  }
>              }
>              break;
> @@ -706,7 +706,7 @@ static u16 __init parse_ivhd_device_spec
>          {
>              printk(XENLOG_ERR "IVHD Error: Invalid IO-APIC %#x\n",
>                     special->handle);
> -            return 0;
> +            iommu_intremap = 0;
>          }
>          break;
>      case ACPI_IVHD_HPET:
> @@ -932,7 +932,7 @@ static int __init parse_ivrs_table(struc
>          printk(XENLOG_ERR "IVHD Error: no information for IO-APIC %#x\n",
>                 IO_APIC_ID(apic));
>          if ( amd_iommu_perdev_intremap )
> -            error = -ENXIO;
> +            iommu_intremap = 0;
>          else
>          {
>              ioapic_sbdf[IO_APIC_ID(apic)].pin_setup = xzalloc_array(
> @@ -940,7 +940,7 @@ static int __init parse_ivrs_table(struc
>              if ( !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
>              {
>                  printk(XENLOG_ERR "IVHD Error: Out of memory\n");
> -                error = -ENOMEM;
> +                iommu_intremap = 0;
>              }
>          }
>      }
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -1157,7 +1157,7 @@ int __init amd_iommu_init(void)
>      BUG_ON( !iommu_found() );
>  
>      if ( amd_iommu_perdev_intremap && amd_sp5100_erratum28() )
> -        goto error_out;
> +        iommu_intremap = 0;
>  
>      ivrs_bdf_entries = amd_iommu_get_ivrs_dev_entries();
>  
> @@ -1173,7 +1173,7 @@ int __init amd_iommu_init(void)
>          goto error_out;
>  
>      /* initialize io-apic interrupt remapping entries */
> -    if ( amd_iommu_setup_ioapic_remapping() != 0 )
> +    if ( iommu_intremap && amd_iommu_setup_ioapic_remapping() != 0 )
>          goto error_out;
>  
>      /* allocate and initialize a global device table shared by all iommus */
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -470,6 +470,8 @@ int __init iommu_setup(void)
>          rc = iommu_hardware_setup();
>          iommu_enabled = (rc == 0);
>      }
> +    if ( !iommu_enabled )
> +        iommu_intremap = 0;
>  
>      if ( (force_iommu && !iommu_enabled) ||
>           (force_intremap && !iommu_intremap) )
> @@ -487,6 +489,7 @@ int __init iommu_setup(void)
>          printk(" - Dom0 mode: %s\n",
>                 iommu_passthrough ? "Passthrough" :
>                 iommu_dom0_strict ? "Strict" : "Relaxed");
> +    printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
>  
>      return rc;
>  }
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -373,7 +373,7 @@ unsigned int io_apic_read_remap_rte(
>      struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
>      struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>  
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr || !ir_ctrl->iremap_num ||
+    if ( !ir_ctrl->>iremap_num ||
>          ( (index = apic_pin_2_ir_idx[apic][ioapic_pin]) < 0 ) )
>          return __io_apic_read(apic, reg);
>  
> @@ -396,15 +396,8 @@ void io_apic_write_remap_rte(
>      struct IO_APIC_route_remap_entry *remap_rte;
>      unsigned int rte_upper = (reg & 1) ? 1 : 0;
>      struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
> -    struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
>      int saved_mask;
>  
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -    {
> -        __io_apic_write(apic, reg, value);
> -        return;
> -    }
> -
>      old_rte = __ioapic_read_entry(apic, ioapic_pin, 1);
>  
>      remap_rte = (struct IO_APIC_route_remap_entry *) &old_rte;
> @@ -653,20 +646,11 @@ void msi_msg_read_remap_rte(
>  {
>      struct pci_dev *pdev = msi_desc->dev;
>      struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>  
>      drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                  : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
-    iommu = drhd->>iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    remap_entry_to_msi_msg(iommu, msg);
> +    if ( drhd )
> +        remap_entry_to_msi_msg(drhd->iommu, msg);
>  }
>  
>  void msi_msg_write_remap_rte(
> @@ -674,20 +658,11 @@ void msi_msg_write_remap_rte(
>  {
>      struct pci_dev *pdev = msi_desc->dev;
>      struct acpi_drhd_unit *drhd = NULL;
> -    struct iommu *iommu = NULL;
> -    struct ir_ctrl *ir_ctrl;
>  
>      drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
>                  : hpet_to_drhd(msi_desc->hpet_id);
> -    if ( !drhd )
> -        return;
-    iommu = drhd->>iommu;
> -
> -    ir_ctrl = iommu_ir_ctrl(iommu);
> -    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
> -        return;
> -
> -    msi_msg_to_remap_entry(iommu, pdev, msi_desc, msg);
> +    if ( drhd )
> +        msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg);
>  }
>  
>  int __init intel_setup_hpet_msi(struct msi_desc *msi_desc)
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2065,6 +2065,9 @@ static int init_vtd_hw(void)
>                  break;
>              }
>          }
> +        if ( !iommu_intremap )
> +            for_each_drhd_unit ( drhd )
> +                disable_intremap(drhd->iommu);
>      }
>  
>      /*
> --- a/xen/include/asm-x86/io_apic.h
> +++ b/xen/include/asm-x86/io_apic.h
> @@ -129,7 +129,7 @@ struct IO_APIC_route_entry {
>  extern struct mpc_config_ioapic mp_ioapics[MAX_IO_APICS];
>  
>  /* Only need to remap ioapic RTE (reg: 10~3Fh) */
> -#define ioapic_reg_remapped(reg) (iommu_enabled && ((reg) >= 0x10))
> +#define ioapic_reg_remapped(reg) (iommu_intremap && ((reg) >= 0x10))
>  
>  static inline unsigned int __io_apic_read(unsigned int apic, unsigned int reg)
>  {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
  2013-02-06 13:04 Jan Beulich
@ 2013-02-08  9:58 ` Jan Beulich
  2013-02-08 13:53   ` Sander Eikelenboom
  2013-03-01 18:45   ` Malcolm Crossley
  2013-02-11 13:12 ` Ian Campbell
  1 sibling, 2 replies; 11+ messages in thread
From: Jan Beulich @ 2013-02-08  9:58 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, xiantao.zhang, Sherry Hurwitz

[-- Attachment #1: Type: text/plain, Size: 9405 bytes --]

>>> On 06.02.13 at 14:04, "Jan Beulich" <JBeulich@suse.com> wrote:
> A regression was reported on a class of broken firmware that c/s
> 26517:601139e2b0db didn't consider, leading to a boot time crash.

After some more thought on this and the comments we got
regarding disabling the IOMMU in this situation altogether making
things worse instead of better, I came to the conclusion that we
can actually restrict the action in affected cases to just disabling
interrupt remapping. That doesn't make the situation worse than
prior to the XSA-36 fixes (where interrupt remapping didn't really
protect domains from one another), but allows at least DMA
isolation to still be utilized. Patch 3/2 below/attached.

Jan

AMD IOMMU: only disable interrupt remapping when certain IVRS consistency checks fail

After some more thought on the XSA-36 and specifically the comments we
got regarding disabling the IOMMU in this situation altogether making
things worse instead of better, I came to the conclusion that we can
actually restrict the action in affected cases to just disabling
interrupt remapping. That doesn't make the situation worse than prior
to the XSA-36 fixes (where interrupt remapping didn't really protect
domains from one another), but allows at least DMA isolation to still
be utilized.

In the course of this, the calls to the interrupt remapping related
IOMMU implementation specific operations needed to be re-qualified
from depending on iommu_enabled to iommu_intremap (as was already the
case for x86's HPET code).

That in turn required to make sure iommu_intremap gets properly
cleared when the respective initialization fails (or isn't being
done at all).

Along with making sure interrupt remapping doesn't get inconsistently
enabled on some IOMMUs and not on others in the VT-d code, this in turn
allowed quite a bit of cleanup on the VT-d side (if desired, that
cleanup could of course be broken out into a separate patch).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -207,7 +207,7 @@ static void read_msi_msg(struct msi_desc
         BUG();
     }
 
-    if ( iommu_enabled )
+    if ( iommu_intremap )
         iommu_read_msi_from_ire(entry, msg);
 }
 
@@ -215,7 +215,7 @@ static void write_msi_msg(struct msi_des
 {
     entry->msg = *msg;
 
-    if ( iommu_enabled )
+    if ( iommu_intremap )
     {
         ASSERT(msg != &entry->msg);
         iommu_update_ire_from_msi(entry, msg);
@@ -489,7 +489,7 @@ int msi_free_irq(struct msi_desc *entry)
     }
 
     /* Free the unused IRTE if intr remap enabled */
-    if ( iommu_enabled )
+    if ( iommu_intremap )
         iommu_update_ire_from_msi(entry, NULL);
 
     list_del(&entry->list);
--- a/xen/drivers/passthrough/amd/iommu_acpi.c
+++ b/xen/drivers/passthrough/amd/iommu_acpi.c
@@ -682,10 +682,10 @@ static u16 __init parse_ivhd_device_spec
                     printk(XENLOG_ERR "IVHD Error: Conflicting IO-APIC %#x entries\n",
                            special->handle);
                     if ( amd_iommu_perdev_intremap )
-                        return 0;
+                        iommu_intremap = 0;
                 }
             }
-            else
+            else if ( iommu_intremap )
             {
                 /* set device id of ioapic */
                 ioapic_sbdf[special->handle].bdf = bdf;
@@ -697,7 +697,7 @@ static u16 __init parse_ivhd_device_spec
                      !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
                 {
                     printk(XENLOG_ERR "IVHD Error: Out of memory\n");
-                    return 0;
+                    iommu_intremap = 0;
                 }
             }
             break;
@@ -706,7 +706,7 @@ static u16 __init parse_ivhd_device_spec
         {
             printk(XENLOG_ERR "IVHD Error: Invalid IO-APIC %#x\n",
                    special->handle);
-            return 0;
+            iommu_intremap = 0;
         }
         break;
     case ACPI_IVHD_HPET:
@@ -932,7 +932,7 @@ static int __init parse_ivrs_table(struc
         printk(XENLOG_ERR "IVHD Error: no information for IO-APIC %#x\n",
                IO_APIC_ID(apic));
         if ( amd_iommu_perdev_intremap )
-            error = -ENXIO;
+            iommu_intremap = 0;
         else
         {
             ioapic_sbdf[IO_APIC_ID(apic)].pin_setup = xzalloc_array(
@@ -940,7 +940,7 @@ static int __init parse_ivrs_table(struc
             if ( !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
             {
                 printk(XENLOG_ERR "IVHD Error: Out of memory\n");
-                error = -ENOMEM;
+                iommu_intremap = 0;
             }
         }
     }
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1157,7 +1157,7 @@ int __init amd_iommu_init(void)
     BUG_ON( !iommu_found() );
 
     if ( amd_iommu_perdev_intremap && amd_sp5100_erratum28() )
-        goto error_out;
+        iommu_intremap = 0;
 
     ivrs_bdf_entries = amd_iommu_get_ivrs_dev_entries();
 
@@ -1173,7 +1173,7 @@ int __init amd_iommu_init(void)
         goto error_out;
 
     /* initialize io-apic interrupt remapping entries */
-    if ( amd_iommu_setup_ioapic_remapping() != 0 )
+    if ( iommu_intremap && amd_iommu_setup_ioapic_remapping() != 0 )
         goto error_out;
 
     /* allocate and initialize a global device table shared by all iommus */
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -470,6 +470,8 @@ int __init iommu_setup(void)
         rc = iommu_hardware_setup();
         iommu_enabled = (rc == 0);
     }
+    if ( !iommu_enabled )
+        iommu_intremap = 0;
 
     if ( (force_iommu && !iommu_enabled) ||
          (force_intremap && !iommu_intremap) )
@@ -487,6 +489,7 @@ int __init iommu_setup(void)
         printk(" - Dom0 mode: %s\n",
                iommu_passthrough ? "Passthrough" :
                iommu_dom0_strict ? "Strict" : "Relaxed");
+    printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
 
     return rc;
 }
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -373,7 +373,7 @@ unsigned int io_apic_read_remap_rte(
     struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
     struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
 
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr || !ir_ctrl->iremap_num ||
+    if ( !ir_ctrl->iremap_num ||
         ( (index = apic_pin_2_ir_idx[apic][ioapic_pin]) < 0 ) )
         return __io_apic_read(apic, reg);
 
@@ -396,15 +396,8 @@ void io_apic_write_remap_rte(
     struct IO_APIC_route_remap_entry *remap_rte;
     unsigned int rte_upper = (reg & 1) ? 1 : 0;
     struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
-    struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
     int saved_mask;
 
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-    {
-        __io_apic_write(apic, reg, value);
-        return;
-    }
-
     old_rte = __ioapic_read_entry(apic, ioapic_pin, 1);
 
     remap_rte = (struct IO_APIC_route_remap_entry *) &old_rte;
@@ -653,20 +646,11 @@ void msi_msg_read_remap_rte(
 {
     struct pci_dev *pdev = msi_desc->dev;
     struct acpi_drhd_unit *drhd = NULL;
-    struct iommu *iommu = NULL;
-    struct ir_ctrl *ir_ctrl;
 
     drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
                 : hpet_to_drhd(msi_desc->hpet_id);
-    if ( !drhd )
-        return;
-    iommu = drhd->iommu;
-
-    ir_ctrl = iommu_ir_ctrl(iommu);
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-        return;
-
-    remap_entry_to_msi_msg(iommu, msg);
+    if ( drhd )
+        remap_entry_to_msi_msg(drhd->iommu, msg);
 }
 
 void msi_msg_write_remap_rte(
@@ -674,20 +658,11 @@ void msi_msg_write_remap_rte(
 {
     struct pci_dev *pdev = msi_desc->dev;
     struct acpi_drhd_unit *drhd = NULL;
-    struct iommu *iommu = NULL;
-    struct ir_ctrl *ir_ctrl;
 
     drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
                 : hpet_to_drhd(msi_desc->hpet_id);
-    if ( !drhd )
-        return;
-    iommu = drhd->iommu;
-
-    ir_ctrl = iommu_ir_ctrl(iommu);
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-        return;
-
-    msi_msg_to_remap_entry(iommu, pdev, msi_desc, msg);
+    if ( drhd )
+        msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg);
 }
 
 int __init intel_setup_hpet_msi(struct msi_desc *msi_desc)
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2065,6 +2065,9 @@ static int init_vtd_hw(void)
                 break;
             }
         }
+        if ( !iommu_intremap )
+            for_each_drhd_unit ( drhd )
+                disable_intremap(drhd->iommu);
     }
 
     /*
--- a/xen/include/asm-x86/io_apic.h
+++ b/xen/include/asm-x86/io_apic.h
@@ -129,7 +129,7 @@ struct IO_APIC_route_entry {
 extern struct mpc_config_ioapic mp_ioapics[MAX_IO_APICS];
 
 /* Only need to remap ioapic RTE (reg: 10~3Fh) */
-#define ioapic_reg_remapped(reg) (iommu_enabled && ((reg) >= 0x10))
+#define ioapic_reg_remapped(reg) (iommu_intremap && ((reg) >= 0x10))
 
 static inline unsigned int __io_apic_read(unsigned int apic, unsigned int reg)
 {



[-- Attachment #2: AMD-IOMMU-disable-intremap-only.patch --]
[-- Type: text/plain, Size: 8673 bytes --]

AMD IOMMU: only disable interrupt remapping when certain IVRS consistency checks fail

After some more thought on the XSA-36 and specifically the comments we
got regarding disabling the IOMMU in this situation altogether making
things worse instead of better, I came to the conclusion that we can
actually restrict the action in affected cases to just disabling
interrupt remapping. That doesn't make the situation worse than prior
to the XSA-36 fixes (where interrupt remapping didn't really protect
domains from one another), but allows at least DMA isolation to still
be utilized.

In the course of this, the calls to the interrupt remapping related
IOMMU implementation specific operations needed to be re-qualified
from depending on iommu_enabled to iommu_intremap (as was already the
case for x86's HPET code).

That in turn required to make sure iommu_intremap gets properly
cleared when the respective initialization fails (or isn't being
done at all).

Along with making sure interrupt remapping doesn't get inconsistently
enabled on some IOMMUs and not on others in the VT-d code, this in turn
allowed quite a bit of cleanup on the VT-d side (if desired, that
cleanup could of course be broken out into a separate patch).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -207,7 +207,7 @@ static void read_msi_msg(struct msi_desc
         BUG();
     }
 
-    if ( iommu_enabled )
+    if ( iommu_intremap )
         iommu_read_msi_from_ire(entry, msg);
 }
 
@@ -215,7 +215,7 @@ static void write_msi_msg(struct msi_des
 {
     entry->msg = *msg;
 
-    if ( iommu_enabled )
+    if ( iommu_intremap )
     {
         ASSERT(msg != &entry->msg);
         iommu_update_ire_from_msi(entry, msg);
@@ -489,7 +489,7 @@ int msi_free_irq(struct msi_desc *entry)
     }
 
     /* Free the unused IRTE if intr remap enabled */
-    if ( iommu_enabled )
+    if ( iommu_intremap )
         iommu_update_ire_from_msi(entry, NULL);
 
     list_del(&entry->list);
--- a/xen/drivers/passthrough/amd/iommu_acpi.c
+++ b/xen/drivers/passthrough/amd/iommu_acpi.c
@@ -682,10 +682,10 @@ static u16 __init parse_ivhd_device_spec
                     printk(XENLOG_ERR "IVHD Error: Conflicting IO-APIC %#x entries\n",
                            special->handle);
                     if ( amd_iommu_perdev_intremap )
-                        return 0;
+                        iommu_intremap = 0;
                 }
             }
-            else
+            else if ( iommu_intremap )
             {
                 /* set device id of ioapic */
                 ioapic_sbdf[special->handle].bdf = bdf;
@@ -697,7 +697,7 @@ static u16 __init parse_ivhd_device_spec
                      !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
                 {
                     printk(XENLOG_ERR "IVHD Error: Out of memory\n");
-                    return 0;
+                    iommu_intremap = 0;
                 }
             }
             break;
@@ -706,7 +706,7 @@ static u16 __init parse_ivhd_device_spec
         {
             printk(XENLOG_ERR "IVHD Error: Invalid IO-APIC %#x\n",
                    special->handle);
-            return 0;
+            iommu_intremap = 0;
         }
         break;
     case ACPI_IVHD_HPET:
@@ -932,7 +932,7 @@ static int __init parse_ivrs_table(struc
         printk(XENLOG_ERR "IVHD Error: no information for IO-APIC %#x\n",
                IO_APIC_ID(apic));
         if ( amd_iommu_perdev_intremap )
-            error = -ENXIO;
+            iommu_intremap = 0;
         else
         {
             ioapic_sbdf[IO_APIC_ID(apic)].pin_setup = xzalloc_array(
@@ -940,7 +940,7 @@ static int __init parse_ivrs_table(struc
             if ( !ioapic_sbdf[IO_APIC_ID(apic)].pin_setup )
             {
                 printk(XENLOG_ERR "IVHD Error: Out of memory\n");
-                error = -ENOMEM;
+                iommu_intremap = 0;
             }
         }
     }
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1157,7 +1157,7 @@ int __init amd_iommu_init(void)
     BUG_ON( !iommu_found() );
 
     if ( amd_iommu_perdev_intremap && amd_sp5100_erratum28() )
-        goto error_out;
+        iommu_intremap = 0;
 
     ivrs_bdf_entries = amd_iommu_get_ivrs_dev_entries();
 
@@ -1173,7 +1173,7 @@ int __init amd_iommu_init(void)
         goto error_out;
 
     /* initialize io-apic interrupt remapping entries */
-    if ( amd_iommu_setup_ioapic_remapping() != 0 )
+    if ( iommu_intremap && amd_iommu_setup_ioapic_remapping() != 0 )
         goto error_out;
 
     /* allocate and initialize a global device table shared by all iommus */
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -470,6 +470,8 @@ int __init iommu_setup(void)
         rc = iommu_hardware_setup();
         iommu_enabled = (rc == 0);
     }
+    if ( !iommu_enabled )
+        iommu_intremap = 0;
 
     if ( (force_iommu && !iommu_enabled) ||
          (force_intremap && !iommu_intremap) )
@@ -487,6 +489,7 @@ int __init iommu_setup(void)
         printk(" - Dom0 mode: %s\n",
                iommu_passthrough ? "Passthrough" :
                iommu_dom0_strict ? "Strict" : "Relaxed");
+    printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
 
     return rc;
 }
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -373,7 +373,7 @@ unsigned int io_apic_read_remap_rte(
     struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
     struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
 
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr || !ir_ctrl->iremap_num ||
+    if ( !ir_ctrl->iremap_num ||
         ( (index = apic_pin_2_ir_idx[apic][ioapic_pin]) < 0 ) )
         return __io_apic_read(apic, reg);
 
@@ -396,15 +396,8 @@ void io_apic_write_remap_rte(
     struct IO_APIC_route_remap_entry *remap_rte;
     unsigned int rte_upper = (reg & 1) ? 1 : 0;
     struct iommu *iommu = ioapic_to_iommu(IO_APIC_ID(apic));
-    struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu);
     int saved_mask;
 
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-    {
-        __io_apic_write(apic, reg, value);
-        return;
-    }
-
     old_rte = __ioapic_read_entry(apic, ioapic_pin, 1);
 
     remap_rte = (struct IO_APIC_route_remap_entry *) &old_rte;
@@ -653,20 +646,11 @@ void msi_msg_read_remap_rte(
 {
     struct pci_dev *pdev = msi_desc->dev;
     struct acpi_drhd_unit *drhd = NULL;
-    struct iommu *iommu = NULL;
-    struct ir_ctrl *ir_ctrl;
 
     drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
                 : hpet_to_drhd(msi_desc->hpet_id);
-    if ( !drhd )
-        return;
-    iommu = drhd->iommu;
-
-    ir_ctrl = iommu_ir_ctrl(iommu);
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-        return;
-
-    remap_entry_to_msi_msg(iommu, msg);
+    if ( drhd )
+        remap_entry_to_msi_msg(drhd->iommu, msg);
 }
 
 void msi_msg_write_remap_rte(
@@ -674,20 +658,11 @@ void msi_msg_write_remap_rte(
 {
     struct pci_dev *pdev = msi_desc->dev;
     struct acpi_drhd_unit *drhd = NULL;
-    struct iommu *iommu = NULL;
-    struct ir_ctrl *ir_ctrl;
 
     drhd = pdev ? acpi_find_matched_drhd_unit(pdev)
                 : hpet_to_drhd(msi_desc->hpet_id);
-    if ( !drhd )
-        return;
-    iommu = drhd->iommu;
-
-    ir_ctrl = iommu_ir_ctrl(iommu);
-    if ( !ir_ctrl || !ir_ctrl->iremap_maddr )
-        return;
-
-    msi_msg_to_remap_entry(iommu, pdev, msi_desc, msg);
+    if ( drhd )
+        msi_msg_to_remap_entry(drhd->iommu, pdev, msi_desc, msg);
 }
 
 int __init intel_setup_hpet_msi(struct msi_desc *msi_desc)
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2065,6 +2065,9 @@ static int init_vtd_hw(void)
                 break;
             }
         }
+        if ( !iommu_intremap )
+            for_each_drhd_unit ( drhd )
+                disable_intremap(drhd->iommu);
     }
 
     /*
--- a/xen/include/asm-x86/io_apic.h
+++ b/xen/include/asm-x86/io_apic.h
@@ -129,7 +129,7 @@ struct IO_APIC_route_entry {
 extern struct mpc_config_ioapic mp_ioapics[MAX_IO_APICS];
 
 /* Only need to remap ioapic RTE (reg: 10~3Fh) */
-#define ioapic_reg_remapped(reg) (iommu_enabled && ((reg) >= 0x10))
+#define ioapic_reg_remapped(reg) (iommu_intremap && ((reg) >= 0x10))
 
 static inline unsigned int __io_apic_read(unsigned int apic, unsigned int reg)
 {

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/2] AMD IOMMU: XSA-36 follow ups
@ 2013-02-06 13:04 Jan Beulich
  2013-02-08  9:58 ` Jan Beulich
  2013-02-11 13:12 ` Ian Campbell
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Beulich @ 2013-02-06 13:04 UTC (permalink / raw)
  To: xen-devel; +Cc: Boris Ostrovsky, Sherry Hurwitz

A regression was reported on a class of broken firmware that c/s
26517:601139e2b0db didn't consider, leading to a boot time crash.

Further, the phantom function support committed earlier is still
lacking AMD side code (during the composition of which the
security issue was noticed, and hence the patch had to be held
back.

1: also spot missing IO-APIC entries in IVRS table
2: handle MSI for phantom functions

Signed-off-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-03-04  9:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-08 14:29 [PATCH 0/2] AMD IOMMU: XSA-36 follow ups Boris Ostrovsky
2013-02-08 16:48 ` Jan Beulich
2013-02-08 17:10   ` Sander Eikelenboom
2013-02-11 11:46   ` Ian Campbell
  -- strict thread matches above, loose matches on Subject: below --
2013-02-06 13:04 Jan Beulich
2013-02-08  9:58 ` Jan Beulich
2013-02-08 13:53   ` Sander Eikelenboom
2013-03-01 18:45   ` Malcolm Crossley
2013-03-04  9:48     ` Jan Beulich
2013-02-11 13:12 ` Ian Campbell
2013-02-12  8:45   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.