* [PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs @ 2020-09-02 4:51 ` Suravee Suthikulpanit 0 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: joro, sean.m.osborne, james.puthukattukaran, joao.m.martins, boris.ostrovsky, jon.grimm, Suravee Suthikulpanit Interrupt remapping IO_PAGE_FAULT has been observed under system w/ large number of VMs w/ pass-through devices. This can be reproduced with 64 VMs + 64 pass-through VFs of Mellanox MT28800 Family [ConnectX-5 Ex], where each VM runs small-packet netperf test via the pass-through device to the netserver running on the host. All VMs are running in reboot loop, to trigger IRTE updates. In addition, to accelerate the failure, irqbalance is triggered periodically (e.g. 1-5 sec), which should generate large amount of updates to IRTE. This setup generally triggers IO_PAGE_FAULT within 3-4 hours. Investigation has shown that the issue is in the code to update IRTE while remapping is enabled. Please see patch 2/2 for detail discussion. This serires has been tested running in the setup mentioned above upto 96 hours w/o seeing issues. Thanks, Suravee Suravee Suthikulpanit (2): iommu: amd: Restore IRTE.RemapEn bit after programming IRTE iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++++++++++++++++++-- drivers/iommu/amd/iommu.c | 19 +++++++++++++++---- 3 files changed, 35 insertions(+), 7 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs @ 2020-09-02 4:51 ` Suravee Suthikulpanit 0 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: jon.grimm, james.puthukattukaran, boris.ostrovsky, joao.m.martins, sean.m.osborne Interrupt remapping IO_PAGE_FAULT has been observed under system w/ large number of VMs w/ pass-through devices. This can be reproduced with 64 VMs + 64 pass-through VFs of Mellanox MT28800 Family [ConnectX-5 Ex], where each VM runs small-packet netperf test via the pass-through device to the netserver running on the host. All VMs are running in reboot loop, to trigger IRTE updates. In addition, to accelerate the failure, irqbalance is triggered periodically (e.g. 1-5 sec), which should generate large amount of updates to IRTE. This setup generally triggers IO_PAGE_FAULT within 3-4 hours. Investigation has shown that the issue is in the code to update IRTE while remapping is enabled. Please see patch 2/2 for detail discussion. This serires has been tested running in the setup mentioned above upto 96 hours w/o seeing issues. Thanks, Suravee Suravee Suthikulpanit (2): iommu: amd: Restore IRTE.RemapEn bit after programming IRTE iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++++++++++++++++++-- drivers/iommu/amd/iommu.c | 19 +++++++++++++++---- 3 files changed, 35 insertions(+), 7 deletions(-) -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE 2020-09-02 4:51 ` Suravee Suthikulpanit @ 2020-09-02 4:51 ` Suravee Suthikulpanit -1 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: joro, sean.m.osborne, james.puthukattukaran, joao.m.martins, boris.ostrovsky, jon.grimm, Suravee Suthikulpanit Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> --- drivers/iommu/amd/iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index ba9f3dbc5b94..967f4e96d1eb 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3850,6 +3850,7 @@ int amd_iommu_deactivate_guest_mode(void *data) struct amd_ir_data *ir_data = (struct amd_ir_data *)data; struct irte_ga *entry = (struct irte_ga *) ir_data->entry; struct irq_cfg *cfg = ir_data->cfg; + u64 valid = entry->lo.fields_remap.valid; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry || !entry->lo.fields_vapic.guest_mode) @@ -3858,6 +3859,7 @@ int amd_iommu_deactivate_guest_mode(void *data) entry->lo.val = 0; entry->hi.val = 0; + entry->lo.fields_remap.valid = valid; entry->lo.fields_remap.dm = apic->irq_dest_mode; entry->lo.fields_remap.int_type = apic->irq_delivery_mode; entry->hi.fields.vector = cfg->vector; -- 2.17.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE @ 2020-09-02 4:51 ` Suravee Suthikulpanit 0 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: jon.grimm, james.puthukattukaran, boris.ostrovsky, joao.m.martins, sean.m.osborne Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> --- drivers/iommu/amd/iommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index ba9f3dbc5b94..967f4e96d1eb 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3850,6 +3850,7 @@ int amd_iommu_deactivate_guest_mode(void *data) struct amd_ir_data *ir_data = (struct amd_ir_data *)data; struct irte_ga *entry = (struct irte_ga *) ir_data->entry; struct irq_cfg *cfg = ir_data->cfg; + u64 valid = entry->lo.fields_remap.valid; if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry || !entry->lo.fields_vapic.guest_mode) @@ -3858,6 +3859,7 @@ int amd_iommu_deactivate_guest_mode(void *data) entry->lo.val = 0; entry->hi.val = 0; + entry->lo.fields_remap.valid = valid; entry->lo.fields_remap.dm = apic->irq_dest_mode; entry->lo.fields_remap.int_type = apic->irq_delivery_mode; entry->hi.fields.vector = cfg->vector; -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE 2020-09-02 4:51 ` Suravee Suthikulpanit @ 2020-09-02 15:26 ` Joao Martins -1 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2020-09-02 15:26 UTC (permalink / raw) To: Suravee Suthikulpanit Cc: linux-kernel, iommu, joro, sean.m.osborne, james.puthukattukaran, boris.ostrovsky, jon.grimm On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: > Currently, the RemapEn (valid) bit is accidentally cleared when > programming IRTE w/ guestMode=0. It should be restored to > the prior state. > Probably requires: Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices") ? > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> FWIW, Reviewed-by: Joao Martins <joao.m.martins@oracle.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE @ 2020-09-02 15:26 ` Joao Martins 0 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2020-09-02 15:26 UTC (permalink / raw) To: Suravee Suthikulpanit Cc: jon.grimm, linux-kernel, iommu, james.puthukattukaran, boris.ostrovsky, sean.m.osborne On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: > Currently, the RemapEn (valid) bit is accidentally cleared when > programming IRTE w/ guestMode=0. It should be restored to > the prior state. > Probably requires: Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices") ? > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> FWIW, Reviewed-by: Joao Martins <joao.m.martins@oracle.com> _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE 2020-09-02 4:51 ` Suravee Suthikulpanit @ 2020-09-02 4:51 ` Suravee Suthikulpanit -1 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: joro, sean.m.osborne, james.puthukattukaran, joao.m.martins, boris.ostrovsky, jon.grimm, Suravee Suthikulpanit When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU Driver Device IRQ ============ =========== irte.RemapEn=0 ... change IRTE IRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Reported-by: Sean Osborne <sean.m.osborne@oracle.com> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> --- drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++++++++++++++++++-- drivers/iommu/amd/iommu.c | 17 +++++++++++++---- 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig index 1f061d91e0b8..626b97d0dd21 100644 --- a/drivers/iommu/amd/Kconfig +++ b/drivers/iommu/amd/Kconfig @@ -10,7 +10,7 @@ config AMD_IOMMU select IOMMU_API select IOMMU_IOVA select IOMMU_DMA - depends on X86_64 && PCI && ACPI + depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE help With this option you can enable support for AMD IOMMU hardware in your system. An IOMMU is a hardware component which provides diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c652f16eb702..ad30467f6930 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) + + /* + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. + * GAM also requires GA mode. Therefore, we need to + * check cmbxchg16b support before enabling it. + */ + if (!boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; break; case 0x11: @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) + + /* + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. + * XT, GAM also requires GA mode. Therefore, we need to + * check cmbxchg16b support before enabling them. + */ + if (boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; + break; + } + /* * Note: Since iommu_update_intcapxt() leverages * the IOMMU MMIO access to MSI capability block registers diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 967f4e96d1eb..a382d7a73eaa 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3292,6 +3292,7 @@ static int alloc_irq_index(u16 devid, int count, bool align, static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { + bool ret; struct irq_remap_table *table; struct amd_iommu *iommu; unsigned long flags; @@ -3309,10 +3310,18 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, entry = (struct irte_ga *)table->table; entry = &entry[index]; - entry->lo.fields_remap.valid = 0; - entry->hi.val = irte->hi.val; - entry->lo.val = irte->lo.val; - entry->lo.fields_remap.valid = 1; + + ret = cmpxchg_double(&entry->lo.val, &entry->hi.val, + entry->lo.val, entry->hi.val, + irte->lo.val, irte->hi.val); + /* + * We use cmpxchg16 to atomically update the 128-bit IRTE, + * and it cannot be updated by the hardware or other processors + * behind us, so the return value of cmpxchg16 should be the + * same as the old value. + */ + WARN_ON(!ret); + if (data) data->ref = entry; -- 2.17.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE @ 2020-09-02 4:51 ` Suravee Suthikulpanit 0 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-02 4:51 UTC (permalink / raw) To: linux-kernel, iommu Cc: jon.grimm, james.puthukattukaran, boris.ostrovsky, joao.m.martins, sean.m.osborne When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU Driver Device IRQ ============ =========== irte.RemapEn=0 ... change IRTE IRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Reported-by: Sean Osborne <sean.m.osborne@oracle.com> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> --- drivers/iommu/amd/Kconfig | 2 +- drivers/iommu/amd/init.c | 21 +++++++++++++++++++-- drivers/iommu/amd/iommu.c | 17 +++++++++++++---- 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig index 1f061d91e0b8..626b97d0dd21 100644 --- a/drivers/iommu/amd/Kconfig +++ b/drivers/iommu/amd/Kconfig @@ -10,7 +10,7 @@ config AMD_IOMMU select IOMMU_API select IOMMU_IOVA select IOMMU_DMA - depends on X86_64 && PCI && ACPI + depends on X86_64 && PCI && ACPI && HAVE_CMPXCHG_DOUBLE help With this option you can enable support for AMD IOMMU hardware in your system. An IOMMU is a hardware component which provides diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c652f16eb702..ad30467f6930 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) + + /* + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. + * GAM also requires GA mode. Therefore, we need to + * check cmbxchg16b support before enabling it. + */ + if (!boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; break; case 0x11: @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) iommu->mmio_phys_end = MMIO_REG_END_OFFSET; else iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) + + /* + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. + * XT, GAM also requires GA mode. Therefore, we need to + * check cmbxchg16b support before enabling them. + */ + if (boot_cpu_has(X86_FEATURE_CX16) || + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; + break; + } + /* * Note: Since iommu_update_intcapxt() leverages * the IOMMU MMIO access to MSI capability block registers diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 967f4e96d1eb..a382d7a73eaa 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3292,6 +3292,7 @@ static int alloc_irq_index(u16 devid, int count, bool align, static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { + bool ret; struct irq_remap_table *table; struct amd_iommu *iommu; unsigned long flags; @@ -3309,10 +3310,18 @@ static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte, entry = (struct irte_ga *)table->table; entry = &entry[index]; - entry->lo.fields_remap.valid = 0; - entry->hi.val = irte->hi.val; - entry->lo.val = irte->lo.val; - entry->lo.fields_remap.valid = 1; + + ret = cmpxchg_double(&entry->lo.val, &entry->hi.val, + entry->lo.val, entry->hi.val, + irte->lo.val, irte->hi.val); + /* + * We use cmpxchg16 to atomically update the 128-bit IRTE, + * and it cannot be updated by the hardware or other processors + * behind us, so the return value of cmpxchg16 should be the + * same as the old value. + */ + WARN_ON(!ret); + if (data) data->ref = entry; -- 2.17.1 _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE 2020-09-02 4:51 ` Suravee Suthikulpanit @ 2020-09-02 15:26 ` Joao Martins -1 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2020-09-02 15:26 UTC (permalink / raw) To: Suravee Suthikulpanit Cc: linux-kernel, iommu, joro, sean.m.osborne, james.puthukattukaran, boris.ostrovsky, jon.grimm On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: > When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), > current driver disables interrupt remapping when it updates the IRTE > so that the upper and lower 64-bit values can be updated safely. > > However, this creates a small window, where the interrupt could > arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. > > IOMMU Driver Device IRQ > ============ =========== > irte.RemapEn=0 > ... > change IRTE IRQ from device ==> IO_PAGE_FAULT !! > ... > irte.RemapEn=1 > > This scenario has been observed when changing irq affinity on a system > running I/O-intensive workload, in which the destination APIC ID > in the IRTE is updated. > > Instead, use cmpxchg_double() to update the 128-bit IRTE at once without > disabling the interrupt remapping. However, this means several features, > which require GA (128-bit IRTE) support will also be affected if cmpxchg16b > is not supported (which is unprecedented for AMD processors w/ IOMMU). > Probably requires: Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") ? > Reported-by: Sean Osborne <sean.m.osborne@oracle.com> > Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> With the comments below addressed, FWIW: Reviewed-by: Joao Martins <joao.m.martins@oracle.com> > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > index c652f16eb702..ad30467f6930 100644 > --- a/drivers/iommu/amd/init.c > +++ b/drivers/iommu/amd/init.c > @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) > iommu->mmio_phys_end = MMIO_REG_END_OFFSET; > else > iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; > - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) > + > + /* > + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. > + * GAM also requires GA mode. Therefore, we need to > + * check cmbxchg16b support before enabling it. > + */ s/cmbxchg16b/cmpxchg16b > + if (!boot_cpu_has(X86_FEATURE_CX16) || > + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) > amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; > break; > case 0x11: > @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) > iommu->mmio_phys_end = MMIO_REG_END_OFFSET; > else > iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; > - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) > + > + /* > + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. > + * XT, GAM also requires GA mode. Therefore, we need to > + * check cmbxchg16b support before enabling them. s/cmbxchg16b/cmpxchg16b > + */ > + if (boot_cpu_has(X86_FEATURE_CX16) || You probably want !boot_cpu_has(X86_FEATURE_CX16) ? > + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { > amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; > + break; > + } > + > /* > * Note: Since iommu_update_intcapxt() leverages > * the IOMMU MMIO access to MSI capability block registers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE @ 2020-09-02 15:26 ` Joao Martins 0 siblings, 0 replies; 12+ messages in thread From: Joao Martins @ 2020-09-02 15:26 UTC (permalink / raw) To: Suravee Suthikulpanit Cc: jon.grimm, linux-kernel, iommu, james.puthukattukaran, boris.ostrovsky, sean.m.osborne On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: > When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), > current driver disables interrupt remapping when it updates the IRTE > so that the upper and lower 64-bit values can be updated safely. > > However, this creates a small window, where the interrupt could > arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. > > IOMMU Driver Device IRQ > ============ =========== > irte.RemapEn=0 > ... > change IRTE IRQ from device ==> IO_PAGE_FAULT !! > ... > irte.RemapEn=1 > > This scenario has been observed when changing irq affinity on a system > running I/O-intensive workload, in which the destination APIC ID > in the IRTE is updated. > > Instead, use cmpxchg_double() to update the 128-bit IRTE at once without > disabling the interrupt remapping. However, this means several features, > which require GA (128-bit IRTE) support will also be affected if cmpxchg16b > is not supported (which is unprecedented for AMD processors w/ IOMMU). > Probably requires: Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") ? > Reported-by: Sean Osborne <sean.m.osborne@oracle.com> > Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> With the comments below addressed, FWIW: Reviewed-by: Joao Martins <joao.m.martins@oracle.com> > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c > index c652f16eb702..ad30467f6930 100644 > --- a/drivers/iommu/amd/init.c > +++ b/drivers/iommu/amd/init.c > @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) > iommu->mmio_phys_end = MMIO_REG_END_OFFSET; > else > iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; > - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) > + > + /* > + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. > + * GAM also requires GA mode. Therefore, we need to > + * check cmbxchg16b support before enabling it. > + */ s/cmbxchg16b/cmpxchg16b > + if (!boot_cpu_has(X86_FEATURE_CX16) || > + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) > amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; > break; > case 0x11: > @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) > iommu->mmio_phys_end = MMIO_REG_END_OFFSET; > else > iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; > - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) > + > + /* > + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. > + * XT, GAM also requires GA mode. Therefore, we need to > + * check cmbxchg16b support before enabling them. s/cmbxchg16b/cmpxchg16b > + */ > + if (boot_cpu_has(X86_FEATURE_CX16) || You probably want !boot_cpu_has(X86_FEATURE_CX16) ? > + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { > amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; > + break; > + } > + > /* > * Note: Since iommu_update_intcapxt() leverages > * the IOMMU MMIO access to MSI capability block registers _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE 2020-09-02 15:26 ` Joao Martins @ 2020-09-03 8:14 ` Suravee Suthikulpanit -1 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-03 8:14 UTC (permalink / raw) To: Joao Martins Cc: linux-kernel, iommu, joro, sean.m.osborne, james.puthukattukaran, boris.ostrovsky, jon.grimm Hi, I'll send out V2 with fixes to the review comments below ... On 9/2/20 10:26 PM, Joao Martins wrote: > On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: >> When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), >> current driver disables interrupt remapping when it updates the IRTE >> so that the upper and lower 64-bit values can be updated safely. >> >> However, this creates a small window, where the interrupt could >> arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. >> >> IOMMU Driver Device IRQ >> ============ =========== >> irte.RemapEn=0 >> ... >> change IRTE IRQ from device ==> IO_PAGE_FAULT !! >> ... >> irte.RemapEn=1 >> >> This scenario has been observed when changing irq affinity on a system >> running I/O-intensive workload, in which the destination APIC ID >> in the IRTE is updated. >> >> Instead, use cmpxchg_double() to update the 128-bit IRTE at once without >> disabling the interrupt remapping. However, this means several features, >> which require GA (128-bit IRTE) support will also be affected if cmpxchg16b >> is not supported (which is unprecedented for AMD processors w/ IOMMU). >> > Probably requires: > > Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") > Yes, I will include this in V2. > >> Reported-by: Sean Osborne <sean.m.osborne@oracle.com> >> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> >> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > With the comments below addressed, FWIW: > > Reviewed-by: Joao Martins <joao.m.martins@oracle.com> > >> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c >> index c652f16eb702..ad30467f6930 100644 >> --- a/drivers/iommu/amd/init.c >> +++ b/drivers/iommu/amd/init.c >> @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) >> iommu->mmio_phys_end = MMIO_REG_END_OFFSET; >> else >> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; >> - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) >> + >> + /* >> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. >> + * GAM also requires GA mode. Therefore, we need to >> + * check cmbxchg16b support before enabling it. >> + */ > > s/cmbxchg16b/cmpxchg16b > >> + if (!boot_cpu_has(X86_FEATURE_CX16) || >> + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) >> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; >> break; >> case 0x11: >> @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) >> iommu->mmio_phys_end = MMIO_REG_END_OFFSET; >> else >> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; >> - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) >> + >> + /* >> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. >> + * XT, GAM also requires GA mode. Therefore, we need to >> + * check cmbxchg16b support before enabling them. > > s/cmbxchg16b/cmpxchg16b > >> + */ >> + if (boot_cpu_has(X86_FEATURE_CX16) || > > You probably want !boot_cpu_has(X86_FEATURE_CX16) ? .... Ah, sorry!! I forgot to change it back after testing for the negative case. Thank you for catching this. Suravee > >> + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { >> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; >> + break; >> + } >> + >> /* >> * Note: Since iommu_update_intcapxt() leverages >> * the IOMMU MMIO access to MSI capability block registers ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE @ 2020-09-03 8:14 ` Suravee Suthikulpanit 0 siblings, 0 replies; 12+ messages in thread From: Suravee Suthikulpanit @ 2020-09-03 8:14 UTC (permalink / raw) To: Joao Martins Cc: jon.grimm, linux-kernel, iommu, james.puthukattukaran, boris.ostrovsky, sean.m.osborne Hi, I'll send out V2 with fixes to the review comments below ... On 9/2/20 10:26 PM, Joao Martins wrote: > On 9/2/20 5:51 AM, Suravee Suthikulpanit wrote: >> When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), >> current driver disables interrupt remapping when it updates the IRTE >> so that the upper and lower 64-bit values can be updated safely. >> >> However, this creates a small window, where the interrupt could >> arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. >> >> IOMMU Driver Device IRQ >> ============ =========== >> irte.RemapEn=0 >> ... >> change IRTE IRQ from device ==> IO_PAGE_FAULT !! >> ... >> irte.RemapEn=1 >> >> This scenario has been observed when changing irq affinity on a system >> running I/O-intensive workload, in which the destination APIC ID >> in the IRTE is updated. >> >> Instead, use cmpxchg_double() to update the 128-bit IRTE at once without >> disabling the interrupt remapping. However, this means several features, >> which require GA (128-bit IRTE) support will also be affected if cmpxchg16b >> is not supported (which is unprecedented for AMD processors w/ IOMMU). >> > Probably requires: > > Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") > Yes, I will include this in V2. > >> Reported-by: Sean Osborne <sean.m.osborne@oracle.com> >> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> >> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > With the comments below addressed, FWIW: > > Reviewed-by: Joao Martins <joao.m.martins@oracle.com> > >> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c >> index c652f16eb702..ad30467f6930 100644 >> --- a/drivers/iommu/amd/init.c >> +++ b/drivers/iommu/amd/init.c >> @@ -1511,7 +1511,14 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) >> iommu->mmio_phys_end = MMIO_REG_END_OFFSET; >> else >> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; >> - if (((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) >> + >> + /* >> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. >> + * GAM also requires GA mode. Therefore, we need to >> + * check cmbxchg16b support before enabling it. >> + */ > > s/cmbxchg16b/cmpxchg16b > >> + if (!boot_cpu_has(X86_FEATURE_CX16) || >> + ((h->efr_attr & (0x1 << IOMMU_FEAT_GASUP_SHIFT)) == 0)) >> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; >> break; >> case 0x11: >> @@ -1520,8 +1527,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) >> iommu->mmio_phys_end = MMIO_REG_END_OFFSET; >> else >> iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET; >> - if (((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) >> + >> + /* >> + * Note: GA (128-bit IRTE) mode requires cmpxchg16b supports. >> + * XT, GAM also requires GA mode. Therefore, we need to >> + * check cmbxchg16b support before enabling them. > > s/cmbxchg16b/cmpxchg16b > >> + */ >> + if (boot_cpu_has(X86_FEATURE_CX16) || > > You probably want !boot_cpu_has(X86_FEATURE_CX16) ? .... Ah, sorry!! I forgot to change it back after testing for the negative case. Thank you for catching this. Suravee > >> + ((h->efr_reg & (0x1 << IOMMU_EFR_GASUP_SHIFT)) == 0)) { >> amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY; >> + break; >> + } >> + >> /* >> * Note: Since iommu_update_intcapxt() leverages >> * the IOMMU MMIO access to MSI capability block registers _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-09-03 8:47 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-02 4:51 [PATCH 0/2] iommu: amd: Fix intremap IO_PAGE_FAULT for VMs Suravee Suthikulpanit 2020-09-02 4:51 ` Suravee Suthikulpanit 2020-09-02 4:51 ` [PATCH 1/2] iommu: amd: Restore IRTE.RemapEn bit after programming IRTE Suravee Suthikulpanit 2020-09-02 4:51 ` Suravee Suthikulpanit 2020-09-02 15:26 ` Joao Martins 2020-09-02 15:26 ` Joao Martins 2020-09-02 4:51 ` [PATCH 2/2] iommu: amd: Use cmpxchg_double() when updating 128-bit IRTE Suravee Suthikulpanit 2020-09-02 4:51 ` Suravee Suthikulpanit 2020-09-02 15:26 ` Joao Martins 2020-09-02 15:26 ` Joao Martins 2020-09-03 8:14 ` Suravee Suthikulpanit 2020-09-03 8:14 ` Suravee Suthikulpanit
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.