* [PATCH v2 0/2] IOMMU: XSA-373 follow-on
@ 2021-06-25 12:13 Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: redo awaiting of command completion Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands Jan Beulich
0 siblings, 2 replies; 4+ messages in thread
From: Jan Beulich @ 2021-06-25 12:13 UTC (permalink / raw)
To: xen-devel; +Cc: Paul Durrant, Andrew Cooper
A number of further adjustments were left out of the XSA, for not
being a security concern (anymore in some of the cases, with the
changes put in place there). Only AMD side pieces are left in v2.
1: AMD/IOMMU: redo awaiting of command completion
2: AMD/IOMMU: re-work locking around sending of commands
Jan
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] AMD/IOMMU: redo awaiting of command completion
2021-06-25 12:13 [PATCH v2 0/2] IOMMU: XSA-373 follow-on Jan Beulich
@ 2021-06-25 12:15 ` Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands Jan Beulich
1 sibling, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2021-06-25 12:15 UTC (permalink / raw)
To: xen-devel; +Cc: Paul Durrant, Andrew Cooper
The present abuse of the completion interrupt does not only stand in the
way of, down the road, using it for its actual purpose, but also
requires holding the IOMMU lock while waiting for command completion,
limiting parallelism and keeping interrupts off for non-negligible
periods of time. Have the IOMMU do an ordinary memory write instead of
signaling an otherwise disabled interrupt (by just updating a status
register bit).
Since IOMMU_COMP_WAIT_I_FLAG_SHIFT is now unused and
IOMMU_COMP_WAIT_[FS]_FLAG_SHIFT already were, drop all three of them
while at it.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
v2: Avoid set_field_in_reg_u32(). Drop now unused
IOMMU_COMP_WAIT_[FIS]_FLAG_SHIFT.
--- a/xen/drivers/passthrough/amd/iommu-defs.h
+++ b/xen/drivers/passthrough/amd/iommu-defs.h
@@ -178,11 +178,8 @@ struct amd_iommu_dte {
#define IOMMU_COMP_WAIT_DATA_BUFFER_SIZE 8
#define IOMMU_COMP_WAIT_DATA_BUFFER_ALIGNMENT 8
#define IOMMU_COMP_WAIT_S_FLAG_MASK 0x00000001
-#define IOMMU_COMP_WAIT_S_FLAG_SHIFT 0
#define IOMMU_COMP_WAIT_I_FLAG_MASK 0x00000002
-#define IOMMU_COMP_WAIT_I_FLAG_SHIFT 1
#define IOMMU_COMP_WAIT_F_FLAG_MASK 0x00000004
-#define IOMMU_COMP_WAIT_F_FLAG_SHIFT 2
#define IOMMU_COMP_WAIT_ADDR_LOW_MASK 0xFFFFFFF8
#define IOMMU_COMP_WAIT_ADDR_LOW_SHIFT 3
#define IOMMU_COMP_WAIT_ADDR_HIGH_MASK 0x000FFFFF
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -20,6 +20,9 @@
#include "iommu.h"
#include "../ats.h"
+#define CMD_COMPLETION_INIT 0
+#define CMD_COMPLETION_DONE 1
+
static void send_iommu_command(struct amd_iommu *iommu,
const uint32_t cmd[4])
{
@@ -49,28 +52,27 @@ static void send_iommu_command(struct am
static void flush_command_buffer(struct amd_iommu *iommu,
unsigned int timeout_base)
{
- uint32_t cmd[4];
+ static DEFINE_PER_CPU(uint64_t, poll_slot);
+ uint64_t *this_poll_slot = &this_cpu(poll_slot);
+ paddr_t addr = virt_to_maddr(this_poll_slot);
+ /* send a COMPLETION_WAIT command to flush command buffer */
+ uint32_t cmd[4] = {
+ addr | MASK_INSR(IOMMU_CONTROL_ENABLED,
+ IOMMU_COMP_WAIT_S_FLAG_MASK),
+ (addr >> 32) | MASK_INSR(IOMMU_CMD_COMPLETION_WAIT,
+ IOMMU_CMD_OPCODE_MASK),
+ CMD_COMPLETION_DONE
+ };
s_time_t start, timeout;
static unsigned int __read_mostly threshold = 1;
- /* RW1C 'ComWaitInt' in status register */
- writel(IOMMU_STATUS_COMP_WAIT_INT,
- iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET);
-
- /* send an empty COMPLETION_WAIT command to flush command buffer */
- cmd[3] = cmd[2] = 0;
- set_field_in_reg_u32(IOMMU_CMD_COMPLETION_WAIT, 0,
- IOMMU_CMD_OPCODE_MASK,
- IOMMU_CMD_OPCODE_SHIFT, &cmd[1]);
- set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, 0,
- IOMMU_COMP_WAIT_I_FLAG_MASK,
- IOMMU_COMP_WAIT_I_FLAG_SHIFT, &cmd[0]);
+ ACCESS_ONCE(*this_poll_slot) = CMD_COMPLETION_INIT;
+
send_iommu_command(iommu, cmd);
start = NOW();
timeout = start + (timeout_base ?: 100) * MILLISECS(threshold);
- while ( !(readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET) &
- IOMMU_STATUS_COMP_WAIT_INT) )
+ while ( ACCESS_ONCE(*this_poll_slot) != CMD_COMPLETION_DONE )
{
if ( timeout && NOW() > timeout )
{
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands
2021-06-25 12:13 [PATCH v2 0/2] IOMMU: XSA-373 follow-on Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: redo awaiting of command completion Jan Beulich
@ 2021-06-25 12:15 ` Jan Beulich
2021-06-25 14:13 ` Paul Durrant
1 sibling, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2021-06-25 12:15 UTC (permalink / raw)
To: xen-devel; +Cc: Paul Durrant, Andrew Cooper
It appears unhelpful to me for flush_command_buffer() to block all
progress elsewhere for the given IOMMU by holding its lock while waiting
for command completion. There's no real need for callers of that
function or of send_iommu_command() to hold the lock. Contain all
command sending related locking to the latter function.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Re-work to contain locking to send_iommu_command().
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -27,6 +27,9 @@ static void send_iommu_command(struct am
const uint32_t cmd[4])
{
uint32_t tail;
+ unsigned long flags;
+
+ spin_lock_irqsave(&iommu->lock, flags);
tail = iommu->cmd_buffer.tail + sizeof(cmd_entry_t);
if ( tail == iommu->cmd_buffer.size )
@@ -47,6 +50,8 @@ static void send_iommu_command(struct am
iommu->cmd_buffer.tail = tail;
writel(tail, iommu->mmio_base + IOMMU_CMD_BUFFER_TAIL_OFFSET);
+
+ spin_unlock_irqrestore(&iommu->lock, flags);
}
static void flush_command_buffer(struct amd_iommu *iommu,
@@ -273,7 +278,6 @@ static void invalidate_iommu_all(struct
void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
daddr_t daddr, unsigned int order)
{
- unsigned long flags;
struct amd_iommu *iommu;
unsigned int req_id, queueid, maxpend;
@@ -300,10 +304,8 @@ void amd_iommu_flush_iotlb(u8 devfn, con
maxpend = pdev->ats.queue_depth & 0xff;
/* send INVALIDATE_IOTLB_PAGES command */
- spin_lock_irqsave(&iommu->lock, flags);
invalidate_iotlb_pages(iommu, maxpend, 0, queueid, daddr, req_id, order);
flush_command_buffer(iommu, iommu_dev_iotlb_timeout);
- spin_unlock_irqrestore(&iommu->lock, flags);
}
static void amd_iommu_flush_all_iotlbs(struct domain *d, daddr_t daddr,
@@ -330,17 +332,14 @@ static void amd_iommu_flush_all_iotlbs(s
static void _amd_iommu_flush_pages(struct domain *d,
daddr_t daddr, unsigned int order)
{
- unsigned long flags;
struct amd_iommu *iommu;
unsigned int dom_id = d->domain_id;
/* send INVALIDATE_IOMMU_PAGES command */
for_each_amd_iommu ( iommu )
{
- spin_lock_irqsave(&iommu->lock, flags);
invalidate_iommu_pages(iommu, daddr, dom_id, order);
flush_command_buffer(iommu, 0);
- spin_unlock_irqrestore(&iommu->lock, flags);
}
if ( ats_enabled )
@@ -360,37 +359,25 @@ void amd_iommu_flush_pages(struct domain
void amd_iommu_flush_device(struct amd_iommu *iommu, uint16_t bdf)
{
- ASSERT( spin_is_locked(&iommu->lock) );
-
invalidate_dev_table_entry(iommu, bdf);
flush_command_buffer(iommu, 0);
}
void amd_iommu_flush_intremap(struct amd_iommu *iommu, uint16_t bdf)
{
- ASSERT( spin_is_locked(&iommu->lock) );
-
invalidate_interrupt_table(iommu, bdf);
flush_command_buffer(iommu, 0);
}
void amd_iommu_flush_all_caches(struct amd_iommu *iommu)
{
- ASSERT( spin_is_locked(&iommu->lock) );
-
invalidate_iommu_all(iommu);
flush_command_buffer(iommu, 0);
}
void amd_iommu_send_guest_cmd(struct amd_iommu *iommu, u32 cmd[])
{
- unsigned long flags;
-
- spin_lock_irqsave(&iommu->lock, flags);
-
send_iommu_command(iommu, cmd);
/* TBD: Timeout selection may require peeking into cmd[]. */
flush_command_buffer(iommu, 0);
-
- spin_unlock_irqrestore(&iommu->lock, flags);
}
--- a/xen/drivers/passthrough/amd/iommu_guest.c
+++ b/xen/drivers/passthrough/amd/iommu_guest.c
@@ -449,9 +449,10 @@ static int do_invalidate_dte(struct doma
spin_lock_irqsave(&iommu->lock, flags);
dte_set_gcr3_table(mdte, hdom_id, gcr3_mfn << PAGE_SHIFT, gv, glx);
- amd_iommu_flush_device(iommu, req_id);
spin_unlock_irqrestore(&iommu->lock, flags);
+ amd_iommu_flush_device(iommu, req_id);
+
return 0;
}
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -871,7 +871,10 @@ static void enable_iommu(struct amd_iomm
spin_lock_irqsave(&iommu->lock, flags);
if ( unlikely(iommu->enabled) )
- goto out;
+ {
+ spin_unlock_irqrestore(&iommu->lock, flags);
+ return;
+ }
amd_iommu_erratum_746_workaround(iommu);
@@ -921,13 +924,12 @@ static void enable_iommu(struct amd_iomm
set_iommu_translation_control(iommu, IOMMU_CONTROL_ENABLED);
- if ( iommu->features.flds.ia_sup )
- amd_iommu_flush_all_caches(iommu);
-
iommu->enabled = 1;
- out:
spin_unlock_irqrestore(&iommu->lock, flags);
+
+ if ( iommu->features.flds.ia_sup )
+ amd_iommu_flush_all_caches(iommu);
}
static void disable_iommu(struct amd_iommu *iommu)
@@ -1544,7 +1546,6 @@ static int _invalidate_all_devices(
{
unsigned int bdf;
u16 req_id;
- unsigned long flags;
struct amd_iommu *iommu;
for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ )
@@ -1553,10 +1554,8 @@ static int _invalidate_all_devices(
req_id = ivrs_mappings[bdf].dte_requestor_id;
if ( iommu )
{
- spin_lock_irqsave(&iommu->lock, flags);
amd_iommu_flush_device(iommu, req_id);
amd_iommu_flush_intremap(iommu, req_id);
- spin_unlock_irqrestore(&iommu->lock, flags);
}
}
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -310,9 +310,7 @@ static int update_intremap_entry_from_io
entry.ptr32->flds.remap_en = false;
spin_unlock(lock);
- spin_lock(&iommu->lock);
amd_iommu_flush_intremap(iommu, req_id);
- spin_unlock(&iommu->lock);
spin_lock(lock);
}
@@ -527,11 +525,9 @@ static int update_intremap_entry_from_ms
if ( iommu->enabled )
{
- spin_lock_irqsave(&iommu->lock, flags);
amd_iommu_flush_intremap(iommu, req_id);
if ( alias_id != req_id )
amd_iommu_flush_intremap(iommu, alias_id);
- spin_unlock_irqrestore(&iommu->lock, flags);
}
return 0;
@@ -567,11 +563,9 @@ static int update_intremap_entry_from_ms
entry.ptr32->flds.remap_en = false;
spin_unlock(lock);
- spin_lock(&iommu->lock);
amd_iommu_flush_intremap(iommu, req_id);
if ( alias_id != req_id )
amd_iommu_flush_intremap(iommu, alias_id);
- spin_unlock(&iommu->lock);
spin_lock(lock);
}
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -129,6 +129,8 @@ static void amd_iommu_setup_domain_devic
iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
dte->i = ats_enabled;
+ spin_unlock_irqrestore(&iommu->lock, flags);
+
amd_iommu_flush_device(iommu, req_id);
AMD_IOMMU_DEBUG("Setup I/O page table: device id = %#x, type = %#x, "
@@ -138,8 +140,8 @@ static void amd_iommu_setup_domain_devic
page_to_maddr(hd->arch.amd.root_table),
domain->domain_id, hd->arch.amd.paging_mode);
}
-
- spin_unlock_irqrestore(&iommu->lock, flags);
+ else
+ spin_unlock_irqrestore(&iommu->lock, flags);
ASSERT(pcidevs_locked());
@@ -307,6 +309,8 @@ static void amd_iommu_disable_domain_dev
smp_wmb();
dte->v = true;
+ spin_unlock_irqrestore(&iommu->lock, flags);
+
amd_iommu_flush_device(iommu, req_id);
AMD_IOMMU_DEBUG("Disable: device id = %#x, "
@@ -314,7 +318,8 @@ static void amd_iommu_disable_domain_dev
req_id, domain->domain_id,
dom_iommu(domain)->arch.amd.paging_mode);
}
- spin_unlock_irqrestore(&iommu->lock, flags);
+ else
+ spin_unlock_irqrestore(&iommu->lock, flags);
ASSERT(pcidevs_locked());
@@ -455,9 +460,9 @@ static int amd_iommu_add_device(u8 devfn
iommu->dev_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE),
ivrs_mappings[bdf].intremap_table, iommu, iommu_intremap);
- amd_iommu_flush_device(iommu, bdf);
-
spin_unlock_irqrestore(&iommu->lock, flags);
+
+ amd_iommu_flush_device(iommu, bdf);
}
amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands Jan Beulich
@ 2021-06-25 14:13 ` Paul Durrant
0 siblings, 0 replies; 4+ messages in thread
From: Paul Durrant @ 2021-06-25 14:13 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Andrew Cooper
On 25/06/2021 13:15, Jan Beulich wrote:
> It appears unhelpful to me for flush_command_buffer() to block all
> progress elsewhere for the given IOMMU by holding its lock while waiting
> for command completion. There's no real need for callers of that
> function or of send_iommu_command() to hold the lock. Contain all
> command sending related locking to the latter function.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-06-25 14:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-25 12:13 [PATCH v2 0/2] IOMMU: XSA-373 follow-on Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: redo awaiting of command completion Jan Beulich
2021-06-25 12:15 ` [PATCH v2 2/2] AMD/IOMMU: re-work locking around sending of commands Jan Beulich
2021-06-25 14:13 ` Paul Durrant
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.