All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart
@ 2023-03-16 20:02 Joao Martins
  2023-03-16 20:02 ` [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
  2023-03-16 20:02 ` [PATCH v2 2/2] iommu/amd: Handle GALog overflows Joao Martins
  0 siblings, 2 replies; 14+ messages in thread
From: Joao Martins @ 2023-03-16 20:02 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Maxim Levitsky, Alejandro Jimenez, kvm,
	Joao Martins

Hey,

This small series expands from v1 with one more patch:

Patch 1) Fix affinity changes to already-in-guest-mode IRTEs which would
         otherwise be nops.

Patch 2) Handle the GALog overflow condition by restarting it

I have a follow-up 3-patch series but being an potential optimization[0]
I prefer making it separate. This series just tackles bugs.

Comments appreciated.

Thanks,
	Joao

Changes since v1[1]:
- Adjust commit message in first patch (Suravee)
- Add Rb in the first patch (Suravee)

[0] https://lore.kernel.org/linux-iommu/b39d505c-8d2b-d90b-f52d-ceabde8225cf@oracle.com/
[1] https://lore.kernel.org/linux-iommu/20230208131938.39898-1-joao.m.martins@oracle.com/

Joao Martins (2):
  iommu/amd: Don't block updates to GATag if guest mode is on
  iommu/amd: Handle GALog overflows

 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
 drivers/iommu/amd/iommu.c     | 11 +++++++++--
 3 files changed, 34 insertions(+), 2 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-16 20:02 [PATCH v2 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
@ 2023-03-16 20:02 ` Joao Martins
  2023-03-16 21:01   ` Sean Christopherson
  2023-03-28  9:07   ` Alexey Kardashevskiy
  2023-03-16 20:02 ` [PATCH v2 2/2] iommu/amd: Handle GALog overflows Joao Martins
  1 sibling, 2 replies; 14+ messages in thread
From: Joao Martins @ 2023-03-16 20:02 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Maxim Levitsky, Alejandro Jimenez, kvm,
	Joao Martins

On KVM GSI routing table updates, specially those where they have vIOMMUs
with interrupt remapping enabled (to boot >255vcpus setups without relying
on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
with a new VCPU affinity.

On AMD with AVIC enabled, the new vcpu affinity info is updated via:
	avic_pi_update_irte()
		irq_set_vcpu_affinity()
			amd_ir_set_vcpu_affinity()
				amd_iommu_{de}activate_guest_mode()

Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
(via GALog) when interrupt cannot be delivered due to vCPU is in
blocking state.

The issue is that amd_iommu_activate_guest_mode() will essentially
only change IRTE fields on transitions from non-guest-mode to guest-mode
and otherwise returns *with no changes to IRTE* on already configured
guest-mode interrupts. To the guest this means that the VF interrupts
remain affined to the first vCPU they were first configured, and guest
will be unable to either VF interrupts and receive messages like this
from spuruious interrupts (e.g. from waking the wrong vCPU in GALog):

[  167.759472] __common_interrupt: 3.34 No irq handler for vector
[  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
3122): Recovered 1 EQEs on cmd_eq
[  230.681799] mlx5_core 0000:00:02.0:
wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
recovered after timeout
[  230.683266] __common_interrupt: 3.34 No irq handler for vector

Given the fact that amd_ir_set_vcpu_affinity() uses
amd_iommu_activate_guest_mode() underneath it essentially means that VCPU
affinity changes of IRTEs are nops. Fix it by dropping the check for
guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to
amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change
underlying DestID on the host, the VFIO IRQ handler will still be able to
poke at the right guest-vCPU.

Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 5a505ba5467e..bf3ebc9d6cde 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3485,7 +3485,7 @@ int amd_iommu_activate_guest_mode(void *data)
 	u64 valid;
 
 	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
-	    !entry || entry->lo.fields_vapic.guest_mode)
+	    !entry)
 		return 0;
 
 	valid = entry->lo.fields_vapic.valid;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/2] iommu/amd: Handle GALog overflows
  2023-03-16 20:02 [PATCH v2 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
  2023-03-16 20:02 ` [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
@ 2023-03-16 20:02 ` Joao Martins
  2023-04-13 10:24   ` Suthikulpanit, Suravee
  2023-04-17  5:04   ` Vasant Hegde
  1 sibling, 2 replies; 14+ messages in thread
From: Joao Martins @ 2023-03-16 20:02 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Maxim Levitsky, Alejandro Jimenez, kvm,
	Joao Martins

GALog exists to propagate interrupts into all vCPUs in the system when
interrupts are marked as non running (e.g. when vCPUs aren't running). A
GALog overflow happens when there's in no space in the log to record the
GATag of the interrupt. So when the GALOverflow condition happens, the
GALog queue is processed and the GALog is restarted, as the IOMMU
manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
Procedure":

| * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
|   entries are completed as circumstances allow. GALogRun must be 0b to
|   modify the guest virtual APIC log registers safely.
| * Write MMIO Offset 0018h[GALogEn]=0b.
| * As necessary, change the following values (e.g., to relocate or
| resize the guest virtual APIC event log):
|   - the Guest Virtual APIC Log Base Address Register
|      [MMIO Offset 00E0h],
|   - the Guest Virtual APIC Log Head Pointer Register
|      [MMIO Offset 2040h][GALogHead], and
|   - the Guest Virtual APIC Log Tail Pointer Register
|      [MMIO Offset 2048h][GALogTail].
| * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
| * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
|   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
|   the bit to disable it.

Failing to handle the GALog overflow means that none of the VFs (in any
guest) will work with IOMMU AVIC forcing the user to power cycle the
host. When handling the event it resumes the GALog without resizing
much like how it is done in the event handler overflow. The
[MMIO Offset 2020h][GALOverflow] bit might be set in status register
without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
for GA events (to clear space in the galog), also check the overflow
bit.

[suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 drivers/iommu/amd/amd_iommu.h |  1 +
 drivers/iommu/amd/init.c      | 24 ++++++++++++++++++++++++
 drivers/iommu/amd/iommu.c     |  9 ++++++++-
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index c160a332ce33..24c7e6c6c0de 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -15,6 +15,7 @@ extern irqreturn_t amd_iommu_int_thread(int irq, void *data);
 extern irqreturn_t amd_iommu_int_handler(int irq, void *data);
 extern void amd_iommu_apply_erratum_63(struct amd_iommu *iommu, u16 devid);
 extern void amd_iommu_restart_event_logging(struct amd_iommu *iommu);
+extern void amd_iommu_restart_ga_log(struct amd_iommu *iommu);
 extern int amd_iommu_init_devices(void);
 extern void amd_iommu_uninit_devices(void);
 extern void amd_iommu_init_notifier(void);
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 19a46b9f7357..fd487c33b28a 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -751,6 +751,30 @@ void amd_iommu_restart_event_logging(struct amd_iommu *iommu)
 	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
 }
 
+/*
+ * This function restarts event logging in case the IOMMU experienced
+ * an GA log overflow.
+ */
+void amd_iommu_restart_ga_log(struct amd_iommu *iommu)
+{
+	u32 status;
+
+	status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
+	if (status & MMIO_STATUS_GALOG_RUN_MASK)
+		return;
+
+	pr_info_ratelimited("IOMMU GA Log restarting\n");
+
+	iommu_feature_disable(iommu, CONTROL_GALOG_EN);
+	iommu_feature_disable(iommu, CONTROL_GAINT_EN);
+
+	writel(MMIO_STATUS_GALOG_OVERFLOW_MASK,
+	       iommu->mmio_base + MMIO_STATUS_OFFSET);
+
+	iommu_feature_enable(iommu, CONTROL_GAINT_EN);
+	iommu_feature_enable(iommu, CONTROL_GALOG_EN);
+}
+
 /*
  * This function resets the command buffer if the IOMMU stopped fetching
  * commands from it.
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index bf3ebc9d6cde..ebb155bfef15 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -845,6 +845,7 @@ amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { }
 	(MMIO_STATUS_EVT_OVERFLOW_INT_MASK | \
 	 MMIO_STATUS_EVT_INT_MASK | \
 	 MMIO_STATUS_PPR_INT_MASK | \
+	 MMIO_STATUS_GALOG_OVERFLOW_MASK | \
 	 MMIO_STATUS_GALOG_INT_MASK)
 
 irqreturn_t amd_iommu_int_thread(int irq, void *data)
@@ -868,10 +869,16 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data)
 		}
 
 #ifdef CONFIG_IRQ_REMAP
-		if (status & MMIO_STATUS_GALOG_INT_MASK) {
+		if (status & (MMIO_STATUS_GALOG_INT_MASK |
+			      MMIO_STATUS_GALOG_OVERFLOW_MASK)) {
 			pr_devel("Processing IOMMU GA Log\n");
 			iommu_poll_ga_log(iommu);
 		}
+
+		if (status & MMIO_STATUS_GALOG_OVERFLOW_MASK) {
+			pr_info_ratelimited("IOMMU GA Log overflow\n");
+			amd_iommu_restart_ga_log(iommu);
+		}
 #endif
 
 		if (status & MMIO_STATUS_EVT_OVERFLOW_INT_MASK) {
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-16 20:02 ` [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
@ 2023-03-16 21:01   ` Sean Christopherson
  2023-03-16 21:25     ` Joao Martins
  2023-03-28  9:07   ` Alexey Kardashevskiy
  1 sibling, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2023-03-16 21:01 UTC (permalink / raw)
  To: Joao Martins
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde,
	Will Deacon, Robin Murphy, Maxim Levitsky, Alejandro Jimenez,
	kvm

On Thu, Mar 16, 2023, Joao Martins wrote:
> On KVM GSI routing table updates, specially those where they have vIOMMUs
> with interrupt remapping enabled (to boot >255vcpus setups without relying
> on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
> with a new VCPU affinity.
> 
> On AMD with AVIC enabled, the new vcpu affinity info is updated via:
> 	avic_pi_update_irte()
> 		irq_set_vcpu_affinity()
> 			amd_ir_set_vcpu_affinity()
> 				amd_iommu_{de}activate_guest_mode()
> 
> Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
> contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
> (via GALog) when interrupt cannot be delivered due to vCPU is in
> blocking state.
> 
> The issue is that amd_iommu_activate_guest_mode() will essentially
> only change IRTE fields on transitions from non-guest-mode to guest-mode
> and otherwise returns *with no changes to IRTE* on already configured
> guest-mode interrupts. To the guest this means that the VF interrupts
> remain affined to the first vCPU they were first configured, and guest
> will be unable to either VF interrupts and receive messages like this
> from spuruious interrupts (e.g. from waking the wrong vCPU in GALog):
> 
> [  167.759472] __common_interrupt: 3.34 No irq handler for vector
> [  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
> 3122): Recovered 1 EQEs on cmd_eq
> [  230.681799] mlx5_core 0000:00:02.0:
> wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
> recovered after timeout
> [  230.683266] __common_interrupt: 3.34 No irq handler for vector
> 
> Given the fact that amd_ir_set_vcpu_affinity() uses
> amd_iommu_activate_guest_mode() underneath it essentially means that VCPU
> affinity changes of IRTEs are nops. Fix it by dropping the check for
> guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to
> amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change
> underlying DestID on the host, the VFIO IRQ handler will still be able to
> poke at the right guest-vCPU.

Is there any harm in giving deactivate the same treatement?  If the worst case
scenario is a few wasted cycles, having symmetric flows and eliminating benign
bugs seems like a worthwhile tradeoff (assuming this is indeed a relatively slow
path like I think it is).

> Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  drivers/iommu/amd/iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 5a505ba5467e..bf3ebc9d6cde 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3485,7 +3485,7 @@ int amd_iommu_activate_guest_mode(void *data)

Any chance you (or anyone) would want to create a follow-up series to rename and/or
rework these flows to make it more obvious that the helpers handle updates as well
as transitions between "guest mode" and "host mode"?  E.g. I can see KVM getting
clever and skipping the "activation" when KVM knows AVIC is already active (though
I can't tell for certain whether or not that would actually be problematic).

>  	u64 valid;
>  
>  	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
> -	    !entry || entry->lo.fields_vapic.guest_mode)
> +	    !entry)

This can easily fit on the previous line.

	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry)
		return 0;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-16 21:01   ` Sean Christopherson
@ 2023-03-16 21:25     ` Joao Martins
  2023-03-24 14:31       ` Sean Christopherson
  0 siblings, 1 reply; 14+ messages in thread
From: Joao Martins @ 2023-03-16 21:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde,
	Will Deacon, Robin Murphy, Maxim Levitsky, Alejandro Jimenez,
	kvm

On 16/03/2023 21:01, Sean Christopherson wrote:
> On Thu, Mar 16, 2023, Joao Martins wrote:
>> On KVM GSI routing table updates, specially those where they have vIOMMUs
>> with interrupt remapping enabled (to boot >255vcpus setups without relying
>> on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
>> with a new VCPU affinity.
>>
>> On AMD with AVIC enabled, the new vcpu affinity info is updated via:
>> 	avic_pi_update_irte()
>> 		irq_set_vcpu_affinity()
>> 			amd_ir_set_vcpu_affinity()
>> 				amd_iommu_{de}activate_guest_mode()
>>
>> Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
>> contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
>> (via GALog) when interrupt cannot be delivered due to vCPU is in
>> blocking state.
>>
>> The issue is that amd_iommu_activate_guest_mode() will essentially
>> only change IRTE fields on transitions from non-guest-mode to guest-mode
>> and otherwise returns *with no changes to IRTE* on already configured
>> guest-mode interrupts. To the guest this means that the VF interrupts
>> remain affined to the first vCPU they were first configured, and guest
>> will be unable to either VF interrupts and receive messages like this
>> from spuruious interrupts (e.g. from waking the wrong vCPU in GALog):
>>
>> [  167.759472] __common_interrupt: 3.34 No irq handler for vector
>> [  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
>> 3122): Recovered 1 EQEs on cmd_eq
>> [  230.681799] mlx5_core 0000:00:02.0:
>> wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
>> recovered after timeout
>> [  230.683266] __common_interrupt: 3.34 No irq handler for vector
>>
>> Given the fact that amd_ir_set_vcpu_affinity() uses
>> amd_iommu_activate_guest_mode() underneath it essentially means that VCPU
>> affinity changes of IRTEs are nops. Fix it by dropping the check for
>> guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to
>> amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change
>> underlying DestID on the host, the VFIO IRQ handler will still be able to
>> poke at the right guest-vCPU.
> 
> Is there any harm in giving deactivate the same treatement?  If the worst case
> scenario is a few wasted cycles, having symmetric flows and eliminating benign
> bugs seems like a worthwhile tradeoff (assuming this is indeed a relatively slow
> path like I think it is).
> 

I wanna say there's no harm, but initially I had such a patch, and on testing it
broke the classic interrupt remapping case but I didn't investigate further --
my suspicion is that the only case that should care is the updates (not the
actual deactivation of guest-mode).

>> Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>  drivers/iommu/amd/iommu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index 5a505ba5467e..bf3ebc9d6cde 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> @@ -3485,7 +3485,7 @@ int amd_iommu_activate_guest_mode(void *data)
> 
> Any chance you (or anyone) would want to create a follow-up series to rename and/or
> rework these flows to make it more obvious that the helpers handle updates as well
> as transitions between "guest mode" and "host mode"?  E.g. I can see KVM getting
> clever and skipping the "activation" when KVM knows AVIC is already active (though
> I can't tell for certain whether or not that would actually be problematic).
> 

To be honest, I think the function naming is correct.

Part of the problem here (as you also hint) is instead the reusal of the helpers
used in the (correct) transition to/from guest-mode *externally* by callers
mixed from *internal* usage in amd iommu code for IRQ vcpu affinity using the
same said helpers. And that'a also the reason I put the Fixes tag as that patch
introduced such "reusal" and which could be useful for stable trees. Here we are
mainly concerned with the updates (the internal usage) and actually exercising
the IRTE update instead of skipping it such that when you have interrupts on
blocked vCPUS that you actually wakeup the right one (and not doing so has a
rather drastic effect for VFs within the guest).

>>  	u64 valid;
>>  
>>  	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
>> -	    !entry || entry->lo.fields_vapic.guest_mode)
>> +	    !entry)
> 
> This can easily fit on the previous line.
> 
> 	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) || !entry)
> 		return 0;

True, I can move it to the previous line.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-16 21:25     ` Joao Martins
@ 2023-03-24 14:31       ` Sean Christopherson
  2023-03-28 10:42         ` Joao Martins
  0 siblings, 1 reply; 14+ messages in thread
From: Sean Christopherson @ 2023-03-24 14:31 UTC (permalink / raw)
  To: Joao Martins
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde,
	Will Deacon, Robin Murphy, Maxim Levitsky, Alejandro Jimenez,
	kvm

On Thu, Mar 16, 2023, Joao Martins wrote:
> On 16/03/2023 21:01, Sean Christopherson wrote:
> > Is there any harm in giving deactivate the same treatement?  If the worst case
> > scenario is a few wasted cycles, having symmetric flows and eliminating benign
> > bugs seems like a worthwhile tradeoff (assuming this is indeed a relatively slow
> > path like I think it is).
> > 
> 
> I wanna say there's no harm, but initially I had such a patch, and on testing it
> broke the classic interrupt remapping case but I didn't investigate further --
> my suspicion is that the only case that should care is the updates (not the
> actual deactivation of guest-mode).

Ugh, I bet this is due to KVM invoking irq_set_vcpu_affinity() with garbage when
AVIC is enabled, but KVM can't use a posted interrupt due to the how the IRQ is
configured.  I vaguely recall a bug report about uninitialized data in "pi" being
consumed, but I can't find it at the moment.

	if (!get_pi_vcpu_info(kvm, e, &vcpu_info, &svm) && set &&
		    kvm_vcpu_apicv_active(&svm->vcpu)) {

		...

	} else {
			/* Use legacy mode in IRTE */
			struct amd_iommu_pi_data pi;

			/**
			 * Here, pi is used to:
			 * - Tell IOMMU to use legacy mode for this interrupt.
			 * - Retrieve ga_tag of prior interrupt remapping data.
			 */
			pi.prev_ga_tag = 0;
			pi.is_guest_mode = false;
			ret = irq_set_vcpu_affinity(host_irq, &pi);
	}


> > Any chance you (or anyone) would want to create a follow-up series to rename and/or
> > rework these flows to make it more obvious that the helpers handle updates as well
> > as transitions between "guest mode" and "host mode"?  E.g. I can see KVM getting
> > clever and skipping the "activation" when KVM knows AVIC is already active (though
> > I can't tell for certain whether or not that would actually be problematic).
> > 
> 
> To be honest, I think the function naming is correct.

After looking more closely at the KVM code, I agree.  I was thinking KVM invoked
the (de)activate helpers somewhat spuriously, but that's not actually the case,
KVM just has a few less-than-perfect names due to conflicting requirements.

Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-16 20:02 ` [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
  2023-03-16 21:01   ` Sean Christopherson
@ 2023-03-28  9:07   ` Alexey Kardashevskiy
  2023-03-28 10:19     ` Joao Martins
  1 sibling, 1 reply; 14+ messages in thread
From: Alexey Kardashevskiy @ 2023-03-28  9:07 UTC (permalink / raw)
  To: Joao Martins, iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Maxim Levitsky, Alejandro Jimenez, kvm

On 17/3/23 07:02, Joao Martins wrote:
> On KVM GSI routing table updates, specially those where they have vIOMMUs
> with interrupt remapping enabled (to boot >255vcpus setups without relying
> on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
> with a new VCPU affinity.
> 
> On AMD with AVIC enabled, the new vcpu affinity info is updated via:
> 	avic_pi_update_irte()
> 		irq_set_vcpu_affinity()
> 			amd_ir_set_vcpu_affinity()
> 				amd_iommu_{de}activate_guest_mode()
> 
> Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
> contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
> (via GALog) when interrupt cannot be delivered due to vCPU is in
> blocking state.
> 
> The issue is that amd_iommu_activate_guest_mode() will essentially
> only change IRTE fields on transitions from non-guest-mode to guest-mode
> and otherwise returns *with no changes to IRTE* on already configured
> guest-mode interrupts. To the guest this means that the VF interrupts
> remain affined to the first vCPU they were first configured,and guest
> will be unable to either VF interrupts and receive messages like this
> from spuruious interrupts (e.g. from waking the wrong vCPU in GALog):

The "either" above sounds like there should be a verb which it is not, 
or is it? (my english skills are meh). I kinda get the idea anyway (I hope).

btw s/spuruious/spurious/, says my vim. Thanks,

> 
> [  167.759472] __common_interrupt: 3.34 No irq handler for vector
> [  230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid
> 3122): Recovered 1 EQEs on cmd_eq
> [  230.681799] mlx5_core 0000:00:02.0:
> wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400)
> recovered after timeout
> [  230.683266] __common_interrupt: 3.34 No irq handler for vector
> 
> Given the fact that amd_ir_set_vcpu_affinity() uses
> amd_iommu_activate_guest_mode() underneath it essentially means that VCPU
> affinity changes of IRTEs are nops. Fix it by dropping the check for
> guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to
> amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change
> underlying DestID on the host, the VFIO IRQ handler will still be able to
> poke at the right guest-vCPU.
> 
> Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd/iommu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 5a505ba5467e..bf3ebc9d6cde 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3485,7 +3485,7 @@ int amd_iommu_activate_guest_mode(void *data)
>   	u64 valid;
>   
>   	if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) ||
> -	    !entry || entry->lo.fields_vapic.guest_mode)
> +	    !entry)
>   		return 0;
>   
>   	valid = entry->lo.fields_vapic.valid;

-- 
Alexey


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-28  9:07   ` Alexey Kardashevskiy
@ 2023-03-28 10:19     ` Joao Martins
  0 siblings, 0 replies; 14+ messages in thread
From: Joao Martins @ 2023-03-28 10:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde, Will Deacon,
	Robin Murphy, Maxim Levitsky, Alejandro Jimenez, kvm, iommu

On 28/03/2023 10:07, Alexey Kardashevskiy wrote:
> On 17/3/23 07:02, Joao Martins wrote:
>> On KVM GSI routing table updates, specially those where they have vIOMMUs
>> with interrupt remapping enabled (to boot >255vcpus setups without relying
>> on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs
>> with a new VCPU affinity.
>>
>> On AMD with AVIC enabled, the new vcpu affinity info is updated via:
>>     avic_pi_update_irte()
>>         irq_set_vcpu_affinity()
>>             amd_ir_set_vcpu_affinity()
>>                 amd_iommu_{de}activate_guest_mode()
>>
>> Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag
>> contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM
>> (via GALog) when interrupt cannot be delivered due to vCPU is in
>> blocking state.
>>
>> The issue is that amd_iommu_activate_guest_mode() will essentially
>> only change IRTE fields on transitions from non-guest-mode to guest-mode
>> and otherwise returns *with no changes to IRTE* on already configured
>> guest-mode interrupts. To the guest this means that the VF interrupts
>> remain affined to the first vCPU they were first configured,and guest
>> will be unable to either VF interrupts and receive messages like this
>> from spuruious interrupts (e.g. from waking the wrong vCPU in GALog):
> 
> The "either" above sounds like there should be a verb which it is not, or is it?
> (my english skills are meh). I kinda get the idea anyway (I hope).
> 
It should be 'issue'. I'll delete the 'either'

> btw s/spuruious/spurious/, says my vim. Thanks,
> 
/me nods

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-24 14:31       ` Sean Christopherson
@ 2023-03-28 10:42         ` Joao Martins
  2023-03-28 15:20           ` Sean Christopherson
  0 siblings, 1 reply; 14+ messages in thread
From: Joao Martins @ 2023-03-28 10:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde,
	Will Deacon, Robin Murphy, Maxim Levitsky, Alejandro Jimenez,
	kvm

[I was out sick, hence the delay]

On 24/03/2023 14:31, Sean Christopherson wrote:
> On Thu, Mar 16, 2023, Joao Martins wrote:
>> On 16/03/2023 21:01, Sean Christopherson wrote:
>>> Is there any harm in giving deactivate the same treatement?  If the worst case
>>> scenario is a few wasted cycles, having symmetric flows and eliminating benign
>>> bugs seems like a worthwhile tradeoff (assuming this is indeed a relatively slow
>>> path like I think it is).
>>>
>>
>> I wanna say there's no harm, but initially I had such a patch, and on testing it
>> broke the classic interrupt remapping case but I didn't investigate further --
>> my suspicion is that the only case that should care is the updates (not the
>> actual deactivation of guest-mode).
> 
> Ugh, I bet this is due to KVM invoking irq_set_vcpu_affinity() with garbage when
> AVIC is enabled, but KVM can't use a posted interrupt due to the how the IRQ is
> configured.  I vaguely recall a bug report about uninitialized data in "pi" being
> consumed, but I can't find it at the moment.
> 
> 	if (!get_pi_vcpu_info(kvm, e, &vcpu_info, &svm) && set &&
> 		    kvm_vcpu_apicv_active(&svm->vcpu)) {
> 
> 		...
> 
> 	} else {
> 			/* Use legacy mode in IRTE */
> 			struct amd_iommu_pi_data pi;
> 
> 			/**
> 			 * Here, pi is used to:
> 			 * - Tell IOMMU to use legacy mode for this interrupt.
> 			 * - Retrieve ga_tag of prior interrupt remapping data.
> 			 */
> 			pi.prev_ga_tag = 0;
> 			pi.is_guest_mode = false;
> 			ret = irq_set_vcpu_affinity(host_irq, &pi);
> 	}
> 
> 

I recall one instance of the 'garbage pi data' issue but this was due to
prev_ga_tag not being initialized (see commit f6426ab9c957). As far as I
understand, AMD implementation on irq_vcpu_set_affinity will write back to
caller the following fields of pi:

- prev_ga_tag
- ir_data
- guest_mode (sometimes when it is unsupported or disabled by the host via cmdline)

On legacy interrupt remap path (no iommu avic) the IRQ update just uses irq data
mostly. It's the avic path that uses more things (vcpu_data, ga_tag, base,
ga_root_ptr, ga_vector), but all of which are initialized by KVM properly already.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on
  2023-03-28 10:42         ` Joao Martins
@ 2023-03-28 15:20           ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2023-03-28 15:20 UTC (permalink / raw)
  To: Joao Martins
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Vasant Hegde,
	Will Deacon, Robin Murphy, Maxim Levitsky, Alejandro Jimenez,
	kvm

On Tue, Mar 28, 2023, Joao Martins wrote:
> [I was out sick, hence the delay]
> 
> On 24/03/2023 14:31, Sean Christopherson wrote:
> > On Thu, Mar 16, 2023, Joao Martins wrote:
> >> On 16/03/2023 21:01, Sean Christopherson wrote:
> >>> Is there any harm in giving deactivate the same treatement?  If the worst case
> >>> scenario is a few wasted cycles, having symmetric flows and eliminating benign
> >>> bugs seems like a worthwhile tradeoff (assuming this is indeed a relatively slow
> >>> path like I think it is).
> >>>
> >>
> >> I wanna say there's no harm, but initially I had such a patch, and on testing it
> >> broke the classic interrupt remapping case but I didn't investigate further --
> >> my suspicion is that the only case that should care is the updates (not the
> >> actual deactivation of guest-mode).
> > 
> > Ugh, I bet this is due to KVM invoking irq_set_vcpu_affinity() with garbage when
> > AVIC is enabled, but KVM can't use a posted interrupt due to the how the IRQ is
> > configured.  I vaguely recall a bug report about uninitialized data in "pi" being
> > consumed, but I can't find it at the moment.
> > 
> > 	if (!get_pi_vcpu_info(kvm, e, &vcpu_info, &svm) && set &&
> > 		    kvm_vcpu_apicv_active(&svm->vcpu)) {
> > 
> > 		...
> > 
> > 	} else {
> > 			/* Use legacy mode in IRTE */
> > 			struct amd_iommu_pi_data pi;
> > 
> > 			/**
> > 			 * Here, pi is used to:
> > 			 * - Tell IOMMU to use legacy mode for this interrupt.
> > 			 * - Retrieve ga_tag of prior interrupt remapping data.
> > 			 */
> > 			pi.prev_ga_tag = 0;
> > 			pi.is_guest_mode = false;
> > 			ret = irq_set_vcpu_affinity(host_irq, &pi);
> > 	}
> > 
> > 
> 
> I recall one instance of the 'garbage pi data' issue but this was due to
> prev_ga_tag not being initialized (see commit f6426ab9c957).

Yep, that's the one I was trying to recall.

> As far as I understand, AMD implementation on irq_vcpu_set_affinity will
> write back to caller the following fields of pi:
> 
> - prev_ga_tag
> - ir_data
> - guest_mode (sometimes when it is unsupported or disabled by the host via cmdline)
> 
> On legacy interrupt remap path (no iommu avic) the IRQ update just uses irq data
> mostly. It's the avic path that uses more things (vcpu_data, ga_tag, base,
> ga_root_ptr, ga_vector), but all of which are initialized by KVM properly already.

Ya, on my Nth read through, I don't see any issues with KVM's behavior.  I was
thinking that KVM's "pi" could bleed into amd_iommu_deactivate_guest_mode(), but
I had just gotten turned around by the many "data" variables.  Bummer.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] iommu/amd: Handle GALog overflows
  2023-03-16 20:02 ` [PATCH v2 2/2] iommu/amd: Handle GALog overflows Joao Martins
@ 2023-04-13 10:24   ` Suthikulpanit, Suravee
  2023-04-13 10:30     ` Joao Martins
  2023-04-17  5:04   ` Vasant Hegde
  1 sibling, 1 reply; 14+ messages in thread
From: Suthikulpanit, Suravee @ 2023-04-13 10:24 UTC (permalink / raw)
  To: Joao Martins, iommu
  Cc: Joerg Roedel, Vasant Hegde, Will Deacon, Robin Murphy,
	Maxim Levitsky, Alejandro Jimenez, kvm



On 3/17/2023 3:02 AM, Joao Martins wrote:
> GALog exists to propagate interrupts into all vCPUs in the system when
> interrupts are marked as non running (e.g. when vCPUs aren't running). A
> GALog overflow happens when there's in no space in the log to record the
> GATag of the interrupt. So when the GALOverflow condition happens, the
> GALog queue is processed and the GALog is restarted, as the IOMMU
> manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
> Procedure":
> 
> | * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
> |   entries are completed as circumstances allow. GALogRun must be 0b to
> |   modify the guest virtual APIC log registers safely.
> | * Write MMIO Offset 0018h[GALogEn]=0b.
> | * As necessary, change the following values (e.g., to relocate or
> | resize the guest virtual APIC event log):
> |   - the Guest Virtual APIC Log Base Address Register
> |      [MMIO Offset 00E0h],
> |   - the Guest Virtual APIC Log Head Pointer Register
> |      [MMIO Offset 2040h][GALogHead], and
> |   - the Guest Virtual APIC Log Tail Pointer Register
> |      [MMIO Offset 2048h][GALogTail].
> | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
> | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
> |   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
> |   the bit to disable it.
> 
> Failing to handle the GALog overflow means that none of the VFs (in any
> guest) will work with IOMMU AVIC forcing the user to power cycle the
> host. When handling the event it resumes the GALog without resizing
> much like how it is done in the event handler overflow. The
> [MMIO Offset 2020h][GALOverflow] bit might be set in status register
> without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
> for GA events (to clear space in the galog), also check the overflow
> bit.
> 
> [suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]

According to the AMD IOMMU spec,

* The GAInt is set when the virtual interrupt request is written to the 
GALog and the IOMMU hardware generates an interrupt when GAInt changes 
from 0 to 1.

* The GAOverflow bit is set when a new guest virtual APIC event is to be 
written to the GALog and there is no usable entry in the GALog, causing 
the new event information to be discarded. No interrupt is generated 
when GALOverflow is changed from 0b to 1b.

So, whenever the IOMMU driver detects GALogOverflow, it should also 
ensure to process any existing entries in the GALog.

Please note that we are working on another patch series to isolate the 
interrupts for Event, PPR, and GALog so that each one can be handled 
separately in a similar fashion.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] iommu/amd: Handle GALog overflows
  2023-04-13 10:24   ` Suthikulpanit, Suravee
@ 2023-04-13 10:30     ` Joao Martins
  2023-04-13 10:41       ` Suthikulpanit, Suravee
  0 siblings, 1 reply; 14+ messages in thread
From: Joao Martins @ 2023-04-13 10:30 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: Joerg Roedel, Vasant Hegde, Will Deacon, Robin Murphy,
	Maxim Levitsky, Alejandro Jimenez, kvm, iommu

On 13/04/2023 11:24, Suthikulpanit, Suravee wrote:
> On 3/17/2023 3:02 AM, Joao Martins wrote:
>> GALog exists to propagate interrupts into all vCPUs in the system when
>> interrupts are marked as non running (e.g. when vCPUs aren't running). A
>> GALog overflow happens when there's in no space in the log to record the
>> GATag of the interrupt. So when the GALOverflow condition happens, the
>> GALog queue is processed and the GALog is restarted, as the IOMMU
>> manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
>> Procedure":
>>
>> | * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
>> |   entries are completed as circumstances allow. GALogRun must be 0b to
>> |   modify the guest virtual APIC log registers safely.
>> | * Write MMIO Offset 0018h[GALogEn]=0b.
>> | * As necessary, change the following values (e.g., to relocate or
>> | resize the guest virtual APIC event log):
>> |   - the Guest Virtual APIC Log Base Address Register
>> |      [MMIO Offset 00E0h],
>> |   - the Guest Virtual APIC Log Head Pointer Register
>> |      [MMIO Offset 2040h][GALogHead], and
>> |   - the Guest Virtual APIC Log Tail Pointer Register
>> |      [MMIO Offset 2048h][GALogTail].
>> | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
>> | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
>> |   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
>> |   the bit to disable it.
>>
>> Failing to handle the GALog overflow means that none of the VFs (in any
>> guest) will work with IOMMU AVIC forcing the user to power cycle the
>> host. When handling the event it resumes the GALog without resizing
>> much like how it is done in the event handler overflow. The
>> [MMIO Offset 2020h][GALOverflow] bit might be set in status register
>> without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
>> for GA events (to clear space in the galog), also check the overflow
>> bit.
>>
>> [suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
> 
> According to the AMD IOMMU spec,
> 
> * The GAInt is set when the virtual interrupt request is written to the GALog
> and the IOMMU hardware generates an interrupt when GAInt changes from 0 to 1.
> 
> * The GAOverflow bit is set when a new guest virtual APIC event is to be written
> to the GALog and there is no usable entry in the GALog, causing the new event
> information to be discarded. No interrupt is generated when GALOverflow is
> changed from 0b to 1b.
> 
> So, whenever the IOMMU driver detects GALogOverflow, it should also ensure to
> process any existing entries in the GALog.
> 

... And I am doing all that aren't I?

Or do you want me edit the commit message to quote these two bullet points from
the IOMMU manual?

> Please note that we are working on another patch series to isolate the
> interrupts for Event, PPR, and GALog so that each one can be handled separately
> in a similar fashion.
> 
Cool, if possible please CC me on that series.

	Joao

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] iommu/amd: Handle GALog overflows
  2023-04-13 10:30     ` Joao Martins
@ 2023-04-13 10:41       ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 14+ messages in thread
From: Suthikulpanit, Suravee @ 2023-04-13 10:41 UTC (permalink / raw)
  To: Joao Martins
  Cc: Joerg Roedel, Vasant Hegde, Will Deacon, Robin Murphy,
	Maxim Levitsky, Alejandro Jimenez, kvm, iommu



On 4/13/2023 5:30 PM, Joao Martins wrote:
> On 13/04/2023 11:24, Suthikulpanit, Suravee wrote:
>> On 3/17/2023 3:02 AM, Joao Martins wrote:
>>> GALog exists to propagate interrupts into all vCPUs in the system when
>>> interrupts are marked as non running (e.g. when vCPUs aren't running). A
>>> GALog overflow happens when there's in no space in the log to record the
>>> GATag of the interrupt. So when the GALOverflow condition happens, the
>>> GALog queue is processed and the GALog is restarted, as the IOMMU
>>> manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
>>> Procedure":
>>>
>>> | * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
>>> |   entries are completed as circumstances allow. GALogRun must be 0b to
>>> |   modify the guest virtual APIC log registers safely.
>>> | * Write MMIO Offset 0018h[GALogEn]=0b.
>>> | * As necessary, change the following values (e.g., to relocate or
>>> | resize the guest virtual APIC event log):
>>> |   - the Guest Virtual APIC Log Base Address Register
>>> |      [MMIO Offset 00E0h],
>>> |   - the Guest Virtual APIC Log Head Pointer Register
>>> |      [MMIO Offset 2040h][GALogHead], and
>>> |   - the Guest Virtual APIC Log Tail Pointer Register
>>> |      [MMIO Offset 2048h][GALogTail].
>>> | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
>>> | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
>>> |   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
>>> |   the bit to disable it.
>>>
>>> Failing to handle the GALog overflow means that none of the VFs (in any
>>> guest) will work with IOMMU AVIC forcing the user to power cycle the
>>> host. When handling the event it resumes the GALog without resizing
>>> much like how it is done in the event handler overflow. The
>>> [MMIO Offset 2020h][GALOverflow] bit might be set in status register
>>> without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
>>> for GA events (to clear space in the galog), also check the overflow
>>> bit.
>>>
>>> [suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
>> According to the AMD IOMMU spec,
>>
>> * The GAInt is set when the virtual interrupt request is written to the GALog
>> and the IOMMU hardware generates an interrupt when GAInt changes from 0 to 1.
>>
>> * The GAOverflow bit is set when a new guest virtual APIC event is to be written
>> to the GALog and there is no usable entry in the GALog, causing the new event
>> information to be discarded. No interrupt is generated when GALOverflow is
>> changed from 0b to 1b.
>>
>> So, whenever the IOMMU driver detects GALogOverflow, it should also ensure to
>> process any existing entries in the GALog.
>>
> ... And I am doing all that aren't I?

Correct. I am just following up to clarify details on the GAOverflow and 
GAInt.


> Or do you want me edit the commit message to quote these two bullet points from
> the IOMMU manual?

No need since this is already documented in the spec.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] iommu/amd: Handle GALog overflows
  2023-03-16 20:02 ` [PATCH v2 2/2] iommu/amd: Handle GALog overflows Joao Martins
  2023-04-13 10:24   ` Suthikulpanit, Suravee
@ 2023-04-17  5:04   ` Vasant Hegde
  1 sibling, 0 replies; 14+ messages in thread
From: Vasant Hegde @ 2023-04-17  5:04 UTC (permalink / raw)
  To: Joao Martins, iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	Maxim Levitsky, Alejandro Jimenez, kvm

On 3/17/2023 1:32 AM, Joao Martins wrote:
> GALog exists to propagate interrupts into all vCPUs in the system when
> interrupts are marked as non running (e.g. when vCPUs aren't running). A
> GALog overflow happens when there's in no space in the log to record the
> GATag of the interrupt. So when the GALOverflow condition happens, the
> GALog queue is processed and the GALog is restarted, as the IOMMU
> manual indicates in section "2.7.4 Guest Virtual APIC Log Restart
> Procedure":
> 
> | * Wait until MMIO Offset 2020h[GALogRun]=0b so that all request
> |   entries are completed as circumstances allow. GALogRun must be 0b to
> |   modify the guest virtual APIC log registers safely.
> | * Write MMIO Offset 0018h[GALogEn]=0b.
> | * As necessary, change the following values (e.g., to relocate or
> | resize the guest virtual APIC event log):
> |   - the Guest Virtual APIC Log Base Address Register
> |      [MMIO Offset 00E0h],
> |   - the Guest Virtual APIC Log Head Pointer Register
> |      [MMIO Offset 2040h][GALogHead], and
> |   - the Guest Virtual APIC Log Tail Pointer Register
> |      [MMIO Offset 2048h][GALogTail].
> | * Write MMIO Offset 2020h[GALOverflow] = 1b to clear the bit (W1C).
> | * Write MMIO Offset 0018h[GALogEn] = 1b, and either set
> |   MMIO Offset 0018h[GAIntEn] to enable the GA log interrupt or clear
> |   the bit to disable it.
> 
> Failing to handle the GALog overflow means that none of the VFs (in any
> guest) will work with IOMMU AVIC forcing the user to power cycle the
> host. When handling the event it resumes the GALog without resizing
> much like how it is done in the event handler overflow. The
> [MMIO Offset 2020h][GALOverflow] bit might be set in status register
> without the [MMIO Offset 2020h][GAInt] bit, so when deciding to poll
> for GA events (to clear space in the galog), also check the overflow
> bit.
> 
> [suravee: Check for GAOverflow without GAInt, toggle CONTROL_GAINT_EN]
> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>

Patch looks good to me.

Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>

-Vasant



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-04-17  5:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-16 20:02 [PATCH v2 0/2] iommu/amd: Fix GAM IRTEs affinity and GALog restart Joao Martins
2023-03-16 20:02 ` [PATCH v2 1/2] iommu/amd: Don't block updates to GATag if guest mode is on Joao Martins
2023-03-16 21:01   ` Sean Christopherson
2023-03-16 21:25     ` Joao Martins
2023-03-24 14:31       ` Sean Christopherson
2023-03-28 10:42         ` Joao Martins
2023-03-28 15:20           ` Sean Christopherson
2023-03-28  9:07   ` Alexey Kardashevskiy
2023-03-28 10:19     ` Joao Martins
2023-03-16 20:02 ` [PATCH v2 2/2] iommu/amd: Handle GALog overflows Joao Martins
2023-04-13 10:24   ` Suthikulpanit, Suravee
2023-04-13 10:30     ` Joao Martins
2023-04-13 10:41       ` Suthikulpanit, Suravee
2023-04-17  5:04   ` Vasant Hegde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.