* [PATCH v3 0/5] iommu/arm-smmu-v3: Add NVIDIA Grace CMDQ-V Support
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
From: Nicolin Chen <nicoleotsuka@gmail.com>
NVIDIA's Grace SoC has a CMDQ-Virtualization (CMDQV) hardware that
extends standard ARM SMMUv3 to support multiple command queues with
virtualization capabilities. Though this is similar to the ECMDQ in
SMMUv3.3, CMDQV provides additional V-Interfaces that allow VMs to
have their own interfaces and command queues, and these queues are
able to execute a limited set of commands, mainly TLB invalidation
commands when running in the guest mode, comparing to the standard
SMMUv3 CMDQ.
This patch series extends the SMMUv3 driver to support NVIDIA CMDQV
and implements it first for in-kernel use. Upon kernel boot some of
the vcmdqs will be setup for kernel driver to use, by selecting one
of the command queues based on the CPU currently executing to avoid
lock contention hot spots with a single queue.
Although HW is able to securely expose the additional V-Interfaces
and command queues to guest VMs for fast TLB invalidations without
a hypervisor trap, due to the ongoing proposal of IOMMUFD [0], we
have to postpone the virtualization support that were available in
v2, suggested by Alex and Jason [1]. And we envision that it will
be added back via IOMMUFD in the months ahead.
Thank you!
[0] https://lore.kernel.org/lkml/20210919063848.1476776-1-yi.l.liu@intel.com/
[1] https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#ma07dcfce69fa3f9d59e8b16579f694a0e10798d9
Changelog (details available in PATCH)
v2->v3:
* Dropped VMID and mdev patches to redesign later based on IOMMUFD.
* Separated HYP_OWN part for guest support into a new patch
* Added new preparational changes
v1->v2:
* Added mdev interface support for hypervisor and VMs.
* Added preparational changes for mdev interface implementation.
* PATCH-12 Changed ->issue_cmdlist() to ->get_cmdq() for a better
integration with recently merged ECMDQ-related changes.
Nate Watterson (1):
iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
Nicolin Chen (4):
iommu/arm-smmu-v3: Add CS_NONE quirk
iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 53 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 48 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 446 ++++++++++++++++++
6 files changed, 542 insertions(+), 19 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
--
2.17.1
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH v3 0/5] iommu/arm-smmu-v3: Add NVIDIA Grace CMDQ-V Support
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
From: Nicolin Chen <nicoleotsuka@gmail.com>
NVIDIA's Grace SoC has a CMDQ-Virtualization (CMDQV) hardware that
extends standard ARM SMMUv3 to support multiple command queues with
virtualization capabilities. Though this is similar to the ECMDQ in
SMMUv3.3, CMDQV provides additional V-Interfaces that allow VMs to
have their own interfaces and command queues, and these queues are
able to execute a limited set of commands, mainly TLB invalidation
commands when running in the guest mode, comparing to the standard
SMMUv3 CMDQ.
This patch series extends the SMMUv3 driver to support NVIDIA CMDQV
and implements it first for in-kernel use. Upon kernel boot some of
the vcmdqs will be setup for kernel driver to use, by selecting one
of the command queues based on the CPU currently executing to avoid
lock contention hot spots with a single queue.
Although HW is able to securely expose the additional V-Interfaces
and command queues to guest VMs for fast TLB invalidations without
a hypervisor trap, due to the ongoing proposal of IOMMUFD [0], we
have to postpone the virtualization support that were available in
v2, suggested by Alex and Jason [1]. And we envision that it will
be added back via IOMMUFD in the months ahead.
Thank you!
[0] https://lore.kernel.org/lkml/20210919063848.1476776-1-yi.l.liu@intel.com/
[1] https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#ma07dcfce69fa3f9d59e8b16579f694a0e10798d9
Changelog (details available in PATCH)
v2->v3:
* Dropped VMID and mdev patches to redesign later based on IOMMUFD.
* Separated HYP_OWN part for guest support into a new patch
* Added new preparational changes
v1->v2:
* Added mdev interface support for hypervisor and VMs.
* Added preparational changes for mdev interface implementation.
* PATCH-12 Changed ->issue_cmdlist() to ->get_cmdq() for a better
integration with recently merged ECMDQ-related changes.
Nate Watterson (1):
iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
Nicolin Chen (4):
iommu/arm-smmu-v3: Add CS_NONE quirk
iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 53 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 48 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 446 ++++++++++++++++++
6 files changed, 542 insertions(+), 19 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH v3 0/5] iommu/arm-smmu-v3: Add NVIDIA Grace CMDQ-V Support
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
From: Nicolin Chen <nicoleotsuka@gmail.com>
NVIDIA's Grace SoC has a CMDQ-Virtualization (CMDQV) hardware that
extends standard ARM SMMUv3 to support multiple command queues with
virtualization capabilities. Though this is similar to the ECMDQ in
SMMUv3.3, CMDQV provides additional V-Interfaces that allow VMs to
have their own interfaces and command queues, and these queues are
able to execute a limited set of commands, mainly TLB invalidation
commands when running in the guest mode, comparing to the standard
SMMUv3 CMDQ.
This patch series extends the SMMUv3 driver to support NVIDIA CMDQV
and implements it first for in-kernel use. Upon kernel boot some of
the vcmdqs will be setup for kernel driver to use, by selecting one
of the command queues based on the CPU currently executing to avoid
lock contention hot spots with a single queue.
Although HW is able to securely expose the additional V-Interfaces
and command queues to guest VMs for fast TLB invalidations without
a hypervisor trap, due to the ongoing proposal of IOMMUFD [0], we
have to postpone the virtualization support that were available in
v2, suggested by Alex and Jason [1]. And we envision that it will
be added back via IOMMUFD in the months ahead.
Thank you!
[0] https://lore.kernel.org/lkml/20210919063848.1476776-1-yi.l.liu@intel.com/
[1] https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#ma07dcfce69fa3f9d59e8b16579f694a0e10798d9
Changelog (details available in PATCH)
v2->v3:
* Dropped VMID and mdev patches to redesign later based on IOMMUFD.
* Separated HYP_OWN part for guest support into a new patch
* Added new preparational changes
v1->v2:
* Added mdev interface support for hypervisor and VMs.
* Added preparational changes for mdev interface implementation.
* PATCH-12 Changed ->issue_cmdlist() to ->get_cmdq() for a better
integration with recently merged ECMDQ-related changes.
Nate Watterson (1):
iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
Nicolin Chen (4):
iommu/arm-smmu-v3: Add CS_NONE quirk
iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 53 ++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 48 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 446 ++++++++++++++++++
6 files changed, 542 insertions(+), 19 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* [PATCH v3 1/5] iommu/arm-smmu-v3: Add CS_NONE quirk
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
The CMDQV extension in NVIDIA Grace SoC only supports CS_NONE in the
CS field of CMD_SYNC. So this patch adds a quirk flag to accommodate
that.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f5848b351b19..e6fee69dd79c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -319,7 +319,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
break;
case CMDQ_OP_CMD_SYNC:
- if (ent->sync.msiaddr) {
+ if (ent->sync.cs_none) {
+ cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_NONE);
+ } else if (ent->sync.msiaddr) {
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
} else {
@@ -356,6 +358,9 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu,
q->ent_dwords * 8;
}
+ if (q->quirks & CMDQ_QUIRK_SYNC_CS_NONE_ONLY)
+ ent.sync.cs_none = true;
+
arm_smmu_cmdq_build_cmd(cmd, &ent);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..7a6a6045700d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -499,6 +499,7 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_CMD_SYNC 0x46
struct {
u64 msiaddr;
+ bool cs_none;
} sync;
};
};
@@ -531,6 +532,9 @@ struct arm_smmu_queue {
u32 __iomem *prod_reg;
u32 __iomem *cons_reg;
+
+#define CMDQ_QUIRK_SYNC_CS_NONE_ONLY BIT(0) /* CMD_SYNC CS field supports CS_NONE only */
+ u32 quirks;
};
struct arm_smmu_queue_poll {
--
2.17.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 1/5] iommu/arm-smmu-v3: Add CS_NONE quirk
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
The CMDQV extension in NVIDIA Grace SoC only supports CS_NONE in the
CS field of CMD_SYNC. So this patch adds a quirk flag to accommodate
that.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f5848b351b19..e6fee69dd79c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -319,7 +319,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
break;
case CMDQ_OP_CMD_SYNC:
- if (ent->sync.msiaddr) {
+ if (ent->sync.cs_none) {
+ cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_NONE);
+ } else if (ent->sync.msiaddr) {
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
} else {
@@ -356,6 +358,9 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu,
q->ent_dwords * 8;
}
+ if (q->quirks & CMDQ_QUIRK_SYNC_CS_NONE_ONLY)
+ ent.sync.cs_none = true;
+
arm_smmu_cmdq_build_cmd(cmd, &ent);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..7a6a6045700d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -499,6 +499,7 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_CMD_SYNC 0x46
struct {
u64 msiaddr;
+ bool cs_none;
} sync;
};
};
@@ -531,6 +532,9 @@ struct arm_smmu_queue {
u32 __iomem *prod_reg;
u32 __iomem *cons_reg;
+
+#define CMDQ_QUIRK_SYNC_CS_NONE_ONLY BIT(0) /* CMD_SYNC CS field supports CS_NONE only */
+ u32 quirks;
};
struct arm_smmu_queue_poll {
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 1/5] iommu/arm-smmu-v3: Add CS_NONE quirk
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
The CMDQV extension in NVIDIA Grace SoC only supports CS_NONE in the
CS field of CMD_SYNC. So this patch adds a quirk flag to accommodate
that.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ++++++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f5848b351b19..e6fee69dd79c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -319,7 +319,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_RESUME_1_STAG, ent->resume.stag);
break;
case CMDQ_OP_CMD_SYNC:
- if (ent->sync.msiaddr) {
+ if (ent->sync.cs_none) {
+ cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_NONE);
+ } else if (ent->sync.msiaddr) {
cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_IRQ);
cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK;
} else {
@@ -356,6 +358,9 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct arm_smmu_device *smmu,
q->ent_dwords * 8;
}
+ if (q->quirks & CMDQ_QUIRK_SYNC_CS_NONE_ONLY)
+ ent.sync.cs_none = true;
+
arm_smmu_cmdq_build_cmd(cmd, &ent);
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4cb136f07914..7a6a6045700d 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -499,6 +499,7 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_CMD_SYNC 0x46
struct {
u64 msiaddr;
+ bool cs_none;
} sync;
};
};
@@ -531,6 +532,9 @@ struct arm_smmu_queue {
u32 __iomem *prod_reg;
u32 __iomem *cons_reg;
+
+#define CMDQ_QUIRK_SYNC_CS_NONE_ONLY BIT(0) /* CMD_SYNC CS field supports CS_NONE only */
+ u32 quirks;
};
struct arm_smmu_queue_poll {
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 2/5] iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
The CMDQV extension in NVIDIA Grace SoC resues the arm_smmu_cmdq
structure while the queue location isn't same as smmu->cmdq. So,
this patch adds a cmdq argument to arm_smmu_cmdq_init() function
and shares its define in the header for CMDQV driver to use.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e6fee69dd79c..6be20e926f63 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2922,10 +2922,10 @@ static void arm_smmu_cmdq_free_bitmap(void *data)
bitmap_free(bitmap);
}
-static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq)
{
int ret = 0;
- struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
unsigned int nents = 1 << cmdq->q.llq.max_n_shift;
atomic_long_t *bitmap;
@@ -2955,7 +2955,7 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
if (ret)
return ret;
- ret = arm_smmu_cmdq_init(smmu);
+ ret = arm_smmu_cmdq_init(smmu, &smmu->cmdq);
if (ret)
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7a6a6045700d..475f004ccbe4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -751,6 +751,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.17.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 2/5] iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
The CMDQV extension in NVIDIA Grace SoC resues the arm_smmu_cmdq
structure while the queue location isn't same as smmu->cmdq. So,
this patch adds a cmdq argument to arm_smmu_cmdq_init() function
and shares its define in the header for CMDQV driver to use.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e6fee69dd79c..6be20e926f63 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2922,10 +2922,10 @@ static void arm_smmu_cmdq_free_bitmap(void *data)
bitmap_free(bitmap);
}
-static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq)
{
int ret = 0;
- struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
unsigned int nents = 1 << cmdq->q.llq.max_n_shift;
atomic_long_t *bitmap;
@@ -2955,7 +2955,7 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
if (ret)
return ret;
- ret = arm_smmu_cmdq_init(smmu);
+ ret = arm_smmu_cmdq_init(smmu, &smmu->cmdq);
if (ret)
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7a6a6045700d..475f004ccbe4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -751,6 +751,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 2/5] iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
The CMDQV extension in NVIDIA Grace SoC resues the arm_smmu_cmdq
structure while the queue location isn't same as smmu->cmdq. So,
this patch adds a cmdq argument to arm_smmu_cmdq_init() function
and shares its define in the header for CMDQV driver to use.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +++---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e6fee69dd79c..6be20e926f63 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2922,10 +2922,10 @@ static void arm_smmu_cmdq_free_bitmap(void *data)
bitmap_free(bitmap);
}
-static int arm_smmu_cmdq_init(struct arm_smmu_device *smmu)
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq)
{
int ret = 0;
- struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
unsigned int nents = 1 << cmdq->q.llq.max_n_shift;
atomic_long_t *bitmap;
@@ -2955,7 +2955,7 @@ static int arm_smmu_init_queues(struct arm_smmu_device *smmu)
if (ret)
return ret;
- ret = arm_smmu_cmdq_init(smmu);
+ ret = arm_smmu_cmdq_init(smmu, &smmu->cmdq);
if (ret)
return ret;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 7a6a6045700d..475f004ccbe4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -751,6 +751,8 @@ void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
+int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 3/5] iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
The driver currently calls arm_smmu_get_cmdq() helper internally in
different places, though they are all actually called from the same
source -- arm_smmu_cmdq_issue_cmdlist() function.
This patch changes this to pass the cmdq pointer to these functions
instead of calling arm_smmu_get_cmdq() every time.
This also helps CMDQV extension in NVIDIA Grace SoC, whose driver'd
maintain its own cmdq pointers and needs to redirect arm_smmu->cmdq
to that upon seeing a supported command by checking its opcode.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6be20e926f63..188865ec9a33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -586,11 +586,11 @@ static void arm_smmu_cmdq_poll_valid_map(struct arm_smmu_cmdq *cmdq,
/* Wait for the command queue to become non-full */
static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
unsigned long flags;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
int ret = 0;
/*
@@ -621,11 +621,11 @@ static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
int ret = 0;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 *cmd = (u32 *)(Q_ENT(&cmdq->q, llq->prod));
queue_poll_init(smmu, &qp);
@@ -645,10 +645,10 @@ static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 prod = llq->prod;
int ret = 0;
@@ -695,12 +695,13 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
}
static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
if (smmu->options & ARM_SMMU_OPT_MSIPOLL)
- return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_msi(smmu, cmdq, llq);
- return __arm_smmu_cmdq_poll_until_consumed(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_consumed(smmu, cmdq, llq);
}
static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds,
@@ -757,7 +758,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
- if (arm_smmu_cmdq_poll_until_not_full(smmu, &llq))
+ if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
local_irq_save(flags);
}
@@ -833,7 +834,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
/* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
if (sync) {
llq.prod = queue_inc_prod_n(&llq, n);
- ret = arm_smmu_cmdq_poll_until_sync(smmu, &llq);
+ ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
if (ret) {
dev_err_ratelimited(smmu->dev,
"CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
--
2.17.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 3/5] iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
The driver currently calls arm_smmu_get_cmdq() helper internally in
different places, though they are all actually called from the same
source -- arm_smmu_cmdq_issue_cmdlist() function.
This patch changes this to pass the cmdq pointer to these functions
instead of calling arm_smmu_get_cmdq() every time.
This also helps CMDQV extension in NVIDIA Grace SoC, whose driver'd
maintain its own cmdq pointers and needs to redirect arm_smmu->cmdq
to that upon seeing a supported command by checking its opcode.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6be20e926f63..188865ec9a33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -586,11 +586,11 @@ static void arm_smmu_cmdq_poll_valid_map(struct arm_smmu_cmdq *cmdq,
/* Wait for the command queue to become non-full */
static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
unsigned long flags;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
int ret = 0;
/*
@@ -621,11 +621,11 @@ static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
int ret = 0;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 *cmd = (u32 *)(Q_ENT(&cmdq->q, llq->prod));
queue_poll_init(smmu, &qp);
@@ -645,10 +645,10 @@ static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 prod = llq->prod;
int ret = 0;
@@ -695,12 +695,13 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
}
static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
if (smmu->options & ARM_SMMU_OPT_MSIPOLL)
- return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_msi(smmu, cmdq, llq);
- return __arm_smmu_cmdq_poll_until_consumed(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_consumed(smmu, cmdq, llq);
}
static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds,
@@ -757,7 +758,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
- if (arm_smmu_cmdq_poll_until_not_full(smmu, &llq))
+ if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
local_irq_save(flags);
}
@@ -833,7 +834,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
/* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
if (sync) {
llq.prod = queue_inc_prod_n(&llq, n);
- ret = arm_smmu_cmdq_poll_until_sync(smmu, &llq);
+ ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
if (ret) {
dev_err_ratelimited(smmu->dev,
"CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 3/5] iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist()
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
The driver currently calls arm_smmu_get_cmdq() helper internally in
different places, though they are all actually called from the same
source -- arm_smmu_cmdq_issue_cmdlist() function.
This patch changes this to pass the cmdq pointer to these functions
instead of calling arm_smmu_get_cmdq() every time.
This also helps CMDQV extension in NVIDIA Grace SoC, whose driver'd
maintain its own cmdq pointers and needs to redirect arm_smmu->cmdq
to that upon seeing a supported command by checking its opcode.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 6be20e926f63..188865ec9a33 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -586,11 +586,11 @@ static void arm_smmu_cmdq_poll_valid_map(struct arm_smmu_cmdq *cmdq,
/* Wait for the command queue to become non-full */
static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
unsigned long flags;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
int ret = 0;
/*
@@ -621,11 +621,11 @@ static int arm_smmu_cmdq_poll_until_not_full(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
int ret = 0;
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 *cmd = (u32 *)(Q_ENT(&cmdq->q, llq->prod));
queue_poll_init(smmu, &qp);
@@ -645,10 +645,10 @@ static int __arm_smmu_cmdq_poll_until_msi(struct arm_smmu_device *smmu,
* Must be called with the cmdq lock held in some capacity.
*/
static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
struct arm_smmu_queue_poll qp;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
u32 prod = llq->prod;
int ret = 0;
@@ -695,12 +695,13 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
}
static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
+ struct arm_smmu_cmdq *cmdq,
struct arm_smmu_ll_queue *llq)
{
if (smmu->options & ARM_SMMU_OPT_MSIPOLL)
- return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_msi(smmu, cmdq, llq);
- return __arm_smmu_cmdq_poll_until_consumed(smmu, llq);
+ return __arm_smmu_cmdq_poll_until_consumed(smmu, cmdq, llq);
}
static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds,
@@ -757,7 +758,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
while (!queue_has_space(&llq, n + sync)) {
local_irq_restore(flags);
- if (arm_smmu_cmdq_poll_until_not_full(smmu, &llq))
+ if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
local_irq_save(flags);
}
@@ -833,7 +834,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
/* 5. If we are inserting a CMD_SYNC, we must wait for it to complete */
if (sync) {
llq.prod = queue_inc_prod_n(&llq, n);
- ret = arm_smmu_cmdq_poll_until_sync(smmu, &llq);
+ ret = arm_smmu_cmdq_poll_until_sync(smmu, cmdq, &llq);
if (ret) {
dev_err_ratelimited(smmu->dev,
"CMD_SYNC timeout at 0x%08x [hwprod 0x%08x, hwcons 0x%08x]\n",
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
From: Nate Watterson <nwatterson@nvidia.com>
NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
which extends the standard ARM SMMU v3 IP to support multiple
VCMDQs with virtualization capabilities. In-kernel of host OS,
they're used to reduce contention on a single queue. In terms
of command queue, they are very like the standard CMDQ/ECMDQs,
but only support CS_NONE in the CS field of CMD_SYNC command.
This patch adds a new nvidia-grace-cmdqv file and inserts its
structure pointer into the existing arm_smmu_device, and then
adds related function calls in the arm-smmu-v3 driver.
In the CMDQV driver itself, this patch only adds minimal part
for host kernel support. Upon probe(), VINTF0 is reserved for
in-kernel use. And some of the VCMDQs are assigned to VINTF0.
Then the driver will select one of VCMDQs in the VINTF0 based
on the CPU currently executing, to issue commands.
Note that for the current plan the CMDQV driver only supports
ACPI configuration.
Signed-off-by: Nate Watterson <nwatterson@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Changelog:
v2->v3:
* Replaced impl design with simpler "nvidia_grace_cmdqv" pointer
* Aligned all the namings to "nvidia_grace_cmdqv" or "cmdqv"
* Changed VINTF_ENABLED check in get_cmdq() to VINTF_STATUS
* Dropped overrides at smmu->features and smmu->options
* Inlined hw_probe() to acpi_probe() for simplification
* Added a new CMDQV CONFIG depending on CONFIG_ACPI
* Removed additional platform_device involvement
* Switched krealloc to kzalloc for cmdqv Pointer
* Moved devm_request_irq() out of device_reset()
* Dropped IRQF_SHARED flag at devm_request_irq()
* Reused acpi_iort_node pointer from SMMU driver
* Reused existing smmu functions to init vcmdqs
* Changed writel_relaxed to writel to be safe
* Removed pointless comments and prints
* Updated Copyright lines
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 41 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 418 ++++++++++++++++++
6 files changed, 488 insertions(+), 6 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
diff --git a/MAINTAINERS b/MAINTAINERS
index f32c7d733255..0314ee1edf62 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18726,6 +18726,7 @@ M: Thierry Reding <thierry.reding@gmail.com>
R: Krishna Reddy <vdumpa@nvidia.com>
L: linux-tegra@vger.kernel.org
S: Supported
+F: drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
F: drivers/iommu/tegra*
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 3eb68fa1b8cc..290af9c7b2a5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -388,6 +388,18 @@ config ARM_SMMU_V3_SVA
Say Y here if your system supports SVA extensions such as PCIe PASID
and PRI.
+config NVIDIA_GRACE_CMDQV
+ bool "NVIDIA Grace CMDQ-V extension support for ARM SMMUv3"
+ depends on ARM_SMMU_V3
+ depends on ACPI
+ help
+ Support for NVIDIA Grace CMDQ-Virtualization extension for ARM SMMUv3.
+ The CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues
+ support, except with virtualization capabilities.
+
+ Say Y here if your system is NVIDIA Grace or it has the same CMDQ-V
+ extension.
+
config S390_IOMMU
def_bool y if S390 && PCI
depends on S390 && PCI
diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..a083019de68a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -2,4 +2,5 @@
obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
arm_smmu_v3-objs-y += arm-smmu-v3.o
arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
+arm_smmu_v3-objs-$(CONFIG_NVIDIA_GRACE_CMDQV) += nvidia-grace-cmdqv.o
arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 188865ec9a33..b1182dd825fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -339,6 +339,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
{
+ if (smmu->nvidia_grace_cmdqv)
+ return nvidia_grace_cmdqv_get_cmdq(smmu);
+
return &smmu->cmdq;
}
@@ -2874,12 +2877,10 @@ static struct iommu_ops arm_smmu_ops = {
};
/* Probing and initialisation functions */
-static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
- struct arm_smmu_queue *q,
- void __iomem *page,
- unsigned long prod_off,
- unsigned long cons_off,
- size_t dwords, const char *name)
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name)
{
size_t qsz;
@@ -3438,6 +3439,12 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
return ret;
}
+ if (smmu->nvidia_grace_cmdqv) {
+ ret = nvidia_grace_cmdqv_device_reset(smmu);
+ if (ret)
+ return ret;
+ }
+
return 0;
}
@@ -3686,6 +3693,8 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
smmu->features |= ARM_SMMU_FEAT_COHERENCY;
+ smmu->nvidia_grace_cmdqv = nvidia_grace_cmdqv_acpi_probe(smmu, node);
+
return 0;
}
#else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 475f004ccbe4..24f93444aeeb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -619,6 +619,8 @@ struct arm_smmu_strtab_cfg {
u32 strtab_base_cfg;
};
+struct nvidia_grace_cmdqv;
+
/* An SMMUv3 instance */
struct arm_smmu_device {
struct device *dev;
@@ -679,6 +681,12 @@ struct arm_smmu_device {
struct rb_root streams;
struct mutex streams_mutex;
+
+ /*
+ * Pointer to NVIDIA Grace CMDQ-Virtualization Extension support,
+ * similar to v3.3 ECMDQ except with virtualization capabilities.
+ */
+ struct nvidia_grace_cmdqv *nvidia_grace_cmdqv;
};
struct arm_smmu_stream {
@@ -753,6 +761,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -812,4 +824,33 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
static inline void arm_smmu_sva_notifier_synchronize(void) {}
#endif /* CONFIG_ARM_SMMU_V3_SVA */
+
+struct acpi_iort_node;
+
+#ifdef CONFIG_NVIDIA_GRACE_CMDQV
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node);
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+#else /* CONFIG_NVIDIA_GRACE_CMDQV */
+static inline struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ return NULL;
+}
+
+static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ return -ENODEV;
+}
+
+static inline struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ return NULL;
+}
+#endif /* CONFIG_NVIDIA_GRACE_CMDQV */
+
#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
new file mode 100644
index 000000000000..c0d7351f13e2
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -0,0 +1,418 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2021 NVIDIA CORPORATION & AFFILIATES */
+
+#define dev_fmt(fmt) "nvidia_grace_cmdqv: " fmt
+
+#include <linux/acpi.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+
+#include <acpi/acpixf.h>
+
+#include "arm-smmu-v3.h"
+
+#define NVIDIA_CMDQV_HID "NVDA0600"
+
+/* CMDQV register page base and size defines */
+#define NVIDIA_CMDQV_CONFIG_BASE (0)
+#define NVIDIA_CMDQV_CONFIG_SIZE (SZ_64K)
+#define NVIDIA_VCMDQ_BASE (0 + SZ_64K)
+#define NVIDIA_VCMDQ_SIZE (SZ_64K * 2) /* PAGE0 and PAGE1 */
+
+/* CMDQV global config regs */
+#define NVIDIA_CMDQV_CONFIG 0x0000
+#define CMDQV_EN BIT(0)
+
+#define NVIDIA_CMDQV_PARAM 0x0004
+#define CMDQV_NUM_VINTF_LOG2 GENMASK(11, 8)
+#define CMDQV_NUM_VCMDQ_LOG2 GENMASK(7, 4)
+
+#define NVIDIA_CMDQV_STATUS 0x0008
+#define CMDQV_STATUS GENMASK(2, 1)
+#define CMDQV_ENABLED BIT(0)
+
+#define NVIDIA_CMDQV_VINTF_ERR_MAP 0x000C
+#define NVIDIA_CMDQV_VINTF_INT_MASK 0x0014
+#define NVIDIA_CMDQV_VCMDQ_ERR_MAP 0x001C
+
+#define NVIDIA_CMDQV_CMDQ_ALLOC(q) (0x0200 + 0x4*(q))
+#define CMDQV_CMDQ_ALLOC_VINTF GENMASK(20, 15)
+#define CMDQV_CMDQ_ALLOC_LVCMDQ GENMASK(7, 1)
+#define CMDQV_CMDQ_ALLOCATED BIT(0)
+
+/* VINTF config regs */
+#define NVIDIA_CMDQV_VINTF(v) (0x1000 + 0x100*(v))
+
+#define NVIDIA_VINTF_CONFIG 0x0000
+#define VINTF_HYP_OWN BIT(17)
+#define VINTF_VMID GENMASK(16, 1)
+#define VINTF_EN BIT(0)
+
+#define NVIDIA_VINTF_STATUS 0x0004
+#define VINTF_STATUS GENMASK(3, 1)
+#define VINTF_ENABLED BIT(0)
+
+/* VCMDQ config regs */
+/* -- PAGE0 -- */
+#define NVIDIA_CMDQV_VCMDQ(q) (NVIDIA_VCMDQ_BASE + 0x80*(q))
+
+#define NVIDIA_VCMDQ_CONS 0x00000
+#define VCMDQ_CONS_ERR GENMASK(30, 24)
+
+#define NVIDIA_VCMDQ_PROD 0x00004
+
+#define NVIDIA_VCMDQ_CONFIG 0x00008
+#define VCMDQ_EN BIT(0)
+
+#define NVIDIA_VCMDQ_STATUS 0x0000C
+#define VCMDQ_ENABLED BIT(0)
+
+#define NVIDIA_VCMDQ_GERROR 0x00010
+#define NVIDIA_VCMDQ_GERRORN 0x00014
+
+/* -- PAGE1 -- */
+#define NVIDIA_VCMDQ_BASE_L(q) (NVIDIA_CMDQV_VCMDQ(q) + SZ_64K)
+#define VCMDQ_ADDR GENMASK(47, 5)
+#define VCMDQ_LOG2SIZE GENMASK(4, 0)
+
+struct nvidia_grace_cmdqv_vintf {
+ u16 idx;
+ u32 cfg;
+ u32 status;
+
+ void __iomem *base;
+ struct arm_smmu_cmdq *vcmdqs;
+};
+
+struct nvidia_grace_cmdqv {
+ struct arm_smmu_device *smmu;
+
+ struct device *dev;
+ struct resource *res;
+ void __iomem *base;
+ int irq;
+
+ /* CMDQV Hardware Params */
+ u16 num_total_vintfs;
+ u16 num_total_vcmdqs;
+ u16 num_vcmdqs_per_vintf;
+
+ /* CMDQV_VINTF(0) reserved for host kernel use */
+ struct nvidia_grace_cmdqv_vintf vintf0;
+};
+
+static irqreturn_t nvidia_grace_cmdqv_isr(int irq, void *devid)
+{
+ struct nvidia_grace_cmdqv *cmdqv = (struct nvidia_grace_cmdqv *)devid;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 vintf_err_map[2];
+ u32 vcmdq_err_map[4];
+
+ vintf_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP);
+ vintf_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP + 0x4);
+
+ vcmdq_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP);
+ vcmdq_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x4);
+ vcmdq_err_map[2] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x8);
+ vcmdq_err_map[3] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0xC);
+
+ dev_warn(cmdqv->dev,
+ "unexpected cmdqv error reported: vintf_map %08X %08X, vcmdq_map %08X %08X %08X %08X\n",
+ vintf_err_map[0], vintf_err_map[1], vcmdq_err_map[0], vcmdq_err_map[1],
+ vcmdq_err_map[2], vcmdq_err_map[3]);
+
+ /* If the error was reported by vintf0, avoid using any of its VCMDQs */
+ if (vintf_err_map[vintf0->idx / 32] & (1 << (vintf0->idx % 32))) {
+ vintf0->status = readl_relaxed(vintf0->base + NVIDIA_VINTF_STATUS);
+
+ dev_warn(cmdqv->dev, "error (0x%lX) reported by host vintf0 - disabling its vcmdqs\n",
+ FIELD_GET(VINTF_STATUS, vintf0->status));
+ } else if (vintf_err_map[0] || vintf_err_map[1]) {
+ dev_err(cmdqv->dev, "cmdqv error interrupt triggered by unassigned vintf!\n");
+ }
+
+ return IRQ_HANDLED;
+}
+
+/* Adapt struct arm_smmu_cmdq init sequences from arm-smmu-v3.c for VCMDQs */
+static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
+ struct arm_smmu_cmdq *cmdq,
+ void __iomem *vcmdq_base, u16 qidx)
+{
+ struct arm_smmu_queue *q = &cmdq->q;
+ char name[16];
+ int ret;
+
+ sprintf(name, "vcmdq%u", qidx);
+
+ q->llq.max_n_shift = ilog2(SZ_64K >> CMDQ_ENT_SZ_SHIFT);
+
+ /* Use the common helper to init the VCMDQ, and then... */
+ ret = arm_smmu_init_one_queue(cmdqv->smmu, q, vcmdq_base,
+ NVIDIA_VCMDQ_PROD, NVIDIA_VCMDQ_CONS,
+ CMDQ_ENT_DWORDS, name);
+ if (ret)
+ return ret;
+
+ /* ...override q_base for VCMDQ_BASE_L/H registers */
+ q->q_base = q->base_dma & VCMDQ_ADDR;
+ q->q_base |= FIELD_PREP(VCMDQ_LOG2SIZE, q->llq.max_n_shift);
+
+ /* All VCMDQs support CS_NONE only for CMD_SYNC */
+ q->quirks = CMDQ_QUIRK_SYNC_CS_NONE_ONLY;
+
+ return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
+}
+
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u16 qidx;
+
+ /* Check error status of vintf0 */
+ if (!FIELD_GET(VINTF_STATUS, vintf0->status))
+ return &smmu->cmdq;
+
+ /*
+ * Select a vcmdq to use. Here we use a temporal solution to
+ * balance out traffic on cmdq issuing: each cmdq has its own
+ * lock, if all cpus issue cmdlist using the same cmdq, only
+ * one CPU at a time can enter the process, while the others
+ * will be spinning at the same lock.
+ */
+ qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
+ return &vintf0->vcmdqs[qidx];
+}
+
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 regval;
+ u16 qidx;
+ int ret;
+
+ /* Setup vintf0 for host kernel */
+ vintf0->idx = 0;
+ vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+
+ regval = FIELD_PREP(VINTF_HYP_OWN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ regval |= FIELD_PREP(VINTF_EN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ vintf0->cfg = regval;
+
+ ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
+ regval, regval == VINTF_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ vintf0->status = regval;
+ if (ret) {
+ dev_err(cmdqv->dev, "failed to enable VINTF%u: STATUS = 0x%08X\n",
+ vintf0->idx, regval);
+ return ret;
+ }
+
+ /* Allocate vcmdqs to vintf0 */
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ regval = FIELD_PREP(CMDQV_CMDQ_ALLOC_VINTF, vintf0->idx);
+ regval |= FIELD_PREP(CMDQV_CMDQ_ALLOC_LVCMDQ, qidx);
+ regval |= CMDQV_CMDQ_ALLOCATED;
+ writel_relaxed(regval, cmdqv->base + NVIDIA_CMDQV_CMDQ_ALLOC(qidx));
+ }
+
+ /* Build an arm_smmu_cmdq for each vcmdq allocated to vintf0 */
+ vintf0->vcmdqs = devm_kcalloc(cmdqv->dev, cmdqv->num_vcmdqs_per_vintf,
+ sizeof(*vintf0->vcmdqs), GFP_KERNEL);
+ if (!vintf0->vcmdqs)
+ return -ENOMEM;
+
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ void __iomem *vcmdq_base = cmdqv->base + NVIDIA_CMDQV_VCMDQ(qidx);
+ struct arm_smmu_cmdq *cmdq = &vintf0->vcmdqs[qidx];
+
+ /* Setup struct arm_smmu_cmdq data members */
+ nvidia_grace_cmdqv_init_one_vcmdq(cmdqv, cmdq, vcmdq_base, qidx);
+
+ /* Configure and enable the vcmdq */
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_PROD);
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ writeq_relaxed(cmdq->q.q_base, cmdqv->base + NVIDIA_VCMDQ_BASE_L(qidx));
+
+ writel(VCMDQ_EN, vcmdq_base + NVIDIA_VCMDQ_CONFIG);
+ ret = readl_poll_timeout(vcmdq_base + NVIDIA_VCMDQ_STATUS,
+ regval, regval == VCMDQ_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ if (ret) {
+ u32 gerror = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERROR);
+ u32 gerrorn = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERRORN);
+ u32 cons = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ dev_err(cmdqv->dev,
+ "failed to enable VCMDQ%u: GERROR=0x%X, GERRORN=0x%X, CONS=0x%X\n",
+ qidx, gerror, gerrorn, cons);
+ return ret;
+ }
+
+ dev_info(cmdqv->dev, "VCMDQ%u allocated to VINTF%u as logical-VCMDQ%u\n",
+ qidx, vintf0->idx, qidx);
+ }
+
+ return 0;
+}
+
+static int nvidia_grace_cmdqv_acpi_is_memory(struct acpi_resource *res, void *data)
+{
+ struct resource r;
+
+ return !acpi_dev_resource_memory(res, &r);
+}
+
+static int nvidia_grace_cmdqv_acpi_get_irqs(struct acpi_resource *ares, void *data)
+{
+ struct resource r;
+ int *irq = data;
+
+ if (*irq <= 0 && acpi_dev_resource_interrupt(ares, 0, &r))
+ *irq = r.start;
+
+ return 1; /* No need to add resource to the list */
+}
+
+/*
+ * Function taking care of all ACPI resource probings and according allocations
+ *
+ * Note that it uses devm_* functions for resource allocations here so that smmu
+ * driver can roll back cmdqv resources automatically without additional cleanup
+ * routine, if any further error happens there. Yet this means all error unwinds
+ * here will have to go with devm_* too.
+ */
+static struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_find_resource(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv = NULL;
+ struct list_head resource_list;
+ struct resource_entry *rentry;
+ struct acpi_device *adev;
+ const char *match_uid;
+ int ret;
+
+ if (acpi_disabled)
+ return NULL;
+
+ /* Look for a device in the DSDT whose _UID matches the SMMU's iort_node identifier */
+ match_uid = kasprintf(GFP_KERNEL, "%u", node->identifier);
+ adev = acpi_dev_get_first_match_dev(NVIDIA_CMDQV_HID, match_uid, -1);
+ kfree(match_uid);
+
+ if (!adev)
+ return NULL;
+
+ dev_info(smmu->dev, "found companion CMDQV device, %s\n", dev_name(&adev->dev));
+
+ INIT_LIST_HEAD(&resource_list);
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_is_memory, NULL);
+ if (ret < 0) {
+ dev_err(smmu->dev, "failed to get memory resource: %d\n", ret);
+ goto put_dev;
+ }
+
+ cmdqv = devm_kzalloc(smmu->dev, sizeof(*cmdqv), GFP_KERNEL);
+ if (!cmdqv)
+ goto free_list;
+
+ rentry = list_first_entry_or_null(&resource_list, struct resource_entry, node);
+ if (!rentry) {
+ dev_err(smmu->dev, "failed to get memory resource entry\n");
+ goto free_cmdqv;
+ }
+
+ cmdqv->smmu = smmu;
+ cmdqv->dev = smmu->dev;
+ cmdqv->res = rentry->res;
+
+ cmdqv->base = devm_ioremap_resource(smmu->dev, rentry->res);
+ if (IS_ERR(cmdqv->base)) {
+ dev_err(smmu->dev, "failed to ioremap: %ld\n", PTR_ERR(cmdqv->base));
+ goto free_cmdqv;
+ }
+
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_get_irqs, &cmdqv->irq);
+ if (ret < 0) {
+ dev_warn(smmu->dev, "no cmdqv interrupt - errors will not be reported\n");
+ cmdqv->irq = 0;
+ } else {
+ ret = devm_request_irq(smmu->dev, cmdqv->irq, nvidia_grace_cmdqv_isr,
+ 0, "nvidia-grace-cmdqv", cmdqv);
+ if (ret) {
+ dev_err(smmu->dev, "failed to request irq (%d): %d\n",
+ cmdqv->irq, ret);
+ goto iounmap;
+ }
+ }
+
+ goto free_list;
+
+iounmap:
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+free_cmdqv:
+ devm_kfree(smmu->dev, cmdqv);
+ cmdqv = NULL;
+free_list:
+ acpi_dev_free_resource_list(&resource_list);
+put_dev:
+ put_device(&adev->dev);
+
+ return cmdqv;
+}
+
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv;
+ u32 regval;
+
+ cmdqv = nvidia_grace_cmdqv_find_resource(smmu, node);
+ if (!cmdqv)
+ return NULL;
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_CONFIG);
+ if (!FIELD_GET(CMDQV_EN, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w is disabled: CMDQV_CONFIG=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_STATUS);
+ if (!FIELD_GET(CMDQV_ENABLED, regval) || FIELD_GET(CMDQV_STATUS, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w not ready: CMDQV_STATUS=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_PARAM);
+ cmdqv->num_total_vintfs = 1 << FIELD_GET(CMDQV_NUM_VINTF_LOG2, regval);
+ cmdqv->num_total_vcmdqs = 1 << FIELD_GET(CMDQV_NUM_VCMDQ_LOG2, regval);
+ cmdqv->num_vcmdqs_per_vintf = cmdqv->num_total_vcmdqs / cmdqv->num_total_vintfs;
+
+ return cmdqv;
+
+free_res:
+ if (cmdqv->irq)
+ devm_free_irq(smmu->dev, cmdqv->irq, cmdqv);
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+ devm_kfree(smmu->dev, cmdqv);
+
+ return NULL;
+}
--
2.17.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
From: Nate Watterson <nwatterson@nvidia.com>
NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
which extends the standard ARM SMMU v3 IP to support multiple
VCMDQs with virtualization capabilities. In-kernel of host OS,
they're used to reduce contention on a single queue. In terms
of command queue, they are very like the standard CMDQ/ECMDQs,
but only support CS_NONE in the CS field of CMD_SYNC command.
This patch adds a new nvidia-grace-cmdqv file and inserts its
structure pointer into the existing arm_smmu_device, and then
adds related function calls in the arm-smmu-v3 driver.
In the CMDQV driver itself, this patch only adds minimal part
for host kernel support. Upon probe(), VINTF0 is reserved for
in-kernel use. And some of the VCMDQs are assigned to VINTF0.
Then the driver will select one of VCMDQs in the VINTF0 based
on the CPU currently executing, to issue commands.
Note that for the current plan the CMDQV driver only supports
ACPI configuration.
Signed-off-by: Nate Watterson <nwatterson@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Changelog:
v2->v3:
* Replaced impl design with simpler "nvidia_grace_cmdqv" pointer
* Aligned all the namings to "nvidia_grace_cmdqv" or "cmdqv"
* Changed VINTF_ENABLED check in get_cmdq() to VINTF_STATUS
* Dropped overrides at smmu->features and smmu->options
* Inlined hw_probe() to acpi_probe() for simplification
* Added a new CMDQV CONFIG depending on CONFIG_ACPI
* Removed additional platform_device involvement
* Switched krealloc to kzalloc for cmdqv Pointer
* Moved devm_request_irq() out of device_reset()
* Dropped IRQF_SHARED flag at devm_request_irq()
* Reused acpi_iort_node pointer from SMMU driver
* Reused existing smmu functions to init vcmdqs
* Changed writel_relaxed to writel to be safe
* Removed pointless comments and prints
* Updated Copyright lines
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 41 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 418 ++++++++++++++++++
6 files changed, 488 insertions(+), 6 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
diff --git a/MAINTAINERS b/MAINTAINERS
index f32c7d733255..0314ee1edf62 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18726,6 +18726,7 @@ M: Thierry Reding <thierry.reding@gmail.com>
R: Krishna Reddy <vdumpa@nvidia.com>
L: linux-tegra@vger.kernel.org
S: Supported
+F: drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
F: drivers/iommu/tegra*
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 3eb68fa1b8cc..290af9c7b2a5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -388,6 +388,18 @@ config ARM_SMMU_V3_SVA
Say Y here if your system supports SVA extensions such as PCIe PASID
and PRI.
+config NVIDIA_GRACE_CMDQV
+ bool "NVIDIA Grace CMDQ-V extension support for ARM SMMUv3"
+ depends on ARM_SMMU_V3
+ depends on ACPI
+ help
+ Support for NVIDIA Grace CMDQ-Virtualization extension for ARM SMMUv3.
+ The CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues
+ support, except with virtualization capabilities.
+
+ Say Y here if your system is NVIDIA Grace or it has the same CMDQ-V
+ extension.
+
config S390_IOMMU
def_bool y if S390 && PCI
depends on S390 && PCI
diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..a083019de68a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -2,4 +2,5 @@
obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
arm_smmu_v3-objs-y += arm-smmu-v3.o
arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
+arm_smmu_v3-objs-$(CONFIG_NVIDIA_GRACE_CMDQV) += nvidia-grace-cmdqv.o
arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 188865ec9a33..b1182dd825fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -339,6 +339,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
{
+ if (smmu->nvidia_grace_cmdqv)
+ return nvidia_grace_cmdqv_get_cmdq(smmu);
+
return &smmu->cmdq;
}
@@ -2874,12 +2877,10 @@ static struct iommu_ops arm_smmu_ops = {
};
/* Probing and initialisation functions */
-static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
- struct arm_smmu_queue *q,
- void __iomem *page,
- unsigned long prod_off,
- unsigned long cons_off,
- size_t dwords, const char *name)
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name)
{
size_t qsz;
@@ -3438,6 +3439,12 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
return ret;
}
+ if (smmu->nvidia_grace_cmdqv) {
+ ret = nvidia_grace_cmdqv_device_reset(smmu);
+ if (ret)
+ return ret;
+ }
+
return 0;
}
@@ -3686,6 +3693,8 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
smmu->features |= ARM_SMMU_FEAT_COHERENCY;
+ smmu->nvidia_grace_cmdqv = nvidia_grace_cmdqv_acpi_probe(smmu, node);
+
return 0;
}
#else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 475f004ccbe4..24f93444aeeb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -619,6 +619,8 @@ struct arm_smmu_strtab_cfg {
u32 strtab_base_cfg;
};
+struct nvidia_grace_cmdqv;
+
/* An SMMUv3 instance */
struct arm_smmu_device {
struct device *dev;
@@ -679,6 +681,12 @@ struct arm_smmu_device {
struct rb_root streams;
struct mutex streams_mutex;
+
+ /*
+ * Pointer to NVIDIA Grace CMDQ-Virtualization Extension support,
+ * similar to v3.3 ECMDQ except with virtualization capabilities.
+ */
+ struct nvidia_grace_cmdqv *nvidia_grace_cmdqv;
};
struct arm_smmu_stream {
@@ -753,6 +761,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -812,4 +824,33 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
static inline void arm_smmu_sva_notifier_synchronize(void) {}
#endif /* CONFIG_ARM_SMMU_V3_SVA */
+
+struct acpi_iort_node;
+
+#ifdef CONFIG_NVIDIA_GRACE_CMDQV
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node);
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+#else /* CONFIG_NVIDIA_GRACE_CMDQV */
+static inline struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ return NULL;
+}
+
+static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ return -ENODEV;
+}
+
+static inline struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ return NULL;
+}
+#endif /* CONFIG_NVIDIA_GRACE_CMDQV */
+
#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
new file mode 100644
index 000000000000..c0d7351f13e2
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -0,0 +1,418 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2021 NVIDIA CORPORATION & AFFILIATES */
+
+#define dev_fmt(fmt) "nvidia_grace_cmdqv: " fmt
+
+#include <linux/acpi.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+
+#include <acpi/acpixf.h>
+
+#include "arm-smmu-v3.h"
+
+#define NVIDIA_CMDQV_HID "NVDA0600"
+
+/* CMDQV register page base and size defines */
+#define NVIDIA_CMDQV_CONFIG_BASE (0)
+#define NVIDIA_CMDQV_CONFIG_SIZE (SZ_64K)
+#define NVIDIA_VCMDQ_BASE (0 + SZ_64K)
+#define NVIDIA_VCMDQ_SIZE (SZ_64K * 2) /* PAGE0 and PAGE1 */
+
+/* CMDQV global config regs */
+#define NVIDIA_CMDQV_CONFIG 0x0000
+#define CMDQV_EN BIT(0)
+
+#define NVIDIA_CMDQV_PARAM 0x0004
+#define CMDQV_NUM_VINTF_LOG2 GENMASK(11, 8)
+#define CMDQV_NUM_VCMDQ_LOG2 GENMASK(7, 4)
+
+#define NVIDIA_CMDQV_STATUS 0x0008
+#define CMDQV_STATUS GENMASK(2, 1)
+#define CMDQV_ENABLED BIT(0)
+
+#define NVIDIA_CMDQV_VINTF_ERR_MAP 0x000C
+#define NVIDIA_CMDQV_VINTF_INT_MASK 0x0014
+#define NVIDIA_CMDQV_VCMDQ_ERR_MAP 0x001C
+
+#define NVIDIA_CMDQV_CMDQ_ALLOC(q) (0x0200 + 0x4*(q))
+#define CMDQV_CMDQ_ALLOC_VINTF GENMASK(20, 15)
+#define CMDQV_CMDQ_ALLOC_LVCMDQ GENMASK(7, 1)
+#define CMDQV_CMDQ_ALLOCATED BIT(0)
+
+/* VINTF config regs */
+#define NVIDIA_CMDQV_VINTF(v) (0x1000 + 0x100*(v))
+
+#define NVIDIA_VINTF_CONFIG 0x0000
+#define VINTF_HYP_OWN BIT(17)
+#define VINTF_VMID GENMASK(16, 1)
+#define VINTF_EN BIT(0)
+
+#define NVIDIA_VINTF_STATUS 0x0004
+#define VINTF_STATUS GENMASK(3, 1)
+#define VINTF_ENABLED BIT(0)
+
+/* VCMDQ config regs */
+/* -- PAGE0 -- */
+#define NVIDIA_CMDQV_VCMDQ(q) (NVIDIA_VCMDQ_BASE + 0x80*(q))
+
+#define NVIDIA_VCMDQ_CONS 0x00000
+#define VCMDQ_CONS_ERR GENMASK(30, 24)
+
+#define NVIDIA_VCMDQ_PROD 0x00004
+
+#define NVIDIA_VCMDQ_CONFIG 0x00008
+#define VCMDQ_EN BIT(0)
+
+#define NVIDIA_VCMDQ_STATUS 0x0000C
+#define VCMDQ_ENABLED BIT(0)
+
+#define NVIDIA_VCMDQ_GERROR 0x00010
+#define NVIDIA_VCMDQ_GERRORN 0x00014
+
+/* -- PAGE1 -- */
+#define NVIDIA_VCMDQ_BASE_L(q) (NVIDIA_CMDQV_VCMDQ(q) + SZ_64K)
+#define VCMDQ_ADDR GENMASK(47, 5)
+#define VCMDQ_LOG2SIZE GENMASK(4, 0)
+
+struct nvidia_grace_cmdqv_vintf {
+ u16 idx;
+ u32 cfg;
+ u32 status;
+
+ void __iomem *base;
+ struct arm_smmu_cmdq *vcmdqs;
+};
+
+struct nvidia_grace_cmdqv {
+ struct arm_smmu_device *smmu;
+
+ struct device *dev;
+ struct resource *res;
+ void __iomem *base;
+ int irq;
+
+ /* CMDQV Hardware Params */
+ u16 num_total_vintfs;
+ u16 num_total_vcmdqs;
+ u16 num_vcmdqs_per_vintf;
+
+ /* CMDQV_VINTF(0) reserved for host kernel use */
+ struct nvidia_grace_cmdqv_vintf vintf0;
+};
+
+static irqreturn_t nvidia_grace_cmdqv_isr(int irq, void *devid)
+{
+ struct nvidia_grace_cmdqv *cmdqv = (struct nvidia_grace_cmdqv *)devid;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 vintf_err_map[2];
+ u32 vcmdq_err_map[4];
+
+ vintf_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP);
+ vintf_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP + 0x4);
+
+ vcmdq_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP);
+ vcmdq_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x4);
+ vcmdq_err_map[2] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x8);
+ vcmdq_err_map[3] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0xC);
+
+ dev_warn(cmdqv->dev,
+ "unexpected cmdqv error reported: vintf_map %08X %08X, vcmdq_map %08X %08X %08X %08X\n",
+ vintf_err_map[0], vintf_err_map[1], vcmdq_err_map[0], vcmdq_err_map[1],
+ vcmdq_err_map[2], vcmdq_err_map[3]);
+
+ /* If the error was reported by vintf0, avoid using any of its VCMDQs */
+ if (vintf_err_map[vintf0->idx / 32] & (1 << (vintf0->idx % 32))) {
+ vintf0->status = readl_relaxed(vintf0->base + NVIDIA_VINTF_STATUS);
+
+ dev_warn(cmdqv->dev, "error (0x%lX) reported by host vintf0 - disabling its vcmdqs\n",
+ FIELD_GET(VINTF_STATUS, vintf0->status));
+ } else if (vintf_err_map[0] || vintf_err_map[1]) {
+ dev_err(cmdqv->dev, "cmdqv error interrupt triggered by unassigned vintf!\n");
+ }
+
+ return IRQ_HANDLED;
+}
+
+/* Adapt struct arm_smmu_cmdq init sequences from arm-smmu-v3.c for VCMDQs */
+static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
+ struct arm_smmu_cmdq *cmdq,
+ void __iomem *vcmdq_base, u16 qidx)
+{
+ struct arm_smmu_queue *q = &cmdq->q;
+ char name[16];
+ int ret;
+
+ sprintf(name, "vcmdq%u", qidx);
+
+ q->llq.max_n_shift = ilog2(SZ_64K >> CMDQ_ENT_SZ_SHIFT);
+
+ /* Use the common helper to init the VCMDQ, and then... */
+ ret = arm_smmu_init_one_queue(cmdqv->smmu, q, vcmdq_base,
+ NVIDIA_VCMDQ_PROD, NVIDIA_VCMDQ_CONS,
+ CMDQ_ENT_DWORDS, name);
+ if (ret)
+ return ret;
+
+ /* ...override q_base for VCMDQ_BASE_L/H registers */
+ q->q_base = q->base_dma & VCMDQ_ADDR;
+ q->q_base |= FIELD_PREP(VCMDQ_LOG2SIZE, q->llq.max_n_shift);
+
+ /* All VCMDQs support CS_NONE only for CMD_SYNC */
+ q->quirks = CMDQ_QUIRK_SYNC_CS_NONE_ONLY;
+
+ return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
+}
+
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u16 qidx;
+
+ /* Check error status of vintf0 */
+ if (!FIELD_GET(VINTF_STATUS, vintf0->status))
+ return &smmu->cmdq;
+
+ /*
+ * Select a vcmdq to use. Here we use a temporal solution to
+ * balance out traffic on cmdq issuing: each cmdq has its own
+ * lock, if all cpus issue cmdlist using the same cmdq, only
+ * one CPU at a time can enter the process, while the others
+ * will be spinning at the same lock.
+ */
+ qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
+ return &vintf0->vcmdqs[qidx];
+}
+
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 regval;
+ u16 qidx;
+ int ret;
+
+ /* Setup vintf0 for host kernel */
+ vintf0->idx = 0;
+ vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+
+ regval = FIELD_PREP(VINTF_HYP_OWN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ regval |= FIELD_PREP(VINTF_EN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ vintf0->cfg = regval;
+
+ ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
+ regval, regval == VINTF_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ vintf0->status = regval;
+ if (ret) {
+ dev_err(cmdqv->dev, "failed to enable VINTF%u: STATUS = 0x%08X\n",
+ vintf0->idx, regval);
+ return ret;
+ }
+
+ /* Allocate vcmdqs to vintf0 */
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ regval = FIELD_PREP(CMDQV_CMDQ_ALLOC_VINTF, vintf0->idx);
+ regval |= FIELD_PREP(CMDQV_CMDQ_ALLOC_LVCMDQ, qidx);
+ regval |= CMDQV_CMDQ_ALLOCATED;
+ writel_relaxed(regval, cmdqv->base + NVIDIA_CMDQV_CMDQ_ALLOC(qidx));
+ }
+
+ /* Build an arm_smmu_cmdq for each vcmdq allocated to vintf0 */
+ vintf0->vcmdqs = devm_kcalloc(cmdqv->dev, cmdqv->num_vcmdqs_per_vintf,
+ sizeof(*vintf0->vcmdqs), GFP_KERNEL);
+ if (!vintf0->vcmdqs)
+ return -ENOMEM;
+
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ void __iomem *vcmdq_base = cmdqv->base + NVIDIA_CMDQV_VCMDQ(qidx);
+ struct arm_smmu_cmdq *cmdq = &vintf0->vcmdqs[qidx];
+
+ /* Setup struct arm_smmu_cmdq data members */
+ nvidia_grace_cmdqv_init_one_vcmdq(cmdqv, cmdq, vcmdq_base, qidx);
+
+ /* Configure and enable the vcmdq */
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_PROD);
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ writeq_relaxed(cmdq->q.q_base, cmdqv->base + NVIDIA_VCMDQ_BASE_L(qidx));
+
+ writel(VCMDQ_EN, vcmdq_base + NVIDIA_VCMDQ_CONFIG);
+ ret = readl_poll_timeout(vcmdq_base + NVIDIA_VCMDQ_STATUS,
+ regval, regval == VCMDQ_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ if (ret) {
+ u32 gerror = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERROR);
+ u32 gerrorn = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERRORN);
+ u32 cons = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ dev_err(cmdqv->dev,
+ "failed to enable VCMDQ%u: GERROR=0x%X, GERRORN=0x%X, CONS=0x%X\n",
+ qidx, gerror, gerrorn, cons);
+ return ret;
+ }
+
+ dev_info(cmdqv->dev, "VCMDQ%u allocated to VINTF%u as logical-VCMDQ%u\n",
+ qidx, vintf0->idx, qidx);
+ }
+
+ return 0;
+}
+
+static int nvidia_grace_cmdqv_acpi_is_memory(struct acpi_resource *res, void *data)
+{
+ struct resource r;
+
+ return !acpi_dev_resource_memory(res, &r);
+}
+
+static int nvidia_grace_cmdqv_acpi_get_irqs(struct acpi_resource *ares, void *data)
+{
+ struct resource r;
+ int *irq = data;
+
+ if (*irq <= 0 && acpi_dev_resource_interrupt(ares, 0, &r))
+ *irq = r.start;
+
+ return 1; /* No need to add resource to the list */
+}
+
+/*
+ * Function taking care of all ACPI resource probings and according allocations
+ *
+ * Note that it uses devm_* functions for resource allocations here so that smmu
+ * driver can roll back cmdqv resources automatically without additional cleanup
+ * routine, if any further error happens there. Yet this means all error unwinds
+ * here will have to go with devm_* too.
+ */
+static struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_find_resource(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv = NULL;
+ struct list_head resource_list;
+ struct resource_entry *rentry;
+ struct acpi_device *adev;
+ const char *match_uid;
+ int ret;
+
+ if (acpi_disabled)
+ return NULL;
+
+ /* Look for a device in the DSDT whose _UID matches the SMMU's iort_node identifier */
+ match_uid = kasprintf(GFP_KERNEL, "%u", node->identifier);
+ adev = acpi_dev_get_first_match_dev(NVIDIA_CMDQV_HID, match_uid, -1);
+ kfree(match_uid);
+
+ if (!adev)
+ return NULL;
+
+ dev_info(smmu->dev, "found companion CMDQV device, %s\n", dev_name(&adev->dev));
+
+ INIT_LIST_HEAD(&resource_list);
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_is_memory, NULL);
+ if (ret < 0) {
+ dev_err(smmu->dev, "failed to get memory resource: %d\n", ret);
+ goto put_dev;
+ }
+
+ cmdqv = devm_kzalloc(smmu->dev, sizeof(*cmdqv), GFP_KERNEL);
+ if (!cmdqv)
+ goto free_list;
+
+ rentry = list_first_entry_or_null(&resource_list, struct resource_entry, node);
+ if (!rentry) {
+ dev_err(smmu->dev, "failed to get memory resource entry\n");
+ goto free_cmdqv;
+ }
+
+ cmdqv->smmu = smmu;
+ cmdqv->dev = smmu->dev;
+ cmdqv->res = rentry->res;
+
+ cmdqv->base = devm_ioremap_resource(smmu->dev, rentry->res);
+ if (IS_ERR(cmdqv->base)) {
+ dev_err(smmu->dev, "failed to ioremap: %ld\n", PTR_ERR(cmdqv->base));
+ goto free_cmdqv;
+ }
+
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_get_irqs, &cmdqv->irq);
+ if (ret < 0) {
+ dev_warn(smmu->dev, "no cmdqv interrupt - errors will not be reported\n");
+ cmdqv->irq = 0;
+ } else {
+ ret = devm_request_irq(smmu->dev, cmdqv->irq, nvidia_grace_cmdqv_isr,
+ 0, "nvidia-grace-cmdqv", cmdqv);
+ if (ret) {
+ dev_err(smmu->dev, "failed to request irq (%d): %d\n",
+ cmdqv->irq, ret);
+ goto iounmap;
+ }
+ }
+
+ goto free_list;
+
+iounmap:
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+free_cmdqv:
+ devm_kfree(smmu->dev, cmdqv);
+ cmdqv = NULL;
+free_list:
+ acpi_dev_free_resource_list(&resource_list);
+put_dev:
+ put_device(&adev->dev);
+
+ return cmdqv;
+}
+
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv;
+ u32 regval;
+
+ cmdqv = nvidia_grace_cmdqv_find_resource(smmu, node);
+ if (!cmdqv)
+ return NULL;
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_CONFIG);
+ if (!FIELD_GET(CMDQV_EN, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w is disabled: CMDQV_CONFIG=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_STATUS);
+ if (!FIELD_GET(CMDQV_ENABLED, regval) || FIELD_GET(CMDQV_STATUS, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w not ready: CMDQV_STATUS=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_PARAM);
+ cmdqv->num_total_vintfs = 1 << FIELD_GET(CMDQV_NUM_VINTF_LOG2, regval);
+ cmdqv->num_total_vcmdqs = 1 << FIELD_GET(CMDQV_NUM_VCMDQ_LOG2, regval);
+ cmdqv->num_vcmdqs_per_vintf = cmdqv->num_total_vcmdqs / cmdqv->num_total_vintfs;
+
+ return cmdqv;
+
+free_res:
+ if (cmdqv->irq)
+ devm_free_irq(smmu->dev, cmdqv->irq, cmdqv);
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+ devm_kfree(smmu->dev, cmdqv);
+
+ return NULL;
+}
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
From: Nate Watterson <nwatterson@nvidia.com>
NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
which extends the standard ARM SMMU v3 IP to support multiple
VCMDQs with virtualization capabilities. In-kernel of host OS,
they're used to reduce contention on a single queue. In terms
of command queue, they are very like the standard CMDQ/ECMDQs,
but only support CS_NONE in the CS field of CMD_SYNC command.
This patch adds a new nvidia-grace-cmdqv file and inserts its
structure pointer into the existing arm_smmu_device, and then
adds related function calls in the arm-smmu-v3 driver.
In the CMDQV driver itself, this patch only adds minimal part
for host kernel support. Upon probe(), VINTF0 is reserved for
in-kernel use. And some of the VCMDQs are assigned to VINTF0.
Then the driver will select one of VCMDQs in the VINTF0 based
on the CPU currently executing, to issue commands.
Note that for the current plan the CMDQV driver only supports
ACPI configuration.
Signed-off-by: Nate Watterson <nwatterson@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Changelog:
v2->v3:
* Replaced impl design with simpler "nvidia_grace_cmdqv" pointer
* Aligned all the namings to "nvidia_grace_cmdqv" or "cmdqv"
* Changed VINTF_ENABLED check in get_cmdq() to VINTF_STATUS
* Dropped overrides at smmu->features and smmu->options
* Inlined hw_probe() to acpi_probe() for simplification
* Added a new CMDQV CONFIG depending on CONFIG_ACPI
* Removed additional platform_device involvement
* Switched krealloc to kzalloc for cmdqv Pointer
* Moved devm_request_irq() out of device_reset()
* Dropped IRQF_SHARED flag at devm_request_irq()
* Reused acpi_iort_node pointer from SMMU driver
* Reused existing smmu functions to init vcmdqs
* Changed writel_relaxed to writel to be safe
* Removed pointless comments and prints
* Updated Copyright lines
MAINTAINERS | 1 +
drivers/iommu/Kconfig | 12 +
drivers/iommu/arm/arm-smmu-v3/Makefile | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 41 ++
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 418 ++++++++++++++++++
6 files changed, 488 insertions(+), 6 deletions(-)
create mode 100644 drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
diff --git a/MAINTAINERS b/MAINTAINERS
index f32c7d733255..0314ee1edf62 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18726,6 +18726,7 @@ M: Thierry Reding <thierry.reding@gmail.com>
R: Krishna Reddy <vdumpa@nvidia.com>
L: linux-tegra@vger.kernel.org
S: Supported
+F: drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
F: drivers/iommu/tegra*
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 3eb68fa1b8cc..290af9c7b2a5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -388,6 +388,18 @@ config ARM_SMMU_V3_SVA
Say Y here if your system supports SVA extensions such as PCIe PASID
and PRI.
+config NVIDIA_GRACE_CMDQV
+ bool "NVIDIA Grace CMDQ-V extension support for ARM SMMUv3"
+ depends on ARM_SMMU_V3
+ depends on ACPI
+ help
+ Support for NVIDIA Grace CMDQ-Virtualization extension for ARM SMMUv3.
+ The CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues
+ support, except with virtualization capabilities.
+
+ Say Y here if your system is NVIDIA Grace or it has the same CMDQ-V
+ extension.
+
config S390_IOMMU
def_bool y if S390 && PCI
depends on S390 && PCI
diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..a083019de68a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -2,4 +2,5 @@
obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
arm_smmu_v3-objs-y += arm-smmu-v3.o
arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
+arm_smmu_v3-objs-$(CONFIG_NVIDIA_GRACE_CMDQV) += nvidia-grace-cmdqv.o
arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 188865ec9a33..b1182dd825fd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -339,6 +339,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
{
+ if (smmu->nvidia_grace_cmdqv)
+ return nvidia_grace_cmdqv_get_cmdq(smmu);
+
return &smmu->cmdq;
}
@@ -2874,12 +2877,10 @@ static struct iommu_ops arm_smmu_ops = {
};
/* Probing and initialisation functions */
-static int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
- struct arm_smmu_queue *q,
- void __iomem *page,
- unsigned long prod_off,
- unsigned long cons_off,
- size_t dwords, const char *name)
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name)
{
size_t qsz;
@@ -3438,6 +3439,12 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
return ret;
}
+ if (smmu->nvidia_grace_cmdqv) {
+ ret = nvidia_grace_cmdqv_device_reset(smmu);
+ if (ret)
+ return ret;
+ }
+
return 0;
}
@@ -3686,6 +3693,8 @@ static int arm_smmu_device_acpi_probe(struct platform_device *pdev,
if (iort_smmu->flags & ACPI_IORT_SMMU_V3_COHACC_OVERRIDE)
smmu->features |= ARM_SMMU_FEAT_COHERENCY;
+ smmu->nvidia_grace_cmdqv = nvidia_grace_cmdqv_acpi_probe(smmu, node);
+
return 0;
}
#else
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 475f004ccbe4..24f93444aeeb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -619,6 +619,8 @@ struct arm_smmu_strtab_cfg {
u32 strtab_base_cfg;
};
+struct nvidia_grace_cmdqv;
+
/* An SMMUv3 instance */
struct arm_smmu_device {
struct device *dev;
@@ -679,6 +681,12 @@ struct arm_smmu_device {
struct rb_root streams;
struct mutex streams_mutex;
+
+ /*
+ * Pointer to NVIDIA Grace CMDQ-Virtualization Extension support,
+ * similar to v3.3 ECMDQ except with virtualization capabilities.
+ */
+ struct nvidia_grace_cmdqv *nvidia_grace_cmdqv;
};
struct arm_smmu_stream {
@@ -753,6 +761,10 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
int arm_smmu_cmdq_init(struct arm_smmu_device *smmu,
struct arm_smmu_cmdq *cmdq);
+int arm_smmu_init_one_queue(struct arm_smmu_device *smmu,
+ struct arm_smmu_queue *q, void __iomem *page,
+ unsigned long prod_off, unsigned long cons_off,
+ size_t dwords, const char *name);
#ifdef CONFIG_ARM_SMMU_V3_SVA
bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
@@ -812,4 +824,33 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva *handle)
static inline void arm_smmu_sva_notifier_synchronize(void) {}
#endif /* CONFIG_ARM_SMMU_V3_SVA */
+
+struct acpi_iort_node;
+
+#ifdef CONFIG_NVIDIA_GRACE_CMDQV
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node);
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+#else /* CONFIG_NVIDIA_GRACE_CMDQV */
+static inline struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ return NULL;
+}
+
+static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ return -ENODEV;
+}
+
+static inline struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ return NULL;
+}
+#endif /* CONFIG_NVIDIA_GRACE_CMDQV */
+
#endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
new file mode 100644
index 000000000000..c0d7351f13e2
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -0,0 +1,418 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2021 NVIDIA CORPORATION & AFFILIATES */
+
+#define dev_fmt(fmt) "nvidia_grace_cmdqv: " fmt
+
+#include <linux/acpi.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+
+#include <acpi/acpixf.h>
+
+#include "arm-smmu-v3.h"
+
+#define NVIDIA_CMDQV_HID "NVDA0600"
+
+/* CMDQV register page base and size defines */
+#define NVIDIA_CMDQV_CONFIG_BASE (0)
+#define NVIDIA_CMDQV_CONFIG_SIZE (SZ_64K)
+#define NVIDIA_VCMDQ_BASE (0 + SZ_64K)
+#define NVIDIA_VCMDQ_SIZE (SZ_64K * 2) /* PAGE0 and PAGE1 */
+
+/* CMDQV global config regs */
+#define NVIDIA_CMDQV_CONFIG 0x0000
+#define CMDQV_EN BIT(0)
+
+#define NVIDIA_CMDQV_PARAM 0x0004
+#define CMDQV_NUM_VINTF_LOG2 GENMASK(11, 8)
+#define CMDQV_NUM_VCMDQ_LOG2 GENMASK(7, 4)
+
+#define NVIDIA_CMDQV_STATUS 0x0008
+#define CMDQV_STATUS GENMASK(2, 1)
+#define CMDQV_ENABLED BIT(0)
+
+#define NVIDIA_CMDQV_VINTF_ERR_MAP 0x000C
+#define NVIDIA_CMDQV_VINTF_INT_MASK 0x0014
+#define NVIDIA_CMDQV_VCMDQ_ERR_MAP 0x001C
+
+#define NVIDIA_CMDQV_CMDQ_ALLOC(q) (0x0200 + 0x4*(q))
+#define CMDQV_CMDQ_ALLOC_VINTF GENMASK(20, 15)
+#define CMDQV_CMDQ_ALLOC_LVCMDQ GENMASK(7, 1)
+#define CMDQV_CMDQ_ALLOCATED BIT(0)
+
+/* VINTF config regs */
+#define NVIDIA_CMDQV_VINTF(v) (0x1000 + 0x100*(v))
+
+#define NVIDIA_VINTF_CONFIG 0x0000
+#define VINTF_HYP_OWN BIT(17)
+#define VINTF_VMID GENMASK(16, 1)
+#define VINTF_EN BIT(0)
+
+#define NVIDIA_VINTF_STATUS 0x0004
+#define VINTF_STATUS GENMASK(3, 1)
+#define VINTF_ENABLED BIT(0)
+
+/* VCMDQ config regs */
+/* -- PAGE0 -- */
+#define NVIDIA_CMDQV_VCMDQ(q) (NVIDIA_VCMDQ_BASE + 0x80*(q))
+
+#define NVIDIA_VCMDQ_CONS 0x00000
+#define VCMDQ_CONS_ERR GENMASK(30, 24)
+
+#define NVIDIA_VCMDQ_PROD 0x00004
+
+#define NVIDIA_VCMDQ_CONFIG 0x00008
+#define VCMDQ_EN BIT(0)
+
+#define NVIDIA_VCMDQ_STATUS 0x0000C
+#define VCMDQ_ENABLED BIT(0)
+
+#define NVIDIA_VCMDQ_GERROR 0x00010
+#define NVIDIA_VCMDQ_GERRORN 0x00014
+
+/* -- PAGE1 -- */
+#define NVIDIA_VCMDQ_BASE_L(q) (NVIDIA_CMDQV_VCMDQ(q) + SZ_64K)
+#define VCMDQ_ADDR GENMASK(47, 5)
+#define VCMDQ_LOG2SIZE GENMASK(4, 0)
+
+struct nvidia_grace_cmdqv_vintf {
+ u16 idx;
+ u32 cfg;
+ u32 status;
+
+ void __iomem *base;
+ struct arm_smmu_cmdq *vcmdqs;
+};
+
+struct nvidia_grace_cmdqv {
+ struct arm_smmu_device *smmu;
+
+ struct device *dev;
+ struct resource *res;
+ void __iomem *base;
+ int irq;
+
+ /* CMDQV Hardware Params */
+ u16 num_total_vintfs;
+ u16 num_total_vcmdqs;
+ u16 num_vcmdqs_per_vintf;
+
+ /* CMDQV_VINTF(0) reserved for host kernel use */
+ struct nvidia_grace_cmdqv_vintf vintf0;
+};
+
+static irqreturn_t nvidia_grace_cmdqv_isr(int irq, void *devid)
+{
+ struct nvidia_grace_cmdqv *cmdqv = (struct nvidia_grace_cmdqv *)devid;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 vintf_err_map[2];
+ u32 vcmdq_err_map[4];
+
+ vintf_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP);
+ vintf_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VINTF_ERR_MAP + 0x4);
+
+ vcmdq_err_map[0] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP);
+ vcmdq_err_map[1] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x4);
+ vcmdq_err_map[2] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0x8);
+ vcmdq_err_map[3] = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_VCMDQ_ERR_MAP + 0xC);
+
+ dev_warn(cmdqv->dev,
+ "unexpected cmdqv error reported: vintf_map %08X %08X, vcmdq_map %08X %08X %08X %08X\n",
+ vintf_err_map[0], vintf_err_map[1], vcmdq_err_map[0], vcmdq_err_map[1],
+ vcmdq_err_map[2], vcmdq_err_map[3]);
+
+ /* If the error was reported by vintf0, avoid using any of its VCMDQs */
+ if (vintf_err_map[vintf0->idx / 32] & (1 << (vintf0->idx % 32))) {
+ vintf0->status = readl_relaxed(vintf0->base + NVIDIA_VINTF_STATUS);
+
+ dev_warn(cmdqv->dev, "error (0x%lX) reported by host vintf0 - disabling its vcmdqs\n",
+ FIELD_GET(VINTF_STATUS, vintf0->status));
+ } else if (vintf_err_map[0] || vintf_err_map[1]) {
+ dev_err(cmdqv->dev, "cmdqv error interrupt triggered by unassigned vintf!\n");
+ }
+
+ return IRQ_HANDLED;
+}
+
+/* Adapt struct arm_smmu_cmdq init sequences from arm-smmu-v3.c for VCMDQs */
+static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
+ struct arm_smmu_cmdq *cmdq,
+ void __iomem *vcmdq_base, u16 qidx)
+{
+ struct arm_smmu_queue *q = &cmdq->q;
+ char name[16];
+ int ret;
+
+ sprintf(name, "vcmdq%u", qidx);
+
+ q->llq.max_n_shift = ilog2(SZ_64K >> CMDQ_ENT_SZ_SHIFT);
+
+ /* Use the common helper to init the VCMDQ, and then... */
+ ret = arm_smmu_init_one_queue(cmdqv->smmu, q, vcmdq_base,
+ NVIDIA_VCMDQ_PROD, NVIDIA_VCMDQ_CONS,
+ CMDQ_ENT_DWORDS, name);
+ if (ret)
+ return ret;
+
+ /* ...override q_base for VCMDQ_BASE_L/H registers */
+ q->q_base = q->base_dma & VCMDQ_ADDR;
+ q->q_base |= FIELD_PREP(VCMDQ_LOG2SIZE, q->llq.max_n_shift);
+
+ /* All VCMDQs support CS_NONE only for CMD_SYNC */
+ q->quirks = CMDQ_QUIRK_SYNC_CS_NONE_ONLY;
+
+ return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
+}
+
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u16 qidx;
+
+ /* Check error status of vintf0 */
+ if (!FIELD_GET(VINTF_STATUS, vintf0->status))
+ return &smmu->cmdq;
+
+ /*
+ * Select a vcmdq to use. Here we use a temporal solution to
+ * balance out traffic on cmdq issuing: each cmdq has its own
+ * lock, if all cpus issue cmdlist using the same cmdq, only
+ * one CPU at a time can enter the process, while the others
+ * will be spinning at the same lock.
+ */
+ qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
+ return &vintf0->vcmdqs[qidx];
+}
+
+int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
+{
+ struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
+ struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
+ u32 regval;
+ u16 qidx;
+ int ret;
+
+ /* Setup vintf0 for host kernel */
+ vintf0->idx = 0;
+ vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+
+ regval = FIELD_PREP(VINTF_HYP_OWN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ regval |= FIELD_PREP(VINTF_EN, 1);
+ writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
+
+ vintf0->cfg = regval;
+
+ ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
+ regval, regval == VINTF_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ vintf0->status = regval;
+ if (ret) {
+ dev_err(cmdqv->dev, "failed to enable VINTF%u: STATUS = 0x%08X\n",
+ vintf0->idx, regval);
+ return ret;
+ }
+
+ /* Allocate vcmdqs to vintf0 */
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ regval = FIELD_PREP(CMDQV_CMDQ_ALLOC_VINTF, vintf0->idx);
+ regval |= FIELD_PREP(CMDQV_CMDQ_ALLOC_LVCMDQ, qidx);
+ regval |= CMDQV_CMDQ_ALLOCATED;
+ writel_relaxed(regval, cmdqv->base + NVIDIA_CMDQV_CMDQ_ALLOC(qidx));
+ }
+
+ /* Build an arm_smmu_cmdq for each vcmdq allocated to vintf0 */
+ vintf0->vcmdqs = devm_kcalloc(cmdqv->dev, cmdqv->num_vcmdqs_per_vintf,
+ sizeof(*vintf0->vcmdqs), GFP_KERNEL);
+ if (!vintf0->vcmdqs)
+ return -ENOMEM;
+
+ for (qidx = 0; qidx < cmdqv->num_vcmdqs_per_vintf; qidx++) {
+ void __iomem *vcmdq_base = cmdqv->base + NVIDIA_CMDQV_VCMDQ(qidx);
+ struct arm_smmu_cmdq *cmdq = &vintf0->vcmdqs[qidx];
+
+ /* Setup struct arm_smmu_cmdq data members */
+ nvidia_grace_cmdqv_init_one_vcmdq(cmdqv, cmdq, vcmdq_base, qidx);
+
+ /* Configure and enable the vcmdq */
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_PROD);
+ writel_relaxed(0, vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ writeq_relaxed(cmdq->q.q_base, cmdqv->base + NVIDIA_VCMDQ_BASE_L(qidx));
+
+ writel(VCMDQ_EN, vcmdq_base + NVIDIA_VCMDQ_CONFIG);
+ ret = readl_poll_timeout(vcmdq_base + NVIDIA_VCMDQ_STATUS,
+ regval, regval == VCMDQ_ENABLED,
+ 1, ARM_SMMU_POLL_TIMEOUT_US);
+ if (ret) {
+ u32 gerror = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERROR);
+ u32 gerrorn = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_GERRORN);
+ u32 cons = readl_relaxed(vcmdq_base + NVIDIA_VCMDQ_CONS);
+
+ dev_err(cmdqv->dev,
+ "failed to enable VCMDQ%u: GERROR=0x%X, GERRORN=0x%X, CONS=0x%X\n",
+ qidx, gerror, gerrorn, cons);
+ return ret;
+ }
+
+ dev_info(cmdqv->dev, "VCMDQ%u allocated to VINTF%u as logical-VCMDQ%u\n",
+ qidx, vintf0->idx, qidx);
+ }
+
+ return 0;
+}
+
+static int nvidia_grace_cmdqv_acpi_is_memory(struct acpi_resource *res, void *data)
+{
+ struct resource r;
+
+ return !acpi_dev_resource_memory(res, &r);
+}
+
+static int nvidia_grace_cmdqv_acpi_get_irqs(struct acpi_resource *ares, void *data)
+{
+ struct resource r;
+ int *irq = data;
+
+ if (*irq <= 0 && acpi_dev_resource_interrupt(ares, 0, &r))
+ *irq = r.start;
+
+ return 1; /* No need to add resource to the list */
+}
+
+/*
+ * Function taking care of all ACPI resource probings and according allocations
+ *
+ * Note that it uses devm_* functions for resource allocations here so that smmu
+ * driver can roll back cmdqv resources automatically without additional cleanup
+ * routine, if any further error happens there. Yet this means all error unwinds
+ * here will have to go with devm_* too.
+ */
+static struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_find_resource(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv = NULL;
+ struct list_head resource_list;
+ struct resource_entry *rentry;
+ struct acpi_device *adev;
+ const char *match_uid;
+ int ret;
+
+ if (acpi_disabled)
+ return NULL;
+
+ /* Look for a device in the DSDT whose _UID matches the SMMU's iort_node identifier */
+ match_uid = kasprintf(GFP_KERNEL, "%u", node->identifier);
+ adev = acpi_dev_get_first_match_dev(NVIDIA_CMDQV_HID, match_uid, -1);
+ kfree(match_uid);
+
+ if (!adev)
+ return NULL;
+
+ dev_info(smmu->dev, "found companion CMDQV device, %s\n", dev_name(&adev->dev));
+
+ INIT_LIST_HEAD(&resource_list);
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_is_memory, NULL);
+ if (ret < 0) {
+ dev_err(smmu->dev, "failed to get memory resource: %d\n", ret);
+ goto put_dev;
+ }
+
+ cmdqv = devm_kzalloc(smmu->dev, sizeof(*cmdqv), GFP_KERNEL);
+ if (!cmdqv)
+ goto free_list;
+
+ rentry = list_first_entry_or_null(&resource_list, struct resource_entry, node);
+ if (!rentry) {
+ dev_err(smmu->dev, "failed to get memory resource entry\n");
+ goto free_cmdqv;
+ }
+
+ cmdqv->smmu = smmu;
+ cmdqv->dev = smmu->dev;
+ cmdqv->res = rentry->res;
+
+ cmdqv->base = devm_ioremap_resource(smmu->dev, rentry->res);
+ if (IS_ERR(cmdqv->base)) {
+ dev_err(smmu->dev, "failed to ioremap: %ld\n", PTR_ERR(cmdqv->base));
+ goto free_cmdqv;
+ }
+
+ ret = acpi_dev_get_resources(adev, &resource_list,
+ nvidia_grace_cmdqv_acpi_get_irqs, &cmdqv->irq);
+ if (ret < 0) {
+ dev_warn(smmu->dev, "no cmdqv interrupt - errors will not be reported\n");
+ cmdqv->irq = 0;
+ } else {
+ ret = devm_request_irq(smmu->dev, cmdqv->irq, nvidia_grace_cmdqv_isr,
+ 0, "nvidia-grace-cmdqv", cmdqv);
+ if (ret) {
+ dev_err(smmu->dev, "failed to request irq (%d): %d\n",
+ cmdqv->irq, ret);
+ goto iounmap;
+ }
+ }
+
+ goto free_list;
+
+iounmap:
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+free_cmdqv:
+ devm_kfree(smmu->dev, cmdqv);
+ cmdqv = NULL;
+free_list:
+ acpi_dev_free_resource_list(&resource_list);
+put_dev:
+ put_device(&adev->dev);
+
+ return cmdqv;
+}
+
+struct nvidia_grace_cmdqv *
+nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
+ struct acpi_iort_node *node)
+{
+ struct nvidia_grace_cmdqv *cmdqv;
+ u32 regval;
+
+ cmdqv = nvidia_grace_cmdqv_find_resource(smmu, node);
+ if (!cmdqv)
+ return NULL;
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_CONFIG);
+ if (!FIELD_GET(CMDQV_EN, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w is disabled: CMDQV_CONFIG=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_STATUS);
+ if (!FIELD_GET(CMDQV_ENABLED, regval) || FIELD_GET(CMDQV_STATUS, regval)) {
+ dev_err(cmdqv->dev, "CMDQV h/w not ready: CMDQV_STATUS=0x%08X\n", regval);
+ goto free_res;
+ }
+
+ regval = readl_relaxed(cmdqv->base + NVIDIA_CMDQV_PARAM);
+ cmdqv->num_total_vintfs = 1 << FIELD_GET(CMDQV_NUM_VINTF_LOG2, regval);
+ cmdqv->num_total_vcmdqs = 1 << FIELD_GET(CMDQV_NUM_VCMDQ_LOG2, regval);
+ cmdqv->num_vcmdqs_per_vintf = cmdqv->num_total_vcmdqs / cmdqv->num_total_vintfs;
+
+ return cmdqv;
+
+free_res:
+ if (cmdqv->irq)
+ devm_free_irq(smmu->dev, cmdqv->irq, cmdqv);
+ devm_iounmap(smmu->dev, cmdqv->base);
+ devm_release_mem_region(smmu->dev, cmdqv->res->start,
+ resource_size(cmdqv->res));
+ devm_kfree(smmu->dev, cmdqv);
+
+ return NULL;
+}
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
When VCMDQs are assigned to a VINTF that is owned by a guest, not
hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
are supported. This requires get_cmd() function to scan the input
cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
unsupported commands can still go through emulated smmu->cmdq.
Also the guest shouldn't have HYP_OWN bit being set regardless of
guest kernel driver writing it or not, i.e. the user space driver
running in the host OS should wire this bit to zero when trapping
a write access to this VINTF_CONFIG register from a guest kernel.
So instead of using the existing regval, this patch reads out the
register value explicitly to cache in vintf->cfg.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b1182dd825fd..73941ccc1a3e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return 0;
}
-static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
+static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
if (smmu->nvidia_grace_cmdqv)
- return nvidia_grace_cmdqv_get_cmdq(smmu);
+ return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
return &smmu->cmdq;
}
@@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
u32 prod;
unsigned long flags;
bool owner;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
+ struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
struct arm_smmu_ll_queue llq, head;
int ret = 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 24f93444aeeb..085c775c2eea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
struct acpi_iort_node *node);
int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
+ u64 *cmds, int n);
#else /* CONFIG_NVIDIA_GRACE_CMDQV */
static inline struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
@@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
}
static inline struct arm_smmu_cmdq *
-nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
return NULL;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
index c0d7351f13e2..71f6bc684e64 100644
--- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
}
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
@@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
if (!FIELD_GET(VINTF_STATUS, vintf0->status))
return &smmu->cmdq;
+ /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
+ if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
+ u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
+
+ /* List all supported CMDs for vintf->cmdq pathway */
+ switch (opcode) {
+ case CMDQ_OP_TLBI_NH_ASID:
+ case CMDQ_OP_TLBI_NH_VA:
+ case CMDQ_OP_TLBI_S12_VMALL:
+ case CMDQ_OP_TLBI_S2_IPA:
+ case CMDQ_OP_ATC_INV:
+ break;
+ default:
+ /* Unsupported CMDs go for smmu->cmdq pathway */
+ return &smmu->cmdq;
+ }
+ }
+
/*
* Select a vcmdq to use. Here we use a temporal solution to
* balance out traffic on cmdq issuing: each cmdq has its own
@@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
vintf0->idx = 0;
vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+ /*
+ * Note that HYP_OWN bit is wired to zero when running in guest kernel
+ * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
+ * set of supported commands, by following the HW design.
+ */
regval = FIELD_PREP(VINTF_HYP_OWN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
regval |= FIELD_PREP(VINTF_EN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
- vintf0->cfg = regval;
+ /*
+ * As being mentioned above, HYP_OWN bit is wired to zero for a guest
+ * kernel, so read back regval from HW to ensure that reflects in cfg
+ */
+ vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
regval, regval == VINTF_ENABLED,
--
2.17.1
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
When VCMDQs are assigned to a VINTF that is owned by a guest, not
hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
are supported. This requires get_cmd() function to scan the input
cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
unsupported commands can still go through emulated smmu->cmdq.
Also the guest shouldn't have HYP_OWN bit being set regardless of
guest kernel driver writing it or not, i.e. the user space driver
running in the host OS should wire this bit to zero when trapping
a write access to this VINTF_CONFIG register from a guest kernel.
So instead of using the existing regval, this patch reads out the
register value explicitly to cache in vintf->cfg.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b1182dd825fd..73941ccc1a3e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return 0;
}
-static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
+static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
if (smmu->nvidia_grace_cmdqv)
- return nvidia_grace_cmdqv_get_cmdq(smmu);
+ return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
return &smmu->cmdq;
}
@@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
u32 prod;
unsigned long flags;
bool owner;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
+ struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
struct arm_smmu_ll_queue llq, head;
int ret = 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 24f93444aeeb..085c775c2eea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
struct acpi_iort_node *node);
int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
+ u64 *cmds, int n);
#else /* CONFIG_NVIDIA_GRACE_CMDQV */
static inline struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
@@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
}
static inline struct arm_smmu_cmdq *
-nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
return NULL;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
index c0d7351f13e2..71f6bc684e64 100644
--- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
}
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
@@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
if (!FIELD_GET(VINTF_STATUS, vintf0->status))
return &smmu->cmdq;
+ /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
+ if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
+ u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
+
+ /* List all supported CMDs for vintf->cmdq pathway */
+ switch (opcode) {
+ case CMDQ_OP_TLBI_NH_ASID:
+ case CMDQ_OP_TLBI_NH_VA:
+ case CMDQ_OP_TLBI_S12_VMALL:
+ case CMDQ_OP_TLBI_S2_IPA:
+ case CMDQ_OP_ATC_INV:
+ break;
+ default:
+ /* Unsupported CMDs go for smmu->cmdq pathway */
+ return &smmu->cmdq;
+ }
+ }
+
/*
* Select a vcmdq to use. Here we use a temporal solution to
* balance out traffic on cmdq issuing: each cmdq has its own
@@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
vintf0->idx = 0;
vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+ /*
+ * Note that HYP_OWN bit is wired to zero when running in guest kernel
+ * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
+ * set of supported commands, by following the HW design.
+ */
regval = FIELD_PREP(VINTF_HYP_OWN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
regval |= FIELD_PREP(VINTF_EN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
- vintf0->cfg = regval;
+ /*
+ * As being mentioned above, HYP_OWN bit is wired to zero for a guest
+ * kernel, so read back regval from HW to ensure that reflects in cfg
+ */
+ vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
regval, regval == VINTF_ENABLED,
--
2.17.1
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply related [flat|nested] 51+ messages in thread
* [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-11-19 7:19 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-11-19 7:19 UTC (permalink / raw)
To: joro, will, robin.murphy
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
When VCMDQs are assigned to a VINTF that is owned by a guest, not
hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
are supported. This requires get_cmd() function to scan the input
cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
unsupported commands can still go through emulated smmu->cmdq.
Also the guest shouldn't have HYP_OWN bit being set regardless of
guest kernel driver writing it or not, i.e. the user space driver
running in the host OS should wire this bit to zero when trapping
a write access to this VINTF_CONFIG register from a guest kernel.
So instead of using the existing regval, this patch reads out the
register value explicitly to cache in vintf->cfg.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
.../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
3 files changed, 36 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b1182dd825fd..73941ccc1a3e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
return 0;
}
-static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
+static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
if (smmu->nvidia_grace_cmdqv)
- return nvidia_grace_cmdqv_get_cmdq(smmu);
+ return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
return &smmu->cmdq;
}
@@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
u32 prod;
unsigned long flags;
bool owner;
- struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
+ struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
struct arm_smmu_ll_queue llq, head;
int ret = 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 24f93444aeeb..085c775c2eea 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
struct acpi_iort_node *node);
int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
+struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
+ u64 *cmds, int n);
#else /* CONFIG_NVIDIA_GRACE_CMDQV */
static inline struct nvidia_grace_cmdqv *
nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
@@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
}
static inline struct arm_smmu_cmdq *
-nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
return NULL;
}
diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
index c0d7351f13e2..71f6bc684e64 100644
--- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
+++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
@@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
}
-struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
+struct arm_smmu_cmdq *
+nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
{
struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
@@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
if (!FIELD_GET(VINTF_STATUS, vintf0->status))
return &smmu->cmdq;
+ /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
+ if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
+ u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
+
+ /* List all supported CMDs for vintf->cmdq pathway */
+ switch (opcode) {
+ case CMDQ_OP_TLBI_NH_ASID:
+ case CMDQ_OP_TLBI_NH_VA:
+ case CMDQ_OP_TLBI_S12_VMALL:
+ case CMDQ_OP_TLBI_S2_IPA:
+ case CMDQ_OP_ATC_INV:
+ break;
+ default:
+ /* Unsupported CMDs go for smmu->cmdq pathway */
+ return &smmu->cmdq;
+ }
+ }
+
/*
* Select a vcmdq to use. Here we use a temporal solution to
* balance out traffic on cmdq issuing: each cmdq has its own
@@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
vintf0->idx = 0;
vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
+ /*
+ * Note that HYP_OWN bit is wired to zero when running in guest kernel
+ * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
+ * set of supported commands, by following the HW design.
+ */
regval = FIELD_PREP(VINTF_HYP_OWN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
regval |= FIELD_PREP(VINTF_EN, 1);
writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
- vintf0->cfg = regval;
+ /*
+ * As being mentioned above, HYP_OWN bit is wired to zero for a guest
+ * kernel, so read back regval from HW to ensure that reflects in cfg
+ */
+ vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
regval, regval == VINTF_ENABLED,
--
2.17.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-12-20 18:42 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-20 18:42 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: nicoleotsuka, thierry.reding, vdumpa, nwatterson, jean-philippe,
thunder.leizhen, chenxiang66, Jonathan.Cameron, yuzenghui,
linux-kernel, iommu, linux-arm-kernel, linux-tegra, jgg
On 2021-11-19 07:19, Nicolin Chen wrote:
> From: Nate Watterson <nwatterson@nvidia.com>
>
> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> which extends the standard ARM SMMU v3 IP to support multiple
> VCMDQs with virtualization capabilities. In-kernel of host OS,
> they're used to reduce contention on a single queue. In terms
> of command queue, they are very like the standard CMDQ/ECMDQs,
> but only support CS_NONE in the CS field of CMD_SYNC command.
>
> This patch adds a new nvidia-grace-cmdqv file and inserts its
> structure pointer into the existing arm_smmu_device, and then
> adds related function calls in the arm-smmu-v3 driver.
>
> In the CMDQV driver itself, this patch only adds minimal part
> for host kernel support. Upon probe(), VINTF0 is reserved for
> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> Then the driver will select one of VCMDQs in the VINTF0 based
> on the CPU currently executing, to issue commands.
Is there a tangible difference to DMA API or VFIO performance?
[...]
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +{
> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> + u16 qidx;
> +
> + /* Check error status of vintf0 */
> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> + return &smmu->cmdq;
> +
> + /*
> + * Select a vcmdq to use. Here we use a temporal solution to
> + * balance out traffic on cmdq issuing: each cmdq has its own
> + * lock, if all cpus issue cmdlist using the same cmdq, only
> + * one CPU at a time can enter the process, while the others
> + * will be spinning at the same lock.
> + */
> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
How does ordering work between queues? Do they follow a global order
such that a sync on any queue is guaranteed to complete all prior
commands on all queues?
The challenge to make ECMDQ useful to Linux is how to make sure that all
the commands expected to be within scope of a future CMND_SYNC plus that
sync itself all get issued on the same queue, so I'd be mildly surprised
if you didn't have the same problem.
Robin.
> + return &vintf0->vcmdqs[qidx];
> +}
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-20 18:42 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-20 18:42 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
On 2021-11-19 07:19, Nicolin Chen wrote:
> From: Nate Watterson <nwatterson@nvidia.com>
>
> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> which extends the standard ARM SMMU v3 IP to support multiple
> VCMDQs with virtualization capabilities. In-kernel of host OS,
> they're used to reduce contention on a single queue. In terms
> of command queue, they are very like the standard CMDQ/ECMDQs,
> but only support CS_NONE in the CS field of CMD_SYNC command.
>
> This patch adds a new nvidia-grace-cmdqv file and inserts its
> structure pointer into the existing arm_smmu_device, and then
> adds related function calls in the arm-smmu-v3 driver.
>
> In the CMDQV driver itself, this patch only adds minimal part
> for host kernel support. Upon probe(), VINTF0 is reserved for
> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> Then the driver will select one of VCMDQs in the VINTF0 based
> on the CPU currently executing, to issue commands.
Is there a tangible difference to DMA API or VFIO performance?
[...]
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +{
> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> + u16 qidx;
> +
> + /* Check error status of vintf0 */
> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> + return &smmu->cmdq;
> +
> + /*
> + * Select a vcmdq to use. Here we use a temporal solution to
> + * balance out traffic on cmdq issuing: each cmdq has its own
> + * lock, if all cpus issue cmdlist using the same cmdq, only
> + * one CPU at a time can enter the process, while the others
> + * will be spinning at the same lock.
> + */
> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
How does ordering work between queues? Do they follow a global order
such that a sync on any queue is guaranteed to complete all prior
commands on all queues?
The challenge to make ECMDQ useful to Linux is how to make sure that all
the commands expected to be within scope of a future CMND_SYNC plus that
sync itself all get issued on the same queue, so I'd be mildly surprised
if you didn't have the same problem.
Robin.
> + return &vintf0->vcmdqs[qidx];
> +}
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-20 18:42 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-20 18:42 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: jean-philippe, nwatterson, chenxiang66, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, linux-arm-kernel
On 2021-11-19 07:19, Nicolin Chen wrote:
> From: Nate Watterson <nwatterson@nvidia.com>
>
> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> which extends the standard ARM SMMU v3 IP to support multiple
> VCMDQs with virtualization capabilities. In-kernel of host OS,
> they're used to reduce contention on a single queue. In terms
> of command queue, they are very like the standard CMDQ/ECMDQs,
> but only support CS_NONE in the CS field of CMD_SYNC command.
>
> This patch adds a new nvidia-grace-cmdqv file and inserts its
> structure pointer into the existing arm_smmu_device, and then
> adds related function calls in the arm-smmu-v3 driver.
>
> In the CMDQV driver itself, this patch only adds minimal part
> for host kernel support. Upon probe(), VINTF0 is reserved for
> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> Then the driver will select one of VCMDQs in the VINTF0 based
> on the CPU currently executing, to issue commands.
Is there a tangible difference to DMA API or VFIO performance?
[...]
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +{
> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> + u16 qidx;
> +
> + /* Check error status of vintf0 */
> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> + return &smmu->cmdq;
> +
> + /*
> + * Select a vcmdq to use. Here we use a temporal solution to
> + * balance out traffic on cmdq issuing: each cmdq has its own
> + * lock, if all cpus issue cmdlist using the same cmdq, only
> + * one CPU at a time can enter the process, while the others
> + * will be spinning at the same lock.
> + */
> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
How does ordering work between queues? Do they follow a global order
such that a sync on any queue is guaranteed to complete all prior
commands on all queues?
The challenge to make ECMDQ useful to Linux is how to make sure that all
the commands expected to be within scope of a future CMND_SYNC plus that
sync itself all get issued on the same queue, so I'd be mildly surprised
if you didn't have the same problem.
Robin.
> + return &vintf0->vcmdqs[qidx];
> +}
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-12-20 18:42 ` Robin Murphy
(?)
@ 2021-12-20 19:27 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-20 19:27 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, nicoleotsuka, thierry.reding, vdumpa, nwatterson,
jean-philippe, thunder.leizhen, chenxiang66, Jonathan.Cameron,
yuzenghui, linux-kernel, iommu, linux-arm-kernel, linux-tegra,
jgg
Hi Robin,
Thank you for the reply!
On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> On 2021-11-19 07:19, Nicolin Chen wrote:
> > From: Nate Watterson <nwatterson@nvidia.com>
> >
> > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > which extends the standard ARM SMMU v3 IP to support multiple
> > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > they're used to reduce contention on a single queue. In terms
> > of command queue, they are very like the standard CMDQ/ECMDQs,
> > but only support CS_NONE in the CS field of CMD_SYNC command.
> >
> > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > structure pointer into the existing arm_smmu_device, and then
> > adds related function calls in the arm-smmu-v3 driver.
> >
> > In the CMDQV driver itself, this patch only adds minimal part
> > for host kernel support. Upon probe(), VINTF0 is reserved for
> > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > Then the driver will select one of VCMDQs in the VINTF0 based
> > on the CPU currently executing, to issue commands.
>
> Is there a tangible difference to DMA API or VFIO performance?
Our testing environment is currently running on a single-core
CPU, so unfortunately we don't have a perf data at this point.
> [...]
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +{
> > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > + u16 qidx;
> > +
> > + /* Check error status of vintf0 */
> > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > + return &smmu->cmdq;
> > +
> > + /*
> > + * Select a vcmdq to use. Here we use a temporal solution to
> > + * balance out traffic on cmdq issuing: each cmdq has its own
> > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > + * one CPU at a time can enter the process, while the others
> > + * will be spinning at the same lock.
> > + */
> > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>
> How does ordering work between queues? Do they follow a global order
> such that a sync on any queue is guaranteed to complete all prior
> commands on all queues?
CMDQV internal scheduler would insert a SYNC when (for example)
switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
not SYNC. HW has a configuration bit in the register to disable
this feature, which is by default enabled.
> The challenge to make ECMDQ useful to Linux is how to make sure that all
> the commands expected to be within scope of a future CMND_SYNC plus that
> sync itself all get issued on the same queue, so I'd be mildly surprised
> if you didn't have the same problem.
PATCH-3 in this series actually helps align the command queues,
between issued commands and SYNC, if bool sync == true. Yet, if
doing something like issue->issue->issue_with_sync, it could be
tricker.
Thanks
Nic
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-20 19:27 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-12-20 19:27 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, will, linux-arm-kernel
Hi Robin,
Thank you for the reply!
On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> On 2021-11-19 07:19, Nicolin Chen wrote:
> > From: Nate Watterson <nwatterson@nvidia.com>
> >
> > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > which extends the standard ARM SMMU v3 IP to support multiple
> > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > they're used to reduce contention on a single queue. In terms
> > of command queue, they are very like the standard CMDQ/ECMDQs,
> > but only support CS_NONE in the CS field of CMD_SYNC command.
> >
> > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > structure pointer into the existing arm_smmu_device, and then
> > adds related function calls in the arm-smmu-v3 driver.
> >
> > In the CMDQV driver itself, this patch only adds minimal part
> > for host kernel support. Upon probe(), VINTF0 is reserved for
> > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > Then the driver will select one of VCMDQs in the VINTF0 based
> > on the CPU currently executing, to issue commands.
>
> Is there a tangible difference to DMA API or VFIO performance?
Our testing environment is currently running on a single-core
CPU, so unfortunately we don't have a perf data at this point.
> [...]
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +{
> > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > + u16 qidx;
> > +
> > + /* Check error status of vintf0 */
> > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > + return &smmu->cmdq;
> > +
> > + /*
> > + * Select a vcmdq to use. Here we use a temporal solution to
> > + * balance out traffic on cmdq issuing: each cmdq has its own
> > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > + * one CPU at a time can enter the process, while the others
> > + * will be spinning at the same lock.
> > + */
> > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>
> How does ordering work between queues? Do they follow a global order
> such that a sync on any queue is guaranteed to complete all prior
> commands on all queues?
CMDQV internal scheduler would insert a SYNC when (for example)
switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
not SYNC. HW has a configuration bit in the register to disable
this feature, which is by default enabled.
> The challenge to make ECMDQ useful to Linux is how to make sure that all
> the commands expected to be within scope of a future CMND_SYNC plus that
> sync itself all get issued on the same queue, so I'd be mildly surprised
> if you didn't have the same problem.
PATCH-3 in this series actually helps align the command queues,
between issued commands and SYNC, if bool sync == true. Yet, if
doing something like issue->issue->issue_with_sync, it could be
tricker.
Thanks
Nic
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-20 19:27 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-20 19:27 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, nwatterson, chenxiang66, joro, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, will, linux-arm-kernel
Hi Robin,
Thank you for the reply!
On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> On 2021-11-19 07:19, Nicolin Chen wrote:
> > From: Nate Watterson <nwatterson@nvidia.com>
> >
> > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > which extends the standard ARM SMMU v3 IP to support multiple
> > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > they're used to reduce contention on a single queue. In terms
> > of command queue, they are very like the standard CMDQ/ECMDQs,
> > but only support CS_NONE in the CS field of CMD_SYNC command.
> >
> > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > structure pointer into the existing arm_smmu_device, and then
> > adds related function calls in the arm-smmu-v3 driver.
> >
> > In the CMDQV driver itself, this patch only adds minimal part
> > for host kernel support. Upon probe(), VINTF0 is reserved for
> > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > Then the driver will select one of VCMDQs in the VINTF0 based
> > on the CPU currently executing, to issue commands.
>
> Is there a tangible difference to DMA API or VFIO performance?
Our testing environment is currently running on a single-core
CPU, so unfortunately we don't have a perf data at this point.
> [...]
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +{
> > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > + u16 qidx;
> > +
> > + /* Check error status of vintf0 */
> > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > + return &smmu->cmdq;
> > +
> > + /*
> > + * Select a vcmdq to use. Here we use a temporal solution to
> > + * balance out traffic on cmdq issuing: each cmdq has its own
> > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > + * one CPU at a time can enter the process, while the others
> > + * will be spinning at the same lock.
> > + */
> > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>
> How does ordering work between queues? Do they follow a global order
> such that a sync on any queue is guaranteed to complete all prior
> commands on all queues?
CMDQV internal scheduler would insert a SYNC when (for example)
switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
not SYNC. HW has a configuration bit in the register to disable
this feature, which is by default enabled.
> The challenge to make ECMDQ useful to Linux is how to make sure that all
> the commands expected to be within scope of a future CMND_SYNC plus that
> sync itself all get issued on the same queue, so I'd be mildly surprised
> if you didn't have the same problem.
PATCH-3 in this series actually helps align the command queues,
between issued commands and SYNC, if bool sync == true. Yet, if
doing something like issue->issue->issue_with_sync, it could be
tricker.
Thanks
Nic
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-12-20 19:27 ` Nicolin Chen via iommu
(?)
@ 2021-12-21 18:55 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-21 18:55 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, nicoleotsuka, thierry.reding, vdumpa, nwatterson,
jean-philippe, thunder.leizhen, chenxiang66, Jonathan.Cameron,
yuzenghui, linux-kernel, iommu, linux-arm-kernel, linux-tegra,
jgg
On 2021-12-20 19:27, Nicolin Chen wrote:
> Hi Robin,
>
> Thank you for the reply!
>
> On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
>> On 2021-11-19 07:19, Nicolin Chen wrote:
>>> From: Nate Watterson <nwatterson@nvidia.com>
>>>
>>> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
>>> which extends the standard ARM SMMU v3 IP to support multiple
>>> VCMDQs with virtualization capabilities. In-kernel of host OS,
>>> they're used to reduce contention on a single queue. In terms
>>> of command queue, they are very like the standard CMDQ/ECMDQs,
>>> but only support CS_NONE in the CS field of CMD_SYNC command.
>>>
>>> This patch adds a new nvidia-grace-cmdqv file and inserts its
>>> structure pointer into the existing arm_smmu_device, and then
>>> adds related function calls in the arm-smmu-v3 driver.
>>>
>>> In the CMDQV driver itself, this patch only adds minimal part
>>> for host kernel support. Upon probe(), VINTF0 is reserved for
>>> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
>>> Then the driver will select one of VCMDQs in the VINTF0 based
>>> on the CPU currently executing, to issue commands.
>>
>> Is there a tangible difference to DMA API or VFIO performance?
>
> Our testing environment is currently running on a single-core
> CPU, so unfortunately we don't have a perf data at this point.
OK, as for the ECMDQ patches I think we'll need some investigation with
real workloads to judge whether we can benefit from these things enough
to justify the complexity, and whether the design is right.
My gut feeling is that if these multi-queue schemes really can live up
to their promise of making contention negligible, then they should
further stand to benefit from bypassing the complex lock-free command
batching in favour of something more lightweight, which could change the
direction of much of the refactoring.
>> [...]
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +{
>>> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> + u16 qidx;
>>> +
>>> + /* Check error status of vintf0 */
>>> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> + return &smmu->cmdq;
>>> +
>>> + /*
>>> + * Select a vcmdq to use. Here we use a temporal solution to
>>> + * balance out traffic on cmdq issuing: each cmdq has its own
>>> + * lock, if all cpus issue cmdlist using the same cmdq, only
>>> + * one CPU at a time can enter the process, while the others
>>> + * will be spinning at the same lock.
>>> + */
>>> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>>
>> How does ordering work between queues? Do they follow a global order
>> such that a sync on any queue is guaranteed to complete all prior
>> commands on all queues?
>
> CMDQV internal scheduler would insert a SYNC when (for example)
> switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> not SYNC. HW has a configuration bit in the register to disable
> this feature, which is by default enabled.
Interesting, thanks. So it sounds like this is something you can get
away with for the moment, but may need to revisit once people chasing
real-world performance start wanting to turn that bit off.
>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>> the commands expected to be within scope of a future CMND_SYNC plus that
>> sync itself all get issued on the same queue, so I'd be mildly surprised
>> if you didn't have the same problem.
>
> PATCH-3 in this series actually helps align the command queues,
> between issued commands and SYNC, if bool sync == true. Yet, if
> doing something like issue->issue->issue_with_sync, it could be
> tricker.
Indeed between the iommu_iotlb_gather mechanism and low-level command
batching things are already a lot more concentrated than they could be,
but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
where we'd still be vulnerable to preemption. What I haven't even tried
to reason about yet is assumptions in the higher-level APIs, e.g. if
io-pgtable might chuck out a TLBI during an iommu_unmap() which we
implicitly expect a later iommu_iotlb_sync() to cover.
I've been thinking that in many ways per-domain queues make quite a bit
of sense and would be easier to manage than per-CPU ones - plus that's
pretty much the usage model once we get to VMs anyway - but that fails
to help the significant cases like networking and storage where many
CPUs are servicing a big monolithic device in a single domain :(
Robin.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-21 18:55 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-21 18:55 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, will, linux-arm-kernel
On 2021-12-20 19:27, Nicolin Chen wrote:
> Hi Robin,
>
> Thank you for the reply!
>
> On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
>> On 2021-11-19 07:19, Nicolin Chen wrote:
>>> From: Nate Watterson <nwatterson@nvidia.com>
>>>
>>> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
>>> which extends the standard ARM SMMU v3 IP to support multiple
>>> VCMDQs with virtualization capabilities. In-kernel of host OS,
>>> they're used to reduce contention on a single queue. In terms
>>> of command queue, they are very like the standard CMDQ/ECMDQs,
>>> but only support CS_NONE in the CS field of CMD_SYNC command.
>>>
>>> This patch adds a new nvidia-grace-cmdqv file and inserts its
>>> structure pointer into the existing arm_smmu_device, and then
>>> adds related function calls in the arm-smmu-v3 driver.
>>>
>>> In the CMDQV driver itself, this patch only adds minimal part
>>> for host kernel support. Upon probe(), VINTF0 is reserved for
>>> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
>>> Then the driver will select one of VCMDQs in the VINTF0 based
>>> on the CPU currently executing, to issue commands.
>>
>> Is there a tangible difference to DMA API or VFIO performance?
>
> Our testing environment is currently running on a single-core
> CPU, so unfortunately we don't have a perf data at this point.
OK, as for the ECMDQ patches I think we'll need some investigation with
real workloads to judge whether we can benefit from these things enough
to justify the complexity, and whether the design is right.
My gut feeling is that if these multi-queue schemes really can live up
to their promise of making contention negligible, then they should
further stand to benefit from bypassing the complex lock-free command
batching in favour of something more lightweight, which could change the
direction of much of the refactoring.
>> [...]
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +{
>>> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> + u16 qidx;
>>> +
>>> + /* Check error status of vintf0 */
>>> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> + return &smmu->cmdq;
>>> +
>>> + /*
>>> + * Select a vcmdq to use. Here we use a temporal solution to
>>> + * balance out traffic on cmdq issuing: each cmdq has its own
>>> + * lock, if all cpus issue cmdlist using the same cmdq, only
>>> + * one CPU at a time can enter the process, while the others
>>> + * will be spinning at the same lock.
>>> + */
>>> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>>
>> How does ordering work between queues? Do they follow a global order
>> such that a sync on any queue is guaranteed to complete all prior
>> commands on all queues?
>
> CMDQV internal scheduler would insert a SYNC when (for example)
> switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> not SYNC. HW has a configuration bit in the register to disable
> this feature, which is by default enabled.
Interesting, thanks. So it sounds like this is something you can get
away with for the moment, but may need to revisit once people chasing
real-world performance start wanting to turn that bit off.
>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>> the commands expected to be within scope of a future CMND_SYNC plus that
>> sync itself all get issued on the same queue, so I'd be mildly surprised
>> if you didn't have the same problem.
>
> PATCH-3 in this series actually helps align the command queues,
> between issued commands and SYNC, if bool sync == true. Yet, if
> doing something like issue->issue->issue_with_sync, it could be
> tricker.
Indeed between the iommu_iotlb_gather mechanism and low-level command
batching things are already a lot more concentrated than they could be,
but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
where we'd still be vulnerable to preemption. What I haven't even tried
to reason about yet is assumptions in the higher-level APIs, e.g. if
io-pgtable might chuck out a TLBI during an iommu_unmap() which we
implicitly expect a later iommu_iotlb_sync() to cover.
I've been thinking that in many ways per-domain queues make quite a bit
of sense and would be easier to manage than per-CPU ones - plus that's
pretty much the usage model once we get to VMs anyway - but that fails
to help the significant cases like networking and storage where many
CPUs are servicing a big monolithic device in a single domain :(
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-21 18:55 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-21 18:55 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, nwatterson, chenxiang66, joro, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, will, linux-arm-kernel
On 2021-12-20 19:27, Nicolin Chen wrote:
> Hi Robin,
>
> Thank you for the reply!
>
> On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
>> On 2021-11-19 07:19, Nicolin Chen wrote:
>>> From: Nate Watterson <nwatterson@nvidia.com>
>>>
>>> NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
>>> which extends the standard ARM SMMU v3 IP to support multiple
>>> VCMDQs with virtualization capabilities. In-kernel of host OS,
>>> they're used to reduce contention on a single queue. In terms
>>> of command queue, they are very like the standard CMDQ/ECMDQs,
>>> but only support CS_NONE in the CS field of CMD_SYNC command.
>>>
>>> This patch adds a new nvidia-grace-cmdqv file and inserts its
>>> structure pointer into the existing arm_smmu_device, and then
>>> adds related function calls in the arm-smmu-v3 driver.
>>>
>>> In the CMDQV driver itself, this patch only adds minimal part
>>> for host kernel support. Upon probe(), VINTF0 is reserved for
>>> in-kernel use. And some of the VCMDQs are assigned to VINTF0.
>>> Then the driver will select one of VCMDQs in the VINTF0 based
>>> on the CPU currently executing, to issue commands.
>>
>> Is there a tangible difference to DMA API or VFIO performance?
>
> Our testing environment is currently running on a single-core
> CPU, so unfortunately we don't have a perf data at this point.
OK, as for the ECMDQ patches I think we'll need some investigation with
real workloads to judge whether we can benefit from these things enough
to justify the complexity, and whether the design is right.
My gut feeling is that if these multi-queue schemes really can live up
to their promise of making contention negligible, then they should
further stand to benefit from bypassing the complex lock-free command
batching in favour of something more lightweight, which could change the
direction of much of the refactoring.
>> [...]
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +{
>>> + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> + u16 qidx;
>>> +
>>> + /* Check error status of vintf0 */
>>> + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> + return &smmu->cmdq;
>>> +
>>> + /*
>>> + * Select a vcmdq to use. Here we use a temporal solution to
>>> + * balance out traffic on cmdq issuing: each cmdq has its own
>>> + * lock, if all cpus issue cmdlist using the same cmdq, only
>>> + * one CPU at a time can enter the process, while the others
>>> + * will be spinning at the same lock.
>>> + */
>>> + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
>>
>> How does ordering work between queues? Do they follow a global order
>> such that a sync on any queue is guaranteed to complete all prior
>> commands on all queues?
>
> CMDQV internal scheduler would insert a SYNC when (for example)
> switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> not SYNC. HW has a configuration bit in the register to disable
> this feature, which is by default enabled.
Interesting, thanks. So it sounds like this is something you can get
away with for the moment, but may need to revisit once people chasing
real-world performance start wanting to turn that bit off.
>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>> the commands expected to be within scope of a future CMND_SYNC plus that
>> sync itself all get issued on the same queue, so I'd be mildly surprised
>> if you didn't have the same problem.
>
> PATCH-3 in this series actually helps align the command queues,
> between issued commands and SYNC, if bool sync == true. Yet, if
> doing something like issue->issue->issue_with_sync, it could be
> tricker.
Indeed between the iommu_iotlb_gather mechanism and low-level command
batching things are already a lot more concentrated than they could be,
but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
where we'd still be vulnerable to preemption. What I haven't even tried
to reason about yet is assumptions in the higher-level APIs, e.g. if
io-pgtable might chuck out a TLBI during an iommu_unmap() which we
implicitly expect a later iommu_iotlb_sync() to cover.
I've been thinking that in many ways per-domain queues make quite a bit
of sense and would be easier to manage than per-CPU ones - plus that's
pretty much the usage model once we get to VMs anyway - but that fails
to help the significant cases like networking and storage where many
CPUs are servicing a big monolithic device in a single domain :(
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-12-21 18:55 ` Robin Murphy
(?)
@ 2021-12-21 22:00 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-21 22:00 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, nicoleotsuka, thierry.reding, vdumpa, nwatterson,
jean-philippe, thunder.leizhen, chenxiang66, Jonathan.Cameron,
yuzenghui, linux-kernel, iommu, linux-arm-kernel, linux-tegra,
jgg
On Tue, Dec 21, 2021 at 06:55:20PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-20 19:27, Nicolin Chen wrote:
> > Hi Robin,
> >
> > Thank you for the reply!
> >
> > On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> > > On 2021-11-19 07:19, Nicolin Chen wrote:
> > > > From: Nate Watterson <nwatterson@nvidia.com>
> > > >
> > > > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > > > which extends the standard ARM SMMU v3 IP to support multiple
> > > > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > > > they're used to reduce contention on a single queue. In terms
> > > > of command queue, they are very like the standard CMDQ/ECMDQs,
> > > > but only support CS_NONE in the CS field of CMD_SYNC command.
> > > >
> > > > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > > > structure pointer into the existing arm_smmu_device, and then
> > > > adds related function calls in the arm-smmu-v3 driver.
> > > >
> > > > In the CMDQV driver itself, this patch only adds minimal part
> > > > for host kernel support. Upon probe(), VINTF0 is reserved for
> > > > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > > > Then the driver will select one of VCMDQs in the VINTF0 based
> > > > on the CPU currently executing, to issue commands.
> > >
> > > Is there a tangible difference to DMA API or VFIO performance?
> >
> > Our testing environment is currently running on a single-core
> > CPU, so unfortunately we don't have a perf data at this point.
>
> OK, as for the ECMDQ patches I think we'll need some investigation with
> real workloads to judge whether we can benefit from these things enough
> to justify the complexity, and whether the design is right.
>
> My gut feeling is that if these multi-queue schemes really can live up
> to their promise of making contention negligible, then they should
> further stand to benefit from bypassing the complex lock-free command
> batching in favour of something more lightweight, which could change the
> direction of much of the refactoring.
Makes sense. We will share our perf data once we have certain
level of support on our test environment.
> > > [...]
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +{
> > > > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > + u16 qidx;
> > > > +
> > > > + /* Check error status of vintf0 */
> > > > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > + return &smmu->cmdq;
> > > > +
> > > > + /*
> > > > + * Select a vcmdq to use. Here we use a temporal solution to
> > > > + * balance out traffic on cmdq issuing: each cmdq has its own
> > > > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > > > + * one CPU at a time can enter the process, while the others
> > > > + * will be spinning at the same lock.
> > > > + */
> > > > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
> > >
> > > How does ordering work between queues? Do they follow a global order
> > > such that a sync on any queue is guaranteed to complete all prior
> > > commands on all queues?
> >
> > CMDQV internal scheduler would insert a SYNC when (for example)
> > switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> > not SYNC. HW has a configuration bit in the register to disable
> > this feature, which is by default enabled.
>
> Interesting, thanks. So it sounds like this is something you can get
> away with for the moment, but may need to revisit once people chasing
> real-world performance start wanting to turn that bit off.
Yea, we have limitations on both testing setup and available
clients for an in-depth perf measurement at this moment. But
we surely will do as you mentioned. Anyway, this is just for
initial support.
> > > The challenge to make ECMDQ useful to Linux is how to make sure that all
> > > the commands expected to be within scope of a future CMND_SYNC plus that
> > > sync itself all get issued on the same queue, so I'd be mildly surprised
> > > if you didn't have the same problem.
> >
> > PATCH-3 in this series actually helps align the command queues,
> > between issued commands and SYNC, if bool sync == true. Yet, if
> > doing something like issue->issue->issue_with_sync, it could be
> > tricker.
>
> Indeed between the iommu_iotlb_gather mechanism and low-level command
> batching things are already a lot more concentrated than they could be,
> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
> where we'd still be vulnerable to preemption. What I haven't even tried
> to reason about yet is assumptions in the higher-level APIs, e.g. if
> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
> implicitly expect a later iommu_iotlb_sync() to cover.
Though I might have oversimplified the situation here, I see
the arm_smmu_cmdq_batch_add() calls are typically followed by
arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
_batch_submit() to all the queues that it previously touched
in the _batch_add()?
> I've been thinking that in many ways per-domain queues make quite a bit
> of sense and would be easier to manage than per-CPU ones - plus that's
> pretty much the usage model once we get to VMs anyway - but that fails
> to help the significant cases like networking and storage where many
> CPUs are servicing a big monolithic device in a single domain :(
Yea, and it's hard to assume which client would use CMDQ more
frequently, in order to balance or assign more queues to that
client, which feels like a QoS conundrum.
Thanks
Nic
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-21 22:00 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-12-21 22:00 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, will, linux-arm-kernel
On Tue, Dec 21, 2021 at 06:55:20PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-20 19:27, Nicolin Chen wrote:
> > Hi Robin,
> >
> > Thank you for the reply!
> >
> > On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> > > On 2021-11-19 07:19, Nicolin Chen wrote:
> > > > From: Nate Watterson <nwatterson@nvidia.com>
> > > >
> > > > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > > > which extends the standard ARM SMMU v3 IP to support multiple
> > > > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > > > they're used to reduce contention on a single queue. In terms
> > > > of command queue, they are very like the standard CMDQ/ECMDQs,
> > > > but only support CS_NONE in the CS field of CMD_SYNC command.
> > > >
> > > > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > > > structure pointer into the existing arm_smmu_device, and then
> > > > adds related function calls in the arm-smmu-v3 driver.
> > > >
> > > > In the CMDQV driver itself, this patch only adds minimal part
> > > > for host kernel support. Upon probe(), VINTF0 is reserved for
> > > > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > > > Then the driver will select one of VCMDQs in the VINTF0 based
> > > > on the CPU currently executing, to issue commands.
> > >
> > > Is there a tangible difference to DMA API or VFIO performance?
> >
> > Our testing environment is currently running on a single-core
> > CPU, so unfortunately we don't have a perf data at this point.
>
> OK, as for the ECMDQ patches I think we'll need some investigation with
> real workloads to judge whether we can benefit from these things enough
> to justify the complexity, and whether the design is right.
>
> My gut feeling is that if these multi-queue schemes really can live up
> to their promise of making contention negligible, then they should
> further stand to benefit from bypassing the complex lock-free command
> batching in favour of something more lightweight, which could change the
> direction of much of the refactoring.
Makes sense. We will share our perf data once we have certain
level of support on our test environment.
> > > [...]
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +{
> > > > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > + u16 qidx;
> > > > +
> > > > + /* Check error status of vintf0 */
> > > > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > + return &smmu->cmdq;
> > > > +
> > > > + /*
> > > > + * Select a vcmdq to use. Here we use a temporal solution to
> > > > + * balance out traffic on cmdq issuing: each cmdq has its own
> > > > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > > > + * one CPU at a time can enter the process, while the others
> > > > + * will be spinning at the same lock.
> > > > + */
> > > > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
> > >
> > > How does ordering work between queues? Do they follow a global order
> > > such that a sync on any queue is guaranteed to complete all prior
> > > commands on all queues?
> >
> > CMDQV internal scheduler would insert a SYNC when (for example)
> > switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> > not SYNC. HW has a configuration bit in the register to disable
> > this feature, which is by default enabled.
>
> Interesting, thanks. So it sounds like this is something you can get
> away with for the moment, but may need to revisit once people chasing
> real-world performance start wanting to turn that bit off.
Yea, we have limitations on both testing setup and available
clients for an in-depth perf measurement at this moment. But
we surely will do as you mentioned. Anyway, this is just for
initial support.
> > > The challenge to make ECMDQ useful to Linux is how to make sure that all
> > > the commands expected to be within scope of a future CMND_SYNC plus that
> > > sync itself all get issued on the same queue, so I'd be mildly surprised
> > > if you didn't have the same problem.
> >
> > PATCH-3 in this series actually helps align the command queues,
> > between issued commands and SYNC, if bool sync == true. Yet, if
> > doing something like issue->issue->issue_with_sync, it could be
> > tricker.
>
> Indeed between the iommu_iotlb_gather mechanism and low-level command
> batching things are already a lot more concentrated than they could be,
> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
> where we'd still be vulnerable to preemption. What I haven't even tried
> to reason about yet is assumptions in the higher-level APIs, e.g. if
> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
> implicitly expect a later iommu_iotlb_sync() to cover.
Though I might have oversimplified the situation here, I see
the arm_smmu_cmdq_batch_add() calls are typically followed by
arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
_batch_submit() to all the queues that it previously touched
in the _batch_add()?
> I've been thinking that in many ways per-domain queues make quite a bit
> of sense and would be easier to manage than per-CPU ones - plus that's
> pretty much the usage model once we get to VMs anyway - but that fails
> to help the significant cases like networking and storage where many
> CPUs are servicing a big monolithic device in a single domain :(
Yea, and it's hard to assume which client would use CMDQ more
frequently, in order to balance or assign more queues to that
client, which feels like a QoS conundrum.
Thanks
Nic
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-21 22:00 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-21 22:00 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, nwatterson, chenxiang66, joro, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, will, linux-arm-kernel
On Tue, Dec 21, 2021 at 06:55:20PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-20 19:27, Nicolin Chen wrote:
> > Hi Robin,
> >
> > Thank you for the reply!
> >
> > On Mon, Dec 20, 2021 at 06:42:26PM +0000, Robin Murphy wrote:
> > > On 2021-11-19 07:19, Nicolin Chen wrote:
> > > > From: Nate Watterson <nwatterson@nvidia.com>
> > > >
> > > > NVIDIA's Grace Soc has a CMDQ-Virtualization (CMDQV) hardware,
> > > > which extends the standard ARM SMMU v3 IP to support multiple
> > > > VCMDQs with virtualization capabilities. In-kernel of host OS,
> > > > they're used to reduce contention on a single queue. In terms
> > > > of command queue, they are very like the standard CMDQ/ECMDQs,
> > > > but only support CS_NONE in the CS field of CMD_SYNC command.
> > > >
> > > > This patch adds a new nvidia-grace-cmdqv file and inserts its
> > > > structure pointer into the existing arm_smmu_device, and then
> > > > adds related function calls in the arm-smmu-v3 driver.
> > > >
> > > > In the CMDQV driver itself, this patch only adds minimal part
> > > > for host kernel support. Upon probe(), VINTF0 is reserved for
> > > > in-kernel use. And some of the VCMDQs are assigned to VINTF0.
> > > > Then the driver will select one of VCMDQs in the VINTF0 based
> > > > on the CPU currently executing, to issue commands.
> > >
> > > Is there a tangible difference to DMA API or VFIO performance?
> >
> > Our testing environment is currently running on a single-core
> > CPU, so unfortunately we don't have a perf data at this point.
>
> OK, as for the ECMDQ patches I think we'll need some investigation with
> real workloads to judge whether we can benefit from these things enough
> to justify the complexity, and whether the design is right.
>
> My gut feeling is that if these multi-queue schemes really can live up
> to their promise of making contention negligible, then they should
> further stand to benefit from bypassing the complex lock-free command
> batching in favour of something more lightweight, which could change the
> direction of much of the refactoring.
Makes sense. We will share our perf data once we have certain
level of support on our test environment.
> > > [...]
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +{
> > > > + struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > + struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > + u16 qidx;
> > > > +
> > > > + /* Check error status of vintf0 */
> > > > + if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > + return &smmu->cmdq;
> > > > +
> > > > + /*
> > > > + * Select a vcmdq to use. Here we use a temporal solution to
> > > > + * balance out traffic on cmdq issuing: each cmdq has its own
> > > > + * lock, if all cpus issue cmdlist using the same cmdq, only
> > > > + * one CPU at a time can enter the process, while the others
> > > > + * will be spinning at the same lock.
> > > > + */
> > > > + qidx = smp_processor_id() % cmdqv->num_vcmdqs_per_vintf;
> > >
> > > How does ordering work between queues? Do they follow a global order
> > > such that a sync on any queue is guaranteed to complete all prior
> > > commands on all queues?
> >
> > CMDQV internal scheduler would insert a SYNC when (for example)
> > switching from VCMDQ0 to VCMDQ1 while last command in VCMDQ0 is
> > not SYNC. HW has a configuration bit in the register to disable
> > this feature, which is by default enabled.
>
> Interesting, thanks. So it sounds like this is something you can get
> away with for the moment, but may need to revisit once people chasing
> real-world performance start wanting to turn that bit off.
Yea, we have limitations on both testing setup and available
clients for an in-depth perf measurement at this moment. But
we surely will do as you mentioned. Anyway, this is just for
initial support.
> > > The challenge to make ECMDQ useful to Linux is how to make sure that all
> > > the commands expected to be within scope of a future CMND_SYNC plus that
> > > sync itself all get issued on the same queue, so I'd be mildly surprised
> > > if you didn't have the same problem.
> >
> > PATCH-3 in this series actually helps align the command queues,
> > between issued commands and SYNC, if bool sync == true. Yet, if
> > doing something like issue->issue->issue_with_sync, it could be
> > tricker.
>
> Indeed between the iommu_iotlb_gather mechanism and low-level command
> batching things are already a lot more concentrated than they could be,
> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
> where we'd still be vulnerable to preemption. What I haven't even tried
> to reason about yet is assumptions in the higher-level APIs, e.g. if
> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
> implicitly expect a later iommu_iotlb_sync() to cover.
Though I might have oversimplified the situation here, I see
the arm_smmu_cmdq_batch_add() calls are typically followed by
arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
_batch_submit() to all the queues that it previously touched
in the _batch_add()?
> I've been thinking that in many ways per-domain queues make quite a bit
> of sense and would be easier to manage than per-CPU ones - plus that's
> pretty much the usage model once we get to VMs anyway - but that fails
> to help the significant cases like networking and storage where many
> CPUs are servicing a big monolithic device in a single domain :(
Yea, and it's hard to assume which client would use CMDQ more
frequently, in order to balance or assign more queues to that
client, which feels like a QoS conundrum.
Thanks
Nic
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
2021-12-21 22:00 ` Nicolin Chen via iommu
(?)
@ 2021-12-22 11:57 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 11:57 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, nicoleotsuka, thierry.reding, vdumpa, nwatterson,
jean-philippe, thunder.leizhen, chenxiang66, Jonathan.Cameron,
yuzenghui, linux-kernel, iommu, linux-arm-kernel, linux-tegra,
jgg
On 2021-12-21 22:00, Nicolin Chen wrote:
[...]
>>>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>>>> the commands expected to be within scope of a future CMND_SYNC plus that
>>>> sync itself all get issued on the same queue, so I'd be mildly surprised
>>>> if you didn't have the same problem.
>>>
>>> PATCH-3 in this series actually helps align the command queues,
>>> between issued commands and SYNC, if bool sync == true. Yet, if
>>> doing something like issue->issue->issue_with_sync, it could be
>>> tricker.
>>
>> Indeed between the iommu_iotlb_gather mechanism and low-level command
>> batching things are already a lot more concentrated than they could be,
>> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
>> where we'd still be vulnerable to preemption. What I haven't even tried
>> to reason about yet is assumptions in the higher-level APIs, e.g. if
>> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
>> implicitly expect a later iommu_iotlb_sync() to cover.
>
> Though I might have oversimplified the situation here, I see
> the arm_smmu_cmdq_batch_add() calls are typically followed by
> arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
> _batch_submit() to all the queues that it previously touched
> in the _batch_add()?
Keeping track of which queues a batch has touched is certainly possible,
but it's yet more overhead to impose across the board when intra-batch
preemption should (hopefully) be very rare in practice. I was thinking
more along the lines of disabling preemption/migration for the lifetime
of a batch, or more pragmatically just hoisting the queue selection all
the way out to the scope of the batch itself (which also conveniently
seems about the right shape for potentially forking off a whole other
dedicated command submission flow from that point later).
We still can't mitigate inter-batch preemption, though, so we'll just
have to audit everything very carefully to make sure we don't have (or
inadvertently introduce in future) any places where that could be
problematic. We really want to avoid over-syncing as that's liable to
end up being just as bad for performance as the contention that we're
nominally avoiding.
>> I've been thinking that in many ways per-domain queues make quite a bit
>> of sense and would be easier to manage than per-CPU ones - plus that's
>> pretty much the usage model once we get to VMs anyway - but that fails
>> to help the significant cases like networking and storage where many
>> CPUs are servicing a big monolithic device in a single domain :(
>
> Yea, and it's hard to assume which client would use CMDQ more
> frequently, in order to balance or assign more queues to that
> client, which feels like a QoS conundrum.
Indeed, plus once we start assigning queues to VMs we're going to want
to remove them from the general pool for host usage, so we definitely
want to plan ahead here.
Cheers,
Robin.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-22 11:57 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 11:57 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, will, linux-arm-kernel
On 2021-12-21 22:00, Nicolin Chen wrote:
[...]
>>>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>>>> the commands expected to be within scope of a future CMND_SYNC plus that
>>>> sync itself all get issued on the same queue, so I'd be mildly surprised
>>>> if you didn't have the same problem.
>>>
>>> PATCH-3 in this series actually helps align the command queues,
>>> between issued commands and SYNC, if bool sync == true. Yet, if
>>> doing something like issue->issue->issue_with_sync, it could be
>>> tricker.
>>
>> Indeed between the iommu_iotlb_gather mechanism and low-level command
>> batching things are already a lot more concentrated than they could be,
>> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
>> where we'd still be vulnerable to preemption. What I haven't even tried
>> to reason about yet is assumptions in the higher-level APIs, e.g. if
>> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
>> implicitly expect a later iommu_iotlb_sync() to cover.
>
> Though I might have oversimplified the situation here, I see
> the arm_smmu_cmdq_batch_add() calls are typically followed by
> arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
> _batch_submit() to all the queues that it previously touched
> in the _batch_add()?
Keeping track of which queues a batch has touched is certainly possible,
but it's yet more overhead to impose across the board when intra-batch
preemption should (hopefully) be very rare in practice. I was thinking
more along the lines of disabling preemption/migration for the lifetime
of a batch, or more pragmatically just hoisting the queue selection all
the way out to the scope of the batch itself (which also conveniently
seems about the right shape for potentially forking off a whole other
dedicated command submission flow from that point later).
We still can't mitigate inter-batch preemption, though, so we'll just
have to audit everything very carefully to make sure we don't have (or
inadvertently introduce in future) any places where that could be
problematic. We really want to avoid over-syncing as that's liable to
end up being just as bad for performance as the contention that we're
nominally avoiding.
>> I've been thinking that in many ways per-domain queues make quite a bit
>> of sense and would be easier to manage than per-CPU ones - plus that's
>> pretty much the usage model once we get to VMs anyway - but that fails
>> to help the significant cases like networking and storage where many
>> CPUs are servicing a big monolithic device in a single domain :(
>
> Yea, and it's hard to assume which client would use CMDQ more
> frequently, in order to balance or assign more queues to that
> client, which feels like a QoS conundrum.
Indeed, plus once we start assigning queues to VMs we're going to want
to remove them from the general pool for host usage, so we definitely
want to plan ahead here.
Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V
@ 2021-12-22 11:57 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 11:57 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, nwatterson, chenxiang66, joro, Jonathan.Cameron,
linux-kernel, iommu, nicoleotsuka, linux-tegra, thierry.reding,
jgg, thunder.leizhen, yuzenghui, will, linux-arm-kernel
On 2021-12-21 22:00, Nicolin Chen wrote:
[...]
>>>> The challenge to make ECMDQ useful to Linux is how to make sure that all
>>>> the commands expected to be within scope of a future CMND_SYNC plus that
>>>> sync itself all get issued on the same queue, so I'd be mildly surprised
>>>> if you didn't have the same problem.
>>>
>>> PATCH-3 in this series actually helps align the command queues,
>>> between issued commands and SYNC, if bool sync == true. Yet, if
>>> doing something like issue->issue->issue_with_sync, it could be
>>> tricker.
>>
>> Indeed between the iommu_iotlb_gather mechanism and low-level command
>> batching things are already a lot more concentrated than they could be,
>> but arm_smmu_cmdq_batch_add() and its callers stand out as examples of
>> where we'd still be vulnerable to preemption. What I haven't even tried
>> to reason about yet is assumptions in the higher-level APIs, e.g. if
>> io-pgtable might chuck out a TLBI during an iommu_unmap() which we
>> implicitly expect a later iommu_iotlb_sync() to cover.
>
> Though I might have oversimplified the situation here, I see
> the arm_smmu_cmdq_batch_add() calls are typically followed by
> arm_smmu_cmdq_batch_submit(). Could we just add a SYNC in the
> _batch_submit() to all the queues that it previously touched
> in the _batch_add()?
Keeping track of which queues a batch has touched is certainly possible,
but it's yet more overhead to impose across the board when intra-batch
preemption should (hopefully) be very rare in practice. I was thinking
more along the lines of disabling preemption/migration for the lifetime
of a batch, or more pragmatically just hoisting the queue selection all
the way out to the scope of the batch itself (which also conveniently
seems about the right shape for potentially forking off a whole other
dedicated command submission flow from that point later).
We still can't mitigate inter-batch preemption, though, so we'll just
have to audit everything very carefully to make sure we don't have (or
inadvertently introduce in future) any places where that could be
problematic. We really want to avoid over-syncing as that's liable to
end up being just as bad for performance as the contention that we're
nominally avoiding.
>> I've been thinking that in many ways per-domain queues make quite a bit
>> of sense and would be easier to manage than per-CPU ones - plus that's
>> pretty much the usage model once we get to VMs anyway - but that fails
>> to help the significant cases like networking and storage where many
>> CPUs are servicing a big monolithic device in a single domain :(
>
> Yea, and it's hard to assume which client would use CMDQ more
> frequently, in order to balance or assign more queues to that
> client, which feels like a QoS conundrum.
Indeed, plus once we start assigning queues to VMs we're going to want
to remove them from the general pool for host usage, so we definitely
want to plan ahead here.
Cheers,
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-11-19 7:19 ` Nicolin Chen via iommu
(?)
@ 2021-12-22 12:32 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 12:32 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> When VCMDQs are assigned to a VINTF that is owned by a guest, not
> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> are supported. This requires get_cmd() function to scan the input
> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> unsupported commands can still go through emulated smmu->cmdq.
>
> Also the guest shouldn't have HYP_OWN bit being set regardless of
> guest kernel driver writing it or not, i.e. the user space driver
> running in the host OS should wire this bit to zero when trapping
> a write access to this VINTF_CONFIG register from a guest kernel.
> So instead of using the existing regval, this patch reads out the
> register value explicitly to cache in vintf->cfg.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> 3 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b1182dd825fd..73941ccc1a3e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> return 0;
> }
>
> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> if (smmu->nvidia_grace_cmdqv)
> - return nvidia_grace_cmdqv_get_cmdq(smmu);
> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>
> return &smmu->cmdq;
> }
> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> u32 prod;
> unsigned long flags;
> bool owner;
> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> struct arm_smmu_ll_queue llq, head;
> int ret = 0;
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 24f93444aeeb..085c775c2eea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> struct acpi_iort_node *node);
> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> + u64 *cmds, int n);
> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> static inline struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> }
>
> static inline struct arm_smmu_cmdq *
> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> return NULL;
> }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> index c0d7351f13e2..71f6bc684e64 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> }
>
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +struct arm_smmu_cmdq *
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> return &smmu->cmdq;
>
> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
I'm not sure there was ever a conscious design decision that batches
only ever contain one type of command - if something needs to start
depending on that behaviour then that dependency probably wants to be
clearly documented. Also, a sync on its own gets trapped to the main
cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
VCMDQ, huh?
> +
> + /* List all supported CMDs for vintf->cmdq pathway */
> + switch (opcode) {
> + case CMDQ_OP_TLBI_NH_ASID:
> + case CMDQ_OP_TLBI_NH_VA:
> + case CMDQ_OP_TLBI_S12_VMALL:
> + case CMDQ_OP_TLBI_S2_IPA:
Fun! Can the guest invalidate any VMID it feels like, or is there some
additional magic on the host side that we're missing here?
> + case CMDQ_OP_ATC_INV:
> + break;
Ditto for StreamID here.
Robin.
> + default:
> + /* Unsupported CMDs go for smmu->cmdq pathway */
> + return &smmu->cmdq;
> + }
> + }
> +
> /*
> * Select a vcmdq to use. Here we use a temporal solution to
> * balance out traffic on cmdq issuing: each cmdq has its own
> @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> vintf0->idx = 0;
> vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
>
> + /*
> + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> + * set of supported commands, by following the HW design.
> + */
> regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> regval |= FIELD_PREP(VINTF_EN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> - vintf0->cfg = regval;
> + /*
> + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> + * kernel, so read back regval from HW to ensure that reflects in cfg
> + */
> + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
>
> ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> regval, regval == VINTF_ENABLED,
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-22 12:32 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 12:32 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, linux-arm-kernel
On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> When VCMDQs are assigned to a VINTF that is owned by a guest, not
> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> are supported. This requires get_cmd() function to scan the input
> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> unsupported commands can still go through emulated smmu->cmdq.
>
> Also the guest shouldn't have HYP_OWN bit being set regardless of
> guest kernel driver writing it or not, i.e. the user space driver
> running in the host OS should wire this bit to zero when trapping
> a write access to this VINTF_CONFIG register from a guest kernel.
> So instead of using the existing regval, this patch reads out the
> register value explicitly to cache in vintf->cfg.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> 3 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b1182dd825fd..73941ccc1a3e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> return 0;
> }
>
> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> if (smmu->nvidia_grace_cmdqv)
> - return nvidia_grace_cmdqv_get_cmdq(smmu);
> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>
> return &smmu->cmdq;
> }
> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> u32 prod;
> unsigned long flags;
> bool owner;
> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> struct arm_smmu_ll_queue llq, head;
> int ret = 0;
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 24f93444aeeb..085c775c2eea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> struct acpi_iort_node *node);
> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> + u64 *cmds, int n);
> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> static inline struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> }
>
> static inline struct arm_smmu_cmdq *
> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> return NULL;
> }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> index c0d7351f13e2..71f6bc684e64 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> }
>
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +struct arm_smmu_cmdq *
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> return &smmu->cmdq;
>
> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
I'm not sure there was ever a conscious design decision that batches
only ever contain one type of command - if something needs to start
depending on that behaviour then that dependency probably wants to be
clearly documented. Also, a sync on its own gets trapped to the main
cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
VCMDQ, huh?
> +
> + /* List all supported CMDs for vintf->cmdq pathway */
> + switch (opcode) {
> + case CMDQ_OP_TLBI_NH_ASID:
> + case CMDQ_OP_TLBI_NH_VA:
> + case CMDQ_OP_TLBI_S12_VMALL:
> + case CMDQ_OP_TLBI_S2_IPA:
Fun! Can the guest invalidate any VMID it feels like, or is there some
additional magic on the host side that we're missing here?
> + case CMDQ_OP_ATC_INV:
> + break;
Ditto for StreamID here.
Robin.
> + default:
> + /* Unsupported CMDs go for smmu->cmdq pathway */
> + return &smmu->cmdq;
> + }
> + }
> +
> /*
> * Select a vcmdq to use. Here we use a temporal solution to
> * balance out traffic on cmdq issuing: each cmdq has its own
> @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> vintf0->idx = 0;
> vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
>
> + /*
> + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> + * set of supported commands, by following the HW design.
> + */
> regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> regval |= FIELD_PREP(VINTF_EN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> - vintf0->cfg = regval;
> + /*
> + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> + * kernel, so read back regval from HW to ensure that reflects in cfg
> + */
> + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
>
> ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> regval, regval == VINTF_ENABLED,
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-22 12:32 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-22 12:32 UTC (permalink / raw)
To: Nicolin Chen, joro, will
Cc: jean-philippe, linux-kernel, iommu, linux-tegra, thierry.reding,
jgg, linux-arm-kernel
On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> When VCMDQs are assigned to a VINTF that is owned by a guest, not
> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> are supported. This requires get_cmd() function to scan the input
> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> unsupported commands can still go through emulated smmu->cmdq.
>
> Also the guest shouldn't have HYP_OWN bit being set regardless of
> guest kernel driver writing it or not, i.e. the user space driver
> running in the host OS should wire this bit to zero when trapping
> a write access to this VINTF_CONFIG register from a guest kernel.
> So instead of using the existing regval, this patch reads out the
> register value explicitly to cache in vintf->cfg.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> 3 files changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index b1182dd825fd..73941ccc1a3e 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> return 0;
> }
>
> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> if (smmu->nvidia_grace_cmdqv)
> - return nvidia_grace_cmdqv_get_cmdq(smmu);
> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>
> return &smmu->cmdq;
> }
> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> u32 prod;
> unsigned long flags;
> bool owner;
> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> struct arm_smmu_ll_queue llq, head;
> int ret = 0;
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 24f93444aeeb..085c775c2eea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> struct acpi_iort_node *node);
> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> + u64 *cmds, int n);
> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> static inline struct nvidia_grace_cmdqv *
> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> }
>
> static inline struct arm_smmu_cmdq *
> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> return NULL;
> }
> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> index c0d7351f13e2..71f6bc684e64 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> }
>
> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> +struct arm_smmu_cmdq *
> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> {
> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> return &smmu->cmdq;
>
> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
I'm not sure there was ever a conscious design decision that batches
only ever contain one type of command - if something needs to start
depending on that behaviour then that dependency probably wants to be
clearly documented. Also, a sync on its own gets trapped to the main
cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
VCMDQ, huh?
> +
> + /* List all supported CMDs for vintf->cmdq pathway */
> + switch (opcode) {
> + case CMDQ_OP_TLBI_NH_ASID:
> + case CMDQ_OP_TLBI_NH_VA:
> + case CMDQ_OP_TLBI_S12_VMALL:
> + case CMDQ_OP_TLBI_S2_IPA:
Fun! Can the guest invalidate any VMID it feels like, or is there some
additional magic on the host side that we're missing here?
> + case CMDQ_OP_ATC_INV:
> + break;
Ditto for StreamID here.
Robin.
> + default:
> + /* Unsupported CMDs go for smmu->cmdq pathway */
> + return &smmu->cmdq;
> + }
> + }
> +
> /*
> * Select a vcmdq to use. Here we use a temporal solution to
> * balance out traffic on cmdq issuing: each cmdq has its own
> @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> vintf0->idx = 0;
> vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
>
> + /*
> + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> + * set of supported commands, by following the HW design.
> + */
> regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> regval |= FIELD_PREP(VINTF_EN, 1);
> writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
>
> - vintf0->cfg = regval;
> + /*
> + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> + * kernel, so read back regval from HW to ensure that reflects in cfg
> + */
> + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
>
> ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> regval, regval == VINTF_ENABLED,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-12-22 12:32 ` Robin Murphy
(?)
@ 2021-12-22 22:52 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-22 22:52 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > are supported. This requires get_cmd() function to scan the input
> > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > unsupported commands can still go through emulated smmu->cmdq.
> >
> > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > guest kernel driver writing it or not, i.e. the user space driver
> > running in the host OS should wire this bit to zero when trapping
> > a write access to this VINTF_CONFIG register from a guest kernel.
> > So instead of using the existing regval, this patch reads out the
> > register value explicitly to cache in vintf->cfg.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > 3 files changed, 36 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index b1182dd825fd..73941ccc1a3e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > return 0;
> > }
> >
> > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > if (smmu->nvidia_grace_cmdqv)
> > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> >
> > return &smmu->cmdq;
> > }
> > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > u32 prod;
> > unsigned long flags;
> > bool owner;
> > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > struct arm_smmu_ll_queue llq, head;
> > int ret = 0;
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 24f93444aeeb..085c775c2eea 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > struct acpi_iort_node *node);
> > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > + u64 *cmds, int n);
> > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > static inline struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > }
> >
> > static inline struct arm_smmu_cmdq *
> > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > return NULL;
> > }
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > index c0d7351f13e2..71f6bc684e64 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > }
> >
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +struct arm_smmu_cmdq *
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > return &smmu->cmdq;
> >
> > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>
> I'm not sure there was ever a conscious design decision that batches
> only ever contain one type of command - if something needs to start
Hmm, I think that's a good catch -- as it could be a potential
bug here. Though the SMMUv3 driver currently seems to use loop
by adding one type of cmds to any batch and submitting it right
away so checking opcode of cmds[0] alone seems to be sufficient
at this moment, yet it might not be so in the future. We'd need
to apply certain constrains on the type of cmds in the batch in
SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
SMMUv3's CMDQ pathway here if one of cmds is not supported.
> depending on that behaviour then that dependency probably wants to be
> clearly documented. Also, a sync on its own gets trapped to the main
> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> VCMDQ, huh?
Yea...looks like an implication again where cmds must have SYNC
at the end of the batch. I will see if any simple change can be
done to fix these two. If you have suggestions for them, I would
love to hear too.
> > +
> > + /* List all supported CMDs for vintf->cmdq pathway */
> > + switch (opcode) {
> > + case CMDQ_OP_TLBI_NH_ASID:
> > + case CMDQ_OP_TLBI_NH_VA:
> > + case CMDQ_OP_TLBI_S12_VMALL:
> > + case CMDQ_OP_TLBI_S2_IPA:
>
> Fun! Can the guest invalidate any VMID it feels like, or is there some
> additional magic on the host side that we're missing here?
Yes. VINTF has a register for SW to program VMID so that the HW
can replace VMIDs in the cmds in the VCMDQs of that VINTF with
the programmed VMID. That was the reason why we had numbers of
patches in v2 to route the VMID between guest and host.
> > + case CMDQ_OP_ATC_INV:
> > + break;
> Ditto for StreamID here.
Yes. StreamID works similarly by the HW: each VINTF provides us
16 pairs of MATCH+REPLACE registers to program host and guest's
StreamIDs. Our previous mdev implementation in v2 can be a good
reference code:
https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Thanks
Nic
> > + default:
> > + /* Unsupported CMDs go for smmu->cmdq pathway */
> > + return &smmu->cmdq;
> > + }
> > + }
> > +
> > /*
> > * Select a vcmdq to use. Here we use a temporal solution to
> > * balance out traffic on cmdq issuing: each cmdq has its own
> > @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > vintf0->idx = 0;
> > vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
> >
> > + /*
> > + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> > + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> > + * set of supported commands, by following the HW design.
> > + */
> > regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > regval |= FIELD_PREP(VINTF_EN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > - vintf0->cfg = regval;
> > + /*
> > + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> > + * kernel, so read back regval from HW to ensure that reflects in cfg
> > + */
> > + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> > regval, regval == VINTF_ENABLED,
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-22 22:52 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-12-22 22:52 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, will, linux-arm-kernel
On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > are supported. This requires get_cmd() function to scan the input
> > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > unsupported commands can still go through emulated smmu->cmdq.
> >
> > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > guest kernel driver writing it or not, i.e. the user space driver
> > running in the host OS should wire this bit to zero when trapping
> > a write access to this VINTF_CONFIG register from a guest kernel.
> > So instead of using the existing regval, this patch reads out the
> > register value explicitly to cache in vintf->cfg.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > 3 files changed, 36 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index b1182dd825fd..73941ccc1a3e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > return 0;
> > }
> >
> > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > if (smmu->nvidia_grace_cmdqv)
> > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> >
> > return &smmu->cmdq;
> > }
> > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > u32 prod;
> > unsigned long flags;
> > bool owner;
> > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > struct arm_smmu_ll_queue llq, head;
> > int ret = 0;
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 24f93444aeeb..085c775c2eea 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > struct acpi_iort_node *node);
> > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > + u64 *cmds, int n);
> > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > static inline struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > }
> >
> > static inline struct arm_smmu_cmdq *
> > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > return NULL;
> > }
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > index c0d7351f13e2..71f6bc684e64 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > }
> >
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +struct arm_smmu_cmdq *
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > return &smmu->cmdq;
> >
> > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>
> I'm not sure there was ever a conscious design decision that batches
> only ever contain one type of command - if something needs to start
Hmm, I think that's a good catch -- as it could be a potential
bug here. Though the SMMUv3 driver currently seems to use loop
by adding one type of cmds to any batch and submitting it right
away so checking opcode of cmds[0] alone seems to be sufficient
at this moment, yet it might not be so in the future. We'd need
to apply certain constrains on the type of cmds in the batch in
SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
SMMUv3's CMDQ pathway here if one of cmds is not supported.
> depending on that behaviour then that dependency probably wants to be
> clearly documented. Also, a sync on its own gets trapped to the main
> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> VCMDQ, huh?
Yea...looks like an implication again where cmds must have SYNC
at the end of the batch. I will see if any simple change can be
done to fix these two. If you have suggestions for them, I would
love to hear too.
> > +
> > + /* List all supported CMDs for vintf->cmdq pathway */
> > + switch (opcode) {
> > + case CMDQ_OP_TLBI_NH_ASID:
> > + case CMDQ_OP_TLBI_NH_VA:
> > + case CMDQ_OP_TLBI_S12_VMALL:
> > + case CMDQ_OP_TLBI_S2_IPA:
>
> Fun! Can the guest invalidate any VMID it feels like, or is there some
> additional magic on the host side that we're missing here?
Yes. VINTF has a register for SW to program VMID so that the HW
can replace VMIDs in the cmds in the VCMDQs of that VINTF with
the programmed VMID. That was the reason why we had numbers of
patches in v2 to route the VMID between guest and host.
> > + case CMDQ_OP_ATC_INV:
> > + break;
> Ditto for StreamID here.
Yes. StreamID works similarly by the HW: each VINTF provides us
16 pairs of MATCH+REPLACE registers to program host and guest's
StreamIDs. Our previous mdev implementation in v2 can be a good
reference code:
https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Thanks
Nic
> > + default:
> > + /* Unsupported CMDs go for smmu->cmdq pathway */
> > + return &smmu->cmdq;
> > + }
> > + }
> > +
> > /*
> > * Select a vcmdq to use. Here we use a temporal solution to
> > * balance out traffic on cmdq issuing: each cmdq has its own
> > @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > vintf0->idx = 0;
> > vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
> >
> > + /*
> > + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> > + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> > + * set of supported commands, by following the HW design.
> > + */
> > regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > regval |= FIELD_PREP(VINTF_EN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > - vintf0->cfg = regval;
> > + /*
> > + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> > + * kernel, so read back regval from HW to ensure that reflects in cfg
> > + */
> > + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> > regval, regval == VINTF_ENABLED,
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-22 22:52 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-22 22:52 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > are supported. This requires get_cmd() function to scan the input
> > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > unsupported commands can still go through emulated smmu->cmdq.
> >
> > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > guest kernel driver writing it or not, i.e. the user space driver
> > running in the host OS should wire this bit to zero when trapping
> > a write access to this VINTF_CONFIG register from a guest kernel.
> > So instead of using the existing regval, this patch reads out the
> > register value explicitly to cache in vintf->cfg.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > 3 files changed, 36 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index b1182dd825fd..73941ccc1a3e 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > return 0;
> > }
> >
> > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > if (smmu->nvidia_grace_cmdqv)
> > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> >
> > return &smmu->cmdq;
> > }
> > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > u32 prod;
> > unsigned long flags;
> > bool owner;
> > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > struct arm_smmu_ll_queue llq, head;
> > int ret = 0;
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > index 24f93444aeeb..085c775c2eea 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > struct acpi_iort_node *node);
> > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > + u64 *cmds, int n);
> > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > static inline struct nvidia_grace_cmdqv *
> > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > }
> >
> > static inline struct arm_smmu_cmdq *
> > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > return NULL;
> > }
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > index c0d7351f13e2..71f6bc684e64 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > }
> >
> > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > +struct arm_smmu_cmdq *
> > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > {
> > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > return &smmu->cmdq;
> >
> > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>
> I'm not sure there was ever a conscious design decision that batches
> only ever contain one type of command - if something needs to start
Hmm, I think that's a good catch -- as it could be a potential
bug here. Though the SMMUv3 driver currently seems to use loop
by adding one type of cmds to any batch and submitting it right
away so checking opcode of cmds[0] alone seems to be sufficient
at this moment, yet it might not be so in the future. We'd need
to apply certain constrains on the type of cmds in the batch in
SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
SMMUv3's CMDQ pathway here if one of cmds is not supported.
> depending on that behaviour then that dependency probably wants to be
> clearly documented. Also, a sync on its own gets trapped to the main
> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> VCMDQ, huh?
Yea...looks like an implication again where cmds must have SYNC
at the end of the batch. I will see if any simple change can be
done to fix these two. If you have suggestions for them, I would
love to hear too.
> > +
> > + /* List all supported CMDs for vintf->cmdq pathway */
> > + switch (opcode) {
> > + case CMDQ_OP_TLBI_NH_ASID:
> > + case CMDQ_OP_TLBI_NH_VA:
> > + case CMDQ_OP_TLBI_S12_VMALL:
> > + case CMDQ_OP_TLBI_S2_IPA:
>
> Fun! Can the guest invalidate any VMID it feels like, or is there some
> additional magic on the host side that we're missing here?
Yes. VINTF has a register for SW to program VMID so that the HW
can replace VMIDs in the cmds in the VCMDQs of that VINTF with
the programmed VMID. That was the reason why we had numbers of
patches in v2 to route the VMID between guest and host.
> > + case CMDQ_OP_ATC_INV:
> > + break;
> Ditto for StreamID here.
Yes. StreamID works similarly by the HW: each VINTF provides us
16 pairs of MATCH+REPLACE registers to program host and guest's
StreamIDs. Our previous mdev implementation in v2 can be a good
reference code:
https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Thanks
Nic
> > + default:
> > + /* Unsupported CMDs go for smmu->cmdq pathway */
> > + return &smmu->cmdq;
> > + }
> > + }
> > +
> > /*
> > * Select a vcmdq to use. Here we use a temporal solution to
> > * balance out traffic on cmdq issuing: each cmdq has its own
> > @@ -199,13 +218,22 @@ int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > vintf0->idx = 0;
> > vintf0->base = cmdqv->base + NVIDIA_CMDQV_VINTF(0);
> >
> > + /*
> > + * Note that HYP_OWN bit is wired to zero when running in guest kernel
> > + * regardless of enabling it here, as !HYP_OWN cmdqs have a restricted
> > + * set of supported commands, by following the HW design.
> > + */
> > regval = FIELD_PREP(VINTF_HYP_OWN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > regval |= FIELD_PREP(VINTF_EN, 1);
> > writel(regval, vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > - vintf0->cfg = regval;
> > + /*
> > + * As being mentioned above, HYP_OWN bit is wired to zero for a guest
> > + * kernel, so read back regval from HW to ensure that reflects in cfg
> > + */
> > + vintf0->cfg = readl(vintf0->base + NVIDIA_VINTF_CONFIG);
> >
> > ret = readl_relaxed_poll_timeout(vintf0->base + NVIDIA_VINTF_STATUS,
> > regval, regval == VINTF_ENABLED,
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-12-22 22:52 ` Nicolin Chen via iommu
(?)
@ 2021-12-23 11:14 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-23 11:14 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On 2021-12-22 22:52, Nicolin Chen wrote:
> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>> are supported. This requires get_cmd() function to scan the input
>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>> unsupported commands can still go through emulated smmu->cmdq.
>>>
>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>> guest kernel driver writing it or not, i.e. the user space driver
>>> running in the host OS should wire this bit to zero when trapping
>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>> So instead of using the existing regval, this patch reads out the
>>> register value explicitly to cache in vintf->cfg.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index b1182dd825fd..73941ccc1a3e 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>> return 0;
>>> }
>>>
>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> if (smmu->nvidia_grace_cmdqv)
>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>
>>> return &smmu->cmdq;
>>> }
>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>> u32 prod;
>>> unsigned long flags;
>>> bool owner;
>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>> struct arm_smmu_ll_queue llq, head;
>>> int ret = 0;
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index 24f93444aeeb..085c775c2eea 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> struct acpi_iort_node *node);
>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>> + u64 *cmds, int n);
>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>> static inline struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>> }
>>>
>>> static inline struct arm_smmu_cmdq *
>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> return NULL;
>>> }
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> index c0d7351f13e2..71f6bc684e64 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>> }
>>>
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +struct arm_smmu_cmdq *
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> return &smmu->cmdq;
>>>
>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>
>> I'm not sure there was ever a conscious design decision that batches
>> only ever contain one type of command - if something needs to start
>
> Hmm, I think that's a good catch -- as it could be a potential
> bug here. Though the SMMUv3 driver currently seems to use loop
> by adding one type of cmds to any batch and submitting it right
> away so checking opcode of cmds[0] alone seems to be sufficient
> at this moment, yet it might not be so in the future. We'd need
> to apply certain constrains on the type of cmds in the batch in
> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>
>> depending on that behaviour then that dependency probably wants to be
>> clearly documented. Also, a sync on its own gets trapped to the main
>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>> VCMDQ, huh?
>
> Yea...looks like an implication again where cmds must have SYNC
> at the end of the batch. I will see if any simple change can be
> done to fix these two. If you have suggestions for them, I would
> love to hear too.
Can you explain the current logic here? It's not entirely clear to me
whether the VCMDQ is actually meant to support CMD_SYNC or not.
>>> +
>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>> + switch (opcode) {
>>> + case CMDQ_OP_TLBI_NH_ASID:
>>> + case CMDQ_OP_TLBI_NH_VA:
>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>> + case CMDQ_OP_TLBI_S2_IPA:
>>
>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>> additional magic on the host side that we're missing here?
>
> Yes. VINTF has a register for SW to program VMID so that the HW
> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> the programmed VMID. That was the reason why we had numbers of
> patches in v2 to route the VMID between guest and host.
>
>>> + case CMDQ_OP_ATC_INV:
>>> + break;
>> Ditto for StreamID here.
>
> Yes. StreamID works similarly by the HW: each VINTF provides us
> 16 pairs of MATCH+REPLACE registers to program host and guest's
> StreamIDs. Our previous mdev implementation in v2 can be a good
> reference code:
> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Ah, sorry, I haven't had the bandwidth to dig back through all the
previous threads. Thanks for clarifying - I'm still not sure why any
notion of stage 2 would be exposed to guests at all, but at least it
sounds like there's no functional concern here, other than constraining
the number of devices which can be assigned to a single VM, but I think
that falls into the bucket of information that userspace VMMs will have
to learn about this kind of direct IOMMU interface assignment anyway
(most importantly, the relationship of assigned devices to vIOMMUs
suddenly has to start reflecting the underlying physical topology).
Out of interest, would ATC_INV with an unmatched StreamID raise an error
or just be ignored? Particularly if the host gets a chance to handle a
GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
not, there's scope to do some interesting things for functionality and
robustness.
Cheers,
Robin.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-23 11:14 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-23 11:14 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, will, linux-arm-kernel
On 2021-12-22 22:52, Nicolin Chen wrote:
> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>> are supported. This requires get_cmd() function to scan the input
>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>> unsupported commands can still go through emulated smmu->cmdq.
>>>
>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>> guest kernel driver writing it or not, i.e. the user space driver
>>> running in the host OS should wire this bit to zero when trapping
>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>> So instead of using the existing regval, this patch reads out the
>>> register value explicitly to cache in vintf->cfg.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index b1182dd825fd..73941ccc1a3e 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>> return 0;
>>> }
>>>
>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> if (smmu->nvidia_grace_cmdqv)
>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>
>>> return &smmu->cmdq;
>>> }
>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>> u32 prod;
>>> unsigned long flags;
>>> bool owner;
>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>> struct arm_smmu_ll_queue llq, head;
>>> int ret = 0;
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index 24f93444aeeb..085c775c2eea 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> struct acpi_iort_node *node);
>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>> + u64 *cmds, int n);
>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>> static inline struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>> }
>>>
>>> static inline struct arm_smmu_cmdq *
>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> return NULL;
>>> }
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> index c0d7351f13e2..71f6bc684e64 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>> }
>>>
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +struct arm_smmu_cmdq *
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> return &smmu->cmdq;
>>>
>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>
>> I'm not sure there was ever a conscious design decision that batches
>> only ever contain one type of command - if something needs to start
>
> Hmm, I think that's a good catch -- as it could be a potential
> bug here. Though the SMMUv3 driver currently seems to use loop
> by adding one type of cmds to any batch and submitting it right
> away so checking opcode of cmds[0] alone seems to be sufficient
> at this moment, yet it might not be so in the future. We'd need
> to apply certain constrains on the type of cmds in the batch in
> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>
>> depending on that behaviour then that dependency probably wants to be
>> clearly documented. Also, a sync on its own gets trapped to the main
>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>> VCMDQ, huh?
>
> Yea...looks like an implication again where cmds must have SYNC
> at the end of the batch. I will see if any simple change can be
> done to fix these two. If you have suggestions for them, I would
> love to hear too.
Can you explain the current logic here? It's not entirely clear to me
whether the VCMDQ is actually meant to support CMD_SYNC or not.
>>> +
>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>> + switch (opcode) {
>>> + case CMDQ_OP_TLBI_NH_ASID:
>>> + case CMDQ_OP_TLBI_NH_VA:
>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>> + case CMDQ_OP_TLBI_S2_IPA:
>>
>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>> additional magic on the host side that we're missing here?
>
> Yes. VINTF has a register for SW to program VMID so that the HW
> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> the programmed VMID. That was the reason why we had numbers of
> patches in v2 to route the VMID between guest and host.
>
>>> + case CMDQ_OP_ATC_INV:
>>> + break;
>> Ditto for StreamID here.
>
> Yes. StreamID works similarly by the HW: each VINTF provides us
> 16 pairs of MATCH+REPLACE registers to program host and guest's
> StreamIDs. Our previous mdev implementation in v2 can be a good
> reference code:
> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Ah, sorry, I haven't had the bandwidth to dig back through all the
previous threads. Thanks for clarifying - I'm still not sure why any
notion of stage 2 would be exposed to guests at all, but at least it
sounds like there's no functional concern here, other than constraining
the number of devices which can be assigned to a single VM, but I think
that falls into the bucket of information that userspace VMMs will have
to learn about this kind of direct IOMMU interface assignment anyway
(most importantly, the relationship of assigned devices to vIOMMUs
suddenly has to start reflecting the underlying physical topology).
Out of interest, would ATC_INV with an unmatched StreamID raise an error
or just be ignored? Particularly if the host gets a chance to handle a
GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
not, there's scope to do some interesting things for functionality and
robustness.
Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-23 11:14 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-23 11:14 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On 2021-12-22 22:52, Nicolin Chen wrote:
> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>> are supported. This requires get_cmd() function to scan the input
>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>> unsupported commands can still go through emulated smmu->cmdq.
>>>
>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>> guest kernel driver writing it or not, i.e. the user space driver
>>> running in the host OS should wire this bit to zero when trapping
>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>> So instead of using the existing regval, this patch reads out the
>>> register value explicitly to cache in vintf->cfg.
>>>
>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>> ---
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> index b1182dd825fd..73941ccc1a3e 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>> return 0;
>>> }
>>>
>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> if (smmu->nvidia_grace_cmdqv)
>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>
>>> return &smmu->cmdq;
>>> }
>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>> u32 prod;
>>> unsigned long flags;
>>> bool owner;
>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>> struct arm_smmu_ll_queue llq, head;
>>> int ret = 0;
>>>
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> index 24f93444aeeb..085c775c2eea 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> struct acpi_iort_node *node);
>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>> + u64 *cmds, int n);
>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>> static inline struct nvidia_grace_cmdqv *
>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>> }
>>>
>>> static inline struct arm_smmu_cmdq *
>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> return NULL;
>>> }
>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> index c0d7351f13e2..71f6bc684e64 100644
>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>> }
>>>
>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> +struct arm_smmu_cmdq *
>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>> {
>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>> return &smmu->cmdq;
>>>
>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>
>> I'm not sure there was ever a conscious design decision that batches
>> only ever contain one type of command - if something needs to start
>
> Hmm, I think that's a good catch -- as it could be a potential
> bug here. Though the SMMUv3 driver currently seems to use loop
> by adding one type of cmds to any batch and submitting it right
> away so checking opcode of cmds[0] alone seems to be sufficient
> at this moment, yet it might not be so in the future. We'd need
> to apply certain constrains on the type of cmds in the batch in
> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>
>> depending on that behaviour then that dependency probably wants to be
>> clearly documented. Also, a sync on its own gets trapped to the main
>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>> VCMDQ, huh?
>
> Yea...looks like an implication again where cmds must have SYNC
> at the end of the batch. I will see if any simple change can be
> done to fix these two. If you have suggestions for them, I would
> love to hear too.
Can you explain the current logic here? It's not entirely clear to me
whether the VCMDQ is actually meant to support CMD_SYNC or not.
>>> +
>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>> + switch (opcode) {
>>> + case CMDQ_OP_TLBI_NH_ASID:
>>> + case CMDQ_OP_TLBI_NH_VA:
>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>> + case CMDQ_OP_TLBI_S2_IPA:
>>
>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>> additional magic on the host side that we're missing here?
>
> Yes. VINTF has a register for SW to program VMID so that the HW
> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> the programmed VMID. That was the reason why we had numbers of
> patches in v2 to route the VMID between guest and host.
>
>>> + case CMDQ_OP_ATC_INV:
>>> + break;
>> Ditto for StreamID here.
>
> Yes. StreamID works similarly by the HW: each VINTF provides us
> 16 pairs of MATCH+REPLACE registers to program host and guest's
> StreamIDs. Our previous mdev implementation in v2 can be a good
> reference code:
> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
Ah, sorry, I haven't had the bandwidth to dig back through all the
previous threads. Thanks for clarifying - I'm still not sure why any
notion of stage 2 would be exposed to guests at all, but at least it
sounds like there's no functional concern here, other than constraining
the number of devices which can be assigned to a single VM, but I think
that falls into the bucket of information that userspace VMMs will have
to learn about this kind of direct IOMMU interface assignment anyway
(most importantly, the relationship of assigned devices to vIOMMUs
suddenly has to start reflecting the underlying physical topology).
Out of interest, would ATC_INV with an unmatched StreamID raise an error
or just be ignored? Particularly if the host gets a chance to handle a
GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
not, there's scope to do some interesting things for functionality and
robustness.
Cheers,
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-12-23 11:14 ` Robin Murphy
(?)
@ 2021-12-24 8:02 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-24 8:02 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-22 22:52, Nicolin Chen wrote:
> > On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > > > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > > > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > > > are supported. This requires get_cmd() function to scan the input
> > > > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > > > unsupported commands can still go through emulated smmu->cmdq.
> > > >
> > > > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > > > guest kernel driver writing it or not, i.e. the user space driver
> > > > running in the host OS should wire this bit to zero when trapping
> > > > a write access to this VINTF_CONFIG register from a guest kernel.
> > > > So instead of using the existing regval, this patch reads out the
> > > > register value explicitly to cache in vintf->cfg.
> > > >
> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > ---
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > > > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > > > 3 files changed, 36 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > index b1182dd825fd..73941ccc1a3e 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > > > return 0;
> > > > }
> > > >
> > > > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > > > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > if (smmu->nvidia_grace_cmdqv)
> > > > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > > > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> > > >
> > > > return &smmu->cmdq;
> > > > }
> > > > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > > > u32 prod;
> > > > unsigned long flags;
> > > > bool owner;
> > > > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > > > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > > > struct arm_smmu_ll_queue llq, head;
> > > > int ret = 0;
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > index 24f93444aeeb..085c775c2eea 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > struct acpi_iort_node *node);
> > > > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > > > + u64 *cmds, int n);
> > > > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > > > static inline struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > > > }
> > > >
> > > > static inline struct arm_smmu_cmdq *
> > > > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > return NULL;
> > > > }
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > index c0d7351f13e2..71f6bc684e64 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > > > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > > > }
> > > >
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +struct arm_smmu_cmdq *
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > return &smmu->cmdq;
> > > >
> > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > >
> > > I'm not sure there was ever a conscious design decision that batches
> > > only ever contain one type of command - if something needs to start
> >
> > Hmm, I think that's a good catch -- as it could be a potential
> > bug here. Though the SMMUv3 driver currently seems to use loop
> > by adding one type of cmds to any batch and submitting it right
> > away so checking opcode of cmds[0] alone seems to be sufficient
> > at this moment, yet it might not be so in the future. We'd need
> > to apply certain constrains on the type of cmds in the batch in
> > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> >
> > > depending on that behaviour then that dependency probably wants to be
> > > clearly documented. Also, a sync on its own gets trapped to the main
> > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > VCMDQ, huh?
> >
> > Yea...looks like an implication again where cmds must have SYNC
> > at the end of the batch. I will see if any simple change can be
> > done to fix these two. If you have suggestions for them, I would
> > love to hear too.
>
> Can you explain the current logic here? It's not entirely clear to me
> whether the VCMDQ is actually meant to support CMD_SYNC or not.
Yes. It's designed to take CMD_SYNC in same queue too. Though it
also has features, such as HW-inserted-SYNC when scheduler moves
away from the current queue or when the number of cmds in vcmdq
meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
safer for software to ensure the CMD_SYNC is inserted to the end
of the batch.
> > > > +
> > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > + switch (opcode) {
> > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > + case CMDQ_OP_TLBI_S2_IPA:
> > >
> > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > additional magic on the host side that we're missing here?
> >
> > Yes. VINTF has a register for SW to program VMID so that the HW
> > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > the programmed VMID. That was the reason why we had numbers of
> > patches in v2 to route the VMID between guest and host.
> >
> > > > + case CMDQ_OP_ATC_INV:
> > > > + break;
> > > Ditto for StreamID here.
> >
> > Yes. StreamID works similarly by the HW: each VINTF provides us
> > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > StreamIDs. Our previous mdev implementation in v2 can be a good
> > reference code:
> > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>
> Ah, sorry, I haven't had the bandwidth to dig back through all the
> previous threads. Thanks for clarifying - I'm still not sure why any
> notion of stage 2 would be exposed to guests at all, but at least ita
Do you mean, by "notion of stage 2", Host Stream IDs? The guest
wouldn't get those I think. They'll be trapped in the hypervisor
-- the user driver (QEMU CMDQV device model for example.)
> sounds like there's no functional concern here, other than constraining
> the number of devices which can be assigned to a single VM, but I think
> that falls into the bucket of information that userspace VMMs will have
> to learn about this kind of direct IOMMU interface assignment anyway
> (most importantly, the relationship of assigned devices to vIOMMUs
> suddenly has to start reflecting the underlying physical topology).
We haven't started to think how to fit the best into the IOMMUFD
but we will be likely having some idea or test case in Jan.
> Out of interest, would ATC_INV with an unmatched StreamID raise an error
> or just be ignored? Particularly if the host gets a chance to handle a
Mismatched StreamID will be treated as an Illegal command. Yes,
there'd be an error.
> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> not, there's scope to do some interesting things for functionality and
> robustness.
Would love to learn more about your thoughts :)
Btw, I think we may continue the discussion on this PATCH-5 and
then to figure out ideal solutions for those potential bugs that
you commented so far, as this patch really is very introductory
to Guest support (we need more implementation based on IOMMUFD.)
For the first 4 patches, they could be separated. Do you see a
chance to get them applied first? They are in the mail list for
a while now. And we'd like to accelerate the progress of those
four changes first.
Thank you!
Nic
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-24 8:02 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-12-24 8:02 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, will, linux-arm-kernel
On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-22 22:52, Nicolin Chen wrote:
> > On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > > > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > > > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > > > are supported. This requires get_cmd() function to scan the input
> > > > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > > > unsupported commands can still go through emulated smmu->cmdq.
> > > >
> > > > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > > > guest kernel driver writing it or not, i.e. the user space driver
> > > > running in the host OS should wire this bit to zero when trapping
> > > > a write access to this VINTF_CONFIG register from a guest kernel.
> > > > So instead of using the existing regval, this patch reads out the
> > > > register value explicitly to cache in vintf->cfg.
> > > >
> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > ---
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > > > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > > > 3 files changed, 36 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > index b1182dd825fd..73941ccc1a3e 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > > > return 0;
> > > > }
> > > >
> > > > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > > > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > if (smmu->nvidia_grace_cmdqv)
> > > > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > > > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> > > >
> > > > return &smmu->cmdq;
> > > > }
> > > > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > > > u32 prod;
> > > > unsigned long flags;
> > > > bool owner;
> > > > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > > > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > > > struct arm_smmu_ll_queue llq, head;
> > > > int ret = 0;
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > index 24f93444aeeb..085c775c2eea 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > struct acpi_iort_node *node);
> > > > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > > > + u64 *cmds, int n);
> > > > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > > > static inline struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > > > }
> > > >
> > > > static inline struct arm_smmu_cmdq *
> > > > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > return NULL;
> > > > }
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > index c0d7351f13e2..71f6bc684e64 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > > > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > > > }
> > > >
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +struct arm_smmu_cmdq *
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > return &smmu->cmdq;
> > > >
> > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > >
> > > I'm not sure there was ever a conscious design decision that batches
> > > only ever contain one type of command - if something needs to start
> >
> > Hmm, I think that's a good catch -- as it could be a potential
> > bug here. Though the SMMUv3 driver currently seems to use loop
> > by adding one type of cmds to any batch and submitting it right
> > away so checking opcode of cmds[0] alone seems to be sufficient
> > at this moment, yet it might not be so in the future. We'd need
> > to apply certain constrains on the type of cmds in the batch in
> > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> >
> > > depending on that behaviour then that dependency probably wants to be
> > > clearly documented. Also, a sync on its own gets trapped to the main
> > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > VCMDQ, huh?
> >
> > Yea...looks like an implication again where cmds must have SYNC
> > at the end of the batch. I will see if any simple change can be
> > done to fix these two. If you have suggestions for them, I would
> > love to hear too.
>
> Can you explain the current logic here? It's not entirely clear to me
> whether the VCMDQ is actually meant to support CMD_SYNC or not.
Yes. It's designed to take CMD_SYNC in same queue too. Though it
also has features, such as HW-inserted-SYNC when scheduler moves
away from the current queue or when the number of cmds in vcmdq
meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
safer for software to ensure the CMD_SYNC is inserted to the end
of the batch.
> > > > +
> > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > + switch (opcode) {
> > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > + case CMDQ_OP_TLBI_S2_IPA:
> > >
> > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > additional magic on the host side that we're missing here?
> >
> > Yes. VINTF has a register for SW to program VMID so that the HW
> > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > the programmed VMID. That was the reason why we had numbers of
> > patches in v2 to route the VMID between guest and host.
> >
> > > > + case CMDQ_OP_ATC_INV:
> > > > + break;
> > > Ditto for StreamID here.
> >
> > Yes. StreamID works similarly by the HW: each VINTF provides us
> > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > StreamIDs. Our previous mdev implementation in v2 can be a good
> > reference code:
> > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>
> Ah, sorry, I haven't had the bandwidth to dig back through all the
> previous threads. Thanks for clarifying - I'm still not sure why any
> notion of stage 2 would be exposed to guests at all, but at least ita
Do you mean, by "notion of stage 2", Host Stream IDs? The guest
wouldn't get those I think. They'll be trapped in the hypervisor
-- the user driver (QEMU CMDQV device model for example.)
> sounds like there's no functional concern here, other than constraining
> the number of devices which can be assigned to a single VM, but I think
> that falls into the bucket of information that userspace VMMs will have
> to learn about this kind of direct IOMMU interface assignment anyway
> (most importantly, the relationship of assigned devices to vIOMMUs
> suddenly has to start reflecting the underlying physical topology).
We haven't started to think how to fit the best into the IOMMUFD
but we will be likely having some idea or test case in Jan.
> Out of interest, would ATC_INV with an unmatched StreamID raise an error
> or just be ignored? Particularly if the host gets a chance to handle a
Mismatched StreamID will be treated as an Illegal command. Yes,
there'd be an error.
> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> not, there's scope to do some interesting things for functionality and
> robustness.
Would love to learn more about your thoughts :)
Btw, I think we may continue the discussion on this PATCH-5 and
then to figure out ideal solutions for those potential bugs that
you commented so far, as this patch really is very introductory
to Guest support (we need more implementation based on IOMMUFD.)
For the first 4 patches, they could be separated. Do you see a
chance to get them applied first? They are in the mail list for
a while now. And we'd like to accelerate the progress of those
four changes first.
Thank you!
Nic
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-24 8:02 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-24 8:02 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-12-22 22:52, Nicolin Chen wrote:
> > On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
> > > > When VCMDQs are assigned to a VINTF that is owned by a guest, not
> > > > hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
> > > > are supported. This requires get_cmd() function to scan the input
> > > > cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
> > > > unsupported commands can still go through emulated smmu->cmdq.
> > > >
> > > > Also the guest shouldn't have HYP_OWN bit being set regardless of
> > > > guest kernel driver writing it or not, i.e. the user space driver
> > > > running in the host OS should wire this bit to zero when trapping
> > > > a write access to this VINTF_CONFIG register from a guest kernel.
> > > > So instead of using the existing regval, this patch reads out the
> > > > register value explicitly to cache in vintf->cfg.
> > > >
> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > ---
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
> > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
> > > > .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
> > > > 3 files changed, 36 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > index b1182dd825fd..73941ccc1a3e 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> > > > return 0;
> > > > }
> > > >
> > > > -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
> > > > +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > if (smmu->nvidia_grace_cmdqv)
> > > > - return nvidia_grace_cmdqv_get_cmdq(smmu);
> > > > + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
> > > >
> > > > return &smmu->cmdq;
> > > > }
> > > > @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> > > > u32 prod;
> > > > unsigned long flags;
> > > > bool owner;
> > > > - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
> > > > + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
> > > > struct arm_smmu_ll_queue llq, head;
> > > > int ret = 0;
> > > >
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > index 24f93444aeeb..085c775c2eea 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> > > > @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > struct acpi_iort_node *node);
> > > > int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
> > > > +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
> > > > + u64 *cmds, int n);
> > > > #else /* CONFIG_NVIDIA_GRACE_CMDQV */
> > > > static inline struct nvidia_grace_cmdqv *
> > > > nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
> > > > @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
> > > > }
> > > >
> > > > static inline struct arm_smmu_cmdq *
> > > > -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > return NULL;
> > > > }
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > index c0d7351f13e2..71f6bc684e64 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
> > > > @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
> > > > return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
> > > > }
> > > >
> > > > -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > +struct arm_smmu_cmdq *
> > > > +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
> > > > {
> > > > struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
> > > > struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
> > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > return &smmu->cmdq;
> > > >
> > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > >
> > > I'm not sure there was ever a conscious design decision that batches
> > > only ever contain one type of command - if something needs to start
> >
> > Hmm, I think that's a good catch -- as it could be a potential
> > bug here. Though the SMMUv3 driver currently seems to use loop
> > by adding one type of cmds to any batch and submitting it right
> > away so checking opcode of cmds[0] alone seems to be sufficient
> > at this moment, yet it might not be so in the future. We'd need
> > to apply certain constrains on the type of cmds in the batch in
> > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> >
> > > depending on that behaviour then that dependency probably wants to be
> > > clearly documented. Also, a sync on its own gets trapped to the main
> > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > VCMDQ, huh?
> >
> > Yea...looks like an implication again where cmds must have SYNC
> > at the end of the batch. I will see if any simple change can be
> > done to fix these two. If you have suggestions for them, I would
> > love to hear too.
>
> Can you explain the current logic here? It's not entirely clear to me
> whether the VCMDQ is actually meant to support CMD_SYNC or not.
Yes. It's designed to take CMD_SYNC in same queue too. Though it
also has features, such as HW-inserted-SYNC when scheduler moves
away from the current queue or when the number of cmds in vcmdq
meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
safer for software to ensure the CMD_SYNC is inserted to the end
of the batch.
> > > > +
> > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > + switch (opcode) {
> > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > + case CMDQ_OP_TLBI_S2_IPA:
> > >
> > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > additional magic on the host side that we're missing here?
> >
> > Yes. VINTF has a register for SW to program VMID so that the HW
> > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > the programmed VMID. That was the reason why we had numbers of
> > patches in v2 to route the VMID between guest and host.
> >
> > > > + case CMDQ_OP_ATC_INV:
> > > > + break;
> > > Ditto for StreamID here.
> >
> > Yes. StreamID works similarly by the HW: each VINTF provides us
> > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > StreamIDs. Our previous mdev implementation in v2 can be a good
> > reference code:
> > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>
> Ah, sorry, I haven't had the bandwidth to dig back through all the
> previous threads. Thanks for clarifying - I'm still not sure why any
> notion of stage 2 would be exposed to guests at all, but at least ita
Do you mean, by "notion of stage 2", Host Stream IDs? The guest
wouldn't get those I think. They'll be trapped in the hypervisor
-- the user driver (QEMU CMDQV device model for example.)
> sounds like there's no functional concern here, other than constraining
> the number of devices which can be assigned to a single VM, but I think
> that falls into the bucket of information that userspace VMMs will have
> to learn about this kind of direct IOMMU interface assignment anyway
> (most importantly, the relationship of assigned devices to vIOMMUs
> suddenly has to start reflecting the underlying physical topology).
We haven't started to think how to fit the best into the IOMMUFD
but we will be likely having some idea or test case in Jan.
> Out of interest, would ATC_INV with an unmatched StreamID raise an error
> or just be ignored? Particularly if the host gets a chance to handle a
Mismatched StreamID will be treated as an Illegal command. Yes,
there'd be an error.
> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> not, there's scope to do some interesting things for functionality and
> robustness.
Would love to learn more about your thoughts :)
Btw, I think we may continue the discussion on this PATCH-5 and
then to figure out ideal solutions for those potential bugs that
you commented so far, as this patch really is very introductory
to Guest support (we need more implementation based on IOMMUFD.)
For the first 4 patches, they could be separated. Do you see a
chance to get them applied first? They are in the mail list for
a while now. And we'd like to accelerate the progress of those
four changes first.
Thank you!
Nic
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-12-24 8:02 ` Nicolin Chen via iommu
(?)
@ 2021-12-24 12:13 ` Robin Murphy
-1 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-24 12:13 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On 2021-12-24 08:02, Nicolin Chen wrote:
> On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-12-22 22:52, Nicolin Chen wrote:
>>> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>>>> are supported. This requires get_cmd() function to scan the input
>>>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>>>> unsupported commands can still go through emulated smmu->cmdq.
>>>>>
>>>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>>>> guest kernel driver writing it or not, i.e. the user space driver
>>>>> running in the host OS should wire this bit to zero when trapping
>>>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>>>> So instead of using the existing regval, this patch reads out the
>>>>> register value explicitly to cache in vintf->cfg.
>>>>>
>>>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>>>> ---
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> index b1182dd825fd..73941ccc1a3e 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>>>> return 0;
>>>>> }
>>>>>
>>>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> if (smmu->nvidia_grace_cmdqv)
>>>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>>>
>>>>> return &smmu->cmdq;
>>>>> }
>>>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>>>> u32 prod;
>>>>> unsigned long flags;
>>>>> bool owner;
>>>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>>>> struct arm_smmu_ll_queue llq, head;
>>>>> int ret = 0;
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> index 24f93444aeeb..085c775c2eea 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> struct acpi_iort_node *node);
>>>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>>>> + u64 *cmds, int n);
>>>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>>>> static inline struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>>>> }
>>>>>
>>>>> static inline struct arm_smmu_cmdq *
>>>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> return NULL;
>>>>> }
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> index c0d7351f13e2..71f6bc684e64 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>>>> }
>>>>>
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +struct arm_smmu_cmdq *
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>>>> return &smmu->cmdq;
>>>>>
>>>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>>>
>>>> I'm not sure there was ever a conscious design decision that batches
>>>> only ever contain one type of command - if something needs to start
>>>
>>> Hmm, I think that's a good catch -- as it could be a potential
>>> bug here. Though the SMMUv3 driver currently seems to use loop
>>> by adding one type of cmds to any batch and submitting it right
>>> away so checking opcode of cmds[0] alone seems to be sufficient
>>> at this moment, yet it might not be so in the future. We'd need
>>> to apply certain constrains on the type of cmds in the batch in
>>> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
>>> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>>>
>>>> depending on that behaviour then that dependency probably wants to be
>>>> clearly documented. Also, a sync on its own gets trapped to the main
>>>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>>>> VCMDQ, huh?
>>>
>>> Yea...looks like an implication again where cmds must have SYNC
>>> at the end of the batch. I will see if any simple change can be
>>> done to fix these two. If you have suggestions for them, I would
>>> love to hear too.
>>
>> Can you explain the current logic here? It's not entirely clear to me
>> whether the VCMDQ is actually meant to support CMD_SYNC or not.
>
> Yes. It's designed to take CMD_SYNC in same queue too. Though it
> also has features, such as HW-inserted-SYNC when scheduler moves
> away from the current queue or when the number of cmds in vcmdq
> meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> safer for software to ensure the CMD_SYNC is inserted to the end
> of the batch.
OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
switch statement? That's reassuring at least. Having to trap to the host
to issue a sync would be horrible, and largely defeat the point of the
whole exercise.
It's not generally much use to software to know that the hardware may or
may not have automatically inserted syncs at arbitrary points in the
timeline; certainly for our flow in Linux, which I don't think is
atypical, we need to know for sure that specific invalidation commands
have completed before we can safely reuse resources associated with the
invalidated translations, and the only way to guarantee that is to
explicitly observe the consumption of a CMD_SYNC from a later queue index.
>>>>> +
>>>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>>>> + switch (opcode) {
>>>>> + case CMDQ_OP_TLBI_NH_ASID:
>>>>> + case CMDQ_OP_TLBI_NH_VA:
>>>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>>>> + case CMDQ_OP_TLBI_S2_IPA:
>>>>
>>>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>>>> additional magic on the host side that we're missing here?
>>>
>>> Yes. VINTF has a register for SW to program VMID so that the HW
>>> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
>>> the programmed VMID. That was the reason why we had numbers of
>>> patches in v2 to route the VMID between guest and host.
>>>
>>>>> + case CMDQ_OP_ATC_INV:
>>>>> + break;
>>>> Ditto for StreamID here.
>>>
>>> Yes. StreamID works similarly by the HW: each VINTF provides us
>>> 16 pairs of MATCH+REPLACE registers to program host and guest's
>>> StreamIDs. Our previous mdev implementation in v2 can be a good
>>> reference code:
>>> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>>
>> Ah, sorry, I haven't had the bandwidth to dig back through all the
>> previous threads. Thanks for clarifying - I'm still not sure why any
>> notion of stage 2 would be exposed to guests at all, but at least ita
>
> Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> wouldn't get those I think. They'll be trapped in the hypervisor
> -- the user driver (QEMU CMDQV device model for example.)
I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
moment it makes no sense for a guest to even *think* it can issue
TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
this is that we pick the Context Descriptor from guest memory via the
emulated Stream Table (or other mechanism like virtio-iommu) and plumb
it directly into the S1ContextPtr of the appropriate underlying physical
STE, on top of the host's S2 translation. I don't see how we could also
flatten an emulated S2 into either physical stage without having to go
back to the costly "trap all pagetable accesses" approach which would
obliterate the benefit of having a directly-assigned queue.
>> sounds like there's no functional concern here, other than constraining
>> the number of devices which can be assigned to a single VM, but I think
>> that falls into the bucket of information that userspace VMMs will have
>> to learn about this kind of direct IOMMU interface assignment anyway
>> (most importantly, the relationship of assigned devices to vIOMMUs
>> suddenly has to start reflecting the underlying physical topology).
>
> We haven't started to think how to fit the best into the IOMMUFD
> but we will be likely having some idea or test case in Jan.
>
>> Out of interest, would ATC_INV with an unmatched StreamID raise an error
>> or just be ignored? Particularly if the host gets a chance to handle a
>
> Mismatched StreamID will be treated as an Illegal command. Yes,
> there'd be an error.
>
>> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
>> not, there's scope to do some interesting things for functionality and
>> robustness.
>
> Would love to learn more about your thoughts :)
Basically it's quite neat if we could present a virtual queue to the
guest as the vSMMU's main queue, such that any commands that the
hardware can't consume directly could be fixed up or emulated by the
host with the illusion that they're being consumed as normal. It does
push more complexity into the host, and a round trip via the GError
interrupt would be a bit less efficient than trapping synchronously on a
write to an emulated CMDQ_PROD for commands that *do* need emulating,
but conversely it means we could support any guest with only the most
basic understanding of SMMUv3.0, and could potentially be more robust
overall. As I say, though, it depends entirely on the guest not being
able to observe an error unltil the host has decided not to fix up the
offending command.
> Btw, I think we may continue the discussion on this PATCH-5 and
> then to figure out ideal solutions for those potential bugs that
> you commented so far, as this patch really is very introductory
> to Guest support (we need more implementation based on IOMMUFD.)
>
> For the first 4 patches, they could be separated. Do you see a
> chance to get them applied first? They are in the mail list for
> a while now. And we'd like to accelerate the progress of those
> four changes first.
I can't speak for Will, but personally I'd consider them exactly the
same as the ECMDQ patches - it's good to have them out here, reviewed as
far as we reasonably can, and ready for people to experiment with as
soon as the real hardware turns up, but I don't see any benefit in
actually merging unproven complexity into mainline before then. Neither
patchset gives Linux any new functionality that it can't achieve already
with the regular cmdq, so there's nothing to gain until it's actually
demonstrable that we really are addressing the right bottlenecks in the
right manner to meaningfully improve real-world performance, but what we
have to lose is more effort spent ripping stuff out again if it turns
out to be no good. Even patches #1-#3 here fundamentally beg the
question of whether replicating the full heavyweight cmdq behaviour is
the right way to go.
I appreciate you've probably got hardware validation teams on your back
wanting "the driver" to support every new feature right now for them to
exercise, but we just have to stand firm and tell them that's not how
upstream works :)
Thanks,
Robin.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-24 12:13 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-24 12:13 UTC (permalink / raw)
To: Nicolin Chen
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, will, linux-arm-kernel
On 2021-12-24 08:02, Nicolin Chen wrote:
> On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-12-22 22:52, Nicolin Chen wrote:
>>> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>>>> are supported. This requires get_cmd() function to scan the input
>>>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>>>> unsupported commands can still go through emulated smmu->cmdq.
>>>>>
>>>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>>>> guest kernel driver writing it or not, i.e. the user space driver
>>>>> running in the host OS should wire this bit to zero when trapping
>>>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>>>> So instead of using the existing regval, this patch reads out the
>>>>> register value explicitly to cache in vintf->cfg.
>>>>>
>>>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>>>> ---
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> index b1182dd825fd..73941ccc1a3e 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>>>> return 0;
>>>>> }
>>>>>
>>>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> if (smmu->nvidia_grace_cmdqv)
>>>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>>>
>>>>> return &smmu->cmdq;
>>>>> }
>>>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>>>> u32 prod;
>>>>> unsigned long flags;
>>>>> bool owner;
>>>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>>>> struct arm_smmu_ll_queue llq, head;
>>>>> int ret = 0;
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> index 24f93444aeeb..085c775c2eea 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> struct acpi_iort_node *node);
>>>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>>>> + u64 *cmds, int n);
>>>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>>>> static inline struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>>>> }
>>>>>
>>>>> static inline struct arm_smmu_cmdq *
>>>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> return NULL;
>>>>> }
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> index c0d7351f13e2..71f6bc684e64 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>>>> }
>>>>>
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +struct arm_smmu_cmdq *
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>>>> return &smmu->cmdq;
>>>>>
>>>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>>>
>>>> I'm not sure there was ever a conscious design decision that batches
>>>> only ever contain one type of command - if something needs to start
>>>
>>> Hmm, I think that's a good catch -- as it could be a potential
>>> bug here. Though the SMMUv3 driver currently seems to use loop
>>> by adding one type of cmds to any batch and submitting it right
>>> away so checking opcode of cmds[0] alone seems to be sufficient
>>> at this moment, yet it might not be so in the future. We'd need
>>> to apply certain constrains on the type of cmds in the batch in
>>> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
>>> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>>>
>>>> depending on that behaviour then that dependency probably wants to be
>>>> clearly documented. Also, a sync on its own gets trapped to the main
>>>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>>>> VCMDQ, huh?
>>>
>>> Yea...looks like an implication again where cmds must have SYNC
>>> at the end of the batch. I will see if any simple change can be
>>> done to fix these two. If you have suggestions for them, I would
>>> love to hear too.
>>
>> Can you explain the current logic here? It's not entirely clear to me
>> whether the VCMDQ is actually meant to support CMD_SYNC or not.
>
> Yes. It's designed to take CMD_SYNC in same queue too. Though it
> also has features, such as HW-inserted-SYNC when scheduler moves
> away from the current queue or when the number of cmds in vcmdq
> meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> safer for software to ensure the CMD_SYNC is inserted to the end
> of the batch.
OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
switch statement? That's reassuring at least. Having to trap to the host
to issue a sync would be horrible, and largely defeat the point of the
whole exercise.
It's not generally much use to software to know that the hardware may or
may not have automatically inserted syncs at arbitrary points in the
timeline; certainly for our flow in Linux, which I don't think is
atypical, we need to know for sure that specific invalidation commands
have completed before we can safely reuse resources associated with the
invalidated translations, and the only way to guarantee that is to
explicitly observe the consumption of a CMD_SYNC from a later queue index.
>>>>> +
>>>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>>>> + switch (opcode) {
>>>>> + case CMDQ_OP_TLBI_NH_ASID:
>>>>> + case CMDQ_OP_TLBI_NH_VA:
>>>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>>>> + case CMDQ_OP_TLBI_S2_IPA:
>>>>
>>>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>>>> additional magic on the host side that we're missing here?
>>>
>>> Yes. VINTF has a register for SW to program VMID so that the HW
>>> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
>>> the programmed VMID. That was the reason why we had numbers of
>>> patches in v2 to route the VMID between guest and host.
>>>
>>>>> + case CMDQ_OP_ATC_INV:
>>>>> + break;
>>>> Ditto for StreamID here.
>>>
>>> Yes. StreamID works similarly by the HW: each VINTF provides us
>>> 16 pairs of MATCH+REPLACE registers to program host and guest's
>>> StreamIDs. Our previous mdev implementation in v2 can be a good
>>> reference code:
>>> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>>
>> Ah, sorry, I haven't had the bandwidth to dig back through all the
>> previous threads. Thanks for clarifying - I'm still not sure why any
>> notion of stage 2 would be exposed to guests at all, but at least ita
>
> Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> wouldn't get those I think. They'll be trapped in the hypervisor
> -- the user driver (QEMU CMDQV device model for example.)
I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
moment it makes no sense for a guest to even *think* it can issue
TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
this is that we pick the Context Descriptor from guest memory via the
emulated Stream Table (or other mechanism like virtio-iommu) and plumb
it directly into the S1ContextPtr of the appropriate underlying physical
STE, on top of the host's S2 translation. I don't see how we could also
flatten an emulated S2 into either physical stage without having to go
back to the costly "trap all pagetable accesses" approach which would
obliterate the benefit of having a directly-assigned queue.
>> sounds like there's no functional concern here, other than constraining
>> the number of devices which can be assigned to a single VM, but I think
>> that falls into the bucket of information that userspace VMMs will have
>> to learn about this kind of direct IOMMU interface assignment anyway
>> (most importantly, the relationship of assigned devices to vIOMMUs
>> suddenly has to start reflecting the underlying physical topology).
>
> We haven't started to think how to fit the best into the IOMMUFD
> but we will be likely having some idea or test case in Jan.
>
>> Out of interest, would ATC_INV with an unmatched StreamID raise an error
>> or just be ignored? Particularly if the host gets a chance to handle a
>
> Mismatched StreamID will be treated as an Illegal command. Yes,
> there'd be an error.
>
>> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
>> not, there's scope to do some interesting things for functionality and
>> robustness.
>
> Would love to learn more about your thoughts :)
Basically it's quite neat if we could present a virtual queue to the
guest as the vSMMU's main queue, such that any commands that the
hardware can't consume directly could be fixed up or emulated by the
host with the illusion that they're being consumed as normal. It does
push more complexity into the host, and a round trip via the GError
interrupt would be a bit less efficient than trapping synchronously on a
write to an emulated CMDQ_PROD for commands that *do* need emulating,
but conversely it means we could support any guest with only the most
basic understanding of SMMUv3.0, and could potentially be more robust
overall. As I say, though, it depends entirely on the guest not being
able to observe an error unltil the host has decided not to fix up the
offending command.
> Btw, I think we may continue the discussion on this PATCH-5 and
> then to figure out ideal solutions for those potential bugs that
> you commented so far, as this patch really is very introductory
> to Guest support (we need more implementation based on IOMMUFD.)
>
> For the first 4 patches, they could be separated. Do you see a
> chance to get them applied first? They are in the mail list for
> a while now. And we'd like to accelerate the progress of those
> four changes first.
I can't speak for Will, but personally I'd consider them exactly the
same as the ECMDQ patches - it's good to have them out here, reviewed as
far as we reasonably can, and ready for people to experiment with as
soon as the real hardware turns up, but I don't see any benefit in
actually merging unproven complexity into mainline before then. Neither
patchset gives Linux any new functionality that it can't achieve already
with the regular cmdq, so there's nothing to gain until it's actually
demonstrable that we really are addressing the right bottlenecks in the
right manner to meaningfully improve real-world performance, but what we
have to lose is more effort spent ripping stuff out again if it turns
out to be no good. Even patches #1-#3 here fundamentally beg the
question of whether replicating the full heavyweight cmdq behaviour is
the right way to go.
I appreciate you've probably got hardware validation teams on your back
wanting "the driver" to support every new feature right now for them to
exercise, but we just have to stand firm and tell them that's not how
upstream works :)
Thanks,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-24 12:13 ` Robin Murphy
0 siblings, 0 replies; 51+ messages in thread
From: Robin Murphy @ 2021-12-24 12:13 UTC (permalink / raw)
To: Nicolin Chen
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On 2021-12-24 08:02, Nicolin Chen wrote:
> On Thu, Dec 23, 2021 at 11:14:17AM +0000, Robin Murphy wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2021-12-22 22:52, Nicolin Chen wrote:
>>> On Wed, Dec 22, 2021 at 12:32:29PM +0000, Robin Murphy wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 2021-11-19 07:19, Nicolin Chen via iommu wrote:
>>>>> When VCMDQs are assigned to a VINTF that is owned by a guest, not
>>>>> hypervisor (HYP_OWN bit is unset), only TLB invalidation commands
>>>>> are supported. This requires get_cmd() function to scan the input
>>>>> cmd before selecting cmdq between smmu->cmdq and vintf->vcmdq, so
>>>>> unsupported commands can still go through emulated smmu->cmdq.
>>>>>
>>>>> Also the guest shouldn't have HYP_OWN bit being set regardless of
>>>>> guest kernel driver writing it or not, i.e. the user space driver
>>>>> running in the host OS should wire this bit to zero when trapping
>>>>> a write access to this VINTF_CONFIG register from a guest kernel.
>>>>> So instead of using the existing regval, this patch reads out the
>>>>> register value explicitly to cache in vintf->cfg.
>>>>>
>>>>> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>>>>> ---
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 ++--
>>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +--
>>>>> .../arm/arm-smmu-v3/nvidia-grace-cmdqv.c | 32 +++++++++++++++++--
>>>>> 3 files changed, 36 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> index b1182dd825fd..73941ccc1a3e 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>>> @@ -337,10 +337,10 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
>>>>> return 0;
>>>>> }
>>>>>
>>>>> -static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> if (smmu->nvidia_grace_cmdqv)
>>>>> - return nvidia_grace_cmdqv_get_cmdq(smmu);
>>>>> + return nvidia_grace_cmdqv_get_cmdq(smmu, cmds, n);
>>>>>
>>>>> return &smmu->cmdq;
>>>>> }
>>>>> @@ -747,7 +747,7 @@ static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>>>>> u32 prod;
>>>>> unsigned long flags;
>>>>> bool owner;
>>>>> - struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu);
>>>>> + struct arm_smmu_cmdq *cmdq = arm_smmu_get_cmdq(smmu, cmds, n);
>>>>> struct arm_smmu_ll_queue llq, head;
>>>>> int ret = 0;
>>>>>
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> index 24f93444aeeb..085c775c2eea 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>>>>> @@ -832,7 +832,8 @@ struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> struct acpi_iort_node *node);
>>>>> int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu);
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu);
>>>>> +struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu,
>>>>> + u64 *cmds, int n);
>>>>> #else /* CONFIG_NVIDIA_GRACE_CMDQV */
>>>>> static inline struct nvidia_grace_cmdqv *
>>>>> nvidia_grace_cmdqv_acpi_probe(struct arm_smmu_device *smmu,
>>>>> @@ -847,7 +848,7 @@ static inline int nvidia_grace_cmdqv_device_reset(struct arm_smmu_device *smmu)
>>>>> }
>>>>>
>>>>> static inline struct arm_smmu_cmdq *
>>>>> -nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> return NULL;
>>>>> }
>>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> index c0d7351f13e2..71f6bc684e64 100644
>>>>> --- a/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/nvidia-grace-cmdqv.c
>>>>> @@ -166,7 +166,8 @@ static int nvidia_grace_cmdqv_init_one_vcmdq(struct nvidia_grace_cmdqv *cmdqv,
>>>>> return arm_smmu_cmdq_init(cmdqv->smmu, cmdq);
>>>>> }
>>>>>
>>>>> -struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> +struct arm_smmu_cmdq *
>>>>> +nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu, u64 *cmds, int n)
>>>>> {
>>>>> struct nvidia_grace_cmdqv *cmdqv = smmu->nvidia_grace_cmdqv;
>>>>> struct nvidia_grace_cmdqv_vintf *vintf0 = &cmdqv->vintf0;
>>>>> @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
>>>>> if (!FIELD_GET(VINTF_STATUS, vintf0->status))
>>>>> return &smmu->cmdq;
>>>>>
>>>>> + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
>>>>> + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
>>>>> + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
>>>>
>>>> I'm not sure there was ever a conscious design decision that batches
>>>> only ever contain one type of command - if something needs to start
>>>
>>> Hmm, I think that's a good catch -- as it could be a potential
>>> bug here. Though the SMMUv3 driver currently seems to use loop
>>> by adding one type of cmds to any batch and submitting it right
>>> away so checking opcode of cmds[0] alone seems to be sufficient
>>> at this moment, yet it might not be so in the future. We'd need
>>> to apply certain constrains on the type of cmds in the batch in
>>> SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
>>> SMMUv3's CMDQ pathway here if one of cmds is not supported.
>>>
>>>> depending on that behaviour then that dependency probably wants to be
>>>> clearly documented. Also, a sync on its own gets trapped to the main
>>>> cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
>>>> VCMDQ, huh?
>>>
>>> Yea...looks like an implication again where cmds must have SYNC
>>> at the end of the batch. I will see if any simple change can be
>>> done to fix these two. If you have suggestions for them, I would
>>> love to hear too.
>>
>> Can you explain the current logic here? It's not entirely clear to me
>> whether the VCMDQ is actually meant to support CMD_SYNC or not.
>
> Yes. It's designed to take CMD_SYNC in same queue too. Though it
> also has features, such as HW-inserted-SYNC when scheduler moves
> away from the current queue or when the number of cmds in vcmdq
> meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> safer for software to ensure the CMD_SYNC is inserted to the end
> of the batch.
OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
switch statement? That's reassuring at least. Having to trap to the host
to issue a sync would be horrible, and largely defeat the point of the
whole exercise.
It's not generally much use to software to know that the hardware may or
may not have automatically inserted syncs at arbitrary points in the
timeline; certainly for our flow in Linux, which I don't think is
atypical, we need to know for sure that specific invalidation commands
have completed before we can safely reuse resources associated with the
invalidated translations, and the only way to guarantee that is to
explicitly observe the consumption of a CMD_SYNC from a later queue index.
>>>>> +
>>>>> + /* List all supported CMDs for vintf->cmdq pathway */
>>>>> + switch (opcode) {
>>>>> + case CMDQ_OP_TLBI_NH_ASID:
>>>>> + case CMDQ_OP_TLBI_NH_VA:
>>>>> + case CMDQ_OP_TLBI_S12_VMALL:
>>>>> + case CMDQ_OP_TLBI_S2_IPA:
>>>>
>>>> Fun! Can the guest invalidate any VMID it feels like, or is there some
>>>> additional magic on the host side that we're missing here?
>>>
>>> Yes. VINTF has a register for SW to program VMID so that the HW
>>> can replace VMIDs in the cmds in the VCMDQs of that VINTF with
>>> the programmed VMID. That was the reason why we had numbers of
>>> patches in v2 to route the VMID between guest and host.
>>>
>>>>> + case CMDQ_OP_ATC_INV:
>>>>> + break;
>>>> Ditto for StreamID here.
>>>
>>> Yes. StreamID works similarly by the HW: each VINTF provides us
>>> 16 pairs of MATCH+REPLACE registers to program host and guest's
>>> StreamIDs. Our previous mdev implementation in v2 can be a good
>>> reference code:
>>> https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
>>
>> Ah, sorry, I haven't had the bandwidth to dig back through all the
>> previous threads. Thanks for clarifying - I'm still not sure why any
>> notion of stage 2 would be exposed to guests at all, but at least ita
>
> Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> wouldn't get those I think. They'll be trapped in the hypervisor
> -- the user driver (QEMU CMDQV device model for example.)
I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
moment it makes no sense for a guest to even *think* it can issue
TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
this is that we pick the Context Descriptor from guest memory via the
emulated Stream Table (or other mechanism like virtio-iommu) and plumb
it directly into the S1ContextPtr of the appropriate underlying physical
STE, on top of the host's S2 translation. I don't see how we could also
flatten an emulated S2 into either physical stage without having to go
back to the costly "trap all pagetable accesses" approach which would
obliterate the benefit of having a directly-assigned queue.
>> sounds like there's no functional concern here, other than constraining
>> the number of devices which can be assigned to a single VM, but I think
>> that falls into the bucket of information that userspace VMMs will have
>> to learn about this kind of direct IOMMU interface assignment anyway
>> (most importantly, the relationship of assigned devices to vIOMMUs
>> suddenly has to start reflecting the underlying physical topology).
>
> We haven't started to think how to fit the best into the IOMMUFD
> but we will be likely having some idea or test case in Jan.
>
>> Out of interest, would ATC_INV with an unmatched StreamID raise an error
>> or just be ignored? Particularly if the host gets a chance to handle a
>
> Mismatched StreamID will be treated as an Illegal command. Yes,
> there'd be an error.
>
>> GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
>> not, there's scope to do some interesting things for functionality and
>> robustness.
>
> Would love to learn more about your thoughts :)
Basically it's quite neat if we could present a virtual queue to the
guest as the vSMMU's main queue, such that any commands that the
hardware can't consume directly could be fixed up or emulated by the
host with the illusion that they're being consumed as normal. It does
push more complexity into the host, and a round trip via the GError
interrupt would be a bit less efficient than trapping synchronously on a
write to an emulated CMDQ_PROD for commands that *do* need emulating,
but conversely it means we could support any guest with only the most
basic understanding of SMMUv3.0, and could potentially be more robust
overall. As I say, though, it depends entirely on the guest not being
able to observe an error unltil the host has decided not to fix up the
offending command.
> Btw, I think we may continue the discussion on this PATCH-5 and
> then to figure out ideal solutions for those potential bugs that
> you commented so far, as this patch really is very introductory
> to Guest support (we need more implementation based on IOMMUFD.)
>
> For the first 4 patches, they could be separated. Do you see a
> chance to get them applied first? They are in the mail list for
> a while now. And we'd like to accelerate the progress of those
> four changes first.
I can't speak for Will, but personally I'd consider them exactly the
same as the ECMDQ patches - it's good to have them out here, reviewed as
far as we reasonably can, and ready for people to experiment with as
soon as the real hardware turns up, but I don't see any benefit in
actually merging unproven complexity into mainline before then. Neither
patchset gives Linux any new functionality that it can't achieve already
with the regular cmdq, so there's nothing to gain until it's actually
demonstrable that we really are addressing the right bottlenecks in the
right manner to meaningfully improve real-world performance, but what we
have to lose is more effort spent ripping stuff out again if it turns
out to be no good. Even patches #1-#3 here fundamentally beg the
question of whether replicating the full heavyweight cmdq behaviour is
the right way to go.
I appreciate you've probably got hardware validation teams on your back
wanting "the driver" to support every new feature right now for them to
exercise, but we just have to stand firm and tell them that's not how
upstream works :)
Thanks,
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
2021-12-24 12:13 ` Robin Murphy
(?)
@ 2021-12-28 5:49 ` Nicolin Chen via iommu
-1 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-28 5:49 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Fri, Dec 24, 2021 at 12:13:57PM +0000, Robin Murphy wrote:
> > > > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > > > return &smmu->cmdq;
> > > > > >
> > > > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > > > >
> > > > > I'm not sure there was ever a conscious design decision that batches
> > > > > only ever contain one type of command - if something needs to start
> > > >
> > > > Hmm, I think that's a good catch -- as it could be a potential
> > > > bug here. Though the SMMUv3 driver currently seems to use loop
> > > > by adding one type of cmds to any batch and submitting it right
> > > > away so checking opcode of cmds[0] alone seems to be sufficient
> > > > at this moment, yet it might not be so in the future. We'd need
> > > > to apply certain constrains on the type of cmds in the batch in
> > > > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > > > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> > > >
> > > > > depending on that behaviour then that dependency probably wants to be
> > > > > clearly documented. Also, a sync on its own gets trapped to the main
> > > > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > > > VCMDQ, huh?
> > > >
> > > > Yea...looks like an implication again where cmds must have SYNC
> > > > at the end of the batch. I will see if any simple change can be
> > > > done to fix these two. If you have suggestions for them, I would
> > > > love to hear too.
> > >
> > > Can you explain the current logic here? It's not entirely clear to me
> > > whether the VCMDQ is actually meant to support CMD_SYNC or not.
> >
> > Yes. It's designed to take CMD_SYNC in same queue too. Though it
> > also has features, such as HW-inserted-SYNC when scheduler moves
> > away from the current queue or when the number of cmds in vcmdq
> > meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> > safer for software to ensure the CMD_SYNC is inserted to the end
> > of the batch.
>
> OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
> switch statement? That's reassuring at least. Having to trap to the host
> to issue a sync would be horrible, and largely defeat the point of the
> whole exercise.
Hmm..I'm not sure why we need CMD_SYNC in the switch statement.
I thought that you pointed out a potential corner case where a
batch could be submitted separately, e.g. Batch A {TLBI_NH_VAx2}
and then Batch B {CMD_SYNC}. Right now the SMMUv3 driver submits
all TLBI commands with sync=true, so we don't run into a problem
so far.
> It's not generally much use to software to know that the hardware may or
> may not have automatically inserted syncs at arbitrary points in the
> timeline; certainly for our flow in Linux, which I don't think is
> atypical, we need to know for sure that specific invalidation commands
> have completed before we can safely reuse resources associated with the
> invalidated translations, and the only way to guarantee that is to
> explicitly observe the consumption of a CMD_SYNC from a later queue index.
Hmm, if I capture it correctly, for the potential issue that I
listed above, we could simply ensure each TLBI batch to contain
TLBI commands only and to have CMD_SYNC at the end.
> > > > > > +
> > > > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > > > + switch (opcode) {
> > > > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > > > + case CMDQ_OP_TLBI_S2_IPA:
> > > > >
> > > > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > > > additional magic on the host side that we're missing here?
> > > >
> > > > Yes. VINTF has a register for SW to program VMID so that the HW
> > > > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > > > the programmed VMID. That was the reason why we had numbers of
> > > > patches in v2 to route the VMID between guest and host.
> > > >
> > > > > > + case CMDQ_OP_ATC_INV:
> > > > > > + break;
> > > > > Ditto for StreamID here.
> > > >
> > > > Yes. StreamID works similarly by the HW: each VINTF provides us
> > > > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > > > StreamIDs. Our previous mdev implementation in v2 can be a good
> > > > reference code:
> > > > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
> > >
> > > Ah, sorry, I haven't had the bandwidth to dig back through all the
> > > previous threads. Thanks for clarifying - I'm still not sure why any
> > > notion of stage 2 would be exposed to guests at all, but at least ita
> >
> > Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> > wouldn't get those I think. They'll be trapped in the hypervisor
> > -- the user driver (QEMU CMDQV device model for example.)
>
> I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
> moment it makes no sense for a guest to even *think* it can issue
> TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
Ah..that's true. We've listed those by following the supported
command list from HW team. There might be no use case to cover
those.
> this is that we pick the Context Descriptor from guest memory via the
> emulated Stream Table (or other mechanism like virtio-iommu) and plumb
> it directly into the S1ContextPtr of the appropriate underlying physical
> STE, on top of the host's S2 translation. I don't see how we could also
Yea. VCMDQs are supposed to do TLB invalidation only. All other
commands should be going through ioctls (VFIO or IOMMUFD). What
we currently use for verification is Nesting patches from Eric,
yet the TLB invalidation would run into the VCMDQ pathway, as a
hardware acceleration.
> flatten an emulated S2 into either physical stage without having to go
> back to the costly "trap all pagetable accesses" approach which would
> obliterate the benefit of having a directly-assigned queue.
>
> > > sounds like there's no functional concern here, other than constraining
> > > the number of devices which can be assigned to a single VM, but I think
> > > that falls into the bucket of information that userspace VMMs will have
> > > to learn about this kind of direct IOMMU interface assignment anyway
> > > (most importantly, the relationship of assigned devices to vIOMMUs
> > > suddenly has to start reflecting the underlying physical topology).
> >
> > We haven't started to think how to fit the best into the IOMMUFD
> > but we will be likely having some idea or test case in Jan.
> >
> > > Out of interest, would ATC_INV with an unmatched StreamID raise an error
> > > or just be ignored? Particularly if the host gets a chance to handle a
> >
> > Mismatched StreamID will be treated as an Illegal command. Yes,
> > there'd be an error.
> >
> > > GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> > > not, there's scope to do some interesting things for functionality and
> > > robustness.
> >
> > Would love to learn more about your thoughts :)
>
> Basically it's quite neat if we could present a virtual queue to the
> guest as the vSMMU's main queue, such that any commands that the
> hardware can't consume directly could be fixed up or emulated by the
> host with the illusion that they're being consumed as normal. It does
> push more complexity into the host, and a round trip via the GError
> interrupt would be a bit less efficient than trapping synchronously on a
> write to an emulated CMDQ_PROD for commands that *do* need emulating,
> but conversely it means we could support any guest with only the most
> basic understanding of SMMUv3.0, and could potentially be more robust
Wow...that's an interesting idea! So host kernel could have more
numbers of queues to serve those trapping IRQs, although I am not
sure if IRQs, over 64 interfaces and 128 queues, would overwhelm
the host ISR... Can threaded interrupts from the same IRQ number
be served at the same time on different CPU cores? If so, multi-
queue like VCMDQs and ECMDQ might take advantage of that.
Just one concern here: we'll need to support multi VCMDQs on the
guest level too. So using the vSMMU's main queue slot may not be
sufficient for CMDQV use cases.
> overall. As I say, though, it depends entirely on the guest not being
> able to observe an error unltil the host has decided not to fix up the
> offending command.
Well, given that guest IRQs are raised by the user space driver,
I think we can have certain controls to support that.
> > Btw, I think we may continue the discussion on this PATCH-5 and
> > then to figure out ideal solutions for those potential bugs that
> > you commented so far, as this patch really is very introductory
> > to Guest support (we need more implementation based on IOMMUFD.)
> >
> > For the first 4 patches, they could be separated. Do you see a
> > chance to get them applied first? They are in the mail list for
> > a while now. And we'd like to accelerate the progress of those
> > four changes first.
>
> I can't speak for Will, but personally I'd consider them exactly the
> same as the ECMDQ patches - it's good to have them out here, reviewed as
> far as we reasonably can, and ready for people to experiment with as
> soon as the real hardware turns up, but I don't see any benefit in
> actually merging unproven complexity into mainline before then. Neither
> patchset gives Linux any new functionality that it can't achieve already
> with the regular cmdq, so there's nothing to gain until it's actually
> demonstrable that we really are addressing the right bottlenecks in the
> right manner to meaningfully improve real-world performance, but what we
> have to lose is more effort spent ripping stuff out again if it turns
> out to be no good. Even patches #1-#3 here fundamentally beg the
> question of whether replicating the full heavyweight cmdq behaviour is
> the right way to go.
OK...looks like we'd have to provide some solid perf data here
for host-kernel use of VCMDQs (PATCH 1-4), or to wait until we
have a full-stack support covering guest use cases too.
> I appreciate you've probably got hardware validation teams on your back
> wanting "the driver" to support every new feature right now for them to
> exercise, but we just have to stand firm and tell them that's not how
> upstream works :)
Well..I'd expect that they'll just push me back to do whatever
I can to get the job done lol
Thank you!
Nic
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-28 5:49 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen via iommu @ 2021-12-28 5:49 UTC (permalink / raw)
To: Robin Murphy
Cc: jean-philippe, linux-kernel, iommu, thierry.reding, jgg,
linux-tegra, will, linux-arm-kernel
On Fri, Dec 24, 2021 at 12:13:57PM +0000, Robin Murphy wrote:
> > > > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > > > return &smmu->cmdq;
> > > > > >
> > > > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > > > >
> > > > > I'm not sure there was ever a conscious design decision that batches
> > > > > only ever contain one type of command - if something needs to start
> > > >
> > > > Hmm, I think that's a good catch -- as it could be a potential
> > > > bug here. Though the SMMUv3 driver currently seems to use loop
> > > > by adding one type of cmds to any batch and submitting it right
> > > > away so checking opcode of cmds[0] alone seems to be sufficient
> > > > at this moment, yet it might not be so in the future. We'd need
> > > > to apply certain constrains on the type of cmds in the batch in
> > > > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > > > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> > > >
> > > > > depending on that behaviour then that dependency probably wants to be
> > > > > clearly documented. Also, a sync on its own gets trapped to the main
> > > > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > > > VCMDQ, huh?
> > > >
> > > > Yea...looks like an implication again where cmds must have SYNC
> > > > at the end of the batch. I will see if any simple change can be
> > > > done to fix these two. If you have suggestions for them, I would
> > > > love to hear too.
> > >
> > > Can you explain the current logic here? It's not entirely clear to me
> > > whether the VCMDQ is actually meant to support CMD_SYNC or not.
> >
> > Yes. It's designed to take CMD_SYNC in same queue too. Though it
> > also has features, such as HW-inserted-SYNC when scheduler moves
> > away from the current queue or when the number of cmds in vcmdq
> > meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> > safer for software to ensure the CMD_SYNC is inserted to the end
> > of the batch.
>
> OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
> switch statement? That's reassuring at least. Having to trap to the host
> to issue a sync would be horrible, and largely defeat the point of the
> whole exercise.
Hmm..I'm not sure why we need CMD_SYNC in the switch statement.
I thought that you pointed out a potential corner case where a
batch could be submitted separately, e.g. Batch A {TLBI_NH_VAx2}
and then Batch B {CMD_SYNC}. Right now the SMMUv3 driver submits
all TLBI commands with sync=true, so we don't run into a problem
so far.
> It's not generally much use to software to know that the hardware may or
> may not have automatically inserted syncs at arbitrary points in the
> timeline; certainly for our flow in Linux, which I don't think is
> atypical, we need to know for sure that specific invalidation commands
> have completed before we can safely reuse resources associated with the
> invalidated translations, and the only way to guarantee that is to
> explicitly observe the consumption of a CMD_SYNC from a later queue index.
Hmm, if I capture it correctly, for the potential issue that I
listed above, we could simply ensure each TLBI batch to contain
TLBI commands only and to have CMD_SYNC at the end.
> > > > > > +
> > > > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > > > + switch (opcode) {
> > > > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > > > + case CMDQ_OP_TLBI_S2_IPA:
> > > > >
> > > > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > > > additional magic on the host side that we're missing here?
> > > >
> > > > Yes. VINTF has a register for SW to program VMID so that the HW
> > > > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > > > the programmed VMID. That was the reason why we had numbers of
> > > > patches in v2 to route the VMID between guest and host.
> > > >
> > > > > > + case CMDQ_OP_ATC_INV:
> > > > > > + break;
> > > > > Ditto for StreamID here.
> > > >
> > > > Yes. StreamID works similarly by the HW: each VINTF provides us
> > > > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > > > StreamIDs. Our previous mdev implementation in v2 can be a good
> > > > reference code:
> > > > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
> > >
> > > Ah, sorry, I haven't had the bandwidth to dig back through all the
> > > previous threads. Thanks for clarifying - I'm still not sure why any
> > > notion of stage 2 would be exposed to guests at all, but at least ita
> >
> > Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> > wouldn't get those I think. They'll be trapped in the hypervisor
> > -- the user driver (QEMU CMDQV device model for example.)
>
> I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
> moment it makes no sense for a guest to even *think* it can issue
> TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
Ah..that's true. We've listed those by following the supported
command list from HW team. There might be no use case to cover
those.
> this is that we pick the Context Descriptor from guest memory via the
> emulated Stream Table (or other mechanism like virtio-iommu) and plumb
> it directly into the S1ContextPtr of the appropriate underlying physical
> STE, on top of the host's S2 translation. I don't see how we could also
Yea. VCMDQs are supposed to do TLB invalidation only. All other
commands should be going through ioctls (VFIO or IOMMUFD). What
we currently use for verification is Nesting patches from Eric,
yet the TLB invalidation would run into the VCMDQ pathway, as a
hardware acceleration.
> flatten an emulated S2 into either physical stage without having to go
> back to the costly "trap all pagetable accesses" approach which would
> obliterate the benefit of having a directly-assigned queue.
>
> > > sounds like there's no functional concern here, other than constraining
> > > the number of devices which can be assigned to a single VM, but I think
> > > that falls into the bucket of information that userspace VMMs will have
> > > to learn about this kind of direct IOMMU interface assignment anyway
> > > (most importantly, the relationship of assigned devices to vIOMMUs
> > > suddenly has to start reflecting the underlying physical topology).
> >
> > We haven't started to think how to fit the best into the IOMMUFD
> > but we will be likely having some idea or test case in Jan.
> >
> > > Out of interest, would ATC_INV with an unmatched StreamID raise an error
> > > or just be ignored? Particularly if the host gets a chance to handle a
> >
> > Mismatched StreamID will be treated as an Illegal command. Yes,
> > there'd be an error.
> >
> > > GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> > > not, there's scope to do some interesting things for functionality and
> > > robustness.
> >
> > Would love to learn more about your thoughts :)
>
> Basically it's quite neat if we could present a virtual queue to the
> guest as the vSMMU's main queue, such that any commands that the
> hardware can't consume directly could be fixed up or emulated by the
> host with the illusion that they're being consumed as normal. It does
> push more complexity into the host, and a round trip via the GError
> interrupt would be a bit less efficient than trapping synchronously on a
> write to an emulated CMDQ_PROD for commands that *do* need emulating,
> but conversely it means we could support any guest with only the most
> basic understanding of SMMUv3.0, and could potentially be more robust
Wow...that's an interesting idea! So host kernel could have more
numbers of queues to serve those trapping IRQs, although I am not
sure if IRQs, over 64 interfaces and 128 queues, would overwhelm
the host ISR... Can threaded interrupts from the same IRQ number
be served at the same time on different CPU cores? If so, multi-
queue like VCMDQs and ECMDQ might take advantage of that.
Just one concern here: we'll need to support multi VCMDQs on the
guest level too. So using the vSMMU's main queue slot may not be
sufficient for CMDQV use cases.
> overall. As I say, though, it depends entirely on the guest not being
> able to observe an error unltil the host has decided not to fix up the
> offending command.
Well, given that guest IRQs are raised by the user space driver,
I think we can have certain controls to support that.
> > Btw, I think we may continue the discussion on this PATCH-5 and
> > then to figure out ideal solutions for those potential bugs that
> > you commented so far, as this patch really is very introductory
> > to Guest support (we need more implementation based on IOMMUFD.)
> >
> > For the first 4 patches, they could be separated. Do you see a
> > chance to get them applied first? They are in the mail list for
> > a while now. And we'd like to accelerate the progress of those
> > four changes first.
>
> I can't speak for Will, but personally I'd consider them exactly the
> same as the ECMDQ patches - it's good to have them out here, reviewed as
> far as we reasonably can, and ready for people to experiment with as
> soon as the real hardware turns up, but I don't see any benefit in
> actually merging unproven complexity into mainline before then. Neither
> patchset gives Linux any new functionality that it can't achieve already
> with the regular cmdq, so there's nothing to gain until it's actually
> demonstrable that we really are addressing the right bottlenecks in the
> right manner to meaningfully improve real-world performance, but what we
> have to lose is more effort spent ripping stuff out again if it turns
> out to be no good. Even patches #1-#3 here fundamentally beg the
> question of whether replicating the full heavyweight cmdq behaviour is
> the right way to go.
OK...looks like we'd have to provide some solid perf data here
for host-kernel use of VCMDQs (PATCH 1-4), or to wait until we
have a full-stack support covering guest use cases too.
> I appreciate you've probably got hardware validation teams on your back
> wanting "the driver" to support every new feature right now for them to
> exercise, but we just have to stand firm and tell them that's not how
> upstream works :)
Well..I'd expect that they'll just push me back to do whatever
I can to get the job done lol
Thank you!
Nic
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF
@ 2021-12-28 5:49 ` Nicolin Chen via iommu
0 siblings, 0 replies; 51+ messages in thread
From: Nicolin Chen @ 2021-12-28 5:49 UTC (permalink / raw)
To: Robin Murphy
Cc: joro, will, jean-philippe, linux-kernel, iommu, linux-tegra,
thierry.reding, jgg, linux-arm-kernel
On Fri, Dec 24, 2021 at 12:13:57PM +0000, Robin Murphy wrote:
> > > > > > @@ -176,6 +177,24 @@ struct arm_smmu_cmdq *nvidia_grace_cmdqv_get_cmdq(struct arm_smmu_device *smmu)
> > > > > > if (!FIELD_GET(VINTF_STATUS, vintf0->status))
> > > > > > return &smmu->cmdq;
> > > > > >
> > > > > > + /* Check for supported CMDs if VINTF is owned by guest (not hypervisor) */
> > > > > > + if (!FIELD_GET(VINTF_HYP_OWN, vintf0->cfg)) {
> > > > > > + u64 opcode = (n) ? FIELD_GET(CMDQ_0_OP, cmds[0]) : CMDQ_OP_CMD_SYNC;
> > > > >
> > > > > I'm not sure there was ever a conscious design decision that batches
> > > > > only ever contain one type of command - if something needs to start
> > > >
> > > > Hmm, I think that's a good catch -- as it could be a potential
> > > > bug here. Though the SMMUv3 driver currently seems to use loop
> > > > by adding one type of cmds to any batch and submitting it right
> > > > away so checking opcode of cmds[0] alone seems to be sufficient
> > > > at this moment, yet it might not be so in the future. We'd need
> > > > to apply certain constrains on the type of cmds in the batch in
> > > > SMMUv3 driver upon smmu->nvidia_grace_cmdqv, or fallback to the
> > > > SMMUv3's CMDQ pathway here if one of cmds is not supported.
> > > >
> > > > > depending on that behaviour then that dependency probably wants to be
> > > > > clearly documented. Also, a sync on its own gets trapped to the main
> > > > > cmdq but a sync on the end of a batch of TLBIs or ATCIs goes to the
> > > > > VCMDQ, huh?
> > > >
> > > > Yea...looks like an implication again where cmds must have SYNC
> > > > at the end of the batch. I will see if any simple change can be
> > > > done to fix these two. If you have suggestions for them, I would
> > > > love to hear too.
> > >
> > > Can you explain the current logic here? It's not entirely clear to me
> > > whether the VCMDQ is actually meant to support CMD_SYNC or not.
> >
> > Yes. It's designed to take CMD_SYNC in same queue too. Though it
> > also has features, such as HW-inserted-SYNC when scheduler moves
> > away from the current queue or when the number of cmds in vcmdq
> > meets a MAX-BATCH-SIZE setting (in config register), yet it'd be
> > safer for software to ensure the CMD_SYNC is inserted to the end
> > of the batch.
>
> OK, so the bug here is just that we're missing CMDQ_OP_CMD_SYNC from the
> switch statement? That's reassuring at least. Having to trap to the host
> to issue a sync would be horrible, and largely defeat the point of the
> whole exercise.
Hmm..I'm not sure why we need CMD_SYNC in the switch statement.
I thought that you pointed out a potential corner case where a
batch could be submitted separately, e.g. Batch A {TLBI_NH_VAx2}
and then Batch B {CMD_SYNC}. Right now the SMMUv3 driver submits
all TLBI commands with sync=true, so we don't run into a problem
so far.
> It's not generally much use to software to know that the hardware may or
> may not have automatically inserted syncs at arbitrary points in the
> timeline; certainly for our flow in Linux, which I don't think is
> atypical, we need to know for sure that specific invalidation commands
> have completed before we can safely reuse resources associated with the
> invalidated translations, and the only way to guarantee that is to
> explicitly observe the consumption of a CMD_SYNC from a later queue index.
Hmm, if I capture it correctly, for the potential issue that I
listed above, we could simply ensure each TLBI batch to contain
TLBI commands only and to have CMD_SYNC at the end.
> > > > > > +
> > > > > > + /* List all supported CMDs for vintf->cmdq pathway */
> > > > > > + switch (opcode) {
> > > > > > + case CMDQ_OP_TLBI_NH_ASID:
> > > > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > > > + case CMDQ_OP_TLBI_S12_VMALL:
> > > > > > + case CMDQ_OP_TLBI_S2_IPA:
> > > > >
> > > > > Fun! Can the guest invalidate any VMID it feels like, or is there some
> > > > > additional magic on the host side that we're missing here?
> > > >
> > > > Yes. VINTF has a register for SW to program VMID so that the HW
> > > > can replace VMIDs in the cmds in the VCMDQs of that VINTF with
> > > > the programmed VMID. That was the reason why we had numbers of
> > > > patches in v2 to route the VMID between guest and host.
> > > >
> > > > > > + case CMDQ_OP_ATC_INV:
> > > > > > + break;
> > > > > Ditto for StreamID here.
> > > >
> > > > Yes. StreamID works similarly by the HW: each VINTF provides us
> > > > 16 pairs of MATCH+REPLACE registers to program host and guest's
> > > > StreamIDs. Our previous mdev implementation in v2 can be a good
> > > > reference code:
> > > > https://lore.kernel.org/kvm/20210831101549.237151fa.alex.williamson@redhat.com/T/#m903a1b44935d9e0376439a0c63e832eb464fbaee
> > >
> > > Ah, sorry, I haven't had the bandwidth to dig back through all the
> > > previous threads. Thanks for clarifying - I'm still not sure why any
> > > notion of stage 2 would be exposed to guests at all, but at least ita
> >
> > Do you mean, by "notion of stage 2", Host Stream IDs? The guest
> > wouldn't get those I think. They'll be trapped in the hypervisor
> > -- the user driver (QEMU CMDQV device model for example.)
>
> I mean if it's emulated as a full SMMUv3 interface, IDR0.S2P=0. At the
> moment it makes no sense for a guest to even *think* it can issue
> TLBI_S2_IPA or TLBI_S12_VMALL. My understanding of the usage model for
Ah..that's true. We've listed those by following the supported
command list from HW team. There might be no use case to cover
those.
> this is that we pick the Context Descriptor from guest memory via the
> emulated Stream Table (or other mechanism like virtio-iommu) and plumb
> it directly into the S1ContextPtr of the appropriate underlying physical
> STE, on top of the host's S2 translation. I don't see how we could also
Yea. VCMDQs are supposed to do TLB invalidation only. All other
commands should be going through ioctls (VFIO or IOMMUFD). What
we currently use for verification is Nesting patches from Eric,
yet the TLB invalidation would run into the VCMDQ pathway, as a
hardware acceleration.
> flatten an emulated S2 into either physical stage without having to go
> back to the costly "trap all pagetable accesses" approach which would
> obliterate the benefit of having a directly-assigned queue.
>
> > > sounds like there's no functional concern here, other than constraining
> > > the number of devices which can be assigned to a single VM, but I think
> > > that falls into the bucket of information that userspace VMMs will have
> > > to learn about this kind of direct IOMMU interface assignment anyway
> > > (most importantly, the relationship of assigned devices to vIOMMUs
> > > suddenly has to start reflecting the underlying physical topology).
> >
> > We haven't started to think how to fit the best into the IOMMUFD
> > but we will be likely having some idea or test case in Jan.
> >
> > > Out of interest, would ATC_INV with an unmatched StreamID raise an error
> > > or just be ignored? Particularly if the host gets a chance to handle a
> >
> > Mismatched StreamID will be treated as an Illegal command. Yes,
> > there'd be an error.
> >
> > > GError and decide whether CMDQ_CONS.ERR is reported back to the guest or
> > > not, there's scope to do some interesting things for functionality and
> > > robustness.
> >
> > Would love to learn more about your thoughts :)
>
> Basically it's quite neat if we could present a virtual queue to the
> guest as the vSMMU's main queue, such that any commands that the
> hardware can't consume directly could be fixed up or emulated by the
> host with the illusion that they're being consumed as normal. It does
> push more complexity into the host, and a round trip via the GError
> interrupt would be a bit less efficient than trapping synchronously on a
> write to an emulated CMDQ_PROD for commands that *do* need emulating,
> but conversely it means we could support any guest with only the most
> basic understanding of SMMUv3.0, and could potentially be more robust
Wow...that's an interesting idea! So host kernel could have more
numbers of queues to serve those trapping IRQs, although I am not
sure if IRQs, over 64 interfaces and 128 queues, would overwhelm
the host ISR... Can threaded interrupts from the same IRQ number
be served at the same time on different CPU cores? If so, multi-
queue like VCMDQs and ECMDQ might take advantage of that.
Just one concern here: we'll need to support multi VCMDQs on the
guest level too. So using the vSMMU's main queue slot may not be
sufficient for CMDQV use cases.
> overall. As I say, though, it depends entirely on the guest not being
> able to observe an error unltil the host has decided not to fix up the
> offending command.
Well, given that guest IRQs are raised by the user space driver,
I think we can have certain controls to support that.
> > Btw, I think we may continue the discussion on this PATCH-5 and
> > then to figure out ideal solutions for those potential bugs that
> > you commented so far, as this patch really is very introductory
> > to Guest support (we need more implementation based on IOMMUFD.)
> >
> > For the first 4 patches, they could be separated. Do you see a
> > chance to get them applied first? They are in the mail list for
> > a while now. And we'd like to accelerate the progress of those
> > four changes first.
>
> I can't speak for Will, but personally I'd consider them exactly the
> same as the ECMDQ patches - it's good to have them out here, reviewed as
> far as we reasonably can, and ready for people to experiment with as
> soon as the real hardware turns up, but I don't see any benefit in
> actually merging unproven complexity into mainline before then. Neither
> patchset gives Linux any new functionality that it can't achieve already
> with the regular cmdq, so there's nothing to gain until it's actually
> demonstrable that we really are addressing the right bottlenecks in the
> right manner to meaningfully improve real-world performance, but what we
> have to lose is more effort spent ripping stuff out again if it turns
> out to be no good. Even patches #1-#3 here fundamentally beg the
> question of whether replicating the full heavyweight cmdq behaviour is
> the right way to go.
OK...looks like we'd have to provide some solid perf data here
for host-kernel use of VCMDQs (PATCH 1-4), or to wait until we
have a full-stack support covering guest use cases too.
> I appreciate you've probably got hardware validation teams on your back
> wanting "the driver" to support every new feature right now for them to
> exercise, but we just have to stand firm and tell them that's not how
> upstream works :)
Well..I'd expect that they'll just push me back to do whatever
I can to get the job done lol
Thank you!
Nic
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2021-12-28 5:51 UTC | newest]
Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-19 7:19 [PATCH v3 0/5] iommu/arm-smmu-v3: Add NVIDIA Grace CMDQ-V Support Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-11-19 7:19 ` [PATCH v3 1/5] iommu/arm-smmu-v3: Add CS_NONE quirk Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-11-19 7:19 ` [PATCH v3 2/5] iommu/arm-smmu-v3: Make arm_smmu_cmdq_init reusable Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-11-19 7:19 ` [PATCH v3 3/5] iommu/arm-smmu-v3: Pass cmdq pointer in arm_smmu_cmdq_issue_cmdlist() Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-11-19 7:19 ` [PATCH v3 4/5] iommu/arm-smmu-v3: Add host support for NVIDIA Grace CMDQ-V Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-12-20 18:42 ` Robin Murphy
2021-12-20 18:42 ` Robin Murphy
2021-12-20 18:42 ` Robin Murphy
2021-12-20 19:27 ` Nicolin Chen
2021-12-20 19:27 ` Nicolin Chen
2021-12-20 19:27 ` Nicolin Chen via iommu
2021-12-21 18:55 ` Robin Murphy
2021-12-21 18:55 ` Robin Murphy
2021-12-21 18:55 ` Robin Murphy
2021-12-21 22:00 ` Nicolin Chen
2021-12-21 22:00 ` Nicolin Chen
2021-12-21 22:00 ` Nicolin Chen via iommu
2021-12-22 11:57 ` Robin Murphy
2021-12-22 11:57 ` Robin Murphy
2021-12-22 11:57 ` Robin Murphy
2021-11-19 7:19 ` [PATCH v3 5/5] iommu/nvidia-grace-cmdqv: Limit CMDs for guest owned VINTF Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen
2021-11-19 7:19 ` Nicolin Chen via iommu
2021-12-22 12:32 ` Robin Murphy
2021-12-22 12:32 ` Robin Murphy
2021-12-22 12:32 ` Robin Murphy
2021-12-22 22:52 ` Nicolin Chen
2021-12-22 22:52 ` Nicolin Chen
2021-12-22 22:52 ` Nicolin Chen via iommu
2021-12-23 11:14 ` Robin Murphy
2021-12-23 11:14 ` Robin Murphy
2021-12-23 11:14 ` Robin Murphy
2021-12-24 8:02 ` Nicolin Chen
2021-12-24 8:02 ` Nicolin Chen
2021-12-24 8:02 ` Nicolin Chen via iommu
2021-12-24 12:13 ` Robin Murphy
2021-12-24 12:13 ` Robin Murphy
2021-12-24 12:13 ` Robin Murphy
2021-12-28 5:49 ` Nicolin Chen
2021-12-28 5:49 ` Nicolin Chen
2021-12-28 5:49 ` Nicolin Chen via iommu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.