All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] KFD dGPU initialization
@ 2018-01-04 22:17 Felix Kuehling
  2018-01-04 22:17   ` Felix Kuehling
                   ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Remaining patches from the previous 37-patch series.

Patch 1: Reworked PCIe atomic patch with feedback from PCI maintainers
Patch 2-9: Rebased from previous series

CC-ed linux-pci@vger.kernel.org on relevant patches for context.

Felix Kuehling (8):
  drm/amdkfd: Conditionally enable PCIe atomics
  drm/amdkfd: Make IOMMUv2 code conditional
  drm/amdkfd: Make sched_policy a per-device setting
  drm/amdkfd: Add dGPU support to the device queue manager
  drm/amdkfd: Add dGPU support to the MQD manager
  drm/amdkfd: Add dGPU support to kernel_queue_init
  drm/amdkfd: Add dGPU device IDs and device info
  drm/amdgpu: Enable KFD initialization on dGPUs

Jay Cornwall (1):
  PCI: Add pci_enable_atomic_ops_to_root

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   5 +
 drivers/gpu/drm/amd/amdkfd/Kconfig                 |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c              |   8 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 234 +++++++++++++++++++--
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  33 ++-
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |   5 +
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  56 +++++
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  93 ++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   5 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c       |   7 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  35 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  21 ++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  10 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  17 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h          |   2 +
 drivers/pci/pci.c                                  |  80 +++++++
 include/linux/pci.h                                |   1 +
 include/uapi/linux/pci_regs.h                      |   4 +-
 23 files changed, 592 insertions(+), 39 deletions(-)

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-04 22:17   ` Felix Kuehling
  2018-01-04 22:17   ` [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting Felix Kuehling
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx, oded.gabbay; +Cc: Jay Cornwall, linux-pci, Felix Kuehling

From: Jay Cornwall <Jay.Cornwall@amd.com>

The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be
requested by, routed through and completed by PCIe components. Routing and
completion do not require software support. Component support for each is
detectable via the DEVCAP2 register.

AtomicOp requests are permitted only if a component's
DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be
detected but is a no-op if set on a component with no support. These
requests can only be serviced if the upstream components support AtomicOp
completion and/or routing to a component which does.

A concrete example is the AMD Fiji-class GPU, which is specified to
support AtomicOp requests, routed through a PLX 8747 switch (advertising
AtomicOp routing) to a Haswell host bridge (advertising AtomicOp
completion support). When AtomicOp requests are disabled the GPU logs
attempts to initiate requests to an MMIO register for debugging.

Add pci_enable_atomic_ops_to_root for per-device control over AtomicOp
requests. Upstream bridges are checked for AtomicOp routing capability and
the call fails if any lack this capability. The root port is checked for
AtomicOp completion capabilities and the call fails if it does not support
any. Routes to other PCIe components are not checked for AtomicOp routing
and completion capabilities.

v2: Check for AtomicOp route to root port with AtomicOp completion
v3: Style fixes
v4: Endpoint to root port only, check upstream egress blocking
v5: Rebase, use existing PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK define
v6: Add comp_caps param, fix upstream port detection, cosmetic/comments

CC: linux-pci@vger.kernel.org
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/pci/pci.c             | 80 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h           |  1 +
 include/uapi/linux/pci_regs.h |  4 ++-
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4a7c686..9cea399 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3066,6 +3066,86 @@ int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size)
 }
 
 /**
+ * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
+ * @dev: the PCI device
+ * @comp_caps: Caps required for atomic request completion
+ *
+ * Return 0 if all upstream bridges support AtomicOp routing, egress
+ * blocking is disabled on all upstream ports, and the root port
+ * supports the requested completion capabilities (32-bit, 64-bit
+ * and/or 128-bit AtomicOp completion), or negative otherwise.
+ */
+int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps)
+{
+	struct pci_bus *bus = dev->bus;
+
+	if (!pci_is_pcie(dev))
+		return -EINVAL;
+
+	switch (pci_pcie_type(dev)) {
+	/*
+	 * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted
+	 * to implement AtomicOp requester capabilities.
+	 */
+	case PCI_EXP_TYPE_ENDPOINT:
+	case PCI_EXP_TYPE_LEG_END:
+	case PCI_EXP_TYPE_RC_END:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	while (bus->parent) {
+		struct pci_dev *bridge = bus->self;
+		u32 cap;
+
+		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
+
+		switch (pci_pcie_type(bridge)) {
+		/*
+		 * Upstream, downstream and root ports may implement AtomicOp
+		 * routing capabilities. AtomicOp routing via a root port is
+		 * not considered.
+		 */
+		case PCI_EXP_TYPE_UPSTREAM:
+		case PCI_EXP_TYPE_DOWNSTREAM:
+			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+				return -EINVAL;
+			break;
+
+		/*
+		 * Root ports are permitted to implement AtomicOp completion
+		 * capabilities.
+		 */
+		case PCI_EXP_TYPE_ROOT_PORT:
+			if ((cap & comp_caps) != comp_caps)
+				return -EINVAL;
+			break;
+		}
+
+		/*
+		 * Upstream ports may block AtomicOps on egress.
+		 */
+		if (!bridge->has_secondary_link) {
+			u32 ctl2;
+
+			pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
+						   &ctl2);
+			if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
+				return -EINVAL;
+		}
+
+		bus = bus->parent;
+	}
+
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				 PCI_EXP_DEVCTL2_ATOMIC_REQ);
+
+	return 0;
+}
+EXPORT_SYMBOL(pci_enable_atomic_ops_to_root);
+
+/**
  * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge
  * @dev: the PCI device
  * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c170c92..52a17754 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2061,6 +2061,7 @@ void pci_request_acs(void);
 bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags);
 bool pci_acs_path_enabled(struct pci_dev *start,
 			  struct pci_dev *end, u16 acs_flags);
+int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps);
 
 #define PCI_VPD_LRDT			0x80	/* Large Resource Data Type */
 #define PCI_VPD_LRDT_ID(x)		((x) | PCI_VPD_LRDT)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 70c2b2a..f31b56b 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -624,7 +624,9 @@
 #define PCI_EXP_DEVCAP2		36	/* Device Capabilities 2 */
 #define  PCI_EXP_DEVCAP2_ARI		0x00000020 /* Alternative Routing-ID */
 #define  PCI_EXP_DEVCAP2_ATOMIC_ROUTE	0x00000040 /* Atomic Op routing */
-#define PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* Atomic 64-bit compare */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP32	0x00000080 /* 32b AtomicOp completion */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root
@ 2018-01-04 22:17   ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, Felix Kuehling, Jay Cornwall

From: Jay Cornwall <Jay.Cornwall@amd.com>

The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be
requested by, routed through and completed by PCIe components. Routing and
completion do not require software support. Component support for each is
detectable via the DEVCAP2 register.

AtomicOp requests are permitted only if a component's
DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be
detected but is a no-op if set on a component with no support. These
requests can only be serviced if the upstream components support AtomicOp
completion and/or routing to a component which does.

A concrete example is the AMD Fiji-class GPU, which is specified to
support AtomicOp requests, routed through a PLX 8747 switch (advertising
AtomicOp routing) to a Haswell host bridge (advertising AtomicOp
completion support). When AtomicOp requests are disabled the GPU logs
attempts to initiate requests to an MMIO register for debugging.

Add pci_enable_atomic_ops_to_root for per-device control over AtomicOp
requests. Upstream bridges are checked for AtomicOp routing capability and
the call fails if any lack this capability. The root port is checked for
AtomicOp completion capabilities and the call fails if it does not support
any. Routes to other PCIe components are not checked for AtomicOp routing
and completion capabilities.

v2: Check for AtomicOp route to root port with AtomicOp completion
v3: Style fixes
v4: Endpoint to root port only, check upstream egress blocking
v5: Rebase, use existing PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK define
v6: Add comp_caps param, fix upstream port detection, cosmetic/comments

CC: linux-pci@vger.kernel.org
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/pci/pci.c             | 80 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h           |  1 +
 include/uapi/linux/pci_regs.h |  4 ++-
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4a7c686..9cea399 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3066,6 +3066,86 @@ int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size)
 }
 
 /**
+ * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
+ * @dev: the PCI device
+ * @comp_caps: Caps required for atomic request completion
+ *
+ * Return 0 if all upstream bridges support AtomicOp routing, egress
+ * blocking is disabled on all upstream ports, and the root port
+ * supports the requested completion capabilities (32-bit, 64-bit
+ * and/or 128-bit AtomicOp completion), or negative otherwise.
+ */
+int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps)
+{
+	struct pci_bus *bus = dev->bus;
+
+	if (!pci_is_pcie(dev))
+		return -EINVAL;
+
+	switch (pci_pcie_type(dev)) {
+	/*
+	 * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted
+	 * to implement AtomicOp requester capabilities.
+	 */
+	case PCI_EXP_TYPE_ENDPOINT:
+	case PCI_EXP_TYPE_LEG_END:
+	case PCI_EXP_TYPE_RC_END:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	while (bus->parent) {
+		struct pci_dev *bridge = bus->self;
+		u32 cap;
+
+		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
+
+		switch (pci_pcie_type(bridge)) {
+		/*
+		 * Upstream, downstream and root ports may implement AtomicOp
+		 * routing capabilities. AtomicOp routing via a root port is
+		 * not considered.
+		 */
+		case PCI_EXP_TYPE_UPSTREAM:
+		case PCI_EXP_TYPE_DOWNSTREAM:
+			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+				return -EINVAL;
+			break;
+
+		/*
+		 * Root ports are permitted to implement AtomicOp completion
+		 * capabilities.
+		 */
+		case PCI_EXP_TYPE_ROOT_PORT:
+			if ((cap & comp_caps) != comp_caps)
+				return -EINVAL;
+			break;
+		}
+
+		/*
+		 * Upstream ports may block AtomicOps on egress.
+		 */
+		if (!bridge->has_secondary_link) {
+			u32 ctl2;
+
+			pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2,
+						   &ctl2);
+			if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK)
+				return -EINVAL;
+		}
+
+		bus = bus->parent;
+	}
+
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				 PCI_EXP_DEVCTL2_ATOMIC_REQ);
+
+	return 0;
+}
+EXPORT_SYMBOL(pci_enable_atomic_ops_to_root);
+
+/**
  * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge
  * @dev: the PCI device
  * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c170c92..52a17754 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2061,6 +2061,7 @@ void pci_request_acs(void);
 bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags);
 bool pci_acs_path_enabled(struct pci_dev *start,
 			  struct pci_dev *end, u16 acs_flags);
+int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps);
 
 #define PCI_VPD_LRDT			0x80	/* Large Resource Data Type */
 #define PCI_VPD_LRDT_ID(x)		((x) | PCI_VPD_LRDT)
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 70c2b2a..f31b56b 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -624,7 +624,9 @@
 #define PCI_EXP_DEVCAP2		36	/* Device Capabilities 2 */
 #define  PCI_EXP_DEVCAP2_ARI		0x00000020 /* Alternative Routing-ID */
 #define  PCI_EXP_DEVCAP2_ATOMIC_ROUTE	0x00000040 /* Atomic Op routing */
-#define PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* Atomic 64-bit compare */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP32	0x00000080 /* 32b AtomicOp completion */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
+#define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 2/9] drm/amdkfd: Conditionally enable PCIe atomics
  2018-01-04 22:17 [PATCH 0/9] KFD dGPU initialization Felix Kuehling
  2018-01-04 22:17   ` Felix Kuehling
@ 2018-01-04 22:17 ` Felix Kuehling
  2018-01-31 15:09     ` Oded Gabbay
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17 ` [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info Felix Kuehling
  3 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx, oded.gabbay; +Cc: Felix Kuehling, linux-pci

This will be needed for most dGPUs.

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 17 +++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  1 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index a8fa33a..fafe971 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -41,6 +41,7 @@ static const struct kfd_device_info kaveri_device_info = {
 	.num_of_watch_points = 4,
 	.mqd_size_aligned = MQD_SIZE_ALIGNED,
 	.supports_cwsr = false,
+	.needs_pci_atomics = false,
 };
 
 static const struct kfd_device_info carrizo_device_info = {
@@ -53,6 +54,7 @@ static const struct kfd_device_info carrizo_device_info = {
 	.num_of_watch_points = 4,
 	.mqd_size_aligned = MQD_SIZE_ALIGNED,
 	.supports_cwsr = true,
+	.needs_pci_atomics = false,
 };
 
 struct kfd_deviceid {
@@ -127,6 +129,21 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 		return NULL;
 	}
 
+	if (device_info->needs_pci_atomics) {
+		/* Allow BIF to recode atomics to PCIe 3.0
+		 * AtomicOps. 32 and 64-bit requests are possible and
+		 * must be supported.
+		 */
+		if (pci_enable_atomic_ops_to_root(pdev,
+				PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+				PCI_EXP_DEVCAP2_ATOMIC_COMP64) < 0) {
+			dev_info(kfd_device,
+				"skipped device %x:%x, PCI rejects atomics",
+				 pdev->vendor, pdev->device);
+			return NULL;
+		}
+	}
+
 	kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
 	if (!kfd)
 		return NULL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 6a48d29..eebfb1e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -158,6 +158,7 @@ struct kfd_device_info {
 	uint8_t num_of_watch_points;
 	uint16_t mqd_size_aligned;
 	bool supports_cwsr;
+	bool needs_pci_atomics;
 };
 
 struct kfd_mem_obj {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting Felix Kuehling
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
ASIC information. Also allow building KFD without IOMMUv2 support.
This is still useful for dGPUs and prepares for enabling KFD on
architectures that don't support AMD IOMMUv2.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
 8 files changed, 74 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
index bc5a294..5bbeb95 100644
--- a/drivers/gpu/drm/amd/amdkfd/Kconfig
+++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
@@ -4,6 +4,6 @@
 
 config HSA_AMD
 	tristate "HSA kernel driver for AMD GPU devices"
-	depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
+	depends on DRM_AMDGPU && X86_64
 	help
 	  Enable this if you want to use HSA features on AMD GPU devices.
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 2bc2816..3478270 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -22,7 +22,9 @@
 
 #include <linux/pci.h>
 #include <linux/acpi.h>
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 #include <linux/amd-iommu.h>
+#endif
 #include "kfd_crat.h"
 #include "kfd_priv.h"
 #include "kfd_topology.h"
@@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
 	struct crat_subtype_generic *sub_type_hdr;
 	struct crat_subtype_computeunit *cu;
 	struct kfd_cu_info cu_info;
-	struct amd_iommu_device_info iommu_info;
 	int avail_size = *size;
 	uint32_t total_num_of_cu;
 	int num_of_cache_entries = 0;
 	int cache_mem_filled = 0;
 	int ret = 0;
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	struct amd_iommu_device_info iommu_info;
 	const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
 					 AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
 					 AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
+#endif
 	struct kfd_local_mem_info local_mem_info;
 
 	if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
@@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
 	/* Check if this node supports IOMMU. During parsing this flag will
 	 * translate to HSA_CAP_ATS_PRESENT
 	 */
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	iommu_info.flags = 0;
 	if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
 		if ((iommu_info.flags & required_iommu_flags) ==
 				required_iommu_flags)
 			cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
 	}
+#endif
 
 	crat_table->length += sub_type_hdr->length;
 	crat_table->total_entries++;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index fafe971..5205b34 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -20,7 +20,9 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 #include <linux/amd-iommu.h>
+#endif
 #include <linux/bsearch.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
@@ -31,6 +33,7 @@
 
 #define MQD_SIZE_ALIGNED 768
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 static const struct kfd_device_info kaveri_device_info = {
 	.asic_family = CHIP_KAVERI,
 	.max_pasid_bits = 16,
@@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
 	.num_of_watch_points = 4,
 	.mqd_size_aligned = MQD_SIZE_ALIGNED,
 	.supports_cwsr = false,
+	.needs_iommu_device = true,
 	.needs_pci_atomics = false,
 };
 
@@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
 	.num_of_watch_points = 4,
 	.mqd_size_aligned = MQD_SIZE_ALIGNED,
 	.supports_cwsr = true,
+	.needs_iommu_device = true,
 	.needs_pci_atomics = false,
 };
+#endif
 
 struct kfd_deviceid {
 	unsigned short did;
@@ -64,6 +70,7 @@ struct kfd_deviceid {
 
 /* Please keep this sorted by increasing device id. */
 static const struct kfd_deviceid supported_devices[] = {
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	{ 0x1304, &kaveri_device_info },	/* Kaveri */
 	{ 0x1305, &kaveri_device_info },	/* Kaveri */
 	{ 0x1306, &kaveri_device_info },	/* Kaveri */
@@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
 	{ 0x9875, &carrizo_device_info },	/* Carrizo */
 	{ 0x9876, &carrizo_device_info },	/* Carrizo */
 	{ 0x9877, &carrizo_device_info }	/* Carrizo */
+#endif
 };
 
 static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
@@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 	return kfd;
 }
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 {
 	const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
@@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
 
 	return AMD_IOMMU_INV_PRI_RSP_INVALID;
 }
+#endif /* CONFIG_AMD_IOMMU_V2 */
 
 static void kfd_cwsr_init(struct kfd_dev *kfd)
 {
@@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		goto device_queue_manager_error;
 	}
 
-	if (!device_iommu_pasid_init(kfd)) {
-		dev_err(kfd_device,
-			"Error initializing iommuv2 for device %x:%x\n",
-			kfd->pdev->vendor, kfd->pdev->device);
-		goto device_iommu_pasid_error;
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	if (kfd->device_info->needs_iommu_device) {
+		if (!device_iommu_pasid_init(kfd)) {
+			dev_err(kfd_device, "Error initializing iommuv2\n");
+			goto device_iommu_pasid_error;
+		}
 	}
+#endif
 
 	kfd_cwsr_init(kfd);
 
@@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
 
 	kfd->dqm->ops.stop(kfd->dqm);
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	if (!kfd->device_info->needs_iommu_device)
+		return;
+
 	kfd_unbind_processes_from_device(kfd);
 
 	amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
 	amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
 	amd_iommu_free_device(kfd->pdev);
+#endif
 }
 
 int kgd2kfd_resume(struct kfd_dev *kfd)
@@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
 static int kfd_resume(struct kfd_dev *kfd)
 {
 	int err = 0;
-	unsigned int pasid_limit = kfd_get_pasid_limit();
 
-	err = amd_iommu_init_device(kfd->pdev, pasid_limit);
-	if (err)
-		return -ENXIO;
-	amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
-					iommu_pasid_shutdown_callback);
-	amd_iommu_set_invalid_ppr_cb(kfd->pdev,
-				     iommu_invalid_ppr_cb);
-
-	err = kfd_bind_processes_to_device(kfd);
-	if (err)
-		goto processes_bind_error;
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	if (kfd->device_info->needs_iommu_device) {
+		unsigned int pasid_limit = kfd_get_pasid_limit();
+
+		err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+		if (err)
+			return -ENXIO;
+		amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
+						iommu_pasid_shutdown_callback);
+		amd_iommu_set_invalid_ppr_cb(kfd->pdev,
+					     iommu_invalid_ppr_cb);
+
+		err = kfd_bind_processes_to_device(kfd);
+		if (err)
+			goto processes_bind_error;
+	}
+#endif
 
 	err = kfd->dqm->ops.start(kfd->dqm);
 	if (err) {
@@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
 
 dqm_start_error:
 processes_bind_error:
-	amd_iommu_free_device(kfd->pdev);
-
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	if (kfd->device_info->needs_iommu_device)
+		amd_iommu_free_device(kfd->pdev);
+#endif
 	return err;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 93aae5c..f770dc7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
 	}
 }
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 		unsigned long address, bool is_write_requested,
 		bool is_execute_requested)
@@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	mutex_unlock(&p->event_mutex);
 	kfd_unref_process(p);
 }
+#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
 
 void kfd_signal_hw_exception_event(unsigned int pasid)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index eebfb1e..9f4766c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -158,6 +158,7 @@ struct kfd_device_info {
 	uint8_t num_of_watch_points;
 	uint16_t mqd_size_aligned;
 	bool supports_cwsr;
+	bool needs_iommu_device;
 	bool needs_pci_atomics;
 };
 
@@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
 
 struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 						struct kfd_process *p);
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 int kfd_bind_processes_to_device(struct kfd_dev *dev);
 void kfd_unbind_processes_from_device(struct kfd_dev *dev);
 void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
+#endif
 struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
 							struct kfd_process *p);
 struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
@@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
 		       uint32_t *wait_result);
 void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 				uint32_t valid_id_bits);
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 void kfd_signal_iommu_event(struct kfd_dev *dev,
 		unsigned int pasid, unsigned long address,
 		bool is_write_requested, bool is_execute_requested);
+#endif
 void kfd_signal_hw_exception_event(unsigned int pasid);
 int kfd_set_event(struct kfd_process *p, uint32_t event_id);
 int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index a22fb071..1d0e02c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
 {
 	struct kfd_process *p = container_of(work, struct kfd_process,
 					     release_work);
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	struct kfd_process_device *pdd;
 
 	pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
 
 	list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
-		if (pdd->bound == PDD_BOUND)
+		if (pdd->bound == PDD_BOUND &&
+		    pdd->dev->device_info->needs_iommu_device)
 			amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
 	}
+#endif
 
 	kfd_process_destroy_pdds(p);
 
@@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 							struct kfd_process *p)
 {
 	struct kfd_process_device *pdd;
-	int err;
 
 	pdd = kfd_get_process_device_data(dev, p);
 	if (!pdd) {
@@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 		return ERR_PTR(-EINVAL);
 	}
 
-	err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
-	if (err < 0)
-		return ERR_PTR(err);
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+	if (dev->device_info->needs_iommu_device) {
+		int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
+					       p->lead_thread);
+		if (err < 0)
+			return ERR_PTR(err);
+	}
+#endif
 
 	pdd->bound = PDD_BOUND;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index c6a7609..f57c305 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
  */
 static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
 {
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	struct kfd_perf_properties *props;
 
 	if (amd_iommu_pc_supported()) {
@@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
 			amd_iommu_pc_get_max_counters(0); /* assume one iommu */
 		list_add_tail(&props->list, &kdev->perf_props);
 	}
+#endif
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
index 53fca1f..111fda2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
@@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
 		struct list_head *device_list);
 void kfd_release_topology_device_list(struct list_head *device_list);
 
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 extern bool amd_iommu_pc_supported(void);
 extern u8 amd_iommu_pc_get_max_banks(u16 devid);
 extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+#endif
 
 #endif /* __KFD_TOPOLOGY_H__ */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional Felix Kuehling
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 5/9] drm/amdkfd: Add dGPU support to the device queue manager Felix Kuehling
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Some dGPUs don't support HWS. Allow them to use a per-device
sched_policy that may be different from the global default.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  2 +-
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 22 +++++++++++++++++++---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  1 +
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 ++-
 6 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 62c3d9c..6fe2496 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -901,7 +901,8 @@ static int kfd_ioctl_set_scratch_backing_va(struct file *filep,
 
 	mutex_unlock(&p->mutex);
 
-	if (sched_policy == KFD_SCHED_POLICY_NO_HWS && pdd->qpd.vmid != 0)
+	if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS &&
+	    pdd->qpd.vmid != 0)
 		dev->kfd2kgd->set_scratch_backing_va(
 			dev->kgd, args->va_addr, pdd->qpd.vmid);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 3da25f7..9d4af96 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -33,6 +33,7 @@
 #include "kfd_pm4_headers_diq.h"
 #include "kfd_dbgmgr.h"
 #include "kfd_dbgdev.h"
+#include "kfd_device_queue_manager.h"
 
 static DEFINE_MUTEX(kfd_dbgmgr_mutex);
 
@@ -83,7 +84,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	}
 
 	/* get actual type of DBGDevice cpsch or not */
-	if (sched_policy == KFD_SCHED_POLICY_NO_HWS)
+	if (pdev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
 		type = DBGDEV_TYPE_NODIQ;
 
 	kfd_dbgdev_init(new_buff->dbgdev, pdev, type);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 5205b34..6dd50cc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -352,7 +352,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		 kfd->pdev->device);
 
 	pr_debug("Starting kfd with the following scheduling policy %d\n",
-		sched_policy);
+		kfd->dqm->sched_policy);
 
 	goto out;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index d0693fd..3e2f53b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -385,7 +385,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	prev_active = q->properties.is_active;
 
 	/* Make sure the queue is unmapped before updating the MQD */
-	if (sched_policy != KFD_SCHED_POLICY_NO_HWS) {
+	if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) {
 		retval = unmap_queues_cpsch(dqm,
 				KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
 		if (retval) {
@@ -417,7 +417,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	else if (!q->properties.is_active && prev_active)
 		dqm->queue_count--;
 
-	if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
+	if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS)
 		retval = map_queues_cpsch(dqm);
 	else if (q->properties.is_active &&
 		 (q->properties.type == KFD_QUEUE_TYPE_COMPUTE ||
@@ -1097,7 +1097,7 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 			alternate_aperture_base,
 			alternate_aperture_size);
 
-	if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
+	if ((dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
 		program_sh_mem_settings(dqm, qpd);
 
 	pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
@@ -1242,6 +1242,22 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 	if (!dqm)
 		return NULL;
 
+	switch (dev->device_info->asic_family) {
+	/* HWS is not available on Hawaii. */
+	case CHIP_HAWAII:
+	/* HWS depends on CWSR for timely dequeue. CWSR is not
+	 * available on Tonga.
+	 *
+	 * FIXME: This argument also applies to Kaveri.
+	 */
+	case CHIP_TONGA:
+		dqm->sched_policy = KFD_SCHED_POLICY_NO_HWS;
+		break;
+	default:
+		dqm->sched_policy = sched_policy;
+		break;
+	}
+
 	dqm->dev = dev;
 	switch (sched_policy) {
 	case KFD_SCHED_POLICY_HWS:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index c61b693..9fdc9c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -180,6 +180,7 @@ struct device_queue_manager {
 	unsigned int		*fence_addr;
 	struct kfd_mem_obj	*fence_mem;
 	bool			active_runlist;
+	int			sched_policy;
 };
 
 void device_queue_manager_init_cik(
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 8763806..7817e32 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -208,7 +208,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 
 	case KFD_QUEUE_TYPE_COMPUTE:
 		/* check if there is over subscription */
-		if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
+		if ((dev->dqm->sched_policy ==
+		     KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
 		((dev->dqm->processes_count >= dev->vm_info.vmid_num_kfd) ||
 		(dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
 			pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 5/9] drm/amdkfd: Add dGPU support to the device queue manager
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional Felix Kuehling
  2018-01-04 22:17   ` [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting Felix Kuehling
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 6/9] drm/amdkfd: Add dGPU support to the MQD manager Felix Kuehling
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

GFXv7 and v8 dGPUs use a different addressing mode for KFD compared
to APUs (GPUVM64 vs HSA64). And dGPUs don't support MTYPE_CC. They
use MTYPE_UC instead for memory that requires coherency.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 11 +++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  4 +
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  | 56 +++++++++++++
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   | 93 ++++++++++++++++++++++
 4 files changed, 164 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 3e2f53b..092653f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1308,6 +1308,17 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 	case CHIP_KAVERI:
 		device_queue_manager_init_cik(&dqm->asic_ops);
 		break;
+
+	case CHIP_HAWAII:
+		device_queue_manager_init_cik_hawaii(&dqm->asic_ops);
+		break;
+
+	case CHIP_TONGA:
+	case CHIP_FIJI:
+	case CHIP_POLARIS10:
+	case CHIP_POLARIS11:
+		device_queue_manager_init_vi_tonga(&dqm->asic_ops);
+		break;
 	default:
 		WARN(1, "Unexpected ASIC family %u",
 		     dev->device_info->asic_family);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index 9fdc9c2..68be0aa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -185,8 +185,12 @@ struct device_queue_manager {
 
 void device_queue_manager_init_cik(
 		struct device_queue_manager_asic_ops *asic_ops);
+void device_queue_manager_init_cik_hawaii(
+		struct device_queue_manager_asic_ops *asic_ops);
 void device_queue_manager_init_vi(
 		struct device_queue_manager_asic_ops *asic_ops);
+void device_queue_manager_init_vi_tonga(
+		struct device_queue_manager_asic_ops *asic_ops);
 void program_sh_mem_settings(struct device_queue_manager *dqm,
 					struct qcm_process_device *qpd);
 unsigned int get_queues_num(struct device_queue_manager *dqm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index 28e48c9..aed4c21 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -34,8 +34,13 @@ static bool set_cache_memory_policy_cik(struct device_queue_manager *dqm,
 				   uint64_t alternate_aperture_size);
 static int update_qpd_cik(struct device_queue_manager *dqm,
 					struct qcm_process_device *qpd);
+static int update_qpd_cik_hawaii(struct device_queue_manager *dqm,
+					struct qcm_process_device *qpd);
 static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 				struct qcm_process_device *qpd);
+static void init_sdma_vm_hawaii(struct device_queue_manager *dqm,
+				struct queue *q,
+				struct qcm_process_device *qpd);
 
 void device_queue_manager_init_cik(
 		struct device_queue_manager_asic_ops *asic_ops)
@@ -45,6 +50,14 @@ void device_queue_manager_init_cik(
 	asic_ops->init_sdma_vm = init_sdma_vm;
 }
 
+void device_queue_manager_init_cik_hawaii(
+		struct device_queue_manager_asic_ops *asic_ops)
+{
+	asic_ops->set_cache_memory_policy = set_cache_memory_policy_cik;
+	asic_ops->update_qpd = update_qpd_cik_hawaii;
+	asic_ops->init_sdma_vm = init_sdma_vm_hawaii;
+}
+
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 {
 	/* In 64-bit mode, we can only control the top 3 bits of the LDS,
@@ -132,6 +145,36 @@ static int update_qpd_cik(struct device_queue_manager *dqm,
 	return 0;
 }
 
+static int update_qpd_cik_hawaii(struct device_queue_manager *dqm,
+		struct qcm_process_device *qpd)
+{
+	struct kfd_process_device *pdd;
+	unsigned int temp;
+
+	pdd = qpd_to_pdd(qpd);
+
+	/* check if sh_mem_config register already configured */
+	if (qpd->sh_mem_config == 0) {
+		qpd->sh_mem_config =
+			ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) |
+			DEFAULT_MTYPE(MTYPE_NONCACHED) |
+			APE1_MTYPE(MTYPE_NONCACHED);
+		qpd->sh_mem_ape1_limit = 0;
+		qpd->sh_mem_ape1_base = 0;
+	}
+
+	/* On dGPU we're always in GPUVM64 addressing mode with 64-bit
+	 * aperture addresses.
+	 */
+	temp = get_sh_mem_bases_nybble_64(pdd);
+	qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+
+	pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
+		qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
+
+	return 0;
+}
+
 static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 				struct qcm_process_device *qpd)
 {
@@ -147,3 +190,16 @@ static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 
 	q->properties.sdma_vm_addr = value;
 }
+
+static void init_sdma_vm_hawaii(struct device_queue_manager *dqm,
+				struct queue *q,
+				struct qcm_process_device *qpd)
+{
+	/* On dGPU we're always in GPUVM64 addressing mode with 64-bit
+	 * aperture addresses.
+	 */
+	q->properties.sdma_vm_addr =
+		((get_sh_mem_bases_nybble_64(qpd_to_pdd(qpd))) <<
+		 SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE__SHIFT) &
+		SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE_MASK;
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 2fbce57..fd60a11 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -33,10 +33,21 @@ static bool set_cache_memory_policy_vi(struct device_queue_manager *dqm,
 				   enum cache_policy alternate_policy,
 				   void __user *alternate_aperture_base,
 				   uint64_t alternate_aperture_size);
+static bool set_cache_memory_policy_vi_tonga(struct device_queue_manager *dqm,
+			struct qcm_process_device *qpd,
+			enum cache_policy default_policy,
+			enum cache_policy alternate_policy,
+			void __user *alternate_aperture_base,
+			uint64_t alternate_aperture_size);
 static int update_qpd_vi(struct device_queue_manager *dqm,
 					struct qcm_process_device *qpd);
+static int update_qpd_vi_tonga(struct device_queue_manager *dqm,
+			struct qcm_process_device *qpd);
 static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 				struct qcm_process_device *qpd);
+static void init_sdma_vm_tonga(struct device_queue_manager *dqm,
+			struct queue *q,
+			struct qcm_process_device *qpd);
 
 void device_queue_manager_init_vi(
 		struct device_queue_manager_asic_ops *asic_ops)
@@ -46,6 +57,14 @@ void device_queue_manager_init_vi(
 	asic_ops->init_sdma_vm = init_sdma_vm;
 }
 
+void device_queue_manager_init_vi_tonga(
+		struct device_queue_manager_asic_ops *asic_ops)
+{
+	asic_ops->set_cache_memory_policy = set_cache_memory_policy_vi_tonga;
+	asic_ops->update_qpd = update_qpd_vi_tonga;
+	asic_ops->init_sdma_vm = init_sdma_vm_tonga;
+}
+
 static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 {
 	/* In 64-bit mode, we can only control the top 3 bits of the LDS,
@@ -103,6 +122,33 @@ static bool set_cache_memory_policy_vi(struct device_queue_manager *dqm,
 	return true;
 }
 
+static bool set_cache_memory_policy_vi_tonga(struct device_queue_manager *dqm,
+		struct qcm_process_device *qpd,
+		enum cache_policy default_policy,
+		enum cache_policy alternate_policy,
+		void __user *alternate_aperture_base,
+		uint64_t alternate_aperture_size)
+{
+	uint32_t default_mtype;
+	uint32_t ape1_mtype;
+
+	default_mtype = (default_policy == cache_policy_coherent) ?
+			MTYPE_UC :
+			MTYPE_NC;
+
+	ape1_mtype = (alternate_policy == cache_policy_coherent) ?
+			MTYPE_UC :
+			MTYPE_NC;
+
+	qpd->sh_mem_config =
+			SH_MEM_ALIGNMENT_MODE_UNALIGNED <<
+				   SH_MEM_CONFIG__ALIGNMENT_MODE__SHIFT |
+			default_mtype << SH_MEM_CONFIG__DEFAULT_MTYPE__SHIFT |
+			ape1_mtype << SH_MEM_CONFIG__APE1_MTYPE__SHIFT;
+
+	return true;
+}
+
 static int update_qpd_vi(struct device_queue_manager *dqm,
 					struct qcm_process_device *qpd)
 {
@@ -144,6 +190,40 @@ static int update_qpd_vi(struct device_queue_manager *dqm,
 	return 0;
 }
 
+static int update_qpd_vi_tonga(struct device_queue_manager *dqm,
+			struct qcm_process_device *qpd)
+{
+	struct kfd_process_device *pdd;
+	unsigned int temp;
+
+	pdd = qpd_to_pdd(qpd);
+
+	/* check if sh_mem_config register already configured */
+	if (qpd->sh_mem_config == 0) {
+		qpd->sh_mem_config =
+				SH_MEM_ALIGNMENT_MODE_UNALIGNED <<
+					SH_MEM_CONFIG__ALIGNMENT_MODE__SHIFT |
+				MTYPE_UC <<
+					SH_MEM_CONFIG__DEFAULT_MTYPE__SHIFT |
+				MTYPE_UC <<
+					SH_MEM_CONFIG__APE1_MTYPE__SHIFT;
+
+		qpd->sh_mem_ape1_limit = 0;
+		qpd->sh_mem_ape1_base = 0;
+	}
+
+	/* On dGPU we're always in GPUVM64 addressing mode with 64-bit
+	 * aperture addresses.
+	 */
+	temp = get_sh_mem_bases_nybble_64(pdd);
+	qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+
+	pr_debug("sh_mem_bases nybble: 0x%X and register 0x%X\n",
+		temp, qpd->sh_mem_bases);
+
+	return 0;
+}
+
 static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 				struct qcm_process_device *qpd)
 {
@@ -159,3 +239,16 @@ static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
 
 	q->properties.sdma_vm_addr = value;
 }
+
+static void init_sdma_vm_tonga(struct device_queue_manager *dqm,
+			struct queue *q,
+			struct qcm_process_device *qpd)
+{
+	/* On dGPU we're always in GPUVM64 addressing mode with 64-bit
+	 * aperture addresses.
+	 */
+	q->properties.sdma_vm_addr =
+		((get_sh_mem_bases_nybble_64(qpd_to_pdd(qpd))) <<
+		 SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE__SHIFT) &
+		SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE_MASK;
+}
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 6/9] drm/amdkfd: Add dGPU support to the MQD manager
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-01-04 22:17   ` [PATCH 5/9] drm/amdkfd: Add dGPU support to the device queue manager Felix Kuehling
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init Felix Kuehling
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent
memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c     |  7 +++++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 35 ++++++++++++++++++++++--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c  | 21 ++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h            |  4 +++
 4 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
index dfd260e..ee7061e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
@@ -29,8 +29,15 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type,
 	switch (dev->device_info->asic_family) {
 	case CHIP_KAVERI:
 		return mqd_manager_init_cik(type, dev);
+	case CHIP_HAWAII:
+		return mqd_manager_init_cik_hawaii(type, dev);
 	case CHIP_CARRIZO:
 		return mqd_manager_init_vi(type, dev);
+	case CHIP_TONGA:
+	case CHIP_FIJI:
+	case CHIP_POLARIS10:
+	case CHIP_POLARIS11:
+		return mqd_manager_init_vi_tonga(type, dev);
 	default:
 		WARN(1, "Unexpected ASIC family %u",
 		     dev->device_info->asic_family);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index f8ef4a0..fbe3f83 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -170,14 +170,19 @@ static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
 					       mms);
 }
 
-static int update_mqd(struct mqd_manager *mm, void *mqd,
-			struct queue_properties *q)
+static int __update_mqd(struct mqd_manager *mm, void *mqd,
+			struct queue_properties *q, unsigned int atc_bit)
 {
 	struct cik_mqd *m;
 
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
-				DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
+				DEFAULT_MIN_AVAIL_SIZE;
+	m->cp_hqd_ib_control = DEFAULT_MIN_IB_AVAIL_SIZE;
+	if (atc_bit) {
+		m->cp_hqd_pq_control |= PQ_ATC_EN;
+		m->cp_hqd_ib_control |= IB_ATC_EN;
+	}
 
 	/*
 	 * Calculating queue size which is log base 2 of actual queue size -1
@@ -202,6 +207,18 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 	return 0;
 }
 
+static int update_mqd(struct mqd_manager *mm, void *mqd,
+			struct queue_properties *q)
+{
+	return __update_mqd(mm, mqd, q, 1);
+}
+
+static int update_mqd_hawaii(struct mqd_manager *mm, void *mqd,
+			struct queue_properties *q)
+{
+	return __update_mqd(mm, mqd, q, 0);
+}
+
 static int update_mqd_sdma(struct mqd_manager *mm, void *mqd,
 				struct queue_properties *q)
 {
@@ -441,3 +458,15 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 	return mqd;
 }
 
+struct mqd_manager *mqd_manager_init_cik_hawaii(enum KFD_MQD_TYPE type,
+			struct kfd_dev *dev)
+{
+	struct mqd_manager *mqd;
+
+	mqd = mqd_manager_init_cik(type, dev);
+	if (!mqd)
+		return NULL;
+	if ((type == KFD_MQD_TYPE_CP) || (type == KFD_MQD_TYPE_COMPUTE))
+		mqd->update_mqd = update_mqd_hawaii;
+	return mqd;
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 971aec0..58221c1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -151,6 +151,8 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 
 	m->cp_hqd_pq_rptr_report_addr_lo = lower_32_bits((uint64_t)q->read_ptr);
 	m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
+	m->cp_hqd_pq_wptr_poll_addr_lo = lower_32_bits((uint64_t)q->write_ptr);
+	m->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits((uint64_t)q->write_ptr);
 
 	m->cp_hqd_pq_doorbell_control =
 		q->doorbell_off <<
@@ -208,6 +210,12 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 	return __update_mqd(mm, mqd, q, MTYPE_CC, 1);
 }
 
+static int update_mqd_tonga(struct mqd_manager *mm, void *mqd,
+			struct queue_properties *q)
+{
+	return __update_mqd(mm, mqd, q, MTYPE_UC, 0);
+}
+
 static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 			enum kfd_preempt_type type,
 			unsigned int timeout, uint32_t pipe_id,
@@ -432,3 +440,16 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 
 	return mqd;
 }
+
+struct mqd_manager *mqd_manager_init_vi_tonga(enum KFD_MQD_TYPE type,
+			struct kfd_dev *dev)
+{
+	struct mqd_manager *mqd;
+
+	mqd = mqd_manager_init_vi(type, dev);
+	if (!mqd)
+		return NULL;
+	if ((type == KFD_MQD_TYPE_CP) || (type == KFD_MQD_TYPE_COMPUTE))
+		mqd->update_mqd = update_mqd_tonga;
+	return mqd;
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 9f4766c..993062e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -709,8 +709,12 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type,
 					struct kfd_dev *dev);
 struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 		struct kfd_dev *dev);
+struct mqd_manager *mqd_manager_init_cik_hawaii(enum KFD_MQD_TYPE type,
+		struct kfd_dev *dev);
 struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 		struct kfd_dev *dev);
+struct mqd_manager *mqd_manager_init_vi_tonga(enum KFD_MQD_TYPE type,
+		struct kfd_dev *dev);
 struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev);
 void device_queue_manager_uninit(struct device_queue_manager *dqm);
 struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-01-04 22:17   ` [PATCH 6/9] drm/amdkfd: Add dGPU support to the MQD manager Felix Kuehling
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-04 22:17   ` [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs Felix Kuehling
  2018-01-27  0:35   ` [PATCH 0/9] KFD dGPU initialization Felix Kuehling
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Recognize dGPU ASIC families.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 5dc6567..69f4964 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 
 	switch (dev->device_info->asic_family) {
 	case CHIP_CARRIZO:
+	case CHIP_TONGA:
+	case CHIP_FIJI:
+	case CHIP_POLARIS10:
+	case CHIP_POLARIS11:
 		kernel_queue_init_vi(&kq->ops_asic_specific);
 		break;
 
 	case CHIP_KAVERI:
+	case CHIP_HAWAII:
 		kernel_queue_init_cik(&kq->ops_asic_specific);
 		break;
 	default:
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info
  2018-01-04 22:17 [PATCH 0/9] KFD dGPU initialization Felix Kuehling
                   ` (2 preceding siblings ...)
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-04 22:17 ` Felix Kuehling
  2018-01-31 15:20     ` Oded Gabbay
  3 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx, oded.gabbay; +Cc: Felix Kuehling, linux-pci

CC: linux-pci@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 153 +++++++++++++++++++++++++++++++-
 1 file changed, 151 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 6dd50cc..612afaf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -63,12 +63,118 @@ static const struct kfd_device_info carrizo_device_info = {
 };
 #endif
 
+static const struct kfd_device_info hawaii_device_info = {
+	.asic_family = CHIP_HAWAII,
+	.max_pasid_bits = 16,
+	/* max num of queues for KV.TODO should be a dynamic value */
+	.max_no_of_hqd	= 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = false,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = false,
+};
+
+static const struct kfd_device_info tonga_device_info = {
+	.asic_family = CHIP_TONGA,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = false,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = true,
+};
+
+static const struct kfd_device_info tonga_vf_device_info = {
+	.asic_family = CHIP_TONGA,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = false,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = false,
+};
+
+static const struct kfd_device_info fiji_device_info = {
+	.asic_family = CHIP_FIJI,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = true,
+};
+
+static const struct kfd_device_info fiji_vf_device_info = {
+	.asic_family = CHIP_FIJI,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = false,
+};
+
+
+static const struct kfd_device_info polaris10_device_info = {
+	.asic_family = CHIP_POLARIS10,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = true,
+};
+
+static const struct kfd_device_info polaris10_vf_device_info = {
+	.asic_family = CHIP_POLARIS10,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = false,
+};
+
+static const struct kfd_device_info polaris11_device_info = {
+	.asic_family = CHIP_POLARIS11,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.ih_ring_entry_size = 4 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_cik,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = false,
+	.needs_pci_atomics = true,
+};
+
+
 struct kfd_deviceid {
 	unsigned short did;
 	const struct kfd_device_info *device_info;
 };
 
-/* Please keep this sorted by increasing device id. */
 static const struct kfd_deviceid supported_devices[] = {
 #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	{ 0x1304, &kaveri_device_info },	/* Kaveri */
@@ -97,8 +203,51 @@ static const struct kfd_deviceid supported_devices[] = {
 	{ 0x9874, &carrizo_device_info },	/* Carrizo */
 	{ 0x9875, &carrizo_device_info },	/* Carrizo */
 	{ 0x9876, &carrizo_device_info },	/* Carrizo */
-	{ 0x9877, &carrizo_device_info }	/* Carrizo */
+	{ 0x9877, &carrizo_device_info },	/* Carrizo */
 #endif
+	{ 0x67A0, &hawaii_device_info },	/* Hawaii */
+	{ 0x67A1, &hawaii_device_info },	/* Hawaii */
+	{ 0x67A2, &hawaii_device_info },	/* Hawaii */
+	{ 0x67A8, &hawaii_device_info },	/* Hawaii */
+	{ 0x67A9, &hawaii_device_info },	/* Hawaii */
+	{ 0x67AA, &hawaii_device_info },	/* Hawaii */
+	{ 0x67B0, &hawaii_device_info },	/* Hawaii */
+	{ 0x67B1, &hawaii_device_info },	/* Hawaii */
+	{ 0x67B8, &hawaii_device_info },	/* Hawaii */
+	{ 0x67B9, &hawaii_device_info },	/* Hawaii */
+	{ 0x67BA, &hawaii_device_info },	/* Hawaii */
+	{ 0x67BE, &hawaii_device_info },	/* Hawaii */
+	{ 0x6920, &tonga_device_info },		/* Tonga */
+	{ 0x6921, &tonga_device_info },		/* Tonga */
+	{ 0x6928, &tonga_device_info },		/* Tonga */
+	{ 0x6929, &tonga_device_info },		/* Tonga */
+	{ 0x692B, &tonga_device_info },		/* Tonga */
+	{ 0x692F, &tonga_vf_device_info },	/* Tonga vf */
+	{ 0x6938, &tonga_device_info },		/* Tonga */
+	{ 0x6939, &tonga_device_info },		/* Tonga */
+	{ 0x7300, &fiji_device_info },		/* Fiji */
+	{ 0x730F, &fiji_vf_device_info },	/* Fiji vf*/
+	{ 0x67C0, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C1, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C2, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C4, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C7, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C8, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67C9, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67CA, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67CC, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67CF, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67D0, &polaris10_vf_device_info },	/* Polaris10 vf*/
+	{ 0x67DF, &polaris10_device_info },	/* Polaris10 */
+	{ 0x67E0, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67E1, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67E3, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67E7, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67E8, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67E9, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67EB, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67EF, &polaris11_device_info },	/* Polaris11 */
+	{ 0x67FF, &polaris11_device_info },	/* Polaris11 */
 };
 
 static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2018-01-04 22:17   ` [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init Felix Kuehling
@ 2018-01-04 22:17   ` Felix Kuehling
       [not found]     ` <1515104268-25087-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-01-27  0:35   ` [PATCH 0/9] KFD dGPU initialization Felix Kuehling
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-04 22:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 335e454..7ebe430 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
 	switch (adev->asic_type) {
 #ifdef CONFIG_DRM_AMDGPU_CIK
 	case CHIP_KAVERI:
+	case CHIP_HAWAII:
 		kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
 		break;
 #endif
 	case CHIP_CARRIZO:
+	case CHIP_TONGA:
+	case CHIP_FIJI:
+	case CHIP_POLARIS10:
+	case CHIP_POLARIS11:
 		kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
 		break;
 	default:
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root
  2018-01-04 22:17   ` Felix Kuehling
  (?)
@ 2018-01-05  0:17   ` Bjorn Helgaas
  2018-01-05  0:23       ` Felix Kuehling
  -1 siblings, 1 reply; 47+ messages in thread
From: Bjorn Helgaas @ 2018-01-05  0:17 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx, oded.gabbay, Jay Cornwall, linux-pci

On Thu, Jan 04, 2018 at 05:17:40PM -0500, Felix Kuehling wrote:
> From: Jay Cornwall <Jay.Cornwall@amd.com>
> 
> The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be
> requested by, routed through and completed by PCIe components. Routing and
> completion do not require software support. Component support for each is
> detectable via the DEVCAP2 register.
> 
> AtomicOp requests are permitted only if a component's
> DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be
> detected but is a no-op if set on a component with no support. 

I guess the driver is supposed to know whether its hardware is capable
of using AtomicOps, so the device itself doesn't need to advertise it.

I would word this as "A Requester is permitted to use AtomicOps only
if its PCI_EXP_DEVCTL2_ATOMIC_REQ is set.  A driver should set
PCI_EXP_DEVCTL2_ATOMIC_REQ only if the Completer and all intermediate
routing elements support AtomicOps."

> These
> requests can only be serviced if the upstream components support AtomicOp
> completion and/or routing to a component which does.
> 
> A concrete example is the AMD Fiji-class GPU, which is specified to
> support AtomicOp requests, routed through a PLX 8747 switch (advertising
> AtomicOp routing) to a Haswell host bridge (advertising AtomicOp
> completion support). When AtomicOp requests are disabled the GPU logs
> attempts to initiate requests to an MMIO register for debugging.

The last sentence isn't really relevant to this patch.

> Add pci_enable_atomic_ops_to_root for per-device control over AtomicOp
> requests. Upstream bridges are checked for AtomicOp routing capability and
> the call fails if any lack this capability. The root port is checked for
> AtomicOp completion capabilities and the call fails if it does not support
> any. Routes to other PCIe components are not checked for AtomicOp routing
> and completion capabilities.
> 
> v2: Check for AtomicOp route to root port with AtomicOp completion
> v3: Style fixes
> v4: Endpoint to root port only, check upstream egress blocking
> v5: Rebase, use existing PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK define
> v6: Add comp_caps param, fix upstream port detection, cosmetic/comments
> 
> CC: linux-pci@vger.kernel.org
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/pci/pci.c             | 80 +++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h           |  1 +
>  include/uapi/linux/pci_regs.h |  4 ++-
>  3 files changed, 84 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 4a7c686..9cea399 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3066,6 +3066,86 @@ int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size)
>  }
>  
>  /**
> + * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
> + * @dev: the PCI device
> + * @comp_caps: Caps required for atomic request completion
> + *
> + * Return 0 if all upstream bridges support AtomicOp routing, egress
> + * blocking is disabled on all upstream ports, and the root port
> + * supports the requested completion capabilities (32-bit, 64-bit
> + * and/or 128-bit AtomicOp completion), or negative otherwise.
> + */
> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps)

I still want to see this used to replace qedr_pci_set_atomic() if
that's possible.  I'm not convinced yet that they need to be
different.

Bjorn

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root
@ 2018-01-05  0:23       ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-05  0:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: amd-gfx, oded.gabbay, Jay Cornwall, linux-pci, Ram Amrani, Doug Ledford

On 2018-01-04 07:17 PM, Bjorn Helgaas wrote:
> @@ -3066,6 +3066,86 @@ int pci_rebar_set_size(struct pci_dev *pdev,
> int bar, int size)
>>  }
>>  
>>  /**
>> + * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
>> + * @dev: the PCI device
>> + * @comp_caps: Caps required for atomic request completion
>> + *
>> + * Return 0 if all upstream bridges support AtomicOp routing, egress
>> + * blocking is disabled on all upstream ports, and the root port
>> + * supports the requested completion capabilities (32-bit, 64-bit
>> + * and/or 128-bit AtomicOp completion), or negative otherwise.
>> + */
>> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps)
> I still want to see this used to replace qedr_pci_set_atomic() if
> that's possible.  I'm not convinced yet that they need to be
> different.

I agree. This should be possible now. I've convinced myself that the
functions do the same thing. But I have no way of testing it. I'd also
do this in a separate patch. Maybe Ram or Doug could test it.

Regards,
  Felix

>
> Bjorn

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root
@ 2018-01-05  0:23       ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-05  0:23 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w, Jay Cornwall,
	linux-pci-u79uwXL29TY76Z2rM5mHXA, Ram Amrani,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Doug Ledford

On 2018-01-04 07:17 PM, Bjorn Helgaas wrote:
> @@ -3066,6 +3066,86 @@ int pci_rebar_set_size(struct pci_dev *pdev,
> int bar, int size)
>>  }
>>  
>>  /**
>> + * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port
>> + * @dev: the PCI device
>> + * @comp_caps: Caps required for atomic request completion
>> + *
>> + * Return 0 if all upstream bridges support AtomicOp routing, egress
>> + * blocking is disabled on all upstream ports, and the root port
>> + * supports the requested completion capabilities (32-bit, 64-bit
>> + * and/or 128-bit AtomicOp completion), or negative otherwise.
>> + */
>> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 comp_caps)
> I still want to see this used to replace qedr_pci_set_atomic() if
> that's possible.  I'm not convinced yet that they need to be
> different.

I agree. This should be possible now. I've convinced myself that the
functions do the same thing. But I have no way of testing it. I'd also
do this in a separate patch. Maybe Ram or Doug could test it.

Regards,
  Felix

>
> Bjorn

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/9] KFD dGPU initialization
       [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2018-01-04 22:17   ` [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs Felix Kuehling
@ 2018-01-27  0:35   ` Felix Kuehling
       [not found]     ` <f84a6f6f-0985-a161-f989-b41021085039-5C7GfCeVMHo@public.gmane.org>
  6 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-27  0:35 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

The PCI atomic patch has been accepted by Bjorn Helgaas and should be
included in 4.16. That means the rest of these patches should be good to
apply once you update your tree to 4.16. I've rebased my tree on your
latest (still 4.15-rc4) without any conflicts. But let me know if you
want me to send you a rebased version of this series.

I'm about to send out a series of 25 patches that enables GPUVM support,
based on top of this series.

Regards,
  Felix


On 2018-01-04 05:17 PM, Felix Kuehling wrote:
> Remaining patches from the previous 37-patch series.
>
> Patch 1: Reworked PCIe atomic patch with feedback from PCI maintainers
> Patch 2-9: Rebased from previous series
>
> CC-ed linux-pci@vger.kernel.org on relevant patches for context.
>
> Felix Kuehling (8):
>   drm/amdkfd: Conditionally enable PCIe atomics
>   drm/amdkfd: Make IOMMUv2 code conditional
>   drm/amdkfd: Make sched_policy a per-device setting
>   drm/amdkfd: Add dGPU support to the device queue manager
>   drm/amdkfd: Add dGPU support to the MQD manager
>   drm/amdkfd: Add dGPU support to kernel_queue_init
>   drm/amdkfd: Add dGPU device IDs and device info
>   drm/amdgpu: Enable KFD initialization on dGPUs
>
> Jay Cornwall (1):
>   PCI: Add pci_enable_atomic_ops_to_root
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   5 +
>  drivers/gpu/drm/amd/amdkfd/Kconfig                 |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c              |   8 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 234 +++++++++++++++++++--
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  33 ++-
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |   5 +
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  56 +++++
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  93 ++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   5 +
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c       |   7 +
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  35 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  21 ++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  10 +
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  17 +-
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |   2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h          |   2 +
>  drivers/pci/pci.c                                  |  80 +++++++
>  include/linux/pci.h                                |   1 +
>  include/uapi/linux/pci_regs.h                      |   4 +-
>  23 files changed, 592 insertions(+), 39 deletions(-)
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/9] KFD dGPU initialization
       [not found]     ` <f84a6f6f-0985-a161-f989-b41021085039-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-27 11:31       ` Oded Gabbay
       [not found]         ` <CAFCwf12vCku6JoH3Rcp1-+vQNzqX8zoO_2SG=UhAtTqsYn3SkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-27 11:31 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

Hi Felix,
Thanks for the ping, I also kept monitoring the status of that patch
and once it gets into Linus's branch I will rebase my tree on it. In
the meantime, I will begin reviewing the topology + GPUVM stuff and
hopefully send them all as one big pull request for 4.17, with any
additional patches you will send in the next month.
Sounds good ?

Oded


On Sat, Jan 27, 2018 at 2:35 AM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> The PCI atomic patch has been accepted by Bjorn Helgaas and should be
> included in 4.16. That means the rest of these patches should be good to
> apply once you update your tree to 4.16. I've rebased my tree on your
> latest (still 4.15-rc4) without any conflicts. But let me know if you
> want me to send you a rebased version of this series.
>
> I'm about to send out a series of 25 patches that enables GPUVM support,
> based on top of this series.
>
> Regards,
>   Felix
>
>
> On 2018-01-04 05:17 PM, Felix Kuehling wrote:
>> Remaining patches from the previous 37-patch series.
>>
>> Patch 1: Reworked PCIe atomic patch with feedback from PCI maintainers
>> Patch 2-9: Rebased from previous series
>>
>> CC-ed linux-pci@vger.kernel.org on relevant patches for context.
>>
>> Felix Kuehling (8):
>>   drm/amdkfd: Conditionally enable PCIe atomics
>>   drm/amdkfd: Make IOMMUv2 code conditional
>>   drm/amdkfd: Make sched_policy a per-device setting
>>   drm/amdkfd: Add dGPU support to the device queue manager
>>   drm/amdkfd: Add dGPU support to the MQD manager
>>   drm/amdkfd: Add dGPU support to kernel_queue_init
>>   drm/amdkfd: Add dGPU device IDs and device info
>>   drm/amdgpu: Enable KFD initialization on dGPUs
>>
>> Jay Cornwall (1):
>>   PCI: Add pci_enable_atomic_ops_to_root
>>
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   5 +
>>  drivers/gpu/drm/amd/amdkfd/Kconfig                 |   2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c              |   8 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 234 +++++++++++++++++++--
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  33 ++-
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |   5 +
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  56 +++++
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  93 ++++++++
>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   5 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c       |   7 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  35 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  21 ++
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  10 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  17 +-
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |   2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h          |   2 +
>>  drivers/pci/pci.c                                  |  80 +++++++
>>  include/linux/pci.h                                |   1 +
>>  include/uapi/linux/pci_regs.h                      |   4 +-
>>  23 files changed, 592 insertions(+), 39 deletions(-)
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 0/9] KFD dGPU initialization
       [not found]         ` <CAFCwf12vCku6JoH3Rcp1-+vQNzqX8zoO_2SG=UhAtTqsYn3SkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-27 22:19           ` Kuehling, Felix
  0 siblings, 0 replies; 47+ messages in thread
From: Kuehling, Felix @ 2018-01-27 22:19 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

Yes, sounds great. I think I should be able to get userptr support done in time for 4.17, so that should get it into pretty good shape for running ROCm on an upstream kernel on Fiji and Polaris GPUs.

Regards,
  Felix

________________________________________
From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, January 27, 2018 6:31:36 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 0/9] KFD dGPU initialization

Hi Felix,
Thanks for the ping, I also kept monitoring the status of that patch
and once it gets into Linus's branch I will rebase my tree on it. In
the meantime, I will begin reviewing the topology + GPUVM stuff and
hopefully send them all as one big pull request for 4.17, with any
additional patches you will send in the next month.
Sounds good ?

Oded


On Sat, Jan 27, 2018 at 2:35 AM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> The PCI atomic patch has been accepted by Bjorn Helgaas and should be
> included in 4.16. That means the rest of these patches should be good to
> apply once you update your tree to 4.16. I've rebased my tree on your
> latest (still 4.15-rc4) without any conflicts. But let me know if you
> want me to send you a rebased version of this series.
>
> I'm about to send out a series of 25 patches that enables GPUVM support,
> based on top of this series.
>
> Regards,
>   Felix
>
>
> On 2018-01-04 05:17 PM, Felix Kuehling wrote:
>> Remaining patches from the previous 37-patch series.
>>
>> Patch 1: Reworked PCIe atomic patch with feedback from PCI maintainers
>> Patch 2-9: Rebased from previous series
>>
>> CC-ed linux-pci@vger.kernel.org on relevant patches for context.
>>
>> Felix Kuehling (8):
>>   drm/amdkfd: Conditionally enable PCIe atomics
>>   drm/amdkfd: Make IOMMUv2 code conditional
>>   drm/amdkfd: Make sched_policy a per-device setting
>>   drm/amdkfd: Add dGPU support to the device queue manager
>>   drm/amdkfd: Add dGPU support to the MQD manager
>>   drm/amdkfd: Add dGPU support to kernel_queue_init
>>   drm/amdkfd: Add dGPU device IDs and device info
>>   drm/amdgpu: Enable KFD initialization on dGPUs
>>
>> Jay Cornwall (1):
>>   PCI: Add pci_enable_atomic_ops_to_root
>>
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   5 +
>>  drivers/gpu/drm/amd/amdkfd/Kconfig                 |   2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c              |   8 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 234 +++++++++++++++++++--
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  33 ++-
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |   5 +
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  56 +++++
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  93 ++++++++
>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   5 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c       |   7 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  35 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  21 ++
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  10 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  17 +-
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |   3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |   2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h          |   2 +
>>  drivers/pci/pci.c                                  |  80 +++++++
>>  include/linux/pci.h                                |   1 +
>>  include/uapi/linux/pci_regs.h                      |   4 +-
>>  23 files changed, 592 insertions(+), 39 deletions(-)
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]     ` <1515104268-25087-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 14:56       ` Oded Gabbay
       [not found]         ` <CAFCwf125cHCf=fsfiMhhASjgMNEcau04gNGKKHFu7PQGeorpZQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 14:56 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

Hi Felix,
Please don't spread 19 #ifdefs throughout the code.
I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
functions declarations and in the #else section put macros with empty
implementations. This is much more readable and maintainable.

Oded


On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
> ASIC information. Also allow building KFD without IOMMUv2 support.
> This is still useful for dGPUs and prepares for enabling KFD on
> architectures that don't support AMD IOMMUv2.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>  8 files changed, 74 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
> index bc5a294..5bbeb95 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
> @@ -4,6 +4,6 @@
>
>  config HSA_AMD
>         tristate "HSA kernel driver for AMD GPU devices"
> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
> +       depends on DRM_AMDGPU && X86_64
>         help
>           Enable this if you want to use HSA features on AMD GPU devices.
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> index 2bc2816..3478270 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> @@ -22,7 +22,9 @@
>
>  #include <linux/pci.h>
>  #include <linux/acpi.h>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  #include <linux/amd-iommu.h>
> +#endif
>  #include "kfd_crat.h"
>  #include "kfd_priv.h"
>  #include "kfd_topology.h"
> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>         struct crat_subtype_generic *sub_type_hdr;
>         struct crat_subtype_computeunit *cu;
>         struct kfd_cu_info cu_info;
> -       struct amd_iommu_device_info iommu_info;
>         int avail_size = *size;
>         uint32_t total_num_of_cu;
>         int num_of_cache_entries = 0;
>         int cache_mem_filled = 0;
>         int ret = 0;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       struct amd_iommu_device_info iommu_info;
>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
> +#endif
>         struct kfd_local_mem_info local_mem_info;
>
>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>         /* Check if this node supports IOMMU. During parsing this flag will
>          * translate to HSA_CAP_ATS_PRESENT
>          */
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         iommu_info.flags = 0;
>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>                 if ((iommu_info.flags & required_iommu_flags) ==
>                                 required_iommu_flags)
>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>         }
> +#endif
>
>         crat_table->length += sub_type_hdr->length;
>         crat_table->total_entries++;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index fafe971..5205b34 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -20,7 +20,9 @@
>   * OTHER DEALINGS IN THE SOFTWARE.
>   */
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  #include <linux/amd-iommu.h>
> +#endif
>  #include <linux/bsearch.h>
>  #include <linux/pci.h>
>  #include <linux/slab.h>
> @@ -31,6 +33,7 @@
>
>  #define MQD_SIZE_ALIGNED 768
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  static const struct kfd_device_info kaveri_device_info = {
>         .asic_family = CHIP_KAVERI,
>         .max_pasid_bits = 16,
> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = false,
> +       .needs_iommu_device = true,
>         .needs_pci_atomics = false,
>  };
>
> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = true,
> +       .needs_iommu_device = true,
>         .needs_pci_atomics = false,
>  };
> +#endif
>
>  struct kfd_deviceid {
>         unsigned short did;
> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>
>  /* Please keep this sorted by increasing device id. */
>  static const struct kfd_deviceid supported_devices[] = {
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>         { 0x1306, &kaveri_device_info },        /* Kaveri */
> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>         { 0x9877, &carrizo_device_info }        /* Carrizo */
> +#endif
>  };
>
>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>         return kfd;
>  }
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>  {
>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>
>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>  }
> +#endif /* CONFIG_AMD_IOMMU_V2 */
>
>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>  {
> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>                 goto device_queue_manager_error;
>         }
>
> -       if (!device_iommu_pasid_init(kfd)) {
> -               dev_err(kfd_device,
> -                       "Error initializing iommuv2 for device %x:%x\n",
> -                       kfd->pdev->vendor, kfd->pdev->device);
> -               goto device_iommu_pasid_error;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device) {
> +               if (!device_iommu_pasid_init(kfd)) {
> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
> +                       goto device_iommu_pasid_error;
> +               }
>         }
> +#endif
>
>         kfd_cwsr_init(kfd);
>
> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>
>         kfd->dqm->ops.stop(kfd->dqm);
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (!kfd->device_info->needs_iommu_device)
> +               return;
> +
>         kfd_unbind_processes_from_device(kfd);
>
>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>         amd_iommu_free_device(kfd->pdev);
> +#endif
>  }
>
>  int kgd2kfd_resume(struct kfd_dev *kfd)
> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>  static int kfd_resume(struct kfd_dev *kfd)
>  {
>         int err = 0;
> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>
> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> -       if (err)
> -               return -ENXIO;
> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> -                                       iommu_pasid_shutdown_callback);
> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> -                                    iommu_invalid_ppr_cb);
> -
> -       err = kfd_bind_processes_to_device(kfd);
> -       if (err)
> -               goto processes_bind_error;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device) {
> +               unsigned int pasid_limit = kfd_get_pasid_limit();
> +
> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> +               if (err)
> +                       return -ENXIO;
> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> +                                               iommu_pasid_shutdown_callback);
> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> +                                            iommu_invalid_ppr_cb);
> +
> +               err = kfd_bind_processes_to_device(kfd);
> +               if (err)
> +                       goto processes_bind_error;
> +       }
> +#endif
>
>         err = kfd->dqm->ops.start(kfd->dqm);
>         if (err) {
> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>
>  dqm_start_error:
>  processes_bind_error:
> -       amd_iommu_free_device(kfd->pdev);
> -
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device)
> +               amd_iommu_free_device(kfd->pdev);
> +#endif
>         return err;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index 93aae5c..f770dc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>         }
>  }
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>                 unsigned long address, bool is_write_requested,
>                 bool is_execute_requested)
> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>         mutex_unlock(&p->event_mutex);
>         kfd_unref_process(p);
>  }
> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>
>  void kfd_signal_hw_exception_event(unsigned int pasid)
>  {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index eebfb1e..9f4766c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -158,6 +158,7 @@ struct kfd_device_info {
>         uint8_t num_of_watch_points;
>         uint16_t mqd_size_aligned;
>         bool supports_cwsr;
> +       bool needs_iommu_device;
>         bool needs_pci_atomics;
>  };
>
> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>
>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>                                                 struct kfd_process *p);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
> +#endif
>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>                                                         struct kfd_process *p);
>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>                        uint32_t *wait_result);
>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>                                 uint32_t valid_id_bits);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>                 unsigned int pasid, unsigned long address,
>                 bool is_write_requested, bool is_execute_requested);
> +#endif
>  void kfd_signal_hw_exception_event(unsigned int pasid);
>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index a22fb071..1d0e02c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>  {
>         struct kfd_process *p = container_of(work, struct kfd_process,
>                                              release_work);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         struct kfd_process_device *pdd;
>
>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>
>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
> -               if (pdd->bound == PDD_BOUND)
> +               if (pdd->bound == PDD_BOUND &&
> +                   pdd->dev->device_info->needs_iommu_device)
>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>         }
> +#endif
>
>         kfd_process_destroy_pdds(p);
>
> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>                                                         struct kfd_process *p)
>  {
>         struct kfd_process_device *pdd;
> -       int err;
>
>         pdd = kfd_get_process_device_data(dev, p);
>         if (!pdd) {
> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>                 return ERR_PTR(-EINVAL);
>         }
>
> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
> -       if (err < 0)
> -               return ERR_PTR(err);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (dev->device_info->needs_iommu_device) {
> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
> +                                              p->lead_thread);
> +               if (err < 0)
> +                       return ERR_PTR(err);
> +       }
> +#endif
>
>         pdd->bound = PDD_BOUND;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index c6a7609..f57c305 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>   */
>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>  {
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         struct kfd_perf_properties *props;
>
>         if (amd_iommu_pc_supported()) {
> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>                         amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>                 list_add_tail(&props->list, &kdev->perf_props);
>         }
> +#endif
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> index 53fca1f..111fda2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>                 struct list_head *device_list);
>  void kfd_release_topology_device_list(struct list_head *device_list);
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  extern bool amd_iommu_pc_supported(void);
>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
> +#endif
>
>  #endif /* __KFD_TOPOLOGY_H__ */
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]         ` <CAFCwf125cHCf=fsfiMhhASjgMNEcau04gNGKKHFu7PQGeorpZQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 15:00           ` Oded Gabbay
       [not found]             ` <CAFCwf12pqRA4KdRLpkUmiBs7EQmTePcy80V2kP9mP3pN8V-eTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:00 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> Hi Felix,
> Please don't spread 19 #ifdefs throughout the code.
> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
> functions declarations and in the #else section put macros with empty
> implementations. This is much more readable and maintainable.
>
> Oded

To emphasize my point, there is a call to amd_iommu_bind_pasid in
kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
the compliation breaks. Putting the #ifdefs around the calls is simply
not scalable.

Oded

>
>
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>> ASIC information. Also allow building KFD without IOMMUv2 support.
>> This is still useful for dGPUs and prepares for enabling KFD on
>> architectures that don't support AMD IOMMUv2.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>  8 files changed, 74 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
>> index bc5a294..5bbeb95 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>> @@ -4,6 +4,6 @@
>>
>>  config HSA_AMD
>>         tristate "HSA kernel driver for AMD GPU devices"
>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>> +       depends on DRM_AMDGPU && X86_64
>>         help
>>           Enable this if you want to use HSA features on AMD GPU devices.
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> index 2bc2816..3478270 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> @@ -22,7 +22,9 @@
>>
>>  #include <linux/pci.h>
>>  #include <linux/acpi.h>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  #include <linux/amd-iommu.h>
>> +#endif
>>  #include "kfd_crat.h"
>>  #include "kfd_priv.h"
>>  #include "kfd_topology.h"
>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>         struct crat_subtype_generic *sub_type_hdr;
>>         struct crat_subtype_computeunit *cu;
>>         struct kfd_cu_info cu_info;
>> -       struct amd_iommu_device_info iommu_info;
>>         int avail_size = *size;
>>         uint32_t total_num_of_cu;
>>         int num_of_cache_entries = 0;
>>         int cache_mem_filled = 0;
>>         int ret = 0;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       struct amd_iommu_device_info iommu_info;
>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>> +#endif
>>         struct kfd_local_mem_info local_mem_info;
>>
>>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>         /* Check if this node supports IOMMU. During parsing this flag will
>>          * translate to HSA_CAP_ATS_PRESENT
>>          */
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         iommu_info.flags = 0;
>>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>                 if ((iommu_info.flags & required_iommu_flags) ==
>>                                 required_iommu_flags)
>>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>         }
>> +#endif
>>
>>         crat_table->length += sub_type_hdr->length;
>>         crat_table->total_entries++;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index fafe971..5205b34 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -20,7 +20,9 @@
>>   * OTHER DEALINGS IN THE SOFTWARE.
>>   */
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  #include <linux/amd-iommu.h>
>> +#endif
>>  #include <linux/bsearch.h>
>>  #include <linux/pci.h>
>>  #include <linux/slab.h>
>> @@ -31,6 +33,7 @@
>>
>>  #define MQD_SIZE_ALIGNED 768
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  static const struct kfd_device_info kaveri_device_info = {
>>         .asic_family = CHIP_KAVERI,
>>         .max_pasid_bits = 16,
>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>         .num_of_watch_points = 4,
>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>         .supports_cwsr = false,
>> +       .needs_iommu_device = true,
>>         .needs_pci_atomics = false,
>>  };
>>
>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>>         .num_of_watch_points = 4,
>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>         .supports_cwsr = true,
>> +       .needs_iommu_device = true,
>>         .needs_pci_atomics = false,
>>  };
>> +#endif
>>
>>  struct kfd_deviceid {
>>         unsigned short did;
>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>
>>  /* Please keep this sorted by increasing device id. */
>>  static const struct kfd_deviceid supported_devices[] = {
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>>         { 0x1306, &kaveri_device_info },        /* Kaveri */
>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>>         { 0x9877, &carrizo_device_info }        /* Carrizo */
>> +#endif
>>  };
>>
>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>         return kfd;
>>  }
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>  {
>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>>
>>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>  }
>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>
>>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>>  {
>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>                 goto device_queue_manager_error;
>>         }
>>
>> -       if (!device_iommu_pasid_init(kfd)) {
>> -               dev_err(kfd_device,
>> -                       "Error initializing iommuv2 for device %x:%x\n",
>> -                       kfd->pdev->vendor, kfd->pdev->device);
>> -               goto device_iommu_pasid_error;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device) {
>> +               if (!device_iommu_pasid_init(kfd)) {
>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>> +                       goto device_iommu_pasid_error;
>> +               }
>>         }
>> +#endif
>>
>>         kfd_cwsr_init(kfd);
>>
>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>
>>         kfd->dqm->ops.stop(kfd->dqm);
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (!kfd->device_info->needs_iommu_device)
>> +               return;
>> +
>>         kfd_unbind_processes_from_device(kfd);
>>
>>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>         amd_iommu_free_device(kfd->pdev);
>> +#endif
>>  }
>>
>>  int kgd2kfd_resume(struct kfd_dev *kfd)
>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>  static int kfd_resume(struct kfd_dev *kfd)
>>  {
>>         int err = 0;
>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>
>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>> -       if (err)
>> -               return -ENXIO;
>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>> -                                       iommu_pasid_shutdown_callback);
>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>> -                                    iommu_invalid_ppr_cb);
>> -
>> -       err = kfd_bind_processes_to_device(kfd);
>> -       if (err)
>> -               goto processes_bind_error;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device) {
>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>> +
>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>> +               if (err)
>> +                       return -ENXIO;
>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>> +                                               iommu_pasid_shutdown_callback);
>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>> +                                            iommu_invalid_ppr_cb);
>> +
>> +               err = kfd_bind_processes_to_device(kfd);
>> +               if (err)
>> +                       goto processes_bind_error;
>> +       }
>> +#endif
>>
>>         err = kfd->dqm->ops.start(kfd->dqm);
>>         if (err) {
>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>
>>  dqm_start_error:
>>  processes_bind_error:
>> -       amd_iommu_free_device(kfd->pdev);
>> -
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device)
>> +               amd_iommu_free_device(kfd->pdev);
>> +#endif
>>         return err;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> index 93aae5c..f770dc7 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>>         }
>>  }
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>                 unsigned long address, bool is_write_requested,
>>                 bool is_execute_requested)
>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>         mutex_unlock(&p->event_mutex);
>>         kfd_unref_process(p);
>>  }
>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>
>>  void kfd_signal_hw_exception_event(unsigned int pasid)
>>  {
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> index eebfb1e..9f4766c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>         uint8_t num_of_watch_points;
>>         uint16_t mqd_size_aligned;
>>         bool supports_cwsr;
>> +       bool needs_iommu_device;
>>         bool needs_pci_atomics;
>>  };
>>
>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>
>>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                                                 struct kfd_process *p);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
>> +#endif
>>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>                                                         struct kfd_process *p);
>>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>                        uint32_t *wait_result);
>>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>                                 uint32_t valid_id_bits);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>>                 unsigned int pasid, unsigned long address,
>>                 bool is_write_requested, bool is_execute_requested);
>> +#endif
>>  void kfd_signal_hw_exception_event(unsigned int pasid);
>>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index a22fb071..1d0e02c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>>  {
>>         struct kfd_process *p = container_of(work, struct kfd_process,
>>                                              release_work);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         struct kfd_process_device *pdd;
>>
>>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>
>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>> -               if (pdd->bound == PDD_BOUND)
>> +               if (pdd->bound == PDD_BOUND &&
>> +                   pdd->dev->device_info->needs_iommu_device)
>>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>         }
>> +#endif
>>
>>         kfd_process_destroy_pdds(p);
>>
>> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                                                         struct kfd_process *p)
>>  {
>>         struct kfd_process_device *pdd;
>> -       int err;
>>
>>         pdd = kfd_get_process_device_data(dev, p);
>>         if (!pdd) {
>> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                 return ERR_PTR(-EINVAL);
>>         }
>>
>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>> -       if (err < 0)
>> -               return ERR_PTR(err);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (dev->device_info->needs_iommu_device) {
>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>> +                                              p->lead_thread);
>> +               if (err < 0)
>> +                       return ERR_PTR(err);
>> +       }
>> +#endif
>>
>>         pdd->bound = PDD_BOUND;
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> index c6a7609..f57c305 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>>   */
>>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>  {
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         struct kfd_perf_properties *props;
>>
>>         if (amd_iommu_pc_supported()) {
>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>                         amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>>                 list_add_tail(&props->list, &kdev->perf_props);
>>         }
>> +#endif
>>
>>         return 0;
>>  }
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> index 53fca1f..111fda2 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>                 struct list_head *device_list);
>>  void kfd_release_topology_device_list(struct list_head *device_list);
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  extern bool amd_iommu_pc_supported(void);
>>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>> +#endif
>>
>>  #endif /* __KFD_TOPOLOGY_H__ */
>> --
>> 2.7.4
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting
       [not found]     ` <1515104268-25087-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:06       ` Oded Gabbay
       [not found]         ` <CAFCwf11xXiKH-3sqpjk-cpQ5DyM_dL-6Vk=DrBCPJ=oSyyYyAg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:06 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Some dGPUs don't support HWS. Allow them to use a per-device
> sched_policy that may be different from the global default.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  3 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  2 +-
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 22 +++++++++++++++++++---
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  1 +
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 ++-
>  6 files changed, 27 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 62c3d9c..6fe2496 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -901,7 +901,8 @@ static int kfd_ioctl_set_scratch_backing_va(struct file *filep,
>
>         mutex_unlock(&p->mutex);
>
> -       if (sched_policy == KFD_SCHED_POLICY_NO_HWS && pdd->qpd.vmid != 0)
> +       if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS &&
> +           pdd->qpd.vmid != 0)
>                 dev->kfd2kgd->set_scratch_backing_va(
>                         dev->kgd, args->va_addr, pdd->qpd.vmid);
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> index 3da25f7..9d4af96 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> @@ -33,6 +33,7 @@
>  #include "kfd_pm4_headers_diq.h"
>  #include "kfd_dbgmgr.h"
>  #include "kfd_dbgdev.h"
> +#include "kfd_device_queue_manager.h"
>
>  static DEFINE_MUTEX(kfd_dbgmgr_mutex);
>
> @@ -83,7 +84,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>         }
>
>         /* get actual type of DBGDevice cpsch or not */
> -       if (sched_policy == KFD_SCHED_POLICY_NO_HWS)
> +       if (pdev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
>                 type = DBGDEV_TYPE_NODIQ;
>
>         kfd_dbgdev_init(new_buff->dbgdev, pdev, type);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 5205b34..6dd50cc 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -352,7 +352,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>                  kfd->pdev->device);
>
>         pr_debug("Starting kfd with the following scheduling policy %d\n",
> -               sched_policy);
> +               kfd->dqm->sched_policy);
>
>         goto out;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index d0693fd..3e2f53b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -385,7 +385,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         prev_active = q->properties.is_active;
>
>         /* Make sure the queue is unmapped before updating the MQD */
> -       if (sched_policy != KFD_SCHED_POLICY_NO_HWS) {
> +       if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) {
>                 retval = unmap_queues_cpsch(dqm,
>                                 KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
>                 if (retval) {
> @@ -417,7 +417,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         else if (!q->properties.is_active && prev_active)
>                 dqm->queue_count--;
>
> -       if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
> +       if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS)
>                 retval = map_queues_cpsch(dqm);
>         else if (q->properties.is_active &&
>                  (q->properties.type == KFD_QUEUE_TYPE_COMPUTE ||
> @@ -1097,7 +1097,7 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>                         alternate_aperture_base,
>                         alternate_aperture_size);
>
> -       if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
> +       if ((dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
>                 program_sh_mem_settings(dqm, qpd);
>
>         pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
> @@ -1242,6 +1242,22 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>         if (!dqm)
>                 return NULL;
>
> +       switch (dev->device_info->asic_family) {
> +       /* HWS is not available on Hawaii. */
> +       case CHIP_HAWAII:
> +       /* HWS depends on CWSR for timely dequeue. CWSR is not
> +        * available on Tonga.
> +        *
> +        * FIXME: This argument also applies to Kaveri.
So why not add here "case CHIP_KAVERI:" ?

> +        */
> +       case CHIP_TONGA:
> +               dqm->sched_policy = KFD_SCHED_POLICY_NO_HWS;
> +               break;
> +       default:
> +               dqm->sched_policy = sched_policy;
> +               break;
> +       }
> +
>         dqm->dev = dev;
>         switch (sched_policy) {
This should be changed to:
switch (dqm->sched_policy) {


>         case KFD_SCHED_POLICY_HWS:
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> index c61b693..9fdc9c2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> @@ -180,6 +180,7 @@ struct device_queue_manager {
>         unsigned int            *fence_addr;
>         struct kfd_mem_obj      *fence_mem;
>         bool                    active_runlist;
> +       int                     sched_policy;
>  };
>
>  void device_queue_manager_init_cik(
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 8763806..7817e32 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -208,7 +208,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>
>         case KFD_QUEUE_TYPE_COMPUTE:
>                 /* check if there is over subscription */
> -               if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
> +               if ((dev->dqm->sched_policy ==
> +                    KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
>                 ((dev->dqm->processes_count >= dev->vm_info.vmid_num_kfd) ||
>                 (dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
>                         pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 2/9] drm/amdkfd: Conditionally enable PCIe atomics
@ 2018-01-31 15:09     ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:09 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list, linux-pci

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This will be needed for most dGPUs.
>
> CC: linux-pci@vger.kernel.org
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 17 +++++++++++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  1 +
>  2 files changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index a8fa33a..fafe971 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -41,6 +41,7 @@ static const struct kfd_device_info kaveri_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = false,
> +       .needs_pci_atomics = false,
>  };
>
>  static const struct kfd_device_info carrizo_device_info = {
> @@ -53,6 +54,7 @@ static const struct kfd_device_info carrizo_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = true,
> +       .needs_pci_atomics = false,
>  };
>
>  struct kfd_deviceid {
> @@ -127,6 +129,21 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>                 return NULL;
>         }
>
> +       if (device_info->needs_pci_atomics) {
> +               /* Allow BIF to recode atomics to PCIe 3.0
> +                * AtomicOps. 32 and 64-bit requests are possible and
> +                * must be supported.
> +                */
> +               if (pci_enable_atomic_ops_to_root(pdev,
> +                               PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                               PCI_EXP_DEVCAP2_ATOMIC_COMP64) < 0) {
> +                       dev_info(kfd_device,
> +                               "skipped device %x:%x, PCI rejects atomics",
> +                                pdev->vendor, pdev->device);
> +                       return NULL;
> +               }
> +       }
> +
>         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
>         if (!kfd)
>                 return NULL;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 6a48d29..eebfb1e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -158,6 +158,7 @@ struct kfd_device_info {
>         uint8_t num_of_watch_points;
>         uint16_t mqd_size_aligned;
>         bool supports_cwsr;
> +       bool needs_pci_atomics;
>  };
>
>  struct kfd_mem_obj {
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 2/9] drm/amdkfd: Conditionally enable PCIe atomics
@ 2018-01-31 15:09     ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:09 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This will be needed for most dGPUs.
>
> CC: linux-pci@vger.kernel.org
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 17 +++++++++++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h   |  1 +
>  2 files changed, 18 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index a8fa33a..fafe971 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -41,6 +41,7 @@ static const struct kfd_device_info kaveri_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = false,
> +       .needs_pci_atomics = false,
>  };
>
>  static const struct kfd_device_info carrizo_device_info = {
> @@ -53,6 +54,7 @@ static const struct kfd_device_info carrizo_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = true,
> +       .needs_pci_atomics = false,
>  };
>
>  struct kfd_deviceid {
> @@ -127,6 +129,21 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>                 return NULL;
>         }
>
> +       if (device_info->needs_pci_atomics) {
> +               /* Allow BIF to recode atomics to PCIe 3.0
> +                * AtomicOps. 32 and 64-bit requests are possible and
> +                * must be supported.
> +                */
> +               if (pci_enable_atomic_ops_to_root(pdev,
> +                               PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +                               PCI_EXP_DEVCAP2_ATOMIC_COMP64) < 0) {
> +                       dev_info(kfd_device,
> +                               "skipped device %x:%x, PCI rejects atomics",
> +                                pdev->vendor, pdev->device);
> +                       return NULL;
> +               }
> +       }
> +
>         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
>         if (!kfd)
>                 return NULL;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 6a48d29..eebfb1e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -158,6 +158,7 @@ struct kfd_device_info {
>         uint8_t num_of_watch_points;
>         uint16_t mqd_size_aligned;
>         bool supports_cwsr;
> +       bool needs_pci_atomics;
>  };
>
>  struct kfd_mem_obj {
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 5/9] drm/amdkfd: Add dGPU support to the device queue manager
       [not found]     ` <1515104268-25087-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:09       ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:09 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> GFXv7 and v8 dGPUs use a different addressing mode for KFD compared
> to APUs (GPUVM64 vs HSA64). And dGPUs don't support MTYPE_CC. They
> use MTYPE_UC instead for memory that requires coherency.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 11 +++
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  4 +
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  | 56 +++++++++++++
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   | 93 ++++++++++++++++++++++
>  4 files changed, 164 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 3e2f53b..092653f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1308,6 +1308,17 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>         case CHIP_KAVERI:
>                 device_queue_manager_init_cik(&dqm->asic_ops);
>                 break;
> +
> +       case CHIP_HAWAII:
> +               device_queue_manager_init_cik_hawaii(&dqm->asic_ops);
> +               break;
> +
> +       case CHIP_TONGA:
> +       case CHIP_FIJI:
> +       case CHIP_POLARIS10:
> +       case CHIP_POLARIS11:
> +               device_queue_manager_init_vi_tonga(&dqm->asic_ops);
> +               break;
>         default:
>                 WARN(1, "Unexpected ASIC family %u",
>                      dev->device_info->asic_family);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> index 9fdc9c2..68be0aa 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> @@ -185,8 +185,12 @@ struct device_queue_manager {
>
>  void device_queue_manager_init_cik(
>                 struct device_queue_manager_asic_ops *asic_ops);
> +void device_queue_manager_init_cik_hawaii(
> +               struct device_queue_manager_asic_ops *asic_ops);
>  void device_queue_manager_init_vi(
>                 struct device_queue_manager_asic_ops *asic_ops);
> +void device_queue_manager_init_vi_tonga(
> +               struct device_queue_manager_asic_ops *asic_ops);
>  void program_sh_mem_settings(struct device_queue_manager *dqm,
>                                         struct qcm_process_device *qpd);
>  unsigned int get_queues_num(struct device_queue_manager *dqm);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> index 28e48c9..aed4c21 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> @@ -34,8 +34,13 @@ static bool set_cache_memory_policy_cik(struct device_queue_manager *dqm,
>                                    uint64_t alternate_aperture_size);
>  static int update_qpd_cik(struct device_queue_manager *dqm,
>                                         struct qcm_process_device *qpd);
> +static int update_qpd_cik_hawaii(struct device_queue_manager *dqm,
> +                                       struct qcm_process_device *qpd);
>  static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>                                 struct qcm_process_device *qpd);
> +static void init_sdma_vm_hawaii(struct device_queue_manager *dqm,
> +                               struct queue *q,
> +                               struct qcm_process_device *qpd);
>
>  void device_queue_manager_init_cik(
>                 struct device_queue_manager_asic_ops *asic_ops)
> @@ -45,6 +50,14 @@ void device_queue_manager_init_cik(
>         asic_ops->init_sdma_vm = init_sdma_vm;
>  }
>
> +void device_queue_manager_init_cik_hawaii(
> +               struct device_queue_manager_asic_ops *asic_ops)
> +{
> +       asic_ops->set_cache_memory_policy = set_cache_memory_policy_cik;
> +       asic_ops->update_qpd = update_qpd_cik_hawaii;
> +       asic_ops->init_sdma_vm = init_sdma_vm_hawaii;
> +}
> +
>  static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
>  {
>         /* In 64-bit mode, we can only control the top 3 bits of the LDS,
> @@ -132,6 +145,36 @@ static int update_qpd_cik(struct device_queue_manager *dqm,
>         return 0;
>  }
>
> +static int update_qpd_cik_hawaii(struct device_queue_manager *dqm,
> +               struct qcm_process_device *qpd)
> +{
> +       struct kfd_process_device *pdd;
> +       unsigned int temp;
> +
> +       pdd = qpd_to_pdd(qpd);
> +
> +       /* check if sh_mem_config register already configured */
> +       if (qpd->sh_mem_config == 0) {
> +               qpd->sh_mem_config =
> +                       ALIGNMENT_MODE(SH_MEM_ALIGNMENT_MODE_UNALIGNED) |
> +                       DEFAULT_MTYPE(MTYPE_NONCACHED) |
> +                       APE1_MTYPE(MTYPE_NONCACHED);
> +               qpd->sh_mem_ape1_limit = 0;
> +               qpd->sh_mem_ape1_base = 0;
> +       }
> +
> +       /* On dGPU we're always in GPUVM64 addressing mode with 64-bit
> +        * aperture addresses.
> +        */
> +       temp = get_sh_mem_bases_nybble_64(pdd);
> +       qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
> +
> +       pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
> +               qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
> +
> +       return 0;
> +}
> +
>  static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>                                 struct qcm_process_device *qpd)
>  {
> @@ -147,3 +190,16 @@ static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>
>         q->properties.sdma_vm_addr = value;
>  }
> +
> +static void init_sdma_vm_hawaii(struct device_queue_manager *dqm,
> +                               struct queue *q,
> +                               struct qcm_process_device *qpd)
> +{
> +       /* On dGPU we're always in GPUVM64 addressing mode with 64-bit
> +        * aperture addresses.
> +        */
> +       q->properties.sdma_vm_addr =
> +               ((get_sh_mem_bases_nybble_64(qpd_to_pdd(qpd))) <<
> +                SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE__SHIFT) &
> +               SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE_MASK;
> +}
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> index 2fbce57..fd60a11 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> @@ -33,10 +33,21 @@ static bool set_cache_memory_policy_vi(struct device_queue_manager *dqm,
>                                    enum cache_policy alternate_policy,
>                                    void __user *alternate_aperture_base,
>                                    uint64_t alternate_aperture_size);
> +static bool set_cache_memory_policy_vi_tonga(struct device_queue_manager *dqm,
> +                       struct qcm_process_device *qpd,
> +                       enum cache_policy default_policy,
> +                       enum cache_policy alternate_policy,
> +                       void __user *alternate_aperture_base,
> +                       uint64_t alternate_aperture_size);
>  static int update_qpd_vi(struct device_queue_manager *dqm,
>                                         struct qcm_process_device *qpd);
> +static int update_qpd_vi_tonga(struct device_queue_manager *dqm,
> +                       struct qcm_process_device *qpd);
>  static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>                                 struct qcm_process_device *qpd);
> +static void init_sdma_vm_tonga(struct device_queue_manager *dqm,
> +                       struct queue *q,
> +                       struct qcm_process_device *qpd);
>
>  void device_queue_manager_init_vi(
>                 struct device_queue_manager_asic_ops *asic_ops)
> @@ -46,6 +57,14 @@ void device_queue_manager_init_vi(
>         asic_ops->init_sdma_vm = init_sdma_vm;
>  }
>
> +void device_queue_manager_init_vi_tonga(
> +               struct device_queue_manager_asic_ops *asic_ops)
> +{
> +       asic_ops->set_cache_memory_policy = set_cache_memory_policy_vi_tonga;
> +       asic_ops->update_qpd = update_qpd_vi_tonga;
> +       asic_ops->init_sdma_vm = init_sdma_vm_tonga;
> +}
> +
>  static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
>  {
>         /* In 64-bit mode, we can only control the top 3 bits of the LDS,
> @@ -103,6 +122,33 @@ static bool set_cache_memory_policy_vi(struct device_queue_manager *dqm,
>         return true;
>  }
>
> +static bool set_cache_memory_policy_vi_tonga(struct device_queue_manager *dqm,
> +               struct qcm_process_device *qpd,
> +               enum cache_policy default_policy,
> +               enum cache_policy alternate_policy,
> +               void __user *alternate_aperture_base,
> +               uint64_t alternate_aperture_size)
> +{
> +       uint32_t default_mtype;
> +       uint32_t ape1_mtype;
> +
> +       default_mtype = (default_policy == cache_policy_coherent) ?
> +                       MTYPE_UC :
> +                       MTYPE_NC;
> +
> +       ape1_mtype = (alternate_policy == cache_policy_coherent) ?
> +                       MTYPE_UC :
> +                       MTYPE_NC;
> +
> +       qpd->sh_mem_config =
> +                       SH_MEM_ALIGNMENT_MODE_UNALIGNED <<
> +                                  SH_MEM_CONFIG__ALIGNMENT_MODE__SHIFT |
> +                       default_mtype << SH_MEM_CONFIG__DEFAULT_MTYPE__SHIFT |
> +                       ape1_mtype << SH_MEM_CONFIG__APE1_MTYPE__SHIFT;
> +
> +       return true;
> +}
> +
>  static int update_qpd_vi(struct device_queue_manager *dqm,
>                                         struct qcm_process_device *qpd)
>  {
> @@ -144,6 +190,40 @@ static int update_qpd_vi(struct device_queue_manager *dqm,
>         return 0;
>  }
>
> +static int update_qpd_vi_tonga(struct device_queue_manager *dqm,
> +                       struct qcm_process_device *qpd)
> +{
> +       struct kfd_process_device *pdd;
> +       unsigned int temp;
> +
> +       pdd = qpd_to_pdd(qpd);
> +
> +       /* check if sh_mem_config register already configured */
> +       if (qpd->sh_mem_config == 0) {
> +               qpd->sh_mem_config =
> +                               SH_MEM_ALIGNMENT_MODE_UNALIGNED <<
> +                                       SH_MEM_CONFIG__ALIGNMENT_MODE__SHIFT |
> +                               MTYPE_UC <<
> +                                       SH_MEM_CONFIG__DEFAULT_MTYPE__SHIFT |
> +                               MTYPE_UC <<
> +                                       SH_MEM_CONFIG__APE1_MTYPE__SHIFT;
> +
> +               qpd->sh_mem_ape1_limit = 0;
> +               qpd->sh_mem_ape1_base = 0;
> +       }
> +
> +       /* On dGPU we're always in GPUVM64 addressing mode with 64-bit
> +        * aperture addresses.
> +        */
> +       temp = get_sh_mem_bases_nybble_64(pdd);
> +       qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
> +
> +       pr_debug("sh_mem_bases nybble: 0x%X and register 0x%X\n",
> +               temp, qpd->sh_mem_bases);
> +
> +       return 0;
> +}
> +
>  static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>                                 struct qcm_process_device *qpd)
>  {
> @@ -159,3 +239,16 @@ static void init_sdma_vm(struct device_queue_manager *dqm, struct queue *q,
>
>         q->properties.sdma_vm_addr = value;
>  }
> +
> +static void init_sdma_vm_tonga(struct device_queue_manager *dqm,
> +                       struct queue *q,
> +                       struct qcm_process_device *qpd)
> +{
> +       /* On dGPU we're always in GPUVM64 addressing mode with 64-bit
> +        * aperture addresses.
> +        */
> +       q->properties.sdma_vm_addr =
> +               ((get_sh_mem_bases_nybble_64(qpd_to_pdd(qpd))) <<
> +                SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE__SHIFT) &
> +               SDMA0_RLC0_VIRTUAL_ADDR__SHARED_BASE_MASK;
> +}
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]             ` <CAFCwf12pqRA4KdRLpkUmiBs7EQmTePcy80V2kP9mP3pN8V-eTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 15:11               ` Christian König
  2018-01-31 16:14               ` Felix Kuehling
  2018-02-03  2:29               ` Felix Kuehling
  2 siblings, 0 replies; 47+ messages in thread
From: Christian König @ 2018-01-31 15:11 UTC (permalink / raw)
  To: Oded Gabbay, Felix Kuehling; +Cc: amd-gfx list

Am 31.01.2018 um 16:00 schrieb Oded Gabbay:
> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
>> Hi Felix,
>> Please don't spread 19 #ifdefs throughout the code.
>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>> functions declarations and in the #else section put macros with empty
>> implementations. This is much more readable and maintainable.
>>
>> Oded
> To emphasize my point, there is a call to amd_iommu_bind_pasid in
> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
> the compliation breaks. Putting the #ifdefs around the calls is simply
> not scalable.

I agree with Oded on that.

Additional to this you need to add "imply AMD_IOMMU_V2" to the Kconfig.

Otherwise you can compile amdkfd into the kernel and amd_iommu_v2 as 
module which would result in a kernel with unresolved symbols.

Christian.

>
> Oded
>
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>>> ASIC information. Also allow building KFD without IOMMUv2 support.
>>> This is still useful for dGPUs and prepares for enabling KFD on
>>> architectures that don't support AMD IOMMUv2.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>>>   drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>>   8 files changed, 74 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> index bc5a294..5bbeb95 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> @@ -4,6 +4,6 @@
>>>
>>>   config HSA_AMD
>>>          tristate "HSA kernel driver for AMD GPU devices"
>>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>>> +       depends on DRM_AMDGPU && X86_64
>>>          help
>>>            Enable this if you want to use HSA features on AMD GPU devices.
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> index 2bc2816..3478270 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> @@ -22,7 +22,9 @@
>>>
>>>   #include <linux/pci.h>
>>>   #include <linux/acpi.h>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   #include <linux/amd-iommu.h>
>>> +#endif
>>>   #include "kfd_crat.h"
>>>   #include "kfd_priv.h"
>>>   #include "kfd_topology.h"
>>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>          struct crat_subtype_generic *sub_type_hdr;
>>>          struct crat_subtype_computeunit *cu;
>>>          struct kfd_cu_info cu_info;
>>> -       struct amd_iommu_device_info iommu_info;
>>>          int avail_size = *size;
>>>          uint32_t total_num_of_cu;
>>>          int num_of_cache_entries = 0;
>>>          int cache_mem_filled = 0;
>>>          int ret = 0;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       struct amd_iommu_device_info iommu_info;
>>>          const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>                                           AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>>                                           AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>>> +#endif
>>>          struct kfd_local_mem_info local_mem_info;
>>>
>>>          if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>          /* Check if this node supports IOMMU. During parsing this flag will
>>>           * translate to HSA_CAP_ATS_PRESENT
>>>           */
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>          iommu_info.flags = 0;
>>>          if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>>                  if ((iommu_info.flags & required_iommu_flags) ==
>>>                                  required_iommu_flags)
>>>                          cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>>          }
>>> +#endif
>>>
>>>          crat_table->length += sub_type_hdr->length;
>>>          crat_table->total_entries++;
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index fafe971..5205b34 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -20,7 +20,9 @@
>>>    * OTHER DEALINGS IN THE SOFTWARE.
>>>    */
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   #include <linux/amd-iommu.h>
>>> +#endif
>>>   #include <linux/bsearch.h>
>>>   #include <linux/pci.h>
>>>   #include <linux/slab.h>
>>> @@ -31,6 +33,7 @@
>>>
>>>   #define MQD_SIZE_ALIGNED 768
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   static const struct kfd_device_info kaveri_device_info = {
>>>          .asic_family = CHIP_KAVERI,
>>>          .max_pasid_bits = 16,
>>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>>          .num_of_watch_points = 4,
>>>          .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>          .supports_cwsr = false,
>>> +       .needs_iommu_device = true,
>>>          .needs_pci_atomics = false,
>>>   };
>>>
>>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>>>          .num_of_watch_points = 4,
>>>          .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>          .supports_cwsr = true,
>>> +       .needs_iommu_device = true,
>>>          .needs_pci_atomics = false,
>>>   };
>>> +#endif
>>>
>>>   struct kfd_deviceid {
>>>          unsigned short did;
>>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>>
>>>   /* Please keep this sorted by increasing device id. */
>>>   static const struct kfd_deviceid supported_devices[] = {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>          { 0x1304, &kaveri_device_info },        /* Kaveri */
>>>          { 0x1305, &kaveri_device_info },        /* Kaveri */
>>>          { 0x1306, &kaveri_device_info },        /* Kaveri */
>>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>>          { 0x9875, &carrizo_device_info },       /* Carrizo */
>>>          { 0x9876, &carrizo_device_info },       /* Carrizo */
>>>          { 0x9877, &carrizo_device_info }        /* Carrizo */
>>> +#endif
>>>   };
>>>
>>>   static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>          return kfd;
>>>   }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>>   {
>>>          const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>>>
>>>          return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>>   }
>>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>>
>>>   static void kfd_cwsr_init(struct kfd_dev *kfd)
>>>   {
>>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>                  goto device_queue_manager_error;
>>>          }
>>>
>>> -       if (!device_iommu_pasid_init(kfd)) {
>>> -               dev_err(kfd_device,
>>> -                       "Error initializing iommuv2 for device %x:%x\n",
>>> -                       kfd->pdev->vendor, kfd->pdev->device);
>>> -               goto device_iommu_pasid_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               if (!device_iommu_pasid_init(kfd)) {
>>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>>> +                       goto device_iommu_pasid_error;
>>> +               }
>>>          }
>>> +#endif
>>>
>>>          kfd_cwsr_init(kfd);
>>>
>>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>>
>>>          kfd->dqm->ops.stop(kfd->dqm);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (!kfd->device_info->needs_iommu_device)
>>> +               return;
>>> +
>>>          kfd_unbind_processes_from_device(kfd);
>>>
>>>          amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>>          amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>>          amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>   }
>>>
>>>   int kgd2kfd_resume(struct kfd_dev *kfd)
>>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>>   static int kfd_resume(struct kfd_dev *kfd)
>>>   {
>>>          int err = 0;
>>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>>
>>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> -       if (err)
>>> -               return -ENXIO;
>>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> -                                       iommu_pasid_shutdown_callback);
>>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> -                                    iommu_invalid_ppr_cb);
>>> -
>>> -       err = kfd_bind_processes_to_device(kfd);
>>> -       if (err)
>>> -               goto processes_bind_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>>> +
>>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> +               if (err)
>>> +                       return -ENXIO;
>>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> +                                               iommu_pasid_shutdown_callback);
>>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> +                                            iommu_invalid_ppr_cb);
>>> +
>>> +               err = kfd_bind_processes_to_device(kfd);
>>> +               if (err)
>>> +                       goto processes_bind_error;
>>> +       }
>>> +#endif
>>>
>>>          err = kfd->dqm->ops.start(kfd->dqm);
>>>          if (err) {
>>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>>
>>>   dqm_start_error:
>>>   processes_bind_error:
>>> -       amd_iommu_free_device(kfd->pdev);
>>> -
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device)
>>> +               amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>          return err;
>>>   }
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> index 93aae5c..f770dc7 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>>>          }
>>>   }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>                  unsigned long address, bool is_write_requested,
>>>                  bool is_execute_requested)
>>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>          mutex_unlock(&p->event_mutex);
>>>          kfd_unref_process(p);
>>>   }
>>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>>
>>>   void kfd_signal_hw_exception_event(unsigned int pasid)
>>>   {
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> index eebfb1e..9f4766c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>>          uint8_t num_of_watch_points;
>>>          uint16_t mqd_size_aligned;
>>>          bool supports_cwsr;
>>> +       bool needs_iommu_device;
>>>          bool needs_pci_atomics;
>>>   };
>>>
>>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>>
>>>   struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                  struct kfd_process *p);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>>   void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>>   void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
>>> +#endif
>>>   struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>>                                                          struct kfd_process *p);
>>>   struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>>                         uint32_t *wait_result);
>>>   void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>>                                  uint32_t valid_id_bits);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   void kfd_signal_iommu_event(struct kfd_dev *dev,
>>>                  unsigned int pasid, unsigned long address,
>>>                  bool is_write_requested, bool is_execute_requested);
>>> +#endif
>>>   void kfd_signal_hw_exception_event(unsigned int pasid);
>>>   int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>>   int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index a22fb071..1d0e02c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>   {
>>>          struct kfd_process *p = container_of(work, struct kfd_process,
>>>                                               release_work);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>          struct kfd_process_device *pdd;
>>>
>>>          pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>>
>>>          list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>>> -               if (pdd->bound == PDD_BOUND)
>>> +               if (pdd->bound == PDD_BOUND &&
>>> +                   pdd->dev->device_info->needs_iommu_device)
>>>                          amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>>          }
>>> +#endif
>>>
>>>          kfd_process_destroy_pdds(p);
>>>
>>> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                          struct kfd_process *p)
>>>   {
>>>          struct kfd_process_device *pdd;
>>> -       int err;
>>>
>>>          pdd = kfd_get_process_device_data(dev, p);
>>>          if (!pdd) {
>>> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                  return ERR_PTR(-EINVAL);
>>>          }
>>>
>>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>>> -       if (err < 0)
>>> -               return ERR_PTR(err);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (dev->device_info->needs_iommu_device) {
>>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>>> +                                              p->lead_thread);
>>> +               if (err < 0)
>>> +                       return ERR_PTR(err);
>>> +       }
>>> +#endif
>>>
>>>          pdd->bound = PDD_BOUND;
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index c6a7609..f57c305 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>>>    */
>>>   static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>   {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>          struct kfd_perf_properties *props;
>>>
>>>          if (amd_iommu_pc_supported()) {
>>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>                          amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>>>                  list_add_tail(&props->list, &kdev->perf_props);
>>>          }
>>> +#endif
>>>
>>>          return 0;
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> index 53fca1f..111fda2 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>>                  struct list_head *device_list);
>>>   void kfd_release_topology_device_list(struct list_head *device_list);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>   extern bool amd_iommu_pc_supported(void);
>>>   extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>>   extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>>> +#endif
>>>
>>>   #endif /* __KFD_TOPOLOGY_H__ */
>>> --
>>> 2.7.4
>>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 6/9] drm/amdkfd: Add dGPU support to the MQD manager
       [not found]     ` <1515104268-25087-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:11       ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:11 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> On dGPUs don't set ATC addressing bits and use MTYPE_UC for coherent
> memory.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c     |  7 +++++
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 35 ++++++++++++++++++++++--
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c  | 21 ++++++++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h            |  4 +++
>  4 files changed, 64 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
> index dfd260e..ee7061e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c
> @@ -29,8 +29,15 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type,
>         switch (dev->device_info->asic_family) {
>         case CHIP_KAVERI:
>                 return mqd_manager_init_cik(type, dev);
> +       case CHIP_HAWAII:
> +               return mqd_manager_init_cik_hawaii(type, dev);
>         case CHIP_CARRIZO:
>                 return mqd_manager_init_vi(type, dev);
> +       case CHIP_TONGA:
> +       case CHIP_FIJI:
> +       case CHIP_POLARIS10:
> +       case CHIP_POLARIS11:
> +               return mqd_manager_init_vi_tonga(type, dev);
>         default:
>                 WARN(1, "Unexpected ASIC family %u",
>                      dev->device_info->asic_family);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index f8ef4a0..fbe3f83 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -170,14 +170,19 @@ static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
>                                                mms);
>  }
>
> -static int update_mqd(struct mqd_manager *mm, void *mqd,
> -                       struct queue_properties *q)
> +static int __update_mqd(struct mqd_manager *mm, void *mqd,
> +                       struct queue_properties *q, unsigned int atc_bit)
>  {
>         struct cik_mqd *m;
>
>         m = get_mqd(mqd);
>         m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
> -                               DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
> +                               DEFAULT_MIN_AVAIL_SIZE;
> +       m->cp_hqd_ib_control = DEFAULT_MIN_IB_AVAIL_SIZE;
> +       if (atc_bit) {
> +               m->cp_hqd_pq_control |= PQ_ATC_EN;
> +               m->cp_hqd_ib_control |= IB_ATC_EN;
> +       }
>
>         /*
>          * Calculating queue size which is log base 2 of actual queue size -1
> @@ -202,6 +207,18 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>         return 0;
>  }
>
> +static int update_mqd(struct mqd_manager *mm, void *mqd,
> +                       struct queue_properties *q)
> +{
> +       return __update_mqd(mm, mqd, q, 1);
> +}
> +
> +static int update_mqd_hawaii(struct mqd_manager *mm, void *mqd,
> +                       struct queue_properties *q)
> +{
> +       return __update_mqd(mm, mqd, q, 0);
> +}
> +
>  static int update_mqd_sdma(struct mqd_manager *mm, void *mqd,
>                                 struct queue_properties *q)
>  {
> @@ -441,3 +458,15 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>         return mqd;
>  }
>
> +struct mqd_manager *mqd_manager_init_cik_hawaii(enum KFD_MQD_TYPE type,
> +                       struct kfd_dev *dev)
> +{
> +       struct mqd_manager *mqd;
> +
> +       mqd = mqd_manager_init_cik(type, dev);
> +       if (!mqd)
> +               return NULL;
> +       if ((type == KFD_MQD_TYPE_CP) || (type == KFD_MQD_TYPE_COMPUTE))
> +               mqd->update_mqd = update_mqd_hawaii;
> +       return mqd;
> +}
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index 971aec0..58221c1 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -151,6 +151,8 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>
>         m->cp_hqd_pq_rptr_report_addr_lo = lower_32_bits((uint64_t)q->read_ptr);
>         m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
> +       m->cp_hqd_pq_wptr_poll_addr_lo = lower_32_bits((uint64_t)q->write_ptr);
> +       m->cp_hqd_pq_wptr_poll_addr_hi = upper_32_bits((uint64_t)q->write_ptr);
>
>         m->cp_hqd_pq_doorbell_control =
>                 q->doorbell_off <<
> @@ -208,6 +210,12 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>         return __update_mqd(mm, mqd, q, MTYPE_CC, 1);
>  }
>
> +static int update_mqd_tonga(struct mqd_manager *mm, void *mqd,
> +                       struct queue_properties *q)
> +{
> +       return __update_mqd(mm, mqd, q, MTYPE_UC, 0);
> +}
> +
>  static int destroy_mqd(struct mqd_manager *mm, void *mqd,
>                         enum kfd_preempt_type type,
>                         unsigned int timeout, uint32_t pipe_id,
> @@ -432,3 +440,16 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>
>         return mqd;
>  }
> +
> +struct mqd_manager *mqd_manager_init_vi_tonga(enum KFD_MQD_TYPE type,
> +                       struct kfd_dev *dev)
> +{
> +       struct mqd_manager *mqd;
> +
> +       mqd = mqd_manager_init_vi(type, dev);
> +       if (!mqd)
> +               return NULL;
> +       if ((type == KFD_MQD_TYPE_CP) || (type == KFD_MQD_TYPE_COMPUTE))
> +               mqd->update_mqd = update_mqd_tonga;
> +       return mqd;
> +}
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 9f4766c..993062e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -709,8 +709,12 @@ struct mqd_manager *mqd_manager_init(enum KFD_MQD_TYPE type,
>                                         struct kfd_dev *dev);
>  struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>                 struct kfd_dev *dev);
> +struct mqd_manager *mqd_manager_init_cik_hawaii(enum KFD_MQD_TYPE type,
> +               struct kfd_dev *dev);
>  struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>                 struct kfd_dev *dev);
> +struct mqd_manager *mqd_manager_init_vi_tonga(enum KFD_MQD_TYPE type,
> +               struct kfd_dev *dev);
>  struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev);
>  void device_queue_manager_uninit(struct device_queue_manager *dqm);
>  struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found]     ` <1515104268-25087-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:17       ` Oded Gabbay
       [not found]         ` <CAFCwf11nyKTuxF4R+GfWt_Zg5pRjYezbp9TEW_-OWqRhhR-rVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:17 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Recognize dGPU ASIC families.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 5dc6567..69f4964 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>
>         switch (dev->device_info->asic_family) {
>         case CHIP_CARRIZO:
> +       case CHIP_TONGA:
> +       case CHIP_FIJI:
> +       case CHIP_POLARIS10:
> +       case CHIP_POLARIS11:
I believe POLARIS is from arcatic islands, no ?
Maybe rename kernel_queue_init_vi to kernel_queue_init_vi_ai ?
or create a new function kernel_queue_init_ai() and assign same
functions as vi ?
Either way, I think you need to address that.

>                 kernel_queue_init_vi(&kq->ops_asic_specific);
>                 break;
>
>         case CHIP_KAVERI:
> +       case CHIP_HAWAII:
>                 kernel_queue_init_cik(&kq->ops_asic_specific);
>                 break;
>         default:
> --
> 2.7.4
>

Other then that, This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info
@ 2018-01-31 15:20     ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:20 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list, linux-pci

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> CC: linux-pci@vger.kernel.org
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 153 +++++++++++++++++++++++++++++++-
>  1 file changed, 151 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 6dd50cc..612afaf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -63,12 +63,118 @@ static const struct kfd_device_info carrizo_device_info = {
>  };
>  #endif
>
> +static const struct kfd_device_info hawaii_device_info = {
> +       .asic_family = CHIP_HAWAII,
> +       .max_pasid_bits = 16,
> +       /* max num of queues for KV.TODO should be a dynamic value */
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info tonga_device_info = {
> +       .asic_family = CHIP_TONGA,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
Is there any point in keeping the name event_interrupt_class_cik?
maybe just rename to event_interrupt_class ?
What will happen in vega ? If its the same I think removing the _cik
makes the code more consistent.

Oded


> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info tonga_vf_device_info = {
> +       .asic_family = CHIP_TONGA,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info fiji_device_info = {
> +       .asic_family = CHIP_FIJI,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info fiji_vf_device_info = {
> +       .asic_family = CHIP_FIJI,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +
> +static const struct kfd_device_info polaris10_device_info = {
> +       .asic_family = CHIP_POLARIS10,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info polaris10_vf_device_info = {
> +       .asic_family = CHIP_POLARIS10,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info polaris11_device_info = {
> +       .asic_family = CHIP_POLARIS11,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +
>  struct kfd_deviceid {
>         unsigned short did;
>         const struct kfd_device_info *device_info;
>  };
>
> -/* Please keep this sorted by increasing device id. */
>  static const struct kfd_deviceid supported_devices[] = {
>  #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         { 0x1304, &kaveri_device_info },        /* Kaveri */
> @@ -97,8 +203,51 @@ static const struct kfd_deviceid supported_devices[] = {
>         { 0x9874, &carrizo_device_info },       /* Carrizo */
>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>         { 0x9876, &carrizo_device_info },       /* Carrizo */
> -       { 0x9877, &carrizo_device_info }        /* Carrizo */
> +       { 0x9877, &carrizo_device_info },       /* Carrizo */
>  #endif
> +       { 0x67A0, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A1, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A2, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A8, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A9, &hawaii_device_info },        /* Hawaii */
> +       { 0x67AA, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B0, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B1, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B8, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B9, &hawaii_device_info },        /* Hawaii */
> +       { 0x67BA, &hawaii_device_info },        /* Hawaii */
> +       { 0x67BE, &hawaii_device_info },        /* Hawaii */
> +       { 0x6920, &tonga_device_info },         /* Tonga */
> +       { 0x6921, &tonga_device_info },         /* Tonga */
> +       { 0x6928, &tonga_device_info },         /* Tonga */
> +       { 0x6929, &tonga_device_info },         /* Tonga */
> +       { 0x692B, &tonga_device_info },         /* Tonga */
> +       { 0x692F, &tonga_vf_device_info },      /* Tonga vf */
> +       { 0x6938, &tonga_device_info },         /* Tonga */
> +       { 0x6939, &tonga_device_info },         /* Tonga */
> +       { 0x7300, &fiji_device_info },          /* Fiji */
> +       { 0x730F, &fiji_vf_device_info },       /* Fiji vf*/
> +       { 0x67C0, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C1, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C2, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C4, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C7, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C8, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C9, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CA, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CC, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CF, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67D0, &polaris10_vf_device_info },  /* Polaris10 vf*/
> +       { 0x67DF, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67E0, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E1, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E3, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E7, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E8, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E9, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67EB, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67EF, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67FF, &polaris11_device_info },     /* Polaris11 */
>  };
>
>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
> --
> 2.7.4
>

Other then the note above, This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info
@ 2018-01-31 15:20     ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:20 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> CC: linux-pci@vger.kernel.org
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 153 +++++++++++++++++++++++++++++++-
>  1 file changed, 151 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 6dd50cc..612afaf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -63,12 +63,118 @@ static const struct kfd_device_info carrizo_device_info = {
>  };
>  #endif
>
> +static const struct kfd_device_info hawaii_device_info = {
> +       .asic_family = CHIP_HAWAII,
> +       .max_pasid_bits = 16,
> +       /* max num of queues for KV.TODO should be a dynamic value */
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info tonga_device_info = {
> +       .asic_family = CHIP_TONGA,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
Is there any point in keeping the name event_interrupt_class_cik?
maybe just rename to event_interrupt_class ?
What will happen in vega ? If its the same I think removing the _cik
makes the code more consistent.

Oded


> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info tonga_vf_device_info = {
> +       .asic_family = CHIP_TONGA,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = false,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info fiji_device_info = {
> +       .asic_family = CHIP_FIJI,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info fiji_vf_device_info = {
> +       .asic_family = CHIP_FIJI,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +
> +static const struct kfd_device_info polaris10_device_info = {
> +       .asic_family = CHIP_POLARIS10,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +static const struct kfd_device_info polaris10_vf_device_info = {
> +       .asic_family = CHIP_POLARIS10,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = false,
> +};
> +
> +static const struct kfd_device_info polaris11_device_info = {
> +       .asic_family = CHIP_POLARIS11,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_cik,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = false,
> +       .needs_pci_atomics = true,
> +};
> +
> +
>  struct kfd_deviceid {
>         unsigned short did;
>         const struct kfd_device_info *device_info;
>  };
>
> -/* Please keep this sorted by increasing device id. */
>  static const struct kfd_deviceid supported_devices[] = {
>  #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         { 0x1304, &kaveri_device_info },        /* Kaveri */
> @@ -97,8 +203,51 @@ static const struct kfd_deviceid supported_devices[] = {
>         { 0x9874, &carrizo_device_info },       /* Carrizo */
>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>         { 0x9876, &carrizo_device_info },       /* Carrizo */
> -       { 0x9877, &carrizo_device_info }        /* Carrizo */
> +       { 0x9877, &carrizo_device_info },       /* Carrizo */
>  #endif
> +       { 0x67A0, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A1, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A2, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A8, &hawaii_device_info },        /* Hawaii */
> +       { 0x67A9, &hawaii_device_info },        /* Hawaii */
> +       { 0x67AA, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B0, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B1, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B8, &hawaii_device_info },        /* Hawaii */
> +       { 0x67B9, &hawaii_device_info },        /* Hawaii */
> +       { 0x67BA, &hawaii_device_info },        /* Hawaii */
> +       { 0x67BE, &hawaii_device_info },        /* Hawaii */
> +       { 0x6920, &tonga_device_info },         /* Tonga */
> +       { 0x6921, &tonga_device_info },         /* Tonga */
> +       { 0x6928, &tonga_device_info },         /* Tonga */
> +       { 0x6929, &tonga_device_info },         /* Tonga */
> +       { 0x692B, &tonga_device_info },         /* Tonga */
> +       { 0x692F, &tonga_vf_device_info },      /* Tonga vf */
> +       { 0x6938, &tonga_device_info },         /* Tonga */
> +       { 0x6939, &tonga_device_info },         /* Tonga */
> +       { 0x7300, &fiji_device_info },          /* Fiji */
> +       { 0x730F, &fiji_vf_device_info },       /* Fiji vf*/
> +       { 0x67C0, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C1, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C2, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C4, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C7, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C8, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67C9, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CA, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CC, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67CF, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67D0, &polaris10_vf_device_info },  /* Polaris10 vf*/
> +       { 0x67DF, &polaris10_device_info },     /* Polaris10 */
> +       { 0x67E0, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E1, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E3, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E7, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E8, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67E9, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67EB, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67EF, &polaris11_device_info },     /* Polaris11 */
> +       { 0x67FF, &polaris11_device_info },     /* Polaris11 */
>  };
>
>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
> --
> 2.7.4
>

Other then the note above, This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found]         ` <CAFCwf11nyKTuxF4R+GfWt_Zg5pRjYezbp9TEW_-OWqRhhR-rVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 15:23           ` Deucher, Alexander
       [not found]             ` <BN6PR12MB16520A78D8FFA5E0AF393BD2F7FB0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Deucher, Alexander @ 2018-01-31 15:23 UTC (permalink / raw)
  To: Oded Gabbay, Kuehling, Felix; +Cc: amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 2118 bytes --]


________________________________
From: amd-gfx <amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> on behalf of Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Sent: Wednesday, January 31, 2018 10:17 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:
> Recognize dGPU ASIC families.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 5dc6567..69f4964 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>
>         switch (dev->device_info->asic_family) {
>         case CHIP_CARRIZO:
> +       case CHIP_TONGA:
> +       case CHIP_FIJI:
> +       case CHIP_POLARIS10:
> +       case CHIP_POLARIS11:
I believe POLARIS is from arcatic islands, no ?
Maybe rename kernel_queue_init_vi to kernel_queue_init_vi_ai ?
or create a new function kernel_queue_init_ai() and assign same
functions as vi ?
Either way, I think you need to address that.

They are all gfx8.  adding ai just confuses things.

Alex


>                 kernel_queue_init_vi(&kq->ops_asic_specific);
>                 break;
>
>         case CHIP_KAVERI:
> +       case CHIP_HAWAII:
>                 kernel_queue_init_cik(&kq->ops_asic_specific);
>                 break;
>         default:
> --
> 2.7.4
>

Other then that, This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #1.2: Type: text/html, Size: 4363 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]     ` <1515104268-25087-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:25       ` Oded Gabbay
       [not found]         ` <CAFCwf11WWuHydSRBu3Pk8-jFLgoxJ7k0GDfuO-HWRjpvSRm5xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:25 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 335e454..7ebe430 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
>         switch (adev->asic_type) {
>  #ifdef CONFIG_DRM_AMDGPU_CIK
>         case CHIP_KAVERI:
> +       case CHIP_HAWAII:
>                 kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>                 break;
>  #endif
>         case CHIP_CARRIZO:
> +       case CHIP_TONGA:
> +       case CHIP_FIJI:
> +       case CHIP_POLARIS10:
> +       case CHIP_POLARIS11:
Polaris isn't gfx 9 ?
or is it called differently ?

>                 kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>                 break;
>         default:
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]         ` <CAFCwf11WWuHydSRBu3Pk8-jFLgoxJ7k0GDfuO-HWRjpvSRm5xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 15:28           ` Christian König
       [not found]             ` <5881ecb1-3d76-9783-2b60-5b43b5547a3d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2018-01-31 16:33           ` Felix Kuehling
  1 sibling, 1 reply; 47+ messages in thread
From: Christian König @ 2018-01-31 15:28 UTC (permalink / raw)
  To: Oded Gabbay, Felix Kuehling; +Cc: amd-gfx list

Am 31.01.2018 um 16:25 schrieb Oded Gabbay:
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 335e454..7ebe430 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
>>          switch (adev->asic_type) {
>>   #ifdef CONFIG_DRM_AMDGPU_CIK
>>          case CHIP_KAVERI:
>> +       case CHIP_HAWAII:
>>                  kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>>                  break;
>>   #endif
>>          case CHIP_CARRIZO:
>> +       case CHIP_TONGA:
>> +       case CHIP_FIJI:
>> +       case CHIP_POLARIS10:
>> +       case CHIP_POLARIS11:
> Polaris isn't gfx 9 ?
> or is it called differently ?

No Polaris are just updated gfx8 variants.

gfx9 is Vega10.

Christian.

>
>>                  kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>>                  break;
>>          default:
>> --
>> 2.7.4
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found]             ` <BN6PR12MB16520A78D8FFA5E0AF393BD2F7FB0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2018-01-31 15:29               ` Oded Gabbay
       [not found]                 ` <CAFCwf127vkM7aEcyUK9VjrVekZAFin7d7sk6Ko=JV5gibBeukg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:29 UTC (permalink / raw)
  To: Deucher, Alexander; +Cc: Kuehling, Felix, amd-gfx list

On Wed, Jan 31, 2018 at 5:23 PM, Deucher, Alexander
<Alexander.Deucher@amd.com> wrote:
>
> ________________________________
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Oded
> Gabbay <oded.gabbay@gmail.com>
> Sent: Wednesday, January 31, 2018 10:17 AM
> To: Kuehling, Felix
> Cc: amd-gfx list
> Subject: Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
>
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
> wrote:
>> Recognize dGPU ASIC families.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>> index 5dc6567..69f4964 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>> @@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct
>> kfd_dev *dev,
>>
>>         switch (dev->device_info->asic_family) {
>>         case CHIP_CARRIZO:
>> +       case CHIP_TONGA:
>> +       case CHIP_FIJI:
>> +       case CHIP_POLARIS10:
>> +       case CHIP_POLARIS11:
> I believe POLARIS is from arcatic islands, no ?
> Maybe rename kernel_queue_init_vi to kernel_queue_init_vi_ai ?
> or create a new function kernel_queue_init_ai() and assign same
> functions as vi ?
> Either way, I think you need to address that.
>
> They are all gfx8.  adding ai just confuses things.
>
> Alex

In that case, I think it is better maybe to change it to
kernel_queue_init_gfx_7 and kernel_queue_init_gfx_8, to be consistent
with the calls to amdgpu_amdkfd_gfx_7_0_get_functions and
amdgpu_amdkfd_gfx_8_0_get_functions.

Leaving as cik and vi as the identifier when it clearly isn't seems
confusing to me as well.

Oded

>
>
>>                 kernel_queue_init_vi(&kq->ops_asic_specific);
>>                 break;
>>
>>         case CHIP_KAVERI:
>> +       case CHIP_HAWAII:
>>                 kernel_queue_init_cik(&kq->ops_asic_specific);
>>                 break;
>>         default:
>> --
>> 2.7.4
>>
>
> Other then that, This patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]             ` <5881ecb1-3d76-9783-2b60-5b43b5547a3d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-01-31 15:31               ` Oded Gabbay
       [not found]                 ` <CAFCwf10--hDY=0zFUaSM9+fZWXuk8h4AU5-PE+_0+adCAYJ34Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:31 UTC (permalink / raw)
  To: Christian König; +Cc: Felix Kuehling, amd-gfx list

On Wed, Jan 31, 2018 at 5:28 PM, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Am 31.01.2018 um 16:25 schrieb Oded Gabbay:
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>> wrote:
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 335e454..7ebe430 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device
>>> *adev)
>>>          switch (adev->asic_type) {
>>>   #ifdef CONFIG_DRM_AMDGPU_CIK
>>>          case CHIP_KAVERI:
>>> +       case CHIP_HAWAII:
>>>                  kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>>>                  break;
>>>   #endif
>>>          case CHIP_CARRIZO:
>>> +       case CHIP_TONGA:
>>> +       case CHIP_FIJI:
>>> +       case CHIP_POLARIS10:
>>> +       case CHIP_POLARIS11:
>>
>> Polaris isn't gfx 9 ?
>> or is it called differently ?
>
>
> No Polaris are just updated gfx8 variants.
>
> gfx9 is Vega10.
>
> Christian.

OK, thanks. So this patch is fine and is
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

Having said that, as I wrote to Alex, if that is the case, I think we
should rename all soc-dependent functions from using cik and vi as
identifiers to gfx7/gfx8.

Oded
>
>>
>>>                  kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>>>                  break;
>>>          default:
>>> --
>>> 2.7.4
>>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]                 ` <CAFCwf10--hDY=0zFUaSM9+fZWXuk8h4AU5-PE+_0+adCAYJ34Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 15:34                   ` Christian König
       [not found]                     ` <6ea9aa7d-f8c5-e6a5-a492-7506055527c6-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Christian König @ 2018-01-31 15:34 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Felix Kuehling, amd-gfx list

Am 31.01.2018 um 16:31 schrieb Oded Gabbay:
> On Wed, Jan 31, 2018 at 5:28 PM, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 31.01.2018 um 16:25 schrieb Oded Gabbay:
>>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>> wrote:
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>>>>    1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> index 335e454..7ebe430 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device
>>>> *adev)
>>>>           switch (adev->asic_type) {
>>>>    #ifdef CONFIG_DRM_AMDGPU_CIK
>>>>           case CHIP_KAVERI:
>>>> +       case CHIP_HAWAII:
>>>>                   kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>>>>                   break;
>>>>    #endif
>>>>           case CHIP_CARRIZO:
>>>> +       case CHIP_TONGA:
>>>> +       case CHIP_FIJI:
>>>> +       case CHIP_POLARIS10:
>>>> +       case CHIP_POLARIS11:
>>> Polaris isn't gfx 9 ?
>>> or is it called differently ?
>>
>> No Polaris are just updated gfx8 variants.
>>
>> gfx9 is Vega10.
>>
>> Christian.
> OK, thanks. So this patch is fine and is
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>
> Having said that, as I wrote to Alex, if that is the case, I think we
> should rename all soc-dependent functions from using cik and vi as
> identifiers to gfx7/gfx8.

Well that depends on what the name refers to.

gfx7/gfx8 just describe the version of the CP, not the full ASIC.

For example Tonga and Polaris are both gfx8, but have a different SDMA IIRC.

Christian.

>
> Oded
>>>>                   kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>>>>                   break;
>>>>           default:
>>>> --
>>>> 2.7.4
>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]                     ` <6ea9aa7d-f8c5-e6a5-a492-7506055527c6-5C7GfCeVMHo@public.gmane.org>
@ 2018-01-31 15:50                       ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-01-31 15:50 UTC (permalink / raw)
  To: Christian König; +Cc: Felix Kuehling, amd-gfx list

On Wed, Jan 31, 2018 at 5:34 PM, Christian König
<christian.koenig@amd.com> wrote:
> Am 31.01.2018 um 16:31 schrieb Oded Gabbay:
>>
>> On Wed, Jan 31, 2018 at 5:28 PM, Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>>
>>> Am 31.01.2018 um 16:25 schrieb Oded Gabbay:
>>>>
>>>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>>> wrote:
>>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>>>>>    1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> index 335e454..7ebe430 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct
>>>>> amdgpu_device
>>>>> *adev)
>>>>>           switch (adev->asic_type) {
>>>>>    #ifdef CONFIG_DRM_AMDGPU_CIK
>>>>>           case CHIP_KAVERI:
>>>>> +       case CHIP_HAWAII:
>>>>>                   kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>>>>>                   break;
>>>>>    #endif
>>>>>           case CHIP_CARRIZO:
>>>>> +       case CHIP_TONGA:
>>>>> +       case CHIP_FIJI:
>>>>> +       case CHIP_POLARIS10:
>>>>> +       case CHIP_POLARIS11:
>>>>
>>>> Polaris isn't gfx 9 ?
>>>> or is it called differently ?
>>>
>>>
>>> No Polaris are just updated gfx8 variants.
>>>
>>> gfx9 is Vega10.
>>>
>>> Christian.
>>
>> OK, thanks. So this patch is fine and is
>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>>
>> Having said that, as I wrote to Alex, if that is the case, I think we
>> should rename all soc-dependent functions from using cik and vi as
>> identifiers to gfx7/gfx8.
>
>
> Well that depends on what the name refers to.
>
> gfx7/gfx8 just describe the version of the CP, not the full ASIC.
>
> For example Tonga and Polaris are both gfx8, but have a different SDMA IIRC.
>
> Christian.

Understood, but see the following code from mqd_manager_init():

switch (dev->device_info->asic_family) {
case CHIP_KAVERI:
     return mqd_manager_init_cik(type, dev);
case CHIP_HAWAII:
     return mqd_manager_init_cik_hawaii(type, dev);
case CHIP_CARRIZO:
     return mqd_manager_init_vi(type, dev);
case CHIP_TONGA:
case CHIP_FIJI:
case CHIP_POLARIS10:
case CHIP_POLARIS11:
     return mqd_manager_init_vi_tonga(type, dev);
default:
WARN(1, "Unexpected ASIC family %u",
     dev->device_info->asic_family);
}


You started with mqd_manager_init_cik and mqd_manager_init_vi, which is nice.
Then, you add mqd_manager_init_cik_hawaii and
mqd_manager_init_vi_tonga, which is less nice.
Then, to top it off, you assign CHIP_POLARIS10 and CHIP_POLARIS11 to
mqd_manager_init_vi_tonga ? polaris is neither vi, nor tonga.

same thing in device_queue_manager_init()

That's what I meant by saying cik and vi aren't so good as identifiers.

Oded

>
>
>>
>> Oded
>>>>>
>>>>>                   kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>>>>>                   break;
>>>>>           default:
>>>>> --
>>>>> 2.7.4
>>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]             ` <CAFCwf12pqRA4KdRLpkUmiBs7EQmTePcy80V2kP9mP3pN8V-eTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-01-31 15:11               ` Christian König
@ 2018-01-31 16:14               ` Felix Kuehling
  2018-02-03  2:29               ` Felix Kuehling
  2 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:14 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

On 2018-01-31 10:00 AM, Oded Gabbay wrote:
> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
>> Hi Felix,
>> Please don't spread 19 #ifdefs throughout the code.
>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>> functions declarations and in the #else section put macros with empty
>> implementations. This is much more readable and maintainable.

I've seen that way of handling #ifdefs. But that would require changing
the IOMMU driver, which will add more time before this patch series can
be applied. Maybe drop just this patch for now, the rest should still apply.

BTW, there is another change coming later that puts a few #ifdef
CONFIG_ACPI around code that depends on ACPI (for getting the CRAT table
and some NUMA-related code). That's another change needed to allow KFD
to compile work on non-x86 architectures without ACPI.

>>
>> Oded
> To emphasize my point, there is a call to amd_iommu_bind_pasid in
> kfd_bind_processes_to_device() which isn't wrapped

There is a fix for that in my latest patch series ([PATCH 11/25]
drm/amdkfd: Add missing #ifdef CONFIG_AMD_IOMMU_V2 guard) that you could
squash with this commit.

>  with the #ifdef so
> the compliation breaks. Putting the #ifdefs around the calls is simply
> not scalable.

There are #ifdefs around more than just these calls. For example I'm
removing support for the APU device IDs if there is no IOMMU driver. As
an alternative to changing amd-iommu.h I could try to restructure the
code to put all iommu-related code in one place so the #ifdefs aren't
scattered all over the place.

Regards,
  Felix


>
> Oded
>
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>>> ASIC information. Also allow building KFD without IOMMUv2 support.
>>> This is still useful for dGPUs and prepares for enabling KFD on
>>> architectures that don't support AMD IOMMUv2.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>>  8 files changed, 74 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> index bc5a294..5bbeb95 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> @@ -4,6 +4,6 @@
>>>
>>>  config HSA_AMD
>>>         tristate "HSA kernel driver for AMD GPU devices"
>>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>>> +       depends on DRM_AMDGPU && X86_64
>>>         help
>>>           Enable this if you want to use HSA features on AMD GPU devices.
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> index 2bc2816..3478270 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> @@ -22,7 +22,9 @@
>>>
>>>  #include <linux/pci.h>
>>>  #include <linux/acpi.h>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include "kfd_crat.h"
>>>  #include "kfd_priv.h"
>>>  #include "kfd_topology.h"
>>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>         struct crat_subtype_generic *sub_type_hdr;
>>>         struct crat_subtype_computeunit *cu;
>>>         struct kfd_cu_info cu_info;
>>> -       struct amd_iommu_device_info iommu_info;
>>>         int avail_size = *size;
>>>         uint32_t total_num_of_cu;
>>>         int num_of_cache_entries = 0;
>>>         int cache_mem_filled = 0;
>>>         int ret = 0;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       struct amd_iommu_device_info iommu_info;
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>>> +#endif
>>>         struct kfd_local_mem_info local_mem_info;
>>>
>>>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>         /* Check if this node supports IOMMU. During parsing this flag will
>>>          * translate to HSA_CAP_ATS_PRESENT
>>>          */
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         iommu_info.flags = 0;
>>>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>>                 if ((iommu_info.flags & required_iommu_flags) ==
>>>                                 required_iommu_flags)
>>>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>>         }
>>> +#endif
>>>
>>>         crat_table->length += sub_type_hdr->length;
>>>         crat_table->total_entries++;
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index fafe971..5205b34 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -20,7 +20,9 @@
>>>   * OTHER DEALINGS IN THE SOFTWARE.
>>>   */
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include <linux/bsearch.h>
>>>  #include <linux/pci.h>
>>>  #include <linux/slab.h>
>>> @@ -31,6 +33,7 @@
>>>
>>>  #define MQD_SIZE_ALIGNED 768
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static const struct kfd_device_info kaveri_device_info = {
>>>         .asic_family = CHIP_KAVERI,
>>>         .max_pasid_bits = 16,
>>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = false,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>>
>>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = true,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>> +#endif
>>>
>>>  struct kfd_deviceid {
>>>         unsigned short did;
>>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>>
>>>  /* Please keep this sorted by increasing device id. */
>>>  static const struct kfd_deviceid supported_devices[] = {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1306, &kaveri_device_info },        /* Kaveri */
>>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9877, &carrizo_device_info }        /* Carrizo */
>>> +#endif
>>>  };
>>>
>>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>         return kfd;
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>>  {
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>>>
>>>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>>
>>>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>>>  {
>>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>                 goto device_queue_manager_error;
>>>         }
>>>
>>> -       if (!device_iommu_pasid_init(kfd)) {
>>> -               dev_err(kfd_device,
>>> -                       "Error initializing iommuv2 for device %x:%x\n",
>>> -                       kfd->pdev->vendor, kfd->pdev->device);
>>> -               goto device_iommu_pasid_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               if (!device_iommu_pasid_init(kfd)) {
>>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>>> +                       goto device_iommu_pasid_error;
>>> +               }
>>>         }
>>> +#endif
>>>
>>>         kfd_cwsr_init(kfd);
>>>
>>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>>
>>>         kfd->dqm->ops.stop(kfd->dqm);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (!kfd->device_info->needs_iommu_device)
>>> +               return;
>>> +
>>>         kfd_unbind_processes_from_device(kfd);
>>>
>>>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>>         amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>  }
>>>
>>>  int kgd2kfd_resume(struct kfd_dev *kfd)
>>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>>  static int kfd_resume(struct kfd_dev *kfd)
>>>  {
>>>         int err = 0;
>>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>>
>>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> -       if (err)
>>> -               return -ENXIO;
>>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> -                                       iommu_pasid_shutdown_callback);
>>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> -                                    iommu_invalid_ppr_cb);
>>> -
>>> -       err = kfd_bind_processes_to_device(kfd);
>>> -       if (err)
>>> -               goto processes_bind_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>>> +
>>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> +               if (err)
>>> +                       return -ENXIO;
>>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> +                                               iommu_pasid_shutdown_callback);
>>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> +                                            iommu_invalid_ppr_cb);
>>> +
>>> +               err = kfd_bind_processes_to_device(kfd);
>>> +               if (err)
>>> +                       goto processes_bind_error;
>>> +       }
>>> +#endif
>>>
>>>         err = kfd->dqm->ops.start(kfd->dqm);
>>>         if (err) {
>>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>>
>>>  dqm_start_error:
>>>  processes_bind_error:
>>> -       amd_iommu_free_device(kfd->pdev);
>>> -
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device)
>>> +               amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>         return err;
>>>  }
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> index 93aae5c..f770dc7 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>>>         }
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>                 unsigned long address, bool is_write_requested,
>>>                 bool is_execute_requested)
>>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>         mutex_unlock(&p->event_mutex);
>>>         kfd_unref_process(p);
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>>
>>>  void kfd_signal_hw_exception_event(unsigned int pasid)
>>>  {
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> index eebfb1e..9f4766c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>>         uint8_t num_of_watch_points;
>>>         uint16_t mqd_size_aligned;
>>>         bool supports_cwsr;
>>> +       bool needs_iommu_device;
>>>         bool needs_pci_atomics;
>>>  };
>>>
>>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>>
>>>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                 struct kfd_process *p);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
>>> +#endif
>>>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>>                                                         struct kfd_process *p);
>>>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>>                        uint32_t *wait_result);
>>>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>>                                 uint32_t valid_id_bits);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>>>                 unsigned int pasid, unsigned long address,
>>>                 bool is_write_requested, bool is_execute_requested);
>>> +#endif
>>>  void kfd_signal_hw_exception_event(unsigned int pasid);
>>>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index a22fb071..1d0e02c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>  {
>>>         struct kfd_process *p = container_of(work, struct kfd_process,
>>>                                              release_work);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_process_device *pdd;
>>>
>>>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>>
>>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>>> -               if (pdd->bound == PDD_BOUND)
>>> +               if (pdd->bound == PDD_BOUND &&
>>> +                   pdd->dev->device_info->needs_iommu_device)
>>>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>>         }
>>> +#endif
>>>
>>>         kfd_process_destroy_pdds(p);
>>>
>>> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                         struct kfd_process *p)
>>>  {
>>>         struct kfd_process_device *pdd;
>>> -       int err;
>>>
>>>         pdd = kfd_get_process_device_data(dev, p);
>>>         if (!pdd) {
>>> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                 return ERR_PTR(-EINVAL);
>>>         }
>>>
>>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>>> -       if (err < 0)
>>> -               return ERR_PTR(err);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (dev->device_info->needs_iommu_device) {
>>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>>> +                                              p->lead_thread);
>>> +               if (err < 0)
>>> +                       return ERR_PTR(err);
>>> +       }
>>> +#endif
>>>
>>>         pdd->bound = PDD_BOUND;
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index c6a7609..f57c305 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>>>   */
>>>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>  {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_perf_properties *props;
>>>
>>>         if (amd_iommu_pc_supported()) {
>>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>                         amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>>>                 list_add_tail(&props->list, &kdev->perf_props);
>>>         }
>>> +#endif
>>>
>>>         return 0;
>>>  }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> index 53fca1f..111fda2 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>>                 struct list_head *device_list);
>>>  void kfd_release_topology_device_list(struct list_head *device_list);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  extern bool amd_iommu_pc_supported(void);
>>>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>>> +#endif
>>>
>>>  #endif /* __KFD_TOPOLOGY_H__ */
>>> --
>>> 2.7.4
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting
       [not found]         ` <CAFCwf11xXiKH-3sqpjk-cpQ5DyM_dL-6Vk=DrBCPJ=oSyyYyAg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 16:18           ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:18 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

On 2018-01-31 10:06 AM, Oded Gabbay wrote:
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Some dGPUs don't support HWS. Allow them to use a per-device
>> sched_policy that may be different from the global default.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  3 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  2 +-
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 22 +++++++++++++++++++---
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  1 +
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 ++-
>>  6 files changed, 27 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 62c3d9c..6fe2496 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -901,7 +901,8 @@ static int kfd_ioctl_set_scratch_backing_va(struct file *filep,
>>
>>         mutex_unlock(&p->mutex);
>>
>> -       if (sched_policy == KFD_SCHED_POLICY_NO_HWS && pdd->qpd.vmid != 0)
>> +       if (dev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS &&
>> +           pdd->qpd.vmid != 0)
>>                 dev->kfd2kgd->set_scratch_backing_va(
>>                         dev->kgd, args->va_addr, pdd->qpd.vmid);
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
>> index 3da25f7..9d4af96 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
>> @@ -33,6 +33,7 @@
>>  #include "kfd_pm4_headers_diq.h"
>>  #include "kfd_dbgmgr.h"
>>  #include "kfd_dbgdev.h"
>> +#include "kfd_device_queue_manager.h"
>>
>>  static DEFINE_MUTEX(kfd_dbgmgr_mutex);
>>
>> @@ -83,7 +84,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>>         }
>>
>>         /* get actual type of DBGDevice cpsch or not */
>> -       if (sched_policy == KFD_SCHED_POLICY_NO_HWS)
>> +       if (pdev->dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS)
>>                 type = DBGDEV_TYPE_NODIQ;
>>
>>         kfd_dbgdev_init(new_buff->dbgdev, pdev, type);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 5205b34..6dd50cc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -352,7 +352,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>                  kfd->pdev->device);
>>
>>         pr_debug("Starting kfd with the following scheduling policy %d\n",
>> -               sched_policy);
>> +               kfd->dqm->sched_policy);
>>
>>         goto out;
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index d0693fd..3e2f53b 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -385,7 +385,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>>         prev_active = q->properties.is_active;
>>
>>         /* Make sure the queue is unmapped before updating the MQD */
>> -       if (sched_policy != KFD_SCHED_POLICY_NO_HWS) {
>> +       if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS) {
>>                 retval = unmap_queues_cpsch(dqm,
>>                                 KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
>>                 if (retval) {
>> @@ -417,7 +417,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>>         else if (!q->properties.is_active && prev_active)
>>                 dqm->queue_count--;
>>
>> -       if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
>> +       if (dqm->sched_policy != KFD_SCHED_POLICY_NO_HWS)
>>                 retval = map_queues_cpsch(dqm);
>>         else if (q->properties.is_active &&
>>                  (q->properties.type == KFD_QUEUE_TYPE_COMPUTE ||
>> @@ -1097,7 +1097,7 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>>                         alternate_aperture_base,
>>                         alternate_aperture_size);
>>
>> -       if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
>> +       if ((dqm->sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
>>                 program_sh_mem_settings(dqm, qpd);
>>
>>         pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
>> @@ -1242,6 +1242,22 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>>         if (!dqm)
>>                 return NULL;
>>
>> +       switch (dev->device_info->asic_family) {
>> +       /* HWS is not available on Hawaii. */
>> +       case CHIP_HAWAII:
>> +       /* HWS depends on CWSR for timely dequeue. CWSR is not
>> +        * available on Tonga.
>> +        *
>> +        * FIXME: This argument also applies to Kaveri.
> So why not add here "case CHIP_KAVERI:" ?

Right.

>
>> +        */
>> +       case CHIP_TONGA:
>> +               dqm->sched_policy = KFD_SCHED_POLICY_NO_HWS;
>> +               break;
>> +       default:
>> +               dqm->sched_policy = sched_policy;
>> +               break;
>> +       }
>> +
>>         dqm->dev = dev;
>>         switch (sched_policy) {
> This should be changed to:
> switch (dqm->sched_policy) {

The fix is in my latest patch series and could be squashed with this
([PATCH 12/25] drm/amdkfd: Use per-device sched_policy).

Regards,
  Felix

>
>
>>         case KFD_SCHED_POLICY_HWS:
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> index c61b693..9fdc9c2 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
>> @@ -180,6 +180,7 @@ struct device_queue_manager {
>>         unsigned int            *fence_addr;
>>         struct kfd_mem_obj      *fence_mem;
>>         bool                    active_runlist;
>> +       int                     sched_policy;
>>  };
>>
>>  void device_queue_manager_init_cik(
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> index 8763806..7817e32 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> @@ -208,7 +208,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>>
>>         case KFD_QUEUE_TYPE_COMPUTE:
>>                 /* check if there is over subscription */
>> -               if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
>> +               if ((dev->dqm->sched_policy ==
>> +                    KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
>>                 ((dev->dqm->processes_count >= dev->vm_info.vmid_num_kfd) ||
>>                 (dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
>>                         pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
>> --
>> 2.7.4
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found]                 ` <CAFCwf127vkM7aEcyUK9VjrVekZAFin7d7sk6Ko=JV5gibBeukg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-01-31 16:27                   ` Felix Kuehling
       [not found]                     ` <31443990-b612-e9cc-ec07-054b940c8c25-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:27 UTC (permalink / raw)
  To: Oded Gabbay, Deucher, Alexander; +Cc: amd-gfx list

On 2018-01-31 10:29 AM, Oded Gabbay wrote:
> On Wed, Jan 31, 2018 at 5:23 PM, Deucher, Alexander
> <Alexander.Deucher@amd.com> wrote:
>> ________________________________
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Oded
>> Gabbay <oded.gabbay@gmail.com>
>> Sent: Wednesday, January 31, 2018 10:17 AM
>> To: Kuehling, Felix
>> Cc: amd-gfx list
>> Subject: Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>> wrote:
>>> Recognize dGPU ASIC families.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>> index 5dc6567..69f4964 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>> @@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct
>>> kfd_dev *dev,
>>>
>>>         switch (dev->device_info->asic_family) {
>>>         case CHIP_CARRIZO:
>>> +       case CHIP_TONGA:
>>> +       case CHIP_FIJI:
>>> +       case CHIP_POLARIS10:
>>> +       case CHIP_POLARIS11:
>> I believe POLARIS is from arcatic islands, no ?
>> Maybe rename kernel_queue_init_vi to kernel_queue_init_vi_ai ?
>> or create a new function kernel_queue_init_ai() and assign same
>> functions as vi ?
>> Either way, I think you need to address that.
>>
>> They are all gfx8.  adding ai just confuses things.

Internally we use VI and GFX8 interchangably. I think what's confusing
is, that internal code names are used for marketing purposes and applied
to the wrong chip generation.

Another precedent for that is Hawaii. It was called the first "volcanic
island" GPU when it was launched at an event on Hawaii (a volcanic
island), but as far as the driver is concerned, it belongs to the CIK
generation.

It's really hard to keep consistent naming, when naming conventions get
misappropriated to mean different things over time.

>>
>> Alex
> In that case, I think it is better maybe to change it to
> kernel_queue_init_gfx_7 and kernel_queue_init_gfx_8, to be consistent
> with the calls to amdgpu_amdkfd_gfx_7_0_get_functions and
> amdgpu_amdkfd_gfx_8_0_get_functions.
>
> Leaving as cik and vi as the identifier when it clearly isn't seems
> confusing to me as well.

For Vega10 we use the suffix _v9 instead of _cik or _vi. For consistency
and brevity I could rename _cik->v7 and _vi->v8. However, that would be
a lot of churn and, in my eyes, a waste of time.

Regards,
  Felix

>
> Oded
>
>>
>>>                 kernel_queue_init_vi(&kq->ops_asic_specific);
>>>                 break;
>>>
>>>         case CHIP_KAVERI:
>>> +       case CHIP_HAWAII:
>>>                 kernel_queue_init_cik(&kq->ops_asic_specific);
>>>                 break;
>>>         default:
>>> --
>>> 2.7.4
>>>
>> Other then that, This patch is:
>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info
@ 2018-01-31 16:29       ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:29 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list, linux-pci



On 2018-01-31 10:20 AM, Oded Gabbay wrote:
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> CC: linux-pci@vger.kernel.org
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 153 +++++++++++++++++++++++++++++++-
>>  1 file changed, 151 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 6dd50cc..612afaf 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -63,12 +63,118 @@ static const struct kfd_device_info carrizo_device_info = {
>>  };
>>  #endif
>>
>> +static const struct kfd_device_info hawaii_device_info = {
>> +       .asic_family = CHIP_HAWAII,
>> +       .max_pasid_bits = 16,
>> +       /* max num of queues for KV.TODO should be a dynamic value */
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info tonga_device_info = {
>> +       .asic_family = CHIP_TONGA,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
> Is there any point in keeping the name event_interrupt_class_cik?
> maybe just rename to event_interrupt_class ?
> What will happen in vega ? If its the same I think removing the _cik
> makes the code more consistent.

Vega10 has its own class because the interrupt ring packet format
changed significantly.

Regards,
  Felix

>
> Oded
>
>
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info tonga_vf_device_info = {
>> +       .asic_family = CHIP_TONGA,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info fiji_device_info = {
>> +       .asic_family = CHIP_FIJI,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info fiji_vf_device_info = {
>> +       .asic_family = CHIP_FIJI,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +
>> +static const struct kfd_device_info polaris10_device_info = {
>> +       .asic_family = CHIP_POLARIS10,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info polaris10_vf_device_info = {
>> +       .asic_family = CHIP_POLARIS10,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info polaris11_device_info = {
>> +       .asic_family = CHIP_POLARIS11,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +
>>  struct kfd_deviceid {
>>         unsigned short did;
>>         const struct kfd_device_info *device_info;
>>  };
>>
>> -/* Please keep this sorted by increasing device id. */
>>  static const struct kfd_deviceid supported_devices[] = {
>>  #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>> @@ -97,8 +203,51 @@ static const struct kfd_deviceid supported_devices[] = {
>>         { 0x9874, &carrizo_device_info },       /* Carrizo */
>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>> -       { 0x9877, &carrizo_device_info }        /* Carrizo */
>> +       { 0x9877, &carrizo_device_info },       /* Carrizo */
>>  #endif
>> +       { 0x67A0, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A1, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A2, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A8, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A9, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67AA, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B0, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B1, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B8, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B9, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67BA, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67BE, &hawaii_device_info },        /* Hawaii */
>> +       { 0x6920, &tonga_device_info },         /* Tonga */
>> +       { 0x6921, &tonga_device_info },         /* Tonga */
>> +       { 0x6928, &tonga_device_info },         /* Tonga */
>> +       { 0x6929, &tonga_device_info },         /* Tonga */
>> +       { 0x692B, &tonga_device_info },         /* Tonga */
>> +       { 0x692F, &tonga_vf_device_info },      /* Tonga vf */
>> +       { 0x6938, &tonga_device_info },         /* Tonga */
>> +       { 0x6939, &tonga_device_info },         /* Tonga */
>> +       { 0x7300, &fiji_device_info },          /* Fiji */
>> +       { 0x730F, &fiji_vf_device_info },       /* Fiji vf*/
>> +       { 0x67C0, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C1, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C2, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C4, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C7, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C8, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C9, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CA, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CC, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CF, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67D0, &polaris10_vf_device_info },  /* Polaris10 vf*/
>> +       { 0x67DF, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67E0, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E1, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E3, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E7, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E8, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E9, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67EB, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67EF, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67FF, &polaris11_device_info },     /* Polaris11 */
>>  };
>>
>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>> --
>> 2.7.4
>>
> Other then the note above, This patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info
@ 2018-01-31 16:29       ` Felix Kuehling
  0 siblings, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:29 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: linux-pci-u79uwXL29TY76Z2rM5mHXA, amd-gfx list



On 2018-01-31 10:20 AM, Oded Gabbay wrote:
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> CC: linux-pci@vger.kernel.org
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 153 +++++++++++++++++++++++++++++++-
>>  1 file changed, 151 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 6dd50cc..612afaf 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -63,12 +63,118 @@ static const struct kfd_device_info carrizo_device_info = {
>>  };
>>  #endif
>>
>> +static const struct kfd_device_info hawaii_device_info = {
>> +       .asic_family = CHIP_HAWAII,
>> +       .max_pasid_bits = 16,
>> +       /* max num of queues for KV.TODO should be a dynamic value */
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info tonga_device_info = {
>> +       .asic_family = CHIP_TONGA,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
> Is there any point in keeping the name event_interrupt_class_cik?
> maybe just rename to event_interrupt_class ?
> What will happen in vega ? If its the same I think removing the _cik
> makes the code more consistent.

Vega10 has its own class because the interrupt ring packet format
changed significantly.

Regards,
  Felix

>
> Oded
>
>
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info tonga_vf_device_info = {
>> +       .asic_family = CHIP_TONGA,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = false,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info fiji_device_info = {
>> +       .asic_family = CHIP_FIJI,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info fiji_vf_device_info = {
>> +       .asic_family = CHIP_FIJI,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +
>> +static const struct kfd_device_info polaris10_device_info = {
>> +       .asic_family = CHIP_POLARIS10,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +static const struct kfd_device_info polaris10_vf_device_info = {
>> +       .asic_family = CHIP_POLARIS10,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = false,
>> +};
>> +
>> +static const struct kfd_device_info polaris11_device_info = {
>> +       .asic_family = CHIP_POLARIS11,
>> +       .max_pasid_bits = 16,
>> +       .max_no_of_hqd  = 24,
>> +       .ih_ring_entry_size = 4 * sizeof(uint32_t),
>> +       .event_interrupt_class = &event_interrupt_class_cik,
>> +       .num_of_watch_points = 4,
>> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
>> +       .supports_cwsr = true,
>> +       .needs_iommu_device = false,
>> +       .needs_pci_atomics = true,
>> +};
>> +
>> +
>>  struct kfd_deviceid {
>>         unsigned short did;
>>         const struct kfd_device_info *device_info;
>>  };
>>
>> -/* Please keep this sorted by increasing device id. */
>>  static const struct kfd_deviceid supported_devices[] = {
>>  #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>> @@ -97,8 +203,51 @@ static const struct kfd_deviceid supported_devices[] = {
>>         { 0x9874, &carrizo_device_info },       /* Carrizo */
>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>> -       { 0x9877, &carrizo_device_info }        /* Carrizo */
>> +       { 0x9877, &carrizo_device_info },       /* Carrizo */
>>  #endif
>> +       { 0x67A0, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A1, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A2, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A8, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67A9, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67AA, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B0, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B1, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B8, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67B9, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67BA, &hawaii_device_info },        /* Hawaii */
>> +       { 0x67BE, &hawaii_device_info },        /* Hawaii */
>> +       { 0x6920, &tonga_device_info },         /* Tonga */
>> +       { 0x6921, &tonga_device_info },         /* Tonga */
>> +       { 0x6928, &tonga_device_info },         /* Tonga */
>> +       { 0x6929, &tonga_device_info },         /* Tonga */
>> +       { 0x692B, &tonga_device_info },         /* Tonga */
>> +       { 0x692F, &tonga_vf_device_info },      /* Tonga vf */
>> +       { 0x6938, &tonga_device_info },         /* Tonga */
>> +       { 0x6939, &tonga_device_info },         /* Tonga */
>> +       { 0x7300, &fiji_device_info },          /* Fiji */
>> +       { 0x730F, &fiji_vf_device_info },       /* Fiji vf*/
>> +       { 0x67C0, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C1, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C2, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C4, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C7, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C8, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67C9, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CA, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CC, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67CF, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67D0, &polaris10_vf_device_info },  /* Polaris10 vf*/
>> +       { 0x67DF, &polaris10_device_info },     /* Polaris10 */
>> +       { 0x67E0, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E1, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E3, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E7, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E8, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67E9, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67EB, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67EF, &polaris11_device_info },     /* Polaris11 */
>> +       { 0x67FF, &polaris11_device_info },     /* Polaris11 */
>>  };
>>
>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>> --
>> 2.7.4
>>
> Other then the note above, This patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs
       [not found]         ` <CAFCwf11WWuHydSRBu3Pk8-jFLgoxJ7k0GDfuO-HWRjpvSRm5xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-01-31 15:28           ` Christian König
@ 2018-01-31 16:33           ` Felix Kuehling
  1 sibling, 0 replies; 47+ messages in thread
From: Felix Kuehling @ 2018-01-31 16:33 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

On 2018-01-31 10:25 AM, Oded Gabbay wrote:
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 335e454..7ebe430 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -78,10 +78,15 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
>>         switch (adev->asic_type) {
>>  #ifdef CONFIG_DRM_AMDGPU_CIK
>>         case CHIP_KAVERI:
>> +       case CHIP_HAWAII:
>>                 kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
>>                 break;
>>  #endif
>>         case CHIP_CARRIZO:
>> +       case CHIP_TONGA:
>> +       case CHIP_FIJI:
>> +       case CHIP_POLARIS10:
>> +       case CHIP_POLARIS11:
> Polaris isn't gfx 9 ?
> or is it called differently ?

The GFX IP in Polaris is same as in Fiji. Same CP packets, same shader
instructions, same register offsets etc. You'll see the same in amdgpu.
Enabling Polaris in KFD was really as simple as adding the names and
device IDs in a few places.

Regards,
  Felix

>
>>                 kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
>>                 break;
>>         default:
>> --
>> 2.7.4
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]             ` <CAFCwf12pqRA4KdRLpkUmiBs7EQmTePcy80V2kP9mP3pN8V-eTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-01-31 15:11               ` Christian König
  2018-01-31 16:14               ` Felix Kuehling
@ 2018-02-03  2:29               ` Felix Kuehling
       [not found]                 ` <5a1e7696-b5bf-547c-4fe6-e71e0ae7f5e0-5C7GfCeVMHo@public.gmane.org>
  2 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-02-03  2:29 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

[-- Attachment #1: Type: text/plain, Size: 18406 bytes --]

The attached patch is my attempt to keep most of the IOMMU code in one
place (new kfd_iommu.c) to avoid #ifdefs all over the place. This way I
can still conditionally compile a bunch of KFD code that is only needed
for IOMMU handling, with stub functions for kernel configs without IOMMU
support. About 300 lines of conditionally compiled code got moved to
kfd_iommu.c.

The only piece I didn't move into kfd_iommu.c is 
kfd_signal_iommu_event. I prefer to keep that in kfd_events.c because it
doesn't call any IOMMU driver functions, and because it's closely
related to the rest of the event handling logic. It could be compiled
unconditionally, but it would be dead code without IOMMU support.

And I moved pdd->bound to a place where it doesn't consume extra space
(on 64-bit systems due to structure alignment) instead of making it
conditional.

This is only compile-tested for now.

If you like this approach, I'll do more testing and squash it with "Make
IOMMUv2 code conditional".

Regards,
  Felix


On 2018-01-31 10:00 AM, Oded Gabbay wrote:
> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> Hi Felix,
>> Please don't spread 19 #ifdefs throughout the code.
>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>> functions declarations and in the #else section put macros with empty
>> implementations. This is much more readable and maintainable.
>>
>> Oded
> To emphasize my point, there is a call to amd_iommu_bind_pasid in
> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
> the compliation breaks. Putting the #ifdefs around the calls is simply
> not scalable.
>
> Oded
>
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:
>>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>>> ASIC information. Also allow building KFD without IOMMUv2 support.
>>> This is still useful for dGPUs and prepares for enabling KFD on
>>> architectures that don't support AMD IOMMUv2.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>>  8 files changed, 74 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> index bc5a294..5bbeb95 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> @@ -4,6 +4,6 @@
>>>
>>>  config HSA_AMD
>>>         tristate "HSA kernel driver for AMD GPU devices"
>>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>>> +       depends on DRM_AMDGPU && X86_64
>>>         help
>>>           Enable this if you want to use HSA features on AMD GPU devices.
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> index 2bc2816..3478270 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> @@ -22,7 +22,9 @@
>>>
>>>  #include <linux/pci.h>
>>>  #include <linux/acpi.h>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include "kfd_crat.h"
>>>  #include "kfd_priv.h"
>>>  #include "kfd_topology.h"
>>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>         struct crat_subtype_generic *sub_type_hdr;
>>>         struct crat_subtype_computeunit *cu;
>>>         struct kfd_cu_info cu_info;
>>> -       struct amd_iommu_device_info iommu_info;
>>>         int avail_size = *size;
>>>         uint32_t total_num_of_cu;
>>>         int num_of_cache_entries = 0;
>>>         int cache_mem_filled = 0;
>>>         int ret = 0;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       struct amd_iommu_device_info iommu_info;
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>>> +#endif
>>>         struct kfd_local_mem_info local_mem_info;
>>>
>>>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>         /* Check if this node supports IOMMU. During parsing this flag will
>>>          * translate to HSA_CAP_ATS_PRESENT
>>>          */
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         iommu_info.flags = 0;
>>>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>>                 if ((iommu_info.flags & required_iommu_flags) ==
>>>                                 required_iommu_flags)
>>>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>>         }
>>> +#endif
>>>
>>>         crat_table->length += sub_type_hdr->length;
>>>         crat_table->total_entries++;
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index fafe971..5205b34 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -20,7 +20,9 @@
>>>   * OTHER DEALINGS IN THE SOFTWARE.
>>>   */
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include <linux/bsearch.h>
>>>  #include <linux/pci.h>
>>>  #include <linux/slab.h>
>>> @@ -31,6 +33,7 @@
>>>
>>>  #define MQD_SIZE_ALIGNED 768
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static const struct kfd_device_info kaveri_device_info = {
>>>         .asic_family = CHIP_KAVERI,
>>>         .max_pasid_bits = 16,
>>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = false,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>>
>>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = true,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>> +#endif
>>>
>>>  struct kfd_deviceid {
>>>         unsigned short did;
>>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>>
>>>  /* Please keep this sorted by increasing device id. */
>>>  static const struct kfd_deviceid supported_devices[] = {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1306, &kaveri_device_info },        /* Kaveri */
>>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9877, &carrizo_device_info }        /* Carrizo */
>>> +#endif
>>>  };
>>>
>>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>         return kfd;
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>>  {
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>>>
>>>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>>
>>>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>>>  {
>>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>                 goto device_queue_manager_error;
>>>         }
>>>
>>> -       if (!device_iommu_pasid_init(kfd)) {
>>> -               dev_err(kfd_device,
>>> -                       "Error initializing iommuv2 for device %x:%x\n",
>>> -                       kfd->pdev->vendor, kfd->pdev->device);
>>> -               goto device_iommu_pasid_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               if (!device_iommu_pasid_init(kfd)) {
>>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>>> +                       goto device_iommu_pasid_error;
>>> +               }
>>>         }
>>> +#endif
>>>
>>>         kfd_cwsr_init(kfd);
>>>
>>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>>
>>>         kfd->dqm->ops.stop(kfd->dqm);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (!kfd->device_info->needs_iommu_device)
>>> +               return;
>>> +
>>>         kfd_unbind_processes_from_device(kfd);
>>>
>>>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>>         amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>  }
>>>
>>>  int kgd2kfd_resume(struct kfd_dev *kfd)
>>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>>  static int kfd_resume(struct kfd_dev *kfd)
>>>  {
>>>         int err = 0;
>>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>>
>>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> -       if (err)
>>> -               return -ENXIO;
>>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> -                                       iommu_pasid_shutdown_callback);
>>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> -                                    iommu_invalid_ppr_cb);
>>> -
>>> -       err = kfd_bind_processes_to_device(kfd);
>>> -       if (err)
>>> -               goto processes_bind_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>>> +
>>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> +               if (err)
>>> +                       return -ENXIO;
>>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> +                                               iommu_pasid_shutdown_callback);
>>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> +                                            iommu_invalid_ppr_cb);
>>> +
>>> +               err = kfd_bind_processes_to_device(kfd);
>>> +               if (err)
>>> +                       goto processes_bind_error;
>>> +       }
>>> +#endif
>>>
>>>         err = kfd->dqm->ops.start(kfd->dqm);
>>>         if (err) {
>>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>>
>>>  dqm_start_error:
>>>  processes_bind_error:
>>> -       amd_iommu_free_device(kfd->pdev);
>>> -
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device)
>>> +               amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>         return err;
>>>  }
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> index 93aae5c..f770dc7 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>>>         }
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>                 unsigned long address, bool is_write_requested,
>>>                 bool is_execute_requested)
>>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>         mutex_unlock(&p->event_mutex);
>>>         kfd_unref_process(p);
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>>
>>>  void kfd_signal_hw_exception_event(unsigned int pasid)
>>>  {
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> index eebfb1e..9f4766c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>>         uint8_t num_of_watch_points;
>>>         uint16_t mqd_size_aligned;
>>>         bool supports_cwsr;
>>> +       bool needs_iommu_device;
>>>         bool needs_pci_atomics;
>>>  };
>>>
>>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>>
>>>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                 struct kfd_process *p);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
>>> +#endif
>>>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>>                                                         struct kfd_process *p);
>>>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>>                        uint32_t *wait_result);
>>>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>>                                 uint32_t valid_id_bits);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>>>                 unsigned int pasid, unsigned long address,
>>>                 bool is_write_requested, bool is_execute_requested);
>>> +#endif
>>>  void kfd_signal_hw_exception_event(unsigned int pasid);
>>>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index a22fb071..1d0e02c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>  {
>>>         struct kfd_process *p = container_of(work, struct kfd_process,
>>>                                              release_work);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_process_device *pdd;
>>>
>>>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>>
>>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>>> -               if (pdd->bound == PDD_BOUND)
>>> +               if (pdd->bound == PDD_BOUND &&
>>> +                   pdd->dev->device_info->needs_iommu_device)
>>>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>>         }
>>> +#endif
>>>
>>>         kfd_process_destroy_pdds(p);
>>>
>>> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                         struct kfd_process *p)
>>>  {
>>>         struct kfd_process_device *pdd;
>>> -       int err;
>>>
>>>         pdd = kfd_get_process_device_data(dev, p);
>>>         if (!pdd) {
>>> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                 return ERR_PTR(-EINVAL);
>>>         }
>>>
>>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>>> -       if (err < 0)
>>> -               return ERR_PTR(err);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (dev->device_info->needs_iommu_device) {
>>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>>> +                                              p->lead_thread);
>>> +               if (err < 0)
>>> +                       return ERR_PTR(err);
>>> +       }
>>> +#endif
>>>
>>>         pdd->bound = PDD_BOUND;
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index c6a7609..f57c305 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>>>   */
>>>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>  {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_perf_properties *props;
>>>
>>>         if (amd_iommu_pc_supported()) {
>>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>                         amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>>>                 list_add_tail(&props->list, &kdev->perf_props);
>>>         }
>>> +#endif
>>>
>>>         return 0;
>>>  }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> index 53fca1f..111fda2 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>>                 struct list_head *device_list);
>>>  void kfd_release_topology_device_list(struct list_head *device_list);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  extern bool amd_iommu_pc_supported(void);
>>>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>>> +#endif
>>>
>>>  #endif /* __KFD_TOPOLOGY_H__ */
>>> --
>>> 2.7.4
>>>


[-- Attachment #2: 0001-drm-amdkfd-Centralize-IOMMU-handling.patch --]
[-- Type: text/x-patch, Size: 32463 bytes --]

>From a7e4ba61c9ef58bd543df6ae4324e8825933f5fc Mon Sep 17 00:00:00 2001
From: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
Date: Fri, 2 Feb 2018 21:06:34 -0500
Subject: [PATCH 1/1] drm/amdkfd: Centralize IOMMU handling

Avoid scattering IOMMU ifdefs in too many places.

Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/amdkfd/Makefile       |   4 +
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  20 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 127 +----------
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c    | 357 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.h    |  78 +++++++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  17 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 151 +------------
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  18 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.h |   8 +-
 10 files changed, 476 insertions(+), 309 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_iommu.h

diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile b/drivers/gpu/drm/amd/amdkfd/Makefile
index a317e76..0d02422 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -37,6 +37,10 @@ amdkfd-y	:= kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
 		kfd_interrupt.o kfd_events.o cik_event_interrupt.o \
 		kfd_dbgdev.o kfd_dbgmgr.o kfd_crat.o
 
+ifneq ($(CONFIG_AMD_IOMMU_V2),)
+amdkfd-y += kfd_iommu.o
+endif
+
 amdkfd-$(CONFIG_DEBUG_FS) += kfd_debugfs.o
 
 obj-$(CONFIG_HSA_AMD)	+= amdkfd.o
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index c1981b1..3c6c4cdd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -22,12 +22,10 @@
 
 #include <linux/pci.h>
 #include <linux/acpi.h>
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-#include <linux/amd-iommu.h>
-#endif
 #include "kfd_crat.h"
 #include "kfd_priv.h"
 #include "kfd_topology.h"
+#include "kfd_iommu.h"
 
 /* GPU Processor ID base for dGPUs for which VCRAT needs to be created.
  * GPU processor ID are expressed with Bit[31]=1.
@@ -1044,12 +1042,6 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
 	int num_of_cache_entries = 0;
 	int cache_mem_filled = 0;
 	int ret = 0;
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	struct amd_iommu_device_info iommu_info;
-	const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
-					 AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
-					 AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
-#endif
 	struct kfd_local_mem_info local_mem_info;
 
 	if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
@@ -1110,14 +1102,8 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
 	/* Check if this node supports IOMMU. During parsing this flag will
 	 * translate to HSA_CAP_ATS_PRESENT
 	 */
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	iommu_info.flags = 0;
-	if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
-		if ((iommu_info.flags & required_iommu_flags) ==
-				required_iommu_flags)
-			cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
-	}
-#endif
+	if (!kfd_iommu_check_device(kdev))
+		cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
 
 	crat_table->length += sub_type_hdr->length;
 	crat_table->total_entries++;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 9299a91..f87ee43 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -30,11 +30,12 @@
 #include "kfd_device_queue_manager.h"
 #include "kfd_pm4_headers_vi.h"
 #include "cwsr_trap_handler_gfx8.asm"
+#include "kfd_iommu.h"
 
 #define MQD_SIZE_ALIGNED 768
 static atomic_t kfd_device_suspended = ATOMIC_INIT(0);
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+#ifdef KFD_SUPPORT_IOMMU_V2
 static const struct kfd_device_info kaveri_device_info = {
 	.asic_family = CHIP_KAVERI,
 	.max_pasid_bits = 16,
@@ -177,7 +178,7 @@ struct kfd_deviceid {
 };
 
 static const struct kfd_deviceid supported_devices[] = {
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+#ifdef KFD_SUPPORT_IOMMU_V2
 	{ 0x1304, &kaveri_device_info },	/* Kaveri */
 	{ 0x1305, &kaveri_device_info },	/* Kaveri */
 	{ 0x1306, &kaveri_device_info },	/* Kaveri */
@@ -319,79 +320,6 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 	return kfd;
 }
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-static bool device_iommu_pasid_init(struct kfd_dev *kfd)
-{
-	const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
-					AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
-					AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
-
-	struct amd_iommu_device_info iommu_info;
-	unsigned int pasid_limit;
-	int err;
-
-	err = amd_iommu_device_info(kfd->pdev, &iommu_info);
-	if (err < 0) {
-		dev_err(kfd_device,
-			"error getting iommu info. is the iommu enabled?\n");
-		return false;
-	}
-
-	if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
-		dev_err(kfd_device, "error required iommu flags ats %i, pri %i, pasid %i\n",
-		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
-		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
-		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
-									!= 0);
-		return false;
-	}
-
-	pasid_limit = min_t(unsigned int,
-			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
-			iommu_info.max_pasids);
-
-	if (!kfd_set_pasid_limit(pasid_limit)) {
-		dev_err(kfd_device, "error setting pasid limit\n");
-		return false;
-	}
-
-	return true;
-}
-
-static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
-{
-	struct kfd_dev *dev = kfd_device_by_pci_dev(pdev);
-
-	if (dev)
-		kfd_process_iommu_unbind_callback(dev, pasid);
-}
-
-/*
- * This function called by IOMMU driver on PPR failure
- */
-static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
-		unsigned long address, u16 flags)
-{
-	struct kfd_dev *dev;
-
-	dev_warn(kfd_device,
-			"Invalid PPR device %x:%x.%x pasid %d address 0x%lX flags 0x%X",
-			PCI_BUS_NUM(pdev->devfn),
-			PCI_SLOT(pdev->devfn),
-			PCI_FUNC(pdev->devfn),
-			pasid,
-			address,
-			flags);
-
-	dev = kfd_device_by_pci_dev(pdev);
-	if (!WARN_ON(!dev))
-		kfd_signal_iommu_event(dev, pasid, address,
-			flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
-
-	return AMD_IOMMU_INV_PRI_RSP_INVALID;
-}
-#endif /* CONFIG_AMD_IOMMU_V2 */
-
 static void kfd_cwsr_init(struct kfd_dev *kfd)
 {
 	if (cwsr_enable && kfd->device_info->supports_cwsr) {
@@ -481,14 +409,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		goto device_queue_manager_error;
 	}
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	if (kfd->device_info->needs_iommu_device) {
-		if (!device_iommu_pasid_init(kfd)) {
-			dev_err(kfd_device, "Error initializing iommuv2\n");
-			goto device_iommu_pasid_error;
-		}
+	if (!kfd_iommu_device_init(kfd)) {
+		dev_err(kfd_device, "Error initializing iommuv2\n");
+		goto device_iommu_error;
 	}
-#endif
 
 	kfd_cwsr_init(kfd);
 
@@ -507,7 +431,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	goto out;
 
 kfd_resume_error:
-device_iommu_pasid_error:
+device_iommu_error:
 	device_queue_manager_uninit(kfd->dqm);
 device_queue_manager_error:
 	kfd_interrupt_exit(kfd);
@@ -552,16 +476,7 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
 
 	kfd->dqm->ops.stop(kfd->dqm);
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	if (!kfd->device_info->needs_iommu_device)
-		return;
-
-	kfd_unbind_processes_from_device(kfd);
-
-	amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
-	amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
-	amd_iommu_free_device(kfd->pdev);
-#endif
+	kfd_iommu_suspend(kfd);
 }
 
 int kgd2kfd_resume(struct kfd_dev *kfd)
@@ -587,23 +502,9 @@ static int kfd_resume(struct kfd_dev *kfd)
 {
 	int err = 0;
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	if (kfd->device_info->needs_iommu_device) {
-		unsigned int pasid_limit = kfd_get_pasid_limit();
-
-		err = amd_iommu_init_device(kfd->pdev, pasid_limit);
-		if (err)
-			return -ENXIO;
-		amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
-						iommu_pasid_shutdown_callback);
-		amd_iommu_set_invalid_ppr_cb(kfd->pdev,
-					     iommu_invalid_ppr_cb);
-
-		err = kfd_bind_processes_to_device(kfd);
-		if (err)
-			goto processes_bind_error;
-	}
-#endif
+	err = kfd_iommu_resume(kfd);
+	if (err)
+		return err;
 
 	err = kfd->dqm->ops.start(kfd->dqm);
 	if (err) {
@@ -616,11 +517,7 @@ static int kfd_resume(struct kfd_dev *kfd)
 	return err;
 
 dqm_start_error:
-processes_bind_error:
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	if (kfd->device_info->needs_iommu_device)
-		amd_iommu_free_device(kfd->pdev);
-#endif
+	kfd_iommu_suspend(kfd);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 56ec74a..4890a90 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -30,6 +30,7 @@
 #include <linux/memory.h>
 #include "kfd_priv.h"
 #include "kfd_events.h"
+#include "kfd_iommu.h"
 #include <linux/device.h>
 
 /*
@@ -864,7 +865,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
 	}
 }
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+#ifdef KFD_SUPPORT_IOMMU_V2
 void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 		unsigned long address, bool is_write_requested,
 		bool is_execute_requested)
@@ -933,7 +934,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	mutex_unlock(&p->event_mutex);
 	kfd_unref_process(p);
 }
-#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
+#endif /* KFD_SUPPORT_IOMMU_V2 */
 
 void kfd_signal_hw_exception_event(unsigned int pasid)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
new file mode 100644
index 0000000..333c1ff
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
@@ -0,0 +1,357 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include <linux/printk.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <linux/pci.h>
+#include <linux/amd-iommu.h>
+#include "kfd_priv.h"
+#include "kfd_dbgmgr.h"
+#include "kfd_topology.h"
+#include "kfd_iommu.h"
+
+static const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
+					AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
+					AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
+
+/** kfd_iommu_check_device - Check whether IOMMU is available for device
+ */
+int kfd_iommu_check_device(struct kfd_dev *kfd)
+{
+	struct amd_iommu_device_info iommu_info;
+	int err;
+
+	if (!kfd->device_info->needs_iommu_device)
+		return -ENODEV;
+
+	iommu_info.flags = 0;
+	err = amd_iommu_device_info(kfd->pdev, &iommu_info);
+	if (err)
+		return err;
+
+	if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags)
+		return -ENODEV;
+
+	return 0;
+}
+
+/** kfd_iommu_device_init - Initialize IOMMU for device
+ */
+int kfd_iommu_device_init(struct kfd_dev *kfd)
+{
+	struct amd_iommu_device_info iommu_info;
+	unsigned int pasid_limit;
+	int err;
+
+	if (!kfd->device_info->needs_iommu_device)
+		return 0;
+
+	iommu_info.flags = 0;
+	err = amd_iommu_device_info(kfd->pdev, &iommu_info);
+	if (err < 0) {
+		dev_err(kfd_device,
+			"error getting iommu info. is the iommu enabled?\n");
+		return -ENODEV;
+	}
+
+	if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
+		dev_err(kfd_device, "error required iommu flags ats %i, pri %i, pasid %i\n",
+		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
+		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
+		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
+									!= 0);
+		return -ENODEV;
+	}
+
+	pasid_limit = min_t(unsigned int,
+			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
+			iommu_info.max_pasids);
+
+	if (!kfd_set_pasid_limit(pasid_limit)) {
+		dev_err(kfd_device, "error setting pasid limit\n");
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+/** kfd_iommu_bind_process_to_device - Have the IOMMU bind a process
+ *
+ * Binds the given process to the given device using its PASID. This
+ * enables IOMMUv2 address translation for the process on the device.
+ *
+ * This function assumes that the process mutex is held.
+ */
+int kfd_iommu_bind_process_to_device(struct kfd_process_device *pdd)
+{
+	struct kfd_dev *dev = pdd->dev;
+	struct kfd_process *p = pdd->process;
+	int err;
+
+	if (!dev->device_info->needs_iommu_device || pdd->bound == PDD_BOUND)
+		return 0;
+
+	if (unlikely(pdd->bound == PDD_BOUND_SUSPENDED)) {
+		pr_err("Binding PDD_BOUND_SUSPENDED pdd is unexpected!\n");
+		return -EINVAL;
+	}
+
+	err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
+	if (!err)
+		pdd->bound = PDD_BOUND;
+
+	return err;
+}
+
+/** kfd_iommu_unbind_process - Unbind process from all devices
+ *
+ * This removes all IOMMU device bindings of the process. To be used
+ * before process termination.
+ */
+void kfd_iommu_unbind_process(struct kfd_process *p)
+{
+	struct kfd_process_device *pdd;
+
+	list_for_each_entry(pdd, &p->per_device_data, per_device_list)
+		if (pdd->bound == PDD_BOUND)
+			amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
+}
+
+/* Callback for process shutdown invoked by the IOMMU driver */
+static void iommu_pasid_shutdown_callback(struct pci_dev *pdev, int pasid)
+{
+	struct kfd_dev *dev = kfd_device_by_pci_dev(pdev);
+	struct kfd_process *p;
+	struct kfd_process_device *pdd;
+
+	if (!dev)
+		return;
+
+	/*
+	 * Look for the process that matches the pasid. If there is no such
+	 * process, we either released it in amdkfd's own notifier, or there
+	 * is a bug. Unfortunately, there is no way to tell...
+	 */
+	p = kfd_lookup_process_by_pasid(pasid);
+	if (!p)
+		return;
+
+	pr_debug("Unbinding process %d from IOMMU\n", pasid);
+
+	mutex_lock(kfd_get_dbgmgr_mutex());
+
+	if (dev->dbgmgr && dev->dbgmgr->pasid == p->pasid) {
+		if (!kfd_dbgmgr_unregister(dev->dbgmgr, p)) {
+			kfd_dbgmgr_destroy(dev->dbgmgr);
+			dev->dbgmgr = NULL;
+		}
+	}
+
+	mutex_unlock(kfd_get_dbgmgr_mutex());
+
+	mutex_lock(&p->mutex);
+
+	pdd = kfd_get_process_device_data(dev, p);
+	if (pdd)
+		/* For GPU relying on IOMMU, we need to dequeue here
+		 * when PASID is still bound.
+		 */
+		kfd_process_dequeue_from_device(pdd);
+
+	mutex_unlock(&p->mutex);
+
+	kfd_unref_process(p);
+}
+
+/* This function called by IOMMU driver on PPR failure */
+static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
+		unsigned long address, u16 flags)
+{
+	struct kfd_dev *dev;
+
+	dev_warn(kfd_device,
+			"Invalid PPR device %x:%x.%x pasid %d address 0x%lX flags 0x%X",
+			PCI_BUS_NUM(pdev->devfn),
+			PCI_SLOT(pdev->devfn),
+			PCI_FUNC(pdev->devfn),
+			pasid,
+			address,
+			flags);
+
+	dev = kfd_device_by_pci_dev(pdev);
+	if (!WARN_ON(!dev))
+		kfd_signal_iommu_event(dev, pasid, address,
+			flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
+
+	return AMD_IOMMU_INV_PRI_RSP_INVALID;
+}
+
+/*
+ * Bind processes do the device that have been temporarily unbound
+ * (PDD_BOUND_SUSPENDED) in kfd_unbind_processes_from_device.
+ */
+static int kfd_bind_processes_to_device(struct kfd_dev *kfd)
+{
+	struct kfd_process_device *pdd;
+	struct kfd_process *p;
+	unsigned int temp;
+	int err = 0;
+
+	int idx = srcu_read_lock(&kfd_processes_srcu);
+
+	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+		mutex_lock(&p->mutex);
+		pdd = kfd_get_process_device_data(kfd, p);
+
+		if (WARN_ON(!pdd) || pdd->bound != PDD_BOUND_SUSPENDED) {
+			mutex_unlock(&p->mutex);
+			continue;
+		}
+
+		err = amd_iommu_bind_pasid(kfd->pdev, p->pasid,
+				p->lead_thread);
+		if (err < 0) {
+			pr_err("Unexpected pasid %d binding failure\n",
+					p->pasid);
+			mutex_unlock(&p->mutex);
+			break;
+		}
+
+		pdd->bound = PDD_BOUND;
+		mutex_unlock(&p->mutex);
+	}
+
+	srcu_read_unlock(&kfd_processes_srcu, idx);
+
+	return err;
+}
+
+/*
+ * Mark currently bound processes as PDD_BOUND_SUSPENDED. These
+ * processes will be restored to PDD_BOUND state in
+ * kfd_bind_processes_to_device.
+ */
+static void kfd_unbind_processes_from_device(struct kfd_dev *kfd)
+{
+	struct kfd_process_device *pdd;
+	struct kfd_process *p;
+	unsigned int temp;
+
+	int idx = srcu_read_lock(&kfd_processes_srcu);
+
+	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+		mutex_lock(&p->mutex);
+		pdd = kfd_get_process_device_data(kfd, p);
+
+		if (WARN_ON(!pdd)) {
+			mutex_unlock(&p->mutex);
+			continue;
+		}
+
+		if (pdd->bound == PDD_BOUND)
+			pdd->bound = PDD_BOUND_SUSPENDED;
+		mutex_unlock(&p->mutex);
+	}
+
+	srcu_read_unlock(&kfd_processes_srcu, idx);
+}
+
+/** kfd_iommu_suspend - Prepare IOMMU for suspend
+ *
+ * This unbinds processes from the device and disables the IOMMU for
+ * the device.
+ */
+void kfd_iommu_suspend(struct kfd_dev *kfd)
+{
+	if (!kfd->device_info->needs_iommu_device)
+		return;
+
+	kfd_unbind_processes_from_device(kfd);
+
+	amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
+	amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
+	amd_iommu_free_device(kfd->pdev);
+}
+
+/** kfd_iommu_resume - Restore IOMMU after resume
+ *
+ * This reinitializes the IOMMU for the device and re-binds previously
+ * suspended processes to the device.
+ */
+int kfd_iommu_resume(struct kfd_dev *kfd)
+{
+	unsigned int pasid_limit;
+	int err;
+
+	if (kfd->device_info->needs_iommu_device)
+		return 0;
+
+	pasid_limit = kfd_get_pasid_limit();
+
+	err = amd_iommu_init_device(kfd->pdev, pasid_limit);
+	if (err)
+		return -ENXIO;
+
+	amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
+					iommu_pasid_shutdown_callback);
+	amd_iommu_set_invalid_ppr_cb(kfd->pdev,
+				     iommu_invalid_ppr_cb);
+
+	err = kfd_bind_processes_to_device(kfd);
+	if (err)
+		goto processes_bind_error;
+
+processes_bind_error:
+	amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
+	amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
+	amd_iommu_free_device(kfd->pdev);
+
+	return err;
+}
+
+extern bool amd_iommu_pc_supported(void);
+extern u8 amd_iommu_pc_get_max_banks(u16 devid);
+extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+
+/** kfd_iommu_add_perf_counters - Add IOMMU performance counters to topology
+ */
+int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
+{
+	struct kfd_perf_properties *props;
+
+	if (!(kdev->node_props.capability & HSA_CAP_ATS_PRESENT))
+		return 0;
+
+	if (!amd_iommu_pc_supported())
+		return 0;
+
+	props = kfd_alloc_struct(props);
+	if (!props)
+		return -ENOMEM;
+	strcpy(props->block_name, "iommu");
+	props->max_concurrent = amd_iommu_pc_get_max_banks(0) *
+		amd_iommu_pc_get_max_counters(0); /* assume one iommu */
+	list_add_tail(&props->list, &kdev->perf_props);
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
new file mode 100644
index 0000000..dd23d9f
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __KFD_IOMMU_H__
+#define __KFD_IOMMU_H__
+
+#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+
+#define KFD_SUPPORT_IOMMU_V2
+
+int kfd_iommu_check_device(struct kfd_dev *kfd);
+int kfd_iommu_device_init(struct kfd_dev *kfd);
+
+int kfd_iommu_bind_process_to_device(struct kfd_process_device *pdd);
+void kfd_iommu_unbind_process(struct kfd_process *p);
+
+void kfd_iommu_suspend(struct kfd_dev *kfd);
+int kfd_iommu_resume(struct kfd_dev *kfd);
+
+int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev);
+
+#else
+
+static inline int kfd_iommu_check_device(struct kfd_dev *kfd)
+{
+	return -ENODEV;
+}
+static inline int kfd_iommu_device_init(struct kfd_dev *kfd)
+{
+	return 0;
+}
+
+static inline int kfd_iommu_bind_process_to_device(
+	struct kfd_process_device *pdd)
+{
+	return 0;
+}
+static inline void kfd_iommu_unbind_process(struct kfd_process *p)
+{
+	/* empty */
+}
+
+static inline void kfd_iommu_suspend(struct kfd_dev *kfd)
+{
+	/* empty */
+}
+static inline int kfd_iommu_resume(struct kfd_dev *kfd)
+{
+	return 0;
+}
+
+static inline int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
+{
+	return 0;
+}
+
+#endif /* defined(CONFIG_AMD_IOMMU_V2) */
+
+#endif /* __KFD_IOMMU_H__ */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 2eba853..63e86926 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -554,9 +554,6 @@ struct kfd_process_device {
 	uint64_t scratch_base;
 	uint64_t scratch_limit;
 
-	/* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */
-	enum kfd_pdd_bound bound;
-
 	/* VM context for GPUVM allocations */
 	void *vm;
 
@@ -569,6 +566,9 @@ struct kfd_process_device {
 	 * function.
 	 */
 	bool already_dequeued;
+
+	/* Is this process/pasid bound to this device? (amd_iommu_bind_pasid) */
+	enum kfd_pdd_bound bound;
 };
 
 #define qpd_to_pdd(x) container_of(x, struct kfd_process_device, qpd)
@@ -651,6 +651,10 @@ struct kfd_process {
 	unsigned long last_restore_timestamp;
 };
 
+#define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */
+extern DECLARE_HASHTABLE(kfd_processes_table, KFD_PROCESS_TABLE_SIZE);
+extern struct srcu_struct kfd_processes_srcu;
+
 /**
  * Ioctl function type.
  *
@@ -681,11 +685,6 @@ int kfd_resume_all_processes(void);
 
 struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 						struct kfd_process *p);
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-int kfd_bind_processes_to_device(struct kfd_dev *dev);
-void kfd_unbind_processes_from_device(struct kfd_dev *dev);
-void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
-#endif
 struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
 							struct kfd_process *p);
 struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
@@ -864,11 +863,9 @@ int kfd_wait_on_events(struct kfd_process *p,
 		       uint32_t *wait_result);
 void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 				uint32_t valid_id_bits);
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 void kfd_signal_iommu_event(struct kfd_dev *dev,
 		unsigned int pasid, unsigned long address,
 		bool is_write_requested, bool is_execute_requested);
-#endif
 void kfd_signal_hw_exception_event(unsigned int pasid);
 int kfd_set_event(struct kfd_process *p, uint32_t event_id);
 int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 32c0f34..0684c49 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -36,16 +36,16 @@ struct mm_struct;
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
 #include "kfd_dbgmgr.h"
+#include "kfd_iommu.h"
 
 /*
  * List of struct kfd_process (field kfd_process).
  * Unique/indexed by mm_struct*
  */
-#define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */
-static DEFINE_HASHTABLE(kfd_processes_table, KFD_PROCESS_TABLE_SIZE);
+DEFINE_HASHTABLE(kfd_processes_table, KFD_PROCESS_TABLE_SIZE);
 static DEFINE_MUTEX(kfd_processes_mutex);
 
-DEFINE_STATIC_SRCU(kfd_processes_srcu);
+DEFINE_SRCU(kfd_processes_srcu);
 
 static struct workqueue_struct *kfd_process_wq;
 
@@ -333,17 +333,8 @@ static void kfd_process_wq_release(struct work_struct *work)
 {
 	struct kfd_process *p = container_of(work, struct kfd_process,
 					     release_work);
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	struct kfd_process_device *pdd;
-
-	pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
 
-	list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
-		if (pdd->bound == PDD_BOUND &&
-		    pdd->dev->device_info->needs_iommu_device)
-			amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
-	}
-#endif
+	kfd_iommu_unbind_process(p);
 
 	kfd_process_free_outstanding_kfd_bos(p);
 
@@ -636,6 +627,7 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 							struct kfd_process *p)
 {
 	struct kfd_process_device *pdd;
+	int err;
 
 	pdd = kfd_get_process_device_data(dev, p);
 	if (!pdd) {
@@ -643,140 +635,13 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 		return ERR_PTR(-ENOMEM);
 	}
 
-	if (pdd->bound == PDD_BOUND) {
-		return pdd;
-	} else if (unlikely(pdd->bound == PDD_BOUND_SUSPENDED)) {
-		pr_err("Binding PDD_BOUND_SUSPENDED pdd is unexpected!\n");
-		return ERR_PTR(-EINVAL);
-	}
-
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	if (dev->device_info->needs_iommu_device) {
-		int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
-					       p->lead_thread);
-		if (err < 0)
-			return ERR_PTR(err);
-	}
-#endif
-
-	pdd->bound = PDD_BOUND;
+	err = kfd_iommu_bind_process_to_device(pdd);
+	if (err)
+		return ERR_PTR(err);
 
 	return pdd;
 }
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-/*
- * Bind processes do the device that have been temporarily unbound
- * (PDD_BOUND_SUSPENDED) in kfd_unbind_processes_from_device.
- */
-int kfd_bind_processes_to_device(struct kfd_dev *dev)
-{
-	struct kfd_process_device *pdd;
-	struct kfd_process *p;
-	unsigned int temp;
-	int err = 0;
-
-	int idx = srcu_read_lock(&kfd_processes_srcu);
-
-	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
-		mutex_lock(&p->mutex);
-		pdd = kfd_get_process_device_data(dev, p);
-
-		if (WARN_ON(!pdd) || pdd->bound != PDD_BOUND_SUSPENDED) {
-			mutex_unlock(&p->mutex);
-			continue;
-		}
-
-		err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
-				p->lead_thread);
-		if (err < 0) {
-			pr_err("Unexpected pasid %d binding failure\n",
-					p->pasid);
-			mutex_unlock(&p->mutex);
-			break;
-		}
-
-		pdd->bound = PDD_BOUND;
-		mutex_unlock(&p->mutex);
-	}
-
-	srcu_read_unlock(&kfd_processes_srcu, idx);
-
-	return err;
-}
-
-/*
- * Mark currently bound processes as PDD_BOUND_SUSPENDED. These
- * processes will be restored to PDD_BOUND state in
- * kfd_bind_processes_to_device.
- */
-void kfd_unbind_processes_from_device(struct kfd_dev *dev)
-{
-	struct kfd_process_device *pdd;
-	struct kfd_process *p;
-	unsigned int temp;
-
-	int idx = srcu_read_lock(&kfd_processes_srcu);
-
-	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
-		mutex_lock(&p->mutex);
-		pdd = kfd_get_process_device_data(dev, p);
-
-		if (WARN_ON(!pdd)) {
-			mutex_unlock(&p->mutex);
-			continue;
-		}
-
-		if (pdd->bound == PDD_BOUND)
-			pdd->bound = PDD_BOUND_SUSPENDED;
-		mutex_unlock(&p->mutex);
-	}
-
-	srcu_read_unlock(&kfd_processes_srcu, idx);
-}
-
-void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid)
-{
-	struct kfd_process *p;
-	struct kfd_process_device *pdd;
-
-	/*
-	 * Look for the process that matches the pasid. If there is no such
-	 * process, we either released it in amdkfd's own notifier, or there
-	 * is a bug. Unfortunately, there is no way to tell...
-	 */
-	p = kfd_lookup_process_by_pasid(pasid);
-	if (!p)
-		return;
-
-	pr_debug("Unbinding process %d from IOMMU\n", pasid);
-
-	mutex_lock(kfd_get_dbgmgr_mutex());
-
-	if (dev->dbgmgr && dev->dbgmgr->pasid == p->pasid) {
-		if (!kfd_dbgmgr_unregister(dev->dbgmgr, p)) {
-			kfd_dbgmgr_destroy(dev->dbgmgr);
-			dev->dbgmgr = NULL;
-		}
-	}
-
-	mutex_unlock(kfd_get_dbgmgr_mutex());
-
-	mutex_lock(&p->mutex);
-
-	pdd = kfd_get_process_device_data(dev, p);
-	if (pdd)
-		/* For GPU relying on IOMMU, we need to dequeue here
-		 * when PASID is still bound.
-		 */
-		kfd_process_dequeue_from_device(pdd);
-
-	mutex_unlock(&p->mutex);
-
-	kfd_unref_process(p);
-}
-#endif /* CONFIG_AMD_IOMMU_V2 */
-
 struct kfd_process_device *kfd_get_first_process_device_data(
 						struct kfd_process *p)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index af77e42..9de9ac9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -35,6 +35,7 @@
 #include "kfd_crat.h"
 #include "kfd_topology.h"
 #include "kfd_device_queue_manager.h"
+#include "kfd_iommu.h"
 
 /* topology_device_list - Master list of all topology devices */
 static struct list_head topology_device_list;
@@ -877,21 +878,8 @@ static void find_system_memory(const struct dmi_header *dm,
  */
 static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
 {
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-	struct kfd_perf_properties *props;
-
-	if (amd_iommu_pc_supported()) {
-		props = kfd_alloc_struct(props);
-		if (!props)
-			return -ENOMEM;
-		strcpy(props->block_name, "iommu");
-		props->max_concurrent = amd_iommu_pc_get_max_banks(0) *
-			amd_iommu_pc_get_max_counters(0); /* assume one iommu */
-		list_add_tail(&props->list, &kdev->perf_props);
-	}
-#endif
-
-	return 0;
+	/* These are the only counters supported so far */
+	return kfd_iommu_add_perf_counters(kdev);
 }
 
 /* kfd_add_non_crat_information - Add information that is not currently
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
index be812bb..eb54cfc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
@@ -25,7 +25,7 @@
 
 #include <linux/types.h>
 #include <linux/list.h>
-#include "kfd_priv.h"
+#include "kfd_crat.h"
 
 #define KFD_TOPOLOGY_PUBLIC_NAME_SIZE 128
 
@@ -184,10 +184,4 @@ struct kfd_topology_device *kfd_create_topology_device(
 		struct list_head *device_list);
 void kfd_release_topology_device_list(struct list_head *device_list);
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-extern bool amd_iommu_pc_supported(void);
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
-#endif
-
 #endif /* __KFD_TOPOLOGY_H__ */
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]                 ` <5a1e7696-b5bf-547c-4fe6-e71e0ae7f5e0-5C7GfCeVMHo@public.gmane.org>
@ 2018-02-05 19:00                   ` Christian König
       [not found]                     ` <044a3842-92d1-fe2a-c432-0719e8528416-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Christian König @ 2018-02-05 19:00 UTC (permalink / raw)
  To: Felix Kuehling, Oded Gabbay; +Cc: amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 19406 bytes --]

Looks good to me on first glance.

You probably don't mind that I'm going to pull a good part of that into 
amdgpu as next step?

Regards,
Christian.

Am 03.02.2018 um 03:29 schrieb Felix Kuehling:
> The attached patch is my attempt to keep most of the IOMMU code in one
> place (new kfd_iommu.c) to avoid #ifdefs all over the place. This way I
> can still conditionally compile a bunch of KFD code that is only needed
> for IOMMU handling, with stub functions for kernel configs without IOMMU
> support. About 300 lines of conditionally compiled code got moved to
> kfd_iommu.c.
>
> The only piece I didn't move into kfd_iommu.c is
> kfd_signal_iommu_event. I prefer to keep that in kfd_events.c because it
> doesn't call any IOMMU driver functions, and because it's closely
> related to the rest of the event handling logic. It could be compiled
> unconditionally, but it would be dead code without IOMMU support.
>
> And I moved pdd->bound to a place where it doesn't consume extra space
> (on 64-bit systems due to structure alignment) instead of making it
> conditional.
>
> This is only compile-tested for now.
>
> If you like this approach, I'll do more testing and squash it with "Make
> IOMMUv2 code conditional".
>
> Regards,
>    Felix
>
>
> On 2018-01-31 10:00 AM, Oded Gabbay wrote:
>> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>> Hi Felix,
>>> Please don't spread 19 #ifdefs throughout the code.
>>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>>> functions declarations and in the #else section put macros with empty
>>> implementations. This is much more readable and maintainable.
>>>
>>> Oded
>> To emphasize my point, there is a call to amd_iommu_bind_pasid in
>> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
>> the compliation breaks. Putting the #ifdefs around the calls is simply
>> not scalable.
>>
>> Oded
>>
>>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:
>>>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>>>> ASIC information. Also allow building KFD without IOMMUv2 support.
>>>> This is still useful for dGPUs and prepares for enabling KFD on
>>>> architectures that don't support AMD IOMMUv2.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62 +++++++++++++++++++++----------
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>>>   8 files changed, 74 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>>> index bc5a294..5bbeb95 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>>> @@ -4,6 +4,6 @@
>>>>
>>>>   config HSA_AMD
>>>>          tristate "HSA kernel driver for AMD GPU devices"
>>>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>>>> +       depends on DRM_AMDGPU && X86_64
>>>>          help
>>>>            Enable this if you want to use HSA features on AMD GPU devices.
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>>> index 2bc2816..3478270 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>>> @@ -22,7 +22,9 @@
>>>>
>>>>   #include <linux/pci.h>
>>>>   #include <linux/acpi.h>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   #include <linux/amd-iommu.h>
>>>> +#endif
>>>>   #include "kfd_crat.h"
>>>>   #include "kfd_priv.h"
>>>>   #include "kfd_topology.h"
>>>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>>          struct crat_subtype_generic *sub_type_hdr;
>>>>          struct crat_subtype_computeunit *cu;
>>>>          struct kfd_cu_info cu_info;
>>>> -       struct amd_iommu_device_info iommu_info;
>>>>          int avail_size = *size;
>>>>          uint32_t total_num_of_cu;
>>>>          int num_of_cache_entries = 0;
>>>>          int cache_mem_filled = 0;
>>>>          int ret = 0;
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       struct amd_iommu_device_info iommu_info;
>>>>          const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>>                                           AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>>>                                           AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>>>> +#endif
>>>>          struct kfd_local_mem_info local_mem_info;
>>>>
>>>>          if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>>>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>>>>          /* Check if this node supports IOMMU. During parsing this flag will
>>>>           * translate to HSA_CAP_ATS_PRESENT
>>>>           */
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>          iommu_info.flags = 0;
>>>>          if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>>>                  if ((iommu_info.flags & required_iommu_flags) ==
>>>>                                  required_iommu_flags)
>>>>                          cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>>>          }
>>>> +#endif
>>>>
>>>>          crat_table->length += sub_type_hdr->length;
>>>>          crat_table->total_entries++;
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> index fafe971..5205b34 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> @@ -20,7 +20,9 @@
>>>>    * OTHER DEALINGS IN THE SOFTWARE.
>>>>    */
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   #include <linux/amd-iommu.h>
>>>> +#endif
>>>>   #include <linux/bsearch.h>
>>>>   #include <linux/pci.h>
>>>>   #include <linux/slab.h>
>>>> @@ -31,6 +33,7 @@
>>>>
>>>>   #define MQD_SIZE_ALIGNED 768
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   static const struct kfd_device_info kaveri_device_info = {
>>>>          .asic_family = CHIP_KAVERI,
>>>>          .max_pasid_bits = 16,
>>>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>>>          .num_of_watch_points = 4,
>>>>          .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>>          .supports_cwsr = false,
>>>> +       .needs_iommu_device = true,
>>>>          .needs_pci_atomics = false,
>>>>   };
>>>>
>>>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info = {
>>>>          .num_of_watch_points = 4,
>>>>          .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>>          .supports_cwsr = true,
>>>> +       .needs_iommu_device = true,
>>>>          .needs_pci_atomics = false,
>>>>   };
>>>> +#endif
>>>>
>>>>   struct kfd_deviceid {
>>>>          unsigned short did;
>>>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>>>
>>>>   /* Please keep this sorted by increasing device id. */
>>>>   static const struct kfd_deviceid supported_devices[] = {
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>          { 0x1304, &kaveri_device_info },        /* Kaveri */
>>>>          { 0x1305, &kaveri_device_info },        /* Kaveri */
>>>>          { 0x1306, &kaveri_device_info },        /* Kaveri */
>>>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>>>          { 0x9875, &carrizo_device_info },       /* Carrizo */
>>>>          { 0x9876, &carrizo_device_info },       /* Carrizo */
>>>>          { 0x9877, &carrizo_device_info }        /* Carrizo */
>>>> +#endif
>>>>   };
>>>>
>>>>   static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>>>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>>          return kfd;
>>>>   }
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>>>   {
>>>>          const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>>>>
>>>>          return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>>>   }
>>>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>>>
>>>>   static void kfd_cwsr_init(struct kfd_dev *kfd)
>>>>   {
>>>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>>                  goto device_queue_manager_error;
>>>>          }
>>>>
>>>> -       if (!device_iommu_pasid_init(kfd)) {
>>>> -               dev_err(kfd_device,
>>>> -                       "Error initializing iommuv2 for device %x:%x\n",
>>>> -                       kfd->pdev->vendor, kfd->pdev->device);
>>>> -               goto device_iommu_pasid_error;
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       if (kfd->device_info->needs_iommu_device) {
>>>> +               if (!device_iommu_pasid_init(kfd)) {
>>>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>>>> +                       goto device_iommu_pasid_error;
>>>> +               }
>>>>          }
>>>> +#endif
>>>>
>>>>          kfd_cwsr_init(kfd);
>>>>
>>>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>>>
>>>>          kfd->dqm->ops.stop(kfd->dqm);
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       if (!kfd->device_info->needs_iommu_device)
>>>> +               return;
>>>> +
>>>>          kfd_unbind_processes_from_device(kfd);
>>>>
>>>>          amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>>>          amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>>>          amd_iommu_free_device(kfd->pdev);
>>>> +#endif
>>>>   }
>>>>
>>>>   int kgd2kfd_resume(struct kfd_dev *kfd)
>>>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>>>   static int kfd_resume(struct kfd_dev *kfd)
>>>>   {
>>>>          int err = 0;
>>>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>>>
>>>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>>> -       if (err)
>>>> -               return -ENXIO;
>>>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>>> -                                       iommu_pasid_shutdown_callback);
>>>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>>> -                                    iommu_invalid_ppr_cb);
>>>> -
>>>> -       err = kfd_bind_processes_to_device(kfd);
>>>> -       if (err)
>>>> -               goto processes_bind_error;
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       if (kfd->device_info->needs_iommu_device) {
>>>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>>>> +
>>>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>>> +               if (err)
>>>> +                       return -ENXIO;
>>>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>>> +                                               iommu_pasid_shutdown_callback);
>>>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>>> +                                            iommu_invalid_ppr_cb);
>>>> +
>>>> +               err = kfd_bind_processes_to_device(kfd);
>>>> +               if (err)
>>>> +                       goto processes_bind_error;
>>>> +       }
>>>> +#endif
>>>>
>>>>          err = kfd->dqm->ops.start(kfd->dqm);
>>>>          if (err) {
>>>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>>>
>>>>   dqm_start_error:
>>>>   processes_bind_error:
>>>> -       amd_iommu_free_device(kfd->pdev);
>>>> -
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       if (kfd->device_info->needs_iommu_device)
>>>> +               amd_iommu_free_device(kfd->pdev);
>>>> +#endif
>>>>          return err;
>>>>   }
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>>> index 93aae5c..f770dc7 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct kfd_process *p,
>>>>          }
>>>>   }
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>>                  unsigned long address, bool is_write_requested,
>>>>                  bool is_execute_requested)
>>>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>>          mutex_unlock(&p->event_mutex);
>>>>          kfd_unref_process(p);
>>>>   }
>>>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>>>
>>>>   void kfd_signal_hw_exception_event(unsigned int pasid)
>>>>   {
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> index eebfb1e..9f4766c 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>>>          uint8_t num_of_watch_points;
>>>>          uint16_t mqd_size_aligned;
>>>>          bool supports_cwsr;
>>>> +       bool needs_iommu_device;
>>>>          bool needs_pci_atomics;
>>>>   };
>>>>
>>>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>>>
>>>>   struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>>                                                  struct kfd_process *p);
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>>>   void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>>>   void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid);
>>>> +#endif
>>>>   struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>>>                                                          struct kfd_process *p);
>>>>   struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>>>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>>>                         uint32_t *wait_result);
>>>>   void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>>>                                  uint32_t valid_id_bits);
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   void kfd_signal_iommu_event(struct kfd_dev *dev,
>>>>                  unsigned int pasid, unsigned long address,
>>>>                  bool is_write_requested, bool is_execute_requested);
>>>> +#endif
>>>>   void kfd_signal_hw_exception_event(unsigned int pasid);
>>>>   int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>>>   int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> index a22fb071..1d0e02c 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>>   {
>>>>          struct kfd_process *p = container_of(work, struct kfd_process,
>>>>                                               release_work);
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>          struct kfd_process_device *pdd;
>>>>
>>>>          pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>>>
>>>>          list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>>>> -               if (pdd->bound == PDD_BOUND)
>>>> +               if (pdd->bound == PDD_BOUND &&
>>>> +                   pdd->dev->device_info->needs_iommu_device)
>>>>                          amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>>>          }
>>>> +#endif
>>>>
>>>>          kfd_process_destroy_pdds(p);
>>>>
>>>> @@ -421,7 +424,6 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>>                                                          struct kfd_process *p)
>>>>   {
>>>>          struct kfd_process_device *pdd;
>>>> -       int err;
>>>>
>>>>          pdd = kfd_get_process_device_data(dev, p);
>>>>          if (!pdd) {
>>>> @@ -436,9 +438,14 @@ struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>>                  return ERR_PTR(-EINVAL);
>>>>          }
>>>>
>>>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>>>> -       if (err < 0)
>>>> -               return ERR_PTR(err);
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>> +       if (dev->device_info->needs_iommu_device) {
>>>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>>>> +                                              p->lead_thread);
>>>> +               if (err < 0)
>>>> +                       return ERR_PTR(err);
>>>> +       }
>>>> +#endif
>>>>
>>>>          pdd->bound = PDD_BOUND;
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> index c6a7609..f57c305 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header *dm,
>>>>    */
>>>>   static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>>   {
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>          struct kfd_perf_properties *props;
>>>>
>>>>          if (amd_iommu_pc_supported()) {
>>>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>>                          amd_iommu_pc_get_max_counters(0); /* assume one iommu */
>>>>                  list_add_tail(&props->list, &kdev->perf_props);
>>>>          }
>>>> +#endif
>>>>
>>>>          return 0;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>>> index 53fca1f..111fda2 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>>>                  struct list_head *device_list);
>>>>   void kfd_release_topology_device_list(struct list_head *device_list);
>>>>
>>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>>   extern bool amd_iommu_pc_supported(void);
>>>>   extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>>>   extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>>>> +#endif
>>>>
>>>>   #endif /* __KFD_TOPOLOGY_H__ */
>>>> --
>>>> 2.7.4
>>>>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 19040 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
       [not found]                     ` <31443990-b612-e9cc-ec07-054b940c8c25-5C7GfCeVMHo@public.gmane.org>
@ 2018-02-06  8:39                       ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-02-06  8:39 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Deucher, Alexander, amd-gfx list

On Wed, Jan 31, 2018 at 6:27 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> On 2018-01-31 10:29 AM, Oded Gabbay wrote:
>> On Wed, Jan 31, 2018 at 5:23 PM, Deucher, Alexander
>> <Alexander.Deucher@amd.com> wrote:
>>> ________________________________
>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Oded
>>> Gabbay <oded.gabbay@gmail.com>
>>> Sent: Wednesday, January 31, 2018 10:17 AM
>>> To: Kuehling, Felix
>>> Cc: amd-gfx list
>>> Subject: Re: [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init
>>>
>>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>> wrote:
>>>> Recognize dGPU ASIC families.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 5 +++++
>>>>  1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>>> index 5dc6567..69f4964 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
>>>> @@ -297,10 +297,15 @@ struct kernel_queue *kernel_queue_init(struct
>>>> kfd_dev *dev,
>>>>
>>>>         switch (dev->device_info->asic_family) {
>>>>         case CHIP_CARRIZO:
>>>> +       case CHIP_TONGA:
>>>> +       case CHIP_FIJI:
>>>> +       case CHIP_POLARIS10:
>>>> +       case CHIP_POLARIS11:
>>> I believe POLARIS is from arcatic islands, no ?
>>> Maybe rename kernel_queue_init_vi to kernel_queue_init_vi_ai ?
>>> or create a new function kernel_queue_init_ai() and assign same
>>> functions as vi ?
>>> Either way, I think you need to address that.
>>>
>>> They are all gfx8.  adding ai just confuses things.
>
> Internally we use VI and GFX8 interchangably. I think what's confusing
> is, that internal code names are used for marketing purposes and applied
> to the wrong chip generation.
>
> Another precedent for that is Hawaii. It was called the first "volcanic
> island" GPU when it was launched at an event on Hawaii (a volcanic
> island), but as far as the driver is concerned, it belongs to the CIK
> generation.
>
> It's really hard to keep consistent naming, when naming conventions get
> misappropriated to mean different things over time.
>
>>>
>>> Alex
>> In that case, I think it is better maybe to change it to
>> kernel_queue_init_gfx_7 and kernel_queue_init_gfx_8, to be consistent
>> with the calls to amdgpu_amdkfd_gfx_7_0_get_functions and
>> amdgpu_amdkfd_gfx_8_0_get_functions.
>>
>> Leaving as cik and vi as the identifier when it clearly isn't seems
>> confusing to me as well.
>
> For Vega10 we use the suffix _v9 instead of _cik or _vi. For consistency
> and brevity I could rename _cik->v7 and _vi->v8. However, that would be
> a lot of churn and, in my eyes, a waste of time.

I agree its a lot of churn but I don't think its a total waste of
time, even if those devices are not really important anymore.
I'll try to find some time to do it.
Oded

>
> Regards,
>   Felix
>
>>
>> Oded
>>
>>>
>>>>                 kernel_queue_init_vi(&kq->ops_asic_specific);
>>>>                 break;
>>>>
>>>>         case CHIP_KAVERI:
>>>> +       case CHIP_HAWAII:
>>>>                 kernel_queue_init_cik(&kq->ops_asic_specific);
>>>>                 break;
>>>>         default:
>>>> --
>>>> 2.7.4
>>>>
>>> Other then that, This patch is:
>>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]                     ` <044a3842-92d1-fe2a-c432-0719e8528416-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2018-02-06  8:53                       ` Oded Gabbay
       [not found]                         ` <CAFCwf12s6sjyxyTNWx+cdqCjQg+O-4WonDLmJ2X9QT0iRLBNsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Oded Gabbay @ 2018-02-06  8:53 UTC (permalink / raw)
  To: Christian König; +Cc: Felix Kuehling, amd-gfx list

On Mon, Feb 5, 2018 at 9:00 PM, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
> Looks good to me on first glance.
>
> You probably don't mind that I'm going to pull a good part of that into
> amdgpu as next step?
>

That indeed looks better then the first approach.
Felix, I've applied all other patches from the dGPU topology patchset.
Could you send this new patch after you tested it ?
Thanks.


Christian, I'm going to pull this patch (after its tested and sent
formally) to amdkfd next for 4.17, so if you will pull it to amdgpu we
will have a collision.

Oded


> Regards,
> Christian.
>
>
> Am 03.02.2018 um 03:29 schrieb Felix Kuehling:
>
> The attached patch is my attempt to keep most of the IOMMU code in one
> place (new kfd_iommu.c) to avoid #ifdefs all over the place. This way I
> can still conditionally compile a bunch of KFD code that is only needed
> for IOMMU handling, with stub functions for kernel configs without IOMMU
> support. About 300 lines of conditionally compiled code got moved to
> kfd_iommu.c.
>
> The only piece I didn't move into kfd_iommu.c is
> kfd_signal_iommu_event. I prefer to keep that in kfd_events.c because it
> doesn't call any IOMMU driver functions, and because it's closely
> related to the rest of the event handling logic. It could be compiled
> unconditionally, but it would be dead code without IOMMU support.
>
> And I moved pdd->bound to a place where it doesn't consume extra space
> (on 64-bit systems due to structure alignment) instead of making it
> conditional.
>
> This is only compile-tested for now.
>
> If you like this approach, I'll do more testing and squash it with "Make
> IOMMUv2 code conditional".
>
> Regards,
>   Felix
>
>
> On 2018-01-31 10:00 AM, Oded Gabbay wrote:
>
> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> Hi Felix,
> Please don't spread 19 #ifdefs throughout the code.
> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
> functions declarations and in the #else section put macros with empty
> implementations. This is much more readable and maintainable.
>
> Oded
>
> To emphasize my point, there is a call to amd_iommu_bind_pasid in
> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
> the compliation breaks. Putting the #ifdefs around the calls is simply
> not scalable.
>
> Oded
>
> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
> wrote:
>
> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
> ASIC information. Also allow building KFD without IOMMUv2 support.
> This is still useful for dGPUs and prepares for enabling KFD on
> architectures that don't support AMD IOMMUv2.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62
> +++++++++++++++++++++----------
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>  8 files changed, 74 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig
> b/drivers/gpu/drm/amd/amdkfd/Kconfig
> index bc5a294..5bbeb95 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
> @@ -4,6 +4,6 @@
>
>  config HSA_AMD
>         tristate "HSA kernel driver for AMD GPU devices"
> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
> +       depends on DRM_AMDGPU && X86_64
>         help
>           Enable this if you want to use HSA features on AMD GPU devices.
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> index 2bc2816..3478270 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> @@ -22,7 +22,9 @@
>
>  #include <linux/pci.h>
>  #include <linux/acpi.h>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  #include <linux/amd-iommu.h>
> +#endif
>  #include "kfd_crat.h"
>  #include "kfd_priv.h"
>  #include "kfd_topology.h"
> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void
> *pcrat_image,
>         struct crat_subtype_generic *sub_type_hdr;
>         struct crat_subtype_computeunit *cu;
>         struct kfd_cu_info cu_info;
> -       struct amd_iommu_device_info iommu_info;
>         int avail_size = *size;
>         uint32_t total_num_of_cu;
>         int num_of_cache_entries = 0;
>         int cache_mem_filled = 0;
>         int ret = 0;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       struct amd_iommu_device_info iommu_info;
>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
> +#endif
>         struct kfd_local_mem_info local_mem_info;
>
>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void
> *pcrat_image,
>         /* Check if this node supports IOMMU. During parsing this flag will
>          * translate to HSA_CAP_ATS_PRESENT
>          */
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         iommu_info.flags = 0;
>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>                 if ((iommu_info.flags & required_iommu_flags) ==
>                                 required_iommu_flags)
>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>         }
> +#endif
>
>         crat_table->length += sub_type_hdr->length;
>         crat_table->total_entries++;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index fafe971..5205b34 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -20,7 +20,9 @@
>   * OTHER DEALINGS IN THE SOFTWARE.
>   */
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  #include <linux/amd-iommu.h>
> +#endif
>  #include <linux/bsearch.h>
>  #include <linux/pci.h>
>  #include <linux/slab.h>
> @@ -31,6 +33,7 @@
>
>  #define MQD_SIZE_ALIGNED 768
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  static const struct kfd_device_info kaveri_device_info = {
>         .asic_family = CHIP_KAVERI,
>         .max_pasid_bits = 16,
> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = false,
> +       .needs_iommu_device = true,
>         .needs_pci_atomics = false,
>  };
>
> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info =
> {
>         .num_of_watch_points = 4,
>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>         .supports_cwsr = true,
> +       .needs_iommu_device = true,
>         .needs_pci_atomics = false,
>  };
> +#endif
>
>  struct kfd_deviceid {
>         unsigned short did;
> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>
>  /* Please keep this sorted by increasing device id. */
>  static const struct kfd_deviceid supported_devices[] = {
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>         { 0x1306, &kaveri_device_info },        /* Kaveri */
> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>         { 0x9877, &carrizo_device_info }        /* Carrizo */
> +#endif
>  };
>
>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>         return kfd;
>  }
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>  {
>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev,
> int pasid,
>
>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>  }
> +#endif /* CONFIG_AMD_IOMMU_V2 */
>
>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>  {
> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>                 goto device_queue_manager_error;
>         }
>
> -       if (!device_iommu_pasid_init(kfd)) {
> -               dev_err(kfd_device,
> -                       "Error initializing iommuv2 for device %x:%x\n",
> -                       kfd->pdev->vendor, kfd->pdev->device);
> -               goto device_iommu_pasid_error;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device) {
> +               if (!device_iommu_pasid_init(kfd)) {
> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
> +                       goto device_iommu_pasid_error;
> +               }
>         }
> +#endif
>
>         kfd_cwsr_init(kfd);
>
> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>
>         kfd->dqm->ops.stop(kfd->dqm);
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (!kfd->device_info->needs_iommu_device)
> +               return;
> +
>         kfd_unbind_processes_from_device(kfd);
>
>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>         amd_iommu_free_device(kfd->pdev);
> +#endif
>  }
>
>  int kgd2kfd_resume(struct kfd_dev *kfd)
> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>  static int kfd_resume(struct kfd_dev *kfd)
>  {
>         int err = 0;
> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>
> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> -       if (err)
> -               return -ENXIO;
> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> -                                       iommu_pasid_shutdown_callback);
> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> -                                    iommu_invalid_ppr_cb);
> -
> -       err = kfd_bind_processes_to_device(kfd);
> -       if (err)
> -               goto processes_bind_error;
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device) {
> +               unsigned int pasid_limit = kfd_get_pasid_limit();
> +
> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> +               if (err)
> +                       return -ENXIO;
> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> +
> iommu_pasid_shutdown_callback);
> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> +                                            iommu_invalid_ppr_cb);
> +
> +               err = kfd_bind_processes_to_device(kfd);
> +               if (err)
> +                       goto processes_bind_error;
> +       }
> +#endif
>
>         err = kfd->dqm->ops.start(kfd->dqm);
>         if (err) {
> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>
>  dqm_start_error:
>  processes_bind_error:
> -       amd_iommu_free_device(kfd->pdev);
> -
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (kfd->device_info->needs_iommu_device)
> +               amd_iommu_free_device(kfd->pdev);
> +#endif
>         return err;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index 93aae5c..f770dc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct
> kfd_process *p,
>         }
>  }
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>                 unsigned long address, bool is_write_requested,
>                 bool is_execute_requested)
> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
> unsigned int pasid,
>         mutex_unlock(&p->event_mutex);
>         kfd_unref_process(p);
>  }
> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>
>  void kfd_signal_hw_exception_event(unsigned int pasid)
>  {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index eebfb1e..9f4766c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -158,6 +158,7 @@ struct kfd_device_info {
>         uint8_t num_of_watch_points;
>         uint16_t mqd_size_aligned;
>         bool supports_cwsr;
> +       bool needs_iommu_device;
>         bool needs_pci_atomics;
>  };
>
> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>
>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>                                                 struct kfd_process *p);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int
> pasid);
> +#endif
>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>                                                         struct kfd_process
> *p);
>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev
> *dev,
> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>                        uint32_t *wait_result);
>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>                                 uint32_t valid_id_bits);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>                 unsigned int pasid, unsigned long address,
>                 bool is_write_requested, bool is_execute_requested);
> +#endif
>  void kfd_signal_hw_exception_event(unsigned int pasid);
>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index a22fb071..1d0e02c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct
> *work)
>  {
>         struct kfd_process *p = container_of(work, struct kfd_process,
>                                              release_work);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         struct kfd_process_device *pdd;
>
>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>
>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
> -               if (pdd->bound == PDD_BOUND)
> +               if (pdd->bound == PDD_BOUND &&
> +                   pdd->dev->device_info->needs_iommu_device)
>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>         }
> +#endif
>
>         kfd_process_destroy_pdds(p);
>
> @@ -421,7 +424,6 @@ struct kfd_process_device
> *kfd_bind_process_to_device(struct kfd_dev *dev,
>                                                         struct kfd_process
> *p)
>  {
>         struct kfd_process_device *pdd;
> -       int err;
>
>         pdd = kfd_get_process_device_data(dev, p);
>         if (!pdd) {
> @@ -436,9 +438,14 @@ struct kfd_process_device
> *kfd_bind_process_to_device(struct kfd_dev *dev,
>                 return ERR_PTR(-EINVAL);
>         }
>
> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
> -       if (err < 0)
> -               return ERR_PTR(err);
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +       if (dev->device_info->needs_iommu_device) {
> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
> +                                              p->lead_thread);
> +               if (err < 0)
> +                       return ERR_PTR(err);
> +       }
> +#endif
>
>         pdd->bound = PDD_BOUND;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index c6a7609..f57c305 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header
> *dm,
>   */
>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>  {
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>         struct kfd_perf_properties *props;
>
>         if (amd_iommu_pc_supported()) {
> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct
> kfd_topology_device *kdev)
>                         amd_iommu_pc_get_max_counters(0); /* assume one
> iommu */
>                 list_add_tail(&props->list, &kdev->perf_props);
>         }
> +#endif
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> index 53fca1f..111fda2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>                 struct list_head *device_list);
>  void kfd_release_topology_device_list(struct list_head *device_list);
>
> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>  extern bool amd_iommu_pc_supported(void);
>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
> +#endif
>
>  #endif /* __KFD_TOPOLOGY_H__ */
> --
> 2.7.4
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]                         ` <CAFCwf12s6sjyxyTNWx+cdqCjQg+O-4WonDLmJ2X9QT0iRLBNsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-07  0:30                           ` Felix Kuehling
       [not found]                             ` <fe5b1bd4-37fc-2ed0-6a68-247abe08406e-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 47+ messages in thread
From: Felix Kuehling @ 2018-02-07  0:30 UTC (permalink / raw)
  To: Oded Gabbay, Christian König; +Cc: amd-gfx list

On 2018-02-06 03:53 AM, Oded Gabbay wrote:
> On Mon, Feb 5, 2018 at 9:00 PM, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Looks good to me on first glance.
>>
>> You probably don't mind that I'm going to pull a good part of that into
>> amdgpu as next step?
>>
> That indeed looks better then the first approach.
> Felix, I've applied all other patches from the dGPU topology patchset.
> Could you send this new patch after you tested it ?

Yes. I've fixed and tested it today on CZ and Fiji and I'm rebasing
everything on your updated branch right now.

I also have some fixes and updates in the GPUVM patch series that I'll
send out again after rebasing. One thing to note is, that
amdgpu_amdkfd_gpuvm.c will have to deal with conflicts at some point.
The amdgpu_bo_create function removed one parameter, and some structure
members were renamed.

If you submit the amdgpu changes through your branch, either you or Alex
will need to fix that up at some point, depending on who gets to push to
Dave first. Alternatively, I can submit the amdgpu changes through
Alex's tree, but then you'll need to wait for Alex to push them to Dave
before you can apply the amdkfd changes on top of them.

Which way do you prefer?

Regards,
  Felix

> Thanks.
>
>
> Christian, I'm going to pull this patch (after its tested and sent
> formally) to amdkfd next for 4.17, so if you will pull it to amdgpu we
> will have a collision.
>
> Oded
>
>
>> Regards,
>> Christian.
>>
>>
>> Am 03.02.2018 um 03:29 schrieb Felix Kuehling:
>>
>> The attached patch is my attempt to keep most of the IOMMU code in one
>> place (new kfd_iommu.c) to avoid #ifdefs all over the place. This way I
>> can still conditionally compile a bunch of KFD code that is only needed
>> for IOMMU handling, with stub functions for kernel configs without IOMMU
>> support. About 300 lines of conditionally compiled code got moved to
>> kfd_iommu.c.
>>
>> The only piece I didn't move into kfd_iommu.c is
>> kfd_signal_iommu_event. I prefer to keep that in kfd_events.c because it
>> doesn't call any IOMMU driver functions, and because it's closely
>> related to the rest of the event handling logic. It could be compiled
>> unconditionally, but it would be dead code without IOMMU support.
>>
>> And I moved pdd->bound to a place where it doesn't consume extra space
>> (on 64-bit systems due to structure alignment) instead of making it
>> conditional.
>>
>> This is only compile-tested for now.
>>
>> If you like this approach, I'll do more testing and squash it with "Make
>> IOMMUv2 code conditional".
>>
>> Regards,
>>   Felix
>>
>>
>> On 2018-01-31 10:00 AM, Oded Gabbay wrote:
>>
>> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
>>
>> Hi Felix,
>> Please don't spread 19 #ifdefs throughout the code.
>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>> functions declarations and in the #else section put macros with empty
>> implementations. This is much more readable and maintainable.
>>
>> Oded
>>
>> To emphasize my point, there is a call to amd_iommu_bind_pasid in
>> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
>> the compliation breaks. Putting the #ifdefs around the calls is simply
>> not scalable.
>>
>> Oded
>>
>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>> wrote:
>>
>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>> ASIC information. Also allow building KFD without IOMMUv2 support.
>> This is still useful for dGPUs and prepares for enabling KFD on
>> architectures that don't support AMD IOMMUv2.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62
>> +++++++++++++++++++++----------
>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>  8 files changed, 74 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig
>> b/drivers/gpu/drm/amd/amdkfd/Kconfig
>> index bc5a294..5bbeb95 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>> @@ -4,6 +4,6 @@
>>
>>  config HSA_AMD
>>         tristate "HSA kernel driver for AMD GPU devices"
>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>> +       depends on DRM_AMDGPU && X86_64
>>         help
>>           Enable this if you want to use HSA features on AMD GPU devices.
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> index 2bc2816..3478270 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> @@ -22,7 +22,9 @@
>>
>>  #include <linux/pci.h>
>>  #include <linux/acpi.h>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  #include <linux/amd-iommu.h>
>> +#endif
>>  #include "kfd_crat.h"
>>  #include "kfd_priv.h"
>>  #include "kfd_topology.h"
>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void
>> *pcrat_image,
>>         struct crat_subtype_generic *sub_type_hdr;
>>         struct crat_subtype_computeunit *cu;
>>         struct kfd_cu_info cu_info;
>> -       struct amd_iommu_device_info iommu_info;
>>         int avail_size = *size;
>>         uint32_t total_num_of_cu;
>>         int num_of_cache_entries = 0;
>>         int cache_mem_filled = 0;
>>         int ret = 0;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       struct amd_iommu_device_info iommu_info;
>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>> +#endif
>>         struct kfd_local_mem_info local_mem_info;
>>
>>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void
>> *pcrat_image,
>>         /* Check if this node supports IOMMU. During parsing this flag will
>>          * translate to HSA_CAP_ATS_PRESENT
>>          */
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         iommu_info.flags = 0;
>>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>                 if ((iommu_info.flags & required_iommu_flags) ==
>>                                 required_iommu_flags)
>>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>         }
>> +#endif
>>
>>         crat_table->length += sub_type_hdr->length;
>>         crat_table->total_entries++;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index fafe971..5205b34 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -20,7 +20,9 @@
>>   * OTHER DEALINGS IN THE SOFTWARE.
>>   */
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  #include <linux/amd-iommu.h>
>> +#endif
>>  #include <linux/bsearch.h>
>>  #include <linux/pci.h>
>>  #include <linux/slab.h>
>> @@ -31,6 +33,7 @@
>>
>>  #define MQD_SIZE_ALIGNED 768
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  static const struct kfd_device_info kaveri_device_info = {
>>         .asic_family = CHIP_KAVERI,
>>         .max_pasid_bits = 16,
>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>         .num_of_watch_points = 4,
>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>         .supports_cwsr = false,
>> +       .needs_iommu_device = true,
>>         .needs_pci_atomics = false,
>>  };
>>
>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info =
>> {
>>         .num_of_watch_points = 4,
>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>         .supports_cwsr = true,
>> +       .needs_iommu_device = true,
>>         .needs_pci_atomics = false,
>>  };
>> +#endif
>>
>>  struct kfd_deviceid {
>>         unsigned short did;
>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>
>>  /* Please keep this sorted by increasing device id. */
>>  static const struct kfd_deviceid supported_devices[] = {
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>>         { 0x1306, &kaveri_device_info },        /* Kaveri */
>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>>         { 0x9877, &carrizo_device_info }        /* Carrizo */
>> +#endif
>>  };
>>
>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>         return kfd;
>>  }
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>  {
>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev,
>> int pasid,
>>
>>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>  }
>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>
>>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>>  {
>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>                 goto device_queue_manager_error;
>>         }
>>
>> -       if (!device_iommu_pasid_init(kfd)) {
>> -               dev_err(kfd_device,
>> -                       "Error initializing iommuv2 for device %x:%x\n",
>> -                       kfd->pdev->vendor, kfd->pdev->device);
>> -               goto device_iommu_pasid_error;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device) {
>> +               if (!device_iommu_pasid_init(kfd)) {
>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>> +                       goto device_iommu_pasid_error;
>> +               }
>>         }
>> +#endif
>>
>>         kfd_cwsr_init(kfd);
>>
>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>
>>         kfd->dqm->ops.stop(kfd->dqm);
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (!kfd->device_info->needs_iommu_device)
>> +               return;
>> +
>>         kfd_unbind_processes_from_device(kfd);
>>
>>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>         amd_iommu_free_device(kfd->pdev);
>> +#endif
>>  }
>>
>>  int kgd2kfd_resume(struct kfd_dev *kfd)
>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>  static int kfd_resume(struct kfd_dev *kfd)
>>  {
>>         int err = 0;
>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>
>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>> -       if (err)
>> -               return -ENXIO;
>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>> -                                       iommu_pasid_shutdown_callback);
>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>> -                                    iommu_invalid_ppr_cb);
>> -
>> -       err = kfd_bind_processes_to_device(kfd);
>> -       if (err)
>> -               goto processes_bind_error;
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device) {
>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>> +
>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>> +               if (err)
>> +                       return -ENXIO;
>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>> +
>> iommu_pasid_shutdown_callback);
>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>> +                                            iommu_invalid_ppr_cb);
>> +
>> +               err = kfd_bind_processes_to_device(kfd);
>> +               if (err)
>> +                       goto processes_bind_error;
>> +       }
>> +#endif
>>
>>         err = kfd->dqm->ops.start(kfd->dqm);
>>         if (err) {
>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>
>>  dqm_start_error:
>>  processes_bind_error:
>> -       amd_iommu_free_device(kfd->pdev);
>> -
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (kfd->device_info->needs_iommu_device)
>> +               amd_iommu_free_device(kfd->pdev);
>> +#endif
>>         return err;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> index 93aae5c..f770dc7 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct
>> kfd_process *p,
>>         }
>>  }
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>                 unsigned long address, bool is_write_requested,
>>                 bool is_execute_requested)
>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>> unsigned int pasid,
>>         mutex_unlock(&p->event_mutex);
>>         kfd_unref_process(p);
>>  }
>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>
>>  void kfd_signal_hw_exception_event(unsigned int pasid)
>>  {
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> index eebfb1e..9f4766c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>         uint8_t num_of_watch_points;
>>         uint16_t mqd_size_aligned;
>>         bool supports_cwsr;
>> +       bool needs_iommu_device;
>>         bool needs_pci_atomics;
>>  };
>>
>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>
>>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                                                 struct kfd_process *p);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int
>> pasid);
>> +#endif
>>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>                                                         struct kfd_process
>> *p);
>>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev
>> *dev,
>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>                        uint32_t *wait_result);
>>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>                                 uint32_t valid_id_bits);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>>                 unsigned int pasid, unsigned long address,
>>                 bool is_write_requested, bool is_execute_requested);
>> +#endif
>>  void kfd_signal_hw_exception_event(unsigned int pasid);
>>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index a22fb071..1d0e02c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct
>> *work)
>>  {
>>         struct kfd_process *p = container_of(work, struct kfd_process,
>>                                              release_work);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         struct kfd_process_device *pdd;
>>
>>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>
>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>> -               if (pdd->bound == PDD_BOUND)
>> +               if (pdd->bound == PDD_BOUND &&
>> +                   pdd->dev->device_info->needs_iommu_device)
>>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>         }
>> +#endif
>>
>>         kfd_process_destroy_pdds(p);
>>
>> @@ -421,7 +424,6 @@ struct kfd_process_device
>> *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                                                         struct kfd_process
>> *p)
>>  {
>>         struct kfd_process_device *pdd;
>> -       int err;
>>
>>         pdd = kfd_get_process_device_data(dev, p);
>>         if (!pdd) {
>> @@ -436,9 +438,14 @@ struct kfd_process_device
>> *kfd_bind_process_to_device(struct kfd_dev *dev,
>>                 return ERR_PTR(-EINVAL);
>>         }
>>
>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>> -       if (err < 0)
>> -               return ERR_PTR(err);
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>> +       if (dev->device_info->needs_iommu_device) {
>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>> +                                              p->lead_thread);
>> +               if (err < 0)
>> +                       return ERR_PTR(err);
>> +       }
>> +#endif
>>
>>         pdd->bound = PDD_BOUND;
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> index c6a7609..f57c305 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header
>> *dm,
>>   */
>>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>  {
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>         struct kfd_perf_properties *props;
>>
>>         if (amd_iommu_pc_supported()) {
>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct
>> kfd_topology_device *kdev)
>>                         amd_iommu_pc_get_max_counters(0); /* assume one
>> iommu */
>>                 list_add_tail(&props->list, &kdev->perf_props);
>>         }
>> +#endif
>>
>>         return 0;
>>  }
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> index 53fca1f..111fda2 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>                 struct list_head *device_list);
>>  void kfd_release_topology_device_list(struct list_head *device_list);
>>
>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>  extern bool amd_iommu_pc_supported(void);
>>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>> +#endif
>>
>>  #endif /* __KFD_TOPOLOGY_H__ */
>> --
>> 2.7.4
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional
       [not found]                             ` <fe5b1bd4-37fc-2ed0-6a68-247abe08406e-5C7GfCeVMHo@public.gmane.org>
@ 2018-02-07  6:55                               ` Oded Gabbay
  0 siblings, 0 replies; 47+ messages in thread
From: Oded Gabbay @ 2018-02-07  6:55 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Christian König, amd-gfx list

On Wed, Feb 7, 2018 at 2:30 AM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> On 2018-02-06 03:53 AM, Oded Gabbay wrote:
>> On Mon, Feb 5, 2018 at 9:00 PM, Christian König
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>> Looks good to me on first glance.
>>>
>>> You probably don't mind that I'm going to pull a good part of that into
>>> amdgpu as next step?
>>>
>> That indeed looks better then the first approach.
>> Felix, I've applied all other patches from the dGPU topology patchset.
>> Could you send this new patch after you tested it ?
>
> Yes. I've fixed and tested it today on CZ and Fiji and I'm rebasing
> everything on your updated branch right now.
>
> I also have some fixes and updates in the GPUVM patch series that I'll
> send out again after rebasing. One thing to note is, that
> amdgpu_amdkfd_gpuvm.c will have to deal with conflicts at some point.
> The amdgpu_bo_create function removed one parameter, and some structure
> members were renamed.
>
> If you submit the amdgpu changes through your branch, either you or Alex
> will need to fix that up at some point, depending on who gets to push to
> Dave first. Alternatively, I can submit the amdgpu changes through
> Alex's tree, but then you'll need to wait for Alex to push them to Dave
> before you can apply the amdkfd changes on top of them.
>
> Which way do you prefer?
I don't think you should split it up.

Usually Alex is getting to Dave before me, so either I will fix it, or
Dave will fix it ;)

Oded
>
> Regards,
>   Felix
>
>> Thanks.
>>
>>
>> Christian, I'm going to pull this patch (after its tested and sent
>> formally) to amdkfd next for 4.17, so if you will pull it to amdgpu we
>> will have a collision.
>>
>> Oded
>>
>>
>>> Regards,
>>> Christian.
>>>
>>>
>>> Am 03.02.2018 um 03:29 schrieb Felix Kuehling:
>>>
>>> The attached patch is my attempt to keep most of the IOMMU code in one
>>> place (new kfd_iommu.c) to avoid #ifdefs all over the place. This way I
>>> can still conditionally compile a bunch of KFD code that is only needed
>>> for IOMMU handling, with stub functions for kernel configs without IOMMU
>>> support. About 300 lines of conditionally compiled code got moved to
>>> kfd_iommu.c.
>>>
>>> The only piece I didn't move into kfd_iommu.c is
>>> kfd_signal_iommu_event. I prefer to keep that in kfd_events.c because it
>>> doesn't call any IOMMU driver functions, and because it's closely
>>> related to the rest of the event handling logic. It could be compiled
>>> unconditionally, but it would be dead code without IOMMU support.
>>>
>>> And I moved pdd->bound to a place where it doesn't consume extra space
>>> (on 64-bit systems due to structure alignment) instead of making it
>>> conditional.
>>>
>>> This is only compile-tested for now.
>>>
>>> If you like this approach, I'll do more testing and squash it with "Make
>>> IOMMUv2 code conditional".
>>>
>>> Regards,
>>>   Felix
>>>
>>>
>>> On 2018-01-31 10:00 AM, Oded Gabbay wrote:
>>>
>>> On Wed, Jan 31, 2018 at 4:56 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
>>>
>>> Hi Felix,
>>> Please don't spread 19 #ifdefs throughout the code.
>>> I suggest to put one #ifdef in linux/amd-iommu.h itself around all the
>>> functions declarations and in the #else section put macros with empty
>>> implementations. This is much more readable and maintainable.
>>>
>>> Oded
>>>
>>> To emphasize my point, there is a call to amd_iommu_bind_pasid in
>>> kfd_bind_processes_to_device() which isn't wrapped with the #ifdef so
>>> the compliation breaks. Putting the #ifdefs around the calls is simply
>>> not scalable.
>>>
>>> Oded
>>>
>>> On Fri, Jan 5, 2018 at 12:17 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>> wrote:
>>>
>>> dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on
>>> ASIC information. Also allow building KFD without IOMMUv2 support.
>>> This is still useful for dGPUs and prepares for enabling KFD on
>>> architectures that don't support AMD IOMMUv2.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/Kconfig        |  2 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c     |  8 +++-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   | 62
>>> +++++++++++++++++++++----------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  5 +++
>>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c  | 17 ++++++---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  2 +
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.h |  2 +
>>>  8 files changed, 74 insertions(+), 26 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> index bc5a294..5bbeb95 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
>>> @@ -4,6 +4,6 @@
>>>
>>>  config HSA_AMD
>>>         tristate "HSA kernel driver for AMD GPU devices"
>>> -       depends on DRM_AMDGPU && AMD_IOMMU_V2 && X86_64
>>> +       depends on DRM_AMDGPU && X86_64
>>>         help
>>>           Enable this if you want to use HSA features on AMD GPU devices.
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> index 2bc2816..3478270 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>>> @@ -22,7 +22,9 @@
>>>
>>>  #include <linux/pci.h>
>>>  #include <linux/acpi.h>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include "kfd_crat.h"
>>>  #include "kfd_priv.h"
>>>  #include "kfd_topology.h"
>>> @@ -1037,15 +1039,17 @@ static int kfd_create_vcrat_image_gpu(void
>>> *pcrat_image,
>>>         struct crat_subtype_generic *sub_type_hdr;
>>>         struct crat_subtype_computeunit *cu;
>>>         struct kfd_cu_info cu_info;
>>> -       struct amd_iommu_device_info iommu_info;
>>>         int avail_size = *size;
>>>         uint32_t total_num_of_cu;
>>>         int num_of_cache_entries = 0;
>>>         int cache_mem_filled = 0;
>>>         int ret = 0;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       struct amd_iommu_device_info iommu_info;
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PRI_SUP |
>>>                                          AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
>>> +#endif
>>>         struct kfd_local_mem_info local_mem_info;
>>>
>>>         if (!pcrat_image || avail_size < VCRAT_SIZE_FOR_GPU)
>>> @@ -1106,12 +1110,14 @@ static int kfd_create_vcrat_image_gpu(void
>>> *pcrat_image,
>>>         /* Check if this node supports IOMMU. During parsing this flag will
>>>          * translate to HSA_CAP_ATS_PRESENT
>>>          */
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         iommu_info.flags = 0;
>>>         if (amd_iommu_device_info(kdev->pdev, &iommu_info) == 0) {
>>>                 if ((iommu_info.flags & required_iommu_flags) ==
>>>                                 required_iommu_flags)
>>>                         cu->hsa_capability |= CRAT_CU_FLAGS_IOMMU_PRESENT;
>>>         }
>>> +#endif
>>>
>>>         crat_table->length += sub_type_hdr->length;
>>>         crat_table->total_entries++;
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index fafe971..5205b34 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -20,7 +20,9 @@
>>>   * OTHER DEALINGS IN THE SOFTWARE.
>>>   */
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  #include <linux/amd-iommu.h>
>>> +#endif
>>>  #include <linux/bsearch.h>
>>>  #include <linux/pci.h>
>>>  #include <linux/slab.h>
>>> @@ -31,6 +33,7 @@
>>>
>>>  #define MQD_SIZE_ALIGNED 768
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static const struct kfd_device_info kaveri_device_info = {
>>>         .asic_family = CHIP_KAVERI,
>>>         .max_pasid_bits = 16,
>>> @@ -41,6 +44,7 @@ static const struct kfd_device_info kaveri_device_info = {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = false,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>>
>>> @@ -54,8 +58,10 @@ static const struct kfd_device_info carrizo_device_info =
>>> {
>>>         .num_of_watch_points = 4,
>>>         .mqd_size_aligned = MQD_SIZE_ALIGNED,
>>>         .supports_cwsr = true,
>>> +       .needs_iommu_device = true,
>>>         .needs_pci_atomics = false,
>>>  };
>>> +#endif
>>>
>>>  struct kfd_deviceid {
>>>         unsigned short did;
>>> @@ -64,6 +70,7 @@ struct kfd_deviceid {
>>>
>>>  /* Please keep this sorted by increasing device id. */
>>>  static const struct kfd_deviceid supported_devices[] = {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         { 0x1304, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1305, &kaveri_device_info },        /* Kaveri */
>>>         { 0x1306, &kaveri_device_info },        /* Kaveri */
>>> @@ -91,6 +98,7 @@ static const struct kfd_deviceid supported_devices[] = {
>>>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>>>         { 0x9877, &carrizo_device_info }        /* Carrizo */
>>> +#endif
>>>  };
>>>
>>>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>>> @@ -161,6 +169,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>         return kfd;
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>>  {
>>>         const u32 required_iommu_flags = AMD_IOMMU_DEVICE_FLAG_ATS_SUP |
>>> @@ -231,6 +240,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev,
>>> int pasid,
>>>
>>>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2 */
>>>
>>>  static void kfd_cwsr_init(struct kfd_dev *kfd)
>>>  {
>>> @@ -321,12 +331,14 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>                 goto device_queue_manager_error;
>>>         }
>>>
>>> -       if (!device_iommu_pasid_init(kfd)) {
>>> -               dev_err(kfd_device,
>>> -                       "Error initializing iommuv2 for device %x:%x\n",
>>> -                       kfd->pdev->vendor, kfd->pdev->device);
>>> -               goto device_iommu_pasid_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               if (!device_iommu_pasid_init(kfd)) {
>>> +                       dev_err(kfd_device, "Error initializing iommuv2\n");
>>> +                       goto device_iommu_pasid_error;
>>> +               }
>>>         }
>>> +#endif
>>>
>>>         kfd_cwsr_init(kfd);
>>>
>>> @@ -386,11 +398,16 @@ void kgd2kfd_suspend(struct kfd_dev *kfd)
>>>
>>>         kfd->dqm->ops.stop(kfd->dqm);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (!kfd->device_info->needs_iommu_device)
>>> +               return;
>>> +
>>>         kfd_unbind_processes_from_device(kfd);
>>>
>>>         amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
>>>         amd_iommu_set_invalid_ppr_cb(kfd->pdev, NULL);
>>>         amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>  }
>>>
>>>  int kgd2kfd_resume(struct kfd_dev *kfd)
>>> @@ -405,19 +422,24 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>>>  static int kfd_resume(struct kfd_dev *kfd)
>>>  {
>>>         int err = 0;
>>> -       unsigned int pasid_limit = kfd_get_pasid_limit();
>>>
>>> -       err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> -       if (err)
>>> -               return -ENXIO;
>>> -       amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> -                                       iommu_pasid_shutdown_callback);
>>> -       amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> -                                    iommu_invalid_ppr_cb);
>>> -
>>> -       err = kfd_bind_processes_to_device(kfd);
>>> -       if (err)
>>> -               goto processes_bind_error;
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device) {
>>> +               unsigned int pasid_limit = kfd_get_pasid_limit();
>>> +
>>> +               err = amd_iommu_init_device(kfd->pdev, pasid_limit);
>>> +               if (err)
>>> +                       return -ENXIO;
>>> +               amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>>> +
>>> iommu_pasid_shutdown_callback);
>>> +               amd_iommu_set_invalid_ppr_cb(kfd->pdev,
>>> +                                            iommu_invalid_ppr_cb);
>>> +
>>> +               err = kfd_bind_processes_to_device(kfd);
>>> +               if (err)
>>> +                       goto processes_bind_error;
>>> +       }
>>> +#endif
>>>
>>>         err = kfd->dqm->ops.start(kfd->dqm);
>>>         if (err) {
>>> @@ -431,8 +453,10 @@ static int kfd_resume(struct kfd_dev *kfd)
>>>
>>>  dqm_start_error:
>>>  processes_bind_error:
>>> -       amd_iommu_free_device(kfd->pdev);
>>> -
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (kfd->device_info->needs_iommu_device)
>>> +               amd_iommu_free_device(kfd->pdev);
>>> +#endif
>>>         return err;
>>>  }
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> index 93aae5c..f770dc7 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> @@ -837,6 +837,7 @@ static void lookup_events_by_type_and_signal(struct
>>> kfd_process *p,
>>>         }
>>>  }
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>>>                 unsigned long address, bool is_write_requested,
>>>                 bool is_execute_requested)
>>> @@ -905,6 +906,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>>> unsigned int pasid,
>>>         mutex_unlock(&p->event_mutex);
>>>         kfd_unref_process(p);
>>>  }
>>> +#endif /* CONFIG_AMD_IOMMU_V2_MODULE */
>>>
>>>  void kfd_signal_hw_exception_event(unsigned int pasid)
>>>  {
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> index eebfb1e..9f4766c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> @@ -158,6 +158,7 @@ struct kfd_device_info {
>>>         uint8_t num_of_watch_points;
>>>         uint16_t mqd_size_aligned;
>>>         bool supports_cwsr;
>>> +       bool needs_iommu_device;
>>>         bool needs_pci_atomics;
>>>  };
>>>
>>> @@ -617,9 +618,11 @@ void kfd_unref_process(struct kfd_process *p);
>>>
>>>  struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                 struct kfd_process *p);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  int kfd_bind_processes_to_device(struct kfd_dev *dev);
>>>  void kfd_unbind_processes_from_device(struct kfd_dev *dev);
>>>  void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int
>>> pasid);
>>> +#endif
>>>  struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev,
>>>                                                         struct kfd_process
>>> *p);
>>>  struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev
>>> *dev,
>>> @@ -784,9 +787,11 @@ int kfd_wait_on_events(struct kfd_process *p,
>>>                        uint32_t *wait_result);
>>>  void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>>                                 uint32_t valid_id_bits);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  void kfd_signal_iommu_event(struct kfd_dev *dev,
>>>                 unsigned int pasid, unsigned long address,
>>>                 bool is_write_requested, bool is_execute_requested);
>>> +#endif
>>>  void kfd_signal_hw_exception_event(unsigned int pasid);
>>>  int kfd_set_event(struct kfd_process *p, uint32_t event_id);
>>>  int kfd_reset_event(struct kfd_process *p, uint32_t event_id);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index a22fb071..1d0e02c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -173,14 +173,17 @@ static void kfd_process_wq_release(struct work_struct
>>> *work)
>>>  {
>>>         struct kfd_process *p = container_of(work, struct kfd_process,
>>>                                              release_work);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_process_device *pdd;
>>>
>>>         pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
>>>
>>>         list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>>> -               if (pdd->bound == PDD_BOUND)
>>> +               if (pdd->bound == PDD_BOUND &&
>>> +                   pdd->dev->device_info->needs_iommu_device)
>>>                         amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
>>>         }
>>> +#endif
>>>
>>>         kfd_process_destroy_pdds(p);
>>>
>>> @@ -421,7 +424,6 @@ struct kfd_process_device
>>> *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                                                         struct kfd_process
>>> *p)
>>>  {
>>>         struct kfd_process_device *pdd;
>>> -       int err;
>>>
>>>         pdd = kfd_get_process_device_data(dev, p);
>>>         if (!pdd) {
>>> @@ -436,9 +438,14 @@ struct kfd_process_device
>>> *kfd_bind_process_to_device(struct kfd_dev *dev,
>>>                 return ERR_PTR(-EINVAL);
>>>         }
>>>
>>> -       err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
>>> -       if (err < 0)
>>> -               return ERR_PTR(err);
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>> +       if (dev->device_info->needs_iommu_device) {
>>> +               int err = amd_iommu_bind_pasid(dev->pdev, p->pasid,
>>> +                                              p->lead_thread);
>>> +               if (err < 0)
>>> +                       return ERR_PTR(err);
>>> +       }
>>> +#endif
>>>
>>>         pdd->bound = PDD_BOUND;
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index c6a7609..f57c305 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -875,6 +875,7 @@ static void find_system_memory(const struct dmi_header
>>> *dm,
>>>   */
>>>  static int kfd_add_perf_to_topology(struct kfd_topology_device *kdev)
>>>  {
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>         struct kfd_perf_properties *props;
>>>
>>>         if (amd_iommu_pc_supported()) {
>>> @@ -886,6 +887,7 @@ static int kfd_add_perf_to_topology(struct
>>> kfd_topology_device *kdev)
>>>                         amd_iommu_pc_get_max_counters(0); /* assume one
>>> iommu */
>>>                 list_add_tail(&props->list, &kdev->perf_props);
>>>         }
>>> +#endif
>>>
>>>         return 0;
>>>  }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> index 53fca1f..111fda2 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> @@ -183,8 +183,10 @@ struct kfd_topology_device *kfd_create_topology_device(
>>>                 struct list_head *device_list);
>>>  void kfd_release_topology_device_list(struct list_head *device_list);
>>>
>>> +#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
>>>  extern bool amd_iommu_pc_supported(void);
>>>  extern u8 amd_iommu_pc_get_max_banks(u16 devid);
>>>  extern u8 amd_iommu_pc_get_max_counters(u16 devid);
>>> +#endif
>>>
>>>  #endif /* __KFD_TOPOLOGY_H__ */
>>> --
>>> 2.7.4
>>>
>>>
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2018-02-07  6:55 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04 22:17 [PATCH 0/9] KFD dGPU initialization Felix Kuehling
2018-01-04 22:17 ` [PATCH 1/9] PCI: Add pci_enable_atomic_ops_to_root Felix Kuehling
2018-01-04 22:17   ` Felix Kuehling
2018-01-05  0:17   ` Bjorn Helgaas
2018-01-05  0:23     ` Felix Kuehling
2018-01-05  0:23       ` Felix Kuehling
2018-01-04 22:17 ` [PATCH 2/9] drm/amdkfd: Conditionally enable PCIe atomics Felix Kuehling
2018-01-31 15:09   ` Oded Gabbay
2018-01-31 15:09     ` Oded Gabbay
     [not found] ` <1515104268-25087-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-04 22:17   ` [PATCH 3/9] drm/amdkfd: Make IOMMUv2 code conditional Felix Kuehling
     [not found]     ` <1515104268-25087-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 14:56       ` Oded Gabbay
     [not found]         ` <CAFCwf125cHCf=fsfiMhhASjgMNEcau04gNGKKHFu7PQGeorpZQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 15:00           ` Oded Gabbay
     [not found]             ` <CAFCwf12pqRA4KdRLpkUmiBs7EQmTePcy80V2kP9mP3pN8V-eTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 15:11               ` Christian König
2018-01-31 16:14               ` Felix Kuehling
2018-02-03  2:29               ` Felix Kuehling
     [not found]                 ` <5a1e7696-b5bf-547c-4fe6-e71e0ae7f5e0-5C7GfCeVMHo@public.gmane.org>
2018-02-05 19:00                   ` Christian König
     [not found]                     ` <044a3842-92d1-fe2a-c432-0719e8528416-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-02-06  8:53                       ` Oded Gabbay
     [not found]                         ` <CAFCwf12s6sjyxyTNWx+cdqCjQg+O-4WonDLmJ2X9QT0iRLBNsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-07  0:30                           ` Felix Kuehling
     [not found]                             ` <fe5b1bd4-37fc-2ed0-6a68-247abe08406e-5C7GfCeVMHo@public.gmane.org>
2018-02-07  6:55                               ` Oded Gabbay
2018-01-04 22:17   ` [PATCH 4/9] drm/amdkfd: Make sched_policy a per-device setting Felix Kuehling
     [not found]     ` <1515104268-25087-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:06       ` Oded Gabbay
     [not found]         ` <CAFCwf11xXiKH-3sqpjk-cpQ5DyM_dL-6Vk=DrBCPJ=oSyyYyAg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 16:18           ` Felix Kuehling
2018-01-04 22:17   ` [PATCH 5/9] drm/amdkfd: Add dGPU support to the device queue manager Felix Kuehling
     [not found]     ` <1515104268-25087-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:09       ` Oded Gabbay
2018-01-04 22:17   ` [PATCH 6/9] drm/amdkfd: Add dGPU support to the MQD manager Felix Kuehling
     [not found]     ` <1515104268-25087-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:11       ` Oded Gabbay
2018-01-04 22:17   ` [PATCH 7/9] drm/amdkfd: Add dGPU support to kernel_queue_init Felix Kuehling
     [not found]     ` <1515104268-25087-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:17       ` Oded Gabbay
     [not found]         ` <CAFCwf11nyKTuxF4R+GfWt_Zg5pRjYezbp9TEW_-OWqRhhR-rVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 15:23           ` Deucher, Alexander
     [not found]             ` <BN6PR12MB16520A78D8FFA5E0AF393BD2F7FB0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-01-31 15:29               ` Oded Gabbay
     [not found]                 ` <CAFCwf127vkM7aEcyUK9VjrVekZAFin7d7sk6Ko=JV5gibBeukg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 16:27                   ` Felix Kuehling
     [not found]                     ` <31443990-b612-e9cc-ec07-054b940c8c25-5C7GfCeVMHo@public.gmane.org>
2018-02-06  8:39                       ` Oded Gabbay
2018-01-04 22:17   ` [PATCH 9/9] drm/amdgpu: Enable KFD initialization on dGPUs Felix Kuehling
     [not found]     ` <1515104268-25087-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:25       ` Oded Gabbay
     [not found]         ` <CAFCwf11WWuHydSRBu3Pk8-jFLgoxJ7k0GDfuO-HWRjpvSRm5xQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 15:28           ` Christian König
     [not found]             ` <5881ecb1-3d76-9783-2b60-5b43b5547a3d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-01-31 15:31               ` Oded Gabbay
     [not found]                 ` <CAFCwf10--hDY=0zFUaSM9+fZWXuk8h4AU5-PE+_0+adCAYJ34Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-31 15:34                   ` Christian König
     [not found]                     ` <6ea9aa7d-f8c5-e6a5-a492-7506055527c6-5C7GfCeVMHo@public.gmane.org>
2018-01-31 15:50                       ` Oded Gabbay
2018-01-31 16:33           ` Felix Kuehling
2018-01-27  0:35   ` [PATCH 0/9] KFD dGPU initialization Felix Kuehling
     [not found]     ` <f84a6f6f-0985-a161-f989-b41021085039-5C7GfCeVMHo@public.gmane.org>
2018-01-27 11:31       ` Oded Gabbay
     [not found]         ` <CAFCwf12vCku6JoH3Rcp1-+vQNzqX8zoO_2SG=UhAtTqsYn3SkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-01-27 22:19           ` Kuehling, Felix
2018-01-04 22:17 ` [PATCH 8/9] drm/amdkfd: Add dGPU device IDs and device info Felix Kuehling
2018-01-31 15:20   ` Oded Gabbay
2018-01-31 15:20     ` Oded Gabbay
2018-01-31 16:29     ` Felix Kuehling
2018-01-31 16:29       ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.