linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW
@ 2021-01-18 18:52 Oded Gabbay
  2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
  2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay
  0 siblings, 2 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ofir Bitton

From: Ofir Bitton <obitton@habana.ai>

Driver does not zero some pci counters packets before sending
to FW. This causes an out of sync PI/CI between driver and FW.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/firmware_if.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index 20f77f58edef..c9a12980218a 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -402,6 +402,10 @@ int hl_fw_cpucp_pci_counters_get(struct hl_device *hdev,
 	}
 	counters->rx_throughput = result;
 
+	memset(&pkt, 0, sizeof(pkt));
+	pkt.ctl = cpu_to_le32(CPUCP_PACKET_PCIE_THROUGHPUT_GET <<
+			CPUCP_PKT_CTL_OPCODE_SHIFT);
+
 	/* Fetch PCI tx counter */
 	pkt.index = cpu_to_le32(cpucp_pcie_throughput_tx);
 	rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
@@ -414,6 +418,7 @@ int hl_fw_cpucp_pci_counters_get(struct hl_device *hdev,
 	counters->tx_throughput = result;
 
 	/* Fetch PCI replay counter */
+	memset(&pkt, 0, sizeof(pkt));
 	pkt.ctl = cpu_to_le32(CPUCP_PACKET_PCIE_REPLAY_CNT_GET <<
 			CPUCP_PKT_CTL_OPCODE_SHIFT);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/3] habanalabs: fix backward compatibility of idle check
  2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
@ 2021-01-18 18:52 ` Oded Gabbay
  2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay
  1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
  To: linux-kernel

Need to take the lower 32 bits of the driver's 64-bit idle mask and put
it in the legacy 32-bit variable that the userspace reads to know the
idle mask.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/habanalabs_ioctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index 12efbd9d2e3a..d25892d61ec9 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -133,6 +133,8 @@ static int hw_idle(struct hl_device *hdev, struct hl_info_args *args)
 
 	hw_idle.is_idle = hdev->asic_funcs->is_device_idle(hdev,
 					&hw_idle.busy_engines_mask_ext, NULL);
+	hw_idle.busy_engines_mask =
+			lower_32_bits(hw_idle.busy_engines_mask_ext);
 
 	return copy_to_user(out, &hw_idle,
 		min((size_t) max_size, sizeof(hw_idle))) ? -EFAULT : 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 3/3] habanalabs: disable FW events on device removal
  2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
  2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
@ 2021-01-18 18:52 ` Oded Gabbay
  1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
  To: linux-kernel

When device is removed, we need to make sure the F/W won't send us
any more events because during the remove process we disable the
interrupts.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/misc/habanalabs/common/device.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index 1ea57d86caa3..69d04eca767f 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -1487,6 +1487,15 @@ void hl_device_fini(struct hl_device *hdev)
 		}
 	}
 
+	/* Disable PCI access from device F/W so it won't send us additional
+	 * interrupts. We disable MSI/MSI-X at the halt_engines function and we
+	 * can't have the F/W sending us interrupts after that. We need to
+	 * disable the access here because if the device is marked disable, the
+	 * message won't be send. Also, in case of heartbeat, the device CPU is
+	 * marked as disable so this message won't be sent
+	 */
+	hl_fw_send_pci_access_msg(hdev,	CPUCP_PACKET_DISABLE_PCI_ACCESS);
+
 	/* Mark device as disabled */
 	hdev->disabled = true;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-18 18:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).