* [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW
@ 2021-01-18 18:52 Oded Gabbay
2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay
0 siblings, 2 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
To: linux-kernel; +Cc: Ofir Bitton
From: Ofir Bitton <obitton@habana.ai>
Driver does not zero some pci counters packets before sending
to FW. This causes an out of sync PI/CI between driver and FW.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
drivers/misc/habanalabs/common/firmware_if.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c
index 20f77f58edef..c9a12980218a 100644
--- a/drivers/misc/habanalabs/common/firmware_if.c
+++ b/drivers/misc/habanalabs/common/firmware_if.c
@@ -402,6 +402,10 @@ int hl_fw_cpucp_pci_counters_get(struct hl_device *hdev,
}
counters->rx_throughput = result;
+ memset(&pkt, 0, sizeof(pkt));
+ pkt.ctl = cpu_to_le32(CPUCP_PACKET_PCIE_THROUGHPUT_GET <<
+ CPUCP_PKT_CTL_OPCODE_SHIFT);
+
/* Fetch PCI tx counter */
pkt.index = cpu_to_le32(cpucp_pcie_throughput_tx);
rc = hdev->asic_funcs->send_cpu_message(hdev, (u32 *) &pkt, sizeof(pkt),
@@ -414,6 +418,7 @@ int hl_fw_cpucp_pci_counters_get(struct hl_device *hdev,
counters->tx_throughput = result;
/* Fetch PCI replay counter */
+ memset(&pkt, 0, sizeof(pkt));
pkt.ctl = cpu_to_le32(CPUCP_PACKET_PCIE_REPLAY_CNT_GET <<
CPUCP_PKT_CTL_OPCODE_SHIFT);
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/3] habanalabs: fix backward compatibility of idle check
2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
@ 2021-01-18 18:52 ` Oded Gabbay
2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay
1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
To: linux-kernel
Need to take the lower 32 bits of the driver's 64-bit idle mask and put
it in the legacy 32-bit variable that the userspace reads to know the
idle mask.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
drivers/misc/habanalabs/common/habanalabs_ioctl.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/misc/habanalabs/common/habanalabs_ioctl.c b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
index 12efbd9d2e3a..d25892d61ec9 100644
--- a/drivers/misc/habanalabs/common/habanalabs_ioctl.c
+++ b/drivers/misc/habanalabs/common/habanalabs_ioctl.c
@@ -133,6 +133,8 @@ static int hw_idle(struct hl_device *hdev, struct hl_info_args *args)
hw_idle.is_idle = hdev->asic_funcs->is_device_idle(hdev,
&hw_idle.busy_engines_mask_ext, NULL);
+ hw_idle.busy_engines_mask =
+ lower_32_bits(hw_idle.busy_engines_mask_ext);
return copy_to_user(out, &hw_idle,
min((size_t) max_size, sizeof(hw_idle))) ? -EFAULT : 0;
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 3/3] habanalabs: disable FW events on device removal
2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
@ 2021-01-18 18:52 ` Oded Gabbay
1 sibling, 0 replies; 3+ messages in thread
From: Oded Gabbay @ 2021-01-18 18:52 UTC (permalink / raw)
To: linux-kernel
When device is removed, we need to make sure the F/W won't send us
any more events because during the remove process we disable the
interrupts.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
drivers/misc/habanalabs/common/device.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c
index 1ea57d86caa3..69d04eca767f 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -1487,6 +1487,15 @@ void hl_device_fini(struct hl_device *hdev)
}
}
+ /* Disable PCI access from device F/W so it won't send us additional
+ * interrupts. We disable MSI/MSI-X at the halt_engines function and we
+ * can't have the F/W sending us interrupts after that. We need to
+ * disable the access here because if the device is marked disable, the
+ * message won't be send. Also, in case of heartbeat, the device CPU is
+ * marked as disable so this message won't be sent
+ */
+ hl_fw_send_pci_access_msg(hdev, CPUCP_PACKET_DISABLE_PCI_ACCESS);
+
/* Mark device as disabled */
hdev->disabled = true;
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-01-18 18:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 18:52 [PATCH 1/3] habanalabs: zero pci counters packet before submit to FW Oded Gabbay
2021-01-18 18:52 ` [PATCH 2/3] habanalabs: fix backward compatibility of idle check Oded Gabbay
2021-01-18 18:52 ` [PATCH 3/3] habanalabs: disable FW events on device removal Oded Gabbay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).