Tracing NVMe Driver with BPF missing events

* Tracing NVMe Driver with BPF missing events
@ 2022-05-18 21:31 John Mazzie
  2022-05-21  0:10 ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: John Mazzie @ 2022-05-18 21:31 UTC (permalink / raw)
  To: bpf, John Mazzie (jmazzie)

My group at Micron is using BPF and love the tracing capabilities it
provides. We are mainly focused on the storage subsystem and BPF has
been really helpful in understanding how the storage subsystem
interacts with our drives while running applications.

In the process of developing a tool using BPF to trace the nvme
driver, we ran into an issue with some missing events. I wanted to
check to see if this is possibly a bug/limitation that I'm hitting or
if it's expected behavior with heavy tracing. We are trying to trace 2
trace points (nvme_setup_cmd and nvme_complete_rq) around 1M times a
second.
We noticed if we just trace one of the two, we see all the expected
events, but if we trace both at the same time, the nvme_complete_rq
misses events. I am using two different percpu_hash maps to count both
events. One for setup and another for complete. My expectation was
that tracing these events would affect performance, somewhat, but not
miss events. Ultimately the tool would be used to trace nvme latencies
at the driver level by device and process.

My tool was developed using libbpf v0.7, and I've tested on Rocky
Linux 8.5 (Kernel 4.18.0), Ubuntu 20.04 (Kernel 5.4) and Fedora 36
(Kernel 5.17.6) with the same results.

Thanks,
John Mazzie
Principal Storage Solutions Engineer
Micron Technology, Inc.

^ permalink raw reply	[flat|nested] 7+ messages in thread