Kernel perf counter support (for apple M1 and others)

* Kernel perf counter support (for apple M1 and others)
@ 2022-04-01  1:39 Yichao Yu
  2022-04-13 12:58 ` Yichao Yu
  2022-04-18 12:01 ` Marc Zyngier
  0 siblings, 2 replies; 7+ messages in thread
From: Yichao Yu @ 2022-04-01  1:39 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: khuong, will, mark.rutland, Frank.li, zhangshaokun, liuqi115,
	john.garry, mathieu.poirier, leo.yan, marc.zyngier

Hi,

I am playing with the performance counters on the apple M1 chip from
linux with the hope that it could help making userspace tools like
perf and rr works on the M1. However, I was told that none of these
info should go into the kernel (not even raw event names) and the
userspace should only use the raw event numbers instead of
PERF_TYPE_HARDWARE even for events that have a canonical counterpart.

Although I'm not planning to submit any kernel patches anytime soon
and I'm mostly interested in running the test right now, I do want to
know what I should expect in the long term on the userspace side. I
was told to ask about this on "the list" (and I'm hoping this is the
right one after browsing through MAINTAINERS) instead. There are a few
issues/questions, not all of which are related to M1/asymmetric
systems. For context, see
https://oftc.irclog.whitequark.org/asahi-dev/2022-03-30 (there also
happens to be no other discussion on the channel that day)

1. Is it acceptable (to either kernel or perf source) to submit
patches that are based on a14.plist from macOS. I have personally
never looked at it but if it is acceptable then there's little point
doing the experiment I was doing (apart from the fun doing so and as a
practice to understand the system).

2. Should the kernel provide names for hardware events? Here I'm
talking about things under
`/sys/bus/event_source/devices/<pmu>/events` which I assume is
provided by the kernel (that or my understanding of sysfs has been
fundamentally wrong/out-of-date...). Based on the fact that the
current pmu kernel driver for the M1 does provide this and this
comment https://github.com/torvalds/linux/blob/e8b767f5e04097aaedcd6e06e2270f9fe5282696/drivers/perf/apple_m1_cpu_pmu.c#L31
I assume it's desired. This would also agree with what I've observed
on other (including non-x86) systems. If this is the case, I assume
the kernel driver for the M1 PMU isn't fully "done" yet.

3. For counting events on a system with asymmetric cores.
    I understand that if the system contains multiple processors of
different characteristics, it may not make sense to provide a counter
that counts events on both (or all) types of cores. However, there are
events (PERF_COUNT_HW_INSTRUCTIONS and
PERF_COUNT_HW_BRANCH_INSTRUCTIONS at the least) that shouldn't really
be affected by this (and in fact, any counters that counts events
visible directly to the software/userspace). I want to even say that
branch misses/cache reference/misses might be in this category as well
although certainly not as clear cut.

4. There are other events that may not make as much sense to combine
(cycles for example). However, I feel like a combined cycle count
isn't going to be much tricker to use given that the cycle count on a
single core is still affected by frequency scaling and it can still be
used correctly by pinning the thread.

The main reasons I'm asking about 3 and 4 is that
1. Right now, even to just count instructions without pinning the
thread, I need to create two counters.
2. Even if the number isn't exactly accurate, it can still be useful
as a general guideline. Right now, even if I just want to do a quick
check, I still need to manually specify a dozen of events in `perf
stat -e` rather than simply using `perf stat` (to make it worse, perf
doesn't even provide any useful warning about it). It is also much
harder to do things generically (which is at least partially because
of the lack of documentation....).

Yichao Yu

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread