All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/2] Arm CMN-600 PMU driver
@ 2020-09-09  9:34 Zidenberg, Tsahi
       [not found] ` <082F0C98-4B73-4215-8B42-0C1B31780350@amperemail.onmicrosoft.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Zidenberg, Tsahi @ 2020-09-09  9:34 UTC (permalink / raw)
  To: Robin Murphy, will, mark.rutland, linux-arm-kernel
  Cc: devicetree, Saidi, Ali, harb, tuanphan, james.yang


On 05/08/2020, 15:57, "Robin Murphy" <robin.murphy@arm.com> wrote:
    > At long last, here's an initial cut of the CMN PMU driver that's been
    > festering in on-and-off development for years. It should be functionally
    > complete now, although there is still scope for improving the current
    > implementation (e.g. watchpoint register allocation could be cleverer).

Booted on graviton2 (using ACPI). Cache-fill counter value (both general and
bynodeid) responds as expected to memory pressure from user processes.

Tested-by: Tsahi Zidenberg <tsahee@amazon.com>

   > Of particular interest at this point is the user interface - is it
   > sufficiently complete and useful? Is there any need for a third event
   > targeting method in between "single node ID" and "all nodes"?

The one thing I'm missing (or didn't find) is a way for the user to determine
the list of relevant node ids for each node type or counter.
Using a wrong nodeid just gave me <not supported> as a counter value.
I don't think that it's required for a first version, though.

---
Thank you!
Tsahi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Arm CMN-600 PMU driver
       [not found] ` <082F0C98-4B73-4215-8B42-0C1B31780350@amperemail.onmicrosoft.com>
@ 2020-09-18 13:06     ` Robin Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2020-09-18 13:06 UTC (permalink / raw)
  To: Tuan Phan, Zidenberg, Tsahi
  Cc: will, mark.rutland, linux-arm-kernel, devicetree, Saidi, Ali,
	harb, tuanphan, james.yang

On 2020-09-09 17:54, Tuan Phan wrote:
> On Sep 9, 2020, at 2:34 AM, Zidenberg, Tsahi <tsahee@amazon.com> wrote:
>>
>>
>> On 05/08/2020, 15:57, "Robin Murphy" <robin.murphy@arm.com> wrote:
>>> At long last, here's an initial cut of the CMN PMU driver that's been
>>> festering in on-and-off development for years. It should be functionally
>>> complete now, although there is still scope for improving the current
>>> implementation (e.g. watchpoint register allocation could be cleverer).
>>
>> Booted on graviton2 (using ACPI). Cache-fill counter value (both general and
>> bynodeid) responds as expected to memory pressure from user processes.
>>
>> Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
>>
> 
> Tested on Ampere Altra SOC platform. So far I have not seen any issues.
> 
> Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com <mailto:tphan@os.amperecomputing.com>>
> 
>>> Of particular interest at this point is the user interface - is it
>>> sufficiently complete and useful? Is there any need for a third event
>>> targeting method in between "single node ID" and "all nodes"?
>>
>> The one thing I'm missing (or didn't find) is a way for the user to determine
>> the list of relevant node ids for each node type or counter.
>> Using a wrong nodeid just gave me <not supported> as a counter value.
>> I don't think that it's required for a first version, though.
>>
> 
> Yes, that bothers me for a while. I either dump nodeid during discovery phase or have
> a node list excel file to get correct nodeid.

For end users who want to measure straightforward things like SLC usage, 
the default of counting across all relevant nodes is usually what 
they're going to want anyway, and conveniently means they don't need to 
know the details. For targeted measurements, even if you know all the 
node IDs, how will you know which CPU/memory controller/peripheral/etc. 
they correspond to? Just like for CCN and CCI, to do pretty much 
anything *meaningful* you'll need independent information about the 
wider SoC configuration to know what's connected where, and that 
information should inherently give you the node/port IDs anyway.

That said, I did end up writing some more comprehensive debugfs output 
for my own development use, and also based on discussions around 
tangential development cases like CHI cache-stashing research. I've 
included that in v2, so if you think it deserves to be upstream please 
do say so.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Arm CMN-600 PMU driver
@ 2020-09-18 13:06     ` Robin Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2020-09-18 13:06 UTC (permalink / raw)
  To: Tuan Phan, Zidenberg, Tsahi
  Cc: mark.rutland, devicetree, tuanphan, harb, linux-arm-kernel,
	james.yang, will, Saidi, Ali

On 2020-09-09 17:54, Tuan Phan wrote:
> On Sep 9, 2020, at 2:34 AM, Zidenberg, Tsahi <tsahee@amazon.com> wrote:
>>
>>
>> On 05/08/2020, 15:57, "Robin Murphy" <robin.murphy@arm.com> wrote:
>>> At long last, here's an initial cut of the CMN PMU driver that's been
>>> festering in on-and-off development for years. It should be functionally
>>> complete now, although there is still scope for improving the current
>>> implementation (e.g. watchpoint register allocation could be cleverer).
>>
>> Booted on graviton2 (using ACPI). Cache-fill counter value (both general and
>> bynodeid) responds as expected to memory pressure from user processes.
>>
>> Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
>>
> 
> Tested on Ampere Altra SOC platform. So far I have not seen any issues.
> 
> Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com <mailto:tphan@os.amperecomputing.com>>
> 
>>> Of particular interest at this point is the user interface - is it
>>> sufficiently complete and useful? Is there any need for a third event
>>> targeting method in between "single node ID" and "all nodes"?
>>
>> The one thing I'm missing (or didn't find) is a way for the user to determine
>> the list of relevant node ids for each node type or counter.
>> Using a wrong nodeid just gave me <not supported> as a counter value.
>> I don't think that it's required for a first version, though.
>>
> 
> Yes, that bothers me for a while. I either dump nodeid during discovery phase or have
> a node list excel file to get correct nodeid.

For end users who want to measure straightforward things like SLC usage, 
the default of counting across all relevant nodes is usually what 
they're going to want anyway, and conveniently means they don't need to 
know the details. For targeted measurements, even if you know all the 
node IDs, how will you know which CPU/memory controller/peripheral/etc. 
they correspond to? Just like for CCN and CCI, to do pretty much 
anything *meaningful* you'll need independent information about the 
wider SoC configuration to know what's connected where, and that 
information should inherently give you the node/port IDs anyway.

That said, I did end up writing some more comprehensive debugfs output 
for my own development use, and also based on discussions around 
tangential development cases like CHI cache-stashing research. I've 
included that in v2, so if you think it deserves to be upstream please 
do say so.

Thanks,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 0/2] Arm CMN-600 PMU driver
@ 2020-08-05 12:56 ` Robin Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2020-08-05 12:56 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel
  Cc: devicetree, alisaidi, tsahee, harb, tuanphan, james.yang

Hi all,

At long last, here's an initial cut of the CMN PMU driver that's been
festering in on-and-off development for years. It should be functionally
complete now, although there is still scope for improving the current
implementation (e.g. watchpoint register allocation could be cleverer).

Of particular interest at this point is the user interface - is it
sufficiently complete and useful? Is there any need for a third event
targeting method in between "single node ID" and "all nodes"? Is it
worth templating watchpoints by port and channel to mimic XP events? Do
we want to expose watchpoint-based bandwidth events as synthetic per-node
events? Not all of that would need to be implemented right now so as to
further stall upstreaming, but I really want to make sure that the initial
interface is solid and any further enhancements can cleanly extend it,
rather than painting ourselves into a corner in terms of ABI support.

Robin.


Robin Murphy (2):
  perf: Add Arm CMN-600 DT binding
  perf: Add Arm CMN-600 PMU driver

 Documentation/admin-guide/perf/arm-cmn.rst    |   65 +
 Documentation/admin-guide/perf/index.rst      |    1 +
 .../devicetree/bindings/perf/arm-cmn.yaml     |   57 +
 drivers/perf/Kconfig                          |    7 +
 drivers/perf/Makefile                         |    1 +
 drivers/perf/arm-cmn.c                        | 1653 +++++++++++++++++
 6 files changed, 1784 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/arm-cmn.rst
 create mode 100644 Documentation/devicetree/bindings/perf/arm-cmn.yaml
 create mode 100644 drivers/perf/arm-cmn.c

-- 
2.28.0.dirty


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 0/2] Arm CMN-600 PMU driver
@ 2020-08-05 12:56 ` Robin Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Robin Murphy @ 2020-08-05 12:56 UTC (permalink / raw)
  To: will, mark.rutland, linux-arm-kernel
  Cc: devicetree, tuanphan, tsahee, harb, james.yang, alisaidi

Hi all,

At long last, here's an initial cut of the CMN PMU driver that's been
festering in on-and-off development for years. It should be functionally
complete now, although there is still scope for improving the current
implementation (e.g. watchpoint register allocation could be cleverer).

Of particular interest at this point is the user interface - is it
sufficiently complete and useful? Is there any need for a third event
targeting method in between "single node ID" and "all nodes"? Is it
worth templating watchpoints by port and channel to mimic XP events? Do
we want to expose watchpoint-based bandwidth events as synthetic per-node
events? Not all of that would need to be implemented right now so as to
further stall upstreaming, but I really want to make sure that the initial
interface is solid and any further enhancements can cleanly extend it,
rather than painting ourselves into a corner in terms of ABI support.

Robin.


Robin Murphy (2):
  perf: Add Arm CMN-600 DT binding
  perf: Add Arm CMN-600 PMU driver

 Documentation/admin-guide/perf/arm-cmn.rst    |   65 +
 Documentation/admin-guide/perf/index.rst      |    1 +
 .../devicetree/bindings/perf/arm-cmn.yaml     |   57 +
 drivers/perf/Kconfig                          |    7 +
 drivers/perf/Makefile                         |    1 +
 drivers/perf/arm-cmn.c                        | 1653 +++++++++++++++++
 6 files changed, 1784 insertions(+)
 create mode 100644 Documentation/admin-guide/perf/arm-cmn.rst
 create mode 100644 Documentation/devicetree/bindings/perf/arm-cmn.yaml
 create mode 100644 drivers/perf/arm-cmn.c

-- 
2.28.0.dirty


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-09-18 13:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-09  9:34 [PATCH 0/2] Arm CMN-600 PMU driver Zidenberg, Tsahi
     [not found] ` <082F0C98-4B73-4215-8B42-0C1B31780350@amperemail.onmicrosoft.com>
2020-09-18 13:06   ` Robin Murphy
2020-09-18 13:06     ` Robin Murphy
  -- strict thread matches above, loose matches on Subject: below --
2020-08-05 12:56 Robin Murphy
2020-08-05 12:56 ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.