All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: John Garry <john.garry@huawei.com>,
	will@kernel.org, mark.rutland@arm.com,
	linux-arm-kernel@lists.infradead.org
Cc: devicetree@vger.kernel.org, tuanphan@os.amperecomputing.com,
	tsahee@amazon.com, harb@amperecomputing.com, james.yang@arm.com,
	alisaidi@amazon.com
Subject: Re: [PATCH 2/2] perf: Add Arm CMN-600 PMU driver
Date: Mon, 7 Sep 2020 14:27:33 +0100	[thread overview]
Message-ID: <c8803e30-7b67-bdb1-30ac-fe1137e4bd93@arm.com> (raw)
In-Reply-To: <512b3de2-9ece-baaf-ef93-246a8af9464f@huawei.com>

Hi John,

On 2020-09-07 12:23, John Garry wrote:
> On 05/08/2020 13:56, Robin Murphy wrote:
>> Initial driver for PMU event counting on the Arm CMN-600 interconnect.
>> CMN sports an obnoxiously complex distributed PMU system as part of
>> its debug and trace features, which can do all manner of things like
>> sampling, cross-triggering and generating CoreSight trace. This driver
>> covers the PMU functionality, plus the relevant aspects of watchpoints
>> for simply counting matching flits.
>>
> 
> Hi Robin,
> 
> I had a quick look at the TRM (nearly 1000 pages, wow) for this IP, and 
> could not see anything obvious to know the implementation id (similar 
> problem to SMMU PMCG today). Do you know if there is anything 
> appropiate? Is por_hnf_ppu_iidr any use here?

Note that PPU is a distinct thing with its own architecture (apparently 
https://developer.arm.com/documentation/den0051/latest/ if you're 
interested), so that particular IIDR is merely describing the 
implementation of the PPU interface, which AFAICS is not necessarily 
unique to CMN-600. In terms of the PMU and CMN overall there is no 
architecture to speak of, the implementation just is what it is :)

I'd expect that some future interconnect IPs are likely to be similar 
enough to share most of this driver, but those should get their own 
top-level compatibles since they will not be CMN-600. Thus I don't 
believe that identification is going to be too much of a concern.

> Just a couple of more questions for now...
> 
> [snip]
> 
>> +
>> +static struct attribute *arm_cmn_event_attrs[] = {
>> +    CMN_EVENT_DTC(cycles),
>> +
>> +    /*
>> +     * DVM node events conflict with HN-I events in the equivalent PMU
>> +     * slot, but our lazy short-cut of using the DTM counter index for
>> +     * the PMU index as well happens to avoid that by construction.
>> +     */
>> +    CMN_EVENT_DVM(rxreq_dvmop,    0x01),
>> +    CMN_EVENT_DVM(rxreq_dvmsync,    0x02),
>> +    CMN_EVENT_DVM(rxreq_dvmop_vmid_filtered, 0x03),
>> +    CMN_EVENT_DVM(rxreq_retried,    0x04),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_all, 0x05, 0),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_dvmop, 0x05, 1),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_dvmsync, 0x05, 2),
>> +
>> +    CMN_EVENT_HNF(cache_miss,    0x01),
>> +    CMN_EVENT_HNF(slc_sf_cache_access, 0x02),
>> +    CMN_EVENT_HNF(cache_fill,    0x03),
>> +    CMN_EVENT_HNF(pocq_retry,    0x04),
>> +    CMN_EVENT_HNF(pocq_reqs_recvd,    0x05),
>> +    CMN_EVENT_HNF(sf_hit,        0x06),
>> +    CMN_EVENT_HNF(sf_evictions,    0x07),
>> +    CMN_EVENT_HNF(dir_snoops_sent,    0x08),
>> +    CMN_EVENT_HNF(brd_snoops_sent,    0x09),
>> +    CMN_EVENT_HNF(slc_eviction,    0x0a),
>> +    CMN_EVENT_HNF(slc_fill_invalid_way, 0x0b),
>> +    CMN_EVENT_HNF(mc_retries,    0x0c),
>> +    CMN_EVENT_HNF(mc_reqs,        0x0d),
>> +    CMN_EVENT_HNF(qos_hh_retry,    0x0e),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_all, 0x0f, 0),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_read, 0x0f, 1),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_write, 0x0f, 2),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_atomic, 0x0f, 3),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_stash, 0x0f, 4),
>> +    CMN_EVENT_HNF(pocq_addrhaz,    0x10),
>> +    CMN_EVENT_HNF(pocq_atomic_addrhaz, 0x11),
>> +    CMN_EVENT_HNF(ld_st_swp_adq_full, 0x12),
>> +    CMN_EVENT_HNF(cmp_adq_full,    0x13),
>> +    CMN_EVENT_HNF(txdat_stall,    0x14),
>> +    CMN_EVENT_HNF(txrsp_stall,    0x15),
>> +    CMN_EVENT_HNF(seq_full,        0x16),
>> +    CMN_EVENT_HNF(seq_hit,        0x17),
>> +    CMN_EVENT_HNF(snp_sent,        0x18),
>> +    CMN_EVENT_HNF(sfbi_dir_snp_sent, 0x19),
>> +    CMN_EVENT_HNF(sfbi_brd_snp_sent, 0x1a),
>> +    CMN_EVENT_HNF(snp_sent_untrk,    0x1b),
>> +    CMN_EVENT_HNF(intv_dirty,    0x1c),
>> +    CMN_EVENT_HNF(stash_snp_sent,    0x1d),
>> +    CMN_EVENT_HNF(stash_data_pull,    0x1e),
>> +    CMN_EVENT_HNF(snp_fwded,    0x1f),
>> +
>> +    CMN_EVENT_HNI(rrt_rd_occ_cnt_ovfl, 0x20),
>> +    CMN_EVENT_HNI(rrt_wr_occ_cnt_ovfl, 0x21),
>> +    CMN_EVENT_HNI(rdt_rd_occ_cnt_ovfl, 0x22),
>> +    CMN_EVENT_HNI(rdt_wr_occ_cnt_ovfl, 0x23),
>> +    CMN_EVENT_HNI(wdb_occ_cnt_ovfl,    0x24),
>> +    CMN_EVENT_HNI(rrt_rd_alloc,    0x25),
>> +    CMN_EVENT_HNI(rrt_wr_alloc,    0x26),
>> +    CMN_EVENT_HNI(rdt_rd_alloc,    0x27),
>> +    CMN_EVENT_HNI(rdt_wr_alloc,    0x28),
>> +    CMN_EVENT_HNI(wdb_alloc,    0x29),
>> +    CMN_EVENT_HNI(txrsp_retryack,    0x2a),
>> +    CMN_EVENT_HNI(arvalid_no_arready, 0x2b),
>> +    CMN_EVENT_HNI(arready_no_arvalid, 0x2c),
>> +    CMN_EVENT_HNI(awvalid_no_awready, 0x2d),
>> +    CMN_EVENT_HNI(awready_no_awvalid, 0x2e),
>> +    CMN_EVENT_HNI(wvalid_no_wready,    0x2f),
>> +    CMN_EVENT_HNI(txdat_stall,    0x30),
>> +    CMN_EVENT_HNI(nonpcie_serialization, 0x31),
>> +    CMN_EVENT_HNI(pcie_serialization, 0x32),
>> +
>> +    CMN_EVENT_XP(txflit_valid,    0x01),
>> +    CMN_EVENT_XP(txflit_stall,    0x02),
>> +    CMN_EVENT_XP(partial_dat_flit,    0x03),
>> +    /* We treat watchpoints as a special made-up class of XP events */
>> +    CMN_EVENT_ATTR(watchpoint_up, CMN_TYPE_WP, 0, 0),
>> +    CMN_EVENT_ATTR(watchpoint_down, CMN_TYPE_WP, 2, 0),
>> +
>> +    CMN_EVENT_SBSX(rd_req,        0x01),
>> +    CMN_EVENT_SBSX(wr_req,        0x02),
>> +    CMN_EVENT_SBSX(cmo_req,        0x03),
>> +    CMN_EVENT_SBSX(txrsp_retryack,    0x04),
>> +    CMN_EVENT_SBSX(txdat_flitv,    0x05),
>> +    CMN_EVENT_SBSX(txrsp_flitv,    0x06),
>> +    CMN_EVENT_SBSX(rd_req_trkr_occ_cnt_ovfl, 0x11),
>> +    CMN_EVENT_SBSX(wr_req_trkr_occ_cnt_ovfl, 0x12),
>> +    CMN_EVENT_SBSX(cmo_req_trkr_occ_cnt_ovfl, 0x13),
>> +    CMN_EVENT_SBSX(wdb_occ_cnt_ovfl, 0x14),
>> +    CMN_EVENT_SBSX(rd_axi_trkr_occ_cnt_ovfl, 0x15),
>> +    CMN_EVENT_SBSX(cmo_axi_trkr_occ_cnt_ovfl, 0x16),
>> +    CMN_EVENT_SBSX(arvalid_no_arready, 0x21),
>> +    CMN_EVENT_SBSX(awvalid_no_awready, 0x22),
>> +    CMN_EVENT_SBSX(wvalid_no_wready, 0x23),
>> +    CMN_EVENT_SBSX(txdat_stall,    0x24),
>> +    CMN_EVENT_SBSX(txrsp_stall,    0x25),
>> +
>> +    CMN_EVENT_RNID(s0_rdata_beats,    0x01),
>> +    CMN_EVENT_RNID(s1_rdata_beats,    0x02),
>> +    CMN_EVENT_RNID(s2_rdata_beats,    0x03),
>> +    CMN_EVENT_RNID(rxdat_flits,    0x04),
>> +    CMN_EVENT_RNID(txdat_flits,    0x05),
>> +    CMN_EVENT_RNID(txreq_flits_total, 0x06),
>> +    CMN_EVENT_RNID(txreq_flits_retried, 0x07),
>> +    CMN_EVENT_RNID(rrt_occ_ovfl,    0x08),
>> +    CMN_EVENT_RNID(wrt_occ_ovfl,    0x09),
>> +    CMN_EVENT_RNID(txreq_flits_replayed, 0x0a),
>> +    CMN_EVENT_RNID(wrcancel_sent,    0x0b),
>> +    CMN_EVENT_RNID(s0_wdata_beats,    0x0c),
>> +    CMN_EVENT_RNID(s1_wdata_beats,    0x0d),
>> +    CMN_EVENT_RNID(s2_wdata_beats,    0x0e),
>> +    CMN_EVENT_RNID(rrt_alloc,    0x0f),
>> +    CMN_EVENT_RNID(wrt_alloc,    0x10),
>> +    CMN_EVENT_RNID(rdb_unord,    0x11),
>> +    CMN_EVENT_RNID(rdb_replay,    0x12),
>> +    CMN_EVENT_RNID(rdb_hybrid,    0x13),
>> +    CMN_EVENT_RNID(rdb_ord,        0x14),
>> +
>> +    NULL
>> +};
> 
> Just wondering how does "perf list" look when you have multiple 
> instances of the device in the system?

Probably a giant mess, since I assume every (relevant) event should be 
listed for each instance, much like CPU events on big.LITTLE systems, 
and uncore events on many others (certainly most of the xgene_pmu events 
on my eMAG workstation are repeated up to 8 times).

> [snip]
> 
>> +static int arm_cmn_probe(struct platform_device *pdev)
>> +{
>> +    struct arm_cmn *cmn;
>> +    const char *name;
>> +    static atomic_t id;
>> +    int err, rootnode;
>> +
>> +    cmn = devm_kzalloc(&pdev->dev, sizeof(*cmn), GFP_KERNEL);
>> +    if (!cmn)
>> +        return -ENOMEM;
>> +
>> +    cmn->dev = &pdev->dev;
>> +    platform_set_drvdata(pdev, cmn);
>> +
>> +    if (has_acpi_companion(cmn->dev))
>> +        rootnode = arm_cmn_acpi_probe(pdev, cmn);
>> +    else
>> +        rootnode = arm_cmn_of_probe(pdev, cmn);
>> +    if (rootnode < 0)
>> +        return rootnode;
>> +
>> +    err = arm_cmn_discover(cmn, rootnode);
>> +    if (err)
>> +        return err;
>> +
>> +    err = arm_cmn_init_dtcs(cmn);
>> +    if (err)
>> +        return err;
>> +
>> +    err = arm_cmn_init_irqs(cmn);
>> +    if (err)
>> +        return err;
>> +
>> +    cmn->cpu = raw_smp_processor_id();
>> +    cmn->pmu = (struct pmu) {
>> +        .module = THIS_MODULE,
>> +        .attr_groups = arm_cmn_attr_groups,
>> +        .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>> +        .task_ctx_nr = perf_invalid_context,
>> +        .pmu_enable = arm_cmn_pmu_enable,
>> +        .pmu_disable = arm_cmn_pmu_disable,
>> +        .event_init = arm_cmn_event_init,
>> +        .add = arm_cmn_event_add,
>> +        .del = arm_cmn_event_del,
>> +        .start = arm_cmn_event_start,
>> +        .stop = arm_cmn_event_stop,
>> +        .read = arm_cmn_event_read,
>> +        .start_txn = arm_cmn_start_txn,
>> +        .commit_txn = arm_cmn_commit_txn,
>> +        .cancel_txn = arm_cmn_end_txn,
>> +    };
>> +
>> +    if (atomic_fetch_inc(&id) == 0) {
>> +        name = "arm_cmn";
>> +    } else {
>> +        name = devm_kasprintf(cmn->dev, GFP_KERNEL, "arm_cmn_%d", 
>> atomic_read(&id));
> 
> How is userspace supposed to know which device is which when we have 
> multiple instances? I mean, doesn't this depend on arbitary probe ordering?

Right, I had it in my head that there should be enough information in 
sysfs to link the PMU device back to its parent platform device, but 
apparently that isn't the case :(

Furthermore, you've now got me thinking I might want to rejig the naming 
scheme anyway - if I did have two instances named "foo" and "foo_1", is 
perf tool going to interpret "-e foo/event/" as specifying a common 
prefix and create the event on both PMUs even if I only wanted it on the 
first one?

Robin.

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: John Garry <john.garry@huawei.com>,
	will@kernel.org, mark.rutland@arm.com,
	linux-arm-kernel@lists.infradead.org
Cc: devicetree@vger.kernel.org, tuanphan@os.amperecomputing.com,
	tsahee@amazon.com, harb@amperecomputing.com, james.yang@arm.com,
	alisaidi@amazon.com
Subject: Re: [PATCH 2/2] perf: Add Arm CMN-600 PMU driver
Date: Mon, 7 Sep 2020 14:27:33 +0100	[thread overview]
Message-ID: <c8803e30-7b67-bdb1-30ac-fe1137e4bd93@arm.com> (raw)
In-Reply-To: <512b3de2-9ece-baaf-ef93-246a8af9464f@huawei.com>

Hi John,

On 2020-09-07 12:23, John Garry wrote:
> On 05/08/2020 13:56, Robin Murphy wrote:
>> Initial driver for PMU event counting on the Arm CMN-600 interconnect.
>> CMN sports an obnoxiously complex distributed PMU system as part of
>> its debug and trace features, which can do all manner of things like
>> sampling, cross-triggering and generating CoreSight trace. This driver
>> covers the PMU functionality, plus the relevant aspects of watchpoints
>> for simply counting matching flits.
>>
> 
> Hi Robin,
> 
> I had a quick look at the TRM (nearly 1000 pages, wow) for this IP, and 
> could not see anything obvious to know the implementation id (similar 
> problem to SMMU PMCG today). Do you know if there is anything 
> appropiate? Is por_hnf_ppu_iidr any use here?

Note that PPU is a distinct thing with its own architecture (apparently 
https://developer.arm.com/documentation/den0051/latest/ if you're 
interested), so that particular IIDR is merely describing the 
implementation of the PPU interface, which AFAICS is not necessarily 
unique to CMN-600. In terms of the PMU and CMN overall there is no 
architecture to speak of, the implementation just is what it is :)

I'd expect that some future interconnect IPs are likely to be similar 
enough to share most of this driver, but those should get their own 
top-level compatibles since they will not be CMN-600. Thus I don't 
believe that identification is going to be too much of a concern.

> Just a couple of more questions for now...
> 
> [snip]
> 
>> +
>> +static struct attribute *arm_cmn_event_attrs[] = {
>> +    CMN_EVENT_DTC(cycles),
>> +
>> +    /*
>> +     * DVM node events conflict with HN-I events in the equivalent PMU
>> +     * slot, but our lazy short-cut of using the DTM counter index for
>> +     * the PMU index as well happens to avoid that by construction.
>> +     */
>> +    CMN_EVENT_DVM(rxreq_dvmop,    0x01),
>> +    CMN_EVENT_DVM(rxreq_dvmsync,    0x02),
>> +    CMN_EVENT_DVM(rxreq_dvmop_vmid_filtered, 0x03),
>> +    CMN_EVENT_DVM(rxreq_retried,    0x04),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_all, 0x05, 0),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_dvmop, 0x05, 1),
>> +    _CMN_EVENT_DVM(rxreq_trk_occupancy_dvmsync, 0x05, 2),
>> +
>> +    CMN_EVENT_HNF(cache_miss,    0x01),
>> +    CMN_EVENT_HNF(slc_sf_cache_access, 0x02),
>> +    CMN_EVENT_HNF(cache_fill,    0x03),
>> +    CMN_EVENT_HNF(pocq_retry,    0x04),
>> +    CMN_EVENT_HNF(pocq_reqs_recvd,    0x05),
>> +    CMN_EVENT_HNF(sf_hit,        0x06),
>> +    CMN_EVENT_HNF(sf_evictions,    0x07),
>> +    CMN_EVENT_HNF(dir_snoops_sent,    0x08),
>> +    CMN_EVENT_HNF(brd_snoops_sent,    0x09),
>> +    CMN_EVENT_HNF(slc_eviction,    0x0a),
>> +    CMN_EVENT_HNF(slc_fill_invalid_way, 0x0b),
>> +    CMN_EVENT_HNF(mc_retries,    0x0c),
>> +    CMN_EVENT_HNF(mc_reqs,        0x0d),
>> +    CMN_EVENT_HNF(qos_hh_retry,    0x0e),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_all, 0x0f, 0),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_read, 0x0f, 1),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_write, 0x0f, 2),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_atomic, 0x0f, 3),
>> +    _CMN_EVENT_HNF(qos_pocq_occupancy_stash, 0x0f, 4),
>> +    CMN_EVENT_HNF(pocq_addrhaz,    0x10),
>> +    CMN_EVENT_HNF(pocq_atomic_addrhaz, 0x11),
>> +    CMN_EVENT_HNF(ld_st_swp_adq_full, 0x12),
>> +    CMN_EVENT_HNF(cmp_adq_full,    0x13),
>> +    CMN_EVENT_HNF(txdat_stall,    0x14),
>> +    CMN_EVENT_HNF(txrsp_stall,    0x15),
>> +    CMN_EVENT_HNF(seq_full,        0x16),
>> +    CMN_EVENT_HNF(seq_hit,        0x17),
>> +    CMN_EVENT_HNF(snp_sent,        0x18),
>> +    CMN_EVENT_HNF(sfbi_dir_snp_sent, 0x19),
>> +    CMN_EVENT_HNF(sfbi_brd_snp_sent, 0x1a),
>> +    CMN_EVENT_HNF(snp_sent_untrk,    0x1b),
>> +    CMN_EVENT_HNF(intv_dirty,    0x1c),
>> +    CMN_EVENT_HNF(stash_snp_sent,    0x1d),
>> +    CMN_EVENT_HNF(stash_data_pull,    0x1e),
>> +    CMN_EVENT_HNF(snp_fwded,    0x1f),
>> +
>> +    CMN_EVENT_HNI(rrt_rd_occ_cnt_ovfl, 0x20),
>> +    CMN_EVENT_HNI(rrt_wr_occ_cnt_ovfl, 0x21),
>> +    CMN_EVENT_HNI(rdt_rd_occ_cnt_ovfl, 0x22),
>> +    CMN_EVENT_HNI(rdt_wr_occ_cnt_ovfl, 0x23),
>> +    CMN_EVENT_HNI(wdb_occ_cnt_ovfl,    0x24),
>> +    CMN_EVENT_HNI(rrt_rd_alloc,    0x25),
>> +    CMN_EVENT_HNI(rrt_wr_alloc,    0x26),
>> +    CMN_EVENT_HNI(rdt_rd_alloc,    0x27),
>> +    CMN_EVENT_HNI(rdt_wr_alloc,    0x28),
>> +    CMN_EVENT_HNI(wdb_alloc,    0x29),
>> +    CMN_EVENT_HNI(txrsp_retryack,    0x2a),
>> +    CMN_EVENT_HNI(arvalid_no_arready, 0x2b),
>> +    CMN_EVENT_HNI(arready_no_arvalid, 0x2c),
>> +    CMN_EVENT_HNI(awvalid_no_awready, 0x2d),
>> +    CMN_EVENT_HNI(awready_no_awvalid, 0x2e),
>> +    CMN_EVENT_HNI(wvalid_no_wready,    0x2f),
>> +    CMN_EVENT_HNI(txdat_stall,    0x30),
>> +    CMN_EVENT_HNI(nonpcie_serialization, 0x31),
>> +    CMN_EVENT_HNI(pcie_serialization, 0x32),
>> +
>> +    CMN_EVENT_XP(txflit_valid,    0x01),
>> +    CMN_EVENT_XP(txflit_stall,    0x02),
>> +    CMN_EVENT_XP(partial_dat_flit,    0x03),
>> +    /* We treat watchpoints as a special made-up class of XP events */
>> +    CMN_EVENT_ATTR(watchpoint_up, CMN_TYPE_WP, 0, 0),
>> +    CMN_EVENT_ATTR(watchpoint_down, CMN_TYPE_WP, 2, 0),
>> +
>> +    CMN_EVENT_SBSX(rd_req,        0x01),
>> +    CMN_EVENT_SBSX(wr_req,        0x02),
>> +    CMN_EVENT_SBSX(cmo_req,        0x03),
>> +    CMN_EVENT_SBSX(txrsp_retryack,    0x04),
>> +    CMN_EVENT_SBSX(txdat_flitv,    0x05),
>> +    CMN_EVENT_SBSX(txrsp_flitv,    0x06),
>> +    CMN_EVENT_SBSX(rd_req_trkr_occ_cnt_ovfl, 0x11),
>> +    CMN_EVENT_SBSX(wr_req_trkr_occ_cnt_ovfl, 0x12),
>> +    CMN_EVENT_SBSX(cmo_req_trkr_occ_cnt_ovfl, 0x13),
>> +    CMN_EVENT_SBSX(wdb_occ_cnt_ovfl, 0x14),
>> +    CMN_EVENT_SBSX(rd_axi_trkr_occ_cnt_ovfl, 0x15),
>> +    CMN_EVENT_SBSX(cmo_axi_trkr_occ_cnt_ovfl, 0x16),
>> +    CMN_EVENT_SBSX(arvalid_no_arready, 0x21),
>> +    CMN_EVENT_SBSX(awvalid_no_awready, 0x22),
>> +    CMN_EVENT_SBSX(wvalid_no_wready, 0x23),
>> +    CMN_EVENT_SBSX(txdat_stall,    0x24),
>> +    CMN_EVENT_SBSX(txrsp_stall,    0x25),
>> +
>> +    CMN_EVENT_RNID(s0_rdata_beats,    0x01),
>> +    CMN_EVENT_RNID(s1_rdata_beats,    0x02),
>> +    CMN_EVENT_RNID(s2_rdata_beats,    0x03),
>> +    CMN_EVENT_RNID(rxdat_flits,    0x04),
>> +    CMN_EVENT_RNID(txdat_flits,    0x05),
>> +    CMN_EVENT_RNID(txreq_flits_total, 0x06),
>> +    CMN_EVENT_RNID(txreq_flits_retried, 0x07),
>> +    CMN_EVENT_RNID(rrt_occ_ovfl,    0x08),
>> +    CMN_EVENT_RNID(wrt_occ_ovfl,    0x09),
>> +    CMN_EVENT_RNID(txreq_flits_replayed, 0x0a),
>> +    CMN_EVENT_RNID(wrcancel_sent,    0x0b),
>> +    CMN_EVENT_RNID(s0_wdata_beats,    0x0c),
>> +    CMN_EVENT_RNID(s1_wdata_beats,    0x0d),
>> +    CMN_EVENT_RNID(s2_wdata_beats,    0x0e),
>> +    CMN_EVENT_RNID(rrt_alloc,    0x0f),
>> +    CMN_EVENT_RNID(wrt_alloc,    0x10),
>> +    CMN_EVENT_RNID(rdb_unord,    0x11),
>> +    CMN_EVENT_RNID(rdb_replay,    0x12),
>> +    CMN_EVENT_RNID(rdb_hybrid,    0x13),
>> +    CMN_EVENT_RNID(rdb_ord,        0x14),
>> +
>> +    NULL
>> +};
> 
> Just wondering how does "perf list" look when you have multiple 
> instances of the device in the system?

Probably a giant mess, since I assume every (relevant) event should be 
listed for each instance, much like CPU events on big.LITTLE systems, 
and uncore events on many others (certainly most of the xgene_pmu events 
on my eMAG workstation are repeated up to 8 times).

> [snip]
> 
>> +static int arm_cmn_probe(struct platform_device *pdev)
>> +{
>> +    struct arm_cmn *cmn;
>> +    const char *name;
>> +    static atomic_t id;
>> +    int err, rootnode;
>> +
>> +    cmn = devm_kzalloc(&pdev->dev, sizeof(*cmn), GFP_KERNEL);
>> +    if (!cmn)
>> +        return -ENOMEM;
>> +
>> +    cmn->dev = &pdev->dev;
>> +    platform_set_drvdata(pdev, cmn);
>> +
>> +    if (has_acpi_companion(cmn->dev))
>> +        rootnode = arm_cmn_acpi_probe(pdev, cmn);
>> +    else
>> +        rootnode = arm_cmn_of_probe(pdev, cmn);
>> +    if (rootnode < 0)
>> +        return rootnode;
>> +
>> +    err = arm_cmn_discover(cmn, rootnode);
>> +    if (err)
>> +        return err;
>> +
>> +    err = arm_cmn_init_dtcs(cmn);
>> +    if (err)
>> +        return err;
>> +
>> +    err = arm_cmn_init_irqs(cmn);
>> +    if (err)
>> +        return err;
>> +
>> +    cmn->cpu = raw_smp_processor_id();
>> +    cmn->pmu = (struct pmu) {
>> +        .module = THIS_MODULE,
>> +        .attr_groups = arm_cmn_attr_groups,
>> +        .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
>> +        .task_ctx_nr = perf_invalid_context,
>> +        .pmu_enable = arm_cmn_pmu_enable,
>> +        .pmu_disable = arm_cmn_pmu_disable,
>> +        .event_init = arm_cmn_event_init,
>> +        .add = arm_cmn_event_add,
>> +        .del = arm_cmn_event_del,
>> +        .start = arm_cmn_event_start,
>> +        .stop = arm_cmn_event_stop,
>> +        .read = arm_cmn_event_read,
>> +        .start_txn = arm_cmn_start_txn,
>> +        .commit_txn = arm_cmn_commit_txn,
>> +        .cancel_txn = arm_cmn_end_txn,
>> +    };
>> +
>> +    if (atomic_fetch_inc(&id) == 0) {
>> +        name = "arm_cmn";
>> +    } else {
>> +        name = devm_kasprintf(cmn->dev, GFP_KERNEL, "arm_cmn_%d", 
>> atomic_read(&id));
> 
> How is userspace supposed to know which device is which when we have 
> multiple instances? I mean, doesn't this depend on arbitary probe ordering?

Right, I had it in my head that there should be enough information in 
sysfs to link the PMU device back to its parent platform device, but 
apparently that isn't the case :(

Furthermore, you've now got me thinking I might want to rejig the naming 
scheme anyway - if I did have two instances named "foo" and "foo_1", is 
perf tool going to interpret "-e foo/event/" as specifying a common 
prefix and create the event on both PMUs even if I only wanted it on the 
first one?

Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-09-07 13:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-05 12:56 [PATCH 0/2] Arm CMN-600 PMU driver Robin Murphy
2020-08-05 12:56 ` Robin Murphy
2020-08-05 12:56 ` [PATCH 1/2] perf: Add Arm CMN-600 DT binding Robin Murphy
2020-08-05 12:56   ` Robin Murphy
2020-08-13  2:34   ` Rob Herring
2020-08-13  2:34     ` Rob Herring
2020-08-05 12:56 ` [PATCH 2/2] perf: Add Arm CMN-600 PMU driver Robin Murphy
2020-08-05 12:56   ` Robin Murphy
2020-09-07 10:45   ` Will Deacon
2020-09-07 10:45     ` Will Deacon
2020-09-07 11:23   ` John Garry
2020-09-07 11:23     ` John Garry
2020-09-07 13:27     ` Robin Murphy [this message]
2020-09-07 13:27       ` Robin Murphy
2020-09-07 14:29       ` John Garry
2020-09-07 14:29         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c8803e30-7b67-bdb1-30ac-fe1137e4bd93@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alisaidi@amazon.com \
    --cc=devicetree@vger.kernel.org \
    --cc=harb@amperecomputing.com \
    --cc=james.yang@arm.com \
    --cc=john.garry@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=tsahee@amazon.com \
    --cc=tuanphan@os.amperecomputing.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.