linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: "liuqi (BA)" <liuqi115@huawei.com>
Cc: Linuxarm <linuxarm@huawei.com>,
	mark.rutland@arm.com, bhelgaas@google.com,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, zhangshaokun@hisilicon.com
Subject: Re: [PATCH v6 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
Date: Mon, 21 Jun 2021 18:59:11 +0100	[thread overview]
Message-ID: <20210621175910.GC29713@willie-the-truck> (raw)
In-Reply-To: <0e7f6601-0d18-18da-f19c-d71ce1bc15dc@huawei.com>

On Fri, Jun 18, 2021 at 05:32:47PM +0800, liuqi (BA) wrote:
> 
> 
> On 2021/6/18 1:57, Will Deacon wrote:
> > On Thu, Jun 17, 2021 at 07:00:26PM +0800, liuqi (BA) wrote:
> > > 
> > > 
> > > On 2021/6/16 21:42, Will Deacon wrote:
> > > > Hi,
> > > > 
> > > > On Wed, Jun 16, 2021 at 09:54:23AM +0800, liuqi (BA) wrote:
> > > > > On 2021/6/15 17:35, Will Deacon wrote:
> > > > > > On Tue, Jun 15, 2021 at 04:57:09PM +0800, liuqi (BA) wrote:
> > > > > > > On 2021/6/12 0:23, Will Deacon wrote:
> > > > > > > > On Mon, May 31, 2021 at 09:32:31PM +0800, Qi Liu wrote:
> > > > > > > > > +	/* Process data to set unit of latency as "us". */
> > > > > > > > > +	if (is_latency_event(idx))
> > > > > > > > > +		return div64_u64(data * us_per_cycle, data_ext);
> > > > > > > > > +
> > > > > > > > > +	if (is_bus_util_event(idx))
> > > > > > > > > +		return div64_u64(data * us_per_cycle, data_ext);
> > > > > > > > > +
> > > > > > > > > +	if (is_buf_util_event(idx))
> > > > > > > > > +		return div64_u64(data, data_ext * us_per_cycle);
> > > > > > > > 
> > > > > > > > Why do we need to do all this division in the kernel? Can't we just expose
> > > > > > > > the underlying values and let userspace figure out what it wants to do with
> > > > > > > > the numbers?
> > > > > > > > 
> > > > > > > Our PMU hardware support 8 sets of counters to count bandwidth, latency and
> > > > > > > utilization events.
> > > > > > > 
> > > > > > > For example, when users set latency event, common counter will count delay
> > > > > > > cycles, and extern counter count number of PCIe packets automaticly. And we
> > > > > > > do not have a event number for counting number of PCIe packets.
> > > > > > > 
> > > > > > > So this division cannot move to userspace tool.
> > > > > > 
> > > > > > Why can't you expose the packet counter as an extra event to userspace?
> > > > > > 
> > > > > Maybe I didn’t express it clearly.
> > > > > 
> > > > > As there is no hardware event number for PCIe packets counting, extern
> > > > > counter count packets *automaticly* when latency events is selected by
> > > > > users.
> > > > > 
> > > > > This means users cannot set "config=0xXX" to start packets counting event.
> > > > > So we can only get the value of counter and extern counter in driver and do
> > > > > the division, then pass the result to userspace.
> > > > 
> > > > I still think it would be ideal if we could expose both values to userspace
> > > > rather than combine them somehow. Hmm. Anyway...
> > > > 
> > > > I struggled to figure out exactly what's being counted from the
> > > > documentation patch (please update that). Please can you explain exactly
> > > > what appears in the HISI_PCIE_CNT and HISI_PCIE_EXT_CNT registers for the
> > > > different modes of operation? Without that, the ratios you've chosen to
> > > > report seem rather arbitrary.
> > > > 
> > > 
> > > PCIe PMU events can be devided into 2 types: one type is counted by
> > > HISI_PCIE_CNT, the other type is counted by HISI_PCIE_EXT_CNT and
> > > HISI_PCIE_CNT, including bandwidth events, latency events, buffer
> > > utilization and bus utilization.
> > > 
> > > if user sets "event=0x10, subevent=0x02", this means "latency of RX memory
> > > read" is selected. HISI_PCIE_CNT counts total delay cycles and
> > > HISI_PCIE_EXT_CNT counts PCIe packets number at the same time. So PMU driver
> > > could obtain average latency by caculating: HISI_PCIE_CNT /
> > > HISI_PCIE_EXT_CNT.
> > > 
> > > if users sets "event=0x04, subevent=0x01", this means bandwidth of RX memory
> > > read is selected. HISI_PCIE_CNT counts total packet data volume and
> > > HISI_PCIE_EXT_CNT counts cycles, so PMU driver could obtain average
> > > bandwidth by caculating: HISI_PCIE_CNT / HISI_PCIE_EXT_CNT.
> > > 
> > > The same logic is used when calculating bus utilization and buffer
> > > utilization. Seems I should add this part in Document patch,I 'll do this in
> > > next version, thanks.
> > > 
> > > > I also couldn't figure out how the latency event works. For example, I was
> > > > assuming it would be a filter (a bit like the length), so you could say
> > > > things like "I'm only interested in packets with a latency higher than x"
> > > > but it doesn't look like it works that way.
> > > > 
> > > > Thanks,
> > > > 
> > > latency is not a filter, PCIe PMU has a group of lactency events, their
> > > event number are within the latency_events_list, and the above explains how
> > > latency events work.
> > > 
> > > PMU drivers have TLP length filter for bandwidth events, users could set
> > > like "I only interested in bandwidth of packets with TLP length bigger than
> > > x".
> > 
> > Thanks for the explanations, I think I get it a bit better now. But I still
> > think we should be exposing both of the values to userspace instead of
> > reporting the ratio from which the individual counters are then
> > unrecoverable.
> > 
> > It will complicate the driver slightly, but can we instead expose the
> > events independently and then allowing scheduling some of them in groups?
> > 
> > That way we just treat HISI_PCIE_CNT and HISI_PCIE_EXT_CNT as separate
> > counters, but with a scheduling constraint that events in a register pair
> > must be in the same group.
> > 
> > Will
> 
> Hi Will,
> 
> I got what you mean, treating HISI_PCIE_CNT and HISI_PCIE_EXT_CNT as
> separate counters is a great idea, but here is a hardware limitation.
> 
> The behavior of HISI_PCIE_EXT_CNT is controlled by hardware logical, so
> HISI_PCIE_EXT_CNT only works when latency/bandwidth/... events number are
> set in HISI_PCIE_EVENT_CTRL. So driver cannot separate this two counters,
> they must work together because of hardware limitation.

Why can't you express this as a scheduling constraint when adding a new
event though? In other words:

  * If you only have a latency event, then program the latency counter and
    "reserve" HISI_PCIE_EXT_CNT
  * If you only have a packets event, then program HISI_PCIE_EXT_CNT and
    "reserve" HISI_PCIE_CNT with a dummy latency event so that HISI_PCIE_EXT_CNT
    actually counts something
  * If the user wants to count both latency and packets, then they need to
    create two events in the same group, at which point you can program
    both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT and return the results in the
    two events

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-21 18:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31 13:32 [PATCH v6 0/2] drivers/perf: hisi: Add support for PCIe PMU Qi Liu
2021-05-31 13:32 ` [PATCH v6 1/2] docs: perf: Add description for HiSilicon PCIe PMU driver Qi Liu
2021-05-31 13:32 ` [PATCH v6 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe PMU Qi Liu
2021-06-11 16:23   ` Will Deacon
2021-06-14  9:20     ` Jonathan Cameron
2021-06-15  9:26       ` Will Deacon
2021-06-15  8:57     ` liuqi (BA)
2021-06-15  9:35       ` Will Deacon
2021-06-16  1:54         ` liuqi (BA)
2021-06-16 13:42           ` Will Deacon
2021-06-17 11:00             ` liuqi (BA)
2021-06-17 17:57               ` Will Deacon
2021-06-18  9:32                 ` liuqi (BA)
2021-06-21 17:59                   ` Will Deacon [this message]
2021-06-23  9:59                     ` liuqi (BA)
2021-06-11 23:33   ` Krzysztof Wilczyński
2021-06-16  1:09     ` liuqi (BA)
2021-06-16 15:23       ` Bjorn Helgaas
2021-06-16 17:27         ` Will Deacon
2021-06-16 14:14     ` Bjorn Helgaas
2021-06-11 10:04 ` [PATCH v6 0/2] drivers/perf: hisi: Add support for " liuqi (BA)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210621175910.GC29713@willie-the-truck \
    --to=will@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liuqi115@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=zhangshaokun@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).