All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Andersson <bjorn.andersson@linaro.org>
To: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Rajendra Nayak <quic_rjendra@quicinc.com>,
	Andy Gross <agross@kernel.org>, Georgi Djakov <djakov@kernel.org>,
	Rob Herring <robh+dt@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-msm@vger.kernel.org, linux-pm@vger.kernel.org,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Thara Gopinath <thara.gopinath@linaro.org>
Subject: Re: [PATCH v4 4/4] arm64: dts: qcom: sdm845: Add CPU BWMON
Date: Sat, 25 Jun 2022 22:28:27 -0500	[thread overview]
Message-ID: <YrfSWw9Wpq5TsRUt@builder.lan> (raw)
In-Reply-To: <23320e3c-40c3-12bb-0a1c-7e659a1961f2@linaro.org>

On Thu 23 Jun 07:58 CDT 2022, Krzysztof Kozlowski wrote:

> On 23/06/2022 08:48, Rajendra Nayak wrote:
> >>>> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> index 83e8b63f0910..adffb9c70566 100644
> >>>> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> @@ -2026,6 +2026,60 @@ llcc: system-cache-controller@1100000 {
> >>>>    			interrupts = <GIC_SPI 582 IRQ_TYPE_LEVEL_HIGH>;
> >>>>    		};
> >>>>    
> >>>> +		pmu@1436400 {
> >>>> +			compatible = "qcom,sdm845-cpu-bwmon";
> >>>> +			reg = <0 0x01436400 0 0x600>;
> >>>> +
> >>>> +			interrupts = <GIC_SPI 581 IRQ_TYPE_LEVEL_HIGH>;
> >>>> +
> >>>> +			interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>,
> >>>> +					<&osm_l3 MASTER_OSM_L3_APPS &osm_l3 SLAVE_OSM_L3>;
> >>>> +			interconnect-names = "ddr", "l3c";
> >>>
> >>> Is this the pmu/bwmon instance between the cpu and caches or the one between the caches and DDR?
> >>
> >> To my understanding this is the one between CPU and caches.
> > 
> > Ok, but then because the OPP table lists the DDR bw first and Cache bw second, isn't the driver
> > ending up comparing the bw values thrown by the pmu against the DDR bw instead of the Cache BW?
> 
> I double checked now and you're right.
> 
> > Atleast with my testing on sc7280 I found this to mess things up and I always was ending up at
> > higher OPPs even while the system was completely idle. Comparing the values against the Cache bw
> > fixed it.(sc7280 also has a bwmon4 instance between the cpu and caches and a bwmon5 between the cache
> > and DDR)
> 
> In my case it exposes different issue - under performance. Somehow the
> bwmon does not report bandwidth high enough to vote for high bandwidth.
> 
> After removing the DDR interconnect and bandwidth OPP values I have for:
> sysbench --threads=8 --time=60 --memory-total-size=20T --test=memory
> --memory-block-size=4M run
> 
> 1. Vanilla: 29768 MB/s
> 2. Vanilla without CPU votes: 8728 MB/s
> 3. Previous bwmon (voting too high): 32007 MB/s
> 4. Fixed bwmon 24911 MB/s
> Bwmon does not vote for maximum L3 speed:
> bwmon report 9408 MB/s (thresholds set: <9216000 15052801>
> )
> osm l3 aggregate 14355 MBps -> 897 MHz, level 7, bw 14355 MBps
> 
> Maybe that's just problem with missing governor which would vote for
> bandwidth rounding up or anticipating higher needs.
> 
> >>> Depending on which one it is, shouldn;t we just be scaling either one and not both the interconnect paths?
> >>
> >> The interconnects are the same as ones used for CPU nodes, therefore if
> >> we want to scale both when scaling CPU, then we also want to scale both
> >> when seeing traffic between CPU and cache.
> > 
> > Well, they were both associated with the CPU node because with no other input to decide on _when_
> > to scale the caches and DDR, we just put a mapping table which simply mapped a CPU freq to a L3 _and_
> > DDR freq. So with just one input (CPU freq) we decided on what should be both the L3 freq and DDR freq.
> > 
> > Now with 2 pmu's, we have 2 inputs, so we can individually scale the L3 based on the cache PMU
> > counters and DDR based on the DDR PMU counters, no?
> > 
> > Since you said you have plans to add the other pmu support as well (bwmon5 between the cache and DDR)
> > how else would you have the OPP table associated with that pmu instance? Would you again have both the
> > L3 and DDR scale based on the inputs from that bwmon too?
> 
> Good point, thanks for sharing. I think you're right. I'll keep only the
> l3c interconnect path.
> 

If I understand correctly, <&osm_l3 MASTER_OSM_L3_APPS &osm_l3
SLAVE_OSM_L3> relates to the L3 cache speed, which sits inside the CPU
subsystem. As such traffic hitting this cache will not show up in either
bwmon instance.

The path <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>
affects the DDR frequency. So the traffic measured by the cpu-bwmon
would be the CPU subsystems traffic that missed the L1/L2/L3 caches and
hits the memory bus towards DDR.


If this is the case it seems to make sense to keep the L3 scaling in the
opp-tables for the CPU and make bwmon only scale the DDR path. What do
you think?

Regards,
Bjorn

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Andersson <bjorn.andersson@linaro.org>
To: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Rajendra Nayak <quic_rjendra@quicinc.com>,
	Andy Gross <agross@kernel.org>, Georgi Djakov <djakov@kernel.org>,
	Rob Herring <robh+dt@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-msm@vger.kernel.org, linux-pm@vger.kernel.org,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Thara Gopinath <thara.gopinath@linaro.org>
Subject: Re: [PATCH v4 4/4] arm64: dts: qcom: sdm845: Add CPU BWMON
Date: Sat, 25 Jun 2022 22:28:27 -0500	[thread overview]
Message-ID: <YrfSWw9Wpq5TsRUt@builder.lan> (raw)
In-Reply-To: <23320e3c-40c3-12bb-0a1c-7e659a1961f2@linaro.org>

On Thu 23 Jun 07:58 CDT 2022, Krzysztof Kozlowski wrote:

> On 23/06/2022 08:48, Rajendra Nayak wrote:
> >>>> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> index 83e8b63f0910..adffb9c70566 100644
> >>>> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> >>>> @@ -2026,6 +2026,60 @@ llcc: system-cache-controller@1100000 {
> >>>>    			interrupts = <GIC_SPI 582 IRQ_TYPE_LEVEL_HIGH>;
> >>>>    		};
> >>>>    
> >>>> +		pmu@1436400 {
> >>>> +			compatible = "qcom,sdm845-cpu-bwmon";
> >>>> +			reg = <0 0x01436400 0 0x600>;
> >>>> +
> >>>> +			interrupts = <GIC_SPI 581 IRQ_TYPE_LEVEL_HIGH>;
> >>>> +
> >>>> +			interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>,
> >>>> +					<&osm_l3 MASTER_OSM_L3_APPS &osm_l3 SLAVE_OSM_L3>;
> >>>> +			interconnect-names = "ddr", "l3c";
> >>>
> >>> Is this the pmu/bwmon instance between the cpu and caches or the one between the caches and DDR?
> >>
> >> To my understanding this is the one between CPU and caches.
> > 
> > Ok, but then because the OPP table lists the DDR bw first and Cache bw second, isn't the driver
> > ending up comparing the bw values thrown by the pmu against the DDR bw instead of the Cache BW?
> 
> I double checked now and you're right.
> 
> > Atleast with my testing on sc7280 I found this to mess things up and I always was ending up at
> > higher OPPs even while the system was completely idle. Comparing the values against the Cache bw
> > fixed it.(sc7280 also has a bwmon4 instance between the cpu and caches and a bwmon5 between the cache
> > and DDR)
> 
> In my case it exposes different issue - under performance. Somehow the
> bwmon does not report bandwidth high enough to vote for high bandwidth.
> 
> After removing the DDR interconnect and bandwidth OPP values I have for:
> sysbench --threads=8 --time=60 --memory-total-size=20T --test=memory
> --memory-block-size=4M run
> 
> 1. Vanilla: 29768 MB/s
> 2. Vanilla without CPU votes: 8728 MB/s
> 3. Previous bwmon (voting too high): 32007 MB/s
> 4. Fixed bwmon 24911 MB/s
> Bwmon does not vote for maximum L3 speed:
> bwmon report 9408 MB/s (thresholds set: <9216000 15052801>
> )
> osm l3 aggregate 14355 MBps -> 897 MHz, level 7, bw 14355 MBps
> 
> Maybe that's just problem with missing governor which would vote for
> bandwidth rounding up or anticipating higher needs.
> 
> >>> Depending on which one it is, shouldn;t we just be scaling either one and not both the interconnect paths?
> >>
> >> The interconnects are the same as ones used for CPU nodes, therefore if
> >> we want to scale both when scaling CPU, then we also want to scale both
> >> when seeing traffic between CPU and cache.
> > 
> > Well, they were both associated with the CPU node because with no other input to decide on _when_
> > to scale the caches and DDR, we just put a mapping table which simply mapped a CPU freq to a L3 _and_
> > DDR freq. So with just one input (CPU freq) we decided on what should be both the L3 freq and DDR freq.
> > 
> > Now with 2 pmu's, we have 2 inputs, so we can individually scale the L3 based on the cache PMU
> > counters and DDR based on the DDR PMU counters, no?
> > 
> > Since you said you have plans to add the other pmu support as well (bwmon5 between the cache and DDR)
> > how else would you have the OPP table associated with that pmu instance? Would you again have both the
> > L3 and DDR scale based on the inputs from that bwmon too?
> 
> Good point, thanks for sharing. I think you're right. I'll keep only the
> l3c interconnect path.
> 

If I understand correctly, <&osm_l3 MASTER_OSM_L3_APPS &osm_l3
SLAVE_OSM_L3> relates to the L3 cache speed, which sits inside the CPU
subsystem. As such traffic hitting this cache will not show up in either
bwmon instance.

The path <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>
affects the DDR frequency. So the traffic measured by the cpu-bwmon
would be the CPU subsystems traffic that missed the L1/L2/L3 caches and
hits the memory bus towards DDR.


If this is the case it seems to make sense to keep the L3 scaling in the
opp-tables for the CPU and make bwmon only scale the DDR path. What do
you think?

Regards,
Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-26  3:28 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-01 10:11 [PATCH v4 0/4] soc/arm64: qcom: Add initial version of bwmon Krzysztof Kozlowski
2022-06-01 10:11 ` Krzysztof Kozlowski
2022-06-01 10:11 ` [PATCH v4 1/4] dt-bindings: interconnect: qcom,sdm845-cpu-bwmon: add BWMON device Krzysztof Kozlowski
2022-06-01 10:11   ` Krzysztof Kozlowski
2022-06-06 21:11   ` Bjorn Andersson
2022-06-06 21:11     ` Bjorn Andersson
2022-06-07  6:50     ` Krzysztof Kozlowski
2022-06-07  6:50       ` Krzysztof Kozlowski
2022-06-22 11:58       ` Rajendra Nayak
2022-06-22 11:58         ` Rajendra Nayak
2022-06-22 12:20         ` Krzysztof Kozlowski
2022-06-22 12:20           ` Krzysztof Kozlowski
2022-06-26  3:19         ` Bjorn Andersson
2022-06-26  3:19           ` Bjorn Andersson
2022-06-28 10:43           ` Rajendra Nayak
2022-06-28 10:43             ` Rajendra Nayak
2022-06-01 10:11 ` [PATCH v4 2/4] soc: qcom: icc-bwmon: Add bandwidth monitoring driver Krzysztof Kozlowski
2022-06-01 10:11   ` Krzysztof Kozlowski
2022-06-06 16:35   ` Georgi Djakov
2022-06-06 16:35     ` Georgi Djakov
2022-06-01 10:11 ` [PATCH v4 3/4] arm64: defconfig: enable Qualcomm Bandwidth Monitor Krzysztof Kozlowski
2022-06-01 10:11   ` Krzysztof Kozlowski
2022-06-01 10:11 ` [PATCH v4 4/4] arm64: dts: qcom: sdm845: Add CPU BWMON Krzysztof Kozlowski
2022-06-01 10:11   ` Krzysztof Kozlowski
2022-06-06 20:39   ` Georgi Djakov
2022-06-06 20:39     ` Georgi Djakov
2022-06-07  6:48     ` Krzysztof Kozlowski
2022-06-07  6:48       ` Krzysztof Kozlowski
2022-06-22 11:46   ` Rajendra Nayak
2022-06-22 11:46     ` Rajendra Nayak
2022-06-22 13:52     ` Krzysztof Kozlowski
2022-06-22 13:52       ` Krzysztof Kozlowski
2022-06-23  6:48       ` Rajendra Nayak
2022-06-23  6:48         ` Rajendra Nayak
2022-06-23 12:58         ` Krzysztof Kozlowski
2022-06-23 12:58           ` Krzysztof Kozlowski
2022-06-26  3:28           ` Bjorn Andersson [this message]
2022-06-26  3:28             ` Bjorn Andersson
2022-06-27 12:39             ` Krzysztof Kozlowski
2022-06-27 12:39               ` Krzysztof Kozlowski
2022-06-28 10:36               ` Rajendra Nayak
2022-06-28 10:36                 ` Rajendra Nayak
2022-06-28 10:50                 ` Krzysztof Kozlowski
2022-06-28 10:50                   ` Krzysztof Kozlowski
2022-06-28 13:15                   ` Rajendra Nayak
2022-06-28 13:15                     ` Rajendra Nayak
2022-06-28 14:02                     ` Krzysztof Kozlowski
2022-06-28 14:02                       ` Krzysztof Kozlowski
2022-06-28 15:20                       ` Rajendra Nayak
2022-06-28 15:20                         ` Rajendra Nayak
2022-06-28 15:23                         ` Krzysztof Kozlowski
2022-06-28 15:23                           ` Krzysztof Kozlowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YrfSWw9Wpq5TsRUt@builder.lan \
    --to=bjorn.andersson@linaro.org \
    --cc=agross@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=devicetree@vger.kernel.org \
    --cc=djakov@kernel.org \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=quic_rjendra@quicinc.com \
    --cc=robh+dt@kernel.org \
    --cc=thara.gopinath@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.