All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saravana Kannan <saravanak@google.com>
To: Leonard Crestez <leonard.crestez@nxp.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>,
	Alexandre Bailon <abailon@baylibre.com>,
	Georgi Djakov <georgi.djakov@linaro.org>,
	Stephen Boyd <sboyd@kernel.org>,
	Michael Turquette <mturquette@baylibre.com>,
	MyungJoo Ham <myungjoo.ham@samsung.com>,
	Kyungmin Park <kyungmin.park@samsung.com>,
	Shawn Guo <shawnguo@kernel.org>,
	Aisheng Dong <aisheng.dong@nxp.com>,
	Fabio Estevam <fabio.estevam@nxp.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Jacky Bai <ping.bai@nxp.com>, Anson Huang <anson.huang@nxp.com>,
	Abel Vesa <abel.vesa@nxp.com>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	"kernel@pengutronix.de" <kernel@pengutronix.de>,
	dl-linux-imx <linux-imx@nxp.com>,
	Linux PM <linux-pm@vger.kernel.org>,
	"linux-clk@vger.kernel.org" <linux-clk@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [RFCv2 0/8] Add imx8mm bus frequency switching
Date: Wed, 3 Jul 2019 20:02:25 -0700	[thread overview]
Message-ID: <CAGETcx-p4L3LBVpDBmBrPKXxMUtUXtsw-7AntpWs+AL3kaaP5Q@mail.gmail.com> (raw)
In-Reply-To: <DB7PR04MB505163FDCAD7BE9A0C71A65EEEFB0@DB7PR04MB5051.eurprd04.prod.outlook.com>

On Wed, Jul 3, 2019 at 4:30 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 7/4/2019 1:20 AM, Saravana Kannan wrote:
>
> >> The interconnect and devfreq parts do not communicate explicitly: they both
> >> just call clk_set_min_rate and the clk core picks the minimum value that can
> >> satisfy both. They are thus completely independent.
> >
> > Two different parts not talking to each other and just setting min
> > rate can cause problems for some concurrency use cases. ICC framework
> > is explicitly designed to handle cases like this and aggregate the
> > needs correctly. You might want to look into that more closely.
>
> The clk framework aggregates the min_rate requests from multiple
> consumers under a big "prepare_lock" so I expect it will deal with
> concurrent requests correctly. As for performance: frequency switching
> shouldn't be a fast path.

Sorry I wasn't clear. I was not talking about locking issues or race
conditions when I said concurrency use cases. What I meant was if GPU
wants 5 GB/s and at the same time (concurrent) camera wants 5 GB/s
you'll need to configure the interconnect for 10 GB/s. If both of them
just try to set the min freq equivalent for 5 GB/s the performance
would be bad or functionality might break.

> >> The clk_set_min_rate approach does not mesh very well with the OPP framework.
> >> Some of interconnect nodes on imx8m can run at different voltages: OPP can
> >> handle this well but not in response to a clk_set_min_rate from an unrelated
> >> subsystem. Maybe set voltage on a clk notifier?
> >
> > I think if you design it something like below, it might make your life
> > a whole lot easier.
> > Hopefully the formatting looks okay on your end. The arrow going up is
> > just connecting devfreq to ICC.
> >
> >                          Proactive -> ICC -> clk/OPP API to set freq/volt
> >                                        ^
> >                                        |
> > HW measure -> governor -> devfreq ----+
> >
> > That way, all voltage/frequency requirements are managed cleanly by
> > clk/OPP frameworks. The ICC deals with aggregating all the
> > requirements and devfreq lets you handle reactive scaling and policy.
>
> If icc and devfreq are to directly communicate it makes more sense to do
> it backwards: ICC should set a min_freq on nodes which have a devfreq
> instance attached and devfreq should implement actual freq switching.
>
> HW measurement is done on individual nodes while ICC deals with requests
> along a path. In particular on imx we have a performance monitor
> attached to the ddr controller and I doubt it can distinguish between
> masters so how could this be mapped usefully to an interconnect request?

Ah, that was the missing piece for me -- you are trying to use a
central performance monitor. I see what you are trying to do.

So you are looking at system wide traffic at DDR and then using that
to scale the interconnect/DDRC. I don't know how complicated or not
the IMX interconnect topology is, so please pardon my questions. If
you are using a performance monitor at the DDR controller, why do you
need the "proactive" requests from other clients? Wouldn't the
performance monitor account for all the traffic to DDR?

> As far as I understand with devfreq the ddrc node could use "ondemand"
> while the other nodes would default to the "passive" governor and run at
> predefined default ratios relative to DDRC.

Yes, that approach would also work but I'm not sure why you need the
ICC framework in that case.

> > If all of this makes sense, please take a look at [2] and provide your
> > thoughts. I've dropped a few patches from [1] to avoid confusion (too
> > much going on at once). I think BW OPP tables and having OPP tables
> > for interconnect paths will be something you'll need (if not now,
> > eventually) and something you can build on top of nicely.
>
> I found it very confusing that you're assigning BW OPP tables to
> devices. My initial understanding was that BW OPP would map "bandwidth"
> to "frequency" so BW OPPs should be assigned to links along the
> interconnect graph. But maybe what you want is to have OPPs for the BW
> values requested by devices?

I want to have OPPs for bandwidths requested for paths. Each
interconnect node can also use BW OPPs if that makes sense for them,
but I think they'd be better served by using frequency OPPs.

> Looking at the sdm845 icc provider source and it seems that those
> "peak/avg" values are actually parameters which go into a firmware
> command!? It makes sense that you want interconnect to be "below"
> devfreq since icc_provider.set maps very closely to what firmware exposes.

Even without the firmware (it's mainly there to aggregate requests for
some system wide resources) or if interconnects are scaled directly
using clock APIs (older chips), sdm845 would still want ICC to be
below devfreq. It's because 845 doesn't try to do ICC scaling by
measuring at the DDR. Each master makes separate requests and then the
ICC aggregates and sets the frequency. They have their reasons (good
ones) for doing that.

>  > Interconnects and interconnect paths quantify their performance
> levels > in terms of bandwidth and not in terms of frequency.
>
> On i.MX we just have a bunch of interconnect IPs for which frequencies
> can be adjusted (in hz)

This is true for every chip. In the end, they all set the Hz of hardware IPs.

> so the above statement doesn't really hold. It
> is up to an icc provider to convert aggregate bandwidth values to
> frequencies along the path.

That's the job of every ICC provider. In the case of i.MX you use
clock APIs and in the case of sdm845 they use some complicated
firmware interface. But in both cases, when the system is tuned for
performance/power everyone in the end cares about frequency and
voltage. If the frequency goes too high, they might reduce the
bandwidth to make sure the frequency remains reasonable for whatever
use case they are profiling.

I think the main difference is where the performance monitoring or
performance decision is done. If you don't have a central performance
monitor or don't want to use one, then the policy decision (which is
where devfreq fits in) will have to happen at the bandwidth decision
level.

-Saravana

WARNING: multiple messages have this Message-ID (diff)
From: Saravana Kannan <saravanak@google.com>
To: Leonard Crestez <leonard.crestez@nxp.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>,
	Jacky Bai <ping.bai@nxp.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Michael Turquette <mturquette@baylibre.com>,
	Alexandre Bailon <abailon@baylibre.com>,
	"linux-clk@vger.kernel.org" <linux-clk@vger.kernel.org>,
	Abel Vesa <abel.vesa@nxp.com>, Anson Huang <anson.huang@nxp.com>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	MyungJoo Ham <myungjoo.ham@samsung.com>,
	dl-linux-imx <linux-imx@nxp.com>,
	Linux PM <linux-pm@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Aisheng Dong <aisheng.dong@nxp.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Kyungmin Park <kyungmin.park@samsung.com>,
	"kernel@pengutronix.de" <kernel@pengutronix.de>,
	Fabio Estevam <fabio.estevam@nxp.com>,
	Shawn Guo <shawnguo@kernel.org>,
	Georgi Djakov <georgi.djakov@linaro.org>
Subject: Re: [RFCv2 0/8] Add imx8mm bus frequency switching
Date: Wed, 3 Jul 2019 20:02:25 -0700	[thread overview]
Message-ID: <CAGETcx-p4L3LBVpDBmBrPKXxMUtUXtsw-7AntpWs+AL3kaaP5Q@mail.gmail.com> (raw)
In-Reply-To: <DB7PR04MB505163FDCAD7BE9A0C71A65EEEFB0@DB7PR04MB5051.eurprd04.prod.outlook.com>

On Wed, Jul 3, 2019 at 4:30 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 7/4/2019 1:20 AM, Saravana Kannan wrote:
>
> >> The interconnect and devfreq parts do not communicate explicitly: they both
> >> just call clk_set_min_rate and the clk core picks the minimum value that can
> >> satisfy both. They are thus completely independent.
> >
> > Two different parts not talking to each other and just setting min
> > rate can cause problems for some concurrency use cases. ICC framework
> > is explicitly designed to handle cases like this and aggregate the
> > needs correctly. You might want to look into that more closely.
>
> The clk framework aggregates the min_rate requests from multiple
> consumers under a big "prepare_lock" so I expect it will deal with
> concurrent requests correctly. As for performance: frequency switching
> shouldn't be a fast path.

Sorry I wasn't clear. I was not talking about locking issues or race
conditions when I said concurrency use cases. What I meant was if GPU
wants 5 GB/s and at the same time (concurrent) camera wants 5 GB/s
you'll need to configure the interconnect for 10 GB/s. If both of them
just try to set the min freq equivalent for 5 GB/s the performance
would be bad or functionality might break.

> >> The clk_set_min_rate approach does not mesh very well with the OPP framework.
> >> Some of interconnect nodes on imx8m can run at different voltages: OPP can
> >> handle this well but not in response to a clk_set_min_rate from an unrelated
> >> subsystem. Maybe set voltage on a clk notifier?
> >
> > I think if you design it something like below, it might make your life
> > a whole lot easier.
> > Hopefully the formatting looks okay on your end. The arrow going up is
> > just connecting devfreq to ICC.
> >
> >                          Proactive -> ICC -> clk/OPP API to set freq/volt
> >                                        ^
> >                                        |
> > HW measure -> governor -> devfreq ----+
> >
> > That way, all voltage/frequency requirements are managed cleanly by
> > clk/OPP frameworks. The ICC deals with aggregating all the
> > requirements and devfreq lets you handle reactive scaling and policy.
>
> If icc and devfreq are to directly communicate it makes more sense to do
> it backwards: ICC should set a min_freq on nodes which have a devfreq
> instance attached and devfreq should implement actual freq switching.
>
> HW measurement is done on individual nodes while ICC deals with requests
> along a path. In particular on imx we have a performance monitor
> attached to the ddr controller and I doubt it can distinguish between
> masters so how could this be mapped usefully to an interconnect request?

Ah, that was the missing piece for me -- you are trying to use a
central performance monitor. I see what you are trying to do.

So you are looking at system wide traffic at DDR and then using that
to scale the interconnect/DDRC. I don't know how complicated or not
the IMX interconnect topology is, so please pardon my questions. If
you are using a performance monitor at the DDR controller, why do you
need the "proactive" requests from other clients? Wouldn't the
performance monitor account for all the traffic to DDR?

> As far as I understand with devfreq the ddrc node could use "ondemand"
> while the other nodes would default to the "passive" governor and run at
> predefined default ratios relative to DDRC.

Yes, that approach would also work but I'm not sure why you need the
ICC framework in that case.

> > If all of this makes sense, please take a look at [2] and provide your
> > thoughts. I've dropped a few patches from [1] to avoid confusion (too
> > much going on at once). I think BW OPP tables and having OPP tables
> > for interconnect paths will be something you'll need (if not now,
> > eventually) and something you can build on top of nicely.
>
> I found it very confusing that you're assigning BW OPP tables to
> devices. My initial understanding was that BW OPP would map "bandwidth"
> to "frequency" so BW OPPs should be assigned to links along the
> interconnect graph. But maybe what you want is to have OPPs for the BW
> values requested by devices?

I want to have OPPs for bandwidths requested for paths. Each
interconnect node can also use BW OPPs if that makes sense for them,
but I think they'd be better served by using frequency OPPs.

> Looking at the sdm845 icc provider source and it seems that those
> "peak/avg" values are actually parameters which go into a firmware
> command!? It makes sense that you want interconnect to be "below"
> devfreq since icc_provider.set maps very closely to what firmware exposes.

Even without the firmware (it's mainly there to aggregate requests for
some system wide resources) or if interconnects are scaled directly
using clock APIs (older chips), sdm845 would still want ICC to be
below devfreq. It's because 845 doesn't try to do ICC scaling by
measuring at the DDR. Each master makes separate requests and then the
ICC aggregates and sets the frequency. They have their reasons (good
ones) for doing that.

>  > Interconnects and interconnect paths quantify their performance
> levels > in terms of bandwidth and not in terms of frequency.
>
> On i.MX we just have a bunch of interconnect IPs for which frequencies
> can be adjusted (in hz)

This is true for every chip. In the end, they all set the Hz of hardware IPs.

> so the above statement doesn't really hold. It
> is up to an icc provider to convert aggregate bandwidth values to
> frequencies along the path.

That's the job of every ICC provider. In the case of i.MX you use
clock APIs and in the case of sdm845 they use some complicated
firmware interface. But in both cases, when the system is tuned for
performance/power everyone in the end cares about frequency and
voltage. If the frequency goes too high, they might reduce the
bandwidth to make sure the frequency remains reasonable for whatever
use case they are profiling.

I think the main difference is where the performance monitoring or
performance decision is done. If you don't have a central performance
monitor or don't want to use one, then the policy decision (which is
where devfreq fits in) will have to happen at the bandwidth decision
level.

-Saravana

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-07-04  3:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-28  7:39 [RFCv2 0/8] Add imx8mm bus frequency switching Leonard Crestez
2019-06-28  7:39 ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 1/8] clk: imx8mm: Add dram freq switch support Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 2/8] clk: imx8m-composite: Switch to determine_rate Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  8:45   ` Abel Vesa
2019-06-28  8:45     ` Abel Vesa
2019-06-28  8:56     ` Leonard Crestez
2019-06-28  8:56       ` Leonard Crestez
2019-07-02  7:13       ` Abel Vesa
2019-07-02  7:13         ` Abel Vesa
2019-06-28  7:39 ` [RFCv2 3/8] arm64: dts: imx8mm: Add dram dvfs irqs to ccm node Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 4/8] interconnect: Add generic driver for imx Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 5/8] interconnect: imx: Add platform driver for imx8mm Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 6/8] devfreq: Add imx-devfreq driver Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-07-03  1:31   ` Chanwoo Choi
2019-07-03  1:31     ` Chanwoo Choi
2019-06-28  7:39 ` [RFCv2 7/8] arm64: dts: imx8mm: Add interconnect node Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-06-28  7:39 ` [RFCv2 8/8] arm64: dts: imx8mm: Add devfreq-imx nodes Leonard Crestez
2019-06-28  7:39   ` Leonard Crestez
2019-07-03 22:19 ` [RFCv2 0/8] Add imx8mm bus frequency switching Saravana Kannan
2019-07-03 22:19   ` Saravana Kannan
2019-07-03 23:30   ` Leonard Crestez
2019-07-03 23:30     ` Leonard Crestez
2019-07-04  3:02     ` Saravana Kannan [this message]
2019-07-04  3:02       ` Saravana Kannan
2019-07-04  8:32       ` Leonard Crestez
2019-07-04  8:32         ` Leonard Crestez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGETcx-p4L3LBVpDBmBrPKXxMUtUXtsw-7AntpWs+AL3kaaP5Q@mail.gmail.com \
    --to=saravanak@google.com \
    --cc=abailon@baylibre.com \
    --cc=abel.vesa@nxp.com \
    --cc=aisheng.dong@nxp.com \
    --cc=anson.huang@nxp.com \
    --cc=fabio.estevam@nxp.com \
    --cc=georgi.djakov@linaro.org \
    --cc=kernel@pengutronix.de \
    --cc=krzk@kernel.org \
    --cc=kyungmin.park@samsung.com \
    --cc=leonard.crestez@nxp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=mturquette@baylibre.com \
    --cc=myungjoo.ham@samsung.com \
    --cc=ping.bai@nxp.com \
    --cc=rafael@kernel.org \
    --cc=sboyd@kernel.org \
    --cc=shawnguo@kernel.org \
    --cc=ulf.hansson@linaro.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.