Linux-ARM-MSM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/2] Add interconnect sync state support
@ 2020-07-22 11:01 Georgi Djakov
  2020-07-22 11:01 ` [PATCH v2 1/2] interconnect: Add " Georgi Djakov
  2020-07-22 11:01 ` [PATCH v2 2/2] interconnect: qcom: Use icc_sync_state in sdm845 and osm-3l drivers Georgi Djakov
  0 siblings, 2 replies; 7+ messages in thread
From: Georgi Djakov @ 2020-07-22 11:01 UTC (permalink / raw)
  To: linux-pm
  Cc: saravanak, mdtipton, okukatla, bjorn.andersson, vincent.guittot,
	georgi.djakov, linux-arm-msm, linux-kernel

Bootloaders often leave some system resources enabled such as clocks,
regulators, interconnects etc. We want to keep these resources enabled
until all their consumers are probed. These resources are often shared,
so we must wait for all the consumers to come up, before deciding
whether to turn them off or change the configuration. This patchset is
trying to solve the above problem just for the on-chip interconnects.

The problem is solved by allowing the interconnect providers to specify
an initial bandwidth values, which are enforced during boot as floor
values, while the requests from all consumers are being collected.
Then the sync_state() callback is used to signal when all consumers have
been probed, meaning that the floor bandwidth is not needed anymore and
the framework is ready to re-aggregate and process all requests.

v2:
* Support initial values for both average and peak bandwidth (Mike)
* Skip aggregating/setting for nodes that don't specify initial bw (Mike)
* Drop patch 2/4: Add get_bw() callback (Mike)
* Squash patches 3 and 4.

v1: https://lore.kernel.org/lkml/20200709110705.30359-1-georgi.djakov@linaro.org/

Georgi Djakov (2):
  interconnect: Add sync state support
  interconnect: qcom: Use icc_sync_state in sdm845 and osm-3l drivers

 drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
 drivers/interconnect/qcom/osm-l3.c    |  3 ++
 drivers/interconnect/qcom/sdm845.c    |  3 ++
 include/linux/interconnect-provider.h |  5 +++
 4 files changed, 72 insertions(+)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] interconnect: Add sync state support
  2020-07-22 11:01 [PATCH v2 0/2] Add interconnect sync state support Georgi Djakov
@ 2020-07-22 11:01 ` Georgi Djakov
  2020-07-22 17:07   ` Saravana Kannan
  2020-07-22 11:01 ` [PATCH v2 2/2] interconnect: qcom: Use icc_sync_state in sdm845 and osm-3l drivers Georgi Djakov
  1 sibling, 1 reply; 7+ messages in thread
From: Georgi Djakov @ 2020-07-22 11:01 UTC (permalink / raw)
  To: linux-pm
  Cc: saravanak, mdtipton, okukatla, bjorn.andersson, vincent.guittot,
	georgi.djakov, linux-arm-msm, linux-kernel

The bootloaders often do some initial configuration of the interconnects
in the system and we want to keep this configuration until all consumers
have probed and expressed their bandwidth needs. This is because we don't
want to change the configuration by starting to disable unused paths until
every user had a chance to request the amount of bandwidth it needs.

To accomplish this we will implement an interconnect specific sync_state
callback which will synchronize (aggregate and set) the current bandwidth
settings when all consumers have been probed.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
---
 drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
 include/linux/interconnect-provider.h |  5 +++
 2 files changed, 66 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index e5f998744501..0c4e38d9f1fa 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -26,6 +26,8 @@
 
 static DEFINE_IDR(icc_idr);
 static LIST_HEAD(icc_providers);
+static int providers_count;
+static bool synced_state;
 static DEFINE_MUTEX(icc_lock);
 static struct dentry *icc_debugfs_dir;
 
@@ -255,6 +257,12 @@ static int aggregate_requests(struct icc_node *node)
 			continue;
 		p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
 			     &node->avg_bw, &node->peak_bw);
+
+		/* during boot use the initial bandwidth as a floor value */
+		if (!synced_state) {
+			node->avg_bw = max(node->avg_bw, node->init_avg);
+			node->peak_bw = max(node->peak_bw, node->init_peak);
+		}
 	}
 
 	return 0;
@@ -919,6 +927,13 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
 	node->provider = provider;
 	list_add_tail(&node->node_list, &provider->nodes);
 
+	/* get the initial bandwidth values and sync them with hardware */
+	if ((node->init_avg || node->init_peak) && provider->set) {
+		node->avg_bw = node->init_avg;
+		node->peak_bw = node->init_peak;
+		provider->set(node, node);
+	}
+
 	mutex_unlock(&icc_lock);
 }
 EXPORT_SYMBOL_GPL(icc_node_add);
@@ -1014,8 +1029,54 @@ int icc_provider_del(struct icc_provider *provider)
 }
 EXPORT_SYMBOL_GPL(icc_provider_del);
 
+static int of_count_icc_providers(struct device_node *np)
+{
+	struct device_node *child;
+	int count = 0;
+
+	for_each_available_child_of_node(np, child) {
+		if (of_property_read_bool(child, "#interconnect-cells"))
+			count++;
+		count += of_count_icc_providers(child);
+	}
+	of_node_put(np);
+
+	return count;
+}
+
+void icc_sync_state(struct device *dev)
+{
+	struct icc_provider *p;
+	struct icc_node *n;
+	static int count;
+
+	count++;
+
+	if (count < providers_count)
+		return;
+
+	mutex_lock(&icc_lock);
+	synced_state = true;
+	list_for_each_entry(p, &icc_providers, provider_list) {
+		dev_dbg(p->dev, "interconnect provider is in synced state\n");
+		list_for_each_entry(n, &p->nodes, node_list) {
+			if (n->init_avg || n->init_peak) {
+				aggregate_requests(n);
+				p->set(n, n);
+			}
+		}
+	}
+	mutex_unlock(&icc_lock);
+}
+EXPORT_SYMBOL_GPL(icc_sync_state);
+
 static int __init icc_init(void)
 {
+	struct device_node *root = of_find_node_by_path("/");
+
+	providers_count = of_count_icc_providers(root);
+	of_node_put(root);
+
 	icc_debugfs_dir = debugfs_create_dir("interconnect", NULL);
 	debugfs_create_file("interconnect_summary", 0444,
 			    icc_debugfs_dir, NULL, &icc_summary_fops);
diff --git a/include/linux/interconnect-provider.h b/include/linux/interconnect-provider.h
index 0c494534b4d3..aa85b96296c4 100644
--- a/include/linux/interconnect-provider.h
+++ b/include/linux/interconnect-provider.h
@@ -71,6 +71,8 @@ struct icc_provider {
  * @req_list: a list of QoS constraint requests associated with this node
  * @avg_bw: aggregated value of average bandwidth requests from all consumers
  * @peak_bw: aggregated value of peak bandwidth requests from all consumers
+ * @init_avg: average bandwidth value that is read from the hardware during init
+ * @init_peak: peak bandwidth value that is read from the hardware during init
  * @data: pointer to private data
  */
 struct icc_node {
@@ -87,6 +89,8 @@ struct icc_node {
 	struct hlist_head	req_list;
 	u32			avg_bw;
 	u32			peak_bw;
+	u32			init_avg;
+	u32			init_peak;
 	void			*data;
 };
 
@@ -103,6 +107,7 @@ void icc_node_del(struct icc_node *node);
 int icc_nodes_remove(struct icc_provider *provider);
 int icc_provider_add(struct icc_provider *provider);
 int icc_provider_del(struct icc_provider *provider);
+void icc_sync_state(struct device *dev);
 
 #else
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] interconnect: qcom: Use icc_sync_state in sdm845 and osm-3l drivers
  2020-07-22 11:01 [PATCH v2 0/2] Add interconnect sync state support Georgi Djakov
  2020-07-22 11:01 ` [PATCH v2 1/2] interconnect: Add " Georgi Djakov
@ 2020-07-22 11:01 ` Georgi Djakov
  1 sibling, 0 replies; 7+ messages in thread
From: Georgi Djakov @ 2020-07-22 11:01 UTC (permalink / raw)
  To: linux-pm
  Cc: saravanak, mdtipton, okukatla, bjorn.andersson, vincent.guittot,
	georgi.djakov, linux-arm-msm, linux-kernel

Lowering the bandwidth on the bus might have negative consequences if
it's done before all consumers had a chance to cast their vote. Let's
return the maximum amount of bandwidth as initial value. This bandwidth
level would be maintained until all consumers have probed.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
---
 drivers/interconnect/qcom/osm-l3.c | 3 +++
 drivers/interconnect/qcom/sdm845.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/drivers/interconnect/qcom/osm-l3.c b/drivers/interconnect/qcom/osm-l3.c
index 96fb9ff5ff2e..54aff6273af1 100644
--- a/drivers/interconnect/qcom/osm-l3.c
+++ b/drivers/interconnect/qcom/osm-l3.c
@@ -236,6 +236,8 @@ static int qcom_osm_l3_probe(struct platform_device *pdev)
 
 		node->name = qnodes[i]->name;
 		node->data = qnodes[i];
+		node->init_avg = INT_MAX;
+		node->init_peak = INT_MAX;
 		icc_node_add(node, provider);
 
 		for (j = 0; j < qnodes[i]->num_links; j++)
@@ -268,6 +270,7 @@ static struct platform_driver osm_l3_driver = {
 	.driver = {
 		.name = "osm-l3",
 		.of_match_table = osm_l3_of_match,
+		.sync_state = icc_sync_state,
 	},
 };
 module_platform_driver(osm_l3_driver);
diff --git a/drivers/interconnect/qcom/sdm845.c b/drivers/interconnect/qcom/sdm845.c
index f6c7b969520d..c04775820f15 100644
--- a/drivers/interconnect/qcom/sdm845.c
+++ b/drivers/interconnect/qcom/sdm845.c
@@ -503,6 +503,8 @@ static int qnoc_probe(struct platform_device *pdev)
 
 		node->name = qnodes[i]->name;
 		node->data = qnodes[i];
+		node->init_avg = INT_MAX;
+		node->init_peak = INT_MAX;
 		icc_node_add(node, provider);
 
 		for (j = 0; j < qnodes[i]->num_links; j++)
@@ -559,6 +561,7 @@ static struct platform_driver qnoc_driver = {
 	.driver = {
 		.name = "qnoc-sdm845",
 		.of_match_table = qnoc_of_match,
+		.sync_state = icc_sync_state,
 	},
 };
 module_platform_driver(qnoc_driver);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] interconnect: Add sync state support
  2020-07-22 11:01 ` [PATCH v2 1/2] interconnect: Add " Georgi Djakov
@ 2020-07-22 17:07   ` Saravana Kannan
  2020-07-28  6:18     ` Mike Tipton
  0 siblings, 1 reply; 7+ messages in thread
From: Saravana Kannan @ 2020-07-22 17:07 UTC (permalink / raw)
  To: Georgi Djakov
  Cc: Linux PM, Mike Tipton, okukatla, Bjorn Andersson,
	Vincent Guittot, linux-arm-msm, LKML

On Wed, Jul 22, 2020 at 4:01 AM Georgi Djakov <georgi.djakov@linaro.org> wrote:
>
> The bootloaders often do some initial configuration of the interconnects
> in the system and we want to keep this configuration until all consumers
> have probed and expressed their bandwidth needs. This is because we don't
> want to change the configuration by starting to disable unused paths until
> every user had a chance to request the amount of bandwidth it needs.
>
> To accomplish this we will implement an interconnect specific sync_state
> callback which will synchronize (aggregate and set) the current bandwidth
> settings when all consumers have been probed.
>
> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
> ---
>  drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
>  include/linux/interconnect-provider.h |  5 +++
>  2 files changed, 66 insertions(+)
>
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index e5f998744501..0c4e38d9f1fa 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -26,6 +26,8 @@
>
>  static DEFINE_IDR(icc_idr);
>  static LIST_HEAD(icc_providers);
> +static int providers_count;
> +static bool synced_state;
>  static DEFINE_MUTEX(icc_lock);
>  static struct dentry *icc_debugfs_dir;
>
> @@ -255,6 +257,12 @@ static int aggregate_requests(struct icc_node *node)
>                         continue;
>                 p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
>                              &node->avg_bw, &node->peak_bw);
> +
> +               /* during boot use the initial bandwidth as a floor value */
> +               if (!synced_state) {
> +                       node->avg_bw = max(node->avg_bw, node->init_avg);
> +                       node->peak_bw = max(node->peak_bw, node->init_peak);
> +               }

Sorry I didn't reply earlier.

I liked your previous approach with the get_bw ops. The v2 approach
forces every interconnect provider driver to set up these values even
if they are okay with just maxing out the bandwidth. Also, if they can
actually query their hardware, this adds additional steps for them.

I think the default should be:
1. Query the current bandwidth at boot and use that.
2. If that's not available, max out the bandwidth.

The interconnect providers that don't like maxing out and don't have
real get_bw() capability can just cache and return the last set_bw()
values. And they start off with those cached values matching whatever
init_bw they need.

That way, the default case (can get bw or don't care about maxing out)
would be easy and the extra work would be limited to drivers that want
neither.

-Saravana

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] interconnect: Add sync state support
  2020-07-22 17:07   ` Saravana Kannan
@ 2020-07-28  6:18     ` Mike Tipton
  2020-07-30 19:07       ` Saravana Kannan
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Tipton @ 2020-07-28  6:18 UTC (permalink / raw)
  To: Saravana Kannan, Georgi Djakov
  Cc: Linux PM, okukatla, Bjorn Andersson, Vincent Guittot,
	linux-arm-msm, LKML

On 7/22/2020 10:07 AM, Saravana Kannan wrote:
> On Wed, Jul 22, 2020 at 4:01 AM Georgi Djakov <georgi.djakov@linaro.org> wrote:
>>
>> The bootloaders often do some initial configuration of the interconnects
>> in the system and we want to keep this configuration until all consumers
>> have probed and expressed their bandwidth needs. This is because we don't
>> want to change the configuration by starting to disable unused paths until
>> every user had a chance to request the amount of bandwidth it needs.
>>
>> To accomplish this we will implement an interconnect specific sync_state
>> callback which will synchronize (aggregate and set) the current bandwidth
>> settings when all consumers have been probed.
>>
>> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
>> ---
>>   drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
>>   include/linux/interconnect-provider.h |  5 +++
>>   2 files changed, 66 insertions(+)
>>
>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>> index e5f998744501..0c4e38d9f1fa 100644
>> --- a/drivers/interconnect/core.c
>> +++ b/drivers/interconnect/core.c
>> @@ -26,6 +26,8 @@
>>
>>   static DEFINE_IDR(icc_idr);
>>   static LIST_HEAD(icc_providers);
>> +static int providers_count;
>> +static bool synced_state;
>>   static DEFINE_MUTEX(icc_lock);
>>   static struct dentry *icc_debugfs_dir;
>>
>> @@ -255,6 +257,12 @@ static int aggregate_requests(struct icc_node *node)
>>                          continue;
>>                  p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
>>                               &node->avg_bw, &node->peak_bw);
>> +
>> +               /* during boot use the initial bandwidth as a floor value */
>> +               if (!synced_state) {
>> +                       node->avg_bw = max(node->avg_bw, node->init_avg);
>> +                       node->peak_bw = max(node->peak_bw, node->init_peak);
>> +               }
> 
> Sorry I didn't reply earlier.
> 
> I liked your previous approach with the get_bw ops. The v2 approach
> forces every interconnect provider driver to set up these values even
> if they are okay with just maxing out the bandwidth. Also, if they can
> actually query their hardware, this adds additional steps for them.

The problem with using something like get_bw() is that while we can 
dynamically query the HW, we have far less granularity in HW than we 
have nodes in the framework. We vote at BCM-level granularity, but each 
BCM can have many nodes. For example, the sdm845 CN0 BCM has 47 nodes. 
If we implement get_bw() generically, then it would return the BW for 
each node, which would be the queried BCM vote scaled to account for 
differences in BCM/node widths. While this could be useful in general as 
an informational callback, we wouldn't want to use this as a proxy for 
our initial BW vote requirements. For CN0, we wouldn't want or need to 
vote 47 times for the same CN0 BCM. Each of the 47 node requests would 
result in the same BCM request.

All we'd really need is a single node per-BCM to serve as the proxy 
node. We'd query the HW, scale the queried value for the chosen proxy 
node, and set init_avg/init_peak appropriately. This would save a lot of 
unnecessary votes. Based on the current implementation, the set() call 
in icc_node_add() for initial BW wouldn't trigger any actual HW requests 
since we only queue BCMs that require updating in the aggregate() 
callback. However, the set() call in icc_sync_state() would, since we 
re-aggregate each node that has a non-zero init_avg/init_peak.

There's nothing stopping us from implementing get_bw() as if it were 
get_initial_bw(), but that only works until the framework decides to use 
get_bw() for more things than just the initial vote. I suppose we could 
also just have a "get_initial_bw" callback, but it only needs to be 
called once, so doesn't necessarily need a callback as opposed to 
additional init_avg/init_peak members in the icc_node struct.

> 
> I think the default should be:
> 1. Query the current bandwidth at boot and use that.
> 2. If that's not available, max out the bandwidth.
> 
> The interconnect providers that don't like maxing out and don't have
> real get_bw() capability can just cache and return the last set_bw()
> values. And they start off with those cached values matching whatever
> init_bw they need.
> 
> That way, the default case (can get bw or don't care about maxing out)
> would be easy and the extra work would be limited to drivers that want
> neither. >
> -Saravana
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] interconnect: Add sync state support
  2020-07-28  6:18     ` Mike Tipton
@ 2020-07-30 19:07       ` Saravana Kannan
  2020-08-01  2:36         ` Mike Tipton
  0 siblings, 1 reply; 7+ messages in thread
From: Saravana Kannan @ 2020-07-30 19:07 UTC (permalink / raw)
  To: Mike Tipton
  Cc: Georgi Djakov, Linux PM, okukatla, Bjorn Andersson,
	Vincent Guittot, linux-arm-msm, LKML

On Mon, Jul 27, 2020 at 11:18 PM Mike Tipton <mdtipton@codeaurora.org> wrote:
>
> On 7/22/2020 10:07 AM, Saravana Kannan wrote:
> > On Wed, Jul 22, 2020 at 4:01 AM Georgi Djakov <georgi.djakov@linaro.org> wrote:
> >>
> >> The bootloaders often do some initial configuration of the interconnects
> >> in the system and we want to keep this configuration until all consumers
> >> have probed and expressed their bandwidth needs. This is because we don't
> >> want to change the configuration by starting to disable unused paths until
> >> every user had a chance to request the amount of bandwidth it needs.
> >>
> >> To accomplish this we will implement an interconnect specific sync_state
> >> callback which will synchronize (aggregate and set) the current bandwidth
> >> settings when all consumers have been probed.
> >>
> >> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
> >> ---
> >>   drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
> >>   include/linux/interconnect-provider.h |  5 +++
> >>   2 files changed, 66 insertions(+)
> >>
> >> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> >> index e5f998744501..0c4e38d9f1fa 100644
> >> --- a/drivers/interconnect/core.c
> >> +++ b/drivers/interconnect/core.c
> >> @@ -26,6 +26,8 @@
> >>
> >>   static DEFINE_IDR(icc_idr);
> >>   static LIST_HEAD(icc_providers);
> >> +static int providers_count;
> >> +static bool synced_state;
> >>   static DEFINE_MUTEX(icc_lock);
> >>   static struct dentry *icc_debugfs_dir;
> >>
> >> @@ -255,6 +257,12 @@ static int aggregate_requests(struct icc_node *node)
> >>                          continue;
> >>                  p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
> >>                               &node->avg_bw, &node->peak_bw);
> >> +
> >> +               /* during boot use the initial bandwidth as a floor value */
> >> +               if (!synced_state) {
> >> +                       node->avg_bw = max(node->avg_bw, node->init_avg);
> >> +                       node->peak_bw = max(node->peak_bw, node->init_peak);
> >> +               }
> >
> > Sorry I didn't reply earlier.
> >
> > I liked your previous approach with the get_bw ops. The v2 approach
> > forces every interconnect provider driver to set up these values even
> > if they are okay with just maxing out the bandwidth. Also, if they can
> > actually query their hardware, this adds additional steps for them.
>
> The problem with using something like get_bw() is that while we can
> dynamically query the HW, we have far less granularity in HW than we
> have nodes in the framework. We vote at BCM-level granularity, but each
> BCM can have many nodes. For example, the sdm845 CN0 BCM has 47 nodes.
> If we implement get_bw() generically, then it would return the BW for
> each node, which would be the queried BCM vote scaled to account for
> differences in BCM/node widths. While this could be useful in general as
> an informational callback, we wouldn't want to use this as a proxy for
> our initial BW vote requirements. For CN0, we wouldn't want or need to
> vote 47 times for the same CN0 BCM. Each of the 47 node requests would
> result in the same BCM request.

Firstly most people in the list don't know what BCM means. Also, all
of this is your provider driver specific issues. If you are exposing
more nodes than available HW granularity, then you might want to
question why it needs to be done (probably to make aggregation easier
for the driver). If it's needed, then optimize your get/set() calls by
caching the value in an internal variable so that you don't send a
request to your BCM if you haven't changed the value since the last
request. This is not a reason to not have get_bw() calls at the
framework level. Other providers might support it and it'd make their
lives easier.

> All we'd really need is a single node per-BCM to serve as the proxy
> node. We'd query the HW, scale the queried value for the chosen proxy
> node, and set init_avg/init_peak appropriately. This would save a lot of
> unnecessary votes. Based on the current implementation, the set() call
> in icc_node_add() for initial BW wouldn't trigger any actual HW requests
> since we only queue BCMs that require updating in the aggregate()
> callback. However, the set() call in icc_sync_state() would, since we
> re-aggregate each node that has a non-zero init_avg/init_peak.

Having a fake "proxy node" seems like a bad internal design. Also,
have you timed the cost of these calls to justify your concern? If you
cache the values after aggregation and check before you send it down
to a "BCM", at worst you get one additional call to rpmh per BCM due
to this feature. I'm guessing any time delta would be lost as noise
compared to the boot up time.

> There's nothing stopping us from implementing get_bw() as if it were
> get_initial_bw(), but that only works until the framework decides to use
> get_bw() for more things than just the initial vote. I suppose we could
> also just have a "get_initial_bw" callback, but it only needs to be
> called once, so doesn't necessarily need a callback as opposed to
> additional init_avg/init_peak members in the icc_node struct.

The benefit of "ops" vs "fill up these variables" is that you can
differentiate between "I don't care, framework can decide" vs "I need
it to be 0". Put another way, there's no way to say "I don't care" if
you use variables. And by default drivers that don't really care (as
in, okay if it's set to INT_MAX) shouldn't have to do extra code/work.

Long story short, there's nothing wrong with get_bw(). If your
specific driver needs to optimize the calls to your RPMH hardware,
that should be hidden inside your driver.

-Saravana

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] interconnect: Add sync state support
  2020-07-30 19:07       ` Saravana Kannan
@ 2020-08-01  2:36         ` Mike Tipton
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Tipton @ 2020-08-01  2:36 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Georgi Djakov, Linux PM, okukatla, Bjorn Andersson,
	Vincent Guittot, linux-arm-msm, LKML

On 7/30/2020 12:07 PM, Saravana Kannan wrote:
> On Mon, Jul 27, 2020 at 11:18 PM Mike Tipton <mdtipton@codeaurora.org> wrote:
>>
>> On 7/22/2020 10:07 AM, Saravana Kannan wrote:
>>> On Wed, Jul 22, 2020 at 4:01 AM Georgi Djakov <georgi.djakov@linaro.org> wrote:
>>>>
>>>> The bootloaders often do some initial configuration of the interconnects
>>>> in the system and we want to keep this configuration until all consumers
>>>> have probed and expressed their bandwidth needs. This is because we don't
>>>> want to change the configuration by starting to disable unused paths until
>>>> every user had a chance to request the amount of bandwidth it needs.
>>>>
>>>> To accomplish this we will implement an interconnect specific sync_state
>>>> callback which will synchronize (aggregate and set) the current bandwidth
>>>> settings when all consumers have been probed.
>>>>
>>>> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
>>>> ---
>>>>    drivers/interconnect/core.c           | 61 +++++++++++++++++++++++++++
>>>>    include/linux/interconnect-provider.h |  5 +++
>>>>    2 files changed, 66 insertions(+)
>>>>
>>>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>>>> index e5f998744501..0c4e38d9f1fa 100644
>>>> --- a/drivers/interconnect/core.c
>>>> +++ b/drivers/interconnect/core.c
>>>> @@ -26,6 +26,8 @@
>>>>
>>>>    static DEFINE_IDR(icc_idr);
>>>>    static LIST_HEAD(icc_providers);
>>>> +static int providers_count;
>>>> +static bool synced_state;
>>>>    static DEFINE_MUTEX(icc_lock);
>>>>    static struct dentry *icc_debugfs_dir;
>>>>
>>>> @@ -255,6 +257,12 @@ static int aggregate_requests(struct icc_node *node)
>>>>                           continue;
>>>>                   p->aggregate(node, r->tag, r->avg_bw, r->peak_bw,
>>>>                                &node->avg_bw, &node->peak_bw);
>>>> +
>>>> +               /* during boot use the initial bandwidth as a floor value */
>>>> +               if (!synced_state) {
>>>> +                       node->avg_bw = max(node->avg_bw, node->init_avg);
>>>> +                       node->peak_bw = max(node->peak_bw, node->init_peak);
>>>> +               }
>>>
>>> Sorry I didn't reply earlier.
>>>
>>> I liked your previous approach with the get_bw ops. The v2 approach
>>> forces every interconnect provider driver to set up these values even
>>> if they are okay with just maxing out the bandwidth. Also, if they can
>>> actually query their hardware, this adds additional steps for them.
>>
>> The problem with using something like get_bw() is that while we can
>> dynamically query the HW, we have far less granularity in HW than we
>> have nodes in the framework. We vote at BCM-level granularity, but each
>> BCM can have many nodes. For example, the sdm845 CN0 BCM has 47 nodes.
>> If we implement get_bw() generically, then it would return the BW for
>> each node, which would be the queried BCM vote scaled to account for
>> differences in BCM/node widths. While this could be useful in general as
>> an informational callback, we wouldn't want to use this as a proxy for
>> our initial BW vote requirements. For CN0, we wouldn't want or need to
>> vote 47 times for the same CN0 BCM. Each of the 47 node requests would
>> result in the same BCM request.
> 
> Firstly most people in the list don't know what BCM means. Also, all
> of this is your provider driver specific issues. If you are exposing
> more nodes than available HW granularity, then you might want to
> question why it needs to be done (probably to make aggregation easier
> for the driver). If it's needed, then optimize your get/set() calls by
> caching the value in an internal variable so that you don't send a
> request to your BCM if you haven't changed the value since the last
> request. This is not a reason to not have get_bw() calls at the
> framework level. Other providers might support it and it'd make their
> lives easier.

The nodes capture the HW topology, so they are all needed to expose the 
necessary paths and aggregate properly. However, HW doesn't support 
voting for each node individually. Instead, they are grouped together 
and collections of nodes are voted using a single "BCM". So in addition 
to the node aggregation done in the framework, we also aggregate across 
all nodes belonging to the BCM.

> 
>> All we'd really need is a single node per-BCM to serve as the proxy
>> node. We'd query the HW, scale the queried value for the chosen proxy
>> node, and set init_avg/init_peak appropriately. This would save a lot of
>> unnecessary votes. Based on the current implementation, the set() call
>> in icc_node_add() for initial BW wouldn't trigger any actual HW requests
>> since we only queue BCMs that require updating in the aggregate()
>> callback. However, the set() call in icc_sync_state() would, since we
>> re-aggregate each node that has a non-zero init_avg/init_peak.
> 
> Having a fake "proxy node" seems like a bad internal design. Also,
> have you timed the cost of these calls to justify your concern? If you
> cache the values after aggregation and check before you send it down
> to a "BCM", at worst you get one additional call to rpmh per BCM due
> to this feature. I'm guessing any time delta would be lost as noise
> compared to the boot up time.

It wouldn't be a fake node. It would be one of the real, existing nodes. 
But since we can only query/vote the BCM (and a single BCM can have 
multiple nodes), then we'd just have to choose one node per-BCM in order 
to "hold" the initial BW from boot.

I don't have specific timing numbers offhand. But now that I think about 
it it's a bigger functional issue. If we implement get_bw() to return 
the available BW for each node in the BCM, then our internal BCM 
aggregation will quickly ramp to max. The avg_bw for each node in the 
BCM is summed. If every node in the BCM votes for the queried BW, then 
it will sum to something much higher.

As I mentioned before, we could solve this problem by implementing 
get_bw() to only return non-zero BW for one node per-BCM. This would 
work for the initial proxy voting. The only downside is that it wouldn't 
be a generic "get_bw" callback anymore. It would return our initial BW 
vote instead. That's not a big deal, unless the framework wants to use 
it for other purposes in the future. But that doesn't seem particularly 
likely. The framework already knows the BW explicitly voted by clients. 
So querying is only useful to find out how much BW is configured in HW, 
but isn't explicitly voted for. Outside of the initial sync-state 
context I don't see that being necessary.

> 
>> There's nothing stopping us from implementing get_bw() as if it were
>> get_initial_bw(), but that only works until the framework decides to use
>> get_bw() for more things than just the initial vote. I suppose we could
>> also just have a "get_initial_bw" callback, but it only needs to be
>> called once, so doesn't necessarily need a callback as opposed to
>> additional init_avg/init_peak members in the icc_node struct.
> 
> The benefit of "ops" vs "fill up these variables" is that you can
> differentiate between "I don't care, framework can decide" vs "I need
> it to be 0". Put another way, there's no way to say "I don't care" if
> you use variables. And by default drivers that don't really care (as
> in, okay if it's set to INT_MAX) shouldn't have to do extra code/work.
> 
> Long story short, there's nothing wrong with get_bw(). If your
> specific driver needs to optimize the calls to your RPMH hardware,
> that should be hidden inside your driver.

Yeah, I get all that. I was mainly pointing out that sometimes there's a 
distinction between the available BW (get_bw) and the desired initial BW 
vote (get_initial_bw). I'm fine if we go with get_bw(), but our 
implementation may look a bit odd.

> 
> -Saravana
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 11:01 [PATCH v2 0/2] Add interconnect sync state support Georgi Djakov
2020-07-22 11:01 ` [PATCH v2 1/2] interconnect: Add " Georgi Djakov
2020-07-22 17:07   ` Saravana Kannan
2020-07-28  6:18     ` Mike Tipton
2020-07-30 19:07       ` Saravana Kannan
2020-08-01  2:36         ` Mike Tipton
2020-07-22 11:01 ` [PATCH v2 2/2] interconnect: qcom: Use icc_sync_state in sdm845 and osm-3l drivers Georgi Djakov

Linux-ARM-MSM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-msm/0 linux-arm-msm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-msm linux-arm-msm/ https://lore.kernel.org/linux-arm-msm \
		linux-arm-msm@vger.kernel.org
	public-inbox-index linux-arm-msm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-arm-msm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git