linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: Quentin Perret <quentin.perret@arm.com>
Cc: peterz@infradead.org, rjw@rjwysocki.net,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	gregkh@linuxfoundation.org, mingo@redhat.com,
	dietmar.eggemann@arm.com, morten.rasmussen@arm.com,
	chris.redpath@arm.com, valentin.schneider@arm.com,
	vincent.guittot@linaro.org, thara.gopinath@linaro.org,
	viresh.kumar@linaro.org, tkjos@google.com,
	joel@joelfernandes.org, smuckle@google.com,
	adharmap@codeaurora.org, skannan@codeaurora.org,
	pkondeti@codeaurora.org, juri.lelli@redhat.com,
	edubezval@gmail.com, srinivas.pandruvada@linux.intel.com,
	currojerez@riseup.net, javi.merino@kernel.org
Subject: Re: [PATCH v6 03/14] PM: Introduce an Energy Model management framework
Date: Wed, 29 Aug 2018 11:04:35 +0100	[thread overview]
Message-ID: <20180829100435.GP2960@e110439-lin> (raw)
In-Reply-To: <20180820094420.26590-4-quentin.perret@arm.com>

Hi Quentin,
few possible optimizations related to the (simplified) following
code:

On 20-Aug 10:44, Quentin Perret wrote:

[...]

> +struct em_perf_domain {
> +	struct em_cap_state *table; /* Capacity states, in ascending order. */
> +	int nr_cap_states;
> +	unsigned long cpus[0]; /* CPUs of the frequency domain. */
> +};

[...]

> +static DEFINE_PER_CPU(struct em_perf_domain *, em_data);

[...]

> +struct em_perf_domain *em_cpu_get(int cpu)
> +{
> +	return READ_ONCE(per_cpu(em_data, cpu));
> +}

[...]

> +int em_register_perf_domain(cpumask_t *span, unsigned int nr_states,
> +						struct em_data_callback *cb)
> +{

[...]

> +	mutex_lock(&em_pd_mutex);
> +
[...]

> +     for_each_cpu(cpu, span) {
> +             if (READ_ONCE(per_cpu(em_data, cpu))) {
> +                     ret = -EEXIST;
> +                     goto unlock;
> +             }

[...]

> +	for_each_cpu(cpu, span) {
> +		/*
> +		 * The per-cpu array can be concurrently accessed from
> +		 * em_cpu_get().
> +		 */
> +		smp_store_release(per_cpu_ptr(&em_data, cpu), pd);
> +	}

[...]

> +unlock:
> +	mutex_unlock(&em_pd_mutex);
> +}


In the loop above we use smp_store_release() to propagate the pointer
setting in a PER_CPU(em_data), which ultimate goal is to protect
em_register_perf_domain() from multiple clients registering the same
power domain.

I think there are two possible optimizations there:

1. use of a single memory barrier

   Since we are already em_pd_mutex protected, i.e. there cannot be a
   concurrent writers, we can use one single memory barrier after the
   loop, i.e.

        for_each_cpu(cpu, span)
                WRITE_ONCE()
        smp_wmb()

   which should be just enough to ensure that all other CPUs will see
   the pointer set once we release the mutex

2. avoid using PER_CPU variables

   Apart from the initialization code, i.e. boot time, the em_data is
   expected to be read only, isn't it?

   If that's the case, I think that using PER_CPU variables is not
   strictly required while it unnecessarily increases the cache pressure.

   In the worst case we can end up with one cache line for each CPU to
   host just an 8B pointer, instead of using that single cache line to host
   up to 8 pointers if we use just an array, i.e.

        struct em_perf_domain *em_data[NR_CPUS]
                ____cacheline_aligned_in_smp __read_mostly;

   Consider also that: up to 8 pointers in a single cache line means
   also that single cache line can be enough to access the EM from all
   the CPUs of almost every modern mobile phone SoC.

   Note entirely sure if PER_CPU uses less overall memory in case you
   have much less CPUs then the compile time defined NR_CPUS.
   But still, if the above makes sense, you still have a 8x gain
   factor between number Write allocated .data..percp sections and
   the value of NR_CPUS. Meaning that in the worst case we allocate
   the same amount of memory using NR_CPUS=64 (the default on arm64)
   while running on an 8 CPUs system... but still we should get less
   cluster caches pressure at run-time with the array approach, 1
   cache line vs 4.

Best,
Patrick

-- 
#include <best/regards.h>

Patrick Bellasi

  reply	other threads:[~2018-08-29 10:04 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-20  9:44 [PATCH v6 00/14] Energy Aware Scheduling Quentin Perret
2018-08-20  9:44 ` [PATCH v6 01/14] sched: Relocate arch_scale_cpu_capacity Quentin Perret
2018-08-20  9:44 ` [PATCH v6 02/14] sched/cpufreq: Factor out utilization to frequency mapping Quentin Perret
2018-09-10  9:29   ` Rafael J. Wysocki
2018-08-20  9:44 ` [PATCH v6 03/14] PM: Introduce an Energy Model management framework Quentin Perret
2018-08-29 10:04   ` Patrick Bellasi [this message]
2018-08-29 13:28     ` Quentin Perret
2018-08-31  9:04       ` Patrick Bellasi
2018-09-11  9:34       ` Andrea Parri
2018-09-11 12:32         ` Quentin Perret
2018-09-11 13:31           ` Andrea Parri
2018-09-10  9:44   ` Rafael J. Wysocki
2018-09-10 10:38     ` Quentin Perret
2018-09-10 10:40       ` Rafael J. Wysocki
2018-08-20  9:44 ` [PATCH v6 04/14] PM / EM: Expose the Energy Model in sysfs Quentin Perret
2018-09-06  6:56   ` Dietmar Eggemann
2018-09-06 14:09     ` Quentin Perret
2018-09-07  0:14       ` Dietmar Eggemann
2018-08-20  9:44 ` [PATCH v6 05/14] sched/topology: Reference the Energy Model of CPUs when available Quentin Perret
2018-08-29 16:22   ` Patrick Bellasi
2018-08-29 16:56     ` Quentin Perret
2018-08-30 10:00       ` Patrick Bellasi
2018-08-30 10:47         ` Quentin Perret
2018-08-30 12:50           ` Patrick Bellasi
2018-08-20  9:44 ` [PATCH v6 06/14] sched/topology: Lowest CPU asymmetry sched_domain level pointer Quentin Perret
2018-08-20  9:44 ` [PATCH v6 07/14] sched/topology: Introduce sched_energy_present static key Quentin Perret
2018-08-29 16:50   ` Patrick Bellasi
2018-08-29 17:20     ` Quentin Perret
2018-08-30  9:23       ` Patrick Bellasi
2018-08-30  9:57         ` Quentin Perret
2018-08-30 10:18           ` Patrick Bellasi
2018-09-06  6:06   ` Dietmar Eggemann
2018-09-06  9:29     ` Quentin Perret
2018-09-06 23:49       ` Dietmar Eggemann
2018-09-07  8:24         ` Quentin Perret
2018-08-20  9:44 ` [PATCH v6 08/14] sched/fair: Clean-up update_sg_lb_stats parameters Quentin Perret
2018-08-20  9:44 ` [PATCH v6 09/14] sched: Add over-utilization/tipping point indicator Quentin Perret
2018-08-20  9:44 ` [PATCH v6 10/14] sched/cpufreq: Refactor the utilization aggregation method Quentin Perret
2018-09-10  9:53   ` Rafael J. Wysocki
2018-09-10 10:07     ` Quentin Perret
2018-09-10 10:25       ` Rafael J. Wysocki
2018-08-20  9:44 ` [PATCH v6 11/14] sched/fair: Introduce an energy estimation helper function Quentin Perret
2018-08-20  9:44 ` [PATCH v6 12/14] sched/fair: Select an energy-efficient CPU on task wake-up Quentin Perret
2018-08-20  9:44 ` [PATCH v6 13/14] sched/topology: Make Energy Aware Scheduling depend on schedutil Quentin Perret
2018-09-04 10:59   ` Quentin Perret
2018-09-06  9:18     ` Rafael J. Wysocki
2018-09-06 14:38       ` Quentin Perret
2018-09-07  8:52         ` Rafael J. Wysocki
2018-09-07  8:56           ` Rafael J. Wysocki
2018-09-07  9:02             ` Quentin Perret
2018-09-07 15:29           ` Quentin Perret
2018-09-09 20:13             ` Rafael J. Wysocki
2018-09-10  8:24               ` Quentin Perret
2018-09-10  8:55                 ` Rafael J. Wysocki
2018-09-10  9:43                   ` Quentin Perret
2018-08-20  9:44 ` [PATCH v6 14/14] OPTIONAL: cpufreq: dt: Register an Energy Model Quentin Perret
2018-09-10  9:12 ` [PATCH v6 00/14] Energy Aware Scheduling Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180829100435.GP2960@e110439-lin \
    --to=patrick.bellasi@arm.com \
    --cc=adharmap@codeaurora.org \
    --cc=chris.redpath@arm.com \
    --cc=currojerez@riseup.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=edubezval@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=javi.merino@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pkondeti@codeaurora.org \
    --cc=quentin.perret@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=skannan@codeaurora.org \
    --cc=smuckle@google.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=thara.gopinath@linaro.org \
    --cc=tkjos@google.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).