Re: [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group

From: K Prateek Nayak <kprateek.nayak@amd.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: peterz@infradead.org, aubrey.li@linux.intel.com, efault@gmx.de,
	gautham.shenoy@amd.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, song.bao.hua@hisilicon.com,
	srikar@linux.vnet.ibm.com, valentin.schneider@arm.com,
	vincent.guittot@linaro.org
Subject: Re: [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group
Date: Fri, 18 Feb 2022 09:25:48 +0530	[thread overview]
Message-ID: <9de59bef-0dbb-d94a-077e-28e06daa521a@amd.com> (raw)
In-Reply-To: <20220217131512.GW3366@techsingularity.net>

Hello Mel,

Thank you for the feedback.

On 2/17/2022 6:45 PM, Mel Gorman wrote:
> On Thu, Feb 17, 2022 at 04:53:51PM +0530, K Prateek Nayak wrote:
>> [..snip..]
>> Can we optimize this further as:
>>
>> 	imb = sd->imb_numa_nr;
>> 	if (unlikely(p->nr_cpus_allowed != num_online_cpus()))
>> 		struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>>
>> 		cpumask_and(cpus, sched_group_span(local), p->cpus_ptr);
>> 		imb = min(cpumask_weight(cpus), imb);
>> 	}
>>
>> As for most part, p->nr_cpus_allowed will be equal to num_online_cpus()
>> unless user has specifically pinned the task.
>>
> I'm a little wary due to https://lwn.net/Articles/420019/ raising concerns
> from people that feel more strongly about likely/unlikely use.
I wasn't aware of this. Thank you for pointing this out to me.
> Whether that
> branch is likely true or not is specific to the deployment. On my desktop
> and most tests I run, the branch is very unlikely because most workloads
> I run are usually not CPU-constrained and not fork-intensive. Even those
> that are CPU contrained are generally not fork intensive. For a setup with
> lots of containers, virtual machines, locality-aware applications etc,
> the path is potentially very likely and harder to detect in the future.
Yes, you make a good point.
> I don't object to the change but I would wonder if it's measurable for
> anything other than a fork-intensive microbenchmark given it's one branch
> in a relatively heavy operation.
>
> I think a relatively harmless micro-optimisation would be
>
> -		imb = min(cpumask_weight(cpus), imb);
> +		imb = cpumask_weight(cpus);
>
> It assumes that the constrained cpus_allowed would have a lower imb
> than one calculated based on all cpus allowed which sounds like a safe
> assumption other than racing with hot-onlining a bunch of CPUs.
This is a good micro-optimization as long as the assumption holds
true.
> I think both micro-optimisations are negligible in comparison to avoiding
> an unecessary cpumask_and cpumask_weight call.
I agree. Checking for p->nr_cpus_allowed != num_online_cpus() will
avoid the relatively expensive cpumask operations.
> FWIW, I looked at my own
> use of likely/unlikely recently and it's
>
> c49c2c47dab6b8d45022b3fabf0642a0e62e3109 unlikely that memory hotplug operation is in progress
> 3b12e7e97938424de2bb1b95ba0bd6a49bad39f9 hotplug active or machine booting
> df1acc856923c0a65c28b588585449106c316b71 memory isolated for hotplug or CMA attempt in progress
> 56f0e661ea8c0178e80048df7166653a51ef2c3d memory isolated for hotplug or CMA attempt in progress
> b3b64ebd38225d8032b5db42938d969b602040c2 bulk allocation request with an array that already has pages
>
> Of those, the last one is the most marginal because it really depends
> on whether network core or NFS is the heavy user of the interface and
> I made a guess that high-speed networks are more common critical paths
> than NFS servers.
Thank you for giving these examples. I considered the case of branch
being taken to be very unlikely based on my workloads, but as you
pointed out, there may be other cases where it's outcome might not be
so predictable all the time.

--
Thanks and Regards,
Prateek