All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qais Yousef <qais.yousef@arm.com>
To: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Quentin Perret <qperret@google.com>, Tejun Heo <tj@kernel.org>,
	Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	kernel-team@android.com
Subject: Re: [PATCH v4 12/14] arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system
Date: Wed, 2 Dec 2020 12:59:52 +0000	[thread overview]
Message-ID: <20201202125952.z2q6oucoqbwt6aq2@e107158-lin.cambridge.arm.com> (raw)
In-Reply-To: <20201201221335.GA28496@willie-the-truck>

On 12/01/20 22:13, Will Deacon wrote:
> On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote:
> > On 11/24/20 15:50, Will Deacon wrote:
> > > If we want to support 32-bit applications, then when we identify a CPU
> > > with mismatched 32-bit EL0 support we must ensure that we will always
> > > have an active 32-bit CPU available to us from then on. This is important
> > > for the scheduler, because is_cpu_allowed() will be constrained to 32-bit
> > > CPUs for compat tasks and forced migration due to a hotplug event will
> > > hang if no 32-bit CPUs are available.
> > > 
> > > On detecting a mismatch, prevent offlining of either the mismatching CPU
> > > if it is 32-bit capable, or find the first active 32-bit capable CPU
> > > otherwise.
> >                                        ^^^^^
> > 
> > You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly
> > random function now, cpumask_any_and_distribute(), if you'd like to pick
> > something 'truly' random.
> 
> I think cpumask_any_and() is better, because it makes it clear that I don't
> care about which CPU is chosen (and under the hood it ends up calling
> cpumask_first_and() _anyway_). So this is purely cosmetic.
> 
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 29017cbb6c8e..fe470683b43e 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope)
> > >  
> > >  static int enable_mismatched_32bit_el0(unsigned int cpu)
> > >  {
> > > +	static int lucky_winner = -1;
> > > +
> > >  	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu);
> > >  	bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0);
> > >  
> > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu)
> > >  		static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0);
> > >  	}
> > >  
> > > +	if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit)
> > > +		return 0;
> > 
> > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the
> > boot CPU) specially here, but I don't get why?
> 
> If our ability to execute 32-bit code is the same as the boot CPU then we
> don't have to do anything. That way, we can postpone nominating the lucky
> winner until we really need to.

Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me
trouble. If the first N cpus are 64bit only, we'll skip them here. Worth
a comment?

Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is
empty instead? That would be a lot easier to read.

> 
> > > +	if (lucky_winner >= 0)
> > > +		return 0;
> > > +
> > > +	/*
> > > +	 * We've detected a mismatch. We need to keep one of our CPUs with
> > > +	 * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting
> > > +	 * every CPU in the system for a 32-bit task.
> > > +	 */
> > > +	lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask,
> > > +							 cpu_active_mask);
> > 
> > cpumask_any_and() could return an error. It could be hard or even impossible to
> > trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling
> > get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential
> > splat?
> 
> I don't see how it can return an error here. There are two cases to
> consider:
> 
>   1. The CPU being brought online is the first 32-bit-capable CPU. In which
>      case, we don't use cpumask_any_and() at all.
> 
>   2. The CPU being brought online is the first 64-bit-only CPU. In which
>      case, the CPU doing the onlining is 32-bit capable and will be in
>      the active mask.

Now I understand the if condition above; yeah I think this will not hit.
The condition above simply guarantees cpu_32bit_el0_mask is not empty. And
since this is online path, there's a guarantee there's a single bit shared
between the 2 masks since the same path must have set this shared bit.

> 
> > We can do better by the way and do smarter check in remove_cpu() to block
> > offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But
> > won't insist and happy to wait for someone to come complaining this is not good
> > enough first.
> 
> I couldn't find a satisfactory way to do this without the possibility of
> subtle races, so I'd prefer to keep it simple for the moment. In particular,
> I wanted to make sure that somebody iterating over the cpu_possible_mask
> and calling is_cpu_allowed(p, cpu) for each CPU and a 32-bit task can not
> reach the end of the mask without ever getting a value of 'true'.
> 
> I'm open to revisiting this once some of this is merged, but right now
> I don't think it's needed and it certainly adds complexity.

Agreed. I just wanted to share some awareness. Let's not make this series more
complicated than it needs to be.

> 
> > Some vendors play games with hotplug to help with saving power. They might want
> > to dynamically nominate the last man standing 32bit capable CPU. Again, we can
> > wait for someone to complain first I guess.
> 
> The reality is that either all "big" cores or all "little" cores will be the
> ones that are 32-bit capable, so I doubt it matters an awful lot which one
> of the cluster is left online from a PM perspective. The real problem is
> that a core has to be left online at all, but I don't think we can avoid
> that.

I don't have specific info, but in theory it could matter. We enforce
a specific core to be always online, rather than allow any core to be offlined
until there's a single one left in the mask. I agree we shouldn't worry about
this for now.

And yes, we can't avoid keeping the last 32bit capable cpu online. The same way
we can't offline the last cpu in a normal system.

Thanks

--
Qais Yousef

WARNING: multiple messages have this Message-ID (diff)
From: Qais Yousef <qais.yousef@arm.com>
To: Will Deacon <will@kernel.org>
Cc: linux-arch@vger.kernel.org, Marc Zyngier <maz@kernel.org>,
	kernel-team@android.com,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Quentin Perret <qperret@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Li Zefan <lizefan@huawei.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>, Suren Baghdasaryan <surenb@google.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v4 12/14] arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system
Date: Wed, 2 Dec 2020 12:59:52 +0000	[thread overview]
Message-ID: <20201202125952.z2q6oucoqbwt6aq2@e107158-lin.cambridge.arm.com> (raw)
In-Reply-To: <20201201221335.GA28496@willie-the-truck>

On 12/01/20 22:13, Will Deacon wrote:
> On Fri, Nov 27, 2020 at 01:41:22PM +0000, Qais Yousef wrote:
> > On 11/24/20 15:50, Will Deacon wrote:
> > > If we want to support 32-bit applications, then when we identify a CPU
> > > with mismatched 32-bit EL0 support we must ensure that we will always
> > > have an active 32-bit CPU available to us from then on. This is important
> > > for the scheduler, because is_cpu_allowed() will be constrained to 32-bit
> > > CPUs for compat tasks and forced migration due to a hotplug event will
> > > hang if no 32-bit CPUs are available.
> > > 
> > > On detecting a mismatch, prevent offlining of either the mismatching CPU
> > > if it is 32-bit capable, or find the first active 32-bit capable CPU
> > > otherwise.
> >                                        ^^^^^
> > 
> > You use cpumask_any_and(). Better use cpumask_first_and()? We have a truly
> > random function now, cpumask_any_and_distribute(), if you'd like to pick
> > something 'truly' random.
> 
> I think cpumask_any_and() is better, because it makes it clear that I don't
> care about which CPU is chosen (and under the hood it ends up calling
> cpumask_first_and() _anyway_). So this is purely cosmetic.
> 
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 29017cbb6c8e..fe470683b43e 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -1237,6 +1237,8 @@ has_cpuid_feature(const struct arm64_cpu_capabilities *entry, int scope)
> > >  
> > >  static int enable_mismatched_32bit_el0(unsigned int cpu)
> > >  {
> > > +	static int lucky_winner = -1;
> > > +
> > >  	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, cpu);
> > >  	bool cpu_32bit = id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0);
> > >  
> > > @@ -1245,6 +1247,22 @@ static int enable_mismatched_32bit_el0(unsigned int cpu)
> > >  		static_branch_enable_cpuslocked(&arm64_mismatched_32bit_el0);
> > >  	}
> > >  
> > > +	if (cpumask_test_cpu(0, cpu_32bit_el0_mask) == cpu_32bit)
> > > +		return 0;
> > 
> > Hmm I'm struggling to get what you're doing here. You're treating CPU0 (the
> > boot CPU) specially here, but I don't get why?
> 
> If our ability to execute 32-bit code is the same as the boot CPU then we
> don't have to do anything. That way, we can postpone nominating the lucky
> winner until we really need to.

Okay I see what you're doing now. The '== cpu_32bit' part of the check gave me
trouble. If the first N cpus are 64bit only, we'll skip them here. Worth
a comment?

Wouldn't it be better to replace this with a check if cpu_32bit_el0_mask is
empty instead? That would be a lot easier to read.

> 
> > > +	if (lucky_winner >= 0)
> > > +		return 0;
> > > +
> > > +	/*
> > > +	 * We've detected a mismatch. We need to keep one of our CPUs with
> > > +	 * 32-bit EL0 online so that is_cpu_allowed() doesn't end up rejecting
> > > +	 * every CPU in the system for a 32-bit task.
> > > +	 */
> > > +	lucky_winner = cpu_32bit ? cpu : cpumask_any_and(cpu_32bit_el0_mask,
> > > +							 cpu_active_mask);
> > 
> > cpumask_any_and() could return an error. It could be hard or even impossible to
> > trigger, but better check if lucky_winner is not >= nr_cpu_ids before calling
> > get_cpu_device(lucky_winner) to stay in the safe side and avoid a potential
> > splat?
> 
> I don't see how it can return an error here. There are two cases to
> consider:
> 
>   1. The CPU being brought online is the first 32-bit-capable CPU. In which
>      case, we don't use cpumask_any_and() at all.
> 
>   2. The CPU being brought online is the first 64-bit-only CPU. In which
>      case, the CPU doing the onlining is 32-bit capable and will be in
>      the active mask.

Now I understand the if condition above; yeah I think this will not hit.
The condition above simply guarantees cpu_32bit_el0_mask is not empty. And
since this is online path, there's a guarantee there's a single bit shared
between the 2 masks since the same path must have set this shared bit.

> 
> > We can do better by the way and do smarter check in remove_cpu() to block
> > offlining the last aarch32 capable CPU without 'hardcoding' a specific cpu. But
> > won't insist and happy to wait for someone to come complaining this is not good
> > enough first.
> 
> I couldn't find a satisfactory way to do this without the possibility of
> subtle races, so I'd prefer to keep it simple for the moment. In particular,
> I wanted to make sure that somebody iterating over the cpu_possible_mask
> and calling is_cpu_allowed(p, cpu) for each CPU and a 32-bit task can not
> reach the end of the mask without ever getting a value of 'true'.
> 
> I'm open to revisiting this once some of this is merged, but right now
> I don't think it's needed and it certainly adds complexity.

Agreed. I just wanted to share some awareness. Let's not make this series more
complicated than it needs to be.

> 
> > Some vendors play games with hotplug to help with saving power. They might want
> > to dynamically nominate the last man standing 32bit capable CPU. Again, we can
> > wait for someone to complain first I guess.
> 
> The reality is that either all "big" cores or all "little" cores will be the
> ones that are 32-bit capable, so I doubt it matters an awful lot which one
> of the cluster is left online from a PM perspective. The real problem is
> that a core has to be left online at all, but I don't think we can avoid
> that.

I don't have specific info, but in theory it could matter. We enforce
a specific core to be always online, rather than allow any core to be offlined
until there's a single one left in the mask. I agree we shouldn't worry about
this for now.

And yes, we can't avoid keeping the last 32bit capable cpu online. The same way
we can't offline the last cpu in a normal system.

Thanks

--
Qais Yousef

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-12-02 13:00 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24 15:50 [PATCH v4 00/14] An alternative series for asymmetric AArch32 systems Will Deacon
2020-11-24 15:50 ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 01/14] arm64: cpuinfo: Split AArch32 registers out into a separate struct Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 02/14] arm64: Allow mismatched 32-bit EL0 support Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 10:25   ` Marc Zyngier
2020-11-27 10:25     ` Marc Zyngier
2020-11-27 11:50     ` Will Deacon
2020-11-27 11:50       ` Will Deacon
2020-11-27 13:09   ` Qais Yousef
2020-11-27 13:09     ` Qais Yousef
2020-12-01 16:56     ` Will Deacon
2020-12-01 16:56       ` Will Deacon
2020-12-02 13:16       ` Qais Yousef
2020-12-02 13:16         ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 03/14] KVM: arm64: Kill 32-bit vCPUs on systems with mismatched " Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 10:26   ` Marc Zyngier
2020-11-27 10:26     ` Marc Zyngier
2020-11-27 11:53     ` Will Deacon
2020-11-27 11:53       ` Will Deacon
2020-11-27 17:14       ` Marc Zyngier
2020-11-27 17:14         ` Marc Zyngier
2020-11-27 17:24         ` Quentin Perret
2020-11-27 17:24           ` Quentin Perret
2020-11-27 18:16           ` Marc Zyngier
2020-11-27 18:16             ` Marc Zyngier
2020-12-01 16:57             ` Will Deacon
2020-12-01 16:57               ` Will Deacon
2020-12-02  8:18               ` Marc Zyngier
2020-12-02  8:18                 ` Marc Zyngier
2020-12-02 17:27                 ` Will Deacon
2020-12-02 17:27                   ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 04/14] arm64: Kill 32-bit applications scheduled on 64-bit-only CPUs Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:12   ` Qais Yousef
2020-11-27 13:12     ` Qais Yousef
2020-12-01 16:56     ` Will Deacon
2020-12-01 16:56       ` Will Deacon
2020-12-02 13:52       ` Qais Yousef
2020-12-02 13:52         ` Qais Yousef
2020-12-02 17:42         ` Will Deacon
2020-12-02 17:42           ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 05/14] arm64: Advertise CPUs capable of running 32-bit applications in sysfs Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 06/14] arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:17   ` Qais Yousef
2020-11-27 13:17     ` Qais Yousef
2020-12-01 16:56     ` Will Deacon
2020-12-01 16:56       ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 07/14] sched: Introduce restrict_cpus_allowed_ptr() to limit task CPU affinity Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27  9:49   ` Quentin Perret
2020-11-27  9:49     ` Quentin Perret
2020-11-27 13:19   ` Qais Yousef
2020-11-27 13:19     ` Qais Yousef
2020-12-01 16:56     ` Will Deacon
2020-12-01 16:56       ` Will Deacon
2020-12-02 13:06       ` Qais Yousef
2020-12-02 13:06         ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 08/14] arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 10:01   ` Quentin Perret
2020-11-27 10:01     ` Quentin Perret
2020-11-27 13:23   ` Qais Yousef
2020-11-27 13:23     ` Qais Yousef
2020-12-01 16:55     ` Will Deacon
2020-12-01 16:55       ` Will Deacon
2020-12-02 14:07       ` Qais Yousef
2020-12-02 14:07         ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 09/14] cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:32   ` Qais Yousef
2020-11-27 13:32     ` Qais Yousef
2020-11-30 17:05     ` Qais Yousef
2020-11-30 17:05       ` Qais Yousef
2020-11-30 17:36       ` Quentin Perret
2020-11-30 17:36         ` Quentin Perret
2020-12-01 11:58         ` Qais Yousef
2020-12-01 11:58           ` Qais Yousef
2020-12-01 12:37           ` Quentin Perret
2020-12-01 12:37             ` Quentin Perret
2020-12-01 14:11             ` Qais Yousef
2020-12-01 14:11               ` Qais Yousef
2020-12-01 15:56               ` Quentin Perret
2020-12-01 15:56                 ` Quentin Perret
2020-12-01 22:30                 ` Will Deacon
2020-12-01 22:30                   ` Will Deacon
2020-12-02 11:34                   ` Qais Yousef
2020-12-02 11:34                     ` Qais Yousef
2020-12-02 11:33                 ` Qais Yousef
2020-12-02 11:33                   ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 10/14] sched: Introduce arch_task_cpu_possible_mask() to limit fallback rq selection Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-24 15:50 ` [PATCH v4 11/14] sched: Reject CPU affinity changes based on arch_task_cpu_possible_mask() Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27  9:54   ` Quentin Perret
2020-11-27  9:54     ` Quentin Perret
2020-11-24 15:50 ` [PATCH v4 12/14] arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:41   ` Qais Yousef
2020-11-27 13:41     ` Qais Yousef
2020-12-01 22:13     ` Will Deacon
2020-12-01 22:13       ` Will Deacon
2020-12-02 12:59       ` Qais Yousef [this message]
2020-12-02 12:59         ` Qais Yousef
2020-12-02 17:42         ` Will Deacon
2020-12-02 17:42           ` Will Deacon
2020-12-02 18:08           ` Qais Yousef
2020-12-02 18:08             ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 13/14] arm64: Implement arch_task_cpu_possible_mask() Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:41   ` Qais Yousef
2020-11-27 13:41     ` Qais Yousef
2020-11-24 15:50 ` [PATCH v4 14/14] arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores Will Deacon
2020-11-24 15:50   ` Will Deacon
2020-11-27 13:58 ` [PATCH v4 00/14] An alternative series for asymmetric AArch32 systems Qais Yousef
2020-11-27 13:58   ` Qais Yousef
2020-12-05 20:43 ` Pavel Machek
2020-12-05 20:43   ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202125952.z2q6oucoqbwt6aq2@e107158-lin.cambridge.arm.com \
    --to=qais.yousef@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.