linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: Set SSBS for user threads while creation
@ 2019-12-23 13:02 Srinivas Ramana
  2019-12-24  7:06 ` Anshuman Khandual
  2020-01-02 18:01 ` Catalin Marinas
  0 siblings, 2 replies; 7+ messages in thread
From: Srinivas Ramana @ 2019-12-23 13:02 UTC (permalink / raw)
  To: will, catalin.marinas, maz, will.deacon
  Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Srinivas Ramana

Current SSBS implementation takes care of setting the
SSBS bit in start_thread() for user threads. While this works
for tasks launched with fork/clone followed by execve, for cases
where userspace would just call fork (eg, Java applications) this
leaves the SSBS bit unset. This results in performance
regression for such tasks.

It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
on context switch") masks this issue, but that was done for a
different reason where heterogeneous CPUs(both SSBS supported
and unsupported) are present. It is appropriate to take care
of the SSBS bit for all threads while creation itself.

Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
---
 arch/arm64/kernel/process.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 71f788cd2b18..a8f05cc39261 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -399,6 +399,13 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 		 */
 		if (clone_flags & CLONE_SETTLS)
 			p->thread.uw.tp_value = childregs->regs[3];
+
+		if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE) {
+			if (is_compat_thread(task_thread_info(p)))
+				set_compat_ssbs_bit(childregs);
+			else
+				set_ssbs_bit(childregs);
+		}
 	} else {
 		memset(childregs, 0, sizeof(struct pt_regs));
 		childregs->pstate = PSR_MODE_EL1h;
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2019-12-23 13:02 [PATCH] arm64: Set SSBS for user threads while creation Srinivas Ramana
@ 2019-12-24  7:06 ` Anshuman Khandual
  2019-12-24  8:30   ` Srinivas Ramana
  2020-01-02 18:01 ` Catalin Marinas
  1 sibling, 1 reply; 7+ messages in thread
From: Anshuman Khandual @ 2019-12-24  7:06 UTC (permalink / raw)
  To: Srinivas Ramana, will, catalin.marinas, maz, will.deacon
  Cc: linux-arm-msm, linux-kernel, linux-arm-kernel



On 12/23/2019 06:32 PM, Srinivas Ramana wrote:
> Current SSBS implementation takes care of setting the
> SSBS bit in start_thread() for user threads. While this works
> for tasks launched with fork/clone followed by execve, for cases
> where userspace would just call fork (eg, Java applications) this
> leaves the SSBS bit unset. This results in performance
> regression for such tasks.
> 
> It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
> on context switch") masks this issue, but that was done for a
> different reason where heterogeneous CPUs(both SSBS supported
> and unsupported) are present. It is appropriate to take care
> of the SSBS bit for all threads while creation itself.

So this fixes the situation (i.e low performance) from the creation time
of a task with fork() which will never see a subsequent execve, till it
gets context switched for the very first time ?

> 
> Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
> ---
>  arch/arm64/kernel/process.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 71f788cd2b18..a8f05cc39261 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -399,6 +399,13 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>  		 */
>  		if (clone_flags & CLONE_SETTLS)
>  			p->thread.uw.tp_value = childregs->regs[3];
> +
> +		if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE) {
> +			if (is_compat_thread(task_thread_info(p)))
> +				set_compat_ssbs_bit(childregs);
> +			else
> +				set_ssbs_bit(childregs);
> +		}
>  	} else {
>  		memset(childregs, 0, sizeof(struct pt_regs));
>  		childregs->pstate = PSR_MODE_EL1h;
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2019-12-24  7:06 ` Anshuman Khandual
@ 2019-12-24  8:30   ` Srinivas Ramana
  0 siblings, 0 replies; 7+ messages in thread
From: Srinivas Ramana @ 2019-12-24  8:30 UTC (permalink / raw)
  To: Anshuman Khandual, will, catalin.marinas, maz, will.deacon
  Cc: linux-arm-msm, linux-kernel, linux-arm-kernel

On 12/24/2019 12:36 PM, Anshuman Khandual wrote:
> 
> 
> On 12/23/2019 06:32 PM, Srinivas Ramana wrote:
>> Current SSBS implementation takes care of setting the
>> SSBS bit in start_thread() for user threads. While this works
>> for tasks launched with fork/clone followed by execve, for cases
>> where userspace would just call fork (eg, Java applications) this
>> leaves the SSBS bit unset. This results in performance
>> regression for such tasks.
>>
>> It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
>> on context switch") masks this issue, but that was done for a
>> different reason where heterogeneous CPUs(both SSBS supported
>> and unsupported) are present. It is appropriate to take care
>> of the SSBS bit for all threads while creation itself.
> 
> So this fixes the situation (i.e low performance) from the creation time
> of a task with fork() which will never see a subsequent execve, till it
> gets context switched for the very first time ?
> 
Yes, that is correct.

>>
>> Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
>> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
>> ---
>>   arch/arm64/kernel/process.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
>> index 71f788cd2b18..a8f05cc39261 100644
>> --- a/arch/arm64/kernel/process.c
>> +++ b/arch/arm64/kernel/process.c
>> @@ -399,6 +399,13 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>>   		 */
>>   		if (clone_flags & CLONE_SETTLS)
>>   			p->thread.uw.tp_value = childregs->regs[3];
>> +
>> +		if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE) {
>> +			if (is_compat_thread(task_thread_info(p)))
>> +				set_compat_ssbs_bit(childregs);
>> +			else
>> +				set_ssbs_bit(childregs);
>> +		}
>>   	} else {
>>   		memset(childregs, 0, sizeof(struct pt_regs));
>>   		childregs->pstate = PSR_MODE_EL1h;
>>

Thanks,
-- Srinivas R

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2019-12-23 13:02 [PATCH] arm64: Set SSBS for user threads while creation Srinivas Ramana
  2019-12-24  7:06 ` Anshuman Khandual
@ 2020-01-02 18:01 ` Catalin Marinas
  2020-01-09 15:17   ` Will Deacon
  2020-01-29 11:48   ` Srinivas Ramana
  1 sibling, 2 replies; 7+ messages in thread
From: Catalin Marinas @ 2020-01-02 18:01 UTC (permalink / raw)
  To: Srinivas Ramana; +Cc: will, maz, linux-arm-kernel, linux-kernel, linux-arm-msm

On Mon, Dec 23, 2019 at 06:32:26PM +0530, Srinivas Ramana wrote:
> Current SSBS implementation takes care of setting the
> SSBS bit in start_thread() for user threads. While this works
> for tasks launched with fork/clone followed by execve, for cases
> where userspace would just call fork (eg, Java applications) this
> leaves the SSBS bit unset. This results in performance
> regression for such tasks.
> 
> It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
> on context switch") masks this issue, but that was done for a
> different reason where heterogeneous CPUs(both SSBS supported
> and unsupported) are present. It is appropriate to take care
> of the SSBS bit for all threads while creation itself.
> 
> Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>

I suppose the parent process cleared SSBS explicitly. Isn't the child
after fork() supposed to be nearly identical to the parent? If we did as
you suggest, someone else might complain that SSBS has been set in the
child after fork().

I think the fix is for user space to set SSBS in the child if it no
longer needs it.

-- 
Catalin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2020-01-02 18:01 ` Catalin Marinas
@ 2020-01-09 15:17   ` Will Deacon
  2020-01-29 11:48   ` Srinivas Ramana
  1 sibling, 0 replies; 7+ messages in thread
From: Will Deacon @ 2020-01-09 15:17 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Srinivas Ramana, maz, linux-arm-kernel, linux-kernel, linux-arm-msm

On Thu, Jan 02, 2020 at 06:01:45PM +0000, Catalin Marinas wrote:
> On Mon, Dec 23, 2019 at 06:32:26PM +0530, Srinivas Ramana wrote:
> > Current SSBS implementation takes care of setting the
> > SSBS bit in start_thread() for user threads. While this works
> > for tasks launched with fork/clone followed by execve, for cases
> > where userspace would just call fork (eg, Java applications) this
> > leaves the SSBS bit unset. This results in performance
> > regression for such tasks.
> > 
> > It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
> > on context switch") masks this issue, but that was done for a
> > different reason where heterogeneous CPUs(both SSBS supported
> > and unsupported) are present. It is appropriate to take care
> > of the SSBS bit for all threads while creation itself.
> > 
> > Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
> > Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
> 
> I suppose the parent process cleared SSBS explicitly. Isn't the child
> after fork() supposed to be nearly identical to the parent? If we did as
> you suggest, someone else might complain that SSBS has been set in the
> child after fork().

Right, I'd expect the parent SSBS to be inherited when we copy the pstate
field along with the other regs, and I think this is the correct behaviour.

Is that broken somehow?

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2020-01-02 18:01 ` Catalin Marinas
  2020-01-09 15:17   ` Will Deacon
@ 2020-01-29 11:48   ` Srinivas Ramana
  2020-01-29 16:13     ` Will Deacon
  1 sibling, 1 reply; 7+ messages in thread
From: Srinivas Ramana @ 2020-01-29 11:48 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: will, maz, linux-arm-kernel, linux-kernel, linux-arm-msm

On 1/2/2020 11:31 PM, Catalin Marinas wrote:
> On Mon, Dec 23, 2019 at 06:32:26PM +0530, Srinivas Ramana wrote:
>> Current SSBS implementation takes care of setting the
>> SSBS bit in start_thread() for user threads. While this works
>> for tasks launched with fork/clone followed by execve, for cases
>> where userspace would just call fork (eg, Java applications) this
>> leaves the SSBS bit unset. This results in performance
>> regression for such tasks.
>>
>> It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
>> on context switch") masks this issue, but that was done for a
>> different reason where heterogeneous CPUs(both SSBS supported
>> and unsupported) are present. It is appropriate to take care
>> of the SSBS bit for all threads while creation itself.
>>
>> Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
>> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
> 
> I suppose the parent process cleared SSBS explicitly. Isn't the child

Actually we observe that parent(in case of android, zygote that launches 
the app) does have SSBS bit set. However child doesn't have the bit set.

> after fork() supposed to be nearly identical to the parent? If we did as
> you suggest, someone else might complain that SSBS has been set in the
> child after fork().

I am also wondering why would a userspace process clear SSBS bit loosing 
the performance benefit.
> 
> I think the fix is for user space to set SSBS in the child if it no
> longer needs it.
> 

Sorry for the late response on this.

Thanks,
-- Srinivas R


-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] arm64: Set SSBS for user threads while creation
  2020-01-29 11:48   ` Srinivas Ramana
@ 2020-01-29 16:13     ` Will Deacon
  0 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2020-01-29 16:13 UTC (permalink / raw)
  To: Srinivas Ramana
  Cc: Catalin Marinas, maz, linux-arm-kernel, linux-kernel, linux-arm-msm

On Wed, Jan 29, 2020 at 05:18:53PM +0530, Srinivas Ramana wrote:
> On 1/2/2020 11:31 PM, Catalin Marinas wrote:
> > On Mon, Dec 23, 2019 at 06:32:26PM +0530, Srinivas Ramana wrote:
> > > Current SSBS implementation takes care of setting the
> > > SSBS bit in start_thread() for user threads. While this works
> > > for tasks launched with fork/clone followed by execve, for cases
> > > where userspace would just call fork (eg, Java applications) this
> > > leaves the SSBS bit unset. This results in performance
> > > regression for such tasks.
> > > 
> > > It is understood that commit cbdf8a189a66 ("arm64: Force SSBS
> > > on context switch") masks this issue, but that was done for a
> > > different reason where heterogeneous CPUs(both SSBS supported
> > > and unsupported) are present. It is appropriate to take care
> > > of the SSBS bit for all threads while creation itself.
> > > 
> > > Fixes: 8f04e8e6e29c ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
> > > Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
> > 
> > I suppose the parent process cleared SSBS explicitly. Isn't the child
> 
> Actually we observe that parent(in case of android, zygote that launches the
> app) does have SSBS bit set. However child doesn't have the bit set.

On which SoC? Your commit message talks about heterogeneous systems (wrt
SSBS) as though they don't apply in your case. Could you provide us with
a reproducer?

> > after fork() supposed to be nearly identical to the parent? If we did as
> > you suggest, someone else might complain that SSBS has been set in the
> > child after fork().
> 
> I am also wondering why would a userspace process clear SSBS bit loosing the
> performance benefit.

I guess it could happen during sigreturn if the signal handler wasn't
careful about preserving bits in pstate, although it doesn't feel like
something you'd regularly run into.

But hang on a sec -- it looks like the context switch logic in
cbdf8a189a66 actually does the wrong thing for systems where all of the
CPUs implement SSBS. I don't think it explains the behaviour you're seeing,
but I do think it could end up in situations where SSBS is unexpectedly
*set*.

Diff below.

Will

--->8

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index bbb0f0c145f6..e38284c9fb7b 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -466,6 +466,13 @@ static void ssbs_thread_switch(struct task_struct *next)
 	if (unlikely(next->flags & PF_KTHREAD))
 		return;
 
+	/*
+	 * If all CPUs implement the SSBS instructions, then we just
+	 * need to context-switch the PSTATE field.
+	 */
+	if (cpu_have_feature(cpu_feature(SSBS)))
+		return;
+
 	/* If the mitigation is enabled, then we leave SSBS clear. */
 	if ((arm64_get_ssbd_state() == ARM64_SSBD_FORCE_ENABLE) ||
 	    test_tsk_thread_flag(next, TIF_SSBD))

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-29 16:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-23 13:02 [PATCH] arm64: Set SSBS for user threads while creation Srinivas Ramana
2019-12-24  7:06 ` Anshuman Khandual
2019-12-24  8:30   ` Srinivas Ramana
2020-01-02 18:01 ` Catalin Marinas
2020-01-09 15:17   ` Will Deacon
2020-01-29 11:48   ` Srinivas Ramana
2020-01-29 16:13     ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).