All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: Julien Grall <julien.grall@arm.com>
Cc: Anton.Kirilov@arm.com, catalin.marinas@arm.com,
	will.deacon@arm.com, oleg@redhat.com, zhang.lei@jp.fujitsu.com,
	alex.bennee@linaro.org, linux-arm-kernel@lists.infradead.org,
	Daniel.Kiss@arm.com
Subject: Re: [RFC PATCH v2 7/8] arm64/sve: Don't disable SVE on syscalls return
Date: Fri, 21 Jun 2019 16:33:16 +0100	[thread overview]
Message-ID: <20190621153316.GC2790@e103592.cambridge.arm.com> (raw)
In-Reply-To: <20190613161656.20765-8-julien.grall@arm.com>

On Thu, Jun 13, 2019 at 05:16:55PM +0100, Julien Grall wrote:
> Per the syscalls ABI, SVE registers will be unknown after a syscalls. In

This patch is quite hard to understand, though this is more down to the
code being modified than the patch itself.  So, I may ask some stupid
questions...

In particular, we now have up to 8 task states (all the combinations of
TIF_FOREIGN_FPSTATE, TIF_SVE and TIF_SVE_NEEDS_FLUSH).  Sketching out
the state machine and highlighting any states that we consider invalid
may be a useful exercise, but I've not attempted that yes.

Nit: "after a syscall"

(Inessential to fix these pedantic commit message nits, but probably
worth addressing if respinning the patch.)

> practice the kernel will disable SVE and zero all the registers but the
> first 128-bits of the vector on the next SVE instructions. In workload

Nit: "instruction"
Nit: "In workloads" or "In a workload"

> mixing SVE and syscall, this will result of 2 entry/exit to the kernel

Nit: "and syscalls"
Nit: "will result in"

> per exit.

"2 [...] exit [...] per exit"?  I can see what you're getting at, but I
wonder if this can be reworded, say:

"[...] will result in at least one extra kernel entry/exit per syscall".

> To avoid the second entry/exit, a new flag TIF_SVE_NEEDS_FLUSH is
> introduced to mark a task that needs to flush the SVE context on
> return to userspace.
> 
> On entry to a syscall, the flag TIF_SVE will still be cleared. It will
> be restored on return to userspace once the SVE state has been flushed.
> This means that if a task requires to synchronize the FP state during a
> syscall (e.g context switch, signal), only the FPSIMD registers will be
> saved. When the task is rescheduled, the SVE state will be loaded from
> FPSIMD state.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Fix typo in a comment
> ---
>  arch/arm64/include/asm/thread_info.h |  5 ++++-
>  arch/arm64/kernel/fpsimd.c           | 32 ++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/process.c          |  1 +
>  arch/arm64/kernel/ptrace.c           |  7 +++++++
>  arch/arm64/kernel/signal.c           | 14 +++++++++++++-
>  arch/arm64/kernel/syscall.c          | 13 +++++--------
>  6 files changed, 62 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index f1d032be628a..d87bcd80cb0f 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -86,6 +86,7 @@ void arch_release_task_struct(struct task_struct *tsk);
>  #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
>  #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
>  #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
> +#define TIF_SVE_NEEDS_FLUSH	6	/* Flush SVE registers on return */
>  #define TIF_NOHZ		7
>  #define TIF_SYSCALL_TRACE	8
>  #define TIF_SYSCALL_AUDIT	9
> @@ -113,10 +114,12 @@ void arch_release_task_struct(struct task_struct *tsk);
>  #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
>  #define _TIF_32BIT		(1 << TIF_32BIT)
>  #define _TIF_SVE		(1 << TIF_SVE)
> +#define _TIF_SVE_NEEDS_FLUSH	(1 << TIF_SVE_NEEDS_FLUSH)
>  
>  #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
>  				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
> -				 _TIF_UPROBE | _TIF_FSCHECK)
> +				 _TIF_UPROBE | _TIF_FSCHECK | \
> +				 _TIF_SVE_NEEDS_FLUSH)
>  
>  #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>  				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 92f418e4f989..41ab73b12f4a 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -161,6 +161,8 @@ extern void __percpu *efi_sve_state;
>   */
>  static void __sve_free(struct task_struct *task)
>  {
> +	/* SVE context will be zeroed when allocated. */
> +	clear_tsk_thread_flag(task, TIF_SVE_NEEDS_FLUSH);
>  	kfree(task->thread.sve_state);
>  	task->thread.sve_state = NULL;
>  }
> @@ -217,6 +219,11 @@ static void sve_free(struct task_struct *task)
>   *  * FPSR and FPCR are always stored in task->thread.uw.fpsimd_state
>   *    irrespective of whether TIF_SVE is clear or set, since these are
>   *    not vector length dependent.
> + *
> + *  * When TIF_SVE_NEEDS_FLUSH is set, all the SVE registers but the first
> + *    128-bits of the Z-registers are logically zero but not stored anywhere.
> + *    Saving logically zero bits across context switches is therefore
> + *    pointless, although they must be zeroed before re-entering userspace.
>   */
>  
>  /*
> @@ -226,6 +233,14 @@ static void sve_free(struct task_struct *task)
>   * thread_struct is known to be up to date, when preparing to enter
>   * userspace.
>   *
> + * When TIF_SVE_NEEDS_FLUSH is set, the SVE state will be restored from the
> + * FPSIMD state.
> + *
> + * TIF_SVE_NEEDS_FLUSH and TIF_SVE set at the same time should never happen.
> + * In the unlikely case it happens, the code is able to cope with it. It will
> + * first restore the SVE registers and then flush them in
> + * fpsimd_restore_current_state.
> + *
>   * Softirqs (and preemption) must be disabled.
>   */

If we are scheduled out with TIF_SVE_NEEDS_FLUSH set, 
>  static void task_fpsimd_load(void)
> @@ -236,6 +251,12 @@ static void task_fpsimd_load(void)
>  		sve_load_state(sve_pffr(&current->thread),
>  			       &current->thread.uw.fpsimd_state.fpsr,
>  			       sve_vq_from_vl(current->thread.sve_vl) - 1);
> +	else if (system_supports_sve() &&

Can we factor this in a way that doesn't duplicate the
system_supports_sve() check?  Static key checks, are cheap, but not
free, and the compiler can't merge them.

> +		 test_and_clear_thread_flag(TIF_SVE_NEEDS_FLUSH)) {
> +		sve_load_from_fpsimd_state(&current->thread.uw.fpsimd_state,
> +					   sve_vq_from_vl(current->thread.sve_vl) - 1);
> +		set_thread_flag(TIF_SVE);
> +	}
>  	else
>  		fpsimd_load_state(&current->thread.uw.fpsimd_state);
>  }
> @@ -1070,6 +1091,17 @@ void fpsimd_restore_current_state(void)
>  		fpsimd_bind_task_to_cpu();
>  	}

Should this be an else if?

IIUC, if TIF_FOREIGN_FPSTATE and TIF_SVE_NEEDS_FLUSH are both set, we'll
already have gone down the sve_load_from_fpsimd_state() path in
task_fpsimd_load().

>  
> +	if (system_supports_sve() &&
> +	    test_and_clear_thread_flag(TIF_SVE_NEEDS_FLUSH)) {
> +		/*
> +		 * The userspace had SVE enabled on entry to the kernel
> +		 * and requires the state to be flushed.
> +		 */
> +		sve_flush_live();
> +		sve_user_enable();

Is this needed?  It TIF_FOREIGN_FPSTATE wasn't set on entry to
fpsimd_restore_current_state(), SVE should already be enabled.

Maybe we can simplify things by simply calling
fpsimd_bind_state_to_cpu() unconditionally before the local_bh_enable().
If the state is already bound, rebinding it should to no harm.

Otherwise, this bare call to sve_user_enable() feels a bit weird here.

Alternatively, perhaps it makes sense to do the TIF_SVE_NEEDS_FLUSH
handling in fpsimd_bind_state_to_cpu() itself.  I haven't thought about
this in detail though.

> +		set_thread_flag(TIF_SVE);
> +	}
> +
>  	local_bh_enable();
>  }
>  
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 3767fb21a5b8..8c67ef89b01a 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -367,6 +367,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>  	 * and disable discard SVE state for p:
>  	 */
>  	clear_tsk_thread_flag(p, TIF_SVE);
> +	clear_tsk_thread_flag(p, TIF_SVE_NEEDS_FLUSH);
>  	p->thread.sve_state = NULL;

Should we clear it in fpsimd_flush_thread() too?  Otherwise, I think we
could leak SVE-ness across exec, which is not really the intent.

Any place that we clear TIF_SVE but not TIF_SVE_NEEDS_FLUSH is
potentially suspicious.

What happens in sve_set_vector_length() for example?

>  	/*
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index b82e0a9b3da3..f44016052cba 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -899,6 +899,11 @@ static int sve_set(struct task_struct *target,
>  		ret = __fpr_set(target, regset, pos, count, kbuf, ubuf,
>  				SVE_PT_FPSIMD_OFFSET);
>  		clear_tsk_thread_flag(target, TIF_SVE);
> +		/*
> +		 * If ptrace requested to use FPSIMD, then don't try to
> +		 * re-enable SVE when the task is running again.
> +		 */
> +		clear_tsk_thread_flag(target, TIF_SVE_NEEDS_FLUSH);

Since we have this clear-both-flags op in a few places, it may be worth
having a helper to express what we're trying to do more clearly.

>  		goto out;
>  	}
>  
> @@ -923,6 +928,8 @@ static int sve_set(struct task_struct *target,
>  	 */
>  	fpsimd_sync_to_sve(target);
>  	set_tsk_thread_flag(target, TIF_SVE);
> +	/* Don't flush SVE registers on return as ptrace will update them. */
> +	clear_tsk_thread_flag(target, TIF_SVE_NEEDS_FLUSH);
>  	BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
>  	start = SVE_PT_SVE_OFFSET;
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index ab3e56bbfb07..83a23a1edc7e 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -530,6 +530,17 @@ static int restore_sigframe(struct pt_regs *regs,
>  		} else {
>  			err = restore_fpsimd_context(user.fpsimd);
>  		}
> +
> +		/*
> +		 * When successfully restoring the:
> +		 *	- FPSIMD context, we don't want to re-enable SVE
> +		 *	- SVE context, we don't want to override what was
> +		 *	restored
> +		 */
> +		if (err == 0)
> +			clear_thread_flag(TIF_SVE_NEEDS_FLUSH);
> +
> +

Similarly, we seem to have a common pattern of setting TIF_SVE and
TIF_SVE_NEEDS_FLUSH to specific values together.

Can these be abstracted in some way?

>  	}
>  
>  	return err;
> @@ -942,7 +953,8 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
>  				rseq_handle_notify_resume(NULL, regs);
>  			}
>  
> -			if (thread_flags & _TIF_FOREIGN_FPSTATE)
> +			if (thread_flags & (_TIF_FOREIGN_FPSTATE |
> +					    _TIF_SVE_NEEDS_FLUSH))
>  				fpsimd_restore_current_state();
>  		}
>  
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 871c739f060a..b9bd7092e253 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -142,16 +142,13 @@ static inline void sve_user_discard(void)
>  	if (!system_supports_sve())
>  		return;
>  
> -	clear_thread_flag(TIF_SVE);
> -
>  	/*
> -	 * task_fpsimd_load() won't be called to update CPACR_EL1 in
> -	 * ret_to_user unless TIF_FOREIGN_FPSTATE is still set, which only
> -	 * happens if a context switch or kernel_neon_begin() or context
> -	 * modification (sigreturn, ptrace) intervenes.
> -	 * So, ensure that CPACR_EL1 is already correct for the fast-path case.
> +	 * TIF_SVE is cleared to save the FPSIMD state rather than the SVE

Nit: somethine like: "cleared so that any writeback via
task_fpsimd_save() will save the FPSIMD state rather than [...]"

(Otherwise, this sounds a bit like this code is actually doing the save
somehow.)

> +	 * state on context switch. The bit will be set again while
> +	 * restoring/zeroing the registers.
>  	 */
> -	sve_user_disable();
> +	if (test_and_clear_thread_flag(TIF_SVE))
> +		set_thread_flag(TIF_SVE_NEEDS_FLUSH);

Do we need to do this under local_bh_disable()?  Does it matter?

It looks to me like this is called when we still have interrupts
disabled, which is probably worth fixing.  This may actually be a bug:
if kzalloc() causes us to sleep, we have the potential to trigger
sleeping-while-atomic splats.

If you agree that's a bug, try to come up with a preliminary patch
that's suitable for stable.

(I may have just misunderstood / misremembered something here.)

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-06-21 15:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 16:16 [RFC PATCH v2 0/8] arm64/sve: First steps towards optimizing syscalls Julien Grall
2019-06-13 16:16 ` [RFC PATCH v2 1/8] arm64/fpsimd: Update documentation of do_sve_acc Julien Grall
2019-06-13 16:19   ` Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 2/8] arm64/signal: Update the comment in preserve_sve_context Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 3/8] arm64/fpsimdmacros: Allow the macro "for" to be used in more cases Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-24 16:10     ` Julien Grall
2019-06-25  9:35       ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 4/8] arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 5/8] arm64/sve: Implement an helper to flush SVE registers Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-24 16:28     ` Julien Grall
2019-06-25  9:37       ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 6/8] arm64/sve: Implement an helper to load SVE registers from FPSIMD state Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-24 16:29     ` Julien Grall
2019-06-13 16:16 ` [RFC PATCH v2 7/8] arm64/sve: Don't disable SVE on syscalls return Julien Grall
2019-06-21 15:33   ` Dave Martin [this message]
2019-06-24 16:44     ` Julien Grall
2019-06-25  9:41       ` Dave Martin
2019-07-04 14:15     ` Catalin Marinas
2019-08-02 11:06       ` Julien Grall
2019-06-13 16:16 ` [RFC PATCH v2 8/8] arm64/sve: Rework SVE trap access to use TIF_SVE_NEEDS_FLUSH Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-21 15:32 ` [RFC PATCH v2 0/8] arm64/sve: First steps towards optimizing syscalls Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190621153316.GC2790@e103592.cambridge.arm.com \
    --to=dave.martin@arm.com \
    --cc=Anton.Kirilov@arm.com \
    --cc=Daniel.Kiss@arm.com \
    --cc=alex.bennee@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=julien.grall@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oleg@redhat.com \
    --cc=will.deacon@arm.com \
    --cc=zhang.lei@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.