All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Dave Martin <Dave.Martin@arm.com>
Cc: Anton.Kirilov@arm.com, will.deacon@arm.com, oleg@redhat.com,
	zhang.lei@jp.fujitsu.com, alex.bennee@linaro.org,
	linux-arm-kernel@lists.infradead.org, Daniel.Kiss@arm.com
Subject: Re: [RFC PATCH v2 7/8] arm64/sve: Don't disable SVE on syscalls return
Date: Fri, 2 Aug 2019 12:06:54 +0100	[thread overview]
Message-ID: <b40a2499-945f-c49d-1681-0027d0aa1269@arm.com> (raw)
In-Reply-To: <20190704141559.GA51773@arrakis.emea.arm.com>

Hi Catalin,

On 04/07/2019 15:15, Catalin Marinas wrote:
> On Fri, Jun 21, 2019 at 04:33:16PM +0100, Dave P Martin wrote:
>> On Thu, Jun 13, 2019 at 05:16:55PM +0100, Julien Grall wrote:
>>> Per the syscalls ABI, SVE registers will be unknown after a syscalls. In
>>
>> This patch is quite hard to understand, though this is more down to the
>> code being modified than the patch itself.  So, I may ask some stupid
>> questions...
>>
>> In particular, we now have up to 8 task states (all the combinations of
>> TIF_FOREIGN_FPSTATE, TIF_SVE and TIF_SVE_NEEDS_FLUSH).  Sketching out
>> the state machine and highlighting any states that we consider invalid
>> may be a useful exercise, but I've not attempted that yes.
> 
> We definitely need a state machine sketched out (and a formal model as I
> can't really get all of it in my head at once). I don't fully understand
> the need for a new TIF_SVE_NEEDS_FLUSH. Maybe it becomes obvious if we
> had a state machine description.
Dave and I drafted a state machine on a whiteboard recently. I will clean it up 
and send it on the ML. But I am not sure the state machine will help to 
understand the approach :/.

I realize that I didn't explain why I chose this approach over another one. See 
more below.

> 
> So, we currently have (IIUC):
> 
> TIF_SVE - tells us whether the user space can access SVE registers
> without a fault (doesn't CPACR_EL1 tell us this already on kernel entry?
> I guess we'd need to store it in a TIF flag anyway for switch_to). The

That's correct CPACR_EL1 will tell us on entry whether SVE has been enabled or 
not. But as you pointed out we need to save on context switch but we may also 
disable/enable it via ptrace.

> implications of TIF_SVE on kernel entry is that the SVE state could have
> been touched by the user. If entering via syscall, we discard this state
> in sve_user_discard().
> 
> TIF_FOREIGN_FPSTATE - tells us whether the hardware state is out of sync
> with the current thread.
Note that in this case, TIF_SVE will indicate where the context has been saved 
(fpsimd_state vs sve_state).

> 
> For flushing the SVE state on return from syscall, can we not handle
> this entirely in el0_svc_handler while enabling the SVE access at the
> same time to avoid a subsequent trap? We need to know whether the SVE
> state is valid when we do the context switching but we have TIF_SVE for
> this, cleared on syscall entry but can be set again on return from
> syscall if we enable access in CPACR_EL1 (sve_user_enable()).

If we were to handle it in el0_svc_handler(), we would need to do a similar job 
as fpsimd_restore_current_state().

Indeed, the task may have been switched out (when using PREEMPT and PREEMPT_RT). 
Above you suggested to clear TIF_SVE on syscall entry, so the state would be 
saved in fpsimd_state. We would need to convert the state back to SVE. Ideally, 
this should be done in hardware (see patch #6) as this is likely going to be 
faster than the software version (see fpsimd_to_sve()). This means the state 
would need to be loaded by el0_svc_handler().

Furthermore, handling everything in el0_svc_handler() means that you are mostly 
optimizing for fully preemptible kernel (PREEMPT_RT) and preemptible kernel 
(PREEMPT). Although, you have the risk to get the kernel preempted after you 
return from the el0_svc_handler() and before you return to userspace. So you 
would end up to save the full SVE context (even the zeroed bits) which takes 
longer than just saving the first 128-bits. I will admit, I haven't really 
looked how often this condition can happen.

For voluntary preemptible kernel, they will suffer the same problem when as the 
PREEMPT_RT and PREEMPT when rescheduled in do_notify_resume().

So it feels to me that do_notify_resume() is a better fit to handle the flush 
(i.e zeroing all SVE state but the first 128-bits of each vector). This would 
have the advantage to use the optimize other places (such as the trap to enable 
SVE as done by patch #8).

Handling in do_notify_resume() will require an extra flag because we need to 
know when to flush the SVE state.

To summarize, el0_svc_handler() is a possibility but result to leave some cases 
unoptimized. I don't have any numbers to back this yet, so I don't know whether 
this is a major concerns.

> 
> It probably needs some more thinking on signal handling.

The signal handling is quite tricky in all the cases :). We need to ensure the 
SVE state is not flushed.

Cheers,

-- 
Julien Grall

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-08-02 11:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 16:16 [RFC PATCH v2 0/8] arm64/sve: First steps towards optimizing syscalls Julien Grall
2019-06-13 16:16 ` [RFC PATCH v2 1/8] arm64/fpsimd: Update documentation of do_sve_acc Julien Grall
2019-06-13 16:19   ` Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 2/8] arm64/signal: Update the comment in preserve_sve_context Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 3/8] arm64/fpsimdmacros: Allow the macro "for" to be used in more cases Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-24 16:10     ` Julien Grall
2019-06-25  9:35       ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 4/8] arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN Julien Grall
2019-06-21 15:32   ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 5/8] arm64/sve: Implement an helper to flush SVE registers Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-24 16:28     ` Julien Grall
2019-06-25  9:37       ` Dave Martin
2019-06-13 16:16 ` [RFC PATCH v2 6/8] arm64/sve: Implement an helper to load SVE registers from FPSIMD state Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-24 16:29     ` Julien Grall
2019-06-13 16:16 ` [RFC PATCH v2 7/8] arm64/sve: Don't disable SVE on syscalls return Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-24 16:44     ` Julien Grall
2019-06-25  9:41       ` Dave Martin
2019-07-04 14:15     ` Catalin Marinas
2019-08-02 11:06       ` Julien Grall [this message]
2019-06-13 16:16 ` [RFC PATCH v2 8/8] arm64/sve: Rework SVE trap access to use TIF_SVE_NEEDS_FLUSH Julien Grall
2019-06-21 15:33   ` Dave Martin
2019-06-21 15:32 ` [RFC PATCH v2 0/8] arm64/sve: First steps towards optimizing syscalls Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b40a2499-945f-c49d-1681-0027d0aa1269@arm.com \
    --to=julien.grall@arm.com \
    --cc=Anton.Kirilov@arm.com \
    --cc=Daniel.Kiss@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=alex.bennee@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oleg@redhat.com \
    --cc=will.deacon@arm.com \
    --cc=zhang.lei@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.