From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave.Martin@arm.com (Dave Martin) Date: Wed, 24 May 2017 16:35:58 +0100 Subject: [RFC PATCH v2 2/2] arm64: neon: Remove support for nested or hardirq kernel-mode NEON In-Reply-To: References: <1495636986-30515-1-git-send-email-Dave.Martin@arm.com> <1495636986-30515-3-git-send-email-Dave.Martin@arm.com> Message-ID: <20170524153555.GM3559@e103592.cambridge.arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, May 24, 2017 at 08:22:18AM -0700, Ard Biesheuvel wrote: > On 24 May 2017 at 07:42, Dave Martin wrote: > > Support for kernel-mode NEON to be nested and/or used in hardirq > > context adds significant complexity, and the benefits may be > > marginal. In practice, kernel-mode NEON is not used in hardirq > > context, and is rarely used in softirq context (by certain mac80211 > > drivers). > > > > This patch implements an arm64 may_use_simd() function to allow > > clients to check whether kernel-mode NEON is usable in the current > > context, and simplifies kernel_neon_{begin,end}() to handle only > > saving of the task FPSIMD state (if any). Without nesting, there > > is no other state to save. > > > > The partial fpsimd save/restore functions become redundant as a > > result of these changes, so they are removed too. > > > > The save/restore model is changed to operate directly on > > task_struct without additional percpu storage. This simplifies the > > code and saves a bit of memory, but means that softirqs must now be > > disabled when manipulating the task fpsimd state from task context: > > correspondingly, preempt_{en,dis}sable() calls are upgraded to > > local_bh_{en,dis}able() as appropriate, and softirqs are also > > disabled around the context switch code in fpsimd_thread_switch(). > > The will be some increase in softirq latency, but hardirqs should > > be unaffected. > > > > These changes should make it easier to support kernel-mode NEON in > > the presence of the Scalable Vector Extension in the future. > > > > Signed-off-by: Dave Martin [...] > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > > index 06da8ea..02023da 100644 > > --- a/arch/arm64/kernel/fpsimd.c > > +++ b/arch/arm64/kernel/fpsimd.c > > @@ -17,16 +17,20 @@ [...] > > + * In order to allow softirq handlers to use FPSIMD, kernel_neon_begin() may > > + * save the task's FPSIMD context back to task_struct from softirq context. > > + * To prevent this from racing with the manipulation of the task's FPSIMD state > > + * from task context and thereby corrupting the state, it is necessary to > > + * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE > > + * flag with local_bh_disable(). > > + * > > Surely, this doesn't mean all kernel mode NEON should execute with > bottom halves disabled? Because that is what this patch appears to be > doing. Dang, I was deliberately paranoid when updating kernel_neon_{begin,end}(), but was _too_ paranoid. softirq must be masked around the read-modify-write on TIF_FOREIGN_FPSTATE and thread.fpsimd_state: from kernel_neon_begin() to kernel_neon_end() we only need to disable preemption, to hide the lack of context switch machinery for kernel-mode NEON state in task context (as at present). I think they should do: [...] > > +void kernel_neon_begin(void) > > { > > if (WARN_ON(!system_supports_fpsimd())) > > return; [...] > > + BUG_ON(!may_use_simd()); > > + > > + local_bh_disable(); > > + > > + __this_cpu_write(kernel_neon_busy, true); > > + > > + /* > > + * If there is unsaved task fpsimd state in the CPU, save it > > + * and invalidate the copy stored in the fpsimd regs: > > + */ > > + if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) { > > + fpsimd_save_state(¤t->thread.fpsimd_state); > > this_cpu_write(fpsimd_last_state, NULL); > > } preempt_disable(); local_bh_enable(); > > } > > -EXPORT_SYMBOL(kernel_neon_begin_partial); > > +EXPORT_SYMBOL(kernel_neon_begin); > > > > +/* > > + * kernel_neon_end(): give the CPU FPSIMD registers back to the current task > > + * > > + * Must be called from a context in which kernel_neon_begin() was previously > > + * called, with no call to kernel_neon_end() in the meantime. > > + * > > + * The caller must not use the FPSIMD registers after this function is called, > > + * unless kernel_neon_begin() is called again in the meantime. > > + */ > > void kernel_neon_end(void) > > { > > + bool busy; > > + > > if (!system_supports_fpsimd()) > > return; [...] > > + > > + busy = __this_cpu_xchg(kernel_neon_busy, false); > > + WARN_ON(!busy); /* No matching kernel_neon_begin()? */ > > + preempt_enable(); > > } > > EXPORT_SYMBOL(kernel_neon_end); Make sense? Cheers ---Dave