linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>,
	linux-rt-users@vger.kernel.org,
	Pavel Hronek <hronepa1@fel.cvut.cz>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
Date: Wed, 17 May 2023 16:37:06 +0200	[thread overview]
Message-ID: <20230517143706.6hi48xeo@linutronix.de> (raw)
In-Reply-To: <CAMj1kXEG1F5RgzhjWfELfh8REQoOGjEH065aHwzHVP9DVEt7Bw@mail.gmail.com>

On 2023-05-17 13:05:32 [+0200], Ard Biesheuvel wrote:
> For context, these changes give a substantial performance boost on the
> RX path to IPsec or WireGuard using SIMD based software crypto.
> 
> > …
> > > Please take a look, and confirm whether or not we still need to drop
> > > the get_cpu() call from vfp_sync_hwstate()
> >
> > Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
> > context is fully preemptible. So that get_cpu() needs to be removed
> > before local_bh_disable(). But…
> >
> > If I understand the logic right, then the VFP state is saved on context
> > switch and the FPU access is disabled but the content remains and is in
> > sync with vfp_current_hw_state. Upon first usage (after the context
> > switch) in userland there is an exception at which point the access to
> > the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
> > in sync as per vfp_current_hw_state.
> >
> 
> Exactly. On UP we lazily preserve the user space FP state, and on SMP
> we always preserve it on a context switch. Then, only when user space
> actually tries to use the SIMD context, the correct data is copied in.
> That way, we don't take the performance hit for tasks that are blocked
> in the kernel.
> 
> > That means it relies on disabled preemption while operating on
> > vfp_current_hw_state.
> 
> In general, this code relies on preserving/restoring the VFP context
> to/from memory to be a critical section, as it needs to run to
> completion once it is started.
> 
> > On PREEMPT_RT softirqs are only handled in process
> > context (for instance at the end of the threaded interrupt). Therefore
> > it is sufficient to disable preemption instead of BH. It would need
> > something like x86's fpregs_lock().
> 
> I think this is not the only issue, to be honest. We cannot preempt
> tasks while they are using the SIMD unit in kernel mode, as the kernel
> mode context is never preserved/restored. So we at least need to
> disable preemption again in kernel_neon_begin() [which now relies on
> BH disable to disable preemption but as you point out, that is only
> the case on !RT]
> 
> Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT
> is worth it to begin with. AIUI, the alternative is to disable both
> preemption and BH when touching the VFP state.

Only preemption on RT. So it is debatable if it makes sense to use NEON
on RT. The preempt-off section is limited to PAGE_SIZE due the scatter/
gather walk if I remember correctly.

> > Otherwise vfp_entry() can get preempted after decisions based on
> > vfp_current_hw_state have been made. Also kernel_neon_begin() could get
> > preempted after enabling FPU. I think based on current logic, after
> > kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
> > vfp_current_hw_state is NULL the registers won't be saved and a zero set
> > of registers will be loaded once the neon task gets back on the CPU (it
> > seems that VFP exceptions in kernel mode are treated the same way as
> > those in user mode).
> >
> > What do you think?
> >
> 
> Indeed. kernel_neon_begin() no longer disabling preemption is
> definitely wrong on PREEMPT_RT. The question is whether we want to
> untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually
> exclusive. This, by itself, makes quite a lot of sense, actually,
> given that on actual 32-bit hardware, the SIMD speedup is not that
> spectacular (the AES and other crypto instructions, which do give an
> order of magnitude speedup are only implemented on 64-bit cores, which
> usually run a 64-bit kernel). So if real-time behavior is a
> requirement, using ordinary crypto implemented in C (which does not
> require preemption to be disabled) is likely preferred anyway.

I have no opinion which can be backed up by numbers. So I have no
problem to go along with it and disable KERNEL_MODE_NEON since it is
simpler and the benefit is questionable.

So patch
- #1 KERNEL_MODE_NEON depends on !RT
- #2 disable BH followed by preemption in vfp_sync_hwstate() and
  vfp_entry().

if so, let me prepare something unless you want to :)

Sebastian

  reply	other threads:[~2023-05-17 14:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <202305142217.46875.pisa@cmp.felk.cvut.cz>
2023-05-16 16:33 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
2023-05-16 17:13   ` Ard Biesheuvel
2023-05-17  8:54     ` Sebastian Andrzej Siewior
2023-05-17 11:05       ` Ard Biesheuvel
2023-05-17 14:37         ` Sebastian Andrzej Siewior [this message]
2023-05-17 22:05           ` Ard Biesheuvel
2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate() Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior
2023-05-19 16:14               ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Ard Biesheuvel
2023-05-19 18:06                 ` Sebastian Andrzej Siewior
2023-05-19 19:42                   ` Ard Biesheuvel
2023-05-19 18:52 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
2023-05-21 14:57   ` Pavel Pisa
2023-05-22  7:44     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230517143706.6hi48xeo@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=ardb@kernel.org \
    --cc=hronepa1@fel.cvut.cz \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pisa@cmp.felk.cvut.cz \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).