linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Netdev <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Andrew Lutomirski <luto@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Samuel Neves <sneves@dei.uc.pt>,
	linux-arch@vger.kernel.org, Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH v2 01/17] asm: simd context helper API
Date: Sun, 26 Aug 2018 07:25:01 -0700	[thread overview]
Message-ID: <01BF319B-D6F3-432F-AE1A-1B8B4E3A36A4@amacapital.net> (raw)
In-Reply-To: <CAHmME9q+JcT9pnZ5jgowf15O4BkVF-2-QkHA2o1ZKbVe4nAg6g@mail.gmail.com>




> On Aug 26, 2018, at 7:18 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> 
> On Sun, Aug 26, 2018 at 8:06 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>> Do you mean to say you intend to make kernel_fpu_end() and
>>> kernel_neon_end() only actually do something upon context switch, but
>>> not when it's actually called? So that multiple calls to
>>> kernel_fpu_begin() and kernel_neon_begin() can be made without
>>> penalty?
>> 
>> On context switch and exit to user. That allows to keep those code pathes
>> fully preemptible. Still twisting my brain around the details.
> 
> Just to make sure we're on the same page, the goal is so that this code:
> 
> kernel_fpu_begin();
> kernel_fpu_end();
> kernel_fpu_begin();
> kernel_fpu_end();
> kernel_fpu_begin();
> kernel_fpu_end();
> kernel_fpu_begin();
> kernel_fpu_end();
> kernel_fpu_begin();
> kernel_fpu_end();
> kernel_fpu_begin();
> kernel_fpu_end();
> ...
> 
> has the same performance as this code:
> 
> kernel_fpu_begin();
> kernel_fpu_end();
> 
> (Unless of course the process is preempted or the like.)
> 
> Currently the present situation makes the performance of the above
> wildly different, since kernel_fpu_end() does something immediately.
> 
> What about something like this:
> - Add a tristate flag connected to task_struct (or in the global fpu
> struct in the case that this happens in irq and there isn't a valid
> current).
> - On kernel_fpu_begin(), if the flag is 0, do the usual expensive
> XSAVE stuff, and set the flag to 1.
> - On kernel_fpu_begin(), if the flag is non-0, just set the flag to 1
> and return.
> - On kernel_fpu_end(), if the flag is non-0, set the flag to 2.
> (Otherwise WARN() or BUG() or something.)
> - On context switch / preemption / etc away from the task, if the flag
> is non-0, XRSTOR and such.

It’s not that simple. First, these states need names, at least for thinking about. 0 is “user state in regs”. 1 is “kernel state active”. 2 is “nothing active”.

The actual encoding will be something like TIF_XSTATE_UNLOADED: user state is not in regs.  TIF_KERNEL_XSTATE: kernel is using FPU. And this fundamentally doubles the size of struct fpu.

Tglx, that doubling-the-size-of-fpu makes me question the idea of letting the kernel use the fpu while preemptible.

> - On context switch / preemption / etc back to the task, if the flag
> is 1, XSAVE and such. If the flag is 2, set it to 0.
> 



> Jason

  reply	other threads:[~2018-08-26 14:25 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-24 21:38 [PATCH v2 00/17] WireGuard: Secure Network Tunnel Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 01/17] asm: simd context helper API Jason A. Donenfeld
2018-08-26 12:10   ` Thomas Gleixner
2018-08-26 13:45     ` Jason A. Donenfeld
2018-08-26 14:06       ` Thomas Gleixner
2018-08-26 14:18         ` Jason A. Donenfeld
2018-08-26 14:25           ` Andy Lutomirski [this message]
2018-08-26 14:18         ` Andy Lutomirski
2018-08-26 16:53           ` Rik van Riel
2018-09-01 20:19         ` Jason A. Donenfeld
2018-09-01 20:32           ` Andy Lutomirski
2018-09-01 20:34             ` Jason A. Donenfeld
2018-09-06 13:42               ` Thomas Gleixner
2018-09-06 15:52                 ` Jason A. Donenfeld
2018-08-27 19:50   ` Palmer Dabbelt
2018-08-24 21:38 ` [PATCH v2 02/17] zinc: introduce minimal cryptography library Jason A. Donenfeld
2018-08-25  6:29   ` Eric Biggers
2018-08-25 16:16     ` Andrew Lunn
2018-08-25 16:40     ` Jason A. Donenfeld
2018-08-25 17:26       ` Andrew Lunn
2018-08-26 15:59     ` Jason A. Donenfeld
2018-08-25 10:17   ` Ard Biesheuvel
2018-08-25 17:06     ` Jason A. Donenfeld
2018-08-25 17:17       ` Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 03/17] zinc: ChaCha20 generic C implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 04/17] zinc: ChaCha20 ARM and ARM64 implementations Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 05/17] zinc: ChaCha20 x86_64 implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 06/17] zinc: ChaCha20 MIPS32r2 implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 07/17] zinc: Poly1305 generic C implementation and selftest Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 08/17] zinc: Poly1305 ARM and ARM64 implementations Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 09/17] zinc: Poly1305 x86_64 implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 10/17] zinc: Poly1305 MIPS32r2 and MIPS64 implementations Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 11/17] zinc: ChaCha20Poly1305 construction and selftest Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 12/17] zinc: BLAKE2s generic C implementation " Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 13/17] zinc: BLAKE2s x86_64 implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 14/17] zinc: Curve25519 generic C implementations and selftest Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 15/17] zinc: Curve25519 ARM implementation Jason A. Donenfeld
2018-08-26 13:18   ` Ard Biesheuvel
2018-08-29  5:06     ` Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 16/17] zinc: Curve25519 x86_64 implementation Jason A. Donenfeld
2018-08-24 21:38 ` [PATCH v2 17/17] net: WireGuard secure network tunnel Jason A. Donenfeld
2018-08-24 23:00   ` Andrew Lunn
2018-08-27 11:13   ` kbuild test robot
2018-08-27 12:52   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01BF319B-D6F3-432F-AE1A-1B8B4E3A36A4@amacapital.net \
    --to=luto@amacapital.net \
    --cc=Jason@zx2c4.com \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=riel@surriel.com \
    --cc=sneves@dei.uc.pt \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).