From: Andy Lutomirski <luto@amacapital.net> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: George Spelvin <linux@sciencehorizons.net>, "Jason A. Donenfeld" <Jason@zx2c4.com>, Andi Kleen <ak@linux.intel.com>, David Miller <davem@davemloft.net>, David Laight <David.Laight@aculab.com>, "Daniel J . Bernstein" <djb@cr.yp.to>, Eric Biggers <ebiggers3@gmail.com>, Eric Dumazet <eric.dumazet@gmail.com>, Hannes Frederic Sowa <hannes@stressinduktion.org>, Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>, "kernel-hardening@lists.openwall.com" <kernel-hardening@lists.openwall.com>, Linux Crypto Mailing List <linux-crypto@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Network Development <netdev@vger.kernel.org>, Tom Herbert <tom@herbertland.com>, "Theodore Ts'o" <tytso@mit.edu>, Vegard Nossum <vegard.nossum@gmail.com> Subject: Re: HalfSipHash Acceptable Usage Date: Wed, 21 Dec 2016 17:54:26 -0800 [thread overview] Message-ID: <CALCETrVxFDQGJeQX5k39pM3TvqH4q10SduPY=Os_RiJGEg_0Hg@mail.gmail.com> (raw) In-Reply-To: <CA+55aFy8fNOxw3bnwkX1S46jKnW6i26mueaiuOsScyN3kFJp+A@mail.gmail.com> On Wed, Dec 21, 2016 at 9:25 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Dec 21, 2016 at 7:55 AM, George Spelvin > <linux@sciencehorizons.net> wrote: >> >> How much does kernel_fpu_begin()/kernel_fpu_end() cost? > > It's now better than it used to be, but it's absolutely disastrous > still. We're talking easily many hundreds of cycles. Under some loads, > thousands. > > And I warn you already: it will _benchmark_ a hell of a lot better > than it will work in reality. In benchmarks, you'll hit all the > optimizations ("oh, I've already saved away all the FP registers, no > need to do it again"). > > In contrast, in reality, especially with things like "do it once or > twice per incoming packet", you'll easily hit the absolute worst > cases, where not only does it take a few hundred cycles to save the FP > state, you'll then return to user space in between packets, which > triggers the slow-path return code and reloads the FP state, which is > another few hundred cycles plus. Hah, you're thinking that the x86 code works the way that Rik and I want it to work, and you just made my day. :) What actually happens is that the state is saved in kernel_fpu_begin() and restored in kernel_fpu_end(), and it'll take a few hundred cycles best case. If you do it a bunch of times in a loop, you *might* trigger a CPU optimization that notices that the state being saved is the same state that was just restored, but you're still going to pay the full restore code each round trip no matter what. The code is much clearer in 4.10 kernels now that I deleted the unused "lazy" branches. > > Similarly, in benchmarks you'll hit the "modern CPU's power on the AVX > unit and keep it powered up for a while afterwards", while in real > life you would quite easily hit the "oh, AVX is powered down because > we were idle, now it powers up at half speed which is another latency > hit _and_ the AVX unit won't run full out anyway". I *think* that was mostly fixed in Broadwell or thereabouts (in terms of latency -- throughput and power consumption still suffers).
WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@amacapital.net> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: George Spelvin <linux@sciencehorizons.net>, "Jason A. Donenfeld" <Jason@zx2c4.com>, Andi Kleen <ak@linux.intel.com>, David Miller <davem@davemloft.net>, David Laight <David.Laight@aculab.com>, "Daniel J . Bernstein" <djb@cr.yp.to>, Eric Biggers <ebiggers3@gmail.com>, Eric Dumazet <eric.dumazet@gmail.com>, Hannes Frederic Sowa <hannes@stressinduktion.org>, Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>, "kernel-hardening@lists.openwall.com" <kernel-hardening@lists.openwall.com>, Linux Crypto Mailing List <linux-crypto@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Network Development <netdev@vger.kernel.org>, Tom Herbert <tom@herbertland.com>, Theodore Ts'o <tytso@mit.edu>, Vegard Nossum <vegard.nossum@gmail.com> Subject: [kernel-hardening] Re: HalfSipHash Acceptable Usage Date: Wed, 21 Dec 2016 17:54:26 -0800 [thread overview] Message-ID: <CALCETrVxFDQGJeQX5k39pM3TvqH4q10SduPY=Os_RiJGEg_0Hg@mail.gmail.com> (raw) In-Reply-To: <CA+55aFy8fNOxw3bnwkX1S46jKnW6i26mueaiuOsScyN3kFJp+A@mail.gmail.com> On Wed, Dec 21, 2016 at 9:25 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Dec 21, 2016 at 7:55 AM, George Spelvin > <linux@sciencehorizons.net> wrote: >> >> How much does kernel_fpu_begin()/kernel_fpu_end() cost? > > It's now better than it used to be, but it's absolutely disastrous > still. We're talking easily many hundreds of cycles. Under some loads, > thousands. > > And I warn you already: it will _benchmark_ a hell of a lot better > than it will work in reality. In benchmarks, you'll hit all the > optimizations ("oh, I've already saved away all the FP registers, no > need to do it again"). > > In contrast, in reality, especially with things like "do it once or > twice per incoming packet", you'll easily hit the absolute worst > cases, where not only does it take a few hundred cycles to save the FP > state, you'll then return to user space in between packets, which > triggers the slow-path return code and reloads the FP state, which is > another few hundred cycles plus. Hah, you're thinking that the x86 code works the way that Rik and I want it to work, and you just made my day. :) What actually happens is that the state is saved in kernel_fpu_begin() and restored in kernel_fpu_end(), and it'll take a few hundred cycles best case. If you do it a bunch of times in a loop, you *might* trigger a CPU optimization that notices that the state being saved is the same state that was just restored, but you're still going to pay the full restore code each round trip no matter what. The code is much clearer in 4.10 kernels now that I deleted the unused "lazy" branches. > > Similarly, in benchmarks you'll hit the "modern CPU's power on the AVX > unit and keep it powered up for a while afterwards", while in real > life you would quite easily hit the "oh, AVX is powered down because > we were idle, now it powers up at half speed which is another latency > hit _and_ the AVX unit won't run full out anyway". I *think* that was mostly fixed in Broadwell or thereabouts (in terms of latency -- throughput and power consumption still suffers).
next prev parent reply other threads:[~2016-12-22 1:54 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-12-19 17:32 HalfSipHash Acceptable Usage Jason A. Donenfeld 2016-12-19 17:32 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-19 20:49 ` Jean-Philippe Aumasson 2016-12-19 20:49 ` [kernel-hardening] " Jean-Philippe Aumasson 2016-12-19 21:00 ` Jason A. Donenfeld 2016-12-19 21:00 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-20 21:36 ` Theodore Ts'o 2016-12-20 21:36 ` [kernel-hardening] " Theodore Ts'o 2016-12-20 23:07 ` George Spelvin 2016-12-20 23:07 ` [kernel-hardening] " George Spelvin 2016-12-20 23:55 ` Eric Dumazet 2016-12-20 23:55 ` [kernel-hardening] " Eric Dumazet 2016-12-21 3:28 ` George Spelvin 2016-12-21 3:28 ` [kernel-hardening] " George Spelvin 2016-12-21 5:29 ` Eric Dumazet 2016-12-21 5:29 ` [kernel-hardening] " Eric Dumazet 2016-12-21 5:29 ` Eric Dumazet 2016-12-21 6:34 ` George Spelvin 2016-12-21 6:34 ` [kernel-hardening] " George Spelvin 2016-12-21 14:24 ` Jason A. Donenfeld 2016-12-21 14:24 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-21 15:55 ` George Spelvin 2016-12-21 15:55 ` [kernel-hardening] " George Spelvin 2016-12-21 16:37 ` Jason A. Donenfeld 2016-12-21 16:37 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-21 16:41 ` Rik van Riel 2016-12-21 17:25 ` Linus Torvalds 2016-12-21 17:25 ` [kernel-hardening] " Linus Torvalds 2016-12-21 17:25 ` Linus Torvalds 2016-12-21 18:07 ` George Spelvin 2016-12-21 18:07 ` [kernel-hardening] " George Spelvin 2016-12-22 1:54 ` Andy Lutomirski [this message] 2016-12-22 1:54 ` Andy Lutomirski 2016-12-22 1:54 ` Andy Lutomirski 2016-12-21 14:42 ` Jason A. Donenfeld 2016-12-21 14:42 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-21 15:56 ` Eric Dumazet 2016-12-21 15:56 ` [kernel-hardening] " Eric Dumazet 2016-12-21 16:33 ` Jason A. Donenfeld 2016-12-21 16:33 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-21 16:39 ` Rik van Riel 2016-12-21 17:08 ` Eric Dumazet 2016-12-21 18:37 ` George Spelvin 2016-12-21 18:37 ` [kernel-hardening] " George Spelvin 2016-12-21 18:40 ` Jason A. Donenfeld 2016-12-21 18:40 ` [kernel-hardening] " Jason A. Donenfeld 2016-12-21 22:27 ` Theodore Ts'o 2016-12-21 22:27 ` [kernel-hardening] " Theodore Ts'o 2016-12-22 0:18 ` George Spelvin 2016-12-22 0:18 ` [kernel-hardening] " George Spelvin 2016-12-22 1:13 ` George Spelvin 2016-12-22 1:13 ` [kernel-hardening] " George Spelvin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CALCETrVxFDQGJeQX5k39pM3TvqH4q10SduPY=Os_RiJGEg_0Hg@mail.gmail.com' \ --to=luto@amacapital.net \ --cc=David.Laight@aculab.com \ --cc=Jason@zx2c4.com \ --cc=ak@linux.intel.com \ --cc=davem@davemloft.net \ --cc=djb@cr.yp.to \ --cc=ebiggers3@gmail.com \ --cc=eric.dumazet@gmail.com \ --cc=hannes@stressinduktion.org \ --cc=jeanphilippe.aumasson@gmail.com \ --cc=kernel-hardening@lists.openwall.com \ --cc=linux-crypto@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@sciencehorizons.net \ --cc=netdev@vger.kernel.org \ --cc=tom@herbertland.com \ --cc=torvalds@linux-foundation.org \ --cc=tytso@mit.edu \ --cc=vegard.nossum@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.