From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938746AbcLVFnK (ORCPT ); Thu, 22 Dec 2016 00:43:10 -0500 Received: from mail-ua0-f169.google.com ([209.85.217.169]:32954 "EHLO mail-ua0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S938677AbcLVFnH (ORCPT ); Thu, 22 Dec 2016 00:43:07 -0500 MIME-Version: 1.0 In-Reply-To: <20161222050138.12011.qmail@ns.sciencehorizons.net> References: <20161222050138.12011.qmail@ns.sciencehorizons.net> From: Andy Lutomirski Date: Wed, 21 Dec 2016 21:42:45 -0800 Message-ID: Subject: Re: George's crazy full state idea (Re: HalfSipHash Acceptable Usage) To: George Spelvin Cc: Andrew Lutomirski , Andi Kleen , "David S. Miller" , David Laight , "D. J. Bernstein" , Eric Biggers , Eric Dumazet , Hannes Frederic Sowa , "Jason A. Donenfeld" , Jean-Philippe Aumasson , "kernel-hardening@lists.openwall.com" , Linux Crypto Mailing List , "linux-kernel@vger.kernel.org" , Network Development , Tom Herbert , Linus Torvalds , "Ted Ts'o" , Vegard Nossum Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 21, 2016 at 9:01 PM, George Spelvin wrote: > Andy Lutomirski wrote: >> I don't even think it needs that. This is just adding a >> non-destructive final operation, right? > > It is, but the problem is that SipHash is intended for *small* inputs, > so the standard implementations aren't broken into init/update/final > functions. > > There's just one big function that keeps the state variables in > registers and never stores them anywhere. > > If we *had* init/update/final functions, then it would be trivial. > >> Just to clarify, if we replace SipHash with a black box, I think this >> effectively means, where "entropy" is random_get_entropy() || jiffies >> || current->pid: > >> The first call returns H(random seed || entropy_0 || secret). The >> second call returns H(random seed || entropy_0 || secret || entropy_1 >> || secret). Etc. > > Basically, yes. I was skipping the padding byte and keying the > finalization rounds on the grounds of "can't hurt and might help", > but we could do it a more standard way. > >> If not, then I have a fairly strong preference to keep whatever >> construction we come up with consistent with something that could >> actually happen with invocations of unmodified SipHash -- then all the >> security analysis on SipHash goes through. > > Okay. I don't think it makes a difference, but it's not a *big* waste > of time. If we have finalization rounds, we can reduce the secret > to 128 bits. > > If we include the padding byte, we can do one of two things: > 1) Make the secret 184 bits, to fill up the final partial word as > much as possible, or > 2) Make the entropy 1 byte smaller and conceptually misalign the > secret. What we'd actually do is remove the last byte of > the secret and include it in the entropy words, but that's > just a rotation of the secret between storage and hashing. > > Also, I assume you'd like SipHash-2-4, since you want to rely > on a security analysis. I haven't looked, but I assume that the analysis at least thought about reduced rounds, so maybe other variants are okay. >> The one thing I don't like is >> that I don't see how to prove that you can't run it backwards if you >> manage to acquire a memory dump. In fact, I that that there exist, at >> least in theory, hash functions that are secure in the random oracle >> model but that *can* be run backwards given the full state. From >> memory, SHA-3 has exactly that property, and it would be a bit sad for >> a CSPRNG to be reversible. > > Er... get_random_int() is specifically *not* designed to be resistant > to state capture, and I didn't try. Remember, what it's used for > is ASLR, what we're worried about is somene learning the layouts > of still-running processes, and and if you get a memory dump, you have > the memory layout! True, but it's called get_random_int(), and it seems like making it stronger, especially if the performance cost is low to zero, is a good thing. > > If you want anti-backtracking, though, it's easy to add. What we > hash is: > > entropy_0 || secret || output_0 || entropy_1 || secret || output_1 || ... > > You mix the output word right back in to the (unfinalized) state after > generating it. This is still equivalent to unmodified back-box SipHash, > you're just using a (conceptually independent) SipHash invocation to > produce some of its input. Ah, cute. This could probably be sped up by doing something like: entropy_0 || secret || output_0 ^ entropy_1 || secret || ... It's a little weak because the output is only 64 bits, so you could plausibly backtrack it on a GPU or FPGA cluster or on an ASIC if the old entropy is guessable. I suspect there are sneaky ways around it like using output_n-1 ^ output_n-2 or similar. I'll sleep on it. > > The only remaining issues are: > 1) How many rounds, and > 2) May we use HalfSipHash? I haven't looked closely enough to have a real opinion here. I don't know what the security margin is believed to be. > > I'd *like* to persuade you that skipping the padding byte wouldn't > invalidate any security proofs, because it's true and would simplify > the code. But if you want 100% stock, I'm willing to cater to that. I lean toward stock in the absence of a particularly good reason. At the very least I'd want to read that paper carefully. > > Ted, what do you think? -- Andy Lutomirski AMA Capital Management, LLC