From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754139AbcLLEBj (ORCPT ); Sun, 11 Dec 2016 23:01:39 -0500 Received: from mail-io0-f193.google.com ([209.85.223.193]:36371 "EHLO mail-io0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754064AbcLLEBh (ORCPT ); Sun, 11 Dec 2016 23:01:37 -0500 MIME-Version: 1.0 In-Reply-To: <20161212034817.1773-1-Jason@zx2c4.com> References: <20161211204345.GA1558@kroah.com> <20161212034817.1773-1-Jason@zx2c4.com> From: Linus Torvalds Date: Sun, 11 Dec 2016 20:01:36 -0800 X-Google-Sender-Auth: rNFNfno3rtmos7b13IYiFuJN0FM Message-ID: Subject: Re: [PATCH v2] siphash: add cryptographically secure hashtable function To: "Jason A. Donenfeld" Cc: "kernel-hardening@lists.openwall.com" , LKML , Linux Crypto Mailing List , George Spelvin , Scott Bauer , Andi Kleen , Andy Lutomirski , Greg KH , Jean-Philippe Aumasson , "Daniel J . Bernstein" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Dec 11, 2016 at 7:48 PM, Jason A. Donenfeld wrote: > + switch (left) { > + case 7: b |= ((u64)data[6]) << 48; > + case 6: b |= ((u64)data[5]) << 40; > + case 5: b |= ((u64)data[4]) << 32; > + case 4: b |= ((u64)data[3]) << 24; > + case 3: b |= ((u64)data[2]) << 16; > + case 2: b |= ((u64)data[1]) << 8; > + case 1: b |= ((u64)data[0]); break; > + case 0: break; > + } The above is extremely inefficient. Considering that most kernel data would be expected to be smallish, that matters (ie the usual benchmark would not be about hashing megabytes of data, but instead millions of hashes of small data). I think this could be rewritten (at least for 64-bit architectures) as #ifdef CONFIG_DCACHE_WORD_ACCESS if (left) b |= le64_to_cpu(load_unaligned_zeropad(data) & bytemask_from_count(left)); #else .. do the duff's device thing with the switch() .. #endif which should give you basically perfect code generation (ie a single 64-bit load and a byte mask). Totally untested, just looking at the code and trying to make sense of it. ... and obviously, it requires an actual high-performance use-case to make any difference. Linus