From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753152AbcEBHNV (ORCPT <rfc822;w@1wt.eu>);
	Mon, 2 May 2016 03:13:21 -0400
Received: from www.linutronix.de ([62.245.132.108]:40052 "EHLO
	Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751890AbcEBHNN (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 2 May 2016 03:13:13 -0400
Date: Mon, 2 May 2016 09:11:34 +0200 (CEST)
From: Thomas Gleixner <tglx@linutronix.de>
To: George Spelvin <linux@horizon.com>
cc: eric.dumazet@gmail.com, linux-kernel@vger.kernel.org, riel@redhat.com,
        torvalds@linux-foundation.org
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
In-Reply-To: <20160501094358.4706.qmail@ns.horizon.com>
Message-ID: <alpine.DEB.2.11.1605020908080.3692@nanos>
References: <20160501094358.4706.qmail@ns.horizon.com>
User-Agent: Alpine 2.11 (DEB 23 2013-08-11)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Linutronix-Spam-Score: -1.0
X-Linutronix-Spam-Level: -
X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required,  ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Sun, 1 May 2016, George Spelvin wrote:
> But I noticed a much greater difference.
> 
>  Wang      * PHI    % 4093   Wang      * PHI    % 4093
>  13599486  3494816  5238442  13584166  3460266  5239463
>  13589552  3466764  5237327  13582381  3422304  5276253
>  13569409  3407317  5236239  13566738  3393215  5267909
>  13575257  3413736  5237708  13596428  3379811  5280118
>  13583607  3540416  5325609  13650964  3380301  5265210
> 
> At 3.7 GHz, that's 
> 
> * PHI:     1059 M ops/second
> * Modulo:   706 M ops/second
> * Wang:     271 M ops/second
> 
> Of course, that's a tight loop hashing; I presume your test case
> has more overhead.

Indeed.
 
> Anyway, my recommendation (I'll write the patches if you like) is:
> 
> * Modify the multiplicative constants to be
>   #define COLDEN_RATIO_32 0x61C88647
>   #define COLDEN_RATIO_64 0x61C8864680B583EB

Works for me. I ran them through my test case and they behaved reasonably
well.
 
> * Change the prototype of hash_64 to return u32.

Makes sense.
 
> * Create a separate 32-bit implementation of hash_64() for the
>   BITS_PER_LONG < 64 case.  This should not be Wang's or anything
>   similar because 64-bit shifts are slow and awkward on 32-bit
>   machines.
>   One option is something like __jhash_final(), but I think
>   it will suffice to do:
> 
>   static __always_inline u32 hash_64(u64 val, unsigned int bits)
>   {
> 	u32 hash = (u32)(val >> 32) * GOLDEN_RATIO_32 + (u32)val;
> 	hash *= GOLDEN_RATIO_32;
>         return hash >> (32 - bits);
>   }

Works. That's more or less the same overhead as the modulo one, which behaved
well on 32bit.
 
Thanks,

	tglx