From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757334Ab2DXTiR (ORCPT ); Tue, 24 Apr 2012 15:38:17 -0400 Received: from mail-we0-f174.google.com ([74.125.82.174]:43613 "EHLO mail-we0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757164Ab2DXTiO convert rfc822-to-8bit (ORCPT ); Tue, 24 Apr 2012 15:38:14 -0400 MIME-Version: 1.0 In-Reply-To: <20120424162224.526249106@chello.nl> References: <20120424161039.293018424@chello.nl> <20120424162224.526249106@chello.nl> From: Linus Torvalds Date: Tue, 24 Apr 2012 12:37:53 -0700 X-Google-Sender-Auth: rp_qFneZVih0vBackfdm35Dk_Qw Message-ID: Subject: Re: [RFC][PATCH 2/3] math128: Introduce {mult,add,cmp}_u128 To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, Andrew Morton , Juri Lelli , Ingo Molnar , Thomas Gleixner Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 24, 2012 at 9:10 AM, Peter Zijlstra wrote: > Grow rudimentary u128 support without relying on gcc/libgcc. > > +#ifndef add_u128 > +static inline u128 add_u128(u128 a, u128 b) > +{ > +       u128 res; > + > +       res.hi = a.hi + b.hi; > +       res.lo = a.lo + b.lo; > + > +       if (res.lo < a.lo || res.lo < b.lo) > +               res.hi++; This is wrong. Or at least stupid. Just do one of the comparisons, not both. If overflow occurs, the result will be smaller than *either* of the added numbers, so comparing both is just silly and confused. So just pick one. Also, it might be worth looking at code generation, to see if it's better to just do a.hi += b.hi; a.low += b.low; if (a.low < b.low) a.hi++; return a; because that might make it clear that there are fewer actual values live at any particular time. But gcc may not care. Try it. Also, for the multiply, please make sure gcc knows to do a "32x32->64" multiplication, rather than thinking it needs to do full 64x64 multiplies.. I'm not sure gcc understands that as you wrote it. You are probably better off actually using 32-bit values, and then an explicit cast, ie u32 a32_0 = .. low 32 bits of a .. u32 b32_0 = .. low 32 bits of b .. u64 res64_0 = (u64) a32_0 * (u64) b32_0; but if gcc understands it from the shifts and masks, I guess it doesn't matter. Linus