From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751717AbdLUKIC (ORCPT <rfc822;w@1wt.eu>);
        Thu, 21 Dec 2017 05:08:02 -0500
Received: from smtp-out4.electric.net ([192.162.216.187]:52484 "EHLO
        smtp-out4.electric.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1750884AbdLUKH7 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 21 Dec 2017 05:07:59 -0500
From: David Laight <David.Laight@ACULAB.COM>
To: "'Crt Mori'" <cmo@melexis.com>
CC: Peter Zijlstra <peterz@infradead.org>,
        Jonathan Cameron <jic23@kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Kees Cook <keescook@chromium.org>,
        Rusty Russell <rusty@rustcorp.com.au>, Ian Abbott <abbotti@mev.co.uk>,
        Larry Finger <Larry.Finger@lwfinger.net>,
        Niklas Soderlund <niklas.soderlund+renesas@ragnatech.se>,
        Thomas Gleixner <tglx@linutronix.de>,
        Krzysztof Kozlowski <krzk@kernel.org>,
        Masahiro Yamada <yamada.masahiro@socionext.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-iio@vger.kernel.org" <linux-iio@vger.kernel.org>,
        Joe Perches <joe@perches.com>
Subject: RE: [PATCH v10 1/3] lib: Add strongly typed 64bit int_sqrt
Thread-Topic: [PATCH v10 1/3] lib: Add strongly typed 64bit int_sqrt
Thread-Index: AQHTeZ24vj4uJGdJVEmg6EfyY0tMJ6NMSiFAgAAfCS+AAAbEMIAADUYAgAEVOQA=
Date: Thu, 21 Dec 2017 10:08:09 +0000
Message-ID: <e618459d3d7045f4b763ce62bf472519@AcuMS.aculab.com>
References: <20171220142001.18161-1-cmo@melexis.com>
 <1c1d0ffa8ee140bf9adbc78f1559b1e8@AcuMS.aculab.com>
 <20171220160001.manjff26gfbjccsw@hirez.programming.kicks-ass.net>
 <CAKv63uuL9+xzF7KruhYwSY68-M0=aJSvJOr5Y0vVBiX8ebqfeg@mail.gmail.com>
 <c3462afd27d14c8684ee33ef6623a31a@AcuMS.aculab.com>
 <CAKv63utKNW2--ZnkoJ0z++68764cX_S9xS9Gst_vj7eDsFAZrg@mail.gmail.com>
In-Reply-To: <CAKv63utKNW2--ZnkoJ0z++68764cX_S9xS9Gst_vj7eDsFAZrg@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.202.205.33]
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
X-Outbound-IP: 156.67.243.126
X-Env-From: David.Laight@ACULAB.COM
X-Proto: esmtps
X-Revdns: 
X-HELO: AcuMS.aculab.com
X-TLS: TLSv1.2:ECDHE-RSA-AES256-SHA384:256
X-Authenticated_ID: 
X-PolicySMART: 3396946, 3397078
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id vBLA8ISj007769

From: Crt Mori
> Sent: 20 December 2017 17:30
> To: David Laight
...
> > I think this version works.
> > It doesn't have the optimisation for small values.
> >
> > unsigned int sqrt64(unsigned long long x)
> > {
> >         unsigned int x_hi = x >> 32;
> >
> >         unsigned int b = 0;
> >         unsigned int y = 0;
> >         unsigned int i;
> >
> >         for (i = 0; i < 32; i++) {
> >                 b <<= 2;
> >                 b |= x_hi >> 30;
> >                 x_hi <<= 2;
> >                 if (i == 15)
> >                         x_hi = x;
> >                 y <<= 1;
> >                 if (b > y)
> >                         b -= ++y;
> >         }
> >         return y;
> > }
> 
> Wow, thanks. This seems like a major change.

Not really, it is just doing it slightly differently.
The algorithm is the 'schoolboy' one that used to be taught at school.
(Although I found it in a boot from the 1930s.)

I compared the object code for amd64 (doesn't need to reload
'x_hi' half way through) against the original loop.
They are much the same size, execution speed will depend the lengths
of the dependency chains.

> I did a quick run through unit tests for the sensor and the results
> are way off. On the sensor I had to convert double calculations to
> integer calculations and target was to get end result within 0.02 degC
> (with previous approximate sqrt implementation) in sensor working
> range. This now gets into 3 degC delta at least and some are way off.
> It might be off because of some scaling on the other hand during the
> equation (not exactly comparing sqrt implementations here).
> 
> I will need a bit more testing of this to see if it is replaceable.

You probably need to put values through both (all 3) functions to see
where you are getting errors.
It might be rounding, or there might be a fubar somewhere.

	David