From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91906C433FE for ; Wed, 27 Oct 2021 01:09:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 685FB60F92 for ; Wed, 27 Oct 2021 01:09:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236892AbhJ0BMA (ORCPT ); Tue, 26 Oct 2021 21:12:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236798AbhJ0BL7 (ORCPT ); Tue, 26 Oct 2021 21:11:59 -0400 Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9012DC061745 for ; Tue, 26 Oct 2021 18:09:34 -0700 (PDT) Received: by mail-pf1-x42a.google.com with SMTP id 127so1146810pfu.1 for ; Tue, 26 Oct 2021 18:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IE1yW+OMDFYK7DVFzbjLOEHKVUKZvs6KTijL3WSzHgs=; b=NLNyXqReix36CnFm1oFZAZzWewcBWEwwtwbwJQL1EqMUIYV8pEqTcKuyH6Kdy0Px4V 2pNR+PqQqvFi/tW0FOcQ5fsKwQjkn5w2HBZwkDMcA+U+MAi5uXLCl+3s874yt0wyhFpT GMjsk+L/F72TgL/e9+UJcfgPXjbBJZY6ngzNKjMAFZxAy1uBQGShKV49ljsdDZoxtbFy soxwso5992LkvJM/vKNtmDLloczp2Hq++1GH4SiW9cWja2WhpGpwoOReyQoaFs9oSPar s+wULBwFIa8Re4QBM0DghmfyIUxe3BKsXSc6VnamBgDdu2YNf5y/d6CoPLN3R4GjxTkE ehMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IE1yW+OMDFYK7DVFzbjLOEHKVUKZvs6KTijL3WSzHgs=; b=gGFNKn8AHrZWvIzDt5hIj39sTVO9uuV7LGterhvx0L6/vXwzP0H1F2E5vABhsRmF7Q 9coT0yGv/qOUP7rF8bzZ2rYZKj9rPnzQ//Ydo1qfQgpdbQoI4DRIdb2hXGtSsoVva/Af iVJxbXIt2Th7o3J7eiO8tVvjbGx9ny8Hql+MejPuTXZMv4Rqi3yEzCtLWFozcex0mZK3 Xw5qoXpeWg5gVZ+bebZRdWXngOzYs+eO5OmlPAmVxFjyf/zC7VyKUbDZdBBaPWrJ4YpW Wkoze5QlNTqeI+rkhxERQb9CdDqJ9oMKouJpvCCwitDFx+MlnwDS31fNU37le/GgcHWU qkPQ== X-Gm-Message-State: AOAM530M+fEfD4pUlQAW/LwCCRNSVuc7YDCY7KswImEgb9I8hkpeZek2 4RtMPTldWURzamBttXlKrIDOxHksCvGrTZlrh42LUA== X-Google-Smtp-Source: ABdhPJwd5SxRRHuClfQQAiRjCXQTrpzQChfLU0PAB6BN/EZhNgwPC0gDz0AC+Eq2iNzGTJUXciRXHM1e4aJmtX+MHVk= X-Received: by 2002:a05:6a00:179c:b0:47c:2092:c28c with SMTP id s28-20020a056a00179c00b0047c2092c28cmr3188193pfg.59.1635296973719; Tue, 26 Oct 2021 18:09:33 -0700 (PDT) MIME-Version: 1.0 References: <20211025200852.3002369-1-kaleshsingh@google.com> <20211025200852.3002369-7-kaleshsingh@google.com> <20211026151451.7f3e09a4@gandalf.local.home> <20211026201846.08990d1d@rorschach.local.home> In-Reply-To: <20211026201846.08990d1d@rorschach.local.home> From: Kalesh Singh Date: Tue, 26 Oct 2021 18:09:22 -0700 Message-ID: Subject: Re: [PATCH v4 6/8] tracing/histogram: Optimize division by a power of 2 To: Steven Rostedt Cc: surenb@google.com, hridya@google.com, namhyung@kernel.org, kernel-team@android.com, Jonathan Corbet , Ingo Molnar , Shuah Khan , Masami Hiramatsu , Tom Zanussi , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 26, 2021 at 5:18 PM Steven Rostedt wrote: > > On Tue, 26 Oct 2021 16:39:13 -0700 > Kalesh Singh wrote: > > > > // This works best for small divisors > > > if (div > max_div) { > > > // only do a real division > > > return; > > > } > > > shift = 20; > > > mult = ((1 << shift) + div - 1) / div; > > > delta = mult * div - (1 << shift); > > > if (!delta) { > > > /* div is a power of 2 */ > > > max = -1; > > > return; > > > } > > > max = (1 << shift) / delta; > > > > I'm still trying to digest the above algorithm. > > mult = (2^20 + div - 1) / div; > > The "div - 1" is to round up. > > Basically, it's doing: X / div = X * (2^20 / div) / 2^20 > > If div is constant, the 2^20 / div is constant, and the "2^20" is the > same as a shift. > > So multiplier is 2^20 / div, and the shift is 20. > > But because there's rounding errors it is only accurate up to the > difference of: > > delta = mult * div / 2^20 > > That is if mult is a power of two, then there would be no rounding > errors, and the delta is zero, making the max infinite: > > max = 2^20 / delta as delta goes to zero. > > > But doesn't this add 2 extra divisions? What am I missing here? > > The above is only done at parsing not during the trace, where we care > about. Hi Steve, Thanks for the explanation, this cleared it up for me. - Kalesh > > > > > > > > > > We would of course need to use 64 bit operations (maybe only do this for 64 > > > bit machines). And perhaps even use bigger shift values to get a bigger max. > > > > > > Then we could do: > > > > > > if (val1 < max) > > > return (val1 * mult) >> shift; > > This is done at the time of recording. > > Actually, it would be: > > if (val1 < max) > return (val1 * mult) >> shift; > else > return val1 / div; > > -- Steve