From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63FBAC433ED for ; Tue, 13 Apr 2021 22:04:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3AF6C61222 for ; Tue, 13 Apr 2021 22:04:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348597AbhDMWE2 (ORCPT ); Tue, 13 Apr 2021 18:04:28 -0400 Received: from gate.crashing.org ([63.228.1.57]:36694 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231815AbhDMWEY (ORCPT ); Tue, 13 Apr 2021 18:04:24 -0400 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 13DLw5dB022427; Tue, 13 Apr 2021 16:58:05 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 13DLw41j022426; Tue, 13 Apr 2021 16:58:04 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 13 Apr 2021 16:58:03 -0500 From: Segher Boessenkool To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible Message-ID: <20210413215803.GT26583@gate.crashing.org> References: <09da6fec57792d6559d1ea64e00be9870b02dab4.1617896018.git.christophe.leroy@csgroup.eu> <20210412215428.GM26583@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.3i Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 13, 2021 at 06:33:19PM +0200, Christophe Leroy wrote: > Le 12/04/2021 à 23:54, Segher Boessenkool a écrit : > >On Thu, Apr 08, 2021 at 03:33:44PM +0000, Christophe Leroy wrote: > >>For clear bits, on 32 bits 'rlwinm' can be used instead or 'andc' for > >>when all bits to be cleared are consecutive. > > > >Also on 64-bits, as long as both the top and bottom bits are in the low > >32-bit half (for 32 bit mode, it can wrap as well). > > Yes. But here we are talking about clearing a few bits, all other ones must > remain unchanged. An rlwinm on PPC64 will always clear the upper part, > which is unlikely what we want. No, it does not. It takes the low 32 bits of the source reg, duplicated to the top half as well, then rotated, then ANDed with the mask (which can wrap around). This isn't very often very useful, but :-) (One useful operation is splatting 32 bits to both halves of a 64-bit register, which is just rlwinm d,s,0,1,0). If you only look at the low 32 bits, it does exactly the same as on 32-bit implementations. > >>For the time being only > >>handle the single bit case, which we detect by checking whether the > >>mask is a power of two. > > > >You could look at rs6000_is_valid_mask in GCC: > > > >used by rs6000_is_valid_and_mask immediately after it. You probably > >want to allow only rlwinm in your case, and please note this checks if > >something is a valid mask, not the inverse of a valid mask (as you > >want here). > > This check looks more complex than what I need. It is used for both rlw... > and rld..., and it calculates the operants. The only thing I need is to > validate the mask. It has to do exactly the same thing for rlwinm as for all 64-bit variants (rldicl, rldicr, rldic). One side effect of calculation the bit positions with exact_log2 is that that returns negative if the argument is not a power of two. Here is a simpler way, that handles all cases: input in "u32 val": if (!val) return nonono; if (val & 1) val = ~val; // make the mask non-wrapping val += val & -val; // adding the low set bit should result in // at most one bit set if (!(val & (val - 1))) return okidoki_all_good; > I found a way: By anding the mask with the complement of itself rotated by > left bits to 1, we identify the transitions from 0 to 1. If the result is a > power of 2, it means there's only one transition so the mask is as expected. That does not handle all cases (it misses all bits set at least). Which isn't all that interesting of course, but is a valid mask (but won't clear any bits, so not too interesting for your specific case :-) ) Segher From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44433C433ED for ; Tue, 13 Apr 2021 22:00:56 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9813E61249 for ; Tue, 13 Apr 2021 22:00:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9813E61249 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4FKfgT6zytz3bxQ for ; Wed, 14 Apr 2021 08:00:53 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=permerror (SPF Permanent Error: Unknown mechanism found: ip:192.40.192.88/32) smtp.mailfrom=kernel.crashing.org (client-ip=63.228.1.57; helo=gate.crashing.org; envelope-from=segher@kernel.crashing.org; receiver=) Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lists.ozlabs.org (Postfix) with ESMTP id 4FKfg44Ztlz2xYv for ; Wed, 14 Apr 2021 08:00:31 +1000 (AEST) Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 13DLw5dB022427; Tue, 13 Apr 2021 16:58:05 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 13DLw41j022426; Tue, 13 Apr 2021 16:58:04 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 13 Apr 2021 16:58:03 -0500 From: Segher Boessenkool To: Christophe Leroy Subject: Re: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible Message-ID: <20210413215803.GT26583@gate.crashing.org> References: <09da6fec57792d6559d1ea64e00be9870b02dab4.1617896018.git.christophe.leroy@csgroup.eu> <20210412215428.GM26583@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.3i X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paul Mackerras , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, Apr 13, 2021 at 06:33:19PM +0200, Christophe Leroy wrote: > Le 12/04/2021 à 23:54, Segher Boessenkool a écrit : > >On Thu, Apr 08, 2021 at 03:33:44PM +0000, Christophe Leroy wrote: > >>For clear bits, on 32 bits 'rlwinm' can be used instead or 'andc' for > >>when all bits to be cleared are consecutive. > > > >Also on 64-bits, as long as both the top and bottom bits are in the low > >32-bit half (for 32 bit mode, it can wrap as well). > > Yes. But here we are talking about clearing a few bits, all other ones must > remain unchanged. An rlwinm on PPC64 will always clear the upper part, > which is unlikely what we want. No, it does not. It takes the low 32 bits of the source reg, duplicated to the top half as well, then rotated, then ANDed with the mask (which can wrap around). This isn't very often very useful, but :-) (One useful operation is splatting 32 bits to both halves of a 64-bit register, which is just rlwinm d,s,0,1,0). If you only look at the low 32 bits, it does exactly the same as on 32-bit implementations. > >>For the time being only > >>handle the single bit case, which we detect by checking whether the > >>mask is a power of two. > > > >You could look at rs6000_is_valid_mask in GCC: > > > >used by rs6000_is_valid_and_mask immediately after it. You probably > >want to allow only rlwinm in your case, and please note this checks if > >something is a valid mask, not the inverse of a valid mask (as you > >want here). > > This check looks more complex than what I need. It is used for both rlw... > and rld..., and it calculates the operants. The only thing I need is to > validate the mask. It has to do exactly the same thing for rlwinm as for all 64-bit variants (rldicl, rldicr, rldic). One side effect of calculation the bit positions with exact_log2 is that that returns negative if the argument is not a power of two. Here is a simpler way, that handles all cases: input in "u32 val": if (!val) return nonono; if (val & 1) val = ~val; // make the mask non-wrapping val += val & -val; // adding the low set bit should result in // at most one bit set if (!(val & (val - 1))) return okidoki_all_good; > I found a way: By anding the mask with the complement of itself rotated by > left bits to 1, we identify the transitions from 0 to 1. If the result is a > power of 2, it means there's only one transition so the mask is as expected. That does not handle all cases (it misses all bits set at least). Which isn't all that interesting of course, but is a valid mask (but won't clear any bits, so not too interesting for your specific case :-) ) Segher