From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751699AbdFHCz6 (ORCPT ); Wed, 7 Jun 2017 22:55:58 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:59734 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751439AbdFHCz5 (ORCPT ); Wed, 7 Jun 2017 22:55:57 -0400 Date: Wed, 7 Jun 2017 19:55:56 -0700 From: Matthew Wilcox To: Andy Shevchenko Cc: "linux-kernel@vger.kernel.org" , Andrew Morton , Martin Schwidefsky , Rasmus Villemoes , Matthew Wilcox Subject: Re: [PATCH 3/3] bitmap: Use memcmp optimisation in more situations Message-ID: <20170608025556.GB20010@bombadil.infradead.org> References: <20170607142924.28552-1-willy@infradead.org> <20170607142924.28552-4-willy@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 08, 2017 at 04:48:04AM +0300, Andy Shevchenko wrote: > > Commit 7dd968163f ("bitmap: bitmap_equal memcmp optimization") was > > rather more restrictive than necessary; we can use memcmp() to implement > > bitmap_equal() as long as the number of bits can be proved to be a > > multiple of 8. And architectures other than s390 may be able to make > > good use of this optimisation. > > > - if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0) > > + if (__builtin_constant_p(nbits & 7) && IS_ALIGNED(nbits, 8)) > > return !memcmp(src1, src2, nbits / 8); > > I'm not sure this is a fully correct change. > What exactly ' & 7' part does? > For me looks like you may just drop it. We only need to know if the bottom 3 bits are 0 to apply this optimisation. For example, if we have a user which does this: nbits = 8; if (argle) nbits += 8; if (bitmap_equal(ptr1, ptr2, nbits)) blah(); then we can use memcmp() because gcc can deduce that the bottom 3 bits are never set (try it! it works!). We don't need nbits as a whole to be const.