* [PATCH] bitmap: bitmap_equal memcmp optimization
[not found] <1465288661-3273-1-git-send-email-schwidefsky@de.ibm.com>
@ 2016-06-07 8:37 ` Martin Schwidefsky
2016-06-08 20:42 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Martin Schwidefsky @ 2016-06-07 8:37 UTC (permalink / raw)
To: linux-arch, linux-kernel, Benjamin Herrenschmidt, David S. Miller
Cc: Martin Schwidefsky
The bitmap_equal function has optimized code for small bitmaps with less
than BITS_PER_LONG bits. For larger bitmaps the out-of-line function
__bitmap_equal is called.
For a constant number of bits divisible by BITS_PER_LONG the memcmp
function can be used. For s390 gcc knows how to optimize this function,
memcmp calls with up to 256 bytes / 2048 bits are translated into a
single instruction.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
include/linux/bitmap.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index e9b0b9a..27bfc0b 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -267,6 +267,10 @@ static inline int bitmap_equal(const unsigned long *src1,
{
if (small_const_nbits(nbits))
return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+#ifdef CONFIG_S390
+ else if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
+ return !memcmp(src1, src2, nbits / 8);
+#endif
else
return __bitmap_equal(src1, src2, nbits);
}
--
2.6.6
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] bitmap: bitmap_equal memcmp optimization
2016-06-07 8:37 ` [PATCH] bitmap: bitmap_equal memcmp optimization Martin Schwidefsky
@ 2016-06-08 20:42 ` Andrew Morton
2016-06-09 6:31 ` Martin Schwidefsky
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2016-06-08 20:42 UTC (permalink / raw)
To: Martin Schwidefsky
Cc: linux-arch, linux-kernel, Benjamin Herrenschmidt, David S. Miller
On Tue, 7 Jun 2016 10:37:41 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
> The bitmap_equal function has optimized code for small bitmaps with less
> than BITS_PER_LONG bits. For larger bitmaps the out-of-line function
> __bitmap_equal is called.
>
> For a constant number of bits divisible by BITS_PER_LONG the memcmp
> function can be used. For s390 gcc knows how to optimize this function,
> memcmp calls with up to 256 bytes / 2048 bits are translated into a
> single instruction.
Patch looks simple enough, although it would benefit from a comment
explaining what's special about s390.
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -267,6 +267,10 @@ static inline int bitmap_equal(const unsigned long *src1,
> {
> if (small_const_nbits(nbits))
> return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
> +#ifdef CONFIG_S390
> + else if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
> + return !memcmp(src1, src2, nbits / 8);
> +#endif
> else
> return __bitmap_equal(src1, src2, nbits);
> }
Those elses are a bit daffy. This:
--- a/include/linux/bitmap.h~bitmap-bitmap_equal-memcmp-optimization-fix
+++ a/include/linux/bitmap.h
@@ -266,13 +266,12 @@ static inline int bitmap_equal(const uns
const unsigned long *src2, unsigned int nbits)
{
if (small_const_nbits(nbits))
- return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+ return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
#ifdef CONFIG_S390
- else if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
+ if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
return !memcmp(src1, src2, nbits / 8);
#endif
- else
- return __bitmap_equal(src1, src2, nbits);
+ return __bitmap_equal(src1, src2, nbits);
}
static inline int bitmap_intersects(const unsigned long *src1,
_
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] bitmap: bitmap_equal memcmp optimization
2016-06-08 20:42 ` Andrew Morton
@ 2016-06-09 6:31 ` Martin Schwidefsky
0 siblings, 0 replies; 3+ messages in thread
From: Martin Schwidefsky @ 2016-06-09 6:31 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-arch, linux-kernel, Benjamin Herrenschmidt, David S. Miller
On Wed, 8 Jun 2016 13:42:12 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 7 Jun 2016 10:37:41 +0200 Martin Schwidefsky <schwidefsky@de.ibm.com> wrote:
>
> > The bitmap_equal function has optimized code for small bitmaps with less
> > than BITS_PER_LONG bits. For larger bitmaps the out-of-line function
> > __bitmap_equal is called.
> >
> > For a constant number of bits divisible by BITS_PER_LONG the memcmp
> > function can be used. For s390 gcc knows how to optimize this function,
> > memcmp calls with up to 256 bytes / 2048 bits are translated into a
> > single instruction.
>
> Patch looks simple enough, although it would benefit from a comment
> explaining what's special about s390.
The explanation from the git commit could be transfered into a comment.
> > --- a/include/linux/bitmap.h
> > +++ b/include/linux/bitmap.h
> > @@ -267,6 +267,10 @@ static inline int bitmap_equal(const unsigned long *src1,
> > {
> > if (small_const_nbits(nbits))
> > return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
> > +#ifdef CONFIG_S390
> > + else if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
> > + return !memcmp(src1, src2, nbits / 8);
> > +#endif
> > else
> > return __bitmap_equal(src1, src2, nbits);
> > }
>
> Those elses are a bit daffy. This:
>
> --- a/include/linux/bitmap.h~bitmap-bitmap_equal-memcmp-optimization-fix
> +++ a/include/linux/bitmap.h
> @@ -266,13 +266,12 @@ static inline int bitmap_equal(const uns
> const unsigned long *src2, unsigned int nbits)
> {
> if (small_const_nbits(nbits))
> - return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
> + return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
> #ifdef CONFIG_S390
> - else if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
> + if (__builtin_constant_p(nbits) && (nbits % BITS_PER_LONG) == 0)
> return !memcmp(src1, src2, nbits / 8);
> #endif
> - else
> - return __bitmap_equal(src1, src2, nbits);
> + return __bitmap_equal(src1, src2, nbits);
> }
>
> static inline int bitmap_intersects(const unsigned long *src1,
> _
>
Yeah that looks better. Thanks for picking this up!
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-06-09 6:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1465288661-3273-1-git-send-email-schwidefsky@de.ibm.com>
2016-06-07 8:37 ` [PATCH] bitmap: bitmap_equal memcmp optimization Martin Schwidefsky
2016-06-08 20:42 ` Andrew Morton
2016-06-09 6:31 ` Martin Schwidefsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).