All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: "Yury Norov" <yury.norov@gmail.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Rasmus Villemoes" <linux@rasmusvillemoes.dk>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Michał Mirosław" <mirq-linux@rere.qmqm.pl>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"David Laight" <David.Laight@aculab.com>,
	"Joe Perches" <joe@perches.com>,
	"Dennis Zhou" <dennis@kernel.org>,
	"Emil Renner Berthing" <kernel@esmil.dk>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Matti Vaittinen" <matti.vaittinen@fi.rohmeurope.com>,
	"Alexey Klimov" <aklimov@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 26/49] bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
Date: Thu, 10 Feb 2022 14:49:10 -0800	[thread overview]
Message-ID: <20220210224933.379149-27-yury.norov@gmail.com> (raw)
In-Reply-To: <20220210224933.379149-1-yury.norov@gmail.com>

Many kernel users use bitmap_weight() to compare the result against
some number or expression:

	if (bitmap_weight(...) > 1)
		do_something();

It works OK, but may be significantly improved for large bitmaps: if
first few words count set bits to a number greater than given, we can
stop counting and immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the bitmap
are set, we can stop counting earlier.

This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
and a family of eq, gt, ge, lt, and le wrappers to allow this optimization.
The following patches apply new functions where appropriate.

Suggested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> (for bitmap_weight_cmp)
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
---
 include/linux/bitmap.h | 78 ++++++++++++++++++++++++++++++++++++++++++
 lib/bitmap.c           | 21 ++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 7dba0847510c..a89b626d0fbe 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -51,6 +51,12 @@ struct device;
  *  bitmap_empty(src, nbits)                    Are all bits zero in *src?
  *  bitmap_full(src, nbits)                     Are all bits set in *src?
  *  bitmap_weight(src, nbits)                   Hamming Weight: number set bits
+ *  bitmap_weight_cmp(src, nbits)               compare Hamming Weight with a number
+ *  bitmap_weight_eq(src, nbits, num)           Hamming Weight == num
+ *  bitmap_weight_gt(src, nbits, num)           Hamming Weight >  num
+ *  bitmap_weight_ge(src, nbits, num)           Hamming Weight >= num
+ *  bitmap_weight_lt(src, nbits, num)           Hamming Weight <  num
+ *  bitmap_weight_le(src, nbits, num)           Hamming Weight <= num
  *  bitmap_set(dst, pos, nbits)                 Set specified bit area
  *  bitmap_clear(dst, pos, nbits)               Clear specified bit area
  *  bitmap_find_next_zero_area(buf, len, pos, n, mask)  Find bit free area
@@ -162,6 +168,7 @@ int __bitmap_intersects(const unsigned long *bitmap1,
 int __bitmap_subset(const unsigned long *bitmap1,
 		    const unsigned long *bitmap2, unsigned int nbits);
 int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits);
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num);
 void __bitmap_set(unsigned long *map, unsigned int start, int len);
 void __bitmap_clear(unsigned long *map, unsigned int start, int len);
 
@@ -403,6 +410,77 @@ static __always_inline int bitmap_weight(const unsigned long *src, unsigned int
 	return __bitmap_weight(src, nbits);
 }
 
+/**
+ * bitmap_weight_cmp - compares number of set bits in @src with @num.
+ * @src:   source bitmap
+ * @nbits: length of bitmap in bits
+ * @num:   number to compare with
+ *
+ * As opposite to bitmap_weight() this function doesn't necessarily
+ * traverse full bitmap and may return earlier.
+ *
+ * Because number of set bits cannot decrease while counting, when user
+ * wants to know if the number of set bits in the bitmap is less than
+ * @num, calling
+ *	bitmap_weight_cmp(..., @num) < 0
+ * is potentially less effective than
+ *	bitmap_weight_cmp(..., @num - 1) <= 0
+ *
+ * Consider an example:
+ * bitmap_weight_cmp(1000 0000 0000 0000, 1) < 0
+ *				    ^
+ *				    stop here
+ *
+ * bitmap_weight_cmp(1000 0000 0000 0000, 0) <= 0
+ *		     ^
+ *		     stop here
+ *
+ * Returns: zero if weight of @src is equal to @num;
+ *	   negative number if weight of @src is less than @num;
+ *	   positive number if weight of @src is greater than @num.
+ */
+static __always_inline
+int bitmap_weight_cmp(const unsigned long *src, unsigned int nbits, int num)
+{
+	if ((unsigned int)num > nbits)
+		return -num;
+
+	if (small_const_nbits(nbits))
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)) - num;
+
+	return __bitmap_weight_cmp(src, nbits, num);
+}
+
+static __always_inline
+bool bitmap_weight_eq(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) == 0;
+}
+
+static __always_inline
+bool bitmap_weight_gt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_ge(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_lt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) <= 0;
+}
+
+static __always_inline
+bool bitmap_weight_le(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) <= 0;
+}
+
 static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 		unsigned int nbits)
 {
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 926408883456..fb84ca70c5d9 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -348,6 +348,27 @@ int __bitmap_weight(const unsigned long *bitmap, unsigned int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num)
+{
+	unsigned int k, w, lim = bits / BITS_PER_LONG;
+
+	for (k = 0, w = 0; k < lim; k++) {
+		if (w + bits - k * BITS_PER_LONG < num)
+			goto out;
+
+		w += hweight_long(bitmap[k]);
+
+		if (w > num)
+			goto out;
+	}
+
+	if (bits % BITS_PER_LONG)
+		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+out:
+	return w - num;
+}
+EXPORT_SYMBOL(__bitmap_weight_cmp);
+
 void __bitmap_set(unsigned long *map, unsigned int start, int len)
 {
 	unsigned long *p = map + BIT_WORD(start);
-- 
2.32.0


  parent reply	other threads:[~2022-02-10 23:49 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
2022-02-10 22:48 ` [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
2022-02-10 22:48 ` [PATCH 02/49] net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
2022-02-10 22:48 ` [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
2022-02-11  9:01   ` David Laight
2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
2022-02-11  8:45   ` Andy Shevchenko
2022-02-11 17:17   ` Christophe JAILLET
2022-06-04 15:41     ` Jonathan Cameron
2022-06-11 13:50       ` Jonathan Cameron
2022-02-10 22:48 ` [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free() Yury Norov
2022-02-11  8:48   ` Andy Shevchenko
2022-02-10 22:48 ` [PATCH 06/49] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
2022-02-11 16:34   ` Sean Christopherson
2022-02-11 17:13   ` Christophe JAILLET
2022-02-11 17:19     ` Sean Christopherson
2022-02-11 17:47       ` Yury Norov
2022-02-10 22:48 ` [PATCH 08/49] drm: " Yury Norov
2022-02-11  2:11   ` Dmitry Baryshkov
2022-02-11  2:11     ` Dmitry Baryshkov
2022-02-10 22:48 ` [PATCH 09/49] ice: replace bitmap_weight with bitmap_empty Yury Norov
2022-02-10 22:48   ` [Intel-wired-lan] " Yury Norov
2022-02-10 22:48 ` [PATCH 10/49] octeontx2-pf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-10 22:48 ` [PATCH 11/49] qed: replace bitmap_weight with bitmap_empty in qed_roce_stop() Yury Norov
2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-10 22:48   ` Yury Norov
2022-02-11 10:25   ` Mark Rutland
2022-02-11 10:25     ` Mark Rutland
2022-02-11 17:59     ` Yury Norov
2022-02-11 17:59       ` Yury Norov
2022-02-11 17:27   ` Christophe JAILLET
2022-02-11 17:27     ` Christophe JAILLET
2022-02-11 23:23     ` Yury Norov
2022-02-11 23:23       ` Yury Norov
2022-02-10 22:48 ` [PATCH 13/49] perf tools: " Yury Norov
2022-02-10 22:48 ` [PATCH 14/49] arch/alpha: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-10 22:48   ` Yury Norov
2022-02-10 22:48 ` [PATCH 15/49] arch/ia64: " Yury Norov
2022-02-10 22:48   ` Yury Norov
2022-02-10 22:49 ` [PATCH 16/49] arch/x86: " Yury Norov
2022-02-10 22:49   ` [Nouveau] " Yury Norov
2022-04-10 20:42   ` [tip: x86/cleanups] x86: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-10 22:49   ` Yury Norov
2022-02-11  4:30   ` Viresh Kumar
2022-02-11  4:30     ` Viresh Kumar
2022-02-11  5:17     ` Yury Norov
2022-02-11  5:17       ` Yury Norov
2022-02-10 22:49 ` [PATCH 18/49] drm/i915/pmu: " Yury Norov
2022-02-10 22:49   ` [Intel-gfx] " Yury Norov
2022-02-10 22:49 ` [PATCH 19/49] RDMA/hfi: " Yury Norov
2022-02-11 19:10   ` Jason Gunthorpe
2022-02-10 22:49 ` [PATCH 20/49] irq: mips: " Yury Norov
2022-04-10 20:34   ` [tip: irq/core] irqchip/bmips: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-04-10 20:27   ` [tip: irq/core] genirq/affinity: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
     [not found]     ` <573841649622719@mail.yandex.com>
2022-04-10 21:17       ` Yury Norov
2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-11 10:19   ` Peter Zijlstra
2022-02-11 14:19     ` Yury Norov
2022-02-17 18:56   ` [tip: sched/core] " tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
2022-04-10 20:35   ` [tip: timers/core] clocksource: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-02-11 10:39   ` Mike Rapoport
2022-02-10 22:49 ` [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
2022-04-10 20:42   ` [tip: x86/cleanups] x86/mm: Replace nodes_weight() with nodes_empty() " tip-bot2 for Yury Norov
2022-02-10 22:49 ` Yury Norov [this message]
2022-02-10 22:49 ` [PATCH 27/49] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} " Yury Norov
2022-02-10 22:49 ` [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
2022-02-10 22:49   ` [Intel-wired-lan] [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq, gt} " Yury Norov
2022-02-10 22:49 ` [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
2022-02-17 15:39   ` Ulf Hansson
2022-02-17 16:55     ` Yury Norov
2022-02-22 15:49       ` Ulf Hansson
2022-02-10 22:49 ` [PATCH 30/49] ixgbe: replace bitmap_weight with bitmap_weight_eq Yury Norov
2022-02-10 22:49   ` [Intel-wired-lan] " Yury Norov
2022-02-10 22:49 ` [PATCH 31/49] octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt} Yury Norov
2022-02-10 22:49 ` [PATCH 32/49] mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
2022-02-10 22:49   ` Yury Norov
2022-02-11 10:30   ` Mark Rutland
2022-02-11 10:30     ` Mark Rutland
2022-02-10 22:49 ` [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le Yury Norov
2022-04-28  7:31   ` Hans Verkuil
2022-02-10 22:49 ` [PATCH 35/49] cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 36/49] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c Yury Norov
2022-02-10 22:49   ` Yury Norov
2022-02-10 22:49 ` [PATCH 37/49] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-02-10 22:49 ` [PATCH 38/49] arch/powerpc: " Yury Norov
2022-02-11  4:10   ` Michael Ellerman
2022-02-10 22:49 ` [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
2022-02-11  6:54   ` Sven Schnelle
2022-02-11 23:40     ` Yury Norov
2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
2022-02-10 22:49   ` Yury Norov
2022-02-11  9:45   ` Sudeep Holla
2022-02-11  9:45     ` Sudeep Holla
2022-02-11 10:32   ` Mark Rutland
2022-02-11 10:32     ` Mark Rutland
2022-02-10 22:49 ` [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-02-11 19:11   ` Jason Gunthorpe
2022-02-10 22:49 ` [PATCH 42/49] scsi: lpfc: replace cpumask_weight with cpumask_weight_gt Yury Norov
2022-02-10 22:49 ` [PATCH 43/49] soc/qman: replace cpumask_weight with cpumask_weight_lt Yury Norov
2022-02-10 22:49   ` Yury Norov
2022-02-10 22:49 ` [PATCH 44/49] nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa Yury Norov
2022-02-14 19:18   ` Rafael J. Wysocki
2022-02-14 19:34     ` Yury Norov
2022-02-14 19:45       ` Rafael J. Wysocki
2022-02-14 19:55         ` Yury Norov
2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
2022-02-11 10:40   ` Mike Rapoport
2022-02-11 17:44   ` Christophe JAILLET
2022-02-11 19:47     ` Yury Norov
2022-02-10 22:49 ` [PATCH 47/49] nodemask: add num_node_state_eq() Yury Norov
2022-02-11 10:41   ` Mike Rapoport
2022-02-10 22:49 ` [PATCH 48/49] tools: bitmap: sync bitmap_weight Yury Norov
2022-02-10 22:49 ` [PATCH 49/49] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
2022-02-15 23:18 ` [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220210224933.379149-27-yury.norov@gmail.com \
    --to=yury.norov@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=aklimov@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=dennis@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=joe@perches.com \
    --cc=kernel@esmil.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=matti.vaittinen@fi.rohmeurope.com \
    --cc=mirq-linux@rere.qmqm.pl \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.