linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage
@ 2022-02-10 22:48 Yury Norov
  2022-02-10 22:48 ` [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
                   ` (49 more replies)
  0 siblings, 50 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases people use bitmap_weight()-based functions to compare
the result against a number of expression:

        if (cpumask_weight(mask) > 1)
                do_something();

This may take considerable amount of time on many-cpus machines because
cpumask_weight() will traverse every word of underlying cpumask
unconditionally.

We can significantly improve on it for many real cases if stop traversing
the mask as soon as we count cpus to any number greater than 1:

        if (cpumask_weight_gt(mask, 1))
                do_something();

The first part of the series is a cleanup and rework where bitmap_weight
API is used wrongly.

Second part converts cpumask_weight() to cpumask_empty() if the number
to compare with is 0. Ditto for bitmap_weight() and nodes_weight().

In the 3nd part of the series bitmap_weight_cmp() is added together with
bitmap_weight_{eq,gt,ge,lt,le} wrappers on top of it. Corresponding
wrappers for cpumask and nodemask are added as well.

The rough numbers of new functions usage, as counted by grep:

	{bitmap,cpumask,nodes}_weight_eq	26
	{bitmap,cpumask,nodes}_weight_ge	25
	{bitmap,cpumask,nodes}_weight_gt	19
	{bitmap,cpumask,nodes}_weight_le	18
	{bitmap,cpumask,nodes}_weight_lt	14

v1: https://lkml.org/lkml/2021/11/27/339
v2: https://lkml.org/lkml/2021/12/18/241
v3: https://lkml.org/lkml/2022/1/27/913
v4: 
 - rebase on next-20220209;
 - exclude patches that already in next-20220209;
 - drop patches 41, 43, 47, 48 from v3 as they are not performance
   critical;
 - deeply rework iio_simple_dummy_trigger_h (patch #4) and
   qed_rdma_bmap_free (#10), instead of replacing bitmap_weight;
 - use more standard tags.

Yury Norov (49):
  net: dsa: don't use bitmap_weight() in b53_arl_read()
  net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set()
  net: mellanox: fix open-coded for_each_set_bit()
  iio: fix opencoded for_each_set_bit()
  qed: rework qed_rdma_bmap_free()
  nds32: perf: replace bitmap_weight with bitmap_empty where appropriate
  KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  drm: replace bitmap_weight with bitmap_empty where appropriate
  ice: replace bitmap_weight with bitmap_empty for intel
  octeontx2-pf: replace bitmap_weight with bitmap_empty for Marvell
  qed: replace bitmap_weight with bitmap_empty in qed_roce_stop()
  perf/arm-cci: replace bitmap_weight with bitmap_empty where
    appropriate
  perf tools: replace bitmap_weight with bitmap_empty where appropriate
  arch/alpha: replace cpumask_weight with cpumask_empty where
    appropriate
  arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  RDMA/hfi: replace cpumask_weight with cpumask_empty where appropriate
  irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  genirq/affinity: replace cpumask_weight with cpumask_empty where
    appropriate
  sched: replace cpumask_weight with cpumask_empty where appropriate
  clocksource: replace cpumask_weight with cpumask_empty in
    clocksource.c
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le}
    where appropriate
  iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where
    appropriate
  memstick: replace bitmap_weight with bitmap_weight_eq where
    appropriate
  ixgbe: replace bitmap_weight with bitmap_weight_eq for intel
  octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt} for
    OcteonTX2
  mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for
    mellanox
  perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
  media: tegra-video:: replace bitmap_weight with bitmap_weight_le
  cpumask: add cpumask_weight_{eq,gt,ge,lt,le}
  arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c
  arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where
    appropriate
  arch/powerpc: replace cpumask_weight with cpumask_weight_{eq, ...}
    where appropriate
  arch/s390: replace cpumask_weight with cpumask_weight_eq where
    appropriate
  firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where
    appropriate
  scsi: lpfc: replace cpumask_weight with cpumask_weight_gt
  soc/qman: replace cpumask_weight with cpumask_weight_lt
  nodemask: add nodemask_weight_{eq,gt,ge,lt,le}
  ACPI: replace nodes__weight with nodes_weight_ge for numa
  mm/mempolicy: replace nodes_weight with nodes_weight_eq
  nodemask: add num_node_state_eq()
  tools: bitmap: sync bitmap_weight
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API

 MAINTAINERS                                   |  4 +
 arch/alpha/kernel/process.c                   |  2 +-
 arch/ia64/kernel/setup.c                      |  2 +-
 arch/ia64/mm/tlb.c                            |  2 +-
 arch/mips/cavium-octeon/octeon-irq.c          |  4 +-
 arch/mips/kernel/crash.c                      |  2 +-
 arch/nds32/kernel/perf_event_cpu.c            |  2 +-
 arch/powerpc/kernel/smp.c                     |  2 +-
 arch/powerpc/kernel/watchdog.c                |  2 +-
 arch/powerpc/xmon/xmon.c                      |  4 +-
 arch/s390/kernel/perf_cpum_cf.c               |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 16 ++--
 arch/x86/kvm/hyperv.c                         |  8 +-
 arch/x86/mm/amdtopology.c                     |  2 +-
 arch/x86/mm/mmio-mod.c                        |  2 +-
 arch/x86/mm/numa_emulation.c                  |  4 +-
 arch/x86/platform/uv/uv_nmi.c                 |  2 +-
 drivers/acpi/numa/srat.c                      |  2 +-
 drivers/cpufreq/qcom-cpufreq-hw.c             |  2 +-
 drivers/cpufreq/scmi-cpufreq.c                |  2 +-
 drivers/firmware/psci/psci_checker.c          |  2 +-
 drivers/gpu/drm/i915/i915_pmu.c               |  2 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c      |  2 +-
 drivers/iio/dummy/iio_simple_dummy_buffer.c   | 48 +++++-------
 drivers/iio/industrialio-trigger.c            |  2 +-
 drivers/infiniband/hw/hfi1/affinity.c         | 13 ++--
 drivers/infiniband/hw/qib/qib_file_ops.c      |  2 +-
 drivers/infiniband/hw/qib/qib_iba7322.c       |  2 +-
 drivers/irqchip/irq-bcm6345-l1.c              |  2 +-
 drivers/memstick/core/ms_block.c              |  4 +-
 drivers/net/dsa/b53/b53_common.c              |  6 +-
 drivers/net/ethernet/broadcom/bcmsysport.c    |  6 +-
 .../net/ethernet/intel/ice/ice_virtchnl_pf.c  |  4 +-
 .../net/ethernet/intel/ixgbe/ixgbe_sriov.c    |  2 +-
 .../marvell/octeontx2/nic/otx2_ethtool.c      |  2 +-
 .../marvell/octeontx2/nic/otx2_flows.c        |  8 +-
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx4/cmd.c      | 33 +++-----
 drivers/net/ethernet/mellanox/mlx4/eq.c       |  4 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c       |  4 +-
 drivers/net/ethernet/mellanox/mlx4/main.c     |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_rdma.c    | 45 ++++-------
 drivers/net/ethernet/qlogic/qed/qed_roce.c    |  2 +-
 drivers/perf/arm-cci.c                        |  2 +-
 drivers/perf/arm_pmu.c                        |  4 +-
 drivers/perf/hisilicon/hisi_uncore_pmu.c      |  2 +-
 drivers/perf/thunderx2_pmu.c                  |  4 +-
 drivers/perf/xgene_pmu.c                      |  2 +-
 drivers/scsi/lpfc/lpfc_init.c                 |  2 +-
 drivers/soc/fsl/qbman/qman_test_stash.c       |  2 +-
 drivers/staging/media/tegra-video/vi.c        |  2 +-
 include/linux/bitmap.h                        | 78 +++++++++++++++++++
 include/linux/cpumask.h                       | 50 ++++++++++++
 include/linux/nodemask.h                      | 40 ++++++++++
 kernel/irq/affinity.c                         |  2 +-
 kernel/sched/core.c                           |  2 +-
 kernel/sched/topology.c                       |  2 +-
 kernel/time/clocksource.c                     |  2 +-
 lib/bitmap.c                                  | 21 +++++
 mm/mempolicy.c                                |  2 +-
 mm/page_alloc.c                               |  2 +-
 mm/vmstat.c                                   |  4 +-
 tools/include/linux/bitmap.h                  | 44 +++++++++++
 tools/lib/bitmap.c                            | 20 +++++
 tools/perf/builtin-c2c.c                      |  4 +-
 tools/perf/util/pmu.c                         |  2 +-
 66 files changed, 384 insertions(+), 178 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 02/49] net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                   ` (48 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, Andrew Lunn, Vivien Didelot, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, netdev

Don't call bitmap_weight() if the following code can get by
without it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/b53/b53_common.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index a3b98992f180..d99813bf3cdd 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1620,12 +1620,8 @@ static int b53_arl_read(struct b53_device *dev, u64 mac,
 		return 0;
 	}
 
-	if (bitmap_weight(free_bins, dev->num_arl_bins) == 0)
-		return -ENOSPC;
-
 	*idx = find_first_bit(free_bins, dev->num_arl_bins);
-
-	return -ENOENT;
+	return *idx >= dev->num_arl_bins ? -ENOSPC : -ENOENT;
 }
 
 static int b53_arl_op(struct b53_device *dev, int op, int port,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 02/49] net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
  2022-02-10 22:48 ` [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
                   ` (47 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, David S . Miller ,
	Jakub Kicinski, bcm-kernel-feedback-list, netdev

Don't call bitmap_weight() if the following code can get by
without it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 60dde29974bf..5284a5c961db 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -2180,13 +2180,9 @@ static int bcm_sysport_rule_set(struct bcm_sysport_priv *priv,
 	if (nfc->fs.ring_cookie != RX_CLS_FLOW_WAKE)
 		return -EOPNOTSUPP;
 
-	/* All filters are already in use, we cannot match more rules */
-	if (bitmap_weight(priv->filters, RXCHK_BRCM_TAG_MAX) ==
-	    RXCHK_BRCM_TAG_MAX)
-		return -ENOSPC;
-
 	index = find_first_zero_bit(priv->filters, RXCHK_BRCM_TAG_MAX);
 	if (index >= RXCHK_BRCM_TAG_MAX)
+		/* All filters are already in use, we cannot match more rules */
 		return -ENOSPC;
 
 	/* Location is the classification ID, and index is the position
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
  2022-02-10 22:48 ` [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
  2022-02-10 22:48 ` [PATCH 02/49] net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11  9:01   ` David Laight
  2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
                   ` (46 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	Sebastian Andrzej Siewior, Christophe JAILLET, Rikard Falkeborn,
	linux-pm
  Cc: Tariq Toukan

Mellanox driver has an open-coded for_each_set_bit(). Fix it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index e10b7b04b894..c56d2194cbfc 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -1994,21 +1994,16 @@ static void mlx4_allocate_port_vpps(struct mlx4_dev *dev, int port)
 
 static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
 {
-	int port, err;
+	int p, port, err;
 	struct mlx4_vport_state *vp_admin;
 	struct mlx4_vport_oper_state *vp_oper;
 	struct mlx4_slave_state *slave_state =
 		&priv->mfunc.master.slave_state[slave];
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
-	for (port = min_port; port <= max_port; port++) {
-		if (!test_bit(port - 1, actv_ports.ports))
-			continue;
+	for_each_set_bit(p, actv_ports.ports, priv->dev.caps.num_ports) {
+		port = p + 1;
 		priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
 			priv->mfunc.master.vf_admin[slave].enable_smi[port];
 		vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
@@ -2063,19 +2058,13 @@ static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
 
 static void mlx4_master_deactivate_admin_state(struct mlx4_priv *priv, int slave)
 {
-	int port;
+	int p, port;
 	struct mlx4_vport_oper_state *vp_oper;
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
-
-	for (port = min_port; port <= max_port; port++) {
-		if (!test_bit(port - 1, actv_ports.ports))
-			continue;
+	for_each_set_bit(p, actv_ports.ports, priv->dev.caps.num_ports) {
+		port = p + 1;
 		priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
 			MLX4_VF_SMI_DISABLED;
 		vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 04/49] iio: fix opencoded for_each_set_bit()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (2 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11  8:45   ` Andy Shevchenko
  2022-02-11 17:17   ` Christophe JAILLET
  2022-02-10 22:48 ` [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free() Yury Norov
                   ` (45 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jonathan Cameron, Lars-Peter Clausen, Alexandru Ardelean,
	Nathan Chancellor, linux-iio

iio_simple_dummy_trigger_h() is mostly an opencoded for_each_set_bit().
Using for_each_set_bit() make code much cleaner, and more effective.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/iio/dummy/iio_simple_dummy_buffer.c | 48 ++++++++-------------
 1 file changed, 19 insertions(+), 29 deletions(-)

diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
index d81c2b2dad82..3bc1b7529e2a 100644
--- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
+++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
@@ -45,41 +45,31 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
 {
 	struct iio_poll_func *pf = p;
 	struct iio_dev *indio_dev = pf->indio_dev;
+	int i = 0, j;
 	u16 *data;
 
 	data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL);
 	if (!data)
 		goto done;
 
-	if (!bitmap_empty(indio_dev->active_scan_mask, indio_dev->masklength)) {
-		/*
-		 * Three common options here:
-		 * hardware scans: certain combinations of channels make
-		 *   up a fast read.  The capture will consist of all of them.
-		 *   Hence we just call the grab data function and fill the
-		 *   buffer without processing.
-		 * software scans: can be considered to be random access
-		 *   so efficient reading is just a case of minimal bus
-		 *   transactions.
-		 * software culled hardware scans:
-		 *   occasionally a driver may process the nearest hardware
-		 *   scan to avoid storing elements that are not desired. This
-		 *   is the fiddliest option by far.
-		 * Here let's pretend we have random access. And the values are
-		 * in the constant table fakedata.
-		 */
-		int i, j;
-
-		for (i = 0, j = 0;
-		     i < bitmap_weight(indio_dev->active_scan_mask,
-				       indio_dev->masklength);
-		     i++, j++) {
-			j = find_next_bit(indio_dev->active_scan_mask,
-					  indio_dev->masklength, j);
-			/* random access read from the 'device' */
-			data[i] = fakedata[j];
-		}
-	}
+	/*
+	 * Three common options here:
+	 * hardware scans: certain combinations of channels make
+	 *   up a fast read.  The capture will consist of all of them.
+	 *   Hence we just call the grab data function and fill the
+	 *   buffer without processing.
+	 * software scans: can be considered to be random access
+	 *   so efficient reading is just a case of minimal bus
+	 *   transactions.
+	 * software culled hardware scans:
+	 *   occasionally a driver may process the nearest hardware
+	 *   scan to avoid storing elements that are not desired. This
+	 *   is the fiddliest option by far.
+	 * Here let's pretend we have random access. And the values are
+	 * in the constant table fakedata.
+	 */
+	for_each_set_bit(j, indio_dev->active_scan_mask, indio_dev->masklength)
+		data[i++] = fakedata[j];
 
 	iio_push_to_buffers_with_timestamp(indio_dev, data,
 					   iio_get_time_ns(indio_dev));
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (3 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11  8:48   ` Andy Shevchenko
  2022-02-10 22:48 ` [PATCH 06/49] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
                   ` (44 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ariel Elior, Manish Chopra, David S. Miller, Jakub Kicinski,
	netdev

qed_rdma_bmap_free() is mostly an opencoded version of printk("%*pb").
Using %*pb format simplifies the code, and helps to avoid inefficient
usage of bitmap_weight().

While here, reorganize logic to avoid calculating bmap weight if check
is false.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---

This is RFC because it changes lines printing format to bitmap %*pb. If
it hurts userspace, it's better to drop the patch.

 drivers/net/ethernet/qlogic/qed/qed_rdma.c | 45 +++++++---------------
 1 file changed, 14 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index 23b668de4640..f4c04af9d4dd 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -319,44 +319,27 @@ static int qed_rdma_alloc(struct qed_hwfn *p_hwfn)
 void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn,
 			struct qed_bmap *bmap, bool check)
 {
-	int weight = bitmap_weight(bmap->bitmap, bmap->max_count);
-	int last_line = bmap->max_count / (64 * 8);
-	int last_item = last_line * 8 +
-	    DIV_ROUND_UP(bmap->max_count % (64 * 8), 64);
-	u64 *pmap = (u64 *)bmap->bitmap;
-	int line, item, offset;
-	u8 str_last_line[200] = { 0 };
-
-	if (!weight || !check)
+	unsigned int bit, weight, nbits;
+	unsigned long *b;
+
+	if (!check)
+		goto end;
+
+	weight = bitmap_weight(bmap->bitmap, bmap->max_count);
+	if (!weight)
 		goto end;
 
 	DP_NOTICE(p_hwfn,
 		  "%s bitmap not free - size=%d, weight=%d, 512 bits per line\n",
 		  bmap->name, bmap->max_count, weight);
 
-	/* print aligned non-zero lines, if any */
-	for (item = 0, line = 0; line < last_line; line++, item += 8)
-		if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
+	for (bit = 0; bit < bmap->max_count; bit += 512) {
+		b =  bmap->bitmap + BITS_TO_LONGS(bit);
+		nbits = min(bmap->max_count - bit, 512);
+
+		if (!bitmap_empty(b, nbits))
 			DP_NOTICE(p_hwfn,
-				  "line 0x%04x: 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx\n",
-				  line,
-				  pmap[item],
-				  pmap[item + 1],
-				  pmap[item + 2],
-				  pmap[item + 3],
-				  pmap[item + 4],
-				  pmap[item + 5],
-				  pmap[item + 6], pmap[item + 7]);
-
-	/* print last unaligned non-zero line, if any */
-	if ((bmap->max_count % (64 * 8)) &&
-	    (bitmap_weight((unsigned long *)&pmap[item],
-			   bmap->max_count - item * 64))) {
-		offset = sprintf(str_last_line, "line 0x%04x: ", line);
-		for (; item < last_item; item++)
-			offset += sprintf(str_last_line + offset,
-					  "0x%016llx ", pmap[item]);
-		DP_NOTICE(p_hwfn, "%s\n", str_last_line);
+				  "line 0x%04x: %*pb\n", bit / 512, nbits, b);
 	}
 
 end:
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 06/49] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (4 preceding siblings ...)
  2022-02-10 22:48 ` [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
                   ` (43 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Nick Hu,
	Greentime Hu, Vincent Chen, linux-perf-users

nds32_pmu_enable calls bitmap_weight() to check if any bit of a given
bitmap is set. It's better to use bitmap_empty() in that case because
bitmap_empty() stops traversing the bitmap as soon as it finds first
set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/nds32/kernel/perf_event_cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/nds32/kernel/perf_event_cpu.c b/arch/nds32/kernel/perf_event_cpu.c
index a78a879e7ef1..ea44e9ecb5c7 100644
--- a/arch/nds32/kernel/perf_event_cpu.c
+++ b/arch/nds32/kernel/perf_event_cpu.c
@@ -695,7 +695,7 @@ static void nds32_pmu_enable(struct pmu *pmu)
 {
 	struct nds32_pmu *nds32_pmu = to_nds32_pmu(pmu);
 	struct pmu_hw_events *hw_events = nds32_pmu->get_hw_events();
-	int enabled = bitmap_weight(hw_events->used_mask,
+	bool enabled = !bitmap_empty(hw_events->used_mask,
 				    nds32_pmu->num_events);
 
 	if (enabled)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 07/49] KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (5 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 06/49] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11 16:34   ` Sean Christopherson
  2022-02-11 17:13   ` Christophe JAILLET
  2022-02-10 22:48 ` [PATCH 08/49] drm: " Yury Norov
                   ` (42 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, kvm

In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
of a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 6e38a7d22e97..06c2a5603123 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 {
 	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
 	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
-	int auto_eoi_old, auto_eoi_new;
+	bool auto_eoi_old, auto_eoi_new;
 
 	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
 		return;
@@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 	else
 		__clear_bit(vector, synic->vec_bitmap);
 
-	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
+	auto_eoi_old = !bitmap_empty(synic->auto_eoi_bitmap, 256);
 
 	if (synic_has_vector_auto_eoi(synic, vector))
 		__set_bit(vector, synic->auto_eoi_bitmap);
 	else
 		__clear_bit(vector, synic->auto_eoi_bitmap);
 
-	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
+	auto_eoi_new = !bitmap_empty(synic->auto_eoi_bitmap, 256);
 
-	if (!!auto_eoi_old == !!auto_eoi_new)
+	if (auto_eoi_old == auto_eoi_new)
 		return;
 
 	down_write(&vcpu->kvm->arch.apicv_update_lock);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 08/49] drm: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (6 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11  2:11   ` Dmitry Baryshkov
  2022-02-10 22:48 ` [PATCH 09/49] ice: replace bitmap_weight with bitmap_empty Yury Norov
                   ` (41 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

smp_request_block() in drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c calls
bitmap_weight() to check if any bit of a given bitmap is set. It's
better to use bitmap_empty() in that case because bitmap_empty() stops
traversing the bitmap as soon as it finds first set bit, while
bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
index d7fa2c49e741..56a3063545ec 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
@@ -68,7 +68,7 @@ static int smp_request_block(struct mdp5_smp *smp,
 	uint8_t reserved;
 
 	/* we shouldn't be requesting blocks for an in-use client: */
-	WARN_ON(bitmap_weight(cs, cnt) > 0);
+	WARN_ON(!bitmap_empty(cs, cnt));
 
 	reserved = smp->reserved[cid];
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 09/49] ice: replace bitmap_weight with bitmap_empty
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (7 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 08/49] drm: " Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 10/49] octeontx2-pf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
                   ` (40 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Jakub Kicinski,
	intel-wired-lan, netdev

The ice_vf_has_no_qs_ena() calls bitmap_weight() to check if any bit
of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index 39b80124d282..9a86eeb6e3f2 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -267,8 +267,8 @@ ice_set_pfe_link(struct ice_vf *vf, struct virtchnl_pf_event *pfe,
  */
 static bool ice_vf_has_no_qs_ena(struct ice_vf *vf)
 {
-	return (!bitmap_weight(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
-		!bitmap_weight(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF));
+	return bitmap_empty(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
+		bitmap_empty(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF);
 }
 
 /**
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 10/49] octeontx2-pf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (8 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 09/49] ice: replace bitmap_weight with bitmap_empty Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 11/49] qed: replace bitmap_weight with bitmap_empty in qed_roce_stop() Yury Norov
                   ` (39 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

In some places, octeontx2 code calls bitmap_weight() to check if any bit of
a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c | 4 ++--
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c    | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 77a13fb555fb..80b2d64b4136 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -353,7 +353,7 @@ int otx2_add_macfilter(struct net_device *netdev, const u8 *mac)
 {
 	struct otx2_nic *pf = netdev_priv(netdev);
 
-	if (bitmap_weight(&pf->flow_cfg->dmacflt_bmap,
+	if (!bitmap_empty(&pf->flow_cfg->dmacflt_bmap,
 			  pf->flow_cfg->dmacflt_max_flows))
 		netdev_warn(netdev,
 			    "Add %pM to CGX/RPM DMAC filters list as well\n",
@@ -436,7 +436,7 @@ int otx2_get_maxflows(struct otx2_flow_config *flow_cfg)
 		return 0;
 
 	if (flow_cfg->nr_flows == flow_cfg->max_flows ||
-	    bitmap_weight(&flow_cfg->dmacflt_bmap,
+	    !bitmap_empty(&flow_cfg->dmacflt_bmap,
 			  flow_cfg->dmacflt_max_flows))
 		return flow_cfg->max_flows + flow_cfg->dmacflt_max_flows;
 	else
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index 86c1c2f77bd7..0f671df46694 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1120,7 +1120,7 @@ static int otx2_cgx_config_loopback(struct otx2_nic *pf, bool enable)
 	struct msg_req *msg;
 	int err;
 
-	if (enable && bitmap_weight(&pf->flow_cfg->dmacflt_bmap,
+	if (enable && !bitmap_empty(&pf->flow_cfg->dmacflt_bmap,
 				    pf->flow_cfg->dmacflt_max_flows))
 		netdev_warn(pf->netdev,
 			    "CGX/RPM internal loopback might not work as DMAC filters are active\n");
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 11/49] qed: replace bitmap_weight with bitmap_empty in qed_roce_stop()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (9 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 10/49] octeontx2-pf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
                   ` (38 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ariel Elior, Manish Chopra, David S. Miller, Jakub Kicinski,
	netdev

qed_roce_stop() calls bitmap_weight() to check if any bit of a given
bitmap is set. We can do it more efficiently with bitmap_empty() because
bitmap_empty() stops traversing the bitmap as soon as it finds first set
bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/qlogic/qed/qed_roce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c
index 071b4aeaddf2..134ecfca96a3 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_roce.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c
@@ -76,7 +76,7 @@ void qed_roce_stop(struct qed_hwfn *p_hwfn)
 	 * We delay for a short while if an async destroy QP is still expected.
 	 * Beyond the added delay we clear the bitmap anyway.
 	 */
-	while (bitmap_weight(rcid_map->bitmap, rcid_map->max_count)) {
+	while (!bitmap_empty(rcid_map->bitmap, rcid_map->max_count)) {
 		/* If the HW device is during recovery, all resources are
 		 * immediately reset without receiving a per-cid indication
 		 * from HW. In this case we don't expect the cid bitmap to be
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (10 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 11/49] qed: replace bitmap_weight with bitmap_empty in qed_roce_stop() Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-11 10:25   ` Mark Rutland
  2022-02-11 17:27   ` Christophe JAILLET
  2022-02-10 22:48 ` [PATCH 13/49] perf tools: " Yury Norov
                   ` (37 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, Shaokun Zhang, Qi Liu, Khuong Dinh,
	linux-arm-kernel

In some places, drivers/perf code calls bitmap_weight() to check if any
bit of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/arm-cci.c                   | 2 +-
 drivers/perf/arm_pmu.c                   | 4 ++--
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
 drivers/perf/xgene_pmu.c                 | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 54aca3a62814..96e09fa40909 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
 {
 	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
 	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
-	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
+	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
 	unsigned long flags;
 
 	if (!enabled)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..a31b302b0ade 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	/* For task-bound events we may be called on other CPUs */
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
@@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
 {
 	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
 		return NOTIFY_DONE;
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index a738aeab5c04..358e4e284a62 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
 void hisi_uncore_pmu_enable(struct pmu *pmu)
 {
 	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
-	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
+	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
 				    hisi_pmu->num_counters);
 
 	if (!enabled)
diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 5283608dc055..0c32dffc7ede 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
-	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
+	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
 			pmu_dev->max_counters);
 
 	if (!enabled)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 13/49] perf tools: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (11 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 14/49] arch/alpha: replace cpumask_weight with cpumask_empty " Yury Norov
                   ` (36 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users

Some code in builtin-c2c.c calls bitmap_weight() to check if any bit of
a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/perf/builtin-c2c.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 77dd4afacca4..14f787c67140 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1080,7 +1080,7 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
 		bitmap_zero(set, c2c.cpus_cnt);
 		bitmap_and(set, c2c_he->cpuset, c2c.nodes[node], c2c.cpus_cnt);
 
-		if (!bitmap_weight(set, c2c.cpus_cnt)) {
+		if (bitmap_empty(set, c2c.cpus_cnt)) {
 			if (c2c.node_info == 1) {
 				ret = scnprintf(hpp->buf, hpp->size, "%21s", " ");
 				advance_hpp(hpp, ret);
@@ -1944,7 +1944,7 @@ static int set_nodestr(struct c2c_hist_entry *c2c_he)
 	if (c2c_he->nodestr)
 		return 0;
 
-	if (bitmap_weight(c2c_he->nodeset, c2c.nodes_cnt)) {
+	if (!bitmap_empty(c2c_he->nodeset, c2c.nodes_cnt)) {
 		len = bitmap_scnprintf(c2c_he->nodeset, c2c.nodes_cnt,
 				      buf, sizeof(buf));
 	} else {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 14/49] arch/alpha: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (12 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 13/49] perf tools: " Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:48 ` [PATCH 15/49] arch/ia64: " Yury Norov
                   ` (35 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Geert Uytterhoeven, Davidlohr Bueso, Russell King (Oracle),
	Kees Cook, Zheng Yongjun, Jens Axboe, linux-alpha

common_shutdown_1() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/alpha/kernel/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 5f8527081da9..0d4bc60828bf 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -125,7 +125,7 @@ common_shutdown_1(void *generic_ptr)
 	/* Wait for the secondaries to halt. */
 	set_cpu_present(boot_cpuid, false);
 	set_cpu_possible(boot_cpuid, false);
-	while (cpumask_weight(cpu_present_mask))
+	while (!cpumask_empty(cpu_present_mask))
 		barrier();
 #endif
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 15/49] arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (13 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 14/49] arch/alpha: replace cpumask_weight with cpumask_empty " Yury Norov
@ 2022-02-10 22:48 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 16/49] arch/x86: " Yury Norov
                   ` (34 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Geert Uytterhoeven, Yang Guang, linux-ia64

setup_arch() calls cpumask_weight() to check if any bit of a given cpumask
is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/kernel/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5010348fa21b..fd6301eafa9d 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -572,7 +572,7 @@ setup_arch (char **cmdline_p)
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 	prefill_possible_map();
 #endif
-	per_cpu_scan_finalize((cpumask_weight(&early_cpu_possible_map) == 0 ?
+	per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
 		32 : cpumask_weight(&early_cpu_possible_map)),
 		additional_cpus > 0 ? additional_cpus : 0);
 #endif /* CONFIG_ACPI_NUMA */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 16/49] arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (14 preceding siblings ...)
  2022-02-10 22:48 ` [PATCH 15/49] arch/ia64: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-10 20:42   ` [tip: x86/cleanups] x86: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty " Yury Norov
                   ` (33 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Steven Rostedt,
	Karol Herbst, Pekka Paalanen, Andy Lutomirski, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson, Darren Hart,
	Andy Shevchenko, x86, nouveau, platform-driver-x86

In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
 arch/x86/mm/mmio-mod.c                 |  2 +-
 arch/x86/platform/uv/uv_nmi.c          |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b57b3db9a6a7..e23ff03290b8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus belong to parent ctrl group */
 	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
 		return -EINVAL;
 	}
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Give any dropped cpus to parent rdtgroup */
 		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
 		update_closid_rmid(tmpmask, prgrp);
@@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu rmid
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
 			if (crgrp == rdtgrp)
@@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Can't drop from default group */
 		if (rdtgrp == &rdtgroup_default) {
 			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
@@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu closid/rmid.
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
 			if (r == rdtgrp)
 				continue;
 			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
-			if (cpumask_weight(tmpmask1))
+			if (!cpumask_empty(tmpmask1))
 				cpumask_rdtgrp_clear(r, tmpmask1);
 		}
 		update_closid_rmid(tmpmask, rdtgrp);
@@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 
 	/* check that user didn't specify any offline cpus */
 	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		ret = -EINVAL;
 		rdt_last_cmd_puts("Can only assign online CPUs\n");
 		goto unlock;
diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
index 933a2ebad471..c3317f0650d8 100644
--- a/arch/x86/mm/mmio-mod.c
+++ b/arch/x86/mm/mmio-mod.c
@@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
 	int cpu;
 	int err;
 
-	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
+	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
 		return;
 	pr_notice("Re-enabling CPUs...\n");
 	for_each_cpu(cpu, downed_cpus) {
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 1e9ff28bc2e0..ea277fc08357 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
 
 	/* Clear global flags */
 	if (master) {
-		if (cpumask_weight(uv_nmi_cpu_mask))
+		if (!cpumask_empty(uv_nmi_cpu_mask))
 			uv_nmi_cleanup_mask();
 		atomic_set(&uv_nmi_cpus_in_nmi, -1);
 		atomic_set(&uv_nmi_cpu, -1);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (15 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 16/49] arch/x86: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11  4:30   ` Viresh Kumar
  2022-02-10 22:49 ` [PATCH 18/49] drm/i915/pmu: " Yury Norov
                   ` (32 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar,
	Sudeep Holla, Cristian Marussi, linux-arm-msm, linux-pm,
	linux-arm-kernel

drivers/cpufreq calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for SCMI cpufreq driver)
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
 drivers/cpufreq/scmi-cpufreq.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 05f3d7876e44..95a0c57ab5bb 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -482,7 +482,7 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	qcom_get_related_cpus(index, policy->cpus);
-	if (!cpumask_weight(policy->cpus)) {
+	if (cpumask_empty(policy->cpus)) {
 		dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
 		ret = -ENOENT;
 		goto error;
diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 1e0cd4d165f0..919fa6e3f462 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -154,7 +154,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 	 * table and opp-shared.
 	 */
 	ret = dev_pm_opp_of_get_sharing_cpus(cpu_dev, priv->opp_shared_cpus);
-	if (ret || !cpumask_weight(priv->opp_shared_cpus)) {
+	if (ret || cpumask_empty(priv->opp_shared_cpus)) {
 		/*
 		 * Either opp-table is not set or no opp-shared was found.
 		 * Use the CPU mask from SCMI to designate CPUs sharing an OPP
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 18/49] drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (16 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 19/49] RDMA/hfi: " Yury Norov
                   ` (31 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, intel-gfx, dri-devel
  Cc: Tvrtko Ursulin

i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index cfc21042499d..7299ed9937dd 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1050,7 +1050,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 	GEM_BUG_ON(!pmu->base.event_init);
 
 	/* Select the first online CPU as a designated reader. */
-	if (!cpumask_weight(&i915_pmu_cpumask))
+	if (cpumask_empty(&i915_pmu_cpumask))
 		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
 
 	return 0;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 19/49] RDMA/hfi: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (17 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 18/49] drm/i915/pmu: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 19:10   ` Jason Gunthorpe
  2022-02-10 22:49 ` [PATCH 20/49] irq: mips: " Yury Norov
                   ` (30 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, Jason Gunthorpe,
	linux-rdma
  Cc: Leon Romanovsky

drivers/infiniband/hw/hfi1/affinity.c code calls cpumask_weight() to check
if any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 706b3b659713..877f8e84a672 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -666,7 +666,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 			 * engines, use the same CPU cores as general/control
 			 * context.
 			 */
-			if (cpumask_weight(&entry->def_intr.mask) == 0)
+			if (cpumask_empty(&entry->def_intr.mask))
 				cpumask_copy(&entry->def_intr.mask,
 					     &entry->general_intr_mask);
 		}
@@ -686,7 +686,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 		 * vectors, use the same CPU core as the general/control
 		 * context.
 		 */
-		if (cpumask_weight(&entry->comp_vect_mask) == 0)
+		if (cpumask_empty(&entry->comp_vect_mask))
 			cpumask_copy(&entry->comp_vect_mask,
 				     &entry->general_intr_mask);
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 20/49] irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (18 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 19/49] RDMA/hfi: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-10 20:34   ` [tip: irq/core] irqchip/bmips: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
                   ` (29 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, Thomas Gleixner, Marc Zyngier,
	bcm-kernel-feedback-list, linux-mips

bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/irqchip/irq-bcm6345-l1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
index fd079215c17f..142a7431745f 100644
--- a/drivers/irqchip/irq-bcm6345-l1.c
+++ b/drivers/irqchip/irq-bcm6345-l1.c
@@ -315,7 +315,7 @@ static int __init bcm6345_l1_of_init(struct device_node *dn,
 			cpumask_set_cpu(idx, &intc->cpumask);
 	}
 
-	if (!cpumask_weight(&intc->cpumask)) {
+	if (cpumask_empty(&intc->cpumask)) {
 		ret = -ENODEV;
 		goto out_free;
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (19 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 20/49] irq: mips: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-10 20:27   ` [tip: irq/core] genirq/affinity: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
                   ` (28 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner

__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/irq/affinity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index f7ff8919dc9b..18740faf0eb1 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -258,7 +258,7 @@ static int __irq_build_affinity_masks(unsigned int startvec,
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	struct node_vectors *node_vectors;
 
-	if (!cpumask_weight(cpu_mask))
+	if (cpumask_empty(cpu_mask))
 		return 0;
 
 	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (20 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 10:19   ` Peter Zijlstra
  2022-02-17 18:56   ` [tip: sched/core] " tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
                   ` (27 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

In some places, kernel/sched code calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/sched/core.c     | 2 +-
 kernel/sched/topology.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 28d1b7af03dc..ed7b392945b7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8711,7 +8711,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
 {
 	int ret = 1;
 
-	if (!cpumask_weight(cur))
+	if (cpumask_empty(cur))
 		return ret;
 
 	ret = dl_cpuset_cpumask_can_shrink(cur, trial);
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index d201a7052a29..8478e2a8cd65 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -74,7 +74,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
 			break;
 		}
 
-		if (!cpumask_weight(sched_group_span(group))) {
+		if (cpumask_empty(sched_group_span(group))) {
 			printk(KERN_CONT "\n");
 			printk(KERN_ERR "ERROR: empty group\n");
 			break;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (21 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-10 20:35   ` [tip: timers/core] clocksource: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
                   ` (26 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	John Stultz, Thomas Gleixner, Stephen Boyd

clocksource_verify_percpu() calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/time/clocksource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 95d7ca35bdf2..cee5da1e54c4 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -343,7 +343,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	cpus_read_lock();
 	preempt_disable();
 	clocksource_verify_choose_cpus();
-	if (cpumask_weight(&cpus_chosen) == 0) {
+	if (cpumask_empty(&cpus_chosen)) {
 		preempt_enable();
 		cpus_read_unlock();
 		pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (22 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 10:39   ` Mike Rapoport
  2022-02-10 22:49 ` [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
                   ` (25 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

mm/vmstat.c code calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 mm/vmstat.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index d5cc8d739fac..27a94afd4ee5 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -2041,7 +2041,7 @@ static void __init init_cpu_node_state(void)
 	int node;
 
 	for_each_online_node(node) {
-		if (cpumask_weight(cpumask_of_node(node)) > 0)
+		if (!cpumask_empty(cpumask_of_node(node)))
 			node_set_state(node, N_CPU);
 	}
 }
@@ -2068,7 +2068,7 @@ static int vmstat_cpu_dead(unsigned int cpu)
 
 	refresh_zone_stat_thresholds();
 	node_cpus = cpumask_of_node(node);
-	if (cpumask_weight(node_cpus) > 0)
+	if (!cpumask_empty(node_cpus))
 		return 0;
 
 	node_clear_state(node, N_CPU);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (23 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-10 20:42   ` [tip: x86/cleanups] x86/mm: Replace nodes_weight() with nodes_empty() " tip-bot2 for Yury Norov
  2022-02-10 22:49 ` [PATCH 26/49] bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
                   ` (24 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Dave Hansen, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86

mm code calls nodes_weight() to check if any bit of a given nodemask is
set. We can do it more efficiently with nodes_empty() because nodes_empty()
stops traversing the nodemask as soon as it finds first set bit, while
nodes_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/mm/amdtopology.c    | 2 +-
 arch/x86/mm/numa_emulation.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/amdtopology.c b/arch/x86/mm/amdtopology.c
index 058b2f36b3a6..b3ca7d23e4b0 100644
--- a/arch/x86/mm/amdtopology.c
+++ b/arch/x86/mm/amdtopology.c
@@ -154,7 +154,7 @@ int __init amd_numa_init(void)
 		node_set(nodeid, numa_nodes_parsed);
 	}
 
-	if (!nodes_weight(numa_nodes_parsed))
+	if (nodes_empty(numa_nodes_parsed))
 		return -ENOENT;
 
 	/*
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index 1a02b791d273..9a9305367fdd 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -123,7 +123,7 @@ static int __init split_nodes_interleave(struct numa_meminfo *ei,
 	 * Continue to fill physical nodes with fake nodes until there is no
 	 * memory left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;
@@ -270,7 +270,7 @@ static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
 	 * Fill physical nodes with fake nodes of size until there is no memory
 	 * left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 26/49] bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (24 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 27/49] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
                   ` (23 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

Many kernel users use bitmap_weight() to compare the result against
some number or expression:

	if (bitmap_weight(...) > 1)
		do_something();

It works OK, but may be significantly improved for large bitmaps: if
first few words count set bits to a number greater than given, we can
stop counting and immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the bitmap
are set, we can stop counting earlier.

This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
and a family of eq, gt, ge, lt, and le wrappers to allow this optimization.
The following patches apply new functions where appropriate.

Suggested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> (for bitmap_weight_cmp)
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
---
 include/linux/bitmap.h | 78 ++++++++++++++++++++++++++++++++++++++++++
 lib/bitmap.c           | 21 ++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 7dba0847510c..a89b626d0fbe 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -51,6 +51,12 @@ struct device;
  *  bitmap_empty(src, nbits)                    Are all bits zero in *src?
  *  bitmap_full(src, nbits)                     Are all bits set in *src?
  *  bitmap_weight(src, nbits)                   Hamming Weight: number set bits
+ *  bitmap_weight_cmp(src, nbits)               compare Hamming Weight with a number
+ *  bitmap_weight_eq(src, nbits, num)           Hamming Weight == num
+ *  bitmap_weight_gt(src, nbits, num)           Hamming Weight >  num
+ *  bitmap_weight_ge(src, nbits, num)           Hamming Weight >= num
+ *  bitmap_weight_lt(src, nbits, num)           Hamming Weight <  num
+ *  bitmap_weight_le(src, nbits, num)           Hamming Weight <= num
  *  bitmap_set(dst, pos, nbits)                 Set specified bit area
  *  bitmap_clear(dst, pos, nbits)               Clear specified bit area
  *  bitmap_find_next_zero_area(buf, len, pos, n, mask)  Find bit free area
@@ -162,6 +168,7 @@ int __bitmap_intersects(const unsigned long *bitmap1,
 int __bitmap_subset(const unsigned long *bitmap1,
 		    const unsigned long *bitmap2, unsigned int nbits);
 int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits);
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num);
 void __bitmap_set(unsigned long *map, unsigned int start, int len);
 void __bitmap_clear(unsigned long *map, unsigned int start, int len);
 
@@ -403,6 +410,77 @@ static __always_inline int bitmap_weight(const unsigned long *src, unsigned int
 	return __bitmap_weight(src, nbits);
 }
 
+/**
+ * bitmap_weight_cmp - compares number of set bits in @src with @num.
+ * @src:   source bitmap
+ * @nbits: length of bitmap in bits
+ * @num:   number to compare with
+ *
+ * As opposite to bitmap_weight() this function doesn't necessarily
+ * traverse full bitmap and may return earlier.
+ *
+ * Because number of set bits cannot decrease while counting, when user
+ * wants to know if the number of set bits in the bitmap is less than
+ * @num, calling
+ *	bitmap_weight_cmp(..., @num) < 0
+ * is potentially less effective than
+ *	bitmap_weight_cmp(..., @num - 1) <= 0
+ *
+ * Consider an example:
+ * bitmap_weight_cmp(1000 0000 0000 0000, 1) < 0
+ *				    ^
+ *				    stop here
+ *
+ * bitmap_weight_cmp(1000 0000 0000 0000, 0) <= 0
+ *		     ^
+ *		     stop here
+ *
+ * Returns: zero if weight of @src is equal to @num;
+ *	   negative number if weight of @src is less than @num;
+ *	   positive number if weight of @src is greater than @num.
+ */
+static __always_inline
+int bitmap_weight_cmp(const unsigned long *src, unsigned int nbits, int num)
+{
+	if ((unsigned int)num > nbits)
+		return -num;
+
+	if (small_const_nbits(nbits))
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)) - num;
+
+	return __bitmap_weight_cmp(src, nbits, num);
+}
+
+static __always_inline
+bool bitmap_weight_eq(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) == 0;
+}
+
+static __always_inline
+bool bitmap_weight_gt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_ge(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_lt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) <= 0;
+}
+
+static __always_inline
+bool bitmap_weight_le(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) <= 0;
+}
+
 static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 		unsigned int nbits)
 {
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 926408883456..fb84ca70c5d9 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -348,6 +348,27 @@ int __bitmap_weight(const unsigned long *bitmap, unsigned int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num)
+{
+	unsigned int k, w, lim = bits / BITS_PER_LONG;
+
+	for (k = 0, w = 0; k < lim; k++) {
+		if (w + bits - k * BITS_PER_LONG < num)
+			goto out;
+
+		w += hweight_long(bitmap[k]);
+
+		if (w > num)
+			goto out;
+	}
+
+	if (bits % BITS_PER_LONG)
+		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+out:
+	return w - num;
+}
+EXPORT_SYMBOL(__bitmap_weight_cmp);
+
 void __bitmap_set(unsigned long *map, unsigned int start, int len)
 {
 	unsigned long *p = map + BIT_WORD(start);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 27/49] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (25 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 26/49] bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
                   ` (22 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

__init_one_rdt_domain in rdtgroup.c code calls bitmap_weight() to compare
the weight of bitmap with a given number. We can do it more efficiently
with bitmap_weight_lt because conditional bitmap_weight() may stop
traversing the bitmap earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e23ff03290b8..9d42e592c1cf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2752,7 +2752,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
 	 * bitmap_weight() does not access out-of-bound memory.
 	 */
 	tmp_cbm = cfg->new_ctrl;
-	if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) {
+	if (bitmap_weight_lt(&tmp_cbm, r->cache.cbm_len, r->cache.min_cbm_bits)) {
 		rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id);
 		return -ENOSPC;
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (26 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 27/49] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
                   ` (21 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Jakub Kicinski,
	intel-wired-lan, netdev

drivers/iio calls bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_eq
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/iio/industrialio-trigger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iio/industrialio-trigger.c b/drivers/iio/industrialio-trigger.c
index f504ed351b3e..98c54022fecf 100644
--- a/drivers/iio/industrialio-trigger.c
+++ b/drivers/iio/industrialio-trigger.c
@@ -331,7 +331,7 @@ int iio_trigger_detach_poll_func(struct iio_trigger *trig,
 {
 	struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(pf->indio_dev);
 	bool no_other_users =
-		bitmap_weight(trig->pool, CONFIG_IIO_CONSUMERS_PER_TRIGGER) == 1;
+		bitmap_weight_eq(trig->pool, CONFIG_IIO_CONSUMERS_PER_TRIGGER, 1);
 	int ret = 0;
 
 	if (trig->ops && trig->ops->set_trigger_state && no_other_users) {
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (27 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-17 15:39   ` Ulf Hansson
  2022-02-10 22:49 ` [PATCH 30/49] ixgbe: replace bitmap_weight with bitmap_weight_eq Yury Norov
                   ` (20 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Ulf Hansson, Jens Axboe,
	Luis Chamberlain, Colin Ian King, Arnd Bergmann,
	Shubhankar Kuranagatti, linux-mmc
  Cc: Shubhankar Kuranagatti

msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
weight of bitmap with a given number. We can do it more efficiently with
bitmap_weight_eq because conditional bitmap_weight may stop traversing the
bitmap earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Shubhankar Kuranagatti <shubhankar.vk@gmail.com>
---
 drivers/memstick/core/ms_block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
index 0cda6c6baefc..5cdd987e78f7 100644
--- a/drivers/memstick/core/ms_block.c
+++ b/drivers/memstick/core/ms_block.c
@@ -155,8 +155,8 @@ static int msb_validate_used_block_bitmap(struct msb_data *msb)
 	for (i = 0; i < msb->zone_count; i++)
 		total_free_blocks += msb->free_block_count[i];
 
-	if (msb->block_count - bitmap_weight(msb->used_blocks_bitmap,
-					msb->block_count) == total_free_blocks)
+	if (bitmap_weight_eq(msb->used_blocks_bitmap, msb->block_count,
+				msb->block_count - total_free_blocks))
 		return 0;
 
 	pr_err("BUG: free block counts don't match the bitmap");
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 30/49] ixgbe: replace bitmap_weight with bitmap_weight_eq
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (28 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 31/49] octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt} Yury Norov
                   ` (19 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Jakub Kicinski,
	intel-wired-lan, netdev

ixgbe_disable_sriov calls bitmap_weight() to compare the weight of bitmap
with a given number. We can do it more efficiently with bitmap_weight_eq
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 214a38de3f41..35297d8a488b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -246,7 +246,7 @@ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
 #endif
 
 	/* Disable VMDq flag so device will be set in VM mode */
-	if (bitmap_weight(adapter->fwd_bitmask, adapter->num_rx_pools) == 1) {
+	if (bitmap_weight_eq(adapter->fwd_bitmask, adapter->num_rx_pools, 1)) {
 		adapter->flags &= ~IXGBE_FLAG_VMDQ_ENABLED;
 		adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
 		rss = min_t(int, ixgbe_max_rss_indices(adapter),
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 31/49] octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt}
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (29 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 30/49] ixgbe: replace bitmap_weight with bitmap_weight_eq Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 32/49] mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} Yury Norov
                   ` (18 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

OcteonTX2 code calls bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_{eq,gt}
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c | 2 +-
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
index abe5267210ef..152890066c2a 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
@@ -287,7 +287,7 @@ static int otx2_set_channels(struct net_device *dev,
 	if (!channel->rx_count || !channel->tx_count)
 		return -EINVAL;
 
-	if (bitmap_weight(&pfvf->rq_bmap, pfvf->hw.rx_queues) > 1) {
+	if (bitmap_weight_gt(&pfvf->rq_bmap, pfvf->hw.rx_queues, 1)) {
 		netdev_err(dev,
 			   "Receive queues are in use by TC police action\n");
 		return -EINVAL;
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 80b2d64b4136..55c899a6fcdd 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -1170,8 +1170,8 @@ int otx2_remove_flow(struct otx2_nic *pfvf, u32 location)
 		 * interface mac address and configure CGX/RPM block in
 		 * promiscuous mode
 		 */
-		if (bitmap_weight(&flow_cfg->dmacflt_bmap,
-				  flow_cfg->dmacflt_max_flows) == 1)
+		if (bitmap_weight_eq(&flow_cfg->dmacflt_bmap,
+				     flow_cfg->dmacflt_max_flows, 1))
 			otx2_update_rem_pfmac(pfvf, DMAC_ADDR_DEL);
 	} else {
 		err = otx2_remove_flow_msg(pfvf, flow->entry, false);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 32/49] mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le}
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (30 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 31/49] octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt} Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
                   ` (17 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

Mellanox code uses bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_{eq, ...}
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c  | 10 +++-------
 drivers/net/ethernet/mellanox/mlx4/eq.c   |  4 ++--
 drivers/net/ethernet/mellanox/mlx4/fw.c   |  4 ++--
 drivers/net/ethernet/mellanox/mlx4/main.c |  2 +-
 4 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index c56d2194cbfc..5bca0c68f00a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2792,9 +2792,8 @@ int mlx4_slave_convert_port(struct mlx4_dev *dev, int slave, int port)
 {
 	unsigned n;
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
-	unsigned m = bitmap_weight(actv_ports.ports, dev->caps.num_ports);
 
-	if (port <= 0 || port > m)
+	if (port <= 0 || bitmap_weight_lt(actv_ports.ports, dev->caps.num_ports, port))
 		return -EINVAL;
 
 	n = find_first_bit(actv_ports.ports, dev->caps.num_ports);
@@ -3404,10 +3403,6 @@ int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
 	if (slave == mlx4_master_func_num(dev))
 		return 0;
@@ -3417,7 +3412,8 @@ int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
 	    enabled < 0 || enabled > 1)
 		return -EINVAL;
 
-	if (min_port == max_port && dev->caps.num_ports > 1) {
+	if (dev->caps.num_ports > 1 &&
+	    bitmap_weight_eq(actv_ports.ports, priv->dev.caps.num_ports, 1)) {
 		mlx4_info(dev, "SMI access disallowed for single ported VFs\n");
 		return -EPROTONOSUPPORT;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 414e390e6b48..0c09432ff389 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1435,8 +1435,8 @@ int mlx4_is_eq_shared(struct mlx4_dev *dev, int vector)
 	if (vector <= 0 || (vector >= dev->caps.num_comp_vectors + 1))
 		return -EINVAL;
 
-	return !!(bitmap_weight(priv->eq_table.eq[vector].actv_ports.ports,
-				dev->caps.num_ports) > 1);
+	return bitmap_weight_gt(priv->eq_table.eq[vector].actv_ports.ports,
+				dev->caps.num_ports, 1);
 }
 EXPORT_SYMBOL(mlx4_is_eq_shared);
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 42c96c9d7fb1..855aae326ccb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1300,8 +1300,8 @@ int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int slave,
 	actv_ports = mlx4_get_active_ports(dev, slave);
 	first_port = find_first_bit(actv_ports.ports, dev->caps.num_ports);
 	for (slave_port = 0, real_port = first_port;
-	     real_port < first_port +
-	     bitmap_weight(actv_ports.ports, dev->caps.num_ports);
+	     bitmap_weight_gt(actv_ports.ports, dev->caps.num_ports,
+			      real_port - first_port);
 	     ++real_port, ++slave_port) {
 		if (flags & (MLX4_DEV_CAP_FLAG_WOL_PORT1 << real_port))
 			flags |= MLX4_DEV_CAP_FLAG_WOL_PORT1 << slave_port;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index b187c210d4d6..cfbaa7ac712f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1383,7 +1383,7 @@ static int mlx4_mf_bond(struct mlx4_dev *dev)
 		   dev->persist->num_vfs + 1);
 
 	/* only single port vfs are allowed */
-	if (bitmap_weight(slaves_port_1_2, dev->persist->num_vfs + 1) > 1) {
+	if (bitmap_weight_gt(slaves_port_1_2, dev->persist->num_vfs + 1, 1)) {
 		mlx4_warn(dev, "HA mode unsupported for dual ported VFs\n");
 		return -EINVAL;
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (31 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 32/49] mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 10:30   ` Mark Rutland
  2022-02-10 22:49 ` [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le Yury Norov
                   ` (16 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, linux-arm-kernel

tx2_uncore_event_start() calls bitmap_weight() to compare the weight
of bitmap with a given number. We can do it more efficiently with
bitmap_weight_eq because conditional bitmap_weight may stop traversing
the bitmap earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/thunderx2_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
index 1edb9c03704f..97d5b39778fa 100644
--- a/drivers/perf/thunderx2_pmu.c
+++ b/drivers/perf/thunderx2_pmu.c
@@ -623,8 +623,8 @@ static void tx2_uncore_event_start(struct perf_event *event, int flags)
 		return;
 
 	/* Start timer for first event */
-	if (bitmap_weight(tx2_pmu->active_counters,
-				tx2_pmu->max_counters) == 1) {
+	if (bitmap_weight_eq(tx2_pmu->active_counters,
+				tx2_pmu->max_counters, 1)) {
 		hrtimer_start(&tx2_pmu->hrtimer,
 			ns_to_ktime(tx2_pmu->hrtimer_interval),
 			HRTIMER_MODE_REL_PINNED);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (32 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-04-28  7:31   ` Hans Verkuil
  2022-02-10 22:49 ` [PATCH 35/49] cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
                   ` (15 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thierry Reding, Jonathan Hunter, Sowjanya Komatineni,
	Mauro Carvalho Chehab, linux-media, linux-tegra, linux-staging

tegra_channel_enum_format() calls bitmap_weight() to compare the weight
of bitmap with a given number. We can do it more efficiently with
bitmap_weight_le() because conditional bitmap_weight may stop traversing
the bitmap earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/staging/media/tegra-video/vi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
index d1f43f465c22..4e79a80e9307 100644
--- a/drivers/staging/media/tegra-video/vi.c
+++ b/drivers/staging/media/tegra-video/vi.c
@@ -436,7 +436,7 @@ static int tegra_channel_enum_format(struct file *file, void *fh,
 	if (!IS_ENABLED(CONFIG_VIDEO_TEGRA_TPG))
 		fmts_bitmap = chan->fmts_bitmap;
 
-	if (f->index >= bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM))
+	if (bitmap_weight_le(fmts_bitmap, MAX_FORMAT_NUM, f->index))
 		return -EINVAL;
 
 	for (i = 0; i < f->index + 1; i++, index++)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 35/49] cpumask: add cpumask_weight_{eq,gt,ge,lt,le}
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (33 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 36/49] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c Yury Norov
                   ` (14 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases people use cpumask_weight() to compare the result against
some number or expression:

	if (cpumask_weight(...) > 1)
		do_something();

It may be significantly improved for large cpumasks: if first few words
count set bits to a number greater than given, we can stop counting and
immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the cpumask
are set, we can stop counting earlier.

This patch adds cpumask_weight_{eq, gt, ge, lt, le} helpers based on
corresponding bitmap functions. The following patches apply new functions
where appropriate.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 6b06c698cd2a..0037297c542a 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -575,6 +575,56 @@ static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 	return bitmap_weight(cpumask_bits(srcp), nr_cpumask_bits);
 }
 
+/**
+ * cpumask_weight_eq - Check if # of bits in *srcp is equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_eq(const struct cpumask *srcp, unsigned int num)
+{
+	return bitmap_weight_eq(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_gt - Check if # of bits in *srcp is greater than a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_gt(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_gt(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_ge - Check if # of bits in *srcp is greater than or equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_ge(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_ge(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_lt - Check if # of bits in *srcp is less than a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_lt(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_lt(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_le - Check if # of bits in *srcp is less than or equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_le(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_le(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
 /**
  * cpumask_shift_right - *dstp = *srcp >> n
  * @dstp: the cpumask result
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 36/49] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (34 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 35/49] cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 37/49] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
                   ` (13 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-ia64

__flush_tlb_range() code calls cpumask_weight() to compare the
weight of cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq because conditional cpumask_weight may stop traversing
the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/mm/tlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 135b5135cace..a5bce13ab047 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -332,7 +332,7 @@ __flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
 
 	preempt_disable();
 #ifdef CONFIG_SMP
-	if (mm != current->active_mm || cpumask_weight(mm_cpumask(mm)) != 1) {
+	if (mm != current->active_mm || !cpumask_weight_eq(mm_cpumask(mm), 1)) {
 		ia64_global_tlb_purge(mm, start, end, nbits);
 		preempt_enable();
 		return;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 37/49] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (35 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 36/49] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 38/49] arch/powerpc: " Yury Norov
                   ` (12 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Bogendoerfer, Mark Rutland, Marc Zyngier, linux-mips

Mips code uses calls cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
---
 arch/mips/cavium-octeon/octeon-irq.c | 4 ++--
 arch/mips/kernel/crash.c             | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-irq.c b/arch/mips/cavium-octeon/octeon-irq.c
index 844f882096e6..914871f15fb7 100644
--- a/arch/mips/cavium-octeon/octeon-irq.c
+++ b/arch/mips/cavium-octeon/octeon-irq.c
@@ -763,7 +763,7 @@ static void octeon_irq_cpu_offline_ciu(struct irq_data *data)
 	if (!cpumask_test_cpu(cpu, mask))
 		return;
 
-	if (cpumask_weight(mask) > 1) {
+	if (cpumask_weight_gt(mask, 1)) {
 		/*
 		 * It has multi CPU affinity, just remove this CPU
 		 * from the affinity set.
@@ -795,7 +795,7 @@ static int octeon_irq_ciu_set_affinity(struct irq_data *data,
 	 * This removes the need to do locking in the .ack/.eoi
 	 * functions.
 	 */
-	if (cpumask_weight(dest) != 1)
+	if (!cpumask_weight_eq(dest, 1))
 		return -EINVAL;
 
 	if (!enable_one)
diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c
index 81845ba04835..5b690d52491f 100644
--- a/arch/mips/kernel/crash.c
+++ b/arch/mips/kernel/crash.c
@@ -72,7 +72,7 @@ static void crash_kexec_prepare_cpus(void)
 	 */
 	pr_emerg("Sending IPI to other cpus...\n");
 	msecs = 10000;
-	while ((cpumask_weight(&cpus_in_crash) < ncpus) && (--msecs > 0)) {
+	while (cpumask_weight_lt(&cpus_in_crash, ncpus) && (--msecs > 0)) {
 		cpu_relax();
 		mdelay(1);
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 38/49] arch/powerpc: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (36 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 37/49] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11  4:10   ` Michael Ellerman
  2022-02-10 22:49 ` [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
                   ` (11 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Srikar Dronamraju, Gautham R. Shenoy, Valentin Schneider,
	Parth Shah, Cédric Le Goater, Hari Bathini, Rob Herring,
	Laurent Dufour, Petr Mladek, John Ogness, Sudeep Holla,
	Christophe Leroy, Naveen N. Rao, Xiongwei Song, Arnd Bergmann,
	linuxppc-dev

PowerPC code uses cpumask_weight() to compare the weight of cpumask with
a given number. We can do it more efficiently with cpumask_weight_{eq, ...}
because conditional cpumask_weight may stop traversing the cpumask earlier,
as soon as condition is (or can't be)  met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/powerpc/kernel/smp.c      | 2 +-
 arch/powerpc/kernel/watchdog.c | 2 +-
 arch/powerpc/xmon/xmon.c       | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index b7fd6a72aa76..8bff748df402 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1656,7 +1656,7 @@ void start_secondary(void *unused)
 		if (has_big_cores)
 			sibling_mask = cpu_smallcore_mask;
 
-		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
+		if (cpumask_weight_gt(mask, cpumask_weight(sibling_mask(cpu))))
 			shared_caches = true;
 	}
 
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bfc27496fe7e..62937a077de7 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -483,7 +483,7 @@ static void start_watchdog(void *arg)
 
 	wd_smp_lock(&flags);
 	cpumask_set_cpu(cpu, &wd_cpus_enabled);
-	if (cpumask_weight(&wd_cpus_enabled) == 1) {
+	if (cpumask_weight_eq(&wd_cpus_enabled, 1)) {
 		cpumask_set_cpu(cpu, &wd_smp_cpus_pending);
 		wd_smp_last_reset_tb = get_tb();
 	}
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index fd72753e8ad5..b423812e94e0 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -469,7 +469,7 @@ static bool wait_for_other_cpus(int ncpus)
 
 	/* We wait for 2s, which is a metric "little while" */
 	for (timeout = 20000; timeout != 0; --timeout) {
-		if (cpumask_weight(&cpus_in_xmon) >= ncpus)
+		if (cpumask_weight_ge(&cpus_in_xmon, ncpus))
 			return true;
 		udelay(100);
 		barrier();
@@ -1338,7 +1338,7 @@ static int cpu_cmd(void)
 			case 'S':
 			case 't':
 				cpumask_copy(&xmon_batch_cpus, &cpus_in_xmon);
-				if (cpumask_weight(&xmon_batch_cpus) <= 1) {
+				if (cpumask_weight_le(&xmon_batch_cpus, 1)) {
 					printf("There are no other cpus in xmon\n");
 					break;
 				}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (37 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 38/49] arch/powerpc: " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11  6:54   ` Sven Schnelle
  2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
                   ` (10 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	Alexander Gordeev, Sven Schnelle, Thomas Richter,
	Sumanth Korikkar, Sebastian Andrzej Siewior, Jiapeng Chong,
	kernel test robot, linux-s390

cfset_all_start() calls cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/s390/kernel/perf_cpum_cf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index ee8707abdb6a..4d217f7f5ccf 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -975,7 +975,7 @@ static int cfset_all_start(struct cfset_request *req)
 		return -ENOMEM;
 	cpumask_and(mask, &req->mask, cpu_online_mask);
 	on_each_cpu_mask(mask, cfset_ioctl_on, &p, 1);
-	if (atomic_read(&p.cpus_ack) != cpumask_weight(mask)) {
+	if (!cpumask_weight_eq(mask, atomic_read(&p.cpus_ack))) {
 		on_each_cpu_mask(mask, cfset_ioctl_off, &p, 1);
 		rc = -EIO;
 		debug_sprintf_event(cf_dbg, 4, "%s CPUs missing", __func__);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (38 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11  9:45   ` Sudeep Holla
  2022-02-11 10:32   ` Mark Rutland
  2022-02-10 22:49 ` [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
                   ` (9 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mark Rutland, Lorenzo Pieralisi, linux-arm-kernel

down_and_up_cpus() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/firmware/psci/psci_checker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/psci/psci_checker.c b/drivers/firmware/psci/psci_checker.c
index 116eb465cdb4..90c9473832a9 100644
--- a/drivers/firmware/psci/psci_checker.c
+++ b/drivers/firmware/psci/psci_checker.c
@@ -90,7 +90,7 @@ static unsigned int down_and_up_cpus(const struct cpumask *cpus,
 		 * cpu_down() checks the number of online CPUs before the TOS
 		 * resident CPU.
 		 */
-		if (cpumask_weight(offlined_cpus) + 1 == nb_available_cpus) {
+		if (cpumask_weight_eq(offlined_cpus, nb_available_cpus - 1)) {
 			if (ret != -EBUSY) {
 				pr_err("Unexpected return code %d while trying "
 				       "to power down last online CPU %d\n",
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (39 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 19:11   ` Jason Gunthorpe
  2022-02-10 22:49 ` [PATCH 42/49] scsi: lpfc: replace cpumask_weight with cpumask_weight_gt Yury Norov
                   ` (8 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, Jason Gunthorpe,
	linux-rdma

Infiniband code uses cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c    | 9 ++++-----
 drivers/infiniband/hw/qib/qib_file_ops.c | 2 +-
 drivers/infiniband/hw/qib/qib_iba7322.c  | 2 +-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 877f8e84a672..a9ad07808dea 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -506,7 +506,7 @@ static int _dev_comp_vect_cpu_mask_init(struct hfi1_devdata *dd,
 	 * available CPUs divide it by the number of devices in the
 	 * local NUMA node.
 	 */
-	if (cpumask_weight(&entry->comp_vect_mask) == 1) {
+	if (cpumask_weight_eq(&entry->comp_vect_mask, 1)) {
 		possible_cpus_comp_vect = 1;
 		dd_dev_warn(dd,
 			    "Number of kernel receive queues is too large for completion vector affinity to be effective\n");
@@ -592,7 +592,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 {
 	struct hfi1_affinity_node *entry;
 	const struct cpumask *local_mask;
-	int curr_cpu, possible, i, ret;
+	int curr_cpu, i, ret;
 	bool new_entry = false;
 
 	local_mask = cpumask_of_node(dd->node);
@@ -625,10 +625,9 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 			    local_mask);
 
 		/* fill in the receive list */
-		possible = cpumask_weight(&entry->def_intr.mask);
 		curr_cpu = cpumask_first(&entry->def_intr.mask);
 
-		if (possible == 1) {
+		if (cpumask_weight_eq(&entry->def_intr.mask, 1)) {
 			/* only one CPU, everyone will use it */
 			cpumask_set_cpu(curr_cpu, &entry->rcv_intr.mask);
 			cpumask_set_cpu(curr_cpu, &entry->general_intr_mask);
@@ -1016,7 +1015,7 @@ int hfi1_get_proc_affinity(int node)
 		cpu = cpumask_first(proc_mask);
 		cpumask_set_cpu(cpu, &set->used);
 		goto done;
-	} else if (current->nr_cpus_allowed < cpumask_weight(&set->mask)) {
+	} else if (cpumask_weight_gt(&set->mask, current->nr_cpus_allowed)) {
 		hfi1_cdbg(PROC, "PID %u %s affinity set to CPU set(s) %*pbl",
 			  current->pid, current->comm,
 			  cpumask_pr_args(proc_mask));
diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/hw/qib/qib_file_ops.c
index aa290928cf96..add89bc21b0a 100644
--- a/drivers/infiniband/hw/qib/qib_file_ops.c
+++ b/drivers/infiniband/hw/qib/qib_file_ops.c
@@ -1151,7 +1151,7 @@ static void assign_ctxt_affinity(struct file *fp, struct qib_devdata *dd)
 	 * reserve a processor for it on the local NUMA node.
 	 */
 	if ((weight >= qib_cpulist_count) &&
-		(cpumask_weight(local_mask) <= qib_cpulist_count)) {
+		(cpumask_weight_le(local_mask, qib_cpulist_count))) {
 		for_each_cpu(local_cpu, local_mask)
 			if (!test_and_set_bit(local_cpu, qib_cpulist)) {
 				fd->rec_cpu_num = local_cpu;
diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c
index ceed302cf6a0..b17f96509d2c 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -3405,7 +3405,7 @@ static void qib_setup_7322_interrupt(struct qib_devdata *dd, int clearpend)
 	local_mask = cpumask_of_pcibus(dd->pcidev->bus);
 	firstcpu = cpumask_first(local_mask);
 	if (firstcpu >= nr_cpu_ids ||
-			cpumask_weight(local_mask) == num_online_cpus()) {
+			cpumask_weight_eq(local_mask, num_online_cpus())) {
 		local_mask = topology_core_cpumask(0);
 		firstcpu = cpumask_first(local_mask);
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 42/49] scsi: lpfc: replace cpumask_weight with cpumask_weight_gt
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (40 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 43/49] soc/qman: replace cpumask_weight with cpumask_weight_lt Yury Norov
                   ` (7 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	James Smart, Dick Kennedy, James E.J. Bottomley,
	Martin K. Petersen, linux-scsi

lpfc_cpuhp_get_eq() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_gt because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/scsi/lpfc/lpfc_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index f5c363f663f6..35688427cb7f 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12642,7 +12642,7 @@ lpfc_cpuhp_get_eq(struct lpfc_hba *phba, unsigned int cpu,
 		 * gone offline yet, we need >1.
 		 */
 		cpumask_and(tmp, maskp, cpu_online_mask);
-		if (cpumask_weight(tmp) > 1)
+		if (cpumask_weight_gt(tmp, 1))
 			continue;
 
 		/* Now that we have an irq to shutdown, get the eq
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 43/49] soc/qman: replace cpumask_weight with cpumask_weight_lt
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (41 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 42/49] scsi: lpfc: replace cpumask_weight with cpumask_weight_gt Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 44/49] nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
                   ` (6 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Li Yang, linuxppc-dev, linux-arm-kernel

qman_test_stash() calls cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with cpumask_weight_lt
because conditional cpumask_weight may stop traversing the cpumask earlier,
as soon as condition is (or can't be) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/soc/fsl/qbman/qman_test_stash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/qman_test_stash.c b/drivers/soc/fsl/qbman/qman_test_stash.c
index b7e8e5ec884c..28b08568a349 100644
--- a/drivers/soc/fsl/qbman/qman_test_stash.c
+++ b/drivers/soc/fsl/qbman/qman_test_stash.c
@@ -561,7 +561,7 @@ int qman_test_stash(void)
 {
 	int err;
 
-	if (cpumask_weight(cpu_online_mask) < 2) {
+	if (cpumask_weight_lt(cpu_online_mask, 2)) {
 		pr_info("%s(): skip - only 1 CPU\n", __func__);
 		return 0;
 	}
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 44/49] nodemask: add nodemask_weight_{eq,gt,ge,lt,le}
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (42 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 43/49] soc/qman: replace cpumask_weight with cpumask_weight_lt Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa Yury Norov
                   ` (5 subsequent siblings)
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases kernel code uses nodemask_weight() to compare the result
against some number or expression:

	if (nodes_weight(...) > 1)
		do_something();

It may be significantly improved for large nodemasks: if first few words
count set bits to a number greater than given, we can stop counting and
immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the nodemask
are set, we can stop counting earlier.

This patch adds nodes_weight{eq, gt, ge, lt, le} helpers based on
corresponding bitmap functions. The following patches apply new functions
where appropriate.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/nodemask.h | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 567c3ddba2c4..197598e075e9 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -38,6 +38,11 @@
  * int nodes_empty(mask)		Is mask empty (no bits sets)?
  * int nodes_full(mask)			Is mask full (all bits sets)?
  * int nodes_weight(mask)		Hamming weight - number of set bits
+ * bool nodes_weight_eq(src, nbits, num) Hamming Weight is equal to num
+ * bool nodes_weight_gt(src, nbits, num) Hamming Weight is greater than num
+ * bool nodes_weight_ge(src, nbits, num) Hamming Weight is greater than or equal to num
+ * bool nodes_weight_lt(src, nbits, num) Hamming Weight is less than num
+ * bool nodes_weight_le(src, nbits, num) Hamming Weight is less than or equal to num
  *
  * void nodes_shift_right(dst, src, n)	Shift right
  * void nodes_shift_left(dst, src, n)	Shift left
@@ -240,6 +245,36 @@ static inline int __nodes_weight(const nodemask_t *srcp, unsigned int nbits)
 	return bitmap_weight(srcp->bits, nbits);
 }
 
+#define nodes_weight_eq(nodemask, num) __nodes_weight_eq(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_eq(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_eq(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_gt(nodemask, num) __nodes_weight_gt(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_gt(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_gt(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_ge(nodemask, num) __nodes_weight_ge(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_ge(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_ge(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_lt(nodemask, num) __nodes_weight_lt(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_lt(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_lt(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_le(nodemask, num) __nodes_weight_le(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_le(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_le(srcp->bits, nbits, num);
+}
+
 #define nodes_shift_right(dst, src, n) \
 			__nodes_shift_right(&(dst), &(src), (n), MAX_NUMNODES)
 static inline void __nodes_shift_right(nodemask_t *dstp,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (43 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 44/49] nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-14 19:18   ` Rafael J. Wysocki
  2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
                   ` (4 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rafael J. Wysocki, Len Brown, Dan Williams, Huacai Chen,
	Vitaly Kuznetsov, Alison Schofield, linux-acpi

acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
of nodemask with a given number. We can do it more efficiently with
nodes_weight_eq() because conditional nodes_weight may stop
traversing the nodemask earlier, as soon as condition is (or is not)
met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/acpi/numa/srat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
index 3b818ab186be..fe7a7996f553 100644
--- a/drivers/acpi/numa/srat.c
+++ b/drivers/acpi/numa/srat.c
@@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
 	node = pxm_to_node_map[pxm];
 
 	if (node == NUMA_NO_NODE) {
-		if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
+		if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
 			return NUMA_NO_NODE;
 		node = first_unset_node(nodes_found_map);
 		__acpi_map_pxm_to_node(pxm, node);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (44 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 10:40   ` Mike Rapoport
  2022-02-11 17:44   ` Christophe JAILLET
  2022-02-10 22:49 ` [PATCH 47/49] nodemask: add num_node_state_eq() Yury Norov
                   ` (3 subsequent siblings)
  49 siblings, 2 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

do_migrate_pages() calls nodes_weight() to compare the weight
of nodemask with a given number. We can do it more efficiently with
nodes_weight_eq() because conditional nodes_weight() may stop
traversing the nodemask earlier, as soon as condition is (or is not)
met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 mm/mempolicy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 7c852793d9e8..56efd00b1b6e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1154,7 +1154,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
 			 *          [0-7] - > [3,4,5] moves only 0,1,2,6,7.
 			 */
 
-			if ((nodes_weight(*from) != nodes_weight(*to)) &&
+			if (!nodes_weight_eq(*from, nodes_weight(*to)) &&
 						(node_isset(s, *to)))
 				continue;
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 47/49] nodemask: add num_node_state_eq()
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (45 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-11 10:41   ` Mike Rapoport
  2022-02-10 22:49 ` [PATCH 48/49] tools: bitmap: sync bitmap_weight Yury Norov
                   ` (2 subsequent siblings)
  49 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

Page allocator uses num_node_state() to compare number of nodes with a
given number. The underlying code calls bitmap_weight(), and we can do
it more efficiently with num_node_state_eq because conditional nodes_weight
may stop traversing the nodemask earlier, as soon as condition is (or is
not) met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/nodemask.h | 5 +++++
 mm/page_alloc.c          | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 197598e075e9..c5014dbf3cce 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -466,6 +466,11 @@ static inline int num_node_state(enum node_states state)
 	return nodes_weight(node_states[state]);
 }
 
+static inline int num_node_state_eq(enum node_states state, int num)
+{
+	return nodes_weight_eq(node_states[state], num);
+}
+
 #define for_each_node_state(__node, __state) \
 	for_each_node_mask((__node), node_states[__state])
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cface1d38093..897e64b66ca4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8434,7 +8434,7 @@ void __init page_alloc_init(void)
 	int ret;
 
 #ifdef CONFIG_NUMA
-	if (num_node_state(N_MEMORY) == 1)
+	if (num_node_state_eq(N_MEMORY, 1))
 		hashdist = 0;
 #endif
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 48/49] tools: bitmap: sync bitmap_weight
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (46 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 47/49] nodemask: add num_node_state_eq() Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-10 22:49 ` [PATCH 49/49] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
  2022-02-15 23:18 ` [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Will Deacon
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Jin Yao, John Garry,
	Ian Rogers, Kan Liang, linux-perf-users

Pull bitmap_weight_{cmp,eq,gt,ge,lt,le} from mother kernel and
use where applicable.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/linux/bitmap.h | 44 ++++++++++++++++++++++++++++++++++++
 tools/lib/bitmap.c           | 20 ++++++++++++++++
 tools/perf/util/pmu.c        |  2 +-
 3 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index ea97804d04d4..29bf54996a84 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -12,6 +12,8 @@
 	unsigned long name[BITS_TO_LONGS(bits)]
 
 int __bitmap_weight(const unsigned long *bitmap, int bits);
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits,
+			 unsigned int num);
 void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, int bits);
 int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
@@ -68,6 +70,48 @@ static inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 	return __bitmap_weight(src, nbits);
 }
 
+static __always_inline
+int bitmap_weight_cmp(const unsigned long *src, unsigned int nbits, int num)
+{
+	if ((unsigned int)num > nbits)
+		return -num;
+
+	if (small_const_nbits(nbits))
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)) - num;
+
+	return __bitmap_weight_cmp(src, nbits, num);
+}
+
+static __always_inline
+bool bitmap_weight_eq(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) == 0;
+}
+
+static __always_inline
+bool bitmap_weight_gt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_ge(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_lt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) <= 0;
+}
+
+static __always_inline
+bool bitmap_weight_le(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) <= 0;
+}
+
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
 			     const unsigned long *src2, unsigned int nbits)
 {
diff --git a/tools/lib/bitmap.c b/tools/lib/bitmap.c
index db466ef7be9d..06e58fee8523 100644
--- a/tools/lib/bitmap.c
+++ b/tools/lib/bitmap.c
@@ -18,6 +18,26 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 	return w;
 }
 
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num)
+{
+	unsigned int k, w, lim = bits / BITS_PER_LONG;
+
+	for (k = 0, w = 0; k < lim; k++) {
+		if (w + bits - k * BITS_PER_LONG < num)
+			goto out;
+
+		w += hweight_long(bitmap[k]);
+
+		if (w > num)
+			goto out;
+	}
+
+	if (bits % BITS_PER_LONG)
+		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+out:
+	return w - num;
+}
+
 void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, int bits)
 {
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8dfbba15aeb8..2c26cdd7f9b0 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1314,7 +1314,7 @@ static int pmu_config_term(const char *pmu_name,
 	 */
 	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
 		if (term->no_value &&
-		    bitmap_weight(format->bits, PERF_PMU_FORMAT_BITS) > 1) {
+		    bitmap_weight_gt(format->bits, PERF_PMU_FORMAT_BITS, 1)) {
 			if (err) {
 				parse_events_error__handle(err, term->err_val,
 					   strdup("no value assigned for term"),
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 49/49] MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (47 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 48/49] tools: bitmap: sync bitmap_weight Yury Norov
@ 2022-02-10 22:49 ` Yury Norov
  2022-02-15 23:18 ` [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Will Deacon
  49 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-10 22:49 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

cpumask and nodemask APIs are thin wrappers around basic bitmap API, and
corresponding files are not formally maintained. This patch adds them to
BITMAP_API section, so that bitmap folks would have closer look at it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 MAINTAINERS | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index bc32519e5c02..718ed3b81c8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3434,10 +3434,14 @@ R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
 R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
 S:	Maintained
 F:	include/linux/bitmap.h
+F:	include/linux/cpumask.h
 F:	include/linux/find.h
+F:	include/linux/nodemask.h
 F:	lib/bitmap.c
+F:	lib/cpumask.c
 F:	lib/find_bit.c
 F:	lib/find_bit_benchmark.c
+F:	lib/nodemask.c
 F:	lib/test_bitmap.c
 F:	tools/include/linux/bitmap.h
 F:	tools/include/linux/find.h
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 08/49] drm: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 ` [PATCH 08/49] drm: " Yury Norov
@ 2022-02-11  2:11   ` Dmitry Baryshkov
  0 siblings, 0 replies; 98+ messages in thread
From: Dmitry Baryshkov @ 2022-02-11  2:11 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

On Fri, 11 Feb 2022 at 02:09, Yury Norov <yury.norov@gmail.com> wrote:
>
> smp_request_block() in drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c calls
> bitmap_weight() to check if any bit of a given bitmap is set. It's
> better to use bitmap_empty() in that case because bitmap_empty() stops
> traversing the bitmap as soon as it finds first set bit, while
> bitmap_weight() counts all bits unconditionally.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
> index d7fa2c49e741..56a3063545ec 100644
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
> @@ -68,7 +68,7 @@ static int smp_request_block(struct mdp5_smp *smp,
>         uint8_t reserved;
>
>         /* we shouldn't be requesting blocks for an in-use client: */
> -       WARN_ON(bitmap_weight(cs, cnt) > 0);
> +       WARN_ON(!bitmap_empty(cs, cnt));

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

>
>         reserved = smp->reserved[cid];
>
> --
> 2.32.0
>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 38/49] arch/powerpc: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-02-10 22:49 ` [PATCH 38/49] arch/powerpc: " Yury Norov
@ 2022-02-11  4:10   ` Michael Ellerman
  0 siblings, 0 replies; 98+ messages in thread
From: Michael Ellerman @ 2022-02-11  4:10 UTC (permalink / raw)
  To: Yury Norov, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Michał Mirosław, Greg Kroah-Hartman,
	Peter Zijlstra, David Laight, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Benjamin Herrenschmidt,
	Paul Mackerras, Srikar Dronamraju, Gautham R. Shenoy,
	Valentin Schneider, Parth Shah, Cédric Le Goater,
	Hari Bathini, Rob Herring, Laurent Dufour, Petr Mladek,
	John Ogness, Sudeep Holla, Christophe Leroy, Naveen N. Rao,
	Xiongwei Song, Arnd Bergmann, linuxppc-dev

Yury Norov <yury.norov@gmail.com> writes:
> PowerPC code uses cpumask_weight() to compare the weight of cpumask with
> a given number. We can do it more efficiently with cpumask_weight_{eq, ...}
> because conditional cpumask_weight may stop traversing the cpumask earlier,
> as soon as condition is (or can't be)  met.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/powerpc/kernel/smp.c      | 2 +-
>  arch/powerpc/kernel/watchdog.c | 2 +-
>  arch/powerpc/xmon/xmon.c       | 4 ++--
>  3 files changed, 4 insertions(+), 4 deletions(-)

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

cheers

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:49 ` [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty " Yury Norov
@ 2022-02-11  4:30   ` Viresh Kumar
  2022-02-11  5:17     ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Viresh Kumar @ 2022-02-11  4:30 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Sudeep Holla,
	Cristian Marussi, linux-arm-msm, linux-pm, linux-arm-kernel

On 10-02-22, 14:49, Yury Norov wrote:
> drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for SCMI cpufreq driver)
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
>  drivers/cpufreq/scmi-cpufreq.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

I already applied it yesterday and replied to you as well. Did I miss
something ?

-- 
viresh

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-11  4:30   ` Viresh Kumar
@ 2022-02-11  5:17     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11  5:17 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Sudeep Holla,
	Cristian Marussi, linux-arm-msm, linux-pm, linux-arm-kernel

On Fri, Feb 11, 2022 at 10:00:57AM +0530, Viresh Kumar wrote:
> On 10-02-22, 14:49, Yury Norov wrote:
> > drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> > cpumask is set. We can do it more efficiently with cpumask_empty() because
> > cpumask_empty() stops traversing the cpumask as soon as it finds first set
> > bit, while cpumask_weight() counts all bits unconditionally.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for SCMI cpufreq driver)
> > ---
> >  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
> >  drivers/cpufreq/scmi-cpufreq.c    | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> I already applied it yesterday and replied to you as well. Did I miss
> something ?

It appeared in next today after I prepared this series, that's why it
slipped through. Sorry for that. Please ignore this patch.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-02-10 22:49 ` [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
@ 2022-02-11  6:54   ` Sven Schnelle
  2022-02-11 23:40     ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Sven Schnelle @ 2022-02-11  6:54 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	Alexander Gordeev, Thomas Richter, Sumanth Korikkar,
	Sebastian Andrzej Siewior, Jiapeng Chong, kernel test robot,
	linux-s390

Hi Yury,

Yury Norov <yury.norov@gmail.com> writes:

> cfset_all_start() calls cpumask_weight() to compare the weight of cpumask
> with a given number. We can do it more efficiently with
> cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is (or can't be) met.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/s390/kernel/perf_cpum_cf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
> index ee8707abdb6a..4d217f7f5ccf 100644
> --- a/arch/s390/kernel/perf_cpum_cf.c
> +++ b/arch/s390/kernel/perf_cpum_cf.c
> @@ -975,7 +975,7 @@ static int cfset_all_start(struct cfset_request *req)
>  		return -ENOMEM;
>  	cpumask_and(mask, &req->mask, cpu_online_mask);
>  	on_each_cpu_mask(mask, cfset_ioctl_on, &p, 1);
> -	if (atomic_read(&p.cpus_ack) != cpumask_weight(mask)) {
> +	if (!cpumask_weight_eq(mask, atomic_read(&p.cpus_ack))) {
>  		on_each_cpu_mask(mask, cfset_ioctl_off, &p, 1);
>  		rc = -EIO;
>  		debug_sprintf_event(cf_dbg, 4, "%s CPUs missing", __func__);

given that you're adding a bunch of these functions - gt,lt,eq and
others, i wonder whether it makes sense to also add cpumask_weight_ne(),
so one could just write:

if (cpumask_weight_ne(mask, atomic_read(&p.cpus_ack))) {
	...
}

?

/Sven

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 04/49] iio: fix opencoded for_each_set_bit()
  2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
@ 2022-02-11  8:45   ` Andy Shevchenko
  2022-02-11 17:17   ` Christophe JAILLET
  1 sibling, 0 replies; 98+ messages in thread
From: Andy Shevchenko @ 2022-02-11  8:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Jonathan Cameron,
	Lars-Peter Clausen, Alexandru Ardelean, Nathan Chancellor,
	linux-iio

On Thu, Feb 10, 2022 at 02:48:48PM -0800, Yury Norov wrote:
> iio_simple_dummy_trigger_h() is mostly an opencoded for_each_set_bit().
> Using for_each_set_bit() make code much cleaner, and more effective.

I would wait for some testing, but from code perspective looks good.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/iio/dummy/iio_simple_dummy_buffer.c | 48 ++++++++-------------
>  1 file changed, 19 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> index d81c2b2dad82..3bc1b7529e2a 100644
> --- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
> +++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> @@ -45,41 +45,31 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
>  {
>  	struct iio_poll_func *pf = p;
>  	struct iio_dev *indio_dev = pf->indio_dev;
> +	int i = 0, j;
>  	u16 *data;
>  
>  	data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL);
>  	if (!data)
>  		goto done;
>  
> -	if (!bitmap_empty(indio_dev->active_scan_mask, indio_dev->masklength)) {
> -		/*
> -		 * Three common options here:
> -		 * hardware scans: certain combinations of channels make
> -		 *   up a fast read.  The capture will consist of all of them.
> -		 *   Hence we just call the grab data function and fill the
> -		 *   buffer without processing.
> -		 * software scans: can be considered to be random access
> -		 *   so efficient reading is just a case of minimal bus
> -		 *   transactions.
> -		 * software culled hardware scans:
> -		 *   occasionally a driver may process the nearest hardware
> -		 *   scan to avoid storing elements that are not desired. This
> -		 *   is the fiddliest option by far.
> -		 * Here let's pretend we have random access. And the values are
> -		 * in the constant table fakedata.
> -		 */
> -		int i, j;
> -
> -		for (i = 0, j = 0;
> -		     i < bitmap_weight(indio_dev->active_scan_mask,
> -				       indio_dev->masklength);
> -		     i++, j++) {
> -			j = find_next_bit(indio_dev->active_scan_mask,
> -					  indio_dev->masklength, j);
> -			/* random access read from the 'device' */
> -			data[i] = fakedata[j];
> -		}
> -	}
> +	/*
> +	 * Three common options here:
> +	 * hardware scans: certain combinations of channels make
> +	 *   up a fast read.  The capture will consist of all of them.
> +	 *   Hence we just call the grab data function and fill the
> +	 *   buffer without processing.
> +	 * software scans: can be considered to be random access
> +	 *   so efficient reading is just a case of minimal bus
> +	 *   transactions.
> +	 * software culled hardware scans:
> +	 *   occasionally a driver may process the nearest hardware
> +	 *   scan to avoid storing elements that are not desired. This
> +	 *   is the fiddliest option by far.
> +	 * Here let's pretend we have random access. And the values are
> +	 * in the constant table fakedata.
> +	 */
> +	for_each_set_bit(j, indio_dev->active_scan_mask, indio_dev->masklength)
> +		data[i++] = fakedata[j];
>  
>  	iio_push_to_buffers_with_timestamp(indio_dev, data,
>  					   iio_get_time_ns(indio_dev));
> -- 
> 2.32.0
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free()
  2022-02-10 22:48 ` [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free() Yury Norov
@ 2022-02-11  8:48   ` Andy Shevchenko
  0 siblings, 0 replies; 98+ messages in thread
From: Andy Shevchenko @ 2022-02-11  8:48 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ariel Elior,
	Manish Chopra, David S. Miller, Jakub Kicinski, netdev

On Thu, Feb 10, 2022 at 02:48:49PM -0800, Yury Norov wrote:
> qed_rdma_bmap_free() is mostly an opencoded version of printk("%*pb").
> Using %*pb format simplifies the code, and helps to avoid inefficient
> usage of bitmap_weight().
> 
> While here, reorganize logic to avoid calculating bmap weight if check
> is false.

I like this kind of patches,
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
> 
> This is RFC because it changes lines printing format to bitmap %*pb. If
> it hurts userspace, it's better to drop the patch.

How? The only way is some strange script that parses dmesg, but dmesg almost
never was an ABI, moreover, with printk() indexing feature (recently
introduced) the one who parses such messages can actually find the (new)
format as well.

>  drivers/net/ethernet/qlogic/qed/qed_rdma.c | 45 +++++++---------------
>  1 file changed, 14 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> index 23b668de4640..f4c04af9d4dd 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> @@ -319,44 +319,27 @@ static int qed_rdma_alloc(struct qed_hwfn *p_hwfn)
>  void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn,
>  			struct qed_bmap *bmap, bool check)
>  {
> -	int weight = bitmap_weight(bmap->bitmap, bmap->max_count);
> -	int last_line = bmap->max_count / (64 * 8);
> -	int last_item = last_line * 8 +
> -	    DIV_ROUND_UP(bmap->max_count % (64 * 8), 64);
> -	u64 *pmap = (u64 *)bmap->bitmap;
> -	int line, item, offset;
> -	u8 str_last_line[200] = { 0 };
> -
> -	if (!weight || !check)
> +	unsigned int bit, weight, nbits;
> +	unsigned long *b;
> +
> +	if (!check)
> +		goto end;
> +
> +	weight = bitmap_weight(bmap->bitmap, bmap->max_count);
> +	if (!weight)
>  		goto end;
>  
>  	DP_NOTICE(p_hwfn,
>  		  "%s bitmap not free - size=%d, weight=%d, 512 bits per line\n",
>  		  bmap->name, bmap->max_count, weight);
>  
> -	/* print aligned non-zero lines, if any */
> -	for (item = 0, line = 0; line < last_line; line++, item += 8)
> -		if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
> +	for (bit = 0; bit < bmap->max_count; bit += 512) {
> +		b =  bmap->bitmap + BITS_TO_LONGS(bit);
> +		nbits = min(bmap->max_count - bit, 512);
> +
> +		if (!bitmap_empty(b, nbits))
>  			DP_NOTICE(p_hwfn,
> -				  "line 0x%04x: 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx\n",
> -				  line,
> -				  pmap[item],
> -				  pmap[item + 1],
> -				  pmap[item + 2],
> -				  pmap[item + 3],
> -				  pmap[item + 4],
> -				  pmap[item + 5],
> -				  pmap[item + 6], pmap[item + 7]);
> -
> -	/* print last unaligned non-zero line, if any */
> -	if ((bmap->max_count % (64 * 8)) &&
> -	    (bitmap_weight((unsigned long *)&pmap[item],
> -			   bmap->max_count - item * 64))) {
> -		offset = sprintf(str_last_line, "line 0x%04x: ", line);
> -		for (; item < last_item; item++)
> -			offset += sprintf(str_last_line + offset,
> -					  "0x%016llx ", pmap[item]);
> -		DP_NOTICE(p_hwfn, "%s\n", str_last_line);
> +				  "line 0x%04x: %*pb\n", bit / 512, nbits, b);
>  	}
>  
>  end:
> -- 
> 2.32.0
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 98+ messages in thread

* RE: [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit()
  2022-02-10 22:48 ` [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
@ 2022-02-11  9:01   ` David Laight
  0 siblings, 0 replies; 98+ messages in thread
From: David Laight @ 2022-02-11  9:01 UTC (permalink / raw)
  To: 'Yury Norov',
	Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Rafael J. Wysocki,
	Daniel Lezcano, Amit Kucheria, Zhang Rui,
	Sebastian Andrzej Siewior, Christophe JAILLET, Rikard Falkeborn,
	linux-pm
  Cc: Tariq Toukan

From: Yury Norov
> Sent: 10 February 2022 22:49
> 
> Mellanox driver has an open-coded for_each_set_bit(). Fix it.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/cmd.c | 23 ++++++-----------------
>  1 file changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> index e10b7b04b894..c56d2194cbfc 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> @@ -1994,21 +1994,16 @@ static void mlx4_allocate_port_vpps(struct mlx4_dev *dev, int port)
> 
>  static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
>  {
> -	int port, err;
> +	int p, port, err;
>  	struct mlx4_vport_state *vp_admin;
>  	struct mlx4_vport_oper_state *vp_oper;
>  	struct mlx4_slave_state *slave_state =
>  		&priv->mfunc.master.slave_state[slave];
>  	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
>  			&priv->dev, slave);
> -	int min_port = find_first_bit(actv_ports.ports,
> -				      priv->dev.caps.num_ports) + 1;
> -	int max_port = min_port - 1 +
> -		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
> 
> -	for (port = min_port; port <= max_port; port++) {
> -		if (!test_bit(port - 1, actv_ports.ports))
> -			continue;
> +	for_each_set_bit(p, actv_ports.ports, priv->dev.caps.num_ports) {
> +		port = p + 1;

This is an 'interesting' change in behaviour, and looks like a bug fix.
Did anyone actually test the old code?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
@ 2022-02-11  9:45   ` Sudeep Holla
  2022-02-11 10:32   ` Mark Rutland
  1 sibling, 0 replies; 98+ messages in thread
From: Sudeep Holla @ 2022-02-11  9:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Sudeep Holla, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Mark Rutland, Lorenzo Pieralisi,
	linux-arm-kernel

On Thu, Feb 10, 2022 at 02:49:24PM -0800, Yury Norov wrote:
> down_and_up_cpus() calls cpumask_weight() to compare the weight of
> cpumask with a given number. We can do it more efficiently with
> cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is (or can't be) met.
>

Nit: s/pcsi/psci/ in $subject. With that fixed,

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
@ 2022-02-11 10:19   ` Peter Zijlstra
  2022-02-11 14:19     ` Yury Norov
  2022-02-17 18:56   ` [tip: sched/core] " tip-bot2 for Yury Norov
  1 sibling, 1 reply; 98+ messages in thread
From: Peter Zijlstra @ 2022-02-11 10:19 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ingo Molnar,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira

On Thu, Feb 10, 2022 at 02:49:06PM -0800, Yury Norov wrote:
> In some places, kernel/sched code calls cpumask_weight() to check if
> any bit of a given cpumask is set. We can do it more efficiently with
> cpumask_empty() because cpumask_empty() stops traversing the cpumask as
> soon as it finds first set bit, while cpumask_weight() counts all bits
> unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Neither of these paths are really performance sentitive, but whatever.

Do you want me to take this now, or do you want to merge the whole
series somewere else? In which case:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  kernel/sched/core.c     | 2 +-
>  kernel/sched/topology.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 28d1b7af03dc..ed7b392945b7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8711,7 +8711,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
>  {
>  	int ret = 1;
>  
> -	if (!cpumask_weight(cur))
> +	if (cpumask_empty(cur))
>  		return ret;
>  
>  	ret = dl_cpuset_cpumask_can_shrink(cur, trial);
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index d201a7052a29..8478e2a8cd65 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -74,7 +74,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
>  			break;
>  		}
>  
> -		if (!cpumask_weight(sched_group_span(group))) {
> +		if (cpumask_empty(sched_group_span(group))) {
>  			printk(KERN_CONT "\n");
>  			printk(KERN_ERR "ERROR: empty group\n");
>  			break;
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
@ 2022-02-11 10:25   ` Mark Rutland
  2022-02-11 17:59     ` Yury Norov
  2022-02-11 17:27   ` Christophe JAILLET
  1 sibling, 1 reply; 98+ messages in thread
From: Mark Rutland @ 2022-02-11 10:25 UTC (permalink / raw)
  To: Yury Norov, Will Deacon
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Shaokun Zhang, Qi Liu, Khuong Dinh, linux-arm-kernel

Hi Yury,

On Thu, Feb 10, 2022 at 02:48:56PM -0800, Yury Norov wrote:
> In some places, drivers/perf code calls bitmap_weight() to check if any
> bit of a given bitmap is set. It's better to use bitmap_empty() in that
> case because bitmap_empty() stops traversing the bitmap as soon as it
> finds first set bit, while bitmap_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

This looks like a nice semantic cleanup to me, so FWIW:

Acked-by: Mark Rutland <mark.rutland@arm.com>

How are you expecting to queue all of this? Should Will and I pick this patch?

Thanks,
Mark.

> ---
>  drivers/perf/arm-cci.c                   | 2 +-
>  drivers/perf/arm_pmu.c                   | 4 ++--
>  drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
>  drivers/perf/xgene_pmu.c                 | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> index 54aca3a62814..96e09fa40909 100644
> --- a/drivers/perf/arm-cci.c
> +++ b/drivers/perf/arm-cci.c
> @@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
>  {
>  	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
>  	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
> -	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
>  	unsigned long flags;
>  
>  	if (!enabled)
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 295cc7952d0e..a31b302b0ade 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
>  {
>  	struct arm_pmu *armpmu = to_arm_pmu(pmu);
>  	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
>  
>  	/* For task-bound events we may be called on other CPUs */
>  	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> @@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
>  {
>  	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
>  	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
>  
>  	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
>  		return NOTIFY_DONE;
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> index a738aeab5c04..358e4e284a62 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> @@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
>  void hisi_uncore_pmu_enable(struct pmu *pmu)
>  {
>  	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> -	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
> +	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
>  				    hisi_pmu->num_counters);
>  
>  	if (!enabled)
> diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
> index 5283608dc055..0c32dffc7ede 100644
> --- a/drivers/perf/xgene_pmu.c
> +++ b/drivers/perf/xgene_pmu.c
> @@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
>  {
>  	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
>  	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
> -	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
> +	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
>  			pmu_dev->max_counters);
>  
>  	if (!enabled)
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
  2022-02-10 22:49 ` [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
@ 2022-02-11 10:30   ` Mark Rutland
  0 siblings, 0 replies; 98+ messages in thread
From: Mark Rutland @ 2022-02-11 10:30 UTC (permalink / raw)
  To: Yury Norov, Will Deacon
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-arm-kernel

On Thu, Feb 10, 2022 at 02:49:17PM -0800, Yury Norov wrote:
> tx2_uncore_event_start() calls bitmap_weight() to compare the weight
> of bitmap with a given number. We can do it more efficiently with
> bitmap_weight_eq because conditional bitmap_weight may stop traversing
> the bitmap earlier, as soon as condition is (or can't be) met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Given the max counters value is either 4 or 8 I doubt this should matter, but
for consistenct this is fine, so:

Acked-by: Mark Rutland <mark.rutland@arm.com>

I now see bitmap_weight_eq() is introduced within this series, so I assume you
need to queue that and its users together, and will want to take the prior
drivers/perf/ bit together with that.

Thanks,
Mark.

> ---
>  drivers/perf/thunderx2_pmu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
> index 1edb9c03704f..97d5b39778fa 100644
> --- a/drivers/perf/thunderx2_pmu.c
> +++ b/drivers/perf/thunderx2_pmu.c
> @@ -623,8 +623,8 @@ static void tx2_uncore_event_start(struct perf_event *event, int flags)
>  		return;
>  
>  	/* Start timer for first event */
> -	if (bitmap_weight(tx2_pmu->active_counters,
> -				tx2_pmu->max_counters) == 1) {
> +	if (bitmap_weight_eq(tx2_pmu->active_counters,
> +				tx2_pmu->max_counters, 1)) {
>  		hrtimer_start(&tx2_pmu->hrtimer,
>  			ns_to_ktime(tx2_pmu->hrtimer_interval),
>  			HRTIMER_MODE_REL_PINNED);
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
  2022-02-11  9:45   ` Sudeep Holla
@ 2022-02-11 10:32   ` Mark Rutland
  1 sibling, 0 replies; 98+ messages in thread
From: Mark Rutland @ 2022-02-11 10:32 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Lorenzo Pieralisi, linux-arm-kernel

On Thu, Feb 10, 2022 at 02:49:24PM -0800, Yury Norov wrote:
> down_and_up_cpus() calls cpumask_weight() to compare the weight of
> cpumask with a given number. We can do it more efficiently with
> cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is (or can't be) met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

With the 'pcsi' typo fixed:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  drivers/firmware/psci/psci_checker.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/psci/psci_checker.c b/drivers/firmware/psci/psci_checker.c
> index 116eb465cdb4..90c9473832a9 100644
> --- a/drivers/firmware/psci/psci_checker.c
> +++ b/drivers/firmware/psci/psci_checker.c
> @@ -90,7 +90,7 @@ static unsigned int down_and_up_cpus(const struct cpumask *cpus,
>  		 * cpu_down() checks the number of online CPUs before the TOS
>  		 * resident CPU.
>  		 */
> -		if (cpumask_weight(offlined_cpus) + 1 == nb_available_cpus) {
> +		if (cpumask_weight_eq(offlined_cpus, nb_available_cpus - 1)) {
>  			if (ret != -EBUSY) {
>  				pr_err("Unexpected return code %d while trying "
>  				       "to power down last online CPU %d\n",
> -- 
> 2.32.0
> 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:49 ` [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-02-11 10:39   ` Mike Rapoport
  0 siblings, 0 replies; 98+ messages in thread
From: Mike Rapoport @ 2022-02-11 10:39 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

On Thu, Feb 10, 2022 at 02:49:08PM -0800, Yury Norov wrote:
> mm/vmstat.c code calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  mm/vmstat.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index d5cc8d739fac..27a94afd4ee5 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -2041,7 +2041,7 @@ static void __init init_cpu_node_state(void)
>  	int node;
>  
>  	for_each_online_node(node) {
> -		if (cpumask_weight(cpumask_of_node(node)) > 0)
> +		if (!cpumask_empty(cpumask_of_node(node)))
>  			node_set_state(node, N_CPU);
>  	}
>  }
> @@ -2068,7 +2068,7 @@ static int vmstat_cpu_dead(unsigned int cpu)
>  
>  	refresh_zone_stat_thresholds();
>  	node_cpus = cpumask_of_node(node);
> -	if (cpumask_weight(node_cpus) > 0)
> +	if (!cpumask_empty(node_cpus))
>  		return 0;
>  
>  	node_clear_state(node, N_CPU);
> -- 
> 2.32.0
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq
  2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
@ 2022-02-11 10:40   ` Mike Rapoport
  2022-02-11 17:44   ` Christophe JAILLET
  1 sibling, 0 replies; 98+ messages in thread
From: Mike Rapoport @ 2022-02-11 10:40 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

On Thu, Feb 10, 2022 at 02:49:30PM -0800, Yury Norov wrote:
> do_migrate_pages() calls nodes_weight() to compare the weight
> of nodemask with a given number. We can do it more efficiently with
> nodes_weight_eq() because conditional nodes_weight() may stop
> traversing the nodemask earlier, as soon as condition is (or is not)
> met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  mm/mempolicy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 7c852793d9e8..56efd00b1b6e 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1154,7 +1154,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
>  			 *          [0-7] - > [3,4,5] moves only 0,1,2,6,7.
>  			 */
>  
> -			if ((nodes_weight(*from) != nodes_weight(*to)) &&
> +			if (!nodes_weight_eq(*from, nodes_weight(*to)) &&
>  						(node_isset(s, *to)))
>  				continue;
>  
> -- 
> 2.32.0
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 47/49] nodemask: add num_node_state_eq()
  2022-02-10 22:49 ` [PATCH 47/49] nodemask: add num_node_state_eq() Yury Norov
@ 2022-02-11 10:41   ` Mike Rapoport
  0 siblings, 0 replies; 98+ messages in thread
From: Mike Rapoport @ 2022-02-11 10:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

On Thu, Feb 10, 2022 at 02:49:31PM -0800, Yury Norov wrote:
> Page allocator uses num_node_state() to compare number of nodes with a
> given number. The underlying code calls bitmap_weight(), and we can do
> it more efficiently with num_node_state_eq because conditional nodes_weight
> may stop traversing the nodemask earlier, as soon as condition is (or is
> not) met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  include/linux/nodemask.h | 5 +++++
>  mm/page_alloc.c          | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> index 197598e075e9..c5014dbf3cce 100644
> --- a/include/linux/nodemask.h
> +++ b/include/linux/nodemask.h
> @@ -466,6 +466,11 @@ static inline int num_node_state(enum node_states state)
>  	return nodes_weight(node_states[state]);
>  }
>  
> +static inline int num_node_state_eq(enum node_states state, int num)
> +{
> +	return nodes_weight_eq(node_states[state], num);
> +}
> +
>  #define for_each_node_state(__node, __state) \
>  	for_each_node_mask((__node), node_states[__state])
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cface1d38093..897e64b66ca4 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8434,7 +8434,7 @@ void __init page_alloc_init(void)
>  	int ret;
>  
>  #ifdef CONFIG_NUMA
> -	if (num_node_state(N_MEMORY) == 1)
> +	if (num_node_state_eq(N_MEMORY, 1))
>  		hashdist = 0;
>  #endif
>  
> -- 
> 2.32.0
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-11 10:19   ` Peter Zijlstra
@ 2022-02-11 14:19     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 14:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ingo Molnar,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira

On Fri, Feb 11, 2022 at 11:19:58AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 10, 2022 at 02:49:06PM -0800, Yury Norov wrote:
> > In some places, kernel/sched code calls cpumask_weight() to check if
> > any bit of a given cpumask is set. We can do it more efficiently with
> > cpumask_empty() because cpumask_empty() stops traversing the cpumask as
> > soon as it finds first set bit, while cpumask_weight() counts all bits
> > unconditionally.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> Neither of these paths are really performance sentitive, but whatever.
> 
> Do you want me to take this now,

Yes please. Many patches from this series already merged this way.

> or do you want to merge the whole
> series somewere else? In which case:
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> > ---
> >  kernel/sched/core.c     | 2 +-
> >  kernel/sched/topology.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 28d1b7af03dc..ed7b392945b7 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -8711,7 +8711,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
> >  {
> >  	int ret = 1;
> >  
> > -	if (!cpumask_weight(cur))
> > +	if (cpumask_empty(cur))
> >  		return ret;
> >  
> >  	ret = dl_cpuset_cpumask_can_shrink(cur, trial);
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index d201a7052a29..8478e2a8cd65 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -74,7 +74,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
> >  			break;
> >  		}
> >  
> > -		if (!cpumask_weight(sched_group_span(group))) {
> > +		if (cpumask_empty(sched_group_span(group))) {
> >  			printk(KERN_CONT "\n");
> >  			printk(KERN_ERR "ERROR: empty group\n");
> >  			break;
> > -- 
> > 2.32.0
> > 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 07/49] KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
@ 2022-02-11 16:34   ` Sean Christopherson
  2022-02-11 17:13   ` Christophe JAILLET
  1 sibling, 0 replies; 98+ messages in thread
From: Sean Christopherson @ 2022-02-11 16:34 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, kvm

On Thu, Feb 10, 2022, Yury Norov wrote:
> In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
> of a given bitmap is set. It's better to use bitmap_empty() in that case
> because bitmap_empty() stops traversing the bitmap as soon as it finds
> first set bit, while bitmap_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 07/49] KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
  2022-02-11 16:34   ` Sean Christopherson
@ 2022-02-11 17:13   ` Christophe JAILLET
  2022-02-11 17:19     ` Sean Christopherson
  1 sibling, 1 reply; 98+ messages in thread
From: Christophe JAILLET @ 2022-02-11 17:13 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov

Le 10/02/2022 à 23:48, Yury Norov a écrit :
> In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
> of a given bitmap is set. It's better to use bitmap_empty() in that case
> because bitmap_empty() stops traversing the bitmap as soon as it finds
> first set bit, while bitmap_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>   arch/x86/kvm/hyperv.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 6e38a7d22e97..06c2a5603123 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
>   {
>   	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
>   	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> -	int auto_eoi_old, auto_eoi_new;
> +	bool auto_eoi_old, auto_eoi_new;
>   
>   	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
>   		return;
> @@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
>   	else
>   		__clear_bit(vector, synic->vec_bitmap);
>   
> -	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
> +	auto_eoi_old = !bitmap_empty(synic->auto_eoi_bitmap, 256);

I think that you can also remove the "!" here, ...

>   
>   	if (synic_has_vector_auto_eoi(synic, vector))
>   		__set_bit(vector, synic->auto_eoi_bitmap);
>   	else
>   		__clear_bit(vector, synic->auto_eoi_bitmap);
>   
> -	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
> +	auto_eoi_new = !bitmap_empty(synic->auto_eoi_bitmap, 256);

... and there...

>   
> -	if (!!auto_eoi_old == !!auto_eoi_new)
> +	if (auto_eoi_old == auto_eoi_new)

... because this test would still give the same result.

Just my 2c,
CJ

>   		return;
>   
>   	down_write(&vcpu->kvm->arch.apicv_update_lock);


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 04/49] iio: fix opencoded for_each_set_bit()
  2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
  2022-02-11  8:45   ` Andy Shevchenko
@ 2022-02-11 17:17   ` Christophe JAILLET
  2022-06-04 15:41     ` Jonathan Cameron
  1 sibling, 1 reply; 98+ messages in thread
From: Christophe JAILLET @ 2022-02-11 17:17 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jonathan Cameron, Lars-Peter Clausen, Alexandru Ardelean,
	Nathan Chancellor, linux-iio

Le 10/02/2022 à 23:48, Yury Norov a écrit :
> iio_simple_dummy_trigger_h() is mostly an opencoded for_each_set_bit().
> Using for_each_set_bit() make code much cleaner, and more effective.
> 
> Signed-off-by: Yury Norov <yury.norov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>   drivers/iio/dummy/iio_simple_dummy_buffer.c | 48 ++++++++-------------
>   1 file changed, 19 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> index d81c2b2dad82..3bc1b7529e2a 100644
> --- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
> +++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> @@ -45,41 +45,31 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
>   {
>   	struct iio_poll_func *pf = p;
>   	struct iio_dev *indio_dev = pf->indio_dev;
> +	int i = 0, j;
>   	u16 *data;
>   
>   	data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL);
>   	if (!data)
>   		goto done;
>   
> -	if (!bitmap_empty(indio_dev->active_scan_mask, indio_dev->masklength)) {
> -		/*
> -		 * Three common options here:
> -		 * hardware scans: certain combinations of channels make
> -		 *   up a fast read.  The capture will consist of all of them.
> -		 *   Hence we just call the grab data function and fill the
> -		 *   buffer without processing.
> -		 * software scans: can be considered to be random access
> -		 *   so efficient reading is just a case of minimal bus
> -		 *   transactions.
> -		 * software culled hardware scans:
> -		 *   occasionally a driver may process the nearest hardware
> -		 *   scan to avoid storing elements that are not desired. This
> -		 *   is the fiddliest option by far.
> -		 * Here let's pretend we have random access. And the values are
> -		 * in the constant table fakedata.
> -		 */
> -		int i, j;
> -
> -		for (i = 0, j = 0;
> -		     i < bitmap_weight(indio_dev->active_scan_mask,
> -				       indio_dev->masklength);
> -		     i++, j++) {
> -			j = find_next_bit(indio_dev->active_scan_mask,
> -					  indio_dev->masklength, j);
> -			/* random access read from the 'device' */
> -			data[i] = fakedata[j];
> -		}
> -	}
> +	/*
> +	 * Three common options here:
> +	 * hardware scans: certain combinations of channels make
> +	 *   up a fast read.  The capture will consist of all of them.
> +	 *   Hence we just call the grab data function and fill the
> +	 *   buffer without processing.
> +	 * software scans: can be considered to be random access
> +	 *   so efficient reading is just a case of minimal bus
> +	 *   transactions.
> +	 * software culled hardware scans:
> +	 *   occasionally a driver may process the nearest hardware
> +	 *   scan to avoid storing elements that are not desired. This
> +	 *   is the fiddliest option by far.
> +	 * Here let's pretend we have random access. And the values are
> +	 * in the constant table fakedata.
> +	 */

Nitpicking: you could take advantage of the tab you save to use the full 
width of the line and save some lines of code.

Just my 2c.

CJ


> +	for_each_set_bit(j, indio_dev->active_scan_mask, indio_dev->masklength)
> +		data[i++] = fakedata[j];
>   
>   	iio_push_to_buffers_with_timestamp(indio_dev, data,
>   					   iio_get_time_ns(indio_dev));


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 07/49] KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-11 17:13   ` Christophe JAILLET
@ 2022-02-11 17:19     ` Sean Christopherson
  2022-02-11 17:47       ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Sean Christopherson @ 2022-02-11 17:19 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Vitaly Kuznetsov

On Fri, Feb 11, 2022, Christophe JAILLET wrote:
> Le 10/02/2022 à 23:48, Yury Norov a écrit :
> > In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
> > of a given bitmap is set. It's better to use bitmap_empty() in that case
> > because bitmap_empty() stops traversing the bitmap as soon as it finds
> > first set bit, while bitmap_weight() counts all bits unconditionally.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > ---
> >   arch/x86/kvm/hyperv.c | 8 ++++----
> >   1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > index 6e38a7d22e97..06c2a5603123 100644
> > --- a/arch/x86/kvm/hyperv.c
> > +++ b/arch/x86/kvm/hyperv.c
> > @@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> >   {
> >   	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
> >   	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> > -	int auto_eoi_old, auto_eoi_new;
> > +	bool auto_eoi_old, auto_eoi_new;
> >   	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> >   		return;
> > @@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> >   	else
> >   		__clear_bit(vector, synic->vec_bitmap);
> > -	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
> > +	auto_eoi_old = !bitmap_empty(synic->auto_eoi_bitmap, 256);
> 
> I think that you can also remove the "!" here, ...
> 
> >   	if (synic_has_vector_auto_eoi(synic, vector))
> >   		__set_bit(vector, synic->auto_eoi_bitmap);
> >   	else
> >   		__clear_bit(vector, synic->auto_eoi_bitmap);
> > -	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
> > +	auto_eoi_new = !bitmap_empty(synic->auto_eoi_bitmap, 256);
> 
> ... and there...
> 
> > -	if (!!auto_eoi_old == !!auto_eoi_new)
> > +	if (auto_eoi_old == auto_eoi_new)
> 
> ... because this test would still give the same result.

It would give the same result, but the variable names would be inverted as they
track if "auto EOI" is being used.  So yes, it's technically unnecessary, but
also very deliberate.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
  2022-02-11 10:25   ` Mark Rutland
@ 2022-02-11 17:27   ` Christophe JAILLET
  2022-02-11 23:23     ` Yury Norov
  1 sibling, 1 reply; 98+ messages in thread
From: Christophe JAILLET @ 2022-02-11 17:27 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, Shaokun Zhang, Qi Liu, Khuong Dinh,
	linux-arm-kernel

Le 10/02/2022 à 23:48, Yury Norov a écrit :
> In some places, drivers/perf code calls bitmap_weight() to check if any
> bit of a given bitmap is set. It's better to use bitmap_empty() in that
> case because bitmap_empty() stops traversing the bitmap as soon as it
> finds first set bit, while bitmap_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   drivers/perf/arm-cci.c                   | 2 +-
>   drivers/perf/arm_pmu.c                   | 4 ++--
>   drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
>   drivers/perf/xgene_pmu.c                 | 2 +-
>   4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> index 54aca3a62814..96e09fa40909 100644
> --- a/drivers/perf/arm-cci.c
> +++ b/drivers/perf/arm-cci.c
> @@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
>   {
>   	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
>   	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
> -	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
>   	unsigned long flags;
>   
>   	if (!enabled)
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 295cc7952d0e..a31b302b0ade 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
>   {
>   	struct arm_pmu *armpmu = to_arm_pmu(pmu);
>   	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
>   
>   	/* For task-bound events we may be called on other CPUs */
>   	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> @@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
>   {
>   	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
>   	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
>   
>   	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
>   		return NOTIFY_DONE;
> diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> index a738aeab5c04..358e4e284a62 100644
> --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> @@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
>   void hisi_uncore_pmu_enable(struct pmu *pmu)
>   {
>   	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> -	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
> +	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
>   				    hisi_pmu->num_counters);
>   
>   	if (!enabled)
> diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
> index 5283608dc055..0c32dffc7ede 100644
> --- a/drivers/perf/xgene_pmu.c
> +++ b/drivers/perf/xgene_pmu.c
> @@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
>   {
>   	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
>   	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
> -	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
> +	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
>   			pmu_dev->max_counters);

Would it make sense to call it 'disabled', remove the "!"...

>   
>   	if (!enabled)
... and 'if (disabled)' here?

Just my 2c,

CJ

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq
  2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
  2022-02-11 10:40   ` Mike Rapoport
@ 2022-02-11 17:44   ` Christophe JAILLET
  2022-02-11 19:47     ` Yury Norov
  1 sibling, 1 reply; 98+ messages in thread
From: Christophe JAILLET @ 2022-02-11 17:44 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

Le 10/02/2022 à 23:49, Yury Norov a écrit :
> do_migrate_pages() calls nodes_weight() to compare the weight
> of nodemask with a given number. We can do it more efficiently with
> nodes_weight_eq() because conditional nodes_weight() may stop
> traversing the nodemask earlier, as soon as condition is (or is not)
> met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   mm/mempolicy.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 7c852793d9e8..56efd00b1b6e 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1154,7 +1154,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
>   			 *          [0-7] - > [3,4,5] moves only 0,1,2,6,7.
>   			 */
>   
> -			if ((nodes_weight(*from) != nodes_weight(*to)) &&
> +			if (!nodes_weight_eq(*from, nodes_weight(*to)) &&
>   						(node_isset(s, *to)))

Hi,

I've not looked in details, but would it make sense to hoist the 
"(nodes_weight(*from) != nodes_weight(*to))" test out of the 
for_each_node_mask() to compute it only once?

'from' and 'to' look unmodified in the loop.

Just my 2c,
CJ

>   				continue;
>   


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 07/49] KVM: x86: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-11 17:19     ` Sean Christopherson
@ 2022-02-11 17:47       ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 17:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Christophe JAILLET, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Michał Mirosław, Greg Kroah-Hartman,
	Peter Zijlstra, David Laight, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Paolo Bonzini, Vitaly Kuznetsov

On Fri, Feb 11, 2022 at 05:19:36PM +0000, Sean Christopherson wrote:
> On Fri, Feb 11, 2022, Christophe JAILLET wrote:
> > Le 10/02/2022 à 23:48, Yury Norov a écrit :
> > > In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
> > > of a given bitmap is set. It's better to use bitmap_empty() in that case
> > > because bitmap_empty() stops traversing the bitmap as soon as it finds
> > > first set bit, while bitmap_weight() counts all bits unconditionally.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > ---
> > >   arch/x86/kvm/hyperv.c | 8 ++++----
> > >   1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> > > index 6e38a7d22e97..06c2a5603123 100644
> > > --- a/arch/x86/kvm/hyperv.c
> > > +++ b/arch/x86/kvm/hyperv.c
> > > @@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> > >   {
> > >   	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
> > >   	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> > > -	int auto_eoi_old, auto_eoi_new;
> > > +	bool auto_eoi_old, auto_eoi_new;
> > >   	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
> > >   		return;
> > > @@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
> > >   	else
> > >   		__clear_bit(vector, synic->vec_bitmap);
> > > -	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
> > > +	auto_eoi_old = !bitmap_empty(synic->auto_eoi_bitmap, 256);
> > 
> > I think that you can also remove the "!" here, ...
> > 
> > >   	if (synic_has_vector_auto_eoi(synic, vector))
> > >   		__set_bit(vector, synic->auto_eoi_bitmap);
> > >   	else
> > >   		__clear_bit(vector, synic->auto_eoi_bitmap);
> > > -	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
> > > +	auto_eoi_new = !bitmap_empty(synic->auto_eoi_bitmap, 256);
> > 
> > ... and there...
> > 
> > > -	if (!!auto_eoi_old == !!auto_eoi_new)
> > > +	if (auto_eoi_old == auto_eoi_new)
> > 
> > ... because this test would still give the same result.

This is how it was in v3. Vitaly asked to add '!' to keep variables
names correct.
https://lore.kernel.org/lkml/CAAH8bW_u6oNOkMsA_jRyWFHkzjMi0CB7gXmvLYAdjNMSqrrY7w@mail.gmail.com/t/#m51d28c03eafed5754a69f95f24c7d0a0510cc5c0
> 
> It would give the same result, but the variable names would be inverted as they
> track if "auto EOI" is being used.  So yes, it's technically unnecessary, but
> also very deliberate.

auto_eoi_old_not_used = bitmap_empty() is worse to me than
auto_eoi_old = !bitmap_empty().

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-11 10:25   ` Mark Rutland
@ 2022-02-11 17:59     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 17:59 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Shaokun Zhang, Qi Liu, Khuong Dinh, linux-arm-kernel

On Fri, Feb 11, 2022 at 10:25:23AM +0000, Mark Rutland wrote:
> Hi Yury,
> 
> On Thu, Feb 10, 2022 at 02:48:56PM -0800, Yury Norov wrote:
> > In some places, drivers/perf code calls bitmap_weight() to check if any
> > bit of a given bitmap is set. It's better to use bitmap_empty() in that
> > case because bitmap_empty() stops traversing the bitmap as soon as it
> > finds first set bit, while bitmap_weight() counts all bits unconditionally.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> This looks like a nice semantic cleanup to me, so FWIW:

Thanks :)
 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> 
> How are you expecting to queue all of this?

I expect maintainers of corresponding subsystems will pick most of the
material. For the rest, I have my own bitmap branch.

> Should Will and I pick this patch?

Yes please.

> Thanks,
> Mark.
> 
> > ---
> >  drivers/perf/arm-cci.c                   | 2 +-
> >  drivers/perf/arm_pmu.c                   | 4 ++--
> >  drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
> >  drivers/perf/xgene_pmu.c                 | 2 +-
> >  4 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> > index 54aca3a62814..96e09fa40909 100644
> > --- a/drivers/perf/arm-cci.c
> > +++ b/drivers/perf/arm-cci.c
> > @@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
> >  {
> >  	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
> >  	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
> > -	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
> >  	unsigned long flags;
> >  
> >  	if (!enabled)
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 295cc7952d0e..a31b302b0ade 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
> >  {
> >  	struct arm_pmu *armpmu = to_arm_pmu(pmu);
> >  	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> > -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
> >  
> >  	/* For task-bound events we may be called on other CPUs */
> >  	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> > @@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
> >  {
> >  	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
> >  	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> > -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
> >  
> >  	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> >  		return NOTIFY_DONE;
> > diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > index a738aeab5c04..358e4e284a62 100644
> > --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > @@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
> >  void hisi_uncore_pmu_enable(struct pmu *pmu)
> >  {
> >  	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> > -	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
> > +	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
> >  				    hisi_pmu->num_counters);
> >  
> >  	if (!enabled)
> > diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
> > index 5283608dc055..0c32dffc7ede 100644
> > --- a/drivers/perf/xgene_pmu.c
> > +++ b/drivers/perf/xgene_pmu.c
> > @@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
> >  {
> >  	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
> >  	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
> > -	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
> > +	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
> >  			pmu_dev->max_counters);
> >  
> >  	if (!enabled)
> > -- 
> > 2.32.0
> > 

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 19/49] RDMA/hfi: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:49 ` [PATCH 19/49] RDMA/hfi: " Yury Norov
@ 2022-02-11 19:10   ` Jason Gunthorpe
  0 siblings, 0 replies; 98+ messages in thread
From: Jason Gunthorpe @ 2022-02-11 19:10 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, linux-rdma,
	Leon Romanovsky

On Thu, Feb 10, 2022 at 02:49:03PM -0800, Yury Norov wrote:
> drivers/infiniband/hw/hfi1/affinity.c code calls cpumask_weight() to check
> if any bit of a given cpumask is set. We can do it more efficiently with
> cpumask_empty() because cpumask_empty() stops traversing the cpumask as
> soon as it finds first set bit, while cpumask_weight() counts all bits
> unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/hw/hfi1/affinity.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied to the rdma tree

Thanks,
Jason

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-02-10 22:49 ` [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-02-11 19:11   ` Jason Gunthorpe
  0 siblings, 0 replies; 98+ messages in thread
From: Jason Gunthorpe @ 2022-02-11 19:11 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, linux-rdma

On Thu, Feb 10, 2022 at 02:49:25PM -0800, Yury Norov wrote:
> Infiniband code uses cpumask_weight() to compare the weight of cpumask
> with a given number. We can do it more efficiently with
> cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is (or can't be) met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/infiniband/hw/hfi1/affinity.c    | 9 ++++-----
>  drivers/infiniband/hw/qib/qib_file_ops.c | 2 +-
>  drivers/infiniband/hw/qib/qib_iba7322.c  | 2 +-
>  3 files changed, 6 insertions(+), 7 deletions(-)

I suppose you'll send this with the prior patch adding the functions
in which case

Acked-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq
  2022-02-11 17:44   ` Christophe JAILLET
@ 2022-02-11 19:47     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 19:47 UTC (permalink / raw)
  To: Christophe JAILLET, Larry Woodman
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

 + Larry Woodman <lwoodman@redhat.com>

On Fri, Feb 11, 2022 at 06:44:39PM +0100, Christophe JAILLET wrote:
> Le 10/02/2022 à 23:49, Yury Norov a écrit :
> > do_migrate_pages() calls nodes_weight() to compare the weight
> > of nodemask with a given number. We can do it more efficiently with
> > nodes_weight_eq() because conditional nodes_weight() may stop
> > traversing the nodemask earlier, as soon as condition is (or is not)
> > met.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >   mm/mempolicy.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 7c852793d9e8..56efd00b1b6e 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1154,7 +1154,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
> >   			 *          [0-7] - > [3,4,5] moves only 0,1,2,6,7.
> >   			 */
> > -			if ((nodes_weight(*from) != nodes_weight(*to)) &&
> > +			if (!nodes_weight_eq(*from, nodes_weight(*to)) &&
> >   						(node_isset(s, *to)))
> 
> Hi,
> 
> I've not looked in details, but would it make sense to hoist the
> "(nodes_weight(*from) != nodes_weight(*to))" test out of the
> for_each_node_mask() to compute it only once?
> 
> 'from' and 'to' look unmodified in the loop.

It seems that 'from' and 'to' are untouched in the outer while()
loop as well, so we can compare weights of nodemaps only once at the
beginning.

Larry, can you please comment on that?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-02-11 17:27   ` Christophe JAILLET
@ 2022-02-11 23:23     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 23:23 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, Shaokun Zhang, Qi Liu, Khuong Dinh,
	linux-arm-kernel

On Fri, Feb 11, 2022 at 06:27:56PM +0100, Christophe JAILLET wrote:
> Le 10/02/2022 à 23:48, Yury Norov a écrit :
> > In some places, drivers/perf code calls bitmap_weight() to check if any
> > bit of a given bitmap is set. It's better to use bitmap_empty() in that
> > case because bitmap_empty() stops traversing the bitmap as soon as it
> > finds first set bit, while bitmap_weight() counts all bits unconditionally.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >   drivers/perf/arm-cci.c                   | 2 +-
> >   drivers/perf/arm_pmu.c                   | 4 ++--
> >   drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
> >   drivers/perf/xgene_pmu.c                 | 2 +-
> >   4 files changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
> > index 54aca3a62814..96e09fa40909 100644
> > --- a/drivers/perf/arm-cci.c
> > +++ b/drivers/perf/arm-cci.c
> > @@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
> >   {
> >   	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
> >   	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
> > -	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
> >   	unsigned long flags;
> >   	if (!enabled)
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 295cc7952d0e..a31b302b0ade 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
> >   {
> >   	struct arm_pmu *armpmu = to_arm_pmu(pmu);
> >   	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> > -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
> >   	/* For task-bound events we may be called on other CPUs */
> >   	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> > @@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
> >   {
> >   	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
> >   	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
> > -	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
> > +	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
> >   	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
> >   		return NOTIFY_DONE;
> > diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > index a738aeab5c04..358e4e284a62 100644
> > --- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > +++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
> > @@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
> >   void hisi_uncore_pmu_enable(struct pmu *pmu)
> >   {
> >   	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
> > -	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
> > +	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
> >   				    hisi_pmu->num_counters);
> >   	if (!enabled)
> > diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
> > index 5283608dc055..0c32dffc7ede 100644
> > --- a/drivers/perf/xgene_pmu.c
> > +++ b/drivers/perf/xgene_pmu.c
> > @@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
> >   {
> >   	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
> >   	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
> > -	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
> > +	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
> >   			pmu_dev->max_counters);
> 
> Would it make sense to call it 'disabled', remove the "!"...
> 
> >   	if (!enabled)
> ... and 'if (disabled)' here?

People like positive names (as I do):
        $ git grep bool | grep "= \!" | grep -v "= \!\!" | wc -l
        334

And probably authors chose positive name in this case for a reason.

Replacing 'enabled' with 'disabled' just to avoid negation will add
absolutely nothing to performance, neither to readability. But noise
level of this and other patches will increase - just for nothing.

For me it sounds like total negative commitment.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-02-11  6:54   ` Sven Schnelle
@ 2022-02-11 23:40     ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-11 23:40 UTC (permalink / raw)
  To: Sven Schnelle
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	Alexander Gordeev, Thomas Richter, Sumanth Korikkar,
	Sebastian Andrzej Siewior, Jiapeng Chong, kernel test robot,
	linux-s390

On Fri, Feb 11, 2022 at 07:54:26AM +0100, Sven Schnelle wrote:
> Hi Yury,
> 
> Yury Norov <yury.norov@gmail.com> writes:
> 
> > cfset_all_start() calls cpumask_weight() to compare the weight of cpumask
> > with a given number. We can do it more efficiently with
> > cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> > traversing the cpumask earlier, as soon as condition is (or can't be) met.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  arch/s390/kernel/perf_cpum_cf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
> > index ee8707abdb6a..4d217f7f5ccf 100644
> > --- a/arch/s390/kernel/perf_cpum_cf.c
> > +++ b/arch/s390/kernel/perf_cpum_cf.c
> > @@ -975,7 +975,7 @@ static int cfset_all_start(struct cfset_request *req)
> >  		return -ENOMEM;
> >  	cpumask_and(mask, &req->mask, cpu_online_mask);
> >  	on_each_cpu_mask(mask, cfset_ioctl_on, &p, 1);
> > -	if (atomic_read(&p.cpus_ack) != cpumask_weight(mask)) {
> > +	if (!cpumask_weight_eq(mask, atomic_read(&p.cpus_ack))) {
> >  		on_each_cpu_mask(mask, cfset_ioctl_off, &p, 1);
> >  		rc = -EIO;
> >  		debug_sprintf_event(cf_dbg, 4, "%s CPUs missing", __func__);
> 
> given that you're adding a bunch of these functions - gt,lt,eq and
> others, i wonder whether it makes sense to also add cpumask_weight_ne(),
> so one could just write:
> 
> if (cpumask_weight_ne(mask, atomic_read(&p.cpus_ack))) {
> 	...
> }
> 
> ?

It will have 3 users in cpumask + 1 in nodemask. I have no strong opinion
whether we need it or not. Let's see what people say.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa
  2022-02-10 22:49 ` [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa Yury Norov
@ 2022-02-14 19:18   ` Rafael J. Wysocki
  2022-02-14 19:34     ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Rafael J. Wysocki @ 2022-02-14 19:18 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov,
	Linux Kernel Mailing List, Rafael J. Wysocki, Len Brown,
	Dan Williams, Huacai Chen, Vitaly Kuznetsov, Alison Schofield,
	ACPI Devel Maling List

On Fri, Feb 11, 2022 at 1:31 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
> of nodemask with a given number. We can do it more efficiently with
> nodes_weight_eq() because conditional nodes_weight may stop
> traversing the nodemask earlier, as soon as condition is (or is not)
> met.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/acpi/numa/srat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> index 3b818ab186be..fe7a7996f553 100644
> --- a/drivers/acpi/numa/srat.c
> +++ b/drivers/acpi/numa/srat.c
> @@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
>         node = pxm_to_node_map[pxm];
>
>         if (node == NUMA_NO_NODE) {
> -               if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
> +               if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
>                         return NUMA_NO_NODE;
>                 node = first_unset_node(nodes_found_map);
>                 __acpi_map_pxm_to_node(pxm, node);
> --

Applied as 5.18 material, thanks!

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa
  2022-02-14 19:18   ` Rafael J. Wysocki
@ 2022-02-14 19:34     ` Yury Norov
  2022-02-14 19:45       ` Rafael J. Wysocki
  0 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-14 19:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov,
	Linux Kernel Mailing List, Len Brown, Dan Williams, Huacai Chen,
	Vitaly Kuznetsov, Alison Schofield, ACPI Devel Maling List

On Mon, Feb 14, 2022 at 08:18:27PM +0100, Rafael J. Wysocki wrote:
> On Fri, Feb 11, 2022 at 1:31 AM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
> > of nodemask with a given number. We can do it more efficiently with
> > nodes_weight_eq() because conditional nodes_weight may stop
> > traversing the nodemask earlier, as soon as condition is (or is not)
> > met.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  drivers/acpi/numa/srat.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> > index 3b818ab186be..fe7a7996f553 100644
> > --- a/drivers/acpi/numa/srat.c
> > +++ b/drivers/acpi/numa/srat.c
> > @@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
> >         node = pxm_to_node_map[pxm];
> >
> >         if (node == NUMA_NO_NODE) {
> > -               if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
> > +               if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
> >                         return NUMA_NO_NODE;
> >                 node = first_unset_node(nodes_found_map);
> >                 __acpi_map_pxm_to_node(pxm, node);
> > --
> 
> Applied as 5.18 material, thanks!

It depends on patches 44 and 26. Are you applying them too?

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa
  2022-02-14 19:34     ` Yury Norov
@ 2022-02-14 19:45       ` Rafael J. Wysocki
  2022-02-14 19:55         ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Rafael J. Wysocki @ 2022-02-14 19:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rafael J. Wysocki, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Michał Mirosław, Greg Kroah-Hartman,
	Peter Zijlstra, David Laight, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, Linux Kernel Mailing List, Len Brown,
	Dan Williams, Huacai Chen, Vitaly Kuznetsov, Alison Schofield,
	ACPI Devel Maling List

On Mon, Feb 14, 2022 at 8:36 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Mon, Feb 14, 2022 at 08:18:27PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Feb 11, 2022 at 1:31 AM Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
> > > of nodemask with a given number. We can do it more efficiently with
> > > nodes_weight_eq() because conditional nodes_weight may stop
> > > traversing the nodemask earlier, as soon as condition is (or is not)
> > > met.
> > >
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > ---
> > >  drivers/acpi/numa/srat.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> > > index 3b818ab186be..fe7a7996f553 100644
> > > --- a/drivers/acpi/numa/srat.c
> > > +++ b/drivers/acpi/numa/srat.c
> > > @@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
> > >         node = pxm_to_node_map[pxm];
> > >
> > >         if (node == NUMA_NO_NODE) {
> > > -               if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
> > > +               if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
> > >                         return NUMA_NO_NODE;
> > >                 node = first_unset_node(nodes_found_map);
> > >                 __acpi_map_pxm_to_node(pxm, node);
> > > --
> >
> > Applied as 5.18 material, thanks!
>
> It depends on patches 44 and 26. Are you applying them too?

No, I'm not (I've only received this one directly).

I'll drop this patch now and please feel free to add my ACK to it.

Thanks!

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa
  2022-02-14 19:45       ` Rafael J. Wysocki
@ 2022-02-14 19:55         ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-02-14 19:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov,
	Linux Kernel Mailing List, Len Brown, Dan Williams, Huacai Chen,
	Vitaly Kuznetsov, Alison Schofield, ACPI Devel Maling List

On Mon, Feb 14, 2022 at 11:45 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Feb 14, 2022 at 8:36 PM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > On Mon, Feb 14, 2022 at 08:18:27PM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Feb 11, 2022 at 1:31 AM Yury Norov <yury.norov@gmail.com> wrote:
> > > >
> > > > acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
> > > > of nodemask with a given number. We can do it more efficiently with
> > > > nodes_weight_eq() because conditional nodes_weight may stop
> > > > traversing the nodemask earlier, as soon as condition is (or is not)
> > > > met.
> > > >
> > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > ---
> > > >  drivers/acpi/numa/srat.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
> > > > index 3b818ab186be..fe7a7996f553 100644
> > > > --- a/drivers/acpi/numa/srat.c
> > > > +++ b/drivers/acpi/numa/srat.c
> > > > @@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
> > > >         node = pxm_to_node_map[pxm];
> > > >
> > > >         if (node == NUMA_NO_NODE) {
> > > > -               if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
> > > > +               if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
> > > >                         return NUMA_NO_NODE;
> > > >                 node = first_unset_node(nodes_found_map);
> > > >                 __acpi_map_pxm_to_node(pxm, node);
> > > > --
> > >
> > > Applied as 5.18 material, thanks!
> >
> > It depends on patches 44 and 26. Are you applying them too?
>
> No, I'm not (I've only received this one directly).
>
> I'll drop this patch now and please feel free to add my ACK to it.

OK, thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage
  2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (48 preceding siblings ...)
  2022-02-10 22:49 ` [PATCH 49/49] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
@ 2022-02-15 23:18 ` Will Deacon
  49 siblings, 0 replies; 98+ messages in thread
From: Will Deacon @ 2022-02-15 23:18 UTC (permalink / raw)
  To: Peter Zijlstra, Greg Kroah-Hartman, Andrew Morton, David Laight,
	Andy Shevchenko, Alexey Klimov, Yury Norov, Matti Vaittinen,
	Dennis Zhou, Joe Perches, linux-kernel, Rasmus Villemoes,
	Emil Renner Berthing, Nicholas Piggin, Michał Mirosław
  Cc: catalin.marinas, kernel-team, Will Deacon

On Thu, 10 Feb 2022 14:48:44 -0800, Yury Norov wrote:
> In many cases people use bitmap_weight()-based functions to compare
> the result against a number of expression:
> 
>         if (cpumask_weight(mask) > 1)
>                 do_something();
> 
> This may take considerable amount of time on many-cpus machines because
> cpumask_weight() will traverse every word of underlying cpumask
> unconditionally.
> 
> [...]

Applied to will (for-next/perf), thanks!

[12/49] perf: replace bitmap_weight with bitmap_empty where appropriate
        https://git.kernel.org/will/c/95ed57c73bbc

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-02-10 22:49 ` [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
@ 2022-02-17 15:39   ` Ulf Hansson
  2022-02-17 16:55     ` Yury Norov
  0 siblings, 1 reply; 98+ messages in thread
From: Ulf Hansson @ 2022-02-17 15:39 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Jens Axboe, Luis Chamberlain,
	Colin Ian King, Arnd Bergmann, Shubhankar Kuranagatti, linux-mmc,
	Shubhankar Kuranagatti

On Fri, 11 Feb 2022 at 00:55, Yury Norov <yury.norov@gmail.com> wrote:
>
> msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
> weight of bitmap with a given number. We can do it more efficiently with
> bitmap_weight_eq because conditional bitmap_weight may stop traversing the
> bitmap earlier, as soon as condition is (or can't be) met.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> Acked-by: Shubhankar Kuranagatti <shubhankar.vk@gmail.com>

Applied for next, thanks!

Kind regards
Uffe


> ---
>  drivers/memstick/core/ms_block.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
> index 0cda6c6baefc..5cdd987e78f7 100644
> --- a/drivers/memstick/core/ms_block.c
> +++ b/drivers/memstick/core/ms_block.c
> @@ -155,8 +155,8 @@ static int msb_validate_used_block_bitmap(struct msb_data *msb)
>         for (i = 0; i < msb->zone_count; i++)
>                 total_free_blocks += msb->free_block_count[i];
>
> -       if (msb->block_count - bitmap_weight(msb->used_blocks_bitmap,
> -                                       msb->block_count) == total_free_blocks)
> +       if (bitmap_weight_eq(msb->used_blocks_bitmap, msb->block_count,
> +                               msb->block_count - total_free_blocks))
>                 return 0;
>
>         pr_err("BUG: free block counts don't match the bitmap");
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-02-17 15:39   ` Ulf Hansson
@ 2022-02-17 16:55     ` Yury Norov
  2022-02-22 15:49       ` Ulf Hansson
  0 siblings, 1 reply; 98+ messages in thread
From: Yury Norov @ 2022-02-17 16:55 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Jens Axboe, Luis Chamberlain,
	Colin Ian King, Arnd Bergmann, Shubhankar Kuranagatti, linux-mmc,
	Shubhankar Kuranagatti

On Thu, Feb 17, 2022 at 7:39 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 11 Feb 2022 at 00:55, Yury Norov <yury.norov@gmail.com> wrote:
> >
> > msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
> > weight of bitmap with a given number. We can do it more efficiently with
> > bitmap_weight_eq because conditional bitmap_weight may stop traversing the
> > bitmap earlier, as soon as condition is (or can't be) met.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> > Acked-by: Shubhankar Kuranagatti <shubhankar.vk@gmail.com>
>
> Applied for next, thanks!

Hi Ulf,

This patch depends on patch 26/49 "bitmap: add bitmap_weight_{cmp, eq,
gt, ge, lt, le} functions"
from this series. Can you  make sure you applied them together? Or I can
apply it later.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [tip: sched/core] sched: replace cpumask_weight with cpumask_empty where appropriate
  2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
  2022-02-11 10:19   ` Peter Zijlstra
@ 2022-02-17 18:56   ` tip-bot2 for Yury Norov
  1 sibling, 0 replies; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-02-17 18:56 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yury Norov, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     1087ad4e3f88c474b8134a482720782922bf3fdf
Gitweb:        https://git.kernel.org/tip/1087ad4e3f88c474b8134a482720782922bf3fdf
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:06 -08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 16 Feb 2022 15:57:53 +01:00

sched: replace cpumask_weight with cpumask_empty where appropriate

In some places, kernel/sched code calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220210224933.379149-23-yury.norov@gmail.com
---
 kernel/sched/core.c     | 2 +-
 kernel/sched/topology.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1d863d7..c620aab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8708,7 +8708,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
 {
 	int ret = 1;
 
-	if (!cpumask_weight(cur))
+	if (cpumask_empty(cur))
 		return ret;
 
 	ret = dl_cpuset_cpumask_can_shrink(cur, trial);
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index e6cd559..1c84b48 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -74,7 +74,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
 			break;
 		}
 
-		if (!cpumask_weight(sched_group_span(group))) {
+		if (cpumask_empty(sched_group_span(group))) {
 			printk(KERN_CONT "\n");
 			printk(KERN_ERR "ERROR: empty group\n");
 			break;

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-02-17 16:55     ` Yury Norov
@ 2022-02-22 15:49       ` Ulf Hansson
  0 siblings, 0 replies; 98+ messages in thread
From: Ulf Hansson @ 2022-02-22 15:49 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Jens Axboe, Luis Chamberlain,
	Colin Ian King, Arnd Bergmann, Shubhankar Kuranagatti, linux-mmc,
	Shubhankar Kuranagatti

On Thu, 17 Feb 2022 at 17:55, Yury Norov <yury.norov@gmail.com> wrote:
>
> On Thu, Feb 17, 2022 at 7:39 AM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Fri, 11 Feb 2022 at 00:55, Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
> > > weight of bitmap with a given number. We can do it more efficiently with
> > > bitmap_weight_eq because conditional bitmap_weight may stop traversing the
> > > bitmap earlier, as soon as condition is (or can't be) met.
> > >
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > Acked-by: Shubhankar Kuranagatti <shubhankar.vk@gmail.com>
> >
> > Applied for next, thanks!
>
> Hi Ulf,
>
> This patch depends on patch 26/49 "bitmap: add bitmap_weight_{cmp, eq,
> gt, ge, lt, le} functions"
> from this series. Can you  make sure you applied them together? Or I can
> apply it later.

I can't apply them, unless there is an immutable branch being shared
between the different trees.

Therefore I have dropped the patch for now.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [tip: irq/core] genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate
  2022-02-10 22:49 ` [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-04-10 20:27   ` tip-bot2 for Yury Norov
       [not found]     ` <573841649622719@mail.yandex.com>
  0 siblings, 1 reply; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-04-10 20:27 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yury Norov, Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     911488de0565f1d53bd36174d20917ebc4b44c0e
Gitweb:        https://git.kernel.org/tip/911488de0565f1d53bd36174d20917ebc4b44c0e
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:05 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 10 Apr 2022 22:20:28 +02:00

genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate

__irq_build_affinity_masks() calls cpumask_weight() to check if any bit of
a given cpumask is set.

This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220210224933.379149-22-yury.norov@gmail.com

---
 kernel/irq/affinity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index f7ff891..18740fa 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -258,7 +258,7 @@ static int __irq_build_affinity_masks(unsigned int startvec,
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	struct node_vectors *node_vectors;
 
-	if (!cpumask_weight(cpu_mask))
+	if (cpumask_empty(cpu_mask))
 		return 0;
 
 	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [tip: irq/core] irqchip/bmips: Replace cpumask_weight() with cpumask_empty()
  2022-02-10 22:49 ` [PATCH 20/49] irq: mips: " Yury Norov
@ 2022-04-10 20:34   ` tip-bot2 for Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-04-10 20:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Yury Norov, Thomas Gleixner, Florian Fainelli, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     0de61d739c21003201a0adb1f5c403f89a7c2441
Gitweb:        https://git.kernel.org/tip/0de61d739c21003201a0adb1f5c403f89a7c2441
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:04 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 10 Apr 2022 22:28:28 +02:00

irqchip/bmips: Replace cpumask_weight() with cpumask_empty()

bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
cpumask is set.

This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220210224933.379149-21-yury.norov@gmail.com

---
 drivers/irqchip/irq-bcm6345-l1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
index fd07921..142a743 100644
--- a/drivers/irqchip/irq-bcm6345-l1.c
+++ b/drivers/irqchip/irq-bcm6345-l1.c
@@ -315,7 +315,7 @@ static int __init bcm6345_l1_of_init(struct device_node *dn,
 			cpumask_set_cpu(idx, &intc->cpumask);
 	}
 
-	if (!cpumask_weight(&intc->cpumask)) {
+	if (cpumask_empty(&intc->cpumask)) {
 		ret = -ENODEV;
 		goto out_free;
 	}

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [tip: timers/core] clocksource: Replace cpumask_weight() with cpumask_empty()
  2022-02-10 22:49 ` [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
@ 2022-04-10 20:35   ` tip-bot2 for Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-04-10 20:35 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yury Norov, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     8afbcaf8690dac19ebf570a4e4fef9c59c75bf8e
Gitweb:        https://git.kernel.org/tip/8afbcaf8690dac19ebf570a4e4fef9c59c75bf8e
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:07 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 10 Apr 2022 22:30:04 +02:00

clocksource: Replace cpumask_weight() with cpumask_empty()

clocksource_verify_percpu() calls cpumask_weight() to check if any bit of a
given cpumask is set.

This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220210224933.379149-24-yury.norov@gmail.com

---
 kernel/time/clocksource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 95d7ca3..cee5da1 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -343,7 +343,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	cpus_read_lock();
 	preempt_disable();
 	clocksource_verify_choose_cpus();
-	if (cpumask_weight(&cpus_chosen) == 0) {
+	if (cpumask_empty(&cpus_chosen)) {
 		preempt_enable();
 		cpus_read_unlock();
 		pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [tip: x86/cleanups] x86/mm: Replace nodes_weight() with nodes_empty() where appropriate
  2022-02-10 22:49 ` [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
@ 2022-04-10 20:42   ` tip-bot2 for Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-04-10 20:42 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Yury Norov, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     c2a911d302b0d014a4d0d732a2bfc319e643eb62
Gitweb:        https://git.kernel.org/tip/c2a911d302b0d014a4d0d732a2bfc319e643eb62
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:09 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 10 Apr 2022 22:35:38 +02:00

x86/mm: Replace nodes_weight() with nodes_empty() where appropriate

Various mm code calls nodes_weight() to check if any bit of a given
nodemask is set.

This can be done more efficiently with nodes_empty() because nodes_empty()
stops traversing the nodemask as soon as it finds first set bit, while
nodes_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220210224933.379149-26-yury.norov@gmail.com

---
 arch/x86/mm/amdtopology.c    | 2 +-
 arch/x86/mm/numa_emulation.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/amdtopology.c b/arch/x86/mm/amdtopology.c
index 058b2f3..b3ca7d2 100644
--- a/arch/x86/mm/amdtopology.c
+++ b/arch/x86/mm/amdtopology.c
@@ -154,7 +154,7 @@ int __init amd_numa_init(void)
 		node_set(nodeid, numa_nodes_parsed);
 	}
 
-	if (!nodes_weight(numa_nodes_parsed))
+	if (nodes_empty(numa_nodes_parsed))
 		return -ENOENT;
 
 	/*
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index 1a02b79..9a93053 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -123,7 +123,7 @@ static int __init split_nodes_interleave(struct numa_meminfo *ei,
 	 * Continue to fill physical nodes with fake nodes until there is no
 	 * memory left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;
@@ -270,7 +270,7 @@ static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
 	 * Fill physical nodes with fake nodes of size until there is no memory
 	 * left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [tip: x86/cleanups] x86: Replace cpumask_weight() with cpumask_empty() where appropriate
  2022-02-10 22:49 ` [PATCH 16/49] arch/x86: " Yury Norov
@ 2022-04-10 20:42   ` tip-bot2 for Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: tip-bot2 for Yury Norov @ 2022-04-10 20:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Yury Norov, Thomas Gleixner, Steve Wahl, x86, linux-kernel

The following commit has been merged into the x86/cleanups branch of tip:

Commit-ID:     3a5ff1f6dd50f5e1c2aa87491910dd6d275af24b
Gitweb:        https://git.kernel.org/tip/3a5ff1f6dd50f5e1c2aa87491910dd6d275af24b
Author:        Yury Norov <yury.norov@gmail.com>
AuthorDate:    Thu, 10 Feb 2022 14:49:00 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 10 Apr 2022 22:35:38 +02:00

x86: Replace cpumask_weight() with cpumask_empty() where appropriate

In some cases, x86 code calls cpumask_weight() to check if any bit of a
given cpumask is set.

This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20220210224933.379149-17-yury.norov@gmail.com

---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
 arch/x86/mm/mmio-mod.c                 |  2 +-
 arch/x86/platform/uv/uv_nmi.c          |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 83f901e..f276aff 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus belong to parent ctrl group */
 	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
 		return -EINVAL;
 	}
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Give any dropped cpus to parent rdtgroup */
 		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
 		update_closid_rmid(tmpmask, prgrp);
@@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu rmid
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
 			if (crgrp == rdtgrp)
@@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Can't drop from default group */
 		if (rdtgrp == &rdtgroup_default) {
 			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
@@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu closid/rmid.
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
 			if (r == rdtgrp)
 				continue;
 			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
-			if (cpumask_weight(tmpmask1))
+			if (!cpumask_empty(tmpmask1))
 				cpumask_rdtgrp_clear(r, tmpmask1);
 		}
 		update_closid_rmid(tmpmask, rdtgrp);
@@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 
 	/* check that user didn't specify any offline cpus */
 	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		ret = -EINVAL;
 		rdt_last_cmd_puts("Can only assign online CPUs\n");
 		goto unlock;
diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
index 933a2eb..c3317f0 100644
--- a/arch/x86/mm/mmio-mod.c
+++ b/arch/x86/mm/mmio-mod.c
@@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
 	int cpu;
 	int err;
 
-	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
+	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
 		return;
 	pr_notice("Re-enabling CPUs...\n");
 	for_each_cpu(cpu, downed_cpus) {
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 1e9ff28..ea277fc 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
 
 	/* Clear global flags */
 	if (master) {
-		if (cpumask_weight(uv_nmi_cpu_mask))
+		if (!cpumask_empty(uv_nmi_cpu_mask))
 			uv_nmi_cleanup_mask();
 		atomic_set(&uv_nmi_cpus_in_nmi, -1);
 		atomic_set(&uv_nmi_cpu, -1);

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [tip: irq/core] genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate
       [not found]     ` <573841649622719@mail.yandex.com>
@ 2022-04-10 21:17       ` Yury Norov
  0 siblings, 0 replies; 98+ messages in thread
From: Yury Norov @ 2022-04-10 21:17 UTC (permalink / raw)
  To: Ozgur
  Cc: linux-kernel, linux-tip-commits, Thomas Gleixner, x86, maz, Ming Lei

On Sun, Apr 10, 2022 at 1:35 PM Ozgur <ozgur@linux.com> wrote:
>
>
>
> 10.04.2022, 23:27, "tip-bot2 for Yury Norov" <tip-bot2@linutronix.de>:
>
> The following commit has been merged into the irq/core branch of tip:
>
> Commit-ID: 911488de0565f1d53bd36174d20917ebc4b44c0e
> Gitweb: https://git.kernel.org/tip/911488de0565f1d53bd36174d20917ebc4b44c0e
> Author: Yury Norov <yury.norov@gmail.com>
> AuthorDate: Thu, 10 Feb 2022 14:49:05 -08:00
> Committer: Thomas Gleixner <tglx@linutronix.de>
> CommitterDate: Sun, 10 Apr 2022 22:20:28 +02:00
>
> genirq/affinity: Replace cpumask_weight() with cpumask_empty() where appropriate
>
> __irq_build_affinity_masks() calls cpumask_weight() to check if any bit of
> a given cpumask is set.
>
> This can be done more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
>
> Hello,
> in this patch, struct cpumask *nmsk will also be affected because is called ncpus = cpumask_weight(nmsk);
> right?

Sorry, I don't understand that. The line that you mentioned can't
modify nmsk neither before
nor after this patch. Can you clarify your concern in greater details?

Thanks,
Yury

> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Link: https://lore.kernel.org/r/20220210224933.379149-22-yury.norov@gmail.com
>
> ---
>  kernel/irq/affinity.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index f7ff891..18740fa 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -258,7 +258,7 @@ static int __irq_build_affinity_masks(unsigned int startvec,
>          nodemask_t nodemsk = NODE_MASK_NONE;
>          struct node_vectors *node_vectors;
>
> - if (!cpumask_weight(cpu_mask))
> + if (cpumask_empty(cpu_mask))
>                  return 0;
>
>          nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
>
>
> Ozgur

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le
  2022-02-10 22:49 ` [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le Yury Norov
@ 2022-04-28  7:31   ` Hans Verkuil
  0 siblings, 0 replies; 98+ messages in thread
From: Hans Verkuil @ 2022-04-28  7:31 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thierry Reding, Jonathan Hunter, Sowjanya Komatineni,
	Mauro Carvalho Chehab, linux-media, linux-tegra, linux-staging

On 10/02/2022 23:49, Yury Norov wrote:
> tegra_channel_enum_format() calls bitmap_weight() to compare the weight
> of bitmap with a given number. We can do it more efficiently with
> bitmap_weight_le() because conditional bitmap_weight may stop traversing
> the bitmap earlier, as soon as condition is (or can't be) met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Regards,

	Hans

> ---
>  drivers/staging/media/tegra-video/vi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
> index d1f43f465c22..4e79a80e9307 100644
> --- a/drivers/staging/media/tegra-video/vi.c
> +++ b/drivers/staging/media/tegra-video/vi.c
> @@ -436,7 +436,7 @@ static int tegra_channel_enum_format(struct file *file, void *fh,
>  	if (!IS_ENABLED(CONFIG_VIDEO_TEGRA_TPG))
>  		fmts_bitmap = chan->fmts_bitmap;
>  
> -	if (f->index >= bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM))
> +	if (bitmap_weight_le(fmts_bitmap, MAX_FORMAT_NUM, f->index))
>  		return -EINVAL;
>  
>  	for (i = 0; i < f->index + 1; i++, index++)


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 04/49] iio: fix opencoded for_each_set_bit()
  2022-02-11 17:17   ` Christophe JAILLET
@ 2022-06-04 15:41     ` Jonathan Cameron
  2022-06-11 13:50       ` Jonathan Cameron
  0 siblings, 1 reply; 98+ messages in thread
From: Jonathan Cameron @ 2022-06-04 15:41 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Lars-Peter Clausen, Alexandru Ardelean, Nathan Chancellor,
	linux-iio

On Fri, 11 Feb 2022 18:17:37 +0100
Christophe JAILLET <christophe.jaillet@wanadoo.fr> wrote:

> Le 10/02/2022 à 23:48, Yury Norov a écrit :
> > iio_simple_dummy_trigger_h() is mostly an opencoded for_each_set_bit().
> > Using for_each_set_bit() make code much cleaner, and more effective.
> > 
> > Signed-off-by: Yury Norov <yury.norov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > ---
> >   drivers/iio/dummy/iio_simple_dummy_buffer.c | 48 ++++++++-------------
> >   1 file changed, 19 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > index d81c2b2dad82..3bc1b7529e2a 100644
> > --- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > +++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > @@ -45,41 +45,31 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
> >   {
> >   	struct iio_poll_func *pf = p;
> >   	struct iio_dev *indio_dev = pf->indio_dev;
> > +	int i = 0, j;
> >   	u16 *data;
> >   
> >   	data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL);
> >   	if (!data)
> >   		goto done;
> >   
> > -	if (!bitmap_empty(indio_dev->active_scan_mask, indio_dev->masklength)) {
> > -		/*
> > -		 * Three common options here:
> > -		 * hardware scans: certain combinations of channels make
> > -		 *   up a fast read.  The capture will consist of all of them.
> > -		 *   Hence we just call the grab data function and fill the
> > -		 *   buffer without processing.
> > -		 * software scans: can be considered to be random access
> > -		 *   so efficient reading is just a case of minimal bus
> > -		 *   transactions.
> > -		 * software culled hardware scans:
> > -		 *   occasionally a driver may process the nearest hardware
> > -		 *   scan to avoid storing elements that are not desired. This
> > -		 *   is the fiddliest option by far.
> > -		 * Here let's pretend we have random access. And the values are
> > -		 * in the constant table fakedata.
> > -		 */
> > -		int i, j;
> > -
> > -		for (i = 0, j = 0;
> > -		     i < bitmap_weight(indio_dev->active_scan_mask,
> > -				       indio_dev->masklength);
> > -		     i++, j++) {
> > -			j = find_next_bit(indio_dev->active_scan_mask,
> > -					  indio_dev->masklength, j);
> > -			/* random access read from the 'device' */
> > -			data[i] = fakedata[j];
> > -		}
> > -	}
> > +	/*
> > +	 * Three common options here:
> > +	 * hardware scans: certain combinations of channels make
> > +	 *   up a fast read.  The capture will consist of all of them.
> > +	 *   Hence we just call the grab data function and fill the
> > +	 *   buffer without processing.
> > +	 * software scans: can be considered to be random access
> > +	 *   so efficient reading is just a case of minimal bus
> > +	 *   transactions.
> > +	 * software culled hardware scans:
> > +	 *   occasionally a driver may process the nearest hardware
> > +	 *   scan to avoid storing elements that are not desired. This
> > +	 *   is the fiddliest option by far.
> > +	 * Here let's pretend we have random access. And the values are
> > +	 * in the constant table fakedata.
> > +	 */  
> 
> Nitpicking: you could take advantage of the tab you save to use the full 
> width of the line and save some lines of code.

Tweaked whilst applying.

Sorry this one took so long. I marked it as a patch that I'd revisit if and
tidy up if there was no v2 sent, but then managed to forget about it until
I came to do a clean out of patchwork today.

Anyhow, now applied to the togreg branch of iio.git - initially pushed out
as testing for 0-day to see if we missed anything.

Thanks,

Jonathan

> 
> Just my 2c.
> 
> CJ
> 
> 
> > +	for_each_set_bit(j, indio_dev->active_scan_mask, indio_dev->masklength)
> > +		data[i++] = fakedata[j];
> >   
> >   	iio_push_to_buffers_with_timestamp(indio_dev, data,
> >   					   iio_get_time_ns(indio_dev));  
> 


^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 04/49] iio: fix opencoded for_each_set_bit()
  2022-06-04 15:41     ` Jonathan Cameron
@ 2022-06-11 13:50       ` Jonathan Cameron
  0 siblings, 0 replies; 98+ messages in thread
From: Jonathan Cameron @ 2022-06-11 13:50 UTC (permalink / raw)
  To: Christophe JAILLET
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Lars-Peter Clausen, Alexandru Ardelean, Nathan Chancellor,
	linux-iio

On Sat, 4 Jun 2022 16:41:13 +0100
Jonathan Cameron <jic23@kernel.org> wrote:

> On Fri, 11 Feb 2022 18:17:37 +0100
> Christophe JAILLET <christophe.jaillet@wanadoo.fr> wrote:
> 
> > Le 10/02/2022 à 23:48, Yury Norov a écrit :  
> > > iio_simple_dummy_trigger_h() is mostly an opencoded for_each_set_bit().
> > > Using for_each_set_bit() make code much cleaner, and more effective.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > > ---
> > >   drivers/iio/dummy/iio_simple_dummy_buffer.c | 48 ++++++++-------------
> > >   1 file changed, 19 insertions(+), 29 deletions(-)
> > > 
> > > diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > > index d81c2b2dad82..3bc1b7529e2a 100644
> > > --- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > > +++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
> > > @@ -45,41 +45,31 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
> > >   {
> > >   	struct iio_poll_func *pf = p;
> > >   	struct iio_dev *indio_dev = pf->indio_dev;
> > > +	int i = 0, j;
> > >   	u16 *data;
> > >   
> > >   	data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL);
> > >   	if (!data)
> > >   		goto done;
> > >   
> > > -	if (!bitmap_empty(indio_dev->active_scan_mask, indio_dev->masklength)) {
> > > -		/*
> > > -		 * Three common options here:
> > > -		 * hardware scans: certain combinations of channels make
> > > -		 *   up a fast read.  The capture will consist of all of them.
> > > -		 *   Hence we just call the grab data function and fill the
> > > -		 *   buffer without processing.
> > > -		 * software scans: can be considered to be random access
> > > -		 *   so efficient reading is just a case of minimal bus
> > > -		 *   transactions.
> > > -		 * software culled hardware scans:
> > > -		 *   occasionally a driver may process the nearest hardware
> > > -		 *   scan to avoid storing elements that are not desired. This
> > > -		 *   is the fiddliest option by far.
> > > -		 * Here let's pretend we have random access. And the values are
> > > -		 * in the constant table fakedata.
> > > -		 */
> > > -		int i, j;
> > > -
> > > -		for (i = 0, j = 0;
> > > -		     i < bitmap_weight(indio_dev->active_scan_mask,
> > > -				       indio_dev->masklength);
> > > -		     i++, j++) {
> > > -			j = find_next_bit(indio_dev->active_scan_mask,
> > > -					  indio_dev->masklength, j);
> > > -			/* random access read from the 'device' */
> > > -			data[i] = fakedata[j];
> > > -		}
> > > -	}
> > > +	/*
> > > +	 * Three common options here:
> > > +	 * hardware scans: certain combinations of channels make
> > > +	 *   up a fast read.  The capture will consist of all of them.
> > > +	 *   Hence we just call the grab data function and fill the
> > > +	 *   buffer without processing.
> > > +	 * software scans: can be considered to be random access
> > > +	 *   so efficient reading is just a case of minimal bus
> > > +	 *   transactions.
> > > +	 * software culled hardware scans:
> > > +	 *   occasionally a driver may process the nearest hardware
> > > +	 *   scan to avoid storing elements that are not desired. This
> > > +	 *   is the fiddliest option by far.
> > > +	 * Here let's pretend we have random access. And the values are
> > > +	 * in the constant table fakedata.
> > > +	 */    
> > 
> > Nitpicking: you could take advantage of the tab you save to use the full 
> > width of the line and save some lines of code.  
> 
> Tweaked whilst applying.
> 
> Sorry this one took so long. I marked it as a patch that I'd revisit if and
> tidy up if there was no v2 sent, but then managed to forget about it until
> I came to do a clean out of patchwork today.
> 
> Anyhow, now applied to the togreg branch of iio.git - initially pushed out
> as testing for 0-day to see if we missed anything.

And dropped again during a rebase as a different version has gone upstream
through a pull request to Linus.

Whilst I have no strong opinion on that in general, I am a little grumpy
that a version was merged that was never posted to the mailing lists (that
I can find on lore.kernel.org.)  Sure the changes were minor and easy to verify
as harmless, but none the less they should have been posted.

Jonathan

> 
> Thanks,
> 
> Jonathan
> 
> > 
> > Just my 2c.
> > 
> > CJ
> > 
> >   
> > > +	for_each_set_bit(j, indio_dev->active_scan_mask, indio_dev->masklength)
> > > +		data[i++] = fakedata[j];
> > >   
> > >   	iio_push_to_buffers_with_timestamp(indio_dev, data,
> > >   					   iio_get_time_ns(indio_dev));    
> >   
> 


^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2022-06-11 13:41 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 22:48 [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Yury Norov
2022-02-10 22:48 ` [PATCH 01/49] net: dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
2022-02-10 22:48 ` [PATCH 02/49] net: systemport: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
2022-02-10 22:48 ` [PATCH 03/49] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
2022-02-11  9:01   ` David Laight
2022-02-10 22:48 ` [PATCH 04/49] iio: fix opencoded for_each_set_bit() Yury Norov
2022-02-11  8:45   ` Andy Shevchenko
2022-02-11 17:17   ` Christophe JAILLET
2022-06-04 15:41     ` Jonathan Cameron
2022-06-11 13:50       ` Jonathan Cameron
2022-02-10 22:48 ` [RFC PATCH 05/49] qed: rework qed_rdma_bmap_free() Yury Norov
2022-02-11  8:48   ` Andy Shevchenko
2022-02-10 22:48 ` [PATCH 06/49] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-10 22:48 ` [PATCH 07/49] KVM: x86: " Yury Norov
2022-02-11 16:34   ` Sean Christopherson
2022-02-11 17:13   ` Christophe JAILLET
2022-02-11 17:19     ` Sean Christopherson
2022-02-11 17:47       ` Yury Norov
2022-02-10 22:48 ` [PATCH 08/49] drm: " Yury Norov
2022-02-11  2:11   ` Dmitry Baryshkov
2022-02-10 22:48 ` [PATCH 09/49] ice: replace bitmap_weight with bitmap_empty Yury Norov
2022-02-10 22:48 ` [PATCH 10/49] octeontx2-pf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-10 22:48 ` [PATCH 11/49] qed: replace bitmap_weight with bitmap_empty in qed_roce_stop() Yury Norov
2022-02-10 22:48 ` [PATCH 12/49] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-02-11 10:25   ` Mark Rutland
2022-02-11 17:59     ` Yury Norov
2022-02-11 17:27   ` Christophe JAILLET
2022-02-11 23:23     ` Yury Norov
2022-02-10 22:48 ` [PATCH 13/49] perf tools: " Yury Norov
2022-02-10 22:48 ` [PATCH 14/49] arch/alpha: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-10 22:48 ` [PATCH 15/49] arch/ia64: " Yury Norov
2022-02-10 22:49 ` [PATCH 16/49] arch/x86: " Yury Norov
2022-04-10 20:42   ` [tip: x86/cleanups] x86: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 17/49] cpufreq: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-11  4:30   ` Viresh Kumar
2022-02-11  5:17     ` Yury Norov
2022-02-10 22:49 ` [PATCH 18/49] drm/i915/pmu: " Yury Norov
2022-02-10 22:49 ` [PATCH 19/49] RDMA/hfi: " Yury Norov
2022-02-11 19:10   ` Jason Gunthorpe
2022-02-10 22:49 ` [PATCH 20/49] irq: mips: " Yury Norov
2022-04-10 20:34   ` [tip: irq/core] irqchip/bmips: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 21/49] genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-04-10 20:27   ` [tip: irq/core] genirq/affinity: Replace cpumask_weight() with cpumask_empty() " tip-bot2 for Yury Norov
     [not found]     ` <573841649622719@mail.yandex.com>
2022-04-10 21:17       ` Yury Norov
2022-02-10 22:49 ` [PATCH 22/49] sched: replace cpumask_weight with cpumask_empty " Yury Norov
2022-02-11 10:19   ` Peter Zijlstra
2022-02-11 14:19     ` Yury Norov
2022-02-17 18:56   ` [tip: sched/core] " tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 23/49] clocksource: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
2022-04-10 20:35   ` [tip: timers/core] clocksource: Replace cpumask_weight() with cpumask_empty() tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 24/49] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-02-11 10:39   ` Mike Rapoport
2022-02-10 22:49 ` [PATCH 25/49] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
2022-04-10 20:42   ` [tip: x86/cleanups] x86/mm: Replace nodes_weight() with nodes_empty() " tip-bot2 for Yury Norov
2022-02-10 22:49 ` [PATCH 26/49] bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
2022-02-10 22:49 ` [PATCH 27/49] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
2022-02-10 22:49 ` [PATCH 28/49] iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
2022-02-10 22:49 ` [PATCH 29/49] memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
2022-02-17 15:39   ` Ulf Hansson
2022-02-17 16:55     ` Yury Norov
2022-02-22 15:49       ` Ulf Hansson
2022-02-10 22:49 ` [PATCH 30/49] ixgbe: replace bitmap_weight with bitmap_weight_eq Yury Norov
2022-02-10 22:49 ` [PATCH 31/49] octeontx2-pf: replace bitmap_weight with bitmap_weight_{eq,gt} Yury Norov
2022-02-10 22:49 ` [PATCH 32/49] mlx4: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 33/49] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
2022-02-11 10:30   ` Mark Rutland
2022-02-10 22:49 ` [PATCH 34/49] media: tegra-video: replace bitmap_weight with bitmap_weight_le Yury Norov
2022-04-28  7:31   ` Hans Verkuil
2022-02-10 22:49 ` [PATCH 35/49] cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 36/49] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c Yury Norov
2022-02-10 22:49 ` [PATCH 37/49] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-02-10 22:49 ` [PATCH 38/49] arch/powerpc: " Yury Norov
2022-02-11  4:10   ` Michael Ellerman
2022-02-10 22:49 ` [PATCH 39/49] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
2022-02-11  6:54   ` Sven Schnelle
2022-02-11 23:40     ` Yury Norov
2022-02-10 22:49 ` [PATCH 40/49] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
2022-02-11  9:45   ` Sudeep Holla
2022-02-11 10:32   ` Mark Rutland
2022-02-10 22:49 ` [PATCH 41/49] RDMA/hfi1: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-02-11 19:11   ` Jason Gunthorpe
2022-02-10 22:49 ` [PATCH 42/49] scsi: lpfc: replace cpumask_weight with cpumask_weight_gt Yury Norov
2022-02-10 22:49 ` [PATCH 43/49] soc/qman: replace cpumask_weight with cpumask_weight_lt Yury Norov
2022-02-10 22:49 ` [PATCH 44/49] nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-02-10 22:49 ` [PATCH 45/49] ACPI: replace nodes__weight with nodes_weight_ge for numa Yury Norov
2022-02-14 19:18   ` Rafael J. Wysocki
2022-02-14 19:34     ` Yury Norov
2022-02-14 19:45       ` Rafael J. Wysocki
2022-02-14 19:55         ` Yury Norov
2022-02-10 22:49 ` [PATCH 46/49] mm/mempolicy: replace nodes_weight with nodes_weight_eq Yury Norov
2022-02-11 10:40   ` Mike Rapoport
2022-02-11 17:44   ` Christophe JAILLET
2022-02-11 19:47     ` Yury Norov
2022-02-10 22:49 ` [PATCH 47/49] nodemask: add num_node_state_eq() Yury Norov
2022-02-11 10:41   ` Mike Rapoport
2022-02-10 22:49 ` [PATCH 48/49] tools: bitmap: sync bitmap_weight Yury Norov
2022-02-10 22:49 ` [PATCH 49/49] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
2022-02-15 23:18 ` [PATCH v4 00/49] bitmap: optimize bitmap_weight() usage Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).