All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage
@ 2022-01-23 18:38 Yury Norov
  2022-01-23 18:38 ` [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
                   ` (54 more replies)
  0 siblings, 55 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases people use bitmap_weight()-based functions to compare
the result against a number of expression:

	if (cpumask_weight(mask) > 1)
		do_something();

This may take considerable amount of time on many-cpus machines because
cpumask_weight() will traverse every word of underlying cpumask
unconditionally.

We can significantly improve on it for many real cases if stop traversing
the mask as soon as we count cpus to any number greater than 1:

	if (cpumask_weight_gt(mask, 1))
		do_something();

The first part of series converts cpumask_weight() to cpumask_empty()
if the number to compare with is 0. Ditto for bitmap_weigth() and
nodes_weight().

In the 2nd part of the series bitmap_weight_cmp() is added together with
bitmap_weight_{eq,gt,ge,lt,le} wrappers on top of it. Corresponding
wrappers for cpumask and nodemask are added as well.

v1: https://lkml.org/lkml/2021/11/27/339
v2: https://lkml.org/lkml/2021/12/18/241
v3:
  - drop subseries for possible, present and active cpumasks. Will
    submit it separately if needed;
  - split patches per subsystems as requested by Greg and Michał;
  - trim the recipient list. Add drivers and arch maintainers to 
    corresponding patches only.

Yury Norov (54):
  net/dsa: don't use bitmap_weight() in b53_arl_read()
  net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set()
  thermal/intel: don't use bitmap_weight() in end_power_clamp()
  net: mellanox: fix open-coded for_each_set_bit()
  nds32: perf: replace bitmap_weight with bitmap_empty where appropriate
  x86/kvm: replace bitmap_weight with bitmap_empty where appropriate
  gpu: drm: replace bitmap_weight with bitmap_empty where appropriate
  net: ethernet: replace bitmap_weight with bitmap_empty for intel
  net: ethernet: replace bitmap_weight with bitmap_empty for Marvell
  net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  perf: replace bitmap_weight with bitmap_empty where appropriate
  tools/perf: replace bitmap_weight with bitmap_empty where appropriate
  arch/alpha: replace cpumask_weight with cpumask_empty where
    appropriate
  arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  drivers/infiniband: replace cpumask_weight with cpumask_empty where
    appropriate
  drivers/irqchip: replace cpumask_weight with cpumask_empty where
    appropriate
  kernel/irq: replace cpumask_weight with cpumask_empty where
    appropriate
  kernel: replace cpumask_weight with cpumask_empty in padata.c
  rcu: replace cpumask_weight with cpumask_empty where appropriate
  sched: replace cpumask_weight with cpumask_empty where appropriate
  time: replace cpumask_weight with cpumask_empty in clocksource.c
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le}
    where appropriate
  drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where
    appropriate
  drivers/memstick: replace bitmap_weight with bitmap_weight_eq where
    appropriate
  net: ethernet: replace bitmap_weight with bitmap_weight_eq for intel
  net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt} for
    OcteonTX2
  net: ethernet: replace bitmap_weight with
    bitmap_weight_{eq,gt,ge,lt,le} for mellanox
  perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
  drivers/staging: replace bitmap_weight with bitmap_weight_le for
    tegra-video
  lib/cpumask: add cpumask_weight_{eq,gt,ge,lt,le}
  arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c
  arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where
    appropriate
  arch/powerpc: replace cpumask_weight with cpumask_weight_{eq, ...}
    where appropriate
  arch/s390: replace cpumask_weight with cpumask_weight_eq where
    appropriate
  arch/x86: replace cpumask_weight with cpumask_weight_eq where
    appropriate
  firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  drivers/hv: replace cpumask_weight with cpumask_weight_eq
  infiniband: replace cpumask_weight with cpumask_weight_{eq, ...} where
    appropriate
  scsi: replace cpumask_weight with cpumask_weight_gt
  soc: replace cpumask_weight with cpumask_weight_lt
  sched: replace cpumask_weight with cpumask_weight_eq where appropriate
  kernel/time: replace cpumask_weight with cpumask_weight_eq where
    appropriate
  lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le}
  acpi: replace nodes__weight with nodes_weight_ge for numa
  mm: replace nodes_weight with nodes_weight_eq in mempolicy
  lib/nodemask: add num_node_state_eq()
  tools/bitmap: sync bitmap_weight
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API

 MAINTAINERS                                   |  4 +
 arch/alpha/kernel/process.c                   |  2 +-
 arch/ia64/kernel/setup.c                      |  2 +-
 arch/ia64/mm/tlb.c                            |  2 +-
 arch/mips/cavium-octeon/octeon-irq.c          |  4 +-
 arch/mips/kernel/crash.c                      |  2 +-
 arch/nds32/kernel/perf_event_cpu.c            |  2 +-
 arch/powerpc/kernel/smp.c                     |  2 +-
 arch/powerpc/kernel/watchdog.c                |  2 +-
 arch/powerpc/xmon/xmon.c                      |  4 +-
 arch/s390/kernel/perf_cpum_cf.c               |  2 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 16 ++--
 arch/x86/kernel/smpboot.c                     |  4 +-
 arch/x86/kvm/hyperv.c                         |  8 +-
 arch/x86/mm/amdtopology.c                     |  2 +-
 arch/x86/mm/mmio-mod.c                        |  2 +-
 arch/x86/mm/numa_emulation.c                  |  4 +-
 arch/x86/platform/uv/uv_nmi.c                 |  2 +-
 drivers/acpi/numa/srat.c                      |  2 +-
 drivers/cpufreq/qcom-cpufreq-hw.c             |  2 +-
 drivers/cpufreq/scmi-cpufreq.c                |  2 +-
 drivers/firmware/psci/psci_checker.c          |  2 +-
 drivers/gpu/drm/i915/i915_pmu.c               |  2 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c      |  2 +-
 drivers/hv/channel_mgmt.c                     |  4 +-
 drivers/iio/dummy/iio_simple_dummy_buffer.c   |  4 +-
 drivers/iio/industrialio-trigger.c            |  2 +-
 drivers/infiniband/hw/hfi1/affinity.c         | 13 ++-
 drivers/infiniband/hw/qib/qib_file_ops.c      |  2 +-
 drivers/infiniband/hw/qib/qib_iba7322.c       |  2 +-
 drivers/irqchip/irq-bcm6345-l1.c              |  2 +-
 drivers/memstick/core/ms_block.c              |  4 +-
 drivers/net/dsa/b53/b53_common.c              |  6 +-
 drivers/net/ethernet/broadcom/bcmsysport.c    |  6 +-
 .../net/ethernet/intel/ice/ice_virtchnl_pf.c  |  4 +-
 .../net/ethernet/intel/ixgbe/ixgbe_sriov.c    |  2 +-
 .../marvell/octeontx2/nic/otx2_ethtool.c      |  2 +-
 .../marvell/octeontx2/nic/otx2_flows.c        |  8 +-
 .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx4/cmd.c      | 33 +++-----
 drivers/net/ethernet/mellanox/mlx4/eq.c       |  4 +-
 drivers/net/ethernet/mellanox/mlx4/fw.c       |  4 +-
 drivers/net/ethernet/mellanox/mlx4/main.c     |  2 +-
 drivers/net/ethernet/qlogic/qed/qed_rdma.c    |  4 +-
 drivers/net/ethernet/qlogic/qed/qed_roce.c    |  2 +-
 drivers/perf/arm-cci.c                        |  2 +-
 drivers/perf/arm_pmu.c                        |  4 +-
 drivers/perf/hisilicon/hisi_uncore_pmu.c      |  2 +-
 drivers/perf/thunderx2_pmu.c                  |  4 +-
 drivers/perf/xgene_pmu.c                      |  2 +-
 drivers/scsi/lpfc/lpfc_init.c                 |  2 +-
 drivers/soc/fsl/qbman/qman_test_stash.c       |  2 +-
 drivers/staging/media/tegra-video/vi.c        |  2 +-
 drivers/thermal/intel/intel_powerclamp.c      |  9 +--
 include/linux/bitmap.h                        | 80 +++++++++++++++++++
 include/linux/cpumask.h                       | 50 ++++++++++++
 include/linux/nodemask.h                      | 40 ++++++++++
 kernel/irq/affinity.c                         |  2 +-
 kernel/padata.c                               |  2 +-
 kernel/rcu/tree_nocb.h                        |  4 +-
 kernel/rcu/tree_plugin.h                      |  2 +-
 kernel/sched/core.c                           | 10 +--
 kernel/sched/topology.c                       |  4 +-
 kernel/time/clockevents.c                     |  2 +-
 kernel/time/clocksource.c                     |  2 +-
 lib/bitmap.c                                  | 21 +++++
 mm/mempolicy.c                                |  2 +-
 mm/page_alloc.c                               |  2 +-
 mm/vmstat.c                                   |  4 +-
 tools/include/linux/bitmap.h                  | 44 ++++++++++
 tools/lib/bitmap.c                            | 20 +++++
 tools/perf/builtin-c2c.c                      |  4 +-
 tools/perf/util/pmu.c                         |  2 +-
 73 files changed, 374 insertions(+), 142 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 117+ messages in thread

* [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read()
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24  3:11   ` Florian Fainelli
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                   ` (53 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, Andrew Lunn, Vivien Didelot, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, netdev

Don't call bitmap_weight() if the following code can get by
without it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/dsa/b53/b53_common.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 3867f3d4545f..9a10d80125d9 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1620,12 +1620,8 @@ static int b53_arl_read(struct b53_device *dev, u64 mac,
 		return 0;
 	}
 
-	if (bitmap_weight(free_bins, dev->num_arl_bins) == 0)
-		return -ENOSPC;
-
 	*idx = find_first_bit(free_bins, dev->num_arl_bins);
-
-	return -ENOENT;
+	return *idx >= dev->num_arl_bins ? -ENOSPC : -ENOENT;
 }
 
 static int b53_arl_op(struct b53_device *dev, int op, int port,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set()
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
  2022-01-23 18:38 ` [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24  3:11   ` Florian Fainelli
  2022-01-23 18:38 ` [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp() Yury Norov
                   ` (52 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, David S . Miller ,
	Jakub Kicinski, bcm-kernel-feedback-list, netdev

Don't call bitmap_weight() if the following code can get by
without it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 60dde29974bf..5284a5c961db 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -2180,13 +2180,9 @@ static int bcm_sysport_rule_set(struct bcm_sysport_priv *priv,
 	if (nfc->fs.ring_cookie != RX_CLS_FLOW_WAKE)
 		return -EOPNOTSUPP;
 
-	/* All filters are already in use, we cannot match more rules */
-	if (bitmap_weight(priv->filters, RXCHK_BRCM_TAG_MAX) ==
-	    RXCHK_BRCM_TAG_MAX)
-		return -ENOSPC;
-
 	index = find_first_zero_bit(priv->filters, RXCHK_BRCM_TAG_MAX);
 	if (index >= RXCHK_BRCM_TAG_MAX)
+		/* All filters are already in use, we cannot match more rules */
 		return -ENOSPC;
 
 	/* Location is the classification ID, and index is the position
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp()
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
  2022-01-23 18:38 ` [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-02-04 18:29   ` Rafael J. Wysocki
  2022-01-23 18:38 ` [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
                   ` (51 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	Sebastian Andrzej Siewior, Christophe JAILLET, Rikard Falkeborn,
	linux-pm

Don't call bitmap_weight() if the following code can get by
without it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/thermal/intel/intel_powerclamp.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index 14256421d98c..c841ab37e7c6 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -556,12 +556,9 @@ static void end_power_clamp(void)
 	 * stop faster.
 	 */
 	clamping = false;
-	if (bitmap_weight(cpu_clamping_mask, num_possible_cpus())) {
-		for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
-			pr_debug("clamping worker for cpu %d alive, destroy\n",
-				 i);
-			stop_power_clamp_worker(i);
-		}
+	for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
+		pr_debug("clamping worker for cpu %d alive, destroy\n", i);
+		stop_power_clamp_worker(i);
 	}
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit()
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (2 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp() Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-26  9:01   ` Tariq Toukan
  2022-01-23 18:38 ` [PATCH 05/54] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
                   ` (50 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Tariq Toukan, David S. Miller, Jakub Kicinski, netdev,
	linux-rdma

Mellanox driver has an open-coded for_each_set_bit(). Fix it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index e10b7b04b894..c56d2194cbfc 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -1994,21 +1994,16 @@ static void mlx4_allocate_port_vpps(struct mlx4_dev *dev, int port)
 
 static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
 {
-	int port, err;
+	int p, port, err;
 	struct mlx4_vport_state *vp_admin;
 	struct mlx4_vport_oper_state *vp_oper;
 	struct mlx4_slave_state *slave_state =
 		&priv->mfunc.master.slave_state[slave];
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
-	for (port = min_port; port <= max_port; port++) {
-		if (!test_bit(port - 1, actv_ports.ports))
-			continue;
+	for_each_set_bit(p, actv_ports.ports, priv->dev.caps.num_ports) {
+		port = p + 1;
 		priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
 			priv->mfunc.master.vf_admin[slave].enable_smi[port];
 		vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
@@ -2063,19 +2058,13 @@ static int mlx4_master_activate_admin_state(struct mlx4_priv *priv, int slave)
 
 static void mlx4_master_deactivate_admin_state(struct mlx4_priv *priv, int slave)
 {
-	int port;
+	int p, port;
 	struct mlx4_vport_oper_state *vp_oper;
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
-
-	for (port = min_port; port <= max_port; port++) {
-		if (!test_bit(port - 1, actv_ports.ports))
-			continue;
+	for_each_set_bit(p, actv_ports.ports, priv->dev.caps.num_ports) {
+		port = p + 1;
 		priv->mfunc.master.vf_oper[slave].smi_enabled[port] =
 			MLX4_VF_SMI_DISABLED;
 		vp_oper = &priv->mfunc.master.vf_oper[slave].vport[port];
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 05/54] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (3 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 06/54] x86/kvm: " Yury Norov
                   ` (49 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Nick Hu,
	Greentime Hu, Vincent Chen, linux-perf-users

nds32_pmu_enable calls bitmap_weight() to check if any bit of a given
bitmap is set. It's better to use bitmap_empty() in that case because
bitmap_empty() stops traversing the bitmap as soon as it finds first
set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/nds32/kernel/perf_event_cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/nds32/kernel/perf_event_cpu.c b/arch/nds32/kernel/perf_event_cpu.c
index a78a879e7ef1..ea44e9ecb5c7 100644
--- a/arch/nds32/kernel/perf_event_cpu.c
+++ b/arch/nds32/kernel/perf_event_cpu.c
@@ -695,7 +695,7 @@ static void nds32_pmu_enable(struct pmu *pmu)
 {
 	struct nds32_pmu *nds32_pmu = to_nds32_pmu(pmu);
 	struct pmu_hw_events *hw_events = nds32_pmu->get_hw_events();
-	int enabled = bitmap_weight(hw_events->used_mask,
+	bool enabled = !bitmap_empty(hw_events->used_mask,
 				    nds32_pmu->num_events);
 
 	if (enabled)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 06/54] x86/kvm: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (4 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 05/54] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24  9:11   ` Vitaly Kuznetsov
  2022-01-23 18:38 ` [PATCH 07/54] gpu: drm: " Yury Norov
                   ` (48 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86, kvm

In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
of a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kvm/hyperv.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 6e38a7d22e97..2c3400dea4b3 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 {
 	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
 	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
-	int auto_eoi_old, auto_eoi_new;
+	bool auto_eoi_old, auto_eoi_new;
 
 	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
 		return;
@@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
 	else
 		__clear_bit(vector, synic->vec_bitmap);
 
-	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
+	auto_eoi_old = bitmap_empty(synic->auto_eoi_bitmap, 256);
 
 	if (synic_has_vector_auto_eoi(synic, vector))
 		__set_bit(vector, synic->auto_eoi_bitmap);
 	else
 		__clear_bit(vector, synic->auto_eoi_bitmap);
 
-	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
+	auto_eoi_new = bitmap_empty(synic->auto_eoi_bitmap, 256);
 
-	if (!!auto_eoi_old == !!auto_eoi_new)
+	if (auto_eoi_old == auto_eoi_new)
 		return;
 
 	down_write(&vcpu->kvm->arch.apicv_update_lock);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 07/54] gpu: drm: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (5 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 06/54] x86/kvm: " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38   ` [Intel-wired-lan] " Yury Norov
                   ` (47 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

smp_request_block() in drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c calls
bitmap_weight() to check if any bit of a given bitmap is set. It's
better to use bitmap_empty() in that case because bitmap_empty() stops
traversing the bitmap as soon as it finds first set bit, while
bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
index d7fa2c49e741..56a3063545ec 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c
@@ -68,7 +68,7 @@ static int smp_request_block(struct mdp5_smp *smp,
 	uint8_t reserved;
 
 	/* we shouldn't be requesting blocks for an in-use client: */
-	WARN_ON(bitmap_weight(cs, cnt) > 0);
+	WARN_ON(!bitmap_empty(cs, cnt));
 
 	reserved = smp->reserved[cid];
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 08/54] net: ethernet: replace bitmap_weight with bitmap_empty for intel
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Jakub Kicinski,
	intel-wired-lan, netdev

The ice_vf_has_no_qs_ena() calls bitmap_weight() to check if any bit
of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index 39b80124d282..9dd52aab68cc 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -267,8 +267,8 @@ ice_set_pfe_link(struct ice_vf *vf, struct virtchnl_pf_event *pfe,
  */
 static bool ice_vf_has_no_qs_ena(struct ice_vf *vf)
 {
-	return (!bitmap_weight(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
-		!bitmap_weight(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF));
+	return (bitmap_empty(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
+		bitmap_empty(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF));
 }
 
 /**
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [Intel-wired-lan] [PATCH 08/54] net: ethernet: replace bitmap_weight with bitmap_empty for intel
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: intel-wired-lan

The ice_vf_has_no_qs_ena() calls bitmap_weight() to check if any bit
of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index 39b80124d282..9dd52aab68cc 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -267,8 +267,8 @@ ice_set_pfe_link(struct ice_vf *vf, struct virtchnl_pf_event *pfe,
  */
 static bool ice_vf_has_no_qs_ena(struct ice_vf *vf)
 {
-	return (!bitmap_weight(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
-		!bitmap_weight(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF));
+	return (bitmap_empty(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF) &&
+		bitmap_empty(vf->txq_ena, ICE_MAX_RSS_QS_PER_VF));
 }
 
 /**
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 09/54] net: ethernet: replace bitmap_weight with bitmap_empty for Marvell
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (7 preceding siblings ...)
  2022-01-23 18:38   ` [Intel-wired-lan] " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic Yury Norov
                   ` (45 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

In some places, octeontx2 code calls bitmap_weight() to check if any bit of
a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c | 4 ++--
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c    | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 77a13fb555fb..80b2d64b4136 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -353,7 +353,7 @@ int otx2_add_macfilter(struct net_device *netdev, const u8 *mac)
 {
 	struct otx2_nic *pf = netdev_priv(netdev);
 
-	if (bitmap_weight(&pf->flow_cfg->dmacflt_bmap,
+	if (!bitmap_empty(&pf->flow_cfg->dmacflt_bmap,
 			  pf->flow_cfg->dmacflt_max_flows))
 		netdev_warn(netdev,
 			    "Add %pM to CGX/RPM DMAC filters list as well\n",
@@ -436,7 +436,7 @@ int otx2_get_maxflows(struct otx2_flow_config *flow_cfg)
 		return 0;
 
 	if (flow_cfg->nr_flows == flow_cfg->max_flows ||
-	    bitmap_weight(&flow_cfg->dmacflt_bmap,
+	    !bitmap_empty(&flow_cfg->dmacflt_bmap,
 			  flow_cfg->dmacflt_max_flows))
 		return flow_cfg->max_flows + flow_cfg->dmacflt_max_flows;
 	else
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
index 6080ebd9bd94..3d369ccc7ab9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
@@ -1115,7 +1115,7 @@ static int otx2_cgx_config_loopback(struct otx2_nic *pf, bool enable)
 	struct msg_req *msg;
 	int err;
 
-	if (enable && bitmap_weight(&pf->flow_cfg->dmacflt_bmap,
+	if (enable && !bitmap_empty(&pf->flow_cfg->dmacflt_bmap,
 				    pf->flow_cfg->dmacflt_max_flows))
 		netdev_warn(pf->netdev,
 			    "CGX/RPM internal loopback might not work as DMAC filters are active\n");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (8 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 09/54] net: ethernet: replace bitmap_weight with bitmap_empty for Marvell Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24 12:28   ` Andy Shevchenko
  2022-01-23 18:38   ` Yury Norov
                   ` (44 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ariel Elior, Manish Chopra, David S. Miller, Jakub Kicinski,
	netdev

qlogic/qed code calls bitmap_weight() to check if any bit of a given
bitmap is set. It's better to use bitmap_empty() in that case because
bitmap_empty() stops traversing the bitmap as soon as it finds first
set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/qlogic/qed/qed_rdma.c | 4 ++--
 drivers/net/ethernet/qlogic/qed/qed_roce.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index 23b668de4640..b6e2e17bac04 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -336,7 +336,7 @@ void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn,
 
 	/* print aligned non-zero lines, if any */
 	for (item = 0, line = 0; line < last_line; line++, item += 8)
-		if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
+		if (!bitmap_empty((unsigned long *)&pmap[item], 64 * 8))
 			DP_NOTICE(p_hwfn,
 				  "line 0x%04x: 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx\n",
 				  line,
@@ -350,7 +350,7 @@ void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn,
 
 	/* print last unaligned non-zero line, if any */
 	if ((bmap->max_count % (64 * 8)) &&
-	    (bitmap_weight((unsigned long *)&pmap[item],
+	    (!bitmap_empty((unsigned long *)&pmap[item],
 			   bmap->max_count - item * 64))) {
 		offset = sprintf(str_last_line, "line 0x%04x: ", line);
 		for (; item < last_item; item++)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_roce.c b/drivers/net/ethernet/qlogic/qed/qed_roce.c
index 071b4aeaddf2..134ecfca96a3 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_roce.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_roce.c
@@ -76,7 +76,7 @@ void qed_roce_stop(struct qed_hwfn *p_hwfn)
 	 * We delay for a short while if an async destroy QP is still expected.
 	 * Beyond the added delay we clear the bitmap anyway.
 	 */
-	while (bitmap_weight(rcid_map->bitmap, rcid_map->max_count)) {
+	while (!bitmap_empty(rcid_map->bitmap, rcid_map->max_count)) {
 		/* If the HW device is during recovery, all resources are
 		 * immediately reset without receiving a per-cid indication
 		 * from HW. In this case we don't expect the cid bitmap to be
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 11/54] perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, Shaokun Zhang, Qi Liu, Khuong Dinh,
	linux-arm-kernel

In some places, drivers/perf code calls bitmap_weight() to check if any
bit of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/arm-cci.c                   | 2 +-
 drivers/perf/arm_pmu.c                   | 4 ++--
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
 drivers/perf/xgene_pmu.c                 | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 54aca3a62814..96e09fa40909 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
 {
 	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
 	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
-	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
+	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
 	unsigned long flags;
 
 	if (!enabled)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..a31b302b0ade 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	/* For task-bound events we may be called on other CPUs */
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
@@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
 {
 	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
 		return NOTIFY_DONE;
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index a738aeab5c04..358e4e284a62 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
 void hisi_uncore_pmu_enable(struct pmu *pmu)
 {
 	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
-	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
+	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
 				    hisi_pmu->num_counters);
 
 	if (!enabled)
diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 2b6d476bd213..88bd100a9633 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
-	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
+	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
 			pmu_dev->max_counters);
 
 	if (!enabled)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 11/54] perf: replace bitmap_weight with bitmap_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, Shaokun Zhang, Qi Liu, Khuong Dinh,
	linux-arm-kernel

In some places, drivers/perf code calls bitmap_weight() to check if any
bit of a given bitmap is set. It's better to use bitmap_empty() in that
case because bitmap_empty() stops traversing the bitmap as soon as it
finds first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/arm-cci.c                   | 2 +-
 drivers/perf/arm_pmu.c                   | 4 ++--
 drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
 drivers/perf/xgene_pmu.c                 | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/perf/arm-cci.c b/drivers/perf/arm-cci.c
index 54aca3a62814..96e09fa40909 100644
--- a/drivers/perf/arm-cci.c
+++ b/drivers/perf/arm-cci.c
@@ -1096,7 +1096,7 @@ static void cci_pmu_enable(struct pmu *pmu)
 {
 	struct cci_pmu *cci_pmu = to_cci_pmu(pmu);
 	struct cci_pmu_hw_events *hw_events = &cci_pmu->hw_events;
-	int enabled = bitmap_weight(hw_events->used_mask, cci_pmu->num_cntrs);
+	bool enabled = !bitmap_empty(hw_events->used_mask, cci_pmu->num_cntrs);
 	unsigned long flags;
 
 	if (!enabled)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..a31b302b0ade 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -524,7 +524,7 @@ static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	/* For task-bound events we may be called on other CPUs */
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
@@ -785,7 +785,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
 {
 	struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
 	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
-	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+	bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
 
 	if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
 		return NOTIFY_DONE;
diff --git a/drivers/perf/hisilicon/hisi_uncore_pmu.c b/drivers/perf/hisilicon/hisi_uncore_pmu.c
index a738aeab5c04..358e4e284a62 100644
--- a/drivers/perf/hisilicon/hisi_uncore_pmu.c
+++ b/drivers/perf/hisilicon/hisi_uncore_pmu.c
@@ -393,7 +393,7 @@ EXPORT_SYMBOL_GPL(hisi_uncore_pmu_read);
 void hisi_uncore_pmu_enable(struct pmu *pmu)
 {
 	struct hisi_pmu *hisi_pmu = to_hisi_pmu(pmu);
-	int enabled = bitmap_weight(hisi_pmu->pmu_events.used_mask,
+	bool enabled = !bitmap_empty(hisi_pmu->pmu_events.used_mask,
 				    hisi_pmu->num_counters);
 
 	if (!enabled)
diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 2b6d476bd213..88bd100a9633 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -867,7 +867,7 @@ static void xgene_perf_pmu_enable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
-	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
+	bool enabled = !bitmap_empty(pmu_dev->cntr_assign_mask,
 			pmu_dev->max_counters);
 
 	if (!enabled)
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 12/54] tools/perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (10 preceding siblings ...)
  2022-01-23 18:38   ` Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-02-15 14:34   ` Arnaldo Carvalho de Melo
  2022-01-23 18:38   ` Yury Norov
                   ` (42 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users

Some code in builtin-c2c.c calls bitmap_weight() to check if any bit of
a given bitmap is set. It's better to use bitmap_empty() in that case
because bitmap_empty() stops traversing the bitmap as soon as it finds
first set bit, while bitmap_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/perf/builtin-c2c.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 77dd4afacca4..14f787c67140 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1080,7 +1080,7 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
 		bitmap_zero(set, c2c.cpus_cnt);
 		bitmap_and(set, c2c_he->cpuset, c2c.nodes[node], c2c.cpus_cnt);
 
-		if (!bitmap_weight(set, c2c.cpus_cnt)) {
+		if (bitmap_empty(set, c2c.cpus_cnt)) {
 			if (c2c.node_info == 1) {
 				ret = scnprintf(hpp->buf, hpp->size, "%21s", " ");
 				advance_hpp(hpp, ret);
@@ -1944,7 +1944,7 @@ static int set_nodestr(struct c2c_hist_entry *c2c_he)
 	if (c2c_he->nodestr)
 		return 0;
 
-	if (bitmap_weight(c2c_he->nodeset, c2c.nodes_cnt)) {
+	if (!bitmap_empty(c2c_he->nodeset, c2c.nodes_cnt)) {
 		len = bitmap_scnprintf(c2c_he->nodeset, c2c.nodes_cnt,
 				      buf, sizeof(buf));
 	} else {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 13/54] arch/alpha: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Geert Uytterhoeven, Davidlohr Bueso, Russell King (Oracle),
	Kees Cook, Zheng Yongjun, Jens Axboe, linux-alpha

common_shutdown_1() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/alpha/kernel/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 5f8527081da9..0d4bc60828bf 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -125,7 +125,7 @@ common_shutdown_1(void *generic_ptr)
 	/* Wait for the secondaries to halt. */
 	set_cpu_present(boot_cpuid, false);
 	set_cpu_possible(boot_cpuid, false);
-	while (cpumask_weight(cpu_present_mask))
+	while (!cpumask_empty(cpu_present_mask))
 		barrier();
 #endif
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 13/54] arch/alpha: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Richard Henderson, Ivan Kokshaysky, Matt Turner

common_shutdown_1() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/alpha/kernel/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index 5f8527081da9..0d4bc60828bf 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -125,7 +125,7 @@ common_shutdown_1(void *generic_ptr)
 	/* Wait for the secondaries to halt. */
 	set_cpu_present(boot_cpuid, false);
 	set_cpu_possible(boot_cpuid, false);
-	while (cpumask_weight(cpu_present_mask))
+	while (!cpumask_empty(cpu_present_mask))
 		barrier();
 #endif
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 14/54] arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Geert Uytterhoeven, Yang Guang, linux-ia64

setup_arch() calls cpumask_weight() to check if any bit of a given cpumask
is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/kernel/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5010348fa21b..fd6301eafa9d 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -572,7 +572,7 @@ setup_arch (char **cmdline_p)
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 	prefill_possible_map();
 #endif
-	per_cpu_scan_finalize((cpumask_weight(&early_cpu_possible_map) == 0 ?
+	per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
 		32 : cpumask_weight(&early_cpu_possible_map)),
 		additional_cpus > 0 ? additional_cpus : 0);
 #endif /* CONFIG_ACPI_NUMA */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 14/54] arch/ia64: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Geert Uytterhoeven, Yang Guang, linux-ia64

setup_arch() calls cpumask_weight() to check if any bit of a given cpumask
is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/kernel/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5010348fa21b..fd6301eafa9d 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -572,7 +572,7 @@ setup_arch (char **cmdline_p)
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 	prefill_possible_map();
 #endif
-	per_cpu_scan_finalize((cpumask_weight(&early_cpu_possible_map) = 0 ?
+	per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
 		32 : cpumask_weight(&early_cpu_possible_map)),
 		additional_cpus > 0 ? additional_cpus : 0);
 #endif /* CONFIG_ACPI_NUMA */
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 15/54] arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Steven Rostedt,
	Karol Herbst, Pekka Paalanen, Andy Lutomirski, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson, Darren Hart,
	Andy Shevchenko, x86, nouveau, platform-driver-x86

In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
 arch/x86/mm/mmio-mod.c                 |  2 +-
 arch/x86/platform/uv/uv_nmi.c          |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b57b3db9a6a7..e23ff03290b8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus belong to parent ctrl group */
 	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
 		return -EINVAL;
 	}
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Give any dropped cpus to parent rdtgroup */
 		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
 		update_closid_rmid(tmpmask, prgrp);
@@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu rmid
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
 			if (crgrp == rdtgrp)
@@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Can't drop from default group */
 		if (rdtgrp == &rdtgroup_default) {
 			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
@@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu closid/rmid.
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
 			if (r == rdtgrp)
 				continue;
 			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
-			if (cpumask_weight(tmpmask1))
+			if (!cpumask_empty(tmpmask1))
 				cpumask_rdtgrp_clear(r, tmpmask1);
 		}
 		update_closid_rmid(tmpmask, rdtgrp);
@@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 
 	/* check that user didn't specify any offline cpus */
 	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		ret = -EINVAL;
 		rdt_last_cmd_puts("Can only assign online CPUs\n");
 		goto unlock;
diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
index 933a2ebad471..c3317f0650d8 100644
--- a/arch/x86/mm/mmio-mod.c
+++ b/arch/x86/mm/mmio-mod.c
@@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
 	int cpu;
 	int err;
 
-	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
+	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
 		return;
 	pr_notice("Re-enabling CPUs...\n");
 	for_each_cpu(cpu, downed_cpus) {
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 1e9ff28bc2e0..ea277fc08357 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
 
 	/* Clear global flags */
 	if (master) {
-		if (cpumask_weight(uv_nmi_cpu_mask))
+		if (!cpumask_empty(uv_nmi_cpu_mask))
 			uv_nmi_cleanup_mask();
 		atomic_set(&uv_nmi_cpus_in_nmi, -1);
 		atomic_set(&uv_nmi_cpu, -1);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [Nouveau] [PATCH 15/54] arch/x86: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Steven Rostedt,
	Karol Herbst, Pekka Paalanen, Andy Lutomirski, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson, Darren Hart,
	Andy Shevchenko, x86, nouveau, platform-driver-x86

In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
 arch/x86/mm/mmio-mod.c                 |  2 +-
 arch/x86/platform/uv/uv_nmi.c          |  2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b57b3db9a6a7..e23ff03290b8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus belong to parent ctrl group */
 	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
 		return -EINVAL;
 	}
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Give any dropped cpus to parent rdtgroup */
 		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
 		update_closid_rmid(tmpmask, prgrp);
@@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu rmid
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		head = &prgrp->mon.crdtgrp_list;
 		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
 			if (crgrp == rdtgrp)
@@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 
 	/* Check whether cpus are dropped from this group */
 	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		/* Can't drop from default group */
 		if (rdtgrp == &rdtgroup_default) {
 			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
@@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
 	 * and update per-cpu closid/rmid.
 	 */
 	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
 			if (r == rdtgrp)
 				continue;
 			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
-			if (cpumask_weight(tmpmask1))
+			if (!cpumask_empty(tmpmask1))
 				cpumask_rdtgrp_clear(r, tmpmask1);
 		}
 		update_closid_rmid(tmpmask, rdtgrp);
@@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
 
 	/* check that user didn't specify any offline cpus */
 	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
-	if (cpumask_weight(tmpmask)) {
+	if (!cpumask_empty(tmpmask)) {
 		ret = -EINVAL;
 		rdt_last_cmd_puts("Can only assign online CPUs\n");
 		goto unlock;
diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
index 933a2ebad471..c3317f0650d8 100644
--- a/arch/x86/mm/mmio-mod.c
+++ b/arch/x86/mm/mmio-mod.c
@@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
 	int cpu;
 	int err;
 
-	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
+	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
 		return;
 	pr_notice("Re-enabling CPUs...\n");
 	for_each_cpu(cpu, downed_cpus) {
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index 1e9ff28bc2e0..ea277fc08357 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
 
 	/* Clear global flags */
 	if (master) {
-		if (cpumask_weight(uv_nmi_cpu_mask))
+		if (!cpumask_empty(uv_nmi_cpu_mask))
 			uv_nmi_cleanup_mask();
 		atomic_set(&uv_nmi_cpus_in_nmi, -1);
 		atomic_set(&uv_nmi_cpu, -1);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar,
	Sudeep Holla, Cristian Marussi, linux-arm-msm, linux-pm,
	linux-arm-kernel

drivers/cpufreq calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
 drivers/cpufreq/scmi-cpufreq.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 05f3d7876e44..95a0c57ab5bb 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -482,7 +482,7 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	qcom_get_related_cpus(index, policy->cpus);
-	if (!cpumask_weight(policy->cpus)) {
+	if (cpumask_empty(policy->cpus)) {
 		dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
 		ret = -ENOENT;
 		goto error;
diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 1e0cd4d165f0..919fa6e3f462 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -154,7 +154,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 	 * table and opp-shared.
 	 */
 	ret = dev_pm_opp_of_get_sharing_cpus(cpu_dev, priv->opp_shared_cpus);
-	if (ret || !cpumask_weight(priv->opp_shared_cpus)) {
+	if (ret || cpumask_empty(priv->opp_shared_cpus)) {
 		/*
 		 * Either opp-table is not set or no opp-shared was found.
 		 * Use the CPU mask from SCMI to designate CPUs sharing an OPP
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Viresh Kumar,
	Sudeep Holla, Cristian Marussi, linux-arm-msm, linux-pm,
	linux-arm-kernel

drivers/cpufreq calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
 drivers/cpufreq/scmi-cpufreq.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 05f3d7876e44..95a0c57ab5bb 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -482,7 +482,7 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	qcom_get_related_cpus(index, policy->cpus);
-	if (!cpumask_weight(policy->cpus)) {
+	if (cpumask_empty(policy->cpus)) {
 		dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
 		ret = -ENOENT;
 		goto error;
diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 1e0cd4d165f0..919fa6e3f462 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -154,7 +154,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 	 * table and opp-shared.
 	 */
 	ret = dev_pm_opp_of_get_sharing_cpus(cpu_dev, priv->opp_shared_cpus);
-	if (ret || !cpumask_weight(priv->opp_shared_cpus)) {
+	if (ret || cpumask_empty(priv->opp_shared_cpus)) {
 		/*
 		 * Either opp-table is not set or no opp-shared was found.
 		 * Use the CPU mask from SCMI to designate CPUs sharing an OPP
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:38   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, intel-gfx, dri-devel

i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/gpu/drm/i915/i915_pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index ea655161793e..1894c876b31d 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 	GEM_BUG_ON(!pmu->base.event_init);
 
 	/* Select the first online CPU as a designated reader. */
-	if (!cpumask_weight(&i915_pmu_cpumask))
+	if (cpumask_empty(&i915_pmu_cpumask))
 		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
 
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [Intel-gfx] [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-23 18:38   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
	David Airlie, Daniel Vetter, intel-gfx, dri-devel

i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/gpu/drm/i915/i915_pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index ea655161793e..1894c876b31d 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
 	GEM_BUG_ON(!pmu->base.event_init);
 
 	/* Select the first online CPU as a designated reader. */
-	if (!cpumask_weight(&i915_pmu_cpumask))
+	if (cpumask_empty(&i915_pmu_cpumask))
 		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
 
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 18/54] drivers/infiniband: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (16 preceding siblings ...)
  2022-01-23 18:38   ` [Intel-gfx] " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 19:13   ` Leon Romanovsky
  2022-01-23 18:38 ` [PATCH 19/54] drivers/irqchip: " Yury Norov
                   ` (36 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, Jason Gunthorpe,
	linux-rdma

drivers/infiniband/hw/hfi1/affinity.c code calls cpumask_weight() to check
if any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 98c813ba4304..38eee675369a 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -667,7 +667,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 			 * engines, use the same CPU cores as general/control
 			 * context.
 			 */
-			if (cpumask_weight(&entry->def_intr.mask) == 0)
+			if (cpumask_empty(&entry->def_intr.mask))
 				cpumask_copy(&entry->def_intr.mask,
 					     &entry->general_intr_mask);
 		}
@@ -687,7 +687,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 		 * vectors, use the same CPU core as the general/control
 		 * context.
 		 */
-		if (cpumask_weight(&entry->comp_vect_mask) == 0)
+		if (cpumask_empty(&entry->comp_vect_mask))
 			cpumask_copy(&entry->comp_vect_mask,
 				     &entry->general_intr_mask);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 19/54] drivers/irqchip: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (17 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 18/54] drivers/infiniband: " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24  3:11   ` Florian Fainelli
  2022-01-23 18:38 ` [PATCH 20/54] kernel/irq: " Yury Norov
                   ` (35 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Florian Fainelli, Thomas Gleixner, Marc Zyngier,
	bcm-kernel-feedback-list, linux-mips

bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/irqchip/irq-bcm6345-l1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
index fd079215c17f..142a7431745f 100644
--- a/drivers/irqchip/irq-bcm6345-l1.c
+++ b/drivers/irqchip/irq-bcm6345-l1.c
@@ -315,7 +315,7 @@ static int __init bcm6345_l1_of_init(struct device_node *dn,
 			cpumask_set_cpu(idx, &intc->cpumask);
 	}
 
-	if (!cpumask_weight(&intc->cpumask)) {
+	if (cpumask_empty(&intc->cpumask)) {
 		ret = -ENODEV;
 		goto out_free;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 20/54] kernel/irq: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (18 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 19/54] drivers/irqchip: " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c Yury Norov
                   ` (34 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner

__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/irq/affinity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index f7ff8919dc9b..18740faf0eb1 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -258,7 +258,7 @@ static int __irq_build_affinity_masks(unsigned int startvec,
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	struct node_vectors *node_vectors;
 
-	if (!cpumask_weight(cpu_mask))
+	if (cpumask_empty(cpu_mask))
 		return 0;
 
 	nodes = get_nodes_in_cpumask(node_to_cpumask, cpu_mask, &nodemsk);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (19 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 20/54] kernel/irq: " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-28  6:29   ` Herbert Xu
  2022-01-23 18:38 ` [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
                   ` (33 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Steffen Klassert, Daniel Jordan, linux-crypto

padata_do_parallel() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/padata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 18d3a5c699d8..e5819bb8bd1d 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -181,7 +181,7 @@ int padata_do_parallel(struct padata_shell *ps,
 		goto out;
 
 	if (!cpumask_test_cpu(*cb_cpu, pd->cpumask.cbcpu)) {
-		if (!cpumask_weight(pd->cpumask.cbcpu))
+		if (cpumask_empty(pd->cpumask.cbcpu))
 			goto out;
 
 		/* Select an alternate fallback CPU and notify the caller. */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (20 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24  0:07   ` Paul E. McKenney
  2022-01-23 18:38 ` [PATCH 23/54] sched: " Yury Norov
                   ` (32 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paul E. McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes, rcu

In some places, RCU code calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/rcu/tree_nocb.h   | 4 ++--
 kernel/rcu/tree_plugin.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index eeafb546a7a0..f83c7b1d6110 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1169,7 +1169,7 @@ void __init rcu_init_nohz(void)
 	struct rcu_data *rdp;
 
 #if defined(CONFIG_NO_HZ_FULL)
-	if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask))
+	if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask))
 		need_rcu_nocb_mask = true;
 #endif /* #if defined(CONFIG_NO_HZ_FULL) */
 
@@ -1348,7 +1348,7 @@ static void __init rcu_organize_nocb_kthreads(void)
  */
 void rcu_bind_current_to_nocb(void)
 {
-	if (cpumask_available(rcu_nocb_mask) && cpumask_weight(rcu_nocb_mask))
+	if (cpumask_available(rcu_nocb_mask) && !cpumask_empty(rcu_nocb_mask))
 		WARN_ON(sched_setaffinity(current->pid, rcu_nocb_mask));
 }
 EXPORT_SYMBOL_GPL(rcu_bind_current_to_nocb);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c5b45c2f68a1..0dc0c8d6717c 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1215,7 +1215,7 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
 		    cpu != outgoingcpu)
 			cpumask_set_cpu(cpu, cm);
 	cpumask_and(cm, cm, housekeeping_cpumask(HK_FLAG_RCU));
-	if (cpumask_weight(cm) == 0)
+	if (cpumask_empty(cm))
 		cpumask_copy(cm, housekeeping_cpumask(HK_FLAG_RCU));
 	set_cpus_allowed_ptr(t, cm);
 	free_cpumask_var(cm);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 23/54] sched: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (21 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 24/54] time: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
                   ` (31 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

In some places, kernel/sched code calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/sched/core.c     | 2 +-
 kernel/sched/topology.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2e4ae00e52d1..918d0bdc2ea8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8707,7 +8707,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
 {
 	int ret = 1;
 
-	if (!cpumask_weight(cur))
+	if (cpumask_empty(cur))
 		return ret;
 
 	ret = dl_cpuset_cpumask_can_shrink(cur, trial);
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index d201a7052a29..8478e2a8cd65 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -74,7 +74,7 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
 			break;
 		}
 
-		if (!cpumask_weight(sched_group_span(group))) {
+		if (cpumask_empty(sched_group_span(group))) {
 			printk(KERN_CONT "\n");
 			printk(KERN_ERR "ERROR: empty group\n");
 			break;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 24/54] time: replace cpumask_weight with cpumask_empty in clocksource.c
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (22 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 23/54] sched: " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 25/54] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
                   ` (30 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	John Stultz, Thomas Gleixner, Stephen Boyd

clocksource_verify_percpu() calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/time/clocksource.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 1cf73807b450..a2fecb4d8c0e 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -337,7 +337,7 @@ void clocksource_verify_percpu(struct clocksource *cs)
 	cpus_read_lock();
 	preempt_disable();
 	clocksource_verify_choose_cpus();
-	if (cpumask_weight(&cpus_chosen) == 0) {
+	if (cpumask_empty(&cpus_chosen)) {
 		preempt_enable();
 		cpus_read_unlock();
 		pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 25/54] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (23 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 24/54] time: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 26/54] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
                   ` (29 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

mm/vmstat.c code calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 mm/vmstat.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 4057372745d0..f56f11e3eef5 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -2035,7 +2035,7 @@ static void __init init_cpu_node_state(void)
 	int node;
 
 	for_each_online_node(node) {
-		if (cpumask_weight(cpumask_of_node(node)) > 0)
+		if (!cpumask_empty(cpumask_of_node(node)))
 			node_set_state(node, N_CPU);
 	}
 }
@@ -2062,7 +2062,7 @@ static int vmstat_cpu_dead(unsigned int cpu)
 
 	refresh_zone_stat_thresholds();
 	node_cpus = cpumask_of_node(node);
-	if (cpumask_weight(node_cpus) > 0)
+	if (!cpumask_empty(node_cpus))
 		return 0;
 
 	node_clear_state(node, N_CPU);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 26/54] arch/x86: replace nodes_weight with nodes_empty where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (24 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 25/54] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:38 ` [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
                   ` (28 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Dave Hansen, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86

mm code calls nodes_weight() to check if any bit of a given nodemask is
set. We can do it more efficiently with nodes_empty() because nodes_empty()
stops traversing the nodemask as soon as it finds first set bit, while
nodes_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/mm/amdtopology.c    | 2 +-
 arch/x86/mm/numa_emulation.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/amdtopology.c b/arch/x86/mm/amdtopology.c
index 058b2f36b3a6..b3ca7d23e4b0 100644
--- a/arch/x86/mm/amdtopology.c
+++ b/arch/x86/mm/amdtopology.c
@@ -154,7 +154,7 @@ int __init amd_numa_init(void)
 		node_set(nodeid, numa_nodes_parsed);
 	}
 
-	if (!nodes_weight(numa_nodes_parsed))
+	if (nodes_empty(numa_nodes_parsed))
 		return -ENOENT;
 
 	/*
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index 1a02b791d273..9a9305367fdd 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -123,7 +123,7 @@ static int __init split_nodes_interleave(struct numa_meminfo *ei,
 	 * Continue to fill physical nodes with fake nodes until there is no
 	 * memory left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;
@@ -270,7 +270,7 @@ static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
 	 * Fill physical nodes with fake nodes of size until there is no memory
 	 * left on any of them.
 	 */
-	while (nodes_weight(physnode_mask)) {
+	while (!nodes_empty(physnode_mask)) {
 		for_each_node_mask(i, physnode_mask) {
 			u64 dma32_end = PFN_PHYS(MAX_DMA32_PFN);
 			u64 start, limit, end;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (25 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 26/54] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-24 12:41   ` Andy Shevchenko
  2022-01-28  6:59   ` Vaittinen, Matti
  2022-01-23 18:38 ` [PATCH 28/54] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
                   ` (27 subsequent siblings)
  54 siblings, 2 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

Many kernel users use bitmap_weight() to compare the result against
some number or expression:

	if (bitmap_weight(...) > 1)
		do_something();

It works OK, but may be significantly improved for large bitmaps: if
first few words count set bits to a number greater than given, we can
stop counting and immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the bitmap
are set, we can stop counting earlier.

This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
and a family of eq, gt, ge, lt and le wrappers to allow this optimization.
The following patches apply new functions where appropriate.

Suggested-by: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> (for bitmap_weight_cmp)
Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h | 80 ++++++++++++++++++++++++++++++++++++++++++
 lib/bitmap.c           | 21 +++++++++++
 2 files changed, 101 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 7dba0847510c..708e57b32362 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -51,6 +51,12 @@ struct device;
  *  bitmap_empty(src, nbits)                    Are all bits zero in *src?
  *  bitmap_full(src, nbits)                     Are all bits set in *src?
  *  bitmap_weight(src, nbits)                   Hamming Weight: number set bits
+ *  bitmap_weight_cmp(src, nbits)               compare Hamming Weight with a number
+ *  bitmap_weight_eq(src, nbits, num)           Hamming Weight == num
+ *  bitmap_weight_gt(src, nbits, num)           Hamming Weight >  num
+ *  bitmap_weight_ge(src, nbits, num)           Hamming Weight >= num
+ *  bitmap_weight_lt(src, nbits, num)           Hamming Weight <  num
+ *  bitmap_weight_le(src, nbits, num)           Hamming Weight <= num
  *  bitmap_set(dst, pos, nbits)                 Set specified bit area
  *  bitmap_clear(dst, pos, nbits)               Clear specified bit area
  *  bitmap_find_next_zero_area(buf, len, pos, n, mask)  Find bit free area
@@ -162,6 +168,7 @@ int __bitmap_intersects(const unsigned long *bitmap1,
 int __bitmap_subset(const unsigned long *bitmap1,
 		    const unsigned long *bitmap2, unsigned int nbits);
 int __bitmap_weight(const unsigned long *bitmap, unsigned int nbits);
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num);
 void __bitmap_set(unsigned long *map, unsigned int start, int len);
 void __bitmap_clear(unsigned long *map, unsigned int start, int len);
 
@@ -403,6 +410,79 @@ static __always_inline int bitmap_weight(const unsigned long *src, unsigned int
 	return __bitmap_weight(src, nbits);
 }
 
+/**
+ * bitmap_weight_cmp - compares number of set bits in @src with @num.
+ * @src:   source bitmap
+ * @nbits: length of bitmap in bits
+ * @num:   number to compare with
+ *
+ * As opposite to bitmap_weight() this function doesn't necessarily
+ * traverse full bitmap and may return earlier.
+ *
+ * Returns zero if weight of @src is equal to @num;
+ *	   negative number if weight of @src is less than @num;
+ *	   positive number if weight of @src is greater than @num;
+ *
+ * NOTES
+ *
+ * Because number of set bits cannot decrease while counting, when user
+ * wants to know if the number of set bits in the bitmap is less than
+ * @num, calling
+ *	bitmap_weight_cmp(..., @num) < 0
+ * is potentially less effective than
+ *	bitmap_weight_cmp(..., @num - 1) <= 0
+ *
+ * Consider an example:
+ * bitmap_weight_cmp(1000 0000 0000 0000, 1) < 0
+ *				    ^
+ *				    stop here
+ *
+ * bitmap_weight_cmp(1000 0000 0000 0000, 0) <= 0
+ *		     ^
+ *		     stop here
+ */
+static __always_inline
+int bitmap_weight_cmp(const unsigned long *src, unsigned int nbits, int num)
+{
+	if (num > (int)nbits || num < 0)
+		return -num;
+
+	if (small_const_nbits(nbits))
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)) - num;
+
+	return __bitmap_weight_cmp(src, nbits, num);
+}
+
+static __always_inline
+bool bitmap_weight_eq(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) == 0;
+}
+
+static __always_inline
+bool bitmap_weight_gt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_ge(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_lt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) <= 0;
+}
+
+static __always_inline
+bool bitmap_weight_le(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) <= 0;
+}
+
 static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 		unsigned int nbits)
 {
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 926408883456..fb84ca70c5d9 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -348,6 +348,27 @@ int __bitmap_weight(const unsigned long *bitmap, unsigned int bits)
 }
 EXPORT_SYMBOL(__bitmap_weight);
 
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num)
+{
+	unsigned int k, w, lim = bits / BITS_PER_LONG;
+
+	for (k = 0, w = 0; k < lim; k++) {
+		if (w + bits - k * BITS_PER_LONG < num)
+			goto out;
+
+		w += hweight_long(bitmap[k]);
+
+		if (w > num)
+			goto out;
+	}
+
+	if (bits % BITS_PER_LONG)
+		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+out:
+	return w - num;
+}
+EXPORT_SYMBOL(__bitmap_weight_cmp);
+
 void __bitmap_set(unsigned long *map, unsigned int start, int len)
 {
 	unsigned long *p = map + BIT_WORD(start);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 28/54] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (26 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
@ 2022-01-23 18:38 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
                   ` (26 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:38 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86

__init_one_rdt_domain in rdtgroup.c code calls bitmap_weight() to compare
the weight of bitmap with a given number. We can do it more efficiently
with bitmap_weight_lt because conditional bitmap_weight() may stop
traversing the bitmap earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e23ff03290b8..9d42e592c1cf 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2752,7 +2752,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
 	 * bitmap_weight() does not access out-of-bound memory.
 	 */
 	tmp_cbm = cfg->new_ctrl;
-	if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) {
+	if (bitmap_weight_lt(&tmp_cbm, r->cache.cbm_len, r->cache.min_cbm_bits)) {
 		rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id);
 		return -ENOSPC;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (27 preceding siblings ...)
  2022-01-23 18:38 ` [PATCH 28/54] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-24 12:46   ` Andy Shevchenko
  2022-01-23 18:39 ` [PATCH 30/54] drivers/memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
                   ` (25 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jonathan Cameron, Lars-Peter Clausen, Nathan Chancellor,
	Alexandru Ardelean, linux-iio

drivers/iio calls bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_{eq, gt}
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/iio/dummy/iio_simple_dummy_buffer.c | 4 ++--
 drivers/iio/industrialio-trigger.c          | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iio/dummy/iio_simple_dummy_buffer.c b/drivers/iio/dummy/iio_simple_dummy_buffer.c
index d81c2b2dad82..670997301e47 100644
--- a/drivers/iio/dummy/iio_simple_dummy_buffer.c
+++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c
@@ -71,8 +71,8 @@ static irqreturn_t iio_simple_dummy_trigger_h(int irq, void *p)
 		int i, j;
 
 		for (i = 0, j = 0;
-		     i < bitmap_weight(indio_dev->active_scan_mask,
-				       indio_dev->masklength);
+		     bitmap_weight_gt(indio_dev->active_scan_mask,
+				       indio_dev->masklength, i);
 		     i++, j++) {
 			j = find_next_bit(indio_dev->active_scan_mask,
 					  indio_dev->masklength, j);
diff --git a/drivers/iio/industrialio-trigger.c b/drivers/iio/industrialio-trigger.c
index f504ed351b3e..98c54022fecf 100644
--- a/drivers/iio/industrialio-trigger.c
+++ b/drivers/iio/industrialio-trigger.c
@@ -331,7 +331,7 @@ int iio_trigger_detach_poll_func(struct iio_trigger *trig,
 {
 	struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(pf->indio_dev);
 	bool no_other_users =
-		bitmap_weight(trig->pool, CONFIG_IIO_CONSUMERS_PER_TRIGGER) == 1;
+		bitmap_weight_eq(trig->pool, CONFIG_IIO_CONSUMERS_PER_TRIGGER, 1);
 	int ret = 0;
 
 	if (trig->ops && trig->ops->set_trigger_state && no_other_users) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 30/54] drivers/memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (28 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-31 16:07   ` Ulf Hansson
  2022-01-23 18:39   ` [Intel-wired-lan] " Yury Norov
                   ` (24 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Ulf Hansson, Jens Axboe,
	Luis Chamberlain, Colin Ian King, Arnd Bergmann,
	Shubhankar Kuranagatti, linux-mmc

msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
weight of bitmap with a given number. We can do it more efficiently with
bitmap_weight_eq because conditional bitmap_weight may stop traversing the
bitmap earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/memstick/core/ms_block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
index 0cda6c6baefc..5cdd987e78f7 100644
--- a/drivers/memstick/core/ms_block.c
+++ b/drivers/memstick/core/ms_block.c
@@ -155,8 +155,8 @@ static int msb_validate_used_block_bitmap(struct msb_data *msb)
 	for (i = 0; i < msb->zone_count; i++)
 		total_free_blocks += msb->free_block_count[i];
 
-	if (msb->block_count - bitmap_weight(msb->used_blocks_bitmap,
-					msb->block_count) == total_free_blocks)
+	if (bitmap_weight_eq(msb->used_blocks_bitmap, msb->block_count,
+				msb->block_count - total_free_blocks))
 		return 0;
 
 	pr_err("BUG: free block counts don't match the bitmap");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 31/54] net: ethernet: replace bitmap_weight with bitmap_weight_eq for intel
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:39   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jesse Brandeburg, Tony Nguyen, David S. Miller, Jakub Kicinski,
	intel-wired-lan, netdev

ixgbe_disable_sriov calls bitmap_weight() to compare the weight of bitmap
with a given number. We can do it more efficiently with bitmap_weight_eq
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 214a38de3f41..35297d8a488b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -246,7 +246,7 @@ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
 #endif
 
 	/* Disable VMDq flag so device will be set in VM mode */
-	if (bitmap_weight(adapter->fwd_bitmask, adapter->num_rx_pools) == 1) {
+	if (bitmap_weight_eq(adapter->fwd_bitmask, adapter->num_rx_pools, 1)) {
 		adapter->flags &= ~IXGBE_FLAG_VMDQ_ENABLED;
 		adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
 		rss = min_t(int, ixgbe_max_rss_indices(adapter),
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [Intel-wired-lan] [PATCH 31/54] net: ethernet: replace bitmap_weight with bitmap_weight_eq for intel
@ 2022-01-23 18:39   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: intel-wired-lan

ixgbe_disable_sriov calls bitmap_weight() to compare the weight of bitmap
with a given number. We can do it more efficiently with bitmap_weight_eq
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 214a38de3f41..35297d8a488b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -246,7 +246,7 @@ int ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
 #endif
 
 	/* Disable VMDq flag so device will be set in VM mode */
-	if (bitmap_weight(adapter->fwd_bitmask, adapter->num_rx_pools) == 1) {
+	if (bitmap_weight_eq(adapter->fwd_bitmask, adapter->num_rx_pools, 1)) {
 		adapter->flags &= ~IXGBE_FLAG_VMDQ_ENABLED;
 		adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
 		rss = min_t(int, ixgbe_max_rss_indices(adapter),
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 32/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt} for OcteonTX2
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (30 preceding siblings ...)
  2022-01-23 18:39   ` [Intel-wired-lan] " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox Yury Norov
                   ` (22 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

OcteonTX2 code calls bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_{eq,gt}
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c | 2 +-
 drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
index d85db90632d6..a55fd1d0c653 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
@@ -287,7 +287,7 @@ static int otx2_set_channels(struct net_device *dev,
 	if (!channel->rx_count || !channel->tx_count)
 		return -EINVAL;
 
-	if (bitmap_weight(&pfvf->rq_bmap, pfvf->hw.rx_queues) > 1) {
+	if (bitmap_weight_gt(&pfvf->rq_bmap, pfvf->hw.rx_queues, 1)) {
 		netdev_err(dev,
 			   "Receive queues are in use by TC police action\n");
 		return -EINVAL;
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
index 80b2d64b4136..55c899a6fcdd 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
@@ -1170,8 +1170,8 @@ int otx2_remove_flow(struct otx2_nic *pfvf, u32 location)
 		 * interface mac address and configure CGX/RPM block in
 		 * promiscuous mode
 		 */
-		if (bitmap_weight(&flow_cfg->dmacflt_bmap,
-				  flow_cfg->dmacflt_max_flows) == 1)
+		if (bitmap_weight_eq(&flow_cfg->dmacflt_bmap,
+				     flow_cfg->dmacflt_max_flows, 1))
 			otx2_update_rem_pfmac(pfvf, DMAC_ADDR_DEL);
 	} else {
 		err = otx2_remove_flow_msg(pfvf, flow->entry, false);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (31 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 32/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt} for OcteonTX2 Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-24 12:48   ` Andy Shevchenko
  2022-01-23 18:39   ` Yury Norov
                   ` (21 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad,
	David S. Miller, Jakub Kicinski, netdev

Mellanox code uses bitmap_weight() to compare the weight of bitmap with
a given number. We can do it more efficiently with bitmap_weight_{eq, ...}
because conditional bitmap_weight may stop traversing the bitmap earlier,
as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c  | 10 +++-------
 drivers/net/ethernet/mellanox/mlx4/eq.c   |  4 ++--
 drivers/net/ethernet/mellanox/mlx4/fw.c   |  4 ++--
 drivers/net/ethernet/mellanox/mlx4/main.c |  2 +-
 4 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index c56d2194cbfc..5bca0c68f00a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2792,9 +2792,8 @@ int mlx4_slave_convert_port(struct mlx4_dev *dev, int slave, int port)
 {
 	unsigned n;
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(dev, slave);
-	unsigned m = bitmap_weight(actv_ports.ports, dev->caps.num_ports);
 
-	if (port <= 0 || port > m)
+	if (port <= 0 || bitmap_weight_lt(actv_ports.ports, dev->caps.num_ports, port))
 		return -EINVAL;
 
 	n = find_first_bit(actv_ports.ports, dev->caps.num_ports);
@@ -3404,10 +3403,6 @@ int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
 	struct mlx4_priv *priv = mlx4_priv(dev);
 	struct mlx4_active_ports actv_ports = mlx4_get_active_ports(
 			&priv->dev, slave);
-	int min_port = find_first_bit(actv_ports.ports,
-				      priv->dev.caps.num_ports) + 1;
-	int max_port = min_port - 1 +
-		bitmap_weight(actv_ports.ports, priv->dev.caps.num_ports);
 
 	if (slave == mlx4_master_func_num(dev))
 		return 0;
@@ -3417,7 +3412,8 @@ int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
 	    enabled < 0 || enabled > 1)
 		return -EINVAL;
 
-	if (min_port == max_port && dev->caps.num_ports > 1) {
+	if (dev->caps.num_ports > 1 &&
+	    bitmap_weight_eq(actv_ports.ports, priv->dev.caps.num_ports, 1)) {
 		mlx4_info(dev, "SMI access disallowed for single ported VFs\n");
 		return -EPROTONOSUPPORT;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 414e390e6b48..0c09432ff389 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1435,8 +1435,8 @@ int mlx4_is_eq_shared(struct mlx4_dev *dev, int vector)
 	if (vector <= 0 || (vector >= dev->caps.num_comp_vectors + 1))
 		return -EINVAL;
 
-	return !!(bitmap_weight(priv->eq_table.eq[vector].actv_ports.ports,
-				dev->caps.num_ports) > 1);
+	return bitmap_weight_gt(priv->eq_table.eq[vector].actv_ports.ports,
+				dev->caps.num_ports, 1);
 }
 EXPORT_SYMBOL(mlx4_is_eq_shared);
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index 42c96c9d7fb1..855aae326ccb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -1300,8 +1300,8 @@ int mlx4_QUERY_DEV_CAP_wrapper(struct mlx4_dev *dev, int slave,
 	actv_ports = mlx4_get_active_ports(dev, slave);
 	first_port = find_first_bit(actv_ports.ports, dev->caps.num_ports);
 	for (slave_port = 0, real_port = first_port;
-	     real_port < first_port +
-	     bitmap_weight(actv_ports.ports, dev->caps.num_ports);
+	     bitmap_weight_gt(actv_ports.ports, dev->caps.num_ports,
+			      real_port - first_port);
 	     ++real_port, ++slave_port) {
 		if (flags & (MLX4_DEV_CAP_FLAG_WOL_PORT1 << real_port))
 			flags |= MLX4_DEV_CAP_FLAG_WOL_PORT1 << slave_port;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index b187c210d4d6..cfbaa7ac712f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1383,7 +1383,7 @@ static int mlx4_mf_bond(struct mlx4_dev *dev)
 		   dev->persist->num_vfs + 1);
 
 	/* only single port vfs are allowed */
-	if (bitmap_weight(slaves_port_1_2, dev->persist->num_vfs + 1) > 1) {
+	if (bitmap_weight_gt(slaves_port_1_2, dev->persist->num_vfs + 1, 1)) {
 		mlx4_warn(dev, "HA mode unsupported for dual ported VFs\n");
 		return -EINVAL;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 34/54] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:39   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, linux-arm-kernel

tx2_uncore_event_start() calls bitmap_weight() to compare the weight
of bitmap with a given number. We can do it more efficiently with
bitmap_weight_eq because conditional bitmap_weight may stop traversing
the bitmap earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/thunderx2_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
index 05378c0fd8f3..ebfa66b212c7 100644
--- a/drivers/perf/thunderx2_pmu.c
+++ b/drivers/perf/thunderx2_pmu.c
@@ -623,8 +623,8 @@ static void tx2_uncore_event_start(struct perf_event *event, int flags)
 		return;
 
 	/* Start timer for first event */
-	if (bitmap_weight(tx2_pmu->active_counters,
-				tx2_pmu->max_counters) == 1) {
+	if (bitmap_weight_eq(tx2_pmu->active_counters,
+				tx2_pmu->max_counters, 1)) {
 		hrtimer_start(&tx2_pmu->hrtimer,
 			ns_to_ktime(tx2_pmu->hrtimer_interval),
 			HRTIMER_MODE_REL_PINNED);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 34/54] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2
@ 2022-01-23 18:39   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Will Deacon, Mark Rutland, linux-arm-kernel

tx2_uncore_event_start() calls bitmap_weight() to compare the weight
of bitmap with a given number. We can do it more efficiently with
bitmap_weight_eq because conditional bitmap_weight may stop traversing
the bitmap earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/perf/thunderx2_pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c
index 05378c0fd8f3..ebfa66b212c7 100644
--- a/drivers/perf/thunderx2_pmu.c
+++ b/drivers/perf/thunderx2_pmu.c
@@ -623,8 +623,8 @@ static void tx2_uncore_event_start(struct perf_event *event, int flags)
 		return;
 
 	/* Start timer for first event */
-	if (bitmap_weight(tx2_pmu->active_counters,
-				tx2_pmu->max_counters) == 1) {
+	if (bitmap_weight_eq(tx2_pmu->active_counters,
+				tx2_pmu->max_counters, 1)) {
 		hrtimer_start(&tx2_pmu->hrtimer,
 			ns_to_ktime(tx2_pmu->hrtimer_interval),
 			HRTIMER_MODE_REL_PINNED);
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 35/54] drivers/staging: replace bitmap_weight with bitmap_weight_le for tegra-video
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (33 preceding siblings ...)
  2022-01-23 18:39   ` Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 36/54] lib/cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
                   ` (19 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thierry Reding, Jonathan Hunter, Sowjanya Komatineni,
	Mauro Carvalho Chehab, linux-media, linux-tegra, linux-staging

tegra_channel_enum_format() calls bitmap_weight() to compare the weight
of bitmap with a given number. We can do it more efficiently with
bitmap_weight_le() because conditional bitmap_weight may stop traversing
the bitmap earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/staging/media/tegra-video/vi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
index d1f43f465c22..4e79a80e9307 100644
--- a/drivers/staging/media/tegra-video/vi.c
+++ b/drivers/staging/media/tegra-video/vi.c
@@ -436,7 +436,7 @@ static int tegra_channel_enum_format(struct file *file, void *fh,
 	if (!IS_ENABLED(CONFIG_VIDEO_TEGRA_TPG))
 		fmts_bitmap = chan->fmts_bitmap;
 
-	if (f->index >= bitmap_weight(fmts_bitmap, MAX_FORMAT_NUM))
+	if (bitmap_weight_le(fmts_bitmap, MAX_FORMAT_NUM, f->index))
 		return -EINVAL;
 
 	for (i = 0; i < f->index + 1; i++, index++)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 36/54] lib/cpumask: add cpumask_weight_{eq,gt,ge,lt,le}
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (34 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 35/54] drivers/staging: replace bitmap_weight with bitmap_weight_le for tegra-video Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39   ` Yury Norov
                   ` (18 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases people use cpumask_weight() to compare the result against
some number or expression:

	if (cpumask_weight(...) > 1)
		do_something();

It may be significantly improved for large cpumasks: if first few words
count set bits to a number greater than given, we can stop counting and
immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the cpumask
are set, we can stop counting earlier.

This patch adds cpumask_weight_{eq, gt, ge, lt, le} helpers based on
corresponding bitmap functions. The following patches apply new functions
where appropriate.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 64dae70d31f5..1906e3225737 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -575,6 +575,56 @@ static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 	return bitmap_weight(cpumask_bits(srcp), nr_cpumask_bits);
 }
 
+/**
+ * cpumask_weight_eq - Check if # of bits in *srcp is equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_eq(const struct cpumask *srcp, unsigned int num)
+{
+	return bitmap_weight_eq(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_gt - Check if # of bits in *srcp is greater than a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_gt(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_gt(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_ge - Check if # of bits in *srcp is greater than or equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_ge(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_ge(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_lt - Check if # of bits in *srcp is less than a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_lt(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_lt(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
+/**
+ * cpumask_weight_le - Check if # of bits in *srcp is less than or equal to a given number
+ * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @num: the number to check.
+ */
+static inline bool cpumask_weight_le(const struct cpumask *srcp, int num)
+{
+	return bitmap_weight_le(cpumask_bits(srcp), nr_cpumask_bits, num);
+}
+
 /**
  * cpumask_shift_right - *dstp = *srcp >> n
  * @dstp: the cpumask result
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 37/54] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:39   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-ia64

__flush_tlb_range() code calls cpumask_weight() to compare the
weight of cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq because conditional cpumask_weight may stop traversing
the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/mm/tlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 135b5135cace..a5bce13ab047 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -332,7 +332,7 @@ __flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
 
 	preempt_disable();
 #ifdef CONFIG_SMP
-	if (mm != current->active_mm || cpumask_weight(mm_cpumask(mm)) != 1) {
+	if (mm != current->active_mm || !cpumask_weight_eq(mm_cpumask(mm), 1)) {
 		ia64_global_tlb_purge(mm, start, end, nbits);
 		preempt_enable();
 		return;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 37/54] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c
@ 2022-01-23 18:39   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-ia64

__flush_tlb_range() code calls cpumask_weight() to compare the
weight of cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq because conditional cpumask_weight may stop traversing
the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/mm/tlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 135b5135cace..a5bce13ab047 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -332,7 +332,7 @@ __flush_tlb_range (struct vm_area_struct *vma, unsigned long start,
 
 	preempt_disable();
 #ifdef CONFIG_SMP
-	if (mm != current->active_mm || cpumask_weight(mm_cpumask(mm)) != 1) {
+	if (mm != current->active_mm || !cpumask_weight_eq(mm_cpumask(mm), 1)) {
 		ia64_global_tlb_purge(mm, start, end, nbits);
 		preempt_enable();
 		return;
-- 
2.30.2

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 38/54] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (36 preceding siblings ...)
  2022-01-23 18:39   ` Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-25 19:05   ` Thomas Bogendoerfer
  2022-01-23 18:39 ` [PATCH 39/54] arch/powerpc: " Yury Norov
                   ` (16 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Bogendoerfer, Mark Rutland, Marc Zyngier, linux-mips

Mips code uses calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/mips/cavium-octeon/octeon-irq.c | 4 ++--
 arch/mips/kernel/crash.c             | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-irq.c b/arch/mips/cavium-octeon/octeon-irq.c
index 844f882096e6..914871f15fb7 100644
--- a/arch/mips/cavium-octeon/octeon-irq.c
+++ b/arch/mips/cavium-octeon/octeon-irq.c
@@ -763,7 +763,7 @@ static void octeon_irq_cpu_offline_ciu(struct irq_data *data)
 	if (!cpumask_test_cpu(cpu, mask))
 		return;
 
-	if (cpumask_weight(mask) > 1) {
+	if (cpumask_weight_gt(mask, 1)) {
 		/*
 		 * It has multi CPU affinity, just remove this CPU
 		 * from the affinity set.
@@ -795,7 +795,7 @@ static int octeon_irq_ciu_set_affinity(struct irq_data *data,
 	 * This removes the need to do locking in the .ack/.eoi
 	 * functions.
 	 */
-	if (cpumask_weight(dest) != 1)
+	if (!cpumask_weight_eq(dest, 1))
 		return -EINVAL;
 
 	if (!enable_one)
diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c
index 81845ba04835..5b690d52491f 100644
--- a/arch/mips/kernel/crash.c
+++ b/arch/mips/kernel/crash.c
@@ -72,7 +72,7 @@ static void crash_kexec_prepare_cpus(void)
 	 */
 	pr_emerg("Sending IPI to other cpus...\n");
 	msecs = 10000;
-	while ((cpumask_weight(&cpus_in_crash) < ncpus) && (--msecs > 0)) {
+	while (cpumask_weight_lt(&cpus_in_crash, ncpus) && (--msecs > 0)) {
 		cpu_relax();
 		mdelay(1);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 39/54] arch/powerpc: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (37 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 38/54] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 40/54] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
                   ` (15 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Srikar Dronamraju, Gautham R. Shenoy, Valentin Schneider,
	Parth Shah, Cédric Le Goater, Hari Bathini, Rob Herring,
	Laurent Dufour, Petr Mladek, John Ogness, Sudeep Holla,
	Christophe Leroy, Naveen N. Rao, Xiongwei Song, Arnd Bergmann,
	linuxppc-dev

PowerPC code uses cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/powerpc/kernel/smp.c      | 2 +-
 arch/powerpc/kernel/watchdog.c | 2 +-
 arch/powerpc/xmon/xmon.c       | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index b7fd6a72aa76..8bff748df402 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1656,7 +1656,7 @@ void start_secondary(void *unused)
 		if (has_big_cores)
 			sibling_mask = cpu_smallcore_mask;
 
-		if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
+		if (cpumask_weight_gt(mask, cpumask_weight(sibling_mask(cpu))))
 			shared_caches = true;
 	}
 
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bfc27496fe7e..62937a077de7 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -483,7 +483,7 @@ static void start_watchdog(void *arg)
 
 	wd_smp_lock(&flags);
 	cpumask_set_cpu(cpu, &wd_cpus_enabled);
-	if (cpumask_weight(&wd_cpus_enabled) == 1) {
+	if (cpumask_weight_eq(&wd_cpus_enabled, 1)) {
 		cpumask_set_cpu(cpu, &wd_smp_cpus_pending);
 		wd_smp_last_reset_tb = get_tb();
 	}
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index fd72753e8ad5..b423812e94e0 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -469,7 +469,7 @@ static bool wait_for_other_cpus(int ncpus)
 
 	/* We wait for 2s, which is a metric "little while" */
 	for (timeout = 20000; timeout != 0; --timeout) {
-		if (cpumask_weight(&cpus_in_xmon) >= ncpus)
+		if (cpumask_weight_ge(&cpus_in_xmon, ncpus))
 			return true;
 		udelay(100);
 		barrier();
@@ -1338,7 +1338,7 @@ static int cpu_cmd(void)
 			case 'S':
 			case 't':
 				cpumask_copy(&xmon_batch_cpus, &cpus_in_xmon);
-				if (cpumask_weight(&xmon_batch_cpus) <= 1) {
+				if (cpumask_weight_le(&xmon_batch_cpus, 1)) {
 					printf("There are no other cpus in xmon\n");
 					break;
 				}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 40/54] arch/s390: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (38 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 39/54] arch/powerpc: " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 41/54] arch/x86: " Yury Norov
                   ` (14 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	Alexander Gordeev, Sven Schnelle, Thomas Richter,
	Sumanth Korikkar, Sebastian Andrzej Siewior, Jiapeng Chong,
	kernel test robot, linux-s390

cfset_all_start() calls cpumask_weight() to compare the weight of cpumask
with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/s390/kernel/perf_cpum_cf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index ee8707abdb6a..4d217f7f5ccf 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -975,7 +975,7 @@ static int cfset_all_start(struct cfset_request *req)
 		return -ENOMEM;
 	cpumask_and(mask, &req->mask, cpu_online_mask);
 	on_each_cpu_mask(mask, cfset_ioctl_on, &p, 1);
-	if (atomic_read(&p.cpus_ack) != cpumask_weight(mask)) {
+	if (!cpumask_weight_eq(mask, atomic_read(&p.cpus_ack))) {
 		on_each_cpu_mask(mask, cfset_ioctl_off, &p, 1);
 		rc = -EIO;
 		debug_sprintf_event(cf_dbg, 4, "%s CPUs missing", __func__);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 41/54] arch/x86: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (39 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 40/54] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-24  8:05   ` Peter Zijlstra
  2022-01-24  9:16   ` Vitaly Kuznetsov
  2022-01-23 18:39   ` Yury Norov
                   ` (13 subsequent siblings)
  54 siblings, 2 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Vitaly Kuznetsov, Tim Chen,
	Alison Schofield, Boris Ostrovsky

smpboot code in somw places calls cpumask_weight() to compare the weight
of cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq() because conditional cpumask_weight may stop traversing
the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/x86/kernel/smpboot.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 617012f4619f..e851e9945eb5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1608,7 +1608,7 @@ static void remove_siblinginfo(int cpu)
 		/*/
 		 * last thread sibling in this cpu core going down
 		 */
-		if (cpumask_weight(topology_sibling_cpumask(cpu)) == 1)
+		if (cpumask_weight_eq(topology_sibling_cpumask(cpu), 1))
 			cpu_data(sibling).booted_cores--;
 	}
 
@@ -1617,7 +1617,7 @@ static void remove_siblinginfo(int cpu)
 
 	for_each_cpu(sibling, topology_sibling_cpumask(cpu)) {
 		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
-		if (cpumask_weight(topology_sibling_cpumask(sibling)) == 1)
+		if (cpumask_weight_eq(topology_sibling_cpumask(sibling), 1))
 			cpu_data(sibling).smt_active = false;
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 42/54] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:39   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mark Rutland, Lorenzo Pieralisi, linux-arm-kernel

down_and_up_cpus() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/firmware/psci/psci_checker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/psci/psci_checker.c b/drivers/firmware/psci/psci_checker.c
index 116eb465cdb4..90c9473832a9 100644
--- a/drivers/firmware/psci/psci_checker.c
+++ b/drivers/firmware/psci/psci_checker.c
@@ -90,7 +90,7 @@ static unsigned int down_and_up_cpus(const struct cpumask *cpus,
 		 * cpu_down() checks the number of online CPUs before the TOS
 		 * resident CPU.
 		 */
-		if (cpumask_weight(offlined_cpus) + 1 == nb_available_cpus) {
+		if (cpumask_weight_eq(offlined_cpus, nb_available_cpus - 1)) {
 			if (ret != -EBUSY) {
 				pr_err("Unexpected return code %d while trying "
 				       "to power down last online CPU %d\n",
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 42/54] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq
@ 2022-01-23 18:39   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mark Rutland, Lorenzo Pieralisi, linux-arm-kernel

down_and_up_cpus() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/firmware/psci/psci_checker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/psci/psci_checker.c b/drivers/firmware/psci/psci_checker.c
index 116eb465cdb4..90c9473832a9 100644
--- a/drivers/firmware/psci/psci_checker.c
+++ b/drivers/firmware/psci/psci_checker.c
@@ -90,7 +90,7 @@ static unsigned int down_and_up_cpus(const struct cpumask *cpus,
 		 * cpu_down() checks the number of online CPUs before the TOS
 		 * resident CPU.
 		 */
-		if (cpumask_weight(offlined_cpus) + 1 == nb_available_cpus) {
+		if (cpumask_weight_eq(offlined_cpus, nb_available_cpus - 1)) {
 			if (ret != -EBUSY) {
 				pr_err("Unexpected return code %d while trying "
 				       "to power down last online CPU %d\n",
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (41 preceding siblings ...)
  2022-01-23 18:39   ` Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 22:00   ` Wei Liu
                     ` (2 more replies)
  2022-01-23 18:39 ` [PATCH 44/54] infiniband: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
                   ` (11 subsequent siblings)
  54 siblings, 3 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, linux-hyperv

init_vp_index() calls cpumask_weight() to compare the weights of cpumasks
We can do it more efficiently with cpumask_weight_eq because conditional
cpumask_weight may stop traversing the cpumask earlier (at least one), as
soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/hv/channel_mgmt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 60375879612f..7420a5fd47b5 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -762,8 +762,8 @@ static void init_vp_index(struct vmbus_channel *channel)
 		}
 		alloced_mask = &hv_context.hv_numa_map[numa_node];
 
-		if (cpumask_weight(alloced_mask) ==
-		    cpumask_weight(cpumask_of_node(numa_node))) {
+		if (cpumask_weight_eq(alloced_mask,
+			    cpumask_weight(cpumask_of_node(numa_node)))) {
 			/*
 			 * We have cycled through all the CPUs in the node;
 			 * reset the alloced map.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 44/54] infiniband: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (42 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 43/54] drivers/hv: " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 45/54] scsi: replace cpumask_weight with cpumask_weight_gt Yury Norov
                   ` (10 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, Jason Gunthorpe,
	linux-rdma

Infiniband code uses cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c    | 9 ++++-----
 drivers/infiniband/hw/qib/qib_file_ops.c | 2 +-
 drivers/infiniband/hw/qib/qib_iba7322.c  | 2 +-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 38eee675369a..7c5ca5c5306a 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -507,7 +507,7 @@ static int _dev_comp_vect_cpu_mask_init(struct hfi1_devdata *dd,
 	 * available CPUs divide it by the number of devices in the
 	 * local NUMA node.
 	 */
-	if (cpumask_weight(&entry->comp_vect_mask) == 1) {
+	if (cpumask_weight_eq(&entry->comp_vect_mask, 1)) {
 		possible_cpus_comp_vect = 1;
 		dd_dev_warn(dd,
 			    "Number of kernel receive queues is too large for completion vector affinity to be effective\n");
@@ -593,7 +593,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 {
 	struct hfi1_affinity_node *entry;
 	const struct cpumask *local_mask;
-	int curr_cpu, possible, i, ret;
+	int curr_cpu, i, ret;
 	bool new_entry = false;
 
 	local_mask = cpumask_of_node(dd->node);
@@ -626,10 +626,9 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 			    local_mask);
 
 		/* fill in the receive list */
-		possible = cpumask_weight(&entry->def_intr.mask);
 		curr_cpu = cpumask_first(&entry->def_intr.mask);
 
-		if (possible == 1) {
+		if (cpumask_weight_eq(&entry->def_intr.mask, 1)) {
 			/* only one CPU, everyone will use it */
 			cpumask_set_cpu(curr_cpu, &entry->rcv_intr.mask);
 			cpumask_set_cpu(curr_cpu, &entry->general_intr_mask);
@@ -1017,7 +1016,7 @@ int hfi1_get_proc_affinity(int node)
 		cpu = cpumask_first(proc_mask);
 		cpumask_set_cpu(cpu, &set->used);
 		goto done;
-	} else if (current->nr_cpus_allowed < cpumask_weight(&set->mask)) {
+	} else if (cpumask_weight_gt(&set->mask, current->nr_cpus_allowed)) {
 		hfi1_cdbg(PROC, "PID %u %s affinity set to CPU set(s) %*pbl",
 			  current->pid, current->comm,
 			  cpumask_pr_args(proc_mask));
diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/hw/qib/qib_file_ops.c
index aa290928cf96..add89bc21b0a 100644
--- a/drivers/infiniband/hw/qib/qib_file_ops.c
+++ b/drivers/infiniband/hw/qib/qib_file_ops.c
@@ -1151,7 +1151,7 @@ static void assign_ctxt_affinity(struct file *fp, struct qib_devdata *dd)
 	 * reserve a processor for it on the local NUMA node.
 	 */
 	if ((weight >= qib_cpulist_count) &&
-		(cpumask_weight(local_mask) <= qib_cpulist_count)) {
+		(cpumask_weight_le(local_mask, qib_cpulist_count))) {
 		for_each_cpu(local_cpu, local_mask)
 			if (!test_and_set_bit(local_cpu, qib_cpulist)) {
 				fd->rec_cpu_num = local_cpu;
diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c
index ceed302cf6a0..b17f96509d2c 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -3405,7 +3405,7 @@ static void qib_setup_7322_interrupt(struct qib_devdata *dd, int clearpend)
 	local_mask = cpumask_of_pcibus(dd->pcidev->bus);
 	firstcpu = cpumask_first(local_mask);
 	if (firstcpu >= nr_cpu_ids ||
-			cpumask_weight(local_mask) == num_online_cpus()) {
+			cpumask_weight_eq(local_mask, num_online_cpus())) {
 		local_mask = topology_core_cpumask(0);
 		firstcpu = cpumask_first(local_mask);
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 45/54] scsi: replace cpumask_weight with cpumask_weight_gt
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (43 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 44/54] infiniband: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39   ` Yury Norov
                   ` (9 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	James Smart, Dick Kennedy, James E.J. Bottomley,
	Martin K. Petersen, linux-scsi

lpfc_cpuhp_get_eq() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_gt because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/scsi/lpfc/lpfc_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index a56f01f659f8..325e9004dacd 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12643,7 +12643,7 @@ lpfc_cpuhp_get_eq(struct lpfc_hba *phba, unsigned int cpu,
 		 * gone offline yet, we need >1.
 		 */
 		cpumask_and(tmp, maskp, cpu_online_mask);
-		if (cpumask_weight(tmp) > 1)
+		if (cpumask_weight_gt(tmp, 1))
 			continue;
 
 		/* Now that we have an irq to shutdown, get the eq
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 46/54] soc: replace cpumask_weight with cpumask_weight_lt
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
@ 2022-01-23 18:39   ` Yury Norov
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
                     ` (53 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Li Yang, linuxppc-dev, linux-arm-kernel

qman_test_stash() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_lt because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/soc/fsl/qbman/qman_test_stash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/qman_test_stash.c b/drivers/soc/fsl/qbman/qman_test_stash.c
index b7e8e5ec884c..28b08568a349 100644
--- a/drivers/soc/fsl/qbman/qman_test_stash.c
+++ b/drivers/soc/fsl/qbman/qman_test_stash.c
@@ -561,7 +561,7 @@ int qman_test_stash(void)
 {
 	int err;
 
-	if (cpumask_weight(cpu_online_mask) < 2) {
+	if (cpumask_weight_lt(cpu_online_mask, 2)) {
 		pr_info("%s(): skip - only 1 CPU\n", __func__);
 		return 0;
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 46/54] soc: replace cpumask_weight with cpumask_weight_lt
@ 2022-01-23 18:39   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Li Yang, linuxppc-dev, linux-arm-kernel

qman_test_stash() calls cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_lt because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/soc/fsl/qbman/qman_test_stash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/qman_test_stash.c b/drivers/soc/fsl/qbman/qman_test_stash.c
index b7e8e5ec884c..28b08568a349 100644
--- a/drivers/soc/fsl/qbman/qman_test_stash.c
+++ b/drivers/soc/fsl/qbman/qman_test_stash.c
@@ -561,7 +561,7 @@ int qman_test_stash(void)
 {
 	int err;
 
-	if (cpumask_weight(cpu_online_mask) < 2) {
+	if (cpumask_weight_lt(cpu_online_mask, 2)) {
 		pr_info("%s(): skip - only 1 CPU\n", __func__);
 		return 0;
 	}
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 47/54] sched: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (45 preceding siblings ...)
  2022-01-23 18:39   ` Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-24  8:05   ` Peter Zijlstra
  2022-01-23 18:39 ` [PATCH 48/54] kernel/time: " Yury Norov
                   ` (7 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

kernel/sched code uses cpumask_weight() to compare the weight of
cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/sched/core.c     | 8 ++++----
 kernel/sched/topology.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 918d0bdc2ea8..7494f51a3608 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6006,7 +6006,7 @@ static void sched_core_cpu_starting(unsigned int cpu)
 	WARN_ON_ONCE(rq->core != rq);
 
 	/* if we're the first, we'll be our own leader */
-	if (cpumask_weight(smt_mask) == 1)
+	if (cpumask_weight_eq(smt_mask, 1))
 		goto unlock;
 
 	/* find the leader */
@@ -6047,7 +6047,7 @@ static void sched_core_cpu_deactivate(unsigned int cpu)
 	sched_core_lock(cpu, &flags);
 
 	/* if we're the last man standing, nothing to do */
-	if (cpumask_weight(smt_mask) == 1) {
+	if (cpumask_weight_eq(smt_mask, 1)) {
 		WARN_ON_ONCE(rq->core != rq);
 		goto unlock;
 	}
@@ -9045,7 +9045,7 @@ int sched_cpu_activate(unsigned int cpu)
 	/*
 	 * When going up, increment the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+	if (cpumask_weight_eq(cpu_smt_mask(cpu), 2))
 		static_branch_inc_cpuslocked(&sched_smt_present);
 #endif
 	set_cpu_active(cpu, true);
@@ -9120,7 +9120,7 @@ int sched_cpu_deactivate(unsigned int cpu)
 	/*
 	 * When going down, decrement the number of cores with SMT present.
 	 */
-	if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+	if (cpumask_weight_eq(cpu_smt_mask(cpu), 2))
 		static_branch_dec_cpuslocked(&sched_smt_present);
 
 	sched_core_cpu_deactivate(cpu);
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 8478e2a8cd65..79395571599f 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -169,7 +169,7 @@ static const unsigned int SD_DEGENERATE_GROUPS_MASK =
 
 static int sd_degenerate(struct sched_domain *sd)
 {
-	if (cpumask_weight(sched_domain_span(sd)) == 1)
+	if (cpumask_weight_eq(sched_domain_span(sd), 1))
 		return 1;
 
 	/* Following flags need at least 2 groups */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 48/54] kernel/time: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (46 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 47/54] sched: replace cpumask_weight with cpumask_weight_eq where appropriate Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-24  8:06   ` Peter Zijlstra
  2022-01-23 18:39 ` [PATCH 49/54] lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
                   ` (6 subsequent siblings)
  54 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner

tick_cleanup_dead_cpu() calls cpumask_weight() to compare the weight
of cpumask with a given number. We can do it more efficiently with
cpumask_weight_eq() because conditional cpumask_weight may stop
traversing the cpumask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/time/clockevents.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 003ccf338d20..32d6629a55b2 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -648,7 +648,7 @@ void tick_cleanup_dead_cpu(int cpu)
 	 */
 	list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
 		if (cpumask_test_cpu(cpu, dev->cpumask) &&
-		    cpumask_weight(dev->cpumask) == 1 &&
+		    cpumask_weight_eq(dev->cpumask, 1) &&
 		    !tick_is_broadcast_device(dev)) {
 			BUG_ON(!clockevent_state_detached(dev));
 			list_del(&dev->list);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 49/54] lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le}
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (47 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 48/54] kernel/time: " Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 50/54] acpi: replace nodes__weight with nodes_weight_ge for numa Yury Norov
                   ` (5 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

In many cases kernel code uses nodemask_weight() to compare the result
against some number or expression:

	if (nodes_weight(...) > 1)
		do_something();

It may be significantly improved for large nodemasks: if first few words
count set bits to a number greater than given, we can stop counting and
immediately return.

The same idea would work in other direction: if we know that the number
of set bits that we counted so far is small enough, so that it would be
smaller than required number even if all bits of the rest of the nodemask
are set, we can stop counting earlier.

This patch adds nodes_weight{eq, gt, ge, lt, le} helpers based on
corresponding bitmap functions. The following patches apply new functions
where appropriate.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/nodemask.h | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 567c3ddba2c4..197598e075e9 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -38,6 +38,11 @@
  * int nodes_empty(mask)		Is mask empty (no bits sets)?
  * int nodes_full(mask)			Is mask full (all bits sets)?
  * int nodes_weight(mask)		Hamming weight - number of set bits
+ * bool nodes_weight_eq(src, nbits, num) Hamming Weight is equal to num
+ * bool nodes_weight_gt(src, nbits, num) Hamming Weight is greater than num
+ * bool nodes_weight_ge(src, nbits, num) Hamming Weight is greater than or equal to num
+ * bool nodes_weight_lt(src, nbits, num) Hamming Weight is less than num
+ * bool nodes_weight_le(src, nbits, num) Hamming Weight is less than or equal to num
  *
  * void nodes_shift_right(dst, src, n)	Shift right
  * void nodes_shift_left(dst, src, n)	Shift left
@@ -240,6 +245,36 @@ static inline int __nodes_weight(const nodemask_t *srcp, unsigned int nbits)
 	return bitmap_weight(srcp->bits, nbits);
 }
 
+#define nodes_weight_eq(nodemask, num) __nodes_weight_eq(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_eq(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_eq(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_gt(nodemask, num) __nodes_weight_gt(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_gt(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_gt(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_ge(nodemask, num) __nodes_weight_ge(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_ge(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_ge(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_lt(nodemask, num) __nodes_weight_lt(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_lt(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_lt(srcp->bits, nbits, num);
+}
+
+#define nodes_weight_le(nodemask, num) __nodes_weight_le(&(nodemask), MAX_NUMNODES, (num))
+static inline int __nodes_weight_le(const nodemask_t *srcp, unsigned int nbits, int num)
+{
+	return bitmap_weight_le(srcp->bits, nbits, num);
+}
+
 #define nodes_shift_right(dst, src, n) \
 			__nodes_shift_right(&(dst), &(src), (n), MAX_NUMNODES)
 static inline void __nodes_shift_right(nodemask_t *dstp,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 50/54] acpi: replace nodes__weight with nodes_weight_ge for numa
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (48 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 49/54] lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 51/54] mm: replace nodes_weight with nodes_weight_eq in mempolicy Yury Norov
                   ` (4 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Rafael J. Wysocki, Len Brown, Dan Williams, Huacai Chen,
	Vitaly Kuznetsov, Alison Schofield, linux-acpi

acpi_map_pxm_to_node() calls nodes_weight() to compare the weight
of nodemask with a given number. We can do it more efficiently with
nodes_weight_eq() because conditional nodes_weight may stop
traversing the nodemask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/acpi/numa/srat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
index 3b818ab186be..fe7a7996f553 100644
--- a/drivers/acpi/numa/srat.c
+++ b/drivers/acpi/numa/srat.c
@@ -67,7 +67,7 @@ int acpi_map_pxm_to_node(int pxm)
 	node = pxm_to_node_map[pxm];
 
 	if (node == NUMA_NO_NODE) {
-		if (nodes_weight(nodes_found_map) >= MAX_NUMNODES)
+		if (nodes_weight_ge(nodes_found_map, MAX_NUMNODES))
 			return NUMA_NO_NODE;
 		node = first_unset_node(nodes_found_map);
 		__acpi_map_pxm_to_node(pxm, node);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 51/54] mm: replace nodes_weight with nodes_weight_eq in mempolicy
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (49 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 50/54] acpi: replace nodes__weight with nodes_weight_ge for numa Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 52/54] lib/nodemask: add num_node_state_eq() Yury Norov
                   ` (3 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

do_migrate_pages() calls nodes_weight() to compare the weight
of nodemask with a given number. We can do it more efficiently with
nodes_weight_eq() because conditional nodes_weight() may stop
traversing the nodemask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 mm/mempolicy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index a86590b2507d..27817cf2f2a0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1157,7 +1157,7 @@ int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from,
 			 *          [0-7] - > [3,4,5] moves only 0,1,2,6,7.
 			 */
 
-			if ((nodes_weight(*from) != nodes_weight(*to)) &&
+			if (!nodes_weight_eq(*from, nodes_weight(*to)) &&
 						(node_isset(s, *to)))
 				continue;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 52/54] lib/nodemask: add num_node_state_eq()
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (50 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 51/54] mm: replace nodes_weight with nodes_weight_eq in mempolicy Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 53/54] tools/bitmap: sync bitmap_weight Yury Norov
                   ` (2 subsequent siblings)
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	linux-mm

Kernel code uses num_node_state() to compare number of nodes with a given
number. The underlying code calls bitmap_weight(), and we can do it more
efficiently with num_node_state_eq because conditional nodes_weight may
stop traversing the nodemask earlier, as soon as condition is met.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/nodemask.h | 5 +++++
 mm/page_alloc.c          | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 197598e075e9..c5014dbf3cce 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -466,6 +466,11 @@ static inline int num_node_state(enum node_states state)
 	return nodes_weight(node_states[state]);
 }
 
+static inline int num_node_state_eq(enum node_states state, int num)
+{
+	return nodes_weight_eq(node_states[state], num);
+}
+
 #define for_each_node_state(__node, __state) \
 	for_each_node_mask((__node), node_states[__state])
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8dd6399bafb5..37496d764643 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8328,7 +8328,7 @@ void __init page_alloc_init(void)
 	int ret;
 
 #ifdef CONFIG_NUMA
-	if (num_node_state(N_MEMORY) == 1)
+	if (num_node_state_eq(N_MEMORY, 1))
 		hashdist = 0;
 #endif
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 53/54] tools/bitmap: sync bitmap_weight
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (51 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 52/54] lib/nodemask: add num_node_state_eq() Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-23 18:39 ` [PATCH 54/54] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
  2022-01-26  7:30 ` [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Vaittinen, Matti
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Jin Yao, John Garry,
	Ian Rogers, Kan Liang, linux-perf-users

Pull bitmap_weight_{cmp,eq,gt,ge,lt,le} from mother kernel and
use where applicable.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/linux/bitmap.h | 44 ++++++++++++++++++++++++++++++++++++
 tools/lib/bitmap.c           | 20 ++++++++++++++++
 tools/perf/util/pmu.c        |  2 +-
 3 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index ea97804d04d4..e8ae9a85d555 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -12,6 +12,8 @@
 	unsigned long name[BITS_TO_LONGS(bits)]
 
 int __bitmap_weight(const unsigned long *bitmap, int bits);
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits,
+			 unsigned int num);
 void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, int bits);
 int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
@@ -68,6 +70,48 @@ static inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 	return __bitmap_weight(src, nbits);
 }
 
+static __always_inline
+int bitmap_weight_cmp(const unsigned long *src, unsigned int nbits, int num)
+{
+	if (num > (int)nbits || num < 0)
+		return -num;
+
+	if (small_const_nbits(nbits))
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits)) - num;
+
+	return __bitmap_weight_cmp(src, nbits, num);
+}
+
+static __always_inline
+bool bitmap_weight_eq(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) == 0;
+}
+
+static __always_inline
+bool bitmap_weight_gt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_ge(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) > 0;
+}
+
+static __always_inline
+bool bitmap_weight_lt(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num - 1) <= 0;
+}
+
+static __always_inline
+bool bitmap_weight_le(const unsigned long *src, unsigned int nbits, int num)
+{
+	return bitmap_weight_cmp(src, nbits, num) <= 0;
+}
+
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
 			     const unsigned long *src2, unsigned int nbits)
 {
diff --git a/tools/lib/bitmap.c b/tools/lib/bitmap.c
index db466ef7be9d..06e58fee8523 100644
--- a/tools/lib/bitmap.c
+++ b/tools/lib/bitmap.c
@@ -18,6 +18,26 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 	return w;
 }
 
+int __bitmap_weight_cmp(const unsigned long *bitmap, unsigned int bits, int num)
+{
+	unsigned int k, w, lim = bits / BITS_PER_LONG;
+
+	for (k = 0, w = 0; k < lim; k++) {
+		if (w + bits - k * BITS_PER_LONG < num)
+			goto out;
+
+		w += hweight_long(bitmap[k]);
+
+		if (w > num)
+			goto out;
+	}
+
+	if (bits % BITS_PER_LONG)
+		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+out:
+	return w - num;
+}
+
 void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		 const unsigned long *bitmap2, int bits)
 {
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8dfbba15aeb8..2c26cdd7f9b0 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1314,7 +1314,7 @@ static int pmu_config_term(const char *pmu_name,
 	 */
 	if (term->type_val == PARSE_EVENTS__TERM_TYPE_NUM) {
 		if (term->no_value &&
-		    bitmap_weight(format->bits, PERF_PMU_FORMAT_BITS) > 1) {
+		    bitmap_weight_gt(format->bits, PERF_PMU_FORMAT_BITS, 1)) {
 			if (err) {
 				parse_events_error__handle(err, term->err_val,
 					   strdup("no value assigned for term"),
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* [PATCH 54/54] MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (52 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 53/54] tools/bitmap: sync bitmap_weight Yury Norov
@ 2022-01-23 18:39 ` Yury Norov
  2022-01-26  7:30 ` [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Vaittinen, Matti
  54 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-23 18:39 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel

cpumask and nodemask APIs are thin wrappers around basic bitmap API, and
corresponding files are not formally maintained. This patch adds them to
BITMAP_API section, so that bitmap folks would have closer look at it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 MAINTAINERS | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 27730a5a6345..7a3798de61c9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3412,10 +3412,14 @@ R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
 R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
 S:	Maintained
 F:	include/linux/bitmap.h
+F:	include/linux/cpumask.h
 F:	include/linux/find.h
+F:	include/linux/nodemask.h
 F:	lib/bitmap.c
+F:	lib/cpumask.c
 F:	lib/find_bit.c
 F:	lib/find_bit_benchmark.c
+F:	lib/nodemask.c
 F:	lib/test_bitmap.c
 F:	tools/include/linux/bitmap.h
 F:	tools/include/linux/find.h
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* Re: [PATCH 18/54] drivers/infiniband: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 ` [PATCH 18/54] drivers/infiniband: " Yury Norov
@ 2022-01-23 19:13   ` Leon Romanovsky
  0 siblings, 0 replies; 117+ messages in thread
From: Leon Romanovsky @ 2022-01-23 19:13 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mike Marciniszyn, Dennis Dalessandro, Jason Gunthorpe,
	linux-rdma

On Sun, Jan 23, 2022 at 10:38:49AM -0800, Yury Norov wrote:
> drivers/infiniband/hw/hfi1/affinity.c code calls cpumask_weight() to check
> if any bit of a given cpumask is set. We can do it more efficiently with
> cpumask_empty() because cpumask_empty() stops traversing the cpumask as
> soon as it finds first set bit, while cpumask_weight() counts all bits
> unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/infiniband/hw/hfi1/affinity.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Except that title needs to be: "RDMA/hfi: ....", the change looks ok.

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-23 18:39 ` [PATCH 43/54] drivers/hv: " Yury Norov
@ 2022-01-23 22:00   ` Wei Liu
  2022-01-23 22:02   ` Haiyang Zhang
  2022-01-24  9:20   ` Vitaly Kuznetsov
  2 siblings, 0 replies; 117+ messages in thread
From: Wei Liu @ 2022-01-23 22:00 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, linux-hyperv

On Sun, Jan 23, 2022 at 10:39:14AM -0800, Yury Norov wrote:
> init_vp_index() calls cpumask_weight() to compare the weights of cpumasks
> We can do it more efficiently with cpumask_weight_eq because conditional
> cpumask_weight may stop traversing the cpumask earlier (at least one), as
> soon as condition is met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Wei Liu <wei.liu@kernel.org>

^ permalink raw reply	[flat|nested] 117+ messages in thread

* RE: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-23 18:39 ` [PATCH 43/54] drivers/hv: " Yury Norov
  2022-01-23 22:00   ` Wei Liu
@ 2022-01-23 22:02   ` Haiyang Zhang
  2022-01-24  9:20   ` Vitaly Kuznetsov
  2 siblings, 0 replies; 117+ messages in thread
From: Haiyang Zhang @ 2022-01-23 22:02 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	KY Srinivasan, Stephen Hemminger, Wei Liu, Dexuan Cui,
	linux-hyperv



> -----Original Message-----
> From: Yury Norov <yury.norov@gmail.com>
> Sent: Sunday, January 23, 2022 1:39 PM
> To: Yury Norov <yury.norov@gmail.com>; Andy Shevchenko <andriy.shevchenko@linux.intel.com>;
> Rasmus Villemoes <linux@rasmusvillemoes.dk>; Andrew Morton <akpm@linux-foundation.org>;
> Michał Mirosław <mirq-linux@rere.qmqm.pl>; Greg Kroah-Hartman <gregkh@linuxfoundation.org>;
> Peter Zijlstra <peterz@infradead.org>; David Laight <David.Laight@aculab.com>; Joe Perches
> <joe@perches.com>; Dennis Zhou <dennis@kernel.org>; Emil Renner Berthing <kernel@esmil.dk>;
> Nicholas Piggin <npiggin@gmail.com>; Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>;
> Alexey Klimov <aklimov@redhat.com>; linux-kernel@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; Wei Liu <wei.liu@kernel.org>; Dexuan Cui <decui@microsoft.com>;
> linux-hyperv@vger.kernel.org
> Subject: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
> 
> init_vp_index() calls cpumask_weight() to compare the weights of cpumasks
> We can do it more efficiently with cpumask_weight_eq because conditional
> cpumask_weight may stop traversing the cpumask earlier (at least one), as
> soon as condition is met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/hv/channel_mgmt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 60375879612f..7420a5fd47b5 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -762,8 +762,8 @@ static void init_vp_index(struct vmbus_channel *channel)
>  		}
>  		alloced_mask = &hv_context.hv_numa_map[numa_node];
> 
> -		if (cpumask_weight(alloced_mask) ==
> -		    cpumask_weight(cpumask_of_node(numa_node))) {
> +		if (cpumask_weight_eq(alloced_mask,
> +			    cpumask_weight(cpumask_of_node(numa_node)))) {
>  			/*
>  			 * We have cycled through all the CPUs in the node;
>  			 * reset the alloced map.

Thanks.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 ` [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
@ 2022-01-24  0:07   ` Paul E. McKenney
  0 siblings, 0 replies; 117+ messages in thread
From: Paul E. McKenney @ 2022-01-24  0:07 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, rcu

On Sun, Jan 23, 2022 at 10:38:53AM -0800, Yury Norov wrote:
> In some places, RCU code calls cpumask_weight() to check if any bit of a
> given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Good point!  Queued and pushed, thank you!

							Thanx, Paul

> ---
>  kernel/rcu/tree_nocb.h   | 4 ++--
>  kernel/rcu/tree_plugin.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index eeafb546a7a0..f83c7b1d6110 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1169,7 +1169,7 @@ void __init rcu_init_nohz(void)
>  	struct rcu_data *rdp;
>  
>  #if defined(CONFIG_NO_HZ_FULL)
> -	if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask))
> +	if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask))
>  		need_rcu_nocb_mask = true;
>  #endif /* #if defined(CONFIG_NO_HZ_FULL) */
>  
> @@ -1348,7 +1348,7 @@ static void __init rcu_organize_nocb_kthreads(void)
>   */
>  void rcu_bind_current_to_nocb(void)
>  {
> -	if (cpumask_available(rcu_nocb_mask) && cpumask_weight(rcu_nocb_mask))
> +	if (cpumask_available(rcu_nocb_mask) && !cpumask_empty(rcu_nocb_mask))
>  		WARN_ON(sched_setaffinity(current->pid, rcu_nocb_mask));
>  }
>  EXPORT_SYMBOL_GPL(rcu_bind_current_to_nocb);
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index c5b45c2f68a1..0dc0c8d6717c 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1215,7 +1215,7 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
>  		    cpu != outgoingcpu)
>  			cpumask_set_cpu(cpu, cm);
>  	cpumask_and(cm, cm, housekeeping_cpumask(HK_FLAG_RCU));
> -	if (cpumask_weight(cm) == 0)
> +	if (cpumask_empty(cm))
>  		cpumask_copy(cm, housekeeping_cpumask(HK_FLAG_RCU));
>  	set_cpus_allowed_ptr(t, cm);
>  	free_cpumask_var(cm);
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read()
  2022-01-23 18:38 ` [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
@ 2022-01-24  3:11   ` Florian Fainelli
  0 siblings, 0 replies; 117+ messages in thread
From: Florian Fainelli @ 2022-01-24  3:11 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andrew Lunn, Vivien Didelot, Vladimir Oltean, David S. Miller,
	Jakub Kicinski, netdev



On 1/23/2022 10:38 AM, Yury Norov wrote:
> Don't call bitmap_weight() if the following code can get by
> without it.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set()
  2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
@ 2022-01-24  3:11   ` Florian Fainelli
  0 siblings, 0 replies; 117+ messages in thread
From: Florian Fainelli @ 2022-01-24  3:11 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	David S . Miller, Jakub Kicinski, bcm-kernel-feedback-list,
	netdev



On 1/23/2022 10:38 AM, Yury Norov wrote:
> Don't call bitmap_weight() if the following code can get by
> without it.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 19/54] drivers/irqchip: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38 ` [PATCH 19/54] drivers/irqchip: " Yury Norov
@ 2022-01-24  3:11   ` Florian Fainelli
  0 siblings, 0 replies; 117+ messages in thread
From: Florian Fainelli @ 2022-01-24  3:11 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner, Marc Zyngier, bcm-kernel-feedback-list,
	linux-mips



On 1/23/2022 10:38 AM, Yury Norov wrote:
> bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 41/54] arch/x86: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:39 ` [PATCH 41/54] arch/x86: " Yury Norov
@ 2022-01-24  8:05   ` Peter Zijlstra
  2022-01-24  9:16   ` Vitaly Kuznetsov
  1 sibling, 0 replies; 117+ messages in thread
From: Peter Zijlstra @ 2022-01-24  8:05 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Rafael J. Wysocki, Vitaly Kuznetsov, Tim Chen, Alison Schofield,
	Boris Ostrovsky

On Sun, Jan 23, 2022 at 10:39:12AM -0800, Yury Norov wrote:
> smpboot code in somw places calls cpumask_weight() to compare the weight
> of cpumask with a given number. We can do it more efficiently with
> cpumask_weight_eq() because conditional cpumask_weight may stop traversing
> the cpumask earlier, as soon as condition is met.

Why use a more complicated API for code that has no performance
requirements?

From where I'm sitting this is a net negative for making the code harder
to read.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 47/54] sched: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:39 ` [PATCH 47/54] sched: replace cpumask_weight with cpumask_weight_eq where appropriate Yury Norov
@ 2022-01-24  8:05   ` Peter Zijlstra
  0 siblings, 0 replies; 117+ messages in thread
From: Peter Zijlstra @ 2022-01-24  8:05 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ingo Molnar,
	Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira

On Sun, Jan 23, 2022 at 10:39:18AM -0800, Yury Norov wrote:
> kernel/sched code uses cpumask_weight() to compare the weight of
> cpumask with a given number. We can do it more efficiently with
> cpumask_weight_eq because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is met.
> 

Same as with the other patch, you're just making the code more difficult
to read for no reason.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 48/54] kernel/time: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:39 ` [PATCH 48/54] kernel/time: " Yury Norov
@ 2022-01-24  8:06   ` Peter Zijlstra
  0 siblings, 0 replies; 117+ messages in thread
From: Peter Zijlstra @ 2022-01-24  8:06 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Thomas Gleixner

On Sun, Jan 23, 2022 at 10:39:19AM -0800, Yury Norov wrote:
> tick_cleanup_dead_cpu() calls cpumask_weight() to compare the weight
> of cpumask with a given number. We can do it more efficiently with
> cpumask_weight_eq() because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is met.

But again, nobody gives a crap about performance here..

> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  kernel/time/clockevents.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
> index 003ccf338d20..32d6629a55b2 100644
> --- a/kernel/time/clockevents.c
> +++ b/kernel/time/clockevents.c
> @@ -648,7 +648,7 @@ void tick_cleanup_dead_cpu(int cpu)
>  	 */
>  	list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
>  		if (cpumask_test_cpu(cpu, dev->cpumask) &&
> -		    cpumask_weight(dev->cpumask) == 1 &&
> +		    cpumask_weight_eq(dev->cpumask, 1) &&
>  		    !tick_is_broadcast_device(dev)) {
>  			BUG_ON(!clockevent_state_detached(dev));
>  			list_del(&dev->list);
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 06/54] x86/kvm: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 ` [PATCH 06/54] x86/kvm: " Yury Norov
@ 2022-01-24  9:11   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 117+ messages in thread
From: Vitaly Kuznetsov @ 2022-01-24  9:11 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Paolo Bonzini, Sean Christopherson, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, x86, kvm

Yury Norov <yury.norov@gmail.com> writes:

> In some places kvm/hyperv.c code calls bitmap_weight() to check if any bit
> of a given bitmap is set. It's better to use bitmap_empty() in that case
> because bitmap_empty() stops traversing the bitmap as soon as it finds
> first set bit, while bitmap_weight() counts all bits unconditionally.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/x86/kvm/hyperv.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 6e38a7d22e97..2c3400dea4b3 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -90,7 +90,7 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
>  {
>  	struct kvm_vcpu *vcpu = hv_synic_to_vcpu(synic);
>  	struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
> -	int auto_eoi_old, auto_eoi_new;
> +	bool auto_eoi_old, auto_eoi_new;
>  
>  	if (vector < HV_SYNIC_FIRST_VALID_VECTOR)
>  		return;
> @@ -100,16 +100,16 @@ static void synic_update_vector(struct kvm_vcpu_hv_synic *synic,
>  	else
>  		__clear_bit(vector, synic->vec_bitmap);
>  
> -	auto_eoi_old = bitmap_weight(synic->auto_eoi_bitmap, 256);
> +	auto_eoi_old = bitmap_empty(synic->auto_eoi_bitmap, 256);

I would've preferred this written as 

	auto_eoi_old = !bitmap_empty(synic->auto_eoi_bitmap, 256);

so the variable would indicate wether AutoEOI was previosly enabled, not
disabled.

>  
>  	if (synic_has_vector_auto_eoi(synic, vector))
>  		__set_bit(vector, synic->auto_eoi_bitmap);
>  	else
>  		__clear_bit(vector, synic->auto_eoi_bitmap);
>  
> -	auto_eoi_new = bitmap_weight(synic->auto_eoi_bitmap, 256);
> +	auto_eoi_new = bitmap_empty(synic->auto_eoi_bitmap, 256);

Same here, of course. "auto_eoi_new = true" sounds like "AutoEOI is now
enabled".

>  
> -	if (!!auto_eoi_old == !!auto_eoi_new)
> +	if (auto_eoi_old == auto_eoi_new)
>  		return;
>  
>  	down_write(&vcpu->kvm->arch.apicv_update_lock);

The change look good to me otherwise, feel free to add

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 41/54] arch/x86: replace cpumask_weight with cpumask_weight_eq where appropriate
  2022-01-23 18:39 ` [PATCH 41/54] arch/x86: " Yury Norov
  2022-01-24  8:05   ` Peter Zijlstra
@ 2022-01-24  9:16   ` Vitaly Kuznetsov
  1 sibling, 0 replies; 117+ messages in thread
From: Vitaly Kuznetsov @ 2022-01-24  9:16 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Rafael J. Wysocki, Tim Chen, Alison Schofield,
	Boris Ostrovsky

Yury Norov <yury.norov@gmail.com> writes:

> smpboot code in somw places calls cpumask_weight() to compare the weight
> of cpumask with a given number. We can do it more efficiently with
> cpumask_weight_eq() because conditional cpumask_weight may stop traversing
> the cpumask earlier, as soon as condition is met.

I think this is misleading. cpumask_weight_eq() with any implementation
can only stop earlier if the condition is NOT met (when the number of
set bits is already higher than needed), to check for equality all bits
always need to be examined.

>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/x86/kernel/smpboot.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 617012f4619f..e851e9945eb5 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1608,7 +1608,7 @@ static void remove_siblinginfo(int cpu)
>  		/*/
>  		 * last thread sibling in this cpu core going down
>  		 */
> -		if (cpumask_weight(topology_sibling_cpumask(cpu)) == 1)
> +		if (cpumask_weight_eq(topology_sibling_cpumask(cpu), 1))
>  			cpu_data(sibling).booted_cores--;
>  	}
>  
> @@ -1617,7 +1617,7 @@ static void remove_siblinginfo(int cpu)
>  
>  	for_each_cpu(sibling, topology_sibling_cpumask(cpu)) {
>  		cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
> -		if (cpumask_weight(topology_sibling_cpumask(sibling)) == 1)
> +		if (cpumask_weight_eq(topology_sibling_cpumask(sibling), 1))
>  			cpu_data(sibling).smt_active = false;
>  	}

-- 
Vitaly


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-23 18:39 ` [PATCH 43/54] drivers/hv: " Yury Norov
  2022-01-23 22:00   ` Wei Liu
  2022-01-23 22:02   ` Haiyang Zhang
@ 2022-01-24  9:20   ` Vitaly Kuznetsov
  2022-01-27 15:02     ` Michael Kelley (LINUX)
  2 siblings, 1 reply; 117+ messages in thread
From: Vitaly Kuznetsov @ 2022-01-24  9:20 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, linux-hyperv

Yury Norov <yury.norov@gmail.com> writes:

> init_vp_index() calls cpumask_weight() to compare the weights of cpumasks
> We can do it more efficiently with cpumask_weight_eq because conditional
> cpumask_weight may stop traversing the cpumask earlier (at least one), as
> soon as condition is met.

Same comment as for "PATCH 41/54": cpumask_weight_eq() can only stop
earlier if the condition is not met, to prove the equality all bits need
always have to be examined.

>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/hv/channel_mgmt.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 60375879612f..7420a5fd47b5 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -762,8 +762,8 @@ static void init_vp_index(struct vmbus_channel *channel)
>  		}
>  		alloced_mask = &hv_context.hv_numa_map[numa_node];
>  
> -		if (cpumask_weight(alloced_mask) ==
> -		    cpumask_weight(cpumask_of_node(numa_node))) {
> +		if (cpumask_weight_eq(alloced_mask,
> +			    cpumask_weight(cpumask_of_node(numa_node)))) {

This code is not performace critical and I prefer the old version:

 	cpumask_weight() == cpumask_weight()

 looks better than

	cpumask_weight_eq(..., cpumask_weight())

(let alone the inner cpumask_of_node()) to me.

>  			/*
>  			 * We have cycled through all the CPUs in the node;
>  			 * reset the alloced map.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38   ` Yury Norov
@ 2022-01-24 10:41     ` Sudeep Holla
  -1 siblings, 0 replies; 117+ messages in thread
From: Sudeep Holla @ 2022-01-24 10:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Sudeep Holla,
	Peter Zijlstra, David Laight, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Andy Gross, Bjorn Andersson,
	Rafael J. Wysocki, Viresh Kumar, Cristian Marussi, linux-arm-msm,
	linux-pm, linux-arm-kernel

On Sun, Jan 23, 2022 at 10:38:47AM -0800, Yury Norov wrote:
> drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
>

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for SCMI cpufreq driver)

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-24 10:41     ` Sudeep Holla
  0 siblings, 0 replies; 117+ messages in thread
From: Sudeep Holla @ 2022-01-24 10:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Sudeep Holla,
	Peter Zijlstra, David Laight, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Andy Gross, Bjorn Andersson,
	Rafael J. Wysocki, Viresh Kumar, Cristian Marussi, linux-arm-msm,
	linux-pm, linux-arm-kernel

On Sun, Jan 23, 2022 at 10:38:47AM -0800, Yury Norov wrote:
> drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
>

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> (for SCMI cpufreq driver)

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  2022-01-23 18:38 ` [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic Yury Norov
@ 2022-01-24 12:28   ` Andy Shevchenko
  2022-01-25 21:09     ` Yury Norov
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Shevchenko @ 2022-01-24 12:28 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ariel Elior,
	Manish Chopra, David S. Miller, Jakub Kicinski, netdev

On Sun, Jan 23, 2022 at 10:38:41AM -0800, Yury Norov wrote:
> qlogic/qed code calls bitmap_weight() to check if any bit of a given
> bitmap is set. It's better to use bitmap_empty() in that case because
> bitmap_empty() stops traversing the bitmap as soon as it finds first
> set bit, while bitmap_weight() counts all bits unconditionally.

> -		if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
> +		if (!bitmap_empty((unsigned long *)&pmap[item], 64 * 8))

> -	    (bitmap_weight((unsigned long *)&pmap[item],
> +	    (!bitmap_empty((unsigned long *)&pmap[item],

Side note, these castings reminds me previous discussion and I'm wondering
if you have this kind of potentially problematic places in your TODO as
subject to fix.


-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-23 18:38 ` [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
@ 2022-01-24 12:41   ` Andy Shevchenko
  2022-01-24 12:43     ` Andy Shevchenko
  2022-01-26 15:56     ` Yury Norov
  2022-01-28  6:59   ` Vaittinen, Matti
  1 sibling, 2 replies; 117+ messages in thread
From: Andy Shevchenko @ 2022-01-24 12:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel

On Sun, Jan 23, 2022 at 10:38:58AM -0800, Yury Norov wrote:
> Many kernel users use bitmap_weight() to compare the result against
> some number or expression:
> 
> 	if (bitmap_weight(...) > 1)
> 		do_something();
> 
> It works OK, but may be significantly improved for large bitmaps: if
> first few words count set bits to a number greater than given, we can
> stop counting and immediately return.
> 
> The same idea would work in other direction: if we know that the number
> of set bits that we counted so far is small enough, so that it would be
> smaller than required number even if all bits of the rest of the bitmap
> are set, we can stop counting earlier.
> 
> This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
> and a family of eq, gt, ge, lt and le wrappers to allow this optimization.

lt, and le

> The following patches apply new functions where appropriate.

What I missed in the above message is the rough statistics like some of them
are used more often, some less, and some, perhaps, just added for the sake of
symmetry (the latter is what would be important to see if there are APIs which
have no users at all).

> Suggested-by: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> (for bitmap_weight_cmp)

Please, avoid using double quotes in the tags.

While at it, as a few folks already noticed, keep the subject lines in align
with the policies established in the certain subsystems (in this case seems
'bitmap:' should suffice). I would recommend to run

  `git log --oneline --no-merges -- ...file(s)_in_question...`

to figure out what is the most used and best fit in each case individually.

...

> + * Returns zero if weight of @src is equal to @num;
> + *	   negative number if weight of @src is less than @num;
> + *	   positive number if weight of @src is greater than @num;

> + * NOTES
> + *
> + * Because number of set bits cannot decrease while counting, when user
> + * wants to know if the number of set bits in the bitmap is less than
> + * @num, calling
> + *	bitmap_weight_cmp(..., @num) < 0
> + * is potentially less effective than
> + *	bitmap_weight_cmp(..., @num - 1) <= 0
> + *
> + * Consider an example:
> + * bitmap_weight_cmp(1000 0000 0000 0000, 1) < 0
> + *				    ^
> + *				    stop here
> + *
> + * bitmap_weight_cmp(1000 0000 0000 0000, 0) <= 0
> + *		     ^
> + *		     stop here

This probably should precede the Returns paragraph, also that paragraph can be
converted to a section in the documentation as follows:

 *
 * Returns:
 *   ...
 *

...

> +	if (num > (int)nbits || num < 0)

Wonder if

	if (abs(num) > nbits)

would be sufficient.

> +		return -num;

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-24 12:41   ` Andy Shevchenko
@ 2022-01-24 12:43     ` Andy Shevchenko
  2022-01-24 12:49       ` Peter Zijlstra
  2022-01-26 15:56     ` Yury Norov
  1 sibling, 1 reply; 117+ messages in thread
From: Andy Shevchenko @ 2022-01-24 12:43 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel

On Mon, Jan 24, 2022 at 02:41:38PM +0200, Andy Shevchenko wrote:
> On Sun, Jan 23, 2022 at 10:38:58AM -0800, Yury Norov wrote:

...

> > +	if (num > (int)nbits || num < 0)
> 
> Wonder if
> 
> 	if (abs(num) > nbits)
> 
> would be sufficient.

Scratch it. Of course it won't work.

It may be other way around:

	if ((unsigned int)num > nbits)

> > +		return -num;

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where appropriate
  2022-01-23 18:39 ` [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
@ 2022-01-24 12:46   ` Andy Shevchenko
  2022-01-30 12:54     ` Jonathan Cameron
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Shevchenko @ 2022-01-24 12:46 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Jonathan Cameron,
	Lars-Peter Clausen, Nathan Chancellor, Alexandru Ardelean,
	linux-iio

On Sun, Jan 23, 2022 at 10:39:00AM -0800, Yury Norov wrote:
> drivers/iio calls bitmap_weight() to compare the weight of bitmap with
> a given number. We can do it more efficiently with bitmap_weight_{eq, gt}
> because conditional bitmap_weight may stop traversing the bitmap earlier,
> as soon as condition is met.

...

>  		int i, j;
>  
>  		for (i = 0, j = 0;
> -		     i < bitmap_weight(indio_dev->active_scan_mask,
> -				       indio_dev->masklength);
> +		     bitmap_weight_gt(indio_dev->active_scan_mask,
> +				       indio_dev->masklength, i);
>  		     i++, j++) {
>  			j = find_next_bit(indio_dev->active_scan_mask,
>  					  indio_dev->masklength, j);

This smells like room for improvement. Have you checked this deeply?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox
  2022-01-23 18:39 ` [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox Yury Norov
@ 2022-01-24 12:48   ` Andy Shevchenko
  2022-02-09  6:46     ` Yury Norov
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Shevchenko @ 2022-01-24 12:48 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Sunil Goutham,
	Geetha sowjanya, Subbaraya Sundeep, hariprasad, David S. Miller,
	Jakub Kicinski, netdev

On Sun, Jan 23, 2022 at 10:39:04AM -0800, Yury Norov wrote:
> Mellanox code uses bitmap_weight() to compare the weight of bitmap with
> a given number. We can do it more efficiently with bitmap_weight_{eq, ...}
> because conditional bitmap_weight may stop traversing the bitmap earlier,
> as soon as condition is met.

> -	if (port <= 0 || port > m)
> +	if (port <= 0 || bitmap_weight_lt(actv_ports.ports, dev->caps.num_ports, port))
>  		return -EINVAL;

Can we eliminate now the port <= 0 check? Or at least make it port == 0?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-24 12:43     ` Andy Shevchenko
@ 2022-01-24 12:49       ` Peter Zijlstra
  0 siblings, 0 replies; 117+ messages in thread
From: Peter Zijlstra @ 2022-01-24 12:49 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yury Norov, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, David Laight,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel

On Mon, Jan 24, 2022 at 02:43:30PM +0200, Andy Shevchenko wrote:
> It may be other way around:
> 
> 	if ((unsigned int)num > nbits)

Yes, that's my preferred method too :-)

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38   ` [Intel-gfx] " Yury Norov
@ 2022-01-25  9:28     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 117+ messages in thread
From: Tvrtko Ursulin @ 2022-01-25  9:28 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel


On 23/01/2022 18:38, Yury Norov wrote:
> i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
> given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   drivers/gpu/drm/i915/i915_pmu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index ea655161793e..1894c876b31d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>   	GEM_BUG_ON(!pmu->base.event_init);
>   
>   	/* Select the first online CPU as a designated reader. */
> -	if (!cpumask_weight(&i915_pmu_cpumask))
> +	if (cpumask_empty(&i915_pmu_cpumask))
>   		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>   
>   	return 0;
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I see it's a large series which only partially appeared on our mailing 
lists. So for instance it hasn't got tested by our automated CI. (Not 
that I expect any problems in this patch.)

What are the plans in terms of which tree will it get merged through?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [Intel-gfx] [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-25  9:28     ` Tvrtko Ursulin
  0 siblings, 0 replies; 117+ messages in thread
From: Tvrtko Ursulin @ 2022-01-25  9:28 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel


On 23/01/2022 18:38, Yury Norov wrote:
> i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
> given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   drivers/gpu/drm/i915/i915_pmu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index ea655161793e..1894c876b31d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>   	GEM_BUG_ON(!pmu->base.event_init);
>   
>   	/* Select the first online CPU as a designated reader. */
> -	if (!cpumask_weight(&i915_pmu_cpumask))
> +	if (cpumask_empty(&i915_pmu_cpumask))
>   		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>   
>   	return 0;
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I see it's a large series which only partially appeared on our mailing 
lists. So for instance it hasn't got tested by our automated CI. (Not 
that I expect any problems in this patch.)

What are the plans in terms of which tree will it get merged through?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-25  9:28     ` [Intel-gfx] " Tvrtko Ursulin
  (?)
@ 2022-01-25 18:16       ` Yury Norov
  -1 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-25 18:16 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Emil Renner Berthing, dri-devel, Peter Zijlstra,
	Greg Kroah-Hartman, David Airlie, Rasmus Villemoes, linux-kernel,
	Nicholas Piggin, Michał Mirosław, Alexey Klimov,
	David Laight, Matti Vaittinen, Rodrigo Vivi, Joe Perches,
	Dennis Zhou, Andrew Morton, Andy Shevchenko, intel-gfx

On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 23/01/2022 18:38, Yury Norov wrote:
> > i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
> > given cpumask is set. We can do it more efficiently with cpumask_empty()
> > because cpumask_empty() stops traversing the cpumask as soon as it finds
> > first set bit, while cpumask_weight() counts all bits unconditionally.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> > index ea655161793e..1894c876b31d 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> >       GEM_BUG_ON(!pmu->base.event_init);
> >
> >       /* Select the first online CPU as a designated reader. */
> > -     if (!cpumask_weight(&i915_pmu_cpumask))
> > +     if (cpumask_empty(&i915_pmu_cpumask))
> >               cpumask_set_cpu(cpu, &i915_pmu_cpumask);
> >
> >       return 0;
> >
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> I see it's a large series which only partially appeared on our mailing
> lists.

The series is here: https://lkml.org/lkml/2022/1/23/223
The branch: https://github.com/norov/linux/tree/bitmap-20220123

> So for instance it hasn't got tested by our automated CI. (Not
> that I expect any problems in this patch.)

Would be great if you give a test for the whole series, thanks!

> What are the plans in terms of which tree will it get merged through?

For the patches that will not be merged by maintainers of corresponding
subsystems, I'll use my bitmap branch and send it to linux-next.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-25 18:16       ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-25 18:16 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel

On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 23/01/2022 18:38, Yury Norov wrote:
> > i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
> > given cpumask is set. We can do it more efficiently with cpumask_empty()
> > because cpumask_empty() stops traversing the cpumask as soon as it finds
> > first set bit, while cpumask_weight() counts all bits unconditionally.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> > index ea655161793e..1894c876b31d 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> >       GEM_BUG_ON(!pmu->base.event_init);
> >
> >       /* Select the first online CPU as a designated reader. */
> > -     if (!cpumask_weight(&i915_pmu_cpumask))
> > +     if (cpumask_empty(&i915_pmu_cpumask))
> >               cpumask_set_cpu(cpu, &i915_pmu_cpumask);
> >
> >       return 0;
> >
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> I see it's a large series which only partially appeared on our mailing
> lists.

The series is here: https://lkml.org/lkml/2022/1/23/223
The branch: https://github.com/norov/linux/tree/bitmap-20220123

> So for instance it hasn't got tested by our automated CI. (Not
> that I expect any problems in this patch.)

Would be great if you give a test for the whole series, thanks!

> What are the plans in terms of which tree will it get merged through?

For the patches that will not be merged by maintainers of corresponding
subsystems, I'll use my bitmap branch and send it to linux-next.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [Intel-gfx] [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-25 18:16       ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-25 18:16 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Emil Renner Berthing, dri-devel, Peter Zijlstra,
	Greg Kroah-Hartman, David Airlie, Rasmus Villemoes, linux-kernel,
	Nicholas Piggin, Michał Mirosław, Alexey Klimov,
	David Laight, Matti Vaittinen, Joe Perches, Dennis Zhou,
	Andrew Morton, Andy Shevchenko, intel-gfx

On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 23/01/2022 18:38, Yury Norov wrote:
> > i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
> > given cpumask is set. We can do it more efficiently with cpumask_empty()
> > because cpumask_empty() stops traversing the cpumask as soon as it finds
> > first set bit, while cpumask_weight() counts all bits unconditionally.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >   drivers/gpu/drm/i915/i915_pmu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> > index ea655161793e..1894c876b31d 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> >       GEM_BUG_ON(!pmu->base.event_init);
> >
> >       /* Select the first online CPU as a designated reader. */
> > -     if (!cpumask_weight(&i915_pmu_cpumask))
> > +     if (cpumask_empty(&i915_pmu_cpumask))
> >               cpumask_set_cpu(cpu, &i915_pmu_cpumask);
> >
> >       return 0;
> >
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> I see it's a large series which only partially appeared on our mailing
> lists.

The series is here: https://lkml.org/lkml/2022/1/23/223
The branch: https://github.com/norov/linux/tree/bitmap-20220123

> So for instance it hasn't got tested by our automated CI. (Not
> that I expect any problems in this patch.)

Would be great if you give a test for the whole series, thanks!

> What are the plans in terms of which tree will it get merged through?

For the patches that will not be merged by maintainers of corresponding
subsystems, I'll use my bitmap branch and send it to linux-next.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 38/54] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate
  2022-01-23 18:39 ` [PATCH 38/54] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
@ 2022-01-25 19:05   ` Thomas Bogendoerfer
  0 siblings, 0 replies; 117+ messages in thread
From: Thomas Bogendoerfer @ 2022-01-25 19:05 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Mark Rutland, Marc Zyngier, linux-mips

On Sun, Jan 23, 2022 at 10:39:09AM -0800, Yury Norov wrote:
> Mips code uses calls cpumask_weight() to compare the weight of
> cpumask with a given number. We can do it more efficiently with
> cpumask_weight_{eq, ...} because conditional cpumask_weight may stop
> traversing the cpumask earlier, as soon as condition is met.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/mips/cavium-octeon/octeon-irq.c | 4 ++--
>  arch/mips/kernel/crash.c             | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/mips/cavium-octeon/octeon-irq.c b/arch/mips/cavium-octeon/octeon-irq.c
> index 844f882096e6..914871f15fb7 100644
> --- a/arch/mips/cavium-octeon/octeon-irq.c
> +++ b/arch/mips/cavium-octeon/octeon-irq.c
> @@ -763,7 +763,7 @@ static void octeon_irq_cpu_offline_ciu(struct irq_data *data)
>  	if (!cpumask_test_cpu(cpu, mask))
>  		return;
>  
> -	if (cpumask_weight(mask) > 1) {
> +	if (cpumask_weight_gt(mask, 1)) {
>  		/*
>  		 * It has multi CPU affinity, just remove this CPU
>  		 * from the affinity set.
> @@ -795,7 +795,7 @@ static int octeon_irq_ciu_set_affinity(struct irq_data *data,
>  	 * This removes the need to do locking in the .ack/.eoi
>  	 * functions.
>  	 */
> -	if (cpumask_weight(dest) != 1)
> +	if (!cpumask_weight_eq(dest, 1))
>  		return -EINVAL;
>  
>  	if (!enable_one)
> diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c
> index 81845ba04835..5b690d52491f 100644
> --- a/arch/mips/kernel/crash.c
> +++ b/arch/mips/kernel/crash.c
> @@ -72,7 +72,7 @@ static void crash_kexec_prepare_cpus(void)
>  	 */
>  	pr_emerg("Sending IPI to other cpus...\n");
>  	msecs = 10000;
> -	while ((cpumask_weight(&cpus_in_crash) < ncpus) && (--msecs > 0)) {
> +	while (cpumask_weight_lt(&cpus_in_crash, ncpus) && (--msecs > 0)) {
>  		cpu_relax();
>  		mdelay(1);
>  	}
> -- 
> 2.30.2

Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  2022-01-24 12:28   ` Andy Shevchenko
@ 2022-01-25 21:09     ` Yury Norov
  2022-01-25 22:14       ` David Laight
  0 siblings, 1 reply; 117+ messages in thread
From: Yury Norov @ 2022-01-25 21:09 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ariel Elior,
	Manish Chopra, David S. Miller, Jakub Kicinski, netdev

On Mon, Jan 24, 2022 at 4:29 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Sun, Jan 23, 2022 at 10:38:41AM -0800, Yury Norov wrote:
> > qlogic/qed code calls bitmap_weight() to check if any bit of a given
> > bitmap is set. It's better to use bitmap_empty() in that case because
> > bitmap_empty() stops traversing the bitmap as soon as it finds first
> > set bit, while bitmap_weight() counts all bits unconditionally.
>
> > -             if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
> > +             if (!bitmap_empty((unsigned long *)&pmap[item], 64 * 8))
>
> > -         (bitmap_weight((unsigned long *)&pmap[item],
> > +         (!bitmap_empty((unsigned long *)&pmap[item],
>
> Side note, these castings reminds me previous discussion and I'm wondering
> if you have this kind of potentially problematic places in your TODO as
> subject to fix.

In the discussion you mentioned above, the u32* was cast to u64*,
which is wrong. The code
here is safe because in the worst case, it casts u64* to u32*. This
would be OK wrt
 -Werror=array-bounds.

The function itself looks like doing this unsigned long <-> u64
conversions just for printing
purpose. I'm not a qlogic expert, so let's wait what people say?

The printing part may be refactored although to use %pb" format,
similarly to the snippet below
(not tested).

Thanks,
Yury

diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index 23b668de4640..72505517ced1 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -336,17 +336,8 @@ void qed_rdma_bmap_free(struct qed_hwfn *p_hwfn,

        /* print aligned non-zero lines, if any */
        for (item = 0, line = 0; line < last_line; line++, item += 8)
-               if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
-                       DP_NOTICE(p_hwfn,
-                                 "line 0x%04x: 0x%016llx 0x%016llx
0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx 0x%016llx\n",
-                                 line,
-                                 pmap[item],
-                                 pmap[item + 1],
-                                 pmap[item + 2],
-                                 pmap[item + 3],
-                                 pmap[item + 4],
-                                 pmap[item + 5],
-                                 pmap[item + 6], pmap[item + 7]);
+               if (bitmap_weight(bmap->bitmap, 64 * 8))
+                       DP_NOTICE(p_hwfn, "line 0x%04x: %512pb\n",
line, bmap->bitmap);

        /* print last unaligned non-zero line, if any */
        if ((bmap->max_count % (64 * 8)) &&

^ permalink raw reply related	[flat|nested] 117+ messages in thread

* RE: [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  2022-01-25 21:09     ` Yury Norov
@ 2022-01-25 22:14       ` David Laight
  2022-01-25 23:10         ` Yury Norov
  0 siblings, 1 reply; 117+ messages in thread
From: David Laight @ 2022-01-25 22:14 UTC (permalink / raw)
  To: 'Yury Norov', Andy Shevchenko
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, Joe Perches, Dennis Zhou,
	Emil Renner Berthing, Nicholas Piggin, Matti Vaittinen,
	Alexey Klimov, linux-kernel, Ariel Elior, Manish Chopra,
	David S. Miller, Jakub Kicinski, netdev

From: Yury Norov
> Sent: 25 January 2022 21:10
> On Mon, Jan 24, 2022 at 4:29 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> >
> > On Sun, Jan 23, 2022 at 10:38:41AM -0800, Yury Norov wrote:
> > > qlogic/qed code calls bitmap_weight() to check if any bit of a given
> > > bitmap is set. It's better to use bitmap_empty() in that case because
> > > bitmap_empty() stops traversing the bitmap as soon as it finds first
> > > set bit, while bitmap_weight() counts all bits unconditionally.
> >
> > > -             if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
> > > +             if (!bitmap_empty((unsigned long *)&pmap[item], 64 * 8))
> >
> > > -         (bitmap_weight((unsigned long *)&pmap[item],
> > > +         (!bitmap_empty((unsigned long *)&pmap[item],
> >
> > Side note, these castings reminds me previous discussion and I'm wondering
> > if you have this kind of potentially problematic places in your TODO as
> > subject to fix.
> 
> In the discussion you mentioned above, the u32* was cast to u64*,
> which is wrong. The code
> here is safe because in the worst case, it casts u64* to u32*. This
> would be OK wrt
>  -Werror=array-bounds.
> 
> The function itself looks like doing this unsigned long <-> u64
> conversions just for printing
> purpose. I'm not a qlogic expert, so let's wait what people say?

It'll be wrong on BE systems.
You just can't cast the argument it has to be long[].

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic
  2022-01-25 22:14       ` David Laight
@ 2022-01-25 23:10         ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-25 23:10 UTC (permalink / raw)
  To: David Laight
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	Joe Perches, Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Ariel Elior,
	Manish Chopra, David S. Miller, Jakub Kicinski, netdev

On Tue, Jan 25, 2022 at 2:15 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Yury Norov
> > Sent: 25 January 2022 21:10
> > On Mon, Jan 24, 2022 at 4:29 AM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > >
> > > On Sun, Jan 23, 2022 at 10:38:41AM -0800, Yury Norov wrote:
> > > > qlogic/qed code calls bitmap_weight() to check if any bit of a given
> > > > bitmap is set. It's better to use bitmap_empty() in that case because
> > > > bitmap_empty() stops traversing the bitmap as soon as it finds first
> > > > set bit, while bitmap_weight() counts all bits unconditionally.
> > >
> > > > -             if (bitmap_weight((unsigned long *)&pmap[item], 64 * 8))
> > > > +             if (!bitmap_empty((unsigned long *)&pmap[item], 64 * 8))
> > >
> > > > -         (bitmap_weight((unsigned long *)&pmap[item],
> > > > +         (!bitmap_empty((unsigned long *)&pmap[item],
> > >
> > > Side note, these castings reminds me previous discussion and I'm wondering
> > > if you have this kind of potentially problematic places in your TODO as
> > > subject to fix.
> >
> > In the discussion you mentioned above, the u32* was cast to u64*,
> > which is wrong. The code
> > here is safe because in the worst case, it casts u64* to u32*. This
> > would be OK wrt
> >  -Werror=array-bounds.
> >
> > The function itself looks like doing this unsigned long <-> u64
> > conversions just for printing
> > purpose. I'm not a qlogic expert, so let's wait what people say?
>
> It'll be wrong on BE systems.

The bitmap_weigh() result will be correct. As you can see, the address
is 64-bit aligned anyways. The array boundary violation will never happen
as well.

DP_NOTICE() may be wrong, or may not. It depends on how important
the absolute position of the bit in the printed bitmap is. Nevertheless,
printk("%pb") is better and should be used.

This whole concern may be simply irrelevant if QED is not supported
on 32-bit BE machines. From what I can see, at least Infiniband requires
64BIT.

Thanks,
Yury

> You just can't cast the argument it has to be long[].
>
>         David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage
  2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
                   ` (53 preceding siblings ...)
  2022-01-23 18:39 ` [PATCH 54/54] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
@ 2022-01-26  7:30 ` Vaittinen, Matti
  2022-01-27 17:44   ` Yury Norov
  54 siblings, 1 reply; 117+ messages in thread
From: Vaittinen, Matti @ 2022-01-26  7:30 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Alexey Klimov, linux-kernel

On 1/23/22 20:38, Yury Norov wrote:
> In many cases people use bitmap_weight()-based functions to compare
> the result against a number of expression:
> 
> 	if (cpumask_weight(mask) > 1)
> 		do_something();
> 
> This may take considerable amount of time on many-cpus machines because
> cpumask_weight() will traverse every word of underlying cpumask
> unconditionally.
> 
> We can significantly improve on it for many real cases if stop traversing
> the mask as soon as we count cpus to any number greater than 1:
> 
> 	if (cpumask_weight_gt(mask, 1))
> 		do_something();

I guess I am part of the recipient list because I did the original 
suggestion of adding the single_bit_set()?

If this is the case - well, I do like this series. Overall it looks good 
to me - but I for sure did not go through all the changes in detail ;) 
If there is some other reason to loop me in (Eg, if someone expects me 
to take a more specific look on something) - please give me a nudge.

Best Regards
	-- Matti Vaittinen


-- 
The Linux Kernel guy at ROHM Semiconductors

Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~ this year is the year of a signature writers block ~~

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit()
  2022-01-23 18:38 ` [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
@ 2022-01-26  9:01   ` Tariq Toukan
  0 siblings, 0 replies; 117+ messages in thread
From: Tariq Toukan @ 2022-01-26  9:01 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Tariq Toukan, David S. Miller, Jakub Kicinski, netdev,
	linux-rdma



On 1/23/2022 8:38 PM, Yury Norov wrote:
> Mellanox driver has an open-coded for_each_set_bit(). Fix it.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   drivers/net/ethernet/mellanox/mlx4/cmd.c | 23 ++++++-----------------
>   1 file changed, 6 insertions(+), 17 deletions(-)
> 

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

Thanks,
Tariq

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-25 18:16       ` Yury Norov
  (?)
@ 2022-01-26  9:43         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 117+ messages in thread
From: Tvrtko Ursulin @ 2022-01-26  9:43 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, David Airlie,
	Daniel Vetter, intel-gfx, dri-devel


On 25/01/2022 18:16, Yury Norov wrote:
> On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 23/01/2022 18:38, Yury Norov wrote:
>>> i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
>>> given cpumask is set. We can do it more efficiently with cpumask_empty()
>>> because cpumask_empty() stops traversing the cpumask as soon as it finds
>>> first set bit, while cpumask_weight() counts all bits unconditionally.
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_pmu.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>> index ea655161793e..1894c876b31d 100644
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>>>        GEM_BUG_ON(!pmu->base.event_init);
>>>
>>>        /* Select the first online CPU as a designated reader. */
>>> -     if (!cpumask_weight(&i915_pmu_cpumask))
>>> +     if (cpumask_empty(&i915_pmu_cpumask))
>>>                cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>>>
>>>        return 0;
>>>
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> I see it's a large series which only partially appeared on our mailing
>> lists.
> 
> The series is here: https://lkml.org/lkml/2022/1/23/223
> The branch: https://github.com/norov/linux/tree/bitmap-20220123
> 
>> So for instance it hasn't got tested by our automated CI. (Not
>> that I expect any problems in this patch.)
> 
> Would be great if you give a test for the whole series, thanks!

Can't really test the whole series for you, but if you want to send just 
the i915 patch standalone to the intel-gfx mailing list, that would 
trigger the CI run and if that passes we can merge that single one.

>> What are the plans in terms of which tree will it get merged through?
> 
> For the patches that will not be merged by maintainers of corresponding
> subsystems, I'll use my bitmap branch and send it to linux-next.

Or I guess we can wait for them to trickle back to us this way.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-26  9:43         ` Tvrtko Ursulin
  0 siblings, 0 replies; 117+ messages in thread
From: Tvrtko Ursulin @ 2022-01-26  9:43 UTC (permalink / raw)
  To: Yury Norov
  Cc: Emil Renner Berthing, dri-devel, Peter Zijlstra,
	Greg Kroah-Hartman, David Airlie, Rasmus Villemoes, linux-kernel,
	Nicholas Piggin, Michał Mirosław, Alexey Klimov,
	David Laight, Matti Vaittinen, Rodrigo Vivi, Joe Perches,
	Dennis Zhou, Andrew Morton, Andy Shevchenko, intel-gfx


On 25/01/2022 18:16, Yury Norov wrote:
> On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 23/01/2022 18:38, Yury Norov wrote:
>>> i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
>>> given cpumask is set. We can do it more efficiently with cpumask_empty()
>>> because cpumask_empty() stops traversing the cpumask as soon as it finds
>>> first set bit, while cpumask_weight() counts all bits unconditionally.
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_pmu.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>> index ea655161793e..1894c876b31d 100644
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>>>        GEM_BUG_ON(!pmu->base.event_init);
>>>
>>>        /* Select the first online CPU as a designated reader. */
>>> -     if (!cpumask_weight(&i915_pmu_cpumask))
>>> +     if (cpumask_empty(&i915_pmu_cpumask))
>>>                cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>>>
>>>        return 0;
>>>
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> I see it's a large series which only partially appeared on our mailing
>> lists.
> 
> The series is here: https://lkml.org/lkml/2022/1/23/223
> The branch: https://github.com/norov/linux/tree/bitmap-20220123
> 
>> So for instance it hasn't got tested by our automated CI. (Not
>> that I expect any problems in this patch.)
> 
> Would be great if you give a test for the whole series, thanks!

Can't really test the whole series for you, but if you want to send just 
the i915 patch standalone to the intel-gfx mailing list, that would 
trigger the CI run and if that passes we can merge that single one.

>> What are the plans in terms of which tree will it get merged through?
> 
> For the patches that will not be merged by maintainers of corresponding
> subsystems, I'll use my bitmap branch and send it to linux-next.

Or I guess we can wait for them to trickle back to us this way.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [Intel-gfx] [PATCH 17/54] gpu: drm: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-26  9:43         ` Tvrtko Ursulin
  0 siblings, 0 replies; 117+ messages in thread
From: Tvrtko Ursulin @ 2022-01-26  9:43 UTC (permalink / raw)
  To: Yury Norov
  Cc: Emil Renner Berthing, dri-devel, Peter Zijlstra,
	Greg Kroah-Hartman, David Airlie, Rasmus Villemoes, linux-kernel,
	Nicholas Piggin, Michał Mirosław, Alexey Klimov,
	David Laight, Matti Vaittinen, Joe Perches, Dennis Zhou,
	Andrew Morton, Andy Shevchenko, intel-gfx


On 25/01/2022 18:16, Yury Norov wrote:
> On Tue, Jan 25, 2022 at 1:28 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 23/01/2022 18:38, Yury Norov wrote:
>>> i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
>>> given cpumask is set. We can do it more efficiently with cpumask_empty()
>>> because cpumask_empty() stops traversing the cpumask as soon as it finds
>>> first set bit, while cpumask_weight() counts all bits unconditionally.
>>>
>>> Signed-off-by: Yury Norov <yury.norov@gmail.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_pmu.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>> index ea655161793e..1894c876b31d 100644
>>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>>> @@ -1048,7 +1048,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>>>        GEM_BUG_ON(!pmu->base.event_init);
>>>
>>>        /* Select the first online CPU as a designated reader. */
>>> -     if (!cpumask_weight(&i915_pmu_cpumask))
>>> +     if (cpumask_empty(&i915_pmu_cpumask))
>>>                cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>>>
>>>        return 0;
>>>
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> I see it's a large series which only partially appeared on our mailing
>> lists.
> 
> The series is here: https://lkml.org/lkml/2022/1/23/223
> The branch: https://github.com/norov/linux/tree/bitmap-20220123
> 
>> So for instance it hasn't got tested by our automated CI. (Not
>> that I expect any problems in this patch.)
> 
> Would be great if you give a test for the whole series, thanks!

Can't really test the whole series for you, but if you want to send just 
the i915 patch standalone to the intel-gfx mailing list, that would 
trigger the CI run and if that passes we can merge that single one.

>> What are the plans in terms of which tree will it get merged through?
> 
> For the patches that will not be merged by maintainers of corresponding
> subsystems, I'll use my bitmap branch and send it to linux-next.

Or I guess we can wait for them to trickle back to us this way.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-24 12:41   ` Andy Shevchenko
  2022-01-24 12:43     ` Andy Shevchenko
@ 2022-01-26 15:56     ` Yury Norov
  1 sibling, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-26 15:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel

On Mon, Jan 24, 2022 at 4:42 AM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Sun, Jan 23, 2022 at 10:38:58AM -0800, Yury Norov wrote:
> > Many kernel users use bitmap_weight() to compare the result against
> > some number or expression:
> >
> >       if (bitmap_weight(...) > 1)
> >               do_something();
> >
> > It works OK, but may be significantly improved for large bitmaps: if
> > first few words count set bits to a number greater than given, we can
> > stop counting and immediately return.
> >
> > The same idea would work in other direction: if we know that the number
> > of set bits that we counted so far is small enough, so that it would be
> > smaller than required number even if all bits of the rest of the bitmap
> > are set, we can stop counting earlier.
> >
> > This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
> > and a family of eq, gt, ge, lt and le wrappers to allow this optimization.
>
> lt, and le
>
> > The following patches apply new functions where appropriate.
>
> What I missed in the above message is the rough statistics like some of them
> are used more often, some less, and some, perhaps, just added for the sake of
> symmetry (the latter is what would be important to see if there are APIs which
> have no users at all).

These are my grep numbers. Some lines are declarations and comments, so minus
6 or 8 for each number, but all new functions have actual users.

$ git grep weight_eq|wc -l
35
$ git grep weight_gt|wc -l
20
$ git grep weight_ge|wc -l
25
$ git grep weight_lt|wc -l
14
$ git grep weight_le|wc -l
18

^ permalink raw reply	[flat|nested] 117+ messages in thread

* RE: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-24  9:20   ` Vitaly Kuznetsov
@ 2022-01-27 15:02     ` Michael Kelley (LINUX)
  2022-01-28  9:31       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 117+ messages in thread
From: Michael Kelley (LINUX) @ 2022-01-27 15:02 UTC (permalink / raw)
  To: vkuznets, Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, linux-hyperv

From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Monday, January 24, 2022 1:20 AM
> 
> Yury Norov <yury.norov@gmail.com> writes:
> 
> > init_vp_index() calls cpumask_weight() to compare the weights of cpumasks
> > We can do it more efficiently with cpumask_weight_eq because conditional
> > cpumask_weight may stop traversing the cpumask earlier (at least one), as
> > soon as condition is met.
> 
> Same comment as for "PATCH 41/54": cpumask_weight_eq() can only stop
> earlier if the condition is not met, to prove the equality all bits need
> always have to be examined.
> 
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  drivers/hv/channel_mgmt.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 60375879612f..7420a5fd47b5 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -762,8 +762,8 @@ static void init_vp_index(struct vmbus_channel *channel)
> >  		}
> >  		alloced_mask = &hv_context.hv_numa_map[numa_node];
> >
> > -		if (cpumask_weight(alloced_mask) ==
> > -		    cpumask_weight(cpumask_of_node(numa_node))) {
> > +		if (cpumask_weight_eq(alloced_mask,
> > +			    cpumask_weight(cpumask_of_node(numa_node)))) {
> 
> This code is not performace critical and I prefer the old version:
> 
>  	cpumask_weight() == cpumask_weight()
> 
>  looks better than
> 
> 	cpumask_weight_eq(..., cpumask_weight())
> 
> (let alone the inner cpumask_of_node()) to me.
> 
> >  			/*
> >  			 * We have cycled through all the CPUs in the node;
> >  			 * reset the alloced map.
> 
> --
> Vitaly

I agree with Vitaly in preferring the old version, and indeed performance
here is a shrug.  But actually, I think the old version is a poorly coded way
to determine if the two cpumasks are equal. The following would correctly
capture the intent:

	if (cpumask_equal(alloced_mask, cpumask_of_node(numa_node))

Michael




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 15/54] arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38   ` [Nouveau] " Yury Norov
@ 2022-01-27 17:20     ` Steve Wahl
  -1 siblings, 0 replies; 117+ messages in thread
From: Steve Wahl @ 2022-01-27 17:20 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Fenghua Yu, Reinette Chatre, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Steven Rostedt,
	Karol Herbst, Pekka Paalanen, Andy Lutomirski, Steve Wahl,
	Mike Travis, Dimitri Sivanich, Russ Anderson, Darren Hart,
	Andy Shevchenko, x86, nouveau, platform-driver-x86

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Sun, Jan 23, 2022 at 10:38:46AM -0800, Yury Norov wrote:
> In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
> a given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
>  arch/x86/mm/mmio-mod.c                 |  2 +-
>  arch/x86/platform/uv/uv_nmi.c          |  2 +-
>  3 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index b57b3db9a6a7..e23ff03290b8 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  
>  	/* Check whether cpus belong to parent ctrl group */
>  	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
>  		return -EINVAL;
>  	}
>  
>  	/* Check whether cpus are dropped from this group */
>  	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		/* Give any dropped cpus to parent rdtgroup */
>  		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
>  		update_closid_rmid(tmpmask, prgrp);
> @@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  	 * and update per-cpu rmid
>  	 */
>  	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		head = &prgrp->mon.crdtgrp_list;
>  		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
>  			if (crgrp == rdtgrp)
> @@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  
>  	/* Check whether cpus are dropped from this group */
>  	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		/* Can't drop from default group */
>  		if (rdtgrp == &rdtgroup_default) {
>  			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
> @@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  	 * and update per-cpu closid/rmid.
>  	 */
>  	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
>  			if (r == rdtgrp)
>  				continue;
>  			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
> -			if (cpumask_weight(tmpmask1))
> +			if (!cpumask_empty(tmpmask1))
>  				cpumask_rdtgrp_clear(r, tmpmask1);
>  		}
>  		update_closid_rmid(tmpmask, rdtgrp);
> @@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>  
>  	/* check that user didn't specify any offline cpus */
>  	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		ret = -EINVAL;
>  		rdt_last_cmd_puts("Can only assign online CPUs\n");
>  		goto unlock;
> diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
> index 933a2ebad471..c3317f0650d8 100644
> --- a/arch/x86/mm/mmio-mod.c
> +++ b/arch/x86/mm/mmio-mod.c
> @@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
>  	int cpu;
>  	int err;
>  
> -	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
> +	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
>  		return;
>  	pr_notice("Re-enabling CPUs...\n");
>  	for_each_cpu(cpu, downed_cpus) {
> diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
> index 1e9ff28bc2e0..ea277fc08357 100644
> --- a/arch/x86/platform/uv/uv_nmi.c
> +++ b/arch/x86/platform/uv/uv_nmi.c
> @@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
>  
>  	/* Clear global flags */
>  	if (master) {
> -		if (cpumask_weight(uv_nmi_cpu_mask))
> +		if (!cpumask_empty(uv_nmi_cpu_mask))
>  			uv_nmi_cleanup_mask();
>  		atomic_set(&uv_nmi_cpus_in_nmi, -1);
>  		atomic_set(&uv_nmi_cpu, -1);
> -- 
> 2.30.2
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [Nouveau] [PATCH 15/54] arch/x86: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-01-27 17:20     ` Steve Wahl
  0 siblings, 0 replies; 117+ messages in thread
From: Steve Wahl @ 2022-01-27 17:20 UTC (permalink / raw)
  To: Yury Norov
  Cc: Emil Renner Berthing, Russ Anderson, Steve Wahl, Peter Zijlstra,
	nouveau, Dave Hansen, Rasmus Villemoes, H. Peter Anvin,
	Andy Shevchenko, x86, Mike Travis, Ingo Molnar, Dennis Zhou,
	platform-driver-x86, Fenghua Yu, Alexey Klimov, Steven Rostedt,
	Michał Mirosław, Pekka Paalanen, Borislav Petkov,
	Nicholas Piggin, Andy Lutomirski, Darren Hart, Thomas Gleixner,
	Reinette Chatre, Dimitri Sivanich, Greg Kroah-Hartman,
	Matti Vaittinen, linux-kernel, David Laight, Joe Perches,
	Andrew Morton, Andy Shevchenko

Reviewed-by: Steve Wahl <steve.wahl@hpe.com>

On Sun, Jan 23, 2022 at 10:38:46AM -0800, Yury Norov wrote:
> In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
> a given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 14 +++++++-------
>  arch/x86/mm/mmio-mod.c                 |  2 +-
>  arch/x86/platform/uv/uv_nmi.c          |  2 +-
>  3 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index b57b3db9a6a7..e23ff03290b8 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -341,14 +341,14 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  
>  	/* Check whether cpus belong to parent ctrl group */
>  	cpumask_andnot(tmpmask, newmask, &prgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		rdt_last_cmd_puts("Can only add CPUs to mongroup that belong to parent\n");
>  		return -EINVAL;
>  	}
>  
>  	/* Check whether cpus are dropped from this group */
>  	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		/* Give any dropped cpus to parent rdtgroup */
>  		cpumask_or(&prgrp->cpu_mask, &prgrp->cpu_mask, tmpmask);
>  		update_closid_rmid(tmpmask, prgrp);
> @@ -359,7 +359,7 @@ static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  	 * and update per-cpu rmid
>  	 */
>  	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		head = &prgrp->mon.crdtgrp_list;
>  		list_for_each_entry(crgrp, head, mon.crdtgrp_list) {
>  			if (crgrp == rdtgrp)
> @@ -394,7 +394,7 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  
>  	/* Check whether cpus are dropped from this group */
>  	cpumask_andnot(tmpmask, &rdtgrp->cpu_mask, newmask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		/* Can't drop from default group */
>  		if (rdtgrp == &rdtgroup_default) {
>  			rdt_last_cmd_puts("Can't drop CPUs from default group\n");
> @@ -413,12 +413,12 @@ static int cpus_ctrl_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
>  	 * and update per-cpu closid/rmid.
>  	 */
>  	cpumask_andnot(tmpmask, newmask, &rdtgrp->cpu_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		list_for_each_entry(r, &rdt_all_groups, rdtgroup_list) {
>  			if (r == rdtgrp)
>  				continue;
>  			cpumask_and(tmpmask1, &r->cpu_mask, tmpmask);
> -			if (cpumask_weight(tmpmask1))
> +			if (!cpumask_empty(tmpmask1))
>  				cpumask_rdtgrp_clear(r, tmpmask1);
>  		}
>  		update_closid_rmid(tmpmask, rdtgrp);
> @@ -488,7 +488,7 @@ static ssize_t rdtgroup_cpus_write(struct kernfs_open_file *of,
>  
>  	/* check that user didn't specify any offline cpus */
>  	cpumask_andnot(tmpmask, newmask, cpu_online_mask);
> -	if (cpumask_weight(tmpmask)) {
> +	if (!cpumask_empty(tmpmask)) {
>  		ret = -EINVAL;
>  		rdt_last_cmd_puts("Can only assign online CPUs\n");
>  		goto unlock;
> diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
> index 933a2ebad471..c3317f0650d8 100644
> --- a/arch/x86/mm/mmio-mod.c
> +++ b/arch/x86/mm/mmio-mod.c
> @@ -400,7 +400,7 @@ static void leave_uniprocessor(void)
>  	int cpu;
>  	int err;
>  
> -	if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
> +	if (!cpumask_available(downed_cpus) || cpumask_empty(downed_cpus))
>  		return;
>  	pr_notice("Re-enabling CPUs...\n");
>  	for_each_cpu(cpu, downed_cpus) {
> diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
> index 1e9ff28bc2e0..ea277fc08357 100644
> --- a/arch/x86/platform/uv/uv_nmi.c
> +++ b/arch/x86/platform/uv/uv_nmi.c
> @@ -985,7 +985,7 @@ static int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
>  
>  	/* Clear global flags */
>  	if (master) {
> -		if (cpumask_weight(uv_nmi_cpu_mask))
> +		if (!cpumask_empty(uv_nmi_cpu_mask))
>  			uv_nmi_cleanup_mask();
>  		atomic_set(&uv_nmi_cpus_in_nmi, -1);
>  		atomic_set(&uv_nmi_cpu, -1);
> -- 
> 2.30.2
> 

-- 
Steve Wahl, Hewlett Packard Enterprise

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage
  2022-01-26  7:30 ` [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Vaittinen, Matti
@ 2022-01-27 17:44   ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-01-27 17:44 UTC (permalink / raw)
  To: Vaittinen, Matti
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Alexey Klimov, linux-kernel

On Tue, Jan 25, 2022 at 11:30 PM Vaittinen, Matti
<Matti.Vaittinen@fi.rohmeurope.com> wrote:
>
> On 1/23/22 20:38, Yury Norov wrote:
> > In many cases people use bitmap_weight()-based functions to compare
> > the result against a number of expression:
> >
> >       if (cpumask_weight(mask) > 1)
> >               do_something();
> >
> > This may take considerable amount of time on many-cpus machines because
> > cpumask_weight() will traverse every word of underlying cpumask
> > unconditionally.
> >
> > We can significantly improve on it for many real cases if stop traversing
> > the mask as soon as we count cpus to any number greater than 1:
> >
> >       if (cpumask_weight_gt(mask, 1))
> >               do_something();
>
> I guess I am part of the recipient list because I did the original
> suggestion of adding the single_bit_set()?

Yes, because of single_bit_set()

> If this is the case - well, I do like this series. Overall it looks good
> to me - but I for sure did not go through all the changes in detail ;)
> If there is some other reason to loop me in (Eg, if someone expects me
> to take a more specific look on something) - please give me a nudge.

The key patch of the series is #27: "lib/bitmap: add bitmap_weight_{cmp, eq,
gt, ge, lt, le} functions"

Feel free to add suggested/reviewed (or whatever you find appropriate) tags
if you want.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c
  2022-01-23 18:38 ` [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c Yury Norov
@ 2022-01-28  6:29   ` Herbert Xu
  0 siblings, 0 replies; 117+ messages in thread
From: Herbert Xu @ 2022-01-28  6:29 UTC (permalink / raw)
  To: Yury Norov
  Cc: yury.norov, andriy.shevchenko, linux, akpm, mirq-linux, gregkh,
	peterz, David.Laight, joe, dennis, kernel, npiggin,
	matti.vaittinen, aklimov, linux-kernel, steffen.klassert,
	daniel.m.jordan, linux-crypto

Yury Norov <yury.norov@gmail.com> wrote:
> padata_do_parallel() calls cpumask_weight() to check if any bit of a
> given cpumask is set. We can do it more efficiently with cpumask_empty()
> because cpumask_empty() stops traversing the cpumask as soon as it finds
> first set bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
> kernel/padata.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions
  2022-01-23 18:38 ` [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
  2022-01-24 12:41   ` Andy Shevchenko
@ 2022-01-28  6:59   ` Vaittinen, Matti
  1 sibling, 0 replies; 117+ messages in thread
From: Vaittinen, Matti @ 2022-01-28  6:59 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Alexey Klimov, linux-kernel

On 1/23/22 20:38, Yury Norov wrote:
> Many kernel users use bitmap_weight() to compare the result against
> some number or expression:
> 
> 	if (bitmap_weight(...) > 1)
> 		do_something();
> 
> It works OK, but may be significantly improved for large bitmaps: if
> first few words count set bits to a number greater than given, we can
> stop counting and immediately return.
> 
> The same idea would work in other direction: if we know that the number
> of set bits that we counted so far is small enough, so that it would be
> smaller than required number even if all bits of the rest of the bitmap
> are set, we can stop counting earlier.
> 
> This patch adds new bitmap_weight_cmp() as suggested by Michał Mirosław
> and a family of eq, gt, ge, lt and le wrappers to allow this optimization.
> The following patches apply new functions where appropriate.
> 

Thanks for pushing this improvement Yury. Seeing how much this has 
evolved from the single_bit_set() suggestion - it'd be a bit thick from 
me to add a suggested-by ;) I did review it though and it looks good to me!

Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>

> Suggested-by: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> (for bitmap_weight_cmp)
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   include/linux/bitmap.h | 80 ++++++++++++++++++++++++++++++++++++++++++
>   lib/bitmap.c           | 21 +++++++++++
>   2 files changed, 101 insertions(+)


-- 
The Linux Kernel guy at ROHM Semiconductors

Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~ this year is the year of a signature writers block ~~

^ permalink raw reply	[flat|nested] 117+ messages in thread

* RE: [PATCH 43/54] drivers/hv: replace cpumask_weight with cpumask_weight_eq
  2022-01-27 15:02     ` Michael Kelley (LINUX)
@ 2022-01-28  9:31       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 117+ messages in thread
From: Vitaly Kuznetsov @ 2022-01-28  9:31 UTC (permalink / raw)
  To: Michael Kelley (LINUX)
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, Wei Liu,
	Dexuan Cui, linux-hyperv, Yury Norov

"Michael Kelley (LINUX)" <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Monday, January 24, 2022 1:20 AM
>> 
>> Yury Norov <yury.norov@gmail.com> writes:
>> 
...
>> >
>> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> > index 60375879612f..7420a5fd47b5 100644
>> > --- a/drivers/hv/channel_mgmt.c
>> > +++ b/drivers/hv/channel_mgmt.c
>> > @@ -762,8 +762,8 @@ static void init_vp_index(struct vmbus_channel *channel)
>> >  		}
>> >  		alloced_mask = &hv_context.hv_numa_map[numa_node];
>> >
>> > -		if (cpumask_weight(alloced_mask) ==
>> > -		    cpumask_weight(cpumask_of_node(numa_node))) {
>> > +		if (cpumask_weight_eq(alloced_mask,
>> > +			    cpumask_weight(cpumask_of_node(numa_node)))) {
>> 
>> This code is not performace critical and I prefer the old version:
>> 
>>  	cpumask_weight() == cpumask_weight()
>> 
>>  looks better than
>> 
>> 	cpumask_weight_eq(..., cpumask_weight())
>> 
>> (let alone the inner cpumask_of_node()) to me.
>> 
>> >  			/*
>> >  			 * We have cycled through all the CPUs in the node;
>> >  			 * reset the alloced map.
>> 
> I agree with Vitaly in preferring the old version, and indeed performance
> here is a shrug.  But actually, I think the old version is a poorly coded way
> to determine if the two cpumasks are equal. The following would correctly
> capture the intent:
>
> 	if (cpumask_equal(alloced_mask, cpumask_of_node(numa_node))
>

Indeed. While it seems that only CPUs from 'cpumask_of_node(numa_node)'
can be set in 'alloced_mask' (and thus the comparison is valid), there's
no real need to weigh anything. I'll send a patch.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} where appropriate
  2022-01-24 12:46   ` Andy Shevchenko
@ 2022-01-30 12:54     ` Jonathan Cameron
  0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2022-01-30 12:54 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yury Norov, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Lars-Peter Clausen, Nathan Chancellor, Alexandru Ardelean,
	linux-iio

On Mon, 24 Jan 2022 14:46:43 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:

> On Sun, Jan 23, 2022 at 10:39:00AM -0800, Yury Norov wrote:
> > drivers/iio calls bitmap_weight() to compare the weight of bitmap with
> > a given number. We can do it more efficiently with bitmap_weight_{eq, gt}
> > because conditional bitmap_weight may stop traversing the bitmap earlier,
> > as soon as condition is met.  
> 
> ...
> 
> >  		int i, j;
> >  
> >  		for (i = 0, j = 0;
> > -		     i < bitmap_weight(indio_dev->active_scan_mask,
> > -				       indio_dev->masklength);
> > +		     bitmap_weight_gt(indio_dev->active_scan_mask,
> > +				       indio_dev->masklength, i);
> >  		     i++, j++) {
> >  			j = find_next_bit(indio_dev->active_scan_mask,
> >  					  indio_dev->masklength, j);  
> 
> This smells like room for improvement. Have you checked this deeply?
> 

I have no idea what I was smoking that day.
It was near 10 years ago, so I'll blame my younger self ;)

Jonathan

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 30/54] drivers/memstick: replace bitmap_weight with bitmap_weight_eq where appropriate
  2022-01-23 18:39 ` [PATCH 30/54] drivers/memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
@ 2022-01-31 16:07   ` Ulf Hansson
  0 siblings, 0 replies; 117+ messages in thread
From: Ulf Hansson @ 2022-01-31 16:07 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Maxim Levitsky, Alex Dubov, Jens Axboe, Luis Chamberlain,
	Colin Ian King, Arnd Bergmann, Shubhankar Kuranagatti, linux-mmc

On Sun, 23 Jan 2022 at 19:41, Yury Norov <yury.norov@gmail.com> wrote:
>
> msb_validate_used_block_bitmap() calls bitmap_weight() to compare the
> weight of bitmap with a given number. We can do it more efficiently with
> bitmap_weight_eq because conditional bitmap_weight may stop traversing the
> bitmap earlier, as soon as condition is met.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

> ---
>  drivers/memstick/core/ms_block.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
> index 0cda6c6baefc..5cdd987e78f7 100644
> --- a/drivers/memstick/core/ms_block.c
> +++ b/drivers/memstick/core/ms_block.c
> @@ -155,8 +155,8 @@ static int msb_validate_used_block_bitmap(struct msb_data *msb)
>         for (i = 0; i < msb->zone_count; i++)
>                 total_free_blocks += msb->free_block_count[i];
>
> -       if (msb->block_count - bitmap_weight(msb->used_blocks_bitmap,
> -                                       msb->block_count) == total_free_blocks)
> +       if (bitmap_weight_eq(msb->used_blocks_bitmap, msb->block_count,
> +                               msb->block_count - total_free_blocks))
>                 return 0;
>
>         pr_err("BUG: free block counts don't match the bitmap");
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp()
  2022-01-23 18:38 ` [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp() Yury Norov
@ 2022-02-04 18:29   ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2022-02-04 18:29 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov,
	Linux Kernel Mailing List, Rafael J. Wysocki, Daniel Lezcano,
	Amit Kucheria, Zhang Rui, Sebastian Andrzej Siewior,
	Christophe JAILLET, Rikard Falkeborn, Linux PM

On Sun, Jan 23, 2022 at 7:39 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> Don't call bitmap_weight() if the following code can get by
> without it.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/thermal/intel/intel_powerclamp.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
> index 14256421d98c..c841ab37e7c6 100644
> --- a/drivers/thermal/intel/intel_powerclamp.c
> +++ b/drivers/thermal/intel/intel_powerclamp.c
> @@ -556,12 +556,9 @@ static void end_power_clamp(void)
>          * stop faster.
>          */
>         clamping = false;
> -       if (bitmap_weight(cpu_clamping_mask, num_possible_cpus())) {
> -               for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
> -                       pr_debug("clamping worker for cpu %d alive, destroy\n",
> -                                i);
> -                       stop_power_clamp_worker(i);
> -               }
> +       for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
> +               pr_debug("clamping worker for cpu %d alive, destroy\n", i);
> +               stop_power_clamp_worker(i);
>         }
>  }
>
> --

Applied as 5.18 material, thanks!

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
  2022-01-23 18:38   ` Yury Norov
@ 2022-02-09  6:15     ` Viresh Kumar
  -1 siblings, 0 replies; 117+ messages in thread
From: Viresh Kumar @ 2022-02-09  6:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Sudeep Holla,
	Cristian Marussi, linux-arm-msm, linux-pm, linux-arm-kernel

On 23-01-22, 10:38, Yury Norov wrote:
> drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
>  drivers/cpufreq/scmi-cpufreq.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Applied. Thanks.

-- 
viresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 16/54] cpufreq: replace cpumask_weight with cpumask_empty where appropriate
@ 2022-02-09  6:15     ` Viresh Kumar
  0 siblings, 0 replies; 117+ messages in thread
From: Viresh Kumar @ 2022-02-09  6:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Andy Gross, Bjorn Andersson, Rafael J. Wysocki, Sudeep Holla,
	Cristian Marussi, linux-arm-msm, linux-pm, linux-arm-kernel

On 23-01-22, 10:38, Yury Norov wrote:
> drivers/cpufreq calls cpumask_weight() to check if any bit of a given
> cpumask is set. We can do it more efficiently with cpumask_empty() because
> cpumask_empty() stops traversing the cpumask as soon as it finds first set
> bit, while cpumask_weight() counts all bits unconditionally.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
>  drivers/cpufreq/scmi-cpufreq.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Applied. Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox
  2022-01-24 12:48   ` Andy Shevchenko
@ 2022-02-09  6:46     ` Yury Norov
  0 siblings, 0 replies; 117+ messages in thread
From: Yury Norov @ 2022-02-09  6:46 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Rasmus Villemoes, Andrew Morton, Michał Mirosław,
	Greg Kroah-Hartman, Peter Zijlstra, David Laight, Joe Perches,
	Dennis Zhou, Emil Renner Berthing, Nicholas Piggin,
	Matti Vaittinen, Alexey Klimov, linux-kernel, Sunil Goutham,
	Geetha sowjanya, Subbaraya Sundeep, hariprasad, David S. Miller,
	Jakub Kicinski, netdev

On Mon, Jan 24, 2022 at 02:48:12PM +0200, Andy Shevchenko wrote:
> On Sun, Jan 23, 2022 at 10:39:04AM -0800, Yury Norov wrote:
> > Mellanox code uses bitmap_weight() to compare the weight of bitmap with
> > a given number. We can do it more efficiently with bitmap_weight_{eq, ...}
> > because conditional bitmap_weight may stop traversing the bitmap earlier,
> > as soon as condition is met.
> 
> > -	if (port <= 0 || port > m)
> > +	if (port <= 0 || bitmap_weight_lt(actv_ports.ports, dev->caps.num_ports, port))
> >  		return -EINVAL;
> 
> Can we eliminate now the port <= 0 check? Or at least make it port == 0?

The port is a parameter of exported function. I'd rather not take this risk.
Even if it makes sense, it should be a separate patch anyways.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH 12/54] tools/perf: replace bitmap_weight with bitmap_empty where appropriate
  2022-01-23 18:38 ` [PATCH 12/54] tools/perf: " Yury Norov
@ 2022-02-15 14:34   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 117+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-02-15 14:34 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Michał Mirosław, Greg Kroah-Hartman, Peter Zijlstra,
	David Laight, Joe Perches, Dennis Zhou, Emil Renner Berthing,
	Nicholas Piggin, Matti Vaittinen, Alexey Klimov, linux-kernel,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, linux-perf-users

Em Sun, Jan 23, 2022 at 10:38:43AM -0800, Yury Norov escreveu:
> Some code in builtin-c2c.c calls bitmap_weight() to check if any bit of
> a given bitmap is set. It's better to use bitmap_empty() in that case
> because bitmap_empty() stops traversing the bitmap as soon as it finds
> first set bit, while bitmap_weight() counts all bits unconditionally.

Thanks, applied.

- Arnaldo

 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  tools/perf/builtin-c2c.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
> index 77dd4afacca4..14f787c67140 100644
> --- a/tools/perf/builtin-c2c.c
> +++ b/tools/perf/builtin-c2c.c
> @@ -1080,7 +1080,7 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
>  		bitmap_zero(set, c2c.cpus_cnt);
>  		bitmap_and(set, c2c_he->cpuset, c2c.nodes[node], c2c.cpus_cnt);
>  
> -		if (!bitmap_weight(set, c2c.cpus_cnt)) {
> +		if (bitmap_empty(set, c2c.cpus_cnt)) {
>  			if (c2c.node_info == 1) {
>  				ret = scnprintf(hpp->buf, hpp->size, "%21s", " ");
>  				advance_hpp(hpp, ret);
> @@ -1944,7 +1944,7 @@ static int set_nodestr(struct c2c_hist_entry *c2c_he)
>  	if (c2c_he->nodestr)
>  		return 0;
>  
> -	if (bitmap_weight(c2c_he->nodeset, c2c.nodes_cnt)) {
> +	if (!bitmap_empty(c2c_he->nodeset, c2c.nodes_cnt)) {
>  		len = bitmap_scnprintf(c2c_he->nodeset, c2c.nodes_cnt,
>  				      buf, sizeof(buf));
>  	} else {
> -- 
> 2.30.2

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 117+ messages in thread

end of thread, other threads:[~2022-02-15 14:34 UTC | newest]

Thread overview: 117+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-23 18:38 [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Yury Norov
2022-01-23 18:38 ` [PATCH 01/54] net/dsa: don't use bitmap_weight() in b53_arl_read() Yury Norov
2022-01-24  3:11   ` Florian Fainelli
2022-01-23 18:38 ` [PATCH 02/54] net/ethernet: don't use bitmap_weight() in bcm_sysport_rule_set() Yury Norov
2022-01-24  3:11   ` Florian Fainelli
2022-01-23 18:38 ` [PATCH 03/54] thermal/intel: don't use bitmap_weight() in end_power_clamp() Yury Norov
2022-02-04 18:29   ` Rafael J. Wysocki
2022-01-23 18:38 ` [PATCH 04/54] net: mellanox: fix open-coded for_each_set_bit() Yury Norov
2022-01-26  9:01   ` Tariq Toukan
2022-01-23 18:38 ` [PATCH 05/54] nds32: perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-01-23 18:38 ` [PATCH 06/54] x86/kvm: " Yury Norov
2022-01-24  9:11   ` Vitaly Kuznetsov
2022-01-23 18:38 ` [PATCH 07/54] gpu: drm: " Yury Norov
2022-01-23 18:38 ` [PATCH 08/54] net: ethernet: replace bitmap_weight with bitmap_empty for intel Yury Norov
2022-01-23 18:38   ` [Intel-wired-lan] " Yury Norov
2022-01-23 18:38 ` [PATCH 09/54] net: ethernet: replace bitmap_weight with bitmap_empty for Marvell Yury Norov
2022-01-23 18:38 ` [PATCH 10/54] net: ethernet: replace bitmap_weight with bitmap_empty for qlogic Yury Norov
2022-01-24 12:28   ` Andy Shevchenko
2022-01-25 21:09     ` Yury Norov
2022-01-25 22:14       ` David Laight
2022-01-25 23:10         ` Yury Norov
2022-01-23 18:38 ` [PATCH 11/54] perf: replace bitmap_weight with bitmap_empty where appropriate Yury Norov
2022-01-23 18:38   ` Yury Norov
2022-01-23 18:38 ` [PATCH 12/54] tools/perf: " Yury Norov
2022-02-15 14:34   ` Arnaldo Carvalho de Melo
2022-01-23 18:38 ` [PATCH 13/54] arch/alpha: replace cpumask_weight with cpumask_empty " Yury Norov
2022-01-23 18:38   ` Yury Norov
2022-01-23 18:38 ` [PATCH 14/54] arch/ia64: " Yury Norov
2022-01-23 18:38   ` Yury Norov
2022-01-23 18:38 ` [PATCH 15/54] arch/x86: " Yury Norov
2022-01-23 18:38   ` [Nouveau] " Yury Norov
2022-01-27 17:20   ` Steve Wahl
2022-01-27 17:20     ` [Nouveau] " Steve Wahl
2022-01-23 18:38 ` [PATCH 16/54] cpufreq: " Yury Norov
2022-01-23 18:38   ` Yury Norov
2022-01-24 10:41   ` Sudeep Holla
2022-01-24 10:41     ` Sudeep Holla
2022-02-09  6:15   ` Viresh Kumar
2022-02-09  6:15     ` Viresh Kumar
2022-01-23 18:38 ` [PATCH 17/54] gpu: drm: " Yury Norov
2022-01-23 18:38   ` [Intel-gfx] " Yury Norov
2022-01-25  9:28   ` Tvrtko Ursulin
2022-01-25  9:28     ` [Intel-gfx] " Tvrtko Ursulin
2022-01-25 18:16     ` Yury Norov
2022-01-25 18:16       ` [Intel-gfx] " Yury Norov
2022-01-25 18:16       ` Yury Norov
2022-01-26  9:43       ` Tvrtko Ursulin
2022-01-26  9:43         ` [Intel-gfx] " Tvrtko Ursulin
2022-01-26  9:43         ` Tvrtko Ursulin
2022-01-23 18:38 ` [PATCH 18/54] drivers/infiniband: " Yury Norov
2022-01-23 19:13   ` Leon Romanovsky
2022-01-23 18:38 ` [PATCH 19/54] drivers/irqchip: " Yury Norov
2022-01-24  3:11   ` Florian Fainelli
2022-01-23 18:38 ` [PATCH 20/54] kernel/irq: " Yury Norov
2022-01-23 18:38 ` [PATCH 21/54] kernel: replace cpumask_weight with cpumask_empty in padata.c Yury Norov
2022-01-28  6:29   ` Herbert Xu
2022-01-23 18:38 ` [PATCH 22/54] rcu: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-01-24  0:07   ` Paul E. McKenney
2022-01-23 18:38 ` [PATCH 23/54] sched: " Yury Norov
2022-01-23 18:38 ` [PATCH 24/54] time: replace cpumask_weight with cpumask_empty in clocksource.c Yury Norov
2022-01-23 18:38 ` [PATCH 25/54] mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate Yury Norov
2022-01-23 18:38 ` [PATCH 26/54] arch/x86: replace nodes_weight with nodes_empty " Yury Norov
2022-01-23 18:38 ` [PATCH 27/54] lib/bitmap: add bitmap_weight_{cmp, eq, gt, ge, lt, le} functions Yury Norov
2022-01-24 12:41   ` Andy Shevchenko
2022-01-24 12:43     ` Andy Shevchenko
2022-01-24 12:49       ` Peter Zijlstra
2022-01-26 15:56     ` Yury Norov
2022-01-28  6:59   ` Vaittinen, Matti
2022-01-23 18:38 ` [PATCH 28/54] arch/x86: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} where appropriate Yury Norov
2022-01-23 18:39 ` [PATCH 29/54] drivers/iio: replace bitmap_weight() with bitmap_weight_{eq,gt} " Yury Norov
2022-01-24 12:46   ` Andy Shevchenko
2022-01-30 12:54     ` Jonathan Cameron
2022-01-23 18:39 ` [PATCH 30/54] drivers/memstick: replace bitmap_weight with bitmap_weight_eq " Yury Norov
2022-01-31 16:07   ` Ulf Hansson
2022-01-23 18:39 ` [PATCH 31/54] net: ethernet: replace bitmap_weight with bitmap_weight_eq for intel Yury Norov
2022-01-23 18:39   ` [Intel-wired-lan] " Yury Norov
2022-01-23 18:39 ` [PATCH 32/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt} for OcteonTX2 Yury Norov
2022-01-23 18:39 ` [PATCH 33/54] net: ethernet: replace bitmap_weight with bitmap_weight_{eq,gt,ge,lt,le} for mellanox Yury Norov
2022-01-24 12:48   ` Andy Shevchenko
2022-02-09  6:46     ` Yury Norov
2022-01-23 18:39 ` [PATCH 34/54] perf: replace bitmap_weight with bitmap_weight_eq for ThunderX2 Yury Norov
2022-01-23 18:39   ` Yury Norov
2022-01-23 18:39 ` [PATCH 35/54] drivers/staging: replace bitmap_weight with bitmap_weight_le for tegra-video Yury Norov
2022-01-23 18:39 ` [PATCH 36/54] lib/cpumask: add cpumask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-01-23 18:39 ` [PATCH 37/54] arch/ia64: replace cpumask_weight with cpumask_weight_eq in mm/tlb.c Yury Norov
2022-01-23 18:39   ` Yury Norov
2022-01-23 18:39 ` [PATCH 38/54] arch/mips: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-01-25 19:05   ` Thomas Bogendoerfer
2022-01-23 18:39 ` [PATCH 39/54] arch/powerpc: " Yury Norov
2022-01-23 18:39 ` [PATCH 40/54] arch/s390: replace cpumask_weight with cpumask_weight_eq " Yury Norov
2022-01-23 18:39 ` [PATCH 41/54] arch/x86: " Yury Norov
2022-01-24  8:05   ` Peter Zijlstra
2022-01-24  9:16   ` Vitaly Kuznetsov
2022-01-23 18:39 ` [PATCH 42/54] firmware: pcsi: replace cpumask_weight with cpumask_weight_eq Yury Norov
2022-01-23 18:39   ` Yury Norov
2022-01-23 18:39 ` [PATCH 43/54] drivers/hv: " Yury Norov
2022-01-23 22:00   ` Wei Liu
2022-01-23 22:02   ` Haiyang Zhang
2022-01-24  9:20   ` Vitaly Kuznetsov
2022-01-27 15:02     ` Michael Kelley (LINUX)
2022-01-28  9:31       ` Vitaly Kuznetsov
2022-01-23 18:39 ` [PATCH 44/54] infiniband: replace cpumask_weight with cpumask_weight_{eq, ...} where appropriate Yury Norov
2022-01-23 18:39 ` [PATCH 45/54] scsi: replace cpumask_weight with cpumask_weight_gt Yury Norov
2022-01-23 18:39 ` [PATCH 46/54] soc: replace cpumask_weight with cpumask_weight_lt Yury Norov
2022-01-23 18:39   ` Yury Norov
2022-01-23 18:39 ` [PATCH 47/54] sched: replace cpumask_weight with cpumask_weight_eq where appropriate Yury Norov
2022-01-24  8:05   ` Peter Zijlstra
2022-01-23 18:39 ` [PATCH 48/54] kernel/time: " Yury Norov
2022-01-24  8:06   ` Peter Zijlstra
2022-01-23 18:39 ` [PATCH 49/54] lib/nodemask: add nodemask_weight_{eq,gt,ge,lt,le} Yury Norov
2022-01-23 18:39 ` [PATCH 50/54] acpi: replace nodes__weight with nodes_weight_ge for numa Yury Norov
2022-01-23 18:39 ` [PATCH 51/54] mm: replace nodes_weight with nodes_weight_eq in mempolicy Yury Norov
2022-01-23 18:39 ` [PATCH 52/54] lib/nodemask: add num_node_state_eq() Yury Norov
2022-01-23 18:39 ` [PATCH 53/54] tools/bitmap: sync bitmap_weight Yury Norov
2022-01-23 18:39 ` [PATCH 54/54] MAINTAINERS: add cpumask and nodemask files to BITMAP_API Yury Norov
2022-01-26  7:30 ` [PATCH v3 00/54] lib/bitmap: optimize bitmap_weight() usage Vaittinen, Matti
2022-01-27 17:44   ` Yury Norov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.