All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup
@ 2023-05-09  1:50 Tejun Heo
  2023-05-09  1:50 ` [PATCH 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q Tejun Heo
                   ` (12 more replies)
  0 siblings, 13 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai; +Cc: linux-kernel, kernel-team

Hello,

v2: Acked patches are applied to wq/for-6.5-cleanup-ordered. Some conversion
    patches were dropped (e.g. because they were using WQ_SYSFS and thus
    can't be ordered) and fixed. The final patch to remove implicit ordered
    promotion logic was broken and could trigger WARN spuriously. Fixed.

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary
and contains the following 13 patches on top of wq/for-6.5-cleanup-ordered
branch.

 0001-scsi-ncr53c8xx-Use-default-max_active-for-hostdata-w.patch
 0002-wifi-mwifiex-Use-default-max_active-for-workqueues.patch
 0003-dm-integrity-Use-alloc_ordered_workqueue-to-create-o.patch
 0004-media-amphion-Use-alloc_ordered_workqueue-to-create-.patch
 0005-wifi-ath10-11-12k-Use-alloc_ordered_workqueue-to-cre.patch
 0006-net-wwan-t7xx-Use-alloc_ordered_workqueue-to-create-.patch
 0007-soc-qcom-qmi-Use-alloc_ordered_workqueue-to-create-o.patch
 0008-btrfs-Use-alloc_ordered_workqueue-to-create-ordered-.patch
 0009-net-qrtr-Use-alloc_ordered_workqueue-to-create-order.patch
 0010-rxrpc-Use-alloc_ordered_workqueue-to-create-ordered-.patch
 0011-crypto-octeontx2-Use-alloc_ordered_workqueue-to-crea.patch
 0012-media-coda-Use-alloc_ordered_workqueue-to-create-ord.patch
 0013-workqueue-Don-t-implicitly-make-UNBOUND-workqueues-w.patch

0001-0012 convert the existing users and 0013 drops the implicit ordered
promotion logic from alloc_workqueue(). I'll keep an eye out for a while
after merging 0022. Thankfully, these are pretty easy to grep for. The
patches can also be found in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git cleanup-ordered-v2

diffstat follows. Thanks.

 drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c |   12 ++++++------
 drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c |    6 +++---
 drivers/md/dm-integrity.c                          |    4 ++--
 drivers/md/dm.c                                    |    2 +-
 drivers/media/platform/amphion/vpu_core.c          |    2 +-
 drivers/media/platform/amphion/vpu_v4l2.c          |    2 +-
 drivers/media/platform/chips-media/coda-common.c   |    2 +-
 drivers/net/wireless/ath/ath10k/qmi.c              |    3 +--
 drivers/net/wireless/ath/ath11k/qmi.c              |    3 +--
 drivers/net/wireless/ath/ath12k/qmi.c              |    3 +--
 drivers/net/wireless/marvell/mwifiex/cfg80211.c    |    4 ++--
 drivers/net/wireless/marvell/mwifiex/main.c        |    8 ++++----
 drivers/net/wwan/t7xx/t7xx_hif_cldma.c             |   13 +++++++------
 drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c         |    5 +++--
 drivers/scsi/NCR5380.c                             |    2 +-
 drivers/soc/qcom/qmi_interface.c                   |    2 +-
 fs/btrfs/disk-io.c                                 |    2 +-
 fs/btrfs/scrub.c                                   |    6 ++++--
 include/linux/workqueue.h                          |    4 +---
 kernel/workqueue.c                                 |   23 ++++-------------------
 net/qrtr/ns.c                                      |    2 +-
 net/rxrpc/af_rxrpc.c                               |    2 +-
 22 files changed, 48 insertions(+), 64 deletions(-)


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-21  2:48   ` [PATCH RESEND " Tejun Heo
  2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai; +Cc: linux-kernel, kernel-team, Tejun Heo

hostdata->work_q only hosts a single work item, hostdata->main_task, and
thus doesn't need explicit concurrency limit. Let's use the default
@max_active. This doesn't cost anything and clearly expresses that
@max_active doesn't matter.

Signed-off-by: Tejun Heo <tj@kernel.org>
Finn Thain <fthain@linux-m68k.org>
Michael Schmitz <schmitzmic@gmail.com>
"James E.J. Bottomley" <jejb@linux.ibm.com>
"Martin K. Petersen" <martin.petersen@oracle.com>
linux-scsi@vger.kernel.org
linux-kernel@vger.kernel.org
---
 drivers/scsi/NCR5380.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index ca85bddb582b..cea3a79d538e 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -417,7 +417,7 @@ static int NCR5380_init(struct Scsi_Host *instance, int flags)
 	INIT_WORK(&hostdata->main_task, NCR5380_main);
 	hostdata->work_q = alloc_workqueue("ncr5380_%d",
 	                        WQ_UNBOUND | WQ_MEM_RECLAIM,
-	                        1, instance->host_no);
+				0, instance->host_no);
 	if (!hostdata->work_q)
 		return -ENOMEM;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
  2023-05-09  1:50 ` [PATCH 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-10  8:45   ` Kalle Valo
                     ` (2 more replies)
  2023-05-09  1:50   ` [dm-devel] " Tejun Heo
                   ` (10 subsequent siblings)
  12 siblings, 3 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev

These workqueues only host a single work item and thus doen't need explicit
concurrency limit. Let's use the default @max_active. This doesn't cost
anything and clearly expresses that @max_active doesn't matter.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Amitkumar Karwar <amitkarwar@gmail.com>
Cc: Ganapathi Bhat <ganapathi017@gmail.com>
Cc: Sharvari Harisangam <sharvari.harisangam@nxp.com>
Cc: Xinming Hu <huxinming820@gmail.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/wireless/marvell/mwifiex/cfg80211.c | 4 ++--
 drivers/net/wireless/marvell/mwifiex/main.c     | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index bcd564dc3554..5337ee4b6f10 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -3127,7 +3127,7 @@ struct wireless_dev *mwifiex_add_virtual_intf(struct wiphy *wiphy,
 	priv->dfs_cac_workqueue = alloc_workqueue("MWIFIEX_DFS_CAC%s",
 						  WQ_HIGHPRI |
 						  WQ_MEM_RECLAIM |
-						  WQ_UNBOUND, 1, name);
+						  WQ_UNBOUND, 0, name);
 	if (!priv->dfs_cac_workqueue) {
 		mwifiex_dbg(adapter, ERROR, "cannot alloc DFS CAC queue\n");
 		ret = -ENOMEM;
@@ -3138,7 +3138,7 @@ struct wireless_dev *mwifiex_add_virtual_intf(struct wiphy *wiphy,
 
 	priv->dfs_chan_sw_workqueue = alloc_workqueue("MWIFIEX_DFS_CHSW%s",
 						      WQ_HIGHPRI | WQ_UNBOUND |
-						      WQ_MEM_RECLAIM, 1, name);
+						      WQ_MEM_RECLAIM, 0, name);
 	if (!priv->dfs_chan_sw_workqueue) {
 		mwifiex_dbg(adapter, ERROR, "cannot alloc DFS channel sw queue\n");
 		ret = -ENOMEM;
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index ea22a08e6c08..1cd9d20cca16 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1547,7 +1547,7 @@ mwifiex_reinit_sw(struct mwifiex_adapter *adapter)
 
 	adapter->workqueue =
 		alloc_workqueue("MWIFIEX_WORK_QUEUE",
-				WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+				WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
 	if (!adapter->workqueue)
 		goto err_kmalloc;
 
@@ -1557,7 +1557,7 @@ mwifiex_reinit_sw(struct mwifiex_adapter *adapter)
 		adapter->rx_workqueue = alloc_workqueue("MWIFIEX_RX_WORK_QUEUE",
 							WQ_HIGHPRI |
 							WQ_MEM_RECLAIM |
-							WQ_UNBOUND, 1);
+							WQ_UNBOUND, 0);
 		if (!adapter->rx_workqueue)
 			goto err_kmalloc;
 		INIT_WORK(&adapter->rx_work, mwifiex_rx_work_queue);
@@ -1702,7 +1702,7 @@ mwifiex_add_card(void *card, struct completion *fw_done,
 
 	adapter->workqueue =
 		alloc_workqueue("MWIFIEX_WORK_QUEUE",
-				WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+				WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
 	if (!adapter->workqueue)
 		goto err_kmalloc;
 
@@ -1712,7 +1712,7 @@ mwifiex_add_card(void *card, struct completion *fw_done,
 		adapter->rx_workqueue = alloc_workqueue("MWIFIEX_RX_WORK_QUEUE",
 							WQ_HIGHPRI |
 							WQ_MEM_RECLAIM |
-							WQ_UNBOUND, 1);
+							WQ_UNBOUND, 0);
 		if (!adapter->rx_workqueue)
 			goto err_kmalloc;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/13] dm integrity: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
@ 2023-05-09  1:50   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Alasdair Kergon,
	Mike Snitzer, dm-devel

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: dm-devel@redhat.com
Cc: linux-kernel@vger.kernel.org
---
 drivers/md/dm-integrity.c | 4 ++--
 drivers/md/dm.c           | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 31838b13ea54..63ec502fcb12 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -4268,10 +4268,10 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned int argc, char **argv
 	}
 
 	/*
-	 * If this workqueue were percpu, it would cause bio reordering
+	 * If this workqueue weren't ordered, it would cause bio reordering
 	 * and reduced performance.
 	 */
-	ic->wait_wq = alloc_workqueue("dm-integrity-wait", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+	ic->wait_wq = alloc_ordered_workqueue("dm-integrity-wait", WQ_MEM_RECLAIM);
 	if (!ic->wait_wq) {
 		ti->error = "Cannot allocate workqueue";
 		r = -ENOMEM;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 3b694ba3a106..9599d76cc9a9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -207,7 +207,7 @@ static int __init local_init(void)
 	if (r)
 		return r;
 
-	deferred_remove_workqueue = alloc_workqueue("kdmremove", WQ_UNBOUND, 1);
+	deferred_remove_workqueue = alloc_ordered_workqueue("kdmremove", 0);
 	if (!deferred_remove_workqueue) {
 		r = -ENOMEM;
 		goto out_uevent_exit;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [dm-devel] [PATCH 03/13] dm integrity: Use alloc_ordered_workqueue() to create ordered workqueues
@ 2023-05-09  1:50   ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: kernel-team, Mike Snitzer, linux-kernel, dm-devel, Tejun Heo,
	Alasdair Kergon

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: dm-devel@redhat.com
Cc: linux-kernel@vger.kernel.org
---
 drivers/md/dm-integrity.c | 4 ++--
 drivers/md/dm.c           | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 31838b13ea54..63ec502fcb12 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -4268,10 +4268,10 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned int argc, char **argv
 	}
 
 	/*
-	 * If this workqueue were percpu, it would cause bio reordering
+	 * If this workqueue weren't ordered, it would cause bio reordering
 	 * and reduced performance.
 	 */
-	ic->wait_wq = alloc_workqueue("dm-integrity-wait", WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+	ic->wait_wq = alloc_ordered_workqueue("dm-integrity-wait", WQ_MEM_RECLAIM);
 	if (!ic->wait_wq) {
 		ti->error = "Cannot allocate workqueue";
 		r = -ENOMEM;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 3b694ba3a106..9599d76cc9a9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -207,7 +207,7 @@ static int __init local_init(void)
 	if (r)
 		return r;
 
-	deferred_remove_workqueue = alloc_workqueue("kdmremove", WQ_UNBOUND, 1);
+	deferred_remove_workqueue = alloc_ordered_workqueue("kdmremove", 0);
 	if (!deferred_remove_workqueue) {
 		r = -ENOMEM;
 		goto out_uevent_exit;
-- 
2.40.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/13] media: amphion: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (2 preceding siblings ...)
  2023-05-09  1:50   ` [dm-devel] " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-23 12:19   ` Hans Verkuil
  2023-05-09  1:50 ` [PATCH 05/13] wifi: ath10/11/12k: " Tejun Heo
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Ming Qian, Shijie Qin,
	Zhou Peng, Mauro Carvalho Chehab, linux-media

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ming Qian <ming.qian@nxp.com>
Cc: Shijie Qin <shijie.qin@nxp.com>
Cc: Zhou Peng <eagle.zhou@nxp.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
---
 drivers/media/platform/amphion/vpu_core.c | 2 +-
 drivers/media/platform/amphion/vpu_v4l2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
index de23627a119a..43d85a54268b 100644
--- a/drivers/media/platform/amphion/vpu_core.c
+++ b/drivers/media/platform/amphion/vpu_core.c
@@ -254,7 +254,7 @@ static int vpu_core_register(struct device *dev, struct vpu_core *core)
 	if (vpu_core_is_exist(vpu, core))
 		return 0;
 
-	core->workqueue = alloc_workqueue("vpu", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
+	core->workqueue = alloc_ordered_workqueue("vpu", WQ_MEM_RECLAIM);
 	if (!core->workqueue) {
 		dev_err(core->dev, "fail to alloc workqueue\n");
 		return -ENOMEM;
diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
index 6773b885597c..a48edb445eea 100644
--- a/drivers/media/platform/amphion/vpu_v4l2.c
+++ b/drivers/media/platform/amphion/vpu_v4l2.c
@@ -740,7 +740,7 @@ int vpu_v4l2_open(struct file *file, struct vpu_inst *inst)
 	inst->fh.ctrl_handler = &inst->ctrl_handler;
 	file->private_data = &inst->fh;
 	inst->state = VPU_CODEC_STATE_DEINIT;
-	inst->workqueue = alloc_workqueue("vpu_inst", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
+	inst->workqueue = alloc_ordered_workqueue("vpu_inst", WQ_MEM_RECLAIM);
 	if (inst->workqueue) {
 		INIT_WORK(&inst->msg_work, vpu_inst_run_work);
 		ret = kfifo_init(&inst->msg_fifo,
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/13] wifi: ath10/11/12k: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (3 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 04/13] media: amphion: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-10  8:41   ` Kalle Valo
  2023-05-19  0:41   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 06/13] net: wwan: t7xx: " Tejun Heo
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 drivers/net/wireless/ath/ath10k/qmi.c | 3 +--
 drivers/net/wireless/ath/ath11k/qmi.c | 3 +--
 drivers/net/wireless/ath/ath12k/qmi.c | 3 +--
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c
index 038c5903c0dc..52c1a3de8da6 100644
--- a/drivers/net/wireless/ath/ath10k/qmi.c
+++ b/drivers/net/wireless/ath/ath10k/qmi.c
@@ -1082,8 +1082,7 @@ int ath10k_qmi_init(struct ath10k *ar, u32 msa_size)
 	if (ret)
 		goto err;
 
-	qmi->event_wq = alloc_workqueue("ath10k_qmi_driver_event",
-					WQ_UNBOUND, 1);
+	qmi->event_wq = alloc_ordered_workqueue("ath10k_qmi_driver_event", 0);
 	if (!qmi->event_wq) {
 		ath10k_err(ar, "failed to allocate workqueue\n");
 		ret = -EFAULT;
diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
index ab923e24b0a9..26b252e62909 100644
--- a/drivers/net/wireless/ath/ath11k/qmi.c
+++ b/drivers/net/wireless/ath/ath11k/qmi.c
@@ -3256,8 +3256,7 @@ int ath11k_qmi_init_service(struct ath11k_base *ab)
 		return ret;
 	}
 
-	ab->qmi.event_wq = alloc_workqueue("ath11k_qmi_driver_event",
-					   WQ_UNBOUND, 1);
+	ab->qmi.event_wq = alloc_ordered_workqueue("ath11k_qmi_driver_event", 0);
 	if (!ab->qmi.event_wq) {
 		ath11k_err(ab, "failed to allocate workqueue\n");
 		return -EFAULT;
diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
index 03ba245fbee9..0a7892b1a8f8 100644
--- a/drivers/net/wireless/ath/ath12k/qmi.c
+++ b/drivers/net/wireless/ath/ath12k/qmi.c
@@ -3056,8 +3056,7 @@ int ath12k_qmi_init_service(struct ath12k_base *ab)
 		return ret;
 	}
 
-	ab->qmi.event_wq = alloc_workqueue("ath12k_qmi_driver_event",
-					   WQ_UNBOUND, 1);
+	ab->qmi.event_wq = alloc_ordered_workqueue("ath12k_qmi_driver_event", 0);
 	if (!ab->qmi.event_wq) {
 		ath12k_err(ab, "failed to allocate workqueue\n");
 		return -EFAULT;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/13] net: wwan: t7xx: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (4 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 05/13] wifi: ath10/11/12k: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-25 22:11   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 07/13] soc: qcom: qmi: " Tejun Heo
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Chandrashekar Devegowda,
	Intel Corporation, Chiranjeevi Rapolu, Liu Haijun,
	M Chetan Kumar, Ricardo Martinez, Loic Poulain, Sergey Ryazanov,
	Johannes Berg, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Cc: Intel Corporation <linuxwwan@intel.com>
Cc: Chiranjeevi Rapolu <chiranjeevi.rapolu@linux.intel.com>
Cc: Liu Haijun <haijun.liu@mediatek.com>
Cc: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Cc: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Cc: Loic Poulain <loic.poulain@linaro.org>
Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/wwan/t7xx/t7xx_hif_cldma.c     | 13 +++++++------
 drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c |  5 +++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
index aec3a18d44bd..7162bf38a8c9 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
@@ -1293,9 +1293,9 @@ int t7xx_cldma_init(struct cldma_ctrl *md_ctrl)
 	for (i = 0; i < CLDMA_TXQ_NUM; i++) {
 		md_cd_queue_struct_init(&md_ctrl->txq[i], md_ctrl, MTK_TX, i);
 		md_ctrl->txq[i].worker =
-			alloc_workqueue("md_hif%d_tx%d_worker",
-					WQ_UNBOUND | WQ_MEM_RECLAIM | (i ? 0 : WQ_HIGHPRI),
-					1, md_ctrl->hif_id, i);
+			alloc_ordered_workqueue("md_hif%d_tx%d_worker",
+					WQ_MEM_RECLAIM | (i ? 0 : WQ_HIGHPRI),
+					md_ctrl->hif_id, i);
 		if (!md_ctrl->txq[i].worker)
 			goto err_workqueue;
 
@@ -1306,9 +1306,10 @@ int t7xx_cldma_init(struct cldma_ctrl *md_ctrl)
 		md_cd_queue_struct_init(&md_ctrl->rxq[i], md_ctrl, MTK_RX, i);
 		INIT_WORK(&md_ctrl->rxq[i].cldma_work, t7xx_cldma_rx_done);
 
-		md_ctrl->rxq[i].worker = alloc_workqueue("md_hif%d_rx%d_worker",
-							 WQ_UNBOUND | WQ_MEM_RECLAIM,
-							 1, md_ctrl->hif_id, i);
+		md_ctrl->rxq[i].worker =
+			alloc_ordered_workqueue("md_hif%d_rx%d_worker",
+						WQ_MEM_RECLAIM,
+						md_ctrl->hif_id, i);
 		if (!md_ctrl->rxq[i].worker)
 			goto err_workqueue;
 	}
diff --git a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
index 46514208d4f9..8dab025a088a 100644
--- a/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
+++ b/drivers/net/wwan/t7xx/t7xx_hif_dpmaif_tx.c
@@ -618,8 +618,9 @@ int t7xx_dpmaif_txq_init(struct dpmaif_tx_queue *txq)
 		return ret;
 	}
 
-	txq->worker = alloc_workqueue("md_dpmaif_tx%d_worker", WQ_UNBOUND | WQ_MEM_RECLAIM |
-				      (txq->index ? 0 : WQ_HIGHPRI), 1, txq->index);
+	txq->worker = alloc_ordered_workqueue("md_dpmaif_tx%d_worker",
+				WQ_MEM_RECLAIM | (txq->index ? 0 : WQ_HIGHPRI),
+				txq->index);
 	if (!txq->worker)
 		return -ENOMEM;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/13] soc: qcom: qmi: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (5 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 06/13] net: wwan: t7xx: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 08/13] btrfs: " Tejun Heo
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Andy Gross,
	Bjorn Andersson, Konrad Dybcio, linux-arm-msm

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
Cc: linux-arm-msm@vger.kernel.org
---
 drivers/soc/qcom/qmi_interface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/qcom/qmi_interface.c b/drivers/soc/qcom/qmi_interface.c
index 820bdd9f8e46..78d7361fdcf2 100644
--- a/drivers/soc/qcom/qmi_interface.c
+++ b/drivers/soc/qcom/qmi_interface.c
@@ -650,7 +650,7 @@ int qmi_handle_init(struct qmi_handle *qmi, size_t recv_buf_size,
 	if (!qmi->recv_buf)
 		return -ENOMEM;
 
-	qmi->wq = alloc_workqueue("qmi_msg_handler", WQ_UNBOUND, 1);
+	qmi->wq = alloc_ordered_workqueue("qmi_msg_handler", 0);
 	if (!qmi->wq) {
 		ret = -ENOMEM;
 		goto err_free_recv_buf;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (6 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 07/13] soc: qcom: qmi: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-09 14:53   ` David Sterba
  2023-05-09  1:50 ` [PATCH 09/13] net: qrtr: " Tejun Heo
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

v3: btrfs_alloc_workqueue() excluded again. The concurrency level of a
    workqueue allocated through btrfs_alloc_workqueue() may be dynamically
    adjusted through thresh_exec_hook(), so they can't be ordered.

v2: btrfs_alloc_workqueue() updated too as suggested by Wang.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
---
 fs/btrfs/disk-io.c | 2 +-
 fs/btrfs/scrub.c   | 6 ++++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 59ea049fe7ee..32d08aed88b6 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2217,7 +2217,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
 	fs_info->qgroup_rescan_workers =
 		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
 	fs_info->discard_ctl.discard_workers =
-		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+		alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
 
 	if (!(fs_info->workers && fs_info->hipri_workers &&
 	      fs_info->delalloc_workers && fs_info->flush_workers &&
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 836725a19661..d5401d7813a5 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2734,8 +2734,10 @@ static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info,
 	if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
 		return 0;
 
-	scrub_workers = alloc_workqueue("btrfs-scrub", flags,
-					is_dev_replace ? 1 : max_active);
+	if (is_dev_replace)
+		scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
+	else
+		scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
 	if (!scrub_workers)
 		goto fail_scrub_workers;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/13] net: qrtr: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (7 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 08/13] btrfs: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-25 22:14   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 10/13] rxrpc: " Tejun Heo
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Manivannan Sadhasivam,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-arm-msm, netdev

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: netdev@vger.kernel.org
---
 net/qrtr/ns.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/qrtr/ns.c b/net/qrtr/ns.c
index 0f25a386138c..0f7a729f1a1f 100644
--- a/net/qrtr/ns.c
+++ b/net/qrtr/ns.c
@@ -783,7 +783,7 @@ int qrtr_ns_init(void)
 		goto err_sock;
 	}
 
-	qrtr_ns.workqueue = alloc_workqueue("qrtr_ns_handler", WQ_UNBOUND, 1);
+	qrtr_ns.workqueue = alloc_ordered_workqueue("qrtr_ns_handler", 0);
 	if (!qrtr_ns.workqueue) {
 		ret = -ENOMEM;
 		goto err_sock;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/13] rxrpc: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (8 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 09/13] net: qrtr: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-25 22:14   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 11/13] crypto: octeontx2: " Tejun Heo
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, David Howells, Marc Dionne,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-afs, netdev

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-afs@lists.infradead.org
Cc: netdev@vger.kernel.org
---
 net/rxrpc/af_rxrpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 31f738d65f1c..7eb24c25c731 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -988,7 +988,7 @@ static int __init af_rxrpc_init(void)
 		goto error_call_jar;
 	}
 
-	rxrpc_workqueue = alloc_workqueue("krxrpcd", WQ_HIGHPRI | WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
+	rxrpc_workqueue = alloc_ordered_workqueue("krxrpcd", WQ_HIGHPRI | WQ_MEM_RECLAIM);
 	if (!rxrpc_workqueue) {
 		pr_notice("Failed to allocate work queue\n");
 		goto error_work_queue;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 11/13] crypto: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (9 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 10/13] rxrpc: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-09  2:18   ` Herbert Xu
  2023-05-19  0:42   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 12/13] media: coda: " Tejun Heo
  2023-05-09  1:50 ` [PATCH 13/13] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered Tejun Heo
  12 siblings, 2 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Boris Brezillon,
	Arnaud Ebalard, Srujana Challa, Herbert Xu, David S. Miller,
	Shijith Thotton, Vladis Dronov, Vincent Mailhol, Wolfram Sang,
	Alexander Lobakin, Minghao Chi, ye xingchen, linux-crypto

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Arnaud Ebalard <arno@natisbad.org>
Cc: Srujana Challa <schalla@marvell.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Shijith Thotton <sthotton@marvell.com>
Cc: Vladis Dronov <vdronov@redhat.com>
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Alexander Lobakin <alobakin@pm.me>
Cc: Minghao Chi <chi.minghao@zte.com.cn>
Cc: ye xingchen <ye.xingchen@zte.com.cn>
Cc: linux-crypto@vger.kernel.org
---
 drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c | 12 ++++++------
 drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
index ddf6e913c1c4..30e6acfc93d9 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
@@ -357,9 +357,9 @@ static int cptpf_vfpf_mbox_init(struct otx2_cptpf_dev *cptpf, int num_vfs)
 	u64 vfpf_mbox_base;
 	int err, i;
 
-	cptpf->vfpf_mbox_wq = alloc_workqueue("cpt_vfpf_mailbox",
-					      WQ_UNBOUND | WQ_HIGHPRI |
-					      WQ_MEM_RECLAIM, 1);
+	cptpf->vfpf_mbox_wq =
+		alloc_ordered_workqueue("cpt_vfpf_mailbox",
+					WQ_HIGHPRI | WQ_MEM_RECLAIM);
 	if (!cptpf->vfpf_mbox_wq)
 		return -ENOMEM;
 
@@ -453,9 +453,9 @@ static int cptpf_afpf_mbox_init(struct otx2_cptpf_dev *cptpf)
 	resource_size_t offset;
 	int err;
 
-	cptpf->afpf_mbox_wq = alloc_workqueue("cpt_afpf_mailbox",
-					      WQ_UNBOUND | WQ_HIGHPRI |
-					      WQ_MEM_RECLAIM, 1);
+	cptpf->afpf_mbox_wq =
+		alloc_ordered_workqueue("cpt_afpf_mailbox",
+					WQ_HIGHPRI | WQ_MEM_RECLAIM);
 	if (!cptpf->afpf_mbox_wq)
 		return -ENOMEM;
 
diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
index 392e9fee05e8..6023a7adb70c 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
@@ -75,9 +75,9 @@ static int cptvf_pfvf_mbox_init(struct otx2_cptvf_dev *cptvf)
 	resource_size_t offset, size;
 	int ret;
 
-	cptvf->pfvf_mbox_wq = alloc_workqueue("cpt_pfvf_mailbox",
-					      WQ_UNBOUND | WQ_HIGHPRI |
-					      WQ_MEM_RECLAIM, 1);
+	cptvf->pfvf_mbox_wq =
+		alloc_ordered_workqueue("cpt_pfvf_mailbox",
+					WQ_HIGHPRI | WQ_MEM_RECLAIM);
 	if (!cptvf->pfvf_mbox_wq)
 		return -ENOMEM;
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 12/13] media: coda: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (10 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 11/13] crypto: octeontx2: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  2023-05-09 10:46   ` Philipp Zabel
  2023-05-19  0:50   ` Tejun Heo
  2023-05-09  1:50 ` [PATCH 13/13] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered Tejun Heo
  12 siblings, 2 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Tejun Heo, Philipp Zabel,
	Mauro Carvalho Chehab, linux-media

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

WHAT TO LOOK FOR
================

The conversions are from

  alloc_workqueue(WQ_UNBOUND | flags, 1, args..)

to

  alloc_ordered_workqueue(flags, args...)

which don't cause any functional changes. If you know that fully ordered
execution is not ncessary, please let me know. I'll drop the conversion and
instead add a comment noting the fact to reduce confusion while conversion
is in progress.

If you aren't fully sure, it's completely fine to let the conversion
through. The behavior will stay exactly the same and we can always
reconsider later.

As there are follow-up workqueue core changes, I'd really appreciate if the
patch can be routed through the workqueue tree w/ your acks. Thanks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
---
 drivers/media/platform/chips-media/coda-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/chips-media/coda-common.c b/drivers/media/platform/chips-media/coda-common.c
index d013ea5d9d3d..ac9a642ae76f 100644
--- a/drivers/media/platform/chips-media/coda-common.c
+++ b/drivers/media/platform/chips-media/coda-common.c
@@ -3268,7 +3268,7 @@ static int coda_probe(struct platform_device *pdev)
 						       &dev->iram.blob);
 	}
 
-	dev->workqueue = alloc_workqueue("coda", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
+	dev->workqueue = alloc_ordered_workqueue("coda", WQ_MEM_RECLAIM);
 	if (!dev->workqueue) {
 		dev_err(&pdev->dev, "unable to alloc workqueue\n");
 		ret = -ENOMEM;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 13/13] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered
  2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
                   ` (11 preceding siblings ...)
  2023-05-09  1:50 ` [PATCH 12/13] media: coda: " Tejun Heo
@ 2023-05-09  1:50 ` Tejun Heo
  12 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-09  1:50 UTC (permalink / raw)
  To: jiangshanlai; +Cc: linux-kernel, kernel-team, Tejun Heo, kernel test robot

5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
automoatically promoted UNBOUND workqueues w/ @max_active==1 to ordered
workqueues because UNBOUND workqueues w/ @max_active==1 used to be the way
to create ordered workqueues and the new NUMA support broke it. These
problems can be subtle and the fact that they can only trigger on NUMA
machines made them even more difficult to debug.

However, overloading the UNBOUND allocation interface this way creates other
issues. It's difficult to tell whether a given workqueue actually needs to
be ordered and users that legitimately want a min concurrency level wq
unexpectedly gets an ordered one instead. With planned UNBOUND workqueue
udpates to improve execution locality and more prevalence of chiplet designs
which can benefit from such improvements, this isn't a state we wanna be in
forever.

There aren't that many UNBOUND w/ @max_active==1 users in the tree and the
preceding patches audited all and converted them to
alloc_ordered_workqueue() as appropriate. This patch removes the implicit
promotion of UNBOUND w/ @max_active==1 workqueues to ordered ones.

Workqueue will also add a debug option to make all unordered UNBOUND
workqueues to use per-cpu pool_workqueues so that these problems can be
surfaced easier on most machines.

v2: v1 patch incorrectly dropped !list_empty(&wq->pwqs) condition in
    apply_workqueue_attrs_locked() which spuriously triggers WARNING and
    fails workqueue creation. Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/oe-lkp/202304251050.45a5df1f-oliver.sang@intel.com
---
 include/linux/workqueue.h |  4 +---
 kernel/workqueue.c        | 23 ++++-------------------
 2 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 3992c994787f..79901dea932e 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -339,7 +339,6 @@ enum {
 	__WQ_DRAINING		= 1 << 16, /* internal: workqueue is draining */
 	__WQ_ORDERED		= 1 << 17, /* internal: workqueue is ordered */
 	__WQ_LEGACY		= 1 << 18, /* internal: create*_workqueue() */
-	__WQ_ORDERED_EXPLICIT	= 1 << 19, /* internal: alloc_ordered_workqueue() */
 
 	WQ_MAX_ACTIVE		= 512,	  /* I like 512, better ideas? */
 	WQ_MAX_UNBOUND_PER_CPU	= 4,	  /* 4 * #cpus for unbound wq */
@@ -417,8 +416,7 @@ alloc_workqueue(const char *fmt, unsigned int flags, int max_active, ...);
  * Pointer to the allocated workqueue on success, %NULL on failure.
  */
 #define alloc_ordered_workqueue(fmt, flags, args...)			\
-	alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED |		\
-			__WQ_ORDERED_EXPLICIT | (flags), 1, ##args)
+	alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED | (flags), 1, ##args)
 
 #define create_workqueue(name)						\
 	alloc_workqueue("%s", __WQ_LEGACY | WQ_MEM_RECLAIM, 1, (name))
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4666a1a92a31..34a91eee7332 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4195,12 +4195,8 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
 		return -EINVAL;
 
 	/* creating multiple pwqs breaks ordering guarantee */
-	if (!list_empty(&wq->pwqs)) {
-		if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
-			return -EINVAL;
-
-		wq->flags &= ~__WQ_ORDERED;
-	}
+	if (!list_empty(&wq->pwqs) && WARN_ON(wq->flags & __WQ_ORDERED))
+		return -EINVAL;
 
 	ctx = apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask);
 	if (!ctx)
@@ -4428,16 +4424,6 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	struct workqueue_struct *wq;
 	struct pool_workqueue *pwq;
 
-	/*
-	 * Unbound && max_active == 1 used to imply ordered, which is no
-	 * longer the case on NUMA machines due to per-node pools.  While
-	 * alloc_ordered_workqueue() is the right way to create an ordered
-	 * workqueue, keep the previous behavior to avoid subtle breakages
-	 * on NUMA.
-	 */
-	if ((flags & WQ_UNBOUND) && max_active == 1)
-		flags |= __WQ_ORDERED;
-
 	/* see the comment above the definition of WQ_POWER_EFFICIENT */
 	if ((flags & WQ_POWER_EFFICIENT) && wq_power_efficient)
 		flags |= WQ_UNBOUND;
@@ -4645,14 +4631,13 @@ void workqueue_set_max_active(struct workqueue_struct *wq, int max_active)
 	struct pool_workqueue *pwq;
 
 	/* disallow meddling with max_active for ordered workqueues */
-	if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
+	if (WARN_ON(wq->flags & __WQ_ORDERED))
 		return;
 
 	max_active = wq_clamp_max_active(max_active, wq->flags, wq->name);
 
 	mutex_lock(&wq->mutex);
 
-	wq->flags &= ~__WQ_ORDERED;
 	wq->saved_max_active = max_active;
 
 	for_each_pwq(pwq, wq)
@@ -5920,7 +5905,7 @@ int workqueue_sysfs_register(struct workqueue_struct *wq)
 	 * attributes breaks ordering guarantee.  Disallow exposing ordered
 	 * workqueues.
 	 */
-	if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT))
+	if (WARN_ON(wq->flags & __WQ_ORDERED))
 		return -EINVAL;
 
 	wq->wq_dev = wq_dev = kzalloc(sizeof(*wq_dev), GFP_KERNEL);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 11/13] crypto: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 11/13] crypto: octeontx2: " Tejun Heo
@ 2023-05-09  2:18   ` Herbert Xu
  2023-05-19  0:42   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Herbert Xu @ 2023-05-09  2:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Boris Brezillon,
	Arnaud Ebalard, Srujana Challa, David S. Miller, Shijith Thotton,
	Vladis Dronov, Vincent Mailhol, Wolfram Sang, Alexander Lobakin,
	Minghao Chi, ye xingchen, linux-crypto

On Mon, May 08, 2023 at 03:50:30PM -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> WHAT TO LOOK FOR
> ================
> 
> The conversions are from
> 
>   alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
> 
> to
> 
>   alloc_ordered_workqueue(flags, args...)
> 
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
> 
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
> 
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Boris Brezillon <bbrezillon@kernel.org>
> Cc: Arnaud Ebalard <arno@natisbad.org>
> Cc: Srujana Challa <schalla@marvell.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Shijith Thotton <sthotton@marvell.com>
> Cc: Vladis Dronov <vdronov@redhat.com>
> Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> Cc: Alexander Lobakin <alobakin@pm.me>
> Cc: Minghao Chi <chi.minghao@zte.com.cn>
> Cc: ye xingchen <ye.xingchen@zte.com.cn>
> Cc: linux-crypto@vger.kernel.org
> ---
>  drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c | 12 ++++++------
>  drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c |  6 +++---
>  2 files changed, 9 insertions(+), 9 deletions(-)

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/13] media: coda: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 12/13] media: coda: " Tejun Heo
@ 2023-05-09 10:46   ` Philipp Zabel
  2023-05-19  0:50   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Philipp Zabel @ 2023-05-09 10:46 UTC (permalink / raw)
  To: Tejun Heo, jiangshanlai
  Cc: linux-kernel, kernel-team, Mauro Carvalho Chehab, linux-media

On Mo, 2023-05-08 at 15:50 -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> WHAT TO LOOK FOR
> ================
> 
> The conversions are from
> 
>   alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
> 
> to
> 
>   alloc_ordered_workqueue(flags, args...)
> 
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
> 
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
> 
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 08/13] btrfs: " Tejun Heo
@ 2023-05-09 14:53   ` David Sterba
  2023-05-09 15:57     ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: David Sterba @ 2023-05-09 14:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

On Mon, May 08, 2023 at 03:50:27PM -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> WHAT TO LOOK FOR
> ================
> 
> The conversions are from
> 
>   alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
> 
> to
> 
>   alloc_ordered_workqueue(flags, args...)
> 
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
> 
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
> 
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
> 
> v3: btrfs_alloc_workqueue() excluded again. The concurrency level of a
>     workqueue allocated through btrfs_alloc_workqueue() may be dynamically
>     adjusted through thresh_exec_hook(), so they can't be ordered.
> 
> v2: btrfs_alloc_workqueue() updated too as suggested by Wang.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Wang Yugui <wangyugui@e16-tech.com>
> Cc: Chris Mason <clm@fb.com>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: David Sterba <dsterba@suse.com>
> Cc: linux-btrfs@vger.kernel.org
> ---
>  fs/btrfs/disk-io.c | 2 +-
>  fs/btrfs/scrub.c   | 6 ++++--
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 59ea049fe7ee..32d08aed88b6 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2217,7 +2217,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
>  	fs_info->qgroup_rescan_workers =
>  		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
>  	fs_info->discard_ctl.discard_workers =
> -		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
> +		alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
>  
>  	if (!(fs_info->workers && fs_info->hipri_workers &&
>  	      fs_info->delalloc_workers && fs_info->flush_workers &&

I think there are a few more conversions missing. There's a local flags
variable in btrfs_init_workqueues

2175 static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
2176 {
2177         u32 max_active = fs_info->thread_pool_size;
2178         unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;

And used like

2194         fs_info->fixup_workers =
2195                 btrfs_alloc_workqueue(fs_info, "fixup", flags, 1, 0);

2213         fs_info->qgroup_rescan_workers =
2214                 btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);

WQ_UNBOUND is not mentioned explicitliy like for the "btrfs_discard"
workqueue.  Patch v2 did the switch in btrfs_alloc_workqueue according
to the max_active/limit_active parameter but this would affect all
queues and not all of them require to be ordered.

In btrfs_resize_thread_pool the workqueue_set_max_active is called
directly or indirectly so this can set the max_active to a user-defined
mount option. Could this be a problem or trigger a warning? This would
lead to max_active==1 + WQ_UNBOUND.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09 14:53   ` David Sterba
@ 2023-05-09 15:57     ` Tejun Heo
  2023-05-09 23:36       ` David Sterba
  0 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09 15:57 UTC (permalink / raw)
  To: David Sterba
  Cc: jiangshanlai, linux-kernel, kernel-team, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

Hello, David.

Thanks for taking a look.

On Tue, May 09, 2023 at 04:53:32PM +0200, David Sterba wrote:
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index 59ea049fe7ee..32d08aed88b6 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -2217,7 +2217,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
> >  	fs_info->qgroup_rescan_workers =
> >  		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
> >  	fs_info->discard_ctl.discard_workers =
> > -		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
> > +		alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
> >  
> >  	if (!(fs_info->workers && fs_info->hipri_workers &&
> >  	      fs_info->delalloc_workers && fs_info->flush_workers &&
> 
> I think there are a few more conversions missing. There's a local flags
> variable in btrfs_init_workqueues
> 
> 2175 static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
> 2176 {
> 2177         u32 max_active = fs_info->thread_pool_size;
> 2178         unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
> 
> And used like
> 
> 2194         fs_info->fixup_workers =
> 2195                 btrfs_alloc_workqueue(fs_info, "fixup", flags, 1, 0);
> 
> 2213         fs_info->qgroup_rescan_workers =
> 2214                 btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);

Right you are.

> WQ_UNBOUND is not mentioned explicitliy like for the "btrfs_discard"
> workqueue.  Patch v2 did the switch in btrfs_alloc_workqueue according
> to the max_active/limit_active parameter but this would affect all
> queues and not all of them require to be ordered.

The thresh mechanism which auto adjusts max active means that the workqueues
allocated btrfs_alloc_workqueue() can't be ordered, right? When thresh is
smaller than DFT_THRESHOLD, the mechanism is disabled but that looks like an
optimization.

> In btrfs_resize_thread_pool the workqueue_set_max_active is called
> directly or indirectly so this can set the max_active to a user-defined
> mount option. Could this be a problem or trigger a warning? This would
> lead to max_active==1 + WQ_UNBOUND.

That's not a problem. The only thing we need to make sure is that the
workqueues which actually *must* be ordered use alloc_ordered_workqueue() as
they won't be implicitly treated as ordered in the future.

* The current patch converts two - fs_info->discard_ctl.discard_workers and
  scrub_workers when @is_dev_replace is set. Do they actually need to be
  ordered?

* As you pointed out, fs_info->fixup_workers and
  fs_info->qgroup_rescan_workers are also currently implicitly ordered. Do
  they actually need to be ordered?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09 15:57     ` Tejun Heo
@ 2023-05-09 23:36       ` David Sterba
  2023-05-09 23:47         ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: David Sterba @ 2023-05-09 23:36 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Sterba, jiangshanlai, linux-kernel, kernel-team,
	Wang Yugui, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Tue, May 09, 2023 at 05:57:16AM -1000, Tejun Heo wrote:
> Hello, David.
> 
> Thanks for taking a look.
> 
> On Tue, May 09, 2023 at 04:53:32PM +0200, David Sterba wrote:
> > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > > index 59ea049fe7ee..32d08aed88b6 100644
> > > --- a/fs/btrfs/disk-io.c
> > > +++ b/fs/btrfs/disk-io.c
> > > @@ -2217,7 +2217,7 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
> > >  	fs_info->qgroup_rescan_workers =
> > >  		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
> > >  	fs_info->discard_ctl.discard_workers =
> > > -		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
> > > +		alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
> > >  
> > >  	if (!(fs_info->workers && fs_info->hipri_workers &&
> > >  	      fs_info->delalloc_workers && fs_info->flush_workers &&
> > 
> > I think there are a few more conversions missing. There's a local flags
> > variable in btrfs_init_workqueues
> > 
> > 2175 static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info)
> > 2176 {
> > 2177         u32 max_active = fs_info->thread_pool_size;
> > 2178         unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
> > 
> > And used like
> > 
> > 2194         fs_info->fixup_workers =
> > 2195                 btrfs_alloc_workqueue(fs_info, "fixup", flags, 1, 0);
> > 
> > 2213         fs_info->qgroup_rescan_workers =
> > 2214                 btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
> 
> Right you are.
> 
> > WQ_UNBOUND is not mentioned explicitliy like for the "btrfs_discard"
> > workqueue.  Patch v2 did the switch in btrfs_alloc_workqueue according
> > to the max_active/limit_active parameter but this would affect all
> > queues and not all of them require to be ordered.
> 
> The thresh mechanism which auto adjusts max active means that the workqueues
> allocated btrfs_alloc_workqueue() can't be ordered, right? When thresh is
> smaller than DFT_THRESHOLD, the mechanism is disabled but that looks like an
> optimization.

Yeah I think so but I'm not entierly sure. The ordering for all queues
that don't start with max_active > 1 should not be required, here the
parallelization and out of order processing is expected and serialized
or decided once the work is done.

> > In btrfs_resize_thread_pool the workqueue_set_max_active is called
> > directly or indirectly so this can set the max_active to a user-defined
> > mount option. Could this be a problem or trigger a warning? This would
> > lead to max_active==1 + WQ_UNBOUND.
> 
> That's not a problem. The only thing we need to make sure is that the
> workqueues which actually *must* be ordered use alloc_ordered_workqueue() as
> they won't be implicitly treated as ordered in the future.
> 
> * The current patch converts two - fs_info->discard_ctl.discard_workers and
>   scrub_workers when @is_dev_replace is set. Do they actually need to be
>   ordered?
> 
> * As you pointed out, fs_info->fixup_workers and
>   fs_info->qgroup_rescan_workers are also currently implicitly ordered. Do
>   they actually need to be ordered?

I think all of them somehow implictly depend on the ordering. The
replace process sequentially goes over a block group and copies blocks.

The fixup process is quite obscure and we should preserve the semantics
as much as possible. It has something to do with pages that get out of
sync with extent state without btrfs knowing and that there are more such
requests hapenning at the same time is low but once it happens it can
lead to corruptions.

Quota rescan is in its nature also a sequential process but I think it
does not need to be ordered, it's started from higher level context like
enabling quotas or rescan but there are also calls at remount time so
this makes it less clear.

In summary, if the ordered queue could be used then I'd recommend to do
it as the safe option.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09 23:36       ` David Sterba
@ 2023-05-09 23:47         ` Tejun Heo
  2023-05-25 23:33           ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-09 23:47 UTC (permalink / raw)
  To: David Sterba
  Cc: jiangshanlai, linux-kernel, kernel-team, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

Hello, David.

On Wed, May 10, 2023 at 01:36:20AM +0200, David Sterba wrote:
...
> Yeah I think so but I'm not entierly sure. The ordering for all queues
> that don't start with max_active > 1 should not be required, here the
> parallelization and out of order processing is expected and serialized
> or decided once the work is done.
> 
> > > In btrfs_resize_thread_pool the workqueue_set_max_active is called
> > > directly or indirectly so this can set the max_active to a user-defined
> > > mount option. Could this be a problem or trigger a warning? This would
> > > lead to max_active==1 + WQ_UNBOUND.
> > 
> > That's not a problem. The only thing we need to make sure is that the
> > workqueues which actually *must* be ordered use alloc_ordered_workqueue() as
> > they won't be implicitly treated as ordered in the future.
> > 
> > * The current patch converts two - fs_info->discard_ctl.discard_workers and
> >   scrub_workers when @is_dev_replace is set. Do they actually need to be
> >   ordered?
> > 
> > * As you pointed out, fs_info->fixup_workers and
> >   fs_info->qgroup_rescan_workers are also currently implicitly ordered. Do
> >   they actually need to be ordered?
> 
> I think all of them somehow implictly depend on the ordering. The
> replace process sequentially goes over a block group and copies blocks.
> 
> The fixup process is quite obscure and we should preserve the semantics
> as much as possible. It has something to do with pages that get out of
> sync with extent state without btrfs knowing and that there are more such
> requests hapenning at the same time is low but once it happens it can
> lead to corruptions.
> 
> Quota rescan is in its nature also a sequential process but I think it
> does not need to be ordered, it's started from higher level context like
> enabling quotas or rescan but there are also calls at remount time so
> this makes it less clear.
> 
> In summary, if the ordered queue could be used then I'd recommend to do
> it as the safe option.

I see. It seems rather error-prone to make workqueues implicitly ordered
from btrfs_alloc_workqueue(). I'll see if I can make it explicit and keep
all workqueues which are currently guaranteed to be ordered ordered.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/13] wifi: ath10/11/12k: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 05/13] wifi: ath10/11/12k: " Tejun Heo
@ 2023-05-10  8:41   ` Kalle Valo
  2023-05-19  0:41   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Kalle Valo @ 2023-05-10  8:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Tejun Heo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev

Tejun Heo <tj@kernel.org> wrote:

> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> WHAT TO LOOK FOR
> ================
> 
> The conversions are from
> 
>   alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
> 
> to
> 
>   alloc_ordered_workqueue(flags, args...)
> 
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and
> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
> 
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
> 
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org

I run quick smoke tests with ath11k and ath12k, didn't see any issues. Feel
free to take via the workqueue tree:

Acked-by: Kalle Valo <kvalo@kernel.org>

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20230509015032.3768622-6-tj@kernel.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
@ 2023-05-10  8:45   ` Kalle Valo
  2023-05-10 18:09   ` Brian Norris
  2023-05-19  0:36   ` Tejun Heo
  2 siblings, 0 replies; 45+ messages in thread
From: Kalle Valo @ 2023-05-10  8:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Tejun Heo,
	Amitkumar Karwar, Ganapathi Bhat, Sharvari Harisangam,
	Xinming Hu, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-wireless, netdev

Tejun Heo <tj@kernel.org> wrote:

> These workqueues only host a single work item and thus doen't need explicit
> concurrency limit. Let's use the default @max_active. This doesn't cost
> anything and clearly expresses that @max_active doesn't matter.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Amitkumar Karwar <amitkarwar@gmail.com>
> Cc: Ganapathi Bhat <ganapathi017@gmail.com>
> Cc: Sharvari Harisangam <sharvari.harisangam@nxp.com>
> Cc: Xinming Hu <huxinming820@gmail.com>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

I didn't review the patch but I assume it's ok. Feel free to take it via your
tree:

Acked-by: Kalle Valo <kvalo@kernel.org>

Patch set to Not Applicable.

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20230509015032.3768622-3-tj@kernel.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
  2023-05-10  8:45   ` Kalle Valo
@ 2023-05-10 18:09   ` Brian Norris
  2023-05-10 18:16     ` Tejun Heo
  2023-05-19  0:36   ` Tejun Heo
  2 siblings, 1 reply; 45+ messages in thread
From: Brian Norris @ 2023-05-10 18:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev

On Mon, May 08, 2023 at 03:50:21PM -1000, Tejun Heo wrote:
> These workqueues only host a single work item and thus doen't need explicit
> concurrency limit. Let's use the default @max_active. This doesn't cost
> anything and clearly expresses that @max_active doesn't matter.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Amitkumar Karwar <amitkarwar@gmail.com>
> Cc: Ganapathi Bhat <ganapathi017@gmail.com>
> Cc: Sharvari Harisangam <sharvari.harisangam@nxp.com>
> Cc: Xinming Hu <huxinming820@gmail.com>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

Reviewed-by: Brian Norris <briannorris@chromium.org>

I'll admit, the workqueue documentation sounds a bit like "max_active ==
1 + WQ_UNBOUND" is what we want ("one work item [...] active at any
given time"), but that's more of my misunderstanding than anything --
each work item can only be active in a single context at any given time,
so that note is talking about distinct (i.e., more than 1) work items.

While I'm here: we're still debugging what's affecting WiFi performance
on some of our WiFi systems, but it's possible I'll be turning some of
these into struct kthread_worker instead. We can cross that bridge
(including potential conflicts) if/when we come to it though.

Thanks,
Brian

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-10 18:09   ` Brian Norris
@ 2023-05-10 18:16     ` Tejun Heo
  2023-05-10 18:57       ` Brian Norris
  0 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-10 18:16 UTC (permalink / raw)
  To: Brian Norris
  Cc: jiangshanlai, linux-kernel, kernel-team, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev

Hello,

On Wed, May 10, 2023 at 11:09:55AM -0700, Brian Norris wrote:
> I'll admit, the workqueue documentation sounds a bit like "max_active ==
> 1 + WQ_UNBOUND" is what we want ("one work item [...] active at any
> given time"), but that's more of my misunderstanding than anything --
> each work item can only be active in a single context at any given time,
> so that note is talking about distinct (i.e., more than 1) work items.

Yeah, a future patch is gonna change the semantics a bit and I'll update the
doc to be clearer.

> While I'm here: we're still debugging what's affecting WiFi performance
> on some of our WiFi systems, but it's possible I'll be turning some of
> these into struct kthread_worker instead. We can cross that bridge
> (including potential conflicts) if/when we come to it though.

Can you elaborate the performance problem you're seeing? I'm working on a
major update for workqueue to improve its locality behavior, so if you're
experiencing issues on CPUs w/ multiple L3 caches, it'd be a good test case.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-10 18:16     ` Tejun Heo
@ 2023-05-10 18:57       ` Brian Norris
  2023-05-10 19:19         ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: Brian Norris @ 2023-05-10 18:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev, Pin-yen Lin

Hi,

On Wed, May 10, 2023 at 08:16:00AM -1000, Tejun Heo wrote:
> > While I'm here: we're still debugging what's affecting WiFi performance
> > on some of our WiFi systems, but it's possible I'll be turning some of
> > these into struct kthread_worker instead. We can cross that bridge
> > (including potential conflicts) if/when we come to it though.
> 
> Can you elaborate the performance problem you're seeing? I'm working on a
> major update for workqueue to improve its locality behavior, so if you're
> experiencing issues on CPUs w/ multiple L3 caches, it'd be a good test case.

Sure!

Test case: iperf TCP RX (i.e., hits "MWIFIEX_RX_WORK_QUEUE" a lot) at
some of the higher (VHT 80 MHz) data rates.

Hardware: Mediatek MT8173 2xA53 (little) + 2xA72 (big) CPU
(I'm not familiar with its cache details)
+
Marvell SD8897 SDIO WiFi (mwifiex_sdio)

We're looking at a major regression from our 4.19 kernel to a 5.15
kernel (yeah, that's downstream reality). So far, we've found that
performance is:

(1) much better (nearly the same as 4.19) if we add WQ_SYSFS and pin the
work queue to one CPU (doesn't really matter which CPU, as long as it's
not the one loaded with IRQ(?) work)

(2) moderately better if we pin the CPU frequency (e.g., "performance"
cpufreq governor instead of "schedutil")

(3) moderately better (not quite as good as (2)) if we switch a
kthread_worker and don't pin anything.

We tried (2) because we saw a lot more CPU migration on kernel 5.15
(work moves across all 4 CPUs throughout the run; on kernel 4.19 it
mostly switched between 2 CPUs).

We tried (3) suspecting some kind of EAS issue (instead of distributing
our workload onto 4 different kworkers, our work (and therefore our load
calculation) is mostly confined to a single kernel thread). But it still
seems like our issues are more than "just" EAS / cpufreq issues, since
(2) and (3) aren't as good as (1).

NB: there weren't many relevant mwifiex or MTK-SDIO changes in this
range.

So we're still investigating a few other areas, but it does seem like
"locality" (in some sense of the word) is relevant. We'd probably be
open to testing any patches you have, although it's likely we'd have the
easiest time if we can port those to 5.15. We're constantly working on
getting good upstream support for Chromebook chips, but ARM SoC reality
is that it still varies a lot as to how much works upstream on any given
system.

Thanks,
Brian

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-10 18:57       ` Brian Norris
@ 2023-05-10 19:19         ` Tejun Heo
  2023-05-10 19:50           ` Brian Norris
  0 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-10 19:19 UTC (permalink / raw)
  To: Brian Norris
  Cc: jiangshanlai, linux-kernel, kernel-team, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev, Pin-yen Lin

Hello,

On Wed, May 10, 2023 at 11:57:41AM -0700, Brian Norris wrote:
> Test case: iperf TCP RX (i.e., hits "MWIFIEX_RX_WORK_QUEUE" a lot) at
> some of the higher (VHT 80 MHz) data rates.
> 
> Hardware: Mediatek MT8173 2xA53 (little) + 2xA72 (big) CPU
> (I'm not familiar with its cache details)
> +
> Marvell SD8897 SDIO WiFi (mwifiex_sdio)

Yeah, we had multiple of similar cases on, what I think are, similar
configurations, which is why I'm working on improving workqueue locality.

> We're looking at a major regression from our 4.19 kernel to a 5.15
> kernel (yeah, that's downstream reality). So far, we've found that
> performance is:

That's curious. 4.19 is old but I scanned the history and there's nothing
which can cause that kind of perf regression for unbound workqueues between
4.19 and 5.15.

> (1) much better (nearly the same as 4.19) if we add WQ_SYSFS and pin the
> work queue to one CPU (doesn't really matter which CPU, as long as it's
> not the one loaded with IRQ(?) work)
> 
> (2) moderately better if we pin the CPU frequency (e.g., "performance"
> cpufreq governor instead of "schedutil")
> 
> (3) moderately better (not quite as good as (2)) if we switch a
> kthread_worker and don't pin anything.

Hmm... so it's not just workqueue.

> We tried (2) because we saw a lot more CPU migration on kernel 5.15
> (work moves across all 4 CPUs throughout the run; on kernel 4.19 it
> mostly switched between 2 CPUs).

Workqueue can contribute to this but it seems more likely that scheduling
changes are also part of the story.

> We tried (3) suspecting some kind of EAS issue (instead of distributing
> our workload onto 4 different kworkers, our work (and therefore our load
> calculation) is mostly confined to a single kernel thread). But it still
> seems like our issues are more than "just" EAS / cpufreq issues, since
> (2) and (3) aren't as good as (1).
> 
> NB: there weren't many relevant mwifiex or MTK-SDIO changes in this
> range.
> 
> So we're still investigating a few other areas, but it does seem like
> "locality" (in some sense of the word) is relevant. We'd probably be
> open to testing any patches you have, although it's likely we'd have the
> easiest time if we can port those to 5.15. We're constantly working on
> getting good upstream support for Chromebook chips, but ARM SoC reality
> is that it still varies a lot as to how much works upstream on any given
> system.

I should be able to post the patchset later today or tomorrow. It comes with
sysfs knobs to control affinity scopes and strictness, so hopefully you
should be able to find the configuration that works without too much
difficulty.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-10 19:19         ` Tejun Heo
@ 2023-05-10 19:50           ` Brian Norris
  0 siblings, 0 replies; 45+ messages in thread
From: Brian Norris @ 2023-05-10 19:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Amitkumar Karwar,
	Ganapathi Bhat, Sharvari Harisangam, Xinming Hu, Kalle Valo,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-wireless, netdev, Pin-yen Lin

On Wed, May 10, 2023 at 09:19:20AM -1000, Tejun Heo wrote:
> On Wed, May 10, 2023 at 11:57:41AM -0700, Brian Norris wrote:
> > (1) much better (nearly the same as 4.19) if we add WQ_SYSFS and pin the
> > work queue to one CPU (doesn't really matter which CPU, as long as it's
> > not the one loaded with IRQ(?) work)
> > 
> > (2) moderately better if we pin the CPU frequency (e.g., "performance"
> > cpufreq governor instead of "schedutil")
> > 
> > (3) moderately better (not quite as good as (2)) if we switch a
> > kthread_worker and don't pin anything.
> 
> Hmm... so it's not just workqueue.

Right. And not just cpufreq either.

> > We tried (2) because we saw a lot more CPU migration on kernel 5.15
> > (work moves across all 4 CPUs throughout the run; on kernel 4.19 it
> > mostly switched between 2 CPUs).
> 
> Workqueue can contribute to this but it seems more likely that scheduling
> changes are also part of the story.

Yeah, that's one theory. And in that vein, that's one reason we might
consider switching to a kthread_worker anyway, even if that doesn't
solve all the regression -- because schedutil relies on per-entity load
calculations to make decisions, and workqueues don't help the scheduler
understand that load when spread across N CPUs (workers). A dedicated
kthread would better represent our workload to the scheduler.

(Threaded NAPI -- mwifiex doesn't support NAPI -- takes a similar
approach, as it has its own thread per NAPI context.)

> > We tried (3) suspecting some kind of EAS issue (instead of distributing
> > our workload onto 4 different kworkers, our work (and therefore our load
> > calculation) is mostly confined to a single kernel thread). But it still
> > seems like our issues are more than "just" EAS / cpufreq issues, since
> > (2) and (3) aren't as good as (1).
> > 
> > NB: there weren't many relevant mwifiex or MTK-SDIO changes in this
> > range.
> > 
> > So we're still investigating a few other areas, but it does seem like
> > "locality" (in some sense of the word) is relevant. We'd probably be
> > open to testing any patches you have, although it's likely we'd have the
> > easiest time if we can port those to 5.15. We're constantly working on
> > getting good upstream support for Chromebook chips, but ARM SoC reality
> > is that it still varies a lot as to how much works upstream on any given
> > system.
> 
> I should be able to post the patchset later today or tomorrow. It comes with
> sysfs knobs to control affinity scopes and strictness, so hopefully you
> should be able to find the configuration that works without too much
> difficulty.

Great!

Brian

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues
  2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
  2023-05-10  8:45   ` Kalle Valo
  2023-05-10 18:09   ` Brian Norris
@ 2023-05-19  0:36   ` Tejun Heo
  2 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-19  0:36 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Amitkumar Karwar, Ganapathi Bhat,
	Sharvari Harisangam, Xinming Hu, Kalle Valo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-wireless,
	netdev

On Mon, May 08, 2023 at 03:50:21PM -1000, Tejun Heo wrote:
> These workqueues only host a single work item and thus doen't need explicit
> concurrency limit. Let's use the default @max_active. This doesn't cost
> anything and clearly expresses that @max_active doesn't matter.

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/13] wifi: ath10/11/12k: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 05/13] wifi: ath10/11/12k: " Tejun Heo
  2023-05-10  8:41   ` Kalle Valo
@ 2023-05-19  0:41   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-19  0:41 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Kalle Valo, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-wireless,
	netdev

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 11/13] crypto: octeontx2: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 11/13] crypto: octeontx2: " Tejun Heo
  2023-05-09  2:18   ` Herbert Xu
@ 2023-05-19  0:42   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-19  0:42 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Boris Brezillon, Arnaud Ebalard,
	Srujana Challa, Herbert Xu, David S. Miller, Shijith Thotton,
	Vladis Dronov, Vincent Mailhol, Wolfram Sang, Alexander Lobakin,
	Minghao Chi, ye xingchen, linux-crypto

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/13] media: coda: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 12/13] media: coda: " Tejun Heo
  2023-05-09 10:46   ` Philipp Zabel
@ 2023-05-19  0:50   ` Tejun Heo
  1 sibling, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-19  0:50 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Philipp Zabel, Mauro Carvalho Chehab,
	linux-media

Applied to wq/for-6.5-cleanup-ordered.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH RESEND 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
  2023-05-09  1:50 ` [PATCH 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q Tejun Heo
@ 2023-05-21  2:48   ` Tejun Heo
  2023-05-21  6:42     ` Finn Thain
  2023-05-22 22:06     ` Martin K. Petersen
  0 siblings, 2 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-21  2:48 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Finn Thain, Michael Schmitz,
	James E.J. Bottomley, Martin K. Petersen, linux-scsi

From: Tejun Heo <tj@kernel.org>
Subject: scsi: ncr53c8xx: Use default @max_active for hostdata->work_q

hostdata->work_q only hosts a single work item, hostdata->main_task, and
thus doesn't need explicit concurrency limit. Let's use the default
@max_active. This doesn't cost anything and clearly expresses that
@max_active doesn't matter.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Hello,

Resending because I screwed up the cc list in the original posting. The
whole series can be viewed at:

  http://lkml.kernel.org/r/20230509015032.3768622-1-tj@kernel.org

It's not a must but it'd be great if I can route this through the workqueue
tree so that it can go together with other related and followup cleanups.

Thanks.

 drivers/scsi/NCR5380.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -417,7 +417,7 @@ static int NCR5380_init(struct Scsi_Host
 	INIT_WORK(&hostdata->main_task, NCR5380_main);
 	hostdata->work_q = alloc_workqueue("ncr5380_%d",
 	                        WQ_UNBOUND | WQ_MEM_RECLAIM,
-	                        1, instance->host_no);
+				0, instance->host_no);
 	if (!hostdata->work_q)
 		return -ENOMEM;
 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH RESEND 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
  2023-05-21  2:48   ` [PATCH RESEND " Tejun Heo
@ 2023-05-21  6:42     ` Finn Thain
  2023-05-22 22:06     ` Martin K. Petersen
  1 sibling, 0 replies; 45+ messages in thread
From: Finn Thain @ 2023-05-21  6:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Michael Schmitz,
	James E.J. Bottomley, Martin K. Petersen, linux-scsi

On Sat, 20 May 2023, Tejun Heo wrote:

> From: Tejun Heo <tj@kernel.org>
> Subject: scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
> 

This driver is normally referred to as ncr5380 or NCR5380. (It doesn't 
support any other member of the 538x family.)

> hostdata->work_q only hosts a single work item, hostdata->main_task, and
> thus doesn't need explicit concurrency limit. Let's use the default
> @max_active. This doesn't cost anything and clearly expresses that
> @max_active doesn't matter.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

Acked-by: Finn Thain <fthain@linux-m68k.org>

> Cc: Finn Thain <fthain@linux-m68k.org>
> Cc: Michael Schmitz <schmitzmic@gmail.com>
> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
> Cc: linux-scsi@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
> Hello,
> 
> Resending because I screwed up the cc list in the original posting. The
> whole series can be viewed at:
> 
>   http://lkml.kernel.org/r/20230509015032.3768622-1-tj@kernel.org
> 
> It's not a must but it'd be great if I can route this through the 
> workqueue tree so that it can go together with other related and 
> followup cleanups.
> 
> Thanks.
> 

No objection from me. I guess it's Martin's call?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH RESEND 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
  2023-05-21  2:48   ` [PATCH RESEND " Tejun Heo
  2023-05-21  6:42     ` Finn Thain
@ 2023-05-22 22:06     ` Martin K. Petersen
  2023-05-23  0:58       ` Tejun Heo
  1 sibling, 1 reply; 45+ messages in thread
From: Martin K. Petersen @ 2023-05-22 22:06 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Finn Thain,
	Michael Schmitz, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi


Hi Tejun!

> Subject: scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
>
> hostdata->work_q only hosts a single work item, hostdata->main_task, and
> thus doesn't need explicit concurrency limit. Let's use the default
> @max_active. This doesn't cost anything and clearly expresses that
> @max_active doesn't matter.

> It's not a must but it'd be great if I can route this through the workqueue
> tree so that it can go together with other related and followup cleanups.

As Finn pointed out, please make sure it's tagged NCR5380: instead of
ncr53c8xx:. Otherwise OK.

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH RESEND 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
  2023-05-22 22:06     ` Martin K. Petersen
@ 2023-05-23  0:58       ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-23  0:58 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: jiangshanlai, linux-kernel, kernel-team, Finn Thain,
	Michael Schmitz, James E.J. Bottomley, linux-scsi

On Mon, May 22, 2023 at 06:06:56PM -0400, Martin K. Petersen wrote:
> 
> Hi Tejun!
> 
> > Subject: scsi: ncr53c8xx: Use default @max_active for hostdata->work_q
> >
> > hostdata->work_q only hosts a single work item, hostdata->main_task, and
> > thus doesn't need explicit concurrency limit. Let's use the default
> > @max_active. This doesn't cost anything and clearly expresses that
> > @max_active doesn't matter.
> 
> > It's not a must but it'd be great if I can route this through the workqueue
> > tree so that it can go together with other related and followup cleanups.
> 
> As Finn pointed out, please make sure it's tagged NCR5380: instead of
> ncr53c8xx:. Otherwise OK.
> 
> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>

Will update and apply to wq/for-6.5-cleanup-ordered.

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/13] media: amphion: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 04/13] media: amphion: " Tejun Heo
@ 2023-05-23 12:19   ` Hans Verkuil
  2023-05-23 21:25     ` Tejun Heo
  0 siblings, 1 reply; 45+ messages in thread
From: Hans Verkuil @ 2023-05-23 12:19 UTC (permalink / raw)
  To: Tejun Heo, jiangshanlai
  Cc: linux-kernel, kernel-team, Ming Qian, Shijie Qin, Zhou Peng,
	Mauro Carvalho Chehab, linux-media

On 09/05/2023 03:50, Tejun Heo wrote:
> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered").

That's the wrong patch description, it should be:

4c16bd327c74 ("workqueue: implement NUMA affinity for unbound workqueues")

 Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> WHAT TO LOOK FOR
> ================
> 
> The conversions are from
> 
>   alloc_workqueue(WQ_UNBOUND | flags, 1, args..)
> 
> to
> 
>   alloc_ordered_workqueue(flags, args...)
> 
> which don't cause any functional changes. If you know that fully ordered
> execution is not ncessary, please let me know. I'll drop the conversion and

ncessary -> necessary

> instead add a comment noting the fact to reduce confusion while conversion
> is in progress.
> 
> If you aren't fully sure, it's completely fine to let the conversion
> through. The behavior will stay exactly the same and we can always
> reconsider later.
> 
> As there are follow-up workqueue core changes, I'd really appreciate if the
> patch can be routed through the workqueue tree w/ your acks. Thanks.

Note that the max line length of 75 is exceeded in this commit message.

> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Ming Qian <ming.qian@nxp.com>
> Cc: Shijie Qin <shijie.qin@nxp.com>
> Cc: Zhou Peng <eagle.zhou@nxp.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: linux-media@vger.kernel.org
> ---
>  drivers/media/platform/amphion/vpu_core.c | 2 +-
>  drivers/media/platform/amphion/vpu_v4l2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
> index de23627a119a..43d85a54268b 100644
> --- a/drivers/media/platform/amphion/vpu_core.c
> +++ b/drivers/media/platform/amphion/vpu_core.c
> @@ -254,7 +254,7 @@ static int vpu_core_register(struct device *dev, struct vpu_core *core)
>  	if (vpu_core_is_exist(vpu, core))
>  		return 0;
>  
> -	core->workqueue = alloc_workqueue("vpu", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
> +	core->workqueue = alloc_ordered_workqueue("vpu", WQ_MEM_RECLAIM);
>  	if (!core->workqueue) {
>  		dev_err(core->dev, "fail to alloc workqueue\n");
>  		return -ENOMEM;
> diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
> index 6773b885597c..a48edb445eea 100644
> --- a/drivers/media/platform/amphion/vpu_v4l2.c
> +++ b/drivers/media/platform/amphion/vpu_v4l2.c
> @@ -740,7 +740,7 @@ int vpu_v4l2_open(struct file *file, struct vpu_inst *inst)
>  	inst->fh.ctrl_handler = &inst->ctrl_handler;
>  	file->private_data = &inst->fh;
>  	inst->state = VPU_CODEC_STATE_DEINIT;
> -	inst->workqueue = alloc_workqueue("vpu_inst", WQ_UNBOUND | WQ_MEM_RECLAIM, 1);
> +	inst->workqueue = alloc_ordered_workqueue("vpu_inst", WQ_MEM_RECLAIM);
>  	if (inst->workqueue) {
>  		INIT_WORK(&inst->msg_work, vpu_inst_run_work);
>  		ret = kfifo_init(&inst->msg_fifo,

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>

Feel free to take this with the commit log changes mentioned above.

Regards,

	Hans

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/13] media: amphion: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-23 12:19   ` Hans Verkuil
@ 2023-05-23 21:25     ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-23 21:25 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: jiangshanlai, linux-kernel, kernel-team, Ming Qian, Shijie Qin,
	Zhou Peng, Mauro Carvalho Chehab, linux-media

Applied to wq/for-6.5-cleanup-ordered with the suggested description
updates.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/13] dm integrity: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50   ` [dm-devel] " Tejun Heo
@ 2023-05-25 22:10     ` Tejun Heo
  -1 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 22:10 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Alasdair Kergon, Mike Snitzer, dm-devel

On Mon, May 08, 2023 at 03:50:22PM -1000, Tejun Heo wrote:
...
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@kernel.org>
> Cc: dm-devel@redhat.com
> Cc: linux-kernel@vger.kernel.org

Hey, I'm going to apply this to wq/for-6.5-cleanup-ordered. As it's an
identity conversion, it shouldn't break anything. Please holler if you have
any concerns.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dm-devel] [PATCH 03/13] dm integrity: Use alloc_ordered_workqueue() to create ordered workqueues
@ 2023-05-25 22:10     ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 22:10 UTC (permalink / raw)
  To: jiangshanlai
  Cc: dm-devel, kernel-team, Mike Snitzer, linux-kernel, Alasdair Kergon

On Mon, May 08, 2023 at 03:50:22PM -1000, Tejun Heo wrote:
...
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@kernel.org>
> Cc: dm-devel@redhat.com
> Cc: linux-kernel@vger.kernel.org

Hey, I'm going to apply this to wq/for-6.5-cleanup-ordered. As it's an
identity conversion, it shouldn't break anything. Please holler if you have
any concerns.

Thanks.

-- 
tejun

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/13] net: wwan: t7xx: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 06/13] net: wwan: t7xx: " Tejun Heo
@ 2023-05-25 22:11   ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 22:11 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Chandrashekar Devegowda,
	Intel Corporation, Chiranjeevi Rapolu, Liu Haijun,
	M Chetan Kumar, Ricardo Martinez, Loic Poulain, Sergey Ryazanov,
	Johannes Berg, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev

On Mon, May 08, 2023 at 03:50:25PM -1000, Tejun Heo wrote:
...
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
> Cc: Intel Corporation <linuxwwan@intel.com>
> Cc: Chiranjeevi Rapolu <chiranjeevi.rapolu@linux.intel.com>
> Cc: Liu Haijun <haijun.liu@mediatek.com>
> Cc: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
> Cc: Ricardo Martinez <ricardo.martinez@linux.intel.com>
> Cc: Loic Poulain <loic.poulain@linaro.org>
> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com>
> Cc: Johannes Berg <johannes@sipsolutions.net>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: netdev@vger.kernel.org

Hey, I'm going to apply this to wq/for-6.5-cleanup-ordered. As it's an
identity conversion, it shouldn't break anything. Please holler if you have
any concerns.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/13] net: qrtr: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 09/13] net: qrtr: " Tejun Heo
@ 2023-05-25 22:14   ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 22:14 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, Manivannan Sadhasivam,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-arm-msm, netdev

On Mon, May 08, 2023 at 03:50:28PM -1000, Tejun Heo wrote:
...
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Manivannan Sadhasivam <mani@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-arm-msm@vger.kernel.org
> Cc: netdev@vger.kernel.org

Hey, I'm going to apply this to wq/for-6.5-cleanup-ordered. As it's an
identity conversion, it shouldn't break anything. Please holler if you have
any concerns.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/13] rxrpc: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09  1:50 ` [PATCH 10/13] rxrpc: " Tejun Heo
@ 2023-05-25 22:14   ` Tejun Heo
  0 siblings, 0 replies; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 22:14 UTC (permalink / raw)
  To: jiangshanlai
  Cc: linux-kernel, kernel-team, David Howells, Marc Dionne,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-afs, netdev

On Mon, May 08, 2023 at 03:50:29PM -1000, Tejun Heo wrote:
...
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Marc Dionne <marc.dionne@auristor.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-afs@lists.infradead.org
> Cc: netdev@vger.kernel.org

Hey, I'm going to apply this to wq/for-6.5-cleanup-ordered. As it's an
identity conversion, it shouldn't break anything. Please holler if you have
any concerns.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-09 23:47         ` Tejun Heo
@ 2023-05-25 23:33           ` Tejun Heo
  2023-05-26 12:52             ` David Sterba
  0 siblings, 1 reply; 45+ messages in thread
From: Tejun Heo @ 2023-05-25 23:33 UTC (permalink / raw)
  To: David Sterba
  Cc: jiangshanlai, linux-kernel, kernel-team, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

BACKGROUND
==========

When multiple work items are queued to a workqueue, their execution order
doesn't match the queueing order. They may get executed in any order and
simultaneously. When fully serialized execution - one by one in the queueing
order - is needed, an ordered workqueue should be used which can be created
with alloc_ordered_workqueue().

However, alloc_ordered_workqueue() was a later addition. Before it, an
ordered workqueue could be obtained by creating an UNBOUND workqueue with
@max_active==1. This originally was an implementation side-effect which was
broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered"). Because there were users that depended on the ordered execution,
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
made workqueue allocation path to implicitly promote UNBOUND workqueues w/
@max_active==1 to ordered workqueues.

While this has worked okay, overloading the UNBOUND allocation interface
this way creates other issues. It's difficult to tell whether a given
workqueue actually needs to be ordered and users that legitimately want a
min concurrency level wq unexpectedly gets an ordered one instead. With
planned UNBOUND workqueue updates to improve execution locality and more
prevalence of chiplet designs which can benefit from such improvements, this
isn't a state we wanna be in forever.

This patch series audits all callsites that create an UNBOUND workqueue w/
@max_active==1 and converts them to alloc_ordered_workqueue() as necessary.

BTRFS
=====

* fs_info->scrub_workers initialized in scrub_workers_get() was setting
  @max_active to 1 when @is_dev_replace is set and it seems that the
  workqueue actually needs to be ordered if @is_dev_replace. Update the code
  so that alloc_ordered_workqueue() is used if @is_dev_replace.

* fs_info->discard_ctl.discard_workers initialized in
  btrfs_init_workqueues() was directly using alloc_workqueue() w/
  @max_active==1. Converted to alloc_ordered_workqueue().

* fs_info->fixup_workers and fs_info->qgroup_rescan_workers initialized in
  btrfs_queue_work() use the btrfs's workqueue wrapper, btrfs_workqueue,
  which are allocated with btrfs_alloc_workqueue().

  btrfs_workqueue implements automatic @max_active adjustment which is
  disabled when the specified max limix is below a certain threshold, so
  calling btrfs_alloc_workqueue() with @limit_active==1 yields an ordered
  workqueue whose @max_active won't be changed as the auto-tuning is
  disabled.

  This is rather brittle in that nothing clearly indicates that the two
  workqueues should be ordered or btrfs_alloc_workqueue() must disable
  auto-tuning when @limit_active==1.

  This patch factors out the common btrfs_workqueue init code into
  btrfs_init_workqueue() and add explicit btrfs_alloc_ordered_workqueue().
  The two workqueues are converted to use the new ordered allocation
  interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: David Sterba <dsterba@suse.com>
Cc: linux-btrfs@vger.kernel.org
---
Hello,

David, I think this is a bit too invasive to carry through workqueue tree.
If this looks okay, can you plase apply route it through the btrfs tree?

Thanks.

 fs/btrfs/async-thread.c |   43 ++++++++++++++++++++++++++++++++++++++-----
 fs/btrfs/async-thread.h |    3 +++
 fs/btrfs/disk-io.c      |    8 +++++---
 fs/btrfs/scrub.c        |    6 ++++--
 4 files changed, 50 insertions(+), 10 deletions(-)

--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -71,6 +71,16 @@ bool btrfs_workqueue_normal_congested(co
 	return atomic_read(&wq->pending) > wq->thresh * 2;
 }
 
+static void btrfs_init_workqueue(struct btrfs_workqueue *wq,
+				 struct btrfs_fs_info *fs_info)
+{
+	wq->fs_info = fs_info;
+	atomic_set(&wq->pending, 0);
+	INIT_LIST_HEAD(&wq->ordered_list);
+	spin_lock_init(&wq->list_lock);
+	spin_lock_init(&wq->thres_lock);
+}
+
 struct btrfs_workqueue *btrfs_alloc_workqueue(struct btrfs_fs_info *fs_info,
 					      const char *name, unsigned int flags,
 					      int limit_active, int thresh)
@@ -80,9 +90,9 @@ struct btrfs_workqueue *btrfs_alloc_work
 	if (!ret)
 		return NULL;
 
-	ret->fs_info = fs_info;
+	btrfs_init_workqueue(ret, fs_info);
+
 	ret->limit_active = limit_active;
-	atomic_set(&ret->pending, 0);
 	if (thresh == 0)
 		thresh = DFT_THRESHOLD;
 	/* For low threshold, disabling threshold is a better choice */
@@ -106,9 +116,32 @@ struct btrfs_workqueue *btrfs_alloc_work
 		return NULL;
 	}
 
-	INIT_LIST_HEAD(&ret->ordered_list);
-	spin_lock_init(&ret->list_lock);
-	spin_lock_init(&ret->thres_lock);
+	trace_btrfs_workqueue_alloc(ret, name);
+	return ret;
+}
+
+struct btrfs_workqueue *btrfs_alloc_ordered_workqueue(
+				struct btrfs_fs_info *fs_info, const char *name,
+				unsigned int flags)
+{
+	struct btrfs_workqueue *ret = kzalloc(sizeof(*ret), GFP_KERNEL);
+
+	if (!ret)
+		return NULL;
+
+	btrfs_init_workqueue(ret, fs_info);
+
+	/* ordered workqueues don't allow @max_active adjustments */
+	ret->limit_active = 1;
+	ret->current_active = 1;
+	ret->thresh = NO_THRESHOLD;
+
+	ret->normal_wq = alloc_ordered_workqueue("btrfs-%s", flags, name);
+	if (!ret->normal_wq) {
+		kfree(ret);
+		return NULL;
+	}
+
 	trace_btrfs_workqueue_alloc(ret, name);
 	return ret;
 }
--- a/fs/btrfs/async-thread.h
+++ b/fs/btrfs/async-thread.h
@@ -31,6 +31,9 @@ struct btrfs_workqueue *btrfs_alloc_work
 					      unsigned int flags,
 					      int limit_active,
 					      int thresh);
+struct btrfs_workqueue *btrfs_alloc_ordered_workqueue(
+				struct btrfs_fs_info *fs_info, const char *name,
+				unsigned int flags);
 void btrfs_init_work(struct btrfs_work *work, btrfs_func_t func,
 		     btrfs_func_t ordered_func, btrfs_func_t ordered_free);
 void btrfs_queue_work(struct btrfs_workqueue *wq,
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2177,6 +2177,7 @@ static int btrfs_init_workqueues(struct
 {
 	u32 max_active = fs_info->thread_pool_size;
 	unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND;
+	unsigned int ordered_flags = WQ_MEM_RECLAIM | WQ_FREEZABLE;
 
 	fs_info->workers =
 		btrfs_alloc_workqueue(fs_info, "worker", flags, max_active, 16);
@@ -2196,7 +2197,7 @@ static int btrfs_init_workqueues(struct
 		btrfs_alloc_workqueue(fs_info, "cache", flags, max_active, 0);
 
 	fs_info->fixup_workers =
-		btrfs_alloc_workqueue(fs_info, "fixup", flags, 1, 0);
+		btrfs_alloc_ordered_workqueue(fs_info, "fixup", ordered_flags);
 
 	fs_info->endio_workers =
 		alloc_workqueue("btrfs-endio", flags, max_active);
@@ -2215,9 +2216,10 @@ static int btrfs_init_workqueues(struct
 		btrfs_alloc_workqueue(fs_info, "delayed-meta", flags,
 				      max_active, 0);
 	fs_info->qgroup_rescan_workers =
-		btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0);
+		btrfs_alloc_ordered_workqueue(fs_info, "qgroup-rescan",
+					      ordered_flags);
 	fs_info->discard_ctl.discard_workers =
-		alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1);
+		alloc_ordered_workqueue("btrfs_discard", WQ_FREEZABLE);
 
 	if (!(fs_info->workers && fs_info->hipri_workers &&
 	      fs_info->delalloc_workers && fs_info->flush_workers &&
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2734,8 +2734,10 @@ static noinline_for_stack int scrub_work
 	if (refcount_inc_not_zero(&fs_info->scrub_workers_refcnt))
 		return 0;
 
-	scrub_workers = alloc_workqueue("btrfs-scrub", flags,
-					is_dev_replace ? 1 : max_active);
+	if (is_dev_replace)
+		scrub_workers = alloc_ordered_workqueue("btrfs-scrub", flags);
+	else
+		scrub_workers = alloc_workqueue("btrfs-scrub", flags, max_active);
 	if (!scrub_workers)
 		goto fail_scrub_workers;
 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/13] btrfs: Use alloc_ordered_workqueue() to create ordered workqueues
  2023-05-25 23:33           ` Tejun Heo
@ 2023-05-26 12:52             ` David Sterba
  0 siblings, 0 replies; 45+ messages in thread
From: David Sterba @ 2023-05-26 12:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: jiangshanlai, linux-kernel, kernel-team, Wang Yugui, Chris Mason,
	Josef Bacik, David Sterba, linux-btrfs

On Thu, May 25, 2023 at 01:33:08PM -1000, Tejun Heo wrote:
> BACKGROUND
> ==========
> 
> When multiple work items are queued to a workqueue, their execution order
> doesn't match the queueing order. They may get executed in any order and
> simultaneously. When fully serialized execution - one by one in the queueing
> order - is needed, an ordered workqueue should be used which can be created
> with alloc_ordered_workqueue().
> 
> However, alloc_ordered_workqueue() was a later addition. Before it, an
> ordered workqueue could be obtained by creating an UNBOUND workqueue with
> @max_active==1. This originally was an implementation side-effect which was
> broken by 4c16bd327c74 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
> ordered"). Because there were users that depended on the ordered execution,
> 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
> made workqueue allocation path to implicitly promote UNBOUND workqueues w/
> @max_active==1 to ordered workqueues.
> 
> While this has worked okay, overloading the UNBOUND allocation interface
> this way creates other issues. It's difficult to tell whether a given
> workqueue actually needs to be ordered and users that legitimately want a
> min concurrency level wq unexpectedly gets an ordered one instead. With
> planned UNBOUND workqueue updates to improve execution locality and more
> prevalence of chiplet designs which can benefit from such improvements, this
> isn't a state we wanna be in forever.
> 
> This patch series audits all callsites that create an UNBOUND workqueue w/
> @max_active==1 and converts them to alloc_ordered_workqueue() as necessary.
> 
> BTRFS
> =====
> 
> * fs_info->scrub_workers initialized in scrub_workers_get() was setting
>   @max_active to 1 when @is_dev_replace is set and it seems that the
>   workqueue actually needs to be ordered if @is_dev_replace. Update the code
>   so that alloc_ordered_workqueue() is used if @is_dev_replace.
> 
> * fs_info->discard_ctl.discard_workers initialized in
>   btrfs_init_workqueues() was directly using alloc_workqueue() w/
>   @max_active==1. Converted to alloc_ordered_workqueue().
> 
> * fs_info->fixup_workers and fs_info->qgroup_rescan_workers initialized in
>   btrfs_queue_work() use the btrfs's workqueue wrapper, btrfs_workqueue,
>   which are allocated with btrfs_alloc_workqueue().
> 
>   btrfs_workqueue implements automatic @max_active adjustment which is
>   disabled when the specified max limix is below a certain threshold, so
>   calling btrfs_alloc_workqueue() with @limit_active==1 yields an ordered
>   workqueue whose @max_active won't be changed as the auto-tuning is
>   disabled.
> 
>   This is rather brittle in that nothing clearly indicates that the two
>   workqueues should be ordered or btrfs_alloc_workqueue() must disable
>   auto-tuning when @limit_active==1.
> 
>   This patch factors out the common btrfs_workqueue init code into
>   btrfs_init_workqueue() and add explicit btrfs_alloc_ordered_workqueue().
>   The two workqueues are converted to use the new ordered allocation
>   interface.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Wang Yugui <wangyugui@e16-tech.com>
> Cc: Chris Mason <clm@fb.com>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: David Sterba <dsterba@suse.com>
> Cc: linux-btrfs@vger.kernel.org
> ---
> Hello,
> 
> David, I think this is a bit too invasive to carry through workqueue tree.
> If this looks okay, can you plase apply route it through the btrfs tree?

Yesd and I actually prefer to take such patches via btrfs tree unless
there's a strong dependency on other patches from another subsystem.
Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2023-05-26 12:58 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-09  1:50 [PATCHSET v2 wq/for-6.5-cleanup-ordered] workqueue: Ordered workqueue creation cleanup Tejun Heo
2023-05-09  1:50 ` [PATCH 01/13] scsi: ncr53c8xx: Use default @max_active for hostdata->work_q Tejun Heo
2023-05-21  2:48   ` [PATCH RESEND " Tejun Heo
2023-05-21  6:42     ` Finn Thain
2023-05-22 22:06     ` Martin K. Petersen
2023-05-23  0:58       ` Tejun Heo
2023-05-09  1:50 ` [PATCH 02/13] wifi: mwifiex: Use default @max_active for workqueues Tejun Heo
2023-05-10  8:45   ` Kalle Valo
2023-05-10 18:09   ` Brian Norris
2023-05-10 18:16     ` Tejun Heo
2023-05-10 18:57       ` Brian Norris
2023-05-10 19:19         ` Tejun Heo
2023-05-10 19:50           ` Brian Norris
2023-05-19  0:36   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 03/13] dm integrity: Use alloc_ordered_workqueue() to create ordered workqueues Tejun Heo
2023-05-09  1:50   ` [dm-devel] " Tejun Heo
2023-05-25 22:10   ` Tejun Heo
2023-05-25 22:10     ` [dm-devel] " Tejun Heo
2023-05-09  1:50 ` [PATCH 04/13] media: amphion: " Tejun Heo
2023-05-23 12:19   ` Hans Verkuil
2023-05-23 21:25     ` Tejun Heo
2023-05-09  1:50 ` [PATCH 05/13] wifi: ath10/11/12k: " Tejun Heo
2023-05-10  8:41   ` Kalle Valo
2023-05-19  0:41   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 06/13] net: wwan: t7xx: " Tejun Heo
2023-05-25 22:11   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 07/13] soc: qcom: qmi: " Tejun Heo
2023-05-09  1:50 ` [PATCH 08/13] btrfs: " Tejun Heo
2023-05-09 14:53   ` David Sterba
2023-05-09 15:57     ` Tejun Heo
2023-05-09 23:36       ` David Sterba
2023-05-09 23:47         ` Tejun Heo
2023-05-25 23:33           ` Tejun Heo
2023-05-26 12:52             ` David Sterba
2023-05-09  1:50 ` [PATCH 09/13] net: qrtr: " Tejun Heo
2023-05-25 22:14   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 10/13] rxrpc: " Tejun Heo
2023-05-25 22:14   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 11/13] crypto: octeontx2: " Tejun Heo
2023-05-09  2:18   ` Herbert Xu
2023-05-19  0:42   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 12/13] media: coda: " Tejun Heo
2023-05-09 10:46   ` Philipp Zabel
2023-05-19  0:50   ` Tejun Heo
2023-05-09  1:50 ` [PATCH 13/13] workqueue: Don't implicitly make UNBOUND workqueues w/ @max_active==1 ordered Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.