All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask
@ 2014-04-24 14:37 Frederic Weisbecker
  2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2014-04-24 14:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Hi,

So this new version takes another direction suggested by Tejun.
Now we have a cpumask at the root of the workqueue directory
to bound the affinity of all unbound workqueues.

It's an early release as more plumbing is required. Most notably:

* Drop last patch and integrate Lai's instead to support
apply_workqueue_attrs() on ordered workqueues: https://lkml.org/lkml/2014/4/15/181

* Handle cpumask value expansion (in fact the per wq cpumask must be
stored seperately than the effective cpumask to handle that correctly).
I should take care of that in the next version.

But I prefer to post the current state now in case I'm wandering off.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	core/workqueue-v3

Thanks,
	Frederic
---

Frederic Weisbecker (4):
      workqueue: Create low-level unbound workqueues cpumask
      workqueue: Split apply attrs code from its locking
      workqueue: Allow modifying low level unbound workqueue cpumask
      workqueue: Handle ordered workqueues on cpumask_unbounds change


 kernel/workqueue.c | 219 ++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 185 insertions(+), 34 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-04-24 14:37 [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask Frederic Weisbecker
@ 2014-04-24 14:37 ` Frederic Weisbecker
  2014-04-24 15:37   ` Tejun Heo
  2014-04-24 22:42   ` Kevin Hilman
  2014-04-24 14:37 ` [PATCH 2/4] workqueue: Split apply attrs code from its locking Frederic Weisbecker
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2014-04-24 14:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Create a cpumask that limit the affinity of all unbound workqueues.
This cpumask is controlled though a file at the root of the workqueue
sysfs directory.

It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
such that the effective cpumask applied for a given unbound workqueue is
the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
the new /sys/devices/virtual/workqueue/cpumask_unbounds file.

This patch implements the basic infrastructure and the read interface.
cpumask_unbounds is initially set to cpu_possible_mask.

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0ee63af..b456ed4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -293,6 +293,8 @@ static DEFINE_SPINLOCK(wq_mayday_lock);	/* protects wq->maydays list */
 static LIST_HEAD(workqueues);		/* PL: list of all workqueues */
 static bool workqueue_freezing;		/* PL: have wqs started freezing? */
 
+static cpumask_var_t wq_unbound_cpumask;
+
 /* the per-cpu worker pools */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
 				     cpu_worker_pools);
@@ -3323,9 +3325,29 @@ static struct bus_type wq_subsys = {
 	.dev_groups			= wq_sysfs_groups,
 };
 
+static ssize_t unbounds_cpumask_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	int written;
+
+	written = cpumask_scnprintf(buf, PAGE_SIZE, wq_unbound_cpumask);
+	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
+
+	return written;
+}
+
+static struct device_attribute wq_sysfs_cpumask_attr =
+	__ATTR(cpumask_unbounds, 0444, unbounds_cpumask_show, NULL);
+
 static int __init wq_sysfs_init(void)
 {
-	return subsys_virtual_register(&wq_subsys, NULL);
+	int err;
+
+	err = subsys_virtual_register(&wq_subsys, NULL);
+	if (err)
+		return err;
+
+	return device_create_file(wq_subsys.dev_root, &wq_sysfs_cpumask_attr);
 }
 core_initcall(wq_sysfs_init);
 
@@ -3944,7 +3966,7 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 
 	/* make a copy of @attrs and sanitize it */
 	copy_workqueue_attrs(new_attrs, attrs);
-	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);
+	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask);
 
 	/*
 	 * We may create multiple pwqs with differing cpumasks.  Make a
@@ -5033,6 +5055,9 @@ static int __init init_workqueues(void)
 
 	WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
 
+	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
+	cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
+
 	pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);
 
 	cpu_notifier(workqueue_cpu_up_callback, CPU_PRI_WORKQUEUE_UP);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] workqueue: Split apply attrs code from its locking
  2014-04-24 14:37 [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask Frederic Weisbecker
  2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
@ 2014-04-24 14:37 ` Frederic Weisbecker
  2014-04-24 14:48   ` Tejun Heo
  2014-04-24 14:37 ` [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
  2014-04-24 14:37 ` [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change Frederic Weisbecker
  3 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-04-24 14:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

In order to allow overriding the unbound wqs low-level cpumask, we
need to be able to call apply_workqueue_attr() on all workqueues in
the pool list.

Now since traversing the pool list require to lock it, we can't currently
call apply_workqueue_attr() under the pool traversal.

So lets provide a version of apply_workqueue_attrs() that can be
called when the pool is already locked.

Suggested-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 73 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 41 insertions(+), 32 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b456ed4..2c38e32 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3927,24 +3927,8 @@ static struct pool_workqueue *numa_pwq_tbl_install(struct workqueue_struct *wq,
 	return old_pwq;
 }
 
-/**
- * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
- * @wq: the target workqueue
- * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
- *
- * Apply @attrs to an unbound workqueue @wq.  Unless disabled, on NUMA
- * machines, this function maps a separate pwq to each NUMA node with
- * possibles CPUs in @attrs->cpumask so that work items are affine to the
- * NUMA node it was issued on.  Older pwqs are released as in-flight work
- * items finish.  Note that a work item which repeatedly requeues itself
- * back-to-back will stay on its current pwq.
- *
- * Performs GFP_KERNEL allocations.
- *
- * Return: 0 on success and -errno on failure.
- */
-int apply_workqueue_attrs(struct workqueue_struct *wq,
-			  const struct workqueue_attrs *attrs)
+static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
+					const struct workqueue_attrs *attrs)
 {
 	struct workqueue_attrs *new_attrs, *tmp_attrs;
 	struct pool_workqueue **pwq_tbl, *dfl_pwq;
@@ -3976,15 +3960,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 	copy_workqueue_attrs(tmp_attrs, new_attrs);
 
 	/*
-	 * CPUs should stay stable across pwq creations and installations.
-	 * Pin CPUs, determine the target cpumask for each node and create
-	 * pwqs accordingly.
-	 */
-	get_online_cpus();
-
-	mutex_lock(&wq_pool_mutex);
-
-	/*
 	 * If something goes wrong during CPU up/down, we'll fall back to
 	 * the default pwq covering whole @attrs->cpumask.  Always create
 	 * it even if we don't use it immediately.
@@ -4004,8 +3979,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 		}
 	}
 
-	mutex_unlock(&wq_pool_mutex);
-
 	/* all pwqs have been created successfully, let's install'em */
 	mutex_lock(&wq->mutex);
 
@@ -4026,7 +3999,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
 		put_pwq_unlocked(pwq_tbl[node]);
 	put_pwq_unlocked(dfl_pwq);
 
-	put_online_cpus();
 	ret = 0;
 	/* fall through */
 out_free:
@@ -4040,14 +4012,51 @@ enomem_pwq:
 	for_each_node(node)
 		if (pwq_tbl && pwq_tbl[node] != dfl_pwq)
 			free_unbound_pwq(pwq_tbl[node]);
-	mutex_unlock(&wq_pool_mutex);
-	put_online_cpus();
 enomem:
 	ret = -ENOMEM;
 	goto out_free;
 }
 
 /**
+ * apply_workqueue_attrs - apply new workqueue_attrs to an unbound workqueue
+ * @wq: the target workqueue
+ * @attrs: the workqueue_attrs to apply, allocated with alloc_workqueue_attrs()
+ *
+ * Apply @attrs to an unbound workqueue @wq.  Unless disabled, on NUMA
+ * machines, this function maps a separate pwq to each NUMA node with
+ * possibles CPUs in @attrs->cpumask so that work items are affine to the
+ * NUMA node it was issued on.  Older pwqs are released as in-flight work
+ * items finish.  Note that a work item which repeatedly requeues itself
+ * back-to-back will stay on its current pwq.
+ *
+ * Performs GFP_KERNEL allocations.
+ *
+ * Return: 0 on success and -errno on failure.
+ */
+int apply_workqueue_attrs(struct workqueue_struct *wq,
+			  const struct workqueue_attrs *attrs)
+{
+	int ret;
+
+	/*
+	 * CPUs should stay stable across pwq creations and installations.
+	 * Pin CPUs, determine the target cpumask for each node and create
+	 * pwqs accordingly.
+	 */
+
+	get_online_cpus();
+	/*
+	 * Lock for alloc_unbound_pwq()
+	 */
+	mutex_lock(&wq_pool_mutex);
+	ret = apply_workqueue_attrs_locked(wq, attrs);
+	mutex_unlock(&wq_pool_mutex);
+	put_online_cpus();
+
+	return ret;
+}
+
+/**
  * wq_update_unbound_numa - update NUMA affinity of a wq for CPU hot[un]plug
  * @wq: the target workqueue
  * @cpu: the CPU coming up or going down
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-04-24 14:37 [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask Frederic Weisbecker
  2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
  2014-04-24 14:37 ` [PATCH 2/4] workqueue: Split apply attrs code from its locking Frederic Weisbecker
@ 2014-04-24 14:37 ` Frederic Weisbecker
  2014-04-24 15:30   ` Tejun Heo
  2014-04-24 14:37 ` [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change Frederic Weisbecker
  3 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-04-24 14:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Allow to modify the low-level unbound workqueues cpumask through
sysfs. This is performed by traversing the entire workqueue list
and calling apply_workqueue_attrs() on the unbound workqueues.

Ordered workqueues need some specific treatment and will be dealt with
in a subsequent patch.

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 61 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2c38e32..387ce38 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -293,7 +293,7 @@ static DEFINE_SPINLOCK(wq_mayday_lock);	/* protects wq->maydays list */
 static LIST_HEAD(workqueues);		/* PL: list of all workqueues */
 static bool workqueue_freezing;		/* PL: have wqs started freezing? */
 
-static cpumask_var_t wq_unbound_cpumask;
+static cpumask_var_t wq_unbound_cpumask; /* PL: low level cpumask for all unbound wqs */
 
 /* the per-cpu worker pools */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
@@ -3325,19 +3325,78 @@ static struct bus_type wq_subsys = {
 	.dev_groups			= wq_sysfs_groups,
 };
 
+static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
+					const struct workqueue_attrs *attrs);
+
+/* Must be called with wq_unbound_mutex held */
+static int unbounds_cpumask_apply_all(cpumask_var_t cpumask)
+{
+	struct workqueue_struct *wq;
+
+	list_for_each_entry(wq, &workqueues, list) {
+		struct workqueue_attrs *attrs;
+
+		if (!(wq->flags & WQ_UNBOUND))
+			continue;
+		/* Ordered workqueues need specific treatment */
+		if (wq->flags & __WQ_ORDERED)
+			continue;
+
+		attrs = wq_sysfs_prep_attrs(wq);
+		if (!attrs)
+			return -ENOMEM;
+
+		WARN_ON_ONCE(apply_workqueue_attrs_locked(wq, attrs));
+		free_workqueue_attrs(attrs);
+	}
+
+	return 0;
+}
+
+static ssize_t unbounds_cpumask_store(struct device *dev,
+				      struct device_attribute *attr,
+				      const char *buf, size_t count)
+{
+	cpumask_var_t cpumask;
+	int ret = -EINVAL;
+
+	if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
+		return -ENOMEM;
+
+	ret = cpumask_parse(buf, cpumask);
+	if (ret)
+		goto out;
+
+	get_online_cpus();
+	if (cpumask_intersects(cpumask, cpu_online_mask)) {
+		mutex_lock(&wq_pool_mutex);
+		cpumask_copy(wq_unbound_cpumask, cpumask);
+		ret = unbounds_cpumask_apply_all(cpumask);
+		mutex_unlock(&wq_pool_mutex);
+	}
+	put_online_cpus();
+out:
+	free_cpumask_var(cpumask);
+	return ret ? ret : count;
+}
+
 static ssize_t unbounds_cpumask_show(struct device *dev,
 				     struct device_attribute *attr, char *buf)
 {
 	int written;
 
+	mutex_lock(&wq_pool_mutex);
 	written = cpumask_scnprintf(buf, PAGE_SIZE, wq_unbound_cpumask);
+	mutex_unlock(&wq_pool_mutex);
+
 	written += scnprintf(buf + written, PAGE_SIZE - written, "\n");
 
 	return written;
 }
 
 static struct device_attribute wq_sysfs_cpumask_attr =
-	__ATTR(cpumask_unbounds, 0444, unbounds_cpumask_show, NULL);
+	__ATTR(cpumask_unbounds, 0644, unbounds_cpumask_show,
+	       unbounds_cpumask_store);
 
 static int __init wq_sysfs_init(void)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change
  2014-04-24 14:37 [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2014-04-24 14:37 ` [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
@ 2014-04-24 14:37 ` Frederic Weisbecker
  2014-04-24 15:33   ` Tejun Heo
  3 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-04-24 14:37 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Christoph Lameter, Kevin Hilman,
	Lai Jiangshan, Mike Galbraith, Paul E. McKenney, Tejun Heo,
	Viresh Kumar

Ordered unbound workqueues need some special care if we want to
modify their CPU affinity. These can't be simply handled through
apply_workqueue_attrs() since it works by hot plugging worker pools
which has parallelism side effects and this would break ordering.

The way we solve this is to change the affinity of the (presumaly
unique) worker backing the ordered workqueues.

NOTE: Now like Lai said, there may be bad side effects on this because
ordered wq may share their worker pool with non-ordered workqueues.
So changing the affinity of the worker itself is not a nice solution.
This patch is very likely to be replaced by Lai's patch
"workqueue: allow changing attributions of ordered workqueue"
https://lkml.org/lkml/2014/4/15/181

Cc: Christoph Lameter <cl@linux.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/workqueue.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 67 insertions(+), 9 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 387ce38..564e034 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3328,26 +3328,84 @@ static struct bus_type wq_subsys = {
 static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
 					const struct workqueue_attrs *attrs);
 
+static int unbounds_cpumask_apply_ordered(struct workqueue_struct *wq,
+					  cpumask_var_t cpumask)
+{
+	struct pool_workqueue *pwq;
+	struct worker_pool *pool;
+	struct worker *worker;
+	int ret = 0;
+	int wi;
+
+	/* Ordered wq have a single pool */
+	pwq = list_first_entry(&wq->pwqs, typeof(*pwq), pwqs_node);
+	pool = pwq->pool;
+
+	mutex_lock(&pool->manager_mutex);
+	/* There is a single worker on that pool, but the iterator is convenient */
+	for_each_pool_worker(worker, wi, pool) {
+		ret = set_cpus_allowed_ptr(worker->task, cpumask);
+		if (ret)
+			goto fail;
+	}
+
+	cpumask_copy(pool->attrs->cpumask, cpumask);
+	mutex_unlock(&pool->manager_mutex);
+
+	mutex_lock(&wq->mutex);
+	cpumask_copy(wq->unbound_attrs->cpumask, cpumask);
+	mutex_unlock(&wq->mutex);
+
+	return 0;
+
+fail:
+	mutex_unlock(&pool->manager_mutex);
+	return ret;
+}
+
+static int unbounds_cpumask_apply(struct workqueue_struct *wq,
+				  cpumask_var_t cpumask)
+{
+	struct workqueue_attrs *attrs;
+	int ret;
+
+	attrs = wq_sysfs_prep_attrs(wq);
+	if (!attrs)
+		return -ENOMEM;
+
+	/*
+	 * TODO: this works well when the cpumask is schrinked
+	 * but more plumbing is needed to handle cpumask value
+	 * expansion
+	 */
+	ret = apply_workqueue_attrs_locked(wq, attrs);
+	free_workqueue_attrs(attrs);
+
+	return ret;
+}
+
 /* Must be called with wq_unbound_mutex held */
 static int unbounds_cpumask_apply_all(cpumask_var_t cpumask)
 {
 	struct workqueue_struct *wq;
 
 	list_for_each_entry(wq, &workqueues, list) {
-		struct workqueue_attrs *attrs;
+		int ret;
 
 		if (!(wq->flags & WQ_UNBOUND))
 			continue;
 		/* Ordered workqueues need specific treatment */
 		if (wq->flags & __WQ_ORDERED)
-			continue;
-
-		attrs = wq_sysfs_prep_attrs(wq);
-		if (!attrs)
-			return -ENOMEM;
-
-		WARN_ON_ONCE(apply_workqueue_attrs_locked(wq, attrs));
-		free_workqueue_attrs(attrs);
+			/*
+			 * Calling unbounds_cpumask_apply_ordered() once on
+			 * the first ordered wq we meet should be enough because
+			 * all ordered workqueues all share the same single worker pool.
+			 * But this detail might change in the future
+			 */
+			ret = unbounds_cpumask_apply_ordered(wq, cpumask);
+		else
+			ret = unbounds_cpumask_apply(wq, cpumask);
+		WARN_ON_ONCE(ret);
 	}
 
 	return 0;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] workqueue: Split apply attrs code from its locking
  2014-04-24 14:37 ` [PATCH 2/4] workqueue: Split apply attrs code from its locking Frederic Weisbecker
@ 2014-04-24 14:48   ` Tejun Heo
  2014-05-01 14:40     ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Tejun Heo @ 2014-04-24 14:48 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 04:37:34PM +0200, Frederic Weisbecker wrote:
> +static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
> +					const struct workqueue_attrs *attrs)
>  {
>  	struct workqueue_attrs *new_attrs, *tmp_attrs;
>  	struct pool_workqueue **pwq_tbl, *dfl_pwq;
> @@ -3976,15 +3960,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
>  	copy_workqueue_attrs(tmp_attrs, new_attrs);
>  
>  	/*
> -	 * CPUs should stay stable across pwq creations and installations.
> -	 * Pin CPUs, determine the target cpumask for each node and create
> -	 * pwqs accordingly.
> -	 */
> -	get_online_cpus();
> -
> -	mutex_lock(&wq_pool_mutex);

lockdep_assert_held()

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-04-24 14:37 ` [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
@ 2014-04-24 15:30   ` Tejun Heo
  2014-05-01 14:49     ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Tejun Heo @ 2014-04-24 15:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 04:37:35PM +0200, Frederic Weisbecker wrote:
> +static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
> +					const struct workqueue_attrs *attrs);

Can't we reorder things so that we don't need the above prototype?

> +/* Must be called with wq_unbound_mutex held */

Please use lockdep_assert_held() instead.

> +static int unbounds_cpumask_apply_all(cpumask_var_t cpumask)
> +{
> +	struct workqueue_struct *wq;
> +
> +	list_for_each_entry(wq, &workqueues, list) {
> +		struct workqueue_attrs *attrs;
> +
> +		if (!(wq->flags & WQ_UNBOUND))
> +			continue;
> +		/* Ordered workqueues need specific treatment */
> +		if (wq->flags & __WQ_ORDERED)
> +			continue;
> +
> +		attrs = wq_sysfs_prep_attrs(wq);
> +		if (!attrs)
> +			return -ENOMEM;

So, we're failing in the middle without rolling back?

> +
> +		WARN_ON_ONCE(apply_workqueue_attrs_locked(wq, attrs));

Are we triggering WARN on -ENOMEM too and then ignore the failure?

> +		free_workqueue_attrs(attrs);
> +	}
> +
> +	return 0;
> +}

Shouldn't we separate allocation stage from switching stage so that we
can either succeed or fail?  The above is very mushy about error
handling.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change
  2014-04-24 14:37 ` [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change Frederic Weisbecker
@ 2014-04-24 15:33   ` Tejun Heo
  2014-05-01 14:51     ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Tejun Heo @ 2014-04-24 15:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 04:37:36PM +0200, Frederic Weisbecker wrote:
> Ordered unbound workqueues need some special care if we want to
> modify their CPU affinity. These can't be simply handled through
> apply_workqueue_attrs() since it works by hot plugging worker pools
> which has parallelism side effects and this would break ordering.
> 
> The way we solve this is to change the affinity of the (presumaly
> unique) worker backing the ordered workqueues.
> 
> NOTE: Now like Lai said, there may be bad side effects on this because
> ordered wq may share their worker pool with non-ordered workqueues.
> So changing the affinity of the worker itself is not a nice solution.
> This patch is very likely to be replaced by Lai's patch
> "workqueue: allow changing attributions of ordered workqueue"
> https://lkml.org/lkml/2014/4/15/181

Yeah, it bothers me that we're taking two completely different
approaches for ordered and !ordered workqueues.  The only difference
between them is concurrency and it probably would be a better idea to
address that directly.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
@ 2014-04-24 15:37   ` Tejun Heo
  2014-05-01 15:01     ` Frederic Weisbecker
  2014-04-24 22:42   ` Kevin Hilman
  1 sibling, 1 reply; 18+ messages in thread
From: Tejun Heo @ 2014-04-24 15:37 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 04:37:33PM +0200, Frederic Weisbecker wrote:
> Create a cpumask that limit the affinity of all unbound workqueues.
> This cpumask is controlled though a file at the root of the workqueue
> sysfs directory.
> 
> It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
> such that the effective cpumask applied for a given unbound workqueue is
> the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
> the new /sys/devices/virtual/workqueue/cpumask_unbounds file.

Let's drop "_unbounds" postfix and name it just "cpumask".  We don't
apply it to per-cpu workqueues now but that really is an
implementation detail and later when (and if) we actually distinguish
per-cpu usages for correctness from for optimization, we may as well
apply the same cpumask to per-cpu ones too.

Another thing with naming is that I didn't anticipate having
attributes at the top directory so the workqueue directories aren't
namespaced.  Maybe we want to namespace top level knobs?
"system_cpumask" maybe?  Any better ideas?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
  2014-04-24 15:37   ` Tejun Heo
@ 2014-04-24 22:42   ` Kevin Hilman
  1 sibling, 0 replies; 18+ messages in thread
From: Kevin Hilman @ 2014-04-24 22:42 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Lai Jiangshan, Mike Galbraith,
	Paul E. McKenney, Tejun Heo, Viresh Kumar

Frederic Weisbecker <fweisbec@gmail.com> writes:

> Create a cpumask that limit the affinity of all unbound workqueues.
> This cpumask is controlled though a file at the root of the workqueue
> sysfs directory.
>
> It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
> such that the effective cpumask applied for a given unbound workqueue is
> the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
> the new /sys/devices/virtual/workqueue/cpumask_unbounds file.
>
> This patch implements the basic infrastructure and the read interface.
> cpumask_unbounds is initially set to cpu_possible_mask.

For a nice default in the nohz_full case, how about this defaults to
!tick_nohz_full_mask? ...

[...]

> @@ -5033,6 +5055,9 @@ static int __init init_workqueues(void)
>  
>  	WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
>  
> +	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
> +	cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
> +

... so, something like this instead:

#ifdef CONFIG_NO_HZ_FULL
       cpumask_complement(wq_unbound_cpumask, tick_nohz_full_mask);
#else
       cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
#endif

Kevin


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] workqueue: Split apply attrs code from its locking
  2014-04-24 14:48   ` Tejun Heo
@ 2014-05-01 14:40     ` Frederic Weisbecker
  2014-05-01 14:41       ` Tejun Heo
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-05-01 14:40 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 10:48:32AM -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:37:34PM +0200, Frederic Weisbecker wrote:
> > +static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
> > +					const struct workqueue_attrs *attrs)
> >  {
> >  	struct workqueue_attrs *new_attrs, *tmp_attrs;
> >  	struct pool_workqueue **pwq_tbl, *dfl_pwq;
> > @@ -3976,15 +3960,6 @@ int apply_workqueue_attrs(struct workqueue_struct *wq,
> >  	copy_workqueue_attrs(tmp_attrs, new_attrs);
> >  
> >  	/*
> > -	 * CPUs should stay stable across pwq creations and installations.
> > -	 * Pin CPUs, determine the target cpumask for each node and create
> > -	 * pwqs accordingly.
> > -	 */
> > -	get_online_cpus();
> > -
> > -	mutex_lock(&wq_pool_mutex);
> 
> lockdep_assert_held()

Not sure... Only a small part of the function actually needs to be locked. Namely
those doing the pwq allocations, which already have the lockdep_assert_held().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/4] workqueue: Split apply attrs code from its locking
  2014-05-01 14:40     ` Frederic Weisbecker
@ 2014-05-01 14:41       ` Tejun Heo
  0 siblings, 0 replies; 18+ messages in thread
From: Tejun Heo @ 2014-05-01 14:41 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, May 01, 2014 at 04:40:21PM +0200, Frederic Weisbecker wrote:
> > lockdep_assert_held()
> 
> Not sure... Only a small part of the function actually needs to be locked. Namely
> those doing the pwq allocations, which already have the lockdep_assert_held().

Ah, in that case, never mind.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask
  2014-04-24 15:30   ` Tejun Heo
@ 2014-05-01 14:49     ` Frederic Weisbecker
  0 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2014-05-01 14:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 11:30:48AM -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:37:35PM +0200, Frederic Weisbecker wrote:
> > +static int apply_workqueue_attrs_locked(struct workqueue_struct *wq,
> > +					const struct workqueue_attrs *attrs);
> 
> Can't we reorder things so that we don't need the above prototype?

Yeah I'll give it a try.

> 
> > +/* Must be called with wq_unbound_mutex held */
> 
> Please use lockdep_assert_held() instead.

Ok.

> 
> > +static int unbounds_cpumask_apply_all(cpumask_var_t cpumask)
> > +{
> > +	struct workqueue_struct *wq;
> > +
> > +	list_for_each_entry(wq, &workqueues, list) {
> > +		struct workqueue_attrs *attrs;
> > +
> > +		if (!(wq->flags & WQ_UNBOUND))
> > +			continue;
> > +		/* Ordered workqueues need specific treatment */
> > +		if (wq->flags & __WQ_ORDERED)
> > +			continue;
> > +
> > +		attrs = wq_sysfs_prep_attrs(wq);
> > +		if (!attrs)
> > +			return -ENOMEM;
> 
> So, we're failing in the middle without rolling back?

Yeah, early patch :)

> 
> > +
> > +		WARN_ON_ONCE(apply_workqueue_attrs_locked(wq, attrs));
> 
> Are we triggering WARN on -ENOMEM too and then ignore the failure?

Yeah some more thought is needed on error handling.

> 
> > +		free_workqueue_attrs(attrs);
> > +	}
> > +
> > +	return 0;
> > +}
> 
> Shouldn't we separate allocation stage from switching stage so that we
> can either succeed or fail?  The above is very mushy about error
> handling.

They are already pretty seperate above. But yeah I need to rework the error
handling.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change
  2014-04-24 15:33   ` Tejun Heo
@ 2014-05-01 14:51     ` Frederic Weisbecker
  0 siblings, 0 replies; 18+ messages in thread
From: Frederic Weisbecker @ 2014-05-01 14:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 11:33:20AM -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:37:36PM +0200, Frederic Weisbecker wrote:
> > Ordered unbound workqueues need some special care if we want to
> > modify their CPU affinity. These can't be simply handled through
> > apply_workqueue_attrs() since it works by hot plugging worker pools
> > which has parallelism side effects and this would break ordering.
> > 
> > The way we solve this is to change the affinity of the (presumaly
> > unique) worker backing the ordered workqueues.
> > 
> > NOTE: Now like Lai said, there may be bad side effects on this because
> > ordered wq may share their worker pool with non-ordered workqueues.
> > So changing the affinity of the worker itself is not a nice solution.
> > This patch is very likely to be replaced by Lai's patch
> > "workqueue: allow changing attributions of ordered workqueue"
> > https://lkml.org/lkml/2014/4/15/181
> 
> Yeah, it bothers me that we're taking two completely different
> approaches for ordered and !ordered workqueues.  The only difference
> between them is concurrency and it probably would be a better idea to
> address that directly.

Yeah I've tried with Lai's patch and it seems to work like a charm so
next version will likely be better.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-04-24 15:37   ` Tejun Heo
@ 2014-05-01 15:01     ` Frederic Weisbecker
  2014-05-01 15:02       ` Tejun Heo
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-05-01 15:01 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, Apr 24, 2014 at 11:37:16AM -0400, Tejun Heo wrote:
> On Thu, Apr 24, 2014 at 04:37:33PM +0200, Frederic Weisbecker wrote:
> > Create a cpumask that limit the affinity of all unbound workqueues.
> > This cpumask is controlled though a file at the root of the workqueue
> > sysfs directory.
> > 
> > It works on a lower-level than the per WQ_SYSFS workqueues cpumask files
> > such that the effective cpumask applied for a given unbound workqueue is
> > the intersection of /sys/devices/virtual/workqueue/$WORKQUEUE/cpumask and
> > the new /sys/devices/virtual/workqueue/cpumask_unbounds file.
> 
> Let's drop "_unbounds" postfix and name it just "cpumask".  We don't
> apply it to per-cpu workqueues now but that really is an
> implementation detail and later when (and if) we actually distinguish
> per-cpu usages for correctness from for optimization, we may as well
> apply the same cpumask to per-cpu ones too.

Makes sense. But I hope this won't confused too much people. Having
a cpumask file suggests it applies to all of them.

> 
> Another thing with naming is that I didn't anticipate having
> attributes at the top directory so the workqueue directories aren't
> namespaced.  Maybe we want to namespace top level knobs?
> "system_cpumask" maybe?  Any better ideas?

Not sure why you want that. It makes sense on directories grouping
file for different subsystem. But here?

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-05-01 15:01     ` Frederic Weisbecker
@ 2014-05-01 15:02       ` Tejun Heo
  2014-05-01 15:09         ` Frederic Weisbecker
  0 siblings, 1 reply; 18+ messages in thread
From: Tejun Heo @ 2014-05-01 15:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

Hello,

On Thu, May 01, 2014 at 05:01:17PM +0200, Frederic Weisbecker wrote:
> > Another thing with naming is that I didn't anticipate having
> > attributes at the top directory so the workqueue directories aren't
> > namespaced.  Maybe we want to namespace top level knobs?
> > "system_cpumask" maybe?  Any better ideas?
> 
> Not sure why you want that. It makes sense on directories grouping
> file for different subsystem. But here?

Worried about possible conflicts with workqueue names if we end up
with more attributes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-05-01 15:02       ` Tejun Heo
@ 2014-05-01 15:09         ` Frederic Weisbecker
  2014-05-01 15:13           ` Tejun Heo
  0 siblings, 1 reply; 18+ messages in thread
From: Frederic Weisbecker @ 2014-05-01 15:09 UTC (permalink / raw)
  To: Tejun Heo
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, May 01, 2014 at 11:02:31AM -0400, Tejun Heo wrote:
> Hello,
> 
> On Thu, May 01, 2014 at 05:01:17PM +0200, Frederic Weisbecker wrote:
> > > Another thing with naming is that I didn't anticipate having
> > > attributes at the top directory so the workqueue directories aren't
> > > namespaced.  Maybe we want to namespace top level knobs?
> > > "system_cpumask" maybe?  Any better ideas?
> > 
> > Not sure why you want that. It makes sense on directories grouping
> > file for different subsystem. But here?
> 
> Worried about possible conflicts with workqueue names if we end up
> with more attributes.

But they are already protected in their own directories. Also having
the same file attribute names in root and in individual directories
suggests that root attributes primes on the childs.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask
  2014-05-01 15:09         ` Frederic Weisbecker
@ 2014-05-01 15:13           ` Tejun Heo
  0 siblings, 0 replies; 18+ messages in thread
From: Tejun Heo @ 2014-05-01 15:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Christoph Lameter, Kevin Hilman, Lai Jiangshan,
	Mike Galbraith, Paul E. McKenney, Viresh Kumar

On Thu, May 01, 2014 at 05:09:20PM +0200, Frederic Weisbecker wrote:
> > > Not sure why you want that. It makes sense on directories grouping
> > > file for different subsystem. But here?
> > 
> > Worried about possible conflicts with workqueue names if we end up
> > with more attributes.
> 
> But they are already protected in their own directories. Also having
> the same file attribute names in root and in individual directories
> suggests that root attributes primes on the childs.

Yeah but workqueues already tend to have pretty short and fundamental
names, and it can get confusing to have mix of files and directories
all with short names without much recognizable pattern.  I don't know.
Maybe I'm worrying too much but I'd really hate to be in a situation
where an attribute has to be renamed because it conflicts with a
workqueue name.

-- 
tejun

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-05-01 15:13 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-24 14:37 [RFC PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask Frederic Weisbecker
2014-04-24 14:37 ` [PATCH 1/4] workqueue: Create low-level unbound workqueues cpumask Frederic Weisbecker
2014-04-24 15:37   ` Tejun Heo
2014-05-01 15:01     ` Frederic Weisbecker
2014-05-01 15:02       ` Tejun Heo
2014-05-01 15:09         ` Frederic Weisbecker
2014-05-01 15:13           ` Tejun Heo
2014-04-24 22:42   ` Kevin Hilman
2014-04-24 14:37 ` [PATCH 2/4] workqueue: Split apply attrs code from its locking Frederic Weisbecker
2014-04-24 14:48   ` Tejun Heo
2014-05-01 14:40     ` Frederic Weisbecker
2014-05-01 14:41       ` Tejun Heo
2014-04-24 14:37 ` [PATCH 3/4] workqueue: Allow modifying low level unbound workqueue cpumask Frederic Weisbecker
2014-04-24 15:30   ` Tejun Heo
2014-05-01 14:49     ` Frederic Weisbecker
2014-04-24 14:37 ` [PATCH 4/4] workqueue: Handle ordered workqueues on cpumask_unbounds change Frederic Weisbecker
2014-04-24 15:33   ` Tejun Heo
2014-05-01 14:51     ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.