All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Tejun Heo <tj@kernel.org>, Frederic Weisbecker <fweisbec@gmail.com>
Cc: <linux-kernel@vger.kernel.org>, Christoph Lameter <cl@linux.com>,
	Kevin Hilman <khilman@linaro.org>,
	Mike Galbraith <bitbucket@online.de>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@linaro.org>
Subject: Re: [PATCH 4/4 V5] workqueue: Allow modifying low level unbound workqueue cpumask
Date: Wed, 1 Apr 2015 16:33:30 +0800	[thread overview]
Message-ID: <551BAD5A.2070205@cn.fujitsu.com> (raw)
In-Reply-To: <551A50DC.6060904@cn.fujitsu.com>

Hi, Frederic, TJ

I considered a special case and forgot to consider an another case.

====

Let @L = the low level unbound workqueue cpumask.
Let @U = the user setting cpumask (wq->unbound_attrs->cpumask).

Thus the pwqs in the specified wq are controlled by @L & @U (& = cpmask_and()).
But the per-node pwqs are mandatory controlled by  @L & @U, we don't discus it.
So this mail only takes focus on the default pwq.

What happens for the default pwq when @L & @U == empty cpumask?
I considered this case in the patch.  In my patch, the dfl_pwq directly
use the @U. The reasons:
1) it is not a good idea to directly use cpu_possible_mask.
2) and it is not a good idea  either to use @L & @U ( = empty), in this case,
	the scheduler will use cpu_possible_mask.
3) so we has to chose one from @L and @U.
4) I chose @U instead of @L.
   @L: it is low level global cpumask and it controls *ALL* wqs.
   @U: it is set by the *USER*, it controls only one *SPECIFIC* wq.

Frederic, TJ, both you didn't say anything about my this early quick decision.
Should this case be handled specially? And if yes, does this decision met your requirements?

======

A comment from TJ reminded me that the final cpumask determined by the scheduler
is more important.

Let @O = cpu_online_mask.

The missing case:
(@L & @U) is not empty but (@L & @U @O) is empty.

In my old code (V5 patchset), the dfl_pwq uses (@L & @U), the scheduler will
use cpu_possible_mask instead due to there is no cpu onlined among all cpu in (@L & @U).
It is bad, the pwq is NOT controlled by @L nor @U now.

I think we may use @U for the dfl_pwq in this case.  But it will introduces
a problem:

When (@L & @U) has online cpu, the dfl_pwq's cpumaks is (@L & @U).
when (@L & @U) has no online cpu, the dfl_pwq's cpumask is @U.
It means dfl_pwq may need to be reallocated during the cpuhotplug-add/remove
and it means wq_update_unbound_numa() can fail.

Frederic, TJ, any comments about this case?
TJ, would you like to make wq_update_unbound_numa() be failure-able?

thanks
Lai

On 03/31/2015 03:46 PM, Lai Jiangshan wrote:
> On 03/25/2015 01:31 AM, Tejun Heo wrote:
>> On Wed, Mar 18, 2015 at 12:40:17PM +0800, Lai Jiangshan wrote:
>>> The oreder-workquue is ignore from the low level unbound workqueue cpumask,
>>> it will be handled in near future.
>>
>> Ugh, right, ordered workqueues are tricky.  Maybe we should change how
>> ordered workqueues are implemented.  Just gate work items at the
>> workqueue layer instead of fiddling with max_active and the number of
>> pwqs.
>>
>>>  static struct wq_unbound_install_ctx *
>>>  wq_unbound_install_ctx_prepare(struct workqueue_struct *wq,
>>> -			       const struct workqueue_attrs *attrs)
>>> +			       const struct workqueue_attrs *attrs,
>>> +			       cpumask_var_t unbound_cpumask)
>>>  {
>> ...
>>>  	/* make a copy of @attrs and sanitize it */
>>>  	copy_workqueue_attrs(new_attrs, attrs);
>>> -	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask);
>>> +	copy_workqueue_attrs(pwq_attrs, attrs);
>>> +	cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);
>>> +	cpumask_and(pwq_attrs->cpumask, pwq_attrs->cpumask, unbound_cpumask);
>>
>> Hmmm... we weren't checking whether the intersection becomes null
>> before.
> 
> Di you refer to the unquoted following code "cpumask_empty(pwq_attrs->cpumask)"?
> 
> It is explained in the changelog and the comments.
> 
>>  Why are we doing it now?  Note that this doesn't really make
>> things water-tight as cpu on/offlining can still leave the mask w/o
>> any online cpus.  Shouldn't we just let the scheduler handle it as
>> before?
> 
> Did you refer to "cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);"?
> 
> new_attrs will be copied to wq->unbound_attrs, so we hope it is sanity.
> the same code before this patchset did the same work.
> 
> And it maybe be used for default pwq, and it can reduce the pool creation:
> 	cpu_possible_mask = 0-7
> 	wq_unbound_cpumask = 0-3
> 	user1 try to set wq1:	attrs->cpumask = 4-9
> 	user2 try to set wq2:	attrs->cpumask = 4-11
> thus both wq1 and wq2's default pwq's pool is the same pool. (pool's cpumask = 4-7)
> 	
> 
>>
>>> @@ -3712,6 +3726,9 @@ static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu,
>>>  	 * wq's, the default pwq should be used.
>>>  	 */
>>>  	if (wq_calc_node_cpumask(wq->unbound_attrs, node, cpu_off, cpumask)) {
>>> +		cpumask_and(cpumask, cpumask, wq_unbound_cpumask);
>>> +		if (cpumask_empty(cpumask))
>>> +			goto use_dfl_pwq;
>>
>> So, this special handling is necessary only because we did special in
>> the above for dfl_pwq.  Why do we need these?
> 
> wq->unbound_attrs is user setting attrs, its cpumask is not controlled by
> wq_unbound_cpumask. so we need these cpumask_and().
> 
> Another question:
> Why wq->unbound_attrs' cpumask is not controlled by wq_unbound_cpumask?
> 
> I hope the wq->unbound_attrs is always as the same as the user's last setting,
> regardless how much times the wq_unbound_cpumask is changed.
> 
>>
>>> +static int unbounds_cpumask_apply(cpumask_var_t cpumask)
>>> +{
>> ..
>>> +	list_for_each_entry_safe(ctx, n, &ctxs, list) {
>>> +		if (ret >= 0)
>>
>> Let's do !ret.
>>
>>> +			wq_unbound_install_ctx_commit(ctx);
>>> +		wq_unbound_install_ctx_free(ctx);
>>> +	}
>> ...
>>> +/**
>>> + *  workqueue_unbounds_cpumask_set - Set the low-level unbound cpumask
>>> + *  @cpumask: the cpumask to set
>>> + *
>>> + *  The low-level workqueues cpumask is a global cpumask that limits
>>> + *  the affinity of all unbound workqueues.  This function check the @cpumask
>>> + *  and apply it to all unbound workqueues and updates all pwqs of them.
>>> + *  When all succeed, it saves @cpumask to the global low-level unbound
>>> + *  cpumask.
>>> + *
>>> + *  Retun:	0	- Success
>>> + *  		-EINVAL	- No online cpu in the @cpumask
>>> + *  		-ENOMEM	- Failed to allocate memory for attrs or pwqs.
>>> + */
>>> +int workqueue_unbounds_cpumask_set(cpumask_var_t cpumask)
>>> +{
>>> +	int ret = -EINVAL;
>>> +
>>> +	get_online_cpus();
>>> +	cpumask_and(cpumask, cpumask, cpu_possible_mask);
>>> +	if (cpumask_intersects(cpumask, cpu_online_mask)) {
>>
>> Does this make sense?  We can't prevent cpus going down right after
>> the mask is set.  What's the point of preventing empty config if we
>> can't prevent transitions into it and have to handle it anyway?
> 
> Like set_cpus_allowed_ptr(). The cpumask must be valid when setting,
> although it can be transited into non-intersection later.
> 
> This code is originated from Frederic.  Maybe he has some stronger reason.
> 
>>
>>> +static ssize_t unbounds_cpumask_store(struct device *dev,
>>> +				      struct device_attribute *attr,
>>> +				      const char *buf, size_t count)
>>
>> Naming is too confusing.  Please pick a name which clearly
>> distinguishes per-wq and global masking.
> 
> What about these names?
> wq_unbound_cpumask ==> wq_unbound_global_cpumask
> workqueue_unbounds_cpumask_set() ==> workqueue_set_unbound_global_cpumask(). (public API)
> unbounds_cpumask_store() ==> wq_store_unbound_global_cpumask()   (static function for sysfs)
> 
>>
>> Thanks.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> .
> 


  reply	other threads:[~2015-04-01  8:32 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12  5:00 [PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask v4 Lai Jiangshan
2015-03-12  5:00 ` [PATCH 1/4] workqueue: Reorder sysfs code Lai Jiangshan
2015-03-12  5:00 ` [PATCH 2/4] workqueue: split apply_workqueue_attrs() into 3 stages Lai Jiangshan
2015-03-12  5:00 ` [PATCH 3/4] workqueue: Create low-level unbound workqueues cpumask Lai Jiangshan
2015-03-12 17:33   ` Christoph Lameter
2015-03-13 23:49   ` Kevin Hilman
2015-03-14  0:52     ` Kevin Hilman
2015-03-14  7:52     ` Lai Jiangshan
2015-03-16 17:12       ` Kevin Hilman
2015-03-16 17:25         ` Frederic Weisbecker
2015-03-16 19:38           ` Kevin Hilman
2015-03-12  5:00 ` [PATCH 4/4] workqueue: Allow modifying low level unbound workqueue cpumask Lai Jiangshan
2015-03-12 17:42   ` Christoph Lameter
2015-03-13  1:38     ` Lai Jiangshan
2015-03-13  7:49   ` Lai Jiangshan
2015-03-12 17:49 ` [PATCH 0/4] workqueue: Introduce low-level unbound wq sysfs cpumask v4 Frederic Weisbecker
2015-03-13  6:02 ` Mike Galbraith
2015-03-18  4:40 ` [PATCH 0/4 V5] workqueue: Introduce low-level unbound wq sysfs cpumask v5 Lai Jiangshan
2015-03-18  4:40   ` [PATCH 1/4 V5] workqueue: Reorder sysfs code Lai Jiangshan
2015-03-24 15:41     ` Tejun Heo
2015-03-18  4:40   ` [PATCH 2/4 V5] workqueue: split apply_workqueue_attrs() into 3 stages Lai Jiangshan
2015-03-24 15:55     ` Tejun Heo
2015-03-18  4:40   ` [PATCH 3/4 V5] workqueue: Create low-level unbound workqueues cpumask Lai Jiangshan
2015-03-18  4:40   ` [PATCH 4/4 V5] workqueue: Allow modifying low level unbound workqueue cpumask Lai Jiangshan
2015-03-24 17:31     ` Tejun Heo
2015-03-31  7:46       ` Lai Jiangshan
2015-04-01  8:33         ` Lai Jiangshan [this message]
2015-04-01 15:52           ` Tejun Heo
2015-03-19  8:54   ` [PATCH 0/4 V5] workqueue: Introduce low-level unbound wq sysfs cpumask v5 Mike Galbraith
2015-04-02 11:14   ` [PATCH 0/4 V6] " Lai Jiangshan
2015-04-02 11:14     ` [PATCH 1/4 V6] workqueue: Reorder sysfs code Lai Jiangshan
2015-04-06 15:22       ` Tejun Heo
2015-04-02 11:14     ` [PATCH 2/4 V6] workqueue: split apply_workqueue_attrs() into 3 stages Lai Jiangshan
2015-04-06 15:39       ` Tejun Heo
2015-04-02 11:14     ` [PATCH 3/4 V6] workqueue: Create low-level unbound workqueues cpumask Lai Jiangshan
2015-04-02 11:14     ` [PATCH 4/4 V6] workqueue: Allow modifying low level unbound workqueue cpumask Lai Jiangshan
2015-04-06 15:53       ` Tejun Heo
2015-04-07  1:25         ` Lai Jiangshan
2015-04-07  1:58           ` Tejun Heo
2015-04-07  2:33             ` Lai Jiangshan
2015-04-07 11:26     ` [PATCH 1/3 V7] workqueue: split apply_workqueue_attrs() into 3 stages Lai Jiangshan
2015-04-07 11:26       ` [PATCH 2/3 V7] workqueue: Create low-level unbound workqueues cpumask Lai Jiangshan
2015-04-07 11:26       ` [PATCH 3/3 V7] workqueue: Allow modifying low level unbound workqueue cpumask Lai Jiangshan
2015-04-22 19:39         ` Tejun Heo
2015-04-22 23:02           ` Frederic Weisbecker
2015-04-23  6:29             ` Mike Galbraith
2015-04-17 14:57       ` [PATCH 1/3 V7] workqueue: split apply_workqueue_attrs() into 3 stages Tejun Heo
2015-04-20  3:21         ` Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=551BAD5A.2070205@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=bitbucket@online.de \
    --cc=cl@linux.com \
    --cc=fweisbec@gmail.com \
    --cc=khilman@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tj@kernel.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.