From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752566AbbDAIc5 (ORCPT ); Wed, 1 Apr 2015 04:32:57 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:41830 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750701AbbDAIcx (ORCPT ); Wed, 1 Apr 2015 04:32:53 -0400 X-IronPort-AV: E=Sophos;i="5.04,848,1406563200"; d="scan'208";a="90019322" Message-ID: <551BAD5A.2070205@cn.fujitsu.com> Date: Wed, 1 Apr 2015 16:33:30 +0800 From: Lai Jiangshan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Tejun Heo , Frederic Weisbecker CC: , Christoph Lameter , Kevin Hilman , Mike Galbraith , "Paul E. McKenney" , Viresh Kumar Subject: Re: [PATCH 4/4 V5] workqueue: Allow modifying low level unbound workqueue cpumask References: <1426136412-7594-1-git-send-email-laijs@cn.fujitsu.com> <1426653617-3240-1-git-send-email-laijs@cn.fujitsu.com> <1426653617-3240-5-git-send-email-laijs@cn.fujitsu.com> <20150324173120.GI3880@htj.duckdns.org> <551A50DC.6060904@cn.fujitsu.com> In-Reply-To: <551A50DC.6060904@cn.fujitsu.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.167.226.103] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Frederic, TJ I considered a special case and forgot to consider an another case. ==== Let @L = the low level unbound workqueue cpumask. Let @U = the user setting cpumask (wq->unbound_attrs->cpumask). Thus the pwqs in the specified wq are controlled by @L & @U (& = cpmask_and()). But the per-node pwqs are mandatory controlled by @L & @U, we don't discus it. So this mail only takes focus on the default pwq. What happens for the default pwq when @L & @U == empty cpumask? I considered this case in the patch. In my patch, the dfl_pwq directly use the @U. The reasons: 1) it is not a good idea to directly use cpu_possible_mask. 2) and it is not a good idea either to use @L & @U ( = empty), in this case, the scheduler will use cpu_possible_mask. 3) so we has to chose one from @L and @U. 4) I chose @U instead of @L. @L: it is low level global cpumask and it controls *ALL* wqs. @U: it is set by the *USER*, it controls only one *SPECIFIC* wq. Frederic, TJ, both you didn't say anything about my this early quick decision. Should this case be handled specially? And if yes, does this decision met your requirements? ====== A comment from TJ reminded me that the final cpumask determined by the scheduler is more important. Let @O = cpu_online_mask. The missing case: (@L & @U) is not empty but (@L & @U @O) is empty. In my old code (V5 patchset), the dfl_pwq uses (@L & @U), the scheduler will use cpu_possible_mask instead due to there is no cpu onlined among all cpu in (@L & @U). It is bad, the pwq is NOT controlled by @L nor @U now. I think we may use @U for the dfl_pwq in this case. But it will introduces a problem: When (@L & @U) has online cpu, the dfl_pwq's cpumaks is (@L & @U). when (@L & @U) has no online cpu, the dfl_pwq's cpumask is @U. It means dfl_pwq may need to be reallocated during the cpuhotplug-add/remove and it means wq_update_unbound_numa() can fail. Frederic, TJ, any comments about this case? TJ, would you like to make wq_update_unbound_numa() be failure-able? thanks Lai On 03/31/2015 03:46 PM, Lai Jiangshan wrote: > On 03/25/2015 01:31 AM, Tejun Heo wrote: >> On Wed, Mar 18, 2015 at 12:40:17PM +0800, Lai Jiangshan wrote: >>> The oreder-workquue is ignore from the low level unbound workqueue cpumask, >>> it will be handled in near future. >> >> Ugh, right, ordered workqueues are tricky. Maybe we should change how >> ordered workqueues are implemented. Just gate work items at the >> workqueue layer instead of fiddling with max_active and the number of >> pwqs. >> >>> static struct wq_unbound_install_ctx * >>> wq_unbound_install_ctx_prepare(struct workqueue_struct *wq, >>> - const struct workqueue_attrs *attrs) >>> + const struct workqueue_attrs *attrs, >>> + cpumask_var_t unbound_cpumask) >>> { >> ... >>> /* make a copy of @attrs and sanitize it */ >>> copy_workqueue_attrs(new_attrs, attrs); >>> - cpumask_and(new_attrs->cpumask, new_attrs->cpumask, wq_unbound_cpumask); >>> + copy_workqueue_attrs(pwq_attrs, attrs); >>> + cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask); >>> + cpumask_and(pwq_attrs->cpumask, pwq_attrs->cpumask, unbound_cpumask); >> >> Hmmm... we weren't checking whether the intersection becomes null >> before. > > Di you refer to the unquoted following code "cpumask_empty(pwq_attrs->cpumask)"? > > It is explained in the changelog and the comments. > >> Why are we doing it now? Note that this doesn't really make >> things water-tight as cpu on/offlining can still leave the mask w/o >> any online cpus. Shouldn't we just let the scheduler handle it as >> before? > > Did you refer to "cpumask_and(new_attrs->cpumask, new_attrs->cpumask, cpu_possible_mask);"? > > new_attrs will be copied to wq->unbound_attrs, so we hope it is sanity. > the same code before this patchset did the same work. > > And it maybe be used for default pwq, and it can reduce the pool creation: > cpu_possible_mask = 0-7 > wq_unbound_cpumask = 0-3 > user1 try to set wq1: attrs->cpumask = 4-9 > user2 try to set wq2: attrs->cpumask = 4-11 > thus both wq1 and wq2's default pwq's pool is the same pool. (pool's cpumask = 4-7) > > >> >>> @@ -3712,6 +3726,9 @@ static void wq_update_unbound_numa(struct workqueue_struct *wq, int cpu, >>> * wq's, the default pwq should be used. >>> */ >>> if (wq_calc_node_cpumask(wq->unbound_attrs, node, cpu_off, cpumask)) { >>> + cpumask_and(cpumask, cpumask, wq_unbound_cpumask); >>> + if (cpumask_empty(cpumask)) >>> + goto use_dfl_pwq; >> >> So, this special handling is necessary only because we did special in >> the above for dfl_pwq. Why do we need these? > > wq->unbound_attrs is user setting attrs, its cpumask is not controlled by > wq_unbound_cpumask. so we need these cpumask_and(). > > Another question: > Why wq->unbound_attrs' cpumask is not controlled by wq_unbound_cpumask? > > I hope the wq->unbound_attrs is always as the same as the user's last setting, > regardless how much times the wq_unbound_cpumask is changed. > >> >>> +static int unbounds_cpumask_apply(cpumask_var_t cpumask) >>> +{ >> .. >>> + list_for_each_entry_safe(ctx, n, &ctxs, list) { >>> + if (ret >= 0) >> >> Let's do !ret. >> >>> + wq_unbound_install_ctx_commit(ctx); >>> + wq_unbound_install_ctx_free(ctx); >>> + } >> ... >>> +/** >>> + * workqueue_unbounds_cpumask_set - Set the low-level unbound cpumask >>> + * @cpumask: the cpumask to set >>> + * >>> + * The low-level workqueues cpumask is a global cpumask that limits >>> + * the affinity of all unbound workqueues. This function check the @cpumask >>> + * and apply it to all unbound workqueues and updates all pwqs of them. >>> + * When all succeed, it saves @cpumask to the global low-level unbound >>> + * cpumask. >>> + * >>> + * Retun: 0 - Success >>> + * -EINVAL - No online cpu in the @cpumask >>> + * -ENOMEM - Failed to allocate memory for attrs or pwqs. >>> + */ >>> +int workqueue_unbounds_cpumask_set(cpumask_var_t cpumask) >>> +{ >>> + int ret = -EINVAL; >>> + >>> + get_online_cpus(); >>> + cpumask_and(cpumask, cpumask, cpu_possible_mask); >>> + if (cpumask_intersects(cpumask, cpu_online_mask)) { >> >> Does this make sense? We can't prevent cpus going down right after >> the mask is set. What's the point of preventing empty config if we >> can't prevent transitions into it and have to handle it anyway? > > Like set_cpus_allowed_ptr(). The cpumask must be valid when setting, > although it can be transited into non-intersection later. > > This code is originated from Frederic. Maybe he has some stronger reason. > >> >>> +static ssize_t unbounds_cpumask_store(struct device *dev, >>> + struct device_attribute *attr, >>> + const char *buf, size_t count) >> >> Naming is too confusing. Please pick a name which clearly >> distinguishes per-wq and global masking. > > What about these names? > wq_unbound_cpumask ==> wq_unbound_global_cpumask > workqueue_unbounds_cpumask_set() ==> workqueue_set_unbound_global_cpumask(). (public API) > unbounds_cpumask_store() ==> wq_store_unbound_global_cpumask() (static function for sysfs) > >> >> Thanks. >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > . >