From: Dan Williams <dan.j.williams@intel.com>
To: alexander.h.duyck@linux.intel.com
Cc: "Pasha Tatashin" <pavel.tatashin@microsoft.com>,
"Michal Hocko" <mhocko@suse.com>,
linux-nvdimm <linux-nvdimm@lists.01.org>,
"Dave Hansen" <dave.hansen@intel.com>,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
"Linux MM" <linux-mm@kvack.org>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Ingo Molnar" <mingo@kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v4 4/5] async: Add support for queueing on specific node
Date: Fri, 21 Sep 2018 07:57:21 -0700 [thread overview]
Message-ID: <CAPcyv4iFs5WXMYgbC6mBSxcHggv5y1kPW5BoZ4JMy5o-bv6cOg@mail.gmail.com> (raw)
In-Reply-To: <20180920222938.19464.34102.stgit@localhost.localdomain>
On Thu, Sep 20, 2018 at 3:31 PM Alexander Duyck
<alexander.h.duyck@linux.intel.com> wrote:
>
> This patch introduces two new variants of the async_schedule_ functions
> that allow scheduling on a specific node. These functions are
> async_schedule_on and async_schedule_on_domain which end up mapping to
> async_schedule and async_schedule_domain but provide NUMA node specific
> functionality. The original functions were moved to inline function
> definitions that call the new functions while passing NUMA_NO_NODE.
>
> The main motivation behind this is to address the need to be able to
> schedule NVDIMM init work on specific NUMA nodes in order to improve
> performance of memory initialization.
>
> One additional change I made is I dropped the "extern" from the function
> prototypes in the async.h kernel header since they aren't needed.
>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> ---
> include/linux/async.h | 20 +++++++++++++++++---
> kernel/async.c | 36 +++++++++++++++++++++++++-----------
> 2 files changed, 42 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/async.h b/include/linux/async.h
> index 6b0226bdaadc..9878b99cbb01 100644
> --- a/include/linux/async.h
> +++ b/include/linux/async.h
> @@ -14,6 +14,7 @@
>
> #include <linux/types.h>
> #include <linux/list.h>
> +#include <linux/numa.h>
>
> typedef u64 async_cookie_t;
> typedef void (*async_func_t) (void *data, async_cookie_t cookie);
> @@ -37,9 +38,22 @@ struct async_domain {
> struct async_domain _name = { .pending = LIST_HEAD_INIT(_name.pending), \
> .registered = 0 }
>
> -extern async_cookie_t async_schedule(async_func_t func, void *data);
> -extern async_cookie_t async_schedule_domain(async_func_t func, void *data,
> - struct async_domain *domain);
> +async_cookie_t async_schedule_on(async_func_t func, void *data, int node);
> +async_cookie_t async_schedule_on_domain(async_func_t func, void *data, int node,
> + struct async_domain *domain);
I would expect this to take a cpu instead of a node to not surprise
users coming from queue_work_on() / schedule_work_on()...
> +
> +static inline async_cookie_t async_schedule(async_func_t func, void *data)
> +{
> + return async_schedule_on(func, data, NUMA_NO_NODE);
> +}
> +
> +static inline async_cookie_t
> +async_schedule_domain(async_func_t func, void *data,
> + struct async_domain *domain)
> +{
> + return async_schedule_on_domain(func, data, NUMA_NO_NODE, domain);
> +}
> +
> void async_unregister_domain(struct async_domain *domain);
> extern void async_synchronize_full(void);
> extern void async_synchronize_full_domain(struct async_domain *domain);
> diff --git a/kernel/async.c b/kernel/async.c
> index a893d6170944..1d7ce81c1949 100644
> --- a/kernel/async.c
> +++ b/kernel/async.c
> @@ -56,6 +56,7 @@ synchronization with the async_synchronize_full() function, before returning
> #include <linux/sched.h>
> #include <linux/slab.h>
> #include <linux/workqueue.h>
> +#include <linux/cpu.h>
>
> #include "workqueue_internal.h"
>
> @@ -149,8 +150,11 @@ static void async_run_entry_fn(struct work_struct *work)
> wake_up(&async_done);
> }
>
> -static async_cookie_t __async_schedule(async_func_t func, void *data, struct async_domain *domain)
> +static async_cookie_t __async_schedule(async_func_t func, void *data,
> + struct async_domain *domain,
> + int node)
> {
> + int cpu = WORK_CPU_UNBOUND;
> struct async_entry *entry;
> unsigned long flags;
> async_cookie_t newcookie;
> @@ -194,30 +198,40 @@ static async_cookie_t __async_schedule(async_func_t func, void *data, struct asy
> /* mark that this task has queued an async job, used by module init */
> current->flags |= PF_USED_ASYNC;
>
> + /* guarantee cpu_online_mask doesn't change during scheduling */
> + get_online_cpus();
> +
> + if (node >= 0 && node < MAX_NUMNODES && node_online(node))
> + cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
...I think this node to cpu helper should be up-leveled for callers. I
suspect using get_online_cpus() may cause lockdep problems to take the
cpu_hotplug_lock() within a "do_something_on()" routine. For example,
I found this when auditing queue_work_on() users:
/*
* Doesn't need any cpu hotplug locking because we do rely on per-cpu
* kworkers being shut down before our page_alloc_cpu_dead callback is
* executed on the offlined cpu.
* Calling this function with cpu hotplug locks held can actually lead
* to obscure indirect dependencies via WQ context.
*/
void lru_add_drain_all(void)
I think it's a gotcha waiting to happen if async_schedule_on() has
more restrictive calling contexts than queue_work_on().
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2018-09-21 14:57 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-20 22:24 [PATCH v4 0/5] Address issues slowing persistent memory initialization Alexander Duyck
2018-09-20 22:26 ` [PATCH v4 1/5] mm: Provide kernel parameter to allow disabling page init poisoning Alexander Duyck
2018-09-21 19:04 ` Pasha Tatashin
2018-09-21 19:41 ` Logan Gunthorpe
2018-09-21 19:52 ` Pasha Tatashin
2018-09-20 22:27 ` [PATCH v4 2/5] mm: Create non-atomic version of SetPageReserved for init use Alexander Duyck
2018-09-21 19:06 ` Pasha Tatashin
2018-09-20 22:29 ` [PATCH v4 3/5] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap Alexander Duyck
2018-09-21 19:50 ` Pasha Tatashin
2018-09-21 20:03 ` Alexander Duyck
2018-09-21 20:14 ` Pasha Tatashin
2018-09-20 22:29 ` [PATCH v4 4/5] async: Add support for queueing on specific node Alexander Duyck
2018-09-21 14:57 ` Dan Williams [this message]
2018-09-21 17:02 ` Alexander Duyck
2018-09-29 8:15 ` [LKP] [async] 06f4f5bfb3: BUG:sleeping_function_called_from_invalid_context_at_include/linux/percpu-rwsem.h kernel test robot
2018-09-20 22:29 ` [PATCH v4 5/5] nvdimm: Schedule device registration on node local to the device Alexander Duyck
2018-09-20 22:59 ` Dan Williams
2018-09-21 0:16 ` Alexander Duyck
2018-09-21 0:36 ` Dan Williams
2018-09-21 1:33 ` Alexander Duyck
2018-09-21 2:46 ` Dan Williams
2018-09-21 14:46 ` Alexander Duyck
2018-09-21 14:56 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPcyv4iFs5WXMYgbC6mBSxcHggv5y1kPW5BoZ4JMy5o-bv6cOg@mail.gmail.com \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=jglisse@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=pavel.tatashin@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).