All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: len.brown@intel.com, linux-pm@vger.kernel.org,
	gregkh@linuxfoundation.org, linux-nvdimm@lists.01.org,
	jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
	zwisler@kernel.org, pavel@ucw.cz, rafael@kernel.org,
	akpm@linux-foundation.org
Subject: Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node
Date: Mon, 1 Oct 2018 09:01:42 -0700	[thread overview]
Message-ID: <20181001160142.GE270328@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <ba72f007-84e2-6fe0-b128-d876dadef5f5@linux.intel.com>

Hello,

On Wed, Sep 26, 2018 at 03:19:21PM -0700, Alexander Duyck wrote:
> On 9/26/2018 3:09 PM, Tejun Heo wrote:
> I could just use queue_work_on probably, but is there any issue if I
> am passing CPU values that are not in the wq_unbound_cpumask? That

That should be fine.  If it can't find any available cpu, it'll fall
back to round-robin.  We probably can improve it so that it can
consider the numa distance when falling back.

> was mostly my concern. Also for an unbound queue do I need to worry
> about the hotplug lock? I wasn't sure if that was the case or not as

Issuers don't need to worry about them.

> I know it is called out as something to be concerned with using
> queue_work_on, but in __queue_work the value is just used to
> determine which node to grab a work queue from.

It might be better to leave queue_work_on() to be used for per-cpu
workqueues and introduce queue_work_near() as you suggseted.  I just
don't want it to duplicate the node selection code in it.  Would that
work?

> I forgot to address your question about the advantages. They are
> pretty significant. The test system I was working with was
> initializing 3TB of nvdimm memory per node. If the node is aligned
> it takes something like 24 seconds, whereas an unaligned core can
> take 36 seconds or more.

Oh yeah, sure, numa affinity matters quite a bit on memory heavy
workloads.  I was mistaken that you were adding adding numa affinity
to per-cpu workqueues.

Thanks.

-- 
tejun
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID
From: Tejun Heo <tj@kernel.org>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: linux-nvdimm@lists.01.org, gregkh@linuxfoundation.org,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, len.brown@intel.com,
	dave.jiang@intel.com, rafael@kernel.org,
	vishal.l.verma@intel.com, jiangshanlai@gmail.com, pavel@ucw.cz,
	zwisler@kernel.org, dan.j.williams@intel.com
Subject: Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node
Date: Mon, 1 Oct 2018 09:01:42 -0700	[thread overview]
Message-ID: <20181001160142.GE270328@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <ba72f007-84e2-6fe0-b128-d876dadef5f5@linux.intel.com>

Hello,

On Wed, Sep 26, 2018 at 03:19:21PM -0700, Alexander Duyck wrote:
> On 9/26/2018 3:09 PM, Tejun Heo wrote:
> I could just use queue_work_on probably, but is there any issue if I
> am passing CPU values that are not in the wq_unbound_cpumask? That

That should be fine.  If it can't find any available cpu, it'll fall
back to round-robin.  We probably can improve it so that it can
consider the numa distance when falling back.

> was mostly my concern. Also for an unbound queue do I need to worry
> about the hotplug lock? I wasn't sure if that was the case or not as

Issuers don't need to worry about them.

> I know it is called out as something to be concerned with using
> queue_work_on, but in __queue_work the value is just used to
> determine which node to grab a work queue from.

It might be better to leave queue_work_on() to be used for per-cpu
workqueues and introduce queue_work_near() as you suggseted.  I just
don't want it to duplicate the node selection code in it.  Would that
work?

> I forgot to address your question about the advantages. They are
> pretty significant. The test system I was working with was
> initializing 3TB of nvdimm memory per node. If the node is aligned
> it takes something like 24 seconds, whereas an unaligned core can
> take 36 seconds or more.

Oh yeah, sure, numa affinity matters quite a bit on memory heavy
workloads.  I was mistaken that you were adding adding numa affinity
to per-cpu workqueues.

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Alexander Duyck
	<alexander.h.duyck-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: len.brown-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	linux-pm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org,
	linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
	jiangshanlai-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	zwisler-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	pavel-+ZI9xUNit7I@public.gmane.org,
	rafael-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
Subject: Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node
Date: Mon, 1 Oct 2018 09:01:42 -0700	[thread overview]
Message-ID: <20181001160142.GE270328@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <ba72f007-84e2-6fe0-b128-d876dadef5f5-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

Hello,

On Wed, Sep 26, 2018 at 03:19:21PM -0700, Alexander Duyck wrote:
> On 9/26/2018 3:09 PM, Tejun Heo wrote:
> I could just use queue_work_on probably, but is there any issue if I
> am passing CPU values that are not in the wq_unbound_cpumask? That

That should be fine.  If it can't find any available cpu, it'll fall
back to round-robin.  We probably can improve it so that it can
consider the numa distance when falling back.

> was mostly my concern. Also for an unbound queue do I need to worry
> about the hotplug lock? I wasn't sure if that was the case or not as

Issuers don't need to worry about them.

> I know it is called out as something to be concerned with using
> queue_work_on, but in __queue_work the value is just used to
> determine which node to grab a work queue from.

It might be better to leave queue_work_on() to be used for per-cpu
workqueues and introduce queue_work_near() as you suggseted.  I just
don't want it to duplicate the node selection code in it.  Would that
work?

> I forgot to address your question about the advantages. They are
> pretty significant. The test system I was working with was
> initializing 3TB of nvdimm memory per node. If the node is aligned
> it takes something like 24 seconds, whereas an unaligned core can
> take 36 seconds or more.

Oh yeah, sure, numa affinity matters quite a bit on memory heavy
workloads.  I was mistaken that you were adding adding numa affinity
to per-cpu workqueues.

Thanks.

-- 
tejun

  reply	other threads:[~2018-10-01 16:01 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26 21:51 [RFC workqueue/driver-core PATCH 0/5] Add NUMA aware async_schedule calls Alexander Duyck
2018-09-26 21:51 ` Alexander Duyck
2018-09-26 21:51 ` Alexander Duyck
2018-09-26 21:51 ` [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-26 21:53   ` Tejun Heo
2018-09-26 21:53     ` Tejun Heo
2018-09-26 21:53     ` Tejun Heo
2018-09-26 22:05     ` Alexander Duyck
2018-09-26 22:05       ` Alexander Duyck
2018-09-26 22:09       ` Tejun Heo
2018-09-26 22:09         ` Tejun Heo
2018-09-26 22:09         ` Tejun Heo
2018-09-26 22:19         ` Alexander Duyck
2018-09-26 22:19           ` Alexander Duyck
2018-10-01 16:01           ` Tejun Heo [this message]
2018-10-01 16:01             ` Tejun Heo
2018-10-01 16:01             ` Tejun Heo
2018-10-01 21:54             ` Alexander Duyck
2018-10-01 21:54               ` Alexander Duyck
2018-10-01 21:54               ` Alexander Duyck
2018-10-02 17:41               ` Tejun Heo
2018-10-02 17:41                 ` Tejun Heo
2018-10-02 17:41                 ` Tejun Heo
2018-10-02 18:23                 ` Alexander Duyck
2018-10-02 18:23                   ` Alexander Duyck
2018-10-02 18:23                   ` Alexander Duyck
2018-10-02 18:41                   ` Tejun Heo
2018-10-02 18:41                     ` Tejun Heo
2018-10-02 18:41                     ` Tejun Heo
2018-10-02 20:49                     ` Alexander Duyck
2018-10-02 20:49                       ` Alexander Duyck
2018-10-02 20:49                       ` Alexander Duyck
2018-09-26 21:51 ` [RFC workqueue/driver-core PATCH 2/5] async: Add support for queueing on specific " Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-27  0:31   ` Dan Williams
2018-09-27  0:31     ` Dan Williams
2018-09-27  0:31     ` Dan Williams
2018-09-27 15:16     ` Alexander Duyck
2018-09-27 15:16       ` Alexander Duyck
2018-09-27 15:16       ` Alexander Duyck
2018-09-27 19:48       ` Dan Williams
2018-09-27 19:48         ` Dan Williams
2018-09-27 20:03         ` Alexander Duyck
2018-09-27 20:03           ` Alexander Duyck
2018-09-26 21:51 ` [RFC workqueue/driver-core PATCH 3/5] driver core: Probe devices asynchronously instead of the driver Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-27  0:48   ` Dan Williams
2018-09-27  0:48     ` Dan Williams
2018-09-27  0:48     ` Dan Williams
2018-09-27 15:27     ` Alexander Duyck
2018-09-27 15:27       ` Alexander Duyck
2018-09-27 15:27       ` Alexander Duyck
2018-09-28  2:48       ` Dan Williams
2018-09-28  2:48         ` Dan Williams
2018-09-28  2:48         ` Dan Williams
2018-09-26 21:51 ` [RFC workqueue/driver-core PATCH 4/5] driver core: Use new async_schedule_dev command Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-26 21:51   ` Alexander Duyck
2018-09-28 17:42   ` Dan Williams
2018-09-28 17:42     ` Dan Williams
2018-09-28 17:42     ` Dan Williams
2018-09-26 21:52 ` [RFC workqueue/driver-core PATCH 5/5] nvdimm: Schedule device registration on node local to the device Alexander Duyck
2018-09-26 21:52   ` Alexander Duyck
2018-09-26 21:52   ` Alexander Duyck
2018-09-28 17:46   ` Dan Williams
2018-09-28 17:46     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181001160142.GE270328@devbig004.ftw2.facebook.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiangshanlai@gmail.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=zwisler@kernel.org \
    --subject='Re: [RFC workqueue/driver-core PATCH 1/5] workqueue: Provide queue_work_near to queue work near a given NUMA node' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.