All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues
@ 2013-03-20  0:00 ` Tejun Heo
  0 siblings, 0 replies; 62+ messages in thread
From: Tejun Heo @ 2013-03-20  0:00 UTC (permalink / raw)
  To: laijs
  Cc: axboe, jack, fengguang.wu, jmoyer, zab, linux-kernel, herbert,
	davem, linux-crypto

Hello,

There are two types of workqueues - per-cpu and unbound.  The former
is bound to each CPU and the latter isn't not bound to any by default.
While the recently added attrs support allows unbound workqueues to be
confined to subset of CPUs, it still is quite cumbersome for
applications where CPU affinity is too constricted but NUMA locality
still matters.

This patchset tries to solve that issue by automatically making
unbound workqueues affine to NUMA nodes by default.  A work item
queued to an unbound workqueue is executed on one of the CPUs allowed
by the workqueue in the same node.  If there's none allowed, it may be
executed on any cpu allowed by the workqueue.  It doesn't require any
changes on the user side.  Every interface of workqueues functions the
same as before.

This would be most helpful to subsystems which use some form of async
execution to process significant amount of data - e.g. crypto and
btrfs; however, I wanted to find out whether it would make any dent in
much less favorable use cases.  The following is total run time in
seconds of buliding allmodconfig kernel w/ -j20 on a dual socket
opteron machine with writeback thread pool converted to unbound
workqueue and thus made NUMA-affine.  The file system is ext4 on top
of a WD SSD.

	before conversion		after conversion
	1396.126			1394.763
	1397.621			1394.965
	1399.636			1394.738
	1397.463			1398.162
	1395.543			1393.670

AVG	1397.278			1395.260	DIFF	2.018
STDEV	   1.585			   1.700

And, yes, it actually made things go faster by about 1.2 sigma, which
isn't completely conclusive but is a pretty good indication that it's
actually faster.  Note that this is a workload which is dominated by
CPU time and while there's writeback going on continously it really
isn't touching too much data or a dominating factor, so the gain is
understandably small, 0.14%, but hey it's still a gain and it should
be much more interesting for crypto and btrfs which would actully
access the data or workloads which are more sensitive to NUMA
affinity.

The implementation is fairly simple.  After the recent attrs support
changes, a lot of the differences in pwq (pool_workqueue) handling
between unbound and per-cpu workqueues are gone.  An unbound workqueue
still has one "current" pwq that it uses for queueing any new work
items but can handle multiple pwqs perfectly well while they're
draining, so this patchset adds pwq dispatch table to unbound
workqueues which is indexed by NUMA node and points to the matching
pwq.  Unbound workqueues now simply have multiple "current" pwqs keyed
by NUMA node.

NUMA affinity can be turned off system-wide by workqueue.disable_numa
kernel param or per-workqueue using "numa" sysfs file.

This patchset contains the following ten patches.

 0001-workqueue-add-wq_numa_tbl_len-and-wq_numa_possible_c.patch
 0002-workqueue-drop-H-from-kworker-names-of-unbound-worke.patch
 0003-workqueue-determine-NUMA-node-of-workers-accourding-.patch
 0004-workqueue-add-workqueue-unbound_attrs.patch
 0005-workqueue-make-workqueue-name-fixed-len.patch
 0006-workqueue-move-hot-fields-of-workqueue_struct-to-the.patch
 0007-workqueue-map-an-unbound-workqueues-to-multiple-per-.patch
 0008-workqueue-break-init_and_link_pwq-into-two-functions.patch
 0009-workqueue-implement-NUMA-affinity-for-unbound-workqu.patch
 0010-workqueue-update-sysfs-interface-to-reflect-NUMA-awa.patch

0001 adds basic NUMA topology knoweldge to workqueue.

0002-0006 are prep patches.

0007-0009 implement NUMA affinity.

0010 adds control knobs and updates sysfs interface.

This patchset is on top of

 wq/for-3.10 a0265a7f51 ("workqueue: define workqueue_freezing static variable iff CONFIG_FREEZER")

and also available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-numa

diffstat follows.

 Documentation/kernel-parameters.txt |    9
 include/linux/workqueue.h           |    5
 kernel/workqueue.c                  |  393 ++++++++++++++++++++++++++++--------
 3 files changed, 325 insertions(+), 82 deletions(-)

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2013-03-25 20:48 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-20  0:00 [PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues Tejun Heo
2013-03-20  0:00 ` Tejun Heo
2013-03-20  0:00 ` [PATCH 01/10] workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[] Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20 14:08   ` JoonSoo Kim
2013-03-20 14:08     ` JoonSoo Kim
2013-03-20 14:48     ` Tejun Heo
2013-03-20 14:48       ` Tejun Heo
2013-03-20 15:43   ` Lai Jiangshan
2013-03-20 15:43     ` Lai Jiangshan
2013-03-20 15:48     ` Tejun Heo
2013-03-20 15:48       ` Tejun Heo
2013-03-20 16:43       ` Lai Jiangshan
2013-03-20 16:43         ` Lai Jiangshan
2013-03-20  0:00 ` [PATCH 02/10] workqueue: drop 'H' from kworker names of unbound worker pools Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 03/10] workqueue: determine NUMA node of workers accourding to the allowed cpumask Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 04/10] workqueue: add workqueue->unbound_attrs Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 05/10] workqueue: make workqueue->name[] fixed len Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 06/10] workqueue: move hot fields of workqueue_struct to the end Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 07/10] workqueue: map an unbound workqueues to multiple per-node pool_workqueues Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20  0:00 ` [PATCH 08/10] workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq() Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20 15:52   ` Lai Jiangshan
2013-03-20 15:52     ` Lai Jiangshan
2013-03-20 16:04     ` Tejun Heo
2013-03-20 16:04       ` Tejun Heo
2013-03-20  0:00 ` [PATCH 09/10] workqueue: implement NUMA affinity for unbound workqueues Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20 15:03   ` Lai Jiangshan
2013-03-20 15:03     ` Lai Jiangshan
2013-03-20 15:05     ` Tejun Heo
2013-03-20 15:05       ` Tejun Heo
2013-03-20 15:26       ` Lai Jiangshan
2013-03-20 15:26         ` Lai Jiangshan
2013-03-20 15:32         ` Tejun Heo
2013-03-20 15:32           ` Tejun Heo
2013-03-20 17:08   ` [PATCH v2 " Tejun Heo
2013-03-20 17:08     ` Tejun Heo
2013-03-20 18:54     ` [PATCH v2 UPDATED " Tejun Heo
2013-03-20 18:54       ` Tejun Heo
2013-03-20  0:00 ` [PATCH 10/10] workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity Tejun Heo
2013-03-20  0:00   ` Tejun Heo
2013-03-20 12:14 ` [PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues Lai Jiangshan
2013-03-20 12:14   ` Lai Jiangshan
2013-03-20 17:08 ` [PATCH 11/10] workqueue: use NUMA-aware allocation for pool_workqueues workqueues Tejun Heo
2013-03-20 17:08   ` Tejun Heo
2013-03-20 18:57 ` [PATCHSET wq/for-3.10] workqueue: NUMA affinity for unbound workqueues Tejun Heo
2013-03-20 18:57   ` Tejun Heo
2013-03-24 16:04   ` Lai Jiangshan
2013-03-24 16:04     ` Lai Jiangshan
2013-03-24 18:55     ` Tejun Heo
2013-03-24 18:55       ` Tejun Heo
2013-03-25 19:15       ` Tejun Heo
2013-03-25 19:15         ` Tejun Heo
2013-03-25 20:48         ` Tejun Heo
2013-03-25 20:48           ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.