All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET] workqueue: reimplement high priority using a separate worker pool
@ 2012-07-09 18:41 ` Tejun Heo
  0 siblings, 0 replies; 96+ messages in thread
From: Tejun Heo @ 2012-07-09 18:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: axboe, elder, rni, martin.petersen, linux-bluetooth, torvalds,
	marcel, vwadekar, swhiteho, herbert, bpm, linux-crypto, gustavo,
	xfs, joshhunt00, davem, vgoyal, johan.hedberg

Currently, WQ_HIGHPRI workqueues share the same worker pool as the
normal priority ones.  The only difference is that work items from
highpri wq are queued at the head instead of tail of the worklist.  On
pathological cases, this simplistics highpri implementation doesn't
seem to be sufficient.

For example, block layer request_queue delayed processing uses high
priority delayed_work to restart request processing after a short
delay.  Unfortunately, it doesn't seem to take too much to push the
latency between the delay timer expiring and the work item execution
to few second range leading to unintended long idling of the
underlying device.  There seem to be real-world cases where this
latency shows up[1].

A simplistic test case is measuring queue-to-execution latencies with
a lot of threads saturating CPU cycles.  Measuring over 300sec period
with 3000 0-nice threads performing 1ms sleeps continuously and a
highpri work item being repeatedly queued with 1 jiffy interval on a
single CPU machine, the top latency was 1624ms and the average of top
20 was 1268ms with stdev 927ms.

This patchset reimplements high priority workqueues so that it uses a
separate worklist and worker pool.  Now each global_cwq contains two
worker_pools - one for normal priority work items and the other for
high priority.  Each has its own worklist and worker pool and the
highpri worker pool is populated with worker threads w/ -20 nice
value.

This reimplementation brings down the top latency to 16ms with top 20
average of 3.8ms w/ stdev 5.6ms.  The original block layer bug hasn't
been verfieid to be fixed yet (Josh?).

The addition of separate worker pools doesn't add much to the
complexity but does add more threads per cpu.  Highpri worker pool is
expected to remain small, but the effect is noticeable especially in
idle states.

I'm cc'ing all WQ_HIGHPRI users - block, bio-integrity, crypto, gfs2,
xfs and bluetooth.  Now you guys get proper high priority scheduling
for highpri work items; however, with more power comes more
responsibility.

Especially, the ones with both WQ_HIGHPRI and WQ_CPU_INTENSIVE -
bio-integrity and crypto - may end up dominating CPU usage.  I think
it should be mostly okay for bio-integrity considering it sits right
in the block request completion path.  I don't know enough about
tegra-aes tho.  aes_workqueue_handler() seems to mostly interact with
the hardware crypto.  Is it actually cpu cycle intensive?

This patchset contains the following six patches.

 0001-workqueue-don-t-use-WQ_HIGHPRI-for-unbound-workqueue.patch
 0002-workqueue-factor-out-worker_pool-from-global_cwq.patch
 0003-workqueue-use-pool-instead-of-gcwq-or-cpu-where-appl.patch
 0004-workqueue-separate-out-worker_pool-flags.patch
 0005-workqueue-introduce-NR_WORKER_POOLS-and-for_each_wor.patch
 0006-workqueue-reimplement-WQ_HIGHPRI-using-a-separate-wo.patch

0001 makes unbound wq not use WQ_HIGHPRI as its meaning will be
changing and won't suit the purpose unbound wq is using it for.

0002-0005 gradually pulls out worker_pool from global_cwq and update
code paths to be able to deal with multiple worker_pools per
global_cwq.

0006 replaces the head-queueing WQ_HIGHPRI implementation with the one
with separate worker_pool using the multiple worker_pool mechanism
previously implemented.

The patchset is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-wq-highpri

diffstat follows.

 Documentation/workqueue.txt      |  103 ++----
 include/trace/events/workqueue.h |    2 
 kernel/workqueue.c               |  624 +++++++++++++++++++++------------------
 3 files changed, 385 insertions(+), 344 deletions(-)

Thanks.

--
tejun

[1] https://lkml.org/lkml/2012/3/6/475

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2012-07-16 19:31 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-09 18:41 [PATCHSET] workqueue: reimplement high priority using a separate worker pool Tejun Heo
2012-07-09 18:41 ` Tejun Heo
     [not found] ` <1341859315-17759-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-07-09 18:41   ` [PATCH 1/6] workqueue: don't use WQ_HIGHPRI for unbound workqueues Tejun Heo
2012-07-09 18:41     ` Tejun Heo
2012-07-09 18:41     ` Tejun Heo
2012-07-09 18:41 ` [PATCH 2/6] workqueue: factor out worker_pool from global_cwq Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-10  4:48   ` Namhyung Kim
2012-07-10  4:48     ` Namhyung Kim
2012-07-10  4:48     ` Namhyung Kim
2012-07-12 17:07     ` Tejun Heo
2012-07-12 17:07       ` Tejun Heo
2012-07-12 17:07       ` Tejun Heo
     [not found]   ` <1341859315-17759-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-07-12 21:49     ` [PATCH UPDATED " Tejun Heo
2012-07-12 21:49       ` Tejun Heo
2012-07-12 21:49       ` Tejun Heo
2012-07-09 18:41 ` [PATCH 3/6] workqueue: use @pool instead of @gcwq or @cpu where applicable Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41   ` Tejun Heo
     [not found]   ` <1341859315-17759-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-07-10 23:30     ` Tony Luck
2012-07-10 23:30       ` Tony Luck
2012-07-10 23:30       ` Tony Luck
2012-07-12 17:06       ` Tejun Heo
2012-07-12 17:06         ` Tejun Heo
2012-07-12 17:06         ` Tejun Heo
2012-07-09 18:41 ` [PATCH 4/6] workqueue: separate out worker_pool flags Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41 ` [PATCH 5/6] workqueue: introduce NR_WORKER_POOLS and for_each_worker_pool() Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-14  3:55   ` Tejun Heo
2012-07-14  3:55     ` Tejun Heo
2012-07-14  3:55     ` Tejun Heo
2012-07-14  4:27     ` Linus Torvalds
2012-07-14  4:27       ` Linus Torvalds
2012-07-14  4:27       ` Linus Torvalds
     [not found]       ` <CA+55aFyeauqCqrWsx4U2TB2ENrugZXYj+4vw3Fd0kGaeWBP3RA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-07-14  4:44         ` Tejun Heo
2012-07-14  4:44           ` Tejun Heo
2012-07-14  4:44           ` Tejun Heo
2012-07-14  5:00           ` Linus Torvalds
2012-07-14  5:00             ` Linus Torvalds
2012-07-14  5:00             ` Linus Torvalds
2012-07-14  5:07             ` Tejun Heo
2012-07-14  5:07               ` Tejun Heo
2012-07-14  5:07               ` Tejun Heo
2012-07-16 19:31             ` Peter Seebach
     [not found]   ` <1341859315-17759-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-07-14  5:21     ` [PATCH UPDATED " Tejun Heo
2012-07-14  5:21       ` Tejun Heo
2012-07-14  5:21       ` Tejun Heo
2012-07-09 18:41 ` [PATCH 6/6] workqueue: reimplement WQ_HIGHPRI using a separate worker_pool Tejun Heo
2012-07-09 18:41   ` Tejun Heo
2012-07-09 18:41   ` Tejun Heo
     [not found]   ` <1341859315-17759-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2012-07-12 13:06     ` Fengguang Wu
2012-07-12 13:06       ` Fengguang Wu
2012-07-12 13:06       ` Fengguang Wu
2012-07-12 17:05       ` Tejun Heo
2012-07-12 17:05         ` Tejun Heo
2012-07-12 17:05         ` Tejun Heo
2012-07-12 21:45         ` Tejun Heo
2012-07-12 21:45           ` Tejun Heo
2012-07-12 21:45           ` Tejun Heo
2012-07-12 22:16           ` Tony Luck
2012-07-12 22:16             ` Tony Luck
2012-07-12 22:16             ` Tony Luck
2012-07-12 22:32             ` Tejun Heo
2012-07-12 22:32               ` Tejun Heo
2012-07-12 22:32               ` Tejun Heo
2012-07-12 23:24               ` Tony Luck
2012-07-12 23:24                 ` Tony Luck
2012-07-12 23:24                 ` Tony Luck
2012-07-12 23:36                 ` Tejun Heo
2012-07-12 23:36                   ` Tejun Heo
2012-07-12 23:36                   ` Tejun Heo
2012-07-12 23:46                   ` Tony Luck
2012-07-12 23:46                     ` Tony Luck
2012-07-12 23:46                     ` Tony Luck
2012-07-13 17:51                     ` Tony Luck
2012-07-13 17:51                       ` Tony Luck
2012-07-13 17:51                       ` Tony Luck
2012-07-13  2:08           ` Fengguang Wu
2012-07-13  2:08             ` Fengguang Wu
2012-07-13  2:08             ` Fengguang Wu
2012-07-14  3:41             ` Tejun Heo
2012-07-14  3:41               ` Tejun Heo
2012-07-14  3:41               ` Tejun Heo
2012-07-14  3:56   ` [PATCH UPDATED " Tejun Heo
2012-07-14  3:56     ` Tejun Heo
2012-07-14  3:56     ` Tejun Heo
2012-07-14  8:18     ` Fengguang Wu
2012-07-14  8:18       ` Fengguang Wu
2012-07-14  8:18       ` Fengguang Wu
2012-07-14  5:24   ` [PATCH UPDATED v3 " Tejun Heo
2012-07-14  5:24     ` Tejun Heo
2012-07-14  5:24     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.