KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: aarcange@redhat.com, aaron.lu@intel.com,
	akpm@linux-foundation.org, alex.williamson@redhat.com,
	bsd@redhat.com, daniel.m.jordan@oracle.com,
	darrick.wong@oracle.com, dave.hansen@linux.intel.com,
	jgg@mellanox.com, jwadams@google.com, jiangshanlai@gmail.com,
	mhocko@kernel.org, mike.kravetz@oracle.com,
	Pavel.Tatashin@microsoft.com, prasad.singamsetty@oracle.com,
	rdunlap@infradead.org, steven.sistare@oracle.com,
	tim.c.chen@intel.com, tj@kernel.org, vbabka@suse.cz
Subject: [RFC PATCH v4 04/13] ktask: run helper threads at MAX_NICE
Date: Mon,  5 Nov 2018 11:55:49 -0500
Message-ID: <20181105165558.11698-5-daniel.m.jordan@oracle.com> (raw)
In-Reply-To: <20181105165558.11698-1-daniel.m.jordan@oracle.com>

Multithreading may speed long-running kernel tasks, but overly
optimistic parallelization can go wrong if too many helper threads are
started on an already-busy system.  Such helpers can degrade the
performance of other tasks, so they should be sensitive to current CPU
utilization[1].

To achieve this, run helpers at MAX_NICE so that their CPU time is
proportional to idle CPU time.  The main thread that called into ktask
naturally runs at its original priority so that it can make progress on
a heavily loaded system, as it would if ktask were not in the picture.

I tested two different cases in which a non-ktask and a ktask workload
compete for the same CPUs with the goal of showing that normal priority
(i.e. nice=0) ktask helpers cause the non-ktask workload to run more
slowly, whereas MAX_NICE ktask helpers don't.

Testing notes:
  - Each case was run using 8 CPUs on a large two-socket server, with a
    cpumask allowing all test threads to run anywhere within the 8.
  - The non-ktask workload used 7 threads and the ktask workload used 8
    threads to evaluate how much ktask helpers, rather than the main ktask
    thread, disturbed the non-ktask workload.
  - The non-ktask workload was started after the ktask workload and run
    for less time to maximize the chances that the non-ktask workload would
    be disturbed.
  - Runtimes in seconds.

Case 1: Synthetic, worst-case CPU contention

    ktask_test - a tight loop doing integer multiplication to max out on CPU;
                 used for testing only, does not appear in this series
    stress-ng  - cpu stressor ("-c --cpu-method ackerman --cpu-ops 1200");

                 stress-ng
                     alone  (stdev)   max_nice  (stdev)   normal_prio  (stdev)
                  ------------------------------------------------------------
    ktask_test                           96.87  ( 1.09)         90.81  ( 0.29)
    stress-ng        43.04  ( 0.00)      43.58  ( 0.01)         75.86  ( 0.39)

This case shows MAX_NICE helpers make a significant difference compared
to normal priority helpers, with stress-ng taking 76% longer to finish
when competing with normal priority ktask threads than when run by
itself, but only 1% longer when run with MAX_NICE helpers.  The 1% comes
from the small amount of CPU time MAX_NICE threads are given despite
their low priority.

Case 2: Real-world CPU contention

    ktask_vfio - VFIO page pin a 175G kvm guest
    usemem     - faults in 25G of anonymous THP per thread, PAGE_SIZE stride;
                 used to mimic the page clearing that dominates in ktask_vfio
                 so that usemem competes for the same system resources

                    usemem
                     alone  (stdev)   max_nice  (stdev)   normal_prio  (stdev)
                  ------------------------------------------------------------
    ktask_vfio                           14.74  ( 0.04)          9.93  ( 0.09)
        usemem       10.45  ( 0.04)      10.75  ( 0.04)         14.14  ( 0.07)

In the more realistic case 2, the effect is similar although not as
pronounced.  The usemem threads take 35% longer to finish with normal
priority ktask threads than when run alone, but only 3% longer when
MAX_NICE is used.

All ktask users outside of VFIO boil down to page clearing, so I imagine
the results would be similar for them.

[1] lkml.kernel.org/r/20171206143509.GG7515@dhcp22.suse.cz

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
---
 kernel/ktask.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/ktask.c b/kernel/ktask.c
index b91c62f14dcd..72293a0f50c3 100644
--- a/kernel/ktask.c
+++ b/kernel/ktask.c
@@ -575,6 +575,18 @@ void __init ktask_init(void)
 		goto alloc_fail;
 	}
 
+	/*
+	 * All ktask worker threads have the lowest priority on the system so
+	 * they don't disturb other workloads.
+	 */
+	attrs->nice = MAX_NICE;
+
+	ret = apply_workqueue_attrs(ktask_wq, attrs);
+	if (ret != 0) {
+		pr_warn("disabled (couldn't apply attrs to ktask_wq)");
+		goto apply_fail;
+	}
+
 	attrs->no_numa = true;
 
 	ret = apply_workqueue_attrs(ktask_nonuma_wq, attrs);
-- 
2.19.1

  parent reply index

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-05 16:55 [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 01/13] ktask: add documentation Daniel Jordan
2018-11-05 21:19   ` Randy Dunlap
2018-11-06  2:27     ` Daniel Jordan
2018-11-06  8:49   ` Peter Zijlstra
2018-11-06 20:34     ` Daniel Jordan
2018-11-06 20:51       ` Jason Gunthorpe
2018-11-07 10:27         ` Peter Zijlstra
2018-11-07 20:21           ` Daniel Jordan
2018-11-07 10:35       ` Peter Zijlstra
2018-11-07 21:20         ` Daniel Jordan
2018-11-08 17:26   ` Jonathan Corbet
2018-11-08 19:15     ` Daniel Jordan
2018-11-08 19:24       ` Jonathan Corbet
2018-11-27 19:50   ` Pavel Machek
2018-11-28 16:56     ` Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 02/13] ktask: multithread CPU-intensive kernel work Daniel Jordan
2018-11-05 20:51   ` Randy Dunlap
2018-11-06  2:24     ` Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 03/13] ktask: add undo support Daniel Jordan
2018-11-05 16:55 ` Daniel Jordan [this message]
2018-11-05 16:55 ` [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation Daniel Jordan
2018-11-13 16:34   ` Tejun Heo
2018-11-19 16:45     ` Daniel Jordan
2018-11-20 16:33       ` Tejun Heo
2018-11-20 17:03         ` Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 06/13] vfio: parallelize vfio_pin_map_dma Daniel Jordan
2018-11-05 21:51   ` Alex Williamson
2018-11-06  2:42     ` Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 07/13] mm: change locked_vm's type from unsigned long to atomic_long_t Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 08/13] vfio: remove unnecessary mmap_sem writer acquisition around locked_vm Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 09/13] vfio: relieve mmap_sem reader cacheline bouncing by holding it longer Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 10/13] mm: enlarge type of offset argument in mem_map_offset and mem_map_next Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node Daniel Jordan
2018-11-10  3:48   ` Elliott, Robert (Persistent Memory)
2018-11-12 16:54     ` Daniel Jordan
2018-11-12 22:15       ` Elliott, Robert (Persistent Memory)
2018-11-19 16:01         ` Daniel Jordan
2018-11-27  0:12           ` Elliott, Robert (Persistent Memory)
2018-11-27 20:23             ` Daniel Jordan
2018-11-19 16:29       ` Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 12/13] mm: parallelize clear_gigantic_page Daniel Jordan
2018-11-05 16:55 ` [RFC PATCH v4 13/13] hugetlbfs: parallelize hugetlbfs_fallocate with ktask Daniel Jordan
2018-11-05 17:29 ` [RFC PATCH v4 00/13] ktask: multithread CPU-intensive kernel work Michal Hocko
2018-11-06  1:29   ` Daniel Jordan
2018-11-06  9:21     ` Michal Hocko
2018-11-07 20:17       ` Daniel Jordan
2018-11-05 18:49 ` Zi Yan
2018-11-06  2:20   ` Daniel Jordan
2018-11-06  2:48     ` Zi Yan
2018-11-06 19:00       ` Daniel Jordan
2018-11-30 19:18 ` Tejun Heo
2018-12-01  0:13   ` Daniel Jordan
2018-12-03 16:16     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181105165558.11698-5-daniel.m.jordan@oracle.com \
    --to=daniel.m.jordan@oracle.com \
    --cc=Pavel.Tatashin@microsoft.com \
    --cc=aarcange@redhat.com \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=bsd@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jgg@mellanox.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jwadams@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=prasad.singamsetty@oracle.com \
    --cc=rdunlap@infradead.org \
    --cc=steven.sistare@oracle.com \
    --cc=tim.c.chen@intel.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git