From: Andrew Morton <akpm@linux-foundation.org>
To: alex.williamson@redhat.com, alexander.h.duyck@linux.intel.com,
corbet@lwn.net, dan.j.williams@intel.com,
daniel.m.jordan@oracle.com, dave.hansen@linux.intel.com,
david@redhat.com, elliott@hpe.com, herbert@gondor.apana.org.au,
jgg@ziepe.ca, josh@joshtriplett.org, ktkhai@virtuozzo.com,
mhocko@kernel.org, mm-commits@vger.kernel.org,
pasha.tatashin@soleen.com, pavel@ucw.cz, peterz@infradead.org,
rdunlap@infradead.org, shile.zhang@linux.alibaba.com,
steffen.klassert@secunet.com, steven.sistare@oracle.com,
tj@kernel.org, ziy@nvidia.com
Subject: + padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch added to -mm tree
Date: Wed, 27 May 2020 14:48:57 -0700 [thread overview]
Message-ID: <20200527214857.chb9oi4X2%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200522222217.ee14ad7eda7aab1e6697da6c@linux-foundation.org>
The patch titled
Subject: padata: allocate work structures for parallel jobs from a pool
has been added to the -mm tree. Its filename is
padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: padata: allocate work structures for parallel jobs from a pool
padata allocates per-CPU, per-instance work structs for parallel jobs. A
do_parallel call assigns a job to a sequence number and hashes the number
to a CPU, where the job will eventually run using the corresponding work.
This approach fit with how padata used to bind a job to each CPU
round-robin, makes less sense after commit bfde23ce200e6 ("padata: unbind
parallel jobs from specific CPUs") because a work isn't bound to a
particular CPU anymore, and isn't needed at all for multithreaded jobs
because they don't have sequence numbers.
Replace the per-CPU works with a preallocated pool, which allows sharing
them between existing padata users and the upcoming multithreaded user.
The pool will also facilitate setting NUMA-aware concurrency limits with
later users.
The pool is sized according to the number of possible CPUs. With this
limit, MAX_OBJ_NUM no longer makes sense, so remove it.
If the global pool is exhausted, a parallel job is run in the current task
instead to throttle a system trying to do too much in parallel.
Link: http://lkml.kernel.org/r/20200527173608.2885243-4-daniel.m.jordan@oracle.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Tested-by: Josh Triplett <josh@joshtriplett.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/padata.h | 8 --
kernel/padata.c | 118 +++++++++++++++++++++++++--------------
2 files changed, 78 insertions(+), 48 deletions(-)
--- a/include/linux/padata.h~padata-allocate-work-structures-for-parallel-jobs-from-a-pool
+++ a/include/linux/padata.h
@@ -24,7 +24,6 @@
* @list: List entry, to attach to the padata lists.
* @pd: Pointer to the internal control structure.
* @cb_cpu: Callback cpu for serializatioon.
- * @cpu: Cpu for parallelization.
* @seq_nr: Sequence number of the parallelized data object.
* @info: Used to pass information from the parallel to the serial function.
* @parallel: Parallel execution function.
@@ -34,7 +33,6 @@ struct padata_priv {
struct list_head list;
struct parallel_data *pd;
int cb_cpu;
- int cpu;
unsigned int seq_nr;
int info;
void (*parallel)(struct padata_priv *padata);
@@ -68,15 +66,11 @@ struct padata_serial_queue {
/**
* struct padata_parallel_queue - The percpu padata parallel queue
*
- * @parallel: List to wait for parallelization.
* @reorder: List to wait for reordering after parallel processing.
- * @work: work struct for parallelization.
* @num_obj: Number of objects that are processed by this cpu.
*/
struct padata_parallel_queue {
- struct padata_list parallel;
struct padata_list reorder;
- struct work_struct work;
atomic_t num_obj;
};
@@ -111,7 +105,7 @@ struct parallel_data {
struct padata_parallel_queue __percpu *pqueue;
struct padata_serial_queue __percpu *squeue;
atomic_t refcnt;
- atomic_t seq_nr;
+ unsigned int seq_nr;
unsigned int processed;
int cpu;
struct padata_cpumask cpumask;
--- a/kernel/padata.c~padata-allocate-work-structures-for-parallel-jobs-from-a-pool
+++ a/kernel/padata.c
@@ -32,7 +32,15 @@
#include <linux/sysfs.h>
#include <linux/rcupdate.h>
-#define MAX_OBJ_NUM 1000
+struct padata_work {
+ struct work_struct pw_work;
+ struct list_head pw_list; /* padata_free_works linkage */
+ void *pw_data;
+};
+
+static DEFINE_SPINLOCK(padata_works_lock);
+static struct padata_work *padata_works;
+static LIST_HEAD(padata_free_works);
static void padata_free_pd(struct parallel_data *pd);
@@ -58,30 +66,44 @@ static int padata_cpu_hash(struct parall
return padata_index_to_cpu(pd, cpu_index);
}
-static void padata_parallel_worker(struct work_struct *parallel_work)
+static struct padata_work *padata_work_alloc(void)
{
- struct padata_parallel_queue *pqueue;
- LIST_HEAD(local_list);
+ struct padata_work *pw;
- local_bh_disable();
- pqueue = container_of(parallel_work,
- struct padata_parallel_queue, work);
+ lockdep_assert_held(&padata_works_lock);
- spin_lock(&pqueue->parallel.lock);
- list_replace_init(&pqueue->parallel.list, &local_list);
- spin_unlock(&pqueue->parallel.lock);
+ if (list_empty(&padata_free_works))
+ return NULL; /* No more work items allowed to be queued. */
- while (!list_empty(&local_list)) {
- struct padata_priv *padata;
+ pw = list_first_entry(&padata_free_works, struct padata_work, pw_list);
+ list_del(&pw->pw_list);
+ return pw;
+}
- padata = list_entry(local_list.next,
- struct padata_priv, list);
+static void padata_work_init(struct padata_work *pw, work_func_t work_fn,
+ void *data)
+{
+ INIT_WORK(&pw->pw_work, work_fn);
+ pw->pw_data = data;
+}
- list_del_init(&padata->list);
+static void padata_work_free(struct padata_work *pw)
+{
+ lockdep_assert_held(&padata_works_lock);
+ list_add(&pw->pw_list, &padata_free_works);
+}
- padata->parallel(padata);
- }
+static void padata_parallel_worker(struct work_struct *parallel_work)
+{
+ struct padata_work *pw = container_of(parallel_work, struct padata_work,
+ pw_work);
+ struct padata_priv *padata = pw->pw_data;
+ local_bh_disable();
+ padata->parallel(padata);
+ spin_lock(&padata_works_lock);
+ padata_work_free(pw);
+ spin_unlock(&padata_works_lock);
local_bh_enable();
}
@@ -105,9 +127,9 @@ int padata_do_parallel(struct padata_she
struct padata_priv *padata, int *cb_cpu)
{
struct padata_instance *pinst = ps->pinst;
- int i, cpu, cpu_index, target_cpu, err;
- struct padata_parallel_queue *queue;
+ int i, cpu, cpu_index, err;
struct parallel_data *pd;
+ struct padata_work *pw;
rcu_read_lock_bh();
@@ -135,25 +157,25 @@ int padata_do_parallel(struct padata_she
if ((pinst->flags & PADATA_RESET))
goto out;
- if (atomic_read(&pd->refcnt) >= MAX_OBJ_NUM)
- goto out;
-
- err = 0;
atomic_inc(&pd->refcnt);
padata->pd = pd;
padata->cb_cpu = *cb_cpu;
- padata->seq_nr = atomic_inc_return(&pd->seq_nr);
- target_cpu = padata_cpu_hash(pd, padata->seq_nr);
- padata->cpu = target_cpu;
- queue = per_cpu_ptr(pd->pqueue, target_cpu);
-
- spin_lock(&queue->parallel.lock);
- list_add_tail(&padata->list, &queue->parallel.list);
- spin_unlock(&queue->parallel.lock);
+ rcu_read_unlock_bh();
- queue_work(pinst->parallel_wq, &queue->work);
+ spin_lock(&padata_works_lock);
+ padata->seq_nr = ++pd->seq_nr;
+ pw = padata_work_alloc();
+ spin_unlock(&padata_works_lock);
+ if (pw) {
+ padata_work_init(pw, padata_parallel_worker, padata);
+ queue_work(pinst->parallel_wq, &pw->pw_work);
+ } else {
+ /* Maximum works limit exceeded, run in the current task. */
+ padata->parallel(padata);
+ }
+ return 0;
out:
rcu_read_unlock_bh();
@@ -324,8 +346,9 @@ static void padata_serial_worker(struct
void padata_do_serial(struct padata_priv *padata)
{
struct parallel_data *pd = padata->pd;
+ int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr);
struct padata_parallel_queue *pqueue = per_cpu_ptr(pd->pqueue,
- padata->cpu);
+ hashed_cpu);
struct padata_priv *cur;
spin_lock(&pqueue->reorder.lock);
@@ -416,8 +439,6 @@ static void padata_init_pqueues(struct p
pqueue = per_cpu_ptr(pd->pqueue, cpu);
__padata_list_init(&pqueue->reorder);
- __padata_list_init(&pqueue->parallel);
- INIT_WORK(&pqueue->work, padata_parallel_worker);
atomic_set(&pqueue->num_obj, 0);
}
}
@@ -451,7 +472,7 @@ static struct parallel_data *padata_allo
padata_init_pqueues(pd);
padata_init_squeues(pd);
- atomic_set(&pd->seq_nr, -1);
+ pd->seq_nr = -1;
atomic_set(&pd->refcnt, 1);
spin_lock_init(&pd->lock);
pd->cpu = cpumask_first(pd->cpumask.pcpu);
@@ -1051,6 +1072,7 @@ EXPORT_SYMBOL(padata_free_shell);
void __init padata_init(void)
{
+ unsigned int i, possible_cpus;
#ifdef CONFIG_HOTPLUG_CPU
int ret;
@@ -1062,13 +1084,27 @@ void __init padata_init(void)
ret = cpuhp_setup_state_multi(CPUHP_PADATA_DEAD, "padata:dead",
NULL, padata_cpu_dead);
- if (ret < 0) {
- cpuhp_remove_multi_state(hp_online);
- goto err;
- }
+ if (ret < 0)
+ goto remove_online_state;
+#endif
+
+ possible_cpus = num_possible_cpus();
+ padata_works = kmalloc_array(possible_cpus, sizeof(struct padata_work),
+ GFP_KERNEL);
+ if (!padata_works)
+ goto remove_dead_state;
+
+ for (i = 0; i < possible_cpus; ++i)
+ list_add(&padata_works[i].pw_list, &padata_free_works);
return;
+
+remove_dead_state:
+#ifdef CONFIG_HOTPLUG_CPU
+ cpuhp_remove_multi_state(CPUHP_PADATA_DEAD);
+remove_online_state:
+ cpuhp_remove_multi_state(hp_online);
err:
- pr_warn("padata: initialization failed\n");
#endif
+ pr_warn("padata: initialization failed\n");
}
_
Patches currently in -mm which might be from daniel.m.jordan@oracle.com are
mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
padata-remove-exit-routine.patch
padata-initialize-earlier.patch
padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch
padata-add-basic-support-for-multithreaded-jobs.patch
mm-dont-track-number-of-pages-during-deferred-initialization.patch
mm-parallelize-deferred_init_memmap.patch
mm-make-deferred-inits-max-threads-arch-specific.patch
padata-document-multithreaded-jobs.patch
next prev parent reply other threads:[~2020-05-27 21:49 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200522222217.ee14ad7eda7aab1e6697da6c@linux-foundation.org>
2020-05-25 0:06 ` + bitops-simplify-get_count_order_long.patch added to -mm tree Andrew Morton
2020-05-25 0:08 ` + mmthp-stop-leaking-unreleased-file-pages.patch " Andrew Morton
2020-05-25 0:45 ` + mmap-locking-api-convert-mmap_sem-comments-fix-fix-fix.patch " Andrew Morton
2020-05-25 0:49 ` + mm-remove-vm_bug_onpageslab-from-page_mapcount.patch " Andrew Morton
2020-05-25 0:57 ` + swap-reduce-lock-contention-on-swap-cache-from-swap-slots-allocation-v3.patch " Andrew Morton
2020-05-25 5:10 ` mmotm 2020-05-24-22-09 uploaded Andrew Morton
2020-05-25 20:01 ` + khugepaged-allow-to-collapse-a-page-shared-across-fork-fix-fix.patch added to -mm tree Andrew Morton
2020-05-25 20:19 ` + x86-mm-simplify-init_trampoline-and-surrounding-logic-fix.patch " Andrew Morton
2020-05-25 20:41 ` + lib-make-a-test-module-with-set-clear-bit.patch " Andrew Morton
2020-05-25 20:57 ` + mm-gupc-convert-to-use-get_user_pagepages_fast_only.patch " Andrew Morton
2020-05-25 21:11 ` + mm-remove-vm_bug_onpageslab-from-page_mapcount-fix.patch " Andrew Morton
2020-05-25 21:55 ` + mm_typesh-change-set_page_private-to-inline-function.patch " Andrew Morton
2020-05-25 23:57 ` mmotm 2020-05-25-16-56 uploaded Andrew Morton
2020-05-26 21:18 ` [failures] proc-meminfo-avoid-open-coded-reading-of-vm_committed_as.patch removed from -mm tree Andrew Morton
2020-05-26 21:18 ` [failures] mm-utilc-make-vm_memory_committed-more-accurate.patch " Andrew Morton
2020-05-26 21:18 ` [failures] mm-adjust-vm_committed_as_batch-according-to-vm-overcommit-policy.patch " Andrew Morton
2020-05-27 19:12 ` + mm-swap-fix-vmstats-for-huge-pages.patch added to " Andrew Morton
2020-05-27 19:12 ` + mm-swap-memcg-fix-memcg-stats-for-huge-pages.patch " Andrew Morton
2020-05-27 21:33 ` + mm-prepare-for-swap-over-high-accounting-and-penalty-calculation.patch " Andrew Morton
2020-05-27 21:33 ` + mm-move-penalty-delay-clamping-out-of-calculate_high_delay.patch " Andrew Morton
2020-05-27 21:33 ` + mm-move-cgroup-high-memory-limit-setting-into-struct-page_counter.patch " Andrew Morton
2020-05-27 21:33 ` + mm-automatically-penalize-tasks-with-high-swap-use.patch " Andrew Morton
2020-05-27 21:36 ` + mm-gup-update-pin_user_pagesrst-for-case-3-mmu-notifiers.patch " Andrew Morton
2020-05-27 21:48 ` + padata-remove-exit-routine.patch " Andrew Morton
2020-05-27 21:48 ` + padata-initialize-earlier.patch " Andrew Morton
2020-05-27 21:48 ` Andrew Morton [this message]
2020-05-27 21:49 ` + padata-add-basic-support-for-multithreaded-jobs.patch " Andrew Morton
2020-05-27 21:49 ` + mm-dont-track-number-of-pages-during-deferred-initialization.patch " Andrew Morton
2020-05-27 21:49 ` + mm-parallelize-deferred_init_memmap.patch " Andrew Morton
2020-05-27 21:49 ` + mm-make-deferred-inits-max-threads-arch-specific.patch " Andrew Morton
2020-05-27 21:49 ` + padata-document-multithreaded-jobs.patch " Andrew Morton
2020-05-27 21:55 ` + cpumask-guard-cpumask_of_node-macro-argument.patch " Andrew Morton
2020-05-27 22:15 ` + sparc32-register-memory-occupied-by-kernel-as-memblockmemory.patch " Andrew Morton
2020-05-27 22:32 ` + x86-mm-ptdump-calculate-effective-permissions-correctly-fix.patch " Andrew Morton
2020-05-27 22:55 ` + ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch " Andrew Morton
2020-05-27 22:55 ` + ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch " Andrew Morton
2020-05-27 23:17 ` + mm-gup-introduce-pin_user_pages_locked.patch " Andrew Morton
2020-05-27 23:17 ` + mm-gup-frame_vector-convert-get_user_pages-pin_user_pages.patch " Andrew Morton
2020-05-27 23:52 ` + mm-memory_failure-only-send-bus_mceerr_ao-to-early-kill-process.patch " Andrew Morton
2020-05-28 0:10 ` + relay-handle-alloc_percpu-returning-null-in-relay_open.patch " Andrew Morton
2020-05-28 0:16 ` + xtensa-implement-flush_icache_user_range-fix.patch " Andrew Morton
2020-05-28 0:36 ` + maccess-unexport-probe_kernel_write-and-probe_user_write.patch " Andrew Morton
2020-05-28 0:36 ` + maccess-remove-various-unused-weak-aliases.patch " Andrew Morton
2020-05-28 0:36 ` + maccess-remove-duplicate-kerneldoc-comments.patch " Andrew Morton
2020-05-28 0:36 ` + maccess-clarify-kerneldoc-comments.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-update-the-top-of-file-comment.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-rename-strncpy_from_unsafe_user-to-strncpy_from_user_nofault.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-rename-strncpy_from_unsafe_strict-to-strncpy_from_kernel_nofault.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-rename-strnlen_unsafe_user-to-strnlen_user_nofault.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-remove-probe_read_common-and-probe_write_common.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-unify-the-probe-kernel-arch-hooks.patch " Andrew Morton
2020-05-28 0:37 ` + bpf-factor-out-a-bpf_trace_copy_string-helper.patch " Andrew Morton
2020-05-28 0:37 ` + bpf-handle-the-compat-string-in-bpf_trace_copy_string-better.patch " Andrew Morton
2020-05-28 0:37 ` + bpf-rework-the-compat-kernel-probe-handling.patch " Andrew Morton
2020-05-28 0:37 ` + tracing-kprobes-handle-mixed-kernel-userspace-probes-better.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-remove-strncpy_from_unsafe.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-always-use-strict-semantics-for-probe_kernel_read.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-move-user-access-routines-together.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-allow-architectures-to-provide-kernel-probing-directly.patch " Andrew Morton
2020-05-28 0:37 ` + x86-use-non-set_fs-based-maccess-routines.patch " Andrew Morton
2020-05-28 0:37 ` + maccess-return-erange-when-copy_from_kernel_nofault_allowed-fails.patch " Andrew Morton
2020-05-28 0:45 ` + x86-use-non-set_fs-based-maccess-routines-checkpatch-fixes.patch " Andrew Morton
2020-05-28 0:55 ` + maccess-unify-the-probe-kernel-arch-hooks-fix.patch " Andrew Morton
2020-05-28 1:02 ` + maccess-always-use-strict-semantics-for-probe_kernel_read-fix.patch " Andrew Morton
2020-05-28 2:04 ` + bpf-bpf_seq_printf-handle-potentially-unsafe-format-string-better.patch " Andrew Morton
2020-05-28 3:09 ` [to-be-updated] mm-memory_failure-only-send-bus_mceerr_ao-to-early-kill-process.patch removed from " Andrew Morton
2020-05-14 0:50 incoming Andrew Morton
2020-05-20 23:31 ` + padata-allocate-work-structures-for-parallel-jobs-from-a-pool.patch added to -mm tree Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200527214857.chb9oi4X2%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=alexander.h.duyck@linux.intel.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=daniel.m.jordan@oracle.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=elliott@hpe.com \
--cc=herbert@gondor.apana.org.au \
--cc=jgg@ziepe.ca \
--cc=josh@joshtriplett.org \
--cc=ktkhai@virtuozzo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=pasha.tatashin@soleen.com \
--cc=pavel@ucw.cz \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=shile.zhang@linux.alibaba.com \
--cc=steffen.klassert@secunet.com \
--cc=steven.sistare@oracle.com \
--cc=tj@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).