From: paulmck@kernel.org
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
mingo@kernel.org, jiangshanlai@gmail.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org,
mhocko@kernel.org, mgorman@techsingularity.net,
torvalds@linux-foundation.org,
"Uladzislau Rezki (Sony)" <urezki@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>
Subject: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible
Date: Mon, 28 Sep 2020 16:31:01 -0700 [thread overview]
Message-ID: <20200928233102.24265-14-paulmck@kernel.org> (raw)
In-Reply-To: <20200928233041.GA23230@paulmck-ThinkPad-P72>
From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
The current memory-allocation interface poses the following challenges:
a) In kernels built with CONFIG_PROVE_RAW_LOCK_NESTING, lockdep
complains ("BUG: Invalid wait context"). This complaint is due
to the memory allocator acquiring non-raw spinlocks while a raw
spinlocks is held. This problem can also arise if kvfree_rcu()
is invoked while holding a raw spinlock.
b) In -rt kernels built with CONFIG_PREEMPT_RT, the situation
described in (a) above results in an attempt to acquire a
sleeplock while holding a spinlock, which is of course forbidden.
This can lead to "BUG: scheduling while atomic".
c) Please note that call_rcu() is invoked from raw atomic context,
so that kfree_rcu() and kvfree_rcu() are therefore also expected
to be callable from atomic raw context as well.
However given that CONFIG_PREEMPT_COUNT is unconditionally enabled
by the earlier commits in this series, the preemptible() macro now
properly detects preempt-disable code regions even in kernels built
with CONFIG_PREEMPT_NONE.
This commit therefore uses preemptible() to determine whether allocation
is possible at all for double-argument kvfree_rcu(). If !preemptible(),
then allocation is not possible, and kvfree_rcu() falls back to using
the less cache-friendly rcu_head approach. Even when preemptible(),
the caller might be involved in reclaim, so the GFP_ flags used by
double-argument kvfree_rcu() must avoid invoking reclaim processing.
Note that single-argument kvfree_rcu() must be invoked in sleepable
contexts, and that its fallback is the relatively high latency
synchronize_rcu(). Single-argument kvfree_rcu() therefore uses
GFP_KERNEL|__GFP_RETRY_MAYFAIL to allow limited sleeping within the
memory allocator.
Link: https://lore.kernel.org/lkml/20200630164543.4mdcf6zb4zfclhln@linutronix.de/
Fixes: 3042f83f19be ("rcu: Support reclaim for head-less object")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
kernel/rcu/tree.c | 70 ++++++++++++++++---------------------------------------
1 file changed, 20 insertions(+), 50 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8ce77d9..cc998d7 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3166,7 +3166,7 @@ static void kfree_rcu_work(struct work_struct *work)
krc_this_cpu_unlock(krcp, flags);
if (bkvhead[i])
- free_page((unsigned long) bkvhead[i]);
+ kfree(bkvhead[i]);
cond_resched_tasks_rcu_qs();
}
@@ -3291,43 +3291,28 @@ static void kfree_rcu_monitor(struct work_struct *work)
}
static inline bool
-kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
+add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
+ unsigned long *flags, void *ptr, bool can_sleep)
{
struct kvfree_rcu_bulk_data *bnode;
+ bool can_alloc_page = preemptible();
+ gfp_t gfp = (can_sleep ? GFP_KERNEL | __GFP_RETRY_MAYFAIL : GFP_ATOMIC) | __GFP_NOWARN;
int idx;
- if (unlikely(!krcp->initialized))
+ *krcp = krc_this_cpu_lock(flags);
+ if (unlikely(!(*krcp)->initialized))
return false;
- lockdep_assert_held(&krcp->lock);
idx = !!is_vmalloc_addr(ptr);
/* Check if a new block is required. */
- if (!krcp->bkvhead[idx] ||
- krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
- bnode = get_cached_bnode(krcp);
- if (!bnode) {
- /*
- * To keep this path working on raw non-preemptible
- * sections, prevent the optional entry into the
- * allocator as it uses sleeping locks. In fact, even
- * if the caller of kfree_rcu() is preemptible, this
- * path still is not, as krcp->lock is a raw spinlock.
- * With additional page pre-allocation in the works,
- * hitting this return is going to be much less likely.
- */
- if (IS_ENABLED(CONFIG_PREEMPT_RT))
- return false;
-
- /*
- * NOTE: For one argument of kvfree_rcu() we can
- * drop the lock and get the page in sleepable
- * context. That would allow to maintain an array
- * for the CONFIG_PREEMPT_RT as well if no cached
- * pages are available.
- */
- bnode = (struct kvfree_rcu_bulk_data *)
- __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
+ if (!(*krcp)->bkvhead[idx] ||
+ (*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
+ bnode = get_cached_bnode(*krcp);
+ if (!bnode && can_alloc_page) {
+ krc_this_cpu_unlock(*krcp, *flags);
+ bnode = kmalloc(PAGE_SIZE, gfp);
+ *krcp = krc_this_cpu_lock(flags);
}
/* Switch to emergency path. */
@@ -3336,15 +3321,15 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
/* Initialize the new block. */
bnode->nr_records = 0;
- bnode->next = krcp->bkvhead[idx];
+ bnode->next = (*krcp)->bkvhead[idx];
/* Attach it to the head. */
- krcp->bkvhead[idx] = bnode;
+ (*krcp)->bkvhead[idx] = bnode;
}
/* Finally insert. */
- krcp->bkvhead[idx]->records
- [krcp->bkvhead[idx]->nr_records++] = ptr;
+ (*krcp)->bkvhead[idx]->records
+ [(*krcp)->bkvhead[idx]->nr_records++] = ptr;
return true;
}
@@ -3382,24 +3367,20 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
ptr = (unsigned long *) func;
}
- krcp = krc_this_cpu_lock(&flags);
-
// Queue the object but don't yet schedule the batch.
if (debug_rcu_head_queue(ptr)) {
// Probable double kfree_rcu(), just leak.
WARN_ONCE(1, "%s(): Double-freed call. rcu_head %p\n",
__func__, head);
- // Mark as success and leave.
- success = true;
- goto unlock_return;
+ return;
}
/*
* Under high memory pressure GFP_NOWAIT can fail,
* in that case the emergency path is maintained.
*/
- success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
+ success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
if (!success) {
if (head == NULL)
// Inline if kvfree_rcu(one_arg) call.
@@ -4394,23 +4375,12 @@ static void __init kfree_rcu_batch_init(void)
for_each_possible_cpu(cpu) {
struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
- struct kvfree_rcu_bulk_data *bnode;
for (i = 0; i < KFREE_N_BATCHES; i++) {
INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work);
krcp->krw_arr[i].krcp = krcp;
}
- for (i = 0; i < rcu_min_cached_objs; i++) {
- bnode = (struct kvfree_rcu_bulk_data *)
- __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
-
- if (bnode)
- put_cached_bnode(krcp, bnode);
- else
- pr_err("Failed to preallocate for %d CPU!\n", cpu);
- }
-
INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor);
krcp->initialized = true;
}
--
2.9.5
next prev parent reply other threads:[~2020-09-28 23:31 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-28 23:30 [PATCH tip/core/rcu 0/15] Paul E. McKenney
2020-09-28 23:30 ` [PATCH tip/core/rcu 01/15] lib/debug: Remove pointless ARCH_NO_PREEMPT dependencies paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 02/15] preempt: Make preempt count unconditional paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 03/15] preempt: Cleanup PREEMPT_COUNT leftovers paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 04/15] lockdep: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 05/15] mm/pagemap: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 06/15] locking/bitspinlock: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 07/15] uaccess: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 08/15] sched: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 09/15] ARM: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 10/15] xtensa: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 11/15] drm/i915: " paulmck
2020-10-01 7:17 ` Joonas Lahtinen
2020-10-01 8:25 ` Thomas Gleixner
2020-10-01 16:03 ` Paul E. McKenney
2020-09-28 23:30 ` [PATCH tip/core/rcu 12/15] rcutorture: " paulmck
2020-09-28 23:31 ` [PATCH tip/core/rcu 13/15] preempt: Remove PREEMPT_COUNT from Kconfig paulmck
2020-09-28 23:31 ` paulmck [this message]
2020-09-29 12:07 ` [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible Michal Hocko
2020-09-30 1:53 ` Paul E. McKenney
2020-09-30 8:41 ` Michal Hocko
2020-09-30 12:31 ` Uladzislau Rezki
2020-09-30 23:21 ` Paul E. McKenney
2020-10-01 9:02 ` Michal Hocko
2020-10-01 16:27 ` Paul E. McKenney
2020-10-02 6:57 ` Michal Hocko
2020-10-02 14:12 ` Paul E. McKenney
2020-10-01 16:28 ` Paul E. McKenney
2020-10-01 20:03 ` Uladzislau Rezki
2020-09-28 23:31 ` [PATCH tip/core/rcu 15/15] kvfree_rcu(): Fix ifnullfree.cocci warnings paulmck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200928233102.24265-14-paulmck@kernel.org \
--to=paulmck@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).