RCU Archive on lore.kernel.org
 help / color / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com,
	dipankar@in.ibm.com, akpm@linux-foundation.org,
	mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
	tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org,
	dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com,
	oleg@redhat.com
Subject: Re: [PATCH tip/core/rcu 03/17] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
Date: Mon, 6 Jul 2020 15:42:32 -0400
Message-ID: <20200706194232.GA233429@google.com> (raw)
In-Reply-To: <20200702194506.GA31478@pc636>

On Thu, Jul 02, 2020 at 09:45:06PM +0200, Uladzislau Rezki wrote:
> On Thu, Jul 02, 2020 at 04:12:16PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-06-30 11:35:34 [-0700], Paul E. McKenney wrote:
> > > > This is not going to work together with the "wait context validator"
> > > > (CONFIG_PROVE_RAW_LOCK_NESTING). As of -rc3 it should complain about
> > > > printk() which is why it is still disabled by default.
> > > 
> > > Fixing that should be "interesting".  In particular, RCU CPU stall
> > > warnings rely on the raw spin lock to reduce false positives due
> > > to race conditions.  Some thought will be required here.
> > 
> > I don't get this part. Can you explain/give me an example where to look
> > at?
> > 
> > > > So assume that this is fixed and enabled then on !PREEMPT_RT it will
> > > > complain that you have a raw_spinlock_t acquired (the one from patch
> > > > 02/17) and attempt to acquire a spinlock_t in the memory allocator.
> > > 
> > > Given that the slab allocator doesn't acquire any locks until it gets
> > > a fair way in, wouldn't it make sense to allow a "shallow" allocation
> > > while a raw spinlock is held?  This would require yet another GFP_ flag,
> > > but that won't make all that much of a difference in the total.  ;-)
> > 
> > That would be one way of dealing with. But we could go back to
> > spinlock_t and keep the memory allocation even for RT as is. I don't see
> > a downside of this. And we would worry about kfree_rcu() from real
> > IRQ-off region once we get to it.
> > 

Sorry for my late reply as the day job and family demanded a lot last week...

> Another way of fixing it is just dropping the lock letting the page
> allocator to do an allocation without our "upper/local" lock. I did a
> proposal like that once upon a time, so maybe it is a time to highlight
> it again:
> <snip>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 21c2fa5bd8c3..249f10a89bb9 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3278,9 +3278,11 @@ static void kfree_rcu_monitor(struct work_struct *work)
>  }
> 
>  static inline bool
> -kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
> +kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> +       void *ptr, unsigned long *flags)
>  {
>         struct kvfree_rcu_bulk_data *bnode;
> +       struct kfree_rcu_cpu *tmp;
>         int idx;
> 
>         if (unlikely(!krcp->initialized))
> @@ -3306,6 +3308,9 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
>                         if (IS_ENABLED(CONFIG_PREEMPT_RT))
>                                 return false;
> 
> +                       migrate_disable();
> +                       krc_this_cpu_unlock(krcp, *flags);

If I remember, the issue here is that migrate_disable is not implemented on a
non-RT kernel due to issues with starvation.

What about using the mempools. If we can allocate from that without entry
into the page allocator, then that would be one way. Not sure if there are
other issues such as the mempool allocation itself acquiring regular spinlocks.

We could also just do what Sebastian is suggesting, which is revert to
regular spinlocks and allow this code to sleep while worrying about
atomic callers later. For RT, could such atomic callers perhaps do their
allocations differently such as allocating in advance or later on in process
context?

thanks,

 - Joel


> +
>                         /*
>                          * NOTE: For one argument of kvfree_rcu() we can
>                          * drop the lock and get the page in sleepable
> @@ -3315,6 +3320,12 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
>                          */
>                         bnode = (struct kvfree_rcu_bulk_data *)
>                                 __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> +
> +                       tmp = krc_this_cpu_lock(flags);
> +                       migrate_enable();
> +
> +                       /* Sanity check, just in case. */
> +                       WARN_ON(tmp != krcp);
>                 }
>  
>                 /* Switch to emergency path. */
> @@ -3386,7 +3397,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>          * Under high memory pressure GFP_NOWAIT can fail,
>          * in that case the emergency path is maintained.
>          */
> -       success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr);
> +       success = kvfree_call_rcu_add_ptr_to_bulk(krcp, ptr, &flags);
>         if (!success) {
>                 if (head == NULL)
>                         // Inline if kvfree_rcu(one_arg) call.
> <snip>
> 
> --
> Vlad Rezki

  reply index

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-24 20:12 [PATCH tip/core/rcu 0/17] kfree_rcu updates for v5.9 Paul E. McKenney
2020-06-24 20:12 ` [PATCH tip/core/rcu 01/17] rcu: Fix a kernel-doc warnings for "count" paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 02/17] rcu/tree: Keep kfree_rcu() awake during lock contention paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 03/17] rcu/tree: Skip entry into the page allocator for PREEMPT_RT paulmck
2020-06-30 16:45   ` Sebastian Andrzej Siewior
2020-06-30 18:35     ` Paul E. McKenney
2020-07-02 14:12       ` Sebastian Andrzej Siewior
2020-07-02 16:48         ` Paul E. McKenney
2020-07-02 20:19           ` Sebastian Andrzej Siewior
2020-07-06 21:06             ` Paul E. McKenney
2020-07-07 17:34               ` Uladzislau Rezki
2020-07-07 18:45                 ` Joel Fernandes
2020-07-08 18:48                   ` Uladzislau Rezki
2020-07-02 19:45         ` Uladzislau Rezki
2020-07-06 19:42           ` Joel Fernandes [this message]
2020-07-06 19:55             ` Uladzislau Rezki
2020-07-06 20:29               ` Joel Fernandes
2020-07-07  9:27                 ` Sebastian Andrzej Siewior
2020-07-15 13:38     ` Uladzislau Rezki
2020-07-15 14:16       ` Sebastian Andrzej Siewior
2020-07-15 14:20         ` Uladzislau Rezki
2020-06-24 20:12 ` [PATCH tip/core/rcu 04/17] rcu/tree: Repeat the monitor if any free channel is busy paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 05/17] rcu/tree: Make debug_objects logic independent of rcu_head paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 06/17] rcu/tree: Simplify KFREE_BULK_MAX_ENTR macro paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 07/17] rcu/tree: Move kfree_rcu_cpu locking/unlocking to separate functions paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 08/17] rcu/tree: Use static initializer for krc.lock paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 09/17] rcu/tree: cache specified number of objects paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 10/17] rcu/tree: Maintain separate array for vmalloc ptrs paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 11/17] rcu/tiny: support vmalloc in tiny-RCU paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 12/17] rcu: Rename *_kfree_callback/*_kfree_rcu_offset/kfree_call_* paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 13/17] mm/list_lru.c: Rename kvfree_rcu() to local variant paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 14/17] rcu: Introduce 2 arg kvfree_rcu() interface paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 15/17] rcu: Support reclaim for head-less object paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 16/17] rcu: Introduce single argument kvfree_rcu() interface paulmck
2020-06-24 20:12 ` [PATCH tip/core/rcu 17/17] lib/test_vmalloc.c: Add test cases for kvfree_rcu() paulmck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200706194232.GA233429@google.com \
    --to=joel@joelfernandes.org \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

RCU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/rcu/0 rcu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 rcu rcu/ https://lore.kernel.org/rcu \
		rcu@vger.kernel.org
	public-inbox-index rcu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.rcu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git