All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Daniel Axtens <dja@axtens.net>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Subject: Re: [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument
Date: Thu, 21 Jan 2021 07:07:40 -0800	[thread overview]
Message-ID: <20210121150740.GO2743@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20210121133510.GB1872@pc638.lan>

On Thu, Jan 21, 2021 at 02:35:10PM +0100, Uladzislau Rezki wrote:
> On Wed, Jan 20, 2021 at 01:54:03PM -0800, Paul E. McKenney wrote:
> > On Wed, Jan 20, 2021 at 08:57:57PM +0100, Sebastian Andrzej Siewior wrote:

[ . . . ]

> > > so if bnode is NULL you could retry get_cached_bnode() since it might
> > > have been filled (given preemption or CPU migration changed something).
> > > Judging from patch #3 you think that a CPU migration is a bad thing. But
> > > why?
> > 
> > So that the later "(*krcp)->bkvhead[idx] = bnode" assignment associates
> > it with the correct CPU.
> > 
> > Though now that you mention it, couldn't the following happen?
> > 
> > o	Task A on CPU 0 notices that allocation is needed, so it
> > 	drops the lock disables migration, and sleeps while
> > 	allocating.
> > 
> > o	Task B on CPU 0 does the same.
> > 
> > o	The two tasks wake up in some order, and the second one
> > 	causes trouble at the "(*krcp)->bkvhead[idx] = bnode"
> > 	assignment.
> > 
> > Uladzislau, do we need to recheck "!(*krcp)->bkvhead[idx]" just after
> > the migrate_enable()?  Along with the KVFREE_BULK_MAX_ENTR check?
> > 
> Probably i should have mentioned your sequence you described, that two tasks
> can get a page on same CPU, i was thinking about it :) Yep, it can happen
> since we drop the lock and a context is fully preemptible, so another one
> can trigger kvfree_rcu() ending up at the same place - entering a page
> allocator.
> 
> I spent some time simulating it, but with no any luck, therefore i did not
> reflect this case in the commit message, thus did no pay much attention to
> such scenario.
> 
> >
> > Uladzislau, do we need to recheck "!(*krcp)->bkvhead[idx]" just after
> > the migrate_enable()?  Along with the KVFREE_BULK_MAX_ENTR check?
> >
> Two woken tasks will be serialized, i.e. an assignment is protected by
> the our local lock. We do krc_this_cpu_lock(flags); as a first step
> right after that we do restore a migration. A migration in that case
> can occur only when krc_this_cpu_unlock(*krcp, *flags); is invoked.
> 
> The scenario you described can happen, in that case a previous bnode
> in the drain list can be either empty or partly utilized. But, again
> i was non able to trigger such scenario.

Ah, we did discuss this previously, and yes, the result for a very
rare race is just underutilization of a page.  With the change below,
the result of this race is instead needless use of the slowpath.

> If we should fix it, i think we can go with below "alloc_in_progress"
> protection:
> 
> <snip>
> urezki@pc638:~/data/raid0/coding/linux-rcu.git$ git diff
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index cad36074366d..95485ec7267e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3488,12 +3488,19 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
>         if (!(*krcp)->bkvhead[idx] ||
>                         (*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
>                 bnode = get_cached_bnode(*krcp);
> -               if (!bnode && can_alloc) {
> +               if (!bnode && can_alloc && !(*krcp)->alloc_in_progress)  {
>                         migrate_disable();
> +
> +                       /* Set it before dropping the lock. */
> +                       (*krcp)->alloc_in_progress = true;
>                         krc_this_cpu_unlock(*krcp, *flags);
> +
>                         bnode = (struct kvfree_rcu_bulk_data *)
>                                 __get_free_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOMEMALLOC | __GFP_NOWARN);
>                         *krcp = krc_this_cpu_lock(flags);
> +
> +                       /* Clear it, the lock was taken back. */
> +                       (*krcp)->alloc_in_progress = false;
>                         migrate_enable();
>                 }
>  
> urezki@pc638:~/data/raid0/coding/linux-rcu.git$
> <snip>
> 
> in that case a second task will follow a fallback path bypassing a page
> request. I can send it as a separate patch if there are no any objections.

I was thinking in terms of something like the following.  Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

static bool add_ptr_to_bulk_krc_no_space(struct kfree_rcu_cpu *krcp)
{
	return !(krcp)->bkvhead[idx] ||
	       (krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR;
}

static inline bool
add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
	unsigned long *flags, void *ptr, bool can_alloc)
{
	struct kvfree_rcu_bulk_data *bnode;
	int idx;

	*krcp = krc_this_cpu_lock(flags);
	if (unlikely(!(*krcp)->initialized))
		return false;

	idx = !!is_vmalloc_addr(ptr);

	/* Check if a new block is required. */
	if (add_ptr_to_bulk_krc_no_space(*krcp)) {
		bnode = get_cached_bnode(*krcp);
		if (!bnode && can_alloc) {
			migrate_disable();
			krc_this_cpu_unlock(*krcp, *flags);
			bnode = (struct kvfree_rcu_bulk_data *)
				__get_free_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOMEMALLOC | __GFP_NOWARN);
			*krcp = krc_this_cpu_lock(flags);
			migrate_enable();
		}

		if (!bnode && add_ptr_to_bulk_krc_no_space(*krcp)) {
			return false;
		} else if (bnode && add_ptr_to_bulk_krc_no_space(*krcp))
			/* Initialize the new block. */
			bnode->nr_records = 0;
			bnode->next = (*krcp)->bkvhead[idx];

			/* Attach it to the head. */
			(*krcp)->bkvhead[idx] = bnode;
		} else if (bnode) {
			// Or attempt to add it to the cache?
			free_page((unsigned long)bnode);
		}
	}

	/* Finally insert. */
	(*krcp)->bkvhead[idx]->records
		[(*krcp)->bkvhead[idx]->nr_records++] = ptr;

	return true;
}

  reply	other threads:[~2021-01-21 15:09 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:21 [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument Uladzislau Rezki (Sony)
2021-01-20 16:21 ` [PATCH 2/3] kvfree_rcu: Use __GFP_NOMEMALLOC for single-argument kvfree_rcu() Uladzislau Rezki (Sony)
2021-01-28 18:06   ` Uladzislau Rezki
2021-01-20 16:21 ` [PATCH 3/3] kvfree_rcu: use migrate_disable/enable() Uladzislau Rezki (Sony)
2021-01-20 19:45   ` Sebastian Andrzej Siewior
2021-01-20 21:42     ` Paul E. McKenney
2021-01-23  9:31   ` 回复: " Zhang, Qiang
2021-01-24 21:57     ` Uladzislau Rezki
2021-01-25  1:50       ` 回复: " Zhang, Qiang
2021-01-25  2:18         ` Zhang, Qiang
2021-01-25 13:49           ` Uladzislau Rezki
2021-01-26  9:33             ` 回复: " Zhang, Qiang
2021-01-26 13:43               ` Uladzislau Rezki
2021-01-20 18:40 ` [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument Paul E. McKenney
2021-01-20 19:57 ` Sebastian Andrzej Siewior
2021-01-20 21:54   ` Paul E. McKenney
2021-01-21 13:35     ` Uladzislau Rezki
2021-01-21 15:07       ` Paul E. McKenney [this message]
2021-01-21 19:17         ` Uladzislau Rezki
2021-01-22 11:17     ` Sebastian Andrzej Siewior
2021-01-22 15:28       ` Paul E. McKenney
2021-01-21 12:38   ` Uladzislau Rezki
2021-01-22 11:34     ` Sebastian Andrzej Siewior
2021-01-22 14:21       ` Uladzislau Rezki
2021-01-25 13:22 ` Michal Hocko
2021-01-25 14:31   ` Uladzislau Rezki
2021-01-25 15:39     ` Michal Hocko
2021-01-25 16:25       ` Uladzislau Rezki
2021-01-28 15:11         ` Uladzislau Rezki
2021-01-28 15:17           ` Michal Hocko
2021-01-28 15:30             ` Uladzislau Rezki
2021-01-28 18:02               ` Uladzislau Rezki
     [not found]                 ` <YBPNvbJLg56XU8co@dhcp22.suse.cz>
2021-01-29 16:35                   ` Uladzislau Rezki
2021-02-01 11:47                     ` Michal Hocko
2021-02-01 14:44                       ` Uladzislau Rezki
2021-02-03 19:37                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210121150740.GO2743@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=dja@axtens.net \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=neeraju@codeaurora.org \
    --cc=oleksiy.avramchenko@sonymobile.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.