All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Daniel Axtens <dja@axtens.net>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Subject: Re: [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument
Date: Fri, 22 Jan 2021 07:28:55 -0800	[thread overview]
Message-ID: <20210122152855.GE2743@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20210122111733.tcwfl43akypz3x42@linutronix.de>

On Fri, Jan 22, 2021 at 12:17:33PM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-01-20 13:54:03 [-0800], Paul E. McKenney wrote:
> > > > +// Record ptr in a page managed by krcp, with the pre-krc_this_cpu_lock()
> > > > +// state specified by flags.  If can_alloc is true, the caller must
> > > > +// be schedulable and not be holding any locks or mutexes that might be
> > > > +// acquired by the memory allocator or anything that it might invoke.
> > > > +// Returns true if ptr was successfully recorded, else the caller must
> > > > +// use a fallback.
> > > 
> > > The whole RCU department is getting swamped by the // comments. Can't we
> > > have proper kernel doc and /* */ style comments like the remaining part
> > > of the kernel?
> > 
> > Because // comments are easier to type and take up less horizontal space.
> 
> As for the typing I could try to sell you 
>   ab // /*
> 
> for your .vimrc and then //<enter> would become /* ;) As for the
> horizontal space, I don't have currently anything in my shop. I'm sorry.

;-)

> > Also, this kvfree_call_rcu_add_ptr_to_bulk() function is local to
> > kvfree_rcu(), and we don't normally docbook-ify such functions.
> 
> I didn't mean to promote using docbook to use every. For instance if you
> look at kernel/trace/trace.c, there are no // comments around, just /*
> style, even for things like tracing_selftest_running.
> 
> Basically I was curious if I could learn where this // is coming and if
> I could stop it.

Because they are now allowed and because they make my life easier as
noted above.  Also in-function comment blocks are either one line or two
lines shorter.  Yeah, they look strange at first, but it is not that hard
to get used to them.  After all, I did manage to get used to the /* */
comment style shortly after first encountering it.  ;-)

> > > >  static inline bool
> > > > -kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
> > > > +add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> > > > +	unsigned long *flags, void *ptr, bool can_alloc)
> > > >  {
> > > >  	struct kvfree_rcu_bulk_data *bnode;
> > > >  	int idx;
> > > >  
> > > > -	if (unlikely(!krcp->initialized))
> > > > +	*krcp = krc_this_cpu_lock(flags);
> > > > +	if (unlikely(!(*krcp)->initialized))
> > > >  		return false;
> > > >  
> > > > -	lockdep_assert_held(&krcp->lock);
> > > >  	idx = !!is_vmalloc_addr(ptr);
> > > >  
> > > >  	/* Check if a new block is required. */
> > > > -	if (!krcp->bkvhead[idx] ||
> > > > -			krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
> > > > -		bnode = get_cached_bnode(krcp);
> > > > -		/* Switch to emergency path. */
> > > > +	if (!(*krcp)->bkvhead[idx] ||
> > > > +			(*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) {
> > > > +		bnode = get_cached_bnode(*krcp);
> > > > +		if (!bnode && can_alloc) {
> > > > +			krc_this_cpu_unlock(*krcp, *flags);
> > > > +			bnode = (struct kvfree_rcu_bulk_data *)
> > > 
> > > There is no need for this cast.
> > 
> > Without it, gcc version 7.5.0 says:
> > 
> > 	warning: assignment makes pointer from integer without a cast
> > 
> 
> I'm sorry. I forgot the part where __get_free_page() does not return
> (void *).
> But maybe it should given that free_pages() casts that long back to
> (void *) and __get_free_pages() -> page_address() returns (void *)
> which is then casted long.

No argument here.  Then again, I am not the one in need of convincing.

There are use cases like this from pte_alloc_one_kernel():

	unsigned long page = __get_free_page(GFP_DMA);

But a quick look indicates that they are in the minority.

> > > > +				__get_free_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
> > > > +			*krcp = krc_this_cpu_lock(flags);
> > > 
> > > so if bnode is NULL you could retry get_cached_bnode() since it might
> > > have been filled (given preemption or CPU migration changed something).
> > > Judging from patch #3 you think that a CPU migration is a bad thing. But
> > > why?
> > 
> > So that the later "(*krcp)->bkvhead[idx] = bnode" assignment associates
> > it with the correct CPU.
> > 
> > Though now that you mention it, couldn't the following happen?
> > 
> > o	Task A on CPU 0 notices that allocation is needed, so it
> > 	drops the lock disables migration, and sleeps while
> > 	allocating.
> > 
> > o	Task B on CPU 0 does the same.
> > 
> > o	The two tasks wake up in some order, and the second one
> > 	causes trouble at the "(*krcp)->bkvhead[idx] = bnode"
> > 	assignment.
> 
> Yes it could, good point.
> I would really recommend using migrate_disable() at a minimum and only
> if it is really needed. It is more expensive than preempt_disable() and
> it isn't exactly good in terms of scheduling since the task is run able
> but restricted to a specific CPU.
> If it is unavoidable it is unavoidable but in this case I wouldn't use
> migrate_disable() but re-evaluate the situation after the allocation.

I could imagine the following alternatives:

o	Acquire the old CPU's lock despite having been migrated.
	If the above race happened, put the extra page in the
	per-CPU cache.  As Uladzislau notes, this would require
	some sort of periodic cleanup that would be good to avoid.

o	As now, which can result in an unfilled page, though only
	in an uncommon race condition.  (Uladzislau convinced me
	that this was a good approach some months ago, and the
	fact that he cannot make it happen easily does add some
	weight to his argument.)

o	Use migrate_disable().

Other ideas?

							Thanx, Paul

> > Uladzislau, do we need to recheck "!(*krcp)->bkvhead[idx]" just after
> > the migrate_enable()?  Along with the KVFREE_BULK_MAX_ENTR check?
> > 
> > 							Thanx, Paul
> 
> Sebastian

  reply	other threads:[~2021-01-22 15:31 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:21 [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument Uladzislau Rezki (Sony)
2021-01-20 16:21 ` [PATCH 2/3] kvfree_rcu: Use __GFP_NOMEMALLOC for single-argument kvfree_rcu() Uladzislau Rezki (Sony)
2021-01-28 18:06   ` Uladzislau Rezki
2021-01-20 16:21 ` [PATCH 3/3] kvfree_rcu: use migrate_disable/enable() Uladzislau Rezki (Sony)
2021-01-20 19:45   ` Sebastian Andrzej Siewior
2021-01-20 21:42     ` Paul E. McKenney
2021-01-23  9:31   ` 回复: " Zhang, Qiang
2021-01-24 21:57     ` Uladzislau Rezki
2021-01-25  1:50       ` 回复: " Zhang, Qiang
2021-01-25  2:18         ` Zhang, Qiang
2021-01-25 13:49           ` Uladzislau Rezki
2021-01-26  9:33             ` 回复: " Zhang, Qiang
2021-01-26 13:43               ` Uladzislau Rezki
2021-01-20 18:40 ` [PATCH 1/3] kvfree_rcu: Allocate a page for a single argument Paul E. McKenney
2021-01-20 19:57 ` Sebastian Andrzej Siewior
2021-01-20 21:54   ` Paul E. McKenney
2021-01-21 13:35     ` Uladzislau Rezki
2021-01-21 15:07       ` Paul E. McKenney
2021-01-21 19:17         ` Uladzislau Rezki
2021-01-22 11:17     ` Sebastian Andrzej Siewior
2021-01-22 15:28       ` Paul E. McKenney [this message]
2021-01-21 12:38   ` Uladzislau Rezki
2021-01-22 11:34     ` Sebastian Andrzej Siewior
2021-01-22 14:21       ` Uladzislau Rezki
2021-01-25 13:22 ` Michal Hocko
2021-01-25 14:31   ` Uladzislau Rezki
2021-01-25 15:39     ` Michal Hocko
2021-01-25 16:25       ` Uladzislau Rezki
2021-01-28 15:11         ` Uladzislau Rezki
2021-01-28 15:17           ` Michal Hocko
2021-01-28 15:30             ` Uladzislau Rezki
2021-01-28 18:02               ` Uladzislau Rezki
     [not found]                 ` <YBPNvbJLg56XU8co@dhcp22.suse.cz>
2021-01-29 16:35                   ` Uladzislau Rezki
2021-02-01 11:47                     ` Michal Hocko
2021-02-01 14:44                       ` Uladzislau Rezki
2021-02-03 19:37                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210122152855.GE2743@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=dja@axtens.net \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=neeraju@codeaurora.org \
    --cc=oleksiy.avramchenko@sonymobile.com \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.