linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, mingo@kernel.org, jiangshanlai@gmail.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org,
	rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com,
	fweisbec@gmail.com, oleg@redhat.com, joel@joelfernandes.org,
	mgorman@techsingularity.net, torvalds@linux-foundation.org,
	"Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible
Date: Thu, 1 Oct 2020 11:02:20 +0200	[thread overview]
Message-ID: <20201001090220.GA22560@dhcp22.suse.cz> (raw)
In-Reply-To: <20200930232154.GA29330@paulmck-ThinkPad-P72>

On Wed 30-09-20 16:21:54, Paul E. McKenney wrote:
> On Wed, Sep 30, 2020 at 10:41:39AM +0200, Michal Hocko wrote:
> > On Tue 29-09-20 18:53:27, Paul E. McKenney wrote:
[...]
> > > No argument on it being confusing, and I hope that the added header
> > > comment helps.  But specifically, can_sleep==true is a promise by the
> > > caller to be schedulable and not to be holding any lock/mutex/whatever
> > > that might possibly be acquired by the memory allocator or by anything
> > > else that the memory allocator might invoke, to your point, including
> > > for but one example the reclaim logic.
> > > 
> > > The only way that can_sleep==true is if this function was invoked due
> > > to a call to single-argument kvfree_rcu(), which must be schedulable
> > > because its fallback is to invoke synchronize_rcu().
> > 
> > OK. I have to say that it is still not clear to me whether this call
> > path can be called from the memory reclaim context. If yes then you need
> > __GFP_NOMEMALLOC as well.
> 
> Right now the restriction is that single-argument (AKA can_sleep==true)
> kvfree_rcu() cannot be invoked from memory reclaim context.
> 
> But would adding __GFP_NOMEMALLOC to the can_sleep==true GFP_ flags
> allow us to remove this restriction?  If so, I will queue a separate
> patch making this change.  The improved ease of use would be well
> worth it, if I understand correctly (ha!!!).

It would be quite daring to claim it will be ok but it will certainly be
less problematic. Adding the flag will not hurt in any case. As this is
a shared called that might be called from many contexts I think it will
be safer to have it there. The justification is that it will prevent
consumption of memory reserves from MEMALLOC contexts.

> 
> > [...]
> > 
> > > > What is the point of calling kmalloc  for a PAGE_SIZE object? Wouldn't
> > > > using the page allocator directly be better?
> > > 
> > > Well, you guys gave me considerable heat about abusing internal allocator
> > > interfaces, and kmalloc() and kfree() seem to be about as non-internal
> > > as you can get and still be invoking the allocator.  ;-)
> > 
> > alloc_pages resp. __get_free_pages is a normal page allocator interface
> > to use for page size granular allocations. kmalloc is for more fine
> > grained allocations.
> 
> OK, in the short term, both work, but I have queued a separate patch
> making this change and recording the tradeoffs.  This is not yet a
> promise to push this patch, but it is a promise not to lose this part
> of the picture.  Please see below.

It doesn't matter all that much. Both allocators will work. It is just a
matter of using optimal tool for the specific purose.

> You mentioned alloc_pages().  I reverted to __get_free_pages(), but
> alloc_pages() of course looks nicer.  What are the tradeoffs between
> __get_free_pages() and alloc_pages()?

alloc_pages will return struct page but you need a kernel pointer. That
is what __get_free_pages will give you (or you can call page_address
directly).

> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 490b638d7c241ac06cee168ccf8688bb8b872478
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Wed Sep 30 16:16:39 2020 -0700
> 
>     kvfree_rcu(): Switch from kmalloc/kfree to __get_free_page/free_page.
>     
>     The advantages of using kmalloc() and kfree() are a possible small speedup
>     on CONFIG_SLAB=y systems, avoiding the allocation-side cast, and use of
>     more-familiar API members.  The advantages of using __get_free_page()
>     and free_page() are a possible reduction in fragmentation and direct
>     access to the buddy allocator.
>     
>     To help settle the question as to which to use, this commit switches
>     from kmalloc() and kfree() to __get_free_page() and free_page().
>     
>     Suggested-by: Michal Hocko <mhocko@suse.com>
>     Suggested-by: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Yes, looks good to me. I am not entirely sure about the fragmentation
argument. It really depends on the SL.B allocator internals. The same
applies for the potential speed up. I would be even surprised if the
SLAB was faster in average considering it has to use the page allocator
as well. So to me the primary motivation would be "use the right tool
for the purpose".

> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 2886e81..242f0f0 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3225,7 +3225,8 @@ static void kfree_rcu_work(struct work_struct *work)
>  				bkvhead[i] = NULL;
>  			krc_this_cpu_unlock(krcp, flags);
>  
> -			kfree(bkvhead[i]);
> +			if (bkvhead[i])
> +				free_page((unsigned long)bkvhead[i]);
>  
>  			cond_resched_tasks_rcu_qs();
>  		}
> @@ -3378,7 +3379,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
>  		bnode = get_cached_bnode(*krcp);
>  		if (!bnode && can_alloc_page) {
>  			krc_this_cpu_unlock(*krcp, *flags);
> -			bnode = kmalloc(PAGE_SIZE, gfp);
> +			bnode = (struct kvfree_rcu_bulk_data *)__get_free_page(gfp);
>  			*krcp = krc_this_cpu_lock(flags);
>  		}
>  

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2020-10-01  9:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-28 23:30 [PATCH tip/core/rcu 0/15] Paul E. McKenney
2020-09-28 23:30 ` [PATCH tip/core/rcu 01/15] lib/debug: Remove pointless ARCH_NO_PREEMPT dependencies paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 02/15] preempt: Make preempt count unconditional paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 03/15] preempt: Cleanup PREEMPT_COUNT leftovers paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 04/15] lockdep: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 05/15] mm/pagemap: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 06/15] locking/bitspinlock: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 07/15] uaccess: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 08/15] sched: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 09/15] ARM: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 10/15] xtensa: " paulmck
2020-09-28 23:30 ` [PATCH tip/core/rcu 11/15] drm/i915: " paulmck
2020-10-01  7:17   ` Joonas Lahtinen
2020-10-01  8:25     ` Thomas Gleixner
2020-10-01 16:03       ` Paul E. McKenney
2020-09-28 23:30 ` [PATCH tip/core/rcu 12/15] rcutorture: " paulmck
2020-09-28 23:31 ` [PATCH tip/core/rcu 13/15] preempt: Remove PREEMPT_COUNT from Kconfig paulmck
2020-09-28 23:31 ` [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible paulmck
2020-09-29 12:07   ` Michal Hocko
2020-09-30  1:53     ` Paul E. McKenney
2020-09-30  8:41       ` Michal Hocko
2020-09-30 12:31         ` Uladzislau Rezki
2020-09-30 23:21         ` Paul E. McKenney
2020-10-01  9:02           ` Michal Hocko [this message]
2020-10-01 16:27             ` Paul E. McKenney
2020-10-02  6:57               ` Michal Hocko
2020-10-02 14:12                 ` Paul E. McKenney
2020-10-01 16:28             ` Paul E. McKenney
2020-10-01 20:03             ` Uladzislau Rezki
2020-09-28 23:31 ` [PATCH tip/core/rcu 15/15] kvfree_rcu(): Fix ifnullfree.cocci warnings paulmck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201001090220.GA22560@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).