All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Hillf Danton <hdanton@sina.com>, Michal Hocko <mhocko@suse.com>,
	Matthew Wilcox <willy@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Garnier <thgarnie@google.com>,
	"Oleksiy Avramchenko" <oleksiy.avramchenko@sonymobile.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Joel Fernandes <joelaf@google.com>,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v3 2/4] mm/vmap: preload a CPU with one object for split purpose
Date: Wed, 29 May 2019 16:34:40 +0000	[thread overview]
Message-ID: <20190529163435.GC3228@tower.DHCP.thefacebook.com> (raw)
In-Reply-To: <20190529142715.pxzrjthsthqudgh2@pc636>

On Wed, May 29, 2019 at 04:27:15PM +0200, Uladzislau Rezki wrote:
> Hello, Roman!
> 
> > On Mon, May 27, 2019 at 11:38:40AM +0200, Uladzislau Rezki (Sony) wrote:
> > > Refactor the NE_FIT_TYPE split case when it comes to an
> > > allocation of one extra object. We need it in order to
> > > build a remaining space.
> > > 
> > > Introduce ne_fit_preload()/ne_fit_preload_end() functions
> > > for preloading one extra vmap_area object to ensure that
> > > we have it available when fit type is NE_FIT_TYPE.
> > > 
> > > The preload is done per CPU in non-atomic context thus with
> > > GFP_KERNEL allocation masks. More permissive parameters can
> > > be beneficial for systems which are suffer from high memory
> > > pressure or low memory condition.
> > > 
> > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > ---
> > >  mm/vmalloc.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> > >  1 file changed, 76 insertions(+), 3 deletions(-)
> > 
> > Hi Uladzislau!
> > 
> > This patch generally looks good to me (see some nits below),
> > but it would be really great to add some motivation, e.g. numbers.
> > 
> The main goal of this patch to get rid of using GFP_NOWAIT since it is
> more restricted due to allocation from atomic context. IMHO, if we can
> avoid of using it that is a right way to go.
> 
> From the other hand, as i mentioned before i have not seen any issues
> with that on all my test systems during big rework. But it could be
> beneficial for tiny systems where we do not have any swap and are
> limited in memory size.

Ok, that makes sense to me. Is it possible to emulate such a tiny system
on kvm and measure the benefits? Again, not a strong opinion here,
but it will be easier to justify adding a good chunk of code.

> 
> > > 
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index ea1b65fac599..b553047aa05b 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -364,6 +364,13 @@ static LIST_HEAD(free_vmap_area_list);
> > >   */
> > >  static struct rb_root free_vmap_area_root = RB_ROOT;
> > >  
> > > +/*
> > > + * Preload a CPU with one object for "no edge" split case. The
> > > + * aim is to get rid of allocations from the atomic context, thus
> > > + * to use more permissive allocation masks.
> > > + */
> > > +static DEFINE_PER_CPU(struct vmap_area *, ne_fit_preload_node);
> > > +
> > >  static __always_inline unsigned long
> > >  va_size(struct vmap_area *va)
> > >  {
> > > @@ -950,9 +957,24 @@ adjust_va_to_fit_type(struct vmap_area *va,
> > >  		 *   L V  NVA  V R
> > >  		 * |---|-------|---|
> > >  		 */
> > > -		lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> > > -		if (unlikely(!lva))
> > > -			return -1;
> > > +		lva = __this_cpu_xchg(ne_fit_preload_node, NULL);
> > > +		if (unlikely(!lva)) {
> > > +			/*
> > > +			 * For percpu allocator we do not do any pre-allocation
> > > +			 * and leave it as it is. The reason is it most likely
> > > +			 * never ends up with NE_FIT_TYPE splitting. In case of
> > > +			 * percpu allocations offsets and sizes are aligned to
> > > +			 * fixed align request, i.e. RE_FIT_TYPE and FL_FIT_TYPE
> > > +			 * are its main fitting cases.
> > > +			 *
> > > +			 * There are a few exceptions though, as an example it is
> > > +			 * a first allocation (early boot up) when we have "one"
> > > +			 * big free space that has to be split.
> > > +			 */
> > > +			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> > > +			if (!lva)
> > > +				return -1;
> > > +		}
> > >  
> > >  		/*
> > >  		 * Build the remainder.
> > > @@ -1023,6 +1045,48 @@ __alloc_vmap_area(unsigned long size, unsigned long align,
> > >  }
> > >  
> > >  /*
> > > + * Preload this CPU with one extra vmap_area object to ensure
> > > + * that we have it available when fit type of free area is
> > > + * NE_FIT_TYPE.
> > > + *
> > > + * The preload is done in non-atomic context, thus it allows us
> > > + * to use more permissive allocation masks to be more stable under
> > > + * low memory condition and high memory pressure.
> > > + *
> > > + * If success it returns 1 with preemption disabled. In case
> > > + * of error 0 is returned with preemption not disabled. Note it
> > > + * has to be paired with ne_fit_preload_end().
> > > + */
> > > +static int
> > 
> > Cosmetic nit: you don't need a new line here.
> > 
> > > +ne_fit_preload(int nid)
> > 
> I can fix that.
> 
> > > +{
> > > +	preempt_disable();
> > > +
> > > +	if (!__this_cpu_read(ne_fit_preload_node)) {
> > > +		struct vmap_area *node;
> > > +
> > > +		preempt_enable();
> > > +		node = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, nid);
> > > +		if (node == NULL)
> > > +			return 0;
> > > +
> > > +		preempt_disable();
> > > +
> > > +		if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, node))
> > > +			kmem_cache_free(vmap_area_cachep, node);
> > > +	}
> > > +
> > > +	return 1;
> > > +}
> > > +
> > > +static void
> > 
> > Here too.
> > 
> > > +ne_fit_preload_end(int preloaded)
> > > +{
> > > +	if (preloaded)
> > > +		preempt_enable();
> > > +}
> I can fix that.
> 
> > 
> > I'd open code it. It's used only once, but hiding preempt_disable()
> > behind a helper makes it harder to understand and easier to mess.
> > 
> > Then ne_fit_preload() might require disabled preemption (which it can
> > temporarily re-enable), so that preempt_enable()/disable() logic
> > will be in one place.
> > 
> I see your point. One of the aim was to make less clogged the
> alloc_vmap_area() function. But we can refactor it like you say:
> 
> <snip>
>  static void
> @@ -1091,7 +1089,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>                                 unsigned long vstart, unsigned long vend,
>                                 int node, gfp_t gfp_mask)
>  {
> -       struct vmap_area *va;
> +       struct vmap_area *va, *pva;
>         unsigned long addr;
>         int purged = 0;
>         int preloaded;
> @@ -1122,16 +1120,26 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>          * Just proceed as it is. "overflow" path will refill
>          * the cache we allocate from.
>          */
> -       ne_fit_preload(&preloaded);
> +       preempt_disable();
> +       if (!__this_cpu_read(ne_fit_preload_node)) {
> +               preempt_enable();
> +               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
> +               preempt_disable();
> +
> +               if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
> +                       if (pva)
> +                               kmem_cache_free(vmap_area_cachep, pva);
> +               }
> +       }
> +
>         spin_lock(&vmap_area_lock);
> +       preempt_enable();
>  
>         /*
>          * If an allocation fails, the "vend" address is
>          * returned. Therefore trigger the overflow path.
>          */
>         addr = __alloc_vmap_area(size, align, vstart, vend);
> -       ne_fit_preload_end(preloaded);
> -
>         if (unlikely(addr == vend))
>                 goto overflow;
> <snip>
> 
> Do you mean something like that? If so, i can go with that, unless there are no
> any objections from others.

Yes, it looks much better to me!

Thank you!

  reply	other threads:[~2019-05-29 16:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-27  9:38 [PATCH v3 0/4] Some cleanups for the KVA/vmalloc Uladzislau Rezki (Sony)
2019-05-27  9:38 ` [PATCH v3 1/4] mm/vmap: remove "node" argument Uladzislau Rezki (Sony)
2019-05-28 22:33   ` Roman Gushchin
2019-05-27  9:38 ` [PATCH v3 2/4] mm/vmap: preload a CPU with one object for split purpose Uladzislau Rezki (Sony)
2019-05-28 22:42   ` Roman Gushchin
2019-05-29 14:27     ` Uladzislau Rezki
2019-05-29 16:34       ` Roman Gushchin [this message]
2019-06-03 17:53         ` Uladzislau Rezki
2019-06-03 20:53           ` Uladzislau Rezki
2019-06-03 21:06             ` Roman Gushchin
2019-05-27  9:38 ` [PATCH v3 3/4] mm/vmap: get rid of one single unlink_va() when merge Uladzislau Rezki (Sony)
2019-05-28 22:45   ` Roman Gushchin
2019-05-27  9:38 ` [PATCH v3 4/4] mm/vmap: move BUG_ON() check to the unlink_va() Uladzislau Rezki (Sony)
2019-05-27 12:59   ` Steven Rostedt
2019-05-27 14:02     ` Uladzislau Rezki
2019-05-28 22:50   ` Roman Gushchin
2019-05-29 13:58     ` Uladzislau Rezki
2019-05-29 16:26       ` Roman Gushchin
2019-06-03 17:35         ` Uladzislau Rezki
2019-06-03 20:30           ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190529163435.GC3228@tower.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hdanton@sina.com \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@elte.hu \
    --cc=oleksiy.avramchenko@sonymobile.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.