All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <hdanton@sina.com>, Michal Hocko <mhocko@suse.com>,
	mm-commits@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [failures] mm-vmalloc-print-a-warning-message-first-on-failure.patch removed from -mm tree
Date: Fri, 14 May 2021 16:50:26 +0200	[thread overview]
Message-ID: <20210514145026.GA7183@pc638.lan> (raw)
In-Reply-To: <20210514134530.GP3672@suse.de>

> On Fri, May 14, 2021 at 01:45:43PM +0200, Uladzislau Rezki wrote:
> > > > Seems like "zoneref" refers to invalid address.
> > > > 
> > > > Thoughts?
> > > 
> > > I have not previously read the patch but there are a few concerns and it's
> > > probably just as well this blew up early. The bulk allocator assumes a
> > > valid node but the patch can send in NUMA_NO_NODE (-1). 
> > >
> >
> > Should the bulk-allocator handle the NUMA_NO_NODE on its own? I mean instead
> > of handling by user the allocator itself fixes it if NUMA_NO_NODE is passed.
> > 
> 
> No for API similarity reasons. __alloc_pages_bulk is the API bulk
> equivalent to __alloc_pages() and both expect valid node IDs. vmalloc
> is using alloc_pages_node for high-order pages which first checks
> the node ID so your options are to check it within vmalloc.c or add a
> alloc_pages_node_bulk helper that is API equivalent to alloc_pages_node
> as a prerequisite to your patch.
> 
OK. Thanks.

> > > On the high-order path alloc_pages_node is used which checks nid == NUMA_NO_NODE.
> > > Also, area->pages is not necessarily initialised so that could be interpreted
> > > as a partially populated array so minmally you need.
> > >
> >
> > area->pages are zeroed, because __GFP_ZERO is sued during allocating an array.
> > 
> 
> Ah, yes.
> 
> > > However, the high-order path also looks suspicious. area->nr_pages is
> > > advanced before the allocation attempt so in the event alloc_pages_node()
> > > returns NULL prematurely, area->nr_pages does not reflect the number of
> > > pages allocated so that needs examination.
> > > 
> > <snip>
> >     for (area->nr_pages = 0; area->nr_pages < nr_small_pages;
> >         area->nr_pages += 1U << page_order) {
> > <snip>
> > 
> > if alloc_pages_node() fails we break the loop. area->nr_pages is initialized
> > inside the for(...) loop, thus it will be zero if the single page allocator
> > fails on a first iteration.
> > 
> > Or i miss your point?
> > 
> 
> At the time of the break, area->nr_pages += 1U << page_order happened
> before the allocation failure happens. That looks very suspicious.
> 
The "for" loop does not work that way. If you break the loop the
"area->nr_pages += 1U << page_order" or an "increment" is not increased.
It is increased only after the body of the "for" loop executes and it
goes to next iteration.

> > > As an aside, where or what is test_vmalloc.sh? It appears to have been
> > > used a few times but it's not clear it's representative so are you aware
> > > of workloads that are vmalloc-intensive? It does not matter for the
> > > patch as such but it would be nice to know examples of vmalloc-intensive
> > > workloads because I cannot recall a time during the last few years where
> > > I saw vmalloc.c high in profiles.
> > > 
> > test_vmalloc.sh is a shell script that is used for stressing and testing a
> > vmalloc subsystem as well as performance evaluation. You can find it here:
> > 
> > ./tools/testing/selftests/vm/test_vmalloc.sh
> > 
> 
> Thanks.
> 
> > As for workloads. Most of them which are critical to time and latency. For
> > example audio/video, especially in the mobile area. I did a big rework of
> > the KVA allocator because i found it not optimal to allocation time.
> > 
> 
> Can you give an example benchmark that triggers it or is it somewhat
> specific to mobile platforms with drivers that use vmalloc heavily?
> 
See below an example of audio glitches. That was related to our phones
and audio workloads:

# Explanation is here
wget ftp://vps418301.ovh.net/incoming/analysis_audio_glitches.txt

# Audio 10 seconds sample is here.
# The drop occurs at 00:09.295 you can hear it
wget ftp://vps418301.ovh.net/incoming/tst_440_HZ_tmp_1.wav

Apart of that a slow allocation can course two type of issues. First one
is direct. When for example a high-priority RT thread does some allocation
to bypass data to DSP. Long latency courses a delay of data to be passed to
DSP. This is drivers area.

Another example is when a task is doing an allocation and the RT task is
placed onto a same CPU. In that case a long preemption-off(milliseconds)
section can lead the RT task for starvation. For mobile devices it is UI
stack where RT tasks are used. As a result we face frame drops.

All such issues have been solved after a rework:

wget ftp://vps418301.ovh.net/incoming/Reworking_of_KVA_allocator_in_Linux_kernel.pdf

--
Vlad Rezki

  reply	other threads:[~2021-05-14 14:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12 20:29 [failures] mm-vmalloc-print-a-warning-message-first-on-failure.patch removed from -mm tree akpm
2021-05-12 22:56 ` Stephen Rothwell
2021-05-13 10:31   ` Uladzislau Rezki
2021-05-13 11:11     ` Mel Gorman
2021-05-13 12:46       ` Uladzislau Rezki
2021-05-13 13:24         ` Uladzislau Rezki
2021-05-13 14:18           ` Mel Gorman
     [not found]             ` <CA+KHdyXwdkosDYk4bKtRLVodrwUJnq3NN39xuRQzKJSPTn7+bQ@mail.gmail.com>
2021-05-13 15:51               ` Mel Gorman
2021-05-13 20:18                 ` Uladzislau Rezki
2021-05-14 10:19                   ` Mel Gorman
2021-05-14 11:45                     ` Uladzislau Rezki
2021-05-14 13:45                       ` Mel Gorman
2021-05-14 14:50                         ` Uladzislau Rezki [this message]
2021-05-14 15:41                           ` Mel Gorman
2021-05-14 17:16                             ` Uladzislau Rezki
2021-05-16 17:17                               ` Mel Gorman
2021-05-16 20:31                                 ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210514145026.GA7183@pc638.lan \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=oleksiy.avramchenko@sonymobile.com \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.