All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	hyesoo.yu@samsung.com, willy@infradead.org,
	iamjoonsoo.kim@lge.com, vbabka@suse.cz, surenb@google.com,
	pullip.cho@samsung.com, joaodias@google.com, hridya@google.com,
	sumit.semwal@linaro.org, john.stultz@linaro.org,
	Brian.Starkey@arm.com, linux-media@vger.kernel.org,
	devicetree@vger.kernel.org, robh@kernel.org,
	christian.koenig@amd.com, linaro-mm-sig@lists.linaro.org
Subject: Re: [PATCH v2 2/4] mm: introduce cma_alloc_bulk API
Date: Wed, 2 Dec 2020 08:15:49 -0800	[thread overview]
Message-ID: <X8e9tSwcsrEsAv1O@google.com> (raw)
In-Reply-To: <20201202154915.GU17338@dhcp22.suse.cz>

On Wed, Dec 02, 2020 at 04:49:15PM +0100, Michal Hocko wrote:
> On Wed 02-12-20 10:14:41, David Hildenbrand wrote:
> > On 01.12.20 18:51, Minchan Kim wrote:
> > > There is a need for special HW to require bulk allocation of
> > > high-order pages. For example, 4800 * order-4 pages, which
> > > would be minimum, sometimes, it requires more.
> > > 
> > > To meet the requirement, a option reserves 300M CMA area and
> > > requests the whole 300M contiguous memory. However, it doesn't
> > > work if even one of those pages in the range is long-term pinned
> > > directly or indirectly. The other option is to ask higher-order
> > 
> > My latest knowledge is that pages in the CMA area are never long term
> > pinned.
> > 
> > https://lore.kernel.org/lkml/20201123090129.GD27488@dhcp22.suse.cz/
> > 
> > "gup already tries to deal with long term pins on CMA regions and migrate
> > to a non CMA region. Have a look at __gup_longterm_locked."
> > 
> > We should rather identify ways how that is still possible and get rid of
> > them.
> > 
> > 
> > Now, short-term pinnings and PCP are other issues where
> > alloc_contig_range() could be improved (e.g., in contrast to a FAST
> > mode, a HARD mode which temporarily disables the PCP, ...).
> 
> Agreed!
> 
> > > size (e.g., 2M) than requested order(64K) repeatedly until driver
> > > could gather necessary amount of memory. Basically, this approach
> > > makes the allocation very slow due to cma_alloc's function
> > > slowness and it could be stuck on one of the pageblocks if it
> > > encounters unmigratable page.
> > > 
> > > To solve the issue, this patch introduces cma_alloc_bulk.
> > > 
> > > 	int cma_alloc_bulk(struct cma *cma, unsigned int align,
> > > 		bool fast, unsigned int order, size_t nr_requests,
> > > 		struct page **page_array, size_t *nr_allocated);
> > > 
> > > Most parameters are same with cma_alloc but it additionally passes
> > > vector array to store allocated memory. What's different with cma_alloc
> > > is it will skip pageblocks without waiting/stopping if it has unmovable
> > > page so that API continues to scan other pageblocks to find requested
> > > order page.
> > > 
> > > cma_alloc_bulk is best effort approach in that it skips some pageblocks
> > > if they have unmovable pages unlike cma_alloc. It doesn't need to be
> > > perfect from the beginning at the cost of performance. Thus, the API
> > > takes "bool fast parameter" which is propagated into alloc_contig_range to
> > > avoid significat overhead functions to inrecase CMA allocation success
> > > ratio(e.g., migration retrial, PCP, LRU draining per pageblock)
> > > at the cost of less allocation success ratio. If the caller couldn't
> > > allocate enough, they could call it with "false" to increase success ratio
> > > if they are okay to expense the overhead for the success ratio.
> > 
> > Just so I understand what the idea is:
> > 
> > alloc_contig_range() sometimes fails on CMA regions when trying to
> > allocate big chunks (e.g., 300M). Instead of tackling that issue, you
> > rather allocate plenty of small chunks, and make these small allocations
> > fail faster/ make the allocations less reliable. Correct?
> > 
> > I don't really have a strong opinion on that. Giving up fast rather than
> > trying for longer sounds like a useful thing to have - but I wonder if
> > it's strictly necessary for the use case you describe.
> > 
> > I'd like to hear Michals opinion on that.
> 
> Well, what I can see is that this new interface is an antipatern to our
> allocation routines. We tend to control allocations by gfp mask yet you
> are introducing a bool parameter to make something faster... What that
> really means is rather arbitrary. Would it make more sense to teach
> cma_alloc resp. alloc_contig_range to recognize GFP_NOWAIT, GFP_NORETRY resp.
> GFP_RETRY_MAYFAIL instead?

If we use cma_alloc, that interface requires "allocate one big memory
chunk". IOW, return value is just struct page and expected that the page
is a big contiguos memory. That means it couldn't have a hole in the
range. However the idea here, what we asked is much smaller chunk rather
than a big contiguous memory so we could skip some of pages if they are
randomly pinned(long-term/short-term whatever) and search other pages
in the CMA area to avoid long stall. Thus, it couldn't work with exising
cma_alloc API with simple gfp_mak.

> 
> I am not deeply familiar with the cma allocator so sorry for a
> potentially stupid question. Why does a bulk interface performs better
> than repeated calls to cma_alloc? Is this because a failure would help
> to move on to the next pfn range while a repeated call would have to
> deal with the same range?

Yub, true with other overheads(e.g., migration retrial, waiting writeback
PCP/LRU draining IPI)

> 
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > >  include/linux/cma.h |   5 ++
> > >  include/linux/gfp.h |   2 +
> > >  mm/cma.c            | 126 ++++++++++++++++++++++++++++++++++++++++++--
> > >  mm/page_alloc.c     |  19 ++++---
> > >  4 files changed, 140 insertions(+), 12 deletions(-)
> > > 
> -- 
> Michal Hocko
> SUSE Labs

  parent reply	other threads:[~2020-12-02 16:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-01 17:51 [PATCH v2 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
2020-12-01 17:51 ` [PATCH v2 1/4] mm: introduce alloc_contig_mode Minchan Kim
2020-12-01 17:51 ` [PATCH v2 2/4] mm: introduce cma_alloc_bulk API Minchan Kim
2020-12-02  9:14   ` David Hildenbrand
2020-12-02 15:49     ` Michal Hocko
2020-12-02 16:00       ` David Hildenbrand
2020-12-02 16:15       ` Minchan Kim [this message]
2020-12-02 16:48         ` Michal Hocko
2020-12-02 17:54           ` Minchan Kim
2020-12-02 18:51             ` Michal Hocko
2020-12-02 19:26               ` Minchan Kim
2020-12-02 20:22                 ` David Hildenbrand
2020-12-02 20:48                   ` Minchan Kim
2020-12-03  8:28                   ` Michal Hocko
2020-12-03  9:47                     ` David Hildenbrand
2020-12-03 11:47                       ` Michal Hocko
2020-12-03 11:57                         ` David Hildenbrand
2020-12-02 16:00     ` Minchan Kim
2020-12-01 17:51 ` [PATCH v2 3/4] dma-buf: add export symbol for dma-heap Minchan Kim
2020-12-02 13:51   ` Christoph Hellwig
2020-12-01 17:51 ` [PATCH v2 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
2020-12-01 19:48   ` John Stultz
2020-12-01 19:48     ` John Stultz
2020-12-01 22:55     ` Minchan Kim
2020-12-01 22:55       ` Minchan Kim
2020-12-01 23:38       ` John Stultz
2020-12-01 23:38         ` John Stultz
2020-12-02  0:13         ` Minchan Kim
2020-12-02  0:13           ` Minchan Kim
2020-12-02  0:33           ` John Stultz
2020-12-02  0:33             ` John Stultz
2020-12-02  0:57             ` Minchan Kim
2020-12-02  0:57               ` Minchan Kim
2020-12-02 13:54   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X8e9tSwcsrEsAv1O@google.com \
    --to=minchan@kernel.org \
    --cc=Brian.Starkey@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian.koenig@amd.com \
    --cc=david@redhat.com \
    --cc=devicetree@vger.kernel.org \
    --cc=hridya@google.com \
    --cc=hyesoo.yu@samsung.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=joaodias@google.com \
    --cc=john.stultz@linaro.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=pullip.cho@samsung.com \
    --cc=robh@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.