All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Goldsworthy <cgoldswo@codeaurora.org>
To: David Hildenbrand <david@redhat.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	pratikp@codeaurora.org, pdaly@codeaurora.org,
	sudaraja@codeaurora.org, iamjoonsoo.kim@lge.com,
	linux-arm-msm-owner@vger.kernel.org,
	Vinayak Menon <vinmenon@codeaurora.org>,
	linux-kernel-owner@vger.kernel.org
Subject: Re: [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc
Date: Thu, 17 Sep 2020 10:54:04 -0700	[thread overview]
Message-ID: <5cfa914fca107d884aa845b9273ec656@codeaurora.org> (raw)
In-Reply-To: <a3d62a77-4c4f-e86c-de6d-5222c2a747e0@redhat.com>

On 2020-09-15 00:53, David Hildenbrand wrote:
> On 14.09.20 20:33, Chris Goldsworthy wrote:
>> On 2020-09-14 02:31, David Hildenbrand wrote:
>>> On 11.09.20 21:17, Chris Goldsworthy wrote:
>>>> 
>>>> So, inside of cma_alloc(), instead of giving up when
>>>> alloc_contig_range()
>>>> returns -EBUSY after having scanned a whole CMA-region bitmap,
>>>> perform
>>>> retries indefinitely, with sleeps, to give the system an opportunity
>>>> to
>>>> unpin any pinned pages.
>>>> 
>>>> Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org>
>>>> Co-developed-by: Vinayak Menon <vinmenon@codeaurora.org>
>>>> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
>>>> ---
>>>>  mm/cma.c | 25 +++++++++++++++++++++++--
>>>>  1 file changed, 23 insertions(+), 2 deletions(-)
>>>> 
>>>> diff --git a/mm/cma.c b/mm/cma.c
>>>> index 7f415d7..90bb505 100644
>>>> --- a/mm/cma.c
>>>> +++ b/mm/cma.c
>>>> @@ -442,8 +443,28 @@ struct page *cma_alloc(struct cma *cma, size_t
>>>> count, unsigned int align,
>>>>  				bitmap_maxno, start, bitmap_count, mask,
>>>>  				offset);
>>>>  		if (bitmap_no >= bitmap_maxno) {
>>>> -			mutex_unlock(&cma->lock);
>>>> -			break;
>>>> +			if (ret == -EBUSY) {
>>>> +				mutex_unlock(&cma->lock);
>>>> +
>>>> +				/*
>>>> +				 * Page may be momentarily pinned by some other
>>>> +				 * process which has been scheduled out, e.g.
>>>> +				 * in exit path, during unmap call, or process
>>>> +				 * fork and so cannot be freed there. Sleep
>>>> +				 * for 100ms and retry the allocation.
>>>> +				 */
>>>> +				start = 0;
>>>> +				ret = -ENOMEM;
>>>> +				msleep(100);
>>>> +				continue;
>>>> +			} else {
>>>> +				/*
>>>> +				 * ret == -ENOMEM - all bits in cma->bitmap are
>>>> +				 * set, so we break accordingly.
>>>> +				 */
>>>> +				mutex_unlock(&cma->lock);
>>>> +				break;
>>>> +			}
>>>>  		}
>>>>  		bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
>>>>  		/*
>>>> 
>>> 
>>> What about long-term pinnings? IIRC, that can happen easily e.g.,
>>> with
>>> vfio (and I remember there is a way via vmsplice).
>>> 
>>> Not convinced trying forever is a sane approach in the general case
>>> ...
>> 
>> V1:
>> [1] https://lkml.org/lkml/2020/8/5/1097
>> [2] https://lkml.org/lkml/2020/8/6/1040
>> [3] https://lkml.org/lkml/2020/8/11/893
>> [4] https://lkml.org/lkml/2020/8/21/1490
>> [5] https://lkml.org/lkml/2020/9/11/1072
>> 
>> We're fine with doing indefinite retries, on the grounds that if there
>> is some long-term pinning that occurs when alloc_contig_range returns
>> -EBUSY, that it should be debugged and fixed.  Would it be possible to
>> make this infinite-retrying something that could be enabled or
>> disabled
>> by a defconfig option?
> 
> Two thoughts:
> 
> This means I strongly prefer something like [3] if feasible.

_Resending so that this ends up on LKML_

I can give [3] some further thought then.  Also, I realized [3] will not
completely solve the problem, it just reduces the window in which
_refcount > _mapcount (as mentioned in earlier threads, we encountered
the pinning when a task in copy_one_pte() or in the exit_mmap() path
gets context switched out).  If we were to try a sleeping-lock based
solution, do you think it would be permissible to add another lock to
struct page?

> 2. The issue that I am having is that long-term pinnings are
> (unfortunately) a real thing. It's not something to debug and fix as
> you
> suggest. Like, run a VM with VFIO (e.g., PCI passthrough). While that
> VM
> is running, all VM memory will be pinned. If memory falls onto a CMA
> region your cma_alloc() will be stuck in an (endless, meaning until the
> VM ended) loop. I am not sure if all cma users are fine with that -
> especially, think about CMA being used for gigantic pages now.
> 
> Assume you want to start a new VM while the other one is running and
> use
> some (new) gigantic pages for it. Suddenly you're trapped in an endless
> loop in the kernel. That's nasty.


Thanks for providing this example.

> 
> If we want to stick to retrying forever, can't we use flags like
> __GFP_NOFAIL to explicitly enable this new behavior for selected
> cma_alloc() users that really can't fail/retry manually again?

This would work, we would just have to undo the work done by this patch
/ re-introduce the GFP parameter for cma_alloc():
http://lkml.kernel.org/r/20180709122019eucas1p2340da484acfcc932537e6014f4fd2c29~-sqTPJKij2939229392eucas1p2j@eucas1p2.samsung.com
, and add the support __GFP_NOFAIL (and ignore any flag that is not one
of __GFP_NOFAIL or __GFP_NOWARN).

-- 
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

  parent reply	other threads:[~2020-09-17 17:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <06489716814387e7f147cf53d1b185a8@codeaurora.org>
2020-09-11 19:17 ` [PATCH v2] cma_alloc(), indefinitely retry allocations for -EBUSY failures Chris Goldsworthy
2020-09-11 19:17 ` Chris Goldsworthy
     [not found] ` <1599851809-4342-1-git-send-email-cgoldswo@codeaurora.org>
2020-09-11 19:17   ` [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc Chris Goldsworthy
2020-09-11 19:17   ` Chris Goldsworthy
2020-09-14  9:31     ` David Hildenbrand
2020-09-14 18:33       ` Chris Goldsworthy
2020-09-14 21:52         ` Chris Goldsworthy
2020-09-15  7:53         ` David Hildenbrand
2020-09-17 17:26           ` Chris Goldsworthy
2020-09-17 17:54           ` Chris Goldsworthy [this message]
2020-09-24  5:13             ` Chris Goldsworthy
2020-09-28  7:39           ` Christoph Hellwig
     [not found] <1599855850-11337-1-git-send-email-cgoldswo@codeaurora.org>
2020-09-11 20:24 ` Chris Goldsworthy
     [not found] <1599857630-23714-1-git-send-email-cgoldswo@codeaurora.org>
2020-09-11 20:54 ` Chris Goldsworthy
2020-09-11 21:37   ` Florian Fainelli
2020-09-11 21:42     ` Randy Dunlap
2020-09-14 18:45       ` Chris Goldsworthy
2020-09-14 18:39     ` Chris Goldsworthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cfa914fca107d884aa845b9273ec656@codeaurora.org \
    --to=cgoldswo@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-arm-msm-owner@vger.kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel-owner@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pdaly@codeaurora.org \
    --cc=pratikp@codeaurora.org \
    --cc=sudaraja@codeaurora.org \
    --cc=vinmenon@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.