All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-api@vger.kernel.org
Cc: Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	Guy Shattah <sguy@mellanox.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	David Nellans <dnellans@nvidia.com>,
	Laura Abbott <labbott@redhat.com>, Pavel Machek <pavel@ucw.cz>,
	Dave Hansen <dave.hansen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 3/4] mm: add find_alloc_contig_pages() interface
Date: Tue, 22 May 2018 13:35:49 -0700	[thread overview]
Message-ID: <652bb498-8393-4738-a987-9bed31786261@oracle.com> (raw)
In-Reply-To: <c7972da1-a908-7550-7253-9de9a963174c@intel.com>

On 05/22/2018 09:41 AM, Reinette Chatre wrote:
> On 5/21/2018 4:48 PM, Mike Kravetz wrote:
>> On 05/21/2018 01:54 AM, Vlastimil Babka wrote:
>>> On 05/04/2018 01:29 AM, Mike Kravetz wrote:
>>>> +/**
>>>> + * find_alloc_contig_pages() -- attempt to find and allocate a contiguous
>>>> + *				range of pages
>>>> + * @nr_pages:	number of pages to find/allocate
>>>> + * @gfp:	gfp mask used to limit search as well as during compaction
>>>> + * @nid:	target node
>>>> + * @nodemask:	mask of other possible nodes
>>>> + *
>>>> + * Pages can be freed with a call to free_contig_pages(), or by manually
>>>> + * calling __free_page() for each page allocated.
>>>> + *
>>>> + * Return: pointer to 'order' pages on success, or NULL if not successful.
>>>> + */
>>>> +struct page *find_alloc_contig_pages(unsigned long nr_pages, gfp_t gfp,
>>>> +					int nid, nodemask_t *nodemask)
>>>> +{
>>>> +	unsigned long i, alloc_order, order_pages;
>>>> +	struct page *pages;
>>>> +
>>>> +	/*
>>>> +	 * Underlying allocators perform page order sized allocations.
>>>> +	 */
>>>> +	alloc_order = get_count_order(nr_pages);
>>>
>>> So if takes arbitrary nr_pages but convert it to order anyway? I think
>>> that's rather suboptimal and wasteful... e.g. a range could be skipped
>>> because some of the pages added by rounding cannot be migrated away.
>>
>> Yes.  My idea with this series was to use existing allocators which are
>> all order based.  Let me think about how to do allocation for arbitrary
>> number of allocations.
>> - For less than MAX_ORDER size we rely on the buddy allocator, so we are
>>   pretty much stuck with order sized allocation.  However, allocations of
>>   this size are not really interesting as you can call existing routines
>>   directly.
>> - For sizes greater than MAX_ORDER, we know that the allocation size will
>>   be at least pageblock sized.  So, the isolate/migrate scheme can still
>>   be used for full pageblocks.  We can then use direct migration for the
>>   remaining pages.  This does complicate things a bit.
>>
>> I'm guessing that most (?all?) allocations will be order based.  The use
>> cases I am aware of (hugetlbfs, Intel Cache Pseudo-Locking, RDMA) are all
>> order based.  However, as commented in previous version taking arbitrary
>> nr_pages makes interface more future proof.
>>
> 
> I noticed this Cache Pseudo-Locking statement and would like to clarify.
> I have not been following this thread in detail so I would like to
> apologize first if my comments are out of context.
> 
> Currently the Cache Pseudo-Locking allocations are order based because I
> assumed it was required by the allocator. The contiguous regions needed
> by Cache Pseudo-Locking will not always be order based - instead it is
> based on the granularity of the cache allocation. One example is a
> platform with 55MB L3 cache that can be divided into 20 equal portions.
> To support Cache Pseudo-Locking on this platform we need to be able to
> allocate contiguous regions at increments of 2816KB (the size of each
> portion). In support of this example platform regions needed would thus
> be 2816KB, 5632KB, 8448KB, etc.

Thank you Reinette.  I was not aware of these details.  Yours is the most
concrete new use case.

This certainly makes more of a case for arbitrary sized allocations.

-- 
Mike Kravetz

  reply	other threads:[~2018-05-22 20:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-03 23:29 [PATCH v2 0/4] Interface for higher order contiguous allocations Mike Kravetz
2018-05-03 23:29 ` [PATCH v2 1/4] mm: change type of free_contig_range(nr_pages) to unsigned long Mike Kravetz
2018-05-18  9:12   ` Vlastimil Babka
2018-05-18 22:01     ` Mike Kravetz
2018-05-03 23:29 ` [PATCH v2 2/4] mm: check for proper migrate type during isolation Mike Kravetz
2018-05-18 10:32   ` Vlastimil Babka
2018-05-21 23:10     ` Mike Kravetz
2018-05-22  7:07       ` Vlastimil Babka
2018-05-03 23:29 ` [PATCH v2 3/4] mm: add find_alloc_contig_pages() interface Mike Kravetz
2018-05-21  8:54   ` Vlastimil Babka
2018-05-21 23:48     ` Mike Kravetz
2018-05-22 16:41       ` Reinette Chatre
2018-05-22 20:35         ` Mike Kravetz [this message]
2018-05-23 11:18         ` Vlastimil Babka
2018-05-23 18:07           ` Reinette Chatre
2018-05-28 13:12             ` Vlastimil Babka
2018-05-03 23:29 ` [PATCH v2 4/4] mm/hugetlb: use find_alloc_contig_pages() to allocate gigantic pages Mike Kravetz
2018-05-21 12:00 ` [PATCH v2 0/4] Interface for higher order contiguous allocations Vlastimil Babka
2018-05-22  0:15   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=652bb498-8393-4738-a987-9bed31786261@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dave.hansen@intel.com \
    --cc=dnellans@nvidia.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=labbott@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mina86@mina86.com \
    --cc=pavel@ucw.cz \
    --cc=reinette.chatre@intel.com \
    --cc=sguy@mellanox.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.