All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, mgorman@suse.de, minchan@kernel.org,
	aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com,
	srikar@linux.vnet.ibm.com, haren@linux.vnet.ibm.com,
	jglisse@redhat.com, dave.hansen@intel.com,
	dan.j.williams@intel.com
Subject: Re: [PATCH 3/3] mm: Enable Buddy allocation isolation for CDM nodes
Date: Thu, 9 Feb 2017 10:35:58 +0530	[thread overview]
Message-ID: <8982ccfc-3b96-89bd-60e6-471971aee609@linux.vnet.ibm.com> (raw)
In-Reply-To: <8ef1de25-d4fd-482c-c55e-df93d0730484@suse.cz>

On 02/08/2017 10:48 PM, Vlastimil Babka wrote:
> On 02/08/2017 03:01 PM, Anshuman Khandual wrote:
>> This implements allocation isolation for CDM nodes in buddy allocator by
>> discarding CDM memory zones all the time except in the cases where the
>> gfp
>> flag has got __GFP_THISNODE or the nodemask contains CDM nodes in cases
>> where it is non NULL (explicit allocation request in the kernel or user
>> process MPOL_BIND policy based requests).
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  mm/page_alloc.c | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 40908de..7d8c82a 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -64,6 +64,7 @@
>>  #include <linux/page_owner.h>
>>  #include <linux/kthread.h>
>>  #include <linux/memcontrol.h>
>> +#include <linux/node.h>
>>
>>  #include <asm/sections.h>
>>  #include <asm/tlbflush.h>
>> @@ -2908,6 +2909,24 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned
>> int order, int alloc_flags,
>>          struct page *page;
>>          unsigned long mark;
>>
>> +        /*
>> +         * CDM nodes get skipped if the requested gfp flag
>> +         * does not have __GFP_THISNODE set or the nodemask
>> +         * does not have any CDM nodes in case the nodemask
>> +         * is non NULL (explicit allocation requests from
>> +         * kernel or user process MPOL_BIND policy which has
>> +         * CDM nodes).
>> +         */
>> +        if (is_cdm_node(zone->zone_pgdat->node_id)) {
>> +            if (!(gfp_mask & __GFP_THISNODE)) {
>> +                if (!ac->nodemask)
>> +                    continue;
>> +
>> +                if (!nodemask_has_cdm(*ac->nodemask))
>> +                    continue;
> 
> nodemask_has_cdm() looks quite expensive, combined with the loop here
> that's O(n^2). But I don't understand why you need it. If there is no
> cdm node in the nodemask, then we never reach this code with a cdm node,
> because the zonelist iterator already checks the nodemask? Am I missing
> something?

A CDM zone can be selected during zonelist iteration if

	(1) If nodemask is NULL (where all zones are eligible)

		(1) Skip it if __GFP_THISNODE is not mentioned
		(2) Pick it if __GFP_THISNODE is mentioned

	(2) If nodemask has CDM (where CDM zones are eligible)

		(1) Pick it if nodemask has CDM
		(2) Pick it if __GFP_THISNODE is mentioned

(1) (1) Enforces the primary isolation
(2) (1) Is the only option which could be O(n^2) as the worst case

Checking for both the zone being a CDM zone and the nodemask containing
CDM node has to happen together for (2) (1). But we dont run into this
option unless we have first checked if request contains __GFP_THISNODE
and that nodemask is really a non NULL value. Hence the number cases
getting into (2) (1) should be less. IIUC only the user space MPOL_BIND
ones will come here.

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, mgorman@suse.de, minchan@kernel.org,
	aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com,
	srikar@linux.vnet.ibm.com, haren@linux.vnet.ibm.com,
	jglisse@redhat.com, dave.hansen@intel.com,
	dan.j.williams@intel.com
Subject: Re: [PATCH 3/3] mm: Enable Buddy allocation isolation for CDM nodes
Date: Thu, 9 Feb 2017 10:35:58 +0530	[thread overview]
Message-ID: <8982ccfc-3b96-89bd-60e6-471971aee609@linux.vnet.ibm.com> (raw)
In-Reply-To: <8ef1de25-d4fd-482c-c55e-df93d0730484@suse.cz>

On 02/08/2017 10:48 PM, Vlastimil Babka wrote:
> On 02/08/2017 03:01 PM, Anshuman Khandual wrote:
>> This implements allocation isolation for CDM nodes in buddy allocator by
>> discarding CDM memory zones all the time except in the cases where the
>> gfp
>> flag has got __GFP_THISNODE or the nodemask contains CDM nodes in cases
>> where it is non NULL (explicit allocation request in the kernel or user
>> process MPOL_BIND policy based requests).
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  mm/page_alloc.c | 19 +++++++++++++++++++
>>  1 file changed, 19 insertions(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 40908de..7d8c82a 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -64,6 +64,7 @@
>>  #include <linux/page_owner.h>
>>  #include <linux/kthread.h>
>>  #include <linux/memcontrol.h>
>> +#include <linux/node.h>
>>
>>  #include <asm/sections.h>
>>  #include <asm/tlbflush.h>
>> @@ -2908,6 +2909,24 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned
>> int order, int alloc_flags,
>>          struct page *page;
>>          unsigned long mark;
>>
>> +        /*
>> +         * CDM nodes get skipped if the requested gfp flag
>> +         * does not have __GFP_THISNODE set or the nodemask
>> +         * does not have any CDM nodes in case the nodemask
>> +         * is non NULL (explicit allocation requests from
>> +         * kernel or user process MPOL_BIND policy which has
>> +         * CDM nodes).
>> +         */
>> +        if (is_cdm_node(zone->zone_pgdat->node_id)) {
>> +            if (!(gfp_mask & __GFP_THISNODE)) {
>> +                if (!ac->nodemask)
>> +                    continue;
>> +
>> +                if (!nodemask_has_cdm(*ac->nodemask))
>> +                    continue;
> 
> nodemask_has_cdm() looks quite expensive, combined with the loop here
> that's O(n^2). But I don't understand why you need it. If there is no
> cdm node in the nodemask, then we never reach this code with a cdm node,
> because the zonelist iterator already checks the nodemask? Am I missing
> something?

A CDM zone can be selected during zonelist iteration if

	(1) If nodemask is NULL (where all zones are eligible)

		(1) Skip it if __GFP_THISNODE is not mentioned
		(2) Pick it if __GFP_THISNODE is mentioned

	(2) If nodemask has CDM (where CDM zones are eligible)

		(1) Pick it if nodemask has CDM
		(2) Pick it if __GFP_THISNODE is mentioned

(1) (1) Enforces the primary isolation
(2) (1) Is the only option which could be O(n^2) as the worst case

Checking for both the zone being a CDM zone and the nodemask containing
CDM node has to happen together for (2) (1). But we dont run into this
option unless we have first checked if request contains __GFP_THISNODE
and that nodemask is really a non NULL value. Hence the number cases
getting into (2) (1) should be less. IIUC only the user space MPOL_BIND
ones will come here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-09  6:03 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-08 14:01 [PATCH 0/3] Define coherent device memory node Anshuman Khandual
2017-02-08 14:01 ` Anshuman Khandual
2017-02-08 14:01 ` [PATCH 1/3] mm: Define coherent device memory (CDM) node Anshuman Khandual
2017-02-08 14:01   ` Anshuman Khandual
2017-02-09  2:45   ` kbuild test robot
2017-02-09  6:01   ` kbuild test robot
2017-02-09  9:38   ` Anshuman Khandual
2017-02-09  9:38     ` Anshuman Khandual
2017-02-08 14:01 ` [PATCH 2/3] mm: Enable HugeTLB allocation isolation for CDM nodes Anshuman Khandual
2017-02-08 14:01   ` Anshuman Khandual
2017-02-08 14:01 ` [PATCH 3/3] mm: Enable Buddy " Anshuman Khandual
2017-02-08 14:01   ` Anshuman Khandual
2017-02-08 17:18   ` Vlastimil Babka
2017-02-08 17:18     ` Vlastimil Babka
2017-02-09  5:05     ` Anshuman Khandual [this message]
2017-02-09  5:05       ` Anshuman Khandual
2017-02-09  8:48       ` Vlastimil Babka
2017-02-09  8:48         ` Vlastimil Babka
2017-02-09 10:09         ` Anshuman Khandual
2017-02-09 10:09           ` Anshuman Khandual
2017-02-08 16:42 ` [PATCH 0/3] Define coherent device memory node Balbir Singh
2017-02-08 16:42   ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8982ccfc-3b96-89bd-60e6-471971aee609@linux.vnet.ibm.com \
    --to=khandual@linux.vnet.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=haren@linux.vnet.ibm.com \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.