All of lore.kernel.org
 help / color / mirror / Atom feed
From: Balbir Singh <bsingharora@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz,
	mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths
Date: Tue, 25 Oct 2016 18:25:49 +1100	[thread overview]
Message-ID: <96132402-e934-02ef-525f-636e4e132e9a@gmail.com> (raw)
In-Reply-To: <5f9f43c1-115f-e3fe-fca2-37e6c1eed73f@gmail.com>



On 25/10/16 18:17, Balbir Singh wrote:
> 
> 
> On 25/10/16 15:15, Aneesh Kumar K.V wrote:
>> Dave Hansen <dave.hansen@intel.com> writes:
>>
>>> On 10/23/2016 09:31 PM, Anshuman Khandual wrote:
>>>> This change is part of the isolation requiring coherent device memory nodes
>>>> implementation.
>>>>
>>>> Isolation seeking coherent device memory node requires allocation isolation
>>>> from implicit memory allocations from user space. Towards that effect, the
>>>> memory should not be used for generic HugeTLB page pool allocations. This
>>>> modifies relevant functions to skip all coherent memory nodes present on
>>>> the system during allocation, freeing and auditing for HugeTLB pages.
>>>
>>> This seems really fragile.  You had to hit, what, 18 call sites?  What
>>> are the odds that this is going to stay working?
>>
>>
>> I guess a better approach is to introduce new node_states entry such
>> that we have one that excludes coherent device memory numa nodes. One
>> possibility is to add N_SYSTEM_MEMORY and N_MEMORY.
>>
>> Current N_MEMORY becomes N_SYSTEM_MEMORY and N_MEMORY includes
>> system and device/any other memory which is coherent.
>>
> 
> I thought of this as well, but I would rather see N_COHERENT_MEMORY
> as a flag. The idea being that some device memory is a part of
> N_MEMORY, but N_COHERENT_MEMORY gives it additional attributes
> 
>> All the isolation can then be achieved based on the nodemask_t used for
>> allocation. So for allocations we want to avoid from coherent device we
>> use N_SYSTEM_MEMORY mask or a derivative of that and where we are ok to
>> allocate from CDM with fallbacks we use N_MEMORY.
>>
> 
> I suspect its going to be easier to exclude N_COHERENT_MEMORY.
> 
>> All nodes zonelist will have zones from the coherent device nodes but we
>> will not end up allocating from coherent device node zone due to the
>> node mask used.
>>
>>
>> This will also make sure we end up allocating from the correct coherent
>> device numa node in the presence of multiple of them based on the
>> distance of the coherent device node from the current executing numa
>> node.
>>
> 
> The idea is good overall, but I think its going to be good to document
> the exclusions with the flags
> 

FWIW,, some of this is present in 8/8

Balbir

WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <bsingharora@gmail.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz,
	mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org
Subject: Re: [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths
Date: Tue, 25 Oct 2016 18:25:49 +1100	[thread overview]
Message-ID: <96132402-e934-02ef-525f-636e4e132e9a@gmail.com> (raw)
In-Reply-To: <5f9f43c1-115f-e3fe-fca2-37e6c1eed73f@gmail.com>



On 25/10/16 18:17, Balbir Singh wrote:
> 
> 
> On 25/10/16 15:15, Aneesh Kumar K.V wrote:
>> Dave Hansen <dave.hansen@intel.com> writes:
>>
>>> On 10/23/2016 09:31 PM, Anshuman Khandual wrote:
>>>> This change is part of the isolation requiring coherent device memory nodes
>>>> implementation.
>>>>
>>>> Isolation seeking coherent device memory node requires allocation isolation
>>>> from implicit memory allocations from user space. Towards that effect, the
>>>> memory should not be used for generic HugeTLB page pool allocations. This
>>>> modifies relevant functions to skip all coherent memory nodes present on
>>>> the system during allocation, freeing and auditing for HugeTLB pages.
>>>
>>> This seems really fragile.  You had to hit, what, 18 call sites?  What
>>> are the odds that this is going to stay working?
>>
>>
>> I guess a better approach is to introduce new node_states entry such
>> that we have one that excludes coherent device memory numa nodes. One
>> possibility is to add N_SYSTEM_MEMORY and N_MEMORY.
>>
>> Current N_MEMORY becomes N_SYSTEM_MEMORY and N_MEMORY includes
>> system and device/any other memory which is coherent.
>>
> 
> I thought of this as well, but I would rather see N_COHERENT_MEMORY
> as a flag. The idea being that some device memory is a part of
> N_MEMORY, but N_COHERENT_MEMORY gives it additional attributes
> 
>> All the isolation can then be achieved based on the nodemask_t used for
>> allocation. So for allocations we want to avoid from coherent device we
>> use N_SYSTEM_MEMORY mask or a derivative of that and where we are ok to
>> allocate from CDM with fallbacks we use N_MEMORY.
>>
> 
> I suspect its going to be easier to exclude N_COHERENT_MEMORY.
> 
>> All nodes zonelist will have zones from the coherent device nodes but we
>> will not end up allocating from coherent device node zone due to the
>> node mask used.
>>
>>
>> This will also make sure we end up allocating from the correct coherent
>> device numa node in the presence of multiple of them based on the
>> distance of the coherent device node from the current executing numa
>> node.
>>
> 
> The idea is good overall, but I think its going to be good to document
> the exclusions with the flags
> 

FWIW,, some of this is present in 8/8

Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-10-25  7:26 UTC|newest]

Thread overview: 135+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24  4:31 [RFC 0/8] Define coherent device memory node Anshuman Khandual
2016-10-24  4:31 ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 1/8] mm: " Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:09   ` Dave Hansen
2016-10-24 17:09     ` Dave Hansen
2016-10-25  1:22     ` Anshuman Khandual
2016-10-25  1:22       ` Anshuman Khandual
2016-10-25 15:47       ` Dave Hansen
2016-10-25 15:47         ` Dave Hansen
2016-10-24  4:31 ` [RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:10   ` Dave Hansen
2016-10-24 17:10     ` Dave Hansen
2016-10-25  1:27     ` Anshuman Khandual
2016-10-25  1:27       ` Anshuman Khandual
2016-11-17  7:40   ` Anshuman Khandual
2016-11-17  7:40     ` Anshuman Khandual
2016-11-17  7:59     ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask Anshuman Khandual
2016-11-17  7:59       ` Anshuman Khandual
2016-11-17  7:59       ` [DRAFT 2/2] mm/hugetlb: Restrict HugeTLB allocations only to the system RAM nodes Anshuman Khandual
2016-11-17  7:59         ` Anshuman Khandual
2016-11-17  8:28       ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask kbuild test robot
2016-10-24  4:31 ` [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:16   ` Dave Hansen
2016-10-24 17:16     ` Dave Hansen
2016-10-25  4:15     ` Aneesh Kumar K.V
2016-10-25  4:15       ` Aneesh Kumar K.V
2016-10-25  7:17       ` Balbir Singh
2016-10-25  7:17         ` Balbir Singh
2016-10-25  7:25         ` Balbir Singh [this message]
2016-10-25  7:25           ` Balbir Singh
2016-10-24  4:31 ` [RFC 4/8] mm: Accommodate coherent device memory nodes in MPOL_BIND implementation Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 5/8] mm: Add new flag VM_CDM for coherent device memory Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:38   ` Dave Hansen
2016-10-24 17:38     ` Dave Hansen
2016-10-24 18:00     ` Dave Hansen
2016-10-24 18:00       ` Dave Hansen
2016-10-25 12:36     ` Balbir Singh
2016-10-25 12:36       ` Balbir Singh
2016-10-25 19:20     ` Aneesh Kumar K.V
2016-10-25 19:20       ` Aneesh Kumar K.V
2016-10-25 20:01       ` Dave Hansen
2016-10-25 20:01         ` Dave Hansen
2016-10-24  4:31 ` [RFC 6/8] mm: Make VM_CDM marked VMAs non migratable Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 7/8] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 8/8] mm: Add N_COHERENT_DEVICE node type into node_states[] Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-25  7:22   ` Balbir Singh
2016-10-25  7:22     ` Balbir Singh
2016-10-26  4:52     ` Anshuman Khandual
2016-10-26  4:52       ` Anshuman Khandual
2016-10-24  4:42 ` [DEBUG 00/10] Test and debug patches for coherent device memory Anshuman Khandual
2016-10-24  4:42   ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 01/10] dt-bindings: Add doc for ibm,hotplug-aperture Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 02/10] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 03/10] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 04/10] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 05/10] powerpc/mm: Identify isolation seeking coherent memory nodes during boot Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 06/10] mm: Export definition of 'zone_names' array through mmzone.h Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 08/10] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 09/10] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 10/10] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24 17:09 ` [RFC 0/8] Define coherent device memory node Jerome Glisse
2016-10-24 17:09   ` Jerome Glisse
2016-10-25  4:26   ` Aneesh Kumar K.V
2016-10-25  4:26     ` Aneesh Kumar K.V
2016-10-25 15:16     ` Jerome Glisse
2016-10-25 15:16       ` Jerome Glisse
2016-10-26 11:09       ` Aneesh Kumar K.V
2016-10-26 11:09         ` Aneesh Kumar K.V
2016-10-26 16:07         ` Jerome Glisse
2016-10-26 16:07           ` Jerome Glisse
2016-10-28  5:29           ` Aneesh Kumar K.V
2016-10-28  5:29             ` Aneesh Kumar K.V
2016-10-28 16:16             ` Jerome Glisse
2016-10-28 16:16               ` Jerome Glisse
2016-11-05  5:21     ` Anshuman Khandual
2016-11-05  5:21       ` Anshuman Khandual
2016-11-05 18:02       ` Jerome Glisse
2016-11-05 18:02         ` Jerome Glisse
2016-10-25  4:59   ` Aneesh Kumar K.V
2016-10-25  4:59     ` Aneesh Kumar K.V
2016-10-25 15:32     ` Jerome Glisse
2016-10-25 15:32       ` Jerome Glisse
2016-10-25 17:31       ` Aneesh Kumar K.V
2016-10-25 17:31         ` Aneesh Kumar K.V
2016-10-25 18:52         ` Jerome Glisse
2016-10-25 18:52           ` Jerome Glisse
2016-10-26 11:13           ` Anshuman Khandual
2016-10-26 11:13             ` Anshuman Khandual
2016-10-26 16:02             ` Jerome Glisse
2016-10-26 16:02               ` Jerome Glisse
2016-10-27  4:38               ` Anshuman Khandual
2016-10-27  4:38                 ` Anshuman Khandual
2016-10-27  7:03                 ` Anshuman Khandual
2016-10-27  7:03                   ` Anshuman Khandual
2016-10-27 15:05                   ` Jerome Glisse
2016-10-27 15:05                     ` Jerome Glisse
2016-10-28  5:47                     ` Anshuman Khandual
2016-10-28  5:47                       ` Anshuman Khandual
2016-10-28 16:08                       ` Jerome Glisse
2016-10-28 16:08                         ` Jerome Glisse
2016-10-26 12:56           ` Anshuman Khandual
2016-10-26 12:56             ` Anshuman Khandual
2016-10-26 16:28             ` Jerome Glisse
2016-10-26 16:28               ` Jerome Glisse
2016-10-27 10:23               ` Balbir Singh
2016-10-27 10:23                 ` Balbir Singh
2016-10-25 12:07   ` Balbir Singh
2016-10-25 12:07     ` Balbir Singh
2016-10-25 15:21     ` Jerome Glisse
2016-10-25 15:21       ` Jerome Glisse
2016-10-24 18:04 ` Dave Hansen
2016-10-24 18:04   ` Dave Hansen
2016-10-24 18:32   ` David Nellans
2016-10-24 18:32     ` David Nellans
2016-10-24 19:36     ` Dave Hansen
2016-10-24 19:36       ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=96132402-e934-02ef-525f-636e4e132e9a@gmail.com \
    --to=bsingharora@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave.hansen@intel.com \
    --cc=js1304@gmail.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.