linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: Balbir Singh <bsingharora@gmail.com>, Mel Gorman <mgorman@suse.de>
Cc: "Anshuman Khandual" <khandual@linux.vnet.ibm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, "Michal Hocko" <mhocko@suse.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Minchan Kim" <minchan@kernel.org>,
	"Aneesh Kumar KV" <aneesh.kumar@linux.vnet.ibm.com>,
	"Srikar Dronamraju" <srikar@linux.vnet.ibm.com>,
	haren@linux.vnet.ibm.com, "Jérôme Glisse" <jglisse@redhat.com>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Dan Williams" <dan.j.williams@intel.com>
Subject: Re: [PATCH V3 0/4] Define coherent device memory node
Date: Wed, 8 Mar 2017 14:34:05 +0530	[thread overview]
Message-ID: <1d67f38b-548f-26a2-23f5-240d6747f286@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAKTCnzm+nyAaNMmPgKQVNtxdbTLoarnGtebT=MfsZaSq8Umw0w@mail.gmail.com>

On 03/01/2017 04:29 PM, Balbir Singh wrote:
> On Wed, Mar 1, 2017 at 8:55 PM, Mel Gorman <mgorman@suse.de> wrote:
>> On Wed, Mar 01, 2017 at 01:42:40PM +1100, Balbir Singh wrote:
>>>>>> The idea of this patchset was to introduce
>>>>>> the concept of memory that is not necessarily system memory, but is coherent
>>>>>> in terms of visibility/access with some restrictions
>>>>>>
>>>>> Which should be done without special casing the page allocator, cpusets and
>>>>> special casing how cpusets are handled. It's not necessary for any other
>>>>> mechanism used to restrict access to portions of memory such as cpusets,
>>>>> mempolicies or even memblock reservations.
>>>> Agreed, I mentioned a limitation that we see a cpusets. I do agree that
>>>> we should reuse any infrastructure we have, but cpusets are more static
>>>> in nature and inheritence compared to the requirements of CDM.
>>>>
>>> Mel, I went back and looked at cpusets and found some limitations that
>>> I mentioned earlier, isolating a particular node requires some amount
>>> of laborious work in terms of isolating all tasks away from the root cpuset
>>> and then creating a hierarchy where the root cpuset is empty and now
>>> belong to a child cpuset that has everything but the node we intend to
>>> ioslate. Even with hardwalling, it does not prevent allocations from
>>> the parent cpuset.
>>>
>> That it is difficult does not in itself justify adding a third mechanism
>> specific to one type of device for controlling access to memory.
>>
> Not only is it difficult, but there are several tasks that refuse to
> change cpusets once created. I also noticed that the isolation may
> begin a little too late, some allocations may end up on the node to
> isolate.
> 
> I also want to eventually control whether auto-numa
> balancing/kswapd/reclaim etc run on this node (something that cpusets
> do not provide). The reason for these decisions is very dependent on
> the properties of the node. The isolation mechanism that exists today
> is insufficient. Moreover the correct abstraction for device memory
> would be a class similar to N_MEMORY, but limited in what we include
> (which is why I was asking if questions 3 and 4 are clear). You might
> argue these are not NUMA nodes then, but these are in general sense
> NUMA nodes (with non-uniform properties and access times). NUMA allows
> with the right hardware expose the right programming model. Please
> consider reading the full details at
> 
> https://patchwork.kernel.org/patch/9566393/
> https://lkml.org/lkml/2016/11/22/339

As explained by Balbir, right now cpuset mechanism gives only isolation
and is insufficient for creating other properties required for full
fledged CDM representation. NUMA representation is the close match for
CDM memory which represents non uniform attributes instead of distance
as the only differentiating property. Once represented as a NUMA node
in the kernel, we can achieve the isolation requirement either through
buddy allocator changes as proposed in this series or can look into
some alternative approaches as well. As I had mentioned in the last
RFC there is another way to achieve isolation through zonelist rebuild
process changes and mbind() implementation changes. Please find those
two relevant commits here.

https://github.com/akhandual/linux/commit/da1093599db29c31d12422a34d4e0cbf4683618f
https://github.com/akhandual/linux/commit/faadab4e9dc9685ab7a564a84d4a06bde8fc79d8

Will post these commits on this thread for further discussion. Do let
me know your views and suggestions on this approach.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-08  9:04 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-15 12:07 [PATCH V3 0/4] Define coherent device memory node Anshuman Khandual
2017-02-15 12:07 ` [PATCH V3 1/4] mm: Define coherent device memory (CDM) node Anshuman Khandual
2017-02-17 14:05   ` Bob Liu
2017-02-21 10:20     ` Anshuman Khandual
2017-02-15 12:07 ` [PATCH V3 2/4] mm: Enable HugeTLB allocation isolation for CDM nodes Anshuman Khandual
2017-02-15 12:07 ` [PATCH V3 3/4] mm: Add new parameter to get_page_from_freelist() function Anshuman Khandual
2017-02-15 12:07 ` [PATCH V3 4/4] mm: Enable Buddy allocation isolation for CDM nodes Anshuman Khandual
2017-02-15 18:20 ` [PATCH V3 0/4] Define coherent device memory node Mel Gorman
2017-02-16 22:14   ` Balbir Singh
2017-02-17  9:33     ` Mel Gorman
2017-02-21  2:57       ` Balbir Singh
2017-03-01  2:42         ` Balbir Singh
2017-03-01  9:55           ` Mel Gorman
2017-03-01 10:59             ` Balbir Singh
2017-03-08  9:04               ` Anshuman Khandual [this message]
2017-03-08  9:21                 ` [PATCH 1/2] mm: Change generic FALLBACK zonelist creation process Anshuman Khandual
2017-03-08 11:07                   ` John Hubbard
2017-03-14 13:33                     ` Anshuman Khandual
2017-03-15  4:10                       ` John Hubbard
2017-03-08  9:21                 ` [PATCH 2/2] mm: Change mbind(MPOL_BIND) implementation for CDM nodes Anshuman Khandual
2017-02-17 11:41   ` [PATCH V3 0/4] Define coherent device memory node Anshuman Khandual
2017-02-17 13:32     ` Mel Gorman
2017-02-21 13:09       ` Anshuman Khandual
2017-02-21 20:14         ` Jerome Glisse
2017-02-23  8:14           ` Anshuman Khandual
2017-02-23 15:27             ` Jerome Glisse
2017-02-22  9:29         ` Michal Hocko
2017-02-22 14:59           ` Jerome Glisse
2017-02-22 16:54             ` Michal Hocko
2017-03-06  5:48               ` Anshuman Khandual
2017-02-23  8:52           ` Anshuman Khandual
2017-02-23 15:57         ` Mel Gorman
2017-03-06  5:12           ` Anshuman Khandual
2017-02-21 11:11     ` Michal Hocko
2017-02-21 13:39       ` Anshuman Khandual
2017-02-22  9:50         ` Michal Hocko
2017-02-23  6:52           ` Anshuman Khandual
2017-03-05 12:39             ` Anshuman Khandual
2017-02-24  1:06         ` Bob Liu
2017-02-24  4:39           ` John Hubbard
2017-02-24  4:53           ` Jerome Glisse
2017-02-27  1:56             ` Bob Liu
2017-02-27  5:41               ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d67f38b-548f-26a2-23f5-240d6747f286@linux.vnet.ibm.com \
    --to=khandual@linux.vnet.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=haren@linux.vnet.ibm.com \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).