All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz,
	mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org,
	aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com
Subject: [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask
Date: Thu, 17 Nov 2016 13:29:08 +0530	[thread overview]
Message-ID: <1479369549-13309-1-git-send-email-khandual@linux.vnet.ibm.com> (raw)
In-Reply-To: <582D5F02.6010705@linux.vnet.ibm.com>

task->mems_allowed decides the final node mask of nodes from which memory
can be allocated irrespective of process or VMA based memory policy. CDM
nodes should not be used for any user space memory allocation, hence they
should not be part of any mems_allowed mask in user space to begin with.
This adds a function system_ram() which computes system RAM only nodes
and excludes all the CDM nodes on the platform. This resultant system RAM
nodemask is used instead of N_MEMORY mask during cpuset and mems_allowed
initialization. This achieves isolation of the coherent device memory
from userspace allocations.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
This completely isolates CDM nodes from user space allocations. Hence
explicit allocation to the CDM nodes would not be possible any more.
To again enable explicit allocation capability from user space, cpuset
needs to be changed to accommodate CDM nodes into task's mems_allowed.

 include/linux/mm.h |  9 +++++++++
 kernel/cpuset.c    | 12 +++++++-----
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a92c8d7..f338492 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -446,6 +446,15 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+static inline nodemask_t system_ram(void)
+{
+	nodemask_t ram_nodes;
+
+	nodes_clear(ram_nodes);
+	nodes_andnot(ram_nodes, node_states[N_MEMORY], node_states[N_COHERENT_DEVICE]);
+	return ram_nodes;
+}
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29f815d..78c6fa3 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -364,9 +364,11 @@ static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
  */
 static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
 {
-	while (!nodes_intersects(cs->effective_mems, node_states[N_MEMORY]))
+	nodemask_t nodes = system_ram();
+
+	while (!nodes_intersects(cs->effective_mems, nodes))
 		cs = parent_cs(cs);
-	nodes_and(*pmask, cs->effective_mems, node_states[N_MEMORY]);
+	nodes_and(*pmask, cs->effective_mems, nodes);
 }
 
 /*
@@ -2301,7 +2303,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 
 	/* fetch the available cpus/mems and find out which changed how */
 	cpumask_copy(&new_cpus, cpu_active_mask);
-	new_mems = node_states[N_MEMORY];
+	new_mems = system_ram();
 
 	cpus_updated = !cpumask_equal(top_cpuset.effective_cpus, &new_cpus);
 	mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
@@ -2393,11 +2395,11 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 void __init cpuset_init_smp(void)
 {
 	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_MEMORY];
+	top_cpuset.mems_allowed = system_ram();
 	top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
 
 	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
-	top_cpuset.effective_mems = node_states[N_MEMORY];
+	top_cpuset.effective_mems = system_ram();
 
 	register_hotmemory_notifier(&cpuset_track_online_nodes_nb);
 
-- 
1.8.3.1

WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz,
	mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org,
	aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com
Subject: [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask
Date: Thu, 17 Nov 2016 13:29:08 +0530	[thread overview]
Message-ID: <1479369549-13309-1-git-send-email-khandual@linux.vnet.ibm.com> (raw)
In-Reply-To: <582D5F02.6010705@linux.vnet.ibm.com>

task->mems_allowed decides the final node mask of nodes from which memory
can be allocated irrespective of process or VMA based memory policy. CDM
nodes should not be used for any user space memory allocation, hence they
should not be part of any mems_allowed mask in user space to begin with.
This adds a function system_ram() which computes system RAM only nodes
and excludes all the CDM nodes on the platform. This resultant system RAM
nodemask is used instead of N_MEMORY mask during cpuset and mems_allowed
initialization. This achieves isolation of the coherent device memory
from userspace allocations.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
This completely isolates CDM nodes from user space allocations. Hence
explicit allocation to the CDM nodes would not be possible any more.
To again enable explicit allocation capability from user space, cpuset
needs to be changed to accommodate CDM nodes into task's mems_allowed.

 include/linux/mm.h |  9 +++++++++
 kernel/cpuset.c    | 12 +++++++-----
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a92c8d7..f338492 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -446,6 +446,15 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+static inline nodemask_t system_ram(void)
+{
+	nodemask_t ram_nodes;
+
+	nodes_clear(ram_nodes);
+	nodes_andnot(ram_nodes, node_states[N_MEMORY], node_states[N_COHERENT_DEVICE]);
+	return ram_nodes;
+}
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29f815d..78c6fa3 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -364,9 +364,11 @@ static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
  */
 static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
 {
-	while (!nodes_intersects(cs->effective_mems, node_states[N_MEMORY]))
+	nodemask_t nodes = system_ram();
+
+	while (!nodes_intersects(cs->effective_mems, nodes))
 		cs = parent_cs(cs);
-	nodes_and(*pmask, cs->effective_mems, node_states[N_MEMORY]);
+	nodes_and(*pmask, cs->effective_mems, nodes);
 }
 
 /*
@@ -2301,7 +2303,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 
 	/* fetch the available cpus/mems and find out which changed how */
 	cpumask_copy(&new_cpus, cpu_active_mask);
-	new_mems = node_states[N_MEMORY];
+	new_mems = system_ram();
 
 	cpus_updated = !cpumask_equal(top_cpuset.effective_cpus, &new_cpus);
 	mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
@@ -2393,11 +2395,11 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 void __init cpuset_init_smp(void)
 {
 	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_MEMORY];
+	top_cpuset.mems_allowed = system_ram();
 	top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
 
 	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
-	top_cpuset.effective_mems = node_states[N_MEMORY];
+	top_cpuset.effective_mems = system_ram();
 
 	register_hotmemory_notifier(&cpuset_track_online_nodes_nb);
 
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-11-17  7:59 UTC|newest]

Thread overview: 135+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24  4:31 [RFC 0/8] Define coherent device memory node Anshuman Khandual
2016-10-24  4:31 ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 1/8] mm: " Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:09   ` Dave Hansen
2016-10-24 17:09     ` Dave Hansen
2016-10-25  1:22     ` Anshuman Khandual
2016-10-25  1:22       ` Anshuman Khandual
2016-10-25 15:47       ` Dave Hansen
2016-10-25 15:47         ` Dave Hansen
2016-10-24  4:31 ` [RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:10   ` Dave Hansen
2016-10-24 17:10     ` Dave Hansen
2016-10-25  1:27     ` Anshuman Khandual
2016-10-25  1:27       ` Anshuman Khandual
2016-11-17  7:40   ` Anshuman Khandual
2016-11-17  7:40     ` Anshuman Khandual
2016-11-17  7:59     ` Anshuman Khandual [this message]
2016-11-17  7:59       ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask Anshuman Khandual
2016-11-17  7:59       ` [DRAFT 2/2] mm/hugetlb: Restrict HugeTLB allocations only to the system RAM nodes Anshuman Khandual
2016-11-17  7:59         ` Anshuman Khandual
2016-11-17  8:28       ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask kbuild test robot
2016-10-24  4:31 ` [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:16   ` Dave Hansen
2016-10-24 17:16     ` Dave Hansen
2016-10-25  4:15     ` Aneesh Kumar K.V
2016-10-25  4:15       ` Aneesh Kumar K.V
2016-10-25  7:17       ` Balbir Singh
2016-10-25  7:17         ` Balbir Singh
2016-10-25  7:25         ` Balbir Singh
2016-10-25  7:25           ` Balbir Singh
2016-10-24  4:31 ` [RFC 4/8] mm: Accommodate coherent device memory nodes in MPOL_BIND implementation Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 5/8] mm: Add new flag VM_CDM for coherent device memory Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24 17:38   ` Dave Hansen
2016-10-24 17:38     ` Dave Hansen
2016-10-24 18:00     ` Dave Hansen
2016-10-24 18:00       ` Dave Hansen
2016-10-25 12:36     ` Balbir Singh
2016-10-25 12:36       ` Balbir Singh
2016-10-25 19:20     ` Aneesh Kumar K.V
2016-10-25 19:20       ` Aneesh Kumar K.V
2016-10-25 20:01       ` Dave Hansen
2016-10-25 20:01         ` Dave Hansen
2016-10-24  4:31 ` [RFC 6/8] mm: Make VM_CDM marked VMAs non migratable Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 7/8] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-24  4:31 ` [RFC 8/8] mm: Add N_COHERENT_DEVICE node type into node_states[] Anshuman Khandual
2016-10-24  4:31   ` Anshuman Khandual
2016-10-25  7:22   ` Balbir Singh
2016-10-25  7:22     ` Balbir Singh
2016-10-26  4:52     ` Anshuman Khandual
2016-10-26  4:52       ` Anshuman Khandual
2016-10-24  4:42 ` [DEBUG 00/10] Test and debug patches for coherent device memory Anshuman Khandual
2016-10-24  4:42   ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 01/10] dt-bindings: Add doc for ibm,hotplug-aperture Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 02/10] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 03/10] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 04/10] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 05/10] powerpc/mm: Identify isolation seeking coherent memory nodes during boot Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 06/10] mm: Export definition of 'zone_names' array through mmzone.h Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 08/10] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 09/10] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24  4:42   ` [DEBUG 10/10] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
2016-10-24  4:42     ` Anshuman Khandual
2016-10-24 17:09 ` [RFC 0/8] Define coherent device memory node Jerome Glisse
2016-10-24 17:09   ` Jerome Glisse
2016-10-25  4:26   ` Aneesh Kumar K.V
2016-10-25  4:26     ` Aneesh Kumar K.V
2016-10-25 15:16     ` Jerome Glisse
2016-10-25 15:16       ` Jerome Glisse
2016-10-26 11:09       ` Aneesh Kumar K.V
2016-10-26 11:09         ` Aneesh Kumar K.V
2016-10-26 16:07         ` Jerome Glisse
2016-10-26 16:07           ` Jerome Glisse
2016-10-28  5:29           ` Aneesh Kumar K.V
2016-10-28  5:29             ` Aneesh Kumar K.V
2016-10-28 16:16             ` Jerome Glisse
2016-10-28 16:16               ` Jerome Glisse
2016-11-05  5:21     ` Anshuman Khandual
2016-11-05  5:21       ` Anshuman Khandual
2016-11-05 18:02       ` Jerome Glisse
2016-11-05 18:02         ` Jerome Glisse
2016-10-25  4:59   ` Aneesh Kumar K.V
2016-10-25  4:59     ` Aneesh Kumar K.V
2016-10-25 15:32     ` Jerome Glisse
2016-10-25 15:32       ` Jerome Glisse
2016-10-25 17:31       ` Aneesh Kumar K.V
2016-10-25 17:31         ` Aneesh Kumar K.V
2016-10-25 18:52         ` Jerome Glisse
2016-10-25 18:52           ` Jerome Glisse
2016-10-26 11:13           ` Anshuman Khandual
2016-10-26 11:13             ` Anshuman Khandual
2016-10-26 16:02             ` Jerome Glisse
2016-10-26 16:02               ` Jerome Glisse
2016-10-27  4:38               ` Anshuman Khandual
2016-10-27  4:38                 ` Anshuman Khandual
2016-10-27  7:03                 ` Anshuman Khandual
2016-10-27  7:03                   ` Anshuman Khandual
2016-10-27 15:05                   ` Jerome Glisse
2016-10-27 15:05                     ` Jerome Glisse
2016-10-28  5:47                     ` Anshuman Khandual
2016-10-28  5:47                       ` Anshuman Khandual
2016-10-28 16:08                       ` Jerome Glisse
2016-10-28 16:08                         ` Jerome Glisse
2016-10-26 12:56           ` Anshuman Khandual
2016-10-26 12:56             ` Anshuman Khandual
2016-10-26 16:28             ` Jerome Glisse
2016-10-26 16:28               ` Jerome Glisse
2016-10-27 10:23               ` Balbir Singh
2016-10-27 10:23                 ` Balbir Singh
2016-10-25 12:07   ` Balbir Singh
2016-10-25 12:07     ` Balbir Singh
2016-10-25 15:21     ` Jerome Glisse
2016-10-25 15:21       ` Jerome Glisse
2016-10-24 18:04 ` Dave Hansen
2016-10-24 18:04   ` Dave Hansen
2016-10-24 18:32   ` David Nellans
2016-10-24 18:32     ` David Nellans
2016-10-24 19:36     ` Dave Hansen
2016-10-24 19:36       ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1479369549-13309-1-git-send-email-khandual@linux.vnet.ibm.com \
    --to=khandual@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bsingharora@gmail.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.