linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/4] Define coherent device memory node
@ 2016-11-22 14:19 Anshuman Khandual
  2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

	There are certain devices like accelerators, GPU cards, network
cards, FPGA cards, PLD cards etc which might contain on board memory. This
on board memory can be coherent along with system RAM and may be accessible
from either the CPU or from the device. The coherency is usually achieved
through synchronizing the cache accesses from either side. This makes the
device memory appear in the same address space as that of the system RAM.
The on board device memory and system RAM are coherent but have differences
in their properties as explained and elaborated below. Following diagram
explains how the coherent device memory appears in the memory address
space.

                +-----------------+         +-----------------+
                |                 |         |                 |
                |       CPU       |         |     DEVICE      |
                |                 |         |                 |
                +-----------------+         +-----------------+
                         |                           |
                         |   Shared Address Space    |
 +---------------------------------------------------------------------+
 |                                             |                       |
 |                                             |                       |
 |                 System RAM                  |     Coherent Memory   |
 |                                             |                       |
 |                                             |                       |
 +---------------------------------------------------------------------+

	User space applications might be interested in using the coherent
device memory either explicitly or implicitly along with the system RAM
utilizing the basic semantics for memory allocation, access and release.
Basically the user applications should be able to allocate memory any where
(system RAM or coherent memory) and then get it accessed either from the
CPU or from the coherent device for various computation or data
transformation purpose. User space really should not be concerned about
memory placement and their subsequent allocations when the memory really
faults because of the access.

	To achieve seamless integration  between system RAM and coherent
device memory it must be able to utilize core memory kernel features like
anon mapping, file mapping, page cache, driver managed pages, HW poisoning,
migrations, reclaim, compaction, etc. Making the coherent device memory
appear as a distinct memory only NUMA node which will be initialized as any
other node with memory can create this integration with currently available
system RAM memory. Also at the same time there should be a differentiating
mark which indicates that this node is a coherent device memory node not
any other memory only system RAM node.
 
	Coherent device memory invariably isn't available until the driver
for the device has been initialized. It is desirable but not required for
the device to support memory offlining for the purposes such as power
management, link management and hardware errors. Kernel allocation should
not come here as it cannot be moved out. Hence coherent device memory
should go inside ZONE_MOVABLE zone instead. This guarantees that kernel
allocations will never be satisfied from this memory and any process having
un-movable pages on this coherent device memory (likely achieved through
pinning later on after initial allocation) can be killed to free up memory
from page table and eventually hot plugging the node out.

	After similar representation as a NUMA node, the coherent memory
might still need some special consideration while being inside the kernel.
There can be a variety of coherent device memory nodes with different
expectations and special considerations from the core kernel. This RFC
discusses only one such scenario where the coherent device memory requires
just isolation.

	Now let us consider in detail the case of a coherent device memory
node which requires isolation. This kind of coherent device memory is on
board an external device attached to the system through a link where there
is a chance of link errors plugging out the entire memory node with it.
More over the memory might also have higher chances of ECC errors as
compared to the system RAM. These are just some possibilities. But the fact
remains that the coherent device memory can have some other different
properties which might not be desirable for some user space applications.
An application should not be exposed to related risks of a device if its
not taking advantage of special features of that device and it's memory.

	Because of the reasons explained above allocations into isolation
based coherent device memory node should further be regulated apart from
earlier requirement of kernel allocations not coming there. User space
allocations should not come here implicitly without the user application
explicitly knowing about it. This summarizes isolation requirement of
certain kind of a coherent device memory node as an example.

	Some coherent memory devices may not require isolation altogether.
Then there might be other coherent memory devices which require some other
special treatment after being part of core memory representation in kernel.
Though the framework suggested by this RFC has made provisions for them, it
has not considered any other kind of requirement other than isolation for
now.

	Though this RFC series currently attempts to implement one such
isolation seeking coherent device memory example, this framework can be
extended to accommodate any present or future coherent memory devices which
will fit the requirement as explained before even with new requirements
other than isolation. In case of isolation seeking coherent device memory
node, there will be other core VM code paths which need to be taken care
before it can be completely isolated as required.

	Core kernel memory features like reclamation, evictions etc. might
need to be restricted or modified on the coherent device memory node as
they can be performance limiting. The RFC does not propose anything on this
yet but it can be looked into later on. For now it just disables Auto NUMA
for any VMA which has coherent device memory.

	Seamless integration of coherent device memory with system memory
will enable various other features, some of which can be listed as follows.

	a. Seamless migrations between system RAM and the coherent memory
	b. Will have asynchronous and high throughput migrations
	c. Be able to allocate huge order pages from these memory regions
	d. Restrict allocations to a large extent to the tasks using the
	   device for workload acceleration

	Before concluding, will look into the reasons why the existing
solutions don't work. There are two basic requirements which have to be
satisfies before the coherent device memory can be integrated with core
kernel seamlessly.

	a. PFN must have struct page
	b. Struct page must able to be inside standard LRU lists

	The above two basic requirements discard the existing method of
device memory representation approaches like these which then requires the
need of creating a new framework.

(1) Traditional ioremap

	a. Memory is mapped into kernel (linear and virtual) and user space
	b. These PFNs do not have struct pages associated with it
	c. These special PFNs are marked with special flags inside the PTE
	d. Cannot participate in core VM functions much because of this
	e. Cannot do easy user space migrations

(2) Zone ZONE_DEVICE

	a. Memory is mapped into kernel and user space
	b. PFNs do have struct pages associated with it
	c. These struct pages are allocated inside it's own memory range
	d. Unfortunately the struct page's union containing LRU has been
	   used for struct dev_pagemap pointer
	e. Hence it cannot be part of any LRU (like Page cache)
	f. Hence file cached mapping cannot reside on these PFNs
	g. Cannot do easy migrations

	I had also explored non LRU representation of this coherent device
memory where the integration with system RAM in the core VM is limited only
to the following functions. Not being inside LRU is definitely going to
reduce the scope of tight integration with system RAM.

(1) Migration support between system RAM and coherent memory
(2) Migration support between various coherent memory nodes
(3) Isolation of the coherent memory
(4) Mapping the coherent memory into user space through driver's
    struct vm_operations
(5) HW poisoning of the coherent memory

	Allocating the entire memory of the coherent device node right
after hot plug into ZONE_MOVABLE (where the memory is already inside the
buddy system) will still expose a time window where other user space
allocations can come into the coherent device memory node and prevent the
intended isolation. So traditional hot plug is not the solution. Hence
started looking into CMA based non LRU solution but then hit the following
roadblocks.

(1) CMA does not support hot plugging of new memory node
	a. CMA area needs to be marked during boot before buddy is
	   initialized
	b. cma_alloc()/cma_release() can happen on the marked area
	c. Should be able to mark the CMA areas just after memory hot plug
	d. cma_alloc()/cma_release() can happen later after the hot plug
	e. This is not currently supported right now

(2) Mapped non LRU migration of pages
	a. Recent work from Michan Kim makes non LRU page migratable
	b. But it still does not support migration of mapped non LRU pages
	c. With non LRU CMA reserved, again there are some additional
	   challenges

	With hot pluggable CMA and non LRU mapped migration support there
may be an alternate approach to represent coherent device memory.

Changes compared to specialized zonelist rebuild approach
=========================================================
* Moved from specialized zonelist rebuilding to cpuset based isolation for
  the coherent device memory nodes
* Right now with this new approach, there is no explicit way of allocation
  into the coherent device memory nodes from user space, though it can be
  explored into later on
* Changed the behaviour of __alloc_pages_nodemask() when both cpuset is
  enabled and the allocation request has __GFP_THISNODE flag
* Dropped the VMA flag VM_CDM and related auto NUMA changes
* Dropped migrate_virtual_range() function from the series and moved that
  into the DEBUG patches

The previous CDM RFC post is here https://lkml.org/lkml/2016/10/24/19, the
current series has been tested to some extent for isolation purpose and to
see that __GFP_THISNODE based allocation works on the coherent device node.
Wondering if this approach or the previous one is better positioned to
represent coherent device memory in the kernel in a NUMA visible manner.
Inputs, thoughts or suggestions on other alternate approaches are welcome.
Thank you.

Anshuman Khandual (4):
  mm: Define coherent device memory node
  mm/cpuset: Exclude coherent device memory nodes from mems_allowed
  mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask
  mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE

 Documentation/ABI/stable/sysfs-devices-node |  7 +++++++
 arch/powerpc/Kconfig                        |  1 +
 arch/powerpc/mm/numa.c                      |  7 +++++++
 drivers/base/node.c                         |  6 ++++++
 include/linux/mm.h                          |  1 +
 include/linux/node.h                        | 18 ++++++++++++++++
 include/linux/nodemask.h                    |  3 +++
 kernel/cpuset.c                             | 12 ++++++-----
 mm/Kconfig                                  |  5 +++++
 mm/hugetlb.c                                | 32 +++++++++++++++++++++--------
 mm/memory_hotplug.c                         | 10 +++++++++
 mm/page_alloc.c                             |  2 +-
 12 files changed, 89 insertions(+), 15 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 1/4] mm: Define coherent device memory node
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-29 17:57   ` Dave Hansen
  2016-11-22 14:19 ` [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed Anshuman Khandual
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

There are certain devices like specialized accelerator, GPU cards, network
cards, FPGA cards etc which might contain onboard memory which is coherent
along with the existing system RAM while being accessed either from the CPU
or from the device. They share some similar properties with that of normal
system RAM but at the same time can also be different with respect to
system RAM.

User applications might be interested in using this kind of coherent device
memory explicitly or implicitly along side the system RAM utilizing all
possible core memory functions like anon mapping (LRU), file mapping (LRU),
page cache (LRU), driver managed (non LRU), HW poisoning, NUMA migrations
etc. To achieve this kind of tight integration with core memory subsystem,
the device onbaord coherent memory must be represented as a memory only
NUMA node. At the same time arch must export some kind of a function to
identify of this node as a coherent device memory not any other regular
cpu less memory only NUMA node.

After achieving the integration with core memory subsystem coherent device
memory might still need some special consideration inside the kernel. There
can be a variety of coherent memory nodes with different expectations from
the core kernel memory. But right now only one kind of special treatment is
considered which requires certain isolation.

Now consider the case of a coherent device memory node type which requires
isolation. This kind of coherent memory is onboard an external device
attached to the system through a link where there is always a chance of a
link failure taking down the entire memory node with it. More over the
memory might also have higher chance of ECC failure as compared to the
system RAM. Hence allocation into this kind of coherent memory node should
be regulated. Kernel allocations must not come here. Normal user space
allocations too should not come here implicitly (without user application
knowing about it). This summarizes isolation requirement of certain kind of
coherent device memory node as an example. There can be different kinds of
isolation requirement also.

Some coherent memory devices might not require isolation altogether after
all. Then there might be other coherent memory devices which might require
some other special treatment after being part of core memory representation.
For now, will look into isolation seeking coherent device memory node not
the other ones.

To implement the integration as well as isolation, the coherent memory node
must be present in N_MEMORY and a new N_COHERENT_DEVICE node mask inside
the node_states[] array. During memory hotplug operations, the new nodemask
N_COHERENT_DEVICE is updated along with N_MEMORY for these coherent device
memory nodes. This also creates the following new sysfs based interface to
list down all the coherent memory nodes of the system.

	/sys/devices/system/node/is_coherent_node

Architectures must export function arch_check_node_cdm() which identifies
any coherent device memory node in case they enable CONFIG_COHERENT_DEVICE.
---
 Documentation/ABI/stable/sysfs-devices-node |  7 +++++++
 arch/powerpc/Kconfig                        |  1 +
 arch/powerpc/mm/numa.c                      |  7 +++++++
 drivers/base/node.c                         |  6 ++++++
 include/linux/node.h                        |  6 ++++++
 include/linux/nodemask.h                    |  3 +++
 mm/Kconfig                                  |  5 +++++
 mm/memory_hotplug.c                         | 10 ++++++++++
 8 files changed, 45 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
index 5b2d0f0..6f039a4 100644
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -29,6 +29,13 @@ Description:
 		Nodes that have regular or high memory.
 		Depends on CONFIG_HIGHMEM.
 
+What:		/sys/devices/system/node/is_coherent_device
+Date:		November 2016
+Contact:	Linux Memory Management list <linux-mm@kvack.org>
+Description:
+		Lists the nodemask of nodes that have coherent device memory.
+		Depends on CONFIG_COHERENT_DEVICE.
+
 What:		/sys/devices/system/node/nodeX
 Date:		October 2002
 Contact:	Linux Memory Management list <linux-mm@kvack.org>
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 65fba4c..81bf679 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -162,6 +162,7 @@ config PPC
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HAVE_ARCH_HARDENED_USERCOPY
 	select HAVE_KERNEL_GZIP
+	select COHERENT_DEVICE if PPC64 && CPUSETS
 
 config GENERIC_CSUM
 	def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index a51c188..31efc27 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -41,6 +41,13 @@
 #include <asm/setup.h>
 #include <asm/vdso.h>
 
+#ifdef CONFIG_COHERENT_DEVICE
+int arch_check_node_cdm(int nid)
+{
+	return 0;
+}
+#endif
+
 static int numa_enabled = 1;
 
 static char *cmdline __initdata;
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5548f96..5b5dd89 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -661,6 +661,9 @@ static ssize_t show_node_state(struct device *dev,
 	[N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY),
 #endif
 	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
+#ifdef CONFIG_COHERENT_DEVICE
+	[N_COHERENT_DEVICE] = _NODE_ATTR(is_coherent_device, N_COHERENT_DEVICE),
+#endif
 };
 
 static struct attribute *node_state_attrs[] = {
@@ -674,6 +677,9 @@ static ssize_t show_node_state(struct device *dev,
 	&node_state_attr[N_MEMORY].attr.attr,
 #endif
 	&node_state_attr[N_CPU].attr.attr,
+#ifdef CONFIG_COHERENT_DEVICE
+	&node_state_attr[N_COHERENT_DEVICE].attr.attr,
+#endif
 	NULL
 };
 
diff --git a/include/linux/node.h b/include/linux/node.h
index 2115ad5..fc319de 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -81,4 +81,10 @@ static inline void register_hugetlbfs_with_node(node_registration_func_t reg,
 
 #define to_node(device) container_of(device, struct node, dev)
 
+#ifdef CONFIG_COHERENT_DEVICE
+extern int arch_check_node_cdm(int nid);
+#else
+static inline int arch_check_node_cdm(int nid) {return 0;}
+#endif
+
 #endif /* _LINUX_NODE_H_ */
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index f746e44..6e66cfd 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -393,6 +393,9 @@ enum node_states {
 	N_MEMORY = N_HIGH_MEMORY,
 #endif
 	N_CPU,		/* The node has one or more cpus */
+#ifdef CONFIG_COHERENT_DEVICE
+	N_COHERENT_DEVICE,
+#endif
 	NR_NODE_STATES
 };
 
diff --git a/mm/Kconfig b/mm/Kconfig
index 86e3e0e..546dc69 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -143,6 +143,11 @@ config HAVE_GENERIC_RCU_GUP
 config ARCH_DISCARD_MEMBLOCK
 	bool
 
+config COHERENT_DEVICE
+	bool
+	depends on CPUSETS
+	default n
+
 config NO_BOOTMEM
 	bool
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index cad4b91..269af7c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1030,6 +1030,11 @@ static void node_states_set_node(int node, struct memory_notify *arg)
 	if (arg->status_change_nid_high >= 0)
 		node_set_state(node, N_HIGH_MEMORY);
 
+#ifdef CONFIG_COHERENT_DEVICE
+	if (arch_check_node_cdm(node))
+		node_set_state(node, N_COHERENT_DEVICE);
+#endif
+
 	node_set_state(node, N_MEMORY);
 }
 
@@ -1844,6 +1849,11 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
 	if ((N_MEMORY != N_HIGH_MEMORY) &&
 	    (arg->status_change_nid >= 0))
 		node_clear_state(node, N_MEMORY);
+
+#ifdef CONFIG_COHERENT_DEVICE
+	if (arch_check_node_cdm(node))
+		node_clear_state(node, N_COHERENT_DEVICE);
+#endif
 }
 
 static int __ref __offline_pages(unsigned long start_pfn,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
  2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask Anshuman Khandual
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

Task's mems_allowed decides the final node mask of nodes from which memory
can be allocated irrespective of the process or VMA based memory policy.
Coherent device memory nodes should not be used for any user space memory
allocation, hence they should not be part of any mems_allowed mask in user
space to begin with. This adds a new function system_ram() which computes
system RAM only node mask and excludes all the coherent memory nodes on the
platform. This resultant system RAM node mask is used instead of N_MEMORY
node mask during cpuset update and mems_allowed initialization. It achieves
isolation of the coherent device memory node from userspace allocations.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 include/linux/mm.h   |  1 +
 include/linux/node.h | 12 ++++++++++++
 kernel/cpuset.c      | 12 +++++++-----
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a92c8d7..c40b454 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -446,6 +446,7 @@ static inline int put_page_testzero(struct page *page)
 	return page_ref_dec_and_test(page);
 }
 
+
 /*
  * Try to grab a ref unless the page has a refcount of zero, return false if
  * that is the case.
diff --git a/include/linux/node.h b/include/linux/node.h
index fc319de..99978f9 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -87,4 +87,16 @@ static inline void register_hugetlbfs_with_node(node_registration_func_t reg,
 static inline int arch_check_node_cdm(int nid) {return 0;}
 #endif
 
+static inline nodemask_t ram_nodemask(void)
+{
+#ifdef CONFIG_COHERENT_DEVICE
+	nodemask_t ram_nodes;
+
+	nodes_clear(ram_nodes);
+	nodes_andnot(ram_nodes, node_states[N_MEMORY], node_states[N_COHERENT_DEVICE]);
+	return ram_nodes;
+#else
+	return node_states[N_MEMORY];
+#endif
+}
 #endif /* _LINUX_NODE_H_ */
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29f815d..bdbe847 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -364,9 +364,11 @@ static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
  */
 static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
 {
-	while (!nodes_intersects(cs->effective_mems, node_states[N_MEMORY]))
+	nodemask_t ram_nodes = ram_nodemask();
+
+	while (!nodes_intersects(cs->effective_mems, ram_nodes))
 		cs = parent_cs(cs);
-	nodes_and(*pmask, cs->effective_mems, node_states[N_MEMORY]);
+	nodes_and(*pmask, cs->effective_mems, ram_nodes);
 }
 
 /*
@@ -2301,7 +2303,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
 
 	/* fetch the available cpus/mems and find out which changed how */
 	cpumask_copy(&new_cpus, cpu_active_mask);
-	new_mems = node_states[N_MEMORY];
+	new_mems = ram_nodemask();
 
 	cpus_updated = !cpumask_equal(top_cpuset.effective_cpus, &new_cpus);
 	mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
@@ -2393,11 +2395,11 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
 void __init cpuset_init_smp(void)
 {
 	cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
-	top_cpuset.mems_allowed = node_states[N_MEMORY];
+	top_cpuset.mems_allowed = ram_nodemask();
 	top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
 
 	cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
-	top_cpuset.effective_mems = node_states[N_MEMORY];
+	top_cpuset.effective_mems = ram_nodemask();
 
 	register_hotmemory_notifier(&cpuset_track_online_nodes_nb);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
  2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
  2016-11-22 14:19 ` [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

HugeTLB allocation/release/accounting currently spans across all the nodes
under N_MEMORY node mask. Coherent memory nodes should not be part of these
allocations. So use system_ram() call to fetch system RAM only nodes on the
platform which can then be used for HugeTLB allocation purpose instead of
N_MEMORY node mask. This isolates coherent device memory nodes from HugeTLB
allocations.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 mm/hugetlb.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 418bf01..f7236e1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1782,6 +1782,9 @@ static void return_unused_surplus_pages(struct hstate *h,
 					unsigned long unused_resv_pages)
 {
 	unsigned long nr_pages;
+	nodemask_t nodes;
+
+	nodes = ram_nodemask();
 
 	/* Uncommit the reservation */
 	h->resv_huge_pages -= unused_resv_pages;
@@ -1801,7 +1804,7 @@ static void return_unused_surplus_pages(struct hstate *h,
 	 * on-line nodes with memory and will handle the hstate accounting.
 	 */
 	while (nr_pages--) {
-		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
+		if (!free_pool_huge_page(h, &nodes, 1))
 			break;
 		cond_resched_lock(&hugetlb_lock);
 	}
@@ -2088,8 +2091,10 @@ int __weak alloc_bootmem_huge_page(struct hstate *h)
 {
 	struct huge_bootmem_page *m;
 	int nr_nodes, node;
+	nodemask_t nodes;
 
-	for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
+	nodes = ram_nodemask();
+	for_each_node_mask_to_alloc(h, nr_nodes, node, &nodes) {
 		void *addr;
 
 		addr = memblock_virt_alloc_try_nid_nopanic(
@@ -2158,13 +2163,15 @@ static void __init gather_bootmem_prealloc(void)
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
 	unsigned long i;
+	nodemask_t nodes;
+
 
+	nodes = ram_nodemask();
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
 			if (!alloc_bootmem_huge_page(h))
 				break;
-		} else if (!alloc_fresh_huge_page(h,
-					 &node_states[N_MEMORY]))
+		} else if (!alloc_fresh_huge_page(h, &nodes))
 			break;
 	}
 	h->max_huge_pages = i;
@@ -2401,8 +2408,11 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy,
 					   unsigned long count, size_t len)
 {
 	int err;
+	nodemask_t ram_nodes;
+
 	NODEMASK_ALLOC(nodemask_t, nodes_allowed, GFP_KERNEL | __GFP_NORETRY);
 
+	ram_nodes = ram_nodemask();
 	if (hstate_is_gigantic(h) && !gigantic_page_supported()) {
 		err = -EINVAL;
 		goto out;
@@ -2415,7 +2425,7 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 				init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_MEMORY];
+			nodes_allowed = &ram_nodes;
 		}
 	} else if (nodes_allowed) {
 		/*
@@ -2425,11 +2435,11 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy,
 		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
 		init_nodemask_of_node(nodes_allowed, nid);
 	} else
-		nodes_allowed = &node_states[N_MEMORY];
+		nodes_allowed = &ram_nodes;
 
 	h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
 
-	if (nodes_allowed != &node_states[N_MEMORY])
+	if (nodes_allowed != &ram_nodes)
 		NODEMASK_FREE(nodes_allowed);
 
 	return len;
@@ -2726,9 +2736,11 @@ static void hugetlb_register_node(struct node *node)
  */
 static void __init hugetlb_register_all_nodes(void)
 {
+	nodemask_t nodes;
 	int nid;
 
-	for_each_node_state(nid, N_MEMORY) {
+	nodes = ram_nodemask();
+	for_each_node_mask(nid, nodes) {
 		struct node *node = node_devices[nid];
 		if (node->dev.id == nid)
 			hugetlb_register_node(node);
@@ -2998,13 +3010,15 @@ int hugetlb_report_node_meminfo(int nid, char *buf)
 
 void hugetlb_show_meminfo(void)
 {
+	nodemask_t nodes;
 	struct hstate *h;
 	int nid;
 
 	if (!hugepages_supported())
 		return;
 
-	for_each_node_state(nid, N_MEMORY)
+	nodes = ram_nodemask();
+	for_each_node_mask(nid, nodes)
 		for_each_hstate(h)
 			pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",
 				nid,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (2 preceding siblings ...)
  2016-11-22 14:19 ` [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-28 21:12   ` Dave Hansen
  2016-11-22 14:19 ` [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

__GFP_THISNODE specifically asks the memory to be allocated from the given
node. Not all the requests that end up in __alloc_pages_nodemask() are
originated from the process context where cpuset makes more sense. The
current condition enforces cpuset limitation on every allocation whether
originated from process context or not which prevents __GFP_THISNODE
mandated allocations to come from the specified node. In context of the
coherent device memory node which is isolated from all cpuset nodemask
in the system, it prevents the only way of allocation into it which has
been changed with this patch.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440..1697e21 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3715,7 +3715,7 @@ struct page *
 		.migratetype = gfpflags_to_migratetype(gfp_mask),
 	};
 
-	if (cpusets_enabled()) {
+	if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
 		alloc_mask |= __GFP_HARDWALL;
 		alloc_flags |= ALLOC_CPUSET;
 		if (!ac.nodemask)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (3 preceding siblings ...)
  2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

Coherent device memory nodes will have "ibm,hotplug-aperture" as one of the
compatible properties in their respective device nodes in the device tree.
Detect them early during NUMA platform initialization and mark them as such
in the node_to_phys_device_map[] array which in turn is used to support the
arch_check_cdm_node() function for the core VM.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 31efc27..b625e0e 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -41,10 +41,12 @@
 #include <asm/setup.h>
 #include <asm/vdso.h>
 
+static int node_to_phys_device_map[MAX_NUMNODES];
+
 #ifdef CONFIG_COHERENT_DEVICE
 int arch_check_node_cdm(int nid)
 {
-	return 0;
+	return node_to_phys_device_map[nid];
 }
 #endif
 
@@ -790,6 +792,9 @@ static int __init parse_numa_properties(void)
 		if (nid < 0)
 			nid = default_nid;
 
+		if (of_device_is_compatible(memory, "ibm,hotplug-aperture"))
+			node_to_phys_device_map[nid] = 1;
+
 		fake_numa_create_new_node(((start + size) >> PAGE_SHIFT), &nid);
 		node_set_online(nid);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (4 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

From: Reza Arbab <arbab@linux.vnet.ibm.com>

When scanning the device tree to initialize the system NUMA topology,
process dt elements with compatible id "ibm,hotplug-aperture" to create
memoryless numa nodes.

These nodes will be filled when hotplug occurs within the associated
address range.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b625e0e..e4cb4e62 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -717,6 +717,12 @@ static void __init parse_drconf_memory(struct device_node *memory)
 	}
 }
 
+static const struct of_device_id memory_match[] = {
+	{ .type = "memory" },
+	{ .compatible = "ibm,hotplug-aperture" },
+	{ /* sentinel */ }
+};
+
 static int __init parse_numa_properties(void)
 {
 	struct device_node *memory;
@@ -761,7 +767,7 @@ static int __init parse_numa_properties(void)
 
 	get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells);
 
-	for_each_node_by_type(memory, "memory") {
+	for_each_matching_node(memory, memory_match) {
 		unsigned long start;
 		unsigned long size;
 		int nid;
@@ -1056,7 +1062,7 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr)
 	struct device_node *memory;
 	int nid = -1;
 
-	for_each_node_by_type(memory, "memory") {
+	for_each_matching_node(memory, memory_match) {
 		unsigned long start, size;
 		int ranges;
 		const __be32 *memcell_buf;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (5 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

From: Reza Arbab <arbab@linux.vnet.ibm.com>

Remove the check which prevents us from hotplugging into an empty node.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index e4cb4e62..4086ff7 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1103,7 +1103,7 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr)
 int hot_add_scn_to_nid(unsigned long scn_addr)
 {
 	struct device_node *memory = NULL;
-	int nid, found = 0;
+	int nid;
 
 	if (!numa_enabled || (min_common_depth < 0))
 		return first_online_node;
@@ -1119,17 +1119,6 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
 	if (nid < 0 || !node_online(nid))
 		nid = first_online_node;
 
-	if (NODE_DATA(nid)->node_spanned_pages)
-		return nid;
-
-	for_each_online_node(nid) {
-		if (NODE_DATA(nid)->node_spanned_pages) {
-			found = 1;
-			break;
-		}
-	}
-
-	BUG_ON(!found);
 	return nid;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (6 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

From: Reza Arbab <arbab@linux.vnet.ibm.com>

Onlining memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE.

Enable the use of this config option on PPC64 platforms.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt | 2 +-
 mm/Kconfig                          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 37babf9..61cfa0b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2401,7 +2401,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			that the amount of memory usable for all allocations
 			is not too small.
 
-	movable_node	[KNL,X86] Boot-time switch to enable the effects
+	movable_node	[KNL,X86,PPC] Boot-time switch to enable the effects
 			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
 
 	MTD_Partition=	[MTD]
diff --git a/mm/Kconfig b/mm/Kconfig
index 546dc69..dd0ac83 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -158,7 +158,7 @@ config MOVABLE_NODE
 	bool "Enable to assign a node which has only movable memory"
 	depends on HAVE_MEMBLOCK
 	depends on NO_BOOTMEM
-	depends on X86_64
+	depends on X86_64 || PPC64
 	depends on NUMA
 	default n
 	help
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (7 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

Just enable MOVABLE_NODE config option for PPC64 platform by default.
This prevents accidentally building the kernel without the required
config option.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 81bf679..c2ed822 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -311,6 +311,10 @@ config PGTABLE_LEVELS
 	default 3 if PPC_64K_PAGES && !PPC_BOOK3S_64
 	default 4
 
+config MOVABLE_NODE
+	bool
+	default y if PPC64
+
 source "init/Kconfig"
 
 source "kernel/Kconfig.freezer"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range()
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (8 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

This adds a new virtual address range based migration interface which
can migrate all the mapped pages from a virtual range of a process to
a destination node. This also exports this new function symbol.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 include/linux/mempolicy.h |  7 +++++
 include/linux/migrate.h   |  3 ++
 mm/mempolicy.c            |  7 ++---
 mm/migrate.c              | 71 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 5e5b296..c2b4a18 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -152,6 +152,9 @@ extern struct zonelist *huge_zonelist(struct vm_area_struct *vma,
 extern bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 				const nodemask_t *mask);
 extern unsigned int mempolicy_slab_node(void);
+extern int queue_pages_range(struct mm_struct *mm, unsigned long start,
+			unsigned long end, nodemask_t *nodes,
+			unsigned long flags, struct list_head *pagelist);
 
 extern enum zone_type policy_zone;
 
@@ -302,4 +305,8 @@ static inline void mpol_put_task_policy(struct task_struct *task)
 {
 }
 #endif /* CONFIG_NUMA */
+
+#define MPOL_MF_DISCONTIG_OK (MPOL_MF_INTERNAL << 0)	/* Skip checks for continuous vmas */
+#define MPOL_MF_INVERT (MPOL_MF_INTERNAL << 1)		/* Invert check for nodemask */
+
 #endif
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index ae8d475..e2a1af5 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -49,6 +49,9 @@ extern int migrate_page_move_mapping(struct address_space *mapping,
 		struct page *newpage, struct page *page,
 		struct buffer_head *head, enum migrate_mode mode,
 		int extra_count);
+
+extern int migrate_virtual_range(int pid, unsigned long vaddr,
+				unsigned long size, int nid);
 #else
 
 static inline void putback_movable_pages(struct list_head *l) {}
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 0b859af..728347a 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -100,10 +100,6 @@
 
 #include "internal.h"
 
-/* Internal flags */
-#define MPOL_MF_DISCONTIG_OK (MPOL_MF_INTERNAL << 0)	/* Skip checks for continuous vmas */
-#define MPOL_MF_INVERT (MPOL_MF_INTERNAL << 1)		/* Invert check for nodemask */
-
 static struct kmem_cache *policy_cache;
 static struct kmem_cache *sn_cache;
 
@@ -662,7 +658,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
  * @nodes and @flags,) it's isolated and queued to the pagelist which is
  * passed via @private.)
  */
-static int
+int
 queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
 		nodemask_t *nodes, unsigned long flags,
 		struct list_head *pagelist)
@@ -683,6 +679,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 
 	return walk_page_range(start, end, &queue_pages_walk);
 }
+EXPORT_SYMBOL(queue_pages_range);
 
 /*
  * Apply policy to a single VMA
diff --git a/mm/migrate.c b/mm/migrate.c
index 99250ae..4f20415 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1367,6 +1367,77 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	return rc;
 }
 
+static struct page *new_node_page(struct page *page,
+		unsigned long node, int **x)
+{
+	return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE
+					| __GFP_THISNODE, 0);
+}
+
+/*
+ * migrate_virtual_range - migrate all the pages faulted within a virtual
+ *			address range to a specified node.
+ *
+ * @pid:		PID of the task
+ * @start:		Virtual address range beginning
+ * @end:		Virtual address range end
+ * @nid:		Target migration node
+ *
+ * The function first scans the process VMA list to find out the VMA which
+ * contains the given virtual range. Then validates that the virtual range
+ * is within the given VMA's limits.
+ *
+ * Returns the number of pages that were not migrated or an error code.
+ */
+int migrate_virtual_range(int pid, unsigned long start,
+			unsigned long end, int nid)
+{
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	nodemask_t nmask;
+	int ret = -EINVAL;
+
+	LIST_HEAD(mlist);
+
+	nodes_clear(nmask);
+	nodes_setall(nmask);
+
+	if ((!start) || (!end))
+		return -EINVAL;
+
+	rcu_read_lock();
+	mm = find_task_by_vpid(pid)->mm;
+	rcu_read_unlock();
+
+	start &= PAGE_MASK;
+	end &= PAGE_MASK;
+	down_write(&mm->mmap_sem);
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if  ((start < vma->vm_start) || (end > vma->vm_end))
+			continue;
+
+		ret = queue_pages_range(mm, start, end, &nmask, MPOL_MF_MOVE_ALL
+						| MPOL_MF_DISCONTIG_OK, &mlist);
+		if (ret) {
+			putback_movable_pages(&mlist);
+			break;
+		}
+
+		if (list_empty(&mlist)) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		ret = migrate_pages(&mlist, new_node_page, NULL, nid,
+					MIGRATE_SYNC, MR_COMPACTION);
+		if (ret)
+			putback_movable_pages(&mlist);
+	}
+	up_write(&mm->mmap_sem);
+	return ret;
+}
+EXPORT_SYMBOL(migrate_virtual_range);
+
 #ifdef CONFIG_NUMA
 /*
  * Move a list of individual pages
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (9 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  2016-11-22 14:19 ` [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

This adds two different drivers inside drivers/char/ directory under two
new kernel config options COHERENT_HOTPLUG_DEMO and COHERENT_MEMORY_DEMO.

1) coherent_hotplug_demo: Detects, hoptlugs the coherent device memory
2) coherent_memory_demo:  Exports debugfs interface for VMA migrations

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 drivers/char/Kconfig                 |  23 +++
 drivers/char/Makefile                |   2 +
 drivers/char/coherent_hotplug_demo.c | 133 ++++++++++++++
 drivers/char/coherent_memory_demo.c  | 337 +++++++++++++++++++++++++++++++++++
 drivers/char/memory_online_sysfs.h   | 148 +++++++++++++++
 mm/migrate.c                         |  14 +-
 6 files changed, 656 insertions(+), 1 deletion(-)
 create mode 100644 drivers/char/coherent_hotplug_demo.c
 create mode 100644 drivers/char/coherent_memory_demo.c
 create mode 100644 drivers/char/memory_online_sysfs.h

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index dcc0973..22c538d 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -588,6 +588,29 @@ config TILE_SROM
 	  device appear much like a simple EEPROM, and knows
 	  how to partition a single ROM for multiple purposes.
 
+config COHERENT_HOTPLUG_DEMO
+	tristate "Demo driver to test coherent memory node hotplug"
+	depends on PPC64 || COHERENT_DEVICE
+	default n
+	help
+	  Say yes when you want to build a test driver to hotplug all
+	  the coherent memory nodes present on the system. This driver
+	  scans through the device tree, checks on "ibm,memory-device"
+	  property device nodes and onlines its memory. When unloaded,
+	  it goes through the list of memory ranges it onlined before
+	  and oflines them one by one. If not sure, select N.
+
+config COHERENT_MEMORY_DEMO
+	tristate "Demo driver to test coherent memory node functionality"
+	depends on PPC64 || COHERENT_DEVICE
+	default n
+	help
+	  Say yes when you want to build a test driver to demonstrate
+	  the coherent memory functionalities, capabilities and probable
+	  utilizaton. It also exports a debugfs file to accept inputs for
+	  virtual address range migration for any process. If not sure,
+	  select N.
+
 source "drivers/char/xillybus/Kconfig"
 
 endmenu
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index 6e6c244..92fa338 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -60,3 +60,5 @@ js-rtc-y = rtc.o
 obj-$(CONFIG_TILE_SROM)		+= tile-srom.o
 obj-$(CONFIG_XILLYBUS)		+= xillybus/
 obj-$(CONFIG_POWERNV_OP_PANEL)	+= powernv-op-panel.o
+obj-$(CONFIG_COHERENT_HOTPLUG_DEMO)	+= coherent_hotplug_demo.o
+obj-$(CONFIG_COHERENT_MEMORY_DEMO)	+= coherent_memory_demo.o
diff --git a/drivers/char/coherent_hotplug_demo.c b/drivers/char/coherent_hotplug_demo.c
new file mode 100644
index 0000000..3670081
--- /dev/null
+++ b/drivers/char/coherent_hotplug_demo.c
@@ -0,0 +1,133 @@
+/*
+ * Memory hotplug support for coherent memory nodes in runtime.
+ *
+ * Copyright (C) 2016, Reza Arbab, IBM Corporation.
+ * Copyright (C) 2016, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include <linux/of.h>
+#include <linux/export.h>
+#include <linux/spinlock.h>
+#include <linux/init.h>
+#include <linux/memblock.h>
+#include <linux/module.h>
+#include <linux/memory.h>
+#include <linux/sizes.h>
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/migrate.h>
+#include <linux/memblock.h>
+#include <linux/uaccess.h>
+
+#include <asm/mmu.h>
+#include <asm/pgalloc.h>
+#include "memory_online_sysfs.h"
+
+#define MAX_HOTADD_NODES 100
+phys_addr_t addr[MAX_HOTADD_NODES][2];
+int nr_addr;
+
+/*
+ * extern int memory_failure(unsigned long pfn, int trapno, int flags);
+ * extern int min_free_kbytes;
+ * extern int user_min_free_kbytes;
+ *
+ * extern unsigned long nr_kernel_pages;
+ * extern unsigned long nr_all_pages;
+ * extern unsigned long dma_reserve;
+ */
+
+static void dump_core_vm_tunables(void)
+{
+/*
+ *	printk(":::::::: VM TUNABLES :::::::\n");
+ *	printk("[min_free_kbytes]	%d\n", min_free_kbytes);
+ *	printk("[user_min_free_kbytes]	%d\n", user_min_free_kbytes);
+ *	printk("[nr_kernel_pages]	%ld\n", nr_kernel_pages);
+ *	printk("[nr_all_pages]		%ld\n", nr_all_pages);
+ *	printk("[dma_reserve]		%ld\n", dma_reserve);
+ */
+}
+
+
+
+static int online_coherent_memory(void)
+{
+	struct device_node *memory;
+
+	nr_addr = 0;
+	disable_auto_online();
+	dump_core_vm_tunables();
+	for_each_compatible_node(memory, NULL, "ibm,memory-device") {
+		struct device_node *mem;
+		const __be64 *reg;
+		unsigned int len, ret;
+		phys_addr_t start, size;
+
+		mem = of_parse_phandle(memory, "memory-region", 0);
+		if (!mem) {
+			pr_info("memory-region property not found\n");
+			return -1;
+		}
+
+		reg = of_get_property(mem, "reg", &len);
+		if (!reg || len <= 0) {
+			pr_info("memory-region property not found\n");
+			return -1;
+		}
+		start = be64_to_cpu(*reg);
+		size = be64_to_cpu(*(reg + 1));
+		pr_info("Coherent memory start %llx size %llx\n", start, size);
+		ret = memory_probe_store(start, size);
+		if (ret)
+			pr_info("probe faile\n");
+
+		ret = store_mem_state(start, size, "online_movable");
+		if (ret)
+			pr_info("online_movable failed\n");
+
+		addr[nr_addr][0] = start;
+		addr[nr_addr][1] = size;
+		nr_addr++;
+	}
+	dump_core_vm_tunables();
+	enable_auto_online();
+	return 0;
+}
+
+static int offline_coherent_memory(void)
+{
+	int i;
+
+	for (i = 0; i < nr_addr; i++)
+		store_mem_state(addr[i][0], addr[i][1], "offline");
+	return 0;
+}
+
+static void __exit coherent_hotplug_exit(void)
+{
+	pr_info("%s\n", __func__);
+	offline_coherent_memory();
+}
+
+static int __init coherent_hotplug_init(void)
+{
+	pr_info("%s\n", __func__);
+	return online_coherent_memory();
+}
+module_init(coherent_hotplug_init);
+module_exit(coherent_hotplug_exit);
+MODULE_LICENSE("GPL");
diff --git a/drivers/char/coherent_memory_demo.c b/drivers/char/coherent_memory_demo.c
new file mode 100644
index 0000000..1dcd9f7
--- /dev/null
+++ b/drivers/char/coherent_memory_demo.c
@@ -0,0 +1,337 @@
+/*
+ * Demonstrating various aspects of the coherent memory.
+ *
+ * Copyright (C) 2016, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include <linux/of.h>
+#include <linux/export.h>
+#include <linux/spinlock.h>
+#include <linux/init.h>
+#include <linux/memblock.h>
+#include <linux/module.h>
+#include <linux/memory.h>
+#include <linux/sizes.h>
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+#include <linux/migrate.h>
+#include <linux/memblock.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+
+#include <asm/mmu.h>
+#include <asm/pgalloc.h>
+
+#define COHERENT_DEV_MAJOR 89
+#define COHERENT_DEV_NAME  "coherent_memory"
+
+#define CRNT_NODE_NID1 1
+#define CRNT_NODE_NID2 2
+#define CRNT_NODE_NID3 3
+
+#define RAM_CRNT_MIGRATE 1
+#define CRNT_RAM_MIGRATE 2
+
+struct vma_map_info {
+	struct list_head list;
+	unsigned long nr_pages;
+	spinlock_t lock;
+};
+
+static void vma_map_info_init(struct vm_area_struct *vma)
+{
+	struct vma_map_info *info = kmalloc(sizeof(struct vma_map_info),
+								GFP_KERNEL);
+
+	BUG_ON(!info);
+	INIT_LIST_HEAD(&info->list);
+	spin_lock_init(&info->lock);
+	vma->vm_private_data = info;
+	info->nr_pages = 0;
+}
+
+static void coherent_vmops_open(struct vm_area_struct *vma)
+{
+	vma_map_info_init(vma);
+}
+
+static void coherent_vmops_close(struct vm_area_struct *vma)
+{
+	struct vma_map_info *info = vma->vm_private_data;
+
+	BUG_ON(!info);
+again:
+	cond_resched();
+	spin_lock(&info->lock);
+	while (info->nr_pages) {
+		struct page *page, *page2;
+
+		list_for_each_entry_safe(page, page2, &info->list, lru) {
+			if (!trylock_page(page)) {
+				spin_unlock(&info->lock);
+				goto again;
+			}
+
+			list_del_init(&page->lru);
+			info->nr_pages--;
+			unlock_page(page);
+			SetPageReclaim(page);
+			put_page(page);
+		}
+		spin_unlock(&info->lock);
+		cond_resched();
+		spin_lock(&info->lock);
+	}
+	spin_unlock(&info->lock);
+	kfree(info);
+	vma->vm_private_data = NULL;
+}
+
+static int coherent_vmops_fault(struct vm_area_struct *vma,
+					struct vm_fault *vmf)
+{
+	struct vma_map_info *info;
+	struct page *page;
+	static int coherent_node = CRNT_NODE_NID1;
+
+	if (coherent_node == CRNT_NODE_NID1)
+		coherent_node = CRNT_NODE_NID2;
+	else
+		coherent_node = CRNT_NODE_NID1;
+
+	page = alloc_pages_node(coherent_node,
+				GFP_HIGHUSER_MOVABLE | __GFP_THISNODE, 0);
+	if (!page)
+		return VM_FAULT_SIGBUS;
+
+	info = (struct vma_map_info *) vma->vm_private_data;
+	BUG_ON(!info);
+	spin_lock(&info->lock);
+	list_add(&page->lru, &info->list);
+	info->nr_pages++;
+	spin_unlock(&info->lock);
+
+	page->index = vmf->pgoff;
+	get_page(page);
+	vmf->page = page;
+	return 0;
+}
+
+static const struct vm_operations_struct coherent_memory_vmops = {
+	.open = coherent_vmops_open,
+	.close = coherent_vmops_close,
+	.fault = coherent_vmops_fault,
+};
+
+static int coherent_memory_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	pr_info("Mmap opened (file: %lx vma: %lx)\n",
+			(unsigned long) file, (unsigned long) vma);
+	vma->vm_ops = &coherent_memory_vmops;
+	coherent_vmops_open(vma);
+	return 0;
+}
+
+static int coherent_memory_open(struct inode *inode, struct file *file)
+{
+	pr_info("Device opened (inode: %lx file: %lx)\n",
+			(unsigned long) inode, (unsigned long) file);
+	return 0;
+}
+
+static int coherent_memory_close(struct inode *inode, struct file *file)
+{
+	pr_info("Device closed (inode: %lx file: %lx)\n",
+			(unsigned long) inode, (unsigned long) file);
+	return 0;
+}
+
+static void lru_ram_coherent_migrate(unsigned long addr)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+	nodemask_t nmask;
+	LIST_HEAD(mlist);
+
+	nodes_clear(nmask);
+	nodes_setall(nmask);
+	down_write(&mm->mmap_sem);
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if  ((addr < vma->vm_start) || (addr > vma->vm_end))
+			continue;
+		break;
+	}
+	up_write(&mm->mmap_sem);
+	if (!vma) {
+		pr_info("%s: No VMA found\n", __func__);
+		return;
+	}
+	migrate_virtual_range(current->pid, vma->vm_start, vma->vm_end, 2);
+}
+
+static void lru_coherent_ram_migrate(unsigned long addr)
+{
+	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
+	nodemask_t nmask;
+	LIST_HEAD(mlist);
+
+	nodes_clear(nmask);
+	nodes_setall(nmask);
+	down_write(&mm->mmap_sem);
+	for (vma = mm->mmap; vma; vma = vma->vm_next) {
+		if  ((addr < vma->vm_start) || (addr > vma->vm_end))
+			continue;
+		break;
+	}
+	up_write(&mm->mmap_sem);
+	if (!vma) {
+		pr_info("%s: No VMA found\n", __func__);
+		return;
+	}
+	migrate_virtual_range(current->pid, vma->vm_start, vma->vm_end, 0);
+}
+
+static long coherent_memory_ioctl(struct file *file,
+					unsigned int cmd, unsigned long arg)
+{
+	switch (cmd) {
+	case RAM_CRNT_MIGRATE:
+		lru_ram_coherent_migrate(arg);
+		break;
+
+	case CRNT_RAM_MIGRATE:
+		lru_coherent_ram_migrate(arg);
+		break;
+
+	default:
+		pr_info("%s Invalid ioctl() command: %d\n", __func__, cmd);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static const struct file_operations fops = {
+	.mmap = coherent_memory_mmap,
+	.open = coherent_memory_open,
+	.release = coherent_memory_close,
+	.unlocked_ioctl = &coherent_memory_ioctl
+};
+
+static char kbuf[100];	/* Will store original user passed buffer */
+static char str[100];	/* Working copy for individual substring */
+
+static u64 args[4];
+static u64 index;
+static void convert_substring(const char *buf)
+{
+	u64 val = 0;
+
+	if (kstrtou64(buf, 0, &val))
+		pr_info("String conversion failed\n");
+
+	args[index] = val;
+	index++;
+}
+
+static ssize_t coherent_debug_write(struct file *file,
+					const char __user *user_buf,
+					size_t count, loff_t *ppos)
+{
+	char *tmp, *tmp1;
+	size_t ret;
+
+	memset(args, 0, sizeof(args));
+	index = 0;
+
+	ret = simple_write_to_buffer(kbuf, sizeof(kbuf), ppos, user_buf, count);
+	if (ret < 0)
+		return ret;
+
+	kbuf[ret] = '\0';
+	tmp = kbuf;
+	do {
+		tmp1 = strchr(tmp, ',');
+		if (tmp1) {
+			*tmp1 = '\0';
+			strncpy(str, (const char *)tmp, strlen(tmp));
+			convert_substring(str);
+		} else {
+			strncpy(str, (const char *)tmp, strlen(tmp));
+			convert_substring(str);
+			break;
+		}
+		tmp = tmp1 + 1;
+		memset(str, 0, sizeof(str));
+	} while (true);
+	migrate_virtual_range(args[0], args[1], args[2], args[3]);
+	return ret;
+}
+
+static int coherent_debug_show(struct seq_file *m, void *v)
+{
+	seq_puts(m, "Expected Value: <pid,vaddr,size,nid>\n");
+	return 0;
+}
+
+static int coherent_debug_open(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, coherent_debug_show, NULL);
+}
+
+static const struct file_operations coherent_debug_fops = {
+	.open		= coherent_debug_open,
+	.write		= coherent_debug_write,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static struct dentry *debugfile;
+
+static void coherent_memory_debugfs(void)
+{
+
+	debugfile = debugfs_create_file("coherent_debug", 0644, NULL, NULL,
+				&coherent_debug_fops);
+	if (!debugfile)
+		pr_warn("Failed to create coherent_memory in debugfs");
+}
+
+static void __exit coherent_memory_exit(void)
+{
+	pr_info("%s\n", __func__);
+	debugfs_remove(debugfile);
+	unregister_chrdev(COHERENT_DEV_MAJOR, COHERENT_DEV_NAME);
+}
+
+static int __init coherent_memory_init(void)
+{
+	int ret;
+
+	pr_info("%s\n", __func__);
+	ret = register_chrdev(COHERENT_DEV_MAJOR, COHERENT_DEV_NAME, &fops);
+	if (ret < 0) {
+		pr_info("%s register_chrdev() failed\n", __func__);
+		return -1;
+	}
+	coherent_memory_debugfs();
+	return 0;
+}
+
+module_init(coherent_memory_init);
+module_exit(coherent_memory_exit);
+MODULE_LICENSE("GPL");
diff --git a/drivers/char/memory_online_sysfs.h b/drivers/char/memory_online_sysfs.h
new file mode 100644
index 0000000..a5f022d
--- /dev/null
+++ b/drivers/char/memory_online_sysfs.h
@@ -0,0 +1,148 @@
+/*
+ * Accessing sysfs interface for memory hotplug operation from
+ * inside the kernel.
+ *
+ * Licensed under GPL V2
+ */
+#ifndef __SYSFS_H
+#define __SYSFS_H
+
+#include <linux/fs.h>
+#include <linux/uaccess.h>
+
+#define AUTO_ONLINE_BLOCKS "/sys/devices/system/memory/auto_online_blocks"
+#define BLOCK_SIZE_BYTES   "/sys/devices/system/memory/block_size_bytes"
+#define MEMORY_PROBE       "/sys/devices/system/memory/probe"
+
+static ssize_t read_buf(char *filename, char *buf, ssize_t count)
+{
+	mm_segment_t old_fs;
+	struct file *filp;
+	loff_t pos = 0;
+
+	if (!count)
+		return 0;
+
+	old_fs = get_fs();
+	set_fs(KERNEL_DS);
+
+	filp = filp_open(filename, O_RDONLY, 0);
+	if (IS_ERR(filp)) {
+		count = PTR_ERR(filp);
+		goto err_open;
+	}
+
+	count = vfs_read(filp, buf, count - 1, &pos);
+	buf[count] = '\0';
+
+	filp_close(filp, NULL);
+
+err_open:
+	set_fs(old_fs);
+
+	return count;
+}
+
+static unsigned long long read_0x(char *filename)
+{
+	unsigned long long ret;
+	char buf[32];
+
+	if (read_buf(filename, buf, 32) <= 0)
+		return 0;
+
+	if (kstrtoull(buf, 16, &ret))
+		return 0;
+
+	return ret;
+}
+
+static ssize_t write_buf(char *filename, char *buf)
+{
+	int ret;
+	mm_segment_t old_fs;
+	struct file *filp;
+	loff_t pos = 0;
+
+	old_fs = get_fs();
+	set_fs(KERNEL_DS);
+
+	filp = filp_open(filename, O_WRONLY, 0);
+	if (IS_ERR(filp)) {
+		ret = PTR_ERR(filp);
+		goto err_open;
+	}
+
+	ret = vfs_write(filp, buf, strlen(buf), &pos);
+
+	filp_close(filp, NULL);
+
+err_open:
+	set_fs(old_fs);
+
+	return ret;
+}
+
+int memory_probe_store(phys_addr_t addr, phys_addr_t size)
+{
+	phys_addr_t block_sz =
+		read_0x(BLOCK_SIZE_BYTES);
+	long i;
+
+	for (i = 0; i < size / block_sz; i++, addr += block_sz) {
+		char s[32];
+		ssize_t count;
+
+		snprintf(s, 32, "0x%llx", addr);
+
+		count = write_buf(MEMORY_PROBE, s);
+		if (count < 0)
+			return count;
+	}
+
+	return 0;
+}
+
+int store_mem_state(phys_addr_t addr, phys_addr_t size, char *state)
+{
+	phys_addr_t block_sz = read_0x(BLOCK_SIZE_BYTES);
+	unsigned long start_block, end_block, i;
+
+	start_block = addr / block_sz;
+	end_block = start_block + size / block_sz;
+
+	for (i = end_block - 1; i >= start_block; i--) {
+		char filename[64];
+		ssize_t count;
+
+		snprintf(filename, 64,
+			 "/sys/devices/system/memory/memory%ld/state", i);
+
+		count = write_buf(filename, state);
+		if (count < 0)
+			return count;
+	}
+
+	return 0;
+}
+
+int disable_auto_online(void)
+{
+	int ret;
+
+	ret = write_buf(AUTO_ONLINE_BLOCKS, "offline");
+	if (ret)
+		return ret;
+	return 0;
+}
+
+int enable_auto_online(void)
+{
+	int ret;
+
+	ret = write_buf(AUTO_ONLINE_BLOCKS, "online");
+	if (ret)
+		return ret;
+	return 0;
+}
+#endif
diff --git a/mm/migrate.c b/mm/migrate.c
index 4f20415..87861f6 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1396,6 +1396,7 @@ int migrate_virtual_range(int pid, unsigned long start,
 	struct vm_area_struct *vma;
 	nodemask_t nmask;
 	int ret = -EINVAL;
+	bool found = false;
 
 	LIST_HEAD(mlist);
 
@@ -1405,6 +1406,7 @@ int migrate_virtual_range(int pid, unsigned long start,
 	if ((!start) || (!end))
 		return -EINVAL;
 
+	pr_info("%s: %d %lx %lx %d: ", __func__, pid, start, end, nid);
 	rcu_read_lock();
 	mm = find_task_by_vpid(pid)->mm;
 	rcu_read_unlock();
@@ -1416,23 +1418,33 @@ int migrate_virtual_range(int pid, unsigned long start,
 		if  ((start < vma->vm_start) || (end > vma->vm_end))
 			continue;
 
+		found = true;
 		ret = queue_pages_range(mm, start, end, &nmask, MPOL_MF_MOVE_ALL
 						| MPOL_MF_DISCONTIG_OK, &mlist);
 		if (ret) {
+			pr_info("queue_pages_range_failed\n");
 			putback_movable_pages(&mlist);
 			break;
 		}
 
 		if (list_empty(&mlist)) {
+			pr_info("list_empty\n");
 			ret = -ENOMEM;
 			break;
 		}
 
 		ret = migrate_pages(&mlist, new_node_page, NULL, nid,
 					MIGRATE_SYNC, MR_COMPACTION);
-		if (ret)
+		if (ret) {
+			pr_info("migration_failed\n");
 			putback_movable_pages(&mlist);
+		} else {
+			pr_info("migration_passed\n");
+		}
 	}
+	if (!found)
+		pr_info("vma_missing\n");
+
 	up_write(&mm->mmap_sem);
 	return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes
  2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
                   ` (10 preceding siblings ...)
  2016-11-22 14:19 ` [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
@ 2016-11-22 14:19 ` Anshuman Khandual
  11 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-22 14:19 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, dave.hansen

This is a test script which creates a workload (e.g ebizzy) and go through
it's VMAs (/proc/pid/maps) and initiate migration to random nodes which can
be either system memory node or coherent memory node.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 tools/testing/selftests/vm/cdm_migration.sh | 76 +++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100755 tools/testing/selftests/vm/cdm_migration.sh

diff --git a/tools/testing/selftests/vm/cdm_migration.sh b/tools/testing/selftests/vm/cdm_migration.sh
new file mode 100755
index 0000000..fab11ed
--- /dev/null
+++ b/tools/testing/selftests/vm/cdm_migration.sh
@@ -0,0 +1,76 @@
+#!/usr/bin/bash
+#
+# Should work with any workoad and workload commandline.
+# But for now ebizzy should be installed. Please run it
+# as root.
+#
+# Copyright (C) Anshuman Khandual 2016, IBM Corporation
+#
+# Licensed under GPL V2
+
+# Unload, build and reload modules
+if [ "$1" = "reload" ]
+then
+	rmmod coherent_memory_demo
+	rmmod coherent_hotplug_demo
+	cd ../../../../
+	make -s -j 64 modules
+	insmod drivers/char/coherent_hotplug_demo.ko
+	insmod drivers/char/coherent_memory_demo.ko
+	cd -
+fi
+
+# Workload
+workload=ebizzy
+work_cmd="ebizzy -T -z -m -t 128 -n 100000 -s 32768 -S 10000"
+
+pkill $workload
+$work_cmd &
+
+# File
+if [ -e input_file.txt ]
+then
+	rm input_file.txt
+fi
+
+# Inputs
+pid=`pidof ebizzy`
+cp /proc/$pid/maps input_file.txt
+if [ ! -e input_file.txt ]
+then
+	echo "Input file was not created"
+	exit
+fi
+input=input_file.txt
+
+# Migrations
+dmesg -C
+while read line
+do
+	addr_start=$(echo $line | cut -d '-' -f1)
+	addr_end=$(echo $line | cut -d '-' -f2 | cut -d ' ' -f1)
+	node=`expr $RANDOM % 5`
+
+	echo $pid,0x$addr_start,0x$addr_end,$node > /sys/kernel/debug/coherent_debug
+done < "$input"
+
+# Analyze dmesg output
+passed=`dmesg | grep "migration_passed" | wc -l`
+failed=`dmesg | grep "migration_failed" | wc -l`
+queuef=`dmesg | grep "queue_pages_range_failed" | wc -l`
+empty=`dmesg | grep "list_empty" | wc -l`
+missing=`dmesg | grep "vma_missing" | wc -l`
+
+# Stats
+echo passed	$passed
+echo failed	$failed
+echo queuef	$queuef
+echo empty	$empty
+echo missing	$missing
+
+# Cleanup
+rm input_file.txt
+if pgrep -x $workload > /dev/null
+then
+	pkill $workload
+fi
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
@ 2016-11-28 21:12   ` Dave Hansen
  2016-11-29  6:51     ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2016-11-28 21:12 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3715,7 +3715,7 @@ struct page *
>  		.migratetype = gfpflags_to_migratetype(gfp_mask),
>  	};
>  
> -	if (cpusets_enabled()) {
> +	if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
>  		alloc_mask |= __GFP_HARDWALL;
>  		alloc_flags |= ALLOC_CPUSET;
>  		if (!ac.nodemask)

This means now that any __GFP_THISNODE allocation can "escape" the
cpuset.  That seems like a pretty major change to how cpusets works.  Do
we know that *ALL* __GFP_THISNODE allocations are truly lacking in a
cpuset context that can be enforced?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-28 21:12   ` Dave Hansen
@ 2016-11-29  6:51     ` Anshuman Khandual
  2016-11-29 16:52       ` Dave Hansen
  0 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-29  6:51 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/29/2016 02:42 AM, Dave Hansen wrote:
> On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3715,7 +3715,7 @@ struct page *
>>  		.migratetype = gfpflags_to_migratetype(gfp_mask),
>>  	};
>>  
>> -	if (cpusets_enabled()) {
>> +	if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
>>  		alloc_mask |= __GFP_HARDWALL;
>>  		alloc_flags |= ALLOC_CPUSET;
>>  		if (!ac.nodemask)
> 
> This means now that any __GFP_THISNODE allocation can "escape" the
> cpuset.  That seems like a pretty major change to how cpusets works.  Do
> we know that *ALL* __GFP_THISNODE allocations are truly lacking in a
> cpuset context that can be enforced?

Right, I know its a very blunt change. With the cpuset based isolation
of coherent device node for the user space tasks leads to a side effect
that a driver or even kernel cannot allocate memory from the coherent
device node in the task's own context (ioctl() calls or similar). For
non task context allocation (work queues, interrupts, anything async
etc) this problem can be fixed by modifying kernel thread's task->mems
_allowed to include all nodes of the system including the coherent
device nodes. Though I have not figured out the details yet. Whats
your thoughts on this ? What we are looking for is a explicit and
definite way of allocating from the coherent device node inside the
kernel.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-29  6:51     ` Anshuman Khandual
@ 2016-11-29 16:52       ` Dave Hansen
  2016-11-30 11:17         ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2016-11-29 16:52 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/28/2016 10:51 PM, Anshuman Khandual wrote:
> On 11/29/2016 02:42 AM, Dave Hansen wrote:
>> > On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
>>> >> --- a/mm/page_alloc.c
>>> >> +++ b/mm/page_alloc.c
>>> >> @@ -3715,7 +3715,7 @@ struct page *
>>> >>  		.migratetype = gfpflags_to_migratetype(gfp_mask),
>>> >>  	};
>>> >>  
>>> >> -	if (cpusets_enabled()) {
>>> >> +	if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
>>> >>  		alloc_mask |= __GFP_HARDWALL;
>>> >>  		alloc_flags |= ALLOC_CPUSET;
>>> >>  		if (!ac.nodemask)
>> > 
>> > This means now that any __GFP_THISNODE allocation can "escape" the
>> > cpuset.  That seems like a pretty major change to how cpusets works.  Do
>> > we know that *ALL* __GFP_THISNODE allocations are truly lacking in a
>> > cpuset context that can be enforced?
> Right, I know its a very blunt change. With the cpuset based isolation
> of coherent device node for the user space tasks leads to a side effect
> that a driver or even kernel cannot allocate memory from the coherent
...

Well, we have __GFP_HARDWALL:

	 * __GFP_HARDWALL enforces the cpuset memory allocation policy.

which you can clear in the places where you want to do an allocation but
want to ignore cpusets.  But, __cpuset_node_allowed() looks like it gets
a little funky if you do that since it would probably be falling back to
the root cpuset that also would not have the new node in mems_allowed.

What exactly are the kernel-internal places that need to allocate from
the coherent device node?  When would this be done out of the context of
an application *asking* for memory in the new node?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/4] mm: Define coherent device memory node
  2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
@ 2016-11-29 17:57   ` Dave Hansen
  2016-11-30 11:46     ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2016-11-29 17:57 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
> @@ -393,6 +393,9 @@ enum node_states {
>  	N_MEMORY = N_HIGH_MEMORY,
>  #endif
>  	N_CPU,		/* The node has one or more cpus */
> +#ifdef CONFIG_COHERENT_DEVICE
> +	N_COHERENT_DEVICE,
> +#endif
>  	NR_NODE_STATES
>  };

Don't we really want this to be N_MEMORY_ISOLATED?  Or, better yet,
N_MEMORY_UNISOLATED so that we can just drop the bitmap in for N_MEMORY
and not have to do any bit manipulation operations at runtime.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-29 16:52       ` Dave Hansen
@ 2016-11-30 11:17         ` Anshuman Khandual
  2016-11-30 19:43           ` Dave Hansen
  0 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-30 11:17 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/29/2016 10:22 PM, Dave Hansen wrote:
> On 11/28/2016 10:51 PM, Anshuman Khandual wrote:
>> On 11/29/2016 02:42 AM, Dave Hansen wrote:
>>>> On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
>>>>>> --- a/mm/page_alloc.c
>>>>>> +++ b/mm/page_alloc.c
>>>>>> @@ -3715,7 +3715,7 @@ struct page *
>>>>>>  		.migratetype = gfpflags_to_migratetype(gfp_mask),
>>>>>>  	};
>>>>>>  
>>>>>> -	if (cpusets_enabled()) {
>>>>>> +	if (cpusets_enabled() && !(alloc_mask & __GFP_THISNODE)) {
>>>>>>  		alloc_mask |= __GFP_HARDWALL;
>>>>>>  		alloc_flags |= ALLOC_CPUSET;
>>>>>>  		if (!ac.nodemask)
>>>>
>>>> This means now that any __GFP_THISNODE allocation can "escape" the
>>>> cpuset.  That seems like a pretty major change to how cpusets works.  Do
>>>> we know that *ALL* __GFP_THISNODE allocations are truly lacking in a
>>>> cpuset context that can be enforced?
>> Right, I know its a very blunt change. With the cpuset based isolation
>> of coherent device node for the user space tasks leads to a side effect
>> that a driver or even kernel cannot allocate memory from the coherent
> ...
> 
> Well, we have __GFP_HARDWALL:
> 
> 	 * __GFP_HARDWALL enforces the cpuset memory allocation policy.
> 
> which you can clear in the places where you want to do an allocation but
> want to ignore cpusets.  But, __cpuset_node_allowed() looks like it gets
> a little funky if you do that since it would probably be falling back to
> the root cpuset that also would not have the new node in mems_allowed.

Right but what is the rationale behind this ? This what is in the in-code
documentation for this function __cpuset_node_allowed().

 *	GFP_KERNEL   - any node in enclosing hardwalled cpuset ok
 
If the allocation has requested GFP_KERNEL, should not it look for the
entire system for memory ? Does cpuset still has to be enforced ?

> 
> What exactly are the kernel-internal places that need to allocate from
> the coherent device node?  When would this be done out of the context of
> an application *asking* for memory in the new node?

The primary user right now is a driver who wants to move around mapped
pages of an application from system RAM to CDM nodes and back. If the
application has requested for it though an ioctl(), during migration
the destination pages will be allocated on the CDM *in* the task context.

The driver could also have scheduled migration chunks in the work queue
which can execute later on. IIUC those execution and corresponding
allocation into CDM node will be *out* of context of the task.

Ideally looking for both the scenarios to work which dont right now.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 1/4] mm: Define coherent device memory node
  2016-11-29 17:57   ` Dave Hansen
@ 2016-11-30 11:46     ` Anshuman Khandual
  0 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2016-11-30 11:46 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse

On 11/29/2016 11:27 PM, Dave Hansen wrote:
> On 11/22/2016 06:19 AM, Anshuman Khandual wrote:
>> @@ -393,6 +393,9 @@ enum node_states {
>>  	N_MEMORY = N_HIGH_MEMORY,
>>  #endif
>>  	N_CPU,		/* The node has one or more cpus */
>> +#ifdef CONFIG_COHERENT_DEVICE
>> +	N_COHERENT_DEVICE,
>> +#endif
>>  	NR_NODE_STATES
>>  };
> 
> Don't we really want this to be N_MEMORY_ISOLATED?  Or, better yet,

Sure, If we move from a CDM description to a purely node isolation one.
I am still thinking through this.

> N_MEMORY_UNISOLATED so that we can just drop the bitmap in for N_MEMORY

Did not get that, N_MEMORY_UNISOLATED for the system RAM nodes which are
not isolated ? Then where the isolated/CDM nodes go in ?

> and not have to do any bit manipulation operations at runtime.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE
  2016-11-30 11:17         ` Anshuman Khandual
@ 2016-11-30 19:43           ` Dave Hansen
  0 siblings, 0 replies; 20+ messages in thread
From: Dave Hansen @ 2016-11-30 19:43 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-mm
  Cc: mhocko, vbabka, mgorman, minchan, aneesh.kumar, bsingharora,
	srikar, haren, jglisse, Li Zefan

On 11/30/2016 03:17 AM, Anshuman Khandual wrote:
> Right but what is the rationale behind this ? This what is in the in-code
> documentation for this function __cpuset_node_allowed().
> 
>  *	GFP_KERNEL   - any node in enclosing hardwalled cpuset ok
>  
> If the allocation has requested GFP_KERNEL, should not it look for the
> entire system for memory ? Does cpuset still has to be enforced ?

Documentation/cgroup-v1/cpusets.txt explains it quite a bit.

>> What exactly are the kernel-internal places that need to allocate from
>> the coherent device node?  When would this be done out of the context of
>> an application *asking* for memory in the new node?
> 
> The primary user right now is a driver who wants to move around mapped
> pages of an application from system RAM to CDM nodes and back. If the
> application has requested for it though an ioctl(), during migration
> the destination pages will be allocated on the CDM *in* the task context.

Side note: uhh, so you're doing migrate_pages() through some kind of new
ioctl()?  Why?

I think you're actually pointing out a hole in how cpusets currently
works, especially about the workqueue.  I'm not quite sure if this is by
design for migrate_pages() (a task doing migrate_pages() can pages for a
task from a cpuset even though that task isn't able to allocate itself).

> The driver could also have scheduled migration chunks in the work queue
> which can execute later on. IIUC those execution and corresponding
> allocation into CDM node will be *out* of context of the task.

Yeah, the current->mems_allowed in __cpuset_node_allowed() does seem
rather wrong for something happening in another task's context.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-11-30 19:44 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
2016-11-29 17:57   ` Dave Hansen
2016-11-30 11:46     ` Anshuman Khandual
2016-11-22 14:19 ` [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed Anshuman Khandual
2016-11-22 14:19 ` [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask Anshuman Khandual
2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
2016-11-28 21:12   ` Dave Hansen
2016-11-29  6:51     ` Anshuman Khandual
2016-11-29 16:52       ` Dave Hansen
2016-11-30 11:17         ` Anshuman Khandual
2016-11-30 19:43           ` Dave Hansen
2016-11-22 14:19 ` [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).