linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
       [not found] <A24AE1FFE7AEC5489F83450EE98351BF2A40FED20A@shsmsx502.ccr.corp.intel.com>
@ 2010-12-09  1:21 ` Shaohui Zheng
  2010-12-09 21:29   ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohui Zheng @ 2010-12-09  1:21 UTC (permalink / raw)
  To: rientjes
  Cc: akpm, linux-mm, linux-kernel, haicheng.li, lethal, ak, gregkh,
	shaohui.zheng, shaohui.zheng

> 
> > From:  Shaohui Zheng <shaohui.zheng@intel.com>
> > 
> > Add add_memory interface to support to memory hotplug emulation for each online
> > node under debugfs. The reserved memory can be added into desired node with
> > this interface.
> > 
> > The layout on debugfs:
> > 	mem_hotplug/node0/add_memory
> > 	mem_hotplug/node1/add_memory
> > 	mem_hotplug/node2/add_memory
> > 	...
> > 
> > Add a memory section(128M) to node 3(boots with mem=1024m)
> > 
> > 	echo 0x40000000 > mem_hotplug/node3/add_memory
> > 
> > And more we make it friendly, it is possible to add memory to do
> > 
> > 	echo 1024m > mem_hotplug/node3/add_memory
> > 
> 
> I don't think you should be using memparse() to support this type of 
> interface, the standard way of writing memory locations is by writing 
> address in hex as the first example does.  The idea is to not try to make 
> things simpler by introducing multiple ways of doing the same thing but 
> rather to standardize on a single interface.

Undoubtedly, A hex is the best way to represent a physical address. If we use
memparse function, we can use the much simpler way to represent an address,
it is not the offical way, but it takes many conveniences if we just want to 
to some simple test.

When we reserce memory, we use mempasre to parse the mem=XXX parameter, we can
avoid the complicated translation when we add memory thru the add_memory interface,
how about still use the memparse here? but remove it from the document since it is
just for some simple testing. 

> 
> > CC: David Rientjes <rientjes@google.com>
> > CC: Dave Hansen <dave@linux.vnet.ibm.com>
> > Signed-off-by: Haicheng Li <haicheng.li@intel.com>
> > Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
> > ---
> > Index: linux-hpe4/mm/memory_hotplug.c
> > ===================================================================
> > --- linux-hpe4.orig/mm/memory_hotplug.c	2010-12-02 12:35:31.557622002 +0800
> > +++ linux-hpe4/mm/memory_hotplug.c	2010-12-06 07:30:36.067622001 +0800
> > @@ -930,6 +930,80 @@
> >  
> >  static struct dentry *memhp_debug_root;
> >  
> > +#ifdef CONFIG_ARCH_MEMORY_PROBE
> > +
> > +static ssize_t add_memory_store(struct file *file, const char __user *buf,
> > +				size_t count, loff_t *ppos)
> > +{
> > +	u64 phys_addr = 0;
> > +	int nid = file->private_data - NULL;
> > +	int ret;
> > +
> > +	phys_addr = simple_strtoull(buf, NULL, 0);
> 
> This isn't doing anything.
> 
Should be removed

> > +	printk(KERN_INFO "Add a memory section to node: %d.\n", nid);
> > +	phys_addr = memparse(buf, NULL);
> > +	ret = add_memory(nid, phys_addr, PAGES_PER_SECTION << PAGE_SHIFT);
> 
> Does the add_memory() call handle memoryless nodes such that they 
> appropriately transition to N_HIGH_MEMORY when memory is added?

For memoryless nodes, it will cause OOM issue on old kernel version, but now
memoryless node is already supported, and the test result matches it well. The
emulator is a tool to reproduce the OOM issue in eraly kernel.

> 
> > +
> > +	if (ret)
> > +		count = ret;
> > +
> > +	return count;
> > +}
> > +
> > +static int add_memory_open(struct inode *inode, struct file *file)
> > +{
> > +	file->private_data = inode->i_private;
> > +	return 0;
> > +}
> > +
> > +static const struct file_operations add_memory_file_ops = {
> > +	.open		= add_memory_open,
> > +	.write		= add_memory_store,
> > +	.llseek		= generic_file_llseek,
> > +};
> > +
> > +/*
> > + * Create add_memory debugfs entry under specified node
> > + */
> > +static int debugfs_create_add_memory_entry(int nid)
> > +{
> > +	char buf[32];
> > +	static struct dentry *node_debug_root;
> > +
> > +	snprintf(buf, sizeof(buf), "node%d", nid);
> > +	node_debug_root = debugfs_create_dir(buf, memhp_debug_root);
> 
> This can fail, and if it does then the subsequent debugfs_create_file() 
> will be added to root while we don't want, so this needs error handling.
> 
I will add error handling code for it.

> > +
> > +	/* the nid information was represented by the offset of pointer(NULL+nid) */
> > +	if (!debugfs_create_file("add_memory", S_IWUSR, node_debug_root,
> > +			NULL + nid, &add_memory_file_ops))
> > +		return -ENOMEM;
> > +
> > +	return 0;
> > +}
> > +
> > +static int __init memory_debug_init(void)
> > +{
> > +	int nid;
> > +
> > +	if (!memhp_debug_root)
> > +		memhp_debug_root = debugfs_create_dir("mem_hotplug", NULL);
> > +	if (!memhp_debug_root)
> > +		return -ENOMEM;
> > +
> > +	for_each_online_node(nid)
> > +		 debugfs_create_add_memory_entry(nid);
> > +
> > +	return 0;
> > +}
> > +
> > +module_init(memory_debug_init);
> > +#else
> > +static debugfs_create_add_memory_entry(int nid)
> > +{
> > +	return 0;
> > +}
> > +#endif /* CONFIG_ARCH_MEMORY_PROBE */
> > +
> >  static ssize_t add_node_store(struct file *file, const char __user *buf,
> >  				size_t count, loff_t *ppos)
> >  {
> > @@ -960,6 +1034,8 @@
> >  		return -ENOMEM;
> >  
> >  	ret = add_memory(nid, start, size);
> > +
> > +	debugfs_create_add_memory_entry(nid);
> >  	return ret ? ret : count;
> >  }
> >  
> > Index: linux-hpe4/Documentation/memory-hotplug.txt
> > ===================================================================
> > --- linux-hpe4.orig/Documentation/memory-hotplug.txt	2010-12-02 12:35:31.557622002 +0800
> > +++ linux-hpe4/Documentation/memory-hotplug.txt	2010-12-06 07:39:36.007622000 +0800
> > @@ -19,6 +19,7 @@
> >    4.1 Hardware(Firmware) Support
> >    4.2 Notify memory hot-add event by hand
> >    4.3 Node hotplug emulation
> > +  4.4 Memory hotplug emulation
> >  5. Logical Memory hot-add phase
> >    5.1. State of memory
> >    5.2. How to online memory
> > @@ -239,6 +240,29 @@
> >  Once the new node has been added, it is possible to online the memory by
> >  toggling the "state" of its memory section(s) as described in section 5.1.
> >  
> > +4.4 Memory hotplug emulation
> > +------------
> > +With debugfs, it is possible to test memory hotplug with software method, we
> > +can add memory section to desired node with add_memory interface. It is a much
> > +more powerful interface than "probe" described in section 4.2.
> > +
> > +There is an add_memory interface for each online node at the debugfs mount
> > +point.
> > +	mem_hotplug/node0/add_memory
> > +	mem_hotplug/node1/add_memory
> > +	mem_hotplug/node2/add_memory
> > +	...
> > +
> > +Add a memory section(128M) to node 3(boots with mem=1024m)
> > +
> > +	echo 0x40000000 > mem_hotplug/node3/add_memory
> > +
> > +And more we make it friendly, it is possible to add memory to do
> > +
> > +	echo 1024m > mem_hotplug/node3/add_memory
> > +
> > +Once the new memory section has been added, it is possible to online the memory
> > +by toggling the "state" described in section 5.1.
> >  
> >  ------------------------------
> >  5. Logical Memory hot-add phase
> > 
> > -- 
> > Thanks & Regards,
> > Shaohui
> > 
> > 
> > 

-- 
Thanks & Regards,
Shaohui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-09  1:21 ` [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface Shaohui Zheng
@ 2010-12-09 21:29   ` David Rientjes
  2010-12-09 23:57     ` Shaohui Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2010-12-09 21:29 UTC (permalink / raw)
  To: Shaohui Zheng
  Cc: akpm, linux-mm, linux-kernel, haicheng.li, lethal, ak, gregkh,
	shaohui.zheng

On Thu, 9 Dec 2010, Shaohui Zheng wrote:

> > I don't think you should be using memparse() to support this type of 
> > interface, the standard way of writing memory locations is by writing 
> > address in hex as the first example does.  The idea is to not try to make 
> > things simpler by introducing multiple ways of doing the same thing but 
> > rather to standardize on a single interface.
> 
> Undoubtedly, A hex is the best way to represent a physical address. If we use
> memparse function, we can use the much simpler way to represent an address,
> it is not the offical way, but it takes many conveniences if we just want to 
> to some simple test.
> 

Testing code should be removed from the patch prior to proposal.

> When we reserce memory, we use mempasre to parse the mem=XXX parameter, we can
> avoid the complicated translation when we add memory thru the add_memory interface,
> how about still use the memparse here? but remove it from the document since it is
> just for some simple testing. 
> 

We really don't want a public interface to have undocumented behavior, so 
it would be much better to retain the documentation if you choose to keep 
the memparse().  I disagree that converting the mem= parameter to hex is 
"complicated," however, so I'd prefer that the interface is similar to 
that of add_node.

> > > +	printk(KERN_INFO "Add a memory section to node: %d.\n", nid);
> > > +	phys_addr = memparse(buf, NULL);
> > > +	ret = add_memory(nid, phys_addr, PAGES_PER_SECTION << PAGE_SHIFT);
> > 
> > Does the add_memory() call handle memoryless nodes such that they 
> > appropriately transition to N_HIGH_MEMORY when memory is added?
> 
> For memoryless nodes, it will cause OOM issue on old kernel version, but now
> memoryless node is already supported, and the test result matches it well. The
> emulator is a tool to reproduce the OOM issue in eraly kernel.
> 

That doesn't address the question.  My question is whether or not adding 
memory to a memoryless node in this way transitions its state to 
N_HIGH_MEMORY in the VM?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-09 21:29   ` David Rientjes
@ 2010-12-09 23:57     ` Shaohui Zheng
  2010-12-10 23:30       ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohui Zheng @ 2010-12-09 23:57 UTC (permalink / raw)
  To: David Rientjes
  Cc: Shaohui Zheng, akpm, linux-mm, linux-kernel, haicheng.li, lethal,
	ak, gregkh

On Thu, Dec 09, 2010 at 01:29:28PM -0800, David Rientjes wrote:
> On Thu, 9 Dec 2010, Shaohui Zheng wrote:
> 
> > > I don't think you should be using memparse() to support this type of 
> > > interface, the standard way of writing memory locations is by writing 
> > > address in hex as the first example does.  The idea is to not try to make 
> > > things simpler by introducing multiple ways of doing the same thing but 
> > > rather to standardize on a single interface.
> > 
> > Undoubtedly, A hex is the best way to represent a physical address. If we use
> > memparse function, we can use the much simpler way to represent an address,
> > it is not the offical way, but it takes many conveniences if we just want to 
> > to some simple test.
> > 
> 
> Testing code should be removed from the patch prior to proposal.
> 
> > When we reserce memory, we use mempasre to parse the mem=XXX parameter, we can
> > avoid the complicated translation when we add memory thru the add_memory interface,
> > how about still use the memparse here? but remove it from the document since it is
> > just for some simple testing. 
> > 
> 
> We really don't want a public interface to have undocumented behavior, so 
> it would be much better to retain the documentation if you choose to keep 
> the memparse().  I disagree that converting the mem= parameter to hex is 
> "complicated," however, so I'd prefer that the interface is similar to 
> that of add_node.
> 

Okay, I will keep interface to accept hex address which is simliar wiht add_node.

> > > > +	printk(KERN_INFO "Add a memory section to node: %d.\n", nid);
> > > > +	phys_addr = memparse(buf, NULL);
> > > > +	ret = add_memory(nid, phys_addr, PAGES_PER_SECTION << PAGE_SHIFT);
> > > 
> > > Does the add_memory() call handle memoryless nodes such that they 
> > > appropriately transition to N_HIGH_MEMORY when memory is added?
> > 
> > For memoryless nodes, it will cause OOM issue on old kernel version, but now
> > memoryless node is already supported, and the test result matches it well. The
> > emulator is a tool to reproduce the OOM issue in eraly kernel.
> > 
> 
> That doesn't address the question.  My question is whether or not adding 
> memory to a memoryless node in this way transitions its state to 
> N_HIGH_MEMORY in the VM?
I guess that you are talking about memory hotplug on x86_32, memory hotplug is
NOT supported well for x86_32, and the function add_memory does not consider
this situlation.

For 64bit, N_HIGH_MEMORY == N_NORMAL_MEMORY, so we need not to do the transition.

-- 
Thanks & Regards,
Shaohui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-09 23:57     ` Shaohui Zheng
@ 2010-12-10 23:30       ` David Rientjes
  2010-12-13  2:09         ` Shaohui Zheng
  0 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2010-12-10 23:30 UTC (permalink / raw)
  To: Shaohui Zheng
  Cc: Andrew Morton, linux-mm, linux-kernel, haicheng.li, lethal,
	Andi Kleen, Greg Kroah-Hartman

On Fri, 10 Dec 2010, Shaohui Zheng wrote:

> > That doesn't address the question.  My question is whether or not adding 
> > memory to a memoryless node in this way transitions its state to 
> > N_HIGH_MEMORY in the VM?
> I guess that you are talking about memory hotplug on x86_32, memory hotplug is
> NOT supported well for x86_32, and the function add_memory does not consider
> this situlation.
> 
> For 64bit, N_HIGH_MEMORY == N_NORMAL_MEMORY, so we need not to do the transition.
> 

One more time :)  Memoryless nodes do not have their bit set in 
N_HIGH_MEMORY.  When memory is added to a memoryless node with this new 
interface, does the bit get set?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-10 23:30       ` David Rientjes
@ 2010-12-13  2:09         ` Shaohui Zheng
  2010-12-13 20:56           ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: Shaohui Zheng @ 2010-12-13  2:09 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-mm, linux-kernel, haicheng.li, lethal,
	Andi Kleen, Greg Kroah-Hartman

On Fri, Dec 10, 2010 at 03:30:38PM -0800, David Rientjes wrote:
> On Fri, 10 Dec 2010, Shaohui Zheng wrote:
> 
> > > That doesn't address the question.  My question is whether or not adding 
> > > memory to a memoryless node in this way transitions its state to 
> > > N_HIGH_MEMORY in the VM?
> > I guess that you are talking about memory hotplug on x86_32, memory hotplug is
> > NOT supported well for x86_32, and the function add_memory does not consider
> > this situlation.
> > 
> > For 64bit, N_HIGH_MEMORY == N_NORMAL_MEMORY, so we need not to do the transition.
> > 
> 
> One more time :)  Memoryless nodes do not have their bit set in 
> N_HIGH_MEMORY.  When memory is added to a memoryless node with this new 
> interface, does the bit get set?

When we use debugfs add_node interface to add a fake node, the node was created, 
and memory sections were created, but the state of the memory section is still 
__offline__, so the new added node is still memoryless node. the result of debugfs
add_memory interface doing the similar thing with add_node, it just add memory
to an exists node.

For the state transition to N_HIGH_MEMORY, it does not happen on the above too
interfaces. It happens when the memory was onlined with sysfs /sys/device/system/memory/memoryXX/online
interface.

That is the code path:
store_mem_state
	->memory_block_change_state
	 	->memory_block_action
			->online_pages

			if (onlined_pages) {
				kswapd_run(zone_to_nid(zone));
				node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
			}

does it address your question? thanks.

-- 
Thanks & Regards,
Shaohui


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-13  2:09         ` Shaohui Zheng
@ 2010-12-13 20:56           ` David Rientjes
  0 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2010-12-13 20:56 UTC (permalink / raw)
  To: Shaohui Zheng
  Cc: Andrew Morton, linux-mm, linux-kernel, haicheng.li, lethal,
	Andi Kleen, Greg Kroah-Hartman

On Mon, 13 Dec 2010, Shaohui Zheng wrote:

> For the state transition to N_HIGH_MEMORY, it does not happen on the above too
> interfaces. It happens when the memory was onlined with sysfs /sys/device/system/memory/memoryXX/online
> interface.
> 
> That is the code path:
> store_mem_state
> 	->memory_block_change_state
> 	 	->memory_block_action
> 			->online_pages
> 
> 			if (onlined_pages) {
> 				kswapd_run(zone_to_nid(zone));
> 				node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
> 			}
> 
> does it address your question? thanks.
> 

Ok, thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-07  1:00 ` [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface shaohui.zheng
@ 2010-12-08 21:31   ` David Rientjes
  0 siblings, 0 replies; 8+ messages in thread
From: David Rientjes @ 2010-12-08 21:31 UTC (permalink / raw)
  To: Shaohui Zheng
  Cc: Andrew Morton, linux-mm, linux-kernel, haicheng.li, lethal,
	Andi Kleen, dave, Greg Kroah-Hartman, Haicheng Li

On Tue, 7 Dec 2010, shaohui.zheng@intel.com wrote:

> From:  Shaohui Zheng <shaohui.zheng@intel.com>
> 
> Add add_memory interface to support to memory hotplug emulation for each online
> node under debugfs. The reserved memory can be added into desired node with
> this interface.
> 
> The layout on debugfs:
> 	mem_hotplug/node0/add_memory
> 	mem_hotplug/node1/add_memory
> 	mem_hotplug/node2/add_memory
> 	...
> 
> Add a memory section(128M) to node 3(boots with mem=1024m)
> 
> 	echo 0x40000000 > mem_hotplug/node3/add_memory
> 
> And more we make it friendly, it is possible to add memory to do
> 
> 	echo 1024m > mem_hotplug/node3/add_memory
> 

I don't think you should be using memparse() to support this type of 
interface, the standard way of writing memory locations is by writing 
address in hex as the first example does.  The idea is to not try to make 
things simpler by introducing multiple ways of doing the same thing but 
rather to standardize on a single interface.

> CC: David Rientjes <rientjes@google.com>
> CC: Dave Hansen <dave@linux.vnet.ibm.com>
> Signed-off-by: Haicheng Li <haicheng.li@intel.com>
> Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
> ---
> Index: linux-hpe4/mm/memory_hotplug.c
> ===================================================================
> --- linux-hpe4.orig/mm/memory_hotplug.c	2010-12-02 12:35:31.557622002 +0800
> +++ linux-hpe4/mm/memory_hotplug.c	2010-12-06 07:30:36.067622001 +0800
> @@ -930,6 +930,80 @@
>  
>  static struct dentry *memhp_debug_root;
>  
> +#ifdef CONFIG_ARCH_MEMORY_PROBE
> +
> +static ssize_t add_memory_store(struct file *file, const char __user *buf,
> +				size_t count, loff_t *ppos)
> +{
> +	u64 phys_addr = 0;
> +	int nid = file->private_data - NULL;
> +	int ret;
> +
> +	phys_addr = simple_strtoull(buf, NULL, 0);

This isn't doing anything.

> +	printk(KERN_INFO "Add a memory section to node: %d.\n", nid);
> +	phys_addr = memparse(buf, NULL);
> +	ret = add_memory(nid, phys_addr, PAGES_PER_SECTION << PAGE_SHIFT);

Does the add_memory() call handle memoryless nodes such that they 
appropriately transition to N_HIGH_MEMORY when memory is added?

> +
> +	if (ret)
> +		count = ret;
> +
> +	return count;
> +}
> +
> +static int add_memory_open(struct inode *inode, struct file *file)
> +{
> +	file->private_data = inode->i_private;
> +	return 0;
> +}
> +
> +static const struct file_operations add_memory_file_ops = {
> +	.open		= add_memory_open,
> +	.write		= add_memory_store,
> +	.llseek		= generic_file_llseek,
> +};
> +
> +/*
> + * Create add_memory debugfs entry under specified node
> + */
> +static int debugfs_create_add_memory_entry(int nid)
> +{
> +	char buf[32];
> +	static struct dentry *node_debug_root;
> +
> +	snprintf(buf, sizeof(buf), "node%d", nid);
> +	node_debug_root = debugfs_create_dir(buf, memhp_debug_root);

This can fail, and if it does then the subsequent debugfs_create_file() 
will be added to root while we don't want, so this needs error handling.

> +
> +	/* the nid information was represented by the offset of pointer(NULL+nid) */
> +	if (!debugfs_create_file("add_memory", S_IWUSR, node_debug_root,
> +			NULL + nid, &add_memory_file_ops))
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int __init memory_debug_init(void)
> +{
> +	int nid;
> +
> +	if (!memhp_debug_root)
> +		memhp_debug_root = debugfs_create_dir("mem_hotplug", NULL);
> +	if (!memhp_debug_root)
> +		return -ENOMEM;
> +
> +	for_each_online_node(nid)
> +		 debugfs_create_add_memory_entry(nid);
> +
> +	return 0;
> +}
> +
> +module_init(memory_debug_init);
> +#else
> +static debugfs_create_add_memory_entry(int nid)
> +{
> +	return 0;
> +}
> +#endif /* CONFIG_ARCH_MEMORY_PROBE */
> +
>  static ssize_t add_node_store(struct file *file, const char __user *buf,
>  				size_t count, loff_t *ppos)
>  {
> @@ -960,6 +1034,8 @@
>  		return -ENOMEM;
>  
>  	ret = add_memory(nid, start, size);
> +
> +	debugfs_create_add_memory_entry(nid);
>  	return ret ? ret : count;
>  }
>  
> Index: linux-hpe4/Documentation/memory-hotplug.txt
> ===================================================================
> --- linux-hpe4.orig/Documentation/memory-hotplug.txt	2010-12-02 12:35:31.557622002 +0800
> +++ linux-hpe4/Documentation/memory-hotplug.txt	2010-12-06 07:39:36.007622000 +0800
> @@ -19,6 +19,7 @@
>    4.1 Hardware(Firmware) Support
>    4.2 Notify memory hot-add event by hand
>    4.3 Node hotplug emulation
> +  4.4 Memory hotplug emulation
>  5. Logical Memory hot-add phase
>    5.1. State of memory
>    5.2. How to online memory
> @@ -239,6 +240,29 @@
>  Once the new node has been added, it is possible to online the memory by
>  toggling the "state" of its memory section(s) as described in section 5.1.
>  
> +4.4 Memory hotplug emulation
> +------------
> +With debugfs, it is possible to test memory hotplug with software method, we
> +can add memory section to desired node with add_memory interface. It is a much
> +more powerful interface than "probe" described in section 4.2.
> +
> +There is an add_memory interface for each online node at the debugfs mount
> +point.
> +	mem_hotplug/node0/add_memory
> +	mem_hotplug/node1/add_memory
> +	mem_hotplug/node2/add_memory
> +	...
> +
> +Add a memory section(128M) to node 3(boots with mem=1024m)
> +
> +	echo 0x40000000 > mem_hotplug/node3/add_memory
> +
> +And more we make it friendly, it is possible to add memory to do
> +
> +	echo 1024m > mem_hotplug/node3/add_memory
> +
> +Once the new memory section has been added, it is possible to online the memory
> +by toggling the "state" described in section 5.1.
>  
>  ------------------------------
>  5. Logical Memory hot-add phase
> 
> -- 
> Thanks & Regards,
> Shaohui
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface
  2010-12-07  1:00 [0/7,v8] NUMA Hotplug Emulator (v8) shaohui.zheng
@ 2010-12-07  1:00 ` shaohui.zheng
  2010-12-08 21:31   ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: shaohui.zheng @ 2010-12-07  1:00 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, haicheng.li, lethal, ak, shaohui.zheng, rientjes,
	dave, gregkh, Haicheng Li, Shaohui Zheng

[-- Attachment #1: 007-hotplug-emulator-add-memory-debugfs-interface.patch --]
[-- Type: text/plain, Size: 4673 bytes --]

From:  Shaohui Zheng <shaohui.zheng@intel.com>

Add add_memory interface to support to memory hotplug emulation for each online
node under debugfs. The reserved memory can be added into desired node with
this interface.

The layout on debugfs:
	mem_hotplug/node0/add_memory
	mem_hotplug/node1/add_memory
	mem_hotplug/node2/add_memory
	...

Add a memory section(128M) to node 3(boots with mem=1024m)

	echo 0x40000000 > mem_hotplug/node3/add_memory

And more we make it friendly, it is possible to add memory to do

	echo 1024m > mem_hotplug/node3/add_memory

CC: David Rientjes <rientjes@google.com>
CC: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com>
---
Index: linux-hpe4/mm/memory_hotplug.c
===================================================================
--- linux-hpe4.orig/mm/memory_hotplug.c	2010-12-02 12:35:31.557622002 +0800
+++ linux-hpe4/mm/memory_hotplug.c	2010-12-06 07:30:36.067622001 +0800
@@ -930,6 +930,80 @@
 
 static struct dentry *memhp_debug_root;
 
+#ifdef CONFIG_ARCH_MEMORY_PROBE
+
+static ssize_t add_memory_store(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	u64 phys_addr = 0;
+	int nid = file->private_data - NULL;
+	int ret;
+
+	phys_addr = simple_strtoull(buf, NULL, 0);
+	printk(KERN_INFO "Add a memory section to node: %d.\n", nid);
+	phys_addr = memparse(buf, NULL);
+	ret = add_memory(nid, phys_addr, PAGES_PER_SECTION << PAGE_SHIFT);
+
+	if (ret)
+		count = ret;
+
+	return count;
+}
+
+static int add_memory_open(struct inode *inode, struct file *file)
+{
+	file->private_data = inode->i_private;
+	return 0;
+}
+
+static const struct file_operations add_memory_file_ops = {
+	.open		= add_memory_open,
+	.write		= add_memory_store,
+	.llseek		= generic_file_llseek,
+};
+
+/*
+ * Create add_memory debugfs entry under specified node
+ */
+static int debugfs_create_add_memory_entry(int nid)
+{
+	char buf[32];
+	static struct dentry *node_debug_root;
+
+	snprintf(buf, sizeof(buf), "node%d", nid);
+	node_debug_root = debugfs_create_dir(buf, memhp_debug_root);
+
+	/* the nid information was represented by the offset of pointer(NULL+nid) */
+	if (!debugfs_create_file("add_memory", S_IWUSR, node_debug_root,
+			NULL + nid, &add_memory_file_ops))
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int __init memory_debug_init(void)
+{
+	int nid;
+
+	if (!memhp_debug_root)
+		memhp_debug_root = debugfs_create_dir("mem_hotplug", NULL);
+	if (!memhp_debug_root)
+		return -ENOMEM;
+
+	for_each_online_node(nid)
+		 debugfs_create_add_memory_entry(nid);
+
+	return 0;
+}
+
+module_init(memory_debug_init);
+#else
+static debugfs_create_add_memory_entry(int nid)
+{
+	return 0;
+}
+#endif /* CONFIG_ARCH_MEMORY_PROBE */
+
 static ssize_t add_node_store(struct file *file, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
@@ -960,6 +1034,8 @@
 		return -ENOMEM;
 
 	ret = add_memory(nid, start, size);
+
+	debugfs_create_add_memory_entry(nid);
 	return ret ? ret : count;
 }
 
Index: linux-hpe4/Documentation/memory-hotplug.txt
===================================================================
--- linux-hpe4.orig/Documentation/memory-hotplug.txt	2010-12-02 12:35:31.557622002 +0800
+++ linux-hpe4/Documentation/memory-hotplug.txt	2010-12-06 07:39:36.007622000 +0800
@@ -19,6 +19,7 @@
   4.1 Hardware(Firmware) Support
   4.2 Notify memory hot-add event by hand
   4.3 Node hotplug emulation
+  4.4 Memory hotplug emulation
 5. Logical Memory hot-add phase
   5.1. State of memory
   5.2. How to online memory
@@ -239,6 +240,29 @@
 Once the new node has been added, it is possible to online the memory by
 toggling the "state" of its memory section(s) as described in section 5.1.
 
+4.4 Memory hotplug emulation
+------------
+With debugfs, it is possible to test memory hotplug with software method, we
+can add memory section to desired node with add_memory interface. It is a much
+more powerful interface than "probe" described in section 4.2.
+
+There is an add_memory interface for each online node at the debugfs mount
+point.
+	mem_hotplug/node0/add_memory
+	mem_hotplug/node1/add_memory
+	mem_hotplug/node2/add_memory
+	...
+
+Add a memory section(128M) to node 3(boots with mem=1024m)
+
+	echo 0x40000000 > mem_hotplug/node3/add_memory
+
+And more we make it friendly, it is possible to add memory to do
+
+	echo 1024m > mem_hotplug/node3/add_memory
+
+Once the new memory section has been added, it is possible to online the memory
+by toggling the "state" described in section 5.1.
 
 ------------------------------
 5. Logical Memory hot-add phase

-- 
Thanks & Regards,
Shaohui



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-12-13 20:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <A24AE1FFE7AEC5489F83450EE98351BF2A40FED20A@shsmsx502.ccr.corp.intel.com>
2010-12-09  1:21 ` [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface Shaohui Zheng
2010-12-09 21:29   ` David Rientjes
2010-12-09 23:57     ` Shaohui Zheng
2010-12-10 23:30       ` David Rientjes
2010-12-13  2:09         ` Shaohui Zheng
2010-12-13 20:56           ` David Rientjes
2010-12-07  1:00 [0/7,v8] NUMA Hotplug Emulator (v8) shaohui.zheng
2010-12-07  1:00 ` [7/7,v8] NUMA Hotplug Emulator: Implement per-node add_memory debugfs interface shaohui.zheng
2010-12-08 21:31   ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).