linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -v2] x86, mm: Probe memory block size for generic x86 64bit
@ 2012-01-30  8:24 Yinghai Lu
  2012-02-07 22:11 ` H. Peter Anvin
  0 siblings, 1 reply; 3+ messages in thread
From: Yinghai Lu @ 2012-01-30  8:24 UTC (permalink / raw)
  To: Andrew Morton, Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: linux-kernel, Yinghai Lu

Usually if the system support memory remapping to get back memory for mmio
range, we will have 128M ... 2G at the end.

Try to probe that size.

So we can get less entries in /sys/devices/system/memory/

-v2: don't probe it every time when /sys/../block_size_byte is showed...

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/mm/init_64.c |   34 ++++++++++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 4 deletions(-)

Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c
+++ linux-2.6/arch/x86/mm/init_64.c
@@ -890,17 +890,43 @@ const char *arch_vma_name(struct vm_area
 	return NULL;
 }
 
-#ifdef CONFIG_X86_UV
-unsigned long memory_block_size_bytes(void)
+static unsigned long probe_memory_block_size(void)
 {
+	/* start from 2g */
+	unsigned long bz = 1UL<<31;
+
+#ifdef CONFIG_X86_UV
 	if (is_uv_system()) {
 		printk(KERN_INFO "UV: memory block size 2GB\n");
 		return 2UL * 1024 * 1024 * 1024;
 	}
-	return MIN_MEMORY_BLOCK_SIZE;
-}
 #endif
 
+	/* less than 64g installed */
+	if ((max_pfn << PAGE_SHIFT) < (16UL << 32))
+		return MIN_MEMORY_BLOCK_SIZE;
+
+	/* get the tail size */
+	while (bz > MIN_MEMORY_BLOCK_SIZE) {
+		if (!((max_pfn << PAGE_SHIFT) & (bz - 1)))
+			break;
+		bz >>= 1;
+	}
+
+	printk(KERN_DEBUG "memory block size : %ldMB\n", bz >> 20);
+
+	return bz;
+}
+
+static unsigned long memory_block_size_probed;
+unsigned long memory_block_size_bytes(void)
+{
+	if (!memory_block_size_probed)
+		memory_block_size_probed = probe_memory_block_size();
+
+	return memory_block_size_probed;
+}
+
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 /*
  * Initialise the sparsemem vmemmap using huge-pages at the PMD level.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH -v2] x86, mm: Probe memory block size for generic x86 64bit
  2012-01-30  8:24 [PATCH -v2] x86, mm: Probe memory block size for generic x86 64bit Yinghai Lu
@ 2012-02-07 22:11 ` H. Peter Anvin
  2012-02-07 22:35   ` Yinghai Lu
  0 siblings, 1 reply; 3+ messages in thread
From: H. Peter Anvin @ 2012-02-07 22:11 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, linux-kernel

On 01/30/2012 12:24 AM, Yinghai Lu wrote:
> Usually if the system support memory remapping to get back memory for mmio
> range, we will have 128M ... 2G at the end.
> 
> Try to probe that size.
> 
> So we can get less entries in /sys/devices/system/memory/
> 
> -v2: don't probe it every time when /sys/../block_size_byte is showed...
> 

Okay... what on Earth is the point of this?

This just screams "dangerous toxic heuristic that's likely to break
unusual configurations", and reducing the number of entries in a /sys
directory is hardly motivation for anything.

You need a much better description of what you're trying to accomplish
and why that matters in the first place.

	-hpa

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH -v2] x86, mm: Probe memory block size for generic x86 64bit
  2012-02-07 22:11 ` H. Peter Anvin
@ 2012-02-07 22:35   ` Yinghai Lu
  0 siblings, 0 replies; 3+ messages in thread
From: Yinghai Lu @ 2012-02-07 22:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, linux-kernel

On Tue, Feb 7, 2012 at 2:11 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/30/2012 12:24 AM, Yinghai Lu wrote:
>> Usually if the system support memory remapping to get back memory for mmio
>> range, we will have 128M ... 2G at the end.
>>
>> Try to probe that size.
>>
>> So we can get less entries in /sys/devices/system/memory/
>>
>> -v2: don't probe it every time when /sys/../block_size_byte is showed...
>>
>
> Okay... what on Earth is the point of this?
>
> This just screams "dangerous toxic heuristic that's likely to break
> unusual configurations", and reducing the number of entries in a /sys
> directory is hardly motivation for anything.
>
> You need a much better description of what you're trying to accomplish
> and why that matters in the first place.

that is for memory hotplug support.

on one 2048g system, current x86_64 have 128M as section size, and one
memory_block only include
one section.
So that directory will have 16400 entries.

recent change will try to use block id to find block pointer in /sys
for any section, and reuse that block pointer.
even only one section per block. and that finding will take some time
... already have one patch that will skip the search in that case
during booting up.

but other case, We may still have this finding problem.

So solution could be increase block size just like SGI UV system did.
(harded code to 2g).

This patch is trying to probe the block size to make it match mmio remap size.
for example, AMD system after rev E or Intel Nehalem later system will
have memory range
[0, TOM), [4g, TOM2]. if the memory hole is 2g and total is 128g, TOM
will be 2g, and TOM2 will be 130g.
We could use 2g as block size instead of default 128M.
That will reduce number of entries in  /sys/devices/system/memory/

Yinghai

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-02-07 22:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-30  8:24 [PATCH -v2] x86, mm: Probe memory block size for generic x86 64bit Yinghai Lu
2012-02-07 22:11 ` H. Peter Anvin
2012-02-07 22:35   ` Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).