All of lore.kernel.org
 help / color / mirror / Atom feed
* alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
@ 2009-05-29  0:30 Max Laier
       [not found] ` <f568093c0905281743i63e1a24ak681df87bc83826ce@mail.gmail.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Max Laier @ 2009-05-29  0:30 UTC (permalink / raw)
  To: linux-numa; +Cc: christoph

[-- Attachment #1: Type: text/plain, Size: 1703 bytes --]

Hello,

I'm having a bit of trouble with the NUMA allocator in the kernel.  This 
is in a numa=fake test-setup (though this shouldn't matter, I guess).

I'm trying to allocate pages for KVM VMs from selected nodes (minimal PoC 
diff attached - hard coding the preferred page to 7 [the last node in my 
setup, but that doesn't matter - it just demonstrates the point most 
effectively]).

The call of interest is:

  page = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0)

The problem is that - while page_to_nid() reports that the page is from 
node 7 - "numactl --hardware" doesn't show any allocations from node 7.  
In fact it seems that the memory is allocated from the first node with 
free pages until these run out.  Only then pages from the selected (last) 
node are given out.  Once the selected node is full, alloc_pages_node(... 
GFP_THISNODE ...) returns NULL - as it should - and I fall back to a 
normal allocation that then also reports a different node ID from 
page_to_nid (c.f. the attached diff).

The strange thing is, that a simple test module (attached as well) works 
as expected.  The allocation succeeds, reports the selected node in 
page_to_nid *and* the free memory reported from "numactl --hardware" in 
the selected node decreases.

Any insight as to why the KVM allocation might be special are very 
appreciated.  I tried to follow the call path, but didn't find any red 
flags that would indicate the difference.

Thanks.

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News


[-- Attachment #2: kvm.mmu.c.diff --]
[-- Type: text/x-patch, Size: 704 bytes --]

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 479e748..82a3f56 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -300,9 +300,13 @@ static int mmu_topup_memory_cache_page(struct kvm_mmu_memory_cache *cache,
 	if (cache->nobjs >= min)
 		return 0;
 	while (cache->nobjs < ARRAY_SIZE(cache->objects)) {
-		page = alloc_page(GFP_KERNEL);
-		if (!page)
-			return -ENOMEM;
+		page = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0);
+		if (!page) {
+			page = alloc_page(GFP_KERNEL);
+			if (!page)                
+				return -ENOMEM;
+			printk("Page from node %d\n", page_to_nid(page));
+		}
 		set_page_private(page, 0);
 		cache->objects[cache->nobjs++] = page_address(page);
 	}

[-- Attachment #3: nodemem.c --]
[-- Type: text/x-csrc, Size: 814 bytes --]

#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/gfp.h>
#include <linux/mm.h>

MODULE_LICENSE("Dual BSD/GPL");

/* 200 MB */
#define NUM_PAGES (51200)

struct page *pages[NUM_PAGES];

static int nodemem_init(void) {
  int i;

  printk("Trying to allocate %d pages from node 7\n", NUM_PAGES);

  for (i = 0; i < NUM_PAGES; i++) {
    pages[i] = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0);
    if (!pages[i]) {
      for (i--; i >= 0; i--)
        __free_pages(pages[i], 0);
      return -ENOMEM;
    }
    printk("Page %d from node %d\n", i, page_to_nid(pages[i]));
  }
    
  return 0;
}
    
static void nodemem_exit(void) {
  int i;

  for (i = 0; i < NUM_PAGES; i++)
    __free_pages(pages[i], 0);
}
      
module_init(nodemem_init);
module_exit(nodemem_exit);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
       [not found] ` <f568093c0905281743i63e1a24ak681df87bc83826ce@mail.gmail.com>
@ 2009-05-29  0:46   ` Christoph Lameter
  2009-05-29  1:09     ` Max Laier
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Lameter @ 2009-05-29  0:46 UTC (permalink / raw)
  To: Max Laier; +Cc: linux-numa

On Thu, 28 May 2009, Christoph Lameter wrote:

> I'm having a bit of trouble with the NUMA allocator in the kernel.  This
> is in a numa=fake test-setup (though this shouldn't matter, I guess).

Not sure how fake numa works. This could affect the result. page_to_nid
shows a different number than the node from which the page actually came?
Sounds broken.

> node 7 - "numactl --hardware" doesn't show any allocations from node 7.
> In fact it seems that the memory is allocated from the first node with
> free pages until these run out.  Only then pages from the selected (last)
> node are given out.  Once the selected node is full, alloc_pages_node(...
> GFP_THISNODE ...) returns NULL - as it should - and I fall back to a
> normal allocation that then also reports a different node ID from
> page_to_nid (c.f. the attached diff).
>
> The strange thing is, that a simple test module (attached as well) works
> as expected.  The allocation succeeds, reports the selected node in
> page_to_nid *and* the free memory reported from "numactl --hardware" in
> the selected node decreases.
>
> Any insight as to why the KVM allocation might be special are very
> appreciated.  I tried to follow the call path, but didn't find any red
> flags that would indicate the difference.

Please verify your numbers using /proc/zoneinfo.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29  0:46   ` Christoph Lameter
@ 2009-05-29  1:09     ` Max Laier
  2009-05-29 13:54       ` Christoph Lameter
  0 siblings, 1 reply; 15+ messages in thread
From: Max Laier @ 2009-05-29  1:09 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-numa

On Friday 29 May 2009 02:46:57 Christoph Lameter wrote:
> On Thu, 28 May 2009, Christoph Lameter wrote:
> > I'm having a bit of trouble with the NUMA allocator in the kernel. 
> > This is in a numa=fake test-setup (though this shouldn't matter, I
> > guess).
>
> Not sure how fake numa works. This could affect the result. page_to_nid

Simply by giving "numa=fake=256,256*" as a boot option I get several nodes 
of 256M each - seemed the easiest way to go for testing.  It's setup from 
arch/x86/mm/numa_64.c::numa_emulation and I didn't spot any obvious 
difference in the setup compared to the hardware detection.  I do have a 
box with k8 NUMA, but unfortunately no KVM support.  Again, remember that 
my test module works flawlessly.

My first reaction was that there must be something in the context that 
affects the allocation (waiting not allowed or some such), but I couldn't 
find anything like this in the call path.

> shows a different number than the node from which the page actually
> came? Sounds broken.
>
> > node 7 - "numactl --hardware" doesn't show any allocations from node
> > 7. In fact it seems that the memory is allocated from the first node
> > with free pages until these run out.  Only then pages from the
> > selected (last) node are given out.  Once the selected node is full,
> > alloc_pages_node(... GFP_THISNODE ...) returns NULL - as it should -
> > and I fall back to a normal allocation that then also reports a
> > different node ID from page_to_nid (c.f. the attached diff).
> >
> > The strange thing is, that a simple test module (attached as well)
> > works as expected.  The allocation succeeds, reports the selected
> > node in page_to_nid *and* the free memory reported from "numactl
> > --hardware" in the selected node decreases.
> >
> > Any insight as to why the KVM allocation might be special are very
> > appreciated.  I tried to follow the call path, but didn't find any
> > red flags that would indicate the difference.
>
> Please verify your numbers using /proc/zoneinfo.

Same result.  "numa_hit" in node 7 increases, while "nr_free_pages" stays 
the same.  Anything else you'd want me to watch out for?

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29  1:09     ` Max Laier
@ 2009-05-29 13:54       ` Christoph Lameter
  2009-05-29 15:01         ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Lameter @ 2009-05-29 13:54 UTC (permalink / raw)
  To: Max Laier; +Cc: linux-numa, David Rientjes

On Fri, 29 May 2009, Max Laier wrote:

> Same result.  "numa_hit" in node 7 increases, while "nr_free_pages" stays
> the same.  Anything else you'd want me to watch out for?

That looks like a bug in fake numa.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 13:54       ` Christoph Lameter
@ 2009-05-29 15:01         ` Andi Kleen
  2009-05-29 16:18           ` Max Laier
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2009-05-29 15:01 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Max Laier, linux-numa, David Rientjes

On Fri, May 29, 2009 at 09:54:06AM -0400, Christoph Lameter wrote:
> On Fri, 29 May 2009, Max Laier wrote:
> 
> > Same result.  "numa_hit" in node 7 increases, while "nr_free_pages" stays
> > the same.  Anything else you'd want me to watch out for?
> 
> That looks like a bug in fake numa.

I also got some reports of fake numa being a bit broken recently.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 15:01         ` Andi Kleen
@ 2009-05-29 16:18           ` Max Laier
  2009-05-29 16:36             ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Max Laier @ 2009-05-29 16:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christoph Lameter, Max Laier, linux-numa, David Rientjes


Am Fr, 29.05.2009, 17:01, schrieb Andi Kleen:
> On Fri, May 29, 2009 at 09:54:06AM -0400, Christoph Lameter wrote:
>> On Fri, 29 May 2009, Max Laier wrote:
>>
>> > Same result.  "numa_hit" in node 7 increases, while "nr_free_pages"
>> stays
>> > the same.  Anything else you'd want me to watch out for?
>>
>> That looks like a bug in fake numa.
>
> I also got some reports of fake numa being a bit broken recently.

That might have been me too - I sent you a private mail earlier this week.

I guess the question is, what is special about my allocation in KVM as
opposed to the allocation in the test module (that works as expected). 
The userland test tools from the numactl package also work as expected in
membind mode.

Also, I have been reading the fake numa setup back and forth and I really
don't see any obvious difference to the ACPI driven setup - so maybe it's
a general problem, but it's just that more people are running fake numa so
the problem is reported there first?

I'll take another look at the fake numa setup later today.  Any chance
somebody could give the KVM thing a try on real numa hardware?  Though
there are probably not that many systems that have real numa and kvm
support ... dual socket Gainestown setup anyone?

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 16:18           ` Max Laier
@ 2009-05-29 16:36             ` Andi Kleen
  2009-05-29 16:45               ` Max Laier
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2009-05-29 16:36 UTC (permalink / raw)
  To: Max Laier; +Cc: Andi Kleen, Christoph Lameter, linux-numa, David Rientjes

On Fri, May 29, 2009 at 06:18:44PM +0200, Max Laier wrote:
> 
> Am Fr, 29.05.2009, 17:01, schrieb Andi Kleen:
> > On Fri, May 29, 2009 at 09:54:06AM -0400, Christoph Lameter wrote:
> >> On Fri, 29 May 2009, Max Laier wrote:
> >>
> >> > Same result.  "numa_hit" in node 7 increases, while "nr_free_pages"
> >> stays
> >> > the same.  Anything else you'd want me to watch out for?
> >>
> >> That looks like a bug in fake numa.
> >
> > I also got some reports of fake numa being a bit broken recently.
> 
> That might have been me too - I sent you a private mail earlier this week.

No that was from someone else. Also I don't have an email from you in 
my mailbox, so I didn't see it.

> I guess the question is, what is special about my allocation in KVM as

Define "in KVM"?

Inside the guest? 

> opposed to the allocation in the test module (that works as expected). 

I thought you complained that the test module didn't increase the numastat
counters as expected?

> The userland test tools from the numactl package also work as expected in
> membind mode.
> 
> Also, I have been reading the fake numa setup back and forth and I really

To be honest I don't understand it anymore either since it got so much
new stuff a couple of years back. All I can say it worked when
I wrote it originally and set up the nodes in exactly the same
way as the native NUMA setup on x86-64.


> I'll take another look at the fake numa setup later today.  Any chance
> somebody could give the KVM thing a try on real numa hardware?  Though
> there are probably not that many systems that have real numa and kvm
> support ... dual socket Gainestown setup anyone?

I don't understand. There are lots of KVM capable NUMA systems.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 16:36             ` Andi Kleen
@ 2009-05-29 16:45               ` Max Laier
  2009-05-29 18:26                 ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Max Laier @ 2009-05-29 16:45 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christoph Lameter, linux-numa, David Rientjes


Am Fr, 29.05.2009, 18:36, schrieb Andi Kleen:
> On Fri, May 29, 2009 at 06:18:44PM +0200, Max Laier wrote:
>> Am Fr, 29.05.2009, 17:01, schrieb Andi Kleen:
>> > On Fri, May 29, 2009 at 09:54:06AM -0400, Christoph Lameter wrote:
>> >> On Fri, 29 May 2009, Max Laier wrote:
>> I guess the question is, what is special about my allocation in KVM as
>
> Define "in KVM"?
>
> Inside the guest?

No, inside the host /for/ the guest.  See the mail that started this
thread.  What I'm trying to do is to pin page allocations for a VM to a
(set of) node(s).

>> opposed to the allocation in the test module (that works as expected).
>
> I thought you complained that the test module didn't increase the numastat
> counters as expected?

No, the test module is fine.  The exact same call in the KVM allocation
routine, however, does not work.  Again, please see my initial post.

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 16:45               ` Max Laier
@ 2009-05-29 18:26                 ` Andi Kleen
  2009-05-29 20:39                   ` Max Laier
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2009-05-29 18:26 UTC (permalink / raw)
  To: Max Laier; +Cc: Andi Kleen, Christoph Lameter, linux-numa, David Rientjes

> No, the test module is fine.  The exact same call in the KVM allocation
> routine, however, does not work.  Again, please see my initial post.

Sorry I don't have time to reverse engineer/debug your patches.

If you can't state the problem clearly there's probably no way
other people can help you on the mailing list.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 18:26                 ` Andi Kleen
@ 2009-05-29 20:39                   ` Max Laier
  2009-06-02 20:13                     ` Andi Kleen
  2009-06-02 22:59                     ` David Rientjes
  0 siblings, 2 replies; 15+ messages in thread
From: Max Laier @ 2009-05-29 20:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Christoph Lameter, linux-numa, David Rientjes


Am Fr, 29.05.2009, 20:26, schrieb Andi Kleen:
>> No, the test module is fine.  The exact same call in the KVM allocation
>> routine, however, does not work.  Again, please see my initial post.
>
> Sorry I don't have time to reverse engineer/debug your patches.
>
> If you can't state the problem clearly there's probably no way
> other people can help you on the mailing list.

I'm sorry if I sounded snippy, this wasn't intended.  I hope I haven't
pissed away your good will yet - which is indeed greatly appreciated.  So
let me try again to be more precise in specifying my problem:

I'm trying to replace a call to alloc_page() with alloc_pages_node() in
order to membind the allocation to a specific node set.  This seemed to
work - as page_to_nid() returns the selected node.  However, the free
memory in that node isn't decreasing.

I tried to build a test module to recreate the problem, but the test
module doesn't exhibit the bug.  I assume that is because there is
something different with the context from which the allocation is
happening.  I tried to follow the code path of alloc_pages_node() to find
out what that difference might be, but didn't succeed.  Hence I attached a
boiled down version of my changes that still demonstrate the problem. 
It's really a one line change in kvm.

I was hoping that someone with real numa hardware and kvm capabilities
would be able to test this tiny patch in order to find out if the problem
lies with the fake numa or indeed the context of the allocation so I could
focus my efforts on the right target.

Alternatively, I'd also be greatful if anybody had any idea why
alloc_pages_node() would behave differently in different context and what
that difference would be.

Again, sorry if I sounded ungreatful.  I do appreciate any help or
pointers you could give me.  I don't expect a final sollution and am
willing and ready to invest more time of my own to fixing this.  I just
need some idea in which direction to go:  fake numa or context.

Thanks.

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 20:39                   ` Max Laier
@ 2009-06-02 20:13                     ` Andi Kleen
  2009-06-02 22:59                     ` David Rientjes
  1 sibling, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2009-06-02 20:13 UTC (permalink / raw)
  To: Max Laier; +Cc: Andi Kleen, Christoph Lameter, linux-numa, David Rientjes

> I tried to build a test module to recreate the problem, but the test
> module doesn't exhibit the bug.  I assume that is because there is
> something different with the context from which the allocation is
> happening.  I tried to follow the code path of alloc_pages_node() to find
> out what that difference might be, but didn't succeed.  Hence I attached a
> boiled down version of my changes that still demonstrate the problem. 
> It's really a one line change in kvm.

Yes it looks harmless.
> 
> I was hoping that someone with real numa hardware and kvm capabilities
> would be able to test this tiny patch in order to find out if the problem
> lies with the fake numa or indeed the context of the allocation so I could
> focus my efforts on the right target.

One way you could test that yourself would be to fake real NUMA.
As in hardcode a very simple SRAT into drivers/acpi/numa.c that splits
your memory/cores in half. That should be about equivalent to the 
real thing on the Linux level. Originally fake numa was like this too,
but that got changed later.

> Alternatively, I'd also be greatful if anybody had any idea why
> alloc_pages_node() would behave differently in different context and what
> that difference would be.

Maybe someone set a task memory policy? I thought NUMA support for that got
added to KVM qemu at some point.  You can check with stracing qemu
or by dumping the policies of the kvm task.

Actually you specified GFP_THISNODE which should avoid this, but maybe
there is a bug somewhere. Very likely there is in fact from your
description. But I can't see it either from a quick look.

Did you try it without GFP_THISNODE?

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-05-29 20:39                   ` Max Laier
  2009-06-02 20:13                     ` Andi Kleen
@ 2009-06-02 22:59                     ` David Rientjes
  2009-06-03 14:04                       ` Christoph Lameter
  1 sibling, 1 reply; 15+ messages in thread
From: David Rientjes @ 2009-06-02 22:59 UTC (permalink / raw)
  To: Max Laier; +Cc: Andi Kleen, Christoph Lameter, linux-numa

On Fri, 29 May 2009, Max Laier wrote:

> I'm trying to replace a call to alloc_page() with alloc_pages_node() in
> order to membind the allocation to a specific node set.  This seemed to
> work - as page_to_nid() returns the selected node.  However, the free
> memory in that node isn't decreasing.
> 

I'm assuming that pgalloc_{zone} is being incremented correctly in 
/proc/vmstat if you have CONFIG_VM_EVENT_COUNTERS enabled (and that you're 
using a recent kernel, you never mentioned your version).

Keep in mind the vm stat threshold which will purposefully neglect to 
update ZVCs if it is not reached.

nr_free_pages in /proc/zoneinfo won't decrease if the allocated page was 
from the pcp and is order 0; check the difference in `count' for the 
allocating cpu's pageset on the target node in the same file.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-06-02 22:59                     ` David Rientjes
@ 2009-06-03 14:04                       ` Christoph Lameter
  2009-06-03 18:24                         ` David Rientjes
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Lameter @ 2009-06-03 14:04 UTC (permalink / raw)
  To: David Rientjes; +Cc: Max Laier, Andi Kleen, linux-numa

On Tue, 2 Jun 2009, David Rientjes wrote:

> I'm assuming that pgalloc_{zone} is being incremented correctly in
> /proc/vmstat if you have CONFIG_VM_EVENT_COUNTERS enabled (and that you're
> using a recent kernel, you never mentioned your version).
>
> Keep in mind the vm stat threshold which will purposefully neglect to
> update ZVCs if it is not reached.

ZVCs will be updated after a second or so. The threshhold is to bound
deviations within the ZVC update interval.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-06-03 14:04                       ` Christoph Lameter
@ 2009-06-03 18:24                         ` David Rientjes
  2009-06-14  4:50                           ` Max Laier
  0 siblings, 1 reply; 15+ messages in thread
From: David Rientjes @ 2009-06-03 18:24 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Max Laier, Andi Kleen, linux-numa

On Wed, 3 Jun 2009, Christoph Lameter wrote:

> > I'm assuming that pgalloc_{zone} is being incremented correctly in
> > /proc/vmstat if you have CONFIG_VM_EVENT_COUNTERS enabled (and that you're
> > using a recent kernel, you never mentioned your version).
> >
> > Keep in mind the vm stat threshold which will purposefully neglect to
> > update ZVCs if it is not reached.
> 
> ZVCs will be updated after a second or so. The threshhold is to bound
> deviations within the ZVC update interval.
> 

The point is that you can see disagreement between NUMA_HIT and 
NR_FREE_PAGES depending on when you collect your statistics if the 
threshold has been reached for one ZVC and not another inside the 
interval.

This likely isn't the case for Max since it's easily reproducible; I would 
suggest checking the `count' for the allocating cpu's pcp from 
/proc/zoneinfo as I earlier described.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
  2009-06-03 18:24                         ` David Rientjes
@ 2009-06-14  4:50                           ` Max Laier
  0 siblings, 0 replies; 15+ messages in thread
From: Max Laier @ 2009-06-14  4:50 UTC (permalink / raw)
  To: David Rientjes; +Cc: Christoph Lameter, Andi Kleen, linux-numa

Hi,

thank you for the comments.  The vmstat counters finaly put me on the
right track and I owe you all (Andi in particular) a big apology - the bug
was all mine after all.  The NUMA framework is working as expected - no
matter if in fake mode or not.  Sorry for the confusion and thank you for
your patience.

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-06-14  4:50 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-29  0:30 alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails Max Laier
     [not found] ` <f568093c0905281743i63e1a24ak681df87bc83826ce@mail.gmail.com>
2009-05-29  0:46   ` Christoph Lameter
2009-05-29  1:09     ` Max Laier
2009-05-29 13:54       ` Christoph Lameter
2009-05-29 15:01         ` Andi Kleen
2009-05-29 16:18           ` Max Laier
2009-05-29 16:36             ` Andi Kleen
2009-05-29 16:45               ` Max Laier
2009-05-29 18:26                 ` Andi Kleen
2009-05-29 20:39                   ` Max Laier
2009-06-02 20:13                     ` Andi Kleen
2009-06-02 22:59                     ` David Rientjes
2009-06-03 14:04                       ` Christoph Lameter
2009-06-03 18:24                         ` David Rientjes
2009-06-14  4:50                           ` Max Laier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.