linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* brk1 tests the wrong thing
@ 2020-12-10 20:07 Matthew Wilcox
  2021-01-11 15:41 ` Matthew Wilcox
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2020-12-10 20:07 UTC (permalink / raw)
  To: anton; +Cc: linux-mm, Liam R. Howlett

Linux has this horrendously complicated anon_vma structure that you don't
care about, but the upshot is that after calling fork(), each process
that calls brk() gets a _new_ VMA created.  That is, after calling brk()
the first time, the process address space looks like this:

557777fab000-557777ff0000 rw-p 00000000 00:00 0                          [heap]
557777ff0000-557777ff1000 rw-p 00000000 00:00 0                          [heap]

so what brk1 is actually testing is how long it takes to create & destroy
a new VMA.  This does not match what most programs do -- most will call
exec() which resets the anon_vma structures and starts each program off
with its own heap.  And if you do have a multi-process program which
uses brk(), chances are it doesn't just oscillate betwee zero and one
extra pages of heap compared to its parent.

A better test starts out by allocating one page on the heap and then
throbs between one and two pages instead of throbbing between zero and
one page.  That means we're actually testing expanding and contracting
the heap instead of creating and destroying a new heap.

For realism, I wanted to add actually accessing the memory in the new
heap, but that doesn't work for the threaded case -- another thread
might remove the memory you just allocated while you're allocating it.
Threaded programs give each thread its own heap anyway, so this is
kind of a pointless syscall to ask about its threaded scalability.

Anyway, here's brk2.c.  It is not very different from brk1.c, but the
performance results are quite different (actually worse by about 10-15%).


#include <assert.h>
#include <sys/types.h>
#include <unistd.h>

char *testcase_description = "brk unshared increase/decrease of one page";

void testcase(unsigned long long *iterations, unsigned long nr)
{
	unsigned long page_size = getpagesize();
	void *addr = sbrk(page_size) + page_size;

	while (1) {
		addr += page_size;
		assert(brk(addr) == 0);

		addr -= page_size;
		assert(brk(addr) == 0);

		(*iterations) += 2;
	}
}


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: brk1 tests the wrong thing
  2020-12-10 20:07 brk1 tests the wrong thing Matthew Wilcox
@ 2021-01-11 15:41 ` Matthew Wilcox
  2021-01-11 20:17   ` Anton Blanchard
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2021-01-11 15:41 UTC (permalink / raw)
  To: anton; +Cc: linux-mm, Liam R. Howlett


ping

On Thu, Dec 10, 2020 at 08:07:36PM +0000, Matthew Wilcox wrote:
> Linux has this horrendously complicated anon_vma structure that you don't
> care about, but the upshot is that after calling fork(), each process
> that calls brk() gets a _new_ VMA created.  That is, after calling brk()
> the first time, the process address space looks like this:
> 
> 557777fab000-557777ff0000 rw-p 00000000 00:00 0                          [heap]
> 557777ff0000-557777ff1000 rw-p 00000000 00:00 0                          [heap]
> 
> so what brk1 is actually testing is how long it takes to create & destroy
> a new VMA.  This does not match what most programs do -- most will call
> exec() which resets the anon_vma structures and starts each program off
> with its own heap.  And if you do have a multi-process program which
> uses brk(), chances are it doesn't just oscillate betwee zero and one
> extra pages of heap compared to its parent.
> 
> A better test starts out by allocating one page on the heap and then
> throbs between one and two pages instead of throbbing between zero and
> one page.  That means we're actually testing expanding and contracting
> the heap instead of creating and destroying a new heap.
> 
> For realism, I wanted to add actually accessing the memory in the new
> heap, but that doesn't work for the threaded case -- another thread
> might remove the memory you just allocated while you're allocating it.
> Threaded programs give each thread its own heap anyway, so this is
> kind of a pointless syscall to ask about its threaded scalability.
> 
> Anyway, here's brk2.c.  It is not very different from brk1.c, but the
> performance results are quite different (actually worse by about 10-15%).
> 
> 
> #include <assert.h>
> #include <sys/types.h>
> #include <unistd.h>
> 
> char *testcase_description = "brk unshared increase/decrease of one page";
> 
> void testcase(unsigned long long *iterations, unsigned long nr)
> {
> 	unsigned long page_size = getpagesize();
> 	void *addr = sbrk(page_size) + page_size;
> 
> 	while (1) {
> 		addr += page_size;
> 		assert(brk(addr) == 0);
> 
> 		addr -= page_size;
> 		assert(brk(addr) == 0);
> 
> 		(*iterations) += 2;
> 	}
> }
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: brk1 tests the wrong thing
  2021-01-11 15:41 ` Matthew Wilcox
@ 2021-01-11 20:17   ` Anton Blanchard
  0 siblings, 0 replies; 3+ messages in thread
From: Anton Blanchard @ 2021-01-11 20:17 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, Liam R. Howlett

Hi Willy,

Thanks for this, I've merged it.

Anton

> ping
> 
> On Thu, Dec 10, 2020 at 08:07:36PM +0000, Matthew Wilcox wrote:
> > Linux has this horrendously complicated anon_vma structure that you
> > don't care about, but the upshot is that after calling fork(), each
> > process that calls brk() gets a _new_ VMA created.  That is, after
> > calling brk() the first time, the process address space looks like
> > this:

> > 
> > 557777fab000-557777ff0000 rw-p 00000000 00:00 0
> >      [heap] 557777ff0000-557777ff1000 rw-p 00000000 00:00 0
> >                  [heap]
> > 
> > so what brk1 is actually testing is how long it takes to create &
> > destroy a new VMA.  This does not match what most programs do --
> > most will call exec() which resets the anon_vma structures and
> > starts each program off with its own heap.  And if you do have a
> > multi-process program which uses brk(), chances are it doesn't just
> > oscillate betwee zero and one extra pages of heap compared to its
> > parent.
> > 
> > A better test starts out by allocating one page on the heap and then
> > throbs between one and two pages instead of throbbing between zero
> > and one page.  That means we're actually testing expanding and
> > contracting the heap instead of creating and destroying a new heap.
> > 
> > For realism, I wanted to add actually accessing the memory in the
> > new heap, but that doesn't work for the threaded case -- another
> > thread might remove the memory you just allocated while you're
> > allocating it. Threaded programs give each thread its own heap
> > anyway, so this is kind of a pointless syscall to ask about its
> > threaded scalability.
> > 
> > Anyway, here's brk2.c.  It is not very different from brk1.c, but
> > the performance results are quite different (actually worse by
> > about 10-15%).
> > 
> > 
> > #include <assert.h>
> > #include <sys/types.h>
> > #include <unistd.h>
> > 
> > char *testcase_description = "brk unshared increase/decrease of one
> > page";
> > 
> > void testcase(unsigned long long *iterations, unsigned long nr)
> > {
> > 	unsigned long page_size = getpagesize();
> > 	void *addr = sbrk(page_size) + page_size;
> > 
> > 	while (1) {
> > 		addr += page_size;
> > 		assert(brk(addr) == 0);
> > 
> > 		addr -= page_size;
> > 		assert(brk(addr) == 0);
> > 
> > 		(*iterations) += 2;
> > 	}
> > }
> >   
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-11 20:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 20:07 brk1 tests the wrong thing Matthew Wilcox
2021-01-11 15:41 ` Matthew Wilcox
2021-01-11 20:17   ` Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).