From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Juergen Gross <jgross@suse.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
the arch/x86 maintainers <x86@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Andy Lutomirski <luto@amacapital.net>,
Jork Loeser <Jork.Loeser@microsoft.com>,
Ingo Molnar <mingo@redhat.com>,
xen-devel <xen-devel@lists.xenproject.org>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
KY Srinivasan <kys@microsoft.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] x86: enable RCU based table free when PARAVIRT
Date: Thu, 24 Aug 2017 01:36:38 +0300 [thread overview]
Message-ID: <20170823223637.bjke4w3wpolrn7md__6650.52086483247$1503549801$gmane$org@black.fi.intel.com> (raw)
In-Reply-To: <CA+55aFxDwWgMQa2HGfgWKOxqfepiBu5XVpGj3VJ=f53a=w0kpA@mail.gmail.com>
On Wed, Aug 23, 2017 at 08:27:18PM +0000, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 12:59 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> >
> > In this case we need performance numbers for !PARAVIRT kernel.
>
> Yes.
>
> > Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
> > interesting too for worst case scenario.
>
> Actually, I don't think you want to populate all the pages. You just
> want to populate *one* page, in order to build up the page directory
> structure, not allocate all the final points.
>
> And we only free the actual page tables when there is nothing around,
> so it should be at least a 2MB-aligned region etc.
>
> So you should do a *big* allocation, and then touch a single page in
> the middle, and then minmap it - that should give you maximal page
> table activity. Otherwise the page tables will generally just stay
> around.
>
> Realistically, it's mainly exit() that frees page tables. Yes, you may
> have a few page tables free'd by a normal munmap(), but it's usually
> very limited. Which is why I suggested that script-heavy thing with
> lots of small executables. That tends to be the main realistic load
> that really causes a ton of page directory activity.
Below is test cases that allocates a lot of page tables and measuare
fork/exit time. (I'm not entirely sure it's the best way to stress the
codepath.)
Unpatched: average 4.8322s, stddev 0.114s
Patched: average 4.8362s, stddev 0.111s
Both without PARAVIRT. Patch is modified to enable HAVE_RCU_TABLE_FREE for
!PARAVIRT too.
The test-case requires "echo 1 > /proc/sys/vm/overcommit_memory".
#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <sys/types.h>
#include <sys/wait.h>
#define PUD_SIZE (1UL << 30)
#define PMD_SIZE (1UL << 21)
#define NR_PUD 4096
#define NSEC_PER_SEC 1000000000L
int main(void)
{
char *addr = NULL;
unsigned long i, j;
struct timespec start, finish;
long long nsec;
prctl(PR_SET_THP_DISABLE);
for (i = 0; i < NR_PUD ; i++) {
addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
break;
}
for (j = 0; j < PUD_SIZE; j += PMD_SIZE)
assert(addr[j] == 0);
}
for (i = 0; i < 10; i++) {
pid_t pid;
clock_gettime(CLOCK_MONOTONIC, &start);
pid = fork();
if (pid == -1)
perror("fork");
if (!pid)
exit(0);
wait(NULL);
clock_gettime(CLOCK_MONOTONIC, &finish);
nsec = (finish.tv_sec - start.tv_sec) * NSEC_PER_SEC +
(finish.tv_nsec - start.tv_nsec);
printf("%lld\n", nsec);
}
return 0;
}
--
Kirill A. Shutemov
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-08-23 22:37 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-23 13:45 [PATCH] x86: enable RCU based table free when PARAVIRT Vitaly Kuznetsov
2017-08-23 18:26 ` Linus Torvalds
2017-08-23 18:26 ` Linus Torvalds
2017-08-23 19:59 ` Kirill A. Shutemov
2017-08-23 20:27 ` Linus Torvalds
2017-08-23 20:27 ` Linus Torvalds
2017-08-23 22:36 ` Kirill A. Shutemov [this message]
2017-08-23 22:36 ` Kirill A. Shutemov
2017-08-23 23:03 ` Linus Torvalds
2017-08-23 23:03 ` Linus Torvalds
2017-08-24 8:47 ` Vitaly Kuznetsov
2017-08-24 8:47 ` Vitaly Kuznetsov
2017-08-24 8:47 ` Kirill A. Shutemov
2017-08-24 8:47 ` Kirill A. Shutemov
2017-08-23 19:59 ` Kirill A. Shutemov
2017-08-23 13:45 Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='20170823223637.bjke4w3wpolrn7md__6650.52086483247$1503549801$gmane$org@black.fi.intel.com' \
--to=kirill.shutemov@linux.intel.com \
--cc=Jork.Loeser@microsoft.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=kirill@shutemov.name \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.