All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: enable RCU based table free when PARAVIRT
@ 2017-08-23 13:45 Vitaly Kuznetsov
  2017-08-23 18:26 ` Linus Torvalds
  2017-08-23 18:26 ` Linus Torvalds
  0 siblings, 2 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2017-08-23 13:45 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, xen-devel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Kirill A. Shutemov, Peter Zijlstra,
	Linus Torvalds, Jork Loeser, KY Srinivasan, Stephen Hemminger,
	Steven Rostedt, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski

On x86 software page-table walkers depend on the fact that remote TLB flush
does an IPI: walk is performed lockless but with interrupts disabled and in
case the pagetable is freed the freeing CPU will get blocked as remote TLB
flush is required. On other architecture which don't require an IPI to do
remote TLB flush we have an RCU-based mechanism (see
include/asm-generic/tlb.h for more details).

In virtualized environments we may want to override .flush_tlb_others hook
in pv_mmu_ops and use a hypercall asking the hypervisor to do remote TLB
flush for us. This breaks the assumption about IPI. Xen PV does this for
years and the upcoming remote TLB flush for Hyper-V will do it too. This
is not safe, software pagetable walkers may step on an already freed page.

Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
is selected in config. Testing with kernbench doesn't show any notable
performance impact:

6-CPU host:

Average Half load -j 3 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 400.498 (0.179679)         Elapsed Time 399.909 (0.162853)
User Time 1098.72 (0.278536)            User Time 1097.59 (0.283894)
System Time 100.301 (0.201629)          System Time 99.736 (0.196254)
Percent CPU 299 (0)                     Percent CPU 299 (0)
Context Switches 5774.1 (69.2121)       Context Switches 5744.4 (79.4162)
Sleeps 87621.2 (78.1093)                Sleeps 87586.1 (99.7079)

Average Optimal load -j 24 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 219.03 (0.652534)          Elapsed Time 218.959 (0.598674)
User Time 1119.51 (21.3284)             User Time 1118.81 (21.7793)
System Time 100.499 (0.389308)          System Time 99.8335 (0.251423)
Percent CPU 432.5 (136.974)             Percent CPU 432.45 (136.922)
Context Switches 81827.4 (78029.5)      Context Switches 81818.5 (78051)
Sleeps 97124.8 (9822.4)                 Sleeps 97207.9 (9955.04)

16-CPU host:

Average Half load -j 8 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 213.538 (3.7891)           Elapsed Time 212.5 (3.10939)
User Time 1306.4 (1.83399)              User Time 1307.65 (1.01364)
System Time 194.59 (0.864378)           System Time 195.478 (0.794588)
Percent CPU 702.6 (13.5388)             Percent CPU 707 (11.1131)
Context Switches 21189.2 (1199.4)       Context Switches 21288.2 (552.388)
Sleeps 89390.2 (482.325)                Sleeps 89677 (277.06)

Average Optimal load -j 64 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 137.866 (0.787928)         Elapsed Time 138.438 (0.218792)
User Time 1488.92 (192.399)             User Time 1489.92 (192.135)
System Time 234.981 (42.5806)           System Time 236.09 (42.8138)
Percent CPU 1057.1 (373.826)            Percent CPU 1057.1 (369.114)
Context Switches 187514 (175324)        Context Switches 187358 (175060)
Sleeps 112633 (24535.5)                 Sleeps 111743 (23297.6)

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
---
Changes since RFC:
- Added Juergen's Acked-by. Fixed a typo in the description.

I didn't get any other feedback on my RFC, assuming there are no
objections, dropping RFC.
---
 arch/x86/Kconfig           |  1 +
 arch/x86/include/asm/tlb.h |  7 +++++++
 arch/x86/mm/pgtable.c      | 15 +++++++++++----
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 323cb065be5e..8032e1ac14f5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -168,6 +168,7 @@ config X86
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_RCU_TABLE_FREE              if SMP && PARAVIRT
 	select HAVE_RELIABLE_STACKTRACE		if X86_64 && FRAME_POINTER && STACK_VALIDATION
 	select HAVE_STACK_VALIDATION		if X86_64
 	select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index c7797307fc2b..1d074c560a48 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -15,4 +15,11 @@
 
 #include <asm-generic/tlb.h>
 
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+static inline void __tlb_remove_table(void *table)
+{
+	free_page_and_swap_cache(table);
+}
+#endif
+
 #endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 508a708eb9a6..f9a3cdb9b574 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -52,11 +52,18 @@ static int __init setup_userpte(char *arg)
 }
 early_param("userpte", setup_userpte);
 
+#ifndef CONFIG_HAVE_RCU_TABLE_FREE
+static inline void tlb_remove_table(struct mmu_gather *tlb, void *table)
+{
+	return tlb_remove_page(tlb, table);
+}
+#endif
+
 void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 {
 	pgtable_page_dtor(pte);
 	paravirt_release_pte(page_to_pfn(pte));
-	tlb_remove_page(tlb, pte);
+	tlb_remove_table(tlb, pte);
 }
 
 #if CONFIG_PGTABLE_LEVELS > 2
@@ -72,21 +79,21 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 	tlb->need_flush_all = 1;
 #endif
 	pgtable_pmd_page_dtor(page);
-	tlb_remove_page(tlb, page);
+	tlb_remove_table(tlb, page);
 }
 
 #if CONFIG_PGTABLE_LEVELS > 3
 void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
 {
 	paravirt_release_pud(__pa(pud) >> PAGE_SHIFT);
-	tlb_remove_page(tlb, virt_to_page(pud));
+	tlb_remove_table(tlb, virt_to_page(pud));
 }
 
 #if CONFIG_PGTABLE_LEVELS > 4
 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
 {
 	paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
-	tlb_remove_page(tlb, virt_to_page(p4d));
+	tlb_remove_table(tlb, virt_to_page(p4d));
 }
 #endif	/* CONFIG_PGTABLE_LEVELS > 4 */
 #endif	/* CONFIG_PGTABLE_LEVELS > 3 */
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 13:45 [PATCH] x86: enable RCU based table free when PARAVIRT Vitaly Kuznetsov
  2017-08-23 18:26 ` Linus Torvalds
@ 2017-08-23 18:26 ` Linus Torvalds
  2017-08-23 19:59   ` Kirill A. Shutemov
  2017-08-23 19:59   ` Kirill A. Shutemov
  1 sibling, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 18:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, xen-devel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Kirill A. Shutemov,
	Peter Zijlstra, Jork Loeser, KY Srinivasan, Stephen Hemminger,
	Steven Rostedt, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski

On Wed, Aug 23, 2017 at 6:45 AM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
> is selected in config. Testing with kernbench doesn't show any notable
> performance impact:

I wonder if we should just make it unconditional if it doesn't really
show any performance difference. One less config complexity to worry
about (and in this case I'm not so much worried about Kconfig itself,
as just "oh, you have totally different paths in the core VM depending
on PARAVIRT".

That said, the thing to test for these kinds of things is often
heavily scripted loads that just run thousands and thousands of really
small processes, and build up and tear down page tables all the time
because of fork/exit.

The load I've used occasionally is just "make test" in the git source
tree. Tons and tons of trivial fork/exec/exit things for all those
small tests and shell scripts.

I think 'kernbench' just does kernel compiles. Which is not very
kernel or VM intensive at all. It's mostly just user mode compilers in
parallel.

               Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 13:45 [PATCH] x86: enable RCU based table free when PARAVIRT Vitaly Kuznetsov
@ 2017-08-23 18:26 ` Linus Torvalds
  2017-08-23 18:26 ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 18:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Juergen Gross, Stephen Hemminger, Peter Zijlstra, Andrew Cooper,
	the arch/x86 maintainers, Linux Kernel Mailing List,
	Steven Rostedt, Andy Lutomirski, Jork Loeser, Ingo Molnar,
	H. Peter Anvin, xen-devel, Thomas Gleixner, KY Srinivasan,
	Boris Ostrovsky, Kirill A. Shutemov

On Wed, Aug 23, 2017 at 6:45 AM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
> is selected in config. Testing with kernbench doesn't show any notable
> performance impact:

I wonder if we should just make it unconditional if it doesn't really
show any performance difference. One less config complexity to worry
about (and in this case I'm not so much worried about Kconfig itself,
as just "oh, you have totally different paths in the core VM depending
on PARAVIRT".

That said, the thing to test for these kinds of things is often
heavily scripted loads that just run thousands and thousands of really
small processes, and build up and tear down page tables all the time
because of fork/exit.

The load I've used occasionally is just "make test" in the git source
tree. Tons and tons of trivial fork/exec/exit things for all those
small tests and shell scripts.

I think 'kernbench' just does kernel compiles. Which is not very
kernel or VM intensive at all. It's mostly just user mode compilers in
parallel.

               Linus

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 18:26 ` Linus Torvalds
@ 2017-08-23 19:59   ` Kirill A. Shutemov
  2017-08-23 20:27     ` Linus Torvalds
  2017-08-23 20:27     ` Linus Torvalds
  2017-08-23 19:59   ` Kirill A. Shutemov
  1 sibling, 2 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-23 19:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vitaly Kuznetsov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Kirill A. Shutemov, Peter Zijlstra,
	Jork Loeser, KY Srinivasan, Stephen Hemminger, Steven Rostedt,
	Juergen Gross, Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

On Wed, Aug 23, 2017 at 11:26:46AM -0700, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 6:45 AM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >
> > Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
> > is selected in config. Testing with kernbench doesn't show any notable
> > performance impact:
> 
> I wonder if we should just make it unconditional if it doesn't really
> show any performance difference. One less config complexity to worry
> about (and in this case I'm not so much worried about Kconfig itself,
> as just "oh, you have totally different paths in the core VM depending
> on PARAVIRT".

In this case we need performance numbers for !PARAVIRT kernel.

> That said, the thing to test for these kinds of things is often
> heavily scripted loads that just run thousands and thousands of really
> small processes, and build up and tear down page tables all the time
> because of fork/exit.
> 
> The load I've used occasionally is just "make test" in the git source
> tree. Tons and tons of trivial fork/exec/exit things for all those
> small tests and shell scripts.

Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
interesting too for worst case scenario.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 18:26 ` Linus Torvalds
  2017-08-23 19:59   ` Kirill A. Shutemov
@ 2017-08-23 19:59   ` Kirill A. Shutemov
  1 sibling, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-23 19:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juergen Gross, Stephen Hemminger, Peter Zijlstra, Andrew Cooper,
	Boris Ostrovsky, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, H. Peter Anvin, xen-devel,
	Vitaly Kuznetsov, KY Srinivasan, Thomas Gleixner,
	Kirill A. Shutemov

On Wed, Aug 23, 2017 at 11:26:46AM -0700, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 6:45 AM, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >
> > Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
> > is selected in config. Testing with kernbench doesn't show any notable
> > performance impact:
> 
> I wonder if we should just make it unconditional if it doesn't really
> show any performance difference. One less config complexity to worry
> about (and in this case I'm not so much worried about Kconfig itself,
> as just "oh, you have totally different paths in the core VM depending
> on PARAVIRT".

In this case we need performance numbers for !PARAVIRT kernel.

> That said, the thing to test for these kinds of things is often
> heavily scripted loads that just run thousands and thousands of really
> small processes, and build up and tear down page tables all the time
> because of fork/exit.
> 
> The load I've used occasionally is just "make test" in the git source
> tree. Tons and tons of trivial fork/exec/exit things for all those
> small tests and shell scripts.

Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
interesting too for worst case scenario.

-- 
 Kirill A. Shutemov

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 19:59   ` Kirill A. Shutemov
  2017-08-23 20:27     ` Linus Torvalds
@ 2017-08-23 20:27     ` Linus Torvalds
  2017-08-23 22:36       ` Kirill A. Shutemov
  2017-08-23 22:36       ` Kirill A. Shutemov
  1 sibling, 2 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 20:27 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Vitaly Kuznetsov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Kirill A. Shutemov, Peter Zijlstra,
	Jork Loeser, KY Srinivasan, Stephen Hemminger, Steven Rostedt,
	Juergen Gross, Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

On Wed, Aug 23, 2017 at 12:59 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
>
> In this case we need performance numbers for !PARAVIRT kernel.

Yes.

> Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
> interesting too for worst case scenario.

Actually, I don't think you want to populate all the pages. You just
want to populate *one* page, in order to build up the page directory
structure, not allocate all the final points.

And we only free the actual page tables when there is nothing around,
so it should be at least a 2MB-aligned region etc.

So you should do a *big* allocation, and then touch a single page in
the middle, and then minmap it - that should give you maximal page
table activity. Otherwise the page tables will generally just stay
around.

Realistically, it's mainly exit() that frees page tables. Yes, you may
have a few page tables free'd by a normal munmap(), but it's usually
very limited. Which is why I suggested that script-heavy thing with
lots of small executables. That tends to be the main realistic load
that really causes a ton of page directory activity.

              Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 19:59   ` Kirill A. Shutemov
@ 2017-08-23 20:27     ` Linus Torvalds
  2017-08-23 20:27     ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 20:27 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Juergen Gross, Stephen Hemminger, Peter Zijlstra, Andrew Cooper,
	Boris Ostrovsky, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, H. Peter Anvin, xen-devel,
	Vitaly Kuznetsov, KY Srinivasan, Thomas Gleixner,
	Kirill A. Shutemov

On Wed, Aug 23, 2017 at 12:59 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
>
> In this case we need performance numbers for !PARAVIRT kernel.

Yes.

> Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
> interesting too for worst case scenario.

Actually, I don't think you want to populate all the pages. You just
want to populate *one* page, in order to build up the page directory
structure, not allocate all the final points.

And we only free the actual page tables when there is nothing around,
so it should be at least a 2MB-aligned region etc.

So you should do a *big* allocation, and then touch a single page in
the middle, and then minmap it - that should give you maximal page
table activity. Otherwise the page tables will generally just stay
around.

Realistically, it's mainly exit() that frees page tables. Yes, you may
have a few page tables free'd by a normal munmap(), but it's usually
very limited. Which is why I suggested that script-heavy thing with
lots of small executables. That tends to be the main realistic load
that really causes a ton of page directory activity.

              Linus

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 20:27     ` Linus Torvalds
  2017-08-23 22:36       ` Kirill A. Shutemov
@ 2017-08-23 22:36       ` Kirill A. Shutemov
  2017-08-23 23:03         ` Linus Torvalds
  2017-08-23 23:03         ` Linus Torvalds
  1 sibling, 2 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-23 22:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kirill A. Shutemov, Vitaly Kuznetsov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Jork Loeser,
	KY Srinivasan, Stephen Hemminger, Steven Rostedt, Juergen Gross,
	Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

On Wed, Aug 23, 2017 at 08:27:18PM +0000, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 12:59 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> >
> > In this case we need performance numbers for !PARAVIRT kernel.
> 
> Yes.
> 
> > Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
> > interesting too for worst case scenario.
> 
> Actually, I don't think you want to populate all the pages. You just
> want to populate *one* page, in order to build up the page directory
> structure, not allocate all the final points.
> 
> And we only free the actual page tables when there is nothing around,
> so it should be at least a 2MB-aligned region etc.
> 
> So you should do a *big* allocation, and then touch a single page in
> the middle, and then minmap it - that should give you maximal page
> table activity. Otherwise the page tables will generally just stay
> around.
> 
> Realistically, it's mainly exit() that frees page tables. Yes, you may
> have a few page tables free'd by a normal munmap(), but it's usually
> very limited. Which is why I suggested that script-heavy thing with
> lots of small executables. That tends to be the main realistic load
> that really causes a ton of page directory activity.

Below is test cases that allocates a lot of page tables and measuare
fork/exit time. (I'm not entirely sure it's the best way to stress the
codepath.)

Unpatched:	average 4.8322s, stddev	0.114s
Patched:	average 4.8362s, stddev	0.111s

Both without PARAVIRT. Patch is modified to enable HAVE_RCU_TABLE_FREE for
!PARAVIRT too.

The test-case requires "echo 1 > /proc/sys/vm/overcommit_memory".

#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <sys/types.h>
#include <sys/wait.h>

#define PUD_SIZE (1UL << 30)
#define PMD_SIZE (1UL << 21)

#define NR_PUD 4096

#define NSEC_PER_SEC	1000000000L

int main(void)
{
	char *addr = NULL;
	unsigned long i, j;
	struct timespec start, finish;
	long long nsec;

	prctl(PR_SET_THP_DISABLE);
	for (i = 0; i < NR_PUD ; i++) {
		addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
				MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
		if (addr == MAP_FAILED) {
			perror("mmap");
			break;
		}

		for (j = 0; j < PUD_SIZE; j += PMD_SIZE)
			assert(addr[j] == 0);
	}

	for (i = 0; i < 10; i++) {
		pid_t pid;
		
		clock_gettime(CLOCK_MONOTONIC, &start);
		pid = fork();
		if (pid == -1)
			perror("fork");
		if (!pid)
			exit(0);
		wait(NULL);
		clock_gettime(CLOCK_MONOTONIC, &finish);

		nsec = (finish.tv_sec - start.tv_sec) * NSEC_PER_SEC +
			(finish.tv_nsec - start.tv_nsec);
		printf("%lld\n", nsec);
	}

	return 0;
}
-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 20:27     ` Linus Torvalds
@ 2017-08-23 22:36       ` Kirill A. Shutemov
  2017-08-23 22:36       ` Kirill A. Shutemov
  1 sibling, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-23 22:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juergen Gross, Stephen Hemminger, H. Peter Anvin, Peter Zijlstra,
	Andrew Cooper, Boris Ostrovsky, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, xen-devel, Kirill A. Shutemov,
	Vitaly Kuznetsov, KY Srinivasan, Thomas Gleixner

On Wed, Aug 23, 2017 at 08:27:18PM +0000, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 12:59 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
> >
> > In this case we need performance numbers for !PARAVIRT kernel.
> 
> Yes.
> 
> > Numbers for tight loop of "mmap(MAP_POPULATE); munmap()" might be
> > interesting too for worst case scenario.
> 
> Actually, I don't think you want to populate all the pages. You just
> want to populate *one* page, in order to build up the page directory
> structure, not allocate all the final points.
> 
> And we only free the actual page tables when there is nothing around,
> so it should be at least a 2MB-aligned region etc.
> 
> So you should do a *big* allocation, and then touch a single page in
> the middle, and then minmap it - that should give you maximal page
> table activity. Otherwise the page tables will generally just stay
> around.
> 
> Realistically, it's mainly exit() that frees page tables. Yes, you may
> have a few page tables free'd by a normal munmap(), but it's usually
> very limited. Which is why I suggested that script-heavy thing with
> lots of small executables. That tends to be the main realistic load
> that really causes a ton of page directory activity.

Below is test cases that allocates a lot of page tables and measuare
fork/exit time. (I'm not entirely sure it's the best way to stress the
codepath.)

Unpatched:	average 4.8322s, stddev	0.114s
Patched:	average 4.8362s, stddev	0.111s

Both without PARAVIRT. Patch is modified to enable HAVE_RCU_TABLE_FREE for
!PARAVIRT too.

The test-case requires "echo 1 > /proc/sys/vm/overcommit_memory".

#include <assert.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#include <sys/types.h>
#include <sys/wait.h>

#define PUD_SIZE (1UL << 30)
#define PMD_SIZE (1UL << 21)

#define NR_PUD 4096

#define NSEC_PER_SEC	1000000000L

int main(void)
{
	char *addr = NULL;
	unsigned long i, j;
	struct timespec start, finish;
	long long nsec;

	prctl(PR_SET_THP_DISABLE);
	for (i = 0; i < NR_PUD ; i++) {
		addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
				MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
		if (addr == MAP_FAILED) {
			perror("mmap");
			break;
		}

		for (j = 0; j < PUD_SIZE; j += PMD_SIZE)
			assert(addr[j] == 0);
	}

	for (i = 0; i < 10; i++) {
		pid_t pid;
		
		clock_gettime(CLOCK_MONOTONIC, &start);
		pid = fork();
		if (pid == -1)
			perror("fork");
		if (!pid)
			exit(0);
		wait(NULL);
		clock_gettime(CLOCK_MONOTONIC, &finish);

		nsec = (finish.tv_sec - start.tv_sec) * NSEC_PER_SEC +
			(finish.tv_nsec - start.tv_nsec);
		printf("%lld\n", nsec);
	}

	return 0;
}
-- 
 Kirill A. Shutemov

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 22:36       ` Kirill A. Shutemov
  2017-08-23 23:03         ` Linus Torvalds
@ 2017-08-23 23:03         ` Linus Torvalds
  2017-08-24  8:47           ` Vitaly Kuznetsov
                             ` (3 more replies)
  1 sibling, 4 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 23:03 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Vitaly Kuznetsov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Jork Loeser,
	KY Srinivasan, Stephen Hemminger, Steven Rostedt, Juergen Gross,
	Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> Below is test cases that allocates a lot of page tables and measuare
> fork/exit time. (I'm not entirely sure it's the best way to stress the
> codepath.)

Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me

   0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range


                                          ▒

so it does seem to spend time in the page directory code.

> Unpatched:      average 4.8322s, stddev 0.114s
> Patched:        average 4.8362s, stddev 0.111s

Ok, I vote for avoiding the complexity of two different behaviors, and
just making the page table freeing use RCU unconditionally.

If actively trying to trigger that code doesn't show a real measurable
difference, I don't think it matters, and the fewer different code
paths we have, the better.

              Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 22:36       ` Kirill A. Shutemov
@ 2017-08-23 23:03         ` Linus Torvalds
  2017-08-23 23:03         ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2017-08-23 23:03 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Juergen Gross, Stephen Hemminger, H. Peter Anvin, Peter Zijlstra,
	Andrew Cooper, Boris Ostrovsky, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, xen-devel, Kirill A. Shutemov,
	Vitaly Kuznetsov, KY Srinivasan, Thomas Gleixner

On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> Below is test cases that allocates a lot of page tables and measuare
> fork/exit time. (I'm not entirely sure it's the best way to stress the
> codepath.)

Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me

   0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range


                                          ▒

so it does seem to spend time in the page directory code.

> Unpatched:      average 4.8322s, stddev 0.114s
> Patched:        average 4.8362s, stddev 0.111s

Ok, I vote for avoiding the complexity of two different behaviors, and
just making the page table freeing use RCU unconditionally.

If actively trying to trigger that code doesn't show a real measurable
difference, I don't think it matters, and the fewer different code
paths we have, the better.

              Linus

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 23:03         ` Linus Torvalds
  2017-08-24  8:47           ` Vitaly Kuznetsov
@ 2017-08-24  8:47           ` Vitaly Kuznetsov
  2017-08-24  8:47           ` Kirill A. Shutemov
  2017-08-24  8:47           ` Kirill A. Shutemov
  3 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2017-08-24  8:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kirill A. Shutemov, Kirill A. Shutemov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Jork Loeser,
	KY Srinivasan, Stephen Hemminger, Steven Rostedt, Juergen Gross,
	Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
>>
>> Below is test cases that allocates a lot of page tables and measuare
>> fork/exit time. (I'm not entirely sure it's the best way to stress the
>> codepath.)
>
> Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me
>
>    0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range
>
>                                           ▒
>
> so it does seem to spend time in the page directory code.
>
>> Unpatched:      average 4.8322s, stddev 0.114s
>> Patched:        average 4.8362s, stddev 0.111s
>
> Ok, I vote for avoiding the complexity of two different behaviors, and
> just making the page table freeing use RCU unconditionally.

Thanks Linus & Kirill,

I actually did a microbenchmark with mmap/munmap too but wasn't able
to see any measurable performace difference.

>
> If actively trying to trigger that code doesn't show a real measurable
> difference, I don't think it matters, and the fewer different code
> paths we have, the better.

I'll send v2 enabling HAVE_RCU_TABLE_FREE on x86 unconditionally, thanks!

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 23:03         ` Linus Torvalds
@ 2017-08-24  8:47           ` Vitaly Kuznetsov
  2017-08-24  8:47           ` Vitaly Kuznetsov
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2017-08-24  8:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juergen Gross, Stephen Hemminger, H. Peter Anvin, Peter Zijlstra,
	Andrew Cooper, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, xen-devel, Kirill A. Shutemov,
	Thomas Gleixner, KY Srinivasan, Boris Ostrovsky,
	Kirill A. Shutemov

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
>>
>> Below is test cases that allocates a lot of page tables and measuare
>> fork/exit time. (I'm not entirely sure it's the best way to stress the
>> codepath.)
>
> Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me
>
>    0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range
>
>                                           ▒
>
> so it does seem to spend time in the page directory code.
>
>> Unpatched:      average 4.8322s, stddev 0.114s
>> Patched:        average 4.8362s, stddev 0.111s
>
> Ok, I vote for avoiding the complexity of two different behaviors, and
> just making the page table freeing use RCU unconditionally.

Thanks Linus & Kirill,

I actually did a microbenchmark with mmap/munmap too but wasn't able
to see any measurable performace difference.

>
> If actively trying to trigger that code doesn't show a real measurable
> difference, I don't think it matters, and the fewer different code
> paths we have, the better.

I'll send v2 enabling HAVE_RCU_TABLE_FREE on x86 unconditionally, thanks!

-- 
  Vitaly

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 23:03         ` Linus Torvalds
                             ` (2 preceding siblings ...)
  2017-08-24  8:47           ` Kirill A. Shutemov
@ 2017-08-24  8:47           ` Kirill A. Shutemov
  3 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-24  8:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kirill A. Shutemov, Vitaly Kuznetsov, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-devel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Jork Loeser,
	KY Srinivasan, Stephen Hemminger, Steven Rostedt, Juergen Gross,
	Boris Ostrovsky, Andrew Cooper, Andy Lutomirski

On Wed, Aug 23, 2017 at 04:03:53PM -0700, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > Below is test cases that allocates a lot of page tables and measuare
> > fork/exit time. (I'm not entirely sure it's the best way to stress the
> > codepath.)
> 
> Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me
> 
>    0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range
> 
> 
>                                           ▒
> 
> so it does seem to spend time in the page directory code.
> 
> > Unpatched:      average 4.8322s, stddev 0.114s
> > Patched:        average 4.8362s, stddev 0.111s
> 
> Ok, I vote for avoiding the complexity of two different behaviors, and
> just making the page table freeing use RCU unconditionally.
> 
> If actively trying to trigger that code doesn't show a real measurable
> difference, I don't think it matters, and the fewer different code
> paths we have, the better.

Numbers from bigger 2-socket machine:

Unpatched:	average 5.0542s, stddev 0.058s
Patched:	average 5.0440s, stddev 0.072s

Still fine.

I don't see a reason not to go this path.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: enable RCU based table free when PARAVIRT
  2017-08-23 23:03         ` Linus Torvalds
  2017-08-24  8:47           ` Vitaly Kuznetsov
  2017-08-24  8:47           ` Vitaly Kuznetsov
@ 2017-08-24  8:47           ` Kirill A. Shutemov
  2017-08-24  8:47           ` Kirill A. Shutemov
  3 siblings, 0 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2017-08-24  8:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juergen Gross, Stephen Hemminger, Peter Zijlstra, Andrew Cooper,
	Boris Ostrovsky, the arch/x86 maintainers,
	Linux Kernel Mailing List, Steven Rostedt, Andy Lutomirski,
	Jork Loeser, Ingo Molnar, H. Peter Anvin, xen-devel,
	Vitaly Kuznetsov, KY Srinivasan, Thomas Gleixner,
	Kirill A. Shutemov

On Wed, Aug 23, 2017 at 04:03:53PM -0700, Linus Torvalds wrote:
> On Wed, Aug 23, 2017 at 3:36 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > Below is test cases that allocates a lot of page tables and measuare
> > fork/exit time. (I'm not entirely sure it's the best way to stress the
> > codepath.)
> 
> Looks ok to me. Doing a profile (without the RCU freeing, obviously) gives me
> 
>    0.77%  a.out    [kernel.vmlinux]  [k] free_pgd_range
> 
> 
>                                           ▒
> 
> so it does seem to spend time in the page directory code.
> 
> > Unpatched:      average 4.8322s, stddev 0.114s
> > Patched:        average 4.8362s, stddev 0.111s
> 
> Ok, I vote for avoiding the complexity of two different behaviors, and
> just making the page table freeing use RCU unconditionally.
> 
> If actively trying to trigger that code doesn't show a real measurable
> difference, I don't think it matters, and the fewer different code
> paths we have, the better.

Numbers from bigger 2-socket machine:

Unpatched:	average 5.0542s, stddev 0.058s
Patched:	average 5.0440s, stddev 0.072s

Still fine.

I don't see a reason not to go this path.

-- 
 Kirill A. Shutemov

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] x86: enable RCU based table free when PARAVIRT
@ 2017-08-23 13:45 Vitaly Kuznetsov
  0 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2017-08-23 13:45 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, Stephen Hemminger, Peter Zijlstra, Andrew Cooper,
	linux-kernel, Steven Rostedt, Andy Lutomirski, Jork Loeser,
	Ingo Molnar, H. Peter Anvin, xen-devel, Thomas Gleixner,
	KY Srinivasan, Linus Torvalds, Boris Ostrovsky,
	Kirill A. Shutemov

On x86 software page-table walkers depend on the fact that remote TLB flush
does an IPI: walk is performed lockless but with interrupts disabled and in
case the pagetable is freed the freeing CPU will get blocked as remote TLB
flush is required. On other architecture which don't require an IPI to do
remote TLB flush we have an RCU-based mechanism (see
include/asm-generic/tlb.h for more details).

In virtualized environments we may want to override .flush_tlb_others hook
in pv_mmu_ops and use a hypercall asking the hypervisor to do remote TLB
flush for us. This breaks the assumption about IPI. Xen PV does this for
years and the upcoming remote TLB flush for Hyper-V will do it too. This
is not safe, software pagetable walkers may step on an already freed page.

Solve the issue by enabling RCU-based table free mechanism when PARAVIRT
is selected in config. Testing with kernbench doesn't show any notable
performance impact:

6-CPU host:

Average Half load -j 3 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 400.498 (0.179679)         Elapsed Time 399.909 (0.162853)
User Time 1098.72 (0.278536)            User Time 1097.59 (0.283894)
System Time 100.301 (0.201629)          System Time 99.736 (0.196254)
Percent CPU 299 (0)                     Percent CPU 299 (0)
Context Switches 5774.1 (69.2121)       Context Switches 5744.4 (79.4162)
Sleeps 87621.2 (78.1093)                Sleeps 87586.1 (99.7079)

Average Optimal load -j 24 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 219.03 (0.652534)          Elapsed Time 218.959 (0.598674)
User Time 1119.51 (21.3284)             User Time 1118.81 (21.7793)
System Time 100.499 (0.389308)          System Time 99.8335 (0.251423)
Percent CPU 432.5 (136.974)             Percent CPU 432.45 (136.922)
Context Switches 81827.4 (78029.5)      Context Switches 81818.5 (78051)
Sleeps 97124.8 (9822.4)                 Sleeps 97207.9 (9955.04)

16-CPU host:

Average Half load -j 8 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 213.538 (3.7891)           Elapsed Time 212.5 (3.10939)
User Time 1306.4 (1.83399)              User Time 1307.65 (1.01364)
System Time 194.59 (0.864378)           System Time 195.478 (0.794588)
Percent CPU 702.6 (13.5388)             Percent CPU 707 (11.1131)
Context Switches 21189.2 (1199.4)       Context Switches 21288.2 (552.388)
Sleeps 89390.2 (482.325)                Sleeps 89677 (277.06)

Average Optimal load -j 64 Run (std deviation):
CURRENT                                 HAVE_RCU_TABLE_FREE
=======                                 ===================
Elapsed Time 137.866 (0.787928)         Elapsed Time 138.438 (0.218792)
User Time 1488.92 (192.399)             User Time 1489.92 (192.135)
System Time 234.981 (42.5806)           System Time 236.09 (42.8138)
Percent CPU 1057.1 (373.826)            Percent CPU 1057.1 (369.114)
Context Switches 187514 (175324)        Context Switches 187358 (175060)
Sleeps 112633 (24535.5)                 Sleeps 111743 (23297.6)

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
---
Changes since RFC:
- Added Juergen's Acked-by. Fixed a typo in the description.

I didn't get any other feedback on my RFC, assuming there are no
objections, dropping RFC.
---
 arch/x86/Kconfig           |  1 +
 arch/x86/include/asm/tlb.h |  7 +++++++
 arch/x86/mm/pgtable.c      | 15 +++++++++++----
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 323cb065be5e..8032e1ac14f5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -168,6 +168,7 @@ config X86
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_RCU_TABLE_FREE              if SMP && PARAVIRT
 	select HAVE_RELIABLE_STACKTRACE		if X86_64 && FRAME_POINTER && STACK_VALIDATION
 	select HAVE_STACK_VALIDATION		if X86_64
 	select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index c7797307fc2b..1d074c560a48 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -15,4 +15,11 @@
 
 #include <asm-generic/tlb.h>
 
+#ifdef CONFIG_HAVE_RCU_TABLE_FREE
+static inline void __tlb_remove_table(void *table)
+{
+	free_page_and_swap_cache(table);
+}
+#endif
+
 #endif /* _ASM_X86_TLB_H */
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 508a708eb9a6..f9a3cdb9b574 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -52,11 +52,18 @@ static int __init setup_userpte(char *arg)
 }
 early_param("userpte", setup_userpte);
 
+#ifndef CONFIG_HAVE_RCU_TABLE_FREE
+static inline void tlb_remove_table(struct mmu_gather *tlb, void *table)
+{
+	return tlb_remove_page(tlb, table);
+}
+#endif
+
 void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
 {
 	pgtable_page_dtor(pte);
 	paravirt_release_pte(page_to_pfn(pte));
-	tlb_remove_page(tlb, pte);
+	tlb_remove_table(tlb, pte);
 }
 
 #if CONFIG_PGTABLE_LEVELS > 2
@@ -72,21 +79,21 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
 	tlb->need_flush_all = 1;
 #endif
 	pgtable_pmd_page_dtor(page);
-	tlb_remove_page(tlb, page);
+	tlb_remove_table(tlb, page);
 }
 
 #if CONFIG_PGTABLE_LEVELS > 3
 void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
 {
 	paravirt_release_pud(__pa(pud) >> PAGE_SHIFT);
-	tlb_remove_page(tlb, virt_to_page(pud));
+	tlb_remove_table(tlb, virt_to_page(pud));
 }
 
 #if CONFIG_PGTABLE_LEVELS > 4
 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
 {
 	paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
-	tlb_remove_page(tlb, virt_to_page(p4d));
+	tlb_remove_table(tlb, virt_to_page(p4d));
 }
 #endif	/* CONFIG_PGTABLE_LEVELS > 4 */
 #endif	/* CONFIG_PGTABLE_LEVELS > 3 */
-- 
2.13.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-08-24  8:47 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23 13:45 [PATCH] x86: enable RCU based table free when PARAVIRT Vitaly Kuznetsov
2017-08-23 18:26 ` Linus Torvalds
2017-08-23 18:26 ` Linus Torvalds
2017-08-23 19:59   ` Kirill A. Shutemov
2017-08-23 20:27     ` Linus Torvalds
2017-08-23 20:27     ` Linus Torvalds
2017-08-23 22:36       ` Kirill A. Shutemov
2017-08-23 22:36       ` Kirill A. Shutemov
2017-08-23 23:03         ` Linus Torvalds
2017-08-23 23:03         ` Linus Torvalds
2017-08-24  8:47           ` Vitaly Kuznetsov
2017-08-24  8:47           ` Vitaly Kuznetsov
2017-08-24  8:47           ` Kirill A. Shutemov
2017-08-24  8:47           ` Kirill A. Shutemov
2017-08-23 19:59   ` Kirill A. Shutemov
2017-08-23 13:45 Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.