All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v4 0/2] x86/xen: add xen hypercall preemption
@ 2015-01-23  0:29 Luis R. Rodriguez
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
                   ` (5 more replies)
  0 siblings, 6 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23  0:29 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: linux-kernel, x86, kvm, paulmck, rostedt, Luis R. Rodriguez

From: "Luis R. Rodriguez" <mcgrof@suse.com>

This v4 addresses some of the cleanups recommended and adds
tracing option for when we do actually preempt a hypercall.
I kept the NOKPROBE_SYMBOL() for now but did remove the 'notrace'
stuff.

This goes out as RFC still as I have not been able to test 32-bit.
Can anyone test that or at least confirm that the 32-bit point
we do the upcall is definitely not on the IRQ stack?

Luis R. Rodriguez (2):
  x86/xen: add xen_is_preemptible_hypercall()
  x86/xen: allow privcmd hypercalls to be preempted

 arch/arm/include/asm/xen/hypercall.h |  5 +++++
 arch/x86/include/asm/xen/hypercall.h | 20 ++++++++++++++++++++
 arch/x86/kernel/entry_32.S           |  2 ++
 arch/x86/kernel/entry_64.S           |  2 ++
 arch/x86/xen/enlighten.c             |  7 +++++++
 arch/x86/xen/xen-head.S              | 18 +++++++++++++++++-
 drivers/xen/events/events_base.c     | 23 +++++++++++++++++++++++
 include/trace/events/xen.h           |  9 +++++++++
 include/xen/events.h                 |  1 +
 9 files changed, 86 insertions(+), 1 deletion(-)

-- 
2.1.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
@ 2015-01-23  0:29 ` Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
                     ` (3 more replies)
  2015-01-23  0:29 ` Luis R. Rodriguez
                   ` (4 subsequent siblings)
  5 siblings, 4 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23  0:29 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: linux-kernel, x86, kvm, paulmck, rostedt, Luis R. Rodriguez,
	Andy Lutomirski, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Masami Hiramatsu, Jan Beulich

From: "Luis R. Rodriguez" <mcgrof@suse.com>

On kernels with voluntary or no preemption we can run
into situations where a hypercall issued through userspace
will linger around as it addresses sub-operatiosn in kernel
context (multicalls). Such operations can trigger soft lockup
detection.

We want to address a way to let the kernel voluntarily preempt
such calls even on non preempt kernels, to address this we first
need to distinguish which hypercalls fall under this category.
This implements xen_is_preemptible_hypercall() which lets us do
just that by adding a secondary hypercall page, calls made via
the new page may be preempted.

Andrew had originally submitted a version of this work [0].

[0] http://lists.xen.org/archives/html/xen-devel/2014-02/msg01056.html

Based on original work by: Andrew Cooper <andrew.cooper3@citrix.com>

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 arch/arm/include/asm/xen/hypercall.h |  5 +++++
 arch/x86/include/asm/xen/hypercall.h | 20 ++++++++++++++++++++
 arch/x86/xen/enlighten.c             |  7 +++++++
 arch/x86/xen/xen-head.S              | 18 +++++++++++++++++-
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/hypercall.h b/arch/arm/include/asm/xen/hypercall.h
index 712b50e..4fc8395 100644
--- a/arch/arm/include/asm/xen/hypercall.h
+++ b/arch/arm/include/asm/xen/hypercall.h
@@ -74,4 +74,9 @@ MULTI_mmu_update(struct multicall_entry *mcl, struct mmu_update *req,
 	BUG();
 }
 
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return false;
+}
+
 #endif /* _ASM_ARM_XEN_HYPERCALL_H */
diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index ca08a27..221008e 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -84,6 +84,22 @@
 
 extern struct { char _entry[32]; } hypercall_page[];
 
+#ifndef CONFIG_PREEMPT
+extern struct { char _entry[32]; } preemptible_hypercall_page[];
+
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return !user_mode_vm(regs) &&
+		regs->ip >= (unsigned long)preemptible_hypercall_page &&
+		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;
+}
+#else
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return false;
+}
+#endif
+
 #define __HYPERCALL		"call hypercall_page+%c[offset]"
 #define __HYPERCALL_ENTRY(x)						\
 	[offset] "i" (__HYPERVISOR_##x * sizeof(hypercall_page[0]))
@@ -215,7 +231,11 @@ privcmd_call(unsigned call,
 
 	asm volatile("call *%[call]"
 		     : __HYPERCALL_5PARAM
+#ifndef CONFIG_PREEMPT
+		     : [call] "a" (&preemptible_hypercall_page[call])
+#else
 		     : [call] "a" (&hypercall_page[call])
+#endif
 		     : __HYPERCALL_CLOBBER5);
 
 	return (long)__res;
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 6bf3a13..9c01b48 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -84,6 +84,9 @@
 #include "multicalls.h"
 
 EXPORT_SYMBOL_GPL(hypercall_page);
+#ifndef CONFIG_PREEMPT
+EXPORT_SYMBOL_GPL(preemptible_hypercall_page);
+#endif
 
 /*
  * Pointer to the xen_vcpu_info structure or
@@ -1531,6 +1534,10 @@ asmlinkage __visible void __init xen_start_kernel(void)
 #endif
 	xen_setup_machphys_mapping();
 
+#ifndef CONFIG_PREEMPT
+	copy_page(preemptible_hypercall_page, hypercall_page);
+#endif
+
 	/* Install Xen paravirt ops */
 	pv_info = xen_info;
 	pv_init_ops = xen_init_ops;
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 674b2225..6e6a9517 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -85,9 +85,18 @@ ENTRY(xen_pvh_early_cpu_init)
 .pushsection .text
 	.balign PAGE_SIZE
 ENTRY(hypercall_page)
+
+#ifdef CONFIG_PREEMPT
+#  define PREEMPT_HYPERCALL_ENTRY(x)
+#else
+#  define PREEMPT_HYPERCALL_ENTRY(x) \
+       .global xen_hypercall_##x ## _p ASM_NL \
+       .set preemptible_xen_hypercall_##x, xen_hypercall_##x + PAGE_SIZE ASM_NL
+#endif
 #define NEXT_HYPERCALL(x) \
 	ENTRY(xen_hypercall_##x) \
-	.skip 32
+	.skip 32 ASM_NL \
+	PREEMPT_HYPERCALL_ENTRY(x)
 
 NEXT_HYPERCALL(set_trap_table)
 NEXT_HYPERCALL(mmu_update)
@@ -138,6 +147,13 @@ NEXT_HYPERCALL(arch_4)
 NEXT_HYPERCALL(arch_5)
 NEXT_HYPERCALL(arch_6)
 	.balign PAGE_SIZE
+
+#ifndef CONFIG_PREEMPT
+ENTRY(preemptible_hypercall_page)
+	.skip PAGE_SIZE
+#endif /* CONFIG_PREEMPT */
+
+#undef NEXT_HYPERCALL
 .popsection
 
 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
@ 2015-01-23  0:29 ` Luis R. Rodriguez
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23  0:29 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: Borislav Petkov, kvm, Luis R. Rodriguez, x86, linux-kernel,
	rostedt, Andy Lutomirski, Ingo Molnar, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, Thomas Gleixner, paulmck

From: "Luis R. Rodriguez" <mcgrof@suse.com>

On kernels with voluntary or no preemption we can run
into situations where a hypercall issued through userspace
will linger around as it addresses sub-operatiosn in kernel
context (multicalls). Such operations can trigger soft lockup
detection.

We want to address a way to let the kernel voluntarily preempt
such calls even on non preempt kernels, to address this we first
need to distinguish which hypercalls fall under this category.
This implements xen_is_preemptible_hypercall() which lets us do
just that by adding a secondary hypercall page, calls made via
the new page may be preempted.

Andrew had originally submitted a version of this work [0].

[0] http://lists.xen.org/archives/html/xen-devel/2014-02/msg01056.html

Based on original work by: Andrew Cooper <andrew.cooper3@citrix.com>

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 arch/arm/include/asm/xen/hypercall.h |  5 +++++
 arch/x86/include/asm/xen/hypercall.h | 20 ++++++++++++++++++++
 arch/x86/xen/enlighten.c             |  7 +++++++
 arch/x86/xen/xen-head.S              | 18 +++++++++++++++++-
 4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/hypercall.h b/arch/arm/include/asm/xen/hypercall.h
index 712b50e..4fc8395 100644
--- a/arch/arm/include/asm/xen/hypercall.h
+++ b/arch/arm/include/asm/xen/hypercall.h
@@ -74,4 +74,9 @@ MULTI_mmu_update(struct multicall_entry *mcl, struct mmu_update *req,
 	BUG();
 }
 
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return false;
+}
+
 #endif /* _ASM_ARM_XEN_HYPERCALL_H */
diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index ca08a27..221008e 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -84,6 +84,22 @@
 
 extern struct { char _entry[32]; } hypercall_page[];
 
+#ifndef CONFIG_PREEMPT
+extern struct { char _entry[32]; } preemptible_hypercall_page[];
+
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return !user_mode_vm(regs) &&
+		regs->ip >= (unsigned long)preemptible_hypercall_page &&
+		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;
+}
+#else
+static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
+{
+	return false;
+}
+#endif
+
 #define __HYPERCALL		"call hypercall_page+%c[offset]"
 #define __HYPERCALL_ENTRY(x)						\
 	[offset] "i" (__HYPERVISOR_##x * sizeof(hypercall_page[0]))
@@ -215,7 +231,11 @@ privcmd_call(unsigned call,
 
 	asm volatile("call *%[call]"
 		     : __HYPERCALL_5PARAM
+#ifndef CONFIG_PREEMPT
+		     : [call] "a" (&preemptible_hypercall_page[call])
+#else
 		     : [call] "a" (&hypercall_page[call])
+#endif
 		     : __HYPERCALL_CLOBBER5);
 
 	return (long)__res;
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 6bf3a13..9c01b48 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -84,6 +84,9 @@
 #include "multicalls.h"
 
 EXPORT_SYMBOL_GPL(hypercall_page);
+#ifndef CONFIG_PREEMPT
+EXPORT_SYMBOL_GPL(preemptible_hypercall_page);
+#endif
 
 /*
  * Pointer to the xen_vcpu_info structure or
@@ -1531,6 +1534,10 @@ asmlinkage __visible void __init xen_start_kernel(void)
 #endif
 	xen_setup_machphys_mapping();
 
+#ifndef CONFIG_PREEMPT
+	copy_page(preemptible_hypercall_page, hypercall_page);
+#endif
+
 	/* Install Xen paravirt ops */
 	pv_info = xen_info;
 	pv_init_ops = xen_init_ops;
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 674b2225..6e6a9517 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -85,9 +85,18 @@ ENTRY(xen_pvh_early_cpu_init)
 .pushsection .text
 	.balign PAGE_SIZE
 ENTRY(hypercall_page)
+
+#ifdef CONFIG_PREEMPT
+#  define PREEMPT_HYPERCALL_ENTRY(x)
+#else
+#  define PREEMPT_HYPERCALL_ENTRY(x) \
+       .global xen_hypercall_##x ## _p ASM_NL \
+       .set preemptible_xen_hypercall_##x, xen_hypercall_##x + PAGE_SIZE ASM_NL
+#endif
 #define NEXT_HYPERCALL(x) \
 	ENTRY(xen_hypercall_##x) \
-	.skip 32
+	.skip 32 ASM_NL \
+	PREEMPT_HYPERCALL_ENTRY(x)
 
 NEXT_HYPERCALL(set_trap_table)
 NEXT_HYPERCALL(mmu_update)
@@ -138,6 +147,13 @@ NEXT_HYPERCALL(arch_4)
 NEXT_HYPERCALL(arch_5)
 NEXT_HYPERCALL(arch_6)
 	.balign PAGE_SIZE
+
+#ifndef CONFIG_PREEMPT
+ENTRY(preemptible_hypercall_page)
+	.skip PAGE_SIZE
+#endif /* CONFIG_PREEMPT */
+
+#undef NEXT_HYPERCALL
 .popsection
 
 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
  2015-01-23  0:29 ` Luis R. Rodriguez
@ 2015-01-23  0:29 ` Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
                     ` (3 more replies)
  2015-01-23  0:29 ` Luis R. Rodriguez
                   ` (2 subsequent siblings)
  5 siblings, 4 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23  0:29 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: linux-kernel, x86, kvm, paulmck, rostedt, Luis R. Rodriguez,
	Andy Lutomirski, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Masami Hiramatsu, Jan Beulich

From: "Luis R. Rodriguez" <mcgrof@suse.com>

Xen has support for splitting heavy work work into a series
of hypercalls, called multicalls, and preempting them through
what Xen calls continuation [0]. Despite this though without
CONFIG_PREEMPT preemption won't happen, without preemption
a system can become pretty useless on heavy handed hypercalls.
Such is the case for example when creating a > 50 GiB HVM guest,
we can get softlockups [1] with:.

kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]

The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
(default 120 seconds), on the Xen side in this particular case
this happens when the following Xen hypervisor code is used:

xc_domain_set_pod_target() -->
  do_memory_op() -->
    arch_memory_op() -->
      p2m_pod_set_mem_target()
	-- long delay (real or emulated) --

This happens on arch_memory_op() on the XENMEM_set_pod_target memory
op even though arch_memory_op() can handle continuation via
hypercall_create_continuation() for example.

Machines over 50 GiB of memory are on high demand and hard to come
by so to help replicate this sort of issue long delays on select
hypercalls have been emulated in order to be able to test this on
smaller machines [2].

On one hand this issue can be considered as expected given that
CONFIG_PREEMPT=n is used however we have forced voluntary preemption
precedent practices in the kernel even for CONFIG_PREEMPT=n through
the usage of cond_resched() sprinkled in many places. To address
this issue with Xen hypercalls though we need to find a way to aid
to the schedular in the middle of hypercalls. We are motivated to
address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
rather unresponsive for long periods of time; in the worst case, at least
only currently by emulating long delays on select io disk bound
hypercalls, this can lead to filesystem corruption if the delay happens
for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').

We can address this problem by trying to check if we should schedule
on the xen timer in the middle of a hypercall on the return from the
timer interrupt. We want to be careful to not always force voluntary
preemption though so to do this we only selectively enable preemption
on very specific xen hypercalls.

This enables hypercall preemption by selectively forcing checks for
voluntary preempting only on ioctl initiated private hypercalls
where we know some folks have run into reported issues [1].

This also adds a trace event to be able to review when xen hypercalls
are preemtped, right now we just tell you when it happens and which
CPU got preempted.

ergon:~ # echo 1 > /sys/kernel/debug/tracing/events/xen/xen_hypercall_preemption/trigger
ergon:~ # cat /sys/kernel/debug/tracing/trace_pipe
...
 qemu-system-i38-2114  [000] ....   491.038440: xen_hypercall_preemption: on CPU 0
 qemu-system-i38-2114  [003] ....   518.138592: xen_hypercall_preemption: on CPU 3
...

[0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
[1] https://bugzilla.novell.com/show_bug.cgi?id=861093
[2] http://ftp.suse.com/pub/people/mcgrof/xen/emulate-long-xen-hypercalls.patch

Based on original work by: David Vrabel <david.vrabel@citrix.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 arch/x86/kernel/entry_32.S       |  2 ++
 arch/x86/kernel/entry_64.S       |  2 ++
 drivers/xen/events/events_base.c | 23 +++++++++++++++++++++++
 include/trace/events/xen.h       |  9 +++++++++
 include/xen/events.h             |  1 +
 5 files changed, 37 insertions(+)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 000d419..b4b1f42 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -982,6 +982,8 @@ ENTRY(xen_hypervisor_callback)
 ENTRY(xen_do_upcall)
 1:	mov %esp, %eax
 	call xen_evtchn_do_upcall
+	movl %esp,%eax
+	call xen_end_upcall
 	jmp  ret_from_intr
 	CFI_ENDPROC
 ENDPROC(xen_hypervisor_callback)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 9ebaf63..ee28733 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1198,6 +1198,8 @@ ENTRY(xen_do_hypervisor_callback)   # do_hypervisor_callback(struct *pt_regs)
 	popq %rsp
 	CFI_DEF_CFA_REGISTER rsp
 	decl PER_CPU_VAR(irq_count)
+	movq %rsp, %rdi  /* pass pt_regs as first argument */
+	call xen_end_upcall
 	jmp  error_exit
 	CFI_ENDPROC
 END(xen_do_hypervisor_callback)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b4bca2d..8126642 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -32,6 +32,10 @@
 #include <linux/slab.h>
 #include <linux/irqnr.h>
 #include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/kprobes.h>
+
+#include <trace/events/xen.h>
 
 #ifdef CONFIG_X86
 #include <asm/desc.h>
@@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
+/*
+ * CONFIG_PREEMPT=n kernels can end up triggering the softlock
+ * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
+ * when certain multicalls are used [0] on large systems, in
+ * that case we need a way to voluntarily preempt. This is
+ * only an issue on CONFIG_PREEMPT=n kernels.
+ *
+ * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
+ */
+void xen_end_upcall(struct pt_regs *regs)
+{
+	if (xen_is_preemptible_hypercall(regs)) {
+		int cpuid = smp_processor_id();
+		if (_cond_resched())
+			trace_xen_hypercall_preemption(cpuid);
+	}
+}
+NOKPROBE_SYMBOL(xen_end_upcall);
+
 void xen_hvm_evtchn_do_upcall(void)
 {
 	__xen_evtchn_do_upcall();
diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h
index d06b6da..77e333b 100644
--- a/include/trace/events/xen.h
+++ b/include/trace/events/xen.h
@@ -509,6 +509,15 @@ TRACE_EVENT(xen_cpu_set_ldt,
 		      __entry->addr, __entry->entries)
 	);
 
+TRACE_EVENT(xen_hypercall_preemption,
+	    TP_PROTO(int cpu_id),
+	    TP_ARGS(cpu_id),
+	    TP_STRUCT__entry(
+		    __field(int, cpuid)
+		    ),
+	    TP_fast_assign(__entry->cpuid = cpu_id),
+	    TP_printk("on CPU %d", __entry->cpuid)
+	);
 
 #endif /*  _TRACE_XEN_H */
 
diff --git a/include/xen/events.h b/include/xen/events.h
index 5321cd9..f08df87 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -95,6 +95,7 @@ void xen_hvm_callback_vector(void);
 extern int xen_have_vector_callback;
 int xen_set_callback_via(uint64_t via);
 void xen_evtchn_do_upcall(struct pt_regs *regs);
+void xen_end_upcall(struct pt_regs *regs);
 void xen_hvm_evtchn_do_upcall(void);
 
 /* Bind a pirq for a physical interrupt to an irq. */
-- 
2.1.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
@ 2015-01-23  0:29 ` Luis R. Rodriguez
  2015-01-23 11:51 ` [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption David Vrabel
  2015-01-23 11:51 ` David Vrabel
  5 siblings, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23  0:29 UTC (permalink / raw)
  To: david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: Borislav Petkov, kvm, Luis R. Rodriguez, x86, linux-kernel,
	rostedt, Andy Lutomirski, Ingo Molnar, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, Thomas Gleixner, paulmck

From: "Luis R. Rodriguez" <mcgrof@suse.com>

Xen has support for splitting heavy work work into a series
of hypercalls, called multicalls, and preempting them through
what Xen calls continuation [0]. Despite this though without
CONFIG_PREEMPT preemption won't happen, without preemption
a system can become pretty useless on heavy handed hypercalls.
Such is the case for example when creating a > 50 GiB HVM guest,
we can get softlockups [1] with:.

kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]

The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
(default 120 seconds), on the Xen side in this particular case
this happens when the following Xen hypervisor code is used:

xc_domain_set_pod_target() -->
  do_memory_op() -->
    arch_memory_op() -->
      p2m_pod_set_mem_target()
	-- long delay (real or emulated) --

This happens on arch_memory_op() on the XENMEM_set_pod_target memory
op even though arch_memory_op() can handle continuation via
hypercall_create_continuation() for example.

Machines over 50 GiB of memory are on high demand and hard to come
by so to help replicate this sort of issue long delays on select
hypercalls have been emulated in order to be able to test this on
smaller machines [2].

On one hand this issue can be considered as expected given that
CONFIG_PREEMPT=n is used however we have forced voluntary preemption
precedent practices in the kernel even for CONFIG_PREEMPT=n through
the usage of cond_resched() sprinkled in many places. To address
this issue with Xen hypercalls though we need to find a way to aid
to the schedular in the middle of hypercalls. We are motivated to
address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
rather unresponsive for long periods of time; in the worst case, at least
only currently by emulating long delays on select io disk bound
hypercalls, this can lead to filesystem corruption if the delay happens
for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').

We can address this problem by trying to check if we should schedule
on the xen timer in the middle of a hypercall on the return from the
timer interrupt. We want to be careful to not always force voluntary
preemption though so to do this we only selectively enable preemption
on very specific xen hypercalls.

This enables hypercall preemption by selectively forcing checks for
voluntary preempting only on ioctl initiated private hypercalls
where we know some folks have run into reported issues [1].

This also adds a trace event to be able to review when xen hypercalls
are preemtped, right now we just tell you when it happens and which
CPU got preempted.

ergon:~ # echo 1 > /sys/kernel/debug/tracing/events/xen/xen_hypercall_preemption/trigger
ergon:~ # cat /sys/kernel/debug/tracing/trace_pipe
...
 qemu-system-i38-2114  [000] ....   491.038440: xen_hypercall_preemption: on CPU 0
 qemu-system-i38-2114  [003] ....   518.138592: xen_hypercall_preemption: on CPU 3
...

[0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
[1] https://bugzilla.novell.com/show_bug.cgi?id=861093
[2] http://ftp.suse.com/pub/people/mcgrof/xen/emulate-long-xen-hypercalls.patch

Based on original work by: David Vrabel <david.vrabel@citrix.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 arch/x86/kernel/entry_32.S       |  2 ++
 arch/x86/kernel/entry_64.S       |  2 ++
 drivers/xen/events/events_base.c | 23 +++++++++++++++++++++++
 include/trace/events/xen.h       |  9 +++++++++
 include/xen/events.h             |  1 +
 5 files changed, 37 insertions(+)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 000d419..b4b1f42 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -982,6 +982,8 @@ ENTRY(xen_hypervisor_callback)
 ENTRY(xen_do_upcall)
 1:	mov %esp, %eax
 	call xen_evtchn_do_upcall
+	movl %esp,%eax
+	call xen_end_upcall
 	jmp  ret_from_intr
 	CFI_ENDPROC
 ENDPROC(xen_hypervisor_callback)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 9ebaf63..ee28733 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1198,6 +1198,8 @@ ENTRY(xen_do_hypervisor_callback)   # do_hypervisor_callback(struct *pt_regs)
 	popq %rsp
 	CFI_DEF_CFA_REGISTER rsp
 	decl PER_CPU_VAR(irq_count)
+	movq %rsp, %rdi  /* pass pt_regs as first argument */
+	call xen_end_upcall
 	jmp  error_exit
 	CFI_ENDPROC
 END(xen_do_hypervisor_callback)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index b4bca2d..8126642 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -32,6 +32,10 @@
 #include <linux/slab.h>
 #include <linux/irqnr.h>
 #include <linux/pci.h>
+#include <linux/sched.h>
+#include <linux/kprobes.h>
+
+#include <trace/events/xen.h>
 
 #ifdef CONFIG_X86
 #include <asm/desc.h>
@@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
+/*
+ * CONFIG_PREEMPT=n kernels can end up triggering the softlock
+ * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
+ * when certain multicalls are used [0] on large systems, in
+ * that case we need a way to voluntarily preempt. This is
+ * only an issue on CONFIG_PREEMPT=n kernels.
+ *
+ * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
+ */
+void xen_end_upcall(struct pt_regs *regs)
+{
+	if (xen_is_preemptible_hypercall(regs)) {
+		int cpuid = smp_processor_id();
+		if (_cond_resched())
+			trace_xen_hypercall_preemption(cpuid);
+	}
+}
+NOKPROBE_SYMBOL(xen_end_upcall);
+
 void xen_hvm_evtchn_do_upcall(void)
 {
 	__xen_evtchn_do_upcall();
diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h
index d06b6da..77e333b 100644
--- a/include/trace/events/xen.h
+++ b/include/trace/events/xen.h
@@ -509,6 +509,15 @@ TRACE_EVENT(xen_cpu_set_ldt,
 		      __entry->addr, __entry->entries)
 	);
 
+TRACE_EVENT(xen_hypercall_preemption,
+	    TP_PROTO(int cpu_id),
+	    TP_ARGS(cpu_id),
+	    TP_STRUCT__entry(
+		    __field(int, cpuid)
+		    ),
+	    TP_fast_assign(__entry->cpuid = cpu_id),
+	    TP_printk("on CPU %d", __entry->cpuid)
+	);
 
 #endif /*  _TRACE_XEN_H */
 
diff --git a/include/xen/events.h b/include/xen/events.h
index 5321cd9..f08df87 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -95,6 +95,7 @@ void xen_hvm_callback_vector(void);
 extern int xen_have_vector_callback;
 int xen_set_callback_via(uint64_t via);
 void xen_evtchn_do_upcall(struct pt_regs *regs);
+void xen_end_upcall(struct pt_regs *regs);
 void xen_hvm_evtchn_do_upcall(void);
 
 /* Bind a pirq for a physical interrupt to an irq. */
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23  1:57     ` Steven Rostedt
  2015-01-23  1:57     ` Steven Rostedt
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
  2015-01-23 11:45   ` David Vrabel
  3 siblings, 2 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-01-23  1:40 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Vrabel, Konrad Rzeszutek Wilk, Boris Ostrovsky, xen-devel,
	linux-kernel, X86 ML, kvm list, Paul McKenney, Steven Rostedt,
	Luis R. Rodriguez, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Masami Hiramatsu, Jan Beulich

On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> Xen has support for splitting heavy work work into a series
> of hypercalls, called multicalls, and preempting them through
> what Xen calls continuation [0]. Despite this though without
> CONFIG_PREEMPT preemption won't happen, without preemption
> a system can become pretty useless on heavy handed hypercalls.
> Such is the case for example when creating a > 50 GiB HVM guest,
> we can get softlockups [1] with:.
>
> kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
>
> The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> (default 120 seconds), on the Xen side in this particular case
> this happens when the following Xen hypervisor code is used:
>
> xc_domain_set_pod_target() -->
>   do_memory_op() -->
>     arch_memory_op() -->
>       p2m_pod_set_mem_target()
>         -- long delay (real or emulated) --
>
> This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> op even though arch_memory_op() can handle continuation via
> hypercall_create_continuation() for example.
>
> Machines over 50 GiB of memory are on high demand and hard to come
> by so to help replicate this sort of issue long delays on select
> hypercalls have been emulated in order to be able to test this on
> smaller machines [2].
>
> On one hand this issue can be considered as expected given that
> CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> precedent practices in the kernel even for CONFIG_PREEMPT=n through
> the usage of cond_resched() sprinkled in many places. To address
> this issue with Xen hypercalls though we need to find a way to aid
> to the schedular in the middle of hypercalls. We are motivated to
> address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> rather unresponsive for long periods of time; in the worst case, at least
> only currently by emulating long delays on select io disk bound
> hypercalls, this can lead to filesystem corruption if the delay happens
> for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
>
> We can address this problem by trying to check if we should schedule
> on the xen timer in the middle of a hypercall on the return from the
> timer interrupt. We want to be careful to not always force voluntary
> preemption though so to do this we only selectively enable preemption
> on very specific xen hypercalls.
>
> This enables hypercall preemption by selectively forcing checks for
> voluntary preempting only on ioctl initiated private hypercalls
> where we know some folks have run into reported issues [1].
>
> This also adds a trace event to be able to review when xen hypercalls
> are preemtped, right now we just tell you when it happens and which
> CPU got preempted.
>
> ergon:~ # echo 1 > /sys/kernel/debug/tracing/events/xen/xen_hypercall_preemption/trigger
> ergon:~ # cat /sys/kernel/debug/tracing/trace_pipe
> ...
>  qemu-system-i38-2114  [000] ....   491.038440: xen_hypercall_preemption: on CPU 0
>  qemu-system-i38-2114  [003] ....   518.138592: xen_hypercall_preemption: on CPU 3
> ...
>
> [0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
> [1] https://bugzilla.novell.com/show_bug.cgi?id=861093
> [2] http://ftp.suse.com/pub/people/mcgrof/xen/emulate-long-xen-hypercalls.patch
>
> Based on original work by: David Vrabel <david.vrabel@citrix.com>
> Suggested-by: Andy Lutomirski <luto@amacapital.net>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  arch/x86/kernel/entry_32.S       |  2 ++
>  arch/x86/kernel/entry_64.S       |  2 ++
>  drivers/xen/events/events_base.c | 23 +++++++++++++++++++++++
>  include/trace/events/xen.h       |  9 +++++++++
>  include/xen/events.h             |  1 +
>  5 files changed, 37 insertions(+)
>

Reviewed-by: Andy Lutomirski <luto@amacapital.net>

>
> +/*
> + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> + * when certain multicalls are used [0] on large systems, in
> + * that case we need a way to voluntarily preempt. This is
> + * only an issue on CONFIG_PREEMPT=n kernels.
> + *
> + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> + */
> +void xen_end_upcall(struct pt_regs *regs)
> +{
> +       if (xen_is_preemptible_hypercall(regs)) {
> +               int cpuid = smp_processor_id();
> +               if (_cond_resched())
> +                       trace_xen_hypercall_preemption(cpuid);

If you want to speed this up a bit, I think you could move the
smp_processor_id() into the TP_fast_assign.  But don't tracepoints
report the cpu number even without any action?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
@ 2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23  1:40   ` Andy Lutomirski
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-01-23  1:40 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Borislav Petkov, kvm list, Luis R. Rodriguez, X86 ML,
	linux-kernel, Steven Rostedt, Ingo Molnar, David Vrabel,
	Jan Beulich, H. Peter Anvin, Masami Hiramatsu, xen-devel,
	Boris Ostrovsky, Paul McKenney, Thomas Gleixner

On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> Xen has support for splitting heavy work work into a series
> of hypercalls, called multicalls, and preempting them through
> what Xen calls continuation [0]. Despite this though without
> CONFIG_PREEMPT preemption won't happen, without preemption
> a system can become pretty useless on heavy handed hypercalls.
> Such is the case for example when creating a > 50 GiB HVM guest,
> we can get softlockups [1] with:.
>
> kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
>
> The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> (default 120 seconds), on the Xen side in this particular case
> this happens when the following Xen hypervisor code is used:
>
> xc_domain_set_pod_target() -->
>   do_memory_op() -->
>     arch_memory_op() -->
>       p2m_pod_set_mem_target()
>         -- long delay (real or emulated) --
>
> This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> op even though arch_memory_op() can handle continuation via
> hypercall_create_continuation() for example.
>
> Machines over 50 GiB of memory are on high demand and hard to come
> by so to help replicate this sort of issue long delays on select
> hypercalls have been emulated in order to be able to test this on
> smaller machines [2].
>
> On one hand this issue can be considered as expected given that
> CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> precedent practices in the kernel even for CONFIG_PREEMPT=n through
> the usage of cond_resched() sprinkled in many places. To address
> this issue with Xen hypercalls though we need to find a way to aid
> to the schedular in the middle of hypercalls. We are motivated to
> address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> rather unresponsive for long periods of time; in the worst case, at least
> only currently by emulating long delays on select io disk bound
> hypercalls, this can lead to filesystem corruption if the delay happens
> for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
>
> We can address this problem by trying to check if we should schedule
> on the xen timer in the middle of a hypercall on the return from the
> timer interrupt. We want to be careful to not always force voluntary
> preemption though so to do this we only selectively enable preemption
> on very specific xen hypercalls.
>
> This enables hypercall preemption by selectively forcing checks for
> voluntary preempting only on ioctl initiated private hypercalls
> where we know some folks have run into reported issues [1].
>
> This also adds a trace event to be able to review when xen hypercalls
> are preemtped, right now we just tell you when it happens and which
> CPU got preempted.
>
> ergon:~ # echo 1 > /sys/kernel/debug/tracing/events/xen/xen_hypercall_preemption/trigger
> ergon:~ # cat /sys/kernel/debug/tracing/trace_pipe
> ...
>  qemu-system-i38-2114  [000] ....   491.038440: xen_hypercall_preemption: on CPU 0
>  qemu-system-i38-2114  [003] ....   518.138592: xen_hypercall_preemption: on CPU 3
> ...
>
> [0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
> [1] https://bugzilla.novell.com/show_bug.cgi?id=861093
> [2] http://ftp.suse.com/pub/people/mcgrof/xen/emulate-long-xen-hypercalls.patch
>
> Based on original work by: David Vrabel <david.vrabel@citrix.com>
> Suggested-by: Andy Lutomirski <luto@amacapital.net>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  arch/x86/kernel/entry_32.S       |  2 ++
>  arch/x86/kernel/entry_64.S       |  2 ++
>  drivers/xen/events/events_base.c | 23 +++++++++++++++++++++++
>  include/trace/events/xen.h       |  9 +++++++++
>  include/xen/events.h             |  1 +
>  5 files changed, 37 insertions(+)
>

Reviewed-by: Andy Lutomirski <luto@amacapital.net>

>
> +/*
> + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> + * when certain multicalls are used [0] on large systems, in
> + * that case we need a way to voluntarily preempt. This is
> + * only an issue on CONFIG_PREEMPT=n kernels.
> + *
> + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> + */
> +void xen_end_upcall(struct pt_regs *regs)
> +{
> +       if (xen_is_preemptible_hypercall(regs)) {
> +               int cpuid = smp_processor_id();
> +               if (_cond_resched())
> +                       trace_xen_hypercall_preemption(cpuid);

If you want to speed this up a bit, I think you could move the
smp_processor_id() into the TP_fast_assign.  But don't tracepoints
report the cpu number even without any action?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
@ 2015-01-23  1:40   ` Andy Lutomirski
  2015-01-27  1:45     ` Luis R. Rodriguez
  2015-01-27  1:45     ` Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-01-23  1:40 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Vrabel, Konrad Rzeszutek Wilk, Boris Ostrovsky, xen-devel,
	linux-kernel, X86 ML, kvm list, Paul McKenney, Steven Rostedt,
	Luis R. Rodriguez, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Masami Hiramatsu, Jan Beulich

On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> On kernels with voluntary or no preemption we can run
> into situations where a hypercall issued through userspace
> will linger around as it addresses sub-operatiosn in kernel
> context (multicalls). Such operations can trigger soft lockup
> detection.

Looks reasonable.

--Andy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23 11:30   ` [Xen-devel] " David Vrabel
  2015-01-23 11:30   ` David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-01-23  1:40 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Borislav Petkov, kvm list, Luis R. Rodriguez, X86 ML,
	linux-kernel, Steven Rostedt, Ingo Molnar, David Vrabel,
	Jan Beulich, H. Peter Anvin, Masami Hiramatsu, xen-devel,
	Boris Ostrovsky, Paul McKenney, Thomas Gleixner

On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> On kernels with voluntary or no preemption we can run
> into situations where a hypercall issued through userspace
> will linger around as it addresses sub-operatiosn in kernel
> context (multicalls). Such operations can trigger soft lockup
> detection.

Looks reasonable.

--Andy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23  1:57     ` Steven Rostedt
@ 2015-01-23  1:57     ` Steven Rostedt
  1 sibling, 0 replies; 31+ messages in thread
From: Steven Rostedt @ 2015-01-23  1:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Luis R. Rodriguez, David Vrabel, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, xen-devel, linux-kernel, X86 ML, kvm list,
	Paul McKenney, Luis R. Rodriguez, Borislav Petkov,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Masami Hiramatsu,
	Jan Beulich

On Thu, 22 Jan 2015 17:40:27 -0800
Andy Lutomirski <luto@amacapital.net> wrote:

> >
> > +/*
> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> > + * when certain multicalls are used [0] on large systems, in
> > + * that case we need a way to voluntarily preempt. This is
> > + * only an issue on CONFIG_PREEMPT=n kernels.
> > + *
> > + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> > + */
> > +void xen_end_upcall(struct pt_regs *regs)
> > +{
> > +       if (xen_is_preemptible_hypercall(regs)) {
> > +               int cpuid = smp_processor_id();
> > +               if (_cond_resched())
> > +                       trace_xen_hypercall_preemption(cpuid);
> 
> If you want to speed this up a bit, I think you could move the
> smp_processor_id() into the TP_fast_assign.  But don't tracepoints
> report the cpu number even without any action?

Yes, but if you scheduled here, the tracepoint could happen on a
different CPU. Thus, cpuid will not equal smp_processor_id().

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-23  1:57     ` Steven Rostedt
  2015-01-23  1:57     ` Steven Rostedt
  1 sibling, 0 replies; 31+ messages in thread
From: Steven Rostedt @ 2015-01-23  1:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: X86 ML, kvm list, Luis R. Rodriguez, Luis R. Rodriguez,
	linux-kernel, Ingo Molnar, Borislav Petkov, David Vrabel,
	Jan Beulich, H. Peter Anvin, Masami Hiramatsu, xen-devel,
	Boris Ostrovsky, Paul McKenney, Thomas Gleixner

On Thu, 22 Jan 2015 17:40:27 -0800
Andy Lutomirski <luto@amacapital.net> wrote:

> >
> > +/*
> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> > + * when certain multicalls are used [0] on large systems, in
> > + * that case we need a way to voluntarily preempt. This is
> > + * only an issue on CONFIG_PREEMPT=n kernels.
> > + *
> > + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> > + */
> > +void xen_end_upcall(struct pt_regs *regs)
> > +{
> > +       if (xen_is_preemptible_hypercall(regs)) {
> > +               int cpuid = smp_processor_id();
> > +               if (_cond_resched())
> > +                       trace_xen_hypercall_preemption(cpuid);
> 
> If you want to speed this up a bit, I think you could move the
> smp_processor_id() into the TP_fast_assign.  But don't tracepoints
> report the cpu number even without any action?

Yes, but if you scheduled here, the tracepoint could happen on a
different CPU. Thus, cpuid will not equal smp_processor_id().

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-23 11:30   ` David Vrabel
  2015-01-23 18:57     ` Luis R. Rodriguez
  2015-01-23 18:57     ` [Xen-devel] " Luis R. Rodriguez
  2015-01-23 11:30   ` David Vrabel
  3 siblings, 2 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:30 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: Borislav Petkov, kvm, Luis R. Rodriguez, x86, linux-kernel,
	rostedt, Andy Lutomirski, Ingo Molnar, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, Thomas Gleixner, paulmck

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> On kernels with voluntary or no preemption we can run
> into situations where a hypercall issued through userspace
> will linger around as it addresses sub-operatiosn in kernel
> context (multicalls). Such operations can trigger soft lockup
> detection.
> 
> We want to address a way to let the kernel voluntarily preempt
> such calls even on non preempt kernels, to address this we first
> need to distinguish which hypercalls fall under this category.
> This implements xen_is_preemptible_hypercall() which lets us do
> just that by adding a secondary hypercall page, calls made via
> the new page may be preempted.
[...]
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -84,6 +84,22 @@
>  
>  extern struct { char _entry[32]; } hypercall_page[];
>  
> +#ifndef CONFIG_PREEMPT
> +extern struct { char _entry[32]; } preemptible_hypercall_page[];
> +
> +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> +{
> +	return !user_mode_vm(regs) &&
> +		regs->ip >= (unsigned long)preemptible_hypercall_page &&
> +		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;

I think you can optimize this to:

    return (regs->ip >> PAGE_SHIFT) == preemptible_hypercall_pfn
	&& !user_mode_vm(regs);

David


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
                     ` (2 preceding siblings ...)
  2015-01-23 11:30   ` [Xen-devel] " David Vrabel
@ 2015-01-23 11:30   ` David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:30 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: paulmck, kvm, Luis R. Rodriguez, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, Borislav Petkov

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> On kernels with voluntary or no preemption we can run
> into situations where a hypercall issued through userspace
> will linger around as it addresses sub-operatiosn in kernel
> context (multicalls). Such operations can trigger soft lockup
> detection.
> 
> We want to address a way to let the kernel voluntarily preempt
> such calls even on non preempt kernels, to address this we first
> need to distinguish which hypercalls fall under this category.
> This implements xen_is_preemptible_hypercall() which lets us do
> just that by adding a secondary hypercall page, calls made via
> the new page may be preempted.
[...]
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -84,6 +84,22 @@
>  
>  extern struct { char _entry[32]; } hypercall_page[];
>  
> +#ifndef CONFIG_PREEMPT
> +extern struct { char _entry[32]; } preemptible_hypercall_page[];
> +
> +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> +{
> +	return !user_mode_vm(regs) &&
> +		regs->ip >= (unsigned long)preemptible_hypercall_page &&
> +		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;

I think you can optimize this to:

    return (regs->ip >> PAGE_SHIFT) == preemptible_hypercall_pfn
	&& !user_mode_vm(regs);

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
  2015-01-23  1:40   ` Andy Lutomirski
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-23 11:45   ` David Vrabel
  2015-01-23 18:58     ` Luis R. Rodriguez
                       ` (3 more replies)
  2015-01-23 11:45   ` David Vrabel
  3 siblings, 4 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:45 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: Borislav Petkov, kvm, Luis R. Rodriguez, x86, linux-kernel,
	rostedt, Andy Lutomirski, Ingo Molnar, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, Thomas Gleixner, paulmck

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> Xen has support for splitting heavy work work into a series
> of hypercalls, called multicalls, and preempting them through
> what Xen calls continuation [0]. Despite this though without
> CONFIG_PREEMPT preemption won't happen, without preemption
> a system can become pretty useless on heavy handed hypercalls.
> Such is the case for example when creating a > 50 GiB HVM guest,
> we can get softlockups [1] with:.
> 
> kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
> 
> The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> (default 120 seconds), on the Xen side in this particular case
> this happens when the following Xen hypervisor code is used:
> 
> xc_domain_set_pod_target() -->
>   do_memory_op() -->
>     arch_memory_op() -->
>       p2m_pod_set_mem_target()
> 	-- long delay (real or emulated) --
> 
> This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> op even though arch_memory_op() can handle continuation via
> hypercall_create_continuation() for example.
> 
> Machines over 50 GiB of memory are on high demand and hard to come
> by so to help replicate this sort of issue long delays on select
> hypercalls have been emulated in order to be able to test this on
> smaller machines [2].
> 
> On one hand this issue can be considered as expected given that
> CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> precedent practices in the kernel even for CONFIG_PREEMPT=n through
> the usage of cond_resched() sprinkled in many places. To address
> this issue with Xen hypercalls though we need to find a way to aid
> to the schedular in the middle of hypercalls. We are motivated to
> address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> rather unresponsive for long periods of time; in the worst case, at least
> only currently by emulating long delays on select io disk bound
> hypercalls, this can lead to filesystem corruption if the delay happens
> for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
> 
> We can address this problem by trying to check if we should schedule
> on the xen timer in the middle of a hypercall on the return from the
> timer interrupt. We want to be careful to not always force voluntary
> preemption though so to do this we only selectively enable preemption
> on very specific xen hypercalls.
[...]
> @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>  	set_irq_regs(old_regs);
>  }
>  
> +/*
> + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> + * when certain multicalls are used [0] on large systems, in
> + * that case we need a way to voluntarily preempt. This is
> + * only an issue on CONFIG_PREEMPT=n kernels.

Rewrite this comment as;

* Some hypercalls issued by the toolstack can take many 10s of
* seconds.  Allow tasks running hypercalls via the privcmd driver to be
* voluntarily preempted even if full kernel preemption is disabled.

> + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093

This link isn't accessible so I don't think it should be included here.

> + */
> +void xen_end_upcall(struct pt_regs *regs)
> +{
> +	if (xen_is_preemptible_hypercall(regs)) {
> +		int cpuid = smp_processor_id();
> +		if (_cond_resched())
> +			trace_xen_hypercall_preemption(cpuid);

I don't think a tracepoint here is useful.

> +	}
> +}
> +NOKPROBE_SYMBOL(xen_end_upcall);

Do we need this is this function is no longer notrace?

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
                     ` (2 preceding siblings ...)
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
@ 2015-01-23 11:45   ` David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:45 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: paulmck, kvm, Luis R. Rodriguez, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, Borislav Petkov

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> Xen has support for splitting heavy work work into a series
> of hypercalls, called multicalls, and preempting them through
> what Xen calls continuation [0]. Despite this though without
> CONFIG_PREEMPT preemption won't happen, without preemption
> a system can become pretty useless on heavy handed hypercalls.
> Such is the case for example when creating a > 50 GiB HVM guest,
> we can get softlockups [1] with:.
> 
> kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
> 
> The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> (default 120 seconds), on the Xen side in this particular case
> this happens when the following Xen hypervisor code is used:
> 
> xc_domain_set_pod_target() -->
>   do_memory_op() -->
>     arch_memory_op() -->
>       p2m_pod_set_mem_target()
> 	-- long delay (real or emulated) --
> 
> This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> op even though arch_memory_op() can handle continuation via
> hypercall_create_continuation() for example.
> 
> Machines over 50 GiB of memory are on high demand and hard to come
> by so to help replicate this sort of issue long delays on select
> hypercalls have been emulated in order to be able to test this on
> smaller machines [2].
> 
> On one hand this issue can be considered as expected given that
> CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> precedent practices in the kernel even for CONFIG_PREEMPT=n through
> the usage of cond_resched() sprinkled in many places. To address
> this issue with Xen hypercalls though we need to find a way to aid
> to the schedular in the middle of hypercalls. We are motivated to
> address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> rather unresponsive for long periods of time; in the worst case, at least
> only currently by emulating long delays on select io disk bound
> hypercalls, this can lead to filesystem corruption if the delay happens
> for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
> 
> We can address this problem by trying to check if we should schedule
> on the xen timer in the middle of a hypercall on the return from the
> timer interrupt. We want to be careful to not always force voluntary
> preemption though so to do this we only selectively enable preemption
> on very specific xen hypercalls.
[...]
> @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>  	set_irq_regs(old_regs);
>  }
>  
> +/*
> + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> + * when certain multicalls are used [0] on large systems, in
> + * that case we need a way to voluntarily preempt. This is
> + * only an issue on CONFIG_PREEMPT=n kernels.

Rewrite this comment as;

* Some hypercalls issued by the toolstack can take many 10s of
* seconds.  Allow tasks running hypercalls via the privcmd driver to be
* voluntarily preempted even if full kernel preemption is disabled.

> + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093

This link isn't accessible so I don't think it should be included here.

> + */
> +void xen_end_upcall(struct pt_regs *regs)
> +{
> +	if (xen_is_preemptible_hypercall(regs)) {
> +		int cpuid = smp_processor_id();
> +		if (_cond_resched())
> +			trace_xen_hypercall_preemption(cpuid);

I don't think a tracepoint here is useful.

> +	}
> +}
> +NOKPROBE_SYMBOL(xen_end_upcall);

Do we need this is this function is no longer notrace?

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2015-01-23  0:29 ` Luis R. Rodriguez
@ 2015-01-23 11:51 ` David Vrabel
  2015-01-23 18:58   ` Luis R. Rodriguez
  2015-01-23 18:58   ` Luis R. Rodriguez
  2015-01-23 11:51 ` David Vrabel
  5 siblings, 2 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:51 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: kvm, Luis R. Rodriguez, x86, linux-kernel, rostedt, paulmck

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> This v4 addresses some of the cleanups recommended and adds
> tracing option for when we do actually preempt a hypercall.
> I kept the NOKPROBE_SYMBOL() for now but did remove the 'notrace'
> stuff.
> 
> This goes out as RFC still as I have not been able to test 32-bit.
> Can anyone test that or at least confirm that the 32-bit point
> we do the upcall is definitely not on the IRQ stack?

You can omit fixing this for 32-bit guests (provided you note as such in
the commit message).

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 0/2] x86/xen: add xen hypercall preemption
  2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2015-01-23 11:51 ` [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption David Vrabel
@ 2015-01-23 11:51 ` David Vrabel
  5 siblings, 0 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-23 11:51 UTC (permalink / raw)
  To: Luis R. Rodriguez, david.vrabel, konrad.wilk, boris.ostrovsky, xen-devel
  Cc: kvm, Luis R. Rodriguez, x86, linux-kernel, rostedt, paulmck

On 23/01/15 00:29, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> This v4 addresses some of the cleanups recommended and adds
> tracing option for when we do actually preempt a hypercall.
> I kept the NOKPROBE_SYMBOL() for now but did remove the 'notrace'
> stuff.
> 
> This goes out as RFC still as I have not been able to test 32-bit.
> Can anyone test that or at least confirm that the 32-bit point
> we do the upcall is definitely not on the IRQ stack?

You can omit fixing this for 32-bit guests (provided you note as such in
the commit message).

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23 11:30   ` [Xen-devel] " David Vrabel
  2015-01-23 18:57     ` Luis R. Rodriguez
@ 2015-01-23 18:57     ` Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:57 UTC (permalink / raw)
  To: David Vrabel
  Cc: Luis R. Rodriguez, konrad.wilk, boris.ostrovsky, xen-devel,
	Borislav Petkov, kvm, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, paulmck

On Fri, Jan 23, 2015 at 11:30:16AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > On kernels with voluntary or no preemption we can run
> > into situations where a hypercall issued through userspace
> > will linger around as it addresses sub-operatiosn in kernel
> > context (multicalls). Such operations can trigger soft lockup
> > detection.
> > 
> > We want to address a way to let the kernel voluntarily preempt
> > such calls even on non preempt kernels, to address this we first
> > need to distinguish which hypercalls fall under this category.
> > This implements xen_is_preemptible_hypercall() which lets us do
> > just that by adding a secondary hypercall page, calls made via
> > the new page may be preempted.
> [...]
> > --- a/arch/x86/include/asm/xen/hypercall.h
> > +++ b/arch/x86/include/asm/xen/hypercall.h
> > @@ -84,6 +84,22 @@
> >  
> >  extern struct { char _entry[32]; } hypercall_page[];
> >  
> > +#ifndef CONFIG_PREEMPT
> > +extern struct { char _entry[32]; } preemptible_hypercall_page[];
> > +
> > +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> > +{
> > +	return !user_mode_vm(regs) &&
> > +		regs->ip >= (unsigned long)preemptible_hypercall_page &&
> > +		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;
> 
> I think you can optimize this to:
> 
>     return (regs->ip >> PAGE_SHIFT) == preemptible_hypercall_pfn

I take it you meant preemptible_hypercall_page ?

> 	&& !user_mode_vm(regs);

If so I don't see how this can work as an identical replacement.
Consider a PAGE_SIZE is 16, so PAGE_SHIFT would be 4, and lets
say we are checking for byte 2 which should be in the page:

; 0b0010 >>4
	0

Can you elaborate more on this, or can we perhaps leave such
optimization as an evolution to avoid regressions if you are
not 100% certain?

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23 11:30   ` [Xen-devel] " David Vrabel
@ 2015-01-23 18:57     ` Luis R. Rodriguez
  2015-01-23 18:57     ` [Xen-devel] " Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:57 UTC (permalink / raw)
  To: David Vrabel
  Cc: x86, kvm, Luis R. Rodriguez, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, paulmck, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, xen-devel, boris.ostrovsky,
	Borislav Petkov, Thomas Gleixner

On Fri, Jan 23, 2015 at 11:30:16AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > On kernels with voluntary or no preemption we can run
> > into situations where a hypercall issued through userspace
> > will linger around as it addresses sub-operatiosn in kernel
> > context (multicalls). Such operations can trigger soft lockup
> > detection.
> > 
> > We want to address a way to let the kernel voluntarily preempt
> > such calls even on non preempt kernels, to address this we first
> > need to distinguish which hypercalls fall under this category.
> > This implements xen_is_preemptible_hypercall() which lets us do
> > just that by adding a secondary hypercall page, calls made via
> > the new page may be preempted.
> [...]
> > --- a/arch/x86/include/asm/xen/hypercall.h
> > +++ b/arch/x86/include/asm/xen/hypercall.h
> > @@ -84,6 +84,22 @@
> >  
> >  extern struct { char _entry[32]; } hypercall_page[];
> >  
> > +#ifndef CONFIG_PREEMPT
> > +extern struct { char _entry[32]; } preemptible_hypercall_page[];
> > +
> > +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> > +{
> > +	return !user_mode_vm(regs) &&
> > +		regs->ip >= (unsigned long)preemptible_hypercall_page &&
> > +		regs->ip < (unsigned long)preemptible_hypercall_page + PAGE_SIZE;
> 
> I think you can optimize this to:
> 
>     return (regs->ip >> PAGE_SHIFT) == preemptible_hypercall_pfn

I take it you meant preemptible_hypercall_page ?

> 	&& !user_mode_vm(regs);

If so I don't see how this can work as an identical replacement.
Consider a PAGE_SIZE is 16, so PAGE_SHIFT would be 4, and lets
say we are checking for byte 2 which should be in the page:

; 0b0010 >>4
	0

Can you elaborate more on this, or can we perhaps leave such
optimization as an evolution to avoid regressions if you are
not 100% certain?

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
  2015-01-23 18:58     ` Luis R. Rodriguez
@ 2015-01-23 18:58     ` Luis R. Rodriguez
  2015-01-26 10:46       ` Jan Beulich
                         ` (3 more replies)
  2015-01-23 19:16     ` Luis R. Rodriguez
  2015-01-23 19:16     ` Luis R. Rodriguez
  3 siblings, 4 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:58 UTC (permalink / raw)
  To: David Vrabel
  Cc: Luis R. Rodriguez, konrad.wilk, boris.ostrovsky, xen-devel,
	Borislav Petkov, kvm, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, paulmck

On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > Xen has support for splitting heavy work work into a series
> > of hypercalls, called multicalls, and preempting them through
> > what Xen calls continuation [0]. Despite this though without
> > CONFIG_PREEMPT preemption won't happen, without preemption
> > a system can become pretty useless on heavy handed hypercalls.
> > Such is the case for example when creating a > 50 GiB HVM guest,
> > we can get softlockups [1] with:.
> > 
> > kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
> > 
> > The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> > (default 120 seconds), on the Xen side in this particular case
> > this happens when the following Xen hypervisor code is used:
> > 
> > xc_domain_set_pod_target() -->
> >   do_memory_op() -->
> >     arch_memory_op() -->
> >       p2m_pod_set_mem_target()
> > 	-- long delay (real or emulated) --
> > 
> > This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> > op even though arch_memory_op() can handle continuation via
> > hypercall_create_continuation() for example.
> > 
> > Machines over 50 GiB of memory are on high demand and hard to come
> > by so to help replicate this sort of issue long delays on select
> > hypercalls have been emulated in order to be able to test this on
> > smaller machines [2].
> > 
> > On one hand this issue can be considered as expected given that
> > CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> > precedent practices in the kernel even for CONFIG_PREEMPT=n through
> > the usage of cond_resched() sprinkled in many places. To address
> > this issue with Xen hypercalls though we need to find a way to aid
> > to the schedular in the middle of hypercalls. We are motivated to
> > address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> > rather unresponsive for long periods of time; in the worst case, at least
> > only currently by emulating long delays on select io disk bound
> > hypercalls, this can lead to filesystem corruption if the delay happens
> > for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
> > 
> > We can address this problem by trying to check if we should schedule
> > on the xen timer in the middle of a hypercall on the return from the
> > timer interrupt. We want to be careful to not always force voluntary
> > preemption though so to do this we only selectively enable preemption
> > on very specific xen hypercalls.
> [...]
> > @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
> >  	set_irq_regs(old_regs);
> >  }
> >  
> > +/*
> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> > + * when certain multicalls are used [0] on large systems, in
> > + * that case we need a way to voluntarily preempt. This is
> > + * only an issue on CONFIG_PREEMPT=n kernels.
> 
> Rewrite this comment as;
> 
> * Some hypercalls issued by the toolstack can take many 10s of

Its not just hypercalls though, this is all about the interactions
with multicalls no?

> * seconds.  Allow tasks running hypercalls via the privcmd driver to be
> * voluntarily preempted even if full kernel preemption is disabled.
> 
> > + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> 
> This link isn't accessible so I don't think it should be included here.

OK.

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
@ 2015-01-23 18:58     ` Luis R. Rodriguez
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:58 UTC (permalink / raw)
  To: David Vrabel
  Cc: x86, kvm, Luis R. Rodriguez, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, paulmck, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, xen-devel, boris.ostrovsky,
	Borislav Petkov, Thomas Gleixner

On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > Xen has support for splitting heavy work work into a series
> > of hypercalls, called multicalls, and preempting them through
> > what Xen calls continuation [0]. Despite this though without
> > CONFIG_PREEMPT preemption won't happen, without preemption
> > a system can become pretty useless on heavy handed hypercalls.
> > Such is the case for example when creating a > 50 GiB HVM guest,
> > we can get softlockups [1] with:.
> > 
> > kernel: [  802.084335] BUG: soft lockup - CPU#1 stuck for 22s! [xend:31351]
> > 
> > The softlock up triggers on the TASK_UNINTERRUPTIBLE hanger check
> > (default 120 seconds), on the Xen side in this particular case
> > this happens when the following Xen hypervisor code is used:
> > 
> > xc_domain_set_pod_target() -->
> >   do_memory_op() -->
> >     arch_memory_op() -->
> >       p2m_pod_set_mem_target()
> > 	-- long delay (real or emulated) --
> > 
> > This happens on arch_memory_op() on the XENMEM_set_pod_target memory
> > op even though arch_memory_op() can handle continuation via
> > hypercall_create_continuation() for example.
> > 
> > Machines over 50 GiB of memory are on high demand and hard to come
> > by so to help replicate this sort of issue long delays on select
> > hypercalls have been emulated in order to be able to test this on
> > smaller machines [2].
> > 
> > On one hand this issue can be considered as expected given that
> > CONFIG_PREEMPT=n is used however we have forced voluntary preemption
> > precedent practices in the kernel even for CONFIG_PREEMPT=n through
> > the usage of cond_resched() sprinkled in many places. To address
> > this issue with Xen hypercalls though we need to find a way to aid
> > to the schedular in the middle of hypercalls. We are motivated to
> > address this issue on CONFIG_PREEMPT=n as otherwise the system becomes
> > rather unresponsive for long periods of time; in the worst case, at least
> > only currently by emulating long delays on select io disk bound
> > hypercalls, this can lead to filesystem corruption if the delay happens
> > for example on SCHEDOP_remote_shutdown (when we call 'xl <domain> shutdown').
> > 
> > We can address this problem by trying to check if we should schedule
> > on the xen timer in the middle of a hypercall on the return from the
> > timer interrupt. We want to be careful to not always force voluntary
> > preemption though so to do this we only selectively enable preemption
> > on very specific xen hypercalls.
> [...]
> > @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
> >  	set_irq_regs(old_regs);
> >  }
> >  
> > +/*
> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
> > + * when certain multicalls are used [0] on large systems, in
> > + * that case we need a way to voluntarily preempt. This is
> > + * only an issue on CONFIG_PREEMPT=n kernels.
> 
> Rewrite this comment as;
> 
> * Some hypercalls issued by the toolstack can take many 10s of

Its not just hypercalls though, this is all about the interactions
with multicalls no?

> * seconds.  Allow tasks running hypercalls via the privcmd driver to be
> * voluntarily preempted even if full kernel preemption is disabled.
> 
> > + * [0] https://bugzilla.novell.com/show_bug.cgi?id=861093
> 
> This link isn't accessible so I don't think it should be included here.

OK.

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption
  2015-01-23 11:51 ` [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption David Vrabel
@ 2015-01-23 18:58   ` Luis R. Rodriguez
  2015-01-23 18:58   ` Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:58 UTC (permalink / raw)
  To: David Vrabel
  Cc: Luis R. Rodriguez, konrad.wilk, boris.ostrovsky, xen-devel, kvm,
	x86, linux-kernel, rostedt, paulmck

On Fri, Jan 23, 2015 at 11:51:09AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > This v4 addresses some of the cleanups recommended and adds
> > tracing option for when we do actually preempt a hypercall.
> > I kept the NOKPROBE_SYMBOL() for now but did remove the 'notrace'
> > stuff.
> > 
> > This goes out as RFC still as I have not been able to test 32-bit.
> > Can anyone test that or at least confirm that the 32-bit point
> > we do the upcall is definitely not on the IRQ stack?
> 
> You can omit fixing this for 32-bit guests (provided you note as such in
> the commit message).

I'll be happy to do that.

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 0/2] x86/xen: add xen hypercall preemption
  2015-01-23 11:51 ` [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption David Vrabel
  2015-01-23 18:58   ` Luis R. Rodriguez
@ 2015-01-23 18:58   ` Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 18:58 UTC (permalink / raw)
  To: David Vrabel
  Cc: Luis R. Rodriguez, kvm, x86, linux-kernel, rostedt, xen-devel,
	boris.ostrovsky, paulmck

On Fri, Jan 23, 2015 at 11:51:09AM +0000, David Vrabel wrote:
> On 23/01/15 00:29, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> > 
> > This v4 addresses some of the cleanups recommended and adds
> > tracing option for when we do actually preempt a hypercall.
> > I kept the NOKPROBE_SYMBOL() for now but did remove the 'notrace'
> > stuff.
> > 
> > This goes out as RFC still as I have not been able to test 32-bit.
> > Can anyone test that or at least confirm that the 32-bit point
> > we do the upcall is definitely not on the IRQ stack?
> 
> You can omit fixing this for 32-bit guests (provided you note as such in
> the commit message).

I'll be happy to do that.

 Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
  2015-01-23 18:58     ` Luis R. Rodriguez
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
@ 2015-01-23 19:16     ` Luis R. Rodriguez
  2015-01-23 19:16     ` Luis R. Rodriguez
  3 siblings, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 19:16 UTC (permalink / raw)
  To: David Vrabel, Andy Lutomirski, Steven Rostedt
  Cc: Luis R. Rodriguez, konrad.wilk, boris.ostrovsky, xen-devel,
	Borislav Petkov, kvm, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, paulmck

On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
> > + */
> > +void xen_end_upcall(struct pt_regs *regs)
> > +{
> > +	if (xen_is_preemptible_hypercall(regs)) {
> > +		int cpuid = smp_processor_id();
> > +		if (_cond_resched())
> > +			trace_xen_hypercall_preemption(cpuid);
> 
> I don't think a tracepoint here is useful.

OK.. I'll remove.

> > +	}
> > +}
> > +NOKPROBE_SYMBOL(xen_end_upcall);
> 
> Do we need this is this function is no longer notrace?

Stephen and Andy were going down some corner case rabbit hole
and it seemed to me that the conclusion was not settled so
to be safe I kept it. I'll let them decide. I did remove
the notrace junk.

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 11:45   ` [Xen-devel] " David Vrabel
                       ` (2 preceding siblings ...)
  2015-01-23 19:16     ` Luis R. Rodriguez
@ 2015-01-23 19:16     ` Luis R. Rodriguez
  3 siblings, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-23 19:16 UTC (permalink / raw)
  To: David Vrabel
  Cc: x86, kvm, Luis R. Rodriguez, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, paulmck, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, xen-devel, boris.ostrovsky,
	Borislav Petkov, Thomas Gleixner

On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
> > + */
> > +void xen_end_upcall(struct pt_regs *regs)
> > +{
> > +	if (xen_is_preemptible_hypercall(regs)) {
> > +		int cpuid = smp_processor_id();
> > +		if (_cond_resched())
> > +			trace_xen_hypercall_preemption(cpuid);
> 
> I don't think a tracepoint here is useful.

OK.. I'll remove.

> > +	}
> > +}
> > +NOKPROBE_SYMBOL(xen_end_upcall);
> 
> Do we need this is this function is no longer notrace?

Stephen and Andy were going down some corner case rabbit hole
and it seemed to me that the conclusion was not settled so
to be safe I kept it. I'll let them decide. I did remove
the notrace junk.

  Luis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
@ 2015-01-26 10:46       ` Jan Beulich
  2015-01-26 10:46       ` Jan Beulich
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2015-01-26 10:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Andy Lutomirski, David Vrabel, Luis R. Rodriguez, rostedt,
	Masami Hiramatsu, x86, Thomas Gleixner, paulmck, xen-devel,
	boris.ostrovsky, konrad.wilk, Ingo Molnar, Borislav Petkov, kvm,
	linux-kernel, H. Peter Anvin

>>> On 23.01.15 at 19:58, <mcgrof@suse.com> wrote:
> On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
>> On 23/01/15 00:29, Luis R. Rodriguez wrote:
>> > @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>> >  	set_irq_regs(old_regs);
>> >  }
>> >  
>> > +/*
>> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
>> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
>> > + * when certain multicalls are used [0] on large systems, in
>> > + * that case we need a way to voluntarily preempt. This is
>> > + * only an issue on CONFIG_PREEMPT=n kernels.
>> 
>> Rewrite this comment as;
>> 
>> * Some hypercalls issued by the toolstack can take many 10s of
> 
> Its not just hypercalls though, this is all about the interactions
> with multicalls no?

multicalls are just a special case of hypercalls.

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
  2015-01-26 10:46       ` Jan Beulich
@ 2015-01-26 10:46       ` Jan Beulich
  2015-01-26 10:47       ` David Vrabel
  2015-01-26 10:47       ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2015-01-26 10:46 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Borislav Petkov, kvm, Luis R. Rodriguez, x86, linux-kernel,
	rostedt, Andy Lutomirski, Ingo Molnar, David Vrabel,
	H. Peter Anvin, Masami Hiramatsu, xen-devel, Thomas Gleixner,
	paulmck, boris.ostrovsky

>>> On 23.01.15 at 19:58, <mcgrof@suse.com> wrote:
> On Fri, Jan 23, 2015 at 11:45:06AM +0000, David Vrabel wrote:
>> On 23/01/15 00:29, Luis R. Rodriguez wrote:
>> > @@ -1243,6 +1247,25 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>> >  	set_irq_regs(old_regs);
>> >  }
>> >  
>> > +/*
>> > + * CONFIG_PREEMPT=n kernels can end up triggering the softlock
>> > + * TASK_UNINTERRUPTIBLE hanger check (default 120 seconds)
>> > + * when certain multicalls are used [0] on large systems, in
>> > + * that case we need a way to voluntarily preempt. This is
>> > + * only an issue on CONFIG_PREEMPT=n kernels.
>> 
>> Rewrite this comment as;
>> 
>> * Some hypercalls issued by the toolstack can take many 10s of
> 
> Its not just hypercalls though, this is all about the interactions
> with multicalls no?

multicalls are just a special case of hypercalls.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xen-devel] [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
                         ` (2 preceding siblings ...)
  2015-01-26 10:47       ` David Vrabel
@ 2015-01-26 10:47       ` David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-26 10:47 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Luis R. Rodriguez, konrad.wilk, boris.ostrovsky, xen-devel,
	Borislav Petkov, kvm, x86, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, Jan Beulich, H. Peter Anvin,
	Masami Hiramatsu, Thomas Gleixner, paulmck

On 23/01/15 18:58, Luis R. Rodriguez wrote:
> 
> Its not just hypercalls though, this is all about the interactions
> with multicalls no?

No.  This applies to any preemptible hypercall and the toolstack doesn't
use multicalls for most of its work.

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted
  2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
  2015-01-26 10:46       ` Jan Beulich
  2015-01-26 10:46       ` Jan Beulich
@ 2015-01-26 10:47       ` David Vrabel
  2015-01-26 10:47       ` [Xen-devel] " David Vrabel
  3 siblings, 0 replies; 31+ messages in thread
From: David Vrabel @ 2015-01-26 10:47 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: x86, kvm, Luis R. Rodriguez, linux-kernel, rostedt,
	Andy Lutomirski, Ingo Molnar, paulmck, Jan Beulich,
	H. Peter Anvin, Masami Hiramatsu, xen-devel, boris.ostrovsky,
	Borislav Petkov, Thomas Gleixner

On 23/01/15 18:58, Luis R. Rodriguez wrote:
> 
> Its not just hypercalls though, this is all about the interactions
> with multicalls no?

No.  This applies to any preemptible hypercall and the toolstack doesn't
use multicalls for most of its work.

David

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  1:40   ` Andy Lutomirski
  2015-01-27  1:45     ` Luis R. Rodriguez
@ 2015-01-27  1:45     ` Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-27  1:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Luis R. Rodriguez, David Vrabel, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, xen-devel, linux-kernel, X86 ML, kvm list,
	Paul McKenney, Steven Rostedt, Borislav Petkov, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Masami Hiramatsu, Jan Beulich

On Thu, Jan 22, 2015 at 05:40:45PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
> <mcgrof@do-not-panic.com> wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> >
> > On kernels with voluntary or no preemption we can run
> > into situations where a hypercall issued through userspace
> > will linger around as it addresses sub-operatiosn in kernel
> > context (multicalls). Such operations can trigger soft lockup
> > detection.
> 
> Looks reasonable.

I'll add a reviewed-by...

  LUis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall()
  2015-01-23  1:40   ` Andy Lutomirski
@ 2015-01-27  1:45     ` Luis R. Rodriguez
  2015-01-27  1:45     ` Luis R. Rodriguez
  1 sibling, 0 replies; 31+ messages in thread
From: Luis R. Rodriguez @ 2015-01-27  1:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: X86 ML, kvm list, Luis R. Rodriguez, linux-kernel,
	Steven Rostedt, Ingo Molnar, Borislav Petkov, David Vrabel,
	Jan Beulich, H. Peter Anvin, Masami Hiramatsu, xen-devel,
	Boris Ostrovsky, Paul McKenney, Thomas Gleixner

On Thu, Jan 22, 2015 at 05:40:45PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 22, 2015 at 4:29 PM, Luis R. Rodriguez
> <mcgrof@do-not-panic.com> wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> >
> > On kernels with voluntary or no preemption we can run
> > into situations where a hypercall issued through userspace
> > will linger around as it addresses sub-operatiosn in kernel
> > context (multicalls). Such operations can trigger soft lockup
> > detection.
> 
> Looks reasonable.

I'll add a reviewed-by...

  LUis

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2015-01-27  1:46 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23  0:29 [RFC v4 0/2] x86/xen: add xen hypercall preemption Luis R. Rodriguez
2015-01-23  0:29 ` [RFC v4 1/2] x86/xen: add xen_is_preemptible_hypercall() Luis R. Rodriguez
2015-01-23  1:40   ` Andy Lutomirski
2015-01-27  1:45     ` Luis R. Rodriguez
2015-01-27  1:45     ` Luis R. Rodriguez
2015-01-23  1:40   ` Andy Lutomirski
2015-01-23 11:30   ` [Xen-devel] " David Vrabel
2015-01-23 18:57     ` Luis R. Rodriguez
2015-01-23 18:57     ` [Xen-devel] " Luis R. Rodriguez
2015-01-23 11:30   ` David Vrabel
2015-01-23  0:29 ` Luis R. Rodriguez
2015-01-23  0:29 ` [RFC v4 2/2] x86/xen: allow privcmd hypercalls to be preempted Luis R. Rodriguez
2015-01-23  1:40   ` Andy Lutomirski
2015-01-23  1:40   ` Andy Lutomirski
2015-01-23  1:57     ` Steven Rostedt
2015-01-23  1:57     ` Steven Rostedt
2015-01-23 11:45   ` [Xen-devel] " David Vrabel
2015-01-23 18:58     ` Luis R. Rodriguez
2015-01-23 18:58     ` [Xen-devel] " Luis R. Rodriguez
2015-01-26 10:46       ` Jan Beulich
2015-01-26 10:46       ` Jan Beulich
2015-01-26 10:47       ` David Vrabel
2015-01-26 10:47       ` [Xen-devel] " David Vrabel
2015-01-23 19:16     ` Luis R. Rodriguez
2015-01-23 19:16     ` Luis R. Rodriguez
2015-01-23 11:45   ` David Vrabel
2015-01-23  0:29 ` Luis R. Rodriguez
2015-01-23 11:51 ` [Xen-devel] [RFC v4 0/2] x86/xen: add xen hypercall preemption David Vrabel
2015-01-23 18:58   ` Luis R. Rodriguez
2015-01-23 18:58   ` Luis R. Rodriguez
2015-01-23 11:51 ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.