linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
@ 2014-11-04  9:24 Jan Beulich
  2014-11-04 19:40 ` Andy Lutomirski
  2014-11-10 10:01 ` Ingo Molnar
  0 siblings, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2014-11-04  9:24 UTC (permalink / raw)
  To: mingo, tglx, hpa; +Cc: Tony Jones, linux-kernel

The main obstacle to having done this long ago was the need to
determine whether annotations are needed in the first place: They need
to be avoided when a frame pointer got set up. Since I can't see a way
to determine this before the compilation phase, this is being achieved
by inspecting the memory address generated by the compiler in an
interposed assembler macro. Of course this isn't really nice code, and
this the main reason I'm posting this as RFC only at this point (with
the hope that maybe someone has an idea of how to achieve the same
thing in a more elegant way).

The reason for this being needed are various inline stack pointer
adjustments. In particular in the context of perf and its use of NMIs,
Tony observed the unwinder to get confused when an interruption
happened in the middle of an inline push/pop pair. While the live
unwinder continues to be of no interest upstream, NMIs being used to
trigger crash dumps (via external button or watchdog) would appear to
suffer from the same problem.

With the irqflags generic code generation issue out of the way (see
https://patchwork.kernel.org/patch/5223561/), the only differences in
generated code are a number pointless extra frame pointer adjustments
the compiler decides it needs.

Several pieces of code were intentionally left untouched:

- use site compiled with -fno-omit-frame-pointer:
  arch/x86/include/asm/switch_to.h:switch_to()

- annotations in helper sections (like .fixup) don't work anyway:
  arch/x86/lib/usercopy_32.c

- call stack reaching there unlikely to be of interest:
  arch/x86/boot/
  arch/x86/kernel/cpu/common.c:flag_is_changeable_p()
  arch/x86/include/asm/apm.h
  arch/x86/include/asm/kexec.h:crash_setup_regs()
  arch/x86/pci/pcbios.c:pcibios_get_irq_routing_table()
  arch/x86/xen/enlighten.c:xen_setup_gdt()
  drivers/input/misc/wistron_btns.c:call_bios()
  drivers/pci/hotplug/cpqphp_nvram.c:access_EV()
  drivers/pnp/pnpbios/
  drivers/watchdog/hpwdt.c:asminline_call

- to be done separately (if desired/needed):
  anything pv-ops related (patching as well as wrappers)
  arch/x86/kernel/kprobes/
  arch/x86/kernel/kvm/
  drivers/char/i8k.c:i8k_smm()
  drivers/char/toshiba.c:tosh_smm()
  drivers/lguest/x86/

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Tony Jones <tonyj@suse.de>
---
 arch/x86/Makefile                |   22 ++++++++
 arch/x86/include/asm/frame.h     |  103 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/include/asm/irqflags.h  |   18 ++++--
 arch/x86/include/asm/processor.h |   22 +++++---
 arch/x86/kernel/irq_32.c         |   15 ++++-
 drivers/cpufreq/speedstep-smi.c  |   32 ++++++------
 drivers/crypto/padlock-aes.c     |   14 +++--
 7 files changed, 188 insertions(+), 38 deletions(-)

--- 3.18-rc3/arch/x86/Makefile
+++ 3.18-rc3-x86-inline-unwind-info/arch/x86/Makefile
@@ -131,6 +131,28 @@ ifdef CONFIG_X86_X32
 endif
 export CONFIG_X86_X32_ABI
 
+ifeq ($(CONFIG_UNWIND_INFO),y)
+# Does gcc support generating CFI annotations via .cfi_* directives?
+ifeq ($(call cc-option,-fdwarf2-cfi-asm),-fdwarf2-cfi-asm)
+cfi-directives := $(call try-run,\
+		    /bin/echo '\
+			void test(void) { \
+				CFI_DECL; \
+				asm volatile(".cfi_undefined 0"); \
+				asm volatile(CFI_ADJUST_CFA_OFFSET(1) :: CFI_INPUTS()); \
+		 	}' | \
+		    $(CC) -DCONFIG_CC_AS_CFI -include $(srctree)/arch/x86/include/asm/frame.h -c -x c -o "$$TMP" - \
+		    ,y,n)
+ifeq ($(cfi-directives),y)
+KBUILD_CFLAGS += -DCONFIG_CC_AS_CFI
+else
+$(warning Inline function CFI annotations not usable with $(CC))
+endif
+else
+$(warning Inline function CFI annotations not supported by $(CC))
+endif
+endif
+
 # Don't unroll struct assignments with kmemcheck enabled
 ifeq ($(CONFIG_KMEMCHECK),y)
 	KBUILD_CFLAGS += $(call cc-option,-fno-builtin-memcpy)
--- 3.18-rc3/arch/x86/include/asm/frame.h
+++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/frame.h
@@ -1,3 +1,8 @@
+#if 0
+.if 0
+#elif !defined(_ASM_X86_FRAME_H)
+#define _ASM_X86_FRAME_H
+
 #ifdef __ASSEMBLY__
 
 #include <asm/asm.h>
@@ -23,4 +28,100 @@
 	.endm
 #endif
 
-#endif  /*  __ASSEMBLY__  */
+#elif defined(CONFIG_CC_AS_CFI)
+
+#include <linux/stringify.h>
+
+asm(".include \"" __FILE__ "\"");
+
+#define CFI_DECL volatile struct {} cfi_var
+/* Helper: Even inside .if 0 blocks gas warns when strings start a line. */
+#define CFI_BLANK
+#define CFI_PREFIX "; maybe_cfi_annotate$ %[cfi_var],%c[cfi_sz],"
+#define CFI_ADJUST_CFA_OFFSET(nr) CFI_PREFIX "adjust_cfa_offset" \
+	CFI_BLANK " (" __stringify(nr) ") * (%c[cfi_sz])"
+#define CFI_DEF_CFA_REGISTER(reg) CFI_PREFIX "def_cfa_register " #reg
+#define CFI_INPUTS(more...) [cfi_var] "m" (cfi_var), \
+			    [cfi_sz] "i" (sizeof(void *)), ## more
+
+#else
+
+#define CFI_DECL struct cfi_dummy
+#define CFI_ADJUST_CFA_OFFSET(nr)
+#define CFI_DEF_CFA_REGISTER(reg)
+#define CFI_INPUTS(more...) more
+
+#endif /* __ASSEMBLY__ / CONFIG_CC_AS_CFI */
+
+#elif 0
+.endif
+.macro maybe_cfi_annotate$ mem:req, sz:req, annot:vararg
+ nest$ = 0
+ state$ = 0
+ skip$ = 0
+ .irpc c,\mem
+  .ifeqs "(", "\c"
+   nest$ = nest$ + 1
+   state$ = 0
+  .endif
+
+  .ifeqs ")", "\c"
+   nest$ = nest$ - 1
+  .elseif (nest$ == 1) && (skip$ == 0)
+   .if state$ == 0
+    .ifeqs "%", "\c"
+     state$ = 1
+    .endif
+   .elseif state$ == 1
+    .if \sz == 4
+     .ifeqs "e", "\c"
+      state$ = 2
+     .endif
+    .endif
+
+    .if \sz == 8
+     .ifeqs "r", "\c"
+      state$ = 2
+     .endif
+    .endif
+
+    .if state$ <> 2
+     .exitm
+    .endif
+   .elseif state$ == 2
+    .ifeqs "s", "\c"
+     state$ = state$ | 0x04
+    .endif
+
+    .ifeqs "b", "\c"
+     state$ = state$ | 0x08
+    .endif
+
+    .if (state$ & 0x0c) == 0
+     .exitm
+    .endif
+   .elseif ((state$ & 3) == 2) && ((state$ & 0x0c) <> 0)
+    .ifeqs "p", "\c"
+     state$ = state$ | 0x10
+    .else
+     .exitm
+    .endif
+   .else
+    .ifeqs ",", "\c"
+     skip$ = 1
+    .else
+     .exitm
+    .endif
+   .endif
+  .endif
+ .endr
+
+ .if (nest$ <> 0) && (state$ == 0)
+  .error "unmatched parentheses in '\mem\(')"
+ .elseif (nest$ == 0) && (state$ == 0x16)
+  .cfi_\annot
+ .elseif (nest$ <> 0) || (state$ <> 0x1a)
+  .error "unsupported memory expression '\mem\(')"
+ .endif
+.endm
+#endif
--- 3.18-rc3/arch/x86/include/asm/irqflags.h
+++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/irqflags.h
@@ -4,6 +4,9 @@
 #include <asm/processor-flags.h>
 
 #ifndef __ASSEMBLY__
+
+#include <asm/frame.h>
+
 /*
  * Interrupt control:
  */
@@ -11,6 +14,7 @@
 static inline unsigned long native_save_fl(void)
 {
 	unsigned long flags;
+	CFI_DECL;
 
 	/*
 	 * "=rm" is safe here, because "pop" adjusts the stack before
@@ -18,9 +22,10 @@ static inline unsigned long native_save_
 	 * documented behavior of the "pop" instruction.
 	 */
 	asm volatile("# __raw_save_flags\n\t"
-		     "pushf ; pop %0"
+		     "pushf" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
+		     "pop %0" CFI_ADJUST_CFA_OFFSET(-1)
 		     : "=rm" (flags)
-		     : /* no input */
+		     : CFI_INPUTS()
 		     : "memory");
 
 	return flags;
@@ -28,10 +33,13 @@ static inline unsigned long native_save_
 
 static inline void native_restore_fl(unsigned long flags)
 {
-	asm volatile("push %0 ; popf"
+	CFI_DECL;
+
+	asm volatile("push %[flags]" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
+		     "popf" CFI_ADJUST_CFA_OFFSET(-1)
 		     : /* no output */
-		     :"g" (flags)
-		     :"memory", "cc");
+		     : CFI_INPUTS([flags] "g" (flags))
+		     : "memory", "cc");
 }
 
 static inline void native_irq_disable(void)
--- 3.18-rc3/arch/x86/include/asm/processor.h
+++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/processor.h
@@ -8,7 +8,6 @@ struct task_struct;
 struct mm_struct;
 
 #include <asm/vm86.h>
-#include <asm/math_emu.h>
 #include <asm/segment.h>
 #include <asm/types.h>
 #include <asm/sigcontext.h>
@@ -22,6 +21,11 @@ struct mm_struct;
 #include <asm/nops.h>
 #include <asm/special_insns.h>
 
+#ifdef CONFIG_X86_32
+#include <asm/math_emu.h>
+#include <asm/frame.h>
+#endif
+
 #include <linux/personality.h>
 #include <linux/cpumask.h>
 #include <linux/cache.h>
@@ -531,15 +535,17 @@ static inline void native_set_iopl_mask(
 {
 #ifdef CONFIG_X86_32
 	unsigned int reg;
+	CFI_DECL;
 
-	asm volatile ("pushfl;"
-		      "popl %0;"
-		      "andl %1, %0;"
-		      "orl %2, %0;"
-		      "pushl %0;"
-		      "popfl"
+	asm volatile ("pushfl" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
+		      "popl %0" CFI_ADJUST_CFA_OFFSET(-1) "\n\t"
+		      "andl %[invmask], %0\n\t"
+		      "orl %[mask], %0\n\t"
+		      "pushl %0" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
+		      "popfl" CFI_ADJUST_CFA_OFFSET(-1)
 		      : "=&r" (reg)
-		      : "i" (~X86_EFLAGS_IOPL), "r" (mask));
+		      : CFI_INPUTS([invmask] "i" (~X86_EFLAGS_IOPL),
+				   [mask] "r" (mask)));
 #endif
 }
 
--- 3.18-rc3/arch/x86/kernel/irq_32.c
+++ 3.18-rc3-x86-inline-unwind-info/arch/x86/kernel/irq_32.c
@@ -20,6 +20,7 @@
 #include <linux/mm.h>
 
 #include <asm/apic.h>
+#include <asm/frame.h>
 
 DEFINE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
 EXPORT_PER_CPU_SYMBOL(irq_stat);
@@ -60,12 +61,15 @@ DEFINE_PER_CPU(struct irq_stack *, softi
 
 static void call_on_stack(void *func, void *stack)
 {
+	CFI_DECL;
+
 	asm volatile("xchgl	%%ebx,%%esp	\n"
+		     CFI_DEF_CFA_REGISTER(ebx) "\n"
 		     "call	*%%edi		\n"
 		     "movl	%%ebx,%%esp	\n"
+		     CFI_DEF_CFA_REGISTER(esp)
 		     : "=b" (stack)
-		     : "0" (stack),
-		       "D"(func)
+		     : CFI_INPUTS("0" (stack), "D" (func))
 		     : "memory", "cc", "edx", "ecx", "eax");
 }
 
@@ -86,6 +90,7 @@ execute_on_irq_stack(int overflow, struc
 {
 	struct irq_stack *curstk, *irqstk;
 	u32 *isp, *prev_esp, arg1, arg2;
+	CFI_DECL;
 
 	curstk = (struct irq_stack *) current_stack();
 	irqstk = __this_cpu_read(hardirq_stack);
@@ -109,11 +114,13 @@ execute_on_irq_stack(int overflow, struc
 		call_on_stack(print_stack_overflow, isp);
 
 	asm volatile("xchgl	%%ebx,%%esp	\n"
+		     CFI_DEF_CFA_REGISTER(ebx) "\n"
 		     "call	*%%edi		\n"
 		     "movl	%%ebx,%%esp	\n"
+		     CFI_DEF_CFA_REGISTER(esp)
 		     : "=a" (arg1), "=d" (arg2), "=b" (isp)
-		     :  "0" (irq),   "1" (desc),  "2" (isp),
-			"D" (desc->handle_irq)
+		     : CFI_INPUTS("0" (irq),   "1" (desc),  "2" (isp),
+				  "D" (desc->handle_irq))
 		     : "memory", "cc", "ecx");
 	return 1;
 }
--- 3.18-rc3/drivers/cpufreq/speedstep-smi.c
+++ 3.18-rc3-x86-inline-unwind-info/drivers/cpufreq/speedstep-smi.c
@@ -19,6 +19,7 @@
 #include <linux/cpufreq.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <asm/frame.h>
 #include <asm/ist.h>
 #include <asm/cpu_device_id.h>
 
@@ -64,6 +65,7 @@ static int speedstep_smi_ownership(void)
 	u32 command, result, magic, dummy;
 	u32 function = GET_SPEEDSTEP_OWNER;
 	unsigned char magic_data[] = "Copyright (c) 1999 Intel Corporation";
+	CFI_DECL;
 
 	command = (smi_sig & 0xffffff00) | (smi_cmd & 0xff);
 	magic = virt_to_phys(magic_data);
@@ -72,14 +74,14 @@ static int speedstep_smi_ownership(void)
 			command, smi_port);
 
 	__asm__ __volatile__(
-		"push %%ebp\n"
+		"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
 		"out %%al, (%%dx)\n"
-		"pop %%ebp\n"
+		"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
 		: "=D" (result),
 		  "=a" (dummy), "=b" (dummy), "=c" (dummy), "=d" (dummy),
 		  "=S" (dummy)
-		: "a" (command), "b" (function), "c" (0), "d" (smi_port),
-		  "D" (0), "S" (magic)
+		: CFI_INPUTS("a" (command), "b" (function), "c" (0),
+			     "d" (smi_port), "D" (0), "S" (magic))
 		: "memory"
 	);
 
@@ -102,6 +104,7 @@ static int speedstep_smi_get_freqs(unsig
 	u32 command, result = 0, edi, high_mhz, low_mhz, dummy;
 	u32 state = 0;
 	u32 function = GET_SPEEDSTEP_FREQS;
+	CFI_DECL;
 
 	if (!(ist_info.event & 0xFFFF)) {
 		pr_debug("bug #1422 -- can't read freqs from BIOS\n");
@@ -114,17 +117,15 @@ static int speedstep_smi_get_freqs(unsig
 			command, smi_port);
 
 	__asm__ __volatile__(
-		"push %%ebp\n"
+		"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
 		"out %%al, (%%dx)\n"
-		"pop %%ebp"
+		"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
 		: "=a" (result),
 		  "=b" (high_mhz),
 		  "=c" (low_mhz),
 		  "=d" (state), "=D" (edi), "=S" (dummy)
-		: "a" (command),
-		  "b" (function),
-		  "c" (state),
-		  "d" (smi_port), "S" (0), "D" (0)
+		: CFI_INPUTS("a" (command), "b" (function), "c" (state),
+			     "d" (smi_port), "S" (0), "D" (0))
 	);
 
 	pr_debug("result %x, low_freq %u, high_freq %u\n",
@@ -165,6 +166,8 @@ static void speedstep_set_state(unsigned
 		state, command, smi_port);
 
 	do {
+		CFI_DECL;
+
 		if (retry) {
 			pr_debug("retry %u, previous result %u, waiting...\n",
 					retry, result);
@@ -172,14 +175,15 @@ static void speedstep_set_state(unsigned
 		}
 		retry++;
 		__asm__ __volatile__(
-			"push %%ebp\n"
+			"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
 			"out %%al, (%%dx)\n"
-			"pop %%ebp"
+			"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
 			: "=b" (new_state), "=D" (result),
 			  "=c" (dummy), "=a" (dummy),
 			  "=d" (dummy), "=S" (dummy)
-			: "a" (command), "b" (function), "c" (state),
-			  "d" (smi_port), "S" (0), "D" (0)
+			: CFI_INPUTS("a" (command), "b" (function),
+				     "c" (state), "d" (smi_port),
+				     "S" (0), "D" (0))
 			);
 	} while ((new_state != state) && (retry <= SMI_TRIES));
 
--- 3.18-rc3/drivers/crypto/padlock-aes.c
+++ 3.18-rc3-x86-inline-unwind-info/drivers/crypto/padlock-aes.c
@@ -21,6 +21,7 @@
 #include <linux/slab.h>
 #include <asm/cpu_device_id.h>
 #include <asm/byteorder.h>
+#include <asm/frame.h>
 #include <asm/processor.h>
 #include <asm/i387.h>
 
@@ -168,12 +169,13 @@ static inline void padlock_reset_key(str
 {
 	int cpu = raw_smp_processor_id();
 
-	if (cword != per_cpu(paes_last_cword, cpu))
-#ifndef CONFIG_X86_64
-		asm volatile ("pushfl; popfl");
-#else
-		asm volatile ("pushfq; popfq");
-#endif
+	if (cword != per_cpu(paes_last_cword, cpu)) {
+		CFI_DECL;
+
+		asm volatile ("pushf" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
+			      "popf" CFI_ADJUST_CFA_OFFSET(-1)
+			      :: CFI_INPUTS());
+	}
 }
 
 static inline void padlock_store_cword(struct cword *cword)



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-04  9:24 [PATCH, RFC] x86: also CFI-annotate certain inline asm()s Jan Beulich
@ 2014-11-04 19:40 ` Andy Lutomirski
  2014-11-05 17:13   ` Jan Beulich
  2014-11-10 10:01 ` Ingo Molnar
  1 sibling, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2014-11-04 19:40 UTC (permalink / raw)
  To: Jan Beulich, mingo, tglx, hpa; +Cc: Tony Jones, linux-kernel

On 11/04/2014 01:24 AM, Jan Beulich wrote:
> The main obstacle to having done this long ago was the need to
> determine whether annotations are needed in the first place: They need
> to be avoided when a frame pointer got set up. Since I can't see a way
> to determine this before the compilation phase, this is being achieved
> by inspecting the memory address generated by the compiler in an
> interposed assembler macro. Of course this isn't really nice code, and
> this the main reason I'm posting this as RFC only at this point (with
> the hope that maybe someone has an idea of how to achieve the same
> thing in a more elegant way).

Ask binutils for help?

Is the issue that the CFI annotation you need is different depending on
whether there's a frame pointer or not?  If so, can you add some
comments so that mere asm mortals have some prayer of understanding how
your magic works and what the desired output annotations are in the
various cases?

I have enough trouble understanding what the CFA is, and the sorry state
of the docs don't really help.

--Andy

> 
> The reason for this being needed are various inline stack pointer
> adjustments. In particular in the context of perf and its use of NMIs,
> Tony observed the unwinder to get confused when an interruption
> happened in the middle of an inline push/pop pair. While the live
> unwinder continues to be of no interest upstream, NMIs being used to
> trigger crash dumps (via external button or watchdog) would appear to
> suffer from the same problem.
> 
> With the irqflags generic code generation issue out of the way (see
> https://patchwork.kernel.org/patch/5223561/), the only differences in
> generated code are a number pointless extra frame pointer adjustments
> the compiler decides it needs.
> 
> Several pieces of code were intentionally left untouched:
> 
> - use site compiled with -fno-omit-frame-pointer:
>   arch/x86/include/asm/switch_to.h:switch_to()
> 
> - annotations in helper sections (like .fixup) don't work anyway:
>   arch/x86/lib/usercopy_32.c
> 
> - call stack reaching there unlikely to be of interest:
>   arch/x86/boot/
>   arch/x86/kernel/cpu/common.c:flag_is_changeable_p()
>   arch/x86/include/asm/apm.h
>   arch/x86/include/asm/kexec.h:crash_setup_regs()
>   arch/x86/pci/pcbios.c:pcibios_get_irq_routing_table()
>   arch/x86/xen/enlighten.c:xen_setup_gdt()
>   drivers/input/misc/wistron_btns.c:call_bios()
>   drivers/pci/hotplug/cpqphp_nvram.c:access_EV()
>   drivers/pnp/pnpbios/
>   drivers/watchdog/hpwdt.c:asminline_call
> 
> - to be done separately (if desired/needed):
>   anything pv-ops related (patching as well as wrappers)
>   arch/x86/kernel/kprobes/
>   arch/x86/kernel/kvm/
>   drivers/char/i8k.c:i8k_smm()
>   drivers/char/toshiba.c:tosh_smm()
>   drivers/lguest/x86/
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Tony Jones <tonyj@suse.de>
> ---
>  arch/x86/Makefile                |   22 ++++++++
>  arch/x86/include/asm/frame.h     |  103 ++++++++++++++++++++++++++++++++++++++-
>  arch/x86/include/asm/irqflags.h  |   18 ++++--
>  arch/x86/include/asm/processor.h |   22 +++++---
>  arch/x86/kernel/irq_32.c         |   15 ++++-
>  drivers/cpufreq/speedstep-smi.c  |   32 ++++++------
>  drivers/crypto/padlock-aes.c     |   14 +++--
>  7 files changed, 188 insertions(+), 38 deletions(-)
> 
> --- 3.18-rc3/arch/x86/Makefile
> +++ 3.18-rc3-x86-inline-unwind-info/arch/x86/Makefile
> @@ -131,6 +131,28 @@ ifdef CONFIG_X86_X32
>  endif
>  export CONFIG_X86_X32_ABI
>  
> +ifeq ($(CONFIG_UNWIND_INFO),y)
> +# Does gcc support generating CFI annotations via .cfi_* directives?
> +ifeq ($(call cc-option,-fdwarf2-cfi-asm),-fdwarf2-cfi-asm)
> +cfi-directives := $(call try-run,\
> +		    /bin/echo '\
> +			void test(void) { \
> +				CFI_DECL; \
> +				asm volatile(".cfi_undefined 0"); \
> +				asm volatile(CFI_ADJUST_CFA_OFFSET(1) :: CFI_INPUTS()); \
> +		 	}' | \
> +		    $(CC) -DCONFIG_CC_AS_CFI -include $(srctree)/arch/x86/include/asm/frame.h -c -x c -o "$$TMP" - \
> +		    ,y,n)
> +ifeq ($(cfi-directives),y)
> +KBUILD_CFLAGS += -DCONFIG_CC_AS_CFI
> +else
> +$(warning Inline function CFI annotations not usable with $(CC))
> +endif
> +else
> +$(warning Inline function CFI annotations not supported by $(CC))
> +endif
> +endif
> +
>  # Don't unroll struct assignments with kmemcheck enabled
>  ifeq ($(CONFIG_KMEMCHECK),y)
>  	KBUILD_CFLAGS += $(call cc-option,-fno-builtin-memcpy)
> --- 3.18-rc3/arch/x86/include/asm/frame.h
> +++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/frame.h
> @@ -1,3 +1,8 @@
> +#if 0
> +.if 0
> +#elif !defined(_ASM_X86_FRAME_H)
> +#define _ASM_X86_FRAME_H
> +
>  #ifdef __ASSEMBLY__
>  
>  #include <asm/asm.h>
> @@ -23,4 +28,100 @@
>  	.endm
>  #endif
>  
> -#endif  /*  __ASSEMBLY__  */
> +#elif defined(CONFIG_CC_AS_CFI)
> +
> +#include <linux/stringify.h>
> +
> +asm(".include \"" __FILE__ "\"");
> +
> +#define CFI_DECL volatile struct {} cfi_var
> +/* Helper: Even inside .if 0 blocks gas warns when strings start a line. */
> +#define CFI_BLANK
> +#define CFI_PREFIX "; maybe_cfi_annotate$ %[cfi_var],%c[cfi_sz],"
> +#define CFI_ADJUST_CFA_OFFSET(nr) CFI_PREFIX "adjust_cfa_offset" \
> +	CFI_BLANK " (" __stringify(nr) ") * (%c[cfi_sz])"
> +#define CFI_DEF_CFA_REGISTER(reg) CFI_PREFIX "def_cfa_register " #reg
> +#define CFI_INPUTS(more...) [cfi_var] "m" (cfi_var), \
> +			    [cfi_sz] "i" (sizeof(void *)), ## more
> +
> +#else
> +
> +#define CFI_DECL struct cfi_dummy
> +#define CFI_ADJUST_CFA_OFFSET(nr)
> +#define CFI_DEF_CFA_REGISTER(reg)
> +#define CFI_INPUTS(more...) more
> +
> +#endif /* __ASSEMBLY__ / CONFIG_CC_AS_CFI */
> +
> +#elif 0
> +.endif
> +.macro maybe_cfi_annotate$ mem:req, sz:req, annot:vararg
> + nest$ = 0
> + state$ = 0
> + skip$ = 0
> + .irpc c,\mem
> +  .ifeqs "(", "\c"
> +   nest$ = nest$ + 1
> +   state$ = 0
> +  .endif
> +
> +  .ifeqs ")", "\c"
> +   nest$ = nest$ - 1
> +  .elseif (nest$ == 1) && (skip$ == 0)
> +   .if state$ == 0
> +    .ifeqs "%", "\c"
> +     state$ = 1
> +    .endif
> +   .elseif state$ == 1
> +    .if \sz == 4
> +     .ifeqs "e", "\c"
> +      state$ = 2
> +     .endif
> +    .endif
> +
> +    .if \sz == 8
> +     .ifeqs "r", "\c"
> +      state$ = 2
> +     .endif
> +    .endif
> +
> +    .if state$ <> 2
> +     .exitm
> +    .endif
> +   .elseif state$ == 2
> +    .ifeqs "s", "\c"
> +     state$ = state$ | 0x04
> +    .endif
> +
> +    .ifeqs "b", "\c"
> +     state$ = state$ | 0x08
> +    .endif
> +
> +    .if (state$ & 0x0c) == 0
> +     .exitm
> +    .endif
> +   .elseif ((state$ & 3) == 2) && ((state$ & 0x0c) <> 0)
> +    .ifeqs "p", "\c"
> +     state$ = state$ | 0x10
> +    .else
> +     .exitm
> +    .endif
> +   .else
> +    .ifeqs ",", "\c"
> +     skip$ = 1
> +    .else
> +     .exitm
> +    .endif
> +   .endif
> +  .endif
> + .endr
> +
> + .if (nest$ <> 0) && (state$ == 0)
> +  .error "unmatched parentheses in '\mem\(')"
> + .elseif (nest$ == 0) && (state$ == 0x16)
> +  .cfi_\annot
> + .elseif (nest$ <> 0) || (state$ <> 0x1a)
> +  .error "unsupported memory expression '\mem\(')"
> + .endif
> +.endm
> +#endif
> --- 3.18-rc3/arch/x86/include/asm/irqflags.h
> +++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/irqflags.h
> @@ -4,6 +4,9 @@
>  #include <asm/processor-flags.h>
>  
>  #ifndef __ASSEMBLY__
> +
> +#include <asm/frame.h>
> +
>  /*
>   * Interrupt control:
>   */
> @@ -11,6 +14,7 @@
>  static inline unsigned long native_save_fl(void)
>  {
>  	unsigned long flags;
> +	CFI_DECL;
>  
>  	/*
>  	 * "=rm" is safe here, because "pop" adjusts the stack before
> @@ -18,9 +22,10 @@ static inline unsigned long native_save_
>  	 * documented behavior of the "pop" instruction.
>  	 */
>  	asm volatile("# __raw_save_flags\n\t"
> -		     "pushf ; pop %0"
> +		     "pushf" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		     "pop %0" CFI_ADJUST_CFA_OFFSET(-1)
>  		     : "=rm" (flags)
> -		     : /* no input */
> +		     : CFI_INPUTS()
>  		     : "memory");
>  
>  	return flags;
> @@ -28,10 +33,13 @@ static inline unsigned long native_save_
>  
>  static inline void native_restore_fl(unsigned long flags)
>  {
> -	asm volatile("push %0 ; popf"
> +	CFI_DECL;
> +
> +	asm volatile("push %[flags]" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		     "popf" CFI_ADJUST_CFA_OFFSET(-1)
>  		     : /* no output */
> -		     :"g" (flags)
> -		     :"memory", "cc");
> +		     : CFI_INPUTS([flags] "g" (flags))
> +		     : "memory", "cc");
>  }
>  
>  static inline void native_irq_disable(void)
> --- 3.18-rc3/arch/x86/include/asm/processor.h
> +++ 3.18-rc3-x86-inline-unwind-info/arch/x86/include/asm/processor.h
> @@ -8,7 +8,6 @@ struct task_struct;
>  struct mm_struct;
>  
>  #include <asm/vm86.h>
> -#include <asm/math_emu.h>
>  #include <asm/segment.h>
>  #include <asm/types.h>
>  #include <asm/sigcontext.h>
> @@ -22,6 +21,11 @@ struct mm_struct;
>  #include <asm/nops.h>
>  #include <asm/special_insns.h>
>  
> +#ifdef CONFIG_X86_32
> +#include <asm/math_emu.h>
> +#include <asm/frame.h>
> +#endif
> +
>  #include <linux/personality.h>
>  #include <linux/cpumask.h>
>  #include <linux/cache.h>
> @@ -531,15 +535,17 @@ static inline void native_set_iopl_mask(
>  {
>  #ifdef CONFIG_X86_32
>  	unsigned int reg;
> +	CFI_DECL;
>  
> -	asm volatile ("pushfl;"
> -		      "popl %0;"
> -		      "andl %1, %0;"
> -		      "orl %2, %0;"
> -		      "pushl %0;"
> -		      "popfl"
> +	asm volatile ("pushfl" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		      "popl %0" CFI_ADJUST_CFA_OFFSET(-1) "\n\t"
> +		      "andl %[invmask], %0\n\t"
> +		      "orl %[mask], %0\n\t"
> +		      "pushl %0" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		      "popfl" CFI_ADJUST_CFA_OFFSET(-1)
>  		      : "=&r" (reg)
> -		      : "i" (~X86_EFLAGS_IOPL), "r" (mask));
> +		      : CFI_INPUTS([invmask] "i" (~X86_EFLAGS_IOPL),
> +				   [mask] "r" (mask)));
>  #endif
>  }
>  
> --- 3.18-rc3/arch/x86/kernel/irq_32.c
> +++ 3.18-rc3-x86-inline-unwind-info/arch/x86/kernel/irq_32.c
> @@ -20,6 +20,7 @@
>  #include <linux/mm.h>
>  
>  #include <asm/apic.h>
> +#include <asm/frame.h>
>  
>  DEFINE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
>  EXPORT_PER_CPU_SYMBOL(irq_stat);
> @@ -60,12 +61,15 @@ DEFINE_PER_CPU(struct irq_stack *, softi
>  
>  static void call_on_stack(void *func, void *stack)
>  {
> +	CFI_DECL;
> +
>  	asm volatile("xchgl	%%ebx,%%esp	\n"
> +		     CFI_DEF_CFA_REGISTER(ebx) "\n"
>  		     "call	*%%edi		\n"
>  		     "movl	%%ebx,%%esp	\n"
> +		     CFI_DEF_CFA_REGISTER(esp)
>  		     : "=b" (stack)
> -		     : "0" (stack),
> -		       "D"(func)
> +		     : CFI_INPUTS("0" (stack), "D" (func))
>  		     : "memory", "cc", "edx", "ecx", "eax");
>  }
>  
> @@ -86,6 +90,7 @@ execute_on_irq_stack(int overflow, struc
>  {
>  	struct irq_stack *curstk, *irqstk;
>  	u32 *isp, *prev_esp, arg1, arg2;
> +	CFI_DECL;
>  
>  	curstk = (struct irq_stack *) current_stack();
>  	irqstk = __this_cpu_read(hardirq_stack);
> @@ -109,11 +114,13 @@ execute_on_irq_stack(int overflow, struc
>  		call_on_stack(print_stack_overflow, isp);
>  
>  	asm volatile("xchgl	%%ebx,%%esp	\n"
> +		     CFI_DEF_CFA_REGISTER(ebx) "\n"
>  		     "call	*%%edi		\n"
>  		     "movl	%%ebx,%%esp	\n"
> +		     CFI_DEF_CFA_REGISTER(esp)
>  		     : "=a" (arg1), "=d" (arg2), "=b" (isp)
> -		     :  "0" (irq),   "1" (desc),  "2" (isp),
> -			"D" (desc->handle_irq)
> +		     : CFI_INPUTS("0" (irq),   "1" (desc),  "2" (isp),
> +				  "D" (desc->handle_irq))
>  		     : "memory", "cc", "ecx");
>  	return 1;
>  }
> --- 3.18-rc3/drivers/cpufreq/speedstep-smi.c
> +++ 3.18-rc3-x86-inline-unwind-info/drivers/cpufreq/speedstep-smi.c
> @@ -19,6 +19,7 @@
>  #include <linux/cpufreq.h>
>  #include <linux/delay.h>
>  #include <linux/io.h>
> +#include <asm/frame.h>
>  #include <asm/ist.h>
>  #include <asm/cpu_device_id.h>
>  
> @@ -64,6 +65,7 @@ static int speedstep_smi_ownership(void)
>  	u32 command, result, magic, dummy;
>  	u32 function = GET_SPEEDSTEP_OWNER;
>  	unsigned char magic_data[] = "Copyright (c) 1999 Intel Corporation";
> +	CFI_DECL;
>  
>  	command = (smi_sig & 0xffffff00) | (smi_cmd & 0xff);
>  	magic = virt_to_phys(magic_data);
> @@ -72,14 +74,14 @@ static int speedstep_smi_ownership(void)
>  			command, smi_port);
>  
>  	__asm__ __volatile__(
> -		"push %%ebp\n"
> +		"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
>  		"out %%al, (%%dx)\n"
> -		"pop %%ebp\n"
> +		"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
>  		: "=D" (result),
>  		  "=a" (dummy), "=b" (dummy), "=c" (dummy), "=d" (dummy),
>  		  "=S" (dummy)
> -		: "a" (command), "b" (function), "c" (0), "d" (smi_port),
> -		  "D" (0), "S" (magic)
> +		: CFI_INPUTS("a" (command), "b" (function), "c" (0),
> +			     "d" (smi_port), "D" (0), "S" (magic))
>  		: "memory"
>  	);
>  
> @@ -102,6 +104,7 @@ static int speedstep_smi_get_freqs(unsig
>  	u32 command, result = 0, edi, high_mhz, low_mhz, dummy;
>  	u32 state = 0;
>  	u32 function = GET_SPEEDSTEP_FREQS;
> +	CFI_DECL;
>  
>  	if (!(ist_info.event & 0xFFFF)) {
>  		pr_debug("bug #1422 -- can't read freqs from BIOS\n");
> @@ -114,17 +117,15 @@ static int speedstep_smi_get_freqs(unsig
>  			command, smi_port);
>  
>  	__asm__ __volatile__(
> -		"push %%ebp\n"
> +		"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
>  		"out %%al, (%%dx)\n"
> -		"pop %%ebp"
> +		"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
>  		: "=a" (result),
>  		  "=b" (high_mhz),
>  		  "=c" (low_mhz),
>  		  "=d" (state), "=D" (edi), "=S" (dummy)
> -		: "a" (command),
> -		  "b" (function),
> -		  "c" (state),
> -		  "d" (smi_port), "S" (0), "D" (0)
> +		: CFI_INPUTS("a" (command), "b" (function), "c" (state),
> +			     "d" (smi_port), "S" (0), "D" (0))
>  	);
>  
>  	pr_debug("result %x, low_freq %u, high_freq %u\n",
> @@ -165,6 +166,8 @@ static void speedstep_set_state(unsigned
>  		state, command, smi_port);
>  
>  	do {
> +		CFI_DECL;
> +
>  		if (retry) {
>  			pr_debug("retry %u, previous result %u, waiting...\n",
>  					retry, result);
> @@ -172,14 +175,15 @@ static void speedstep_set_state(unsigned
>  		}
>  		retry++;
>  		__asm__ __volatile__(
> -			"push %%ebp\n"
> +			"push %%ebp" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
>  			"out %%al, (%%dx)\n"
> -			"pop %%ebp"
> +			"pop %%ebp" CFI_ADJUST_CFA_OFFSET(-1)
>  			: "=b" (new_state), "=D" (result),
>  			  "=c" (dummy), "=a" (dummy),
>  			  "=d" (dummy), "=S" (dummy)
> -			: "a" (command), "b" (function), "c" (state),
> -			  "d" (smi_port), "S" (0), "D" (0)
> +			: CFI_INPUTS("a" (command), "b" (function),
> +				     "c" (state), "d" (smi_port),
> +				     "S" (0), "D" (0))
>  			);
>  	} while ((new_state != state) && (retry <= SMI_TRIES));
>  
> --- 3.18-rc3/drivers/crypto/padlock-aes.c
> +++ 3.18-rc3-x86-inline-unwind-info/drivers/crypto/padlock-aes.c
> @@ -21,6 +21,7 @@
>  #include <linux/slab.h>
>  #include <asm/cpu_device_id.h>
>  #include <asm/byteorder.h>
> +#include <asm/frame.h>
>  #include <asm/processor.h>
>  #include <asm/i387.h>
>  
> @@ -168,12 +169,13 @@ static inline void padlock_reset_key(str
>  {
>  	int cpu = raw_smp_processor_id();
>  
> -	if (cword != per_cpu(paes_last_cword, cpu))
> -#ifndef CONFIG_X86_64
> -		asm volatile ("pushfl; popfl");
> -#else
> -		asm volatile ("pushfq; popfq");
> -#endif
> +	if (cword != per_cpu(paes_last_cword, cpu)) {
> +		CFI_DECL;
> +
> +		asm volatile ("pushf" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +			      "popf" CFI_ADJUST_CFA_OFFSET(-1)
> +			      :: CFI_INPUTS());
> +	}
>  }
>  
>  static inline void padlock_store_cword(struct cword *cword)
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-04 19:40 ` Andy Lutomirski
@ 2014-11-05 17:13   ` Jan Beulich
  2014-11-05 17:23     ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2014-11-05 17:13 UTC (permalink / raw)
  To: luto, mingo, tglx, hpa; +Cc: tonyj, linux-kernel

>>> Andy Lutomirski <luto@amacapital.net> 11/04/14 8:40 PM >>>
>On 11/04/2014 01:24 AM, Jan Beulich wrote:
>> The main obstacle to having done this long ago was the need to
>> determine whether annotations are needed in the first place: They need
>> to be avoided when a frame pointer got set up. Since I can't see a way
>> to determine this before the compilation phase, this is being achieved
>> by inspecting the memory address generated by the compiler in an
>> interposed assembler macro. Of course this isn't really nice code, and
>> this the main reason I'm posting this as RFC only at this point (with
>> the hope that maybe someone has an idea of how to achieve the same
>> thing in a more elegant way).
>
>Ask binutils for help?

Binutils know as little about the code the compiler generated as we do.

>Is the issue that the CFI annotation you need is different depending on
>whether there's a frame pointer or not?

No - as said above, they need to be avoided altogether when there's a
frame pointer.

> If so, can you add some
>comments so that mere asm mortals have some prayer of understanding how
>your magic works and what the desired output annotations are in the
>various cases?

Honestly I have a hard time seeing where comments would help here. Plus
the difficult part isn't how the annotations look like, but (see above) simply
whether to emit them at all.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-05 17:13   ` Jan Beulich
@ 2014-11-05 17:23     ` Andy Lutomirski
  2014-11-06 10:35       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2014-11-05 17:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, tonyj, linux-kernel

On Wed, Nov 5, 2014 at 9:13 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>> Andy Lutomirski <luto@amacapital.net> 11/04/14 8:40 PM >>>
>>On 11/04/2014 01:24 AM, Jan Beulich wrote:
>>> The main obstacle to having done this long ago was the need to
>>> determine whether annotations are needed in the first place: They need
>>> to be avoided when a frame pointer got set up. Since I can't see a way
>>> to determine this before the compilation phase, this is being achieved
>>> by inspecting the memory address generated by the compiler in an
>>> interposed assembler macro. Of course this isn't really nice code, and
>>> this the main reason I'm posting this as RFC only at this point (with
>>> the hope that maybe someone has an idea of how to achieve the same
>>> thing in a more elegant way).
>>
>>Ask binutils for help?
>
> Binutils know as little about the code the compiler generated as we do.

Could binutils add a
.cfi_adjust_cfa_offset_if_the_cfa_depends_on_sp_right_now directive?
IIUC, the issue is that, when you push, you don't want the canonical
frame address to change as a result, but you just changed the stack
pointer, so if the CFA is computed as an offset from the stack pointer
in the current context, that offset needs to change.

Alternatively, is there any sane way to get the inline asm to act as
though it creates an entirely new frame?  It would have CFA == rsp
initially (or rsp + 8 or whatever -- I can never keep track of what
the CFA is actually supposed to point to) and unwind instructions that
tell the unwinder that the caller pc is at a known address instead of
being stuck in the stack frame?

>
>>Is the issue that the CFI annotation you need is different depending on
>>whether there's a frame pointer or not?
>
> No - as said above, they need to be avoided altogether when there's a
> frame pointer.
>
>> If so, can you add some
>>comments so that mere asm mortals have some prayer of understanding how
>>your magic works and what the desired output annotations are in the
>>various cases?
>
> Honestly I have a hard time seeing where comments would help here. Plus
> the difficult part isn't how the annotations look like, but (see above) simply
> whether to emit them at all.

It could be something simple like an example of what the inputs to the
asm macros are in the two cases.  Currently even figuring out where
those inputs come from involves following a big tangle of C
preprocessor stuff, and I don't know what it's supposed to output, and
what that's supposed to do the expansions in the inline asm, and how
that ends up influencing the gas macros.

I.e. I sort of expect I'll need to want to one of these things some
day, and I'd like a couple pointers :)

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-05 17:23     ` Andy Lutomirski
@ 2014-11-06 10:35       ` Jan Beulich
  2014-11-06 17:00         ` Andy Lutomirski
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2014-11-06 10:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Thomas Gleixner, tonyj, linux-kernel, H. Peter Anvin

>>> On 05.11.14 at 18:23, <luto@amacapital.net> wrote:
> On Wed, Nov 5, 2014 at 9:13 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> Andy Lutomirski <luto@amacapital.net> 11/04/14 8:40 PM >>>
>>>On 11/04/2014 01:24 AM, Jan Beulich wrote:
>>>> The main obstacle to having done this long ago was the need to
>>>> determine whether annotations are needed in the first place: They need
>>>> to be avoided when a frame pointer got set up. Since I can't see a way
>>>> to determine this before the compilation phase, this is being achieved
>>>> by inspecting the memory address generated by the compiler in an
>>>> interposed assembler macro. Of course this isn't really nice code, and
>>>> this the main reason I'm posting this as RFC only at this point (with
>>>> the hope that maybe someone has an idea of how to achieve the same
>>>> thing in a more elegant way).
>>>
>>>Ask binutils for help?
>>
>> Binutils know as little about the code the compiler generated as we do.
> 
> Could binutils add a
> .cfi_adjust_cfa_offset_if_the_cfa_depends_on_sp_right_now directive?
> IIUC, the issue is that, when you push, you don't want the canonical
> frame address to change as a result, but you just changed the stack
> pointer, so if the CFA is computed as an offset from the stack pointer
> in the current context, that offset needs to change.

While that's theoretically doable, I don't think this would be a
reasonable approach.

> Alternatively, is there any sane way to get the inline asm to act as
> though it creates an entirely new frame?  It would have CFA == rsp
> initially (or rsp + 8 or whatever -- I can never keep track of what
> the CFA is actually supposed to point to) and unwind instructions that
> tell the unwinder that the caller pc is at a known address instead of
> being stuck in the stack frame?

No, that can't work: You'd have to
- end the previous function (from the CFI engine's pov)
- start a new function
- do what you suggest above
- end the "nested" function
- start a continuation function for the subsequent compiler
  generated code
- magically know the state of things at the point the original
  function got (artificially) ended

>>>Is the issue that the CFI annotation you need is different depending on
>>>whether there's a frame pointer or not?
>>
>> No - as said above, they need to be avoided altogether when there's a
>> frame pointer.
>>
>>> If so, can you add some
>>>comments so that mere asm mortals have some prayer of understanding how
>>>your magic works and what the desired output annotations are in the
>>>various cases?
>>
>> Honestly I have a hard time seeing where comments would help here. Plus
>> the difficult part isn't how the annotations look like, but (see above) 
> simply
>> whether to emit them at all.
> 
> It could be something simple like an example of what the inputs to the
> asm macros are in the two cases.  Currently even figuring out where
> those inputs come from involves following a big tangle of C
> preprocessor stuff, and I don't know what it's supposed to output, and
> what that's supposed to do the expansions in the inline asm, and how
> that ends up influencing the gas macros.
> 
> I.e. I sort of expect I'll need to want to one of these things some
> day, and I'd like a couple pointers :)

I'll see what I can do (but I'll invest time into doing so only if there
are at least signs of this having a remote chance of going in at some
point).

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-06 10:35       ` Jan Beulich
@ 2014-11-06 17:00         ` Andy Lutomirski
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Lutomirski @ 2014-11-06 17:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ingo Molnar, Thomas Gleixner, Tony Jones, linux-kernel, H. Peter Anvin

On Thu, Nov 6, 2014 at 2:35 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 05.11.14 at 18:23, <luto@amacapital.net> wrote:
>> On Wed, Nov 5, 2014 at 9:13 AM, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> Andy Lutomirski <luto@amacapital.net> 11/04/14 8:40 PM >>>
>>>>On 11/04/2014 01:24 AM, Jan Beulich wrote:
>>>>> The main obstacle to having done this long ago was the need to
>>>>> determine whether annotations are needed in the first place: They need
>>>>> to be avoided when a frame pointer got set up. Since I can't see a way
>>>>> to determine this before the compilation phase, this is being achieved
>>>>> by inspecting the memory address generated by the compiler in an
>>>>> interposed assembler macro. Of course this isn't really nice code, and
>>>>> this the main reason I'm posting this as RFC only at this point (with
>>>>> the hope that maybe someone has an idea of how to achieve the same
>>>>> thing in a more elegant way).
>>>>
>>>>Ask binutils for help?
>>>
>>> Binutils know as little about the code the compiler generated as we do.
>>
>> Could binutils add a
>> .cfi_adjust_cfa_offset_if_the_cfa_depends_on_sp_right_now directive?
>> IIUC, the issue is that, when you push, you don't want the canonical
>> frame address to change as a result, but you just changed the stack
>> pointer, so if the CFA is computed as an offset from the stack pointer
>> in the current context, that offset needs to change.
>
> While that's theoretically doable, I don't think this would be a
> reasonable approach.
>

I'll defer to your judgment about this.  You clearly know a lot more
about cfi than I do :)

That being said, I've occasionally wanted the ability to do things
like this in userspace code, so maybe it wouldn't be a terrible
feature request.

>> Alternatively, is there any sane way to get the inline asm to act as
>> though it creates an entirely new frame?  It would have CFA == rsp
>> initially (or rsp + 8 or whatever -- I can never keep track of what
>> the CFA is actually supposed to point to) and unwind instructions that
>> tell the unwinder that the caller pc is at a known address instead of
>> being stuck in the stack frame?
>
> No, that can't work: You'd have to
> - end the previous function (from the CFI engine's pov)
> - start a new function
> - do what you suggest above
> - end the "nested" function
> - start a continuation function for the subsequent compiler
>   generated code
> - magically know the state of things at the point the original
>   function got (artificially) ended

Fair enough.  Empirically, sticking this in the middle of a function
doesn't work:

    .cfi_remember_state
    .cfi_endproc
    .cfi_startproc
    .cfi_restore_state

Oh, well.

--Andy

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-04  9:24 [PATCH, RFC] x86: also CFI-annotate certain inline asm()s Jan Beulich
  2014-11-04 19:40 ` Andy Lutomirski
@ 2014-11-10 10:01 ` Ingo Molnar
       [not found]   ` <CA+55aFw9MV1n8T7_k5oftfY-sOu8s1ywKYM-3k4+PF022vv9ow@mail.gmail.com>
  1 sibling, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2014-11-10 10:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: mingo, tglx, hpa, Tony Jones, linux-kernel, Linus Torvalds


* Jan Beulich <JBeulich@suse.com> wrote:

> @@ -11,6 +14,7 @@
>  static inline unsigned long native_save_fl(void)
>  {
>  	unsigned long flags;
> +	CFI_DECL;
>  
>  	/*
>  	 * "=rm" is safe here, because "pop" adjusts the stack before
> @@ -18,9 +22,10 @@ static inline unsigned long native_save_
>  	 * documented behavior of the "pop" instruction.
>  	 */
>  	asm volatile("# __raw_save_flags\n\t"
> -		     "pushf ; pop %0"
> +		     "pushf" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		     "pop %0" CFI_ADJUST_CFA_OFFSET(-1)
>  		     : "=rm" (flags)
> -		     : /* no input */
> +		     : CFI_INPUTS()
>  		     : "memory");
>  
>  	return flags;
> @@ -28,10 +33,13 @@ static inline unsigned long native_save_
>  
>  static inline void native_restore_fl(unsigned long flags)
>  {
> -	asm volatile("push %0 ; popf"
> +	CFI_DECL;
> +
> +	asm volatile("push %[flags]" CFI_ADJUST_CFA_OFFSET(1) "\n\t"
> +		     "popf" CFI_ADJUST_CFA_OFFSET(-1)
>  		     : /* no output */
> -		     :"g" (flags)
> -		     :"memory", "cc");
> +		     : CFI_INPUTS([flags] "g" (flags))
> +		     : "memory", "cc");
>  }

In principle I'm not against generating better debug info for our 
assembly code, but I think it should be more readable - for 
example I would not hide 'cfi_var' in the CFI_DECL above but I'd 
make it something like:

	CFI_DECL(cfi_var);

	...
		CFI_ADJUST_CFA_OFFSET(cfi_var, ...);

etc. - even if this is slightly more verbose than what you wrote.

There's few things worse than hidden state connections between 
various primitives - even if this is build only.

I'd also suggest splitting up the patch into 'add new primitives' 
and 'use them to improve stuff' parts.

(Assuming nobody objects strongly.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
       [not found]   ` <CA+55aFw9MV1n8T7_k5oftfY-sOu8s1ywKYM-3k4+PF022vv9ow@mail.gmail.com>
@ 2014-11-10 18:10     ` Linus Torvalds
  2014-11-11  7:52       ` Jan Beulich
  2014-11-10 18:17     ` Ingo Molnar
  2014-11-11  7:42     ` Jan Beulich
  2 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2014-11-10 18:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Anvin, Jan Beulich, Thomas Gleixner, Ingo Molnar,
	Linux Kernel Mailing List, Tony Jones

On Mon, Nov 10, 2014 at 9:56 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So no. A very strong NACK. The code was too ugly to live, there is no good
> stated reason for it, and the only reason I can even remotely imagine is
> wrong and complete crap anyway (ie making the CFI annotations a correctness
> issue by introducing other infrastructure that depends on it always being
> right).

Btw, the sane thing to do is to make your infrastructure just say "If
my frame walker hits a push/pop without CFI information, I'll just add
it myself".

Yes, that involved having to actuall ylook at the instruction. Tough
shit. Just do it right. There aren't that many push/pop patterns.
Don't make the kernel more fragile by introducing these kinds of hacky
macros-from-hell.

Improve the debugger, don't make kernel code worse because your
out-of-tree debugging infrastructure is too broken to live.

                      Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
       [not found]   ` <CA+55aFw9MV1n8T7_k5oftfY-sOu8s1ywKYM-3k4+PF022vv9ow@mail.gmail.com>
  2014-11-10 18:10     ` Linus Torvalds
@ 2014-11-10 18:17     ` Ingo Molnar
  2014-11-11  7:42     ` Jan Beulich
  2 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2014-11-10 18:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Anvin, Jan Beulich, tglx, mingo, linux-kernel, Tony Jones


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Nov 10, 2014 2:01 AM, "Ingo Molnar" <mingo@kernel.org> wrote:
> >
> > (Assuming nobody objects strongly.)
> 
> I object strongly to anything this disgustingly ugly, 
> especially since it doesn't even have a strong stated reason.

Fair enough!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
       [not found]   ` <CA+55aFw9MV1n8T7_k5oftfY-sOu8s1ywKYM-3k4+PF022vv9ow@mail.gmail.com>
  2014-11-10 18:10     ` Linus Torvalds
  2014-11-10 18:17     ` Ingo Molnar
@ 2014-11-11  7:42     ` Jan Beulich
  2014-11-12 20:36       ` Linus Torvalds
  2 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2014-11-11  7:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: mingo, Ingo Molnar, tglx, Tony Jones, linux-kernel, Peter Anvin

>>> On 10.11.14 at 18:56, <torvalds@linux-foundation.org> wrote:
> So no. A very strong NACK. The code was too ugly to live, there is no good
> stated reason for it, and the only reason I can even remotely imagine is
> wrong and complete crap anyway (ie making the CFI annotations a correctness
> issue by introducing other infrastructure that depends on it always being
> right).
> 
> We've been through this before. The same suspects were involved. If this is
> some crazy "debug info has to be right, because otherwise our total crap
> unwinding goes boom", then the solution is not to add more broken and ugly
> CFI annotations, the solution is to not depend on a broken and fragile
> unwinder.
> 
> Any debug infrastructure *must* be robust enough that missing or even
> actively wrong frame information must not break it. If it's too fragile to
> handle missing or bad debug information, then it is too fragile to be used
> for debugging.

And no-one said this not to be the case. The fact here is that the
unwind information can be improved, resulting in better use of
whatever consumes it. And the "whatever" that prompted this was
upstream's perf subsystem (I think I even mentioned this in the
patch description), which - through the way things get connected
together in the kernel - just _happens_ to use the unwinder when
it's present in the kernel.

Nothing crashes with the unwind information being wrong. It is
solely you who was claiming (without proof) years ago that the
unwinder repeatedly caused issues. Yes, we did find a bug or two
over the years in it, but there being bugs in software is pretty
much unavoidable. If you didn't accept that, you shouldn't be
merging any patches introducing new features.

The point of the patch, however, is to make the unwinder do a
better job. And even if you prefer to ignore this fact, people
over here actually prefer looking at stack traces they don't have
to guess or verify for each entry whether it actually belongs to
the call stack - this saves them time, and hence allows them to
be more productive.

> The fact is, debug information had better *never* be so important that it
> has to be right. Seriously. Even gcc historically gets the debug
> information wrong quite regularly, exactly because there are so few things
> that depend on it. We have to assume that frame information is incomplete,
> and that also means that we don't add insane crap to our inline asm code.
> 
> So for something like this to be workable, it has better be:

So finally you're getting to at least slightly productive criticism.

> - much cleaner and less hacky

And I asked for suggestions in the patch comment.

> - not have to depend on random gcc code generation

I don't think I can correlate this to any particular aspect of the
change.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-10 18:10     ` Linus Torvalds
@ 2014-11-11  7:52       ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2014-11-11  7:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Ingo Molnar, Thomas Gleixner, Tony Jones,
	Linux Kernel Mailing List, Peter Anvin

>>> On 10.11.14 at 19:10, <torvalds@linux-foundation.org> wrote:
> Btw, the sane thing to do is to make your infrastructure just say "If
> my frame walker hits a push/pop without CFI information, I'll just add
> it myself".
> 
> Yes, that involved having to actuall ylook at the instruction. Tough
> shit. Just do it right. There aren't that many push/pop patterns.

Did you think this through? Inspecting instructions while unwinding
the stack would involve significant amounts of architecture specific
code, whereas the unwinder is largely architecture independent.
Apart from code to obtain machine state, only the annotations are
(necessarily) connected to the architecture since they accompany
machine instructions.

Did you ever write a disassembler capable of correctly dealing with
everything a compiler may generate (i.e. including data literals in the
middle of code)?

Anyway - I'm sure I won't convince you now or ever, this is too
religious a topic for you afaict, and hence an objective and fair
discussion is impossible.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-11  7:42     ` Jan Beulich
@ 2014-11-12 20:36       ` Linus Torvalds
  2014-11-13  7:49         ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2014-11-12 20:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ingo Molnar, Ingo Molnar, Thomas Gleixner, Tony Jones,
	Linux Kernel Mailing List, Peter Anvin

On Mon, Nov 10, 2014 at 11:42 PM, Jan Beulich <JBeulich@suse.com> wrote:
>
> Nothing crashes with the unwind information being wrong. It is
> solely you who was claiming (without proof) years ago that the
> unwinder repeatedly caused issues.

Umm. We had oopses showing it. Several times.

> Yes, we did find a bug or two over the years in it

.. and you and Andi repeatedly refused to make the code more robust
when I asked.

Which is why I don't work with Andi or you directly any more, and why
the code got entirely removed when I got fed up with the last time it
crashed in ways that it *wouldn't* have crashed if it had been made
more robust.

Every time there were just excuses. Like now. "All code has bugs".
Sure. But debugging code had better be better than "all code", and it
had better be written to be robust, and it had better not make bugs
more _likely_ either by failing, or by making the non-debug code
harder to read.

I'm done with your crazy unwinder games. We removed a lot of CFI stuff
entirely because it made the asm code hard to read. Much of it ended
up getting re-introduced when you made the infrastructure more
bearable.

But this patch I NAK'ed because the code is not readable, and the
infrastructure is not bearable.

Live with it.

                      Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH, RFC] x86: also CFI-annotate certain inline asm()s
  2014-11-12 20:36       ` Linus Torvalds
@ 2014-11-13  7:49         ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2014-11-13  7:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Ingo Molnar, Thomas Gleixner, Tony Jones,
	Linux Kernel Mailing List, Peter Anvin

>>> On 12.11.14 at 21:36, <torvalds@linux-foundation.org> wrote:
> On Mon, Nov 10, 2014 at 11:42 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>
>> Nothing crashes with the unwind information being wrong. It is
>> solely you who was claiming (without proof) years ago that the
>> unwinder repeatedly caused issues.
> 
> Umm. We had oopses showing it. Several times.

And I know you've been saying so before. The only problem here is
- these weren't sent my way for investigation, at least as far as I
recall.

>> Yes, we did find a bug or two over the years in it
> 
> .. and you and Andi repeatedly refused to make the code more robust
> when I asked.

True, we considered _some_ of the requests you made wrong.

> Which is why I don't work with Andi or you directly any more,

People thinking differently than you in certain aspects shouldn't
preclude working with them directly, should it? Yes, it's a project you
started, but it has long become a community one, and as such
excluding people just because they don't conform to every opinion
of yours is, well, odd.

> But this patch I NAK'ed because the code is not readable, and the
> infrastructure is not bearable.
> 
> Live with it.

I got the message.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-11-13  7:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-04  9:24 [PATCH, RFC] x86: also CFI-annotate certain inline asm()s Jan Beulich
2014-11-04 19:40 ` Andy Lutomirski
2014-11-05 17:13   ` Jan Beulich
2014-11-05 17:23     ` Andy Lutomirski
2014-11-06 10:35       ` Jan Beulich
2014-11-06 17:00         ` Andy Lutomirski
2014-11-10 10:01 ` Ingo Molnar
     [not found]   ` <CA+55aFw9MV1n8T7_k5oftfY-sOu8s1ywKYM-3k4+PF022vv9ow@mail.gmail.com>
2014-11-10 18:10     ` Linus Torvalds
2014-11-11  7:52       ` Jan Beulich
2014-11-10 18:17     ` Ingo Molnar
2014-11-11  7:42     ` Jan Beulich
2014-11-12 20:36       ` Linus Torvalds
2014-11-13  7:49         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).