* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
@ 2007-02-13 7:52 Jan Beulich
2007-02-13 10:00 ` Andi Kleen
2007-02-14 17:51 ` Jeff Dike
0 siblings, 2 replies; 8+ messages in thread
From: Jan Beulich @ 2007-02-13 7:52 UTC (permalink / raw)
To: Jeff Dike; +Cc: Andi Kleen, linux-kernel, patches
>Yup. How does this patch look to you? We set error_code and trap_no
>for userspace faults and kernel faults which call die(). We don't set
>them for kernelspace faults which are fixed up.
Actually, after a second round of thinking I believe there's still more to do
- your second patch missed fixing i386's do_trap() similarly to x86-64's
and, vice versa, x86-64's do_general_protection() similarly to i386's.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-13 7:52 [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Jan Beulich
@ 2007-02-13 10:00 ` Andi Kleen
2007-02-14 17:51 ` Jeff Dike
1 sibling, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2007-02-13 10:00 UTC (permalink / raw)
To: Jan Beulich; +Cc: Jeff Dike, linux-kernel, patches
On Tuesday 13 February 2007 08:52, Jan Beulich wrote:
> >Yup. How does this patch look to you? We set error_code and trap_no
> >for userspace faults and kernel faults which call die(). We don't set
> >them for kernelspace faults which are fixed up.
>
> Actually, after a second round of thinking I believe there's still more to do
> - your second patch missed fixing i386's do_trap() similarly to x86-64's
> and, vice versa, x86-64's do_general_protection() similarly to i386's.
I dropped the patch for now until that is all worked out
-Andi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-13 7:52 [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Jan Beulich
2007-02-13 10:00 ` Andi Kleen
@ 2007-02-14 17:51 ` Jeff Dike
2007-02-15 8:01 ` Jan Beulich
1 sibling, 1 reply; 8+ messages in thread
From: Jeff Dike @ 2007-02-14 17:51 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andi Kleen, linux-kernel, patches
On Tue, Feb 13, 2007 at 07:52:54AM +0000, Jan Beulich wrote:
> Actually, after a second round of thinking I believe there's still more to do
> - your second patch missed fixing i386's do_trap() similarly to x86-64's
> and, vice versa, x86-64's do_general_protection() similarly to i386's.
Sigh, here's another go at it - the full patch instead of
incrementally fixing the old one:
Index: linux-2.6/arch/i386/kernel/traps.c
===================================================================
--- linux-2.6.orig/arch/i386/kernel/traps.c
+++ linux-2.6/arch/i386/kernel/traps.c
@@ -473,8 +473,6 @@ static void __kprobes do_trap(int trapnr
siginfo_t *info)
{
struct task_struct *tsk = current;
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = trapnr;
if (regs->eflags & VM_MASK) {
if (vm86)
@@ -486,6 +484,9 @@ static void __kprobes do_trap(int trapnr
goto kernel_trap;
trap_signal: {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
+
if (info)
force_sig_info(signr, info, tsk);
else
@@ -494,8 +495,11 @@ static void __kprobes do_trap(int trapnr
}
kernel_trap: {
- if (!fixup_exception(regs))
+ if (!fixup_exception(regs)) {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
die(str, regs, error_code);
+ }
return;
}
@@ -600,15 +604,21 @@ fastcall void __kprobes do_general_prote
}
put_cpu();
- current->thread.error_code = error_code;
- current->thread.trap_no = 13;
-
if (regs->eflags & VM_MASK)
goto gp_in_vm86;
if (!user_mode(regs))
goto gp_in_kernel;
+ /*
+ * We want error_code and trap_no set for userspace faults and
+ * kernelspace faults which result in die(), but not
+ * kernelspace faults which are fixed up. die() gives the
+ * process no chance to handle the signal and notice the
+ * kernel fault information, so that won't result in polluting
+ * the information about previously queued, but not yet
+ * delivered, fault.
+ */
current->thread.error_code = error_code;
current->thread.trap_no = 13;
force_sig(SIGSEGV, current);
@@ -621,6 +631,8 @@ gp_in_vm86:
gp_in_kernel:
if (!fixup_exception(regs)) {
+ current->thread.error_code = error_code;
+ current->thread.trap_no = 13;
if (notify_die(DIE_GPF, "general protection fault", regs,
error_code, 13, SIGSEGV) == NOTIFY_STOP)
return;
Index: linux-2.6/arch/x86_64/kernel/traps.c
===================================================================
--- linux-2.6.orig/arch/x86_64/kernel/traps.c
+++ linux-2.6/arch/x86_64/kernel/traps.c
@@ -581,10 +581,19 @@ static void __kprobes do_trap(int trapnr
{
struct task_struct *tsk = current;
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = trapnr;
-
if (user_mode(regs)) {
+ /*
+ * We want error_code and trap_no set for userspace faults
+ * and kernelspace faults which result in die(), but
+ * not kernelspace faults which are fixed up. die()
+ * gives the process no chance to handle the signal
+ * and notice the kernel fault information, so that
+ * won't result in polluting the information about
+ * previously queued, but not yet delivered, fault.
+ */
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
+
if (exception_trace && unhandled_signal(tsk, signr))
printk(KERN_INFO
"%s[%d] trap %s rip:%lx rsp:%lx error:%lx\n",
@@ -605,8 +614,11 @@ static void __kprobes do_trap(int trapnr
fixup = search_exception_tables(regs->rip);
if (fixup)
regs->rip = fixup->fixup;
- else
+ else {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
die(str, regs, error_code);
+ }
return;
}
}
@@ -682,10 +694,10 @@ asmlinkage void __kprobes do_general_pro
conditional_sti(regs);
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = 13;
-
if (user_mode(regs)) {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = 13;
+
if (exception_trace && unhandled_signal(tsk, SIGSEGV))
printk(KERN_INFO
"%s[%d] general protection rip:%lx rsp:%lx error:%lx\n",
@@ -704,6 +716,10 @@ asmlinkage void __kprobes do_general_pro
regs->rip = fixup->fixup;
return;
}
+
+
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = 13;
if (notify_die(DIE_GPF, "general protection fault", regs,
error_code, 13, SIGSEGV) == NOTIFY_STOP)
return;
--
Work email - jdike at linux dot intel dot com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-14 17:51 ` Jeff Dike
@ 2007-02-15 8:01 ` Jan Beulich
2007-02-15 16:23 ` Jeff Dike
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2007-02-15 8:01 UTC (permalink / raw)
To: Jeff Dike; +Cc: Andi Kleen, linux-kernel, patches
>>> Jeff Dike <jdike@addtoit.com> 14.02.07 18:51 >>>
>On Tue, Feb 13, 2007 at 07:52:54AM +0000, Jan Beulich wrote:
>> Actually, after a second round of thinking I believe there's still more to do
>> - your second patch missed fixing i386's do_trap() similarly to x86-64's
>> and, vice versa, x86-64's do_general_protection() similarly to i386's.
>
>Sigh, here's another go at it - the full patch instead of
>incrementally fixing the old one:
Ack.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.21 review I] [1/25] x86_64: Add __copy_from_user_nocache
@ 2007-02-10 11:50 Andi Kleen
2007-02-10 11:50 ` [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Andi Kleen
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2007-02-10 11:50 UTC (permalink / raw)
To: patches, linux-kernel
This does user copies in fs write() into the page cache with write combining.
This pushes the destination out of the CPU's cache, but allows higher bandwidth
in some case.
The theory is that the page cache data is usually not touched by the
CPU again and it's better to not pollute the cache with it. Also it is a little
faster.
Signed-off-by: Andi Kleen <ak@suse.de>
---
arch/x86_64/kernel/x8664_ksyms.c | 1
arch/x86_64/lib/Makefile | 2
arch/x86_64/lib/copy_user_nocache.S | 217 ++++++++++++++++++++++++++++++++++++
include/asm-x86_64/uaccess.h | 14 ++
4 files changed, 233 insertions(+), 1 deletion(-)
Index: linux/arch/x86_64/lib/Makefile
===================================================================
--- linux.orig/arch/x86_64/lib/Makefile
+++ linux/arch/x86_64/lib/Makefile
@@ -9,4 +9,4 @@ obj-y := io.o iomap_copy.o
lib-y := csum-partial.o csum-copy.o csum-wrappers.o delay.o \
usercopy.o getuser.o putuser.o \
thunk.o clear_page.o copy_page.o bitstr.o bitops.o
-lib-y += memcpy.o memmove.o memset.o copy_user.o rwlock.o
+lib-y += memcpy.o memmove.o memset.o copy_user.o rwlock.o copy_user_nocache.o
Index: linux/arch/x86_64/lib/copy_user_nocache.S
===================================================================
--- /dev/null
+++ linux/arch/x86_64/lib/copy_user_nocache.S
@@ -0,0 +1,217 @@
+/* Copyright 2002 Andi Kleen, SuSE Labs.
+ * Subject to the GNU Public License v2.
+ *
+ * Functions to copy from and to user space.
+ */
+
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+
+#define FIX_ALIGNMENT 1
+
+#include <asm/current.h>
+#include <asm/asm-offsets.h>
+#include <asm/thread_info.h>
+#include <asm/cpufeature.h>
+
+/*
+ * copy_user_nocache - Uncached memory copy with exception handling
+ * This will force destination/source out of cache for more performance.
+ *
+ * Input:
+ * rdi destination
+ * rsi source
+ * rdx count
+ * rcx zero flag when 1 zero on exception
+ *
+ * Output:
+ * eax uncopied bytes or 0 if successful.
+ */
+ENTRY(__copy_user_nocache)
+ CFI_STARTPROC
+ pushq %rbx
+ CFI_ADJUST_CFA_OFFSET 8
+ CFI_REL_OFFSET rbx, 0
+ pushq %rcx /* save zero flag */
+ CFI_ADJUST_CFA_OFFSET 8
+ CFI_REL_OFFSET rcx, 0
+
+ xorl %eax,%eax /* zero for the exception handler */
+
+#ifdef FIX_ALIGNMENT
+ /* check for bad alignment of destination */
+ movl %edi,%ecx
+ andl $7,%ecx
+ jnz .Lbad_alignment
+.Lafter_bad_alignment:
+#endif
+
+ movq %rdx,%rcx
+
+ movl $64,%ebx
+ shrq $6,%rdx
+ decq %rdx
+ js .Lhandle_tail
+
+ .p2align 4
+.Lloop:
+.Ls1: movq (%rsi),%r11
+.Ls2: movq 1*8(%rsi),%r8
+.Ls3: movq 2*8(%rsi),%r9
+.Ls4: movq 3*8(%rsi),%r10
+.Ld1: movnti %r11,(%rdi)
+.Ld2: movnti %r8,1*8(%rdi)
+.Ld3: movnti %r9,2*8(%rdi)
+.Ld4: movnti %r10,3*8(%rdi)
+
+.Ls5: movq 4*8(%rsi),%r11
+.Ls6: movq 5*8(%rsi),%r8
+.Ls7: movq 6*8(%rsi),%r9
+.Ls8: movq 7*8(%rsi),%r10
+.Ld5: movnti %r11,4*8(%rdi)
+.Ld6: movnti %r8,5*8(%rdi)
+.Ld7: movnti %r9,6*8(%rdi)
+.Ld8: movnti %r10,7*8(%rdi)
+
+ dec %rdx
+
+ leaq 64(%rsi),%rsi
+ leaq 64(%rdi),%rdi
+
+ jns .Lloop
+
+ .p2align 4
+.Lhandle_tail:
+ movl %ecx,%edx
+ andl $63,%ecx
+ shrl $3,%ecx
+ jz .Lhandle_7
+ movl $8,%ebx
+ .p2align 4
+.Lloop_8:
+.Ls9: movq (%rsi),%r8
+.Ld9: movnti %r8,(%rdi)
+ decl %ecx
+ leaq 8(%rdi),%rdi
+ leaq 8(%rsi),%rsi
+ jnz .Lloop_8
+
+.Lhandle_7:
+ movl %edx,%ecx
+ andl $7,%ecx
+ jz .Lende
+ .p2align 4
+.Lloop_1:
+.Ls10: movb (%rsi),%bl
+.Ld10: movb %bl,(%rdi)
+ incq %rdi
+ incq %rsi
+ decl %ecx
+ jnz .Lloop_1
+
+ CFI_REMEMBER_STATE
+.Lende:
+ popq %rcx
+ CFI_ADJUST_CFA_OFFSET -8
+ CFI_RESTORE %rcx
+ popq %rbx
+ CFI_ADJUST_CFA_OFFSET -8
+ CFI_RESTORE rbx
+ ret
+ CFI_RESTORE_STATE
+
+#ifdef FIX_ALIGNMENT
+ /* align destination */
+ .p2align 4
+.Lbad_alignment:
+ movl $8,%r9d
+ subl %ecx,%r9d
+ movl %r9d,%ecx
+ cmpq %r9,%rdx
+ jz .Lhandle_7
+ js .Lhandle_7
+.Lalign_1:
+.Ls11: movb (%rsi),%bl
+.Ld11: movb %bl,(%rdi)
+ incq %rsi
+ incq %rdi
+ decl %ecx
+ jnz .Lalign_1
+ subq %r9,%rdx
+ jmp .Lafter_bad_alignment
+#endif
+
+ /* table sorted by exception address */
+ .section __ex_table,"a"
+ .align 8
+ .quad .Ls1,.Ls1e
+ .quad .Ls2,.Ls2e
+ .quad .Ls3,.Ls3e
+ .quad .Ls4,.Ls4e
+ .quad .Ld1,.Ls1e
+ .quad .Ld2,.Ls2e
+ .quad .Ld3,.Ls3e
+ .quad .Ld4,.Ls4e
+ .quad .Ls5,.Ls5e
+ .quad .Ls6,.Ls6e
+ .quad .Ls7,.Ls7e
+ .quad .Ls8,.Ls8e
+ .quad .Ld5,.Ls5e
+ .quad .Ld6,.Ls6e
+ .quad .Ld7,.Ls7e
+ .quad .Ld8,.Ls8e
+ .quad .Ls9,.Le_quad
+ .quad .Ld9,.Le_quad
+ .quad .Ls10,.Le_byte
+ .quad .Ld10,.Le_byte
+#ifdef FIX_ALIGNMENT
+ .quad .Ls11,.Lzero_rest
+ .quad .Ld11,.Lzero_rest
+#endif
+ .quad .Le5,.Le_zero
+ .previous
+
+ /* compute 64-offset for main loop. 8 bytes accuracy with error on the
+ pessimistic side. this is gross. it would be better to fix the
+ interface. */
+ /* eax: zero, ebx: 64 */
+.Ls1e: addl $8,%eax
+.Ls2e: addl $8,%eax
+.Ls3e: addl $8,%eax
+.Ls4e: addl $8,%eax
+.Ls5e: addl $8,%eax
+.Ls6e: addl $8,%eax
+.Ls7e: addl $8,%eax
+.Ls8e: addl $8,%eax
+ addq %rbx,%rdi /* +64 */
+ subq %rax,%rdi /* correct destination with computed offset */
+
+ shlq $6,%rdx /* loop counter * 64 (stride length) */
+ addq %rax,%rdx /* add offset to loopcnt */
+ andl $63,%ecx /* remaining bytes */
+ addq %rcx,%rdx /* add them */
+ jmp .Lzero_rest
+
+ /* exception on quad word loop in tail handling */
+ /* ecx: loopcnt/8, %edx: length, rdi: correct */
+.Le_quad:
+ shll $3,%ecx
+ andl $7,%edx
+ addl %ecx,%edx
+ /* edx: bytes to zero, rdi: dest, eax:zero */
+.Lzero_rest:
+ cmpl $0,(%rsp) /* zero flag set? */
+ jz .Le_zero
+ movq %rdx,%rcx
+.Le_byte:
+ xorl %eax,%eax
+.Le5: rep
+ stosb
+ /* when there is another exception while zeroing the rest just return */
+.Le_zero:
+ movq %rdx,%rax
+ jmp .Lende
+ CFI_ENDPROC
+ENDPROC(__copy_user_nocache)
+
+
Index: linux/include/asm-x86_64/uaccess.h
===================================================================
--- linux.orig/include/asm-x86_64/uaccess.h
+++ linux/include/asm-x86_64/uaccess.h
@@ -367,4 +367,18 @@ __copy_to_user_inatomic(void __user *dst
return copy_user_generic((__force void *)dst, src, size);
}
+#define ARCH_HAS_NOCACHE_UACCESS 1
+extern long __copy_user_nocache(void *dst, const void __user *src, unsigned size, int zerorest);
+
+static inline int __copy_from_user_nocache(void *dst, const void __user *src, unsigned size)
+{
+ might_sleep();
+ return __copy_user_nocache(dst, (__force void *)src, size, 1);
+}
+
+static inline int __copy_from_user_inatomic_nocache(void *dst, const void __user *src, unsigned size)
+{
+ return __copy_user_nocache(dst, (__force void *)src, size, 0);
+}
+
#endif /* __X86_64_UACCESS_H */
Index: linux/arch/x86_64/kernel/x8664_ksyms.c
===================================================================
--- linux.orig/arch/x86_64/kernel/x8664_ksyms.c
+++ linux/arch/x86_64/kernel/x8664_ksyms.c
@@ -26,6 +26,7 @@ EXPORT_SYMBOL(__put_user_4);
EXPORT_SYMBOL(__put_user_8);
EXPORT_SYMBOL(copy_user_generic);
+EXPORT_SYMBOL(__copy_user_nocache);
EXPORT_SYMBOL(copy_from_user);
EXPORT_SYMBOL(copy_to_user);
EXPORT_SYMBOL(__copy_from_user_inatomic);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-10 11:50 [PATCH 2.6.21 review I] [1/25] x86_64: Add __copy_from_user_nocache Andi Kleen
@ 2007-02-10 11:50 ` Andi Kleen
2007-02-12 9:32 ` [patches] " Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Andi Kleen @ 2007-02-10 11:50 UTC (permalink / raw)
To: Jeff Dike, Andi Kleen, patches, linux-kernel
From: Jeff Dike <jdike@addtoit.com>
Kernel-mode traps on x86_64 can pollute the trap information for a previous
userspace trap for which the signal has not yet been delivered to the
process.
do_trap and do_general_protection set task->thread.error_code and .trapno
for kernel traps. If a kernel-mode trap arrives between the arrival of a
userspace trap and the delivery of the associated SISGEGV to the process,
the process will get the kernel trap information in its sigcontext.
This causes UML process segfaults, as the trapno that the UML kernel sees
is 13, rather than the 14 for normal page faults. So, the UML kernel
passes the SIGSEGV along to its process.
I don't claim to fully understand the problem. On the one hand, a check in
do_general_protection for a pending SIGSEGV turned up nothing. On the
other hand, this patch fixed the UML process segfault problem.
The patch below moves the setting of error_code and trapno so that that
only happens in the case of userspace faults. As a side-effect, this
should speed up kernel-mode fault handling a tiny bit.
I looked at i386, and there is a similar situation. In this case, there is
duplicate code setting task->thread.error_code and trapno. I deleted one,
leaving the copy that runs in the case of a userspace fault.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
arch/i386/kernel/traps.c | 8 +++-----
arch/x86_64/kernel/traps.c | 12 ++++++------
2 files changed, 9 insertions(+), 11 deletions(-)
Index: linux/arch/i386/kernel/traps.c
===================================================================
--- linux.orig/arch/i386/kernel/traps.c
+++ linux/arch/i386/kernel/traps.c
@@ -474,8 +474,6 @@ static void __kprobes do_trap(int trapnr
siginfo_t *info)
{
struct task_struct *tsk = current;
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = trapnr;
if (regs->eflags & VM_MASK) {
if (vm86)
@@ -487,6 +485,9 @@ static void __kprobes do_trap(int trapnr
goto kernel_trap;
trap_signal: {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
+
if (info)
force_sig_info(signr, info, tsk);
else
@@ -601,9 +602,6 @@ fastcall void __kprobes do_general_prote
}
put_cpu();
- current->thread.error_code = error_code;
- current->thread.trap_no = 13;
-
if (regs->eflags & VM_MASK)
goto gp_in_vm86;
Index: linux/arch/x86_64/kernel/traps.c
===================================================================
--- linux.orig/arch/x86_64/kernel/traps.c
+++ linux/arch/x86_64/kernel/traps.c
@@ -581,10 +581,10 @@ static void __kprobes do_trap(int trapnr
{
struct task_struct *tsk = current;
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = trapnr;
-
if (user_mode(regs)) {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
+
if (exception_trace && unhandled_signal(tsk, signr))
printk(KERN_INFO
"%s[%d] trap %s rip:%lx rsp:%lx error:%lx\n",
@@ -682,10 +682,10 @@ asmlinkage void __kprobes do_general_pro
conditional_sti(regs);
- tsk->thread.error_code = error_code;
- tsk->thread.trap_no = 13;
-
if (user_mode(regs)) {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = 13;
+
if (exception_trace && unhandled_signal(tsk, SIGSEGV))
printk(KERN_INFO
"%s[%d] general protection rip:%lx rsp:%lx error:%lx\n",
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-10 11:50 ` [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Andi Kleen
@ 2007-02-12 9:32 ` Jan Beulich
2007-02-12 16:42 ` Jeff Dike
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2007-02-12 9:32 UTC (permalink / raw)
To: Jeff Dike, Andi Kleen; +Cc: linux-kernel, patches
>>> Andi Kleen <ak@suse.de> 10.02.07 12:50 >>>
>
>From: Jeff Dike <jdike@addtoit.com>
>
>Kernel-mode traps on x86_64 can pollute the trap information for a previous
>userspace trap for which the signal has not yet been delivered to the
>process.
>
>do_trap and do_general_protection set task->thread.error_code and .trapno
>for kernel traps. If a kernel-mode trap arrives between the arrival of a
>userspace trap and the delivery of the associated SISGEGV to the process,
>the process will get the kernel trap information in its sigcontext.
>
>This causes UML process segfaults, as the trapno that the UML kernel sees
>is 13, rather than the 14 for normal page faults. So, the UML kernel
>passes the SIGSEGV along to its process.
>
>I don't claim to fully understand the problem. On the one hand, a check in
>do_general_protection for a pending SIGSEGV turned up nothing. On the
>other hand, this patch fixed the UML process segfault problem.
>
>The patch below moves the setting of error_code and trapno so that that
>only happens in the case of userspace faults. As a side-effect, this
>should speed up kernel-mode fault handling a tiny bit.
This breaks consumers of notify_die() relying on the proper trap number being
passed, as the call to notify_die() from die() currently reads
current->thread.trap_no.
Also, you seem to leave other places where trap_no gets set untouched -
is this intentional (do_debug - probably correct here, kernel_math_error -
probably incorrect here)?
>I looked at i386, and there is a similar situation. In this case, there is
>duplicate code setting task->thread.error_code and trapno. I deleted one,
>leaving the copy that runs in the case of a userspace fault.
Likewise.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-12 9:32 ` [patches] " Jan Beulich
@ 2007-02-12 16:42 ` Jeff Dike
2007-02-12 17:01 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Dike @ 2007-02-12 16:42 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andi Kleen, linux-kernel, patches
On Mon, Feb 12, 2007 at 09:32:10AM +0000, Jan Beulich wrote:
> This breaks consumers of notify_die() relying on the proper trap number being
> passed, as the call to notify_die() from die() currently reads
> current->thread.trap_no.
Rats, good point.
> Also, you seem to leave other places where trap_no gets set untouched -
> is this intentional (do_debug - probably correct here, kernel_math_error -
> probably incorrect here)?
I did check the other trap handlers. kernel_math_error calls die,
which calls do_exit(SIGSEGV). This doesn't seem to allow the process
the opportunity to trap the SIGSEGV and examine the fault information.
> >I looked at i386, and there is a similar situation. In this case, there is
> >duplicate code setting task->thread.error_code and trapno. I deleted one,
> >leaving the copy that runs in the case of a userspace fault.
>
> Likewise.
Yup. How does this patch look to you? We set error_code and trap_no
for userspace faults and kernel faults which call die(). We don't set
them for kernelspace faults which are fixed up.
Index: linux-2.6/arch/i386/kernel/traps.c
===================================================================
--- linux-2.6.orig/arch/i386/kernel/traps.c
+++ linux-2.6/arch/i386/kernel/traps.c
@@ -619,6 +619,8 @@ gp_in_vm86:
gp_in_kernel:
if (!fixup_exception(regs)) {
+ current->thread.error_code = error_code;
+ current->thread.trap_no = 13;
if (notify_die(DIE_GPF, "general protection fault", regs,
error_code, 13, SIGSEGV) == NOTIFY_STOP)
return;
Index: linux-2.6/arch/x86_64/kernel/traps.c
===================================================================
--- linux-2.6.orig/arch/x86_64/kernel/traps.c
+++ linux-2.6/arch/x86_64/kernel/traps.c
@@ -605,8 +605,11 @@ static void __kprobes do_trap(int trapnr
fixup = search_exception_tables(regs->rip);
if (fixup)
regs->rip = fixup->fixup;
- else
+ else {
+ tsk->thread.error_code = error_code;
+ tsk->thread.trap_no = trapnr;
die(str, regs, error_code);
+ }
return;
}
}
--
Work email - jdike at linux dot intel dot com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead
2007-02-12 16:42 ` Jeff Dike
@ 2007-02-12 17:01 ` Jan Beulich
0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2007-02-12 17:01 UTC (permalink / raw)
To: Jeff Dike; +Cc: Andi Kleen, linux-kernel, patches
>Yup. How does this patch look to you? We set error_code and trap_no
>for userspace faults and kernel faults which call die(). We don't set
>them for kernelspace faults which are fixed up.
That seems a reasonable approach.
Thanks, Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-02-15 16:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-13 7:52 [patches] [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Jan Beulich
2007-02-13 10:00 ` Andi Kleen
2007-02-14 17:51 ` Jeff Dike
2007-02-15 8:01 ` Jan Beulich
2007-02-15 16:23 ` Jeff Dike
-- strict thread matches above, loose matches on Subject: below --
2007-02-10 11:50 [PATCH 2.6.21 review I] [1/25] x86_64: Add __copy_from_user_nocache Andi Kleen
2007-02-10 11:50 ` [PATCH 2.6.21 review I] [4/25] x86: kernel-mode faults pollute current->thead Andi Kleen
2007-02-12 9:32 ` [patches] " Jan Beulich
2007-02-12 16:42 ` Jeff Dike
2007-02-12 17:01 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).